Cloud infrastructure isn’t a new concept for those working in technology, and many young startups are realizing the benefits early on. Like others, we contemplated a cloud solution for storage and infrastructure to help us scale and reduce associated costs, but were unsure what to expect from it.
"Focusing on just data growth is not enough; you have to think about growth in numbers of users who will access it"
As one of the fastest growing ecommerce companies in the U.S., Zulily is a site featuring designer fashions for moms, babies and kids. We’re focused on improving the customer experience and delivering fresh new products at low prices. We launch more than 9,000 product styles every day. To make the shopping experience as easy as possible for our millions of active customers, we began using data and predictive analytics to help drive business decisions. This includes using data to connect retail customers to the products and styles they’ll be most interested in and enabling our own network of buyers who sell products on our site. On the backend, we maintain a tremendous amount of data that provides us complete visibility into customers’ orders throughout our complex supply chain of thousands of vendors. On top of all that, we consolidate data from all marketing channels to assess what programs deliver the best ROI and make decisions in near real-time.
Like so many others, especially in the retail industry, we found ourselves overwhelmed by consumer demand. This is a great problem to have, but if we couldn’t maintain an infrastructure to support the influx of orders, our business could all be put at risk. In order to keep up with the demand, we had to redesign the way we processed, analyzed and used bigdata. The biggest challenge was to figure out how to build a data platform that would allow hundreds of users across the business to make decisions at lightning speed while our data continued to grow exponentially. It was impossible to scale our legacy infrastructure. There were several data sources that made this project more complex. First, the clickstream data from our website and mobile apps, which enables us to understand user behavior, is exploding with the company’s growth. Second was the growing amount of email data. And third was data from various sources like carriers, marketing partners and ad platforms.
We needed to explore every option because, as any technical lead knows, you adopt an infrastructure that you hopefully won’t need to change every few years. We were looking for something that could grow with us. After exploring various options including self-hosted solutions, the cost economics of cloud were just too good to be ignored. For scale, cloud was the obvious choice. The combination of Hadoop on Google Compute Engine, storage in Google Cloud Storage and Google BigQuery for analytics proved to be the best fit for us.
The next step was getting executive buy-in. Sometimes, this can be the hardest part, because you’re generally asking for a substantial amount of capital to get started. Once leadership saw the capabilities and complementary cost structure behind it, we progressed with Google Cloud Platform, in June 2014 because of its capabilities, ease of use, and cost. We had learned that co-locating storage and compute on the same machines reduces latency but increases cost over time because growth in storage is much faster than growth required in compute. Having storage and compute separated across Google Compute engine and Google Storage is a big benefit and enables us to continue scaling without astronomical costs. Google BigQuery also boosted overall performance by cutting runtime of complex queries from minutes and hours to a matter of seconds. Finally, when scaling quickly with big data, designing infrastructure around peak loads is critical. The key is implementing an infrastructure that can handle peak loads when it needs to, but one that also doesn’t require you to pay for that load level all the time. If you’re growing quickly like we were, but are experiencing highs and lows throughout the day, week or month, “pay as you go” models are so critical.
After just six months, our entire business changed. We are now able to provide item-level visibility across the organization, which was impossible with the previous legacy solution. Benefits have been seen in every part of our business. Our marketing team can better understand marketing spend and manage ad platforms for better ROI. Furthermore, we are able to provide detailed analytics on every flash sale event to Zulily’s merchants, enabling them to also make better decisions. All of this helps us deliver an exceptional daily experience to our millions of active customers.
Big data is only valuable if you have all the related data accessible to everyone who needs to use it in a single location. Ability to correlate different types of datasets for example clickstream, email, demand and tracking is critical for making good decisions. Focusing on just data growth is not enough; you have to think about growth in numbers of users who will access it. Transitioning to the cloud has given ability to scale not just storage and compute but it has enabled us to make complex business decisions really fast with accurate data.