« Stuff The Internet Says On Scalability For October 9th, 2015 | Main | Your Load Generator is Probably Lying to You - Take the Red Pill and Find Out Why »
Wednesday
Oct072015

Zappos's Website Frozen for Two Years as it Integrates with Amazon

Here's an interesting nugget from a wonderfully written and deeply interesting article by Roger Hodge in the New Republic: A radical experiment at Zappos to end the office workplace as we know it:

Zappos's customer-facing web site has been basically frozen for the last few years while the company migrates its backend systems to Amazon's platforms, a multiyear project known as Supercloud.

It's a testament to Zappos that they still sell well with a frozen website while most of the rest of the world has adopted a model of continuous deployment and constant evolution across multiple platforms.

Amazon is requiring the move, otherwise a company like Zappos would probably be sensitive to the Conway's law implication of such a deep integration. Keep in mind Facebook is reportedly keeping WhatsApp and Instagram independent. This stop the world plan must mean something, unfortunately I don't have the strategic insight to understand why this might be. Any thoughts?

The article has more tantalizing details about what's going on with the move:

MEANWHILE, THE MIGRATION of Zappos's entire IT infrastructure to Amazon—which means figuring out a way to move an extraordinarily complex set of custom software programs that power a billion-dollar-a-year e-commerce site over to an entirely new environment—continues. The difficulty of this effort is almost unfathomable. Imagine taking a million square pegs and attempting to insert them into a million round holes, except you don't even know if all the holes exist, or where the holes might be located, and you have to negotiate access to the holes with dozens of different teams of hostile software engineers. The project has consumed Zappos's tech department for more than two years, and during that time, the Zappos site has been almost completely static. That means no improvements or innovations and only minimal bug fixes.


Barry Van Beek, the project's program manager, told me he thought Supercloud was the single largest e-commerce replatforming in history. “No one has ever attempted this before, on this scale.” Van Beek, who has been at Zappos for eight years, said Supercloud has about 20 different teams, with 250 to 350 people, including contractors, working with more than 100 different Amazon teams. When they began, they had no idea what they were getting into. It took months to figure out what was available on the Amazon side, and for well over a year, Van Beek wasn't even sure the migration was technically possible. But, he said, Supercloud is an Amazon-mandated goal, and they didn't have any choice. So they figured it out. “The level of effort,” he said, “was almost incomprehensible.”


Van Beek told me he is confident Supercloud can be completed, but he worries the disruptions caused by the offer, as well as unforeseeable delays on the Amazon side, will slow their progress. When Zappos does complete the migration, he hopes the tech department will be able to turn its attention to innovating in the e-commerce space, tackling challenges such as the size and fit problem. If customers are more likely to arrive at a good fit before they order, eliminating returns and the associated restocking and shipping costs will improve margins.

Reader Comments (9)

I agree the project is crazy huge, that's why I would never have suggested that it be done. I'd have reccy'd that they identify the key business processes that drove success (say, 50 or so) and figure out how to do them using existing Amazon infrastructure. Then have a big ugly cutover.

There is no way the lost sales would ever have cost what this project is costing.

One other thing strikes me ... this is a strong contra to the held belief that people *want* continuous change in their interfaces and in the functionality they deal with. I've never thought that true for consumers and know it is false for B2B software. If I were someone rich and in charge of a public B2C company I'd study this example carefully - all expense not taken falls to the bottom line.

-XC

October 7, 2015 | Unregistered CommenterCliff elam

Such a project would occur only at a company with money to burn. If they're expecting a payback, I doubt the scales will balance this decade. This sounds like it was motivated more by Bezos' ego than by hard business needs.

October 7, 2015 | Unregistered CommenterWarren Spencer

I guess it's not surprising that Zappos would treat the conversion of their computing infrastructure in much the same way as Tony Hsieh is handling the conversion of the whole company to Holacracy, that is, as a big cutover. Though apparently Tony wasn't willing to "freeze the company" while they worked out how to do it, as they are with the web site.

I'd suggest that freezing the web site to accomplish such a conversion is largely a failure of imagination. Even if their existing system is a monolithic "one tier" application (as Amazon's once was), there's no technical reason they shouldn't be able to convert it piece-by-piece onto a platform in the cloud. I mean, how hard is it to throw an "if" statement in the existing code that sends some portion of some service calls over to a new system, and thus reasonably safely roll out a "converted" backend? The apparent drama evident in "Van Beek wasn't even sure the migration was technically possible" suggests that perhaps they picked the wrong guy. Not "technically possible"? Please. We're talking code running on computers here, not speed-of-light travel.

And to your point about Instagram, I recently attended a ~1 hour talk at Facebook where they gave an overview of the steps it took to move Instagram from AWS to Facebook's internal compute infrastructure, which is nowhere near as feature-rich or flexible as AWS. It clearly wasn't an easy task, and they managed to do it without freezing or even taking down Instagram.

October 7, 2015 | Unregistered CommenterHans

Actually, Instagram has migrated to FB's infra http://engineering.instagram.com/posts/1086762781352542/migrating-from-aws-to-fb/

October 7, 2015 | Unregistered CommenterStreeter

"It's a testament to Zappos that they still sell well with a frozen website while most of the rest of the world has adopted a model of continuous deployment and constant evolution across multiple platforms."

I know people don't dare to draw another conclusion: That most development organizations do not add to the bottom line and should be challanged.

October 8, 2015 | Unregistered CommenterStephan Schmidt

Does anyone know why Amazon is requiring this change or what Amazon is requiring to be changed?

October 8, 2015 | Unregistered CommenterJason Pirkey

I'd love to know WHY they are moving to amazon !

October 9, 2015 | Unregistered CommenterKeru

Zappos lost me as a customer over the past couple years. Their web site is no longer competitive in terms of locating products, reviews, order history, etc. (Go try it - search is completely broken, for example.) They probably lost a large number of customers but did not realize since it is difficult to track this metric. Customers like me who just silently quit using their services due to better alternatives.

October 9, 2015 | Unregistered CommenterFormerCustomer

They do track customer metrics, trust me, I was once a developer there.

This move isn't overly surprising, but it is sad, and is probably the reason they had a big migration of developer talent over the last several years. I was part of the team that updated the site from perl to java. The only thing we didn't really touch was the underlying database. We ended up with a slimmer application that utilized fewer resources as a result.

As for the comment about "adding an 'if' statement", well that doesn't cover all of it. the move from perl to java took some planning and coordination. The biggest hurdle there is moving/mirroring the data. If you are doing 10's of thousands of orders a day, making sure you get that data correct during the move isn't trivial. Then you have the support software (which at the time was all written in house) that must now work with both architectures. Oh yeah, then you have the warehouse processing which communicates with the inventory and order management data that must also work across those two systems.

It would be quite a challenge.

October 12, 2015 | Unregistered Commenterrobert

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>