Thursday
Dec132007
Is premature scalation a real disease?

Update 3: InfoQ's Big Architecture Up Front - A Case of Premature Scalaculation? twines several different threads on the topic together into a fine noose.
Update 2: Kevin says the biggest problems he sees with startups is they need to scale their backend (no, the other one).
Update: My bad. It's hard to sell scalability so just forget it.
The premise of Startups and The Problem Of Premature Scalaculation and Don’t scale: 99.999% uptime is for Wal-Mart is that you shouldn't spend precious limited resources worrying about scaling before you've first implemented the functionality that will make you successful enough to have scaling problems in the first place. It's kind of an embodied life force model of system creation. Energy is scarce so any parasites siphoning off energy must be hunted down and destroyed so the body has its best chance of survival. Is this really how it works?
If I ever believed this I certainly don't believe it anymore. The world has changed, even since 2005.
Thanks to many books and papers on how to scale the knowledge of scaling isn't the scarce precious resource it once was. It's no longer knowledge tightly held by a cabal of experts until Nicolas Cage flies in and pries it out of their grasping dessicated fingers. Now any journeyman computerista can do a reasonable job at designing a scalable system.
Not only has knowledge dissemination improved, but so have our tools. Drastically. At one time building a scalable system up front would have required buying and configuring a truck load of servers, building out a data center, configuring a spider's web of networks, and bootstrapping an equally nasty storage network. All extremely complicated and disaster prone. Now you can use services like Amazon's EC2/S3, 3tera's grid OS, Joyent to cut significant parts of all that complexity out of the system.
While most of us toil away in anonymity and scaling problems are just a fond dream, when the webosphere does find you it does so with a crush. With a little thinking ahead Blue Origin was able to handle 3.5 million requests and 758 GBs in bandwidth in a single day using S3. Did that effort prevent other features from getting implemented? I seriously doubt it. Usually doing the right thing isn't harder if you know what is the right thing to do.
And what if Blue Origin wouldn't have been able to scale? Could they have recovered from the opportunity lost of grabbing the iron when it's hot and when potential customers are interested? Ask Friendster.
What do you think? Has most of the risk associated with up front scalability design been squeezed out? Is premature scalation still something to be avoided? Or have times changed and does doing the simplest thing that could possibly work now include worrying about scaling up front?
Update 2: Kevin says the biggest problems he sees with startups is they need to scale their backend (no, the other one).
Update: My bad. It's hard to sell scalability so just forget it.
The premise of Startups and The Problem Of Premature Scalaculation and Don’t scale: 99.999% uptime is for Wal-Mart is that you shouldn't spend precious limited resources worrying about scaling before you've first implemented the functionality that will make you successful enough to have scaling problems in the first place. It's kind of an embodied life force model of system creation. Energy is scarce so any parasites siphoning off energy must be hunted down and destroyed so the body has its best chance of survival. Is this really how it works?
If I ever believed this I certainly don't believe it anymore. The world has changed, even since 2005.
Thanks to many books and papers on how to scale the knowledge of scaling isn't the scarce precious resource it once was. It's no longer knowledge tightly held by a cabal of experts until Nicolas Cage flies in and pries it out of their grasping dessicated fingers. Now any journeyman computerista can do a reasonable job at designing a scalable system.
Not only has knowledge dissemination improved, but so have our tools. Drastically. At one time building a scalable system up front would have required buying and configuring a truck load of servers, building out a data center, configuring a spider's web of networks, and bootstrapping an equally nasty storage network. All extremely complicated and disaster prone. Now you can use services like Amazon's EC2/S3, 3tera's grid OS, Joyent to cut significant parts of all that complexity out of the system.
While most of us toil away in anonymity and scaling problems are just a fond dream, when the webosphere does find you it does so with a crush. With a little thinking ahead Blue Origin was able to handle 3.5 million requests and 758 GBs in bandwidth in a single day using S3. Did that effort prevent other features from getting implemented? I seriously doubt it. Usually doing the right thing isn't harder if you know what is the right thing to do.
And what if Blue Origin wouldn't have been able to scale? Could they have recovered from the opportunity lost of grabbing the iron when it's hot and when potential customers are interested? Ask Friendster.
What do you think? Has most of the risk associated with up front scalability design been squeezed out? Is premature scalation still something to be avoided? Or have times changed and does doing the simplest thing that could possibly work now include worrying about scaling up front?
Reader Comments (10)
I think it's an essential component, businesses need to plan for scalability if they hope to grow. However, it can't come at the cost of delivering today. It's like financial planning, insurance, or a pension fund. It's essential when you need it, but seemingly irrelevant beforehand.
I say plan to scale, plan for success, but don't spend all your money / time worrying about it.
Cheers - http://www.callum-macdonald.com/" title="Callum" target="_blank">Callum
I agree that premature scalation is a real disease. I see it every day in and out of where I work. Everyone worries so much about scaling before they even begin, they often make poor design decisions before they even know what part of their app will have problems first.
It is certainly important to properly design your application so it can be scaled when necessary, but this doesn't mean doing things like creating de-normalized tables, caching everything you can think of or not using certain libraries because they are 'slow'. It means properly segmenting each part of your application (data, logic, presentation, etc) so each can be modified, updated, expanded when needed.
I think we hear far too much about how company X or company Y had trouble scaling every day that we build up a paranoia. Most of us would be lucky to build deal with an application that has more that 10,000 users let alone 10 million.
I think we all need to take a step back and only worry about what will be problems in the near future rather than a hypothetical one.
I think this is a general planning and design problem. How do you really know how many visitors your site will attract?
Overplanning for scalability is not the real disease. Overengineering is.
The really, really difficult thing here is to design and implement only what you need, but in a way that lets you change things as things evolve.
For example, a well-designed web application should be able to run on a single, very simple server with one database, a set of parallel servers with one database, parallel servers with shared/replicated/shared databases, or whatever is needed. This is really, really hard, and one of the reasons good developers, architects and systems managers are so well paid. That said, one of the basic choices that have worked for us, and others, from what I can gather from reading this site, is to do things as simple as possible. It's incredibly hard to change a complex system, but giving a simple systems complex additions is much easier. But you have to be aware of the choices made, and have plans for what to do if things explode. In other words, you don't have to scale up-front, but you need to know what to do if the need arises.
Great points. My post was mostly about the enterprise but could apply for web-based plays as well. Scale does matter, just not as the lead-in for why your product is better than your competitor. It is hard to see, touch, or feel scale in a sales meeting or even when a user visits a site, unless you are terribly slow. Sometimes engineers can spend too much time on having the fastest engine and not enough time on designing a beautiful body. Scale matters but not as your main selling point.
Todd, I think that is a great question, but probably needs qualification. I think it depends on the application and the audience, i.e. what are the user expectations? Is it a paid site? Can you afford any performance hits or downtime? Also, how much time you have prior to coding that you can devote to design? If you're on a tight schedule, then you're probably best off with going with what you know and testing early and often to detect any glaring bottlenecks. However, if you have the luxury of more time for design, then it absolutely makes sense to iterate through different architectures that allow for rapid scaling and deployment, especially if you’re marketing to a potentially large user base with a low tolerance for downtime. With so many options now for scaling that are dependent on outside vendors, you certainly don’t want to pick the wrong one and end up with a bad case of vendor lock-in or major site outages when your compute cloud goes dark during a “routine” software upgrade. More options mean more choices that need to be considered up-front before development begins.
Great site BTW!
This is spot on: "I think it's an essential component, businesses need to plan for scalability if they hope to grow. However, it can't come at the cost of delivering today." Designing for scalability is important, yes, but how many times have we seen developers and systems people try to be TOO SMART and things get way complex way fast? I still think that KISS should prevail, and, I'll be honest, do think that Good Enough is often better than Just Almost Perfect.
--
Dustin Puryear
Author, Best Practices for Managing Linux and UNIX Servers
http://www.puryear-it.com/pubs/linux-unix-best-practices
Budgeting for future scalability today makes sense, but don't spend all your budget in the process. Prepare for disasters and the unexpected.
Only a disease if you are not yeilding more out of it for what you have implemented
-----
http://underwaterseaplants.awardspace.com">sea plants
http://underwaterseaplants.awardspace.com/seagrapes.htm">sea grapes...http://underwaterseaplants.awardspace.com/plantroots.htm">plant roots
If you have proper planning, premature scalation will not be a disease, I cannot agree with your statement buddy, If you there is proper planning and good timing and great analysising , there will no more issues, Am I right?
Cheers,
Ben
http://www.golfcamp.com">Junior golf camp
Absolutely, scalability is a very important component of business, if it has to grow.
IT has to be planned well in advance and keeping intact the current business operations for this financial planning, insurance, sustainability etc comes in to picture..
I say plan to scale, plan for success, but don't spend all your money / time worrying about it.