« ArchCamp: Scalable Databases (NoSQL) | Main | Pairing NoSQL and Relational Data Storage: MySQL with MongoDB »
Friday
Aug062010

Hot Scalability Links for Aug 6, 2010

  • Twitter Sees Its 20 Billionth Tweet writes  Marshall Kirkpatrick of ReadWriteWeb.
  • Startups die for not having customers, so STOP thinking about how to scale. Alessandro Orsi says focusing on the architecture and scaling possibilities of your app for millions of users is just plain dumb...concentrate on marketing...concentrate on user experience. Alessandro is perfectly correct, but this isn't the year the 2000 when the default architecture that is easy is also not scalable and when sites were built from scratch one painful user at a time.  Today neither is tue. In the era of social networks, where Facebook has 500 million users, successful applications can and often do spike to millions of users seemingly overnight. And you have to have some architecture. With today's tool-chains you don't have to choose easy and non-scalable. There are other options. Of course, it's all pointless without customers and that is what you need to worry about, but it's a false choice in this era to think that's all you have to worry about.
  • Node.js: JavaScript on the Server. Ryan Dahl talks about how to handle thousands of connections with server side JavaScript. It seems a little strange to still be talking about this same kind of stuff--event loops, async vs sync, thread pools, processes vs threads, etc--after 20 years, but Ryan does a really good job framing the issues. In the end applications are about state machines, so those nasty abstractions arise somewhere. You can't hide behind event callbacks, it's never enough. 
  • Tech Talks presented at the North American Faculty Summit. Includes: Storage Architecture and Challenges, Cloud Computing and Software Security, Engineering Private Spaces Online, Defeating the Password Anti-Pattern with Open Standards, Security at Scale, Anatomy of a Large-Scale Social Search Engine.
  • A Retrospective on SEDA. Matt Welsh takes a look back on his very influential paper on large scale distributed system architectures and what he would do differently. Achieving good, robust performance across a wide range of loads is the real challenge.
  • Database Scalability Patterns by Robert Treat. Awesome coverage of Vertical Scaling; Horizontal Partitioning; Horizontal Scaling; Read Slaves; Multi-Master; Vertical Partitioning; Federated Data Storage; Database Life-cycle; OLAP vs OLTP; application type; Cloud; tools.
  • MongoDB Schema Design. Alex Popescu collects a list of NoSQL data modeling sources. 
  • Google Wave and Network Effects. Most interesting discussion from Dare Obasanjo on how in a social network world using invite scarcity to grow a user base fails because users can't port their network over. Without your peeps who will you talk to? Strangers? 
  • Beyond Locks and Messages: The Future of Concurrent Programming by Bartosz Milewski. Threads are out (demoted to latency controlling status), tasks (and semi-implicit parallelism) are in. Message passing is out (demoted to implementation detail), shared address space is in. Locks are out (demoted to low-level status), transactional memory is in.
  • Seattle Hadoop Day on August 14th. There's a killer line-up of speakers from Facebook, BackType, Amazon, and more. There's also several hours of intensive, hands-on training.  And it's by and for the community.
  • The Pathologies of Big Data by Adam Jacobs. Scale up your datasets enough and all your apps will come undone. What are the typical problems and where do the bottlenecks generally surface?  

Reader Comments (1)

Hi Todd, first of all thanks for posting and commenting my blog post.

Well, I agree with you (and I will post some thoughts on this in some new article) that

"With today's tool-chains you don't have to choose easy and non-scalable."
.

I think that's the point. For the 99.5% of the startups that succeed there's no needs to being able to manage millions by day one. For a normal curve of adoption, the today's tool-chains are really good.

And if you have scarce resources (as in a not well or at all funded startup) is better using them for acquiring users or customers. And that brings me to an another important point.

Usually you have (if you have them) scalability issues when you have users and not customers.

Facebook, Twitter and all the social network web apps, all of them have users. People that do not pay for using their services.

When people don't pay it's easier they start using something than when they need to pay. It's very simple. Free costs less than everything you have to pay for.

Than you have to figure out what to do with all those people (because you have not a business model, users are not paying). And at that point, if you are not VC funded, it's quite difficult to scale millions (of course given that you arrive to acquire millions of users: millions is a lot!).

With customers, if you make people pay for using your service because you are giving them something they feel valuable (you are solving them a problem or simplifying an old solution), the adoption curve is not so vertical.

So, concentrate on customers, pr and marketing. Because if you are thinking to build a startup it's quite probable you are already good enough to make the right technical choices acting the way you usually do.

P.S.
Flipboard, the iPad app the world is talking about, is having scalability issues: but still it's a free app. And still I think it's a nice problem to have. Now they are using 90% of their resources to solve it. But before they concentrated a lot in marketing their product, pr and in spreading the word.

August 9, 2010 | Unregistered CommenterAlessandro Orsi

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>