High Scalability -

Entries in facebook (29)

Friday

Apr102009

Facebook's Aditya giving presentation on Facebook Architecture

Friday, April 10, 2009 at 3:32PM

Facebook's engg. director aditya talks about facebook architecture. How they use mysql, php and memcache. How they have modified the above to suit their requirements.

Click to read more ...

itsfrosty |

2 Comments |

Permalink |

Print Article

Email Article

MySQL,

PHP,

facebook,

memcache

Tuesday

Dec162008

Facebook is Hiring

Tuesday, December 16, 2008 at 4:31AM

I thought with the job situation these days that people might be interested in some open jobs at Facebook. Here's what's available:

Facebook is hiring! We are looking for a Systems Engineer/Architect and Site Reliability Engineer. I have attached the job descriptions below. If you are interested, please contact Michelle Bostock mbostock-at-facebook.com. Thanks and Happy Holidays! Systems Architect Palo Alto, CA Description Facebook is seeking a seasoned Systems Architect to join the Operations team. The position is full-time and is based in our main office in downtown Palo Alto and will report to the Manager of Systems Operations. Responsibilities * Analyze application flow and infrastructure design to improve performance and scalability of the site * Collaborate on design of services infrastructure from servers to networking * Monitor, analyze, and make recommendations as appropriate to improve site stability and availability * Evaluate hardware and software technologies to improve site efficiency and performance * Troubleshoot and solve issues with hardware, applications, and network components * Lead team efforts from design to implementation, prioritize tasks and resources while interacting with Engineering and Operations * Document current and future configuration processes and policies * Participate in 24x7 on-call support Requirements * B.S. in Computer Science or equivalent experience * 4+ years of experience in Operations with large web farms * Extensive knowledge of web architecture and technologies, including Linux, Apache, MySQL, PHP, TCP/IP, security, HTTP, LDAP and MTAs * Strong background/interest in application and infrastructure design * Scripting and programming skills * Excellent verbal and written communication skills
Site Reliability Engineer Palo Alto, CA Description Facebook is seeking talented operations engineers to join the Site Reliability Engineering team. The ideal candidate will have strong communication skills, a passion for tinkering with Linux, and an almost insane fondness for fast-paced, seat-of-your-pants troubleshooting and crisis management. The position is full-time and is based in our main office in downtown Palo Alto. This position reports to the Manager of Site Reliability Engineering. Responsibilities * Monitor the stability and performance of the website * Remotely troubleshoot and diagnose hardware problems * Debug issues with Linux software, applications and network * Resolve technical challenges encountered in LAMP technologies * Develop and maintain monitoring tools and automation systems * Predict and respond to utilization variances across multiple datacenters * Identify and triage all outage related events * Facilitate communication, coordinate escalation, and work with subject matter experts to implement critical fixes * Automate and streamline processes * Track issues and run reports Requirements * 2-3 years+ Linux support/sys admin experience in an Internet operations environment * BA/BS in Computer Science or a related field, or equivalent experience * Working knowledge of Linux, Cisco, TCP/IP, Apache and mySQL * Experience working with network management systems and monitoring tools, such as Nagios, Ganglia and Cacti * Competency in Shell, PHP, Perl or Python. C is a plus * Solid understanding of web services architecture and commonly employed technologies * A sense of urgency in responding to and resolving critical issues that relate to the performance of the site and/or core infrastructure * Excellent verbal and written communication skills * Participation in a shifted coverage schedule, including working nights and on-call rotations

Click to read more ...

Todd Hoff |

1 Comment |

Permalink |

Print Article

Email Article

facebook,

jobs

Saturday

Dec132008

Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests

Saturday, December 13, 2008 at 5:19AM

Our latest strategy is taken from a great post by Paul Saab of Facebook, detailing how with changes Facebook has made to memcached they have:

...been able to scale memcached to handle 200,000 UDP requests per second with an average latency of 173 microseconds. The total throughput achieved is 300,000 UDP requests/s, but the latency at that request rate is too high to be useful in our system. This is an amazing increase from 50,000 UDP requests/s using the stock version of Linux and memcached.

To scale Facebook has hundreds of thousands of TCP connections open to their memcached processes. First, this is still amazing. It's not so long ago you could have never done this. Optimizing connection use was always a priority because the OS simply couldn't handle large numbers of connections or large numbers of threads or large numbers of CPUs. To get to this point is a big accomplishment. Still, at that scale there are problems that are often solved.

Some of the problem Facebook faced and fixed:

Per connection consumption of resources. What works well at low number of inputs can totally kill a system as inputs grow. Memcached uses a per-connection buffer which adds up to a lot of memory that could be used to store data. Nothing wrong with this design choice, but Facebook made changes to use a per-thread shared connection buffer and reclaimed gigabytes of RAM on each server.

Kernel lock contention. Facebook discovered under load there was lock contention when transmitting through a single UDP socket from multiple threads. Sockets are data structures too and they are subject to the usual lock contention issues. Facebook got around this issue by maintaining separate reply sockets in different threads so they would not contend with the receive sockets. They found another bottleneck in Linux’s “netdevice” layer that sits in-between IP and device drivers. They changed the dequeue algorithm to batch dequeues so more work was done when they had the CPU.

Application lock contention. Nothing brings out lock issues like moving to more cores. Facebook found when they moved to 8 core machines a global lock protecting stats collection used 20-30% of CPU usage. In application that require little processing per request, as does memcached, this is not unexpected, but doing real work with your CPU is a better idea. So they collected stats on a per thread basis and then calculated a global view on demand.

Interrupt floods and starvation. With so much traffic directed at a single server the hardware can flood the CPU(s) with interrupts and keep the CPU from doing "real" work. To get around this problem Facebook implements some complicated strategies to load balance IO across all the cores. As I am less clever I might try more network cards with a TCP Offload engine.

When you read Paul's article keep in mind all the incredible number of man hours that went into profiling the system, not just their application, but the entire software hardware stack. Then add in the research, planning, and trying different solutions to see if anything changed for the better. It's a lot of work. Notice using a nifty new parallel language or moving to a cloud wouldn't have made a bit difference. It's complete mastery of their system that made the difference.

A summary of potential strategies:

Profile everything. Problems are always specific. The understanding of the problem must be specific. The fix must be specific.

Burn profiling into your regression tests. Detect when and where performance tanks as a regular part of your build.

Use resources in proportion to what grows slowest. This requires multiplexing, but at least your resource usage is more predictable and bounded.

Batch work. When you have the CPU do all the work you possibly can in the quantum or the whole system grinds to a halt in processing overhead.

Do work and maintain resources per task. Otherwise locking for shared resources takes more and more time when there's less and less time to do the work that needs to be done.

Change algorithms. Sometimes you simply need to do things differently. Tweaking will only get you so far.

You can find their changes on github, the hub that says "git."

Todd Hoff |

3 Comments |

Permalink |

facebook

Monday

Nov242008

Product: Scribe - Facebook's Scalable Logging System

Monday, November 24, 2008 at 12:25AM

In Log Everything All the Time I advocate applications shouldn't bother logging at all. Why waste all that time and code? No, wait, that's not right. I preach logging everything all the time. Doh. Facebook obviously feels similarly which is why they opened sourced Scribe, their internal logging system, capable of logging 10s of billions of messages per day. These messages include access logs, performance statistics, actions that went to News Feed, and many others.

Imagine hundreds of thousands of machines across many geographical dispersed datacenters just aching to send their precious log payload to the central repository off all knowledge. Because really, when you combine all the meta data with all the events you pretty much have a complete picture of your operations. Once in the central repository logs can be scanned, indexed, summarized, aggregated, refactored, diced, data cubed, and mined for every scrap of potentially useful information.

Just imagine the log stream from all of Facebook's Apache servers alone. Brutal. My guess is these are not real-time feeds so there are no streaming query issues, but the task is still daunting. Let's say they log 10 billion messages a day. That's over 1 million messages per second!

When no off the shelf products worked for them they built their own. Scribe can be downloaded from Sourceforge. But the real action is on their wiki. It's here you'll find some decent documentation and their support forums. Not much activity on the site so you haven't missed your chance to be a charter member of the Scribe guild.

A logging system has three broad components:

Client Code Interface - How does your code interact with the log system? Scribe doesn't do much for you here. There's a simple Thrift interface for logging from a large set of languages, but the bulk of the work is stull up to you.

Distribution System - This is were Scribe fits. It reliably (mostly) moves large numbers of messages around. A few error cases lead to data loss: 1) If a client can't connect to either the local or central scribe server the message will be loss; 2) If a scribe server crashes it could lose a small amount of data that's in memory but not on disk; 3) Some multiple component failure cases, such as a resender can't connect to any central server and its local disk fills up; 4) Some rare timeout conditions can lead to duplicate messages

Do Something Usefullizer - How do you do anything useful with over 1 million messages per second? Good question. Scribe doesn't help here. But Scribe will get your data their.

I browsed around the source and it's a well crafted, straightforward socket server that forwards messages to other servers and can write messages to disk. Nothing fancy which is why it probably works for them. It's basic function is:

Scribe is a server for aggregating streaming log data. It is designed to scale to a very large number of nodes and be robust to network and node failures. There is a scribe server running on every node in the system, configured to aggregate messages and send them to a central scribe server (or servers) in larger groups. If the central scribe server isn't available the local scribe server writes the messages to a file on local disk and sends them when the central server recovers. The central scribe server(s) can write the messages to the files that are their final destination, typically on an nfs filer or a distributed file system, or send them to another layer of scribe servers.

It some ways it could be fancier. For example, there's no throttle on incoming connections so a server can chew up memory. And there is a max_msg_per_second throttle on message processing, but this is really to simple. Throttling needs to be adaptive based on local conditions and the conditions of down stream servers. Under load you want to push flow control back to the client so the data stays there until resources become available. Simple configuration file settings rarely work when the world starts getting weird.

Client Code Interface

Here's what the Thrift interface looks like:

enum ResultCode
{
OK,
TRY_LATER
}

struct LogEntry
{
1: string category,
2: string message
}

service scribe extends fb303.FacebookService
{
ResultCode Log(1: list messages);
}

I know, I thought the same thing. Thank God there's another IDL syntax. We simply did not have enough of them. Thrift translates this IDL into the glue code necessary for making cross-language calls (marshalling arguments and responses over the wire). The Thrift library also has templates for servers and clients.

Here's what a call looks like in PHP:

$messages = array();
$entry = new LogEntry;
$entry->category = "buckettest";
$entry->message = "something very interesting happened";
$messages []= $entry;
$result = $conn->Log($messages);

Pretty simple. Usually in C++, for example, there's an elaborate set of macros for logging that provide sophisticated control of log generation. It might look something like:

MSG(msg) - a simple message. It only prints out msg. None of the other information is printed out.
NOTE(const char* name, const char* reason, const char* what, Module* module, msg) - something to take note of.
WARN(const char* name, const char* reason, const char* what, Module* module, msg) - a warning.
ERR(const char* name, const char* reason, const char* what, Module* module, msg) - an error occured.
CRIT(const char* name, const char* reason, const char* what, Module* module, msg) - a critical error occurred.
EMERG(const char* name, const char* reason, const char* what, Module* module, msg) - an emergency occurred.

There's lots more to handle streams and behind the scenes things like time stamps, thread ids, function names, and line numbers. Scribe has wisely not done any of that. It has a RPC like interface to send a list of messages and that's it. It's up to you to write the wrappers.

You'll no doubt have noticed Scribe only logs a category and message, both strings:

Scribe is unique in that clients log entries consisting of two strings, a category and a message. The category is a high level description of the intended destination of the message and can have a specific configuration in the scribe server, which allows data stores to be moved by changing the scribe configuration instead of client code. The server also allows for configurations based on category prefix, and a default configuration that can insert the category name in the file path. Flexibility and extensibility is provided through the "store" abstraction. Stores are loaded dynamically based on a configuration file, and can be changed at runtime without stopping the server. Stores are implemented as a class hierarchy, and stores can contain other stores. This allows a user to chain features together in different orders and combinations by changing only the configuration.

Distribution System

The payload has whatever structure you give it. Scribe is policy neutral and doesn't push a logging model on you.

The configuration file looks something like this:


# BUCKETIZER TEST
<store>
  category=buckettest
  type=buffer
  target_write_size=20480
  max_write_interval=1
  buffer_send_rate=2
  retry_interval=30
  retry_interval_range=10
<primary>
   type=bucket
   num_buckets=6
   bucket_subdir=bucket
   bucket_type=key_hash
   delimiter=1
<bucket>
   type=file
   fs_type=std
   file_path=/tmp/scribetest
   base_filename=buckettest
   max_size=1000000
   rotate_period=hourly
    rotate_hour=0
   rotate_minute=30
   write_meta=yes
</bucket>
</primary>
<secondary>
  type=file
  fs_type=std
  file_path=/tmp
  base_filename=buckettest
  max_size=30000
</secondary>
</store>

The types of stores currently available are:

file - writes to a file, either local or nfs.

network - sends messages to another scribe server.

buffer - contains a primary and a secondary store. Messages are sent to the primary store if possible, and otherwise the secondary. When the primary store becomes available the messages are read from the secondary store and sent to the primary.

bucket - contains a large number of other stores, and decides which messages to send to which stores based on a hash.

null - discards all messages.

thriftfile - similar to a file store but writes messages into a Thrift TFileTransport file.

multi - a store that forwards messages to multiple stores.

Certainly a flexible and useful set of logging capabilities. You can build a hierarchy of log servers to do pretty much anything you want. You could imagine have a log server on each server that has file store to handle upstream server failures. This log server forwards messages onto a centralized server for a datacenter. And all the datacenter servers forward their logs on to the centralized data warehouse. To scale adjust fan-in and fan-out as necessary.

Do Something Usefullizer

You may not have over 1 million log messages a second to process, but you are likely to have your own tanker trunk full of log messages. How do you do something useful with them?

Log messages stored in log files are next to useless. Grep'ing on a terabyte of logs to answer simple questions about your data just doesn't work.

You may have a sharded datawarehouse you can pump log messages into and do reasonably effective job of querying.

Or you can set up a HADOOP/HDFS. style system. The idea here is you need a distributed file system to handle the continual stream of log messages. And once you have all the data stored safely away you'll need to use map-reduce to do anything with such a large amount of data.

If you want to ask, for example, how many of your users are from Asia, log files won't work. It's likely your data warehouse can't handle it. HADOOP/HDFS is a practical option.

If that's the direction you are going what does it imply about your log system? I would say it makes even the simple category-payload system of Scribe overkill. The with a scalable backend is to move log payloads from applications to the centralized store as quickly as possible. By definition the central store can handle the load, so there's no reason to use intermediate servers to scale. From an application write directly to the central store, even from multiple datacenters. The payload structure is unimportant until it hits the central store. If the application can't hit the central store then it queues into the file system until it can. Ideally log messages never hit the file system until HDFS is writing them to their final destination. This makes for a low latency and high throughput logging and is even simpler than Scribe.

If you don't have a scalable central store then Scribe is a good option. It gives you all the flexibility you need to compose your logging system in a way that is mostly reliabile and scalable.

Todd Hoff |

13 Comments |

Permalink |

operations

Wednesday

Sep032008

Some Facebook Secrets to Better Operations

Wednesday, September 3, 2008 at 1:27AM

Kim Nash in an interview with Jonathan Heiliger, Facebook VP of technical operations, provides some juicy details on how Facebook handles operations. Operations is one of those departments everyone runs differently as it is usually an ontogeny recapitulates phylogeny situation. With 2,000 databases, 25 terabytes of cache, 90 million active users, and 10,000 servers you know Facebook has some serious operational issues. What are some of Facebook's secrets to better operations?

Frequent Releases. A major release once a week and a minor releases every few days.

Create a Cyber Liability Group. At one time operations was distributed amongst several groups. A permanent operations group was created to isolate problems and revert problem software components back to previously known good states. The ability of a separate team to handle rollbacks speaks to a great deal of standardization and advanced tool building.

Distribute Team Across Time Zones. Split the operations team across different time zones so no one has to work the graveyard shift. Facebook has 20 people in their team located in Palo Alto, California and London, England.

Be Innovative, Not Safe. Fear of failure often shuts down the organizational brain and makes it hide behind excessive rules and regulations. A technology company should have a bias towards action and innovation. Release software. Don't stifle genius. Rely on your tools and processes to recover from problems.

Expect Problems. Software pushed to production will have problems. Expect problems, but don't let that stop you from innovating.

Roll Backward. When a problem is detected in a release the changes can either be rolled forward or backward. Rolling back is going to a previously good release. Rolling forward is fixing problems in the new release rather than rolling back. Bugs in production are fixed in production. Roll forward ends up being covered in the press, so prefer roll backs over roll forwards.

Roll Out Massive Changes Slowly. Turn on features gradually, for a few percent of users at a time. Use the slow rollout to fix problems that can only be found under real user conditions. This approach give operations and development a lot of confidence in changes.

Encourage Openness and Information Sharing. Design reviews, PR strategy, which servers to buy, etc are often open for informal debate among employees. Facebook has created an Ideas system where employees can create an Idea by category. There's a discussion tool for discussing the idea and a rating system for rating the idea. Tools are built on-top of Facebook platform so they are available to everyone.

Live-blog Key Events. Large company meetings, monthly presentations and weekly Q&As with the management team are transcribed live.

It sounds like a relatively fun environment for pushing software live. Getting software moved into production is often harder than the original coding and testing. Now I know what you are thinking. You somehow managed to procure the ssh login. So just login remotely and do the install yourself! Nobody will know. Oh so tempting. But it's not really good corporate citizenship. And you just might screw up, then there will be some esplaining to do.

Emphasing frequent releases and gutsy release policies makes it actually seem like someone is supporting developers instead of treating them like their software carries the plague. Data centers are often treated like quarantine stations and developers are treated like asymptomatic carriers of some unknown virulent disease. To be safe nothing should ever change, but that's not an attitude that makes things better. Nice to see that recognized.

To setup or not to setup a separate operations group? Facebook says "to be" and creates a seperate group. Amazon says "not to be" and has developers support their own software. Secretly I think Amazon gets better results by requiring developers to support their own software. Knowing it may be you getting the "It's Down!" call gives one proper perspective. But I like not being on call and I think most developers agree. Plus the idea "following the sun" to get 24 hour support is a smart idea.

HighScalability Operations

On Designing and Deploying Internet-Scale Services

Amazon Architecture

Todd Hoff |

7 Comments |

Permalink |

Print Article

Email Article

facebook,

operations

Tuesday

Jul222008

Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App

Tuesday, July 22, 2008 at 12:48AM

Several months ago I attended a Joyent presentation where the spokesman hinted that Joyent had the chops to support a one billion page per month Facebook Ruby on Rails application. Even under a few seconds of merciless grilling he would not give up the name of the application. Now we have the big reveal: it was LinkedIn's Bumper Sticker app. For those not currently sticking things on bumps, Bumper Sticker is quite surprisingly a viral media sharing application that allows users to express their individuality by sticking small virtual stickers on Facebook profiles. At the time I was quite curious how Joyent's cloud approach could be leveraged for this kind of app. Now that they've released a few details, we get to find out.

Site: http://www.Facebook.com/apps/application.php?id=2427603417

Information Sources

Video: Scaling to 1 Billion Page Views Per MonthVideo (very flashy)

Web Scalability Practices: Bumper Sticker on Rails by Ikai Lan and Jim Meyer from LinkedIn

1 Billion Page Views a Month by David Young from Joyent

Ruby on Rails: scaling to 1 billion page views per month by Dennis Howlettby from Zdnet

Joyent's Grid Accelerators for Web Applications by Jason Hoffman from Joyent

On Grids, the Ambitions of Amazon and Joyent by Jason Hoffman from Joyent

Scaling Ruby on Rails to 1 Billion Page Views a Month by Joe Pruitt from DevCentral

The Platform

MySQL

Nginx

Mongrel

CDN

Ruby on Rails (rapid prototype development approach)

Facebook

Joyent Accelerator - provides a highly scalable on-demand infrastructure for running web sites, including rich web applications written in Ruby on Rails, PHP, Python and Java. Joyent Accelerators are next-generation virtual computers that can grow and multiply (or shrink and consolidate) depending on the real world demands faced by your Web application. Accelerators are built on OpenSolaris, multi-core (8+), RAM-rich servers (32GB+ each) and vast amounts of NAS storage.

Masochism Plugin - provides an easy solution for Ruby on Rails applications to work in a replicated database environment. Connection proxy sends some database queries (those in a transaction, update statements, and ActiveRecord::Base#reload) to a master database, and the rest to the slave database.

The Stats

1 billion page views per month

13.5 million installations

1.5 million daily active users. Recruited 1 million users in first 46 days.

20-27 million canvas page views a day

13 web application servers running Nginx and Mongrel

8 static asset servers serving over 3,500,000 stickers (migrating to a CDN)

4 MySQL servers in a master/slave configuration using Masochism as a proxy to load balance database operations.

Cost is about $25K/month.

The Architecture

Bumper Sticker was an experiment to see how fast the Light Engineering Development (LED) team at LinkedIn could build a Ruby on Rails Facebook application.

RoR was an easy an environment to prototype in, but they needed a production environment in which they could quickly develop, deploy, and scale. Joyent was selected.

Some Notes on Joyent:
* Joyent is a scale on demand cloud. Allows customers to have a dynamic data center instead of being stuck using their own rigid infrastructure.
* There's an API if you need one. The service is unmanaged, you get root on all your boxes.
* They consider their infrastructure to be better and more open than Amazon. You get access to a high end load balancer and the capabilities of OpenSolaris (Dtrace, Zones, lower request processing overhead, sub 10 second reboot times).
* Joyent's primary scalability principle is to organize apps around silos built from their powerful Accelerator blocks: put applications on different servers based on the quality of service you want to give them. For example, put static content on their own servers so the static content is always served fast and reliably. This allows you to prioritize based on what's important to you. You could, for example, prioritize the virality of your application by putting the Invite Friends functionality on their own servers, thus assuring the growth of your application through your viral functionality possibly at the expense of less important functionality.
* Has three data centers in the US and are opening a fourth, none in Europe.
* Considers their secret sauce to be their highly sophisticated administration system which allows a few people to easily manage a large infrastructure.
* Has a peering relationship with Facebook. That means there are direct high-speed fiber links between Joyent’s data center in Emeryville and Facebook’s data center in San Francisco.

80% of the content for Bumber Sticker is static. The Facebook API can directly render content at a specified memory location. Bumper Sticker was able to use the scripting feature of F5 BIG-IP load balancer to directly load static content by passing a pointer to the Facebook API.

The Lessons

Rails scales exactly like any other app. Take into account all the components from the moment the request is received at the load balancer all the way down and all the way back again.

The development process is: put some measurements in place, find problems, fix problems, more people adopt and scale you out of your solution, and the cycle repeats. Sun's Dtrace feature makes it easy to instrument the stack to identify bottlenecks.

Rails scales as long as the development team using it understands that many of the bottlenecks are exactly those faced by developers on any other database-driven web platform.

Hit a disk spindle and you are screwed. Avoid going to the database or the file system. The more they avoided disk the fewer timeouts they experienced.

Convert anything dynamic into static content. Dynamic content is your enemy. Convert anything dynamic into static content so it can be removed from the disk path.

Push content to the edge. Move content as close to the client as possible. Move cache to the CDN. Reduce time going across the network.

Faster means more viral. On a viral system the better the performance the more people can play with your application. The more people who play with your system the more likely they are to pull more people in, which means the more the app will spread and go viral. Bumper Sticker has been successful at creating a community of fans who enjoy uploading and sharing their own stickers.

Some issues:

Since most of the content is static and served by the load balancer, the impact of Rails in the system is not clear.

The functionality of Bumper Sticker is relatively simple. What would the impact be on scalability if other often requested features like search were added?

Friends for Sale Architecture - A 300 Million Page View/Month Facebook RoR App

Todd Hoff |

5 Comments |

Permalink |

CDN,

MySQL,

Nignx,

RoR,

facebook

Wednesday

May142008

New Facebook Chat Feature Scales to 70 Million Users Using Erlang

Wednesday, May 14, 2008 at 5:09PM

Update: Erlang at Facebook by Eugene Letuchy. How Facebook uses Erlang to implement Chat, AIM Presence, and Chat Jabber support.

I've done some XMPP development so when I read Facebook was making a Jabber chat client I was really curious how they would make it work. While core XMPP is straightforward, a number of protocol extensions like discovery, forms, chat states, pubsub, multi user chat, and privacy lists really up the implementation complexity. Some real engineering challenges were involved to make this puppy scale and perform. It's not clear what extensions they've implemented, but a blog entry by Facebook's Eugene Letuchy hits some of the architectural challenges they faced and how they overcame them.

A web based Jabber client poses a few problems because XMPP, like most IM protocols, is an asynchronous event driven system that pretty much assumes you have a full time open connection. After logging in the server sends a client roster information and presence information. Your client has to be present to receive the information. If your client wants to discover the capabilities of another client then a request is sent over the wire and some time later the response comes back. An ID is used to map the reply to the request. All responses are intermingled. IM messages can come in at any time. Subscription requests can come in at any time.

Facebook has the client open a persistent connection to the IM server and uses long polling to send requests and continually get data from the server. Long polling is a mixture of client pull and server push. It works by having the client make a request to the server. The client connection blocks until the server has data to return. When it does data is returned, the client processes it, and then is in position to make another request of the server and get any more data that has queued up in the mean time. Obviously there are all sorts of latency, overhead, and resource issues with this approach. The previous link discusses them in more detail and for performance information take a look at Performance Testing of Data Delivery Techniques for AJAX Applications by Engin Bozdag, Ali Mesbah and Arie van Deursen.

From a client perspective I think this approach is workable, but obviously not ideal. Your client's IMs, presence changes, subscription requests, and chat states etc are all blocked on the polling loop, which wouldn't have a predictable latency. Predictable latency can be as important as raw performance.

The real scaling challenge is on the server side. With 70 million people how do you keep all those persistent connections open? Well, when you read another $100 million was invested in Facebook for hardware you know why. That's one hella lot of connections. And consider all the data those IM servers must store up in between polling intervals. Looking at the memory consumption for their servers would be like watching someone breath. Breath in- streams of data come in and must be stored waiting for the polling loop. Breath out- the polling loops hit and all the data is written to the client and released from the server. A ceaseless cycle. In a stream based system data comes in and is pushed immediately out the connection. Only socket queue is used and that's usually quite sufficient. Now add network bandwidth for all the XMPP and TCP protocol overhead and CPU to process it all and you are talking some serious scalability issues.

So, how do you handle all those concurrent connections? They chose Erlang. When you first hear Erlang and Jabber you think ejabberd, an open source Erlang based XMPP server. But since the blog doesn't mention ejabberd it seems they haven't used it .

Why Erlang? First, the famous Yaws vs Apache shootout where "Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections." Erlang is naturally good at solving high concurrency problems. Yet following the rule that no benchmark can go unchallenged, Erik Onnen calls this the Worst Measurement Ever and has some good reasoning behind it.

In any case, Erlang does nicely match the problem space. Erlang's approach to a concurrency problem is to throw a very light weight Erlang process at each state machine you want to be concurrent. Code-wise that's more natural than thread pools, async IO, or thread per connection systems. Until Linux 2.6 it wasn't even possible to schedule large numbers of threads on a single machine. And you are still devoting a lot of unnecessary stack space to each thread. Erlang will make excellent use of machine resources to handle all those connections. Something anyone with a VPS knows is hard to do with Apache. Apache sucks up memory with joyous VPS killing abandon.

The blog says C++ is used to log IM messages. Erlang is famously excellent for its concurrency prowess and equally famous for being poor at IO, so I imagine C++ was needed for efficiency.

One of the downsides of multi-language development is reusing code across languages. Facebook created Thrift to tie together the Babeling Tower of all their different implementation languages. Thrift is a software framework for scalable cross-language services development. It combines a powerful software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, and Ruby. Another approach might be to cross language barriers using REST based services.

A problem Facebook probably doesn't have to worry about scaling is the XMPP roster (contact list). Handling that many user accounts would challenge most XMPP server vendors, but Facebook has that part already solved. They could concentrate on scaling the protocol across a bunch of shiny new servers without getting bogged down in database issues. Wouldn't that be nice :-) They can just load balance users across servers and scalability is solved horizontally, simply by adding more servers. Nice work.

Todd Hoff |

13 Comments |

Permalink |

erlang,

xmpp

Friday

May022008

Friends for Sale Architecture - A 300 Million Page View/Month Facebook RoR App

Friday, May 2, 2008 at 1:33AM

Update: Jake in Does Django really scale better than Rails? thinks apps like FFS shouldn't need so much hardware to scale.

In a short three months Friends for Sale (think Hot-or-Not with a market economy) grew to become a top 10 Facebook application handling 200 gorgeous requests per second and a stunning 300 million page views a month. They did all this using Ruby on Rails, two part time developers, a cluster of a dozen machines, and a fairly standard architecture. How did Friends for Sale scale to sell all those beautiful people? And how much do you think your friends are worth on the open market?

Site: http://www.facebook.com/apps/application.php?id=7019261521

Information Sources

Siqi Chen and Alexander Le, co-creators of Friends for Sale, answering my standard questionairre.

Virality on Facebook

The Platform

Ruby on Rails

CentOS 5 (64 bit)

Capistrano - update and restart application servers.

Memcached

MySQL

Nginx

Starling - distributed queue server

Softlayer - hosting service

Pingdom - for website monitoring

LVM - logical volume manager

Dr. Nics Magic Multi-Connections Gem - split database reads and writes to servers

The Stats

10th most popular application on Facebook.

Nearly 600,000 active users.

Half a million unique visitors a day and growing fast.

300 million page views a month.

300% monthly growth rate, but that is plateauing.

2.1 million unique visitors in the past month

200 requests per second.

5TB of bandwidth per month.

2 part time (now full time), and 1 remote DBA contractor.

4 DB servers, 6 application servers, 1 staging server, and 1 front end server.
- 6, 4 core 8 GB application servers.
- Each application server runs 16 mongrels for a total of 96 mongrels. -
- 4 GB memcache instance on each application server
- 2 32GB 4 core servers with 4x 15K SCSI RAID 10 disks in a master-slave setup

Getting to Know You

What is your system is for?

Our system is designed for our Facebook application, Friends for Sale.
It's basically Hot-or-Not with a market economy. At the time of this
writing it's the 10th most popular application on Facebook.

Their Facebook description reads: Buy and sell your friends as pets! You can make your pets poke, send gifts, or just show off for you.
Make money as a shrewd pets investor or as a hot commodity! Friends for Sale is the bees knees!

Why did you decide to build this system?

We designed this as more of an experiment to see if we understood virality concepts and metrics on Facebook. I guess we do. =)

What particular design/architecture/implementation challenges do your system have?

As a Facebook application, every request is dynamic so no page caching is possible. Also, it is a very interactive, write heavy application so scaling the database was a challenge.

What did you do to meet these challenges?

We memcached extensively early on - every page reload results in 0 SQL calls. We use Rail's fragment caching with custom expiration logic mostly.

How big is your system?

We had more than half a million unique visitors yesterday and growing fast. We're on track to do more than 300 million page views this month.

What is your in/out bandwidth usage?

We used around 3 terabytes of bandwidth last month. This month should be at least 5TB or so. This number is just for a few icons and XHTML/CSS.

How many documents, do you server? How many images? How much data?

We don't really have unique documents ... we do have around 10 million user profiles though.

The only images we store are a few static image icons.

How fast are you growing?

We went from around 3M page views per day a month ago to more than 10M page views a day. A month before that we were doing 1M page views per day. So that's around a 300% monthly growth rate but that is plateauing. On a request per second basis, we get around 200 requests per second.

What is your ratio of free to paying users?

It's all free.

What is your user churn?

It's around 1% per day, with a growth rate of 3% or so per day in terms of installed users.

How many accounts have been active in the past month?

We had roughtly 2.1 million unique visitors in the past month according to Google.

What is the architecture of your system?

It's a relatively standard Rails cluster. We have a dedicated front end proxy balancer / static web server running nginx, which proxies directly to 6, 4 core 8 GB application servers. Each application server runs 16 mongrels for a total of 96 mongrels. The front end load balancer proxies directly to the mongrel ports. In addition, we run a 4 GB memcache instance on each application server, along with a local starling distributed queue server and misc background processes.

We use god to monitor our processes.

On the DB layer, we have 2 32GB 4 core servers with 4x 15K SCSI RAID 10 disks in a master-slave setup. We use Dr Nic's magic multi-connection's gem in production split reads and writes to each
box.

We are adding more slaves right now so we can distribute the read load better and have better redundancy and backup policies. We also get help from Percona (the mysqlperformanceblog guys) for remote DBA work.

We're hosted on Softlayer - they're a fantastic host. The only problem was that their hardware load balancing server doesn't really work very well ... we had lots of problems with hanging connections and latency. Switching a dedicated box running just nginx fixed everything.

How is your system architected to scale?

It really isn't. On the application layer we are shared-nothing so it's pretty trivial. On the database side we're still with a monolithic master and we're trying to push off sharding for as long as we can. We're still vertically scaled on the database side and I think we can get away with it for quite some time.

What do you do that is unique and different that people could best learn from?

The three things that are unique is -

1. Neither of the two developers in involved had previous experience in large scale Rails deployment.
2. Our growth trajectory is relatively rare in the history of Rails deployments
3. We had very little opportunity for static page caching - each request does hit the full Rails stack

What lessons have you learned? Why have you succeeded? What do you wish you would have done differently? What wouldn't you change?

We learned that a good host, good hardware, and a good DBA are very important. We used to be hosted on Railsmachine, which to be fair is an excellent shared hosting company and they did go out of there way to support us. In the end though, we were barely responsive for a good month due to hardware problems, and it only took two hours to get up and running on Softlayer without a hitch. Choose a good host if you plan on scaling, because migrating isn't fun.

The most important thing we learned is that your scalability problems is pretty much always, always, always the database. Check it first, and if you don't find anything, check again. Then check again. Without exception, every performance problem we had can be traced to the database server, the database configuration, the query, or the use and non-use of indices.

We definitely should have gotten on to a better host earlier in the game so we would have been up.

We definitely wouldn't change our choice of framework - Rails was invaluable for rapid application development, and I think we've pretty much proven that two guys without a lot of scaling experience can scale a Rails app up. The whole 'but does Rails scale?' discussion sounds like a bunch of masturbation - the point is moot.

How is your team setup?

We have two Rails developers, inclusive of me. We very recently retained the services of a remote DBA for help on the database end.

How many people do you have?

On the technical side, 2 part time (now full time), and 1 remote DBA contractor.

Where are they located?

The full time employees are also located in the SOMA area of San Francisco.

Who performs what roles?

The two developers server as co-founders . I (Siqi) was responsible for front end design and development early on, but since I had some experience with deployment I also ended up handling network operations and deployment as well. My co founder Alex is responsible for the bulk of the Rails code - basically all the application logic is from him. Now I find myself doing more deep back end network operations tasks like MySQL optimization and replication - it's hard to find time to get back to the front end which is what I love. But it's been a real fun learning experience so I've been eating up all I can from this.

Do you have a particular management philosophy?

Yes - basically find the smartest people you can, give them the best deal possible, and get out of their way. The best managers GET OUT OF THE WAY, so I try to run the company as much as I can with that in mind. I think I usually fail at it.

If you have a distributed team how do you make that work?

We'd have to have some really good communication tools in the cloud - somebody would have to be a Basecamp nazi. I think remote work / outsourcing is really difficult - I prefer to stay away with from it
for core development. For something like MySQL DBA or even sysadmin - it might make more sense.

What do you use?

We use Rails with a bunch of plugins, most notable cache-fu from Chris Wanstrath and magic multi connections from Dr. Nic. I use VIM as the editor with the rails.vim plugin.

Which languages do you use to develop your system?

Ruby / Rails

How many servers do you have?

We now have 12 servers in the cluster.

How are they allocated?

4 DB servers, 6 application servers, 1 staging server, and 1 front end server.

How are they provisioned?

We order them from Softlayer - there's a less than 4 hour turn around for most boxes, which is awesome.

What operating systems do you use?

CentOS 5 (64 bit)

Which web server do you use?

nginx

Which database do you use?

MySQL 5.1

Do you use a reverse proxy?

We just use nginx's built in proxy balancer.

How is your system deployed in data centers?

We use a dedicated hosting service, Softlayer.

What is your storage strategy?

We use NAS for backups but internal SCSI drives for our production boxes.

How much capacity do you have?

Across all of our boxes we probably have around ... 5 TB of storage or
thereabouts.

How do you grow capacity?

Ad-hoc. We haven't done a proper capacity planning study, to our detriment.

Do you use a storage service?

Nope.

Do you use storage virtualization?

Nope.

How do you handle session management?

Right now we just persist it to the database - it would be fairly easy to use memcache directly for this purpose though.

How is your database architected? Master/slave? Shard? Other?

Master/slave right now. We're moving towards a Master/Multi-slave with a read only load balancing proxy to the slave cluster.

How do you handle load balancing?

We do it in software via nginx.

Which web framework/AJAX Library do you use?

Rails.

Which real-time messaging frame works do you use?

None.

Which distributed job management system do you use?

Starling

How do you handle ad serving?

We run network ads. We also weight our various ad networks by eCPM on our application layer.

Do you have a standard API to your website?

Nope.

How many people are in your team?

2 developers.

What skill sets does your team possess?

Me: Front end design, development, limited Rails. Obviously, recently proficient in MySQL optimization and large scale Rails deployment.
Alex: application logic development, front end design, general software engineering.

What is your development environment?

Alex develops on OSX while I develop on Ubuntu. We use SVN for version control. I use VIM for editing and Alex uses TextMate.

What is your development process?

On the logic layer, it's very test driven - we test extensively. On the application layer, it's all about quick iterations and testing.

What is your object and content caching strategy?

We cache both in memcache with no TTL, and we just manually expire.

What is your client side caching strategy?

None.

How do you manage your system?

How do check global availability and simulate end-user performance?

We use Pingdom for external website monitoring - they're really good.

How do you health check your server and networks?

Right now we're just relying on our external monitoring and Softlayer's ping monitoring. We're investigating FiveRuns for monitoring as a possible solution to server monitoring.

How you do graph network and server statistics and trends?

We don't.

How do you test your system?

We deploy to staging and run some sanity tests, then we do a deploy to all application servers.

How you analyze performance?

We trace back every SQL query in development to make sure we're not doing any unnecessary calls or model instantiations. Other than that, we haven't done any real benchmarking.

How do you handle security?

Carefully.

How do you decide what features to add/keep?

User feedback and critical thinking. We are big believers in simplicity so we are pretty careful to consider before we add any major features.

How do you implement web analytics?

We use a home grown metrics tracking system for virality optimization,
and we also use Google Analytics.

Do you do A/B testing?

Yes, from the time to time we will tweak aspects of our design to optimize for virality.

How is your data center setup?

Which firewall product do you use?

Which DNS service do you use?

Which routers do you use?

Which switches do you use?

Which email system do you use?

How do you handle spam?

How do you handle virus checking of email and uploads?

Don't know to all of the above.

How do you backup and restore your system?

We use LVM to do incrementals on a weekly and daily basis.

How are software and hardware upgrades rolled out?

Right now they are done manually, except for new Rails application deployments. We use capistrano to update and restart our application servers.

How do you handle major changes in database schemas on upgrades?

We usually migrate on a slave first and then just switch masters.

What is your fault tolerance and business continuity plan?

Not very good.

Do you have a separate operations team managing your website?

Oh we wish.

Do you use a content delivery network? If so, which one and what for?

Nope

What is your revenue model?

CPM - more page views more money. We also have incentivized direct offers through our virtual currency.

How do you market your product?

Word of mouth - the social graph. We just leverage viral design tactics to grow.

Do you use any particularly cool technologies are algorithms?

I think Ruby is pretty particularly cool. But no, not really - we're not doing rocket science, we're just trying to get people laid.

Do your store images in your database?

No, that wouldn't be very smart.

How much up front design should you do?

Hm. I'd say none if you haven't scaled up anything before, and a lot if you have. It's hard to know what's actually going to be the problem until you've actually been through and see what real load problems look like. Once you've done that, then you have enough domain knowledge to do some actual meaningful up front design on our next go around.

Has anything surprised your either for the good or bad?

How unreliable vendor hardware can be, and how different support can be from host to host. The number one most important thing you will need is a scaled up dedicated host who can support your needs. We use Softlayer and we can't recommend them highly enough.

On the other hand, it's surprising how far just a master-multislave setup can take you on commodity hardware. You can easily do a Billion page views per month on this setup.

How does your system evolve to meet new scaling challenges?

It doesn't really, we just fix bottle necks as they come and we see them coming.

Who do you admire?

Brad Fitzpatrick for inventing memcache, and anyone who has successfully horizontally scaled anything.

How are you thinking of changing your architecture in the future?

We will have to start sharding by users soon as we hit database size and write limits.

Their Thoughts on Facebook Virality

Facebook models the social graph in digital form as accurately and completely as possible.

Social graph is more important that features.

Facebook enables rapid social distribution of new applications through the social graph.

Your application idea should be: social, engaging, and universal.

The social aspect makes it viral.

Engaging makes it monetizable.

Universal gives it potential.

Friends for Sale is social because you are buying and selling your social graph.

It's engaging because it's a twist on an idea, low pressure, flirty, and a bit cynical.

It's universal because everyone is vain, has a price, and wants to flirt with hot people.

Every touch point in the application is a potential for recruiting new users.

Every user converts 1.4 other users which is the basis for exponential growth.

For every new user track the number of invites, notifications, minifeed items, profile clicks, and other channels.

For every channel track the percent clicked, converted, uninstalls.

Lessons Learned

Scaling from the start is a requirement on Facebook. They went to 1 million pages/day in 4 weeks.

Ruby on Rails can scale.

Anything scales on the right architecture. Focus on architecture and operations.

You need a good DBA, good host, and good well configured hardware.

With caching and the heavy duty servers available today, you can go a long time without adopting more complicated database architectures.

The social graph is real. It's truly staggering the number of accessible users on Facebook with the right well implemented viral application.

Most performance problems are in the database. Look to the database server, the database configuration, the query, or the use and non-use of indexes.

People still use Vi!

I'd really like to thank Siqi taking the time to answer all my questions and provide this fascinating look in to their system. It's amazing what you've done in so little time. Excellent job and thanks again.

Todd Hoff |

52 Comments |

Permalink |

CentOS,

MySQL,

RoR,

nginx

Tuesday

Oct232007

Hire Facebook, Ning, and Salesforce to Scale for You

Tuesday, October 23, 2007 at 9:35AM

One of the premier scaling strategies is always: get someone else to do the work for you. But unlike Huckleberry Finn in Tom Sawyer, you won't have to trick anyone into whitewashing a fence for you. Times have changed. Companies like Ning, Facebook, and Salesforce are more than happy to help. Their price: lock-in. Previously you had few options when building a "real" website. You needed to do everything yourself. Infrastructure and application were all yours. Then companies stepped in by commoditizing parts of the infrastructure, but the application was still yours. The next step is full on Borg take no prisoners assimilation where the infrastructure and application are built as one collective. What you have to decide as someone faced with building a scalable website is if these new options are worth the price. Feeding this explosion of choice is one of the new strategy games on the intertubes: the Internet Platform Game. Ning's Marc Andreessen defines a platform as: a system that can be programmed and therefore customized by outside developers -- users -- and in that way, adapted to countless needs and niches that the platform's original developers could not have possibly contemplated, much less had time to accommodate. The idea is you'll win great rewards in exchange for coding to someone else's internet platform. From Ning you'll win a featureful and customizable social networking platform that they are completely responsible for scaling. The cost ranges from free to very reasonable. From Facebook you'll win prime space on the profile page of over 40 million virally infected customers. It's free, but you must make your application scalable enough to handle all those millions. By coding to the Salesforce platform you'll win the same infrastructure that executes 100 million Salesforce transactions a day. The cost of their service is unknown at this time.

The Three Levels of Internet Platforms

Mr. Andreessen then went a step further and defined a three level platform categorization scheme:

Level 1: Access API. A platform provided in the form of a REST/SOAP web services API. Examples: eBay, Paypal, Flickr, Digg. Your application lives outside the service and their API is your only access point to the system. Scalability is completely up to you. You are basically building a mashup from distributed parts in your own data center.

Level 2: Plug-In API. A platform provided in the form of a system for embedding your application inside another application. Examples: Facebook, Eclipse, Firefox. You still use an API, but the user sees an integrated application because your application is using their screen real estate, log in, user accounts and so on. For internet plug-ins scalability is still up to you. The millions of Facebook users running your application must run completely on your servers.

Level 3: Deep hosting. A platform provided in the form an API, Plug-in, and fully hosted runtime environment. Examples: Ning, Salesforce, and Second Life. Your application is completely integrated with a host application framework and runs completely on the host servers. They are responsible adding machines, maintenance, and management. You are free to just write your application. Amazon is on his original list, but I don't put it there. If Amazon exposed their Dynamo service I would, but since with EC2 you are stuck worrying about database storage they really don't belong here. Like the typical depiction of human ascent from amoeba to weapon wielding, art appreciating primate, the levels are meant to indicate progress. While in reality evolution isn't about progress at all. It's all about survival through adaptation to local ecological niches. And that's how I look at the levels. At each level you gain something and you lose something. You need to select your niche by looking at your talents and needs.

Why Use an Access API?

Using open APIs to access services is what has made the internet great. APIs provide the most flexibility at the greatest cost. You get access to a huge number of wonderful services for virtually nothing. The linkage between website is a relatively simple API and a data definition. You can do anything you want, but you have to build the infrastructure to do it. Yet that's a lot better than building your own map service, your own SMS service, or your own photo sharing service. Yet there's still so much work to do. Grid services make the job easier, but the level of expertise it takes to create a scalable site is still very high.

Why Use a Plug-In?

Since Facebook is the only internet company in this category the answer is clear why you want to be a Facebook plug-in: to get access to a lot of users, connected by an exploitable social graph, for the purpose of exponentionally propagating your application along the graph. Most would be ecstatic to get to hundreds of thousands of regular users on their own standalone site. With Facebook that's very possible. The reward is great, but the costs are great too. Your application must be something that can be deconstructed onto Facebook. I don't see gmail making it as a Facebook app. You must subject yourself to a lot of restrictions to use the Facebook infrastructure. You must trust yourself to a poorly documented system in which it is hard to get anything done. And to top it off:Facebook does not host your application. This really blew me away when I first heard about it. When someone says they are offering a platform my immediate assumption is they are hosting your application. That's what a platform is, isn't it? But your application must run on your own hardware. Imagine going from 0 to millions of users in the space of a few days. How would you handle that? Well that's exactly the problem ILike (a popular music sharing site) had when they released their Facebook app. Mr. Andreessen gives a wonderful if somewhat self-serving account of ILike's troubles with viral growth. After launching they posted this on their blog: In our first 20 hours of opening doors we had 50,000 users sign up, and it is only accelerating. (10,000 users joined in the first 12 hrs. 10,000 more users in the next 3 hrs. 30,000 more users in the next 5 hrs!!) We started the system not knowing what to expect, with only 2 servers, but ready with backup. Facebook's rabid userbase chewed up our 2 servers almost instantly. We doubled our capacity to catch up. And then we doubled it again. And again. And again. Oh crap - we ran out of servers!! Although iLike.com has a very healthy level of Web traffic, and even though about half of all the servers in our datacenter were sitting unused, idle, as backup capacity, we are now completely maxed out. We just emailed everybody we know across over a dozen Bay Area startups, corporations, and venture firms in a desperate plea to find spare servers so we can triple our capacity for the continued onslaught. Tomorrow we are picking up over 100 servers from different companies to have them installed just to handle the weekend's traffic. (For those who responded to our late night pleas, thank you!) ILike says they now have over 3 million Facebook users and are growing at an astonishing rate of 300,000 users per day. That number of users and growth rate will make almost anyone salivate. Yet how many can afford the hundreds and hundreds of servers it would take to handle all those users, especially if you have an unclear monetization strategy? Which brings us to Deep Hosting and Mr. Andreessen's end game for the internet's evolution.

Why Use Deep Hosting?

The trouble with handling application growth under Facebook's large user base has an obvious solution: host your application on their infrastructure. This is exactly what Mr. Andreessen has done with Ning. Out of the Ning box you get an exceptionally functional social networking package. So functional in fact it makes almost anyone think "do I really need to reinvent all this stuff when they've already done it? Can't I just tweak a few things and make it my own?" And that's exactly what Ning wants to hear. They've made it so you can completely rebrand their software, add your own features using normal programming tools, yet still host your application on their platform, on their servers, in their datacenter. So you don't have to worry about scaling. Its Ning's job to scale the database, back it up, manage the infrastructure, add servers, and do all the other nasty bits that keep so many people away from deploying successful websites. So the temptation is clear. Go with Ning and you immediately get a cool system that will scale and that you can still program if you feel the need. But with all that power comes a price, as usual. You are locked inside a gilded cage. If your application slows down there's not much you can do about it. I found their documentation better than Facebook's, but not very useful for someone looking to get going quickly and that makes me very nervous when adopting a platform. Yet when they add features, as they frequently do, your app gets them for free. You see some of the same effects here that all Google apps get when the Google stack is improved. And not having to worry about scalability is very attractive, especially at such a reasonable cost.

Problems with Deep Hosting

Mr. Andreessen thinks that "in the long run, all credible large-scale Internet companies will provide Level 3 platforms." There are three problems with this argument.

One: Ning has the same problem as Salesforce, only their part of the application infrastructure is scalable. What if I want to a add new service that is specific to my application? Let's say I want to send mass emailings for an invitation feature, for example? How do I make my infrastructure for this run inside their platform? I don't. Which means I have to be able build a scalable infrastructure anyway. Which means I might as well do the whole thing. But Ning might say their functionality is so compelling that it's worth the trade off. You can always make those external services. Which brings us back to if I have to do one part I might as well do it all. And it also brings us to the second problem with the L3 platform model.

Two: How compelling will each L3 domain be? You have to be very very attractive to even get someone to consider assimilating into a platform. Ning has done an excellent job at this. But how many other companies in how many other domains will do as a good a job? Precious few I would think.

Three: Mr. Andreessen maintains it is "really easy to learn how to program -- in fact, it's never been easier." So centering the L3 platform definition around programmability is not seen as a concern. But programming is not easy. It's very hard. Especially with such poorly documented systems. The more code you have to write the further you are away from your goal and the further you are away from adoption. This is why we see systems like Drupal with well defined plug-in architectures being very popular. Most people can't and won't ever program, so building things from pre-existing parts (like how our bodies evolved) allows people to get a lot of core functionality with the chance for specialization and expandability.

What does this mean for you?

I've found it difficult to reconcile all the different pros and cons of each approach. There is a definite value in all these alternatives. If you have a vision for an application then building it yourself is the only way you'll achieve that vision. So do it yourself. But what good is a vision without users? So go Facebook. But I could get something going very quickly in Ning and the expand overtime with much less hassle, even if it's not exactly what I want. So go Ning. What to do? The point of this post isn't to come to a conclusion. The point has been to cover some new and different approaches to scalability so you can spend a few sleepless nights pondering your options too :-)

The three kinds of platforms you meet on the Internet by Marc Andreessen

Analyzing the Facebook Platform, three weeks in by Marc Andreessen

Q&A with iLike’s Ali Partovi, on Facebook By Eric Eldon

I want to understand Ning's architecture and how it works

Response to Three Platforms You Meet by Joshua

Ning's Developer Documentation

Facebook's Application Architecture

Saleforce's On-Demand Computing Platform

Building a Business on Virtual Infrastructure, Using Google and salesforce.com

Click to read more ...

Todd Hoff |

Entries in facebook (29)

Facebook's Aditya giving presentation on Facebook Architecture

Facebook is Hiring

Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests

Product: Scribe - Facebook's Scalable Logging System

Client Code Interface

Distribution System

Do Something Usefullizer

Some Facebook Secrets to Better Operations

Related Articles

Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App

Information Sources

The Platform

The Stats

The Architecture

The Lessons

Related Articles

New Facebook Chat Feature Scales to 70 Million Users Using Erlang

Friends for Sale Architecture - A 300 Million Page View/Month Facebook RoR App

Information Sources

The Platform

The Stats

Getting to Know You

What do you use?

How do you manage your system?

How is your data center setup?

Their Thoughts on Facebook Virality

Lessons Learned

Hire Facebook, Ning, and Salesforce to Scale for You

The Three Levels of Internet Platforms

Why Use an Access API?

Why Use a Plug-In?

Why Use Deep Hosting?

Problems with Deep Hosting

What does this mean for you?

Related Articles