Sunday
Jan042009
Alternative Memcache Usage: A Highly Scalable, Highly Available, In-Memory Shard Index

While working with Memcache the other night, it dawned on me that it’s usage as a distributed caching mechanism was really just one of many ways to use it. That there are in fact many alternative usages that one could find for Memcache if they could just realize what Memcache really is at its core – a simple distributed hash-table – is an important point worthy of further discussion.
To be clear, when I say “simple”, by no means am I implying that Memcache’s implementation is simple, just that the ideas behind it are such. Think about that for a minute. What else could we use a simple distributed hash-table for, besides caching? How about using it as an alternative to the traditional shard lookup method we used in our Master Index Lookup scalability strategy, discussed previously here.
To be clear, when I say “simple”, by no means am I implying that Memcache’s implementation is simple, just that the ideas behind it are such. Think about that for a minute. What else could we use a simple distributed hash-table for, besides caching? How about using it as an alternative to the traditional shard lookup method we used in our Master Index Lookup scalability strategy, discussed previously here.
Reader Comments (4)
We use it as a session store behind tomcat so that we can round robin requests to our tomcat servers. Thus no sticky sessions as is the usual practice.
At my place, we use it as a routing lookup table in proxies - nginx backed by a small FastCGI application written in C that does the lookup and returns which backend that has data. Works great!
Many people use memcached for different things, primarily due to its stability and ease of use. However, you always need to remember that in its original use case, memcached was meant just as an accelerator - if it disappeared, nothing was assumed to break as queries would just go all the way to the database. Memcached does not provide any recovery functionality (in response to a failure) out of the box, because it was originally meant to be "disposable" so to speak.
So if you are using memcached as your primary data store, you need to plan carefully how you are going to restore the data if your memcached becomes inaccessible (potentially due to a network failure or host problems).
In area of distributed eventually consistent key-value data stores, you might want to check out Dynamo whitepaper from Amazon (search on http://allthingsdistributed.com). There are several open source implementations of Dynamo too - look for cliffmoon/dynomite or tuulos/ringo on github for example.
At my place, we use it as a routing lookup table in proxies - nginx backed by a small FastCGI application written in C that does the lookup and returns which backend that has data. Works great!
http://www.keyeagle.com>keyeagle