7 Design Patterns for Almost-infinite Scalability

Thursday

Dec162010

7 Design Patterns for Almost-infinite Scalability

Thursday, December 16, 2010 at 8:02AM

Good article from manageability.com summarizing design patterns from Pat Helland's amazing paper Life beyond Distributed Transactions: an Apostate's Opinion.

Entities are uniquely identified - each entity which represents disjoint data (i.e. no overlap of data between entities) should have a unique key.
Multiple disjoint scopes of transactional serializability - in other words there are these 'entities' and that you cannot perform atomic transactions across these entities.
At-Least-Once messaging - that is an application must tolerate message retries and out-of-order arrival of messages.
Messages are adressed to entities - that is one can't abstract away from the business logic the existence of the unique keys for addresing entities. Addressing however is independent of location.
Entities manage conversational state per party - that is, to ensure idemptency an entity needs to remember that a message has been previously processed. Furthermore, in a world without atomic transactions, outcomes need to be 'negotiated' using some kind of workflow capability.
Alternate indexes cannot reside within a single scope of serializability - that is, one can't assume the indices or references to entities can be update atomically. There is the potential that these indices may become out of sync.
Messaging between Entities are Tentative - that is, entities need to accept some level of uncertainty and that messages that are sent are requests form commitment and may possibly be cancelled.

The article then compares how these principles compare to the design principles used to develop S3:

Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.
Asynchrony: The system makes progress under all circumstances.
Autonomy: The system is designed such that individual components can make decisions based on local information.
Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.
Controlled concurrency: Operations are designed such that no or limited concurrency control is required.
Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.
Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.
Decompose into small well-understood building blocks: Do not try to provide a single service that does everything for every one, but instead build small components that can be used as building blocks for other services.
Symmetry: Nodes in the system are identical in terms of functionality, and require no or minimal node-specific configuration to function.
Simplicity: The system should be made as simple as possible (- but no simpler).

General Chicken |

4 Comments |

Permalink |

Print Article

Email Article

Strategy

Reader Comments (4)

What's called entities here, is normally called (in DDD that is) aggregates.

December 16, 2010 |

Erik van Oosten

Is there nothing more relevant on this topic that we have to go back to articles from 2008 ? Pat isn't even at Amazon anymore.

December 17, 2010 |

bradrover

I've read stuff like that before. The one question which is never answered is: How do you implement something like "safe money transfer between two accounts" under the rule: "Multiple disjoint scopes of transactional serializability" ??? That might be alright for entertainment like FarmVille, but what about actual business applications?

December 18, 2010 |

Sebastien Diot

Solution to money transfer example should be environment where you can choose concurrency control algorithm, for each index (etc.) you can choose if it is synchronously or asynchronously modified and for each read operation in each transaction you can choose if it needs lock on read data or not. And all this distributed and scalable to thousands of nodes and robust enough and prepared for all possible error situations. Then it will be possible to do operations with highest transaction isolation level where needed (accounts), without isolation where possible (most outputs presented to user) and asynchronously also where possible (some searches, transactions with only informative importancy, etc. ).

December 19, 2010 |

Jan Kmetko

Post a New Comment

Enter your information below to add a new comment.

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>