High Scalability -

Entries in nosql (54)

Friday

Oct302009

Hot Scalabilty Links for October 30 2009

Friday, October 30, 2009 at 6:58AM

Life beyond Distributed Transactions: an Apostate’s Opinion by Pat Helland. In particular, we focus on the implications that fall out of assuming we cannot have large-scale distributed transactions.
Tragedy of the Commons, and Cold Starts - Cold application starts on Google App Engine kill your application's responsiveness.
Intel’s 1M IOPS desktop SSD setup by Kevin Burton. What do you get when you take 7 Intel SSDs and throw them in a desktop? 1M IOPS
Videos from NoSQL Berlin sessions. Nicely done talks on CAP, MongoDB, Redis, 4th generation object databases, CouchDB, and Riak.
Designs, Lessons and Advice from Building Large Distributed Systems by Jeff Dean of Google describing how they do their thing. Here are some glosses on the talk by Greg Linden and James Hamilton. You really can't do better than Greg and James.
- Advice from Google on Large Distributed Systems by Greg Linden. A nice summary of Jeff Dean's talk. A standard Google server appears to have about 16G RAM and 2T of disk; Things will crash. Deal with it!; When designing for scale, you should design for expected load, ensure it still works at x10, but don't worry about scaling to x100.
- Jeff Dean: Design Lessons and Advice from Building Large Scale Distributed Systems by James Hamilton. A data center wide storage hierarchy; Failure Inevitable; Excellent set of distributed systems rules of thumb; Typical first year for a new cluster; GFS Usage at Google; Working on next generation Big Table system called Spanner.

HighScalability Team |

2 Comments |

Permalink |

Print Article

Email Article

google,

hot links,

nosql

Thursday

Oct292009

Paper: No Relation: The Mixed Blessings of Non-Relational Databases

Thursday, October 29, 2009 at 9:14AM

This excellent survey of the field was written by Ian Thomas Varley as part of his Master of Science in Engineering program.

The aim of this paper is to explore the conceptual design space of non-relational databases as compared to traditional relational databases. It is clear that the design needs of the two paradigms are different, but how fundamental are the differences, and what strategies can we use to transition our conceptual designs from one to the other?

There are a few things to like about this paper. A running a example is used to show the different ways to model data depending on which type of solution you are targeting, especially covering how many-to-many relationships are modeled, data integrity, and how to support optional attributes. There's also a brief survey of some of the major systems.

The most interesting section of the report is where it tackles the problem of design for non-relational systems. The approach has two different phases: design questions and design strategies.

The questions you should ask yourself about your problem are:

Click to read more ...

HighScalability Team |

5 Comments |

Permalink |

Print Article

Email Article

Paper,

key-value store,

nosql

Thursday

Oct292009

Digg - Looking to the Future with Cassandra

Thursday, October 29, 2009 at 8:47AM

Digg has been researching ways to scale our database infrastructure for some time now. We’ve adopted a traditional vertically partitioned master-slave configuration with MySQL, and also investigated sharding MySQL with IDDB. Ultimately, these solutions left us wanting. In the case of the traditional architecture, the lack of redundancy on the write masters is painful, and both approaches have significant management overhead to keep running.

Since it was already necessary to abandon data normalization and consistency to make these approaches work, we felt comfortable looking at more exotic, non-relational data stores. After considering HBase, Hypertable, Cassandra, Tokyo Cabinet/Tyrant, Voldemort, and Dynomite, we settled on Cassandra.

Each system has its own strengths and weaknesses, but Cassandra has a good blend of everything. It offers column-oriented data storage, so you have a bit more structure than plain key/value stores. It operates in a distributed, highly available, peer-to-peer cluster. While it’s currently lacking some core features, it gets us closer to where we want to be than the other solutions.

continue...

fulvio longhi |

2 Comments |

Permalink |

Print Article

Email Article

tagged

cassandra,

digg,

nosql in

cassandra,

digg,

nosql

Thursday

Sep102009

Building Scalable Databases: Denormalization, the NoSQL Movement and Digg

Thursday, September 10, 2009 at 6:27AM

Database normalization is a technique for designing relational database schemas that ensures that the data is optimal for ad-hoc querying and that modifications such as deletion or insertion of data does not lead to data inconsistency. Database denormalization is the process of optimizing your database for reads by creating redundant data. A consequence of denormalization is that insertions or deletions could cause data inconsistency if not uniformly applied to all redundant copies of the data within the database.

Read more on Carnage4life blog...

mg1313 |

2 Comments |

Permalink |

digg,

nosql,

scalable