Berkeley DB Architecture - NoSQL Before NoSQL was Cool

Monday

Feb202012

Berkeley DB Architecture - NoSQL Before NoSQL was Cool

Monday, February 20, 2012 at 8:56AM

After the filesystem and simple library packages like dbm, Berkeley DB was the original luxury embedded database widely used by applications as their core database engine. NoSQL before NoSQL was cool. The hidden secret making complex applications sing. If you want to dispense with all the network overhead of a server based system, it's still a a good choice.

There's a great writeup for the architecture behind Berkeley DB in the book The Architecture of Open Source Applications. If you want to understand more about how a database works or if you are pondering how to build your own, it's rich in detail, explanations, and lessons. Here's the Berkeley DB chapter from the book. It covers topics like: Architectural Overview; The Access Methods: Btree, Hash, Recno, Queue; The Library Interface Layer; The Buffer Manager: Mpool; Write-ahead Logging; The Lock Manager: Lock; The Log Manager: Log; The Transaction Manager: Txn.

HighScalability Team |

3 Comments |

Permalink |

Print Article

Email Article

Example,

nosql

Reader Comments (3)

Absolutely, BerkeleyDB is amazingly fast if your scenario is a non-sharded single web server. I've used BerkeleyDB on BookMooch.com for 6 years, and real-world query speeds of 2 million queries per second, for a single web page, are typical. Those aren't simulated speeds: that's real world, after locking, overheard, etc... That kind of speed lets me use BerkeleyDB in place of in-memory arrays, and then you get automatic persistence (much like Perl does).

-john

February 20, 2012 |

John Buckman

Lets me say first that the "NoSQL Before NoSQL was Cool" moniker MarkLogic had came up with for quite sometime(I have the shirt to prove it). That being said, MarkLogic is the de-facto XML database when it comes to speed and scalability. MarkLogic does not require horizontal sharding, because it was built for clustering and coordination of thousands of nodes and petabytes of data. MarkLogic has installations in some of the largest companies with massively complex content/data and can perform subsecond queries against any node or document. I think Berkeley is a great tool, but is novel at best, I would be interested in who in the enterprise is using it and at what scale. Being someone who has intimately worked with and for MarkLogic. I know it scales and solves alot of informational problems, whether you have 100 GB or 100 TB of content.

-Gary Vidal

February 25, 2012 |

Gary Vidal

Please check out Bangdb. Currently the embedded version is released at www.iqlect.com. There is an interesting perf comparison document with BerkleyDB and LevelDB. Please check out when you can.
Thanks

September 16, 2012 |

sachin

Post a New Comment

Enter your information below to add a new comment.

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

Berkeley DB Architecture - NoSQL Before NoSQL was Cool

Related Articles

Reader Comments (3)

Post a New Comment