Entries in cassandra (4)

Thursday
Jun272019

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

Ready to transition from a commercial database to open source, and want to know which databases are most popular in 2019? Wondering whether an on-premise vs. public cloud vs. hybrid cloud infrastructure is best for your database strategy? Or, considering adding a new database to your application and want to see which combinations are most popular? We found all the answers you need at the Percona Live event last month, and broke down the insights into the following free trends reports:

Click to read more ...

Monday
Aug012016

How to Setup a Highly Available Multi-AZ Cassandra Cluster on AWS EC2

 

This is a guest post by Alessandro Pieri, Software Architect at Stream. Try out this 5 minute interactive tutorial to learn more about Stream’s API.

Originally built by Facebook in 2009, Apache Cassandra is a free and open-source distributed database designed to handle large amounts of data across a large number of servers. At Stream, we use Cassandra as the primary data store for our feeds. Cassandra stands out because it’s able to:

  • Shard data automatically

  • Handle partial outages without data loss or downtime

  • Scales close to linearly

If you’re already using Cassandra, your cluster is likely configured to handle the loss of 1 or 2 nodes. However, what happens when a full availability zone goes down?

In this article you will learn how to setup Cassandra to survive a full availability zone outage. Afterwards, we will analyze how moving from a single to a multi availability zone cluster impacts availability, cost, and performance.

Recap 1: What Are Availability Zones?

Click to read more ...

Thursday
Oct292009

Digg - Looking to the Future with Cassandra

Digg has been researching ways to scale our database infrastructure for some time now. We’ve adopted a traditional vertically partitioned master-slave configuration with MySQL, and also investigated sharding MySQL with IDDB. Ultimately, these solutions left us wanting. In the case of the traditional architecture, the lack of redundancy on the write masters is painful, and both approaches have significant management overhead to keep running.

Since it was already necessary to abandon data normalization and consistency to make these approaches work, we felt comfortable looking at more exotic, non-relational data stores. After considering HBase, Hypertable, Cassandra, Tokyo Cabinet/Tyrant, Voldemort, and Dynomite, we settled on Cassandra.

Each system has its own strengths and weaknesses, but Cassandra has a good blend of everything. It offers column-oriented data storage, so you have a bit more structure than plain key/value stores. It operates in a distributed, highly available, peer-to-peer cluster. While it’s currently lacking some core features, it gets us closer to where we want to be than the other solutions.

continue... 

Wednesday
Jul012009

Podcast about Facebook's Cassandra Project and the New Wave of Distributed Databases

In this podcast, we interview Jonathan Ellis about how Facebook's open sourced Cassandra Project took lessons learned from Amazon's Dynamo and Google's BigTable to tackle the difficult problem of building a highly scalable, always available, distributed data store.