Entries by Todd Hoff (380)

Tuesday
May052009

Drop ACID and Think About Data

The abstract for the talk given by Bob Ippolito, co-founder and CTO of Mochi Media, Inc:

Building large systems on top of a traditional single-master RDBMS data storage layer is no longer good enough. This talk explores the landscape of new technologies available today to augment your data layer to improve performance and reliability. Is your application a good fit for caches, bloom filters, bitmap indexes, column stores, distributed key/value stores, or document databases? Learn how they work (in theory and practice) and decide for yourself.
Bob does an excellent job highlighting different products and the key concepts to understand when pondering the wide variety of new database offerings. It's unlikely you'll be able to say oh, this is the database for me after watching the presentation, but you will be much better informed on your options. And I imagine slightly confused as to what to do :-) An interesting observation in the talk is that the more robust products are internal to large companies like Amazon and Google or are commercial. A lot of the open source products aren't yet considered ready for prime-time and Bob encourages developers to join a project and make patches rather than start yet another half finished key-value store clone. From my monitoring of the interwebs this does seem to be happening and existing products are starting to mature. From all the choices discussed the column database Vertica seems closest to Bob's heart and it's the product they use. It supports clustering, column storage, compression, bitmapped indexes, bloom filters, grids, and lots of other useful features. And most importantly: it works, which is always a plus :-) Here's a summary of some of the points talked about in the presentation:
  • Video Presentation of Drop ACID and Think About Data

    ACID

  • The claim to fame for relational databases is they make the ACID promise: * Atomicity - a transaction is all or nothing * Consistency - only valid data is written to the database * Isolation - pretend all transactions are happening serially and the data is correct * Durability - what you write is what you get
  • The problem with ACID is that it gives you too much, it trips you up when you are trying to scale a system across multiple nodes.
  • Down time is unacceptable. So your system needs to be reliable. Reliability requires multiple nodes to handle machine failures.
  • To make a scalable systems that can handle lots and lots of reads and writes you need many more nodes.
  • Once you try to scale ACID across many machines you hit problems with network failures and delays. The algorithms don't work in a distributed environment at any acceptable speed.

    CAP

  • If you can't have all of the ACID guarantees it turns out you can have two of the following three characteristics: * Consistency - your data is correct all the time. What you write is what you read. *Availability - you can read and write and write your data all the time * Partition Tolerance - if one or more nodes fails the system still works and becomes consistent when the system comes on-line.

    BASE

  • The types of large systems based on CAP aren't ACID they are BASE (har har): * Basically Available - system seems to work all the time * Soft State - it doesn't have to be consistent all the time * Eventually Consistent - becomes consistent at some later time
  • Everyone who builds big applications builds them on CAP and BASE: Google, Yahoo, Facebook, Amazon, eBay, etc

    Google's BigTable

  • Google BigTable - manages data across many nodes.
  • Paxos (Chubby) - distributed transaction algorithm that manages locks across systems.
  • BigTable Characteristics: * stores data in tablets using GFS, a distributed file system. * compression - great gains in throughput, can store more, reduces IO bottleneck because you have to store less so you have to talk to the disks less so performance improves. *single master - one node knows everything about all the other node (backed up and cached). *hybrid between row and column database ** row database - store objects together ** column database - store attributes of objects together. Makes sequential retrieval very fast, allows very efficient compression, reduces disks seeks and random IO. * versioning * bloom filters - allows data to be distributed across a bunch of nodes. It's a calculation on data that probabilistically maps the data to the nodes it can be found on. * eventually consistent - append only system using a row time stamp. When a client queries they get several versions and the client is in charge of picking the most recent.
  • Pros: * Compression is awesome. He really thinks compression is an important attribute of system. * Clients are probably simple. * Integrates with map-reduce.
  • Cons: * Proprietary to Google - You can't use it on your system. * Single-master - could be a downside but not sure.

    Amazon's Dynamo

  • A giant distributed hash table, called a key-value store.
  • Uses consistent hashing to distribute data to one or more nodes for redundancy and performance. * Consistent hashing - a ring of nodes and hash function picks which node(s) to store data * Consitency between nodes is based on vector clocks and read repair. * Vector clocks - time stamp on every row for every node that has written to it. * Read repair - When a client does a read and the nodes disagree on the data it's up to the client to select the correct data and tell the nodes the new correct state.
  • Pros: * No Master - eliminates single point of failure. * Highly Available for Write - This is the partition failure aspect of CAP. You can write to many nodes at once so depending on the number of replicas (which is configurable) maintained you should always be able to write somewhere. So users will never see a write failure. * Relatively simple which is why we see so many clones.
  • Cons: * Proprietary. * Clients have to be smart to handle read-repair, rebalancing a cluster, hashing, etc. Client proxies can handle these responsibilities but that adds another hop. * No compression which doesn't reduce IO. * Not suitable for column-like workloads, it's just a key-value store, so it's not optimized for analytics. Aggregate queries, for example, aren't in it's wheel house.

    Facebook's Cassandra

  • Peer-to-peer so no master like Dynamo
  • Storage model more like BigTable
  • Pros: * Open source. You can use it. * Incremental scalable - as data grows you can add more nodes to storage mesh. * Minimal administration - because it's incremental you don't have to do a lot of up front planning for migration.
  • Cons: * Not polished yet. It was built for in-box searching so may not be work well for other use cases. * No compression yet.

    Distributed Database Musings

  • Distributed databases are the new web framework.
  • None are awesome yet. No obvious winners.
  • There are many clones with partial features implemented. * For example Project Voldemort doesn't have rebalancing, no garbage collection.
  • Pick one and start submitting patches. Don't start another half-baked clone.

    Simple Key-Value Store

  • Some people are using simple key-value stores to replace relational database.
  • A key (array of bytes) maps using a hash to a value (a BLOB). It's like an associative array.
  • They are really fast and simple.

    Memcached Key-Value Stores

  • Is a key-value store that people use as a cache.
  • No persistence
  • RAM only
  • LRU so it throws data away on purpose when there's too much data
  • Lightening fast
  • Everyone uses it so well supported.
  • A good first strategy in removing load from the database.
  • Dealing with mutable data in a cache is really hard. Adding cache to an ACID system is something you'll probably get wrong and is difficult to debug because it does away with several ACID properties: * Isolation is gone with multiple writers. Everyone sees the current written value where in a database you see a consistent view of the database. * On a transaction fail the cache may reflect the new data when it has been rolled back in the database. * Dependent cache keys are difficult to program correctly because they aren't transactional. Consistency is impossible to keep. Update one key and what happens to the dependent keys? * It's complicated and you'll get it wrong and lose some of the consistency that the database had given your.

    Tokyo Cabinet/Tyrant Key-Value Store

  • Similar use cases as for BerkelyDB.
  • Disk persistence. Can store data larger than RAM.
  • Performs well.
  • Actively developed. Lots of developers adding new features (but not bug fixes).
  • Similar replication strategy to MySQL. Not useful for scalability as it limits the write throughput to one node.
  • Optional compressed pages so has some compression advantages.

    Redis Data Structure Store

  • Very new.
  • It's a data structure store not a key-value store, which means it understands your values so you can operate on them. Typically in a key-value store the values are opaque.
  • Can match on key spaces. You can look for all keys that match an expression.
  • Understands lists and sets. So you can do list and set operation in the process of the database server which is much more efficient because all the data doesn't have to be paged to the client. Updates can then b done atomically on the server side which is difficult to do on the client side.
  • Big downside is it requires that full data store in RAM. Can't store data on disk.
  • It is disk backed so it is reliable over a reboot, but still RAM limited.
  • Maybe useful as a cache that supports higher level operations than memcache.

    Document Databases

  • Schema-free. You don't have to say which attributes are in the values that are stored.
  • Makes schema migration easier because it doesn't care what fields you have. Applications must be coded to handle the different versions of documents.
  • Great for storing documents.

    CouchDB Document Database

  • Apache project so you can use it.
  • Written Erlang
  • Asynchronous replication. Still limited to the write speed of one node.
  • JSON based so easy to use on the web.
  • Queries are done in a map-reduce style using views. - A view is created by writing a Javascript function that is applied to all documents in the document store. This creates a matching list of documents. - Views are materialized on demand. Once you hit the view once it saves the list until an update occurs.
  • Neat admin UI.

    MongoDB Document Database

  • Written in C++
  • Significantly faster the CouchDB
  • JSON and BSON (binary JSON-ish) formats.
  • Asynchronous replication with auto-sharding coming soon.
  • Supports indexes. Querying a property is quick because an index is automatically kept on updates. Trades off some write speed for more consistent read spead.
  • Documents can be nested unlike CouchDB which requires applications keep relationships. Advantage is that the whole object doesn't have to be written and read because the system knows about the relationship. Example is a blog post and comments. In CouchDB the post and comments are stored together and walk through all the comments when creating a view even though you are only interested in the blog post. Better write and query performance.
  • More advanced queries than CouchDB.

    Column Databases

  • Some of this model is implemented by BigTable, Cassandra, and HyperTable.
  • Sequential reads are fast because data in a column is stored together.
  • Columns compress better than rows because the data is similar.
  • Each column is stored separately so IO is efficient as only the columns of interest are scanned. When using column database you are almost always scanning the entire column.
  • Bitmap indexes for fast sequential scans. * Turning cell values into 1 or more bits. * Compression reduces IO even further. * Indexes can be logical anded and ored together to know which rows to select. * Used for big queries for performing joins of multiple tables. When a row is 2 bits (for example) there's a lot less IO than working on uncompressed unbitmapped values.
  • Bloom Filters * Used by BigTable, Cassandra and other projects. * Probabilistic data structure. * Lossy, so you can lose data. * In exchange for losing data you can store all information in constant space * Gives you false positives at a known error rate. * Store bloom filter for a bunch of nodes. Squid uses this for its cache protocol. Knowing a bloom filter you can locally perform a computation to know which nodes the data may be on. If a bit is not set in the filter then the data isn't on the node. If a bit is set it may or may not be on the node. * Used for finding stuff and approximating counts in constant space.

    MonetDB Column Database

  • Research project which crashes a lot and corrupts your data.

    LucidDB Column Database

  • Java/C++ open source data warehouse
  • No clustering so only single node performance, but that can be enough for the applications column stores are good at.
  • No experience so can't speak to it.

    Vertica Column Database

  • The product they use.
  • Commercial column store based on C-store.
  • Clustered
  • Actually works.

    Related Articles

  • Availability & Consistency by Werner Vogels
  • BASE: An Acid Alternative by Dan Pritchett
  • MongoDB - a high-performance, open source, schema-free document-oriented data store that's easy to deploy, manage and use.
  • Vertica - blazing-fast data warehousing software
  • LucidDB - the first and only open-source RDBMS purpose-built entirely for data warehousing and business intelligence.
  • CouchDB - a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API.
  • Memcached Tag at High Scalability
  • Key-Value Store Tag at High Scalability
  • BigTable Tag at High Scalability
  • Dynamo.

    Click to read more ...

  • Monday
    May042009

    STRUCTURE 09 IS BACK!

    The GigaOM Network today announces its second Structure conference after the runaway success of the 2008 event. The Structure 09 conference returns to San Francisco, Calif., on June 25th, 2009. Structure 09 (http://structureconf.com) is a conference designed to explore the next generations of Internet infrastructure. Over a year ago, The GigaOM Network Founder Om Malik saw that the platforms on which we have done business for over a decade were starting to provide diminishing returns, and smart money was seeking new options. Structure 09 looks at the changing needs and rapid growth in the Internet infrastructure sector, and this year's event will consider the impact of the global economy. "I cannot remember a time when a new technology had so much relevance to our industry as cloud computing does in the current economic climate," said The GigaOM Network Founder Om Malik. "We all need to find ways to leverage what we have and cut costs without compromising future options. Infrastructure On Demand and Cloud Computing are very strong avenues for doing so and we will look for what practicable advice we can bring to our audience." "Structure 08 was a great experience for our audience and partners, and I am very pleased to be bringing it back again this year," said Malik. "Along with GigaOM Lead Writer Stacey Higginbotham and the conference program committee, I am bringing together what I intend to be one of the most authoritative programs for the cloud computing and Internet infrastructure space." The GigaOM Network is also announcing early speaker selections. Confirmed speakers include: Marc Benioff - Chairman and CEO, Salesforce.com Paul Sagan, President and CEO, Akamai Werner Vogels, CTO, Amazon Russ Daniels, VP and CTO, Cloud Services Strategy, HP Raj Patel, VP of Global Networks, Yahoo! Jonathan Heiliger, VP, Technical Operations, Facebook Greg Papadopoulos, CTO and EVP - Research and Development, Sun Microsystems Jack Waters, President, Global Network Services and CTO, Level 3 Communications Michael Stonebraker, PhD, CTO and Co-Founder, Vertica Systems David Yen, EVP and GM, Data Center Business Group, Juniper Networks Vijay Gill, VP Engineering, Google Yousef Khalidi, Distinguished Engineer, Microsoft Corporation Tobias Ford, Assistant VP, IT, AT&T Richard Buckingham, VP of Technical Operations, MySpace Lew Tucker, VP and CTO, Cloud Computing, Sun Microsystems Lloyd Taylor, VP Technical Operations, LinkedIn Michael Crandell, CEO and Founder, RightScale Jim Smith, General Partner, MDV-Mohr Davidow Ventures Bryan Doerr, CTO, Savvis Doug Judd, Principal Search Architect, Zvents Brandon Watson, Director, Azure Services Platform, Microsoft Jeff Hammerbacher, Chief Scientist, Cloudera Jason Hoffman, PhD, CTO, Joyent Mayank Bawa, CEO, Aster Data James Urquhart, Market Manager, Cloud Computing and Infrastructure, Cisco Systems Kevin Efrusy, General Partner, Accel Lew Moorman, CEO and Founder, Rackspace Joe Weinman, Strategy and Business Development VP, AT&T Business Solutions Peter Fenton, General Partner, Benchmark Capital David Hitz, Founder and Executive Vice President, NetApp James Lindenbaum, Co-Founder and CEO, Heroku Joseph Tobolski, Director of Cloud Computing, Accenture Steve Herrod, CTO and Sr. VP of R&D, VMware Further Details can be found at the Structure 09 Website http://events.gigaom.com/structure/09/ High Scalability readers can register with a $50 discount at http://structure09.eventbrite.com/?discount=HIGHSCALE

    Click to read more ...

    Friday
    May012009

    FastBit: An Efficient Compressed Bitmap Index Technology

    Data mining and fast queries are always in that bin of hard to do things where doing something smarter can yield big results. Bloom Filters are one such do it smarter strategy, compressed bitmap indexes are another. In one application "FastBit outruns other search indexes by a factor of 10 to 100 and doesn’t require much more room than the original data size." The data size is an interesting metric. Our old standard b-trees can be two to four times larger than the original data. In a test searching an Enron email database FastBit outran MySQL by 10 to 1,000 times.

    FastBit is a software tool for searching large read-only datasets. It organizes user data in a column-oriented structure which is efficient for on-line analytical processing (OLAP), and utilizes compressed bitmap indices to further speed up query processing. Analyses have proven the compressed bitmap index used in FastBit to be theoretically optimal for one-dimensional queries. Compared with other optimal indexing methods, bitmap indices are superior because they can be efficiently combined to answer multi-dimensional queries whereas other optimal methods can not.
    It's not all just map-reduce and add more servers until your attic is full.

    Related Articles

  • FastBit: Digging through databases faster. An excellent description of how FastBit works, especially compared to b-trees.

    Click to read more ...

  • Sunday
    Apr262009

    Poem: Partly Cloudy

    As any reader of this site knows we're huge huge supporters of the arts. To continue that theme here's a visionary poem by Mason Hale. Few have reached for inspiration and found their muse in the emotional maelstrom that is cloud computing, but Mason has and the results speak for themselves: Partly Cloudy We have a dream A vision An aspiration To compute in the cloud To pay as we go To drink by the sip To add cores at our whim To write to disks with no end To scale up with demand And scale down when it ends Elasticity Scalability Redundancy Computing as a utility This is our dream Becoming reality But… There’s a hitch. There’s a bump in the road There’s a twist in the path There’s a detour ahead on the way to achieving our goal It’s the Database Our old friend He is set in his ways He deals in transactions to keeps things consistent He maintains the integrity of all his relations He eats disks for breakfast He hungers for RAM He loves queries and joins, and gives each one a plan He likes his schemas normal and strict His changes are atomic That is his schtick He’s an old friend as I said We all know him well So it pains me to say that in this new-fangled cloud He doesn’t quite fit Don’t get me wrong, our friend can scale as high as you want But there’s a price to be paid That expands as you grow The cost is complexity It’s more things to maintain More things that can go wrong More ways to inflict pain On the poor DBA who cares for our friend The one who backs him up and, if he dies, restores him again I love our old friend I know you do too But it is time for us all to own up to the fact That putting him into the cloud Taking him out of the rack Just causes us both more pain and more woe So… It’s time to move on Time to learn some new tricks Time to explore a new world that is less ACIDic It’s time to meet some new friends Those who were born in the cloud Who are still growing up Still figuring things out There’s Google’s BigTable and Werner’s SimpleDB There’s Hive and HBase and Mongo and Couch There’s Cassandra and Drizzle And not to be left out There’s Vertica and Aster if you want to spend for support There’s a Tokyo Cabinet and something called Redis I’m told It’s a party, a playgroup of newborn DB’s They scale and expand, they re-partition with ease They are new and exciting And still flawed to be sure But they’ll learn and improve, grow and mature They are our future We developers should take heed If our databases can change, then maybe Just maybe So can we

    Click to read more ...

    Friday
    Apr242009

    INFOSCALE 2009 in June in Hong Kong

    In case you are interested here's the info: INFOSCALE 2009: The 4th International ICST Conference on Scalable Information Systems. 10-12 June 2009, Hong Kong, China. In the last few years, we have seen the proliferation of the use of heterogeneous distributed systems, ranging from simple Networks of Workstations, to highly complex grid computing environments. Such computational paradigms have been preferred due to their reduced costs and inherent scalability, which pose many challenges to scalable systems and applications in terms of information access, storage and retrieval. Grid computing, P2P technology, data and knowledge bases, distributed information retrieval technology and networking technology should all converge to address the scalability concern. Furthermore, with the advent of emerging computing architectures - e.g. SMTs, GPUs, Multicores. - the importance of designing techniques explicitly targeting these systems is becoming more and more important. INFOSCALE 2009 will focus on a wide array of scalability issues and investigate new approaches to tackle problems arising from the ever-growing size and complexity of information of all kinds. For further information visit http://infoscale.org/2009/

    Click to read more ...

    Friday
    Apr242009

    Heroku - Simultaneously Develop and Deploy Automatically Scalable Rails Applications in the Cloud

    Update 4: Heroku versus GAE & GAE/J

    Update 3: Heroku has gone live!. Congratulations to the team. It's difficult right now to get a feeling for the relative cost and reliability of Heroku, but it's an impressive accomplishment and a viable option for people looking for a delivery platform.

    Update 2: Heroku Architecture. A great interactive presentation of the Heroku stack. Requests flow into Nginx used as a HTTP Reverse Proxy. Nginx routes requests into a Varnish based HTTP cache. Then requests are injected into an Erlang based routing mesh that balances requests across a grid of dynos. Dynos are your application "VMs" that implement application specific behaviors. Dynos themselves are a stack of: POSIX, Ruby VM, App Server, Rack, Middleware, Framework, Your App. Applications can access PostgreSQL. Memcached is used as an application caching layer.

    Update: Aaron Worsham Interview with James Lindenbaum, CEO of Heroku. Aaron nicely sums up their goal: Heroku is looking to eliminate all the reasons companies have for not doing software projects.


    Adam Wiggins of Heroku presented at the lollapalooza that was the Cloud Computing Demo Night. The idea behind Heroku is that you upload a Rails application into Heroku and it automatically deploys into EC2 and it automatically scales using behind the scenes magic. They call this "liquid scaling." You just dump your code and go. You don't have to think about SVN, databases, mongrels, load balancing, or hosting. You just concentrate on building your application. Heroku's unique feature is their web based development environment that lets you develop applications completely from their control panel. Or you can stick with your own development environment and use their API and Git to move code in and out of their system.

    For website developers this is as high up the stack as it gets. With Heroku we lose that "build your first lightsaber" moment marking the transition out of apprenticeship and into mastery. Upload your code and go isn't exactly a heroes journey, but it is damn effective...

    I must confess to having an inherent love of Heroku's idea because I had a similar notion many moons ago, but the trendy language of the time was Perl instead of Rails. At the time though it just didn't make sense. The economics of creating your own "cloud" for such a different model wasn't there. It's amazing the niches utility computing will seed, fertilize, and help grow. Even today when using Eclipse I really wish it was hosted in the cloud and I didn't have to deal with all its deployment headaches. Firefox based interfaces are pretty impressive these days. Why not?

    Adam views their stack as:
    1. Developer Tools
    2. Application Management
    3. Cluster Management
    4. Elastic Compute Cloud

    At the top level developers see a control panel that lets them edit code, deploy code, interact with the database, see logs, and so on. Your website is live from the first moment you start writing code. It's a powerful feeling to write normal code, see it run immediately, and know it will scale without further effort on your part. Now, will you be able toss your Facebook app into the Heroku engine and immediately handle a deluge of 500 million hits a month? It will be interesting to see how far a generic scaling model can go without special tweaking by a certified scaling professional. Elastra has the same sort of issue.

    Underneath Heroku makes sure all the software components work together in Lennon-McCartney style harmony. They take care (or will take care of) starting and stopping VMs, deploying to those VMs, billing, load balancing, scaling, storage, upgrades, failover, etc. The dynamic nature of Ruby and the development and deployment infrastructure of Rails is what makes this type of hosting possible. You don't have to worry about builds. There's a great infrastructure for installing packages and plugins. And the big hard one of database upgrades is tackled with the new migrations feature.

    A major issue in the Rails world is versioning. Given the precambrian explosion of Rails tools, how does Heroku make sure all the various versions of everything work together? Heroku sees this as their big value add. They are in charge of making sure everything works together. We see a lot companies on the web taking on the role of curator ([1], [2], [3]). A curator is a guardian or an overseer. Of curators Steve Rubel says: They acquire pieces that fit within the tone, direction and - above all - the purpose of the institution. They travel the corners of the world looking for "finds." Then, once located, clean them up and make sure they are presentable and offer the patron a high quality experience. That's the role Heroku will play for their deployable Rails environment.

    With great automated power comes great restrictions. And great opportunity. Curating has a cost for developers: flexibility. The database they support is Postgres. Out of luck if you wan't MySQL. Want a different Ruby version or Rails version? Not if they don't support it. Want memcache? You just can't add it yourself. One forum poster wanted, for example, to use the command line version of ImageMagick but was told it wasn't installed and use RMagick instead. Not the end of the world. And this sort of curating has to be done to keep a happy and healthy environment running, but it is something to be aware of.

    The upside of curation is stuff will work. And we all know how hard it can be to get stuff to work. When I see an EC2 AMI that already has most of what I need my heart goes pitter patter over the headaches I'll save because someone already did the heavy curation for me. A lot of the value in services like rPath offers, for example, is in curation. rPath helps you build images that work, that can be deployed automatically, and can be easily upgraded. It can take a big load off your shoulders.

    There's a lot of competition for Heroku. Mosso has a hosting system that can do much of what Heroku wants to do. It can automatically scale up at the webserver, data, and storage tiers. It supports a variery of frameworks, including Rails. And Mosso also says all you have to do is load and go.

    3Tera is another competitor. As one user said: It lets you visually (through a web ui) create "applications" based on "appliances". There is a standard portfolio of prebuilt applications (SugarCRM, etc.) and templates for LAMP, etc. So, we build our application by taking a firewall appliance, a CentOS appliance, a gateway, a MySql appliance, glue them together, customize them, and then create our own template. You can specify down to the appliance level, the amount of cpu, memory, disk, and bandwidth each are assigned which let's you scale up your capacity simply by tweaking values through the UI. We can now deploy our Rails/Java hosted offering for new customers in about 20 minutes on our grid. AppLogic has automatic failover so that if anything goes wrong, it reploys your application to a new node in your grid and restarts it. It's not as cheap as EC2, but much more powerful. True, 3Tera won't help with your application directly, but most of the hard bits are handled.

    RightScale is another company that combines curation along with load balancing, scaling, failover, and system management.

    What differentiates Heroku is their web based IDE that allows you to focus solely on the application and ignore the details. Though now that they have a command line based interface as well, it's not as clear how they will differentiate themselves from other offerings.

    The hosting model has a possible downside if you want to do something other than straight web hosting. Let's say you want your system to insert commercials into podcasts. That sort of large scale batch logic doesn't cleanly fit into the hosting model. A separate service accessed via something like a REST interface needs to be created. Possibly double the work. Mosso suffers from this same concern. But maybe leaving the web front end to Heroku is exactly what you want to do. That would leave you to concentrate on the back end service without worrying about the web tier. That's a good approach too.

    Heroku is just getting started so everything isn't in place yet. They've been working on how to scale their own infrastructure. Next is working on scaling user applications beyond starting and stopping mongrels based on load. They aren't doing any vertical scaling of the database yet. They plan on memcaching reads, implementing read-only slaves via Slony, and using the automatic partitioning features built into Postgres 8.3. The idea is to start a little smaller with them now and grow as they grow. By the time you need to scale bigger they should have the infrastructure in place.

    One concern is that pricing isn't nailed down yet, but my gut says it will be fair. It's not clear how you will transfer an existing database over, especially from a non-Postgres database. And if you use the web IDE I wonder how you will normal project stuff like continuous integration, upgrades, branching, release tracking, and bug tracking? Certainly a lot of work to do and a lot of details to work out, but I am sure it's nothing they can't handle.

    Related Articles

  • Heroku Rails Podcast
  • Heroku Open Source Plugins etc
  • Tuesday
    Apr212009

    What CDN would you recommend?

    Update 10: The Value of CDNs by Mike Axelrod of Google. Google implements a distributed content cache from within large ISPs. This allows them to serve content from the edge of the network and save bandwidth on the ISPs backbone. Update 9: Just Jump: Start using Clouds and CDNs. Bob Buffone gives a really nice and practical tutorial of how to use CloudFront as your CDN. Update 8: Akamai’s Services Become Affordable for Anyone! Blazing Web Site Performance by Distribution Cloud. Distribution Cloud starts at $150 per month for access to the best content distribution network in the world and the leader of Content Distribution Networks. Update 7: Where Amazon’s Data Centers Are Located, Expanding the Cloud: Amazon CloudFront. Why Amazon's CDN Offering Is No Threat To Akamai, Limelight or CDN Pricing. Amazon has launched their CDN with "“low latency, high data transfer speeds, and no commitments.” The perfect relationship for many. The majority of the locations are in North America, but some are in Europe and Asia. Update 6: Amazon Launching New Content Delivery Network: No Threat To Major CDNs, Yet. All the Amazon will kill all other CDNs is a bit overblown. As usual Dan Rayburn sets us straight: The offering won't support streaming, live broadcasting, or provide many of the other products and services that video content owners need...the real story here is that Amazon is going to offer a high performance method of distributing content with low latency and high data transfer rates. Update 5: When It Comes To Content Delivery Networks, What Is The "Edge"?. Dan Rayburn is on edge about the misuse of the term edge: closest location to the user does not guarantee quality, often content is not delivered from the closest location, all content is not replicated at every "edge" location. Lots of other essential information. Update 4: David Cancel runs a great test to see if you should be Using Amazon S3 as a CDN?. Conclusion: "CacheFly performed the best but only slightly better than EdgeCast. The S3 option was the worst with the Nginx/DIY option performing just over 100 ms faster." Also take look at Part 2 - Cacheability? Update 3: Mr. Rayburn takes A Detailed Look At Akamai's Application Delivery Product . They create a "bi-nodal overlay network" where users and servers are always within 5 to 10 milliseconds of each other. Your data center hosted app can't compete. The problem is that people (that is, me) can understand the data center model. I don't yet understand how applications as a CDN will work. Update 2: Dan Rayburn starts an interesting series of articles on Highlights Of My Day In Cambridge With Akamai. Akamai is moving strong into the application distribution business. That would make an interesting cloud alternative.. Update: Streamingmedia links to new CDN DF Splash that specializes in instant-on TV-quality video streaming. A question was raised on the forum asking for a CDN recommendation. As usual there are no definitive answers, but here are three useful articles that may help your deliberations.

  • First, Tony Chang shows how to drive down response times using edge acceleration strategies.
  • Then Pingdom gives a nice overview and introduction to CDNs.
  • And last but not least, Dan Rayburn from StreamingMedia.com gives a master class in how much you should pay for your CDN, what you should be getting for your money, and how to find the right provider for your needs. Lots and lots of good stuff to learn, even if you didn't roll out of bed this morning pondering the deeper mysteries of content delivery networks and the Canadian dollar.

    Edge Acceleration Strategies: Akamai by Tony Chang

    The edge network is the "network physically closest to the end user and the 'origin' is where the application(s) is hosted." Tony talks about how you use CDNs to manage the user experience through meeting millisecond+ level SLAs using edge acceleration services. He does this in an interesting way. He follows a request through its life cycle and shows how to turn your caterpillar into a butterfly at each stage:
  • An edge DNS means a name server closest to the end user will serve the DNS request.
  • Static content is easily cached on the edge.
  • Dynamic content is moving to the edge using what Akamai calls Web Application Accelerators.
  • And something I've never heard of is to use your CDN to improve routing performance by up to 33%. The service bypasses BGP using its own more optimized route tables to decrease latency.

    Pingdom's A look at Content Delivery Networks, or “how to serve lots of content really fast”

    CDNs are the hidden powerhouse of the internet. The unsung mitochondria powering bits forward. Cost, convenience and performance are the reasons people turn to CDNs. A CDN does what you can't, it put lots of servers in lots of different places. Panther Express, for example, puts 800 servers in 22 different geographical locations. Since CDNs sell delivery capacity capacity planning is one of their big challenges. And Pingdom would like you to recognize the importance of monitoring for detecting and quickly reacting to problems :-) The future of CDNs lies in larger caches for HD video, better locality, and more integration with hosting providers.

    Video on Content Delivery Network Pricing, Costs for Outsourced Video Delivery by Dan Rayburn

    Also CDN Pricing Data: Average Cost Per GB Declines In Q4 Due To Startups. It's evident Dan really knows his stuff. His articles and presentations are highly educational for anyone interested in the complex and confusing CDN world. Dan sees hundreds of real-life customer-CDN vendor contracts a year and he reports on real prices averaged over all the contracts he has seen. One of the hardest things as a consumer is knowing what a good price is for your basket of goods and Dan gives you the edge, so to speak. What I learned:
  • You decide who is a CDN.There's no central agency setting a standard. Dan's minimal definition is a service delivering live video in the US and Europe.
  • CDN market has gone from 10 to 30 vendors. VCs are pumping hundreds of millions into the space.
  • CDN providers provide a wide variety of services: application caching, static caching, streaming video, progressive video, etc. Dan concentrates only on video delivery.
  • You can't say vendor A is better than vendor B. It depends on your needs.
  • When comparing vendors you need to do an "apples to apples" comparison. He really likes that phrase. You can't compare vendors, only like products between vendors.
  • Video serving is complex because there are few standards in the market. There are multiple platforms, multiple encoding standards, etc.
  • Customer's don't buy on price alone. Delivery of bits over a network is a commodity. Buy on SLA, customer service, product, format, support, geographic reach, and performance.
  • There appears to be no way to compare vendors on the performance of their network. There are too many variables in play to make an accurate comparison.He's quite adament about this. Performance could mean: SLA, customer service, upload content, buffering, etc. No way to measure performance network performance across networks. Static image performance is very different than streaming performance. People all over the globe are accessing your content so what is the "performance" of that?
  • A trend this year is demand for P2P pricing and services.
  • To price your video delivery you need to answer 4 questions: 1) How many hours? 2) How many users? 3) How long will they watch? 4) What encoding and what bit rate?
  • Price varies on product bundle. Vendors need to specialize so they can move themselves out of the commodity market. If you would pay 8 cents a gig for delivered video your price will be different if you add application and static caching.
  • Contracts are at 12 months. Maybe 2 years when bundling services.
  • P2P is not necessarily cheaper so compare. Pricing is very confusing.
  • You can sometimes get a lower price by using the vendor's player.
  • Flash streaming is more expensive because of licensing fees. The number varies because each vendor cuts their own licensing deals. Could be 20% more, or it could be double, depends on volume.
  • When signing a vendor think if you need global reach or is regional reach sufficient? Use a regional service provider if you need a CDN just once in a while. It's matter of picking based on your needs. You can often get a less expensive deal and get a quarterly commit versus a montly commit.
  • Storage costs have really fallen. High of $10/gig and low of 10 cents per gig.
  • Most CDNs will pull from your origin storage and cache, which reduces your storage cost.
  • CDNs don't want to get paid with promises of ad sharing.
  • Pick a CDN vendor that will take the time to educate you. They should ask you about your business first, they shouldn't talk about themselves first. He mentions this point a few times and it makes a lot of sense.
  • Consider a dual vendor strategy where you pick one vendor for video and another for application.
  • Quality in the industry is very high. People rarely complain about the network. Customers want better support and reporting. Poor reporting is the #1 complaint. Run away if a vendor wants to charge for reporting. Lots of good stuff.

    Related Articles

  • Highscalability CDN Tag Cloud
  • Edge Acceleration Strategies: Akamai by Tony Chang
  • A look at Content Delivery Networks, or “how to serve lots of content really fast”
  • Content Delivery Network Pricing, Costs for Outsourced Video Delivery
  • CDN Pricing Data: Average Cost Per GB Declines In Q4 Due To Startups
  • A Taxonomy and Survey of Content Delivery Networks
  • Content Delivery Networks (CDN) Research Directory

    Click to read more ...

  • Thursday
    Apr162009

    Serving 250M quotes/day at CNBC.com with aiCache

    As traffic to cnbc.com continued to grow, we found ourselves in an all-too-familiar situation where one feels that a BIG change in how things are done was in order, the status-quo was a road to nowhere. The spending on HW, amount of space and power required to host additional servers, less-than-stellar response times, having to resort to frequent "micro"-caching and similar tricks to try to improve code performance - all of these were surfacing in plain sight, hard to ignore. While code base could clearly be improved, the limited Dev resources and having to innovate to stay competitive always limits ability to go about refactoring. So how can one go about addressing performance and other needs without a full blown effort across the entire team ? For us, the answer was aiCache - a Web caching and application acceleration product (aicache.com). The idea behind caching is simple - handle the requests before they ever hit your regular Apache<->JK<->Java<->Database response generation train (we're mostly a Java shop). Of course, it could be Apache-PHP-Database or some other backend system, with byte-code and/or DB-result-set caching. In our case we have many more caching sub-systems, aimed at speeding up access to stock and company-related information. Developing for such micro-caching and having to maintain systems with such micro-caching sprinkled throughout is not an easy task. Nor is troubleshooting. But we digress... aiCache takes this basic idea of caching and front-ending the user traffic to your Web environment to a whole new level. I don't believe any of aiCache's features are revolutionary in nature, rather it is the sheer number of features it offers that seems to address our every imaginable need. We've also discovered that aiCache provides virtually unlimited performance, combined with incredible configuration flexibility and support for real-time reporting and alerting. In interest of space, here're some quick facts about our experience with the product, in no particular order: · Runs on any Linux distro, our standard happens to be RedHat 5, 64bit on HP DL360G5 · The responses are cached in the RAM, not on disk. No disk IO, ever (well, outside of access and error logging, but even that is configurable). No latency for cached responses - stress tests show TTFB at 0 ms. Extremely low resource utilization - aiCache servers serving in excess of 2000 req/sec are reported to be 99% idle ! Being not a trusting type, I verified the vendor's claim and stress tested these to about 25,000 req/sec per server - with load averages of about 2 (!). · We cache both GET and POST results, with query and parameter busting (selectively removing those semi-random parameters that complicate caching) · For user comments, we use response-driven expiration to refresh comment threads when a new comment is posted. · Had a chance to use site-fallback feature (where aiCache serves cached responses and shields origin servers from any traffic) to expedite service recovery · Used origin-server tagging a few times to get us out of code-deployment-gone-bad situations. · We average about 80% caching ratios across about 10 different sub-domains, with some as high as 97% cache-hit-ratio. Have already downsized a number of production Web farms, having offloaded so much traffic from origin server infrastructure, we see much lower resource utilization across Web, DB and other backend systems · Keynote reports significant improvement in response times - about 30%. · Everyone just loves real-time traffic reporting, this is a standard window on many a desktop now. You get to see req/sec, response time, number of good/bad origin servers, client and origin server connections, input and output BW and so on - all reported per cached sub-domain. Any of these can be alerted on. · We have wired up Nagios to read/chart some of aiCache extensive statistics via SNMP, pretty much everything imaginable is available as an OID. · Their CLI interface is something I like a lot too: you see the inventory of responses, can write out any response, expire responses, report responses sorted by request, size, fill time, refreshes and so on, in real time, no log crunching is required. Some commands are cluster-aware, so you only execute them on one node and they are applied across. Again, the list above is a small sample of product features that we use, there're many more that we use or explore using. Their admin guide weighs in at 140 pages (!) - and it is all hard-core technical stuff that I happen to enjoy. Some details about our network setup . We use F5 load balancers and have configured the virtual IPs to have both aiCache servers _and origin server enabled at the same time. Using F5's VIP priority feature, we direct all of the traffic to aiCache servers, as long as at least one is available, but have ability to automatically, or on demand, failover all of the traffic to origin servers. We also use a well known CDN to serve auxiliary content - Javascript, CSS and imagery. I stumbled upon the product following a Wikipedia link, requested a trial download and was up and running in no time. It probably helped that I have experience with other caching products - going back to circa 2000, using Novell ICS. But it all mostly boils down to knowing what URLs can be cached and for how long. And lastly - when you want stress test aiCache, make sure to hit it directly, right by server's IP - otherwise you will most likely melt down one or more of other network infrastructure components ! A bit about myself: an EE major, have been working with Internet infrastructures since 1992 - from an ISP in Russia (uucp over MNP-5 2400b modem seemed blazing fast back then!) to designing and running infrastructures of some of the busier sites for CNBC and NBC - cnbc.com, NBC's Olympics website and others. Rashid Karimov, Platform, CNBC.com

    Click to read more ...

    Thursday
    Apr162009

    Paper: The End of an Architectural Era (It’s Time for a Complete Rewrite)

    Update 3: A Comparison of Approaches to Large-Scale Data Analysis: MapReduce vs. DBMS Benchmarks. Although the process to load data into and tune the execution of parallel DBMSs took much longer than the MR system, the observed performance of these DBMSs was strikingly better. Update 2: H-Store: A Next Generation OLTP DBMS is the project implementing the ideas in this paper: The goal of the H-Store project is to investigate how these architectural and application shifts affect the performance of OLTP databases, and to study what performance benefits would be possible with a complete redesign of OLTP systems in light of these trends. Our early results show that a simple prototype built from scratch using modern assumptions can outperform current commercial DBMS offerings by around a factor of 80 on OLTP workloads. Update: interesting related thread on Lamda the Ultimate. A really fascinating paper bolstering many of the anti-RDBMS threads the have popped up on the intertube lately. The spirit of the paper is found in the following excerpt: In summary, the current RDBMSs were architected for the business data processing market in a time of different user interfaces and different hardware characteristics. Hence, they all include the following System R architectural features: * Disk oriented storage and indexing structures * Multithreading to hide latency * Locking-based concurrency control mechanisms * Log-based recovery Of course, there have been some extensions over the years, including support for compression, shared-disk architectures, bitmap indexes, support for user-defined data types and operators, etc. However, no system has had a complete redesign since its inception. This paper argues that the time has come for a complete rewrite. Of particular interest the discussion of H-store, which seems like a nice database for the data center. H-Store runs on a grid of computers. All objects are partitioned over the nodes of the grid. Like C-Store [SAB+05], the user can specify the level of K-safety that he wishes to have. At each site in the grid, rows of tables are placed contiguously in main memory, with conventional B-tree indexing. B-tree block size is tuned to the width of an L2 cache line on the machine being used. Although conventional B-trees can be beaten by cache conscious variations [RR99, RR00], we feel that this is an optimization to be performed only if indexing code ends up being a significant performance bottleneck. Every H-Store site is single threaded, and performs incoming SQL commands to completion, without interruption. Each site is decomposed into a number of logical sites, one for each available core. Each logical site is considered an independent physical site, with its own indexes and tuple storage. Main memory on the physical site is partitioned among the logical sites. In this way, every logical site has a dedicated CPU and is single threaded. The paper goes through how databases should be written with modern CPU, memory, and network resources. It's a fun an interesting read. Well worth your time.

    Click to read more ...

    Monday
    Apr132009

    High Performance Web Pages – Real World Examples: Netflix Case Study

    This read will provide you with information about how Netflix deals with high load on their movie rental website.
    It was written by Bill Scott in the fall of 2008.

    Read or download the PDF file here

    Click to read more ...

    Page 1 ... 5 6 7 8 9 ... 38 Next 10 Entries »