« Notify.me Architecture - Synchronicity Kills | Main | Product: Puppet the Automated Administration System »
Sunday
Oct262008

Should you use a SAN to scale your architecture? 

This is a question everyone must struggle with when building out their datacenter. Storage choices are always the ones I have the least confidence in. David Marks in his blog You Can Change It Later! asks the question Should I get a SAN to scale my site architecture? and answers no. A better solution is to use commodity hardware, directly attach storage on servers, and partition across servers to scale and for greater availability.

David's reasoning is interesting:

  • A SAN creates a SPOF (single point of failure) that is dependent on a vendor to fly and fix when there's a problem. This can lead to long down times during this outage you have no access to your data at all.
  • Using easily available commodity hardware minimizes risks to your company, it's not just about saving money. Zooming over to Fry's to buy emergency equipment provides the kind of agility startups need in order to respond quickly to ever changing situations.

    It's hard to beat the power and flexibility (backups, easy to add storage, mirroring, etc) of a good SAN, but Mark makes a good case.
  • Reader Comments (8)

    A SAN doesn't have to be a single point of failure, any more than a power supply in a server is. Get multiple head units, multiple HBAs and NICs. Fully multipath everything. If uptime is that critical you can cluster multiple SANs. Don't scrimp on support either, buy the support that matches the response time you need. 99% of the time commodity hardware isn't going to have the level of integration that you need - I don't see any homebrew solutions supported by VMWare SRM.

    I think it's crappy advice; you buy the solution that fits your needs, your budget and your architecture. Building your own solution implies you have more technical staff onsite who can understand what you've done, and are capable of deciphering the configuration if you're on vacation when everything falls apart. I can call one of the major SAN vendors and get data deduplication, low level integration with VMWare VI3, Exchange, Oracle, MSSQL, etc, block level replication, intelligent snapshotting, NFS, CIFS and iSCSI support. Or I can buy a cheap Tyan server and pack it full of $80 SATA drives, use samba, drbd, rsync and a handful of other projects and hope for the best.

    Both options will work, but to dumb the rationale behind purchasing a SAN down to it being a single point of failure is a mistake.

    December 31, 1999 | Unregistered CommenterCorey Gilmore

    Every architecture solution must exist within a context.

    A SAN is extremely fast and can scale up to high IOP volumes and very large amounts of storage. It's also extremely expensive.

    For small sites, a SAN may not be worth the cost. As that site scales up, there's a price/performance point where a SAN makes a lot of sense. (And, BTW, Corey nailed it... a SAN is only a SPOF if it's badly implemented.)

    As a site gets even large, it might well outgrow the SAN, probably about the same time it outgrows a monolithic application and database architecture. With thousands of compute nodes and ten or more databases, a single SAN may no longer make sense.

    The right answer is almost always, "It depends." Should you use a SAN? It depends on your application architecture, database architecture, storage volume, IO rates, and forseeable capacity needs.

    December 31, 1999 | Unregistered CommenterMichael Nygard

    Indeed, SAN's days are numbered. Just look at the top 3 true web scale companies: Google, Yahoo and Microsoft. None of them use SAN. All the cloud computer centers don't and won't use SANs. For enterprises, the future is modular cloud appliances. They'll be as easy to manage as SANs, and can scale horizontally much less effort and cost.

    December 31, 1999 | Unregistered Commentervicaya

    If you really need large capacity of scalable storage go for storage servers such as the http://www.sun.com/servers/x64/x4540/">Sun Fire X4540 which provides 48TB in 4RU for close to $1/GB or build your own. Use distributed filesystems (e.g. Hadoop HDFS) or application level load balancing.

    December 31, 1999 | Unregistered Commentergeekr

    Having worked in an enterprise SAN environment I have to say I agree. Yes our SAN environment is redundant, redundant and redundant with everything offered by the vendors. At least once a year we loose one of those super highly redundant boxes for at least 12 hours. In that discussion you have to talk about how much down time can you afford when that Single Point of Failure, no matter how redundant, goes down. Everything crashes. How much downtime can your environment/application that is dependent on that single point of failure accept?

    December 31, 1999 | Unregistered CommenterIan K

    SAN concentrates I/O into one box. It thereby creates a single point of failure, as well as a throughput bottleneck. Don't believe the vendors about the *magic* performance of their storage controller. A few years ago we benchmarked a brand new Mid-Range SAN against a Direct Attached Storage array that costs only a fraction per GB. SAN had a few disks more, still DAS clearly won all tests, except random writes. Probably here the SAN storage controller probably had more cache to put into the game .... because no other system was using it yet.
    At another occasion high-end SAN controllers were updated "transparently", so no server should notice because of redundant paths to redundant controllers etc ... The update was done by vendor staff. Half of the servers were in trouble afterwards, vendor denies it has to do with their update procedure. So Single Point of Failure was demonstrated as well.
    So no SAN for my company anymore

    September 25, 2011 | Unregistered CommenterMartijn U

    Been doing storage for 12 years. A SAN is a single point of failure. There is no getting around that. You can have redundant controllers, you can have 20 controllers. Each pair or quadruple controllers become a single point of failure as a whole. I have seen it again and again in my years administering SANs. I was at a customer recently who had dual controllers in a high-end SAN and hardware failure caused a takeover of a controller that had a failed CPU. It turned out that not all the volumes came online in the partner controller and the customer was down for hours waiting on the vendor to replace hardware - as it was impossible to get the offline volume to come up on the partner node.

    If you really care about redundancy in your data, do not rely just on a SAN. Get multiple SANs or better yet use other storage technologies for a more distributed approach.

    October 9, 2013 | Unregistered CommenterLC

    If your SAN is setup vulnerable to a SPOF, then you've not configured it right. Do your homework, do your tests. The benifit of SAN are far greater than attaching storage to a single server as in the intial comment.

    I can't believe the utter rubbish David speaks, and I can only assume the year he was living in is a world apart from where we are today.

    April 9, 2014 | Unregistered CommenterMatt

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>