« Did the Microsoft Stack Kill MySpace? | Main | Facebook's New Realtime Analytics System: HBase to Process 20 Billion Events Per Day »
Thursday
Mar242011

Strategy: Disk Backup for Speed, Tape Backup to Save Your Bacon, Just Ask Google

In Stack Overflow Architecture Update - Now At 95 Million Page Views A Month, a commenter expressed surprise about Stack Overflow's backup strategy: 

Backup is to disk for fast retrieval and to tape for historical archiving.

The comment was:

Really? People still do this? I know some organizations invested a tremendous amount in automated, robotic tape backup, but seriously, a site founded in 2008 is backing up to tape?

The Case of the Missing Gmail Accounts

I admit that I was surprised at this strategy too. In this age of copying data to disk three times for safety, I also wondered if tape backups were still necessary? Then, like in a movie, an event happened that made sense of everything, Google suffered the quintessential #firstworldproblem, gmail accounts went missing! Queue emphatic music. And what's more they were taking a long time to come back. There was a palpable fear in the land that email accounts might never be restored. Think about that. They might never be restored...

The Hero: Tape Restoration

But then, still like in the movies, a miracle happened. Over a period of a few days the accounts were restored, and the hero this time was: tape. Tape? Yes, the email accounts were restored from tape. A quite unexpected plot twist.

The story was told in an official Google blog post, Gmail back soon for everyone:

I know what some of you are thinking: how could this happen if we have multiple copies of your data, in multiple data centers? Well, in some rare instances software bugs can affect several copies of the data. That’s what happened here. Some copies of mail were deleted, and we’ve been hard at work over the last 30 hours getting it back for the people affected by this issue. 

To protect your information from these unusual bugs, we also back it up to tape. Since the tapes are offline, they’re protected from such software bugs. But restoring data from them also takes longer than transferring your requests to another data center, which is why it’s taken us hours to get the email back instead of milliseconds. 

So what caused this problem? We released a storage software update that introduced the unexpected bug, which caused 0.02% of Gmail users to temporarily lose access to their email. When we discovered the problem, we immediately stopped the deployment of the new software and reverted to the old version.

The Moral: Disk Only is Risky

The moral of the story: storing all your replicas online is a risk. A single bug can wipe out everything.

In this case it was a software update that caused the problem. That is to be expected. More disasters have probably been caused by software updates than any other cause. The reason: bugs that effect control are more powerful than bugs that effect data

The Villain: Software Update Induced Amnesia

It wasn't that all three data copies went bad simultaneously. That is unlikely. But software bugs at the control plane layer of a system is not a low probability at all. Software updates on a live running system operate in an almost unimaginably complex world. The success path is clear and usually works flawlessly, but so many faults can happen in such unexpected ways that failure is common and can be devastating. 

For an analogy think of how DNA works. Change the DNA for a gene that helps build something and you've done a lot of damage. Corruption that happens in a single cell is usually caught by the immune system and destroyed. But when a mutation happens in the regulatory region of a gene then all hell can break loose. All the mechanisms to stop a cell from replicating, for example, can be destroyed and the result is cancer.

So it's the program controlling the system that are the most powerfully dangerous areas, as was shown in Google's Gmail problem.

Handling it Better

While Google did a great in having a tape backup and then in how they diligently worked to restore accounts from tape, handling communication with the users could have went better. Wtallis summed up the problem succinctly:

People who were affected had their entire Google accounts disabled, and upon trying to log in, they got exactly the same messages they would have gotten if Google had decided to delete the account for ToS abuse. Additionally, since the entire account (not just GMail) had to be disabled in order to repair things, stuff like shared google calendars went offline, so users whose accounts were not directly affected were getting misleading error messages, too. 

And the bounces didn't stop until the end of the working day on Monday on the east coast.

Some possible fixes:

  • Programs should give finer grained error messages rather than generic error messages. 
  • Have finer grained account lockout so services for an entire account are down when just one service is down.
  • More and better communication.

The Happy Ending: Use Protection

With today's huge datasets isn't backing up to tape impractical? Seth Weintraub, in Google goes to the tape to get lost emails, estimated 200K tapes are needed to backup Gmail accounts. Wow. Others suggest this number is much lower in practice using techniques like compression, deduplication, incremental backups, and the fact that most users use a small fraction of their quota. So tape backup is practical in practice, even for very large datasets.

Why Google's own "immune system" didn't detect this problem isn't clear, but since problems like this are to be expected, Google protected themselves against by backing up to tape. If you have a really important dataset then consider backing up to tape instead of relying only on disk. Google was glad they did. 

Reader Comments (17)

Great article. I still hate tape, but ... sometimes staying old school is a great thing. Excellent summary.

March 24, 2011 | Unregistered CommenterJamieson Becker

This seems less an argument about tape vs. disk and more about offline vs. online. Nothing is preventing you from keeping disks ofline, so long as you have proper rotation and maintenance.

March 24, 2011 | Unregistered CommenterJohn Hugg

This doesn't strike me so much as a case in favour of tapes, as "RAID is not a backup strategy", which all good sysadmins should know already...

March 24, 2011 | Unregistered CommenterShish

That's very true John. I imagine cost would be the deciding factor at scale. Latency for an offline system would be less of a concern.

March 24, 2011 | Registered CommenterHighScalability Team

Shish, that's a good way to look at.

March 24, 2011 | Registered CommenterHighScalability Team

Is there any information about tape rotation strategy ?
We know (Google told us) that GMail has been restored and users could connect, BUT... Did the users lose any email ?

March 24, 2011 | Unregistered CommenterAntoine

@Shish - yes, but Google doesn't use RAID per se, but point is well taken. I had been under the impression that they would journal old files on disk rather than moving them offline. Tape is antiquated and much more expensive than disk (in terms of manpower as well as dollars per byte), but disk needs to have power continuously applied and consumes DC power/cooling/footprint. It must be that off-site storage plus the other costs of tape are lower than the costs of maintaining disk, or that someone just prefers to have that backup of last resort. I'd be curious as to how old the backups were from anyone who had to have their mailbox restored.

And this with FB wanting to reduce their # of photo copies in Haystack from 3 to 2... wonder if they're rethinking that.

March 24, 2011 | Unregistered CommenterJamieson Becker

This seems a tad incredible to me: tape technology is expensive, and Gmail is for most users a free service, and part of the reason they built it is because they had thousand of servers with unused hard drive space: it was close to free infrastructure!

But backing up even smaller hard drives to tape is expensive. Multiply that times thousands and thousands . . . I find it hard to believe . . . anyone can verify this and maybe provide some cost estimates?

-danny

March 24, 2011 | Unregistered CommenterDaniel Howard

Tapes VS disks are like super 8 movies VS DVD. We may be blessed with blazing fast disks and redundant systems, until now, nothing last better than tapes. Do you remember any disk or DVD lasting three or more decades like tapes do ? Well, I don't.

March 24, 2011 | Unregistered Commenteranavarre

I propose a new backup system that will combine the immediacy and low cost of disk with the isolation of tape.

The backup system is build using a standard disk-based computer and exposed two web-based API: save and get. By design, it does not have delete or update. Also, it does not have software updates.

To recycle its own disks space, the backup system delete information once it was not read for predetermined amount of time. I suggest: one month.

March 24, 2011 | Unregistered Commenteryigal irani

@Yigal, Funny you should say that.

March 24, 2011 | Unregistered CommenterJamieson Becker

I'm not quite sure why people think tapes are outdated. LTO-4 and -5 can store 800 (@120 MB/s) and 1500 GB (@140 MB/s) without compression, respectively. Oracle/StorageTek recently announced their T10000C drive which holds 5 TB and goes at 240 MB/s--both numbers uncompressed. All of these drives support wire-speed AES encryption, so if you lose your media you don't have to worry about HIPPA/privacy violations and headlines in newspapers.

As for offlining hard disks: that's a logistical nightmare when you have lots of units. It's okay for departmental-level stuff where you may have a few terabytes and FW/USB/eSATA connections, but not for data centre scale stuff. Tapes are inherently offline, and location management is built into most back up software.

I recently ran across a report on tape use in the US DoE's NSERC facility at Lawrence Berkeley National Laboratory:

When is tape a direct access storage device? And when is a very active tape archive really just another tier of “regular” storage? At the NERSC Center in California, tape is simply seen as extremely efficient, cost-effective, scalable—and reliable—storage. With over 13 PB of data on tape (growing at around 60% p.a.) and decades of history, NERSC has the facts to prove tape’s capabilities; so much so that it does not employ additional copies.

http://tinyurl.com/6jc84nx
http://www.enterprisestrategygroup.com/2010/12/nersc-proving-tape-as-cost-effective-and-reliable-primary-data-storage/

Tape certainly isn't the only game in town like it used to be, and going to disk is certainly a good option. And in fact going to disk before tape is usually necessary (at least staging-wise), as streaming straight from the client is quite impractical in most cases--either the network or the disk won't be able to keep up and you end up with shoe-shining scenarios.

Looking into the future, Oracle's roadmap has a T10000-D in 2013 doing 6-10TB (@270-400MB/s), and T10k-E in 2015 at 12-20TB capacity (@400-600MB/s):

http://www.theregister.co.uk/2010/08/11/oracle_tape_roadmap/
http://www.theregister.co.uk/2010/08/11/oracle_on_storage/page2.html

LTO also has several generations out planned, though not as aggressive as StorageTek:

http://en.wikipedia.org/wiki/Linear_Tape-Open

While it's status in the back up mix may be changing, tape is certainly not going away.

March 24, 2011 | Unregistered CommenterDavid Magda

Love the analogy of defects pushed to production and cancer.. classic!

March 24, 2011 | Unregistered CommenterRussell

I also don't understand why people think tape is outdated. Thankfully, Google didn't think that way. Our LTO-4, as mentioned by @David is very fast. Also, you can write in parallel on some robot/drive setups so it can be very, very fast. In fact, our 12 disk 10TB array has trouble keeping up with the tape drive which is a real issue that cause the tape to rewind because it is moving so fast, it is called shoe shinning.

I honestly think the tape vendors sandbag the new tape technology to keep making it worth having over big disks. I think the tech is there to have it much faster, but they release it slowly to reduce R&D costs.

Also, tape gets cheaper the more of it you have, but the buy in is expensive (time and money) for smaller shops.

March 25, 2011 | Unregistered CommenterScott McCarty

I have to agree that the primary distinction in this case is not between disk and tape but between online and offline.

And thinking that just because it is 'offline' it is safe is stupid. You are trading one type of risk for others. For example - are your tapes 'one use only' or do you buy a new tape for every N uses? Do you regularly VERIFY your tapes are restorable? Even with good tapes it is possible your tape drive isn't... Is your tape drive obsolete? If it dies will you still be able to read your tapes? Are your tapes stored onsite or offsite or both? Are your storage facilities fireproof? Flood proof? Earthquake resistant? Temperature and humidity controlled (mildew on tapes....).

Tape is not a magic wand. It is just a strategy. And depending on many factors, it can be a good one or a bad one.

Me - I use onsite + offsite over network backups to RAID6 configured hard drive in locations 40 miles apart. I use multiple methods of backups where the same data is backed up at least two completely different ways in at least two different places and sometimes as many as four places. I'll probably be adding 'offline external hard drives' to the mix soon now that their capacities have reached 3TB.

March 27, 2011 | Unregistered CommenterBenjamin Franz

Agreed, and offline doesn't have to mean "offline". You could always treat disk like tape -- compressed, incremental backups in an off-site data center that are never deleted or are deleted only with a simple and auditable algorithm (ie cronjob) that is never touched by human hands... grandfather/father/son strategy or whatever is appropriate. One advantage over tape is that you can constantly be checking the health of your backups, remotely. You could even make it possible to only ever write files remotely, not delete, and only the cronjob can free up disk space. Distributing this across a lot of servers would be challenging, but no more challenging then the logistical gymnastics involved in managing databases of thousands of tapes.

March 27, 2011 | Unregistered CommenterJamieson Becker

As several folks have already pointed out, Google's pain wasn't disk versus tape. It was data redundancy (DR) versus archiving. It doesn't really matter if you have multiple copies floating around 30+ globally distributed data centers if all those copies are BAD.

Many of us have learned the hard way that corruption often rears its ugly head long after-the-fact. Like when your CPA is closing out last year's books and discovers database corruption in February transactions. Oops!

As for the "my tape is faaaaaast!" comments... Tape ONLY performs well when data access is linearly sequential and near the native speed of the media. Push data faster or slower and the $*%&# unit performs a lively impression of R2-D2. Try validating data after a backup. Or try selective restores, with only some files located on different selections of the tape. Or try using restoring deduplicated data, where common blocks are stored at indexed points throughout the tape -- or worse, on various tapes in the library. Oh joy! That's the point at which you discover how bloody expensive cleaning cartridges are for your faaaaaaast tape library.

It took Google FOUR DAYS to restore data from tape. That's the object lesson here.

May 18, 2011 | Unregistered CommenterDan Sydnes

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>