Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What if your backup systems get compromised? What if your backup systems get destroyed by inclement weather? Not a backup then, obviously.

Only a backup is a backup, until it's not a backup anymore.



This is why, if your data is sufficiently important, you'll want to:

1) Test your backups, to detect when your backups are no longer backups.

2) Make geographically diverse backups, so a single tidal wave can't wipe out your data. For bonus points, have enough geographically diverse backups that the world is probably ending if they're all being wiped out -- at which point you have bigger problems to take care of.

3) Make backups with a diverse set of mechanisms, so the failure or compromise of one (or N-1) can't fail and compromise all backup copies. Making backups on write-only media and hiding them means current failure or compromise can't fail and compromise previous backups, and may help back your data up against theft, landlords, angry neighbors, spurned girlfriends, or even the occasional corrupt government official.

Mirroring (be it software or RAID) is not a backup system: It is far too dumb, far too happy to overwrite your old good data with new bad data. You want a history, where old good data is not replaced.

Git is not a backup system: It is a version control system. While it may have some of the properties of a backup system as goals, that is not it's primary use case. As a result we see articles like this where we've seen how it can fail in achieving the goals of a backup system as a practical matter in this very article, even when intentionally attempting to use it as a poor man's backup system in the form of mirrors.

Such problems are not unique to git, of course. On a personal note, I've managed to wipe data with both git and perforce in moments of weakness. If you want to treat me kindly about it, you could say I used both to the point where the statistics were against me not shooting myself in the foot. And, fortunately so far, the use of proper, separate backup mechanisms have always allowed me to restore the majority of my data and left me relatively unscathed.


We're already doing #1, 2, and 3 on the list you provided, just so you know (although we missed out on some areas for #1 in retrospect).


That's kind of ridiculous. The point that most people are making is that if someone does something incredibly stupid, or there is corruption in the system that follows down the line (like what happened here), it doesn't matter whether you have a repository.

A backup clearly would have helped here.


A backup of a corrupt repository would have been just as corrupt though.

This is the big thing I can't figure out what people are not understanding. git does consistency checking for you already, tar|rsync|etc. don't, so it makes sense to take advantage of that.

What we had was an instance of some of the underlying data becoming corrupt on the filesystem (with indications of that starting on Feb 22!). The big mistake was considering the source repositories as consistent and canonical at the remote anongit end, but the data would have been just as corrupt if we had scp'ed the repos from git.kde.org to the anongit mirrors around the world, since we would have bypassed git's internal checking in that way.

Is it safe to rsync a running mysql database at random times, or are you supposed to use mysql-provided tools to perform a backup?


OK, but what stops them from daily performing a mirror clone, checking it for consistency, then backing that up? As mentioned in the linked update, 30 complete backups would consume only 900GB, so you could keep weeks of daily backups, plus weekly and/or monthlies going back much further, for a terabyte of space. That way, in the worst case, you could go back to a backup before the corruption began. Obviously you would want to have plenty of safeguards in place so that that never happened, but just in case, it's good to have an honest to goodness backup too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: