Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The 3-2-1 Backup Rule – Why Your Data Will Always Survive (2019) (vmwareblog.org)
184 points by Apocryphon on Nov 24, 2021 | hide | past | favorite | 200 comments


The different media types is probably the hardest issue; I'd wager that for most people, backing up > 2 TB on anything but spinning rust will be impractical/prohibitively expensive.

A slight twist of this: I now have data old enough that accessing it with modern computers is starting to become a challenge. Thankfully I migrated all my, once enormous collection of 50,000 MB on tape, to just be files on my file server, but I'm worried about the longevity of optical media, and now I nervously glance at my collection of even older media ....


And the most important part of the rule which they forgot: Test your backups.

I once has to restore a DB from backups just to discover that the backups from 2 days ago were corrupted. Fortunately backup t-3 days worked. And I had to do some binlog mongering to recover the rest of the data.

Test your backups people. They WILL fail.


From someone smarter than me: people think they need backups while in fact they need restores…


That’s one of my preso jokes. “Backups are a worthless waste of time! Restores, on the other hand…”



In the late 1990s, W. Curtis Preston wrote a book called “Unix Backup and Recovery”. A later version was just named “Backup and Recovery”. Both were published by O’Reilly.

He was saying similar things at that time (e.g., “Backups are worthless, Restores are priceless”), and I’m certain he wasn’t the first.

But you’d have to ask him who was saying things like that before he did.

Disclaimer: Curtis was a co-worker of mine at the time, and I was a technical reviewer of the book in question.


Beautifully succinct.


The article slightly annoyed me by saying, “Why one does backups? To have a copy of data at hand if something goes wrong.” Not, I think, the best way to put it, per your comment.


How do you go about testing your backups? Like if you have 1TB of data, testing that all of it is valid is not trivial, and I guess that should be done every time you create a new backup set? Shouldn't the backup tool somehow perform this test automatically?


> testing that all of it is valid is not trivial

If it was easy then people wouldn't have to be reminded to do it—it'd just get done. :)

One way could be to pre-compute checksums of the files in question and then verify them periodically:

* https://aide.github.io

* https://packages.debian.org/search?keywords=aide

It's also why people suggest using ZFS: it has checksums on the live data to make sure bits aren't flipped, but those checksums can then be replicated to remote copies (zfs send/recv) to make sure your copies also coherent. A service like rsync.net allows remote ZFS data sets—that can also be encrypted.


Being 100% confident you've got a bit for bit copy of what you think you have, doesn't mean you have a restore process that works.

Unless you've tested and proven to your self you can bring up a working system from your backups, you've only done the first half of your disaster recover work.

For me, that means restoring onto your cold spare hardware (either identical or similar-enough for you), shutting down your system pulling it's drives and replacing them with blank ones and restoring onto that (but be careful you can be sure you don't need that specific hardware/firmware/peripherals - cause in 18 months time when someone backs a truck into your office and load all your electronics into it you might not be able to get an identical system), or restoring a running system onto newly provisioned instances on cloud-provider-of-choice.

Prove to yourself at least, that you can have business continuity in a know amount of time once you pull the pin on your DR/restore-from-backups plan.


Yup, this is it. And you don't need to do it for every single backup. Before running the backup, do four coin flips. If they all end up heads (adjust the number of flips based on or how much time you can afford to spend on backup testing) you test the full recovery process of the impending backup.

Of course, it will depend on several parameters, but in my experience, doing a thorough test for a random subset of items is often more economical than a half-assed test of all items.

As a bonus, the random sampling will let you infer things about the totality of all items. (As opposed to any other scheme for selecting which items to test.) So once you've run 27 tests and only one failed, you know at least 85 % of your backups work. At 1/20th of the cost of testing them all, this is a good deal on information.


There is no alternative to performing an actual restore periodically in order to verify that backups and the restore procedure do work.


>should be done every time you create a new backup set

That could be prohibitively expensive. For personal use, each time I buy a new computer (desktop / notebook alternating) I use it as an opportunity to test my restore procedures. I restore my most recent backup to the new computer, then verify that it operates the same, and has all the same data stores, as my next newest computer (the one whose backup I restored).


Not just test the backup, but practice and ensure you are familar with the recovery process as well. Having a clear plan of what you will do if that disk fails/machine is lost/etc. is better than troubleshooting and fumbling with tools while the clock is ticking. A written procedure may be helpful as well (I have one for my Proxmox server for setting up the machine and restoring the VMs from the backup disk).


People often confuse the Different Media, and this blog post makes it murky as well

>> The thing is, while keeping data on the same storage media, you may lose them due to the same hardware issues. In other words, you may lose two copies in the same accident. That’s why you should always combine media.

For "same media" the key part of this is storing the data on two DIFFERENT DEVICES of different types, they both could be hard drives, but they need to be in isolated systems of different types (say a Windows System and a Synology NAS, or a FreeNAS Storage Appliance and a Linux Server, etc)

The rule came about because people would use the same tape library, or have multiple copies on the same SAN (often times they would have a SAN Cluster of 2 or more devices that act as 1, and because they had 2 "devices" they felt they were protected.

Disparate media type (HDD, Tape, DVD, etc) may have some advantaged but as long as you are putting 2 copies on say your Home Desktop, and a NAS you have satisfied 2 media's even if both are using Hard drives

That said today the most common way for a person fill offsite and separate media is to use a Cloud Backup of some kind.


> the most common way for a person fill offsite and separate media is to use a Cloud Backup of some kind.

Which has a different failure mode to disks/NAS/fileservers.

AWS/Dropbox/Google/BackBlaze et al. are all at risk from credit card and/or financial failures. I can go broke and be unable to pay my bills, but my hard drives will still store my data.

Google and AWS (to a lesser extent) are also occasionally subject to arbitrary and capricious account revocation. Google specifically worries me there, there are way too many stories of people have problems with, say, they Google Play developer account or their YouTube account, and finding they're cut off from their gmail and google drive access, with pretty much zero way if fixing it unless you have a million twitter followers or a Google insider to go to bat for you.


Sure, but there is where the 2nd copy comes into play in the 3-2-1.

3 Copies, 2 Media, 1 off site.

One should always have a local copy of their backups.


A couple decades ago, I used to keep backups of my desktop computer on DAT tapes. I had several different tape cassettes that I rotated through to keep many separate backups of all my data, and I thought of them as different devices until the day when I ejected one of the DAT tapes from the drive and it came out trailing a couple meters of unwound tape, and with a small (but apparently quite important) spring dangling off the end of the loop.

Lesson learned! I now still had several intact tape backups of my data, but no functional device that could read and restore from them.


Oh oh, I forget to add that I would _never_ trust flash memory (eg. SSDs) to keep data for more than ~ a year without being powered on. It's absolutely a terrible archival format. (Powered on with scrubbing you'd at least know when it's starting to degrade).


> The different media types is probably the hardest issue; I'd wager that for most people, backing up > 2 TB on anything but spinning rust will be impractical/prohibitively expensive.

There's LTO (Linear Tape Open), LTO-8 has 12 TB of raw capacity and costs about 65 - 70 €. The newest variant (not yet _that_ widely available), LTO-9 has 18 TB.

You need a few to handle rotation, but for most that means at max 10 plus one new per year for the long term archive.

We provide Tape support in our open source Proxmox Backup Server product, it can handle single drives and drive robots (with auto exchangers that lessen the work on tape rotation), the data is deduplicated, compressed and optionally encrypted.

https://pbs.proxmox.com/docs/tape-backup.html

PBS can also efficiently mirror to remotes:

https://pbs.proxmox.com/docs/managing-remotes.html

Check the introduction/main feature section for more info if you're interested: https://pbs.proxmox.com/docs/introduction.html


The cost of the tapes isn't what keeps me from having it home. It's the cost of the tape drive.


I mean, it's a one time investment that's may be good for a decade or so, depending on the setup and workloads.

But yes, forking over a 2 to 4k for a new LTO-7+ changer isn't a small investment for private use, for a company it's IMO a no-brainer though.

FWIW: You can get them often cheaper, e.g., at ebay or whatever your local online reseller space is. For example, one can get LTO-5 ones here for 300 - 400 bucks and LTO-6 for 600 to 800. While LTO-6 only has 2.5 TB of space you can also create tape-sets (e.g., spanning multiple tapes) and for the more important data it may even be enough.


ages ago there also was always the recommendation that you should have two drives, so if one gets out of alignment you'll notice when it starts to create tapes that proper drives can't read anymore. Not sure if that's still a concern with modern ones?


> Not sure if that's still a concern with modern ones?

No, that's not really a concern currently, and I actually never heard that in combination with LTO - which is relatively modern.


Good to know. Quite possibly folklore that got transferred over to LTO.


>The different media types is probably the hardest issue

That's probably overrated and you shouldn't feel bad about not implementing it. The justification for it is:

>you may lose them due to the same hardware issues

buying different hard drive models/manufacturers/batches achieves the same thing without the risk of having data stored on dinosaur media.


I agree with that with one important caveat.

I wouldn't trust multi terabyte drives in raid 5 sets to not suffer cascading failures during a rebuild after a single drive failure/replacement.

I'd consider two file servers or NAS boxes with the same type/brand/model of drives in raid 5, to be "the same media type" in the context of that advice.


>A slight twist of this: I now have data old enough that accessing it with modern computers is starting to become a challenge.

I have that problem!

I have files in my Documents folder going back to the 90s and Classic macOS!


> I have files in my Documents folder going back to the 90s and Classic macOS!

Same. And while I can be sure I have them safely stored, I know for sure I have files I can no longer use because I don't have a way of running the applications needed to work with them. I must have _dozens_ gigabytes worth of copies of Zip disks and Syquest cartridges full of Macromedia Flash projects and Mac OS9 Filemaker Pro databases.


Why not set up an emulated virtual Mac? It's not very hard these days, and the performance is not a concern with modern devices.


My oldest files are from 1978.


My personal interpretation/variation of this is to use two different backup formats, even if everything ultimately ends up on hard drives (whether my own or a cloud service provider's). I usually will do 1 file backup via Duplicacy (but Restic/Borg/Duplicity/Time Machine work just as well) and 1 system image backup via Macrium Reflect/Carbon Copy Cloner. The file backup gets copied to Backblaze B2, the image backup stays local.


There are Blu-ray Discs specifically engineered for long-term archiving and have BER guarantees (anomalous bit read per gb of data stored per year archived or something).


Any more info on this like a brand name?


It used to be "M-DISC", with the 'M' meant to mean "millenium": as in, they'd last 1000 years.

I somehow thought the company making the M-DISCs was dead but apparently Verbatim (and others) still sells BluRay discs labelled as "M-DISC" and they're compatible with many BluRay readers/writers.


You can still buy M-DISC branded discs - either blu-ray or DVD. I think it matters more for DVD than for blu-ray because blu-ray discs use a more reliable recording medium anyway. People say there's little if any difference between M-DISC BDXL and 'normal' BDXL. You can get 100GB or 128GB BDXL discs.


There are two different Blu-ray dye technologies, LTH and HTL. HTL is the original and uses more expensive inorganic materials (germanium, bismuth, or palladium) and then there's the cheaper LTH that came later and uses an organic dye that's more susceptible to light, breaks down rather quickly over time, and has more burning/reading issues. You don't want to use the latter.

Typically if the surface of the disc is gold and darkens with writing, it's an LTH disc.


They include external USB drives and NAS as different types, specifically about the NAS:

> can be considered statistically independent too since it is connected over the network and may survive if something bad happens to a part of your infrastructure.

So my interpretation is: e.g. a HDD connected over SATA, an external HDD over USB and a HDD on NAS are all considered different media types even though ultimately they are all based on spinning hard drives.


Because of the different interfaces to said drives? (USB vs. Network vs. SATA)


yes that seems to be the point. NAS is over the network, USB is supposedly "cold storage" (unless you keep it connected 24/7). Imagine something like: all drives on your desktop PC are compromised due to a PSU malfunction that causes a current or voltage spike - in that case, both NAS and USB cold storage would be safe.


> The different media types is probably the hardest issue;

For my home stuff, I'm comfortable enough with having a backup on (raid 1 mirrored spinning rust) on my media server, plus a copy of that backup (also on spinning rust) on an external USB hard drive. The chances of simultaneous corruption of those - even though they're both spinning rust - seems "low enough" for me.

(I also have another raid 1 pair of spinning rust that powers up once a week very early Monday morning and rsyncs the media server backup directory, then powers back down when it's done, to mitigate against fat-finger-fuckups and/or home network p0wnage. Somebody who p0wns something inside my network might find the cronjob or shell scripts that run the wifi powerpoint those drives use on and off, but at least I'm down to an hour or so a week where that copy might get cryptolockered...)


> I'd wager that for most people, backing up > 2 TB on anything but spinning rust will be impractical/prohibitively expensive.

You'd lose that bet. 2 TB on LTO Ultrium tape costs under $10 (sometimes under $5 depending on volume of tapes ordered).


With the slight downside that you won't be able to acquire a drive for less than $800.

LTO's combination of cheap media and expensive drives is great for people with rooms full of tapes, but it makes it pretty unattractive for everything else.


Depends on your requirements. Something lowest common denominator like LTO2 is still useful, cheap, reliable and the drives are also cheap and reliable.

You can always feed your LTO2 tape into a LTOx drive and get the data back.

Obviously you have to be fairly patient, but for my use case (once a week on my third tier of backup) it's good and gives peace of mind that I can put in a fireproof safe.

Primary backups are still on the cloud though - S3 Glacier is a good option.


By LTOx, you mean at most LTO4 right? I thought LTO drives could read at most 2 generations prior. Also, you need to get to LTO5 to get useful capacity (~1.5GB/tape), LTO2 is limited to 200GB + whatever you can get with compression.


I've been keeping an eye on tapes for literally a decade or more, but for my personal backups hard drives get enough bigger every year that I'm comfortable with (for now) a pair of 8TB usb drives with secondary local backups which I swap between plugged in at home and powered down in my locked desk drawer at work every Monday. I've got a while to go yet before I'm worried about capacity there, and 12TB drives are already available, and will easily be in the right budget soft spot for me when I need them.


And how much is the tape drive? For _most_ people (not data centers) that prohibitively expensive. A 2 TB harddisk is $48 from newegg (and I'm sure you can find cheaper).


Lots of people here focusing on the cost of the drive without factoring in:

- For home use, it is likely a once-or-twice-in-a-lifetime purchase (try saying that about any other kind of media)

- You don't have to buy a brand new drive of the latest generation at sticker price; used last gen(s) gear from the enterprise works just as well.

- Perhaps most importantly, what's your data worth? I don't know about you guys, but I've got photos and documents that are irreplaceable.

Tapes beat all other storage media on $/GB, reliability over time, and arguably durability, all of which are the most important factors for offline backup.

$500 comes out in the wash over a decade or two. That's the kind of time scales we're talking about. Yes, it's not cheap and easy consumer electronics you can buy off the shelf at Walmart, but it's not unreasonable either.


Can you recommend specific tape drives and/or generations? I have never considered it but you make a good point. Maybe this is the time create a backup of my irreplaceable Google photos collection.


Really, it comes down to your budget. LTO drives don't do anything interestingly different across manufacturers, and I've seen no evidence of significant reliability differences either. So, just buy the biggest generation that makes sense for your data storage needs.

Two important things to note:

1. LTO5 and onwards have a feature called LTFS, which allows you to address the tape as an ordinary file system. Before that, you were limited to purpose-built tape tools.

2. Generally, tapes can be read in drives 2 generations newer, and written in drives 1 generation newer. This rule was broken with LTO8, due to new tape composition. Starting here, it's 1 generation newer read/write.


For "purpose built tape tools", tar and dd are fine. tar is a bit easier to restore. If you are running Windows, well, don't run Windows as Microsoft have removed all the built-in tape stuff they used to supply with Windows Server and all the alternatives either cost money or are hideously complicated.

Debian + old HPE server + tape drive + tar. Job done.


That's hilarious. I did buy a tape drive. Heck, I bought a tape library. Guess what? It's useless today because my data consumption has scaled with technology and that old DAT based solution just can't cut it. Any tape drive bought today will be garbage in N years. Just a simple fact.


With all due respect, your data usage needs are yours and can't extrapolate to everyone else's. A few 320GB DAT cartridges may be enough to back up the essentials.


How does one choose a tape drive? I see wild variation on tape drives and no idea on why that is.

If the devices are not too large or ridiculously expensive, than it definitely makes sense to use tape as backup. Hard disks are convenient but fail at inopportune times.


And the drives are expensive.... Especially if you want something that handles fairly modern tapes for actual useful capacity.


And how much for the tape machine?


The person above was talking about the cost of HDDs, they didn't include the cost of a machine or NAS to run them in so why should I?

Seems like a double standard wherein people are going to pretend housing multiple HDDs costs $0 (unrealistic) but won't even evaluate spending money on a tape drive.

You what doesn't get cryptolockered? Yesterday's tape sitting on the shelf.


Because we're talking about most people, and most people already have a machine into which they can plug a HDD. Often that's the very machine with the documents they want to backup.


I think "plugs into SATA or USB" is a reasonable standard for a backup storage medium.


Whereas I think that isn't a backup at all, because it is plugged into the same machine (i.e. online backup) and therefore going to get crypto-lockered.


A USB drive hooked up via ESATA or USB is directly comparable to an LTO tape.


So is a tape left in the drive.

If you can move a tape to a shelf, you can unplug a USB cable.

And to your question about having a separate machine, a Pi will be fine.


Neither SATA nor USB implies permanence.


Most people do not have a $1000 LTO tape drive.


You can pick up a second hand HPE deskside server with a tape drive included for $200-$300. Might only be LTO2 but that's still 400Gb of rock solid storage for no money.


If you cannot fit the backup on a single tape then you increased the headache and number of ways this can fail. I cannot phantom how you can see this as superior to hard-drive (which will probably end up cheaper too).


People often forget the 3-2-1 rule should also apply to the secrets you use to encrypt/store your backups. If you can't decrypt it, it isn't a backup.

I use a crazy long passphrase to encrypt my backups, but should I forget - it is also printed on archival paper inside a sealed envelope in a friends safe deposit box (I also have a copy of his backup passphrase for mutually assured destruction :)).

Also, every once in a while run a fire drill and actually restore something from each of your backups. This is when you find out the rsync job has been stuck for the last 80 days.


Yeah the fire drill to make sure everything works is crazy important.

I just wish that storing things like keys on paper was easier.

At CoreOS we put some keys on printed QR codes and scanned them with an airgapped laptop every 90 days to confirm the keys were safe.


> Yeah the fire drill

I agree. But I've never been allowed to run a fire drill. Rebuilding a network from bare tin is obviously expensive, but not as expensive as losing the business.

And then there's the sheer stress of being responsible for the backups, but not being able to test bare-metal recovery.

I'm interested in backup ("what kind of weirdo is this!"), but that wasn't a fun responsibility.


> But I've never been allowed to run a fire drill. Rebuilding a network from bare tin is obviously expensive, but not as expensive as losing the business.

Sometime your job is not to be able to take the backups, but to take the blame when a restore can't be done. If you think that might be you, time to polish up that resume...

Especially for cloud-based businesses, refusing to let you spend ~8 hours of production platform costs to run a full platform rebuild fire drill is insane, that's about 0.1% of your annual prod AWS budget, and 0.5% of you (or your team's) annual time.


We weren't cloud-based; "bare-metal restore" would have meant acquiring and housing new hardware. And we'd have needed test-scripts and so on - more than 8 hours, more like a few days to do it repeatably (but once that was done, it could have been automated).

The boss owned the company; I think it was shortsighted of him. Being able to blame me wouldn't have done him much good, once his firm had gone down the tubes, and I'd moved on. But it was his business to lose, and he knew what his margins were - I didn't.

He didn't like me much, mainly because he was a control freak, and I knew his systems much better than he did - I had the control that he craved. Control freaks shouldn't hire people that know more than they do.


You could use Samir Secret Sharing to split the secrets in M unrelated strings and then "recover" it with M-k strings. (E.g. Generate 5 keys and require 2 to recover). That way you can share some of those keys but nobody will have the target password.


If I am in a situation where my brain doesn't work well enough to remember a backup passphrase, it isn't going to work well enough to do silly hacker shit.

There are very clear directions to my friend for when it is acceptable to access my stuff, and very clear consequences for malice.


You can also write very clear instructions on the recovery messages. I plan on doing this very thing for my disk decryption passphrase and password database. It's a standard package available in the base repos, and I'll have step by step instructions on how to recover it. If they can't follow the instructions, then they probably shouldn't be trying to handle my affairs in the event of an emergency, either.


> I also have a copy of his backup passphrase for mutually assured destruction

I have two friends who've given me private key fragments (from a Shamirs Secret Sharing setup) with the explanation "If anything happens to me, you'll work out who you need to talk to and what you'll need to do."

I haven't done that myself, because I don't have any need/desire for anybody to decrypt my backups once I'm not around. That might change if I end up with dependants one day.


For your last point: That is the reason I implemented https://healthchecks.io/ into my jobs.


Fire drill would still be a good practice, though: during the fire drill you can find out your backup is corrupted, or is incomplete (you are not backing up everything you thought you were), or inaccessible (you encrypted the backup, but where are the keys...).


He never said fire drill was not important?


> This is when you find out the rsync job has been stuck for the last 80 days.

Fire drills are important, but I get a Telegram notification every day when my rsync jobs complete. The notification is muted, but I see it in the list every morning when I use Telegram to text my girlfriend. If it wasn't there, I'd immediately know something wasn't working.


A full test is absolutely needed - including any system or DB restores and the system needs to be fully functionaly tested

A quick backup verification is not enough - I learnt this the hard way and lost 9 months of data only after I did full annual DR test. The backups were setup and configured by the (very well known) manufacturer of the backup software. They screwed up on the DNS name


> I use a crazy long passphrase to encrypt my backups, but should I forget - it is also printed on archival paper inside a sealed envelope in a friends safe deposit box (I also have a copy of his backup passphrase for mutually assured destruction :)).

Why not just remember the password and perform a regular fire drill decryption to ensure you won't ever forget it?


For backups that I can safeguard physically (as opposed to cloud), I just copy files uncompressed and unencrypted using FreeFileSync and Syncback. Restoring now is as easy as media swap.


I prefer the "1 rule" for some data... the password is only on my mind, so when I die, it's hopefully gone forever :)


I have data like that... however there is some data that needs to survive me for my family / estate. Proper planning for that is something everyone should do.

I have been involved or aware of a few nightmares where an expected loss resulted in all kinds of pain and problems for the family has they lost access to critical accounts, financial data, and other things that required lengthy interactions with government agencies, banks, etc that would not have been needed if the data would have been able to be recovered from the systems


While I have some passwords that I don’t think I’d ever forget not sure I’d want to bet on it given that I’m remembering more than one password.


You think you have his password...


To be honest I wouldn't have mentioned that on a public forum, but it's kind of you to have done so.


There’s another backup rule that is largely skipped (especially in personal backup space):

“Do not backup everything!”

- repeatedly clean your data

- and discard redundancy and garbage before adding to backup set

- otherwise you’ll just to be creating a backup set that’ll turn into a pile of “mostly” garbage soon

- and it’ll just be GBs and TBs of “too much” - essentially useless - and costing more as well.

Keep your storage footprint in check.


This is very true, and de-duplication can be a great answer for this.

Here's the read out from my last Borg backup which says that I'm storing 2.5Tb worth of data in 59Gb

  ------------------------------------------------------------------------------
                         Original size      Compressed size    Deduplicated size
  This archive:              220.28 GB             66.75 GB            591.11 MB
  All archives:                2.55 TB            777.85 GB             59.44 GB
  
                         Unique chunks         Total chunks
  Chunk index:                  234389              4706300
  ------------------------------------------------------------------------------


Why do you store so much redundant data? Aren’t the additional TBs in the ‘Original size’ duplicates? Or does dedup add some kind of magic compression sauce also?


Most people i know make "full backups". I've argued with them repeatedly about what would be critical to lose, and what would merely be inconvenient.

I.e. I backup my photo collection religiously. I keep all my photos in the cloud, and have a machine synchronizing photos locally. That machine then makes 2 backups, one local, and one to another cloud. The same applies to documents, mostly because they're highly compressible and don't take up much space.

When it comes to media backups, i honestely don't care if my iTunes library got wiped out. Most of it has been purchased, so can (hopefully) be downloaded again, and the rest has been ripped from CDs that still reside somewhere in my attic. So inconvenient to lose, but not exactly critical.

I also don't make "full computer backups". If something breaks i can just as easily reinstall the computer/applications and restore my documents/photos.

As for photos, i also burn identical M-disc BDXL media every so often, containing the photos taken since last archive date, and store the copies in different locations. They're low cost, low maintenance, and while not "spinning rust" cheap, they're still within $12/100GB, and unlike spinning rust and cloud, it's a one time cost.


Absolutely. Disk image backups feel completely pointless to me — why would I back up my entire system when I can reinstall 90% of it from a live USB? It's only my documents that I care about (including photos, etc). My restic job runs in seconds.


True. I thought a copy on write system would solve the “too much” issue. It made it so much worse.


If you're too lazy to do this manually, Restic will handle it for you!


As far as I can see, this scheme gives you one backup copied to three places, on two different media.

What if the source for the backup was already corrupt or broken in some way? If you only have one backup, then your backup is corrupt too.

I was taught grandfather-father-son back in the 80s; still three levels of backup, but they're different generations. That fitted the kinds of media available then, but it doesn't really map to modern equipment. I've struggled to work out a backup scheme that is equally adaptable to the needs of a small business, a home network or an individual.

Ironically, it's hardest for the individual; a modern business is finished if it loses all its data. For an individual (or even a hobby network), total data-loss is painful, but not usually an existential risk. So it's harder to justify keeping everything in triplicate.


Reminds me a story here on HN, when they had plenty of backups, all encrypted with an in-memory key which was accidentally never saved to disk even on their primary server.


Doesn't the use of snapshots (and their backup) help with this ? I am assuming here that you mean that the source got corrupted some time after it was created (and anyway, if it is not the case, it is a issue unrelated to backup, in my opinion).

A proper backup tool should help you keep several versions of your data without using a proportional amount of space, by using some form of deduplication. I use borg backup for my backup, and I can go back to any day in the past three years and get an old copy of any document (as long as I saved it on my disk for more than one day, since I do daily backups)

You can also setup "append only" backups, if you are worried that somebody may willingly try to destroy your old copies


I use Apple's TimeMachine backups as my sources, and BackInTime on the Linux boxes I care about, so I'm backing up an archive which preserves previous versions of files. I push 12 rolling monthly archival copies of them into AWS Glacier, and keep one of those as an annual archive. That's costing me around $200/year in Glacier costs. I've only ever done a restore once from there as a test, from memory it cost me ~$30 in retrieval charges.


For source integrity there are other methods available. Checksums, parity, RAID. I don't see this as a backup problem.


Those are checks of binary integrity. They can't confirm that data hasn't been deliberately or accidentally replaced, added or deleted.

I would like to have a grandfather, a week old; a father, a day old; and a son, being last night's backup. All on different media, amd ideally not connected to the source machine.


This is not a backup problem.


it absolutely is a backup problem

Ransomware is the #1 thing an enterprise will use a Backup to recover from, the second most common is accidentally deletion.

Binary corruption is very very very very very rare and with all the systems in place to prevent it, is not even something I even think about anymore. I worry about Ransomware, and users doing stupid things


I think the parent is suggesting you've moved out of "backup" territory, into archive and/or disaster recovery territory.

I agree with you problem description and approach. I'm less sure I agree with your terminology choice.

I use a software raid 1 pair of external usb drives on a wifi powerpoint to take a once a week snapshot of my backups, and the drives are only powered up for ~1 hour a week. Not guaranteed protection agains ransomware, or me fat fingering a "sudo rm -rf / tmp", but "good enough" for my home/personal stuff. I also have monthly and yearly snapshots of my backups in AWS Glacier. I sometimes colloquially refer to all that as "backups", but I can see your parent posters point that I've gone way beyond what "backups" covers in this paragraph.


I don't view backups and DR as wholely distict things.

DR is a set of policies and procedures that cover backups. That are inextricably linked

You use backups to perform a diaster recovery.

I personally dont believe you can talk about backups with out including DR nor can you talk about DR with out talking about backups


In my head at least, backups are a subset of disaster recovery.

They are linked, but you can do backups without doing disaster recovery (for example, just turning on TimeMachine backups to a usb drive on your Mac is “backups” without being “disaster recovery”), but you can’t do disaster recovery without backups. But you also need archives, and retention policies, and recovery procedures and test plans, and training and practice for the people responsible for DR, and hardware/site/network disaster recovery plans and resources, and recovery time objectives and recovery point objectives, and a whole bunch of other “not directly backup” related stuff.

But I admit that colloquially that might all be assumed in certain contexts to be “backups”, but that’s a probably dangerous assumption unless everybody in that discussion is totally on the same page about just how much of that related disaster recovery stuff is actually in place and ready.


It's not. You can backup ransomwared files all the same.


Which is why you have immutable backups, and you do not simply over write everything. this is not a new concept, or a new thing. To believe ransomeware protection is not a function of proper backups is just ignorant. Anyone working in Enterprise IT, managing backups for any company, that is the #1 thing we talk about, that is the number 1 thing we are focused on in our backup strategy. We spend significant resources on it.


That doesn't matter. If you create a new backup, then your ransomwared files will be in that backup. Immutable just means that after the creation, the backup can't be modified anymore.

So, how do you detect if you have ransomwared data? My take is that a backup can't help with that. It can help with restoring obviously.


My backup drive would fill up and alert me since it's btrfs formatted and all the encrypted data would be new data not in an existing snapshot.


I never really liked the 3-2-1 rule because it feels too specific: while it works, simpler solutions also provide the same level of reliability.

I think about backups in terms of blast radius. 1) The local machine has the working copy of data and a local backup as permitted by free space. The smallest blast radius where I lose data is "my laptop hard drive fails". 2) My external drive has another backup. The new blast radius is "my house burns down". 3) I maintain a cloud backup. The new blast radius is "a catastrophe on a global scale".

Any two of these backups can fail and the data is still salvageable.


> while it works, simpler solutions also provide the same level of reliability.

It exists and is important because many backup strategies are broken and people don't realize it.

For example your own strategy treats a PC's local storage and an external drive as distinct backups when in reality you've only evaluated hardware failures when formulating it and not malicious actors. In 3-2-1, in particular the media type + off-site thing, tries to "trick" you into having a backup which isn't accessible from the same system that it is backing up (i.e. offline backups).

Your backup strategy has been used almost verbatim by multiple institutions who got cryto-locked. The external drive was hit and then the cloud backup service happily synced the now encrypted files. They went from "backed up in three places" to backed up in zero places, and are now calling the cloud provider hoping that their backups had unencrypted copies in them.

3-2-1 isn't simple, but it is good, and that's what it tries to be.


In my DIY backup setups, I'm relying on rsync.net's snapshots [0] and also the fact that the AWS S3 bucket I'm automatically copying stuff to has 'versioning' enabled. [1]

Is that good enough in your point of view?

Thanks

[0] https://www.rsync.net/resources/howto/snapshots.html (not affiliated, just a happy customer)

[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versio... (ditto)


For automated ransomewhere this works.

For targeted - it often doesn't because often your AWS keys are on the system doing the back up and have permissions to delete items etc.

Of course, this is why S3 allows you to set and object lock rule (ie, 30 days is plenty) that means even you (or your computer) can't go and delete those online backups.


"For targeted - it often doesn't because often your AWS keys are on the system doing the back up and have permissions to delete items etc."

Just a reminder - snapshots at rsync.net are immutable.[1]

Even if Mallory gains access to all of your access credentials, she cannot delete/change the snapshots in your rsync.net account.

[1] https://twitter.com/rsyncnet/status/1453044746213990405


I have send internal drive that gets an rsync of my first drive. First drive is xfs currently, second drive is btrfs. After the rsync, the drive gets a snapshot, then it's unmounted.

For cloud, I use a time4vps storage server. I rsync some data and use borg for more sensitive stuff.


That's great to hear about the object lock rule, thanks a lot. I'll add that to my S3 setups.


Also, if this were for a company I'd suggest to test the backup at least once during the object lock duration: e.g. if the object lock is set to 30 days, you have to test them at least once (e.g.: day 15).

That's because a targeted attack with a ransomware could gain access to your servers and wait 30 days while silently encrypting your backups until the 30th day, when the attackers could just complete the attack encrypting the rest of the files and showing the message.

So my recommendation for extremely critical data would be: 1) Test whole data thoroughly at least once during the object lock period. 2) Setup an automatic task that retrieves X random data every day (or the longest period of time you can afford to lose it) and perform checks with checksums or other methods. If something is corrupt and/or encrypted you will realize before it is too late.


Thanks a lot! I appreciate the insight


Backup versioning is the important distinction here. A backup on its own is only good against loss, but a backup with multiple snapshots in varying states in place protects against the threat of crypto-locked cloud synchronization.

Of course if the provider is locked then it's a moot point.


Some backup versioning schemes are disabled on purpose by some cryto-malware.


What you're saying is that a critical piece of the 3-2-1 rule is a piece that isn't actually prescribed by the 3-2-1 rule: that you have "a backup which isn't accessible from the same system that it is backing up (i.e. offline backups)." Another reason that I don't think this "rule of thumb" as useful as it purports to be.

Presumably the rule was invented before ransomware was a thing, so perhaps it gets a pass for not anticipating versions, but yeah: the rule of thumb for the modern world probably includes something about backup versions.


It should be known as the "3-2-1 rule with knobs on" for that precise reason, not to mention several others. However, you need to start somewhere with a pithy pitch.

Veeam for example (other backup systems are available) have several "immutable offerings" whereby you get backups that can't be deleted for a while. These are not appliances but configuration recipes. One of them is a generic Linux box with XFS with reflinks, a particular set of file perms and what they call a one off credential (a sort of app password I think), forward incrementals and a few other things. Get it all in line then even if a baddy gets in they can't delete your backups for a while. This does require detection inside say a week. The reflinks thing is really useful for storage space and things like "synthetic fulls" where you roll up incrementals into the last full backup.

You can also use AWS etc object storage for immutable. In all of these things you can lengthen the timescales available by applying more cash.

For me: tape in a safe is the only decent way to protect against ransomware. Even then some tape hygiene is needed!


IMO it's a stupid rule, but it's not for us, it's for people who don't even have this.

I think there's an element that is as important as keeping a backup safe against a state-wide wildfire and that's AUTOMATION. If it isn't automated then chances are that your backup is very old once you finally need it.


Also need monitoring then!


After the data center fire in France last year, where a lot of people lost their cloud data, including a friend of mine, I also make sure that I've got multiple cloud backups in distributed geographical locations. I recommend looking into Storj DCS.


> simpler solutions also provide the same level of reliability.

far more simpler solutions exist that don't provide anything like reliability.

Things area more nuanced these days than when this "rule" was first formulated, but I suspect it's still true that the vast majority of people would be far better off with following it than whatever they are doing now. Doubly true of personal use.


Yes, this is exactly my point. I think we need a "2.0" version of this rule that better encapsulates what a good backup strategy is.


You missed a radius there, with neighborhood or town/city level disasters.

I only use cloud backups (Glacier and Google Workspace), I gave up on off-site drives as they end up too far away to be easily/consistently updated, or close enough they are in the same disaster area I am in (Earthquake zone).

Houston flooding is another good example.


Am I missing something or is your system actually 3-3-1? I.e. more comprehensive than the 3-2-1.


Do you count "laptop hard drive", "external hard drive", and "cloud hard drive" as different media types? What if the laptop happens to be SSD but the external drive is magnetic? What if the cloud drive is SSD or magnetic? That "2" in 3-2-1 just doesn't make sense in today's world.


I think actually physically going through your disaster recovery plan (whatever that would be, 321 or not) reveals how good/bad your backup plan is for you.

Personally, I downsized. I made peace with myself that I can live without the gobsmack amount of data I have if it were to be lost tomorrow. I pared down to a small set of data (less than 1GB) that is critical to me. That data is synced to various devices with syncthing (includes even my cellphone!), and then I use restic to two different cloud storage providers. When I'm bored I do an independent, standalone export from cloud storage.


I did the same with two tiers- a first tier of the data I would need for a true disaster (a few GB), and a second tier of data that is meaningful and difficult to replace (small enough to fit on a flash drive). Everything else is dust in the wind.


Same here.

Since I "restore" this backup pretty frequently just for day to day living (ie, doing taxes) I'm also pretty sure it's accessible.

I do pay for versioning for the online sync of this, and I do a period S3 object lock copy (30 days). For me, that's good enough.


Oh bother, it's down :( (edit: it might no longer)

Archive links:

https://web.archive.org/web/20211001064106/https://www.vmwar...

https://archive.md/1jHmP


TL;DR 3-2-1:

3 backups

2 different media types

1 off-site


Looks like they have backups :)


I learned this the hard way a few years ago.

Having several drives is not enough. I used to keep my important data replicated on 3 drives, from different brands, different capacities, 1 internal, 2 external.

One day the internal drive failed, the next day one of the external drives also failed. So... I panicked, shut everything down and bought a brand new hard drive (1tb). While copying from the third drive it also stopped working. So, I had a triple drive failure. I managed to recover most of my data by freezing the external drives and copying from them (until they heated up and had to freeze them again).


What are the odds. Did you maybe bring some object with some radioactive material inside?


This may or may not be the case here, but generally speaking…

Pretty high for any parity-based multi-disk system. The remaining disks get stress tested when a disk fails and you need to copy everything over to the replacement. It’s why RAID5 is no longer sufficient with today’s disk sizes and why a RAID10 (which can only lose any one disk plus a specific other disk) is actually real-world safer than RAID6 (which can lose _any_ two disks).


Although with RAID10, the disk that absolutely can't fail is the one under all the restore stress. And likely to be the same age as the failed disk.

There is also RAIDZ3 for those who want more safety, which is ZFS triple parity.


Who knows? This happened a decade ago. Maybe it was an electric issue... But that would fry the drive's motherboard instead of making the hard drives click?


In 2005 on a Saturday evening, I lost every bit of digital data I had, in spite of following (and even exceeding) this rule. I had:

- One live copy of data on my laptop

- One copy on external hard disk, updated ~daily, on-site (home)

- One copy on external hard disk, different brand and age, updated ~bi-weekly, off-site (work)

- Immutable copies on optical media, persisted ~once a month, on-site

Data on the laptop was lost due to operational error: I fat-fingered a command and destroyed the partition table and part of the leading data on disk. Being a reasonably fast disk this ate a lot of structurally critical data quickly. Recovering the filesystem would be really hard, but I had a two days old backup, so didn't think much of it.

Now, to the local backup. I booted up a live cd, rebuilt a partition layout, plugged the disk in, and started restoring data. Reboot, and it seemed to work, mostly, but some things that should were not. Immediately jumped to look at the recovered data, it was severely corrupted. Diffed some files and compared to the "originals" (i.e from the backup) and they were identical: data on the backup disk was hopelessly mangled even though the hardware was fine. A cursory analysis seemed to highlight a software bug (filesystem code? drive firmware? whatever, the issue had some logical consistency to it that made it obvious it was unrecoverable, which was my sole goal at this point). And I just restored it over what remained of perfectly valid - if difficultly reachable - data, essentially scrubbing the laptop disk. Sweat was starting to build up.

Okay, optical copies were next, even if older. Surely this would get my heart rate down. I put the disk in, closed the tray, and heard the sound of a rattling helicopter. I stored the disk in a closet which I thought would be safe, but it turns out the hot water pipes for the flat above were running behind a thin wall, which build up enough heat over time inside the closet to slowly warp the disk. Well, one of them, because I was paranoid enough to have three disks for a 3 month rotation; but while the other disks were geometrically fine (maybe due to being a different brand), they were stored for a longer amount of time and their data suffered much bitrot. This was going to be a long Sunday.

Back at work on Monday, the final disk immediately emitted an ominous clicking noise right away. Shortly after it snapped, never to power on again. I could maybe recover data straight from the platters if I sent it to some firm for a hefty pile of cash, which I had none at hand, neither at that time nor in the foreseeable future.

So, in order I experienced: an operational error, a logical error, an environmental issue, and a hardware fault. Luck had it that I had a second computer temporarily lent to me, which I toyed with and where some of my most recent work files turned out to lie from a week before, so I could resume putting food on the table quickly. No amount of hackery was able to restore any meaningful data, so I lost about 10 years of digital photos and older work archives.

Psychologically it was fairly interesting, because I thought I would be enraged at myself for multiple reasons, but the perspective of such ridiculous odds of this happening turned the whole thing into a very contemplative experience shortly after.


This evokes feelings!

I've never had a total catastrophe, but I have had a chain of independently-unlikely faults that combined together to create a once-in-a-lifetime disaster.

This has happened to me many times during my life.


Are there guidelines for how long a given type of media is considered stable?

From experience, hard drives seem safe on the order of years; I've spun drives back up from the early 2000s and they are fully intact. The lifespan of burned optical media is/was counted in minutes. Various flash memory is somewhere in between - I've had all kinda of cheap flash drives die.

Even with the multiple media forms they need to be refreshed at some frequency, and I don't know how often that should be.


We’ve come a long way since CD-R; I said this earlier but I guess it’s worth copy-and-pasting here:

> There are Blu-ray Discs specifically engineered for long-term archiving and have BER guarantees (anomalous bit read per gb of data stored per year archived or something).

Now these are obviously simulated numbers (the tech isn’t even old enough to test) but it’s a start: it means people are at least considering the right questions.

I wouldn’t write sensitive to a Blu-ray directly unless it was the kind of data where a bit-flip is not a huge deal (eg a backup of users’ profile images where there are many small files, a bit flip affects the content but doesn’t compromise the overall data, errors aren’t cascading, etc). There’s already bit error correction baked into the analog <-> digital transition layer but it’s not great - but fortunately efficient bit error correction at the file level has been a thing since before binaries on Usenet. Stick some PAR2 files on the Blu-ray or even serve your backup as a Blu-ray Disc plus a DVD-R stuffed to the brim with PAR2 data for the former (I prefer the first approach).


I see a lot of results on google for "data storage lifespan"

Trying to find an authorative source (loc.gov, archive.org, etc.), I found this, which is not a full answer, but gets into interesting details: "Table 2 - the relative stability of optical disc formats" [0] -- from >100 years down to ... 5-10!

[0] https://www.canada.ca/en/conservation-institute/services/con...


This is from 2019, not that the advice ever goes out of date.

It's crazy how the seemingly easiest most basic security/backup advice is so easy to give, and so hard to actually do. 3-2-1, so easy to teach and remember! In reality, at any kind of scale, not so easy to do.

I am constantly reminded just how hard every aspect of security really is to do. Even for the little/basic stuff.


The difficulty of 3-2-1 increases as your data size goes up.

For me, the only sane thing to do is partitioning.

My first group would be data that if I were to lose it would cause great pain. I keep the size of this group as small as possible. A few hundred megs or less. You have a live copy, a backup on a USB thumb stick or drive, and a copy you email someone or snail mail a USB drive. It's simple to deal with. It has to be, because it's critical.

My second group is data that is important but not a serious threat to me. Photos and videos, mostly. This second group is where the headache starts and logistics, cost, and time become an issue. Offsite backup is either running to the bank deposit box (time consuming), or upload to cloud (also time consuming, and expensive). Containing the bit rot becomes a futile exercise. Especially considering most people aren't running ECC RAM and end-to-end ZFS with redundancy (for recovery) requires significant expertise and time. Parchive files are the best bet for most people.

Finally, my third group. I have a NAS with a simple mirrored ZFS setup. Two huge drives. I'll probably add a 3rd drive for added redundancy. There is no backup. This data I don't care that much about. I'd hate to lose it. But I'd hate backing it up much more. I don't live for my data, my data lives for me.

You have to ruthlessly prune data that you care to keep. Just like the burden of owning a boat or an overly large house, there is a burden to too much data you care about. The mistake a lot of people make is treating all data the same. Then they end up with terabytes of data of unequal importance and get sloppy protecting the tiny amount of data that truly matters.


If lizards can voluntarily make their tail fall off to escape predators, you can survive losing your music backups!


For local documents I find it way easier than cloud stuff.

1. Local disk

2. Time machine ssd

3. Backblaze backup agent

For cloud stuff the services make it so hard. I have been working on a service off and on to backup Google photos to an SD card and then mail it to folks. And the amount of limitations, rate limiting, and random errors out of Google's API can be frustrating.


And you can basically just as easily backup to two Time Machine drives for just a little extra piece of mind. Not sure it makes sense to use an SSD for that application though.


Try Google Takeout.


It isn't automated. And in the case of the thing I am building I can't help other people because you can't delegate the creation of a takeout either.


> you can't delegate the creation of a takeout either.

Not through Google's APIs. But you can, through adversarial interoperability.


Sorry?


Funny Story.

My first service order way back when was to a engineering office who were in a panic because their primary CAD file server had failed right in the middle of a deadline submittal.

I got there when they had just pulled out an identical file server from some other department. They had 'everything' backed up nightly on to a Zip drive. Nice, I thought. So, I plug in the Zip drive and can see all these image files created by their backup software. I ask for the backup app install files and they say its kept in a directory -- on the failed server !!!

I should mention this was not in the USA and resulted in a 6+hr long international 42kbps zmodem download session from some random BBS server of the best guess software product and version.

I still have one of the failed HDDs from that server as a paper weight on my desk.

PS. We got it working in time and the obvious moral of the story is to always test your backup systems (and that goes for both failover and fallback)


Every couple of weeks, I or a colleague wander into our bijou data centre. We turn the power off. The UPSs beep for 10 seconds and the genny fires up. A couple of SMSs land on several mobiles and a Teams channel gets notified. Icinga2 notes the power is up but provided by the genny.

I particularly like to do this when showing someone around. You whitter on about power and then you swiftly turn the key on the distribution panel and it makes a satisfying clunk. The status lights switch around and the UPSs start beeping. Then they stop. There is a barely perceptible hum from the genny in the boiler room, three walls (one of them a firewall in the traditional sense) away. It is extremely satisfying to do and customers appreciate it - they have all jumped a bit when they first see it. The UPSs obviously (monitored and fortnightly tested) have enough runtime to switch back if something fails on the genny.

At a second location there is just enough IT gear to run my company for a while. Replication via Veeam - 'nuff said.

Every couple of weeks, Veeam fires up a small part of my company's IT from backups/replicas at the second location. It runs some scripts which tests some functionality and sends out some status reports via email.

I still worry about business continuity. There is way more to it than just backups. However, get backups sorted out first. Nine months back a customer had a fire on the production floor. They are still running out of our place, 50 miles away. We set up offsite backups to our place only the week before: "There but for the Grace of God, go I" etc etc. I had them up and running inside three hours from a standing start. Rather lucky there really - they have a functioning business and I have a success story rather than a potential lawsuit from an ex-customer's liquidators trying to claw back some loot.


For those out there using S3 to store their personal backups, I'd like to recommend Wasabi [0] as either a replacement or addition to AWS S3. Wasabi offers object storage in multiple regions, compatible with AWS S3 API and therefore is a drop-in replacement. The storage cost is $6 per TB p/m, with no bandwidth costs.

I'm not affiliated with them, just a happy customer enjoying massive cost savings.

[0] https://wasabi.com/


There’s something “off” about them that’s difficult for me to put my finger on. I think it might be either their marketing approach, or signs of operational immaturity (GoDaddy domain issue taking them down for most of a day, for instance).


They advertise this on the website:

  > 80% Less than Amazon S3
Why would I use a service that is 80% less reliable than Amazon S3? And if they mean something other than reliability - such as uptime, price, or transfer speed - then they should state it clearly.


It’s not for people that won’t put more than 2 seconds of work into something. It’s a discount service.

That line could easily be a decoy to ward off problem customers who don’t know how to use APIs, read specs, etc and would be high customer service load.


So it's 80% less service than S3, I see ))

In any case, thanks for mentioning the service. I'm quite happy with [Boto](https://github.com/boto/boto3) for interfacing with Glacier but it is good to know about alternatives.


What a ridiculous nitpick.


You should mention the 90 day rule, however.

Basically anything that gets uploaded is billed as if it is was stored for 90 days minimum, even if deleted, overwritten etc.

An additional difference is that the egress is limited to the size of the buckets. So if you store 1 TB your egress should not be more than 1TB in a month.


This 3-2-1 backup strategy is missing a dimension: time. Data is rarely unchanging. The 3-2-1 strategy is inadequate for changing data, because if you delete the wrong database table and didn't realize it, the 3-2-1 backups are going to be missing the table too.

You need to add the time dimension and have a monthly snapshot for data that is older than a year, weekly snapshot for data within last 1 year, and daily snapshot for the last 90 days.


Yes, and validation as well.

If backups are not tested by actually retrieving data on a regular basis, you might get a nasty surprise.

The Legato NetWorker bug which resulted in 64 bit XFS inodes not getting backed up bit Pixar very hard back in the day (LGTpa40680).


You may want to check out our open source project: Proxmox Backup Server, it supports 3-2-1 relatively integrated by:

- integrated tape backup (LTO-4 or newer), this allows to fulfill the two different medium rule and may also help on the offsite one (e.g., but a tape in a safe at CEO's home once a week or so)

- efficiently mirroring of backup data to remote PBS (offsite copy)

- client-side encryption, because well, backups are good but maybe not so if they're leaked

- file-level and block level backup - disclaimer the former currently only for Linux clients and the latter can handle anything but also runs on Linux (VMs are used)

The introduction/main feature section of the docs contain more info, if you're interested: https://pbs.proxmox.com/docs/introduction.html

If you have your non-Linux workload contained in VMs and maybe even already use Proxmox VE for that it's really covering safe and painless self-hosted backup needs.


I say “you are only as good as your last restore”, a phrase I stole from the Theatre industry where your performance “is only as good as your last rehearsal”. Of course this is most useful in Disaster Recovery scenarios. Rehearsal? Restore? Oh we’ve never tested it.


For a couple of VMs I have an automated restore to another VM. The test copies rsync from the latest backup daily and restart themselves, and once a week do a full wipe and resync.

A script touches the copy mail store (I currently use Zimbra for a mail server) to check it is up, and that the last item “received” is no older than 24 hours, and emails me to say all is well. If I don't get that daily mail, the restore has failed somehow. I keep meaning to add an extra layer that checks for the daily mail and sends me an SMS if it doesn't arrive, just in case I don't notice. The copy VMs don't need to be high spec, though in theory if the main machines die I could just switch other to them in minutes by DNS+firewall updates (and by giving them a bit more CPU+RAM allocation), just enough that the services start and run fast enough for the occasional manual paranoia check, and they aren't accessible to the outside world (but could be, if I needed to make that switch in a DR situation).

Similar for the little web servers I still run and key parts of general data backup (the parts that can't be reobtained at all if both originals and backups are lost): restore from backup after the update of the backups is due to finish, checksum everything older a few hours on both sides, and send a mail if the checksums match. That sometimes give false negatives, if a file is updated between backup and checksum, I have a couple of possible fixes for that in mind but it only happens if I'm working silly late due to when the checks run so I've not been bothered enough to implement one.


This reminded me to buy a new hard drive for secondary backup. Is it just the nature of so many reviews these days, or is it impossible to find a drive now that won't likely fail in a few months? I just spent an hour on Amazon and other stores and can't find a single drive that isn't littered with relatively believable 1-star horror stories... many of which include things like "I've had 3 drives from X brand before but this is nowhere near as good as the old ones."

(Side note, I also see the same comments about the shoes I want from the company I've bought shoes from forever).

Is this comment overload from naive users in the hard drive sector, or has quality control dropped across the board?


If you have 0.01% failure rate and sell a million devices, that’s a hundred expected failures. Some of these buyers will want to blame someone, so they’ll go on Amazon telling everyone that their study with a sample size of one shows how incompetent YourDriveBrand has become, and that they should never buy from them again.

The average rating is probable a better indicator here.


Backblaze publishes good data about their drive failure rate.

I usually judge based on the manufacturer warranty. If they do 3 years, it's probably pretty good. 5 years means top shelf. Stay away from anything with shorter warrenties.


It seems like a solid plan but what's missing is no matter if you're doing 3-2-1 or using simpler solutions, you should test if the backup works periodically. Or else over the years, the backups would die one by one and nobody noticed, it defeats the point of so many redundancies. Also there's a chance the backup data is corrupted or incomplete (especially due to the design) so it's not possible to restore the system. Just test it periodically.

If there's a backup process you should also test it too. Scenarios like backup job workers are stuck and the system is not actually backing up is not uncommon.


I test my backups whenever I buy a new computer... I'll transfer the data from my latest backup to the new computer before I wipe my old computer. After I wipe the old computer I will restore the data from the other backups to test them. I should do it more, but I buy a new computer every 2-3 years and testing the backup is most convenient when I have a "spare" computer on hand.


What's the cloud solution (at major providers) that automatically geographically replicates data? For example, S3 buckets are tied to a region, which cannot be considered reliable - a single DC can always burn down, or even just intermittently unreachable. I'm looking for something that accepts an "upload" command that will (eventually) replicate the data to several regions. Ideally the regions can be changed at any point later, too.

This would take care of 3 and 1, sadly not 2 but 2 is pretty hard.


A region for AWS or GCP (and surely Azure) is more than one physical data center. A given data center (availability zone) can fail with no data loss. You have to choose very specific storage tiers with warnings galore to get to a point where one data center will cause data loss.

There is also multi region buckets which go across regions.

I use RClone and Duplicacy to backup to S3 Glacier and Google Workspace. Kopia is also another up and comer.

For a simple copy type setup RClone is a good one to look at, by default it copies the files and doesn't do any snapshots, packing, splitting, etc.


S3 offers multi-region buckets that automatically read/write from one of n buckets in n regions. Cross-region replication rules give you eventual consistency (without conflict resolution) across all buckets, which is good enough for backup systems with unique file names/paths.


Thanks, this works! I'm going to use it now :)


No problem. I was loose with my lingo earlier; it’s not called a multi-region bucket but rather a multi-region S3 access point (hence the behavior described). More here: https://aws.amazon.com/s3/features/multi-region-access-point...


S3 is region based, not AZ based. So it should be able to tolerate an entire AZ (=DC) being lost.

They offer an amazing number of 9s for durability, it is very unlikely that a single fire will cause harm to your data.


I can't find the article anymore, but a few weeks ago here on HN there was a piece where somebody did a few calculations on how AWS US-East-1 (the most used AWS region in the world) works, and got to the conclusion that all the things that can make that region go BOOM are likely to have major consequences to our civilization.

In short: AWS will go down only when you won't care about it anyway


Yeah, OK. But apart from the reliability question - what's a one stop shop for uploading files to Europe and the US in one go?


For AWS there are options but require setup. It's not exactly one click setup and can be quite complex depending on your skill level.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/replic...

https://aws.amazon.com/blogs/aws/new-cross-region-replicatio...


I've added offline/local-first to my web app so users to keep a copy and backups of all of their data on site and use the app offline but none of them want to do anything at all to implement it, and I've made very easy.

They really don't seem to give it any thought at all, even when I explain to them why they need to do this.

Conversely, when they're having network connection issues they don't hesitate to call me, and sometimes in a panic.

I'm going to keep pushing them to get on it on their end though.


How does everyone back up their iDevices? Every article I read talks about iCloud, which is great and I use, but doesn’t follow this principle and you’re out of luck if you lose access to your Apple ID.

I’ve found some content specific solutions that will backup photos or contacts for example, but I’m looking for a fairly streamlined comprehensive solution that would back up everything-messages, contacts, photos, emails, bookmarks, basically iCloud 2 but without the Apple lock in.


I basically don’t, at least not at the device level. Photos get synced to my home server using PhotoSync every once in a while (manually, but eh, it works). Texts are backed up on my laptop technically, but I pretty much consider them ephemeral. I don’t really care about bookmarks, and everything else on that list is cloud-first.


You can always plug into your computer and use iTunes for a local backup. Not elegant, but it’s still there for those who want it.


Might be important in the age of ransomware: is data protected against someone that has all your passwords? All cloud backup is probably gone in such case. All NAS and other network connected backup is gone too. Physical backups should be good if in write once mode (dvd disks for example).


For my off-site copy, I keep an external hard drive at my parents house and back up to that at least once/month. Time machine handles multiple disk backups beautifully. Ya, it could be a little out of date, but I wouldn't lose that much in a month.


The article glosses over one thing: you should have at least one backup that is not accessible through the company network. A different site connected to the same network is accessible by ransomware gangs once they are in.


This is pretty much precisely how Maersk recovered from the NotPetya worm attack, though the circumstances were serendiptous rather than planned:

... After a frantic search that entailed calling hundreds of IT admins in data centers around the world, Maersk’s desperate administrators finally found one lone surviving domain controller in a remote office—in Ghana. At some point before NotPetya struck, a blackout had knocked the Ghanaian machine offline, and the computer remained disconnected from the network. It thus contained the singular known copy of the company’s domain controller data left untouched by the malware—all thanks to a power outage. “There were a lot of joyous whoops in the office when we found it,” a Maersk administrator says.

When the tense engineers in Maidenhead set up a connection to the Ghana office, however, they found its bandwidth was so thin that it would take days to transmit the several-hundred-gigabyte domain controller backup to the UK. Their next idea: put a Ghanaian staffer on the next plane to London. But none of the West African office’s employees had a British visa.

So the Maidenhead operation arranged for a kind of relay race: One staffer from the Ghana office flew to Nigeria to meet another Maersk employee in the airport to hand off the very precious hard drive. That staffer then boarded the six-and-a-half-hour flight to Heathrow, carrying the keystone of Maersk’s entire recovery process. ...

https://tech.industry-best-practice.com/2018/10/14/the-untol...


Anyone using m-disc for archival?

Write-once seems valuable in the age of ransomware.


I've backed up a load of photos (mostly raw, 5 discs so far) onto 100GB M-discs. They seemed like a good choice for long term archival storage - they don't need to be kept in ideal conditions - humidity isn't supposed to be a problem, and leaving them out in the sun is apparently fine! Plus drives for reading them are cheap and readily available.

The discs themselves are quite expensive in terms of cost/TB - somewhere around 50-60 GBP from what I've found.

I have more confidence in their ability to not lose data if left on a shelf and forgotten about for several years compared to hard drives. I have the same data stored on a ZFS pool too.


Is that 3 copies per each of the 2 media, ie 6 copies?


Recently learned this lesson.

Had 2 backups (1 SD card, 1 HDD) for my "Document" folder.

I was trying to replace my GRUB MBR bootloader for REFind's EFI so I could dualboot on a new laptop (and swap-in my old one's SSD without having to reinsall the whole system: Arch). Unfortunately, the boot partition was too small, and needed to be re-sized from 512MB to 1GB. Foolishly, and in a rush, I thought using gparted to change the partition boundaries (shrink the root partition by 512MB from the beginning, and stretch the boot partition to 1GB from the end) was the answer.

I completely forgot that EXT4 has a superblock at the beginning, so now it was gone, and the root partition was completely unmountable -- and fsck was of no use.

So I scramble to find my backups (to decide whether or not I should figure out how to fix this), and realize that tiny little SD card was missing, and my HDD backup was completely unmountable.

Truly, a major fuck-up.

Thankfully, I didn't write anything to the boot partition, so throwing a Hail Mary and simply resizing the partitions back to their exact original sizes (thankfully x2 my TTS history was useful), allowed the root drive to mount without a hitch.

I was close to losing all of my KeepPassXC passwords and private keys due to shear idiocy.

In the end, I set up "cloud" backups (second storage media type, and long and far away), switched to Debian, and continued on my merry way.


> so throwing a Hail Mary and simply resizing the partitions back to their exact original sizes

I've made similar mistakes (well, the same effect, but for different reasons. I thought I was over a remote session when I was local and made an entire new partition layout).

`testdisk` was able to introspect what partition boundaries _were_ and rebuild the partition table.

If you're ever in a similar situation and either don't remember the exact boundaries or don't trust yourself to recreate them, check it out!


  > switched to Debian
Not to trivialize the wonders of Arch, but the switch to Debian was a major step in improving the Linux-Wife balance. A system that is reliable, and doesn't have many history lines that begin with `sudo vim /etc...` makes for a happy family life.


To be fair I think the fuck-up was on gparted's part, it's supposed to handle this kind of thing for you during the resize. That being said no tool is ever bug free hence backups :).


Close shave! Be sure your cloud backups can be accessed by you even in the case of loss of all/most computing devices in your possession. This includes your phone for 2FA.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: