Backing Up: a Journey

The Problem

We all know we need to back up stuff that's important to us. We've all been told this before, many times. For some classes of information, our providers (google, apple, etc) have decided to help us with built-in cloud storage like iCloud or Google Photos. They aren't bulletproof, but they're useful and you've got images in two places. No longer must you lose your history if you lose your phone.

Even so, I'm not a trusting soul, so I back my images up locally and in the cloud. If there's a meltdown in an Apple or Google datacenter tonight, I'll still be able to get to my images (and the music I own, for example).

There's another class of data: Stuff That's Not Automagically In The Cloud. Like lots of random shit on my laptop; works in progress are often not backed up to the cloud, etc.

I also run a fairly significant test lab here. It's three enterprise-class servers and a handful of cast-off PCs pressed into 'server duty' over the years. The enterprise servers are set up as a virtualization environment (ProxMox).

ProxMox does a bang-up job of backing up its own children – VMs and LXC containers – sufficiently well done that I can restore them to a different machine and it works just fine.

This leaves me with miscellaneous scattered systems to back up and this is where my problems lie. So many applications have “use docker” as their recommended deployment strategy. I'm on board with that to some extent; containers are a big win in some regards. They also remove the need for bare-metal restores of servers and filesystems if you properly isolate the storage.

My Solutions

Infrastructure

I've got a TrueNAS server in my network to manage the storage for backups. It will, incidentally, run some random apps, but it's important to note (ironically) that it doesn't back those apps up usefully, so don't use it for anything that's not a throwaway.

Mobile devices

I use iCloud backup for my phone and my iPad. It's just too convenient. If someone grabs my phone and throws it in the river, I can have a new phone with all my old info in just a couple of hours. Same with my iPad. If you are concerned about data safety, you'll have to do something more manual, like plug your phone or tablet into a computer and manually back up.

Laptop and Desktop

I use Time Machine as my Mac client for backups to the TrueNAS server. It can be fiddly and slow over the wire; if I were to switch to a local SSD it'd be much faster, I suspect. Still, it's robust enough.

Virtualization Lab

Proxmox VE has a robust backup mechanism with scheduling and retention and will happily restore the entire vm or container to any pve node and it works. It's happy to use NFS as transport to the TrueNAS server. It's been fairly 'set-and-forget' for me.

In my k8s cluster I use velero for namespace backups. I haven't done a full disaster recovery scenario, but I have deleted apps and namespaces and successfully restored them to the same cluster. Velero is using Minio on TrueNAS as an S3 object storage back end.

Miscellany

There are a LOT of backup utilities for linux. I've been screwing with several, ranging from hand-scripted rsync jobs to burp to borg to ... you name it. I've finally sorta settled on restic.

restic does incremental snapshots and deduplication so your storage can contain many versions without actually having many copies. This is a win from a storage perspective and a speed perspective. It's great for backing up directories thus far. It uses encryption and has a raft of back-ends it can use (I currently use restic's sftp backend to the TrueNAS server).

Additionally, I like restic because of the low barrier of entry. apt install -y restic and you're ready to back shit up. You can back it up locally or to a remote system with several backends available: local, sftp, S3, Minio, Backblaze B2, and more.

Need a repo? Set up your ssh creds on your backup server for password-less login and go to town.

restic init \
   sftp:user@10.10.10.10:/home/user/restic

It'll ask you the password you wanna use and then for confirmation and you're cooking with microwaves. Don't forget that password. There's no coming back from that.

Then you can set a couple of environment variables:

export RESTIC_REPOSITORY=sftp:user@10.10.10.10:/home/user/restic
export RESTIC_PASSWORD=somesecurepassword
#alternately a file containing the password:
export RESTIC_PASSWORD_FILE=/home/user/.conf/resticp.txt

I have multiple repos, so I created some environment files.

File for repo #1 (.repo1)

export RESTIC_REPOSITORY=sftp:user@10.10.10.10:/home/user/restic
export RESTIC_PASSWORD_FILE=/home/user/.conf/resticp.txt

File for repo #2 (.repo2)

export RESTIC_REPOSITORY=sftp:user@10.10.10.20:/home/user/restic
export RESTIC_PASSWORD_FILE=/home/user/.conf/resticp2.txt

Now I can . .repo2 and then restic snapshots to see what's what:

 ➤ restic snapshots
repository fc56ded3 opened successfully, password is correct
ID        Time                 Host        Tags        Paths
------------------------------------------------------------------------------
4eab9583  2023-09-14 21:52:04  labbox                 /home/user/Documents
------------------------------------------------------------------------------
1 snapshots

Then I can backup a directory with:

 ➤ restic backup ./Desktop
repository fc56ded2 opened successfully, password is correct
no parent snapshot found, will read all files

Files:         284 new,     0 changed,     0 unmodified
Dirs:          140 new,     0 changed,     0 unmodified
Added to the repo: 47.006 MiB

processed 284 files, 63.214 MiB in 0:02
snapshot 5df6aaf1 saved

and check it out with:

➤ restic snapshots
repository fc56ded2 opened successfully, password is correct
ID        Time                 Host        Tags        Paths
------------------------------------------------------------------------------
4eab9583  2023-09-14 21:52:04  labbox                 /home/user/Documents
5df6aaf1  2023-09-14 22:21:24  labbox                 /home/user/Desktop
------------------------------------------------------------------------------
2 snapshots

So say there's a change:

$ echo "This is a new file"> Desktop/newfile.txt

$ restic backup ./Desktop
repository fc56ded2 opened successfully, password is correct
using parent snapshot 5df6aaf1

Files:           1 new,     0 changed,   284 unmodified
Dirs:            0 new,     1 changed,   139 unmodified
Added to the repo: 2.624 KiB

processed 285 files, 63.214 MiB in 0:00
snapshot 6a4f7265 saved

$ restic snapshots
repository fc56ded2 opened successfully, password is correct
ID        Time                 Host        Tags  Paths
------------------------------------------------------------------------
4eab9583  2023-09-14 21:52:04  labbox           /home/stwhite/Documents
5df6aaf1  2023-09-14 22:21:24  labbox           /home/stwhite/Desktop
6a4f7265  2023-09-14 22:28:13  labbox           /home/stwhite/Desktop
------------------------------------------------------------------------
3 snapshots

Now, if we want the unchanged one back (Ignore the fact that we could just delete newfile.txt for the sake of the illustration), we can delete our Desktop folder and restore from snapshot.

 ➤ restic restore 5df6aaf1 -t ~/
repository fc56ded2 opened successfully, password is correct
restoring <Snapshot 5df6aaf1 of [/home/user/Desktop] at 2023-09-14 22:28:13.563663716 -0500 CDT by user@labbox> to /home/user/

Et Voila!

$  ls Desktop/
'Old Firefox Data'   computer.desktop   network.desktop   trash-can.desktop   user-home.desktop

No pesky newfile.txt in this newly restored directory!

But wait, there's more! If you want to see what's changed between two snapshots:

restic diff 5df6aaf1 6a4f7265
repository fc56ded2 opened successfully, password is correct
comparing snapshot 5df6aaf1 to 6a4f7265:

+    /Desktop/newfile.txt

Files:           1 new,     0 removed,     0 changed
Dirs:            0 new,     0 removed
Others:          0 new,     0 removed
Data Blobs:      1 new,     0 removed
Tree Blobs:      2 new,     2 removed
  Added:   2.624 KiB
  Removed: 2.235 KiB

Or you can make sure your repo is consistent and solid:

$ restic check
using temporary cache in /tmp/restic-check-cache-4275756157
repository fc56ded2 opened successfully, password is correct
created new cache in /tmp/restic-check-cache-4275756157
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
[0:00] 100.00%  3 / 3 snapshots
no errors were found

We can maintain the snapshots in the repo with restic, as well:

$ restic forget --keep-last 3 --prune

This will tell it to forget any snapshots but the latest three. You can use more complex qualifiers too, include --keep-daily or --keep-weekly, for instance. ``` Restic has a lot deeper feature set than is covered here, but these few commands will get you on the road to CLI backup easily enough. It's easy to script for cron jobs, and with incremental snapshots and built-in deduplication, you can run it every hour if you want to.

There are ... many backup utilities out there with varying degrees of complexity and flexibility. Some of them (like velero) use restic in the background. I've come to like it quite a bit. But if it's still not yor cup of tea, I still wanna leave you with a bit of advice: Back your shit up. Do it now. Don't wait.