Published on 2024-01-15

I remember when I was younger, my mother would always talk about how she had shelves of photographs taken over the course of a lifetime. She was worried that all it'd take is one fire for them to be gone within an afternoon. She told me that when she retired, she'd make a project out of them: to photocopy them, or to scan them, so that they'd last into the future. It'd take weeks of tedium, carefully feeding these old, sacred photographs into a scanner. It wouldn't be easy, but they mattered. Preserving them mattered.

As far as I know, she never tried. She still could, I suppose, but I haven't heard mention of it in forever. She cared at a time when it seemed like nobody else did. Most of the people in those photos are dead. I imagine that most of the people who recognize them are dead too.

Yesterday I started what I've been looking to turn into a weekly ritual for quite a while, hoping that this time it would stick. The idea is to block out some time once a week on Sunday to backup all my important devices to a hard drive, and to store encrypted backups on some cloud service for off-site assurance.

Since this is the first time I've really sat down to do it seriously, there was a bit of a backlog. See, I have three drives I use as USB mass storage devices: an SSD, a 1TB HDD, and a 2TB HDD I recently acquired. The SSD is starting to fail, it seems, so I needed to move my backups off of it. Meanwhile, my 1TB HDD works fine, but I'm hoping to use it in a new PC build, and now that I have a much more compact 2TB HDD, it makes more sense to use it as the new permanent home for all my backups.

The thing is, my old hard drive had over 100GB of files stored on it, many of which were very tiny: ancient git repos, copies of ancient git repos, duplicate back ups of ancient git repos, node_modules folders, old builds... Certainly over a hundred thousand files. And moving each file between the two hard drives took at least two seconds each. That's like, 56 hours minimum.

In there was over a decade of my digital life. Almost my entire digital life. The earliest dates I saw were from around 2016, but some of the backups went back way further--probably closer to 2012, when I was a literal child. So much of the stuff in there was junk, generated files that could easily be regenerated when it actually mattered, but discerning what's junk and what's not is really difficult at this point. Like, back when I was in middle school, I made a bunch of these walking simulators in the Blender Game Engine (a game engine that effectively doesn't exist anymore). All of my games are still there, duplicated several times across many back ups, and it looked like every one had a copy of the entire Python runtime packaged with it.

I can't delete those; they're an important part of the digital fossil record of my youth. But it's also really hard to tell what parts of it I can delete, to facilitate transferring it between drives. So, my options are to spend 55 hours passively copying my files between drives, or to spend potentially much more time actively combing through ancient versions of software to repair and clean old projects; not to mention combing through the many other files that need to be individually selected on whether or not they merit me continuing to work to preserve them.

And this doesn't address the problem of git repositories at all...

It didn't take me 55 hours to do this in the end. I did some research and found I could move the files much quicker if I compressed them in a tar archive on my computer's drive and then moved the archive to the destination. Tar seemed to help quite a bit with the duplication in back ups. I didn't realize the file system I was using on the new drive had a file size cap at 4GB, which caused some problems, but I managed to get away with splitting them up in the end.

Once, or perhaps if, I erase my old HDD, then I'll have a new problem: accessing these old back ups will go from something as easy as plugging in a USB cable to what may potentially be an afternoon of reconstructing split archives and extracting the 40GB worth of data. This is a problem I imagine will only become more challenging as I get older, and more of this data needs to get packaged and compressed.

Using tar was a pretty big step up from how I'm used to backing up devices. I suspect in the future, I might need to look into more "industrial" solutions, because my old ways are becoming two inefficient to be feasible. I only expect these backups to become more inaccessible with time as well.

Maybe one day I'll retire, and I'll have all the time in the world to carefully comb through these old backups, to format the digital fossil record of my life in a way that makes it easy to read and easy to transfer. Maybe I'll write it to a spool of magnetic tape encased in a vacuum chamber. Maybe I'll burn it to an archival grade optical disk. Maybe I'll launch it into space. Maybe it's not worth my time.

As my data gets cemented beneath more and more geological strata of digital cruft, it simultaneously becomes harder to read and harder to transfer. I didn't even bother to verify that my split archives didn't get corrupted part way through the backup process; it would have been too time consuming. It's unlikely I even have enough space on my computer's disk to try. They stop being what they were, and they become a sort of abstract token of something that once was. My entire life as a weird, geeky teenager, everything I ever loved, dot tar dot gz, part one of forty.

It's hard to say when exactly a file reaches its end of life. I want to hope that my files will live forever, but I know that one day I'll hit the little X icon in the top right corner of the window and I'll never see it again.

I don't think that makes them any less meaningful, though. Even if they're just an abstract sequence of bits that may or may not have fallen victim to bit rot in the many years of their being sequenced together, they're still my high-school-work.tar.gz, and that matters. I'll carry them with me for as long as I can.

Respond to this article

If you have thoughts you'd like to share, send me an email!

See here for ways to reach out