Digital sovereignty is harder than it should be

Your dependency graph is huge, but that’s by design

Aug 30, 2022

Sooner or later, they’ll come for your stuff — your documents, your email, your social media accounts, your photos, or some other part of your digital life. When it happens, you may or may not know it — maybe they secretly peek at your files, or maybe your docs are leaked in a hack, or maybe you’re suddenly de-platformed with no recourse of any kind.

Who’s “they”?

Maybe it’s feds, some megaplatform, or even hackers.
Maybe it’s a hostile foreign power that ends up making all our chips and putting backdoors into everything.
Ultimately, “they” is whoever has admin rights — de facto or de jure — on the cloud system where you’re keeping your data. They’re the complete strangers who can read, write, copy, and delete all your password-protected secrets.

If code is law, “they” are congress, the courts, and the executive branch all rolled into one. None of them were elected in any contest you voted in, or appointed by any process that you had any say in. Nobody outside the company hosting your data even knows their names, or how many of them there are. Heck, most people inside the company couldn’t name them. But their system belongs to them — they have root, and you’re merely a guest with whatever rights and privileges they say you have.

The only way to keep them from coming for you is to get your data out of their hands before they notice you. If you’re truly worried about what they might do to your data, you’re gonna have to host your own cloud.

Thankfully, we’re in the era of cheap storage and the Raspberry Pi. So this should be pretty straightforward, and not even really that expensive, right?

Wrong. Taking command of your own digital life is a lot harder than you probably think it is. Even if you have a lot of money to throw at the problem, it’s still a bigger pain than you might imagine.

What follows is a case study based on my own abortive efforts to achieve some measure of digital sovereignty. Consider it a cautionary tale.

Making a plan

When I decided to have a go at untangling myself from the cloud, I figured this would be pretty straightforward and only moderately time-consuming. My initial plan went something like this:

Buy a Synology DiskStation with maxed-out RAM (for hosting virtual machines)
Back up all my data from Google, Dropbox, and Apple
Move my email over to Protonmail
Install NextCloud on the Diskstation and start using it on my laptop and phone
Victory!

After I put the order in for the DiskStation and the drives, I started making a comprehensive list of all the cloud services I use, so that I wouldn’t miss any when I started replacing them with self-hosted alternatives. I was just gonna go down this little list move these few things over to the DiskStation, and then boom I’m my own cloud provider. Not a problem!

This list-making exercise was my first indication that things were not going to go as planned. Check this out:

The above is extensive, but it isn’t even a complete list. I just kind of quit when I filled up the slide. At some point this month I have big plans to go through my credit card statements and unsubscribe from a ton of stuff, so I’m sure I’ll find even more, then.

The cloud manifest

As I sat and stared at the slide above, I realized that I — a software developer — had seen lists like this before. In developer terms, this is what’s known as a “manifest.”

If you’re a ruby dev, I’m talking about Gemfile.lock. For JavaScript, it’s package.json.

A pair of manifests. On the left is a Gemfile.lock for a ruby project, and on the right is a package-lock.json for a JavaScript project.

When you have a manifest like the above, you can use it to build a dependency graph — a graph that shows how pieces of software, services, and other entities depend on one another.

A dependency graph for software. Source: Jobin RV

Once I began thinking of my list in terms of manifests and dependency graphs, I settled on the following set of layers for a generalized cloud dependency graph:

With this in mind, I started converting the above slide into a more formal manifest, so that I could map out dependencies all the way up to the top of the graph. I ended up with something that looks like this for just the “Messaging” category:

What I learned

I went through the first steps in my initial plan — bought a NAS, backed up my data, and got a Protonmail account. And at that point, I looked at the spreadsheet above and just gave up. Ok, maybe I didn’t give up — what I told myself was that I was pressing pause on this project, and putting it on the back burner. But the reality is that until I pick this project back up again, not only have I left myself just as dependent on the cloud as I was when I started, but I’m now over $1K poorer and with a couple of brand new cloud dependencies (Protonmail, Backblaze for backing up the NAS, and some Synology-specific services) to show for it.

As sad as this unfinished story is, the exercise hasn’t been a total loss. I’ve learned a few things along the way:

If you showed this tangled mess of an online presence to me in 2001, I wouldn't believe that normies would live this way. It’s way too complicated. Normal people will never, ever do all this. Totally infeasible. And yet here we are! (More here.)
The rows that don't have credit cards — like Gmail or Facebook — STILL HAVE CREDIT CARDS! Google and FB have actually aliased my credit cards from the other rows into those rows because they're tracking me and they know what cards are mine.
Things are this way because the current state of web2 makes it easy for them to be this way. This is actually the path of least resistance in web2 — more signups, more identities, more accounts, more of your data in more places that you don’t own or control. (More here.)
There are cycles in this graph — circular dependencies. I use my jonstokes.com email address to log into Google Workspace and to the DNS provider that hosts the jonstokes.com domain and MX records. Oops. Recipe for disaster.
Legislative remedies for this state of affairs — always popular in the EU, and gaining traction in the US — are probably actually possible, but in their current form, they’re not going to work and are actually worse than nothing. (More here.)
Despite the scale of the problem, I still think it's possible for me to untangle all this and collapse it down into a manageable set of self-hosted services and identities. But it's a huge project and will require an investment that’s even steeper than the cash I’ve already laid out — i.e., a significant investment of time.

To sum up: I wish I had better news, but I don’t. We all do what’s easy and natural, and what’s easy and natural in web2 is to keep clicking “Log in with Google” (or Facebook, or your identity provider of choice) and spread your data to the four winds.

Rolling back this situation toward something that’s more like “my data, my rules,” is going to take a ton of work, both individual and collective. A lot of that work will be political. See my next article for why.

Thank you for reading jonstokes.com. This post is public so feel free to share it.

jonstokes.com

Digital sovereignty is harder than it should be

Your dependency graph is huge, but that’s by design

Making a plan

The cloud manifest

What I learned

Discussion about this post