Don't build "The Vault"

This post is a sequel to Tiers Considered Harmful. Read that one first, or don’t1.

Four separate times at four separate jobs, someone had the bright idea to build a high-security inner sanctum environment to process some extra-sensitive data (often payment card data). Oftentimes it’s called “The Vault”. And all four times I’ve seen:

  1. they underestimated the maintenance cost
  2. the environment didn’t deliver on its promised security properties

In most instances, The Vault achieved its desired security properties on Day 1. But then time moves on, and by Year 2 or 3, these environments look quite long-in-the-tooth. By that point the people involved in building The Vault have moved onto different projects, if not entirely different employers. The bug that looked so harmless 6 months ago is now really causing issues, but people are hesitant to deploy new code until someone really twists their arm. The monitoring system is now entirely out-of-step with what the rest of the company uses. The third party software components are now rife with unpatched CVEs. Researchers disclosed some scary results around the HSMs at last year’s DefCon. The rest of the hardware is getting iffy; folks are starting to realize that they’ll eventually need to do an unappetizing data migration to the new hardware. The runbooks—which are supposed to be practiced top-to-bottom each quarter—are sitting gathering dust. The engineering director who inherited this system is wondering just how just how many confidentiality and availability eggs are tied up in this one basket, and just how fired they’ll be if the basket breaks.

Of course, this is just one of many concerns on the mind of this director on this day. They’re mostly worried about preparing for the next round of layoffs: the forth since The Vault was originally specced (or is it the fifth? Who bothered counting?). Without fail, The Vault was envisioned when the company was bright-eyed and bushy-tailed, with grand plans and absolutely not a clue that growth would slow and the staffing available to this team would be cut by 70%. They definitely didn’t envision a radical shift in executive leadership, like the company selling itself to a mercurial asswipe who agreed to the purchase but then refused to consummate it.

The engineers involved in designing The Vault had originally conceived of it as an entirely offline environment “except for the odd emergency”. And, to their credit, that is what has happened: everyone has taken a live-and-let-live attitude, and a grand total of 2 working days per year (10 milliFTEs) were devoted to The Vault since it was first developed. The Vault (as promised) isn’t a good place to do any worthwhile software engineering, so SWEs/SREs go off and find better things to do. They go out and build new things; they take on new scope. Nobody bothers to write The Vault into their promo packet anymore2; why would they?

SWEs and SREs aren’t Air Force missileers: they aren’t well adapted to sitting around in a bunker waiting for an order that everyone hopes will never arrive. The team’s director knows this, and wouldn’t dream of advocating for headcount whose role would be: (1) make sure The Vault is well-kept, and (2) resist the urge to build anything unless they absolutely have to. So the director treats The Vault as an annoying tail risk. They know it’s going to (need heavy maintenance/go down/have an exploitable vuln) one day, and they just hope that day isn’t between now and the end of the bonus cycle. In this respect they’re like every other corporate functionary, or indeed any rational and well-intentioned person with limited bandwidth and a lot of other shit to do.

I can’t possibly hope to dissuade you from building The Vault. So, in a spirit of harm reduction, I recommend that your first goal be to put yourself in the mindset of a person who is building a time capsule. Take off your Security Engineer hat, and instead put on your Nuclear Waste Storage Engineer hat. You need to future-proof the shit out of this thing. So, as much as you possibly can:

At the end of the day, The Vault is a people problem. If all of this “extra” work isn’t up your alley, convince your leadership to pay for Stripe instead. Another valid approach would be to build The Vault, have the time of your paranoid Security Engineer life while doing so, and then quit after bonuses are paid out.


  1. If you don’t want to bother reading the prequel, its thesis is “you should be building many small security cells, not a few large security tiers”. 

  2. As a rule, it should only be possible to write The Vault into your promo packet after it has been serving production traffic for two years.