docs/disaster-recovery.md

14 KiB

Disaster Recovery

Overview

This document outlines the creation of a "Disaster Recovery" (DR) system which functions as a one-way box that we can encrypt secrets to at any time, but only recover them with cooperation of a quorum of people with access to multiple offline HSM devices stored in a diversity of physical locations.

In short, it should be trivial to backup data, but very expensive to recover; recovery is difficult to do for an attacker, accepted to be time-consuming, logistically complicated and financially costly.

Data is backed up by encrypting plaintext to a Disaster Recovery Key. The resulting ciphertext is then stored in the Disaster Recovery Warehouse. In the case of a disaster, ciphertexts can be gathered from the DR Warehouse and then decrypted using the DR Key to regain access to the plaintext.

Threat Model

Each of the below can always be independently tolerated if only one of them happens and the disaster recovery goes to plan. However if more than one of the below happen, it is possible for the DR key to either be lost or stolen.

  • N-M operators lose control of their Operator Key Smartcards
    • Where an M-of-N threshold is 3-of-5, then a loss of two is tolerated
  • An entire cloud provider or account becoming unavailable
  • An offline backup location is fully destroyed
  • An adversary gains control of any single DR participant
  • An adversary gains control of a single offline backup location
  • An adversary has malware that can control any internet connected computer

Components

flowchart TD
    shard1-enc(Shard 1 <br>Encrypted)
    secrets-enc(Encrypted Secrets)
    loc1-prvkey-sc(Location 1<br>Key Smartcard)
    loc1-prvkey-enc-pass(Location 1<br>Key Passphrase<br> Encrypted)
    op1-sc(Operator 1<br> Key Smartcard)
    seckey(DR Private Key)
    secrets-dec --> pubkey
    pubkey --> secrets-enc
    pubkey(DR Public Key)
    ssss(Shamir's Secret<br> Sharing System)
    ssss --> seckey
    secrets-enc --> seckey
    seckey --> secrets-dec(Plaintext Secrets)
    op1-sc --> loc1-prvkey-enc-pass
    loc1-prvkey-enc-pass --> loc1-prvkey-sc
    shard1-enc --> loc1-prvkey-sc
    loc1-prvkey-sc --> shard1
    shard1(Shard 1) --> ssss
    shard2(Shard 2) -.-> ssss
    shard3(Shard 3) -.-> ssss
    style shard2 stroke:#222,stroke-width:1px,color:#555,stroke-dasharray: 5 5
    style shard3 stroke:#222,stroke-width:1px,color:#555,stroke-dasharray: 5 5

Note: we assume all PGP encryption subkeys use Curve 25519 + AES 256.

DR Key

  • PGP asymmetric key pair all secrets are directly encrypted to
  • We chose the PGP standard because:
    • It is a widely supported with a plurality of implementations and tooling
    • The PGP standard and tooling is assumed to outlive any custom made tools
    • Should be more reliable than any crypto implementation we maintain
  • Possible more than one DR key could exist in the future for specialized uses

DR Key Shards

  • A Shamirs Secret Share of the private portion of the DR Key
  • Encrypted to a respective Location Key
  • Stored solely in geographically separate Locations.

Location

  • DR Key Shards and Location Keys are distributed to separate Locations
    • The Locations are geographically separated
    • The Locations have a fixed human access list
      • Those with access can however cooperate to transfer access to others
    • Each Location has staff and physical controls that strictly enforce access

Location Keys

  • A PGP Keypair whose private key is stored only in a given Location
    • Shards are encrypted to a Location public key
    • The Location Private Key is used to decrypt shards
  • We are confident only one of each Location private key bundle exists
    • Keys are generated on an airgapped system with witnesses
    • Airgapped system is proven to run code all parties agree to
  • Each Location private key is replicated on three mediums:
    1. Yubikey 5 (primary)
    2. Encrypted on paper (backup)
    3. Encrypted on a SD card (backup)
  • All mediums are decrypted/unlocked with the same 256bit entropy password

DR Courier

  • A human who is on the access list for a Location
  • Capable of retrieving a set of shard and its Location Keys.
  • We expect a Shard and its Location Key are only accessible by one DR Courier
  • May be distinct from Operator, but not strictly necessary
  • Must be highly trusted, but does not have to be technically skilled

Operator

  • A human who is capable of decrypting data with a Location Key
  • They use their Operator Key to decrypt the password of a Location Key

Operator Key

  • Each Operator has their own Operator Key on a smartcard
  • Operator Key smartcards have a simple, known authorization PIN
  • Security relies on geographic separation of the Operator Keys and Shard
  • Can decrypt the pin or backup password for a specific DR Location Key

DR Warehouse

  • Online storage for encrypted data replicated across multiple providers
  • All data in DR Warehouse can only be decrypted by the DR Key
  • Tolerate loss of any single provider by duplicating data to all of them
  • Storage backends can be any combination of the following:
    • S3 Compatible object stores:
      • AWS, Google Cloud, DigitalOcean, Azure, etc.
    • Version control systems:
      • Github, AWS CodeCommit, Gitea
  • We tolerate a loss of all but one DR storage backend
  • A minimum of three storage backends should be maintained

DR Ceremony System

  • Laptop trusted to be untampered by all ceremony participants
  • Could be a brand new laptop, or one tightly held under multi-party custody
  • Has firmware capable of attesting the hash of an OS ISO we wish to use
    • Allows all parties to form trust no single party tampered with the OS

DR Ceremony OS

  • Linux operating system
  • Basic shell tools
    • bash, sed, grep, awk, etc.
  • Cryptography tools
    • gpg, openssl, etc
  • A Shamirs Secret Sharing tool
    • ssss, horcrux, or similar
  • Built by multiple people who confirmed identical hashes
  • AirgapOS is suggested:

DR Generation Script

Routine Inputs:

  • Desired m-of-n threshold for Shamir Secret Sharing split of Quorum key
  • N * 2 unprovisioned yubikeys
  • N * 2 + 1 SD cards

Subroutines:

  • DR Key generation:

    1. Generate a PGP key with an encryption subkey and signing subkey.
  • DR Operator Key generation:

    1. For each Operator
      1. Set a simple, well documented PIN for a Yubikey
      2. Provision a PGP key with an encryption subkey directly onto the Yubikey
      3. Sign the public key cert of the generated key with the DR Key
  • Location Key generation:

    1. Generate a PGP key with an encryption subkey using multiple entropy sources
    2. Generate a random, 43 char password
    3. Encrypt PGP secret key to the generated password
    4. Encrypt password to the Operator Keys associated with this Location
    5. Export the encrypted PGP secret key to paper and an SD Card
    6. Provision a yubikey with the generated password as the PIN
    7. Upload the Location Key onto the yubikey
    8. Sign Location Public Key with DR Key
  • Shards Generation:

    1. Split DR Key secret with Shamir Secret Sharing
    • M is the reconstruction threshold
    • N is the total number of shards
    1. Encrypt each shard to its assigned Location Key
  • Generate DR Key Verification Challenge

    1. Encrypt a random string to the DR Key
    2. Decrypt random string ciphertext with DR Key and verify

Routine Outputs:

  • N Operator Key Yubikeys
  • N SD Cards ("Operator Cards ") each containing a plaintext Operator Key
  • N pieces of paper each containing a plaintext Operator Key
  • N Location Key Yubikeys
  • N SD Cards ("Shard Cards") Containing:
    • Location Key backup (encrypted to password)
    • Encrypted password (for Location Key)
    • Shard encrypted to Location Key
  • N pieces of paper with a Location Key backup (encrypted to password)
  • SD card ("Public Card") containing:
    • DR Key public key
    • All DR Operator Key public keys and DR Key signatures over them
    • All DR Location Key public keys and DR Key signatures over them
    • DR Key verification challenge

DR Reconstitution Script

Routine Inputs:

  • All Shards
  • m-of-n encrypted Location Key PINs
  • m-of-n Location Key Yubikeys
  • m-of-n Operator Yubikeys

Routine:

  1. For m operators:
    1. prompt for Operator to insert Operator Key Yubikey
    2. prompt for Operator to insert Location Key Yubikey
    3. Operator key is used to decrypt password
    4. Decrypted password is used to authorize Location Key Yubikey
    5. Location Key Yubikey is used to decrypt Shard
    6. Decrypted shard is persisted in memory with other decrypted shards
    7. Decrypted shards are assembled with Shamir Secret Sharing tool, outputting DR Key

Routine Output:

  • Plaintext DR Key

DR Generation Ceremony

flowchart LR
    subgraph storage[Online Storage]
        loc-pubkey(Location<br> Public Keys)
        op-pubkey(Operator<br> Public keys)
        pubkey(DR Public Key)
        shard-enc(Encrypted<br> Shards 1-N)
    end

    subgraph generation[Generation]
        seckey(DR Private Key)
        ssss[Shamir's Secret<br> Sharing System]
        seckey --> ssss
        ssss --> shard(Shards 1-N)
    end

    subgraph location[Location 1-N]
        loc-prvkey-sc(Location Key<br> Smartcard)
        loc-prvkey-enc-paper(Location Key<br> Encrypted<br> Paper Backup)
        loc-prvkey-enc-pass(Location Key<br> Encrypted<br> Passphrase)
    end

    subgraph operator[Operator 1-N]
        op-sc(Operator Key <br> Smartcard)
        op-sd(Operator Key<br> Plaintext SD)
        op-ppr(Operator Key<br> Plaintext Paper)
    end

    generation --> operator
    generation --> storage
    generation --> location

Requirements:

  • N Operators
  • 1 Ceremony Laptop with Coreboot/Heads firmware
  • 1 flash drive containing:
    • Multi-party signed iso of AirgapOS
    • DR Generation Script
    • Shamirs Secret Sharing Binary
  • N new Operator Yubikeys
  • N new Location Key Yubikeys
  • N*2+1 new SD Cards
  • N*3 safety deposit bags
  • 1 bottle of glitter nail polish
  • 1 phone/camera of any kind
    • Prefer offline device that can save images to SD card
  • 1 Disposable inkjet printer with ink and paper
    • Should have no wifi, or have wifi physically removed

Steps:

  1. Boot Ceremony Laptop to Heads firmware menu
  2. Drop to a debug shell
  3. Insert and mount AirgapOS flash drive
  4. Generate hashes of the AirgapOS flash drive contents
  5. Multiple parties verify hashes are expected
  6. Boot ISO file
  7. Run DR Generation Script
  8. Reboot system
  9. Run DR Reconstitution Ceremony
  10. Decrypt DR Key Verification Challenge
  11. Shut down system
  12. Print contents of Shard Cards from generation script
  13. Print contents of Operator Cards from generation script
  14. Seal Location Key artifacts
    • Open new safety deposit box bag
    • Insert printed backups of "Shard Cards"
    • Insert respective Location Key Smartcard
    • Cover seals of closed safety deposit bags in glitter nail polish
    • Take and backup closeup pictures of all safety deposit bag seals
  15. Seal Operator Card Artifacts into "Inner bag"
    • Open new safety deposit box bag
    • Insert printed backup of Operator Card
    • Insert Operator Card
    • Cover seals of closed safety deposit bags in glitter nail polish
  16. Seal Operator Smartcard
    • Open new safety deposit box bag
    • Insert Inner bag
    • Insert Operator Smartcard
    • Cover seals of closed safety deposit bags in glitter nail polish
  17. Take closeup pictures of all safety deposit bag seals
  18. Commit relevant artifacts to GitHub
    • From Public Card:
      • Public DR Key
      • Public DR Shard Encryption Keys + signatures from the DR Key
      • Public DR Operator Keys + signatures from the DR Key
    • Images of the sealed bags
    • Signatures over all of this content by multiple witness Personal PGP keys
  19. Hand off sealed bags with shard material to DR Couriers.
  20. Hand off sealed bags with Operator Key material to respective Operators.

DR Reconstitution Ceremony

flowchart LR

    subgraph storage[Online Storage]
        shard-enc(Encrypted<br> Shards 1-N)
        secrets-enc(Encrypted Secrets)
    end

    storage --> recovery

    subgraph location1[Location 1]
    loc1-prvkey-sc(Location 1<br>Key<br> Smartcard)
    loc1-prvkey-enc-pass(Location 1<br>Key<br> Encrypted<br> Passphrase)
    end

    subgraph location2[Location 2]
    loc2-prvkey-sc(Location 2<br>Private Key<br> Smartcard)
    loc2-prvkey-enc-pass(Location 2<br>Key<br> Encrypted<br> Passphrase)
    end

    subgraph locationN[Location N]
    locN-prvkey-sc(Location N<br>Key<br> Smartcard)
    locN-prvkey-enc-pass(Location N<br>Key<br> Encrypted<br> Passphrase)
    end

    subgraph recovery[Recovery]
        seckey(DR Private Key)
        ssss[Shamir's Secret<br> Sharing System]
        ssss --> seckey
        seckey --> secrets-dec(Decrypted Secrets)
        shard1(Shard 1) --> ssss
        shard2(Shard 2) --> ssss
        shardN(Shard N) --> ssss
    end

    location1 --> recovery
    location2 --> recovery
    locationN --> recovery
    op1-sc(Operator 1<br> Smartcard) --> recovery
    op2-sc(Operator 2<br> Smartcard) --> recovery
    opN-sc(Operator N<br> Smartcard) --> recovery

Requirements:

  • DR Key Public Key
  • M-of-N Operators
  • M-of-N Operator Key Yubikeys
  • M-of-N Location Keys
  • M-of-N Encrypted Location Key pins
  • M-of-N Shards

Steps:

  1. Boot Ceremony Laptop to Heads firmware menu
  2. Drop to a debug shell
  3. Insert and mount AirgapOS flash drive
  4. Generate hash of AirgapOS ISO file and DR Reconstitution Script
  5. Multiple parties verify hashes are expected
  6. Boot ISO file
  7. Run DR Reconstitution Script