From c6c6702cd8366ebed70dafb6afe87a95ca4c2072 Mon Sep 17 00:00:00 2001 From: "Lance R. Vick" Date: Fri, 4 Aug 2023 14:59:58 -0700 Subject: [PATCH] initial commit --- commit-signing.md | 67 +++++++ disaster-recovery.md | 397 +++++++++++++++++++++++++++++++++++++ production-engineering.md | 380 +++++++++++++++++++++++++++++++++++ random-red-team.md | 86 ++++++++ secrets.md | 45 +++++ security.md | 77 +++++++ share-recovery.md | 144 ++++++++++++++ signed-git-workflows.md | 35 ++++ ssh.md | 172 ++++++++++++++++ webauthn-custody.md | 185 +++++++++++++++++ webauthn-kyc-encryption.md | 12 ++ 11 files changed, 1600 insertions(+) create mode 100644 commit-signing.md create mode 100644 disaster-recovery.md create mode 100644 production-engineering.md create mode 100644 random-red-team.md create mode 100644 secrets.md create mode 100644 security.md create mode 100644 share-recovery.md create mode 100644 signed-git-workflows.md create mode 100644 ssh.md create mode 100644 webauthn-custody.md create mode 100644 webauthn-kyc-encryption.md diff --git a/commit-signing.md b/commit-signing.md new file mode 100644 index 0000000..09bdb8a --- /dev/null +++ b/commit-signing.md @@ -0,0 +1,67 @@ +# Commit signing + +Signing all commits has significant value when you do it universally, and +all the more when you have CI that expects it. + +## Rationale + +### Eliminate CI as an attack surface + +Once you have a CI system that rejects any unsigned commits, you avoid an +attacker with git access being able to explore or execute code on these systems +at all. + +### Unambiguous Authorship + +Short of enforced signing, anyone can commit with the email of anyone else. A +typical VCS like Git has no way to protect against this. + +If someone submits 5 very similar PRs, one might review the first one in detail +then not inspect the rest quite as close, because humans. By contrast however +if 4 of the 5 are signed, suddenly the outlier is suspect, could be from +someone else entirely, and will get additional scrutiny. + +A dishonest or disgruntled employee could easily rewrite history to make a +mistake appear to have been committed by someone else, instead of having to +come to terms with it. + +When engineers all sign their commits both at work and in open source projects, +it avoids reputation hijacking where someone might commit malicious code to +a repo with a naive maintainer hoping they will blindly trust your name without +careful inspection. + +### Distributed Accountability + +Only signing tags implies you are approving the whole history + +By signing only tags you place the full burden of reviewing all history and +every line of code between releases on the individual cutting a release. In a +large codebase this is an unreasonable expectation. + +Having signing and review at every PR between releases distributes +responsibility more fairly. When you sign a commit or sign a merge commit after +a PR you are signing only that one chunk of code or subset of changes and +creating a smaller set of checks and balances through the release cycle. + +### Proof of code-review + +Once a merge to master is done, CI should fail to build unless the merge commit +itself at HEAD is signed, and signed by someone different than the authors of +the included commits. + +## Setup + +### Git + +Adjust your ~/.gitconfig to be similar to the following: + +``` +[user] + email = john@doe.com + name = John H. Doe + signingKey = 6B61ECD76088748C70590D55E90A401336C8AAA9 # get from 'gpg --list-keys --with-colons' +[commit] + gpgSign = true +[gpg] + program = gpg2 +``` diff --git a/disaster-recovery.md b/disaster-recovery.md new file mode 100644 index 0000000..46430a9 --- /dev/null +++ b/disaster-recovery.md @@ -0,0 +1,397 @@ +# Disaster Recovery + +## Overview + +This document outlines the creation of a "Disaster Recovery" (DR) system which +functions as a one-way box that we can encrypt secrets to at any time, but only +recover them with cooperation of a quorum of people with access to multiple +offline HSM devices stored in a diversity of physical locations. + +In short, it should be trivial to backup data, but very expensive to recover; +recovery is difficult to do for an attacker, accepted to be time-consuming, +logistically complicated and financially costly. + +Data is backed up by encrypting plaintext to a [Disaster Recovery Key](#dr-key). +The resulting ciphertext is then stored in the +[Disaster Recovery Warehouse](#dr-warehouse). In the case of a disaster, +ciphertexts can be gathered from the DR Warehouse and then decrypted using the +DR Key to regain access to the plaintext. + +## Threat Model + +Each of the below can always be independently tolerated if only one of them +happens and the disaster recovery goes to plan. However if more than one of +the below happen, it is possible for the DR key to either be lost or stolen. + +- N-M operators lose control of their Operator Key Smartcards + - Where an M-of-N threshold is 3-of-5, then a loss of two is tolerated +- An entire cloud provider or account becoming unavailable +- An offline backup location is fully destroyed +- An adversary gains control of any single DR participant +- An adversary gains control of a single offline backup location +- An adversary has malware that can control any internet connected computer + +## Components + +```mermaid +flowchart TD + shard1-enc(Shard 1
Encrypted) + secrets-enc(Encrypted Secrets) + loc1-prvkey-sc(Location 1
Key Smartcard) + loc1-prvkey-enc-pass(Location 1
Key Passphrase
Encrypted) + op1-sc(Operator 1
Key Smartcard) + seckey(DR Private Key) + secrets-dec --> pubkey + pubkey --> secrets-enc + pubkey(DR Public Key) + ssss(Shamir's Secret
Sharing System) + ssss --> seckey + secrets-enc --> seckey + seckey --> secrets-dec(Plaintext Secrets) + op1-sc --> loc1-prvkey-enc-pass + loc1-prvkey-enc-pass --> loc1-prvkey-sc + shard1-enc --> loc1-prvkey-sc + loc1-prvkey-sc --> shard1 + shard1(Shard 1) --> ssss + shard2(Shard 2) -.-> ssss + shard3(Shard 3) -.-> ssss + style shard2 stroke:#222,stroke-width:1px,color:#555,stroke-dasharray: 5 5 + style shard3 stroke:#222,stroke-width:1px,color:#555,stroke-dasharray: 5 5 +``` + +Note: we assume all PGP encryption subkeys use Curve 25519 + AES 256. + +### DR Key + +- PGP asymmetric key pair all secrets are directly encrypted to +- We chose the PGP standard because: + - It is a widely supported with a plurality of implementations and tooling + - The PGP standard and tooling is assumed to outlive any custom made tools + - Should be more reliable than any crypto implementation we maintain +- Possible more than one DR key could exist in the future for specialized uses + +### DR Key Shards + +- A Shamirs Secret Share of the private portion of the DR Key +- Encrypted to a respective Location Key +- Stored solely in geographically separate Locations. + +### Location + +- DR Key Shards and Location Keys are distributed to separate Locations + - The Locations are geographically separated + - The Locations have a fixed human access list + - Those with access can however cooperate to transfer access to others + - Each Location has staff and physical controls that strictly enforce access + +### Location Keys + +- A PGP Keypair whose private key is stored only in a given Location + - Shards are encrypted to a Location public key + - The Location Private Key is used to decrypt shards +- We are confident only one of each Location private key bundle exists + - Keys are generated on an airgapped system with witnesses + - Airgapped system is proven to run code all parties agree to +- Each Location private key is replicated on three mediums: + 1. Yubikey 5 (primary) + 2. Encrypted on paper (backup) + 3. Encrypted on a SD card (backup) +- All mediums are decrypted/unlocked with the same 256bit entropy password + +### DR Courier + +- A human who is on the access list for a Location +- Capable of retrieving a set of shard and its Location Keys. +- We expect a Shard and its Location Key are only accessible by one DR Courier +- May be distinct from Operator, but not strictly necessary +- Must be highly trusted, but does not have to be technically skilled + +### Operator + +- A human who is capable of decrypting data with a Location Key +- They use their Operator Key to decrypt the password of a Location Key + +### Operator Key + +- Each Operator has their own Operator Key on a smartcard +- Operator Key smartcards have a simple, known authorization PIN +- Security relies on geographic separation of the Operator Keys and Shard +- Can decrypt the pin or backup password for a specific DR Location Key + +### DR Warehouse + +- Online storage for encrypted data replicated across multiple providers +- All data in DR Warehouse can only be decrypted by the DR Key +- Tolerate loss of any single provider by duplicating data to all of them +- Storage backends can be any combination of the following: + - S3 Compatible object stores: + - AWS, Google Cloud, DigitalOcean, Azure, etc. + - Version control systems: + - Github, AWS CodeCommit, Gitea +- We tolerate a loss of all but one DR storage backend +- A minimum of three storage backends should be maintained + +### DR Ceremony System + +- Laptop trusted to be untampered by all ceremony participants +- Could be a brand new laptop, or one tightly held under multi-party custody +- Has firmware capable of attesting the hash of an OS ISO we wish to use + - Allows all parties to form trust no single party tampered with the OS + +### DR Ceremony OS + +- Linux operating system +- Basic shell tools + - bash, sed, grep, awk, etc. +- Cryptography tools + - gpg, openssl, etc +- A Shamirs Secret Sharing tool + - ssss, horcrux, or similar +- Built by multiple people who confirmed identical hashes +- AirgapOS is suggested: + - + +### DR Generation Script + +**Routine Inputs:** + +- Desired m-of-n threshold for Shamir Secret Sharing split of Quorum key +- N \* 2 unprovisioned yubikeys +- N \* 2 + 1 SD cards + +**Subroutines:** + +- DR Key generation: + + 1. Generate a PGP key with an encryption subkey and signing subkey. + +- DR Operator Key generation: + + 1. For each Operator + 1. Set a simple, well documented PIN for a Yubikey + 2. Provision a PGP key with an encryption subkey directly onto the Yubikey + 3. Sign the public key cert of the generated key with the DR Key + +- Location Key generation: + + 1. Generate a PGP key with an encryption subkey using multiple entropy sources + 2. Generate a random, 43 char password + 3. Encrypt PGP secret key to the generated password + 4. Encrypt password to the Operator Keys associated with this Location + 5. Export the encrypted PGP secret key to paper and an SD Card + 6. Provision a yubikey with the generated password as the PIN + 7. Upload the Location Key onto the yubikey + 8. Sign Location Public Key with DR Key + +- Shards Generation: + + 1. Split DR Key secret with Shamir Secret Sharing + - M is the reconstruction threshold + - N is the total number of shards + 2. Encrypt each shard to its assigned Location Key + +- Generate DR Key Verification Challenge + + 1. Encrypt a random string to the DR Key + 2. Decrypt random string ciphertext with DR Key and verify + +**Routine Outputs:** + +- N Operator Key Yubikeys +- N SD Cards ("Operator Cards ") each containing a plaintext Operator Key +- N pieces of paper each containing a plaintext Operator Key +- N Location Key Yubikeys +- N SD Cards ("Shard Cards") Containing: + - Location Key backup (encrypted to password) + - Encrypted password (for Location Key) + - Shard encrypted to Location Key +- N pieces of paper with a Location Key backup (encrypted to password) +- SD card ("Public Card") containing: + - DR Key public key + - All DR Operator Key public keys and DR Key signatures over them + - All DR Location Key public keys and DR Key signatures over them + - DR Key verification challenge + +### DR Reconstitution Script + +**Routine Inputs:** + +- All Shards +- m-of-n encrypted Location Key PINs +- m-of-n Location Key Yubikeys +- m-of-n Operator Yubikeys + +**Routine:** + +1. For m operators: + 1. prompt for Operator to insert Operator Key Yubikey + 2. prompt for Operator to insert Location Key Yubikey + 3. Operator key is used to decrypt password + 4. Decrypted password is used to authorize Location Key Yubikey + 5. Location Key Yubikey is used to decrypt Shard + 6. Decrypted shard is persisted in memory with other decrypted shards + 7. Decrypted shards are assembled with Shamir Secret Sharing tool, outputting + DR Key + +**Routine Output:** + +- Plaintext DR Key + +## DR Generation Ceremony + +```mermaid +flowchart LR + subgraph storage[Online Storage] + loc-pubkey(Location
Public Keys) + op-pubkey(Operator
Public keys) + pubkey(DR Public Key) + shard-enc(Encrypted
Shards 1-N) + end + + subgraph generation[Generation] + seckey(DR Private Key) + ssss[Shamir's Secret
Sharing System] + seckey --> ssss + ssss --> shard(Shards 1-N) + end + + subgraph location[Location 1-N] + loc-prvkey-sc(Location Key
Smartcard) + loc-prvkey-enc-paper(Location Key
Encrypted
Paper Backup) + loc-prvkey-enc-pass(Location Key
Encrypted
Passphrase) + end + + subgraph operator[Operator 1-N] + op-sc(Operator Key
Smartcard) + op-sd(Operator Key
Plaintext SD) + op-ppr(Operator Key
Plaintext Paper) + end + + generation --> operator + generation --> storage + generation --> location +``` + +**Requirements:** + +- N Operators +- 1 Ceremony Laptop with Coreboot/Heads firmware +- 1 flash drive containing: + - Multi-party signed iso of AirgapOS + - DR Generation Script + - Shamirs Secret Sharing Binary +- N new Operator Yubikeys +- N new Location Key Yubikeys +- N\*2+1 new SD Cards +- N\*3 safety deposit bags +- 1 bottle of glitter nail polish +- 1 phone/camera of any kind + - Prefer offline device that can save images to SD card +- 1 Disposable inkjet printer with ink and paper + - Should have no wifi, or have wifi physically removed + +**Steps:** + +1. Boot Ceremony Laptop to Heads firmware menu +2. Drop to a debug shell +3. Insert and mount AirgapOS flash drive +4. Generate hashes of the AirgapOS flash drive contents +5. Multiple parties verify hashes are expected +6. Boot ISO file +7. Run _DR Generation Script_ +8. Reboot system +9. Run _DR Reconstitution Ceremony_ +10. Decrypt _DR Key Verification Challenge_ +11. Shut down system +12. Print contents of _Shard Cards_ from generation script +13. Print contents of _Operator Cards_ from generation script +14. Seal Location Key artifacts + * Open new safety deposit box bag + * Insert printed backups of "Shard Cards" + * Insert respective Location Key Smartcard + * Cover seals of closed safety deposit bags in glitter nail polish + * Take and backup closeup pictures of all safety deposit bag seals +15. Seal _Operator Card_ Artifacts into "Inner bag" + * Open new safety deposit box bag + * Insert printed backup of _Operator Card_ + * Insert _Operator Card_ + * Cover seals of closed safety deposit bags in glitter nail polish +16. Seal _Operator Smartcard_ + * Open new safety deposit box bag + * Insert _Inner bag_ + * Insert _Operator Smartcard_ + * Cover seals of closed safety deposit bags in glitter nail polish +17. Take closeup pictures of all safety deposit bag seals +18. Commit relevant artifacts to GitHub + - From _Public Card_: + - Public DR Key + - Public DR Shard Encryption Keys + signatures from the DR Key + - Public DR Operator Keys + signatures from the DR Key + - Images of the sealed bags + - Signatures over all of this content by multiple witness Personal PGP keys +19. Hand off sealed bags with shard material to DR Couriers. +20. Hand off sealed bags with Operator Key material to respective Operators. + +## DR Reconstitution Ceremony + +```mermaid +flowchart LR + + subgraph storage[Online Storage] + shard-enc(Encrypted
Shards 1-N) + secrets-enc(Encrypted Secrets) + end + + storage --> recovery + + subgraph location1[Location 1] + loc1-prvkey-sc(Location 1
Key
Smartcard) + loc1-prvkey-enc-pass(Location 1
Key
Encrypted
Passphrase) + end + + subgraph location2[Location 2] + loc2-prvkey-sc(Location 2
Private Key
Smartcard) + loc2-prvkey-enc-pass(Location 2
Key
Encrypted
Passphrase) + end + + subgraph locationN[Location N] + locN-prvkey-sc(Location N
Key
Smartcard) + locN-prvkey-enc-pass(Location N
Key
Encrypted
Passphrase) + end + + subgraph recovery[Recovery] + seckey(DR Private Key) + ssss[Shamir's Secret
Sharing System] + ssss --> seckey + seckey --> secrets-dec(Decrypted Secrets) + shard1(Shard 1) --> ssss + shard2(Shard 2) --> ssss + shardN(Shard N) --> ssss + end + + location1 --> recovery + location2 --> recovery + locationN --> recovery + op1-sc(Operator 1
Smartcard) --> recovery + op2-sc(Operator 2
Smartcard) --> recovery + opN-sc(Operator N
Smartcard) --> recovery +``` + +**Requirements:** + +- DR Key Public Key +- M-of-N Operators +- M-of-N Operator Key Yubikeys +- M-of-N Location Keys +- M-of-N Encrypted Location Key pins +- M-of-N Shards + +**Steps:** + +1. Boot Ceremony Laptop to Heads firmware menu +2. Drop to a debug shell +3. Insert and mount AirgapOS flash drive +4. Generate hash of AirgapOS ISO file and DR Reconstitution Script +5. Multiple parties verify hashes are expected +6. Boot ISO file +7. Run DR Reconstitution Script diff --git a/production-engineering.md b/production-engineering.md new file mode 100644 index 0000000..9f52268 --- /dev/null +++ b/production-engineering.md @@ -0,0 +1,380 @@ +# Production Engineering + +## Overview + +The goal of this document is to outline strict processes those that have access +to PRODUCTION systems MUST follow. + +It is intended to mitigate most classes of known threats while still allowing +for high productivity via compartmentalization. + +Production access where one has access to the personal data or assets of others +is to be taken very seriously, and assigned only to those who have tasks that +can not be performed without it. + +These are the rules we wish to meet and model at #! and futher recommend for +those who are managing targeted production systems. + +## Assumptions + +1. All of our screens are visible to an adversary +2. All of our keyboards are logging to an adversary +3. Any firmware/bootloaders not verified on every boot are compromised +4. Any host OS with network access is compromised +5. Any guest OS used for any purpose other than prod access is compromised +7. At least one member of the PRODUCTION team is always compromised +8. At least one maintainer of third party code we depend on is compromised + +## Requirements + +1. A PRODUCTION ENGINEER SHOULD have the following: + * A clear set of tasks that can't be completed without access + * Experience in Red/Blue team, CTF, or documented CVE discoveries + * A demonstrated working knowledge of: + * Low level unix understanding. E.g. /proc, filesystems, kernel modules + * Low level debugging techniques. E.g. strace, inotify, /proc, LD_PRELOAD + * Linux kernel security features. E.g. seccomp, pam, cgroups, permissions + * Common attack classes. E.g: XSS, Buffer Overflows, Social Engineering. + * A majority passed interview panel with current PRODUCTION access team + * An extensive background check clean of any evidence of dishonesty + * Training on secret coercion and duress protocols shared with team +2. A PRODUCTION ENGINEER MUST NOT expose a CRITICAL SECRET to: + * a screen + * a keyboard + * an unsupervised peer + * an internet connected OS other than destination that requires it +3. A recommended ENTROPY SOURCE MUST be used to generate a CRITICAL SECRET +4. An OS that accesses PRIVILEGED SYSTEMS MUST NOT be used for anything else +5. Any OS without verified boot from firmware up MUST NOT be trusted +6. Manual PRIVILEGED SYSTEM mutations MUST be approved, witnessed, and recorded +7. PRIVILEGED SYSTEM mutatations MUST be automated and repeatable via code +8. Any code without SECURITY REVIEW MUST NOT be trusted +9. Any code we can not review ourselves as desired MUST NOT be trusted +10. PRIVILEGED SYSTEM access requires physical interaction with an approved HSM + +## Implementation + +### Tools + +* HARDENED WORKSTATION +* PERSONAL HSM +* ENTROPY SOURCE + +### Setup + +#### Install QubesOS + +* Enable full disk encryption +* Set up Verified Boot + * Choose "factory reset" after QubesOS install in boot options + * Sign with PERSONAL HSM + * Verify every boot by inserting PERSONAL HSM and observing green LED +* Set up FDE password in hardware device via PERSONAL HSM +* Set up Challenge Response authentication with PERSONAL HSM + * PERSONAL HSM + password to unlock screen + * Automatically lock screen when PERSONAL HSM removed + +#### Configure Qubes + +##### Vault + +* MUST NOT have internet access +* Example use cases: + * Personal GPG keychain management + * Bulk encrypting/decrypting documents + * Provisioning OTP devices + +##### Production + +* MUST have internet access limited to only bastion servers +* MUST manage all credentials via individually approved hardware decryption + * password-store via PERSONAL HSM or mooltipass +* SHOULD use whonix as the network gateway +* Used only to reach PRODUCTION bastion servers and systems beyond them + +##### Work + +* MUST have internet access limited to organization needs and partners +* MUST manage all credentials via individually approved hardware decryption + * password-store via PERSONAL HSM or mooltipass +* SHOULD use whonix as the network gateway +* Example use cases: + * Read only access to AWS panel + * Observe Kibana logs + * Check organization email and chat + +##### Personal + +* No internet access limits +* SHOULD only be used for personal services +* SHOULD use whonix as the network gateway +* Example use cases: + * Check personal email / chat + * Personal finance + +##### Development + +* No internet access limits +* SHOULD only be used for development +* SHOULD use whonix as the network gateway +* SHOULD manage all credentials via individually approved hardware decryption + * password-store via PERSONAL HSM +* Example use cases: + * Read online documentation + * Authoring code + * Submitting PRs + * Doing code review + +##### Disposable + +* SHOULD only be used for development +* SHOULD use whonix as the network gateway +* Example use cases: + * Pentesting + * Testing untrusted code or applications + * Competitive research + * Explore dark web portals for data dumps + +### Setup Keychain + +* Follow "Advanced" GPG setup guide in Vault Qube + * Daily driver PERSONAL HSM holds subkeys + * Separate locked-away PERSONAL HSM holds master key + * Separate locked-away flash drive holds pubkeys and encrypted private keys + +## Workflow + +The following describes the workflow taken to get signed code/artifacts that +have become stablized in the DEVELOPMENT environment and are ready to enter +the pipeline towards production. + +### Local + +* PRODUCTION ENGINEER or Software Engineer + * Authors/tests changes to INFRASTRUCTURE REPOSOTORY in LOCAL environment + * makes feature branch + * makes one or more commits to feature branch + * signs all commits with PERSONAL HSM + * Optional: Squashes and re-signs sets of commits as desired + * Submits code to peer for review +* PRODUCTION ENGINEER + * Verifies changes work as intended in "local" environment + * Verifies changes have solid health checks and recovery practices + * Merges reviewed branch into master branch with signed merge commit + +### Staging + +* PRODUCTION ENGINEER #1 + * Copies desired changes from LOCAL templates to STAGING templates + * makes feature branch + * makes one or more commits to feature branch + * signs all commits with PERSONAL HSM + * Optional: Squashes and re-signs sets of commits as desired + * Submits code to peer for review +* PRODUCTION ENGINEER #2 + * Optional: significant changes/migrations are tested in "sandbox" env. + * Merges reviewed branch into master branch with signed merge commit +* PRODUCTION ENGINEER #1 + * Logs into STAGING toolbox + * Deploys changes from STAGING "toolbox" + +### Production + +* PRODUCTION ENGINEER #1 + * Copies desired changes from STAGING templates to PRODUCTION templates + * makes feature branch + * makes one or more commits to feature branch + * signs all commits with PERSONAL HSM + * Optional: Squashes and re-signs sets of commits as desired + * Submits code to peer for review +* PRODUCTION ENGINEER #2 + * Optional: significant changes/migrations are tested in "sandbox" env. + * Merges reviewed branch into master branch with signed merge commit +* 2+ PRODUCTION ENGINEERs + * Logs into PRODUCTION toolbox via PRODUCTION bastion + * Deploys changes from PRODUCTION "toolbox" with witness + +### Emergency Changes + +* 2+ PRODUCTION ENGINEERs + * Log into PRODUCTION toolbox via PRODUCTION bastion + * Deploy live fixes from PRODUCTION "toolbox" with witness +* Regular Production process is followed from here + +## Changes + +Ammendments to this document including the Exceptions section may be possible +with a majority vote of current members of the Production Engineering team. + +All changes must be via cryptographically signed commits by current Production +Engineering team members to limit risk of social engineering. + +Direct orders from your superiors that conflict with this document should be +considered a product of duress, and thus respectfully ignored. + +## Appendix + +### Glossary + +#### MUST, MUST NOT, SHOULD, SHOULD NOT, MAY + +These key words correspond to their IETF definitions per RFC2119 + +#### INFRASTRUCTURE REPOSITORY + +This repo, which is where all infrastructure-as-code gets integrated via +direct templates or submodules as appropriate. + +#### PRODUCTION + +Deployment environment that faces the public internet and consumed by end +users. + +#### STAGING + +Internally facing environment that is identical to PRODUCTION and will normally +be one release ahead. Intended for use by contributors to test our services and +deployment process before we deploy to any public facing environments. + +#### DEVELOPMENT + +Environment where changes are rapidly integrated for integration and +development aid with or without code review. + +This environment is never trusted + +#### LOCAL + +Environment designed to run in virtual machines on the workstation of every +engineer. Designed to behave as close as possible to our production environment +so engineers can rapidly test changes and catch issues early without waiting on +longer deployment round trips to deploy to DEVELOPMENT. + +This environment is intended to be the hand-off point between unprivilged +contributors and the PRODUCTION ENGINEER team. + +#### SECURITY REVIEW + +We consider code to be suitably reviewed if it meets the following criteria: + +* Validated by member of the PRODUCTION ENGINEERING team to do what is expected +* Audited for vulnerabilities by an approved security reviewer + +##### Approved security reviewers + +* PRODUCTION ENGINEERING team +* Doyensec +* Cure53 +* Arch Linux +* Bitcoin Core +* Canonical +* Cloud Native Computing Foundation +* CoreOS +* QubesOS +* Raptor Engineering +* Fedora Foundation +* FreeBSD Foundation +* OpenBSD Foundation +* Gentoo Foundation +* Google +* Guardian Project +* Hashicorp +* Inverse Path +* Linux Foundation +* The Debian Project +* Tor Project +* ZCash Foundation + +#### HARDENED WORKSTATION + +A workstation that will come in contact with production write access must meet +the following standards: + +* Requires PERSONAL HSM do firmware/boot integrity attestation every boot +* Open firmware (with potential exception of Mangement Engine blob) +* CPU Management Engine disabled or removed +* Physical switches for microphone, webcam, wireless, and bluetooth +* PS/2 interface for Keyboard/touchpad to mitigate USB spoof/crosstalk attacks + +##### Recommended devices + +* Librem 15 +* Librem 13 +* Insurgo PrivacyBeast X230 +* Raptor Computing Blackbird +* Raptor Computing Talos II + +#### ENTROPY SOURCE + +Good entropy sources should always be impossible to predict for a human. These +are also typically called a True Random Number Generator or TRNG. + +In the event that we can not -prove- an entropy source is impossible to predict +then multiple unrelated entropy sources must be used and combined with each +other. + +A given string of entropy must: +* Be at least 256 bits long +* Be whitened so there are no statistically significant patterns + * sha3, xor-encrypt-xor, etc + +##### Approved entropy sources + +* Infinite Noise +* Built-in hardware RNG in an a PERSONAL HSM +* Webcam +* Microphone +* Thermal resistor +* Dice + +#### PERSONAL HSM + +Small form factor HSM capable of doing common required cryptographic operations +such as GnuPG smartcard emulation, TOTP, challenge/response, etc. + +The following devices are recommended for each respective use case. + +##### WebAuthN / U2F + +* Yubikey 5+ +* u2f-zero +* Nitrokey +* OnlyKey +* MacOS Touchbar +* ChromeOS Fingerprint ID + +##### Password Management + +* Yubikey 5+ +* Trezor Model T +* Leger Nano X +* Mooltipass + +##### Encryption/Decryption/SSH + +* Yubikey 5+ +* Trezor Model T +* Leger Nano X + +##### Firmware Attestation + +* Librem Key +* Nitrokey + +#### PRODUCTION ENGINEER + +One who has limited or complete access to detailed financial data or documents +of our customers, as well as any access that might allow mutation or movement +of their assets. + +#### PRIVILEGED SYSTEM + +Any system that has any level of access beyond that which is provided to all +members of the engineering team. + +#### CRITICAL SECRET + +Any secret that has partial or complete power to move customer assets + +If a given computer -ever- has PRODUCTION access, then secrets used to manage +that system such as login, full disk encryption etc are in scope. diff --git a/random-red-team.md b/random-red-team.md new file mode 100644 index 0000000..ab456e1 --- /dev/null +++ b/random-red-team.md @@ -0,0 +1,86 @@ +# Random Red Team + +## Summary + +This document seeks to detail intentionally introducing security vulnerbilties +into projects to test code review processes and foster a healthy and expected +culture of distrust and higher security scrutiny during code reviews regardless +of social standing, or experience level of the author. + +## Motivation + +In modern organizations it is very commonplace for code to be reviewed for +suboptimal patterns, poor commenting etc. It is far less common that code +is carefully scrutinized for security, particularly around tough deadlines, +or when the code is coming from Sr. engineers that are well trusted. + +Likewise third party package inclusions such as new NPM dependencies are often +not audited at all. + +This culture of trust actually creates non intuitive danger for contributors +as now any of them could be coerced by an sophisticated adversary such as a +government (See Australia's Access And Assistance Bill 2018 ). + +If a culture of high security scrutiny during code review is created then a +coercion or supply chain dependency attack becomes no longer as desireable or +worth the risk for an adversary, and in turn puts contributors at less risk. + +This tactic might also further help to prevent subtle heartbleed style +accidents. + +## Design + +In short, we seek to gamify a reward system for better security practices. + +A typical example of this is encouraging screen locking by using unlocked +machines as a method to social engineer the delivery of donuts by the victim +via impersonation. Another is to encourage badge checks by introducing +badgeless "secret shoppers" that hold rewards for those that challenge them. + +This approach extends this idea to code review. + +The scheme is as follows: + +1. One engineer is picked randomly from a pool of participants every "sprint" +or similar time period to be a "bad actor" + +2. During this time period, in additional to regular duties, the engineer has +a free pass to try to sneak some type of vulnerability past code review that +would allow them some ability to control a private key, execute code, or other +attack that would give an outside adversary some significant advantage in +defeating system security or privacy. + +3. Security and Release engineering teams are always informed of the current +"bad actor" and knows to not actually release any code they are involved in. + * Organizers can play a role in teaching typical red team tactics but can not + have any direct participation in an attack. + +4. Organization puts up bounty that is provided to anyone that successfully +spots a vulnerability, OR a higher one to the "bad actor" that successfully +gets a peer to approve of an exploit. + +5. "Bad actor" and security/release engineering teams are all responsible for +calling out introduced vulnerability before it can get past a dev environment. + +## Drawbacks + +* Engineers are constantly suspicious of their peers + * Counter: They should be! Anyone could be compromised at any time. + +* Engineers may spend more time thinking about security rather than features + * Counter: Accept that higher security slows down development for quality. + +* Engineers have a motivation to leave security vulnerabilities for their turn + * Counter: Provide rewards for security issue discovery outside of game + +* Engineers have the ability to collude and split winnings + * Counter: Terminate dishonest employees with extreme prejudice. + +## Unresolved Questions + +How far should this be allowed to go? Is phishing and exploitation of +unattended machines or keyloggers fair game? + +## Future Work + +Deploy in real world organizations and share results :) diff --git a/secrets.md b/secrets.md new file mode 100644 index 0000000..11f0ccd --- /dev/null +++ b/secrets.md @@ -0,0 +1,45 @@ + 1. Hardware decryption with user interaction + * Tools: + * Password Store + * https://www.passwordstore.org/ + * Shared git repo + * Yubikey with PGP keychain for each engineer + * Defense: + * Prevent theft of secrets not currently being used + * Usage: + * Encrypt secrets to Yubikey PGP keys of all holders as individual files + * Place secrets in Git repo + * Use "pass" command to sync and decrypt secrets on demand as needed + * ```some-signing-command --key=<(pass Exodus/somesecret)``` + * Each access requires a Yubikey tap to decrypt + 2. Hardware decryption with explicit user consent + * Tools: + * Mooltipass + * https://www.themooltipass.com/ + * Ledger + * https://support.ledger.com/hc/en-us/articles/360017501380-Passwords?docs=true + * Trezor + * https://trezor.io/passwords/ + * Defense: + * Prevent theft of secrets not currently being used + * Prevent operator from being tricked into revealing wrong secret + * Usage: + * All devices use a pin to unlock, and can share encrypted databases + * All devices explicitly ask for consent to release a secret by name + * User reads on external display and approves with a button press + 3. Shamirs Secret Sharing to tamper evident system + * Tools: + * Remotely attestable TEE or HSM + * Nitro Enclave + * Google Confidential Compute + * osresearch/heads booted server + * Defense: + * Prevent theft of secrets not currently being used + * Prevent operator from being tricked into revealing wrong secret + * Prevent compromised operator from stealing any secrets + * Usage: + * Public keys of trusted quorum provided to enclave + * Secrets are created in enclave + * Secrets are split into share requiring M-of-N to reconstruct + * Enclave renturns shares encrypted to each quorum member public key + * M-of-N quorum members can submit shares of given secret to servers diff --git a/security.md b/security.md new file mode 100644 index 0000000..79fc467 --- /dev/null +++ b/security.md @@ -0,0 +1,77 @@ +## Web Content Signing via Service Workers +- Implementation: + - M-of-n parties deterministically compile web interface bundle and sign it + - Interface installs service worker mandates all future updates are + - signed with m-of-n valid keys certified by a pinned CA + - newer timestamp than current version +- Protections + - Compromised insider tampering with frontends + - BGP attacks + - DNS takeover + - TLS MITM +- Resources + - https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API/Using_Service_Workers + - https://arxiv.org/pdf/2105.05551 + +## Web Request Signing via WebAuthn +- Implementation: + - Collect WebAuthn public keys for one or more devices for all users + - External Authenticators: Yubikey, Nitrokey, Ledger, Trezor, Solokey, etc. + - Platform Authenticators: iOS 13+, Android 7+, Windows Hello, many Chromebooks + - Certify Webauthn public keys with trusted enclave + - Webauthn sign all impacting web requests like trades and transfers + - Private key enclaves validate request signatures before signing trades and transfers +- Protections: + - Compromised insider tampering with backends + - TLS MITM +- Resources: + - https://developers.yubico.com/WebAuthn/Concepts/Using_WebAuthn_for_Signing.html + +## Internal Supply chain integrity +- Implementation + - Collect and certify asymmetric public keys from all engineers + - Have all engineers locally sign all code commits and reviews + - Multiple independently managed CI/CD systems are deployed + - CI/CD systems deterministically build only validly signed commits/reviews + - CI/CD systems sign resulting artifacts with well known/pinned keys + - Production systems only deploy artifacts signed by multiple CI/CD systems +- Protections + - Compromised insider impersonates commit as another engineer + - Compromised insider injects malicious code, bypassing review controls + - Compromised CI/CD system tampers with artifact generation +- Resources: + - https://github.com/distrust-foundation/sig + - https://github.com/hashbang/git-signatures + - https://github.com/hashbang/book/blob/master/content/docs/security/Commit_Signing.md + - https://blog.dbrgn.ch/2021/11/16/git-ssh-signatures/ + +## External Supply chain integrity +- Implementation + - Collect and pin asymmetric pubic keys from all code reviewers + - Review all third party dependencies used in transfer-critical codebases + - Have all reviewers sign reviews with certified public keys + - Publish reviews in well documented format to help crowd-source efforts + - Have CI/CD fail production builds when un-reviewed deps are present +- Protections + - Obvious malicious code injected into external software library- +- Resources: + - https://gist.github.com/lrvick/d4b87c600cc074dfcd00a01ee6275420 + - https://gitlab.com/wiktor/lance-verifier + - https://github.com/in-toto/attestation/issues/77 + +## Accountable Airgapped Workflows +- Implementation + - Multiple parties compile deterministic airgap OS and firmware + - Multiple parties sign airgap os/firmware artifacts + - New laptop acquired by multiple parties + - Trusted firmware loaded, verifying signed hash with existing firmware + - CA key pinned into firmware, and external TPM verification device + - Laptop stored in highly tamper evident vault requiring multiple parties for access + - Laptop firmware verifies multi-party signature on flash-drive iso and any scripts + - Participants verify date and ensure it is the latest and expected version +- Protections + - Tampering by any single compromised insider + - Tampering by any single compiler or build system +- Resources: + - https://github.com/distrust-foundation/airgap + - https://github.com/hashbang/airgap \ No newline at end of file diff --git a/share-recovery.md b/share-recovery.md new file mode 100644 index 0000000..9e6529b --- /dev/null +++ b/share-recovery.md @@ -0,0 +1,144 @@ +# Share Recovery + +## Overview + +This document outlines the creation of a "Share Recovery" (SR) system which +functions as a one-way box that one can encrypt a partial secret to at any +time, with decryption only possible by a share holder with access to an offline +encryption key. + +Such system offers high security, but low redundancy. It is suitable for +encrypting only a single share of a separate disaster recovery system that +requires m-of-n portions of data in order to recover. + +Data is backed up by encrypting plaintext to a [Share Recovery Key](#sr-key). +The resulting ciphertext is then stored in the +[Share Recovery Warehouse](#sr-warehouse). In the case of a disaster, +ciphertexts can be gathered from the SR Warehouse and then decrypted using the +SR Key to regain access to the plaintext, which can be combined with shares +from other systems to reconstitute desired data by the data owner. + +## Threat Model + +- An adversary with any type of online attack is tolerated + - Management of key and share material is managed entirely offline + - Offline systems are heavily controlled for supply chain integrity +- Coercion of a single operator is tolerated + - Share holder will never have access to more than one share + - We expect this is unlikely to happen to two share holder at once +- Destruction of a single share is tolerated + - This is only a single share in a redundant system + - We expect the destruction of multiple shares at once is unlikely + - We expect shares are sufficiently geographically distributed + +## Components + +### Share Owner + - The owner of the share data encrypted to the Share Recovery System + - Could differ from the entity which initially provides the share + +### DR System + - External DR system utilizing requiring multiple secrets to operate + - Examples: Threshold signing, MPC, or Shamir's Secret Sharing + +### SR Key + +- PGP asymmetric key pair a single DR System secret is directly encrypted to +- Only accessible by one or more SR Operators +- Generated offline by an SR Operator using standard PGP processes +- Stored on a dedicated pin controlled HSM +- We chose the PGP standard because: + - It is a widely supported with a plurality of implementations and tooling + - The PGP standard and tooling is assumed to outlive any custom made tools + - Should be more reliable than any crypto implementation we maintain + +### SR Pin +- Pin that controls access to the HSM containing the SR Key + +### SR Ciphertext + +- Encrypted Ciphertext of a secret encrypted to the SR Key + +### SR Location + +- SR Key and SR Ciphertext storage location + - The Location must be geograhically separate from other Shares in DR system + - The SR Location has a fixed human access list + - Those with access can however cooperate to transfer access to others + - The SR Location has physical controls that strictly enforce access + - E.G. A safety deposit box, TL-15+ Safe, etc. + +### SR Operator + +- A human who is on the access list for an SR Location +- Must be highly trusted, but does not have to highly technically skilled +- A human who is capable of decrypting data with a SR Key + +### SR Warehouse + +- Online storage for encrypted data replicated across multiple providers +- All data in SR Warehouse can only be decrypted by the SR Key +- Tolerate loss of any single provider by duplicating data to all of them +- Storage backends can be any combination of the following: + - S3 Compatible object stores: + - AWS, Google Cloud, DigitalOcean, Azure, etc. + - Version control systems: + - Github, AWS CodeCommit, Gitea +- We tolerate a loss of all but one SR storage backend +- A minimum of three storage backends must be maintained + +### Airgap OS + +- QubesOS Vault VM, or dedicated immutable Linux distro such as AirgapOS: + - +- Basic shell tools + - bash, sed, grep, awk, etc. +- Cryptography tools + - gpg, openssl, etc + +### SR Decryption Script + +**Routine Inputs:** + +- SR Ciphertext +- SR Key PIN +- SR Key HSM + +**Routine:** + +1. Operator invokes script to decrypt given SR Ciphertext on Airgap OS +2. Operator is prompted for SR Key HSM and SR Key Pin + +**Routine Output:** + +- Share in plaintext + +### Share Storage Process + + 1. Operator creates a dedicated SR Key + 2. Operator backs encrypted copy of SR key to SR Warehouse + 3. Operator transports SR Smartcard to SR Location. + 4. operator provides public SR Key to Share Owner or designated entity + 5. Share Owner creates and retains a sha256 hash of plaintext share + 6. Share Owner creates SR ciphertext by encrypting Share to SR Key + 7. SR Ciphertext is provided to an Operator + 8. Operator executes Share Recovery Process to decrypt SR Ciphertext + 9. Operator creates sha256 hash of the contents of the SR Ciphertext + 10. Operator backs up SR Ciphertext to SR Warehouse + 11. Operator returns sha256 hash to Share Owner + 12. Share owner confirms sha256 hash, proving decryption was successful + +### Share Recovery Process + + 1. A Share Owner submits a request for plaintext share + 2. An Operator verifies the identity of the Share Owner using multiple means + - Ideally verify a signed request with a well known key + - Verify in person or over video call + 3. An Operator obtains required resources + - Airgap OS on hardware trusted by Operator + - SR Key + - SR Key Pin + - SR Ciphertext + 3. Operator executes SR Decryption script + 4. Plaintext is provided to requesting Share Owner via requested means + - Ideally immediately re-encrypted to a key controlled by Share Owner diff --git a/signed-git-workflows.md b/signed-git-workflows.md new file mode 100644 index 0000000..ea3fd2b --- /dev/null +++ b/signed-git-workflows.md @@ -0,0 +1,35 @@ +## Multi-party Signed Git workflows + +### Path 1 + +This path allows most devs to use the tools they are used to, but requires a second security-only review later + +1. Author submits changes for review +2. Reviewer and author iterate on changes for style, quality, and functionality using any collaboration tool they wish +3. Reviewer merges changes with signed merge commit +4. Several cycles of steps 1-3 complete until it is time for a release +5. Release engineer tags a release candidate +6. Release candidate is tested in staging +7. Security reviewers do "git sig review" locally examining total diff from last release to current release candidate for any malicious code (ignoring style/quality/functionality) +8. Security reviewers do "git sig add" to attach a signed tags to release candidate +9. Release engineer adds signed release tag to release candidate making it eligible for production deployment + +### Path 2 +This path is shorter and more flexible, but will mean all parties need to learn how to use secure review tooling. + +1. Author submits changes for review +2. Reviewer and author iterate on changes using any tool they wish +3. Author or reviewer merge changes with or without a merge commit +4. Reviewer does "git sig review" to bump their review marker to HEAD. + +## Notes + +* Reviews done in a third party WebUI defeat the point of local signing + * You don't know if a force push happened between your review and local pull/sign/push + * For general code-quality this may be tolerable, but review should not be trusted in a security context +* All changes must have a signature by the author and someone other than the author + * This means in Path 1 that a security reviewer cannot also be a contributor to a given context +* "git sig review" tool /could/ support a folder argument + * would signify a given review only applies to a given folder + * will "git sig verify" will similarly need to be folder aware + * Without this, trust on important apps like "signer" will not be practical in a monorepo. \ No newline at end of file diff --git a/ssh.md b/ssh.md new file mode 100644 index 0000000..225752b --- /dev/null +++ b/ssh.md @@ -0,0 +1,172 @@ +--- +title: Secure Shell +--- + +# SSH + +There are a number of different methods for using SSH with a yubikey. Most +of them however require either proprietary components, or modifications to +servers. Such methods are broken and should not be promoted. + +Here we will cover only methods that use standard tools, standard protocols, +and don't allow secret keys to come in contact with system memoy. + +Secondly solutions here do not require any server modifications. This is key +because you will not be able to modify all systems you use that provide ssh +interfaces such as Github ssh push. + +## PKCS11 + +With this interface it is possible to generate an ssh private key in a +particular format that can be stored inside a PKCS11 capable device such +as a Yubikey 5. + +While this does not offer nearly as many assurances as the GPG setup detailed +below, it is the simplest to setup. + +Note: Due to limitations in the PKCS11 spec, it is not possible to generate +keys stronger than 2048 bit RSA with this method. Consider the security +requirements of your organization before using this method. + +### Generation + +For a set of manual steps on how to set this up see: + +[https://developers.yubico.com/PIV/Guides/SSH_with_PIV_and_PKCS11.html] + +To simplify this process consider the following script: + +[https://gist.github.com/lrvick/9e9c4641fab07f0b8dde6419f968559f] + +### Usage + +Since SSH will by default only scan folders such as ~/.ssh/ for keys +you will need to inform it that you wish it to also check for smartcards via +the OpenSC interface. + +Before using ssh commands be sure you have started your ssh agent like so: + +``` +ssh-add -s $OPENSC_LIBS/opensc-pkcs11.so +``` + +## GPG + +### Configure SSH to use your Security Token + +This assumes you already have a Security Token configured with a GPG Authentication subkey. + +The end result will be that you distribute the SSH Public Key from your Security Token to all VCS systems and servers you normally connect to via SSH. From there you will need to have your key insert it, and tap it once for every connection. You will also be required to tap the key for all SSH Agent forwarding hops removing many of the security issues with traditional ssh agent forwarding on shared systems. + +#### Most Linux Distros + +If you are using a recent systemd based distribution then all the heavy lifting +is likely already done for you. + +We recommend simply adding the following to "/home/$USER/.pam_environment": + +``` +SSH_AGENT_PID DEFAULT= +SSH_AUTH_SOCK DEFAULT="${XDG_RUNTIME_DIR}/gnupg/S.gpg-agent.ssh" +``` + +#### Mac OS, WSL, Non-Systemd Linux Distros + +``` +echo >~/.gnupg/gpg-agent.conf <> ~/.gnupg/gpg-agent.conf +``` + +Edit your ~/.bash_profile (or similar) and add the following lines to use gpg-agent (and thus the Security Token) as your SSH key daemon: + +``` +# If connected locally +if [ -z "$SSH_TTY" ]; then + + # setup local gpg-agent with ssh support and save env to fixed location + # so it can be easily picked up and re-used for multiple terminals + envfile="$HOME/.gnupg/gpg-agent.env" + if [[ ! -e "$envfile" ]] || ( \ + # deal with changed socket path in gnupg 2.1.13 + [[ ! -e "$HOME/.gnupg/S.gpg-agent" ]] && \ + [[ ! -e "/var/run/user/$(id -u)/gnupg/S.gpg-agent" ]] + ); + then + killall gpg-agent + # This isn't strictly required but helps prevents issues when trying to + # mount a socket in the /run tmpfs into a Docker container, and puts + # the socket in a consistent location under GNUPGHOME + rm -r /run/user/`id -u`/gnupg + touch /run/user/`id -u`/gnupg + gpg-agent --daemon --enable-ssh-support > $envfile + fi + + # Get latest gpg-agent socket location and expose for use by SSH + eval "$(cat "$envfile")" && export SSH_AUTH_SOCK + + # Wake up smartcard to avoid races + gpg --card-status > /dev/null 2>&1 + +fi + +# If running remote via SSH +if [ ! -z "$SSH_TTY" ]; then + # Copy gpg-socket forwarded from ssh to default location + # This allows local gpg to be used on the remote system transparently. + # Strongly discouraged unless GPG managed with a touch-activated GPG + # smartcard such as a Yubikey 4. + # Also assumes local .ssh/config contains host block similar to: + # Host someserver.com + # ForwardAgent yes + # StreamLocalBindUnlink yes + # RemoteForward /home/user/.gnupg/S.gpg-agent.ssh /home/user/.gnupg/S.gpg-agent + if [ -e $HOME/.gnupg/S.gpg-agent.ssh ]; then + mv $HOME/.gnupg/S.gpg-agent{.ssh,} + elif [ -e "/var/run/user/$(id -u)/gnupg/S.gpg-agent" ]; then + mv /var/run/user/$(id -u)/gnupg/S.gpg-agent{.ssh,} + fi + + # Ensure existing sessions like screen/tmux get latest ssh auth socket + # Use fixed location updated on connect so in-memory location always works + if [ ! -z "$SSH_AUTH_SOCK" -a \ + "$SSH_AUTH_SOCK" != "$HOME/.ssh/agent_sock" ]; + then + unlink "$HOME/.ssh/agent_sock" 2>/dev/null + ln -s "$SSH_AUTH_SOCK" "$HOME/.ssh/agent_sock" + fi + export SSH_AUTH_SOCK="$HOME/.ssh/agent_sock" +fi +``` + +Now switch to using this new setup immediately: + +``` +$ killall gpg-agent +$ source ~/.bash_profile +``` + +### Get SSH Public Key + +You (or anyone that has your GPG public key) can get your SSH key as follows: + +``` +$ gpg --export-ssh-key you@email.com + +ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCcjW34jewR+Sgsp1Kn4IVnkb7kxwZCi2gnqMTzSnCg5vABTbG +jFvcjvj1hgD3CbEiuaDkDqCuXnYPJ9MDojwZre/ae0UW/Apy2vAG8gxo8kkXn9fgzJezlW8xjV49sx6AgS6BvOD +iuBjT0rYN3EPQoNvkuqQmukgu+R1naNj4wejHzBRKOSIBqrYHfP6fkYmfpmnPyvgxbyrwmss2KhwwAvYgryx7eH +HRkZtBi0Zb9/KRshvQ89nuXik50sGW6wWZm+6m3v6HNctbOI4TAE4xlJW7alGnNugt3ArQR4ic3BeIuOH2c/il4 +4BGwdRIp7XlSkkf8cZX59l9kYLaDzw4Z cardno:000603703024 +``` + +You can paste that into your list of authorized keys on all VCS and systems you have access to. + +You should disable any other keys that are not backed by a Security Token. diff --git a/webauthn-custody.md b/webauthn-custody.md new file mode 100644 index 0000000..cae7a0b --- /dev/null +++ b/webauthn-custody.md @@ -0,0 +1,185 @@ +# Webauthn Strategy + +## Goal + +* Tolerate a compromise of any single internet connected computer + * Production engineering laptops + * Client laptops + * Servers +* Tolerate a compromise of any single employee +* Require a quorum of user devices or employees to cause value movement + * Exceptions may be tolerated for ease of use if well covered by insurance. + +## Approach + +Most users today have WebAuthn devices, though most do not yet realize it. + +Apple TouchId, Chromebook touch, Yubikeys, Ledgers, Trezors, etc are all +examples. + +These devices allow users to login without a password, but also allow them to +sign any individual operation, proving the origin of that request was from a +device controlled by the user. + +This document describes an infrastructure where every decision to move or limit +the movement of assets is cryptographically signed on the system of the +decision maker in a way that is easy and transparent for users. + +This seeks to allow off-chain multisig for all sensitive actions in a custodial +digital asset stack. If implemented, it should result in a system that makes +it highly evident when any operations in a system are manipulated by any entity +other than the original decision makers for that operation. + +## Infrastructure + +### Queue + +* AMPQ or similar + +### Policy node + +#### Function + +* Contains "Policy key" +* Contains knowledge of latest version of all policies + * Store complete signed log or an accumulator to prove membership in set + * https://dev.to/johndriscoll/compressing-authority-1kph +* Validaties policies + * One of: + * signed by itself (initial policies) + * signed by m-of-n owners defined in initial policy +* Validates operations + * Verifies operation meets signed policies + * Signs with "Policy key" +* Approve Webauthn keys + * Blindly approve keys for a new group/wallet + * Require m-of-n policy + +#### Implementation + +* Nitro enclave w/ REST proxy +* Memory safe language such as Rust recommended +* Only use dependencies it is practical to fully review. + +#### Deployment + +* Generates ephemeral RSA key at boot +* M-of-N parties + * Remotely attest Nitro enclave is running expected deterministic image + * Decrypt shamirs secret shares with personal HSMs on airgapped machines + * Encrypt respective shamirs secret shares to ephemeral key + * Submit encrypted shares to Policy enclave +* Policy enclave comes online and can begin evaluations + +### Signing node + +#### Function + +* Contains "Quorum Key" +* All individual user keys are encrypted to Quorum Key +* Validaties signing requests are signed by policy enclave +* Issues blockchain-valid signatures +* Returns signed bundle to Queue + +#### Implementation + +* Nitro enclave w/ REST proxy +* Memory safe language such as Rust recommended +* Only use dependencies it is practical to fully review + +#### Deployment + +* Generates ephemeral RSA key at boot +* M-of-N parties + * Remotely attest Nitro enclave is running expected deterministic image + * Decrypt shamirs secret shares with personal HSMs on airgapped machines + * Encrypt respective shamirs secret shares to ephemeral key + * Submit encrypted shares to Policy enclave +* Policy enclave comes online and can begin evaluations + + +## Workflows + +### Verify UX Integrity + +* Service worker is registered in browser on first visit +* Service worker configured to proxy all future server interactions +* Service worker contains update() function that only accepts signed code +* Service worker verifies all new js/css/html from server is multi-signet + +See: https://arxiv.org/pdf/2105.05551.pdf + +### Add user + +* User initiates WebAuthn registration in web UI +* WebAuthn Public key and UserID is submitted to Policy Node +* Policy Node verifies key has never been seen before +* Policy Node signs key/UserID pair and submits to user database + +### Add device + +* User performs WebAuthn login with existing device +* User selects "Add Device" option after login +* User is presented with one time short url and QR code to add second device +* User opens url on new device and performs WebAuthn registration +* Both devices show user-friendly hash (emoji or similar) of new device key +* User confirms hashes match, and old device signs key of new device +* Signature of new key is submitted to Policy Node +* Policy Node verifies old device signature and signs new key/UserID pair + +### Add/Change policy + +* User performs WebAuthn login with existing device +* User selects "Add Policy" option for a wallet after login +* User chooses m-of-n or whitelist policy options +* User signs policy by interacting with WebAuthn device +* Existing m-of-n group (if any) signs policy change with WebAuthn devices +* Policy is submitted to Policy Node and remembered as latest policy. + +See: https://developers.yubico.com/WebAuthn/Concepts/Using_WebAuthn_for_Signing.html + +### Recover account + +* User signals they wish to recover account via "forgot device" form +* User provides identifying account information + * Could include a signed an automatically WebRTC recorded video clip +* User uses a new WebAuthn device to approve/sign the support request +* Two customer support representatives verify and sign replacement device +* Policy Node verifies replacement key was signed by two "support" role keys +* Policy Node signs key/UserID pair and submits to user database +* User can now use new device to login + +## Rollout + +1. Enable WebAuthn as 2FA option, augmenting existing login flows +2. Mandate WebAuthn 2FA for all employees +3. Mandate WebAuthn 2FA for all high value users +4. Mandate WebAuthn 2FA for all new users +5. Flag accounts as "at risk" that only have a single registered device +6. Disable all 2FA methods but WebAuthn (force migration on login) +7. Drop password requirments and migrate to pure FIDO2/WebAuthn login +8. Begin signing all policy change, user additions, withdrawal requests, etc. +9. Begin enforcing all policies and keys in TEE or HSM (e.g. Nitro enclave) + + +### Short term mitigations + +Until measures like those above are in place, it must be understood that any +member with administrative account access, or production access, in effect have +complete personal custody of all user assets. + +During this high risk period, some very strict and reasonably low cost controls +are recommended for immediate implementation. + +* Create a fully "cold" wallet system for bulk of assets + * Users are told "cold" marked assets will have extended withdrawl times + * Use a hardened multi-party controlled laptop in dual-access vault + * See: https://github.com/distrust-foundation/airgap +* Ensure all assets above insurance level are maintained in "cold" wallet +* Ensure administrative/production access only possible by dedicated machines + * See: https://github.com/hashbang/book/blob/master/content/docs/security/Production_Engineering.md +* Ensure all commits and reviews are signed with employee Yubikeys or similar + * https://github.com/hashbang/book/blob/master/content/docs/security/Commit_Signing.md + * https://github.com/distrust-foundation/sig +* Deploy competing CI/CD systems that check keys and test for determinisim + * See: https://reproducible-builds.org/ \ No newline at end of file diff --git a/webauthn-kyc-encryption.md b/webauthn-kyc-encryption.md new file mode 100644 index 0000000..12a8814 --- /dev/null +++ b/webauthn-kyc-encryption.md @@ -0,0 +1,12 @@ +1. User submits KYC document via a web form +2. Web form automatically encrypts document to a key held in KMS (with offline backups) +3. Encrypted documents are submitted to an API gateway hook that triggers a lambda job which places the documents directly into an s3 bucket. +4. A support agent opens the KYC review interface and clicks a document to decrypt. +5. The support agent browser automatically generates a random encryption public key, and public key and the ID of the requested document they wish to decrypt to API Gateway +6. API Gateway launches a lambda job which hashes the document request with a random challenge and returns it to the browser +7. The browser prompts the support agent to tap their Yubikey which signs the challenge. +8. The browser sends the signed challenge back to API Gateway. +9. API gateway passes the signed document request payload to a lambda job which has access to the KMS role to use the KYC decryption key. +10. Lambda job decrypts the one document, and then encrypts it to the encryption key of the request and returns it to the support agent browser. +11. Document is decrypted and displayed in support agent browser. +12. Agent reviews document and it is automatically deleted locally when closed. \ No newline at end of file