docs/webauthn-custody.md

185 lines
6.9 KiB
Markdown
Raw Normal View History

2023-08-04 21:59:58 +00:00
# Webauthn Strategy
## Goal
* Tolerate a compromise of any single internet connected computer
* Production engineering laptops
* Client laptops
* Servers
* Tolerate a compromise of any single employee
* Require a quorum of user devices or employees to cause value movement
* Exceptions may be tolerated for ease of use if well covered by insurance.
## Approach
Most users today have WebAuthn devices, though most do not yet realize it.
Apple TouchId, Chromebook touch, Yubikeys, Ledgers, Trezors, etc are all
examples.
These devices allow users to login without a password, but also allow them to
sign any individual operation, proving the origin of that request was from a
device controlled by the user.
This document describes an infrastructure where every decision to move or limit
the movement of assets is cryptographically signed on the system of the
decision maker in a way that is easy and transparent for users.
This seeks to allow off-chain multisig for all sensitive actions in a custodial
digital asset stack. If implemented, it should result in a system that makes
it highly evident when any operations in a system are manipulated by any entity
other than the original decision makers for that operation.
## Infrastructure
### Queue
* AMPQ or similar
### Policy node
#### Function
* Contains "Policy key"
* Contains knowledge of latest version of all policies
* Store complete signed log or an accumulator to prove membership in set
* https://dev.to/johndriscoll/compressing-authority-1kph
* Validaties policies
* One of:
* signed by itself (initial policies)
* signed by m-of-n owners defined in initial policy
* Validates operations
* Verifies operation meets signed policies
* Signs with "Policy key"
* Approve Webauthn keys
* Blindly approve keys for a new group/wallet
* Require m-of-n policy
#### Implementation
* Nitro enclave w/ REST proxy
* Memory safe language such as Rust recommended
* Only use dependencies it is practical to fully review.
#### Deployment
* Generates ephemeral RSA key at boot
* M-of-N parties
* Remotely attest Nitro enclave is running expected deterministic image
* Decrypt shamirs secret shares with personal HSMs on airgapped machines
* Encrypt respective shamirs secret shares to ephemeral key
* Submit encrypted shares to Policy enclave
* Policy enclave comes online and can begin evaluations
### Signing node
#### Function
* Contains "Quorum Key"
* All individual user keys are encrypted to Quorum Key
* Validaties signing requests are signed by policy enclave
* Issues blockchain-valid signatures
* Returns signed bundle to Queue
#### Implementation
* Nitro enclave w/ REST proxy
* Memory safe language such as Rust recommended
* Only use dependencies it is practical to fully review
#### Deployment
* Generates ephemeral RSA key at boot
* M-of-N parties
* Remotely attest Nitro enclave is running expected deterministic image
* Decrypt shamirs secret shares with personal HSMs on airgapped machines
* Encrypt respective shamirs secret shares to ephemeral key
* Submit encrypted shares to Policy enclave
* Policy enclave comes online and can begin evaluations
## Workflows
### Verify UX Integrity
* Service worker is registered in browser on first visit
* Service worker configured to proxy all future server interactions
* Service worker contains update() function that only accepts signed code
* Service worker verifies all new js/css/html from server is multi-signet
See: https://arxiv.org/pdf/2105.05551.pdf
### Add user
* User initiates WebAuthn registration in web UI
* WebAuthn Public key and UserID is submitted to Policy Node
* Policy Node verifies key has never been seen before
* Policy Node signs key/UserID pair and submits to user database
### Add device
* User performs WebAuthn login with existing device
* User selects "Add Device" option after login
* User is presented with one time short url and QR code to add second device
* User opens url on new device and performs WebAuthn registration
* Both devices show user-friendly hash (emoji or similar) of new device key
* User confirms hashes match, and old device signs key of new device
* Signature of new key is submitted to Policy Node
* Policy Node verifies old device signature and signs new key/UserID pair
### Add/Change policy
* User performs WebAuthn login with existing device
* User selects "Add Policy" option for a wallet after login
* User chooses m-of-n or whitelist policy options
* User signs policy by interacting with WebAuthn device
* Existing m-of-n group (if any) signs policy change with WebAuthn devices
* Policy is submitted to Policy Node and remembered as latest policy.
See: https://developers.yubico.com/WebAuthn/Concepts/Using_WebAuthn_for_Signing.html
### Recover account
* User signals they wish to recover account via "forgot device" form
* User provides identifying account information
* Could include a signed an automatically WebRTC recorded video clip
* User uses a new WebAuthn device to approve/sign the support request
* Two customer support representatives verify and sign replacement device
* Policy Node verifies replacement key was signed by two "support" role keys
* Policy Node signs key/UserID pair and submits to user database
* User can now use new device to login
## Rollout
1. Enable WebAuthn as 2FA option, augmenting existing login flows
2. Mandate WebAuthn 2FA for all employees
3. Mandate WebAuthn 2FA for all high value users
4. Mandate WebAuthn 2FA for all new users
5. Flag accounts as "at risk" that only have a single registered device
6. Disable all 2FA methods but WebAuthn (force migration on login)
7. Drop password requirments and migrate to pure FIDO2/WebAuthn login
8. Begin signing all policy change, user additions, withdrawal requests, etc.
9. Begin enforcing all policies and keys in TEE or HSM (e.g. Nitro enclave)
### Short term mitigations
Until measures like those above are in place, it must be understood that any
member with administrative account access, or production access, in effect have
complete personal custody of all user assets.
During this high risk period, some very strict and reasonably low cost controls
are recommended for immediate implementation.
* Create a fully "cold" wallet system for bulk of assets
* Users are told "cold" marked assets will have extended withdrawl times
* Use a hardened multi-party controlled laptop in dual-access vault
* See: https://github.com/distrust-foundation/airgap
* Ensure all assets above insurance level are maintained in "cold" wallet
* Ensure administrative/production access only possible by dedicated machines
* See: https://github.com/hashbang/book/blob/master/content/docs/security/Production_Engineering.md
* Ensure all commits and reviews are signed with employee Yubikeys or similar
* https://github.com/hashbang/book/blob/master/content/docs/security/Commit_Signing.md
* https://github.com/distrust-foundation/sig
* Deploy competing CI/CD systems that check keys and test for determinisim
* See: https://reproducible-builds.org/