docs/engineering-standards.md

7.9 KiB

Engineering Standards

These are our opinionated engineering standards we use internally at Distrust, and expect from all contractors and vendors.

Summary

This process is recommended for any binaries, that if compromised, could result in a loss of funds exceeding $1,000,000, which is the rough level where incidents of violent coercion or deployments of expensive ($200k+ per Zerodium) "zero-day" exploits have been seen in our industry.

It is reasonable to expect any privileged general-purpose OS kernels, tools, or binaries may, in some use cases, have access to memory that holds a high value secrets. This makes such binaries and those that maintain them significant targets for supply chain attacks by malicious adversaries, be it covert via a 0day of overt via violence.

It is also reasonable to expect that a binary that runs on multiple engineer workstations could have the power to deploy to production.

This document seeks to largely mitigate all classes of supply chain attacks we have seen deployed in the wild with high accountability build and release processes and maintain a high standard of code quality and security.

Motivation

80% of software bugs are memory safety issues, which speaks to poor coding standards, and failure to use memory safe languages where possible.

Also adversaries are demonstrating an increased willingness to resort to supply chain attacks to gain value from software systems.

Examples:

Worse even than this, when such vectors are not readily viable, some are even willing to resort to physical violence just to get a signature from a the private key that can directly or indirectly control significant value.

Examples:

Given this, we should expect that humans in control of any single point of failure in a software ecosystem that controls a significant amount of value are placing themselves at substantial personal risk.

To address this, we must avoid trusting any single human or any system they control by design in order to protect those individuals and downstream users.

Threat Model

  • Every human will make coding mistakes
  • Any single human or computer involved in the supply chain is compromised
  • All systems managed by a single party (IT, third parties) are compromised
  • Any code or binaries controlled by one system or party are compromised
  • All memory of all internet-connected computers is visible to an adversary
  • Any logging or backups that are not signed are tampered with
  • Adversary wields a "zero-click" "Zero-Day" exploit for any system

Requirements

The following only applies if code is bound for production, and these can be met in any order or cadence desired.

  • Third-party code:
    • MUST have extensive and frequent reviews.
      • Example: The Linux kernel has well funded distributed review.
    • MUST be hash-pinned at known reviewed versions
    • MUST be at version with all known related security patches
    • SHOULD be latest versions if security disclosures lag behind releases
      • Example: The Linux kernel
  • First-party code:
    • MUST be signed in version control systems by well-known author keys
    • MUST be signed by a separate subject matter expert after a security review
  • All code MUST build deterministically
  • All binaries:
    • MUST be built and signed by multiple parties with no management overlay
      • Example: One build by IT, another by Infrastructure team managed CI/CD
    • MUST be signed by well-known keys signed by a common CA
      • Example: PGP Smartcards signed under OpenPGP-CA.
  • All signing keys:
    • MUST be stored in approved PERSONAL HSMs
    • MUST NOT ever come in contact with network-accessible memory
  • All execution environments SHOULD be able to verify m-of-n binary signatures
    • Example: Custom Secure Boot verifies minimum signatures against CA
  • Only code required for sensitive operation SHOULD be present
    • Example: A server has no need for sound card drivers

Workflows

Design

Many implementations are possible under the above requirements and threat model however, the following opinionated workflows can be used as a reference when in doubt.

Engineer Workflow

Will vary significantly by project, but at a minimum:

  1. Engineer makes source code change
  2. Engineer builds and tests changes locally
  3. Engineer makes commit signed with PERSONAL HSM and pushes
  4. Submits code for peer review if multiple engineers are active in codebase

Continuous Integration

This is ideal, though not always present for all projects.

  1. Pulls code
  2. Verifies commit signature by key signed with hardcoded CA key
  3. Builds binaries
  4. Verifies binary hashes match hashes in signed commit
  5. Runs test suite
  6. Signs binary with a well-known key (ideally held in HSM)
  7. Publishes binary and signature to artifact storage
  8. Continuous Integration notifies project team of success/error of above steps

Security Reviewer

  1. Reviews all code changes between release candidate and last commit
  2. Reviews all third-party changes and evidence they are appropriate
  3. Appends tag signed with PERSONAL HSM to release candidate commit reviewed

Release Engineer

  1. Release engineer verifies:
    • Commit signature and security reviewer signature on commit are valid
    • The artifact corresponding to this commit is signed by the CI key
    • The artifact hash is in the release candidate commit
  2. Release engineer generates detached artifact signature with PERSONAL HSM
  3. Release engineer publishes detached signature to artifact store

Guidelines

Boilerplate

All projects should have:

  • A README.md containing goals, requirements, and build/usage instructions
  • A Makefile configured such that 'make' will build the project
  • A Stagex Containerfile responsible for the building project deterministically

Dependency Choices

Use the following rationale guidelines to help decide when and how to use third party dependencies

 flowchart TD
    A[Can it be done with the standard Library in under ~10k easily readable lines?]
    A --> D{No} --> E
    A --> B{Yes} --> C

    E[Can it be done with a library used in the official interpreter or compiler?]
    E --> F{Yes} --> X
    E --> G{No} --> I

    I[Does a widely used, well vetted, well reviewed, and well maintained library with exist?]
    I --> J{Yes} --> X
    I --> K{No} --> L

    L[Is this a cryptography or security sensitive use case?] 
    L --> M{No} --> O
    L --> N{Yes} --> P[Review by yourself and pay for reputable external security audit] --> X
    
    O[Does -any- suitible library exist small enough for you to review yourself?]
    O --> Q{No} --> C
    O --> R{Yes} --> S[Review by yourself and by a peer] --> X

    C[Write it yourself]

    X[Document rationale and use library at specific version we have reason to trust]

Language Guidelines

Rust

TBD

Glossary

MUST, MUST NOT, SHOULD, SHOULD NOT, MAY

These keywords correspond to their IETF definitions per RFC2119