2024-10-28 22:50:38 +00:00
---
_class: lead
paginate: true
backgroundColor: #fff
---
< style >
/* Changed in Marp 4.0.0. Re-center. */
section.lead {
display: flex;
}
div.two-columns {
column-count: 2;
}
< / style >
# Expanding (Dis)Trust
How can we prove that our software has not been tampered during build time?
* Binary - software that's in a format computers can work with
* Compiler - builds software into binaries
* Hashing - takes a data set and produces a fixed length string
2024-11-21 18:29:49 +00:00
* 8a1aaf746ada2a80fab03a58c91575ffe82885ac "banana"
* 9144b7b25e83a315de79e7a527f5631f9d4dacf2 "banan"
2024-10-28 22:50:38 +00:00
<!--
2024-11-21 18:29:49 +00:00
* How do we do this today? We don't really have great tools to do this. There is monitoring, we can do static analysis etc., but these are not a direct way of
ensuring our software wasn't tampered, but rather monitor the environment, and hope we catch things using static analysis etc.
2024-10-28 22:50:38 +00:00
2024-11-21 18:29:49 +00:00
* This talk is about how we can practically verify integrity of software
2024-10-28 22:50:38 +00:00
-->
---
# Anton Livaja
Co-Founder & Security Engineer at Distrust (https://distrust.co)
* Firm specializing in high assurance security consulting and engineering.
* Mission: to improve the security, privacy and freedom of as many people as
possible through working on fundamental security problems and creating open
source solutions.
* Clients: electrical grid operators, healthcare providers, fin-tech companies
and more.
<!--
* We specialize in supply chain security, operating system engineering, infrastructure hardening, and applied cryptography
2024-11-21 18:29:49 +00:00
* We exclusively use and write open source software
2024-10-28 22:50:38 +00:00
* Introduce some problems teams maybe weren't even thinking about
-->
---
# Ken Thompson's Reflections on Trusting Trust
> **[The moral is obvious. You can't trust code that you did not totally create
yourself**. (Especially code from companies that employ people like me.) No
amount of source-level verification or scrutiny will protect you from using
untrusted code...]
<!--
2024-11-21 18:29:49 +00:00
* Ken Thompson is a computer scientist from Bell Labs, read a
2024-10-28 22:50:38 +00:00
Air Force paper where he got this idea
2024-11-21 18:29:49 +00:00
* Even if you review your source code and verify it's secure, that's not enough, as the compiler can still modify code
* How can you trust a compiler... but also how can you trust all the software downstream from the compiler, that's built by it. And again, how do we easily check if the compiler, or some other aspect of the environment injected malicious software (guest software)?
2024-10-28 22:50:38 +00:00
* This is an unexplored attack surface area I will do my best to contextualize
it and give you a good intuition about it
* I won't open the can of worms on whether it's better to use open source
software in the context of security, but I'm firmly in the camp of don't trust,
verify
-->
---
![](http://www.gne.com.sg/wp-content/uploads/2017/11/SolarWinds-logo.png)
<!--
* One of the most significant breaches in recent history - Orion software platform - a monitoring tool to help orgs manage their infra including networks, servers, applications, dbs etc.
* 1000s of enterprise and government customers had their systems completely exposed
* This company is one of the GO TO companies for cybersecurity solutions
2024-11-21 18:29:49 +00:00
* The other thing that happened is that the APT stole cybertooling and weaponized it and used to improve their evasive abilities
2024-10-28 22:50:38 +00:00
* This means that IP, government secrets etc could have been leaked
* I never saw a proper response and retro on how to prevent this from happening
2024-11-21 18:29:49 +00:00
* Not directly the result of compiler compromise, but no way to verify if software is tampered
2024-10-28 22:50:38 +00:00
-->
---
2024-11-21 18:29:49 +00:00
![no-tamper-evidence ](https://antonlivaja.com/images/binary-exploit-2.png )
---
2024-10-28 22:50:38 +00:00
# What's the Answer?
* Integrity hashes are already widely used
2024-11-21 18:29:49 +00:00
* How do we use them to verify the integrity of software during build time, not after?
2024-10-28 22:50:38 +00:00
* Determinism / Reproducibility
* > Method of building software which ensures that the resulting binary for
a piece of software is always bit-for-bit identical.
* When something is bit-for-bit identical each time it is _deterministic_
* Once something is _deterministic_ , it can be _reproduced_
<!--
2024-11-21 18:29:49 +00:00
* We use integrity hashes to ensure that the software is not modified between the download source (CDN etc.) and end user
2024-10-28 22:50:38 +00:00
2024-11-21 18:29:49 +00:00
* You may be thinking that it's likely that most software is already deterministic by default - but it's not. This is because of things like time stamps, linking order, compilation flags, environment variables etc.
2024-10-28 22:50:38 +00:00
* This becomes very powerful when we start to reproduce the same software in
multiple different environments, and by different agents. Different hardware,
different OS, different person etc.
* So determinism is the method that allows us to easily and quickly check if
something new has been added to a binary
* How do we apply this to our tech stack?
-->
---
2024-11-21 18:29:49 +00:00
![height:600px ](https://antonlivaja.com/images/expanded-3-hashes.png )
<!--
* In this example, we see the same software built deterministically in several different environments.
* Because it's determnistic, we know that we expect the same hash on all systems
* We easily notice that Azure produced a binary that hashes to a different value, and therefore know something is different about this binary
-->
---
2024-10-28 22:50:38 +00:00
# How Deep Do We Have to Go?
* Software Application
* First Party Code
* Third Party Code
* Build and Runtime Environment
* Operating System + Packages
* Additional CLI / Tools
* Compiler
<!--
2024-11-21 18:29:49 +00:00
* We need everything to be deterministic - this is not how software is currently built
2024-10-28 22:50:38 +00:00
* And yes this is not simple to do... so let's talk about how we can achieve this
2024-11-21 18:29:49 +00:00
* As a side note, a similar apporach applies to interpreted languages like javascript, where can hash the source code of the application, and use a deterministically built environment and runtime.
2024-10-28 22:50:38 +00:00
-->
---
# Adequate Solution
* Allows us to make the whole tree deterministic
2024-11-21 18:29:49 +00:00
* Can be easily reproduced
2024-10-28 22:50:38 +00:00
2024-11-21 18:29:49 +00:00
* Drop in replacement / easy to upgrade
2024-10-28 22:50:38 +00:00
---
# Bootstrapping our Way Up
![right:0% left:0% ](https://mermaid.ink/svg/pako:eNotjrsOgzAMRX8l8gw_kKFSga2dypgwWImBSHkpJANC_HtTiif73CP5HqCCJuCwJIwre3-kZ3Weog8uGktpYm37YJ14UfJkp3_cXbAX475lcmygSF6TV4a22-gvYxDPGK1RmE3wEzTgKDk0uv47fp6EvJIjCbyummYsNkuQ_qwqlhzG3SvgORVqIIWyrMBntFu9StSYaTBYe7ubnl_6WELh )
---
# Who Compiles the Compiler?
* Mostly downloaded as a binary
* Even if the compiler is built from source, usually another compiler is used to do so
* This means there is no clear providence to how we went from nothing to having a usable compiler
<!--
2024-11-21 18:29:49 +00:00
* Maintainers of open source software are the people that often are the ones building this software, and even in large organizations like Microsoft and Apple, they are not using determinism
2024-10-28 22:50:38 +00:00
* We can also rely on reverse engineering but it's not a reliable and practical method
2024-11-21 18:29:49 +00:00
* So the very foundation of how we build software is not verifiable... that's a problem
2024-10-28 22:50:38 +00:00
-->
---
# Bootstrapping Compilers
2024-11-21 18:29:49 +00:00
* Consists of "stages", and hundreds of steps of starting from a human auditable rudimentary compiler and building up all the way up to a modern compiler
2024-10-28 22:50:38 +00:00
* Bootstrapping programming languages
<!--
2024-11-21 18:29:49 +00:00
* Complicated but auditable process
* We want to do this deterministically of course so we have a tamper evidence method
2024-10-28 22:50:38 +00:00
-->
---
# We Have a Compiler, Now What?
* Build all of the different dependencies we need:
* `linux kernel`
* `bash`
* `openssl`
* `git`
* Yes... I mean *everything* in your build environment
---
# Status Check-In
2024-11-21 18:29:49 +00:00
* So far we have established we need the following for a solution:
* Bootstrap a compiler in a deterministic manner
* Use compiler to build all our dependencies
2024-10-28 22:50:38 +00:00
* Last thing remaining: your application
<!-- Now this seems like a lot... and it is, so we went ahead and built
an open source solution that tries to address the problem -->
---
2024-11-21 18:29:49 +00:00
# [Stageˣ]
Open source Linux Distribution
---
# Multi-Signed, Bootstrapped, Deterministic, and Minimal
2024-10-28 22:50:38 +00:00
<!-- Speaker notes
2024-11-21 18:29:49 +00:00
* Most Linux distributions are built for *compatibility* rather than *security* . This results in a dramatic increase of attack surface area of an operating
system.
* StageX is the first Linux multisig distribution, is one of two fully
2024-10-28 22:50:38 +00:00
bootstrapped Linux distributions, is 100% reproducible and deterministic,
and can build complicated software with as few dependencies exposed as
possible.
2024-11-21 18:29:49 +00:00
* The other thing that differentiates StageX from other solutions like NixOS
is that it is fully container native, so there is no package manager required
such as flake or otherwise.
2024-10-28 22:50:38 +00:00
-->
< hr / >
| Distribution | Signatures | Libc | Bootstrapped | Reproducible | Rust deps |
|--------------|------------|-------|--------------|--------------|----------:|
| Stagex | 2+ Human | Musl | Yes | Yes | 4 |
| Debian | 1 Human | Glibc | No | Partial | 231 |
| Arch | 1 Human | Glibc | No | Partial | 127 |
| Fedora | 1 Bot | Glibc | No | No | 167 |
| Alpine | None | Musl | No | No | 41 |
<!-- NOTE: "Rust deps" is the amount of dependencies required to build a Rust
hello world -->
---
# Full source bootstrapped from Stage 0
From a 256-byte compiler written in hex0, StageX bootstraps all the compiler
tools necessary to build the distribution, 100% deterministically.
- Stage 0: Getting a basic C compiler on x86
- Stage 1: Building GCC for x86
- Stage 2: Upgrading GCC for x86_64
- Stage 3: Building up-to-date toolchains
- Stage X: Shipping the software you know and love
---
# A Rust Example
```dockerfile
FROM stagex/pallet-rust@sha256:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c AS build
ADD . /src
WORKDIR /src
ARG TARGET x86_64-unknown-linux-musl
RUN cargo build --release --target ${TARGET}
FROM scratch
COPY --from=build /app/target/${TARGET}/release/hello /usr/bin/hello
CMD ["/usr/bin/hello"]
```
<!--
* This may look very similar to what you may do with alpine linux, but the difference is that with alpine you are trusting single points of failure since none of the alpine packages are multi reproduced and signed - this
is why we made stagex - they also do not use bootstrapped compilers.
2024-11-21 18:29:49 +00:00
* Who built alpine rust; what compiler did they use
* There is no way to easily reproduce most software so you can't verify it for yourself, you are blindly trusting that the binary is clean
-->
2024-10-28 22:50:38 +00:00
---
# All packages in StageX are:
* Built using hash-locked sources
* Confirmed reproducible by multiple developers
* Signed by multiple release maintainers
<!-- Speaker notes
To ensure StageX remains a secure toolchain, there's some additional
maintenance that is performed compared to most distributions. This includes:
* Built using hash-locked sources. This ensures every developer gets the exact same copy of the code for each container, so no middleman could inject
malware, which helps with:
* Reproducing projects, ensuring they're built deterministically. This confirms
that no single developer, nor their machine, have been compromised. Once each
package is confirmed, they are...
* Signed by the release maintainers. These maintainers each build a copy of the
package locally and sign the containers with an OCI-compliant signature using
well-known OpenPGP keys.
---
-->
![bg right:35% 80% ](https://mermaid.ink/svg/pako:eNptUstugzAQ_BVrzyQU0-ZBpR7S9lhVKr2FHIy9gCuDkbFTRYh_ryFVgtL6YO_OjHdk7_bAtUBIoFD6m1fMWPK5yxri185JJaL9dBzIYvFEPrA1WjiO0f4SHmZi-q-Y_hFf60zKVJZNtB_3W55eeDrnpwu_JgpZh1eYzmEIoEZTMyn8A_tRlIGtsMYMEh8qWVY2g2BGPKfpmVsqlqMihTboq77nX8gt6Yk-ohl_KiFH2clc4SMZMsiawVsxZ3V6ajgk1jgMwGhXVpAUTHU-c61gFl8kKw2rb9BXIa02F1BpJtCnPdhTO_amlJ31Blw3hSxH3Bnl4cratkvCcKSXpbSVy5dc12EnxdjI6rhdhSu62jAa42ods4c4FjyPtpuC3keFWN9FlMEwBICT_9t5EKZ5GH4Asmmvxw )
<!--
flowchart TB
Build1[Build] --\> Reproduce1[Reproduce]
Build2[Build] --\> Reproduce2[Reproduce]
Reproduce1 --\> Sign1[Sign]
Reproduce2 --\> Sign2[Sign]
Sign1 --\> Release
Sign2 --\> Release
{
"theme": "light",
"themeCSS": ".label foreignObject { overflow: visible; }"
}
-->
<!-- TODO: talk about bootstrapping, incl. corrupt compilers in distro
toolchain -->
<!-- https://distrowatch.com/images/other/distro - family - tree.png -->
---
# Multi-Signed OCI Images
<!-- Speaker notes
2024-11-21 18:29:49 +00:00
* We have multiple individuals rebuild the all of the software in the StageX distribution
* You can also clone the stagex repository, install docker and run the command `make` to verify for yourself that all the hashes match
* You can overlay rules around how many times software has to have been rebuilt, and a trusted list of cryptographic keys the software has to be signed by to ensure you always have a desired level of reproduciblity in your stack
2024-10-28 22:50:38 +00:00
---
-->
[![ ](https://mermaid.ink/svg/pako:eNpdklFrgzAQx79KuGdbV91s62DQpmNPZbDube4hJqdmRFNi7Cjid1-sa7EGAvn_f3c5LpcWuBYIMWRK__KCGUs-d0lF3NpvvvZMVtZtNGTzTWazF_Km_-F2DLcTSMeQTuBmkJReb5poOtFD_EdT27uEkUFvBnhQoimZFK6ltscJ2AJLTCB2RyXzwibgjQA9HAY2VyxFRTJtUObVe_qD3JKW6BOa_m1icpK1TBU-ky6BpOpcKdZYfThXHGJrGvTA6CYvIM6Yqp1qjoJZ3EmWG1ZeQwbzVUirzS1SaSbQyRbs-dgPI5eXVriuMpn3fmOUswtrj3Xs-z2e59IWTTrnuvRrKfrJFad15EdBtGJBiNEyZE9hKHi6WK-y4HGRieXDImDQdR7gpf5-mPzlA3R_HuyhBw )](https://mermaid.ink/svg/pako:eNpdklFrgzAQx79KuGdbV91s62DQpmNPZbDube4hJqdmRFNi7Cjid1-sa7EGAvn_f3c5LpcWuBYIMWRK__KCGUs-d0lF3NpvvvZMVtZtNGTzTWazF_Km_-F2DLcTSMeQTuBmkJReb5poOtFD_EdT27uEkUFvBnhQoimZFK6ltscJ2AJLTCB2RyXzwibgjQA9HAY2VyxFRTJtUObVe_qD3JKW6BOa_m1icpK1TBU-ky6BpOpcKdZYfThXHGJrGvTA6CYvIM6Yqp1qjoJZ3EmWG1ZeQwbzVUirzS1SaSbQyRbs-dgPI5eXVriuMpn3fmOUswtrj3Xs-z2e59IWTTrnuvRrKfrJFad15EdBtGJBiNEyZE9hKHi6WK-y4HGRieXDImDQdR7gpf5-mPzlA3R_HuyhBw)
<!--
flowchart TD
MA[Maintainer A] --\> Go
MB[Maintainer B] --\> Go
MC[Maintainer C] --\> Go
MA --\> GCC
MB --\> GCC
MC --\> GCC
MA --\> Rust
MB --\> Rust
MC --\> Rust
{
"theme": "light",
"themeCSS": ".label foreignObject { overflow: visible; }"
}
-->
---
# Pallets
2024-11-21 18:29:49 +00:00
StageX will soon offer prebuilt containers including all the packages necessary to run some of our most used software, such as:
2024-10-28 22:50:38 +00:00
- `rust`
- `go`
- `nodejs`
- `nginx`
- `redis`
- `postgres`
2024-11-21 18:29:49 +00:00
<!--
* We already offer packages that can be used today, and are used in production by multiple companies
* Adding a usability improvement where all the dependencies are grouped into what we are calling "pallets"
-->
2024-10-28 22:50:38 +00:00
---
# Key Takeaways
2024-11-21 18:29:49 +00:00
* Bootstrapped compiler
* Fully deterministic
2024-10-28 22:50:38 +00:00
* Packages the software you're already using, but in a more secure manner.
2024-11-21 18:29:49 +00:00
* Is a drop in replacement, and has native container support
2024-10-28 22:50:38 +00:00
<!--
Other distributions run their own package manager inside of containers
2024-11-21 18:29:49 +00:00
We use containers as our package manager 100% container native, no attack surface
* Package managers are notorious for introducing attack surfaces, such as arbitrary execution of `setup.py` or post-download scripts, and by using Docker
2024-10-28 22:50:38 +00:00
as our package manager, we avoid all forms of spontaneous execution.
All StageX software is built deterministically, meaning you can be sure all
components listed in your Software Bill Of Materials hasn't been tampered with.
Because StageX provides a toolchain for you to build your software in the same
2024-11-21 18:29:49 +00:00
manner
* Available on docker hub
2024-10-28 22:50:38 +00:00
-->
---
# What's Next?
* Adding SBOM
* Packaging more software
* Fully automating software updates
* Additional container runtimes like Podman and Kaniko
* Additional chip architecture support such as ARM and RISC-V
---
# How You Can Help
* Provide feedback
* Support with development efforts
* Become a sponsor
---
# Links
**Email**: anton@distrust.co / sales@distrust.co
**Matrix Chat**: #stagex:matrix .org
2024-11-21 18:29:49 +00:00
**Docker Hub**: https://hub.docker.com/u/stagex
2024-10-28 22:50:38 +00:00
**Git Repo**: https://codeberg.org/stagex/stagex
Big thank you to sponsors who have supported the development of this project:
**Turnkey, Distrust, Mysten Labs**
2024-11-21 18:29:49 +00:00
Thank you to InCyber for hosting this fantastic event!