presentations/stagex/index.md

---
_class: lead
paginate: true
backgroundColor: #fff
---

<!-- __ -->

![bg left:40% 80%](img/stagex-logo.png)

# Bootstrapping Reproducibility with StageX

<!--
Minimalism and security first repository of reproducible and multi-signed OCI
images of common open source software toolchains full-source bootstrapped from
Stage 0 to the compiler and libraries you'll use.
-->

---

# Minimalism and security first repository

Approach the distribution of a toolchain by ensuring each component uses
exactly what it needs to build - no more, no less.

<!--
TODO: include image describing traditional package building, by installing
_every_ dependency in a single OS, with a comparison of stagex only having mini
Containerfiles with just what each project needs.
-->

<!-- Speaker notes
Most Linux distributions are built for *compatibility* rather than *security*.
This results in a dramatic increase of attack surface area of an operating
system. StageX is designed to allow the creation of application specific
environments with a minimal footprint to eliminate attack surface area. Each
component of the toolchain installs only what it needs, and only packages what
it builds, resulting in a decreased attack surface.
-->

---

# A Rust Example

```dockerfile
FROM scratch AS build
COPY --from=stagex/busybox . /
COPY --from=stagex/rust . /
COPY --from=stagex/musl . /
COPY --from=stagex/gcc . /
COPY --from=stagex/llvm . /
COPY --from=stagex/binutils . /
COPY --from=stagex/libunwind . /
ADD <<EOF hello.rs
fn main() {
    println!("Hello, world!");
}
EOF
RUN rustc hello.rs
FROM scratch
COPY --from=build ./hello .
CMD ["./hello"]
```

<!-- Speaker notes
In this example, note how we are only pulling in Rust and the dependencies
required to invoke Rust. We don't include anything extra, which reduces the
attack surface when compiling software.
-->

---

# All packages in StageX are:

* Built using hash-locked sources
* Confirmed reproducible by multiple developers
* Signed by multiple release maintainers

<!-- Speaker notes
To ensure StageX remains a secure toolchain, there's some additional
maintenance that is performed compared to most distributions. This includes:

* Built using hash-locked sources. This ensures every developer gets the exact
  same copy of the code for each container, so no middleman could inject
  malware, which helps with:
* Reproducing projects, ensuring they're built deterministically. This confirms
  that no single developer, nor their machine, have been compromised. Once each
  package is confirmed, they are...
* Signed by the release maintainers. These maintainers each build a copy of the
  package locally and sign the containers with an OCI-compliant signature using
  well-known OpenPGP keys.
-->

<!-- TODO: talk about bootstrapping, incl. corrupt compilers in distro
toolchain -->

<!-- https://distrowatch.com/images/other/distro-family-tree.png -->

<!-- TODO: libfakerand to act as the "why" -->
<!--
* Create modified compiler which injects libfakerand during build time
* Use it to compile software from source, for example bitcoin core
* Show that the wallet generated with bitcoin core is not random
-->

---

# OCI Images

<!--
  Put some kind of graphic here to explain the association between images
  and multisig
-->

<!--
StageX uses the Open Container Initiative standard for images to support the
use of multiple container runtimes. Because OCI images can be signed using
OpenPGP keys, this allows the association of built images to signatures, which
can enable developers to build their software using StageX, without having to
build the entire StageX toolchain for themselves.
-->

---

# Common toolchain dependencies

StageX comes with developer-loved tooling and languages, such as:

* `rust`
* `go`
* `python`
* `curl`
* `git`

<!-- TODO: Add end-user software like tofu, stagex, ocismack, kubectl, etc. -->

If you are interested in additionally software being added feel free to open a PR or let us know what you would like to see added.

---

# Pallets

StageX offers prebuilt containers including all the packages necessary to run
some of our most used software, such as:

* `kubectl`, `kustomize`, `helm`
* `keyfork`
* `nginx`
* `redis`
* `postgres`

---

# **Full source bootstrapped from Stage 0**

The StageX compiler and all libraries necessary to build software are themselves fully bootstrapped and deterministic

Bootstrapped - built up from "nothing" in order to allow verification of how the compiler is built - ensuring there is no malicious code added to it at any point.

Ken Thompson describes the risk of using a compiler which can't be verified to be trustworthy in his seminal paper "Reflections on Trusting Trust"

---

# **OK, So What?**

There is an entire family of supply chain vulnerabilities which can be eliminated by using StageX

By reducing the number of dependencies needed to run and build software, we remove unnecessary software which can act as an entry point for malicious software such as malware

For example, if using Debian as a base for `rust`, one ends up using **232 dependencies**, where as StageX only requires **4 dependencies**

---

Additionally, there has not been a simple way to verify that a compiler is trusted.

This is because compilers are used to build other compilers, and for a long time, we lost the ability to build up a compiler toolchain from "nothing"

StageX allows us to bootstrap the compiler toolchain, making it easy to verify that no malicious code was introduced at any point, by reviewing the code, and it also does so in a deterministic manner, which makes it simple to further verify the integrity of the binary

---

# Solar Winds

According to: https://www.crowdstrike.com/blog/sunspot-malware-technical-analysis/

> * SUNSPOT is StellarParticle’s malware used to insert the SUNBURST backdoor into software builds of the SolarWinds Orion IT management product.
> * SUNSPOT monitors running processes for those involved in compilation of the Orion product and replaces one of the source files to include the SUNBURST backdoor code.
> * Several safeguards were added to SUNSPOT to avoid the Orion builds from failing, potentially alerting developers to the adversary’s presence.

<!--
We can see that the compromise occurred because the threat actors infiltrated the network
and replaced source code files during build time.

This is clearly something we could have prevented by using determinism.

* Ensuring that all our build time dependencies are reviewed and built deterministically
* Ensuring that our commits are signed (additional protection)
* Ensuring that the final result is determnistic

If Solar Winds deployed a secondary runner in an isolated environment that's pull only,
it's nearly impossible they would not notice that something is amuck in their final
release build. In fact if any developer built the code locally, they would have noticed
that something is not lining up.

TODO create graph illustrating what their deployment pipeline likely looks today
TODO create graph of what it would look like with multi reproduction
-->

---

# **What's Next?**

Packaging more software

Adding additional container runtimes like Podman and Kaniko

Adding additional chip architecture support such as ARM and RISC-V

---

# **Links**

**Presenter**: <your_name>

**Matrix Chat**: #stagex:matrix.org

**Git Repo**: https://codeberg.org/stagex/stagex

Big thank you to sponsors who have supported the development of this project:

**Turnkey, Distrust, Mysten Labs**
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								---
 								_class: lead
 								paginate: true
 								backgroundColor: #fff
 								---
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!-- __ -->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								![bg left:40% 80%](img/stagex-logo.png)
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# Bootstrapping Reproducibility with StageX
 								<!--
 								Minimalism and security first repository of reproducible and multi-signed OCI
 								images of common open source software toolchains full-source bootstrapped from
 								Stage 0 to the compiler and libraries you'll use.
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
 								---
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# Minimalism and security first repository
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								Approach the distribution of a toolchain by ensuring each component uses
 								exactly what it needs to build - no more, no less.
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!--
 								TODO: include image describing traditional package building, by installing
 								_every_ dependency in a single OS, with a comparison of stagex only having mini
 								Containerfiles with just what each project needs.
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!-- Speaker notes
 								Most Linux distributions are built for *compatibility* rather than *security*.
 								This results in a dramatic increase of attack surface area of an operating
 								system. StageX is designed to allow the creation of application specific
 								environments with a minimal footprint to eliminate attack surface area. Each
 								component of the toolchain installs only what it needs, and only packages what
 								it builds, resulting in a decreased attack surface.
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
 								---
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# A Rust Example
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
 								```dockerfile
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								FROM scratch AS build
 								COPY --from=stagex/busybox . /
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								COPY --from=stagex/rust . /
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								COPY --from=stagex/musl . /
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								COPY --from=stagex/gcc . /
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								COPY --from=stagex/llvm . /
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								COPY --from=stagex/binutils . /
 								COPY --from=stagex/libunwind . /
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								ADD <<EOF hello.rs
 								fn main() {
 								    println!("Hello, world!");
 								}
 								EOF
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								RUN rustc hello.rs
 								FROM scratch
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								COPY --from=build ./hello .
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								CMD ["./hello"]
 								```
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
 								<!-- Speaker notes
 								In this example, note how we are only pulling in Rust and the dependencies
 								required to invoke Rust. We don't include anything extra, which reduces the
 								attack surface when compiling software.
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								---
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# All packages in StageX are:
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								* Built using hash-locked sources
 								* Confirmed reproducible by multiple developers
 								* Signed by multiple release maintainers
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!-- Speaker notes
 								To ensure StageX remains a secure toolchain, there's some additional
 								maintenance that is performed compared to most distributions. This includes:
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								* Built using hash-locked sources. This ensures every developer gets the exact
 								  same copy of the code for each container, so no middleman could inject
 								  malware, which helps with:
 								* Reproducing projects, ensuring they're built deterministically. This confirms
 								  that no single developer, nor their machine, have been compromised. Once each
 								  package is confirmed, they are...
 								* Signed by the release maintainers. These maintainers each build a copy of the
 								  package locally and sign the containers with an OCI-compliant signature using
 								  well-known OpenPGP keys.
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!-- TODO: talk about bootstrapping, incl. corrupt compilers in distro
 								toolchain -->
 								<!-- https://distrowatch.com/images/other/distro-family-tree.png -->
 								<!-- TODO: libfakerand to act as the "why" -->
-												add notes about compiler poc and solar winds mitigation

											
										
										
											2024-08-21 17:22:55 +00:00
+								<!--
 								* Create modified compiler which injects libfakerand during build time
 								* Use it to compile software from source, for example bitcoin core
 								* Show that the wallet generated with bitcoin core is not random
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								---
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# OCI Images
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!--
 								  Put some kind of graphic here to explain the association between images
 								  and multisig
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!--
 								StageX uses the Open Container Initiative standard for images to support the
 								use of multiple container runtimes. Because OCI images can be signed using
 								OpenPGP keys, this allows the association of built images to signatures, which
 								can enable developers to build their software using StageX, without having to
 								build the entire StageX toolchain for themselves.
 								-->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
 								---
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# Common toolchain dependencies
 								StageX comes with developer-loved tooling and languages, such as:
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								* `rust`
 								* `go`
 								* `python`
 								* `curl`
 								* `git`
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								<!-- TODO: Add end-user software like tofu, stagex, ocismack, kubectl, etc. -->
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
 								If you are interested in additionally software being added feel free to open a PR or let us know what you would like to see added.
 								---
-												stagex: rewrite a good chunk

											
										
										
											2024-08-20 22:58:04 +00:00
+								# Pallets
 								StageX offers prebuilt containers including all the packages necessary to run
 								some of our most used software, such as:
 								* `kubectl`, `kustomize`, `helm`
 								* `keyfork`
 								* `nginx`
 								* `redis`
 								* `postgres`
 								---
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								# **Full source bootstrapped from Stage 0**
 								The StageX compiler and all libraries necessary to build software are themselves fully bootstrapped and deterministic
 								Bootstrapped - built up from "nothing" in order to allow verification of how the compiler is built - ensuring there is no malicious code added to it at any point.
 								Ken Thompson describes the risk of using a compiler which can't be verified to be trustworthy in his seminal paper "Reflections on Trusting Trust"
 								---
 								# **OK, So What?**
 								There is an entire family of supply chain vulnerabilities which can be eliminated by using StageX
 								By reducing the number of dependencies needed to run and build software, we remove unnecessary software which can act as an entry point for malicious software such as malware
 								For example, if using Debian as a base for `rust`, one ends up using **232 dependencies**, where as StageX only requires **4 dependencies**
 								---
 								Additionally, there has not been a simple way to verify that a compiler is trusted.
 								This is because compilers are used to build other compilers, and for a long time, we lost the ability to build up a compiler toolchain from "nothing"
 								StageX allows us to bootstrap the compiler toolchain, making it easy to verify that no malicious code was introduced at any point, by reviewing the code, and it also does so in a deterministic manner, which makes it simple to further verify the integrity of the binary
 								---
-												add notes about compiler poc and solar winds mitigation

											
										
										
											2024-08-21 17:22:55 +00:00
+								# Solar Winds
 								According to: https://www.crowdstrike.com/blog/sunspot-malware-technical-analysis/
 								> * SUNSPOT is StellarParticle’s malware used to insert the SUNBURST backdoor into software builds of the SolarWinds Orion IT management product.
 								> * SUNSPOT monitors running processes for those involved in compilation of the Orion product and replaces one of the source files to include the SUNBURST backdoor code.
 								> * Several safeguards were added to SUNSPOT to avoid the Orion builds from failing, potentially alerting developers to the adversary’s presence.
 								<!--
 								We can see that the compromise occurred because the threat actors infiltrated the network
 								and replaced source code files during build time.
 								This is clearly something we could have prevented by using determinism.
 								* Ensuring that all our build time dependencies are reviewed and built deterministically
 								* Ensuring that our commits are signed (additional protection)
 								* Ensuring that the final result is determnistic
 								If Solar Winds deployed a secondary runner in an isolated environment that's pull only,
 								it's nearly impossible they would not notice that something is amuck in their final
 								release build. In fact if any developer built the code locally, they would have noticed
 								that something is not lining up.
 								TODO create graph illustrating what their deployment pipeline likely looks today
 								TODO create graph of what it would look like with multi reproduction
 								-->
 								---
-												initial commit

											
										
										
											2024-06-04 19:12:41 +00:00
+								# **What's Next?**
 								Packaging more software
 								Adding additional container runtimes like Podman and Kaniko
 								Adding additional chip architecture support such as ARM and RISC-V
 								---
 								# **Links**
 								**Presenter**: <your_name>
 								**Matrix Chat**: #stagex:matrix.org
 								**Git Repo**: https://codeberg.org/stagex/stagex
 								Big thank you to sponsors who have supported the development of this project:
 								**Turnkey, Distrust, Mysten Labs**