add incyber presentation - montreal nov 2024

2024-11-21 13:29:49 -05:00 · 2024-11-21 13:29:49 -05:00 · 4a55268f07
parent a8e20f2e51
commit 4a55268f07
4 changed files with 91 additions and 144 deletions
--- a/stagex/img/binary-exploit-2.png
+++ b/stagex/img/binary-exploit-2.png
--- a/stagex/img/binary-tampering.png
+++ b/stagex/img/binary-tampering.png
--- a/stagex/img/expanded-3-hashes.png
+++ b/stagex/img/expanded-3-hashes.png
--- a/stagex/incyber.md
+++ b/stagex/incyber.md
@ -22,22 +22,15 @@ How can we prove that our software has not been tampered during build time?
 * Binary - software that's in a format computers can work with
 * Compiler - builds software into binaries
 * Hashing - takes a data set and produces a fixed length string
+    * 8a1aaf746ada2a80fab03a58c91575ffe82885ac "banana"
+    * 9144b7b25e83a315de79e7a527f5631f9d4dacf2 "banan"

 <!--
-* This talk is a "yet another" supply chain security talk, but likely unlike
-most you have seen so far

-* This is a question relevant to everyone who ships software. At some point in
-our supply chains, we rely on compilers, and software libraries which are part
-of operating systems we use, different language ecosystems etc.
+* How do we do this today? We don't really have great tools to do this. There is monitoring, we can do static analysis etc., but these are not a direct way of
+ensuring our software wasn't tampered, but rather monitor the environment, and hope we catch things using static analysis etc.

-* What if there are issues in the source code of the app, third party libraries,
-OS packages, cli tools or additional software that your app requires to be
-built or run
-
-* How do we do this today? We don't really have great tools to do this. There is
-monitoring, we can do static analysis etc., but these are not a direct way of
-ensuring our software wasn't tampered, but rather monitor the environment.
+* This talk is about how we can practically verify integrity of software

 -->
 ---
@ -58,6 +51,8 @@ and more.
 <!--
 * We specialize in supply chain security, operating system engineering, infrastructure hardening, and applied cryptography

+* We exclusively use and write open source software
+
 * Introduce some problems teams maybe weren't even thinking about
 -->

@ -71,11 +66,12 @@ amount of source-level verification or scrutiny will protect you from using
 untrusted code...]

 <!--
-* TODO: who is Ken Thompson is a computer scientist from Bell Labs, read a
+* Ken Thompson is a computer scientist from Bell Labs, read a
 Air Force paper where he got this idea

-* Even if you review your source code and verify it's secure, that's not enough,
-as the compiler can still modify code
+* Even if you review your source code and verify it's secure, that's not enough, as the compiler can still modify code
+
+* How can you trust a compiler... but also how can you trust all the software downstream from the compiler, that's built by it. And again, how do we easily check if the compiler, or some other aspect of the environment injected malicious software (guest software)?

 * This is an unexplored attack surface area I will do my best to contextualize
 it and give you a good intuition about it
@ -91,23 +87,24 @@ verify

 <!--
 * One of the most significant breaches in recent history - Orion software platform - a monitoring tool to help orgs manage their infra including networks, servers, applications, dbs etc.
-* While not directly the result of compiler compromise, it is directly related to the issue at hand. Rather than a compiler, in this case it was the environment that caused the issue
-* Build system injected malicious code
-* Happened because we don't have a simple method to ensure that software is tamper evident
 * 1000s of enterprise and government customers had their systems completely exposed
 * This company is one of the GO TO companies for cybersecurity solutions
-* The other thing that happened is that the APT stole cybertooling and weaponized
-it and used to improve their evasive abilities
+* The other thing that happened is that the APT stole cybertooling and weaponized it and used to improve their evasive abilities
 * This means that IP, government secrets etc could have been leaked
 * I never saw a proper response and retro on how to prevent this from happening
-again
+* Not directly the result of compiler compromise, but no way to verify if software is tampered
 -->

 ---

+![no-tamper-evidence](https://antonlivaja.com/images/binary-exploit-2.png)
+
+---
+
 # What's the Answer?

 * Integrity hashes are already widely used
+    * How do we use them to verify the integrity of software during build time, not after?

 * Determinism / Reproducibility
  * > Method of building software which ensures that the resulting binary for
@ -119,12 +116,9 @@ again

 <!--

-* We use integrity hashes to ensure that the software is not modified between the
-download source (CDN etc.) and end user
+* We use integrity hashes to ensure that the software is not modified between the download source (CDN etc.) and end user

-* You may be thinking that it's likely that most software is already deterministic
-by default - but it's not. This is because of things like time stamps, linking order,
-compilation flags, environment variables etc.
+* You may be thinking that it's likely that most software is already deterministic by default - but it's not. This is because of things like time stamps, linking order, compilation flags, environment variables etc.

 * This becomes very powerful when we start to reproduce the same software in
 multiple different environments, and by different agents. Different hardware,
@ -133,17 +127,22 @@ different OS, different person etc.
 * So determinism is the method that allows us to easily and quickly check if
 something new has been added to a binary

-* To make it clear, there are integrity hashes currently available for software,
-but they are nearly never deterministic, which means they only defend you from
-compromise of the last leg of the trip, from the CDN/server to the end user, but
-anything upstream is still susceptible to tampering, and there is no way to
-reproduce the software to verify the hash matches, you can only check that the
-binary you downloaded matches the hash they posted online and signed.
-
 * How do we apply this to our tech stack?
 -->
 ---

+![height:600px](https://antonlivaja.com/images/expanded-3-hashes.png)
+<!--
+* In this example, we see the same software built deterministically in several different environments.
+
+* Because it's determnistic, we know that we expect the same hash on all systems
+
+* We easily notice that Azure produced a binary that hashes to a different value, and therefore know something is different about this binary
+
+-->
+
+---
+
 # How Deep Do We Have to Go?

 * Software Application
@ -157,9 +156,9 @@ binary you downloaded matches the hash they posted online and signed.
 * Compiler

 <!--
-* We need everything to be deterministic - this is not how software is currently
-built
+* We need everything to be deterministic - this is not how software is currently built
 * And yes this is not simple to do... so let's talk about how we can achieve this
+* As a side note, a similar apporach applies to interpreted languages like javascript, where can hash the source code of the application, and use a deterministically built environment and runtime.
 -->

 ---
@ -168,9 +167,9 @@ built

 * Allows us to make the whole tree deterministic

-* Can be easily reproduced (deterministically)
+* Can be easily reproduced 

-* Drop in replacement for the current approach
+* Drop in replacement / easy to upgrade 


 ---
@ -179,8 +178,6 @@ built

 ![right:0% left:0%](https://mermaid.ink/svg/pako:eNotjrsOgzAMRX8l8gw_kKFSga2dypgwWImBSHkpJANC_HtTiif73CP5HqCCJuCwJIwre3-kZ3Weog8uGktpYm37YJ14UfJkp3_cXbAX475lcmygSF6TV4a22-gvYxDPGK1RmE3wEzTgKDk0uv47fp6EvJIjCbyummYsNkuQ_qwqlhzG3SvgORVqIIWyrMBntFu9StSYaTBYe7ubnl_6WELh)

-<!-- TODO: add graph of going from compiler up to OS + deps and then to application -->
-
 ---

 # Who Compiles the Compiler?
@ -192,28 +189,26 @@ built
 * This means there is no clear providence to how we went from nothing to having a usable compiler

 <!--
-* Maintainers of open source software are the people that often are the ones building
-this software, and even in large organizations like Microsoft and Apple, they are
-not using determinism to verify their software is secure
-
-* For the most part the approach to addressing this has been to
-use two different compilers to build the code, and while unlikely it is possible for both compilers to be compromised in the same manner
+* Maintainers of open source software are the people that often are the ones building this software, and even in large organizations like Microsoft and Apple, they are not using determinism 

 * We can also rely on reverse engineering but it's not a reliable and practical method
+
+* So the very foundation of how we build software is not verifiable... that's a problem
 -->

 ---

 # Bootstrapping Compilers

-* Consists of "stages", and hundreds of steps of starting from a human auditable (256 byte) compiler written in hex0 and building up all the way up to a modern compiler
+* Consists of "stages", and hundreds of steps of starting from a human auditable rudimentary compiler and building up all the way up to a modern compiler

 * Bootstrapping programming languages

 <!--
-* If you bootstrap, you have a compiler you can verify and trust
+* Complicated but auditable process
+
+* We want to do this deterministically of course so we have a tamper evidence method

-* Now you may be wondering okay this is great, but if a compiler like this wasn't used to build all the other software isn't that a problem...? Yes, it is, we are for the most part unaware of this risk, or didn't have a way to practically deal with it. More on the solution of that problem a few slides from now.
 -->

 ---
@ -231,9 +226,9 @@ use two different compilers to build the code, and while unlikely it is possible

 # Status Check-In

-* So far we have:
-  * A fully deterministic compiler
-  * Used that compiler to build all our dependencies
+* So far we have established we need the following for a solution:
+  * Bootstrap a compiler in a deterministic manner
+  * Use compiler to build all our dependencies
  * Last thing remaining: your application

 <!-- Now this seems like a lot... and it is, so we went ahead and built
@ -241,36 +236,33 @@ an open source solution that tries to address the problem -->

 ---

-# Deterministic and Minimal Linux distribution
+# [Stageˣ]
+
+Open source Linux Distribution
+
+---
+
+# Multi-Signed, Bootstrapped, Deterministic, and Minimal 

 <!-- Speaker notes
-* We tried to get the existing distributions to implement the necessary upgrades
-to gain the security properties we are after but they wouldn't, so we were
-forced to build our own Linux distribution.

-Most Linux distributions are built for *compatibility* rather than *security*.
-This results in a dramatic increase of attack surface area of an operating
-system. StageX is designed to allow the creation of application specific
-environments with a minimal footprint to eliminate attack surface area. Each
-component of the toolchain installs only what it needs, and only packages what
-it builds, resulting in a decreased attack surface.
+* Most Linux distributions are built for *compatibility* rather than *security*. This results in a dramatic increase of attack surface area of an operating
+system. 

-StageX is the first Linux multisig distribution, is one of two fully
+* StageX is the first Linux multisig distribution, is one of two fully
 bootstrapped Linux distributions, is 100% reproducible and deterministic,
 and can build complicated software with as few dependencies exposed as
 possible.
+
+* The other thing that differentiates StageX from other solutions like NixOS
+is that it is fully container native, so there is no package manager required 
+such as flake or otherwise.
+
 -->


 <hr />

-<!--
-TODO: include image describing traditional package building, by installing
-_every_ dependency in a single OS, with a comparison of stagex only having mini
-Containerfiles with just what each project needs. If done so, this graph can be
-moved to a separate slide.
-->
-
 | Distribution | Signatures | Libc  | Bootstrapped | Reproducible | Rust deps |
 |--------------|------------|-------|--------------|--------------|----------:|
 | Stagex       | 2+ Human   | Musl  | Yes          | Yes          | 4         |
@ -282,15 +274,6 @@ moved to a separate slide.
 <!-- NOTE: "Rust deps" is the amount of dependencies required to build a Rust
 hello world -->

-<!---
-- Unable to confirm the following:
-| Guix         | 1 Human    | Glibc | Yes          | Yes          | 4         |
-| Nix          | 1 Bot      | Glibc | Partial      | Mostly       | 4         |
--->
-
-<!-- Add a link to a script that confirms/reproduces the dependency count for
-building Rust hello world -->
-
 ---

 # Full source bootstrapped from Stage 0
@ -321,13 +304,13 @@ CMD ["/usr/bin/hello"]
 ```

 <!--
-* We could include other dependencies, let's say nettle, or gmp easily
 * This may look very similar to what you may do with alpine linux, but the difference is that with alpine you are trusting single points of failure since none of the alpine packages are multi reproduced and signed - this
 is why we made stagex - they also do not use bootstrapped compilers.
-->

-<!-- TODO: make pallets a thing, test this. Include RUSTFLAGS to make static in
-     the pallet -->
+* Who built alpine rust; what compiler did they use
+
+* There is no way to easily reproduce most software so you can't verify it for yourself, you are blindly trusting that the binary is clean
+-->

 ---

@ -378,23 +361,14 @@ toolchain -->

 # Multi-Signed OCI Images

-Multiple maintainers can each sign individual images, with the container
-runtime enforcing _multiple_ signatures by maintainers to ensure no individual
-maintainer could have tampered with an image.
-
 <!-- Speaker notes
-StageX uses the Open Container Initiative standard for images to support the
-use of multiple container runtimes. Because OCI images can be signed using
-OpenPGP keys, this allows the association of built images to trusted
-maintainers, which can enable developers to build their software using StageX,
-without having to build the entire StageX toolchain for themselves.

-Creating a network of trust builds a relationship between developers and
-maintainers, allowing developers to choose maintainers that implement key
-management policies that match their standards. For example, Distrust signing
-keys are always stored on smart cards or airgapped machines, avoiding key
-exfiltration attacks and limiting key exposure to trusted computation
-environments.
+* We have multiple individuals rebuild the all of the software in the StageX distribution
+
+* You can also clone the stagex repository, install docker and run the command `make` to verify for yourself that all the hashes match
+
+* You can overlay rules around how many times software has to have been rebuilt, and a trusted list of cryptographic keys the software has to be signed by to ensure you always have a desired level of reproduciblity in your stack
+
 ---
 -->

@ -420,21 +394,9 @@ flowchart TD

 ---

-# Common toolchain dependencies
-
-StageX comes with developer-loved tooling and languages, such as:
-
- `curl`
- `git`
- `bash`
- `openssl`
-
---
-
 # Pallets

-StageX will soon offer prebuilt containers including all the packages necessary to run
-some of our most used software, such as:
+StageX will soon offer prebuilt containers including all the packages necessary to run some of our most used software, such as:

 - `rust`
 - `go`
@ -443,34 +405,36 @@ some of our most used software, such as:
 - `redis`
 - `postgres`

+<!--
+* We already offer packages that can be used today, and are used in production by multiple companies
+
+* Adding a usability improvement where all the dependencies are grouped into what we are calling "pallets"
+-->
+
 ---

 # Key Takeaways

-StageX...
-
-* Your software, at every point in the bootstrapped toolchain, can all be built
-deterministically.
+* Bootstrapped compiler 
+* Fully deterministic 
 * Packages the software you're already using, but in a more secure manner.
-* Is a drop in replacement, and has container support
+* Is a drop in replacement, and has native container support

 <!--

 Other distributions run their own package manager inside of containers
-We use containers as our package manager
-100% container native, no attack surface
-
-By using StageX, you have the software you already use, with the assurance it
-was built in a secure manner.
-
-Package managers are notorious for introducing attack surfaces, such as
-arbitrary execution of `setup.py` or post-download scripts, and by using Docker
+We use containers as our package manager 100% container native, no attack surface
+    * Package managers are notorious for introducing attack surfaces, such as arbitrary execution of `setup.py` or post-download scripts, and by using Docker
 as our package manager, we avoid all forms of spontaneous execution.

 All StageX software is built deterministically, meaning you can be sure all
 components listed in your Software Bill Of Materials hasn't been tampered with.
 Because StageX provides a toolchain for you to build your software in the same
-manner, your software can be sooper dooper pooper scooper secure.
+manner
+
+* Available on docker hub
+
+
 -->

 ---
@ -499,31 +463,14 @@ manner, your software can be sooper dooper pooper scooper secure.

 ---

-# Other Projects
-
-This is only one part of the "Distrust Stack"
-
-* [`keyfork`](https://git.distrust.co/public/keyfork):  toolchain for generating and managing a wide range of cryptographic keys
-
-* [`bootproof`](https://git.distrust.co/public/bootproof): tpm2 remote attestation
-
-* [`reprOS`](https://codeberg.org/stagex/repros): OS designed for secure reproduction
-
-* [`sigRev`](): open standard for signed code reviews
-
-<!--
-* This is why we are called Distrust  we don't want you to have to trust anyone
-* As Benjamin Franklin once said distrust and caution are the parents of security
- -->
-
---
-
 # Links

 **Email**: anton@distrust.co / sales@distrust.co

 **Matrix Chat**: #stagex:matrix.org

+**Docker Hub**: https://hub.docker.com/u/stagex
+
 **Git Repo**: https://codeberg.org/stagex/stagex

 Big thank you to sponsors who have supported the development of this project: