more edits

This commit is contained in:
Anton Livaja 2025-04-01 21:22:45 -07:00
parent 7cbb2cea11
commit b67f1599c2
Signed by: anton
GPG Key ID: 44A86CFF1FDF0E85
2 changed files with 35 additions and 29 deletions

View File

@ -4,7 +4,7 @@ title: The Safe{Wallet}/Bybit incident report and mitigation strategies
date: 2025-04-02
---
The Safe{Wallet}/Bybit incident is an example of a nation-state actor executing a series of sophisticated, multi-layered attacks on high-value targets. In cases where the potential gain is significant, it may be justified for attackers to invest in multiple 0-day vulnerabilities and chain them into elaborate exploit sequences. These campaigns often span multiple layers of tech stack, involve precision-targeted social engineering, insider compromise, or even physical infiltration.
The Safe{Wallet}/Bybit incident is an example of a nation-state actor executing a series of sophisticated, multi-layered attacks on high-value targets. In cases where the potential gain is significant, all attacks are on the table. It may be justified for attackers to invest in multiple 0-day vulnerabilities and chain them into elaborate exploit sequences. These campaigns often span multiple layers of tech stack, involve precision-targeted social engineering, insider compromise, or even physical infiltration.
As such, defending against this level of adversary requires a threat model that accounts for their full range of capabilities—and guides the design of equally rigorous mitigations. It demands defenders adopt a much more rigorous set of assumptions about attacker's capabilities and invest time in implementing controls that typical organizations may not need. When protecting high value assets, the game changes.
@ -23,14 +23,13 @@ At Distrust, we operate under the assumption that nation-state actors are persis
* Physical attacks are viable and likely
* Side-channel attacks are viable and likely
These assumptions drive the design strategies and tooling outlined in this report. The controls we've developed are built specifically to address this elevated thread model. Many of the tools are ready to use today, some are reference designs, while other tooling requires further development. If you care about these issues and want to help us push this work forward, [talk to us](https://distrust.co/contact.html).
These assumptions drive everything at Distrust, including the strategies and tooling outlined in this report. The controls we've developed are built specifically to address this elevated thread model. Many of our open source tools are ready to use today, some are reference designs, while other tooling requires further development.
### Summary
This report identifies critical single points of failure—cases where trust is placed in a single individual or computer—creating opportunities for compromise. In contrast, blockchains offer stronger security properties through cryptography and decentralized trust models.
Traditional infrastructure has historically lacked mechanisms to distribute trust, but this limitation can be addressed. By applying targeted design strategies, it's possible to distribute trust across systems and reduce the risks of a single compromised actor undermining the integrity of the entire system.
Traditional infrastructure has historically lacked mechanisms to distribute trust, but this limitation can be addressed. By applying targeted design strategies, it's possible to distribute trust (*dis*trust, get it?) across systems and reduce the risks of a single compromised actor undermining the integrity of the entire system.
---
@ -50,56 +49,55 @@ The compromise occurred due to several key factors, already documented in other
While many security teams reach for quick wins—like access token rotation, stricter IAM policies, or improved monitoring—these are often reactive measures. They may help, but they're equivalent of "plugging holes on a sinking ship" rather than rebuilding the hull from stronger material.
For example, improving access control to the S3 bucket used to serve JavaScript resources, or adding better monitoring, are good steps. But they rely on trust placed in individuals or cloud platforms, which remain vulnerable to compromise.
For example, improving access control to the S3 bucket used to serve JavaScript resources, or adding better monitoring, are good steps. However, they don't address the root cause.
> At the core of this breach lies a recurring theme: single points of failure.
> At the core of this breach lies a recurring theme: single points of failure.
To explore this from first principles, consider the deployment pipeline. In most companies, one individual—an admin or developer—often has the ability to modify critical infrastructure or code. That person becomes a single point of failure.
Even if the pipeline is hardened, the risk shifts, not disappears. They's always one super-admin who has full access. Most cloud platforms encourage this pattern, and the industry has come to accept it.
Even if the pipeline is hardened, the risk will shift, rather than disappear. There is almost always one super-admin who has the "keys to the kingdom". Most cloud platforms encourage this pattern, and the industry has come to accept it.
But this isn't about distrusting your team—it's about designing systems where **trust is distributed**. In the blockchain space, this is already accepted practice. So the question becomes:
But this isn't about doubting your team and their intentions—it's about designing systems where **trust is distributed**. In the blockchain space, this is already accepted practice. So the question becomes:
> *Does it make sense for a single individual to hold the integrity of an entire system in their hands?*
Those who've worked with decentralized systems would say: absolutely not.
#### Mitigation principles
To adequately defend against the risks outlined in the Distrust threat model, it is critical to distinguish between **cold** and **hot** wallets. The following principles are drawn from practical experience building secure systems at BitGo, Unit410, and Turnkey, as well as from diligence work conduced across leading custodial and vaulting solutions.
* A **cold cryptographic key management system** is one where all components can be built, operated, and verified offline. If any part of the system requires trusting a networked component, it becomes **hot** system by definition. For example, if a wallet relies on internet-connected components, it should be considered hot wallet—regardless of how it's marketed. While some systems make trade-offs for user experience, these often come at the cost of real security guarantees.
* A **cold cryptographic key management system** is one where all components can be built, operated, and verified offline. If any part of the system requires trusting a networked component, it becomes a **hot** system by definition. For example, if a wallet relies on internet-connected components, it should be considered a hot wallet—regardless of how it's marketed. While some systems make trade-offs for user experience, these often come at the cost of security guarantees.
* Cold cryptographic key management systems that leverage truly random entropy sources are **not susceptible to remote attacks**, and are only exposed to localized threats such as physical access or side-channel attacks.
* A common misconception is that simply keeping a key offline makes a system cold and secure. But an attacker doesn't always need to steal the key—they just need to achieve the outcome where the key performs an an operation on the desired data on their behalf.
* A common misconception is that simply keeping a key offline makes a system cold and secure. But an attacker doesn't always need to steal the key—they just need to achieve the outcome where the key performs an an operation on the desired data on their behalf.
* **All software in the stack must be open source**, built deterministically (to support reproduction), and compiled using a fully bootstrapped toolchain. Otherwise, the system remains exposed to single points of failure, especially via supply chain compromise.
* **All software in the stack must be open source**, built deterministically (to support reproduction), and compiled using a fully bootstrapped toolchain. Otherwise, the system remains exposed to single points of failure, especially via supply chain compromise.
#### Mitigations and reference designs
We propose two high-level design strategies that can eliminate the types of vulnerabilities exploited in the Safe{Wallet}/Bybit attack. Both approaches offer similar levels of security assurance—but differ significantly in implementation complexity and effort.
In our view, **when billions of dollars are at stake**, it is worth investing in proven low-level mitigations, even if they are operationally harder to deploy. The accounting is simple: **invest in securing your system up front**, rather than gambling on assumptions you won't be targeted.
In our view, **when billions of dollars are at stake**, it is worth investing in proven low-level mitigations, even if they are operationally harder to deploy. The accounting is simple: **invest in securing your system up front**, rather than gambling on assumptions you won't be targeted.
State funded actors are highly motivated—and when digital assets are involved, it's game theory at work. The cost of compromising a weak system is often far less than the potential gain.
We've seen this playbook used in previous incidents, including Axie Infinity, and we will see it again. Attackers are increasingly exploiting both human and technical single points of failure—while defenders often under-invest in securing this surface area.
We've seen this playbook used in previous incidents, a major example being Axie Infinity, and we will see it again. Attackers are increasingly exploiting supply chains and single points of failure—while defenders often under-invest in securing this surface area.
#### Strategy 1 - Run everything locally
This strategy can be implemented without major adjustments to the existing system. The goal is to move the component currently introducing risk—effectively making the wallet "hot"--—into an offline component, upgrading the system to a fully cold solution.
The idea centers on extracting the **signing** component from the application (which currently operates in the UI) and converting it into an offline application. A practical example of this approach would be using a tool like **Electrum**.
The idea centers on extracting the **signing** component from the application (which currently operates in the UI) and converting it into an offline application.
However, simply making a component offline does not eliminate all single points of failure. The security requires that the individual builds the application themselves from source, using a fully bootstrapped compiler and a **deterministic build process**.
However, simply making a component offline does not eliminate all single points of failure. To close off supply chain threats stemming from compiler, dependency or environment compromise requires that the application be reproduced on multiple diverse systems (using different chipsets and operating systems), using a fully bootstrapped compiler - a fully hermetic, deterministic and reproducible process.
We've developed open-source tooling for this under **[StageX](https://codeberg.org/stagex/stagex)**. To learn more about the importance of reproducible builds, check out [this video](https://antonlivaja.com/videos/2024-incyber-stagex-talk.mp4), where one of our co-founders explains how the SolarWinds incident unfolded—and how it could have been prevented.
##### Reference design
This reference design was developed for the Safe{Wallet} team, but it can be applied to any team seeking to build an offline component with minimal single points of failure.
This reference design was developed for the Safe{Wallet} team, but it can be applied to any system seeking to distribute trust in their system.
1. **System administrators use dedicated offline laptops**
@ -113,27 +111,27 @@ This reference design was developed for the Safe{Wallet} team, but it can be app
* Signing operations are performed exclusively on the engineer's offline system
* Distrust has developed open-source tooling to support secure key provisioning: **[Trove](https://trove.distrust.co/generated-documents/all-levels/pgp-key-provisioning.html)**
* Distrust has developed open-source tooling to drastically simplify PGP key provisioning: **[Trove](https://trove.distrust.co/generated-documents/all-levels/pgp-key-provisioning.html)**
3. **Offline signing applications are deterministically compiled, verified, and signed by multiple engineers**
* Includes a full set of tools needed to secure offline key operations
* Includes a full set of tools needed for secure offline key operations
* Distrust also created **[AirgapOS](https://git.distrust.co/public/airgap)**, a custom custom Linux distribution designed specifically for offline secret management.It has been independently audited and is in production with several major digital asset organizations.
* Distrust also created **[AirgapOS](https://git.distrust.co/public/airgap)**, a custom Linux distribution designed specifically for offline secret management. It has been independently audited and is used in production by several major digitial asset organizations.
4. **All sensitive operations are fully verified offline before any cryptographic action is taken**
This design drastically reduces exposure to remote attacks and central points of trust, aligning closely with Distrust's first-principles security model.
This design drastically reduces exposure to remote attacks and central points of trust, aligning closely with Distrust's first-principles security model. The community has built some tools like [safe-utils](https://github.com/openzeppelin/safe-utils) - but unfortunately people are being encourages to use these tools online, which is distributing the risk in a sense, but it's largely shifting it to more online services, and tools which are not built deterministically, missing on the opportunity to fully eliminate a number of attack vectors.
#### Strategy 2 - Use remotely verified service
This strategy maintains a user experience nearly identical to the current system, while introducing verifiability at critical points in the architecture. It requires significantly moe engineering effort and operational discipline, and the tooling needed to support this model is still under active development.
This strategy maintains a user experience nearly identical to the current system, while introducing verifiability at critical points in the architecture. It requires significantly more engineering effort and operational discipline, and the tooling needed to support this model is still under active development.
##### Reference design
This design relies on **secure enclaves** to host servers that are immutable, deterministic, and capable of cryptographically attesting to the software they are running. While this brings us closer to a cold setup, some residual attack surface—such as browser exploits, host OS compromise, or 0-day attacks—will always remain.
The core implementation steps include:
The core implementation steps are:
1. **Rewrite the application to run entirely within a secure enclave**
@ -145,7 +143,7 @@ The core implementation steps include:
2. **Create a deterministic OS image with remote attestation (e.g., TPM2, Nitro Enclave or similar)**
* The entire stack is built using fully bootstrapped compiler in a reproducible manner
* The entire stack is built using full source bootstrapped compiler in a bit-for-bit reproducible manner
3. **One engineer deploys a new enclave** with the updated application code
@ -166,10 +164,10 @@ This high-level overview is meant to illustrate the kinds of problems we focus o
## About Distrust
The Distrust team has helped build and secure some of the highest-risk systems in the world. This includes vaulting infrastructure at BitGo, Unit410, and Turnkey, as well as security work with electrical frid operators, industrial control systems, and other mission-critical environments.
The Distrust team has helped build and secure some of the highest-risk systems in the world. This includes vaulting infrastructure at BitGo, Unit410, and Turnkey, as well as security work with electrical grid operators, industrial control systems, and other mission-critical systems.
We've conducted deep security due diligence across most major custodians. Through our experience with organizations that operate under constant threat—where **every class of attack is viable**—we've developed a methodology and set of open-source tools designed to defend against even the most sophisticated adversaries.
Today, we're taking the hard-earned lessons from that work and sharing them with the broader community. Our goal is to help others strengthen their security posture by making what we've learned—and the tools we've built—available to everyone.
Today, we're taking the hard-earned lessons from that work and sharing them with the broader community. Our goal is to help others strengthen their security posture by making what we've learned—and the open source tools we've built—available to everyone.
**Looking for help analyzing and mitigating security risks in your own organization? [Talk to us](https://distrust.co/contact.html)**.
**Looking to aid us in developing the tooling or for help analyzing and mitigating security risks in your own organization? [Talk to us](https://distrust.co/contact.html)**.

View File

@ -385,6 +385,10 @@ a:hover {
color: white !important;
}
.blog a:hover {
background: unset;
}
.arrow-link:hover .arrow {
transform: translateX(5px);
background: none !important;
@ -1703,25 +1707,29 @@ pre {
}
.blog h1 {
font-size: 2.2rem !important;
font-weight: 600 !important;
font-size: 2.5rem !important;
line-height: 2.2rem !important;
font-weight: 600 !important;
}
.blog h2 {
font-size: 1.8rem !important;
font-weight: 600 !important;
}
.blog h3 {
font-size: 1.6rem !important;
font-weight: 600 !important;
}
.blog h4 {
font-size: 1.4rem !important;
font-weight: 600 !important;
}
.blog h5 {
font-size: 1.2rem !important;
font-weight: 600 !important;
}
.blog hr {