extend blog post no.4 with new section

This commit is contained in:
Christian Reitter 2024-01-04 22:37:57 +01:00
parent 947d22a5a3
commit 28871425fd
1 changed files with 25 additions and 5 deletions

View File

@ -5,7 +5,7 @@ author: ["Christian Reitter"]
date: 2024-01-04 13:00:00 +0000
---
We take a deepdive into the `bip3x` library's use of pseudo random number generators (PRNG) and related problems.
We take a deep dive into the `bip3x` library's use of pseudo random number generators (PRNG) and related problems.
<div id="toc-container" markdown="1">
<h2 class="no_toc">Table of Contents</h2>
@ -57,7 +57,7 @@ While these steps certainly improve the situation, we think that defaulting to i
When compiling the `bip3x` library for non-Windows-targets, it uses a PRNG from the [PCG family](https://www.pcg-random.org/) of RNG algorithms by default. The OpenSSL random functions are available as an opt-in alternative, as outlined previously.
As far as we know, none of the currently available PCG algorithm variants are designed to be a Cryptographically Secure Pseudo Random Number Generator (CSPRNG), which should disqualify them from any usage to generate long-lived cryptographic key material such as BIP39 seeds.
The existing `bip3x` documentation can be understood to outline this:
The existing `bip3x` documentation can be understood to [outline this](https://github.com/edwardstock/bip3x/blob/57e7c5c2854c58048c249389b8d469385156181f/README.md#cross-compile-for-windows-under-mingw64):
> IMPORTANT: using c++ (mt19937, and PCGRand not works properly too) random generator is not secure. Use -Dbip3x_USE_OPENSSL_RANDOM=On while configuring to use OpenSSL random generator.
@ -72,7 +72,8 @@ Given this context, we were curious: how vulnerable are PCG-generated seeds in `
* A [public comment](https://gist.github.com/Leandros/6dc334c22db135b033b57e9ee0311553?permalink_comment_id=3726134#gistcomment-3726134) suggests that this implementation differs from the official PCG algorithms in the essential step `m_state = oldstate * 6364136223846793005ULL + m_inc;` by not setting the lowest bit in `m_inc` to one. However, given that the `bip3x` calls the function in question only after the `seed()` seeding function which permanently applies the `| 1` bitwise operation on the `m_inc` variable, we think the practical behavior is functionally identical at this step.
#### Behavior
At first glance, the `32-bit Output, 64-bit State: PCG-XSH-RR` variant has the huge problem of an internal state that is limited to only 64-bit, which would give an upper limit of **64-bit entropy** in terms of starting positions. Attacking this state would be 2^32-times harder than attacking Mersenne Twister MT19937-32 with its **32-bit** of seeding state, but still be in the realm of brute-forcing, at least in the near future or for attackers with significant resources. For context, brute-forcing 40-bit of BIP39 key entropy in a similar situation was possible [within 30 hours in 2020](https://medium.com/@johncantrell97/how-i-checked-over-1-trillion-mnemonics-in-30-hours-to-win-a-bitcoin-635fe051a752) for less than 425$ in costs.
At first glance, the `32-bit Output, 64-bit State: PCG-XSH-RR` variant has the huge problem of an internal state that is limited to only 64-bit, which would give an upper limit of **64-bit entropy** in terms of starting positions. Attacking this state would be 2^32-times harder than attacking Mersenne Twister MT19937-32 with its **32-bit** of seeding state, but still be in the realm of brute-forcing, at least in the near future or for attackers with significant resources.
For context, brute-forcing 40-bit of BIP39 key entropy in a similar situation was possible [within 30 hours in 2020](https://medium.com/@johncantrell97/how-i-checked-over-1-trillion-mnemonics-in-30-hours-to-win-a-bitcoin-635fe051a752) for less than 425$ in costs according to the original article. Scaling this to 64 bit would likely require high GPU costs or use of more specialized hardware (FPGAs, ASICs), but it seems generally possible to do this. Similar optimizations have been done before for cryptocurrency mining, and the necessary technology keeps getting cheaper.
Looking closer, the used PCG variant also has a second 63-bit state-like parameter which selects a special sub-stream that extends upon the main state.`bip3x` uses and seeds this sub-stream index with random values as well. We're unclear if this effectively increases the upper theoretical bound to 127 bits of "seeding state" in the general case, but consider it a significant enough obstacle to prevent practical brute-force attacks of the type we've shown for MT19937-32.
@ -116,11 +117,30 @@ We consider this random number handling to be an implementation flaw. However, a
In the _best_ case, `std::random_device` provides perfect 32 bit of actual hardware-backed non-guessable entropy per call. Taking into account the implementation details, this could lead to something in the order of roughly 2^125 different initial configurations of the PCG PRNG. If the PRNG state and sub-streams really are as independent as intended, which we didn't investigate, this complexity is impossible to brute-force blindly.
Unfortunately, given the loose guarantees of C++ with regards to the `unsigned int` bitsize and `std::random_device` randomness, it's also possible that far less non-guessable bits make it into the PCG start configuration, for example if `std::random_device` uses an insecure PRNG on its own or if `unsigned int` has only the minimally required 16-bit width. Therefore assumptions about the entropy input may not hold on all operating systems and compilers, which makes this construction susceptible to silently breaking in problematic ways.
To summarize, we see the use of this algorithm as unsafe and unsuited _for key generation_ and recommend against its use in `bip3x`. It _may_ have enough internal complexity to withstand remote brute-force attacks on _most_ modern systems, but there are better alternatives that are designed for cryptography. Especially when generating 18-word (192-bit) and 24-word (256-bit) BIP39 secrets, this PRNG will significantly decrease overall security margins for no good reason.
#### PCG Seeding Failure
Shortly after the publication of the first iteration of this blog post, Itamar Carvalho pointed us to an aspect we weren't fully aware of before: on MinGW builds of `bip3x` in early 2021, the `std::random_device` C++ randomness API failed in a spectacular fashion and defaulted to deterministic values (!). As we've outlined in the previous article section, the PCG implementation used in `bip3x` offers no defense against such a PRNG seeding failure, which leads to completely predictable PRNG results and a static BIP39 key generation. Most properties of a PRNG don't really matter much if it is fed with 0-bits of entropy in the form of a fixed input.
🎲 [Obligatory XKCD #221 reference](https://xkcd.com/221/) 🎲.
Apparently, this behavior motivated the initial (also unsafe) use of Mersenne Twister + time based seeding for the MinGW based Windows builds that were introduced in version `2.1.1`, after the bad behavior of previous versions was found.
We haven't reproduced this code behavior on our own, but suspect that the long public [MinGW](https://sourceforge.net/p/mingw-w64/bugs/338/) and [gcc](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85494) bugs were directly involved here. Some of this has finally been patched in newer gcc versions, but the CPU architecture and environment specific patches do not inspire much confidence. The descriptions suggest the new behavior is only safe on _some_ systems: ([1](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85494#c9), [2](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85494#c19))
> [...]
> It's also fixed for mingw.org if the CPU supports either RDSEED or RDRAND. For mingw.org binaries running on older CPUs it will still use the mt19937 PRNG.
> [...]
> This patch adds a fallback for the case where the runtime cpuid checks for x86 hardware instructions fail, and no /dev/urandom is available. When this happens a std::linear_congruential_engine object will be used, with a seed based on hashing the engine's address and the current time. Distinct std::random_device objects will use different seeds, unless an object is created and destroyed and a new object created at the same memory location within the clock tick. This is not great, but is better than always throwing from the constructor, and better than always using std::mt19937 with the same seed (as GCC 9 and earlier do).
With APIs like this, you don't need any enemies 🙃
#### PCG Summary
To summarize, we see the use of the chosen PCG PRNG algorithm as unsafe, generally unsuited for key generation, and recommend against its use in `bip3x`. It _may_ have enough internal complexity to withstand remote brute-force attacks on _most_ modern systems, but there are better alternatives that are designed for cryptography. Especially when generating 18-word (192-bit) and 24-word (256-bit) BIP39 secrets, this PRNG will significantly decrease overall security margins for no good reason. Additionally, the C++ randomness API used in this particular implementation has some horrible implementations & fallback modes with silent security downgrades, which makes it very difficult to rely on the PRNG output for any security purposes.
Please be aware that this is just a brief analysis we did to judge potential directions for our research. It is not a formal review, and you should not rely upon it for your own security.
## Summary & Outlook
In this post, we took a look at a software which may have contributed to some of the discovered weak wallets based on Mersenne Twister outputs. In its standard configuration, it uses another PRNG algorithm that is risky but likely not weak enough for wide-scale remote attacks.
In this post, we took a look at a software which may have contributed to some of the discovered weak wallets based on Mersenne Twister outputs. In its standard configuration, it uses the `PCG-XSH-RR` PRNG algorithm that is risky but likely not weak enough for wide-scale remote attacks - unless the basic random source is broken.
In the upcoming posts, we'll show more wallet analysis of previously vulnerable funds and discuss other affected software.