It is more typical in this repo to use `module::Error` instead of a type
alias when importing.
Use `hex::Error` directly instead of `use hex::Error as HexError`.
106acdc3ac Add fuzzing for Witness struct (Riccardo Casatta)
2fd0125bfa Introduce Witness struct mainly to improve ser/de performance while keeping most usability. (Riccardo Casatta)
Pull request description:
At the moment the Witness struct is `Vec<Vec<u8>>`, the vec inside a vec cause a lot of allocations, specifically:
- empty witness -> 1 allocation, while an empty vec doesn't allocate, the outer vec is not empty
- witness with n elements -> n+1 allocations
The proposed Witness struct contains the serialized format of the witness. This reduces the allocations to:
- empty witness -> 0 allocations
- witness with n elements -> 1 allocation for most common cases (you don't know how many bytes is long the entire witness beforehand, thus you need to estimate a good value, not too big to avoid wasting space and not too low to avoid vector reallocation, I used 128 since it covers about 80% of cases on mainnet)
The inconvenience is having slightly less comfortable access to the witness, but the iterator is efficient (no allocations) and you can always collect the iteration to have a Vec of slices. If you collect the iteration you end up doing allocation anyway, but the rationale is that it is an operation you need to do rarely while ser/de is done much more often.
I had to add a bigger block to better see the improvement (ae860247e191e2136d7c87382f78c96e0908d700), these are the results of the benches on my machine:
```
RCasatta/master_with_block
test blockdata::block::benches::bench_block_deserialize ... bench: 5,496,821 ns/iter (+/- 298,859)
test blockdata::block::benches::bench_block_serialize ... bench: 437,389 ns/iter (+/- 31,576)
test blockdata::block::benches::bench_block_serialize_logic ... bench: 108,759 ns/iter (+/- 5,807)
test blockdata::transaction::benches::bench_transaction_deserialize ... bench: 670 ns/iter (+/- 49)
test blockdata::transaction::benches::bench_transaction_get_size ... bench: 7 ns/iter (+/- 0)
test blockdata::transaction::benches::bench_transaction_serialize ... bench: 51 ns/iter (+/- 5)
test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench: 13 ns/iter (+/- 0)
branch witness_with_block (this one)
test blockdata::block::benches::bench_block_deserialize ... bench: 4,302,788 ns/iter (+/- 424,806)
test blockdata::block::benches::bench_block_serialize ... bench: 366,493 ns/iter (+/- 42,216)
test blockdata::block::benches::bench_block_serialize_logic ... bench: 84,646 ns/iter (+/- 7,366)
test blockdata::transaction::benches::bench_transaction_deserialize ... bench: 648 ns/iter (+/- 77)
test blockdata::transaction::benches::bench_transaction_get_size ... bench: 7 ns/iter (+/- 0)
test blockdata::transaction::benches::bench_transaction_serialize ... bench: 50 ns/iter (+/- 5)
test blockdata::transaction::benches::bench_transaction_serialize_logic ... bench: 14 ns/iter (+/- 0)
```
With an increased performance to deserialize a block of about 21% and to serialize a block of about 16% (seems even higher than expected, need to do more tests to confirm, I'll appreciate tests results from reviewers)
ACKs for top commit:
apoelstra:
ACK 106acdc3ac
sanket1729:
ACK 106acdc3ac
dr-orlovsky:
utACK 106acdc3ac
Tree-SHA512: e4f23bdd55075c7ea788bc55846fd9e30f9cb76d5847cb259bddbf72523857715b0d4dbac505be3dfb9d4b1bcae289384ab39885b4887e188f8f1c06caf4049a
Witness struct is in place of the Vec<Vec<u8>> we have before this commit.
from_vec() and to_vec() methods are provided to switch between this type and Vec<Vec<u8>>
Moreover, implementation of Default, Iterator and others allows to have similar behaviour but
using a single Vec prevent many allocations during deserialization which in turns results in
better performance, even 20% better perfomance on recent block.
last() and second_to_last() allows to access respective element without going through costly Vec
transformation
This is the initial step towards using and maybe enforcing clippy.
It does not fix all lints as some are not applicable. They may be
explicitly ignored later.
Docs can always do with a bit of love.
Clean up the module level (`//!`) rustdocs for all public modules.
I claim uniform is better than any specific method/style. I tried to fit
in with what ever was either most sane of most prevalent, therefore
attaining uniformity without unnecessary code churn (one exception being
the changes to headings described below).
Notes:
* Headings - use heading as a regular sentence for all modules e.g.,
```
//! Bitcoin network messages.
```
as opposed to
```
//! # Bitcoin Network Messages
```
It was not clear which style to use so I picked a 'random' mature
project and copied their style.
* Added 'This module' in _most_ places as the start of the module
description, however I was not religious about this one.
* Fixed line length if necessary since most of our code seems to follow
short (80 char) line lengths for comments anyways.
* Added periods and fixed obvious (and sometimes not so obvious)
grammatically errors.
* Added a trailing `//!` to every block since this was almost universal
already. I don't really like this one but I'm guessing it is Andrew's
preferred style since its on the copyright notices as well.
Instead of using magic numbers we can define constants for the address
prefix bytes. This makes it easier for future readers of the code to see
what these values are if they don't know them and/or see that they are
correct if they do know them.
Based on the original work by Justin Moon.
*MSRV unchanged from 1.29.0.*
When `std` is off, `no-std` must be on, and we use the [`alloc`](https://doc.rust-lang.org/alloc/) and core2 crates. The `alloc` crate requires the user define a global allocator.
* Import from `core` and `alloc` instead of `std`
* `alloc` only used if `no-std` is on
* Create `std` feature
* Create `no-std` feature which adds a core2 dependency to polyfill `std::io` features. This is an experimental feature and should be
used with caution.
* CI runs tests `no-std`
* MSRV for `no-std` is 1.51 or so
This introduces some constants defined by Bitcoin Core which as a
consequence define some network rules in a new 'policy' module.
Only some were picked, which are very unlikely to change. Nonetheless a
Warning has been put in the module documentation.
Script-level constants are left into rust-miniscript where they are
already defined (src/miniscript/limits.rs).
- Move network::encodable::* to consensus::encode::*
- Rename Consensus{En,De}codable to {En,De}codable (now under
consensus::encode)
- Move network::serialize::Error to consensus::encode::Error
- Remove Raw{En,De}coder, implement {En,De}coder for T: {Write,Read}
instead
- Move network::serialize::Simple{En,De}coder to
consensus::encode::{En,De}coder
- Rename util::Error::Serialize to util::Error::Encode
- Modify comments to refer to new names
- Modify files to refer to new names
- Expose {En,De}cod{able,er}, {de,}serialize, Params
- Do not return Result for serialize{,_hex} as serializing to a Vec
should never fail
Previously this structure was unused, it's now being used by the `TxIn`
structure to simplify the code a little bit and avoid confusions. Also
the rust-lightning source code has an `OutPoint` similar to this one
but with the `vout` index as an `u16` to avoid unsafe conversions.
I've added to new methods to `OutPoint`:
- `null`: Creates a new "null" `OutPoint`.
- `is_null`: Checks if the given `OutPoint` is null.
Signed-off-by: Jean Pierre Dudey <jeandudey@hotmail.com>
Addresses #96.
Turns out it was being used for hex encoding/decoding, so replaced that with the `hex` crate.
i chose to import the `decode` method as:
```
use hex::decode as hex_decode
```
so that it is clear to the reader what is being decoded when it is called. "decode" is such a generic sounding function name that it would get confusing otherwise.
This is a rather large breaking API change, but is significantly
more sensible. In the "do not allow internal representation to
represent an invalid state" category, this ensures that witness
cannot have an length other than the number of inputs. Further,
it reduces vec propagation, which may help performance in some
cases by reducing allocs. Fianlly, this just makes more sense (tm).
Witness are a per-input field like the scriptSig, placing them
outside of the TxIn is just where they are serialized, not where
they logically belong.
Rather than having methods taking &mut self, have them consume self
and return another Builder, so that methods can be chained.
Bump major version number.
Work is stalled on some other library work (to give better lifetime
requirements on `eventual::Future` and avoid some unsafety), so
committing here.
There are only three errors left in this round :)
Also all the indenting is done, so there should be no more massive
rewrite commits. Depending how invasive the lifetime-error fixes
are, I may even be able to do sanely sized commits from here on.
Looks like to implement the crypto opcodes I may need to switch from
rust-crypto to rust-openssl.. or implement RIPEMD-160 for rust-crypto.
In either case I will need to generalize the hash.rs stuff to support
other hashes, so I'm committing here as a checkpoint before doing all
that.
I noticed that the little/big endian hex string functions for Sha256dHash
did not match my intuition. What we should have is that the raw bytes
correspond to a little-endian representation (since we convert to Uint256
by transmuting, and Uint256's have little-endian representation) while
the reversed raw bytes are big-endian.
This means that the output from `sha256sum` is "little-endian", while the
standard "zeros on the left" output from bitcoind is "big-endian". This
is correct since we think of blockhashes as being "below the target" when
they have lots of zeros on the left, and we also notice that when hashing
Bitcoin objects with sha256sum that the output hashes are always reversed.
These two functions le_hex_string and be_hex_string should really not be
used outside of the library; the Encodable trait should give access to a
"big endian" representation while ConsensusEncodable gives access to a
"little endian" representation. That way we describe the split in terms
of user-facing/consensus code rather than big/little endian code, which
is a better way of thinking about it. After all, a hash is a collection
of bytes, not a number --- it doesn't have an intrinsic endianness.
Oh, and by the way, to compute a sha256d hash from sha256sum, you do
echo -n 'data' | sha256sum | xxd -r -p | sha256dsum
This is a massive simplification, fixes a couple endianness bugs (though
not all of them I don't think), should give a speedup, gets rid of the
`serialize_iter` crap.
We were conflicting with the Rust stdlib trait Hash, which is used
by various datastructures which need a general hash. Also implement
Hash for Sha256dHash so that we can use bitcoin hashes as keys for
such data structures.