Improve README.md

This commit is contained in:
Christian Reitter 2024-12-16 12:43:07 +01:00
parent c3f0c3c96d
commit f17029f23e
1 changed files with 4 additions and 1 deletions
early_research_code/python-bloom-filter-util

View File

@ -13,7 +13,7 @@ thirdaddress
The resulting bloom filter can be "checked against" with an address, and will respond whether that address exists in the bloom filter set or not.
It's important to keep in mind that bloom filters are probabilistic data structures and as such result in false positives usually at a rate of ~1%, which can be adjusted for by increasing the data set size, but at typical parameters which result from an optimized bloom filter, balancing false positives and size, 1% is the usual rate we encounter.
It's important to keep in mind that bloom filters are probabilistic data structures and as such result in false positives at a certain rate, which can be adjusted for by increasing the data set size. Adjust this depending on your workload. If you check millions or billions or addresses against a filter and cannot tolerate more than a few false positives, we recommend setting an appropriately small false positive factor.
## Generate bloom filter
`python bloom-util.py create --filter_file filter.pkl --addresses_file addresses.txt`
@ -31,6 +31,9 @@ $ Address fourthaddress is not in the filter
This is experimental, unmaintained code. Use only as research inspiration.
Specifically, we make no security guarantees.
Deserializing malicious filters may be problematic, for example.
## License
Licensed under either of `Apache License, Version 2.0` or `MIT` license at your option.