-
Notifications
You must be signed in to change notification settings - Fork 51
Improve binary fuse benchmark #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Run with both 8 and 16-bit fingerprints and with a few different sizes. Add a `MKeys/s` metric which is intuitive. Results on an Apple M1 laptop below. For comparison, on the same machine a cacheline-blocked bloom filter does ~130Mkeys/s. ``` name time/op BinaryFusePopulate/8/n=10000-10 199µs ± 1% BinaryFusePopulate/8/n=100000-10 2.55ms ± 1% BinaryFusePopulate/8/n=1000000-10 27.5ms ± 1% BinaryFusePopulate/16/n=10000-10 231µs ± 1% BinaryFusePopulate/16/n=100000-10 2.58ms ± 0% BinaryFusePopulate/16/n=1000000-10 29.0ms ± 1% name MKeys/s BinaryFusePopulate/8/n=10000-10 50.4 ± 1% BinaryFusePopulate/8/n=100000-10 39.2 ± 1% BinaryFusePopulate/8/n=1000000-10 36.4 ± 1% BinaryFusePopulate/16/n=10000-10 43.3 ± 1% BinaryFusePopulate/16/n=100000-10 38.8 ± 0% BinaryFusePopulate/16/n=1000000-10 34.5 ± 1% name alloc/op BinaryFusePopulate/8/n=10000-10 283kB ± 0% BinaryFusePopulate/8/n=100000-10 2.58MB ± 0% BinaryFusePopulate/8/n=1000000-10 24.8MB ± 0% BinaryFusePopulate/16/n=10000-10 321kB ± 0% BinaryFusePopulate/16/n=100000-10 2.91MB ± 0% BinaryFusePopulate/16/n=1000000-10 28.1MB ± 0% name allocs/op BinaryFusePopulate/8/n=10000-10 8.00 ± 0% BinaryFusePopulate/8/n=100000-10 8.00 ± 0% BinaryFusePopulate/8/n=1000000-10 8.00 ± 0% BinaryFusePopulate/16/n=10000-10 8.00 ± 0% BinaryFusePopulate/16/n=100000-10 8.00 ± 0% BinaryFusePopulate/16/n=1000000-10 8.00 ± 0% ```
Member
|
Merged. |
Contributor
Author
|
Thank you! |
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this pull request
Jan 10, 2026
Binary fuse filters take non-trivial memory, about 24 bits per key (see FastFilter/xorfilter#48). We thus have to be more careful with memory usage. For small-to-medium filters, we reuse builders in a `sync.Pool`. For large filters, we limit concurrency and keep a very small pool of builders to reuse. For very large filters (that are unlikely to show up in practice currently), we further limit concurrency and don't reuse builders. Note that Pebble's compaction concurrency is typically much smaller than the number of CPUs, so the limits should not impact performance (especially we only limit concurrency of building the filter itself, which is a small part of sstable write time).
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this pull request
Jan 11, 2026
Binary fuse filters take non-trivial memory, about 24 bits per key (see FastFilter/xorfilter#48). We thus have to be more careful with memory usage. For small-to-medium filters, we reuse builders in a `sync.Pool`. For large filters, we limit concurrency and keep a very small pool of builders to reuse. For very large filters (that are unlikely to show up in practice currently), we further limit concurrency and don't reuse builders. Note that Pebble's compaction concurrency is typically much smaller than the number of CPUs, so the limits should not impact performance (especially we only limit concurrency of building the filter itself, which is a small part of sstable write time).
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this pull request
Jan 13, 2026
Binary fuse filters take non-trivial memory, about 24 bits per key (see FastFilter/xorfilter#48). We thus have to be more careful with memory usage. For small-to-medium filters, we reuse builders in a `sync.Pool`. For large filters, we limit concurrency and keep a very small pool of builders to reuse. For very large filters (that are unlikely to show up in practice currently), we further limit concurrency and don't reuse builders. Note that Pebble's compaction concurrency is typically much smaller than the number of CPUs, so the limits should not impact performance (especially we only limit concurrency of building the filter itself, which is a small part of sstable write time).
RaduBerinde
added a commit
to RaduBerinde/pebble
that referenced
this pull request
Jan 13, 2026
Binary fuse filters take non-trivial memory, about 24 bits per key (see FastFilter/xorfilter#48). We thus have to be more careful with memory usage. For small-to-medium filters, we reuse builders in a `sync.Pool`. For large filters, we limit concurrency and keep a very small pool of builders to reuse. For very large filters (that are unlikely to show up in practice currently), we further limit concurrency and don't reuse builders. Note that Pebble's compaction concurrency is typically much smaller than the number of CPUs, so the limits should not impact performance (especially we only limit concurrency of building the filter itself, which is a small part of sstable write time).
RaduBerinde
added a commit
to cockroachdb/pebble
that referenced
this pull request
Jan 13, 2026
Binary fuse filters take non-trivial memory, about 24 bits per key (see FastFilter/xorfilter#48). We thus have to be more careful with memory usage. For small-to-medium filters, we reuse builders in a `sync.Pool`. For large filters, we limit concurrency and keep a very small pool of builders to reuse. For very large filters (that are unlikely to show up in practice currently), we further limit concurrency and don't reuse builders. Note that Pebble's compaction concurrency is typically much smaller than the number of CPUs, so the limits should not impact performance (especially we only limit concurrency of building the filter itself, which is a small part of sstable write time).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Run with both 8 and 16-bit fingerprints and with a few different
sizes. Add a
MKeys/smetric which is intuitive.Results on an Apple M1 laptop below. For comparison, on the same
machine a cacheline-blocked bloom filter does ~130Mkeys/s.