Skip to content

Conversation

@matt-o-how
Copy link
Contributor

@matt-o-how matt-o-how commented Sep 25, 2025

This PR adds a costed and cached shatree which accounts for cost as if it isn't caching so that further improvements may be made in the future without breaking consensus.

Attached are the benchmarked performance graphs for cost of call, cost per pair and cost for 32byte chunk.

sha256tree-per-pair sha256tree-per-byte sha256tree-base

@coveralls-official
Copy link

coveralls-official bot commented Sep 25, 2025

Pull Request Test Coverage Report for Build 20377346286

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 86 of 449 (19.15%) changed or added relevant lines in 6 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-3.1%) to 87.111%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/chia_dialect.rs 0 1 0.0%
src/test_ops.rs 0 1 0.0%
src/treehash.rs 81 109 74.31%
tools/src/bin/sha256tree-benching.rs 0 333 0.0%
Totals Coverage Status
Change from base Build 18973962912: -3.1%
Covered Lines: 6380
Relevant Lines: 7324

💛 - Coveralls

@matt-o-how matt-o-how marked this pull request as ready for review October 27, 2025 15:49
Copy link
Contributor

@arvidn arvidn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a tool that can be extended that establishes a reasonable cost for new operators as well. We need some kind of benchmark to set the cost.

Copy link
Contributor

@arvidn arvidn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you feel confident that the cost benchmarks are good? specifically the cost per byte, cost per pair and cost per atom?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that all benchmarking for establishing the cost model for sha256tree is done by tools/src/bin/sha256tree-benching.rs. It seems it would make sense to leave this file unchanged now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed all changes except the comments that were added to the flags

// native
let start = Instant::now();
let red = run_program(a, &dialect, call, a.nil(), 11_000_000_000).unwrap();
let cost = red.0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this cost seems like a circular dependency. This program is meant to establish this cost, not present what it happens to be set to right now. It really seems like something that easy to misunderstand. I don't know what it tells us here, but I do see how it can be confusing. I think it would be best to remove it, but at least it should have a comment explaining what it's for.

The main risk I see is that it, accidentally, is used to make it appear as if a bad cost is good, because it matches this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern, but believe it's useful to keep in the file to track how close the current set values are to the calculated expected value. I've reformatted the output to be far more explicit about what the meaning of that output is.

// a new list entry is 2 nodes (a cons and a nil) and a 3 chunk hash operation and a 1 chunk hash operation
// this equation lets us figure out a theoretical cost just for a node
let duration = (duration - (500.0 + i as f64) * (4.0 * bytes32_native_time)) / 2.0;
let cost = (cost as f64 - (500.0 + i as f64) * (4.0 * bytes32_native_cost)) / 2.0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, this cost must not be used to establish the cost model or cost constants for the tree. I think it's risky to mix it in.

Comment on lines +235 to +238
writeln!(output_native_time, "{}\t{}", i, duration).unwrap();
writeln!(output_native_cost, "{}\t{}", i, cost).unwrap();
samples_time_native.push((i as f64, duration));
samples_cost_native.push((i as f64, cost as f64));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
writeln!(output_native_time, "{}\t{}", i, duration).unwrap();
writeln!(output_native_cost, "{}\t{}", i, cost).unwrap();
samples_time_native.push((i as f64, duration));
samples_cost_native.push((i as f64, cost as f64));
writeln!(output_native_time, "{}\t{}", (500 + i), duration).unwrap();
writeln!(output_native_cost, "{}\t{}", (500 + i), cost).unwrap();
samples_time_native.push(((500 + i) as f64, duration));
samples_cost_native.push(((500 + i) as f64, cost as f64));

You add 500 items before this loop, so I think you need to offset all i by 500. A simpler way may be to change the loop to:

for i in 500..1500 {

And remove the offsetting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// this function is for comparing the cost per 32byte chunk of hashing between the native and clvm implementation
#[allow(clippy::type_complexity)]
fn time_per_byte_for_atom(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn time_per_byte_for_atom(
fn time_per_32bytes_for_atom(

This is measuring blocks of 32 bytes, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Finally the `COST_PER_NODE` was the trickiest to pin down as it is the most unique to this operator.
The trick to costing was to compare with the "in-language" implementation and deduct the costs of the known hash operations using our previously costed `COST_PER_BYTES32`.

The calculations for this can be seen in the file `sha256tree-benching.rs`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind including the relevant plots of the measurements also, to demonstrate that this model matches reality? Specifically, that CPU cost grows proportional to CLVM cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plots added

Comment on lines +55 to +56
Native time per node (ns): 115.0891
CLVM time per node (ns): 397.1927
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the two different ways you measure time-per-node seem to disagree quite a lot.

benchmark balanced tree list
CLVM time per node 517.8038 397.1927
Native time per node 203.8718 115.0891

I see a similar pattern on RPi 5.

benchmark balanced tree list
CLVM time per node 1034.3594 855.1965
Native time per node 320.4549 181.4918

I think this suggests there's something fishy about the assumptions in those measurements.

@arvidn arvidn mentioned this pull request Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants