feat(bench): add csearch benchmark harness#68
feat(bench): add csearch benchmark harness#68Liam-Deacon wants to merge 5 commits intofeat/search-defrom
Conversation
Add particle swarm optimization with new optimizer entry, config limits, and registry aliases. Update csearch docs/man page for PSO usage and budgets. Add PSO regression test to validate convergence. Tests: cmake --build build --target test_search_pso; ctest --test-dir build --output-on-failure -R search.pso; python3 -m pre_commit run --all-files
Allow PSO swarm size, coefficients, and vmax to be configured via CLI/env. Propagate settings into PSO run config and log output. Update csearch help/docs and config env test. Tests: cmake --build build --target test_search_optimizer test_search_pso; ctest --test-dir build --output-on-failure -R 'search.optimizer|search.pso'; python3 -m pre_commit run --all-files
Implement DE optimizer and register it in the optimizer registry with the 'de' alias. Add DE config defaults, limits, and logging; update csearch help/docs/man page. Add DE regression test and extend optimizer lookup coverage. Tests: cmake -S . -B build -DCMAKE_BUILD_TYPE=Release; cmake --build build --target test_search_de test_search_optimizer; ctest --test-dir build --output-on-failure -R 'search.de|search.optimizer'; python3 -m pre_commit run --all-files
Add a JSON manifest format, runner script, and plot helper for csearch benchmarks. Include report template and README plus gitignore entries for output. Tests: python3 tools/benchmarks/run_benchmarks.py --dry-run --leed /usr/bin/true --rfac /usr/bin/true --optimizers si --seeds 1 --max-evals 1 --max-iters 1 --output-dir benchmarks/out; python3 -m pre_commit run --all-files
Reviewer's GuideAdds a reproducible CSEARCH benchmarking harness, including a manifest‑driven runner that stages datasets and executes sweeps, a plotting script for convergence curves, and documentation/templates for recording and sharing benchmark results while ignoring generated outputs. Sequence diagram for the CSEARCH benchmark sweep runnersequenceDiagram
actor User
participant RunBenchmarksCLI as run_benchmarks_cli
participant Runner as run_benchmarks_py
participant Csearch as csearch_process
participant FS as filesystem
User->>RunBenchmarksCLI: invoke with manifest, seeds, optimizers
RunBenchmarksCLI->>Runner: main() parses args and calls run_benchmarks(args)
Runner->>FS: load_manifest(manifest.json)
FS-->>Runner: datasets with resolved paths
Runner->>Runner: parse_seeds(args)
Runner->>Runner: parse optimizers list
Runner->>Runner: ensure_program for CSEARCH_LEED and CSEARCH_RFAC
Runner->>FS: create output_root directory with timestamped run_id
Runner->>FS: copy manifest.json into output_root
loop for each dataset
Runner->>FS: validate dataset input file exists
loop for each optimizer
loop for each seed
Runner->>FS: create run_dir dataset/optimizer/seed
Runner->>FS: copy input, bulk, control, extra_files into run_dir
Runner->>Runner: build csearch command line
Runner->>Runner: build environment with CSEARCH_LEED, CSEARCH_RFAC
alt dry_run enabled
Runner->>Runner: simulate exit_code, stdout, stderr
else execute csearch
Runner->>Csearch: subprocess.run(cmd, cwd=run_dir, env=env)
Csearch-->>Runner: exit_code, stdout, stderr
end
Runner->>FS: write run.stdout and run.stderr in run_dir
Runner->>FS: read project.log in run_dir via parse_log()
FS-->>Runner: rmin, reported_iters, trace
alt trace not empty
Runner->>FS: write trace.csv via write_trace_csv(trace)
else
Runner->>Runner: set trace_path to None
end
Runner->>Runner: derive evals from last trace point
Runner->>Runner: append result row to results list
end
end
end
Runner->>FS: write summary.json with results
Runner->>FS: write summary.csv with results
Runner-->>RunBenchmarksCLI: return output_root
RunBenchmarksCLI-->>User: print benchmark results location
Entity relationship diagram for the benchmark manifest schemaerDiagram
Manifest {
string path
}
Dataset {
string name
string description
string input
string bulk
string control
float delta
string extra_files
}
Manifest ||--o{ Dataset : contains
%% Notes on fields (as attributes):
%% - input: required path to .inp file (resolved relative to manifest)
%% - bulk: optional path to .bul file
%% - control: optional path to .ctr file
%% - delta: optional displacement value passed as -d
%% - extra_files: optional list of additional paths copied into run_dir
Flow diagram for the CSEARCH benchmarking and reporting pipelineflowchart LR
subgraph BenchConfig["Benchmark configuration"]
M["benchmarks/manifest.json"]
RBPY["tools/benchmarks/run_benchmarks.py"]
end
subgraph BenchOutputs["Benchmark run outputs"]
OUTROOT["benchmarks/out/<run_id>/"]
SUMJSON["summary.json"]
SUMCSV["summary.csv"]
TRACECSV["trace.csv (per run)"]
LOGS["*.log, run.stdout, run.stderr"]
end
subgraph Plotting["Plotting and reporting"]
PBPY["tools/benchmarks/plot_benchmarks.py"]
PLOTS["benchmarks/plots/<dataset>_convergence.png"]
REPORTTPL["benchmarks/report_template.md"]
end
M --> RBPY
RBPY --> OUTROOT
OUTROOT --> SUMJSON
OUTROOT --> SUMCSV
OUTROOT --> TRACECSV
OUTROOT --> LOGS
SUMJSON --> PBPY
PBPY --> PLOTS
SUMCSV --> REPORTTPL
PLOTS --> REPORTTPL
classDef config fill:#e7f0ff,stroke:#4a78c2
classDef outputs fill:#e8ffe7,stroke:#3c9a3c
classDef plotting fill:#fff3cd,stroke:#b7950b
class M,RBPY config
class OUTROOT,SUMJSON,SUMCSV,TRACECSV,LOGS outputs
class PBPY,PLOTS,REPORTTPL plotting
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Comment |
Codacy's Analysis Summary19 new issues (≤ 0 issue) Review Pull Request in Codacy →
|
|
Tried running the benchmark harness against the Ni111_Cu example.
Artifacts are under: I can retry with a different example or additional inputs if you can point me to a known-good dataset for csearch. |
|
Filed TODO issue for the example-run failure: #70 |
d72e121 to
e2e86a0
Compare
Problem
Solution
Testing
python3 tools/benchmarks/run_benchmarks.py --dry-run --leed /usr/bin/true --rfac /usr/bin/true --optimizers si --seeds 1 --max-evals 1 --max-iters 1 --output-dir benchmarks/outpython3 -m pre_commit run --all-filesLinks
Follow-ups
Summary by Sourcery
Add a benchmark harness for running and analyzing csearch sweeps across datasets, optimizers, and seeds.
New Features:
Documentation:
Tests:
Chores: