Skip to content

Comments

[Examples] Add MachSuite Benchmarks#287

Merged
chhzh123 merged 16 commits intocornell-zhang:mainfrom
zzzDavid:machsuite
Feb 6, 2026
Merged

[Examples] Add MachSuite Benchmarks#287
chhzh123 merged 16 commits intocornell-zhang:mainfrom
zzzDavid:machsuite

Conversation

@zzzDavid
Copy link
Contributor

@zzzDavid zzzDavid commented Jan 24, 2025

Description

MachSuite is a set of 19 benchmarks designed to mimic low-level kernels suitable for hardware acceleration, this PR adds MachSuite Allo implementations as examples.

Contributors

Francis Pham @fpham0701
Rhoda Ma @rhodama
Raymond Lin @rlin569
William Yoon @wty5
Nicole Li @nicolelii
Juhyoung Lee @Juhyoung29
Yuqiang Ge @YqGe585

Checklist

  • PR's title starts with a category (e.g. [Bugfix], [IR], [Builder], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage (It would be better to provide ~2 different test cases to test the robustness of your code)
  • Code is well-documented

@chhzh123
Copy link
Member

I think we need to clean up the folder to eliminate the data files. It'd be better to include those inputs in the kernel file or use small random inputs to reduce the testing time.

@zzzDavid
Copy link
Contributor Author

Yes, I need to clean up the tests, I'll let you know when this PR is ready

@chhzh123
Copy link
Member

chhzh123 commented Jun 5, 2025

BugBot run

@cursor
Copy link

cursor bot commented Jun 5, 2025

🚨 BugBot failed to run

Remote branch not found for this Pull Request. It may have been merged or deleted (requestId: serverGenReqId_74d9584d-f604-4c85-a057-e061761da4fc).

@chhzh123
Copy link
Member

@zzzDavid Could you find some time to clean up this PR?

Copilot AI review requested due to automatic review settings February 5, 2026 20:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds MachSuite benchmark implementations as Allo examples. MachSuite is a collection of 19 benchmarks representing low-level kernels suitable for hardware acceleration. The implementation includes various algorithms across different domains including sparse matrix operations, sorting, FFT, graph algorithms, neural networks, and cryptography.

Changes:

  • Added Allo implementations for multiple MachSuite benchmarks (spmv, mergesort, md, kmp, gemm, fft, bfs, backprop, aes)
  • Included test infrastructure and data files for verification
  • Provided both basic implementations and optimized versions for some benchmarks

Reviewed changes

Copilot reviewed 54 out of 87 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
examples/machsuite/spmv/ellpack/ellpack.py ELLPACK sparse matrix-vector multiplication implementation
examples/machsuite/spmv/crs/run_test.py Test runner for CRS sparse matrix format
examples/machsuite/spmv/crs/crs.py CRS sparse matrix-vector multiplication implementation
examples/machsuite/mergesort/testing_backup.py Backup testing file for merge sort with reference implementation
examples/machsuite/mergesort/mergesort.py Merge sort implementation (commented out with debugging notes)
examples/machsuite/merge/testing.py Test runner for merge sort
examples/machsuite/merge/mergesort.py Alternative merge sort implementation
examples/machsuite/md/knn/md.py Molecular dynamics k-nearest neighbor implementation
examples/machsuite/md/grid/md.py Molecular dynamics grid-based implementation
examples/machsuite/kmp/kmp.py Knuth-Morris-Pratt string matching algorithm
examples/machsuite/gemm/gemm_ncubed.py Basic N-cubed GEMM implementation
examples/machsuite/gemm/gemm_blocked.py Blocked GEMM implementation
examples/machsuite/fft/transpose/transpose_fft.py FFT with transpose optimization
examples/machsuite/fft/strided/strided_fft.py FFT with strided memory access
examples/machsuite/bfs/bfs_queue_allo.py Breadth-first search using queue-based approach
examples/machsuite/bfs/bfs_bulk_allo.py Breadth-first search using bulk synchronous approach
examples/machsuite/backprop/backprop.py Neural network backpropagation implementation
examples/machsuite/aes/aes.py AES encryption implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 64 out of 64 changed files in this pull request and generated 65 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor Author

@zzzDavid zzzDavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Addressed the valid findings:

Fixed:

  • Removed duplicate imports in transpose_fft.py (float32, int32 imported twice)
  • Removed unused cmplx_MUL_x/y functions (identical to cmplx_M_x/y and never called)
  • Removed commented-out FFT4 code block in transpose_fft.py
  • Removed commented-out class definitions in md/grid/md.py
  • Removed unused imports across 11 files: numpy, math, Struct, index, Int, int32, float32, allo.ir.types as T

Not applicable (false positives):

  • "Call to a non-callable of builtin-class module" (~40 comments): allo.customize() is the core API of the Allo framework — these are all valid calls, not errors.
  • N_TOKENS in viterbi.py: standard pattern documenting the emission matrix shape, consistent with N_OBS and N_STATES which are used.

@zzzDavid zzzDavid self-assigned this Feb 6, 2026
@zzzDavid zzzDavid requested a review from chhzh123 February 6, 2026 01:54
@zzzDavid
Copy link
Contributor Author

zzzDavid commented Feb 6, 2026

Hi @chhzh123 this PR is ready for human review :)

Copy link
Member

@chhzh123 chhzh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest better organizing the files and making them consistent. It'd be better to separate the Allo implementation and the Numpy implementation into two different files, and create a run_test.py for each application.

@zzzDavid
Copy link
Contributor Author

zzzDavid commented Feb 6, 2026

@chhzh123 Addressed your review feedback:

  1. File organization: Each benchmark now has a dedicated run_test.py or test_*.py with a consistent pattern — the Allo implementation is in its own file (e.g., aes.py, backprop.py) and the test/NumPy reference is in a separate test file. A top-level test_machsuite.py orchestrates all benchmarks via pytest.

  2. F64Type and SinOp: Split out into a separate PR: [Builder] Add F64Type support and SinOp for math operations #548.

  3. Copilot review threads: All resolved — unused imports removed, commented-out code cleaned up, and the ~20 allo.customize() false positives dismissed.

  4. Remaining unused numpy imports in 4 kernel files (ellpack.py, radix_sort.py, gemm_ncubed.py, nw.py): will push a fix shortly.

zzzDavid and others added 11 commits February 6, 2026 14:30
- Fix allo.sin()/allo.cos(): add missing SinOp to math op dispatch dict
  and add F64Type support to the type guard in builder.py (MLIR math
  dialect supports all float types natively)
- Add new implementations: Needleman-Wunsch (nw/) and Radix Sort (sort/radix/)
- Fix all benchmarks: resolve loop iterator pre-declarations, scalar-from-array
  workarounds, caller variable shadowing, module-level customize interference,
  AoS data parsing, and hardcoded user paths
- Fix all test scripts to use os.path.dirname(__file__) for portable data paths
- Add stencil data files for stencil2d/stencil3d

All 19 MachSuite benchmark variants build and pass with LLVM backend:
AES, Backprop, BFS/Bulk, BFS/Queue, FFT/Strided, FFT/Transpose,
GEMM/Ncubed, GEMM/Blocked, KMP, MD/Grid, MD/KNN, MergeSort,
NW, RadixSort, SPMV/CRS, SPMV/ELLPACK, Stencil2D, Stencil3D, Viterbi

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace 36 .data files (~130K lines) with programmatic input generation
and Python/NumPy reference validation in each test. All 19 benchmarks
generate their own inputs (seeded random or inline constants) and
validate against a Python reference implementation instead of golden
data files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add psize.json with "full" and "small" size tiers for all 19 benchmarks,
following the polybench pattern. Guard module-level allo.customize()/build()
calls so kernel files can be imported without side effects. Add test_*()
functions to each benchmark that accept a size parameter, and create
test_machsuite.py as the pytest entry point using small sizes for CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…te mergesort dir

Remove 14 files (680 lines) not used by the test suite: debug scripts
(debug_aes.py, reproduce.py, no-code.py, neg_loop_step.py), file I/O helpers
(support.py, read.py, write.py), HLS synthesis variants (*_opt.py), the
duplicate mergesort/ directory, and setup-py312.sh. Move top-level imports
of removed modules into __main__ guards in generate.py and viterbi.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Run the 19 MachSuite benchmark tests (small sizes) alongside the
existing polybench benchmarks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When pytest discovers examples/machsuite/ as a directory, it was
collecting tests from both test_machsuite.py and individual subdirectory
files (34 items), causing module name collisions (md/grid/md.py vs
md/knn/md.py) and fatal aborts. The conftest.py ignores all benchmark
subdirectories so only test_machsuite.py is collected (19 items).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ed-out code

Address Copilot review findings:
- Remove duplicate float32/int32 imports and unused cmplx_MUL_x/y in transpose_fft.py
- Remove unused imports (numpy, math, Struct, index, Int, int32, float32, T) across 11 files
- Remove commented-out FFT4 block and class definitions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove unused `import numpy as np` from ellpack.py, radix_sort.py,
gemm_ncubed.py, and nw.py kernel files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reorganize the BFS benchmark into two subfolders matching the other
multi-variant benchmarks (md/grid, md/knn, fft/strided, fft/transpose).
Each subfolder contains the Allo kernel, Python reference, and run_test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zzzDavid and others added 4 commits February 6, 2026 14:39
- Rename all test files to `run_test.py` for consistency across benchmarks
- Flatten `sort/radix/` to `radix_sort/` since there's only one sorting algorithm
- Update test_machsuite.py to reflect all path changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Combine run_test_blocked.py into run_test.py so each benchmark has
a single test file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Inline the Python reference implementation into run_test.py and remove
the separate viterbi.py file. Clean up unused imports in viterbi_allo.py.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename viterbi_allo.py to viterbi.py
- Merge BFS python references into run_test.py, remove bfs_bulk_python.py
  and bfs_queue_python.py
- Clean up kernel files: remove __main__ blocks and unused imports

Each benchmark now consistently has <name>.py (Allo kernel) and
run_test.py (test + reference implementation).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@zzzDavid
Copy link
Contributor Author

zzzDavid commented Feb 6, 2026

Every benchmark now follows the same pattern: <name>.py for the Allo kernel, and run_test.py for the test + reference.

Copy link
Member

@chhzh123 chhzh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@chhzh123 chhzh123 merged commit 48f61da into cornell-zhang:main Feb 6, 2026
2 checks passed
@zzzDavid zzzDavid deleted the machsuite branch February 6, 2026 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants