PAE-000: Add Excel parser benchmark tooling #574
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ticket: PAE-000
Summary
Adds diagnostic benchmark tooling for measuring Excel parser performance:
npm run benchmark:parse <file>) - Measures Excel file loading and parsing time, reports metadata fields, table counts, and row counts per tablenpm run benchmark:validate <file>) - Measures both parse and validation time through the full data syntax validation pipeline, reports validation outcomes (INCLUDED/EXCLUDED/REJECTED) per table and any validation issuesnpm run benchmark:generate <source> <rows> [output]) - Generates test files with a target row count by duplicating existing data into empty placeholder rows, preserving all Excel formatting, styles, and validationFindings
Parse time is essentially O(1) for row count - dominated by Excel file loading/decompression rather than row processing:
Validation scales O(n) at ~0.07ms per row, but even at 10k rows it's only 20% of total time.
Files Changed
benchmarks/parse-file.js- Updated usage message (script renamed)benchmarks/validate-file.js- New validation benchmark scriptbenchmarks/generate-test-file.js- New test file generatorbenchmarks/README.md- Documentation for all three toolspackage.json- Added npm scripts