Add SSOT verification for PDF sources #56
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Implements automated verification that
literature/pdfs/ALL_FLAT/remains the canonical PDF source and stays in sync withcorpus_manifest.json.Changes
verify_pdf_ssot.py: Verification script with 5 checks (completeness both directions, hash integrity, uniqueness, validity)test_verify_pdf_ssot.py: Test suite covering all verification scenariosMakefile: Addedverify-ssotandtest-ssottargetsFindings
Verification identified 1 corrupted PDF (
fpls-15-1268101.pdf) - XML error from failed download rather than valid PDF content. All other 113 PDFs validated successfully.Usage
The verification is designed to be integrated into CI/CD or pre-commit hooks to prevent SSOT drift.
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.