-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Milestone
Description
Below are the tasks for a first stab at creating a generalized QA script for the PnP book processing and ingestion pipeline.
QA Script Pipeline Tasks
Master Script
- One Python script should be set up with arguments to run the various possible subprocesses of this QA process and output metadata on each QA run (time run, slurm log list for each sbatch call, etc.)
- An optional config file (i.e. yaml file) that includes directories and optionally a sequence of the QA subprocesses to run (i.e. clear old results, run new QA, gather results, analyze, etc.)
See: Create Master QA script with Optional Config File #3
Subprocesses
- Run test autocrop script with arguments across one or more given book directories
- The ability to clear old results folders from a list of book directories (including an 'Are you sure?' option for safety) Add the ability to clear out old QA run results from output folder #4
- Gather output metadata into one file from all book directory 'results' from a QA run Gather output metadata into one file from all book directory 'results' from a QA run #2
Error Processing
- Distinguishing between slurm output logs of errored and successful runs
- Bucketing errors once errored runs are identified – what and how many of each
- Analyzing errors (bridges and autocrop) and implementing fixes for them
See: Create a more generlized QA run Error/Success Analysis Script #6
Metadata Additions/Fixes
- Image dimensions and area
- Area difference from crop to original
- Frobenius/L2 norm from crop to original
- File count for originals and for each crop run
- Percent area of crop to original Add Percent Area to Crop QA Metadata #7
Asserts
- TBD acceptable %'s for metadata differences from crop to original images
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request