Quality Assurance testing for the Print & Probability book processing

The QA workflow operates several modules that represent distinct parts of the Print & Probability (P&P) book processing pipeline (i.e. autocrop, line extraction). Each QA module performs its respective part of the pipeline as well as a quality assurance process over its results. Modules are called one at a time. They take in a yaml config file and/or command line arguments as well as typical input (one or more folders of scanned book pages). They produce the typical output for that part of the P&P pipeline, but also several metadata files that give information about the results of the process. Modules have some common subprocesses, the calling and order of which are specified via the yaml config file (i.e. clear, archive, run, output_stats, collate). These are provided to allow for customization and quick (re)runs of the QA workflow and are described in the Modules and Subprocesses section below.

Main operation

All QA workflow runs begin by calling the run_qa.sh bash script at the command line with sbatch:

sbatch run_qa.sh <module_name> <yaml_config_filepath>

Current valid module names are:

autocrop and
line_extraction

The contents of the yaml config file are described in the config section below.

Modules and Subprocesses

Each QA module itself can initiate its underlying process but also has several 'subprocesses' that can be run before and afterward. (These 'subprocesses' themselves can be broken down into smaller substeps if desired.) Below are some general descriptions of each process and their own substeps (if they have them). Each QA module implements a version of a common base class/interface. For a more detailed look at the respective functions for these subprocesses, see Python class QA_Module, in qa_utilities.py.

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
eynollah_shell_scripts		eynollah_shell_scripts
logs		logs
output_data		output_data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze_autocrop_qa_results.py		analyze_autocrop_qa_results.py
create_autocrop_test_dir.py		create_autocrop_test_dir.py
example_config.yaml		example_config.yaml
prepare_alignment_input_csv.py		prepare_alignment_input_csv.py
qa.py		qa.py
qa_autocrop.py		qa_autocrop.py
qa_autocrop.sh		qa_autocrop.sh
qa_autocrop.yaml		qa_autocrop.yaml
qa_autocrop_new.sh		qa_autocrop_new.sh
qa_constants.py		qa_constants.py
qa_line_extraction.py		qa_line_extraction.py
qa_line_extraction.sh		qa_line_extraction.sh
qa_line_extraction.yaml		qa_line_extraction.yaml
qa_line_extraction_eynollah.sh		qa_line_extraction_eynollah.sh
qa_line_extraction_final.sh		qa_line_extraction_final.sh
qa_line_extraction_streamlined.sh		qa_line_extraction_streamlined.sh
qa_utilities.py		qa_utilities.py
run_qa.sh		run_qa.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quality Assurance testing for the Print & Probability book processing

Main operation

Modules and Subprocesses

archive

clear

collate

output_stats

run

About

Uh oh!

Releases

Packages

Languages

License

printprobability/qa-workflow

Folders and files

Latest commit

History

Repository files navigation

Quality Assurance testing for the Print & Probability book processing

Main operation

Modules and Subprocesses

archive

clear

collate

output_stats

run

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages