CIS Converter

A Python tool that converts CIS benchmark PDF documents into structured data formats (CSV or Excel) for easier analysis.

Key Features

Intelligent PDF Parsing: Automatically extracts and structures CIS recommendation data from PDF documents
Multiple Output Formats: Supports both CSV and Excel (.xlsx) output formats
Table of Contents Processing: Leverages TOC to accurately identify and extract individual recommendations
Rich Data Extraction: Captures standard CIS fields
Excel Enhancements: Creates formatted Excel files with
- Data validation dropdowns for compliance tracking (OK, KO, Partial, N/A, ?)
- Conditional formatting for visual status indicators
- Named tables for easy data manipulation
Batch Processing: Process multiple CIS benchmark PDFs in a single operation
Debugging Support: Comprehensive logging and debug output for troubleshooting

Installation

Prerequisites

Python 3.6 or higher
pip package manager

Install Dependencies

pip install -r requirements.txt

Required Dependencies

PyMuPDF: For PDF text extraction and processing
xlsxwriter: For Excel file generation (required only for Excel output format)

Usage

Basic Usage

Convert a CIS benchmark PDF to Excel format:

python cis-converter.py path/to/cis-benchmark.pdf

Convert to CSV format:

python cis-converter.py -f CSV path/to/cis-benchmark.pdf

Process multiple PDF files:

python cis-converter.py -f EXCEL -o output/ file1.pdf file2.pdf file3.pdf

Examples

Convert single PDF to Excel with custom output directory:

python cis-converter.py -f EXCEL -o ./results/ CIS_Ubuntu_Linux_20.04_Benchmark_v1.1.0.pdf

Convert to CSV with custom delimiter and debug logging:

python cis-converter.py -f CSV --csv-delimiter ";" -l DEBUG benchmark.pdf

Batch process multiple benchmarks:

python cis-converter.py -o ./compliance-data/ *.pdf

Output Structure

The tool extracts the following information from each CIS recommendation:

Field	Description
Benchmark	Source PDF filename
CIS #	Recommendation number (e.g., 2.3.1.6)
Scored	Scoring type (Scored/Not Scored/Manual/Automated)
Type	Profile level (L1/L2) or applicability
Policy	Recommendation title/name
Profile Applicability	Target systems and environments
Description	Detailed explanation of the control
Rationale	Why this control is important
Audit	Steps to verify compliance
Result	Compliance status (for tracking)
Comments	Additional notes (for tracking)
Remediation	Steps to implement the control
Impact	Potential effects of implementation
Default Value	System default configuration
References	Related documentation and resources
Additional Information	Extra context and notes
CIS Controls	Mapping to CIS Controls framework

Command Line Options

usage: cis-converter.py [-h] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--debug-file DEBUG_FILE] [-f {CSV,EXCEL}]
                        [-o OUTPUT_DIR] [--csv-quoting {ALL,MINIMAL,NONNUMERIC,NONE,NOTNULL,STRINGS}]
                        [--csv-delimiter CSV_DELIMITER] [--csv-quotechar CSV_QUOTECHAR]
                        [--csv-escapechar CSV_ESCAPECHAR]
                        input_files [input_files ...]

positional arguments:
  input_files           path to the input file(s)

options:
  -h, --help            show this help message and exit
  -l {DEBUG,INFO,WARNING,ERROR,CRITICAL}, --log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        set the logging level (default: INFO)
  --debug-file DEBUG_FILE
                        output file for TXT extract from PDF, only if --log-level=DEBUG (default: cis-debug.txt)
  -f {CSV,EXCEL}, --format {CSV,EXCEL}
                        set the output format (default: EXCEL)
  -o OUTPUT_DIR, --output-folder OUTPUT_DIR
                        path to the folder for storing files generated by the script (default: ./)

CSV options:
  --csv-quoting {ALL,MINIMAL,NONNUMERIC,NONE,NOTNULL,STRINGS}
                        set the CSV quoting style (default: ALL)
  --csv-delimiter CSV_DELIMITER
                        set the CSV delimiter (default: ,)
  --csv-quotechar CSV_QUOTECHAR
                        set the CSV quote character (default: ")
  --csv-escapechar CSV_ESCAPECHAR
                        set the CSV escape character (default: \)

Output Files

Excel Format (.xlsx)

Main Worksheet: Contains all extracted CIS recommendations with formatted columns
Data Worksheet: Provides validation lists for compliance tracking
Features:
- Data validation dropdowns in the "Result" column
- Conditional formatting with color-coded compliance status
- Text wrapping and proper cell formatting
- Named tables for easy filtering and sorting

CSV Format (.csv)

UTF-8 encoded with Byte Order Mark (BOM) for proper character display
Customizable delimiters and quoting options
Compatible with spreadsheet applications and data analysis tools

Troubleshooting

Common Issues

"Table of Contents could not be found"

Ensure the PDF follows standard CIS benchmark format
Check if the PDF has a proper Table of Contents section
Run with --log-level=DEBUG to see detailed extraction information

Missing or incomplete data

Some CIS PDFs may have formatting variations
Use debug mode to examine the extracted text: --log-level=DEBUG
Check the debug output file (default: cis-debug.txt) for parsing details

Debug Mode

Enable debug logging to troubleshoot parsing issues:

python cis-converter.py --log-level=DEBUG --debug-file=debug.txt input.pdf

This will create a detailed log file showing:

Raw text extraction from each page
Formatted and cleaned text
Table of contents parsing results
Section identification and data extraction

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues for:

Bug fixes and improvements
Support for additional CIS benchmark formats
Enhanced parsing algorithms
New output formats

License

This project is licensed under the MIT License.

Acknowledgments

Based on the original CISConverter by Fragtastic
Enhanced for improved parsing performance and reliability

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
cis-converter.py		cis-converter.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CIS Converter

Key Features

Installation

Prerequisites

Install Dependencies

Required Dependencies

Usage

Basic Usage

Examples

Output Structure

Command Line Options

Output Files

Excel Format (.xlsx)

CSV Format (.csv)

Troubleshooting

Common Issues

Debug Mode

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

I-TRACING-ASO/CIS-Converter

Folders and files

Latest commit

History

Repository files navigation

CIS Converter

Key Features

Installation

Prerequisites

Install Dependencies

Required Dependencies

Usage

Basic Usage

Examples

Output Structure

Command Line Options

Output Files

Excel Format (.xlsx)

CSV Format (.csv)

Troubleshooting

Common Issues

Debug Mode

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages