A high-performance file compression tool implementing a multi-stage compression pipeline: BWT → MTF → RLE → Huffman Coding
-
Multi-Stage Compression Pipeline
- BWT (Burrows-Wheeler Transform): Rearranges data to cluster similar characters
- MTF (Move-to-Front): Transforms repeated characters into small values
- RLE (Run-Length Encoding): Compresses runs of identical bytes
- Huffman Coding: Entropy encoding as the final compression stage
-
Parallel Processing: OpenMP-based parallelization for improved performance on multi-core systems
-
Sequential Mode: Optimized single-threaded implementation
-
Comprehensive Testing: 18-test suite covering all pipeline components
-
Cross-Platform: Works on Linux, macOS, and other POSIX systems
| File Size | Original | Compressed | Compression Ratio |
|---|---|---|---|
| 1 MB | 1.0 MB | 764 KB | 74.6% |
| 10 MB | 10 MB | 7.5 MB | 74.6% |
| 50 MB | 50 MB | 38 MB | 74.6% |
| 100 MB | 100 MB | 78 MB | 74.6% |
Notes:
- Compression ratio varies by data type (random data is hardest to compress)
- Repetitive/text data achieves better compression (typically 40-60% of original size)
- All decompressed files verified identical to originals using
diff
Performance metrics using -p flag with different thread counts:
| Threads | Time | Speedup | Efficiency | Cost |
|---|---|---|---|---|
| 1 (seq) | 0.269s | 1.00x | 100.0% | 0.269 |
| 2 | 0.232s | 1.15x | 57.5% | 0.464 |
| 4 | 0.197s | 1.36x | 34.0% | 0.788 |
| 8 | 0.136s | 1.97x | 24.6% | 1.088 |
| 16 | 0.146s | 1.84x | 11.5% | 2.336 |
| Threads | Time | Speedup | Efficiency | Cost |
|---|---|---|---|---|
| 1 (seq) | 1.309s | 1.00x | 100.0% | 1.309 |
| 2 | 1.124s | 1.16x | 58.0% | 2.248 |
| 4 | 0.845s | 1.54x | 38.5% | 3.380 |
| 8 | 0.685s | 1.91x | 23.8% | 5.480 |
| 16 | 0.630s | 2.07x | 12.9% | 10.080 |
| Threads | Time | Speedup | Efficiency | Cost |
|---|---|---|---|---|
| 1 (seq) | 2.632s | 1.00x | 100.0% | 2.632 |
| 2 | 2.000s | 1.31x | 65.5% | 4.000 |
| 4 | 1.587s | 1.65x | 41.2% | 6.348 |
| 8 | 1.265s | 2.08x | 26.0% | 10.120 |
| 16 | 1.260s | 2.08x | 13.0% | 20.160 |
| Threads | Time | Speedup | Efficiency | Cost |
|---|---|---|---|---|
| 1 (seq) | 13.372s | 1.00x | 100.0% | 13.372 |
| 2 | 10.284s | 1.30x | 65.0% | 20.568 |
| 4 | 7.902s | 1.69x | 42.3% | 31.608 |
| 8 | 6.230s | 2.15x | 26.8% | 49.840 |
| 16 | 6.075s | 2.20x | 13.8% | 97.200 |
| Threads | Time | Speedup | Efficiency | Cost |
|---|---|---|---|---|
| 1 (seq) | 27.496s | 1.00x | 100.0% | 27.496 |
| 2 | 27.427s | 1.00x | 50.1% | 54.854 |
| 4 | 17.643s | 1.56x | 39.0% | 70.572 |
| 8 | 12.627s | 2.18x | 27.2% | 101.016 |
| 16 | 12.714s | 2.16x | 13.5% | 203.424 |
Performance Metrics:
- Speedup = Sequential Time / Parallel Time
- Efficiency = (Speedup / Threads) × 100%
- Cost = Threads × Parallel Time
- Large files (≥500MB) with random/complex data: Use 8-16 threads for best speedup (~2.2x)
- Files ≥100MB with random/complex data: Use 8-16 threads for best speedup (~2-4x)
- Files 10-50MB: Use 8 threads for optimal balance (~1.91-1.97x speedup)
- Files <10MB or highly repetitive data: Use sequential mode (parallel overhead not worth it)
- Best efficiency: 2 threads (65% efficiency)
- Best speedup: 16 threads on large random files (2.2x max)
- GCC or Clang with C11 support
- OpenMP support (for parallel version)
- Make
# Build all (parallel and sequential versions)
make
# Build only parallel version
make huffcode
# Build only sequential version
make huffcode_seq
# Build test suite
make test_pipeline
# Clean build artifacts
make clean# Compress a file (sequential mode)
./huffcode -i input.txt -o output.huff
# Compress with parallel mode
./huffcode -p -i input.txt -o output.huff
# Compress with specific thread count (parallel mode)
export OMP_NUM_THREADS=8
./huffcode -p -i large_file.txt -o compressed.huff
# Use sequential version (always single-threaded, no -p flag)
./huffcode_seq -i input.txt -o output.huff# Decompress a file (sequential mode)
./huffcode -d -i compressed.huff -o decompressed.txt
# Decompress with parallel mode
./huffcode -p -d -i compressed.huff -o decompressed.txt
# Decompress with sequential version
./huffcode_seq -d -i compressed.huff -o decompressed.txt-i <file>: Input file path (required)-o <file>: Output file path (required)-d: Decompress mode (default is compress)-p: Use parallel version (OpenMP multi-threading)
OMP_NUM_THREADS: Set number of threads when using-pflag (default: system max)export OMP_NUM_THREADS=8 # Use 8 threads with -p flag ./huffcode -p -i large_file.txt -o compressed.huff
# Run complete test suite (sequential, parallel, and pipeline tests)
make checkThis runs:
- Sequential huffcode tests (basic compression/decompression)
- Parallel huffcode tests (multi-threaded compression)
- Pipeline component tests (18 tests covering BWT, MTF, RLE, Huffman, and integration)
# Sequential tests only
make check_seq
# Parallel tests only
make check_parallel
# Pipeline tests only (BWT, MTF, RLE, Huffman integration)
make check_pipeline
# Or run pipeline tests directly
./test_pipelineThe test_pipeline suite includes:
- BWT Tests (4): Simple strings, repetitive patterns, edge cases
- MTF Tests (3): Character patterns, binary data
- RLE Tests (3): Run compression, no-run data, long runs
- Pipeline Integration (1): Full BWT→MTF→RLE→Huffman pipeline
- Huffman Tests (5): Small text, repetitive data, binary data, edge cases
- Edge Cases (2): Two-symbol data, medium text (10KB)
# Run with Valgrind (checks for memory leaks)
make valgrind_check-
BWT (Burrows-Wheeler Transform)
- Rearranges input data using lexicographic rotation sorting
- Clusters similar characters together
- Reversible with stored index position
- Time: O(n² log n), Space: O(n)
-
MTF (Move-to-Front)
- Maintains a list of all 256 possible byte values
- When a byte is encountered, output its position and move it to front
- Repeated bytes get small position values
- Time: O(n), Space: O(1)
-
RLE (Run-Length Encoding)
- Compresses runs of 3+ identical bytes
- Stores original length in 4-byte header
- Preserves incompressible data as-is
- Time: O(n), Space: O(n)
-
Huffman Coding
- Builds frequency table and optimal prefix-free code tree
- Encodes data using variable-length codes
- Parallel implementation for frequency counting
- Time: O(n log n), Space: O(n)
Compressed files contain:
- Original file size (4 bytes)
- BWT transform index (4 bytes)
- RLE-encoded data length (4 bytes)
- Huffman tree structure
- Compressed bit stream
.
├── huffcode.c # Main command-line interface
├── huffman.c # Huffman coding + pipeline integration
├── huffman_parallel.c # Parallel Huffman implementation (OpenMP)
├── bwt.c / bwt.h # Burrows-Wheeler Transform
├── mtf.c / mtf.h # Move-to-Front encoding
├── rle.c / rle.h # Run-Length Encoding
├── test_pipeline.c # Comprehensive test suite (18 tests)
├── run_tests.sh # Basic compression test script
├── Makefile # Build system
└── test_data/ # Test files (1MB-100MB)
| Algorithm | Encoding Time | Encoding Space | Decoding Time | Decoding Space |
|---|---|---|---|---|
| BWT | O(n² log n) | O(n) | O(n) | O(n) |
| MTF | O(n) | O(1) | O(n) | O(1) |
| RLE | O(n) | O(n) | O(n) | O(n) |
| Huffman | O(n log n) | O(n) | O(n) | O(n) |
| Total | O(n² log n) | O(n) | O(n) | O(n) |
Note: BWT dominates encoding complexity; decoding is linear time.
- BWT encoding is O(n²) time complexity - slower on very large files
- Random data compresses poorly (~75% ratio) due to high entropy
- Files <1MB show minimal benefit from parallelization
- Maximum file size limited by available memory (BWT requires ~2x file size in RAM)
$ ./huffcode -i document.txt -o document.huff
$ ls -lh document.*
-rw-r--r-- 1 user group 100K document.txt
-rw-r--r-- 1 user group 45K document.huff # ~45% of original$ export OMP_NUM_THREADS=8
$ ./huffcode -p -i large_file.txt -o large_file.huff$ ./huffcode -d -i document.huff -o restored.txt
$ diff document.txt restored.txt
$ echo $?
0 # Files are identical!# Sequential compression
$ time ./huffcode -i test_data/random_10MB.txt -o /tmp/test.huff
real 0m0.269s
user 0m0.265s
sys 0m0.004s
# Parallel compression (8 threads)
$ export OMP_NUM_THREADS=8
$ time ./huffcode -p -i test_data/random_10MB.txt -o /tmp/test.huff
real 0m0.136s
user 0m0.856s
sys 0m0.008s
# Speedup: 1.97x faster!See LICENSE file for details.
Contributions welcome! Areas for improvement:
- Faster BWT implementation (suffix array construction)
- Block-based processing for better memory efficiency
- Additional compression algorithms (LZ77, LZMA)
- Improved parallel scaling for small files
- Burrows, M. and Wheeler, D. J. (1994), A block-sorting lossless data compression algorithm
- Huffman, D. A. (1952), A Method for the Construction of Minimum-Redundancy Codes
- Bentley, J. L., et al. (1986), A locally adaptive data compression scheme