A high-performance, parallel, and robust directory comparison and sync tool written in Rust.
Developed using Windsurf and Cursor.
- Modular architecture: Core logic is implemented as a reusable library crate (
lib.rs), with a minimal binary (main.rs). - Structured error handling: All fallible operations use idiomatic
Result<T, E>with custom error types, powered bythiserrorandanyhowfor robust CLI error reporting. - Structured, configurable logging: Uses
logandenv_loggerfor info, warning, and error output. Logging is configurable via theRUST_LOGenvironment variable. - Parallel directory scanning with jwalk for both trees
- Efficient file comparison using size, modification time, and fast hashing (BLAKE3)
- Hash sampling for huge files: only the first and last 64KB are hashed for files >100MB
- Memory-mapped and parallel hashing for large files
- Direct content comparison for very small files
- Batch output: diffs are buffered and written in batches for speed
- Streaming output: minimal memory usage, even for millions of diffs
- Progress bars for counting, scanning (separate for left/right), and diffing/output (with ETA)
- Smart diff output: all diffs streamed to file, summary at end
- Sync and rollback support (with backup, if enabled)
- Synthetic benchmark mode for performance testing
- Configurable parallelism: set thread count via CLI
- Ignore patterns: skip files/folders by pattern (basic substring, can be extended)
- Help/usage output:
--helpfor all options - Comprehensive test harness: Unit and integration tests for all major modules, using
tempfilefor isolated test directories/files. - Graceful shutdown: Handles Ctrl+C (SIGINT) for safe interruption.
folder-differ <left_dir> <right_dir> [--threads N] [--sync] [--dry-run] [--rollback] [--synthetic-benchmark] [--help]
<left_dir>: Path to the left directory<right_dir>: Path to the right directory
--threads N: Set number of threads for parallelism (default: 2x logical CPUs)--sync: Plan and perform sync actions (copy/delete files)--dry-run: Show planned sync actions without making changes--rollback: Roll back the last sync operation using backups--synthetic-benchmark: Run a synthetic benchmark (creates and scans a large fake tree)--help: Show help/usage message
This tool uses structured, configurable logging. To control log verbosity, set the RUST_LOG environment variable. For example:
RUST_LOG=info folder-differ ...
RUST_LOG=debug folder-differ ...
Log output includes info, warning, and error messages for all major phases and error conditions.
- Counting Phase: Recursively counts files and directories in both trees, with a progress bar for each.
- Scanning Phase: Uses jwalk for fast, parallel file listing, with separate progress bars for left and right.
- Diff Calculation: Compares all files by path:
- If only in left/right: marked as such
- If sizes differ: marked as different
- If times differ: hashes compared (BLAKE3, hash sampling for huge files, memory-mapped for large files, direct compare for small)
- If same size/time: assumed identical
- Progress bar with ETA during this phase
- Diff Output:
- All diffs streamed to output file (buffered, thread-safe)
- Summary at end
- Sync/Backup/Rollback (if enabled):
- Plans and performs sync actions (copy, delete, backup)
- Logs actions and supports rollback using backups
folder-differ /path/to/dirA /path/to/dirB --threads 32
- Uses jwalk for fast, parallel directory traversal
- Uses BLAKE3 for fast, parallel hashing
- Hash sampling for huge files
- Batch and streaming output for minimal memory usage
- Progress bars for all major phases
jwalk(parallel directory traversal)blake3(fast, parallel hashing)memmap2(memory-mapped file access)indicatif(progress bars)rayon(parallelism)rustc-hash(fast hash maps/sets)ignore(ignore pattern support, can be extended)num_cpus(CPU count for thread pool)thiserror(custom error types)anyhow(ergonomic error handling in CLI)log(structured logging)env_logger(configurable logging backend)tempfile(test harness)ctrlc(graceful shutdown)
- Rust (edition 2024)
MIT
This crate supports optional features to enable or disable certain functionality at build time:
progress(default): Enables progress bars and related UI (requiresindicatif).benchmarking: Enables the synthetic benchmarking mode (--synthetic-benchmark).sync: Enables directory sync and rollback functionality (--sync,--rollback).
To build without progress bars:
cargo build --no-default-features
To enable benchmarking or sync support:
cargo build --features="benchmarking sync"
You can combine features as needed. By default, only progress is enabled.