PolarWarp

High-performance tool for analyzing storage I/O operation logs (oplog files from sai3-bench, MinIO Warp, etc.).

Features

Multi-format support: TSV and CSV files, with automatic zstd decompression and separator detection
Size-bucketed analysis: 9 size buckets (zero, 1B-8KiB, ... >2GiB)
Summary rows: Aggregate statistics for META (LIST/HEAD/DELETE/STAT), GET, and PUT operations
Per-client statistics: Compare performance across multiple clients with --per-client option
Per-endpoint statistics: Compare performance across storage endpoints with --per-endpoint option
Excel export: Export results to a formatted .xlsx workbook with --excel option
Latency percentiles: mean, median, p90, p95, p99, max (statistically valid)
Throughput metrics: ops/sec and MiB/sec per bucket
Multi-file consolidation: Combine results from multiple agents, with automatic overlap detection — sequential runs are flagged and skipped, partial overlaps are warned and trimmed to the intersection window
Time skip: Exclude warmup periods with --skip option

Implementations

PolarWarp is available in two implementations with identical functionality and output format:

Implementation	Speed	Best For
Rust	~1,075K records/sec	Production use, large files, compiled binary
Python	~558K records/sec	Quick analysis, scripting, no compilation

Performance notes: Rust is 2.3× faster than Python, and both are significantly faster than MinIO warp.

Performance Comparison

Benchmark results processing 2.32M operations (2 × 1.16M files, zstd compressed) on the same machine:

Tool	Time	Speedup
PolarWarp (Rust)	2.36s	14.4x faster
PolarWarp (Python)	5.48s	6.2x faster
MinIO `warp merge + analyze`	34.0s	baseline

Both PolarWarp implementations provide significantly faster analysis than MinIO's native warp tool.

Multi-File Consolidation Performance

When analyzing multiple files (2 × 1.16M operations = 2.32M total), PolarWarp handles consolidation in a single command, while MinIO warp requires separate merge and analyze steps:

Tool	Merge Time	Analyze Time	Total Time	Notes
PolarWarp (Rust)	—	—	2.36s	Single command
PolarWarp (Python)	—	—	5.48s	Single command
MinIO warp	12.58s	21.41s	34.0s	Two commands required

Summary:

PolarWarp Rust is 14.4× faster than warp
PolarWarp Python is 6.2× faster than warp
PolarWarp Rust is 2.3× faster than Python

Resource Scaling Analysis

Measured scaling factors (1 file → 2 files, each 1.16M operations):

Tool	Time Scaling	Memory Scaling	Memory per Op
PolarWarp (Rust)	2.1x (linear)	~1.0x (constant)	0.52 KB/op
PolarWarp (Python)	1.7x (sub-linear)	~1.0x (constant)	0.77 KB/op
MinIO warp	2.0x (linear)	2.28x (super-linear)	2.24 KB/op

Projected Resource Usage at Scale

Moderate scale: 2 × 15M operations (30M total)

Tool	Projected Time	Projected Memory
PolarWarp (Rust)	~30s	~16 GB
PolarWarp (Python)	~70s	~18 GB
MinIO warp (merge+analyze)	~7.5 min	~67 GB

Large scale: 8 × 15M operations (120M total)

Tool	Projected Time	Projected Memory	Feasibility
PolarWarp (Rust)	~2 min	~64 GB	✅ Fits in 64 GB workstation
PolarWarp (Python)	~4.5 min	~72 GB	⚠️ Needs 128 GB or swap
MinIO warp	~30 min	~270 GB	❌ Impractical

Projections based on measured scaling factors. warp's super-linear memory growth (2.28x per 2x data) makes it impractical for large-scale analysis, while PolarWarp's linear scaling remains manageable.

Quick Start - Rust

cd rust
cargo build --release
./target/release/polarwarp-rs oplog.tsv.zst

# Skip warmup period
./target/release/polarwarp-rs --skip=90s oplog.tsv.zst

# Compare performance across multiple clients
./target/release/polarwarp-rs --per-client oplog.tsv.zst

# Export results to Excel
./target/release/polarwarp-rs --excel oplog.tsv.zst

# Per-endpoint breakdown
./target/release/polarwarp-rs --per-endpoint oplog.tsv.zst

Quick Start - Python

cd python
uv run ./polarwarp.py oplog.csv.zst

# Skip warmup period
uv run ./polarwarp.py --skip=90s oplog.csv.zst

# Compare performance across multiple clients
uv run ./polarwarp.py --per-client oplog.csv.zst

# Export results to Excel
uv run ./polarwarp.py --excel oplog.csv.zst

# Per-endpoint breakdown
uv run ./polarwarp.py --per-endpoint oplog.csv.zst

Output Format

Both implementations produce identical output:

      op bytes_bucket bucket_# mean_lat_us med._lat_us 90%_lat_us 95%_lat_us 99%_lat_us max_lat_us avg_obj_KB ops_/_sec xput_MBps     count max_threads runtime_s
    LIST         zero        0      533.98      533.98     533.98     533.98     533.98     533.98       0.00      0.20      0.00         1           1      5.00
     GET      1B-8KiB        1       76.18       71.97     114.27     128.50     160.82   1,173.53       4.00 47,394.46    185.13   236,971           8      5.00

Per-Client Statistics

When running tests with multiple clients (each with a unique client_id in the oplog), use the --per-client flag to see performance variation across clients:

# Rust
./target/release/polarwarp-rs --per-client multi_client_oplog.csv.zst

# Python
uv run ./polarwarp.py --per-client multi_client_oplog.csv.zst

This produces additional output showing:

Overall statistics per client (latency, throughput, ops/sec)
Per-client breakdown by operation type (META, GET, PUT)

Example output:

================================================================================
Per-Client Statistics (2 clients detected)
================================================================================
      client_id mean_lat_us med._lat_us 90%_lat_us 95%_lat_us 99%_lat_us max_lat_us avg_obj_KB ops_/_sec xput_MBps     count
        client1    5,576.82    3,169.10  11,726.88  17,209.86  37,049.76  59,383.34   1,004.26  1,822.37  1,787.24     5,000
        client2    5,356.29    3,164.20  11,226.17  15,567.94  34,929.35  52,315.44   1,000.71  1,822.37  1,780.93     5,000

Per-Client Statistics by Operation Type:
--------------------------------------------------------------------------------

GET Operations:
      client_id mean_lat_us med._lat_us 99%_lat_us ops_/_sec xput_MBps     count
        client1    5,042.83    3,634.59  17,249.31    820.07  1,323.39     2,250
        client2    4,997.21    3,674.98  16,892.96    820.07  1,366.02     2,250

This helps identify:

Performance variability between clients
Outlier clients with higher latency or lower throughput
Load balancing issues or network bottlenecks
Client-specific problems in distributed tests

Size Buckets

Both implementations use identical bucket definitions (matching sai3-bench):

Bucket #	Label	Size Range
0	zero	0 bytes (metadata ops)
1	1B-8KiB	1 B to 8 KiB
2	8KiB-64KiB	8 KiB to 64 KiB
3	64KiB-512KiB	64 KiB to 512 KiB
4	512KiB-4MiB	512 KiB to 4 MiB
5	4MiB-32MiB	4 MiB to 32 MiB
6	32MiB-256MiB	32 MiB to 256 MiB
7	256MiB-2GiB	256 MiB to 2 GiB
8	>2GiB	Greater than 2 GiB

Input File Format

Expected columns (sai3-bench oplog format):

idx  thread  op  client_id  n_objects  bytes  endpoint  file  error  start  first_byte  end  duration_ns

Also supports MinIO Warp CSV output format.

Related Projects

sai3-bench - Multi-protocol I/O benchmarking suite
MinIO Warp - S3 benchmarking tool

Changelog

See Changelog.md for a detailed history of changes.

License

Licensed under the Apache License, Version 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
python		python
rust		rust
.gitignore		.gitignore
Changelog.md		Changelog.md
LICENSE		LICENSE
README.md		README.md
gen_test_data.py		gen_test_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PolarWarp

Features

Implementations

Performance Comparison

Multi-File Consolidation Performance

Resource Scaling Analysis

Projected Resource Usage at Scale

Quick Start - Rust

Quick Start - Python

Output Format

Per-Client Statistics

Size Buckets

Input File Format

Related Projects

Changelog

License

About

Uh oh!

Releases 6

Packages

Contributors 3

Uh oh!

Languages

License

russfellows/polarWarp

Folders and files

Latest commit

History

Repository files navigation

PolarWarp

Features

Implementations

Performance Comparison

Multi-File Consolidation Performance

Resource Scaling Analysis

Projected Resource Usage at Scale

Quick Start - Rust

Quick Start - Python

Output Format

Per-Client Statistics

Size Buckets

Input File Format

Related Projects

Changelog

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Contributors 3

Uh oh!

Languages

Packages