Skip to content

A high-performance, secure file compression library with password protection, written in Rust with Python bindings

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

johnnywale/rustyzip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rustyzipper

CI PyPI Python License

A high-performance, secure file compression library with password protection, written in Rust with Python bindings.

rustyzipper is a modern, actively maintained replacement for pyminizip, addressing critical security vulnerabilities while more encryption options.

Why rustyzipper?

Problems with pyminizip:

  • Abandoned (last update years ago)
  • Security vulnerabilities (CVE-2022-37434)
  • Outdated zlib version
  • No AES-256 support

rustyzipper advantages:

  • Actively maintained with regular updates
  • No known security vulnerabilities
  • Modern zlib (latest version)
  • AES-256 encryption for sensitive data
  • Drop-in pyminizip replacement
  • Windows Explorer compatible (ZipCrypto option)
  • Zero Python dependencies (fully self-contained)

Installation

pip install rustyzipper

Quick Start

Modern API (Recommended)

from rustyzipper import compress_file, decompress_file, EncryptionMethod, SecurityPolicy

# Secure compression with AES-256 (recommended for sensitive data)
compress_file("document.pdf", "secure.zip", password="MySecureP@ssw0rd")

# Windows Explorer compatible (weak security, use only for non-sensitive files)
compress_file(
    "public.pdf",
    "compatible.zip",
    password="simple123",
    encryption=EncryptionMethod.ZIPCRYPTO,
    suppress_warning=True
)

# Decompress with default security (protected against ZIP bombs)
decompress_file("secure.zip", "extracted/", password="MySecureP@ssw0rd")

# Decompress with custom security policy
policy = SecurityPolicy(max_size="10GB", max_ratio=1000)
decompress_file("large.zip", "extracted/", policy=policy)

# Decompress with unlimited policy (for trusted archives only)
decompress_file("trusted.zip", "out/", policy=SecurityPolicy.unlimited())

pyminizip Compatibility (No Code Changes Required)

# Change this line:
# import pyminizip

# To this:
from rustyzipper.compat import pyminizip

# Rest of your code works as-is!
pyminizip.compress("file.txt", None, "output.zip", "password", 5)
pyminizip.uncompress("output.zip", "password", "extracted/", 0)

Features

Encryption Methods

Method Security Compatibility Use Case
AES-256 Strong 7-Zip, WinRAR, WinZip Sensitive data
ZipCrypto Weak Windows Explorer, All tools Maximum compatibility
None None All tools Non-sensitive data

Compression Levels

from rustyzipper import CompressionLevel

CompressionLevel.STORE    # No compression (fastest)
CompressionLevel.FAST     # Fast compression
CompressionLevel.DEFAULT  # Balanced (recommended)
CompressionLevel.BEST     # Maximum compression (slowest)

API Reference

compress_file

compress_file(
    input_path: str,
    output_path: str,
    password: str | None = None,
    encryption: EncryptionMethod = EncryptionMethod.AES256,
    compression_level: CompressionLevel = CompressionLevel.DEFAULT,
    suppress_warning: bool = False
) -> None

compress_files

compress_files(
    input_paths: list[str],
    output_path: str,
    password: str | None = None,
    encryption: EncryptionMethod = EncryptionMethod.AES256,
    compression_level: CompressionLevel = CompressionLevel.DEFAULT,
    prefixes: list[str | None] | None = None
) -> None

compress_directory

compress_directory(
    input_dir: str,
    output_path: str,
    password: str | None = None,
    encryption: EncryptionMethod = EncryptionMethod.AES256,
    compression_level: CompressionLevel = CompressionLevel.DEFAULT,
    include_patterns: list[str] | None = None,
    exclude_patterns: list[str] | None = None
) -> None

decompress_file

decompress_file(
    input_path: str,
    output_path: str,
    password: str | None = None,
    *,
    policy: SecurityPolicy | None = None
) -> None

SecurityPolicy

# Create a policy with custom limits
policy = SecurityPolicy(
    max_size: int | str = None,    # e.g., "10GB", "500MB", or bytes
    max_ratio: int = None,          # e.g., 1000 for 1000:1
    allow_symlinks: bool = False
)

# Factory methods
SecurityPolicy.unlimited()           # No limits (use with trusted archives only)
SecurityPolicy.strict()              # 100MB max, 100:1 ratio (for untrusted content)
SecurityPolicy.strict("50MB", 50)    # Custom strict limits

Examples

Compress a Directory with Filters

from rustyzipper import compress_directory, EncryptionMethod

compress_directory(
    "my_project/",
    "backup.zip",
    password="BackupP@ss",
    encryption=EncryptionMethod.AES256,
    include_patterns=["*.py", "*.md", "*.json"],
    exclude_patterns=["__pycache__", "*.pyc", ".git", "node_modules"]
)

Compress Multiple Files

from rustyzipper import compress_files

compress_files(
    ["report.pdf", "data.csv", "summary.txt"],
    "documents.zip",
    password="secret",
    prefixes=["reports", "data", "reports"]  # Archive paths
)

In-Memory and Streaming Compression

rustyzipper provides multiple APIs for different use cases, from simple in-memory operations to memory-efficient streaming for large files.

In-Memory Compression (compress_bytes / decompress_bytes)

Compress and decompress data directly in memory without filesystem I/O. Ideal for web applications, APIs, and processing data in memory.

from rustyzipper import compress_bytes, decompress_bytes, EncryptionMethod

# Compress multiple files to bytes
files = [
    ("hello.txt", b"Hello, World!"),
    ("data/config.json", b'{"key": "value"}'),
    ("binary.bin", bytes(range(256))),
]
zip_data = compress_bytes(files, password="secret")

# Send over network, store in database, etc.
# ...

# Decompress back to list of (filename, content) tuples
extracted = decompress_bytes(zip_data, password="secret")
for filename, content in extracted:
    print(f"{filename}: {len(content)} bytes")

Note: These functions load all data into memory. For large files, use streaming APIs below.

Streaming Compression (compress_stream / decompress_stream)

Stream data through file-like objects. Compresses in 64KB chunks to reduce memory usage.

import io
from rustyzipper import compress_stream, decompress_stream, EncryptionMethod

# Compress from file handles to BytesIO
output = io.BytesIO()
with open("large_file.bin", "rb") as f1, open("another.txt", "rb") as f2:
    compress_stream(
        [("large_file.bin", f1), ("another.txt", f2)],
        output,
        password="secret"
    )

zip_data = output.getvalue()

# Decompress from BytesIO
output.seek(0)
files = decompress_stream(output, password="secret")

# Or stream directly to/from files
with open("output.zip", "wb") as out:
    with open("input.txt", "rb") as inp:
        compress_stream([("input.txt", inp)], out, encryption=EncryptionMethod.NONE)

Per-File Streaming Iterator (open_zip_stream)

For processing large ZIP archives with many files, use the streaming iterator to decompress one file at a time. This keeps only one decompressed file in memory at any moment.

from rustyzipper import open_zip_stream, open_zip_stream_from_file

# From bytes
zip_data = open("archive.zip", "rb").read()
for filename, content in open_zip_stream(zip_data, password="secret"):
    print(f"Processing {filename}: {len(content)} bytes")
    process_file(content)
    # Previous file's content is garbage collected

# From file handle
with open("archive.zip", "rb") as f:
    for filename, content in open_zip_stream_from_file(f):
        process_file(content)

# Random access and inspection
reader = open_zip_stream(zip_data)
print(f"Files: {reader.namelist()}")        # List all files
print(f"Count: {len(reader)}")              # Number of files
content = reader.read("specific_file.txt")  # Extract by name

Memory Comparison

Function ZIP Data Decompressed Files Best For
decompress_bytes() All in memory All at once Small archives
decompress_stream() Streamed All at once Large ZIP, small files
open_zip_stream() All in memory One at a time Many files, ZIP fits in RAM
open_zip_stream_from_file() Streamed One at a time Huge ZIP files

Choosing the Right Function

Is your ZIP file small (< 100MB)?
├── Yes → Use decompress_bytes() or open_zip_stream()
└── No → Is the ZIP itself too large for memory?
    ├── Yes → Use open_zip_stream_from_file() (true streaming)
    └── No → Use open_zip_stream() (ZIP in memory, files streamed)

Example: 10GB ZIP with 100 files (100MB each when decompressed)

Function Peak Memory
decompress_bytes() ~10GB (all decompressed at once)
open_zip_stream() ~compressed size + 100MB (one file at a time)
open_zip_stream_from_file() ~100MB only (true streaming)

API Reference

compress_bytes

compress_bytes(
    files: list[tuple[str, bytes]],
    password: str | None = None,
    encryption: EncryptionMethod = EncryptionMethod.AES256,
    compression_level: CompressionLevel = CompressionLevel.DEFAULT,
    suppress_warning: bool = False
) -> bytes

decompress_bytes

decompress_bytes(
    data: bytes,
    password: str | None = None
) -> list[tuple[str, bytes]]

compress_stream

compress_stream(
    files: list[tuple[str, BinaryIO]],
    output: BinaryIO,
    password: str | None = None,
    encryption: EncryptionMethod = EncryptionMethod.AES256,
    compression_level: CompressionLevel = CompressionLevel.DEFAULT,
    suppress_warning: bool = False
) -> None

decompress_stream

decompress_stream(
    input: BinaryIO,
    password: str | None = None
) -> list[tuple[str, bytes]]

open_zip_stream

open_zip_stream(
    data: bytes,
    password: str | None = None
) -> ZipStreamReader

# ZipStreamReader supports:
# - Iteration: for filename, content in reader
# - len(reader): Number of files
# - reader.namelist(): List of filenames
# - reader.read(name): Extract specific file
# - reader.file_count: Number of files (property)
# - reader.total_entries: Total entries including directories (property)

open_zip_stream_from_file

open_zip_stream_from_file(
    input: BinaryIO,
    password: str | None = None
) -> ZipFileStreamReader

# ZipFileStreamReader supports the same interface as ZipStreamReader
# but reads directly from the file handle (true streaming).
# NOTE: File handle must remain open during iteration!

Usage example:

from rustyzipper import open_zip_stream_from_file

# True streaming - even 100GB ZIP files work with minimal memory
with open("huge_archive.zip", "rb") as f:
    for filename, content in open_zip_stream_from_file(f):
        process_file(content)  # Only one file in memory at a time

Security Features

rustyzipper is secure by default with built-in protection against common ZIP vulnerabilities.

Built-in Protections

Protection Default Description
ZIP Bomb (Size) 2 GB max Prevents extraction of archives that decompress to massive sizes
ZIP Bomb (Ratio) 500:1 max Detects suspiciously high compression ratios
Path Traversal Always on Blocks ../ attacks that could write outside target directory
Symlinks Blocked Prevents symlink-based escape attacks
Memory Zeroization Active Passwords are securely erased from memory after use
Thread Pool Capping Auto Dedicated thread pool prevents CPU starvation

Compat API Security Settings

The uncompress() function supports optional security parameters while maintaining full backward compatibility:

from rustyzipper.compat import pyminizip

# Basic usage - protected by secure defaults (2GB/500:1 limits)
pyminizip.uncompress("archive.zip", "password", "output/", 0)

# Disable size limit for known-large archives
pyminizip.uncompress("large.zip", "password", "output/", 0, max_size=0)

# Custom limits for specific use cases
pyminizip.uncompress(
    "archive.zip",
    "password",
    "output/",
    0,
    max_size=10 * 1024 * 1024 * 1024,  # 10 GB
    max_ratio=1000                      # Allow 1000:1 compression ratio
)

# Full parameter list
pyminizip.uncompress(
    src,           # Source ZIP path
    password,      # Password (or None)
    dst,           # Destination directory
    withoutpath,   # 0=preserve paths, 1=flatten
    max_size=None, # Max decompressed size in bytes (default: 2GB, 0=disable)
    max_ratio=None,# Max compression ratio (default: 500, 0=disable)
    allow_symlinks=False  # Allow symlink extraction (default: False)
)

Security Settings Summary

Parameter Default Description
max_size 2 GB Maximum total decompressed size. Set to 0 to disable.
max_ratio 500 Maximum compression ratio. Set to 0 to disable.
allow_symlinks False Whether to extract symbolic links (reserved for future use).

Handling ZIP Bomb Errors

from rustyzipper.compat import pyminizip
from rustyzipper import RustyZipError

try:
    pyminizip.uncompress("suspicious.zip", None, "output/", 0)
except RustyZipError as e:
    if "ZipBomb" in str(e):
        print("Archive exceeds safe decompression limits")
    elif "SuspiciousCompressionRatio" in str(e):
        print("Archive has suspiciously high compression ratio")

Security Guidelines

DO:

  • Use AES-256 for sensitive data
  • Use strong passwords (12+ characters, mixed case, numbers, symbols)
  • Store passwords in a password manager
  • Use unique passwords for each archive
  • Keep default security limits unless you have a specific reason to change them

DON'T:

  • Use ZipCrypto for sensitive data (it's weak!)
  • Use weak or common passwords
  • Share passwords via insecure channels
  • Reuse passwords across archives
  • Disable security limits (max_size=0) without understanding the risks

Platform Support

Platform Architecture Status
Windows 10+ x64, x86, ARM64 Supported
Linux (glibc) x64, ARM64 Supported
macOS 11+ x64, ARM64 (Apple Silicon) Supported

Python Version Support

  • Python 3.8+

Building from Source

Prerequisites

  • Rust 1.70+
  • Python 3.8+
  • maturin (pip install maturin)

Build

git clone https://github.com/johnnywale/rustyzipper.git
cd rustyzipper

# Development build
maturin develop

# Release build
maturin build --release

Run Tests

# Rust tests
cargo test

# Python tests
pip install pytest
pytest python/tests/

Comparison with pyminizip

Feature pyminizip rustyzipper
Maintenance Status Abandoned Active
Security Vulnerabilities Multiple CVEs None known
zlib Version Outdated Latest
AES-256 Support No Yes
Memory Safety C/C++ risks Rust guarantees
Windows Explorer Support Yes (ZipCrypto) Yes (ZipCrypto)
API Compatibility N/A Drop-in replacement
Installation Requires compiler Prebuilt wheels
Type Hints No Yes

License

Dual-licensed under MIT or Apache 2.0 at your option.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development Setup

# Clone the repository
git clone https://github.com/johnnywale/rustyzipper.git
cd rustyzipper

# Install development dependencies
pip install maturin pytest

# Build and install in development mode
maturin develop

# Run tests
cargo test                    # Rust tests
pytest python/tests/ -v       # Python tests

Supported Platforms

Platform Architectures
Linux (glibc/musl) x86_64, aarch64, armv7, i686
Windows x86_64, i686
macOS x86_64, aarch64 (Apple Silicon)

Links

About

A high-performance, secure file compression library with password protection, written in Rust with Python bindings

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published