Releases: ZON-Format/zon-TS
Releases · ZON-Format/zon-TS
v1.3.0
Full Changelog: v1.1.0...v1.3.0
[1.3.0] - 2025-12-04
Changed
- Code Quality: Refactored entire codebase to remove inline implementation comments, relying on clear code structure and JSDoc.
- Documentation: Updated
SPEC.md,docs/advanced-features.md, and all other documentation files to reflect v1.3.0. - CI/CD: Updated GitHub Workflows (
ci.yml,llm-evals.yml) to use Node 20 and correct script paths for example generation and verification. - Evaluation Scripts: Updated
eval:baselineandeval:check-regressionsto support--type=fullfor comprehensive accuracy benchmarks.
Fixed
- ZonDecoder: Fixed a critical bug where escaped quotes (
"") inside a quoted string caused incorrect splitting of values. - Round-Trip Verification: Fixed round-trip failures in
llm-optimizedmode by normalizing timestamps in datasets to include milliseconds (.000Z). - Utils: Fixed
parseValueto correctly handle ZON-style double quoting ("") mixed with standard JSON escapes.
Added
- Documentation: Added "Adaptive Encoding" and "Binary Format" sections to
docs/advanced-features.md.
v1.2.0
Full Changelog: v1.1.0...v1.2.0
[1.2.0] - 2025-12-03
Major Release: Enterprise Features & Production Readiness
This release transforms ZON into an enterprise-grade data format with versioning, adaptive encoding, binary format, comprehensive developer tools, automated testing, and production documentation.
Added
Phase 1: Document-Level Schema Versioning
- Version Embedding/Extraction:
embedVersion()andextractVersion()for metadata management - Migration Manager:
ZonMigrationManagerwith BFS path-finding for schema evolution - Backward/Forward Compatibility: Automatic migration between schema versions
- Test Coverage: 45 tests covering all versioning scenarios
Phase 2: Adaptive Format Selection
- 4 Encoding Modes:
auto,compact,readable,llm-optimized - Data Complexity Analyzer: Automatic analysis of nesting depth, irregularity, field count
- Mode Recommendation:
recommendMode()suggests optimal encoding based on data structure - Test Coverage: 20 tests for adaptive encoding
Phase 3: Binary ZON Format (ZON-B)
- MessagePack-Inspired Encoding: Compact binary format with magic header (
ZNB\x01) - 40-60% Space Savings: Significantly smaller than JSON while maintaining structure
- Full Type Support: Primitives, arrays, objects, nested structures
- APIs:
encodeBinary(),decodeBinary()with round-trip validation - Test Coverage: 18 tests for binary format
Phase 5: Developer Experience Tools
- Format Converters: JSON ↔ ZON ↔ Binary with
BatchConverter - Helper Utilities:
size(),compareFormats(),analyze(),inferSchema(),compare(),isSafe() - Pretty Printer: Syntax highlighting with colors,
diffPrint()for visual diffs - Enhanced Validator: Linting rules for depth, fields, performance with best practice suggestions
- CLI Enhancements:
zon analyze- Data complexity analysiszon diff- Visual file comparisonzon validate --strict- Strict validation with lintingzon convert --to=binary- Binary format conversionzon format --colors- Pretty printing with syntax highlighting
Phase 6: LLM Evaluation Framework
- ZonEvaluator Engine: Core evaluation framework with metric registration
- 7 Built-in Metrics: exactMatch, tokenEfficiency, structuralValidity, formatCorrectness, partialMatch, hallucination, latency
- Regression Detection: Compare baseline vs current results
- Dataset Management:
DatasetRegistrywith versioning and tagging - Storage Backends:
FileEvalStorageandMemoryEvalStorage
Phase 7: CI/CD Integration
- GitHub Actions Workflow: Automated evaluations on PRs and main branch
- Smoke Tests: Fast <1min tests on every PR
- Regression Detection: Automatic baseline comparison with severity levels (critical/major/minor)
- PR Comments: Auto-post eval results with metrics tables
- Baseline Management: Auto-save successful builds as new baselines
- NPM Scripts:
eval:smoke,eval:check-regressions,eval:baseline
Phase 4: Production Documentation
- Production Architecture Guide: Multi-format strategy, versioning workflows, API patterns
- Best Practices Guide: Code organization, error handling, testing, security
- Migration Examples: Batch migration scripts with stats tracking
- Express Middleware: Content negotiation for JSON/ZON/Binary formats
Changed
- Code Quality: Removed inline comments from core files (
encoder.ts,decoder.ts) - Documentation: All functions have proper JSDoc/TSDoc documentation
- Build System: Still compiles cleanly with TypeScript
Performance
- Binary Format: 40-60% smaller than JSON
- ZON Text: Maintains 16-19% smaller than JSON
- Test Suite: All 288 tests passing
Documentation
- New Guides: Production architecture, best practices, developer tools
- Working Examples: Express middleware, migration scripts
- API Reference: Complete documentation for all new APIs
Testing
- Total Tests: 288 passing (up from 175)
- Test Coverage: 100% for all new features
- No Regressions: Full backward compatibility maintained
Development
- Total Code: ~4250+ lines of production code
- Files Added: 21 new files (binary/, evals/, tools/, docs/, examples/)
- Quality: Professional-grade documentation, no inline comments
v1.1.0
[1.1.0] - 2025-12-01
Full Changelog: v1.0.5...v1.1.0
Major Release: Ecosystem Integrations & Streaming
This release transforms ZON into a production-ready format with first-class support for modern AI frameworks and streaming workflows.
Added
Phase 5: Ecosystem Integrations
- LangChain Integration (
zon-format/langchain):ZonOutputParserfor seamless LLM chain integration - Vercel AI SDK Integration (
zon-format/ai-sdk):streamZonhelper for Next.js streaming UI - OpenAI Helper (
zon-format/openai):ZOpenAIwrapper with automatic format injection
Phase 4: Developer Experience
- VS Code Extension: Syntax highlighting for
.zonand.zonffiles - Performance Benchmarks: Automated benchmark suite comparing ZON vs JSON/MsgPack
- CLI Enhancements:
validate,stats, andformatcommands
Phase 3: Streaming & Utilities
- Streaming APIs:
ZonStreamEncoderandZonStreamDecoderfor memory-efficient processing - Browser/Edge Support: Verified compatibility with Cloudflare Workers and Vercel Edge
- CLI Tools: Complete command-line interface for file operations
Advanced Features (Phases 1-2)
- Runtime Schema Validation: Type-safe parsing with
zon.object(),zon.string(), etc. - Dictionary Compression: Automatic deduplication of repeated string values
- Delta Encoding: Sequential numeric columns compressed with delta notation
- Type Coercion: Intelligent handling of LLM-generated "stringified" values
- LLM-Aware Field Ordering:
encodeLLMoptimizes field order for task type
Documentation
- 5 New Comprehensive Guides: Streaming, Integrations, CLI, Schema Validation, Advanced Features
- Updated README: Professional documentation section with organized navigation
- Enhanced API Reference: Added streaming and integration APIs
- Updated Syntax Cheatsheet: Added dictionary and metadata syntax
Fixed
- Dictionary Round-trip Bug: Fixed regex pattern to support dotted column names (e.g.,
recipient.city[3]) - Test Suite: All 175 tests passing (up from 121)
Performance
- Token Efficiency: 16-19% fewer tokens than JSON
- LLM Accuracy: 100% retrieval accuracy maintained
- Round-trip Integrity: 100% across all 18 comprehensive test datasets
v1.0.5
Full Changelog: v1.0.4...v1.0.5
[1.0.5] - 2025-11-30
Added
- Colon-less Syntax: Objects and arrays in nested positions now use
key{...}andkey[...]syntax, removing redundant colons. - Smart Flattening: Top-level nested objects are automatically flattened to dot notation (e.g.,
config.db{...}). - Control Character Escaping: All control characters (ASCII 0-31) are now properly escaped to prevent binary file creation.
Improved
- Token Efficiency: Achieved up to 23.8% reduction vs JSON (GPT-4o) thanks to syntax optimizations.
- Readability: Cleaner, block-like structure for nested data.
[1.0.5] - 2025-11-30
Added
- Algorithmic Benchmark Generation: Replaced LLM-based question generation with a deterministic algorithm for consistent, reproducible benchmarks.
- Expanded Dataset: Added "products" and "feed" data to the unified dataset to simulate real-world e-commerce scenarios.
- Tricky Questions: Introduced edge cases (non-existent fields, logic traps, case sensitivity) to stress-test LLM reasoning.
- Robust Benchmark Runner: Added exponential backoff and rate limiting to handle Azure OpenAI S0 tier constraints.
Changed
- Benchmark Formats: Refined tested formats to ZON, TOON, JSON, JSON (Minified), and CSV for focused analysis.
- Documentation: Updated README and API references with the latest benchmark results (GPT-5 Nano) and accurate token counts.
- Token Efficiency: Recalculated efficiency scores based on the expanded dataset, confirming ZON's leadership (1430.6 score).
Fixed
- Rate Limiting: Resolved 429 errors during benchmarking by implementing robust retry logic and concurrency control.
Full Changelog: v1.0.4...v1.0.5
v1.0.4
[1.0.4] - 2025-11-29
Fixed
- Critical Data Integrity: Fixed roundtrip failures for strings containing newlines, empty strings, and escaped characters.
- Decoder Logic: Fixed
_splitByDelimiterto correctly handle nested arrays and objects within table cells (e.g.,[10, 20]). - Encoder Logic: Added mandatory quoting for empty strings and strings with newlines to prevent data loss.
Documentation
- Updated
SPEC.mdandsyntax-cheatsheet.mdto explicitly require quoting for empty strings and escape sequences.
v1.0.3
[1.0.3] - 2025-11-28
🎯 100% LLM Retrieval Accuracy Achieved
Major Achievement: ZON now achieves 100% LLM retrieval accuracy while maintaining superior token efficiency over TOON!
Changed
- Explicit Sequential Columns: Disabled automatic sequential column omission (
[id]notation)- All columns now explicitly listed in table headers for better LLM comprehension
- Example:
users:@(5):active,id,lastLogin,name,role(wasusers:@(5)[id]:active,lastLogin,name,role) - Trade-off: +1.7% token increase for 100% LLM accuracy
Performance
- LLM Accuracy: 100% (24/24 questions) vs TOON 100%, JSON 91.7%
- Token Efficiency: 19,995 tokens (5.0% fewer than TOON's 20,988)
- Overall Savings vs TOON: 4.6% (Claude) to 17.6% (GPT-4o)
Quality
- ✅ All unit tests pass (28/28)
- ✅ All roundtrip tests pass (27/27 datasets)
- ✅ No data loss or corruption
- ✅ Production ready
[1.0.3] - 2025-11-27
###ACHIEVEMENT: 8/8 Perfect Sweep vs All Competitors!
Breaking Changes:
- Compact header syntax:
@count:instead of@data(count): - Sequential ID auto-omission:
[id]notation for 1..N sequences - Adaptive format selection based on data complexity
Added
- Sparse Table Encoding: Automatically detects semi-uniform data and uses
key:valuenotation for optional fields - Irregularity Score Calculation: Jaccard similarity-based scoring to choose optimal table format
- Sequential Column Detection: Identifies and omits columns with sequential values (1, 2, 3, ..., N)
- Smart Date Detection: ISO 8601 dates output unquoted for token efficiency
- Context-Aware String Quoting: Only quotes strings when necessary to preserve type semantics
Performance
- Total Tokens: 1,945 (down from 2,081 in v1.0.2)
- -136 tokens saved (-6.5% improvement)
- 8/8 wins vs CSV (previously 4/8 tied)
- 8/8 wins vs TOON (-24.4% better)
- -57.2% better than JSON formatted
- -27.0% better than JSON compact
Benchmark Results (8 datasets)
- Employees: 132 tokens (CSV: 138) - ZON WINS -4.3%
- Time-Series: 245 tokens (CSV: 247) - ZON WINS -0.8%
- GitHub Repos: 148 tokens (CSV: 164) - ZON WINS -9.8%
- Event Logs: 220 tokens (CSV: 231) - ZON WINS -4.8% ← Sparse tables!
- E-commerce: 193 tokens (CSV: 313) - ZON WINS -38.3%
- Hike Data: 62 tokens (CSV: 85) - ZON WINS -27.1%
- Deep Config: 111 tokens (CSV: 182) - ZON WINS -39.0%
- Heavily Nested: 764 tokens (CSV: 1,044) - ZON WINS -26.8%
Competitive Analysis
- vs CSV: -20.1% tokens overall
- vs TOON: -24.4% tokens overall (beats on ALL datasets)
- vs JSON: -57.2% formatted, -27.0% compact
- Real Cost Savings: $4,890/month vs CSV at 1M API calls (GPT-4)
Fixed
- Improved irregular schema detection to enable sparse tables for Event Logs
- Enhanced sparse encoding threshold to support up to 5 optional columns
- Better handling of undefined/null values in standard tables
Documentation
- Added comprehensive competitive analysis vs TOON, CSV, JSON, YAML, XML
- Documented sparse table encoding mechanism
- Added real-world cost savings calculations
- Updated benchmarks with CSV comparison
v1.0.2
Fix within deeply nested JSON data
v1.0.1
Have fun using ZON-format