⚡ Lightweight orchestration toolkit to generate, validate, repair and enforce structured output from large language models (LLMs). The project provides a provider-agnostic adapter interface, validators (JSON/Pydantic), prompt template management with versioning, caching, dataset collection, and an enforcement engine that retries and repairs LLM output until it conforms to a schema.
This repository contains:
- Adapter abstractions for OpenAI, Anthropic, and Google Gemini.
- Validation and repair utilities for JSON and Pydantic schemas.
- An
EnforcementEnginethat generates, validates, repairs, and retries. - Prompt template system with versioning and YAML persistence.
- LRU caching to reduce redundant API calls and costs.
- Dataset collection for training and fine-tuning.
- Examples and comprehensive test suite.
- Provider-agnostic adapters: OpenAI, Anthropic (Claude), Google Gemini
- Multiple validators: JSON Schema, Pydantic models
- Automatic repair: Schema-based heuristics fix common formatting issues
- Retry loop: Progressive feedback to model for iterative repair
- Rate limiting: Built-in token bucket algorithm prevents API rate limit violations
- Dataset collection: Capture and export training data (JSONL, JSON, CSV)
- Template system: Type-safe variable substitution with validation
- Version control: Semantic versioning (1.0.0, 2.0.0, etc.)
- YAML persistence: Save/load templates from files
- Template registry: Centralized management of all templates
- Template manager: One-line API for template + enforcement
- LRU cache: In-memory caching with TTL support
- Cost reduction: Avoid redundant API calls for identical requests
- Cache integration: Seamless integration with enforcement engine
- Statistics tracking: Monitor cache hits, misses, and hit rates
pip install parsec-llmOr for development:
git clone https://github.com/olliekm/parsec.git
cd parsec
pip install -e ".[dev]"from parsec.models.adapters import OpenAIAdapter
from parsec.validators import JSONValidator
from parsec.enforcement import EnforcementEngine
# Set up components
adapter = OpenAIAdapter(api_key="your-api-key", model="gpt-4o-mini")
validator = JSONValidator()
engine = EnforcementEngine(adapter, validator, max_retries=3)
# Define your schema
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"}
},
"required": ["name", "age"]
}
# Enforce structured output
result = await engine.enforce(
"Extract: John Doe is 30 years old",
schema
)
print(result.data) # {"name": "John Doe", "age": 30}
print(result.success) # True
print(result.retry_count) # 0from parsec.cache import InMemoryCache
# Add cache to reduce redundant API calls
cache = InMemoryCache(max_size=100, default_ttl=3600)
engine = EnforcementEngine(adapter, validator, cache=cache)
# First call hits API
result1 = await engine.enforce(prompt, schema)
# Second identical call returns cached result (no API call!)
result2 = await engine.enforce(prompt, schema)
# Check cache performance
stats = cache.get_stats()
print(stats) # {'hits': 1, 'misses': 1, 'hit_rate': '50.00%'}from parsec.prompts import PromptTemplate, TemplateRegistry, TemplateManager
# Create a reusable template
template = PromptTemplate(
name="extract_person",
template="Extract person info from: {text}\n\nReturn as JSON.",
variables={"text": str},
required=["text"]
)
# Register with version
registry = TemplateRegistry()
registry.register(template, "1.0.0")
# Use with enforcement
manager = TemplateManager(registry, engine)
result = await manager.enforce_with_template(
template_name="extract_person",
variables={"text": "John Doe, age 30"},
schema=schema
)
# Save templates to file
registry.save_to_disk("templates.yaml")
# Load templates later
registry.load_from_disk("templates.yaml")from pydantic import BaseModel
from parsec.validators import PydanticValidator
class Person(BaseModel):
name: str
age: int
email: str
validator = PydanticValidator()
engine = EnforcementEngine(adapter, validator)
result = await engine.enforce(
"Extract: John Doe, 30 years old, john@example.com",
Person
)
print(result.data) # {"name": "John Doe", "age": 30, "email": "john@example.com"}from parsec.resilience import RateLimiter, PerProviderRateLimiter, PROVIDER_LIMITS
# Basic rate limiting - prevent exceeding API limits
rate_limiter = RateLimiter(
requests_per_minute=60,
tokens_per_minute=90_000
)
engine = EnforcementEngine(adapter, validator, rate_limiter=rate_limiter)
# Make requests - automatically throttled to stay within limits
result = await engine.enforce(prompt, schema)
# Per-provider rate limiting with predefined limits
rate_limiter = PerProviderRateLimiter()
# Configure OpenAI with tier 1 limits (60 req/min, 90K tokens/min)
openai_config = PROVIDER_LIMITS['openai']['tier_1']
rate_limiter.set_provider_limits(
'openai',
requests_per_minute=openai_config.requests_per_minute,
tokens_per_minute=openai_config.tokens_per_minute
)
# Each provider respects its own limits independently
openai_engine = EnforcementEngine(openai_adapter, validator, rate_limiter=rate_limiter)
anthropic_engine = EnforcementEngine(anthropic_adapter, validator, rate_limiter=rate_limiter)
# Get statistics
stats = rate_limiter.get_stats()
print(stats) # {'openai': {'total_requests': 5, 'total_tokens': 2500, ...}}Requirements: Python 3.9+
- Install dependencies:
pip install -e ".[dev]"- Run tests:
poetry run pytest -q- Run the OpenAI example (requires
OPENAI_API_KEY):
export OPENAI_API_KEY="sk-..."
export OPENAI_MODEL="gpt-4o-mini" # optional
poetry run python examples/run_with_openai.pyThe example demonstrates using OpenAIAdapter, JSONValidator and
EnforcementEngine to extract structured data using a JSON schema.
src/parsec/core/— Core abstractions and schemassrc/parsec/models/— LLM provider adapters (OpenAI, Anthropic, Gemini)src/parsec/validators/— Validator implementations (JSON, Pydantic)src/parsec/enforcement/— Enforcement and orchestration enginesrc/parsec/prompts/— Prompt template system with versioningsrc/parsec/cache/— Caching implementations (InMemoryCache)src/parsec/resilience/— Rate limiting, circuit breakers, retries, failoversrc/parsec/training/— Dataset collection for fine-tuningsrc/parsec/utils/— Utility functions (partial JSON parsing)examples/— Working examples with real API callstests/— Comprehensive test suite with pytest
Check out the examples/ directory for complete working examples:
basic_usage.py- Simple extraction with JSON schemaprompt_template_example.py- Template system with versioningprompt_persistence_example.py- Save/load templates from YAMLtemplate_manager_example.py- TemplateManager integrationtemplate_manager_live_example.py- Live demo with real API callsrate_limiting_demo.py- Rate limiting with token bucketsstreaming_example.py- Streaming support (experimental)
Run any example:
python3 examples/rate_limiting_demo.pyRun the test suite with:
poetry run pytest -qCollect and export training data for fine-tuning:
from parsec.training import DatasetCollector
collector = DatasetCollector(
output_path="./training_data",
format="jsonl", # or "json", "csv"
auto_flush=True
)
engine = EnforcementEngine(adapter, validator, collector=collector)
# Data is automatically collected during enforcement
result = await engine.enforce(prompt, schema)
# Export collected data
collector.flush() # Writes to disk# v1.0.0 - Initial template
template_v1 = PromptTemplate(
name="extract_person",
template="Extract: {text}",
variables={"text": str},
required=["text"]
)
registry.register(template_v1, "1.0.0")
# v2.0.0 - Improved with validation rules
template_v2 = PromptTemplate(
name="extract_person",
template="Extract: {text}\n\nValidation: {rules}",
variables={"text": str, "rules": str},
required=["text"],
defaults={"rules": "Strict validation"}
)
registry.register(template_v2, "2.0.0")
# Use specific version
result = await manager.enforce_with_template(
template_name="extract_person",
version="2.0.0", # Explicit version
variables={"text": "John Doe, 30"}
)
# Or use latest automatically
result = await manager.enforce_with_template(
template_name="extract_person", # Gets v2.0.0
variables={"text": "John Doe, 30"}
)from parsec.models.adapters import OpenAIAdapter, AnthropicAdapter
# Switch between providers easily
openai_adapter = OpenAIAdapter(api_key=openai_key, model="gpt-4o-mini")
anthropic_adapter = AnthropicAdapter(api_key=anthropic_key, model="claude-3-5-sonnet-20241022")
# Same enforcement code works with any adapter
engine = EnforcementEngine(anthropic_adapter, validator)
result = await engine.enforce(prompt, schema)- Core enforcement engine with retry logic
- Multiple LLM providers (OpenAI, Anthropic, Gemini)
- JSON and Pydantic validation
- LRU caching with TTL
- Prompt template system with versioning
- Dataset collection for training
- Rate limiting with token bucket algorithm
- Streaming support for real-time output
- Cost tracking and analytics
- A/B testing for prompt variants
- Output post-processing pipeline
Contributions are welcome! Please feel free to submit a Pull Request.
- Examples with real API calls will incur costs — use test/development API keys
- The framework is intentionally modular — extend adapters and validators as needed
- Template system supports version control via YAML files for team collaboration
This project is licensed under the MIT License - see the LICENSE file for details.
Copyright (c) 2025 Oliver Kwun-Morfitt