Skip to content

Conversation

@dpaluy
Copy link
Owner

@dpaluy dpaluy commented Jan 7, 2026

Summary

Implements T8 (#45): Legacy lambda wrapper in RedactionPipeline for backwards compatibility.

Changes

RedactionPipeline now supports two interface styles:

  1. New Pattern-based (from T3): call(text, audit:, field_path:) returning [text, audit]
  2. Legacy lambda: call(text) returning text

The pipeline auto-detects which interface to use:

  • Pattern objects detected by class check
  • Lambdas with audit: keyword param bypass wrapping
  • Legacy single-arg lambdas get wrapped with audit tracking

Also Includes

  • NormalizedInteraction extended with actor_type, actor_id, actor_gid, redaction_audit fields (T4 dependencies)
  • Patterns applied first, then custom_redactors
  • RedactionAudit populated on result.redaction_audit

Backwards Compatibility

Existing custom_redactors lambdas continue to work unchanged. The wrapper:

  • Detects arity and calls appropriate signature
  • Tracks when text was modified by legacy lambdas
  • Records redaction in audit with auto-generated name from source location

Test Plan

  • 9 new RedactionPipeline tests for legacy wrapper
  • All 167 tests pass

dpaluy added 5 commits January 6, 2026 23:00
…ased refactor

Remove dual-layer PII redaction system (database rules + class-based redactors) to prepare for new unified pattern-based architecture.

Deleted:
- RedactionRule model and migration (database-backed rules)
- Base, Email, Phone, CardPAN redactor classes
- Related tests for legacy system

Modified:
- RedactionPipeline: removed apply_database_rules! method
- Config: default_redactors now returns [] with TODO comment
- Tests: updated to use custom_redactors instead of built-in ones

Breaking change: No redactors active until new Pattern system is implemented (T1-T3).
Tests pass: 123 runs, 368 assertions, 0 failures

Related: #38-#54 (PII Redaction Architecture Refactor)
Implements immutable value object for tracking PII redaction operations,
supporting GDPR/CCPA compliance requirements (Spec Gap 9).

Key features:
- Immutable instances with builder pattern methods
- Tracks redaction metadata: timestamp, redactors applied, fields, counts
- LLM redaction status tracking (success, failed, skipped)
- Methods: record_redaction, record_llm_failure, record_llm_success, to_h
- Deduplicates and sorts redactor names
- Compact JSON serialization via to_h

Comprehensive test suite (22 tests):
- Default initialization and timestamp handling
- Redaction recording and deduplication
- Immutability verification
- LLM status transitions
- Hash serialization with nil filtering
- Complex nested scenarios and chaining

Next: Integrate with RedactionPipeline and NormalizedInteraction
Add public method to convert ActiveRecord actors to job-safe serialized
format for background job enqueueing. Supports GlobalID extraction with
fallback to type/id tuple for objects without GlobalID support.

Closes #42
Add pattern-based redaction DSL to Config class:
- config.redact :email, :phone - enables individual patterns
- config.redact_group :api_keys - enables pattern groups
- config.redact_pattern(/regex/, "[REPLACEMENT]") - custom patterns
- config.active_patterns - returns all enabled Pattern objects

Also includes:
- T2: Validators module with Luhn and SSN range validation
- T3: PATTERNS hash with 16 built-in patterns (email, phone, credit_card,
  ssn, openai_key, anthropic_key, aws_key, stripe_key, github_token,
  github_pat, bearer_token, basic_auth, private_key, ipv4, ipv6, jwt)
- PATTERN_GROUPS for convenient batch enabling (pii, financial, api_keys,
  auth, network, crypto)

Invalid pattern names raise ConfigurationError at config time for
early validation.
Update RedactionPipeline to support both interface styles:
- New Pattern-based: call(text, audit:, field_path:) returns [text, audit]
- Legacy lambda: call(text) returns text

The pipeline auto-detects which interface to use:
- Pattern objects are detected by class check
- Lambdas with `audit:` keyword param bypass wrapping
- Legacy single-arg lambdas get wrapped with audit tracking

Also includes:
- NormalizedInteraction extended with actor_type, actor_id, actor_gid,
  redaction_audit fields (T4 dependencies)
- Patterns applied first, then custom_redactors
- RedactionAudit populated on result.redaction_audit

Backwards compatible with existing custom_redactors configuration.
@dpaluy dpaluy added the enhancement New feature or request label Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants