Skip to content

feat: Per-message sampling with adaptive strategies and profile versioning#61

Merged
willibrandon merged 16 commits intomainfrom
feature/per-message-sampling
Aug 23, 2025
Merged

feat: Per-message sampling with adaptive strategies and profile versioning#61
willibrandon merged 16 commits intomainfrom
feature/per-message-sampling

Conversation

@willibrandon
Copy link
Owner

Description

Implements per-message sampling system for mtlog, providing fine-grained control over log event sampling with multiple strategies and adaptive capabilities. This feature enables production systems to manage log volume intelligently while preserving important events.

Key additions:

  • Multiple sampling strategies: Counter-based, rate-based, time-based, first-N, group-based, conditional, and exponential backoff
  • Adaptive sampling: Automatically adjusts sampling rate to maintain target events/second
  • Oscillation prevention: Hysteresis threshold and dampening factor prevent thrashing under variable load
  • Profile versioning: Semantic versioning with migration policies for backward compatibility
  • Production safety: Thread-safe, immutable profile registry with freeze capability
  • Performance optimized: LRU cache with atomic operations, achieving 17ns for simple checks
  • Developer friendly: Fluent configuration API and predefined dampening presets

Type of change

  • Bug fix
  • New feature
  • Performance improvement
  • Documentation update

Checklist

  • Tests pass (go test ./...)
  • Linter passes (golangci-lint run)
  • Benchmarks checked (if performance-related)
  • Documentation updated (if needed)
  • Zero-allocation promise maintained (if applicable)

Additional notes

  • Adds test coverage including concurrent safety verification
  • Benchmarks show efficient performance: 17ns for simple sampling decisions, 209ns with properties
  • Documentation includes examples and quick reference guide

Fixes #49

…ersioning

Implement sampling system including counter, rate, time-based, first-N, group, conditional, and exponential backoff strategies. Add adaptive sampling with hysteresis and dampening to prevent oscillation. Introduce profile versioning with migration policies and thread-safe registry. Include LRU cache, fluent configuration, and predefined dampening presets.

Resolves #49
- Remove unused mutex fields from SamplingGroupManager and BackoffState
- Remove unused globalSamplingMemoryLimit variable and deprecated function
- Simplify loop with append variadic syntax
- Remove unused sync import from per_message_sampling.go
@willibrandon willibrandon self-assigned this Aug 23, 2025
@willibrandon willibrandon added documentation Improvements or additions to documentation enhancement New feature or request sampling Per-message sampling and adaptive rate control features performance Performance improvements and optimizations. Zero-allocation promise territory. 🚀 labels Aug 23, 2025
- Add all sampling-related methods to mockLogger and mockFuzzLogger
- Include Sample, SampleDuration, SampleRate, SampleFirst, SampleGroup, etc.
- Add ResetSampling, ResetSamplingGroup, EnableSamplingSummary, GetSamplingStats
- Add SampleProfile, SampleAdaptive, and SampleAdaptiveWithOptions methods
- Fix TestSamplingConfigComplex flakiness by increasing sample rate and count
Root cause: On Windows, rapid successive events could have identical or very similar timestamps due to lower timer resolution, causing the hash function to produce biased values that wouldn't pass the threshold check.

Solution:
- Add event counter to hash calculation for better distribution
- Handle zero timestamps by using current time as fallback
- Include counter XOR to ensure unique hashes for rapid events
- This ensures proper sampling distribution on all platforms

Also improved test reliability:
- Use RateSamplingFilter instead of AdaptiveSamplingFilter in TestProfileFreezing
- Increase sample sizes and widen acceptable ranges in TestSampleRate
- Add micro-delays between events in TestLoggerPresetMethods for timestamp variety
- Add all sampling-related methods to noOpLogger in middleware adapter
- Include Sample, SampleDuration, SampleRate, SampleFirst, SampleGroup, etc.
- Add ResetSampling, ResetSamplingGroup, EnableSamplingSummary, GetSamplingStats
- Add SampleProfile, SampleAdaptive, and SampleAdaptiveWithOptions methods
@willibrandon willibrandon requested a review from Copilot August 23, 2025 05:12

This comment was marked as outdated.

- Extract validateBackoffFactor function to reduce code duplication
- Remove thread-unsafe lazy initialization of randSource
- Optimize LRU cache Get method to use single lock instead of double-locking
- Add clarifying comment about selflog.Printf usage
- Add SamplingMetrics struct for cache performance monitoring
- Implement GetAvailableProfileDescriptions() for runtime profile discovery
- Add EnableSamplingDebug() to log sampling decisions via selflog
- Include comprehensive tests for new monitoring features
- Fix flaky TestMetricsMiddleware/timing_accuracy test on macOS CI by increasing tolerance from 10ms to 20ms
- Implement String() for human-readable metric summaries with hit rates
- Add Format() to support %s, %v, %+v (verbose), and %#v formatting
- Add PrometheusMetrics() to export metrics as map for monitoring systems
- Include tests for all formatting options and edge cases
- Handle zero values gracefully to prevent divide-by-zero errors
- Replace plain bool with atomic.Bool for thread-safe access
- Add SetSamplingDebugEnabled/IsSamplingDebugEnabled helper functions
- Fix concurrent access issue detected by race detector in tests
- Document SamplingMetrics String() and PrometheusMetrics() usage
- Add examples for debugging sampling decisions with EnableSamplingDebug()
- Include profile discovery with GetAvailableProfileDescriptions()
- Provide monitoring best practices and alerting examples
- Add sampling-monitoring example with metrics and Prometheus export
- Add sampling-debug example for troubleshooting sampling decisions
- Add sampling-profiles example for profile discovery and comparison
- Fix GetSamplingStats() to work with SampleProfile methods
- Update README with links to all sampling examples
@willibrandon willibrandon requested a review from Copilot August 23, 2025 07:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a comprehensive per-message sampling system for mtlog, enabling fine-grained control over log event sampling with multiple strategies and adaptive capabilities. The feature allows production systems to intelligently manage log volume while preserving important events through configurable sampling rates, adaptive algorithms, and profile-based configurations.

Key changes include:

  • Multiple sampling strategies: Counter-based, rate-based, time-based, first-N, group-based, conditional, and exponential backoff sampling
  • Adaptive sampling: Automatically adjusts sampling rates based on target events/second with hysteresis and dampening controls
  • Profile versioning: Semantic versioning system with migration policies for backward compatibility and production safety
  • Performance optimization: Thread-safe LRU caches with atomic operations achieving 17ns for simple sampling decisions

Reviewed Changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
sampling_test.go Comprehensive test coverage for all sampling strategies and concurrent safety verification
sampling_profiles.go Profile registry with versioning, migration policies, and thread-safe management
sampling_monitoring_test.go Tests for metrics, debug logging, and monitoring capabilities
sampling_config_test.go Tests for fluent configuration API and composite sampling strategies
sampling_config.go Fluent configuration builder with AND/OR logic and custom policy support
sampling_bench_test.go Performance benchmarks showing efficient 17ns sampling decisions
sampling_advanced_test.go Tests for summary events, cache warmup, and advanced features
neovim-plugin/tests/spec/analyzer_spec.lua Increased CI timeout for better reliability
logger.go Core logger integration with sampling methods and global state management
internal/filters/per_message_sampling.go Core sampling filter implementations with thread-safe counters
internal/cache/lru_cache.go LRU cache implementation with TTL support and statistics tracking
fortype_util_test.go Updated to use selflog instead of standard log for consistency
fortype.go Migrated from standard log to selflog for diagnostic messages
examples/sampling/main.go Comprehensive examples demonstrating all sampling strategies

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…update tests

- Fix bug where name field was set to description instead of name parameter
- Update test assertions to expect profile name instead of description
@willibrandon willibrandon merged commit 5d75d44 into main Aug 23, 2025
26 checks passed
@willibrandon willibrandon deleted the feature/per-message-sampling branch August 23, 2025 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request performance Performance improvements and optimizations. Zero-allocation promise territory. 🚀 sampling Per-message sampling and adaptive rate control features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add per-message sampling for high-frequency events

1 participant