feat: Per-message sampling with adaptive strategies and profile versioning#61
Merged
willibrandon merged 16 commits intomainfrom Aug 23, 2025
Merged
feat: Per-message sampling with adaptive strategies and profile versioning#61willibrandon merged 16 commits intomainfrom
willibrandon merged 16 commits intomainfrom
Conversation
…ersioning Implement sampling system including counter, rate, time-based, first-N, group, conditional, and exponential backoff strategies. Add adaptive sampling with hysteresis and dampening to prevent oscillation. Introduce profile versioning with migration policies and thread-safe registry. Include LRU cache, fluent configuration, and predefined dampening presets. Resolves #49
- Remove unused mutex fields from SamplingGroupManager and BackoffState - Remove unused globalSamplingMemoryLimit variable and deprecated function - Simplify loop with append variadic syntax - Remove unused sync import from per_message_sampling.go
- Add all sampling-related methods to mockLogger and mockFuzzLogger - Include Sample, SampleDuration, SampleRate, SampleFirst, SampleGroup, etc. - Add ResetSampling, ResetSamplingGroup, EnableSamplingSummary, GetSamplingStats - Add SampleProfile, SampleAdaptive, and SampleAdaptiveWithOptions methods - Fix TestSamplingConfigComplex flakiness by increasing sample rate and count
Root cause: On Windows, rapid successive events could have identical or very similar timestamps due to lower timer resolution, causing the hash function to produce biased values that wouldn't pass the threshold check. Solution: - Add event counter to hash calculation for better distribution - Handle zero timestamps by using current time as fallback - Include counter XOR to ensure unique hashes for rapid events - This ensures proper sampling distribution on all platforms Also improved test reliability: - Use RateSamplingFilter instead of AdaptiveSamplingFilter in TestProfileFreezing - Increase sample sizes and widen acceptable ranges in TestSampleRate - Add micro-delays between events in TestLoggerPresetMethods for timestamp variety
- Add all sampling-related methods to noOpLogger in middleware adapter - Include Sample, SampleDuration, SampleRate, SampleFirst, SampleGroup, etc. - Add ResetSampling, ResetSamplingGroup, EnableSamplingSummary, GetSamplingStats - Add SampleProfile, SampleAdaptive, and SampleAdaptiveWithOptions methods
- Extract validateBackoffFactor function to reduce code duplication - Remove thread-unsafe lazy initialization of randSource - Optimize LRU cache Get method to use single lock instead of double-locking - Add clarifying comment about selflog.Printf usage
- Add SamplingMetrics struct for cache performance monitoring - Implement GetAvailableProfileDescriptions() for runtime profile discovery - Add EnableSamplingDebug() to log sampling decisions via selflog - Include comprehensive tests for new monitoring features - Fix flaky TestMetricsMiddleware/timing_accuracy test on macOS CI by increasing tolerance from 10ms to 20ms
- Implement String() for human-readable metric summaries with hit rates - Add Format() to support %s, %v, %+v (verbose), and %#v formatting - Add PrometheusMetrics() to export metrics as map for monitoring systems - Include tests for all formatting options and edge cases - Handle zero values gracefully to prevent divide-by-zero errors
- Replace plain bool with atomic.Bool for thread-safe access - Add SetSamplingDebugEnabled/IsSamplingDebugEnabled helper functions - Fix concurrent access issue detected by race detector in tests
- Document SamplingMetrics String() and PrometheusMetrics() usage - Add examples for debugging sampling decisions with EnableSamplingDebug() - Include profile discovery with GetAvailableProfileDescriptions() - Provide monitoring best practices and alerting examples
- Add sampling-monitoring example with metrics and Prometheus export - Add sampling-debug example for troubleshooting sampling decisions - Add sampling-profiles example for profile discovery and comparison - Fix GetSamplingStats() to work with SampleProfile methods - Update README with links to all sampling examples
There was a problem hiding this comment.
Pull Request Overview
This PR implements a comprehensive per-message sampling system for mtlog, enabling fine-grained control over log event sampling with multiple strategies and adaptive capabilities. The feature allows production systems to intelligently manage log volume while preserving important events through configurable sampling rates, adaptive algorithms, and profile-based configurations.
Key changes include:
- Multiple sampling strategies: Counter-based, rate-based, time-based, first-N, group-based, conditional, and exponential backoff sampling
- Adaptive sampling: Automatically adjusts sampling rates based on target events/second with hysteresis and dampening controls
- Profile versioning: Semantic versioning system with migration policies for backward compatibility and production safety
- Performance optimization: Thread-safe LRU caches with atomic operations achieving 17ns for simple sampling decisions
Reviewed Changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| sampling_test.go | Comprehensive test coverage for all sampling strategies and concurrent safety verification |
| sampling_profiles.go | Profile registry with versioning, migration policies, and thread-safe management |
| sampling_monitoring_test.go | Tests for metrics, debug logging, and monitoring capabilities |
| sampling_config_test.go | Tests for fluent configuration API and composite sampling strategies |
| sampling_config.go | Fluent configuration builder with AND/OR logic and custom policy support |
| sampling_bench_test.go | Performance benchmarks showing efficient 17ns sampling decisions |
| sampling_advanced_test.go | Tests for summary events, cache warmup, and advanced features |
| neovim-plugin/tests/spec/analyzer_spec.lua | Increased CI timeout for better reliability |
| logger.go | Core logger integration with sampling methods and global state management |
| internal/filters/per_message_sampling.go | Core sampling filter implementations with thread-safe counters |
| internal/cache/lru_cache.go | LRU cache implementation with TTL support and statistics tracking |
| fortype_util_test.go | Updated to use selflog instead of standard log for consistency |
| fortype.go | Migrated from standard log to selflog for diagnostic messages |
| examples/sampling/main.go | Comprehensive examples demonstrating all sampling strategies |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…update tests - Fix bug where name field was set to description instead of name parameter - Update test assertions to expect profile name instead of description
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Implements per-message sampling system for mtlog, providing fine-grained control over log event sampling with multiple strategies and adaptive capabilities. This feature enables production systems to manage log volume intelligently while preserving important events.
Key additions:
Type of change
Checklist
go test ./...)golangci-lint run)Additional notes
Fixes #49