-
Notifications
You must be signed in to change notification settings - Fork 18
Feat: Add redaction patterns for cloud credentials #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Azure credentials Google Cloud credentials SSH keys in different formats API keys from more providers (Stripe, Twilio, etc.) Connection strings for various databases
- Expands scrubber to cover Azure, GCP, Stripe, Twilio, and other sensitive keys for enhanced security. - Updates tests for new redaction markers. - Adds AGENTS.md for project guidelines.
WalkthroughExpands .gitignore entries. Removes a header comment from cmd/cli/root.go. Significantly broadens internal/scrubber credential and secret redaction patterns, adds specific private-key handling, and introduces ScrubLines and ScrubDiff. Updates and adds tests to cover new redaction cases and adjusted placeholders. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Caller
participant S as Scrubber
participant P as Pattern Engine
rect rgba(230, 240, 255, 0.5)
Note over C,S: Existing flow
C->>S: Scrub(text)
S->>P: Detect + Redact matches
P-->>S: Redacted text
S-->>C: Result
end
rect rgba(235, 255, 235, 0.5)
Note over C,S: New entry points
C->>S: ScrubLines(lines) [New]
S->>P: Detect + Redact per line
P-->>S: Redacted lines
S-->>C: Result
C->>S: ScrubDiff(diff) [New]
S->>P: Detect + Redact within diff hunks
P-->>S: Redacted diff
S-->>C: Result
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| - | - | OpenSSH Private Key | dd2a098 | internal/scrubber/scrubber_test.go | View secret |
| - | - | Stripe Keys | dd2a098 | internal/scrubber/scrubber_test.go | View secret |
| - | - | Generic High Entropy Secret | dd2a098 | internal/scrubber/scrubber_test.go | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/scrubber/scrubber.go (1)
30-33: Fix Authorization header redaction: stray quote and mismatched quotingCurrent replacement always appends a double-quote, which can produce mismatched or extra quotes. Capture the optional closing quote and re-insert it.
Apply:
- { - Name: "Authorization Header", - Pattern: regexp.MustCompile(`(?i)(authorization\s*[=:]\s*["\']?)([a-zA-Z0-9_\-\.]{20,})["\']?`), - Redact: "${1}[REDACTED_AUTH_TOKEN]\"", - }, + { + Name: "Authorization Header", + Pattern: regexp.MustCompile(`(?i)(authorization\s*[=:]\s*["\']?)([a-zA-Z0-9_\-\.]{20,})(["\']?)`), + Redact: "${1}[REDACTED_AUTH_TOKEN]${3}", + },
🧹 Nitpick comments (9)
internal/scrubber/scrubber.go (5)
47-68: Preserve delimiters/quotes and avoid partial leaks in Azure patternsCapture the assignment + opening quote as prefix and the optional closing quote as suffix, then reinsert them. This standardizes formatting and prevents mismatched quotes.
{ Name: "Azure Client Secret", - Pattern: regexp.MustCompile(`(?i)(azure[_-]?client[_-]?secret|AZURE_CLIENT_SECRET)\s*[=:]\s*["\']?([a-zA-Z0-9_\-\.~]{20,})["\']?`), - Redact: "${1}=\"[REDACTED_AZURE_CLIENT_SECRET]\"", + Pattern: regexp.MustCompile(`(?i)((?:azure[_-]?client[_-]?secret|AZURE_CLIENT_SECRET)\s*[=:]\s*["\']?)([a-zA-Z0-9_\-\.~]{20,})(["\']?)`), + Redact: "${1}[REDACTED_AZURE_CLIENT_SECRET]${3}", }, { Name: "Azure Subscription Key", - Pattern: regexp.MustCompile(`(?i)(azure[_-]?subscription[_-]?key|AZURE_SUBSCRIPTION_KEY)\s*[=:]\s*["\']?([a-zA-Z0-9]{32})["\']?`), - Redact: "${1}=\"[REDACTED_AZURE_SUBSCRIPTION_KEY]\"", + Pattern: regexp.MustCompile(`(?i)((?:azure[_-]?subscription[_-]?key|AZURE_SUBSCRIPTION_KEY)\s*[=:]\s*["\']?)([a-zA-Z0-9]{32})(["\']?)`), + Redact: "${1}[REDACTED_AZURE_SUBSCRIPTION_KEY]${3}", }, { Name: "Azure Storage Key", - Pattern: regexp.MustCompile(`(?i)(azure[_-]?storage[_-]?key|AZURE_STORAGE_KEY|AccountKey)\s*[=:]\s*["\']?([a-zA-Z0-9/+=]{88})["\']?`), - Redact: "${1}=\"[REDACTED_AZURE_STORAGE_KEY]\"", + Pattern: regexp.MustCompile(`(?i)((?:azure[_-]?storage[_-]?key|AZURE_STORAGE_KEY|AccountKey)\s*[=:]\s*["\']?)([a-zA-Z0-9/+=]{88})(["\']?)`), + Redact: "${1}[REDACTED_AZURE_STORAGE_KEY]${3}", }, { Name: "Azure Service Principal", - Pattern: regexp.MustCompile(`(?i)(azure[_-]?service[_-]?principal|AZURE_SERVICE_PRINCIPAL)\s*[=:]\s*["\']?([a-f0-9-]{36})["\']?`), - Redact: "${1}=\"[REDACTED_AZURE_SERVICE_PRINCIPAL]\"", + Pattern: regexp.MustCompile(`(?i)((?:azure[_-]?service[_-]?principal|AZURE_SERVICE_PRINCIPAL)\s*[=:]\s*["\']?)([a-f0-9-]{36})(["\']?)`), + Redact: "${1}[REDACTED_AZURE_SERVICE_PRINCIPAL]${3}", },Note: If any real-world value exceeds the fixed lengths, consider relaxing the value class to
[A-Za-z0-9/_+=.-]{20,}to avoid partial redactions.
69-90: Narrow GCP JSON redaction to only the private_key valueCurrent pattern replaces a large JSON chunk and may drop other fields. Replace only the private_key value to preserve structure.
{ Name: "Google Cloud JSON Credentials", - Pattern: regexp.MustCompile(`(?s)"type":\s*"service_account".*?"private_key":\s*"-----BEGIN PRIVATE KEY-----.*?-----END PRIVATE KEY-----`), - Redact: "\"type\": \"service_account\",\n\"private_key\": \"[REDACTED_GCP_PRIVATE_KEY]\"", + Pattern: regexp.MustCompile(`(?s)("private_key"\s*:\s*")-----BEGIN PRIVATE KEY-----.*?-----END PRIVATE KEY-----(")`), + Redact: "${1}[REDACTED_GCP_PRIVATE_KEY]${2}", },Optionally, align the other GCP key patterns to the prefix/suffix-quote style used for Azure to keep formatting consistent.
135-161: Include Stripe test keys too (not just live)Cover both live and test variants to avoid misses in non-production diffs.
{ Name: "Stripe API Key", - Pattern: regexp.MustCompile(`(?i)(stripe[_-]?api[_-]?key|STRIPE_API_KEY)\s*[=:]\s*["\']?(sk_live_[a-zA-Z0-9]{24})["\']?`), - Redact: "${1}=\"[REDACTED_STRIPE_KEY]\"", + Pattern: regexp.MustCompile(`(?i)((?:stripe[_-]?api[_-]?key|STRIPE_API_KEY)\s*[=:]\s*["\']?)(sk_(?:live|test)_[A-Za-z0-9]{24,})(["\']?)`), + Redact: "${1}[REDACTED_STRIPE_KEY]${3}", }, { Name: "Stripe Publishable Key", - Pattern: regexp.MustCompile(`(?i)(stripe[_-]?publishable[_-]?key|STRIPE_PUBLISHABLE_KEY)\s*[=:]\s*["\']?(pk_live_[a-zA-Z0-9]{24})["\']?`), - Redact: "${1}=\"[REDACTED_STRIPE_PUBLISHABLE_KEY]\"", + Pattern: regexp.MustCompile(`(?i)((?:stripe[_-]?publishable[_-]?key|STRIPE_PUBLISHABLE_KEY)\s*[=:]\s*["\']?)(pk_(?:live|test)_[A-Za-z0-9]{24,})(["\']?)`), + Redact: "${1}[REDACTED_STRIPE_PUBLISHABLE_KEY]${3}", },
162-183: Heroku API key pattern is likely too strictUsing a UUID format may miss real Heroku API keys (commonly 32+ chars random). Consider a broader value class rather than UUID only, or add an alternate pattern.
Example:
- Pattern: regexp.MustCompile(`(?i)(heroku[_-]?api[_-]?key|HEROKU_API_KEY)\s*[=:]\s*["\']?([a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12})["\']?`), + Pattern: regexp.MustCompile(`(?i)((?:heroku[_-]?api[_-]?key|HEROKU_API_KEY)\s*[=:]\s*["\']?)([A-Za-z0-9_-]{24,})(["\']?)`),This reduces false negatives while still avoiding obvious noise.
223-283: Multi-line keys won’t be scrubbed by ScrubLinesThese patterns are multi-line (
(?s)), but ScrubLines operates per-line. Consider documenting this or adding a fallback to apply ScrubDiff when multi-line markers (BEGIN/END … PRIVATE KEY) are present.Option: in ScrubLines, short-circuit:
func ScrubLines(content string) string { + // Fallback for multi-line secrets (private keys) + if strings.Contains(content, "-----BEGIN ") && strings.Contains(content, "PRIVATE KEY") { + return ScrubDiff(content) + }This keeps per-line behavior while handling multi-line secrets.
internal/scrubber/scrubber_test.go (4)
438-459: Strengthen assertions to ensure no leakageCurrently only checks for presence of a [REDACTED_*] marker. Add a negative assertion that a recognizable slice of the original value doesn’t remain (especially for long AccountKey).
Example:
t.Run(tt.name, func(t *testing.T) { result := ScrubDiff(tt.input) if !strings.Contains(result, "[REDACTED_AZURE") { t.Errorf("ScrubDiff() failed to redact Azure credentials.\nInput: %s\nOutput: %s", tt.input, result) } + // Ensure no obvious part of the secret remains + if strings.Contains(result, "abcdefghijklmnopqrstuvwxyz123456") { + t.Errorf("Redaction leaked part of the Azure secret.\nOutput: %s", result) + } })Also consider using an 88-char AccountKey fixture to mirror real Azure lengths.
461-489: Add negative assertions for GCP fixturesVerify that raw values like
my-service-account-key.jsonor a substring of the API key do not appear post-scrub.Example snippets:
result := ScrubDiff(tt.input) if !strings.Contains(result, "[REDACTED_") { t.Errorf("ScrubDiff() failed to redact Google Cloud credentials.\nInput: %s\nOutput: %s", tt.input, result) } + if strings.Contains(result, "service-account-key.json") || + strings.Contains(result, "abcdefghijklmnopqrstuvwxyz123456") { + t.Errorf("Redaction leaked part of the GCP secret.\nOutput: %s", result) + }
490-521: Also assert original tokens aren’t present for Stripe/Twilio/DO/SendGridPresence of a [REDACTED_*] marker isn’t sufficient; add a couple of negative contains checks per case.
Example:
result := ScrubDiff(tt.input) if !strings.Contains(result, "[REDACTED_") { t.Errorf("ScrubDiff() failed to redact additional API keys.\nInput: %s\nOutput: %s", tt.input, result) } + if strings.Contains(result, "sk_live_") || + strings.Contains(result, "twilio_auth_token") || + strings.Contains(result, "digitalocean_api_key") || + strings.Contains(result, "SG.") { + t.Errorf("Redaction may have leaked raw values.\nOutput: %s", result) + }
523-550: Add leakage checks for SSH fixturesAssert that BEGIN/END blocks (or a known base64 slice) aren’t present after scrubbing.
Example:
result := ScrubDiff(tt.input) if !strings.Contains(result, "[REDACTED_") { t.Errorf("ScrubDiff() failed to redact SSH keys.\nInput: %s\nOutput: %s", tt.input, result) } + if strings.Contains(result, "-----BEGIN ") || strings.Contains(result, " OPENSSH PRIVATE KEY-----") { + t.Errorf("SSH private key material still present.\nOutput: %s", result) + }Additionally, consider adding a test for quoted Authorization headers to guard against quoting issues:
func TestAuthorizationHeaderWithQuotes(t *testing.T) { in := `authorization="sometokenwithmorethan20chars"` out := ScrubDiff(in) if !strings.Contains(out, "[REDACTED_AUTH_TOKEN]") || strings.Count(out, `"`) != 2 { t.Errorf("Authorization redaction with quotes failed: %s", out) } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore(1 hunks)cmd/cli/root.go(0 hunks)internal/scrubber/scrubber.go(2 hunks)internal/scrubber/scrubber_test.go(2 hunks)
💤 Files with no reviewable changes (1)
- cmd/cli/root.go
🧰 Additional context used
🧬 Code graph analysis (1)
internal/scrubber/scrubber_test.go (1)
internal/scrubber/scrubber.go (1)
ScrubDiff(330-339)
🪛 Gitleaks (8.28.0)
internal/scrubber/scrubber.go
[high] 231-236: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
[high] 241-246: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
[high] 251-281: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
internal/scrubber/scrubber_test.go
[high] 439-439: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 501-501: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 505-505: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 509-509: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 476-476: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
[high] 530-530: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build Go Binary (windows-latest)
🔇 Additional comments (3)
.gitignore (1)
31-31: Confirm intent to ignoreAGENTS.md.We’re versioning
AGENTS.mdin this PR, yet it’s also being added to.gitignore. That makes future edits harder (and confusing, since contributors mustgit add -f). Please confirm whether the doc should remain tracked; if so, drop it from.gitignore.internal/scrubber/scrubber_test.go (2)
137-139: LGTM: RSA-specific placeholder expectationMatches the new per-type key redaction behavior.
399-430: Credit card tests look goodCovers spaces, dashes, and contiguous digits.
Description
Add redaction patterns for cloud credentials
Type of Change
Related Issue
Fixes #(issue number)
Changes Made
and other sensitive keys for enhanced security.
Testing
Checklist
Screenshots (if applicable)
Additional Notes
For Hacktoberfest Participants
Thank you for your contribution! 🎉
Summary by CodeRabbit