feat: add -sw (strict-wildcard) flag for automatic wildcard detection and filtering#945
feat: add -sw (strict-wildcard) flag for automatic wildcard detection and filtering#945assakafpix wants to merge 2 commits intoprojectdiscovery:devfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds strict wildcard detection and filtering: new Options fields Changes
Sequence DiagramsequenceDiagram
actor User
participant Runner
participant HybridMap
participant DNSX as "wildcardDnsx (DNSX)"
participant Output
User->>Runner: Start run with StrictWildcard
Runner->>HybridMap: Collect hosts with A records
HybridMap-->>Runner: Return host list
Runner->>DNSX: detectWildcardRoots(hosts)
loop Parallel tests (bounded by Threads)
DNSX->>DNSX: isStrictWildcard(random-subdomain)
end
DNSX-->>Runner: Detected wildcard roots
Runner->>Output: Restart output worker
Runner->>Runner: Filter hosts using isSubdomainOfWildcard()
loop Lookup non-wildcard hosts
Runner->>DNSX: Perform DNS lookups
DNSX-->>Runner: Lookup results
end
Runner->>Output: Emit filtered results
Runner->>User: Log summary (wildcard vs non-wildcard)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
internal/runner/runner.go (1)
566-611:-swwith-re/-roor other record types only outputs A records.When
StrictWildcardis active, hosts are collected only if they have A records (line 579), andlookupAndOutputin Response mode (lines 640–655) only emits A records. If a user combines-swwith-aaaa -re, the AAAA data stored in the HybridMap is silently dropped in the output phase.This matches the existing
-wdbehavior, but consider either:
- Documenting this limitation (e.g., in the flag help text), or
- Extending
lookupAndOutputto emit all requested record types.internal/runner/wildcard.go (1)
77-84: Single random-subdomain probe may produce false positives.
isStrictWildcardissues a single random query. Some DNS setups (e.g., certain CDN catch-all configurations or resolver-level rewriting) could cause a one-off false positive, filtering out legitimate subdomains of that parent. A second probe with a different random label would greatly reduce the false-positive rate at minimal cost.💡 Suggested hardening
func (r *Runner) isStrictWildcard(domain string) bool { - randomHost := xid.New().String() + "." + domain - resp, err := r.wildcardDnsx.QueryOne(randomHost) - if err != nil || resp == nil { - return false - } - return len(resp.A) > 0 + // Two independent random probes to reduce false positives + for i := 0; i < 2; i++ { + randomHost := xid.New().String() + "." + domain + resp, err := r.wildcardDnsx.QueryOne(randomHost) + if err != nil || resp == nil || len(resp.A) == 0 { + return false + } + } + return true }internal/runner/wildcard_test.go (2)
52-69: Live DNS tests will be flaky in CI.
TestIsStrictWildcardandTestDetectWildcardRootsdepend ondev.projectdiscovery.iobeing a wildcard and on network availability. DNS behavior can change, and CI environments may have restricted network access or flaky resolvers. These tests will intermittently fail.Consider gating them behind a build tag or environment variable (e.g.,
//go:build integrationor checkingos.Getenv("DNSX_INTEGRATION_TEST")), and keepingTestIsSubdomainOfWildcardas the always-run unit test.Also applies to: 71-91
10-25:newTestRunneruses hardcodeddns.TypeAmagic number.Line 14 uses the raw value
1instead ofdns.TypeA. Since thednspackage isn't imported, consider importing it for clarity, or add a comment.♻️ Suggested fix
import ( "testing" + "github.com/miekg/dns" "github.com/projectdiscovery/dnsx/libs/dnsx" "github.com/stretchr/testify/require" ) func newTestRunner(t *testing.T) *Runner { t.Helper() options := dnsx.DefaultOptions - options.QuestionTypes = []uint16{1} // TypeA + options.QuestionTypes = []uint16{dns.TypeA} options.MaxRetries = 3
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
497-497:⚠️ Potential issue | 🟡 MinorStale note:
-wdis no longer the only path to wildcard elimination.Line 497 reads: "Domain name (
wd) input is mandatory for wildcard elimination." This is now incorrect;-swperforms wildcard filtering without any domain argument. The note should be updated to reflect both modes.✏️ Suggested fix
- - Domain name (`wd`) input is mandatory for wildcard elimination. + - Domain name (`wd`) or the `-sw` flag is required for wildcard elimination (`-sw` and `-wd` are mutually exclusive).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 497, Update the stale README sentence "Domain name (`wd`) input is mandatory for wildcard elimination." to mention both modes: clarify that the `-wd`/`wd` domain argument is required for wildcard elimination when using the domain-based mode, and that the `-sw` flag performs wildcard filtering without any domain argument; replace the single sentence with a concise description that references both `-wd` and `-sw` and explains which mode requires a domain and which does not.
🧹 Nitpick comments (1)
README.md (1)
410-438: Add a ToC entry and document the-sw/-wdmutual exclusivity.Two minor gaps in this new section:
- The top-of-file navigation (line 21) has no anchor link for "Strict wildcard filtering". Consider adding one alongside the existing
#wildcard-filteringentry.- The PR description states
-swand-wdare mutually exclusive, but neither the new section nor the CONFIGURATIONS block mentions this. A brief note (e.g., "Cannot be used with-wd" in the flag description or in the section prose) would prevent user confusion.✏️ Suggested additions
Add a ToC entry (line 21 area):
<a href="#wildcard-filtering">Wildcard</a> • + <a href="#strict-wildcard-filtering">Strict Wildcard</a> •Note mutual exclusivity in the CONFIGURATIONS block (line 132):
- -sw, -strict-wildcard perform strict wildcard check on all found subdomains + -sw, -strict-wildcard perform strict wildcard check on all found subdomains (mutually exclusive with -wd)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 410 - 438, Add a top-of-file Table-of-Contents anchor entry for the "Strict wildcard filtering" section (the "Strict wildcard filtering" header) alongside the existing `#wildcard-filtering` entry so the section appears in the navigation, and update the CONFIGURATIONS block entry for the -sw flag to explicitly state it cannot be used with -wd (or add a short sentence in the new section noting that -sw and -wd are mutually exclusive); reference the flags by name (-sw and -wd) and the section header "Strict wildcard filtering" when making the edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@README.md`:
- Line 497: Update the stale README sentence "Domain name (`wd`) input is
mandatory for wildcard elimination." to mention both modes: clarify that the
`-wd`/`wd` domain argument is required for wildcard elimination when using the
domain-based mode, and that the `-sw` flag performs wildcard filtering without
any domain argument; replace the single sentence with a concise description that
references both `-wd` and `-sw` and explains which mode requires a domain and
which does not.
---
Nitpick comments:
In `@README.md`:
- Around line 410-438: Add a top-of-file Table-of-Contents anchor entry for the
"Strict wildcard filtering" section (the "Strict wildcard filtering" header)
alongside the existing `#wildcard-filtering` entry so the section appears in the
navigation, and update the CONFIGURATIONS block entry for the -sw flag to
explicitly state it cannot be used with -wd (or add a short sentence in the new
section noting that -sw and -wd are mutually exclusive); reference the flags by
name (-sw and -wd) and the section header "Strict wildcard filtering" when
making the edits.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md`:
- Around line 426-431: Add a short clarifying sentence after the sample output
that explains the filter only removes wildcard subdomains, not the parent/root
domain itself (so entries like herokuapp.com, netlify.app, vercel.app can still
appear even when *.herokuapp.com, *.netlify.app, *.vercel.app were detected);
reference the example output lines (e.g., the shown wildcard roots like
*.herokuapp.com and the non-wildcard results such as herokuapp.com) and state
that this behavior is intentional to avoid confusion.
- Around line 410-438: Add documentation noting that the strict wildcard
detection flag `-sw` and the wildcard-disable flag `-wd` cannot be used
together: update the README section that shows `-sw` and the `-wildcard-retry`
example (or the Notes section) to include a single sentence like "Note: `-sw`
and `-wd` are mutually exclusive; use one or the other." Mention both flags
(`-sw`, `-wd`) explicitly so users see the constraint when reading the wildcard
examples.
- Around line 497-498: Update README.md: change "can not" to "cannot" and revise
the sentence about DNS record flags to state that the restriction applies to
both -wd and -sw flags (wildcard filtering) because the code in runner.go treats
them the same and defaults to A records only; reference the -wd and -sw flags
and the runner.go behavior (defaulting to type A) when rewording to make the
limitation explicit for both flags.
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@README.md`:
- Around line 410-438: The README's "Strict wildcard filtering" section omits
that the CLI flags -sw and -wd are mutually exclusive; update the README.md by
adding a short note under that section (near the -sw description) explicitly
stating that -sw and -wd cannot be used together (e.g., "Note: -sw and -wd are
mutually exclusive; use one or the other."), so users combining both flags see
the constraint in the documentation.
- Around line 497-498: Update the README text to correct "can not" to "cannot"
and broaden the DNS record flag restriction so it applies when using wildcard
filtering with either -wd or -sw; specifically, edit the sentence that currently
reads "DNS record flag can not be used when using wildcard filtering with `-wd`"
to something like "The DNS record flag cannot be used when using wildcard
filtering with `-wd` or with automatic wildcard detection `-sw`" so both flags
(`-wd`, `-sw`) are referenced and the wording uses "cannot".
There was a problem hiding this comment.
🧹 Nitpick comments (1)
README.md (1)
132-132: Nit: "all found subdomains" may mislead — consider "all input hosts"."Found subdomains" implies subdomains discovered at runtime, whereas
-swprocesses whatever is in the input list (domains, subdomains, hosts). A clearer phrase:📝 Suggested wording
- -sw, -strict-wildcard perform strict wildcard check on all found subdomains + -sw, -strict-wildcard perform strict wildcard detection and filtering on all input hosts🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 132, The README line describing the -sw / -strict-wildcard flag is misleading; change the phrase "all found subdomains" to reflect that the flag operates on the input hosts rather than discovered subdomains — e.g., replace with "perform strict wildcard check on all input hosts" and update any mention of -sw or -strict-wildcard to use this clearer wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@README.md`:
- Line 496: Update the README note to explicitly state that the flags -sw
(StrictWildcard) and -wd (WildcardDomain / domain name input) are mutually
exclusive and cannot be used together; replace the current "without needing
`-wd`" language with a clear sentence such as "These flags are mutually
exclusive — use either -sw or -wd, but not both; using both will result in an
error." Reference the exact flag names (-sw and -wd) and the Domain name (`wd`)
wording already in the doc so readers can locate the statement.
---
Nitpick comments:
In `@README.md`:
- Line 132: The README line describing the -sw / -strict-wildcard flag is
misleading; change the phrase "all found subdomains" to reflect that the flag
operates on the input hosts rather than discovered subdomains — e.g., replace
with "perform strict wildcard check on all input hosts" and update any mention
of -sw or -strict-wildcard to use this clearer wording.
/claim #924
Proposed Changes
The current
-wdflag requires manually specifying a single domain and uses IP-threshold grouping to detect wildcards. This fails for load-balanced wildcards like*.herokuapp.com(8 disjoint IP pools, 31 unique IPs — no single IP reaches the threshold).This PR adds a new
-sw(strict-wildcard) flag that automatically detects and filters wildcard subdomains from any input, without requiring the user to know which domains are wildcards.How it works
Phase 1 — Normal resolution (main DNS client,
-retry 2default)Phase 2 — Wildcard detection (separate DNS client,
-wildcard-retry 5default)<xid>.<parent>)--wildcard-retrytimes.-tthreads)Key design decisions
-wdis unchanged,-swis a new additive flag. They are mutually exclusive.example.combeforesub.example.com. If the parent is wildcard, all children are skipped.-t(threads). New flag:--wildcard-retry(default 5, matching shuffledns) for wildcard DNS retries, independent from-retry(default 2) used for normal resolution.xidfor random subdomain generation, consistent with both the existingIsWildcard()in dnsx and shuffledns.Files modified
internal/runner/options.goStrictWildcard,WildcardRetryfields and-sw,--wildcard-retryflagsinternal/runner/wildcard.goisStrictWildcard(),detectWildcardRoots(),isSubdomainOfWildcard(). ExistingIsWildcard()untouched.internal/runner/runner.gowildcardDnsxclient (retries default 5), strict wildcard filtering block,lookupAndOutput()with-respsupport.internal/runner/wildcard_test.goProof
Test domain list (domains.txt - 18 domains)
This list contains 18 domains across 8 root domains, many of which are known wildcard providers (herokuapp.com, vercel.app, netlify.app, ngrok.io, wordpress.com, github.io) plus
*.dev.projectdiscovery.io.puredns (baseline)
puredns detects 8 wildcard roots and outputs 5 valid domains.
Before (original dnsx, no wildcard filtering)
Original dnsx outputs 15 unique hosts with no wildcard filtering. 10 of these are wildcard subdomains that resolve only because the wildcard catches them.
After (dnsx with
-swflag)dnsx
-swdetects 7 wildcard roots (same as puredns minus*.netlify.comwhich had no subdomains in the input) and outputs 5 non-wildcard domains — matching puredns exactly.New flags
Performance
Benchmark on subdomains from 8 different root domains (projectdiscovery.io, hackerone.com, bugcrowd.com, github.com, shopify.com, cloudflare.com, uber.com, tesla.com):
-sw(this PR)For both puredns and dnsx with
-swwe found 1 wildcard root for the bench_100.txt file and 2 wildcard roots and for the bench_500.txt file.Checklist
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests