-
Notifications
You must be signed in to change notification settings - Fork 23
Chore/scan tunning parameters #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0e16a3e to
1bd454d
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
1bd454d to
c12833c
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
c12833c to
92ed9a1
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
92ed9a1 to
ebe0ef3
Compare
📝 WalkthroughWalkthroughAdds file-snippet tuning: CLI flags and help, settings schema and accessors, ScanSettingsBuilder to merge CLI/file/root settings (priority: CLI > file_snippet > root), threads a Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as CLI User
participant CLI as cli.py
participant File as scanoss.json
participant Settings as ScanossSettings
participant Builder as ScanSettingsBuilder
participant Scanner as Scanner
participant API as ScanossApi
participant Server as HTTP Server
User->>CLI: run scan with flags (--min-snippet-hits, --ranking, ...)
CLI->>File: load scanoss.json (if present)
File-->>Settings: build ScanossSettings
CLI->>Builder: provide CLI args + ScanossSettings
Note right of Builder: Merge precedence: CLI > file_snippet > root
Builder-->>Scanner: return merged scanoss_settings
CLI->>Scanner: instantiate Scanner(scanoss_settings, CLI args)
Scanner->>API: instantiate ScanossApi(... tuning params ...)
API->>API: _build_scan_settings_header() -> base64(JSON of non-default tunings)
API->>Server: scan() with header "scanoss-settings: <base64-payload>"
Server-->>API: response
API-->>Scanner: scan result
Scanner-->>CLI: final output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/scanoss/scanossapi.py (1)
58-80: Scan settings header logic is sound; update docs to match behaviourFunctionally this looks good:
- Tuning fields are stored on the instance and
_build_scan_settings_header()emits a base64-encoded JSON blob only when values are “meaningful” (!= 0for snippet counts,!= 'unset'for booleans,!= -1for ranking threshold).scan()attaches the header asscanoss-settings, so the extra metadata rides with each request.A few minor doc cleanups would help:
__init__docstring still says the new params default toNone, buthonour_file_extsandrankingactually default to'unset'._build_scan_settings_headerdocstring referencesx-scanoss-scan-settings; the actual header key isscanoss-settings.- Consider clarifying in the docstring that
0and-1are treated as “defer to server configuration” and intentionally omitted from the header.No changes to runtime logic are needed.
Also applies to: 90-95, 173-180, 293-326
src/scanoss/scanner.py (2)
147-171: Skip-headers merge logic is correct; fix_merge_cli_with_settingsdocstringThe new logic to merge
skip_headersandskip_headers_limit:settings_skip_headers = file_snippet_settings.get('skip_headers') settings_skip_headers_limit = file_snippet_settings.get('skip_headers_limit') skip_headers = Scanner._merge_cli_with_settings(skip_headers, settings_skip_headers) skip_headers_limit = Scanner._merge_cli_with_settings(skip_headers_limit, settings_skip_headers_limit)combined with:
if settings_value is not None: return settings_value return cli_valuecorrectly gives precedence to values from
scanoss.jsonover CLI defaults, which is sensible.However, the
_merge_cli_with_settingsdocstring claims “Merged value with CLI taking priority over settings”, which is the opposite of what the function actually does (and what the code above relies on). I’d update the “Returns:” description to clearly state that settings take priority over CLI values.Also applies to: 248-260
173-238:post_processorcreation condition uses the wrong variableHere:
scan_settings = (ScanSettingsBuilder(scanoss_settings) .with_proxy(proxy) .with_url(url) .with_ignore_cert_errors(ignore_cert_errors) .with_min_snippet_hits(min_snippet_hits) .with_min_snippet_lines(min_snippet_lines) .with_honour_file_exts(honour_file_exts) .with_ranking(ranking) .with_ranking_threshold(ranking_threshold) ) ... self.post_processor = ( ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None )
scan_settingsis always aScanSettingsBuilderinstance, so this condition is effectively always true, even whenscanoss_settingsisNone. That’s a behavioural change from the usual “only post-process when we actually have settings” pattern and risks constructingScanPostProcessorwith aNonesettings object.If
ScanPostProcessorassumes a non-Nonesettings object, this could lead to unexpected errors; even if it handlesNone, the condition is misleading.Suggest switching the condition to check
scanoss_settingsinstead of the builder:Proposed fix
- self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None - ) + self.post_processor = ( + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) + if scanoss_settings + else None + )
🧹 Nitpick comments (4)
tests/test_scan_settings_builder.py (1)
33-359: Good coverage of ScanSettingsBuilder behaviourThe tests exercise initialization, helper methods, all
with_*methods, priority rules, and chaining in a realistic way usingtests/data/scanoss.json. This gives solid confidence in the builder logic.If you want to harden things further, consider adding a couple of tests for sentinel values (
min_snippet_hits == 0,ranking_threshold == -1) to reflect the semantics used when building the API header (i.e., “defer to server config” cases).src/scanoss/cli.py (1)
190-222: Snippet-tuning CLI options wired correctly; consider tightening validationThe new
--min-snippet-hits,--min-snippet-lines,--ranking,--ranking-threshold, and--honour-file-extsoptions are consistent with howScannerandScanSettingsBuilderconsume them, including the “0 / -1 mean defer to server config” semantics.Two optional polish items:
--ranking-threshold: you document-1..99but don’t enforce it; a customtypeor validator would catch out-of-range values early.--honour-file-exts: help text is clear here, but its JSON-schema description currently says “Ignores file extensions”; aligning those descriptions would avoid confusion between positive/negative wording.docs/source/_static/scanoss-settings-schema.json (1)
142-260: Schema additions align with code; tweakhonour_file_extsdescriptionThe new
settings.proxy,settings.http_config,settings.file_snippet,settings.hpfm, andsettings.containerblocks line up well with the new getters and builder logic in code (same field names and types, includingranking_*,min_snippet_*,honour_file_exts, skip header controls, etc.).One small documentation inconsistency:
file_snippet.properties.honour_file_exts.descriptionsays “Ignores file extensions…”, but the field name and CLI option (--honour-file-exts) are phrased positively (“honour file extensions”).To avoid confusion, it would be clearer if this description matched the positive semantics (or explicitly stated whether
truemeans “honour” vs “ignore”).src/scanoss/scanner.py (1)
342-346: Dependency auto-enable via settings: confirm intended behaviour
is_dependency_scan()now returnsTrueeither when the dependency bit is set inscan_optionsor whensettings.file_snippet.dependency_analysisis true:if self.scan_options & ScanType.SCAN_DEPENDENCIES.value: return True file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {} return file_snippet_settings.get('dependency_analysis', False)This means a settings file can implicitly turn on dependency scanning even if the user didn’t pass
--dependencies/--dependencies-only. That’s reasonable, but it is a change in behaviour.If this is intentional, it might be worth documenting somewhere that
dependency_analysisinscanoss.jsonacts as a default “enable deps scan” toggle, overridden only by explicitly disabling dependency bits inscan_options.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)docs/source/_static/scanoss-settings-schema.json(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(31-263)_str_to_bool(220-226)_merge_with_priority(203-209)_merge_cli_with_settings(212-216)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
ScanossSettings(76-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (6)
ScanossSettings(76-433)get_file_snippet_settings(339-345)get_file_snippet_proxy(419-425)get_proxy(403-409)get_http_config(411-417)get_file_snippet_http_config(427-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
🪛 markdownlint-cli2 (0.18.1)
CHANGELOG.md
776-776: Link and image reference definitions should be needed
Duplicate link or image reference definition: "1.42.0"
(MD053, link-image-reference-definitions)
🔇 Additional comments (7)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump to 1.43.0 appropriately reflects the new scan tuning parameters and
ScanSettingsBuilderadditions.scanoss.json (1)
18-21: LGTM!Adding
scanoss-winnowing.pyto the BOM includes is appropriate since it's a declared dependency used for fast winnowing functionality.CHANGELOG.md (1)
11-20: LGTM!The changelog entry for version 1.43.0 clearly documents the new scan engine tuning parameters and
ScanSettingsBuilderfunctionality.CLIENT_HELP.md (1)
262-304: LGTM!Comprehensive documentation for the new snippet tuning options. The examples are clear, and the note about using
=syntax for negative values (line 292) is a helpful detail for users.tests/data/scanoss.json (1)
28-51: LGTM!The test data structure effectively exercises the nested configuration hierarchy with distinct values at root and
file_snippetlevels (e.g., different proxy hosts and http_config settings), which enables proper testing of theScanSettingsBuildermerge priority logic.src/scanoss/cli.py (1)
1410-1437: scan / wfp / folder- settings wiring looks consistent*The refactor to use
scanoss_settings(instead of the oldscan_settings) inwfp(),scan(),folder_hashing_scan(), andfolder_hash()is coherent:
- Settings files are loaded once into
ScanossSettings, with legacy vs new file types handled inscan().- The same
scanoss_settingsinstance is threaded intoScanner,ScannerHFH, andFolderHasher.- New snippet-tuning CLI options are only passed to
Scannerfor thescancommand, which is appropriate given they affect server-side scanning, not fingerprint-only flows.No functional issues stand out in these call sites.
Also applies to: 1518-1532, 1597-1641, 2716-2831
src/scanoss/scanoss_settings.py (1)
339-433: New settings accessors are straightforward and consistentThe added getters for
file_snippet, proxy, andhttp_config(includingmin_snippet_*, ranking,honour_file_exts, skip headers, etc.) correctly mirror the underlying data structure and provide clear defaults (None/0/Falseas documented). They integrate cleanly withScanSettingsBuilderandScanner.
ebe0ef3 to
59b57ba
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
CHANGELOG.md (1)
775-776: Fix duplicate link reference definition.The duplicate link reference issue flagged in the previous review remains unresolved. Line 776 uses
[1.42.0]as the link label, but it should be[1.43.0]to match the version and prevent Markdown rendering issues.🔎 Apply this diff to fix the link label:
[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.41.1...v1.42.0 -[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0 +[1.43.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0
🧹 Nitpick comments (2)
src/scanoss/cli.py (1)
190-222: Fix inconsistent spacing in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesparameter:
- Line 206:
choices=['unset' ,'true', 'false'](space before comma)- Line 219:
choices=['unset','true', 'false'](no space after first comma)🔎 Apply this diff for consistent formatting:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )src/scanoss/scanossapi.py (1)
75-79: Fix spacing before comma in parameter default.Line 77 has a space before the comma:
honour_file_exts = 'unset' ,🔎 Apply this diff:
- honour_file_exts = 'unset' , + honour_file_exts='unset',
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- CLIENT_HELP.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/scanoss/scan_settings_builder.py
- tests/data/scanoss.json
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump is consistent with the new features introduced in this PR.
scanoss.json (1)
18-21: LGTM!Adding the scanoss-winnowing.py library to the BOM is appropriate for dependency tracking.
src/scanoss/scanoss_settings.py (1)
339-433: LGTM!The new getter methods provide clean access to file_snippet settings with consistent patterns, appropriate defaults, and clear documentation.
src/scanoss/scanner.py (1)
173-182: TheScanSettingsBuilderimplementation does not match the stated priority order. The builder implements file_snippet > root settings > CLI, where CLI arguments are the lowest priority (used as fallback). This is the opposite of the PR objectives which require CLI > file_snippet > root settings. The priority order logic must be reversed in the merge methods to ensure CLI arguments take precedence when provided.Likely an incorrect or invalid review comment.
59b57ba to
0fb75e9
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/scanoss/scanner.py (1)
147-157: Priority order contradicts PR objectives.Lines 153-155 use
_merge_cli_with_settingswhich implements "settings > cli" priority (as seen in lines 258-260). However, the PR objectives explicitly state: "priority: CLI > file_snippet > root settings".This means CLI arguments should override settings file values, but the current implementation does the opposite for
skip_headersandskip_headers_limit.🔎 Apply this diff to fix the priority order:
@staticmethod def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with CLI taking priority over settings file """ + if cli_value is not None: + return cli_value if settings_value is not None: return settings_value - return cli_value + return None
🧹 Nitpick comments (3)
src/scanoss/scanossapi.py (1)
75-79: Consider type enforcement for boolean parameters.The
honour_file_extsandrankingparameters useUnion[bool, str, None]to support both boolean values and the 'unset' sentinel string. While theScanSettingsBuildercorrectly converts string inputs to booleans before passing them toScanossApi, direct instantiation ofScanossApicould lead to strings like 'true'/'false' being serialized as JSON strings rather than booleans.Consider adding runtime validation in
_build_scan_settings_header()to ensure these values are booleans when included:🔎 Suggested fix
# honour_file_exts: None = unset, don't send if self.honour_file_exts is not None and self.honour_file_exts != 'unset': - settings['honour_file_exts'] = self.honour_file_exts + settings['honour_file_exts'] = bool(self.honour_file_exts) if not isinstance(self.honour_file_exts, bool) else self.honour_file_exts # ranking: None = unset, don't send if self.ranking is not None and self.ranking != 'unset': - settings['ranking_enabled'] = self.ranking + settings['ranking_enabled'] = bool(self.ranking) if not isinstance(self.ranking, bool) else self.rankingtests/test_scan_settings_builder.py (1)
36-38: Consider usingsetUpClassfor shared fixtures.The
scan_settingsfixture is created at class definition time. While this works, usingsetUpClassis the conventional approach for shared test fixtures in unittest:🔎 Suggested refactor
class TestScanSettingsBuilder(unittest.TestCase): """Tests for the ScanSettingsBuilder class.""" - script_dir = os.path.dirname(os.path.abspath(__file__)) - scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve() - scan_settings = ScanossSettings(filepath=scan_settings_path) + @classmethod + def setUpClass(cls): + script_dir = os.path.dirname(os.path.abspath(__file__)) + cls.scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve() + cls.scan_settings = ScanossSettings(filepath=str(cls.scan_settings_path))src/scanoss/cli.py (1)
190-222: Fix inconsistent spacing in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesparameter:
- Line 206:
choices=['unset' ,'true', 'false']has a space before the comma after 'unset'- Line 219:
choices=['unset','true', 'false']is missing a space after 'unset' comma🔎 Apply this diff to fix the spacing:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=None, help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- scanoss.json
- CHANGELOG.md
- src/scanoss/scan_settings_builder.py
- src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(31-266)_str_to_bool(223-229)_merge_with_priority(203-209)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
ScanossSettings(76-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(31-266)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (14)
tests/data/scanoss.json (1)
29-51: LGTM!The test data file is well-structured with a clear hierarchy between root settings and file_snippet settings, enabling comprehensive testing of the priority-based configuration merging logic.
src/scanoss/scanoss_settings.py (1)
339-433: LGTM!The new accessor methods for file_snippet settings are well-implemented with:
- Consistent patterns across all getters
- Proper Optional return types
- Clear docstrings documenting expected return values
- Clean separation between root-level and file_snippet-level configuration access
src/scanoss/scanossapi.py (2)
176-179: LGTM!The integration of the
scanoss-settingsheader into the scan method is clean and correct. The header is only added when meaningful settings are configured.
293-326: LGTM!The
_build_scan_settings_headermethod correctly:
- Excludes sentinel values (0, 'unset', -1, None) from the serialized payload
- Uses base64 encoding for safe header transmission
- Logs the settings for debugging before encoding
- Returns None when no settings need to be sent
CLIENT_HELP.md (1)
262-304: LGTM!The documentation for the new snippet tuning options is comprehensive and well-structured:
- Clear explanations of each flag's purpose
- Practical usage examples
- Helpful note about
=syntax for negative threshold values- Example showing how to combine multiple options
tests/test_scan_settings_builder.py (2)
44-117: LGTM!Excellent test coverage for:
- Initialization with None and with settings object
- Static helper methods (
_str_to_bool,_merge_with_priority,_merge_cli_with_settings)- Edge cases like all-None inputs and case-insensitive boolean parsing
119-358: LGTM!Comprehensive test coverage for all
with_*methods including:
- CLI-only scenarios (no settings file)
- Settings file values taking priority over CLI
- Method chaining verification
- All tuning parameters: proxy, url, ignore_cert_errors, min_snippet_hits, min_snippet_lines, honour_file_exts, ranking, ranking_threshold
src/scanoss/cli.py (3)
1410-1418: LGTM!Settings loading is correctly refactored to use
scanoss_settingsinstead ofscan_settings, maintaining consistency with the rest of the codebase.
1518-1535: LGTM!Settings loading logic correctly handles different scenarios (identify, ignore, and standard settings) while maintaining backward compatibility.
1631-1641: LGTM!New snippet tuning parameters are correctly wired from CLI arguments to the Scanner constructor, enabling configuration of snippet matching behavior.
src/scanoss/scanner.py (4)
110-117: LGTM!Constructor signature correctly updated to accept the new snippet tuning parameters and renamed
scanoss_settingsfor clarity. The new parameters enable fine-grained control over snippet matching behavior.
186-206: LGTM!The
ScanossApiis correctly initialized with merged settings fromScanSettingsBuilder, properly wiring through the new snippet tuning parameters.
337-345: LGTM!The
is_dependency_scanmethod correctly falls back to checkingdependency_analysisin thefile_snippetsettings when the explicit scan option is not set, providing flexible configuration.
173-184: Correct the stated priority order in review comment.The review comment incorrectly states the PR objectives. The ScanSettingsBuilder class documentation specifies priority as: file_snippet section > settings section > CLI arguments, and the
_merge_with_priorityimplementation correctly follows this priority order. The code is working as designed; the review comment's stated objectives are reversed.Likely an incorrect or invalid review comment.
0fb75e9 to
c62cf42
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/scanoss/cli.py (1)
1523-1528: Fix flag name typo in conflicting settings error message.The error says
--skip-file-settings, but the actual flag is--skip-settings-file. This could confuse users when they hit the conflict.Suggested message fix
- if args.settings and args.skip_settings_file: - print_stderr('ERROR: Cannot specify both --settings and --skip-file-settings options.') + if args.settings and args.skip_settings_file: + print_stderr('ERROR: Cannot specify both --settings and --skip-settings-file options.')
🤖 Fix all issues with AI Agents
In @docs/source/_static/scanoss-settings-schema.json:
- Around line 225-229: The JSON schema description for honour_file_exts is
inverted relative to its semantics: update the description of the
honour_file_exts property to state that True means file extensions are honoured
and False means extensions are ignored (matching
ScanossSettings.get_honour_file_exts and the CLI help); keep the type/default
the same but change the description wording to reflect "When true, honour file
extensions; when false, ignore file extensions; when null/unspecified, defer to
server configuration."
In @src/scanoss/scan_settings_builder.py:
- Around line 102-121: The precedence is inverted because _merge_with_priority
prefers later args, so adjust with_ignore_cert_errors to let the CLI flag win by
moving the CLI-derived value (cli_value if cli_value else None) to the last
argument position in the _merge_with_priority call (keep using
_get_file_snippet_http_config_value('ignore_cert_errors') and
_get_http_config_value('ignore_cert_errors') as the earlier args), ensure you
still treat a False CLI as “not provided” (None), and add a unit test where
settings set ignore_cert_errors=False and CLI passes True to verify CLI
overrides the setting.
🧹 Nitpick comments (5)
CLIENT_HELP.md (1)
297-303: Clarify ranking-threshold valid range to include -1 in the parenthetical.The text says “valid range: 0-99” but then notes that
-1is also valid to defer to server config. To avoid confusion, you could fold-1into the stated range.Proposed wording tweak
-Set the ranking threshold to 50 (valid range: 0-99). A value of -1 defers to server configuration: +Set the ranking threshold to 50 (valid range: -1 to 99, where -1 defers to server configuration):tests/test_scan_settings_builder.py (1)
211-233: Add tests forwith_snippet_range_tolerance(and optionally CLI/settings precedence).You exercise all the
with_*methods exceptwith_snippet_range_tolerance, so that path isn’t covered if its merge logic ever regresses.Sketch of additional tests
def test_with_snippet_range_tolerance_cli_only(self): """Test with_snippet_range_tolerance uses CLI value.""" builder = ScanSettingsBuilder(None) builder.with_snippet_range_tolerance(5) self.assertEqual(builder.snippet_range_tolerance, 5) def test_with_snippet_range_tolerance_from_settings(self): """Test with_snippet_range_tolerance from settings.""" builder = ScanSettingsBuilder(self.scan_settings) builder.with_snippet_range_tolerance(None) # Assuming tests/data/scanoss.json has file_snippet.snippet_range_tolerance = 7 self.assertEqual(builder.snippet_range_tolerance, 7)Optionally, consider a test that sets
ignore_cert_errors=Falsein settings and passescli_value=Truetowith_ignore_cert_errorsto lock in the intended precedence.Also applies to: 317-331
src/scanoss/scan_settings_builder.py (1)
63-68: Tighten type hints forhonour_file_exts/rankingand_str_to_bool.
self.honour_file_extsandself.rankingare annotated asOptional[any], and_str_to_boolis typed as takingstronly, but in practice they acceptboolandNoneas well. Cleaning these up will help static analysis and future readers.Possible type-hint improvements
-from typing import TYPE_CHECKING, Optional +from typing import TYPE_CHECKING, Optional, Any, Union @@ - self.honour_file_exts: Optional[any] = None - self.ranking: Optional[any] = None + self.honour_file_exts: Optional[Any] = None + self.ranking: Optional[Any] = None @@ - def _str_to_bool(value: str) -> Optional[bool]: - """Convert string 'true'/'false' to boolean.""" - if value is None: - return None - if isinstance(value, bool): - return value - return value.lower() == 'true' + def _str_to_bool(value: Union[str, bool, None]) -> Optional[bool]: + """Convert 'true'/'false' (case-insensitive) or bools to a boolean, preserving None.""" + if value is None: + return None + if isinstance(value, bool): + return value + return value.lower() == 'true'Also applies to: 238-245
src/scanoss/cli.py (1)
1400-1447: Consider reusingget_scanoss_settings_from_argsto DRY settings loading inscan/wfp.
scan()andwfp()manually construct and loadScanossSettings, whilefolder_hashing_scan/folder_hashnow useget_scanoss_settings_from_args(). The logic is very similar (skip flag handling, settings path, scan_dir root), which makes future changes to settings-loading behaviour harder to keep in sync.Potential consolidation approach
- Extend
get_scanoss_settings_from_args(args)slightly so it can handle theidentify/ignorelegacy SBOM modes as inscan().- Use it in
scan()andwfp()instead of open-coding theScanossSettings(...)+load_json_file(...)blocks.That would give you a single, well-tested path for initialising
scanoss_settingsacross scan, wfp, folder-scan, and folder-hash, and reduce the chance of precedence/behaviour drift between commands.Also applies to: 1530-1548, 2830-2839
src/scanoss/data/scanoss-settings-schema.json (1)
200-206: Clarifyranking_thresholdtype and default value.The
ranking_thresholdproperty allows bothintegerandnulltypes but defaults to0. This creates ambiguity: ifnullmeans "not configured", why default to0? According to the description,-1defers to server configuration. Consider:
- If
0is a meaningful threshold distinct from "unset", removenullfrom the type array and keepdefault: 0.- If "unset" should be represented, use
default: null(ordefault: -1per the description) and clarify thatnulland-1both defer to server configuration.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/data/scanoss.json
- scanoss.json
- src/scanoss/scanpostprocessor.py
- src/scanoss/scanner.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (6)
ScanossSettings(76-441)get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(247-258)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-441)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (3)
CHANGELOG.md (1)
11-22: 1.44.0 changelog entry and compare links look consistent.The new release notes correctly describe the tuning parameters and settings/schema changes, and the added
[1.43.1]/[1.44.0]compare links match the version sequence.Also applies to: 787-788
src/scanoss/__init__.py (1)
25-25: Version bump aligns with changelog entry.The
__version__ = '1.44.0'value matches the new 1.44.0 changelog section and the PR intent.src/scanoss/scanoss_settings.py (1)
339-441: LGTM! Clean accessor design with clear separation of concerns.The new getters provide a clean API for accessing file_snippet settings. The separation between root-level and file_snippet-level proxy/http_config accessors is well-designed, leaving priority resolution to the caller (ScanSettingsBuilder). The inconsistency between getters that return Optional (returning None for missing keys) and those that provide defaults (skip_headers, skip_headers_limit) aligns with the schema's default specifications and is appropriate for their usage patterns.
121463c to
1656520
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/scanoss/scanossapi.py`:
- Around line 75-79: The constructor for the ScanOSS client is missing the
snippet_range_tolerance parameter referenced in the docstring; add
snippet_range_tolerance: int = None to the __init__ signature, assign it to an
instance variable (self.snippet_range_tolerance) alongside the other min_*
fields, and update the _build_scan_settings_header method to include this value
when assembling scan settings so the documented behavior matches the
implementation.
♻️ Duplicate comments (1)
src/scanoss/scan_settings_builder.py (1)
101-119:with_ignore_cert_errorsprecedence doesn't match docstring.The docstring states "CLI True > file_snippet > settings > False", implying that passing
Trueon the CLI should always win. However, the implementation uses_merge_with_prioritywhich returns thefile_snippetvalue if present, regardless of the CLI value.If a user explicitly passes
--ignore-cert-errorson the CLI but the settings file hasignore_cert_errors: false, the settings file will override the CLI flag, which contradicts user expectations.🔧 Suggested fix to honor CLI True
def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder': """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False. Note: CLI value only takes effect if True (flag present). False means the flag was not provided, so settings file values are checked. """ + # Explicit CLI True should always win + if cli_value: + self.ignore_cert_errors = True + return self + + # Otherwise, fall back to file_snippet then root, defaulting to False result = self._merge_with_priority( - cli_value if cli_value else None, + None, self._get_file_snippet_http_config_value('ignore_cert_errors'), self._get_http_config_value('ignore_cert_errors') ) self.ignore_cert_errors = result if result is not None else False return self
🧹 Nitpick comments (3)
src/scanoss/scan_settings_builder.py (2)
164-167: Missing space before!=operator.Line 165 has
self.honour_file_exts!= 'unset'which should beself.honour_file_exts != 'unset'for consistency with PEP 8 style.🔧 Suggested fix
- if self.honour_file_exts is not None and self.honour_file_exts!= 'unset': + if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
63-67: Consider using tighter type hints forhonour_file_extsandranking.Lines 65-66 use
Optional[any]which is overly permissive. These values can beNone,bool, or the string'unset'. UsingOptional[Union[bool, str]]would better document the expected types.🔧 Suggested improvement
- self.honour_file_exts: Optional[any] = None - self.ranking: Optional[any] = None + self.honour_file_exts: Optional[Union[bool, str]] = None + self.ranking: Optional[Union[bool, str]] = NoneThis requires adding
Unionto the imports on line 25.tests/test_scan_settings_builder.py (1)
200-205: Consider adding test for CLI True vs settings False scenario.The test
test_with_ignore_cert_errors_cli_true_overridespasses when both CLI and settings areTrue, but doesn't verify the edge case where CLI isTrueand settings explicitly setignore_cert_errors: false. This test case would validate (or expose) the precedence issue flagged in the builder implementation.🧪 Suggested additional test
def test_with_ignore_cert_errors_cli_true_overrides_settings_false(self): """Test with_ignore_cert_errors CLI True should override settings False.""" # This test would require test data with ignore_cert_errors: false # or mocking the settings to return False # Expected behavior per docstring: CLI True should win pass
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
CHANGELOG.mdCLIENT_HELP.mdscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
✅ Files skipped from review due to trivial changes (2)
- src/scanoss/init.py
- src/scanoss/scanner.py
🚧 Files skipped from review as they are similar to previous changes (5)
- src/scanoss/data/scanoss-settings-schema.json
- tests/data/scanoss.json
- scanoss.json
- CHANGELOG.md
- src/scanoss/cli.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(244-255)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (13)
CLIENT_HELP.md (1)
262-314: LGTM!The documentation for the new snippet tuning options is well-structured, consistent with existing help sections, and provides clear examples. The note about using
=syntax for negative values (line 302) is a helpful detail for users.src/scanoss/scanossapi.py (2)
294-327: LGTM!The
_build_scan_settings_headermethod correctly builds the base64-encoded JSON payload, appropriately omitting sentinel values (0,-1,'unset') to let the server use its defaults. The debug logging before encoding is useful for troubleshooting.
177-180: LGTM!The scan settings header is conditionally attached only when there are configured tuning parameters, avoiding unnecessary header overhead for default scans.
src/scanoss/scanoss_settings.py (2)
338-410: LGTM!The new accessor methods for file_snippet settings are clean, consistent with existing patterns, and provide sensible defaults for
skip_headers_limit(0) andskip_headers(False). The return type annotations are appropriate.
411-441: LGTM!The proxy and http_config accessors at both root and file_snippet levels are well-structured and follow the established patterns in the class.
src/scanoss/scan_settings_builder.py (1)
201-229: LGTM!The static helper methods are clean and well-documented. The
_str_to_boolmethod correctly handlesNone, boolean passthrough, and case-insensitive string conversion.tests/test_scan_settings_builder.py (2)
33-38: LGTM!Good use of class-level fixtures to load the settings file once for all tests, improving test efficiency.
336-358: LGTM!Excellent comprehensive test for method chaining that validates all builder methods can be chained together and produce expected results.
src/scanoss/scanpostprocessor.py (5)
81-100: LGTM!The parameter and attribute rename from
scan_settingstoscanoss_settingsaligns with the naming convention introduced in this PR. The docstring is updated accordingly.
112-120: LGTM!Renaming
resulttoentryimproves clarity by distinguishing the extracted single item from the loop variable and aligns with the naming used in_replace_purls.
129-133: LGTM!The use of
self.scanoss_settings.is_legacy()correctly delegates legacy detection to the settings object, and the early return with a warning message is appropriate.
142-142: LGTM!The method correctly delegates BOM removal retrieval to
scanoss_settings.get_bom_remove().
155-163: LGTM!The changes correctly delegate BOM replacement retrieval to
scanoss_settings.get_bom_replace(), and the variable rename toentryimproves clarity. The list wrapping on line 163 maintains consistent output format.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
1656520 to
13e2505
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/scanoss/scanossapi.py (1)
75-109: Addsnippet_range_toleranceparameter to match documented CLI option.The
--snippet-range-toleranceoption is documented in CLIENT_HELP.md as a snippet tuning parameter, but it is not implemented in theScanossApiclass. Unlike the other tuning parameters (min_snippet_hits,min_snippet_lines,honour_file_exts,ranking,ranking_threshold),snippet_range_toleranceis missing from:
- The
ScanossApiconstructor parameters- The
ScanSettingsBuilderclass (no attribute orwith_snippet_range_tolerance()method)- The
_build_scan_settings_header()method that encodes settings for the API requestThe parameter exists only as a getter in
scanoss_settings.pybut is not connected to the API layer.
♻️ Duplicate comments (4)
src/scanoss/scan_settings_builder.py (1)
101-119:with_ignore_cert_errorsprecedence doesn't match docstring.The docstring states "CLI True > file_snippet > settings > False", implying that if the CLI flag is
True, it should always win. However,_merge_with_priorityreturnsfile_snippet_valuefirst if notNone, so settings can override an explicit--ignore-cert-errorsflag.If the docstring is correct and CLI
Trueshould win:🔧 Fix to honour CLI True over settings
def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder': """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False. Note: CLI value only takes effect if True (flag present). False means the flag was not provided, so settings file values are checked. """ + # CLI True always wins + if cli_value: + self.ignore_cert_errors = True + return self + result = self._merge_with_priority( - cli_value if cli_value else None, + None, self._get_file_snippet_http_config_value('ignore_cert_errors'), self._get_http_config_value('ignore_cert_errors') ) self.ignore_cert_errors = result if result is not None else False return selfAlternatively, if the current implementation is correct (settings override CLI), update the docstring to match.
src/scanoss/data/scanoss-settings-schema.json (1)
225-229: Fix misleadinghonour_file_extsdescription.The description says "Ignores file extensions" but the field name
honour_file_extsand CLI help describe it as "Honour file extensions". This contradiction will confuse users and tooling consuming the schema.🔧 Suggested fix
"honour_file_exts": { "type": ["boolean", "null"], - "description": "Ignores file extensions. When not set, defers to server configuration.", + "description": "Honour file extensions during matching. When not set, defers to server configuration.", "default": true },src/scanoss/scanner.py (2)
231-233: Condition checks wrong variable.Line 232 checks
if scan_settings(the builder result) but passesscanoss_settings(the original settings object) toScanPostProcessor. This mismatch means the post-processor could be created withscanoss_settingseven whenscan_settingsbuilder result is falsy, or vice versa.🔧 Suggested fix
self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None )
243-256: Priority order contradicts PR objectives.The
_merge_cli_with_settingsmethod gives priority tosettings_valueovercli_value(lines 253-255), but the PR objectives explicitly state: "priority: CLI > file_snippet > root settings". CLI arguments should override settings file values.The docstring on line 251 also contradicts the implementation: it says "CLI taking priority over settings" but the code does the opposite.
🔧 Suggested fix
`@staticmethod` def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: Merged value with CLI taking priority over settings """ - if settings_value is not None: - return settings_value - return cli_value + if cli_value is not None: + return cli_value + return settings_value
🧹 Nitpick comments (5)
src/scanoss/scan_settings_builder.py (1)
65-67: Consider using precise type hints forhonour_file_extsandranking.
Optional[any]is not idiomatic Python;anyis not a type. These fields can beNone,'unset', orboolafter processing.♻️ Suggested type hints
- self.honour_file_exts: Optional[any] = None - self.ranking: Optional[any] = None + self.honour_file_exts: Optional[Union[bool, str]] = None + self.ranking: Optional[Union[bool, str]] = NoneAdd
Unionto the imports:-from typing import TYPE_CHECKING, Optional +from typing import TYPE_CHECKING, Optional, Unionsrc/scanoss/cli.py (1)
193-225: Minor formatting inconsistency in choices lists.The choices for
--rankingand--honour-file-extshave inconsistent spacing:
- Line 209:
['unset' ,'true', 'false']- extra space before comma- Line 222:
['unset','true', 'false']- missing space after 'unset'This doesn't affect functionality but is inconsistent.
♻️ Consistent formatting
- choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'],- choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'],src/scanoss/scanner.py (2)
142-151: Consider guarding against legacy settings files.When
scanoss_settingsis provided but uses a legacy format (withoutsettings.file_snippetstructure), callingget_file_snippet_settings()may return an empty dict, which is safe. However, for clarity and to avoid mixing new semantics into the legacy path, consider explicitly checkingis_legacy().♻️ Suggested improvement
# Get settings values for skip_headers options - file_snippet_settings = scanoss_settings.get_file_snippet_settings() if scanoss_settings else {} + file_snippet_settings = ( + scanoss_settings.get_file_snippet_settings() + if scanoss_settings and not scanoss_settings.is_legacy() + else {} + )
339-340: Potential AttributeError on legacy settings.
is_dependency_scanaccessesget_file_snippet_settings()whenself.scanoss_settingsexists, but doesn't guard against legacy settings. This is consistent with the pattern at lines 142-144 but could benefit from the same legacy guard.♻️ Suggested improvement
- file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {} - return file_snippet_settings.get('dependency_analysis', False) + if not self.scanoss_settings or self.scanoss_settings.is_legacy(): + return False + file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() + return file_snippet_settings.get('dependency_analysis', False)src/scanoss/scanoss_settings.py (1)
395-410: Inconsistent return type handling for skip_headers vs other boolean settings.
get_skip_headers()returnsboolwith a default ofFalse, whileget_ranking_enabled()andget_honour_file_exts()returnOptional[bool]. This inconsistency means callers cannot distinguish between "explicitly set to False" and "not set" forskip_headers.If this is intentional (skip_headers defaults to False when unset), consider documenting this distinction. Otherwise, align with the
Optional[bool]pattern.♻️ For consistency with other boolean settings
- def get_skip_headers(self) -> bool: + def get_skip_headers(self) -> Optional[bool]: """ Get whether to skip headers Returns: - bool: True to skip headers, False otherwise (default) + bool or None: True to skip headers, False to not skip, None if not set """ - return self.get_file_snippet_settings().get('skip_headers', False) + return self.get_file_snippet_settings().get('skip_headers')
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
CHANGELOG.mdCLIENT_HELP.mdscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
✅ Files skipped from review due to trivial changes (1)
- src/scanoss/init.py
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/data/scanoss.json
- scanoss.json
- tests/test_scan_settings_builder.py
- CLIENT_HELP.md
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(244-255)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-441)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (28)
CHANGELOG.md (2)
11-21: LGTM!The changelog entry accurately documents the new scan engine tuning parameters and follows the Keep a Changelog format. The new CLI options,
file_snippetschema section, andScanSettingsBuilderclass are clearly described.
786-787: LGTM!Link reference definitions for versions 1.43.1 and 1.44.0 are correctly added and properly formatted.
src/scanoss/scan_settings_builder.py (4)
25-29: LGTM!Proper use of
TYPE_CHECKINGto avoid circular imports while maintaining type hints.
69-99: LGTM!The
with_proxyandwith_urlmethods correctly implement the documented priority (file_snippet > root settings > CLI) using the_merge_with_priorityhelper.
121-199: LGTM!The snippet tuning methods (
with_min_snippet_hits,with_min_snippet_lines,with_honour_file_exts,with_ranking,with_ranking_threshold) correctly implement settings > CLI priority using_merge_cli_with_settings. The string-to-bool conversion forhonour_file_extsandrankingproperly preserves the'unset'sentinel value.
201-266: LGTM!The private helper methods are well-implemented:
_merge_with_priorityand_merge_cli_with_settingscorrectly implement the documented precedence._str_to_booldefensively handlesNone, existing booleans, and string conversion.- Accessor methods safely guard against
Nonescanoss_settings.src/scanoss/scanpostprocessor.py (4)
81-100: LGTM!Constructor parameter and attribute correctly renamed from
scan_settingstoscanoss_settings, aligning with the broader refactoring across the codebase.
112-121: LGTM!Local variable renamed from
resulttoentryimproves clarity by distinguishing the individual result entry from the overall results dictionary.
122-136: LGTM!
post_processcorrectly usesself.scanoss_settings.is_legacy()for legacy detection.
138-163: LGTM!
_remove_dismissed_filesand_replace_purlscorrectly referenceself.scanoss_settingsfor retrieving BOM entries. The variable rename fromresulttoentryin_replace_purlsmaintains consistency with the earlier change.src/scanoss/scanossapi.py (3)
25-34: LGTM!New imports for
base64,json, and expanded typing support the new scan settings header functionality.
176-179: LGTM!The scan settings header is cleanly integrated into the request flow, only adding the
scanoss-settingsheader when there are non-default tuning parameters configured.
293-326: LGTM!The
_build_scan_settings_headermethod correctly:
- Filters out default/unset values (0 for hits/lines, 'unset' for boolean flags, -1 for threshold)
- Maps
rankingtoranking_enabledin the JSON payload- Base64-encodes the JSON for transmission
- Returns
Nonewhen no settings need to be sent, avoiding empty headerssrc/scanoss/cli.py (5)
1412-1442: LGTM!Settings loading in
wfp()correctly refactored to usescanoss_settingsvariable name and pass it to theScannerconstructor via the renamedscanoss_settingsparameter.
1523-1642: LGTM!The
scan()function correctly:
- Uses
scanoss_settingsvariable name consistently- Handles all three settings loading paths (identify SBOM, ignore SBOM, default scanoss.json)
- Passes the new tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) to theScannerconstructor
2727-2743: LGTM!
folder_hashing_scan()correctly usesget_scanoss_settings_from_args()to load settings and passes them toScannerHFH.
2769-2777: LGTM!
folder_hash()correctly usesget_scanoss_settings_from_args()to load settings and passes them toFolderHasher.
2822-2831: LGTM!
get_scanoss_settings_from_args()cleanly extracts the repeated settings loading pattern used infolder_hashing_scan()andfolder_hash(), with proper error handling.src/scanoss/data/scanoss-settings-schema.json (3)
143-166: LGTM! Root-level proxy and http_config schema additions.The schema definitions for
proxyandhttp_configare well-structured with appropriate types and descriptions.
167-224: LGTM! file_snippet configuration schema.The file_snippet section properly defines the new tuning parameters with appropriate constraints:
ranking_thresholdcorrectly usesminimum: -1andmaximum: 99min_snippet_hitsandmin_snippet_linescorrectly useminimum: 0- Nullable types are properly specified using
["boolean", "null"]and["integer", "null"]
246-266: LGTM! HPFM and container schema additions.The
hpfmsection correctly mirrors the ranking settings pattern fromfile_snippet, andcontainerprovides a placeholder for future configuration.src/scanoss/scanner.py (4)
42-42: LGTM! New import for ScanSettingsBuilder.The import enables the builder pattern for merging CLI and settings file configurations.
107-117: LGTM! Constructor signature updated with new tuning parameters.The parameter rename from
scan_settingstoscanoss_settingsand the addition of new snippet tuning parameters align with the PR objectives.
168-179: LGTM! ScanSettingsBuilder usage.The builder pattern cleanly merges CLI arguments with settings file configurations for the new tuning parameters.
408-408: LGTM! Consistent scanoss_settings propagation.The
scanoss_settingsparameter is consistently passed toFileFiltersacross all usage sites (scan_folder,scan_files,wfp_folder), ensuring proper settings propagation throughout the scanning pipeline.Also applies to: 669-669, 977-977
src/scanoss/scanoss_settings.py (3)
339-346: LGTM! Base accessor for file_snippet settings.The
get_file_snippet_settings()method safely returns an empty dict when the settings path doesn't exist, providing a solid foundation for all other file_snippet accessors.
347-394: LGTM! Snippet tuning parameter accessors.The accessors for
min_snippet_hits,min_snippet_lines,snippet_range_tolerance,ranking_enabled,ranking_threshold, andhonour_file_extsfollow a consistent pattern with appropriateOptionalreturn types and clear docstrings.
411-441: LGTM! Proxy and HTTP config accessors.The root-level and file_snippet-level accessors for
proxyandhttp_configare well-structured. The docstrings correctly note that file_snippet-level settings take priority, which aligns with the PR's stated precedence rules.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
13e2505 to
1e8cba8
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@CLIENT_HELP.md`:
- Around line 297-302: The range text for the --ranking-threshold flag is
ambiguous; change the notation "-1-10" to "-1 to 10" and update the example that
uses --ranking-threshold=75 to a valid value within -1 to 10 (e.g.,
--ranking-threshold=5); ensure the note about using "=" for negative values
remains and that all instances referencing --ranking-threshold reflect the
corrected range and example.
- Around line 277-285: Remove the undocumented CLI option text for
--snippet-range-tolerance / -srt from CLIENT_HELP.md: the CLI does not implement
this flag (only get_snippet_range_tolerance() exists server-side), so delete the
"Set snippet range tolerance" block and examples; ensure the remaining doc only
lists the implemented snippet tuning options (--min-snippet-hits,
--min-snippet-lines, --ranking, --ranking-threshold, --honour-file-exts).
In `@docs/source/_static/scanoss-settings-schema.json`:
- Around line 200-205: Remove the duplicate conflicting `ranking_threshold`
definition in the docs schema so there is a single authoritative entry that
matches the implementation; specifically ensure the `ranking_threshold` schema
uses the constraints expected by ScanSettingsBuilder.with_ranking_threshold
(type integer|null, minimum -1, maximum 10, default -1) and delete the alternate
entry (the one with maximum 99/default 0), then verify the main schema and docs
schema are consistent for `ranking_threshold`.
In `@src/scanoss/cli.py`:
- Around line 206-218: Fix the inconsistent spacing and incorrect help text for
the ranking CLI args: normalize the choices list in the p_scan.add_argument call
for '--ranking' to use consistent spacing (e.g.,
"choices=['unset','true','false']") and update the '--ranking-threshold' help
string to reflect the actual clamped range used by ScanSettingsBuilder (change
"Valid range: -1 to 10" to "Valid range: -1 to 99" or otherwise match the clamp
implemented in ScanSettingsBuilder).
In `@tests/test_scan_settings_builder.py`:
- Around line 317-330: The with_ranking_threshold method in ScanSettingsBuilder
is incorrectly clamping values to a max of 10; update its clamping logic to use
the schema's maximum of 99 (allowing -1 to 99) so values like 50 and 75 pass,
and change the method/docstring text from "Valid range is -1 to 10" to "Valid
range is -1 to 99"; ensure you modify ScanSettingsBuilder.with_ranking_threshold
to reference the 99 maximum (or derive it from the scanoss-settings-schema.json
maximum) and remove the hardcoded 10 clamp.
♻️ Duplicate comments (7)
src/scanoss/scanossapi.py (1)
314-316: Fix the double space on line 315.There's a double space after
andin the condition:and self.rankingshould beand self.ranking.- if self.ranking is not None and self.ranking != 'unset': + if self.ranking is not None and self.ranking != 'unset':src/scanoss/data/scanoss-settings-schema.json (1)
225-229: Fix invertedhonour_file_extsdescription.The description says "Ignores file extensions" but the field name and CLI semantics are "honour file extensions" (True = honour, False = ignore). This inversion will confuse consumers of the schema.
Proposed fix
"honour_file_exts": { "type": ["boolean", "null"], - "description": "Ignores file extensions. When not set, defers to server configuration.", + "description": "Whether to honour file extensions during matching. True = honour, False = ignore. When not set, defers to server configuration.", "default": true },docs/source/_static/scanoss-settings-schema.json (1)
219-223: Fix invertedhonour_file_extsdescription (same issue as main schema).The description says "Ignores file extensions" but should describe "honour" semantics.
src/scanoss/scanner.py (4)
142-151: Priority order issue persists for skip_headers merging.The
_merge_cli_with_settingsmethod (lines 243-255) still gives priority tosettings_valueovercli_value, contradicting the PR objectives which state "priority: CLI > file_snippet > root settings". This affects the skip_headers merging at lines 148-150.
231-232: Condition still checks wrong variable.Line 232 checks
if scan_settingsbut passesscanoss_settingstoScanPostProcessor. The condition should beif scanoss_settings.
243-255: Priority implementation contradicts PR objectives and has inconsistent docstring.
- The implementation gives
settings_valuepriority overcli_value(lines 253-255), but PR objectives state "CLI > file_snippet > root settings".- The docstring is internally inconsistent: line 245 says "settings > cli" but line 251 says "CLI taking priority over settings".
339-340: Consider guarding file_snippet access for legacy settings.While
get_file_snippet_settings()safely returns{}when the section is missing, explicitly guarding legacy settings would improve code clarity and align with the pattern suggested in past reviews.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (5)
- CHANGELOG.md
- tests/data/scanoss.json
- scanoss.json
- src/scanoss/scan_settings_builder.py
- src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(33-309)_merge_cli_with_settings(255-262)with_proxy(81-95)with_url(97-111)with_ignore_cert_errors(113-131)with_min_snippet_hits(133-154)with_min_snippet_lines(156-177)with_honour_file_exts(179-195)with_ranking(197-212)with_ranking_threshold(214-242)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)
tests/test_scan_settings_builder.py (2)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(33-309)_str_to_bool(266-272)_merge_with_priority(246-252)_merge_cli_with_settings(255-262)with_proxy(81-95)with_url(97-111)with_ignore_cert_errors(113-131)with_min_snippet_hits(133-154)with_min_snippet_lines(156-177)with_honour_file_exts(179-195)with_ranking(197-212)with_ranking_threshold(214-242)src/scanoss/scanner.py (1)
_merge_cli_with_settings(244-255)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
🪛 GitHub Actions: Build/Test Local Python Package
tests/test_scan_settings_builder.py
[error] 358-358: Test 'test_method_chaining' failed: expected ranking_threshold to be 50 but got 10 (AssertionError).
[error] 322-322: Test 'test_with_ranking_threshold_cli_only' failed: expected ranking_threshold to be 50 but got 10 (AssertionError).
[error] 330-330: Test 'test_with_ranking_threshold_from_settings' failed: expected ranking_threshold to be 75 but got 10 (AssertionError).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (22)
CLIENT_HELP.md (2)
262-276: LGTM! Clear documentation for snippet hits and lines parameters.The documentation for
--min-snippet-hitsand--min-snippet-linesis clear, includes helpful examples, and correctly explains that a value of 0 defers to server configuration.
304-314: LGTM! Clear documentation for file extensions and combined usage.The documentation for
--honour-file-extsis clear and the combined example effectively demonstrates how to use multiple tuning options together.src/scanoss/scanpostprocessor.py (3)
81-100: LGTM! Consistent renaming of parameter and attribute.The rename from
scan_settingstoscanoss_settingsaligns with the broader PR changes and is applied consistently throughout the class.
116-120: LGTM! Improved variable naming.Renaming
resulttoentrybetter describes the per-item context in the iteration loop.
129-163: LGTM! Attribute references updated correctly.All method calls in
post_process,_remove_dismissed_files, and_replace_purlscorrectly referenceself.scanoss_settings.src/scanoss/scanossapi.py (3)
75-109: LGTM! Clean addition of scan tuning parameters.The new tuning parameters are properly declared with sensible types (
Union[bool, str, None]for tri-state values) and stored as instance variables. The docstring covers the new parameters.
176-179: LGTM! Clean header integration.The scan settings header is correctly built and attached only when non-empty. The header key
scanoss-settingsaligns with the API expectations.
293-326: LGTM! Well-structured settings header builder.The method correctly filters out unset values using appropriate sentinel checks (0 for hits/lines, -1 for threshold, 'unset' for boolean flags). Debug logging aids troubleshooting.
src/scanoss/data/scanoss-settings-schema.json (1)
167-244: LGTM! Well-structured file_snippet configuration block.The nested
proxyandhttp_configwithinfile_snippetenables per-feature overrides while maintaining consistency with the root-level configuration pattern. The tuning parameter constraints are appropriately defined.tests/test_scan_settings_builder.py (2)
33-108: LGTM! Well-structured tests with good coverage.The test class is well-organized with clear sections for initialization, static helpers, and each
with_*method. The static method tests (_str_to_bool,_merge_with_priority,_merge_cli_with_settings) thoroughly cover edge cases.
336-358: Good comprehensive chaining test.The method chaining test effectively validates that all
with_*methods returnselfand that the final builder state is correct. Once theranking_thresholdclamping bug is fixed, this test will pass.src/scanoss/cli.py (5)
1413-1442: LGTM!The refactoring to use
scanoss_settingsis consistent with the new settings flow, and the Scanner constructor is correctly receiving the settings parameter.
1632-1639: LGTM!The new snippet tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) are correctly passed to the Scanner constructor, aligning with the new CLI options.
2727-2743: LGTM!The folder hashing scan correctly uses
get_scanoss_settings_from_args()to load settings and passes them toScannerHFH.
2769-2780: LGTM!The folder hash command correctly uses the centralized settings loading and passes settings to
FolderHasher.
2822-2831: LGTM!The
get_scanoss_settings_from_args()helper centralizes settings creation for folder-based scanning paths, reducing code duplication.src/scanoss/scanner.py (4)
107-116: LGTM!The constructor signature is correctly updated with the new snippet tuning parameters and the renamed
scanoss_settingsparameter.
167-179: LGTM!The
ScanSettingsBuilderis correctly instantiated and configured with all the new snippet tuning parameters using the fluent API pattern.
236-241: LGTM!The
_maybe_set_api_sbommethod correctly usesself.scanoss_settingsfor the SBOM retrieval.
408-416: LGTM!The
FileFiltersinstantiation correctly receivesself.scanoss_settings, maintaining consistency across all scanning methods.src/scanoss/scanoss_settings.py (2)
339-409: LGTM!The file_snippet accessor methods are well-structured with:
- Consistent use of
get_file_snippet_settings()as the base accessor- Appropriate
Optionalreturn types where values may not be set- Sensible defaults for
skip_headers_limit(0) andskip_headers(False)
411-441: LGTM!The proxy and http_config accessors correctly expose both root-level and file_snippet-level configurations, with clear docstrings indicating the priority relationship. This enables the
ScanSettingsBuilderto implement the intended merging logic.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
9adb13c to
1f204cb
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
1f204cb to
5fa4487
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/scanoss/scan_settings_builder.py`:
- Line 32: Update the MAX_RANKING_THRESHOLD constant in scan_settings_builder.py
from 10 to 99 so it matches the documented valid range (-1 to 99) in the PR and
CHANGELOG; locate the MAX_RANKING_THRESHOLD symbol and set its value to 99 to
align code with the specification.
♻️ Duplicate comments (5)
src/scanoss/data/scanoss-settings-schema.json (1)
225-229: Fix misleadinghonour_file_extsdescription.The description says "Ignores file extensions" but the field name
honour_file_extsindicates the opposite behavior (honouring/respecting file extensions). Update the description to match the actual semantics.src/scanoss/scan_settings_builder.py (1)
115-133:with_ignore_cert_errorsprecedence doesn't match docstring.The docstring states "priority: CLI True > file_snippet > settings > False", implying CLI
Trueshould always win. However,_merge_with_priorityreturnsfile_snippetvalue first if present, meaning if the settings file explicitly setsignore_cert_errors: false, the CLI flag--ignore-cert-errorswill not override it.If CLI
Trueshould genuinely override all settings, the logic needs adjustment:🔧 Suggested fix if CLI True should always win
def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder': - """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False. - - Note: CLI value only takes effect if True (flag present). False means - the flag was not provided, so settings file values are checked. - """ - result = self._merge_with_priority( - cli_value if cli_value else None, - self._get_file_snippet_http_config_value('ignore_cert_errors'), - self._get_http_config_value('ignore_cert_errors') - ) - self.ignore_cert_errors = result if result is not None else False + """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False.""" + # CLI True always wins + if cli_value: + self.ignore_cert_errors = True + return self + # Otherwise fall back to settings + result = self._merge_with_priority( + None, + self._get_file_snippet_http_config_value('ignore_cert_errors'), + self._get_http_config_value('ignore_cert_errors') + ) + self.ignore_cert_errors = bool(result) if result is not None else False return selfsrc/scanoss/cli.py (1)
206-225: Inconsistent formatting in choices and incorrect help text for ranking threshold.This issue was flagged in a previous review and remains unaddressed:
- Line 209 has inconsistent spacing:
choices=['unset' ,'true', 'false'](space before comma)- Line 217 states "Valid range: -1 to 10" but
ScanSettingsBuilderclamps values to [-1, 99]- Line 222 has inconsistent spacing:
choices=['unset','true', 'false'](no space after 'unset')Suggested fix
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=-1, - help='Ranking threshold value. Valid range: -1 to 10. A value of -1 defers to server configuration (optional)', + help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )src/scanoss/scanner.py (2)
231-233: Condition checks wrong variable.This was flagged in a previous review: The condition checks
scan_settings(theScanSettingsBuilderresult) but passesscanoss_settings(theScanossSettingsinstance) toScanPostProcessor. These are different objects.Fix the condition
self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None )
243-255: Docstring has contradictory priority descriptions.The docstring is internally inconsistent:
- Line 245 states "two-level priority: settings > cli"
- Line 251 states "Merged value with CLI taking priority over settings"
The implementation gives settings priority (returning
settings_valueif not None), which aligns withScanSettingsBuilder._merge_cli_with_settings. However, the PR objectives state "priority: CLI > file_snippet > root settings".Please clarify the intended behavior and update the docstring accordingly:
If settings should take priority (current behavior)
`@staticmethod` def _merge_cli_with_settings(cli_value, settings_value): """Merge CLI value with settings value (two-level priority: settings > cli). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with settings taking priority over CLI """
🧹 Nitpick comments (8)
src/scanoss/data/scanoss-settings-schema.json (2)
246-262: Consider aligningranking_enabledtype withfile_snippetfor consistency.In
file_snippet,ranking_enabledis["boolean", "null"]with defaultnull(allowing "unset" state), while here inhpfmit's justbooleanwithout nullable support. If both sections need a "defer to server" option, consider making the types consistent.♻️ Suggested change for consistency
"ranking_enabled": { - "type": "boolean", - "description": "Enable ranking for HPFM" + "type": ["boolean", "null"], + "description": "Enable ranking for HPFM. When not set, defers to server configuration.", + "default": null },
263-266: Emptycontainerobject is a placeholder.The
containersection has no properties defined. If this is intentional scaffolding for future work, consider adding a brief comment in the description (e.g., "Reserved for future container scanning options") to clarify its purpose.📝 Suggested description improvement
"container": { "type": "object", - "description": "Container scanning configuration" + "description": "Container scanning configuration (reserved for future options)" }tests/test_scan_settings_builder.py (1)
36-38: Class-level test fixtures may cause issues with test isolation.Using class-level attributes for
scan_settings_pathandscan_settingsmeans all test methods share the sameScanossSettingsinstance. While this works for read-only access, consider usingsetUporsetUpClassmethods for clearer test lifecycle management.src/scanoss/scan_settings_builder.py (2)
79-81: Improve type hints forhonour_file_extsandranking.Using
any(lowercase) is not a valid Python type hint. Consider usingOptional[Union[bool, str]]to accurately reflect that these can hold string values ('true', 'false', 'unset') before conversion or boolean values after.♻️ Suggested fix
+from typing import TYPE_CHECKING, Optional, Union + ... - self.honour_file_exts: Optional[any] = None - self.ranking: Optional[any] = None + self.honour_file_exts: Optional[Union[bool, str]] = None + self.ranking: Optional[Union[bool, str]] = None
194-196: Missing space before!=operator.Minor formatting issue on line 195.
🔧 Fix spacing
- if self.honour_file_exts is not None and self.honour_file_exts!= 'unset': + if self.honour_file_exts is not None and self.honour_file_exts != 'unset':src/scanoss/scanossapi.py (1)
293-326: Potential type inconsistency forhonour_file_extsandrankingvalues.When
ScanOSSAPIis instantiated throughScanner→ScanSettingsBuilder, these values are correctly converted to booleans byScanSettingsBuilder._str_to_bool(). However, ifScanOSSAPIis instantiated directly (e.g., in tests or custom integrations), string values like'true'or'false'could be serialized as JSON strings rather than booleans.Consider adding explicit boolean conversion in
_build_scan_settings_header()to ensure consistent JSON output regardless of instantiation path:Suggested defensive conversion
# honour_file_exts: None = unset, don't send if self.honour_file_exts is not None and self.honour_file_exts != 'unset': - settings['honour_file_exts'] = self.honour_file_exts + settings['honour_file_exts'] = self._to_bool(self.honour_file_exts) # ranking: None = unset, don't send if self.ranking is not None and self.ranking != 'unset': - settings['ranking_enabled'] = self.ranking + settings['ranking_enabled'] = self._to_bool(self.ranking) + + `@staticmethod` + def _to_bool(value) -> bool: + """Convert value to boolean.""" + if isinstance(value, bool): + return value + if isinstance(value, str): + return value.lower() == 'true' + return bool(value)src/scanoss/scanner.py (2)
339-340: Consider guarding file_snippet access for legacy settings.When using legacy settings files (SBOM.json format), accessing
file_snippetsettings may introduce unexpected behavior. Whileget_file_snippet_settings()returns an empty dict for missing sections (avoiding crashes), mixing new semantics into the legacy path could cause confusion.Suggested guard
- file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {} - return file_snippet_settings.get('dependency_analysis', False) + if not self.scanoss_settings or self.scanoss_settings.is_legacy(): + return False + file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() + return file_snippet_settings.get('dependency_analysis', False)
142-151: Consider guarding file_snippet access for legacy settings (applies here too).Similar to the
is_dependency_scanmethod, accessingfile_snippet_settingsin the constructor should be guarded for legacy settings files to maintain consistent behavior:Suggested guard
# Get settings values for skip_headers options - file_snippet_settings = scanoss_settings.get_file_snippet_settings() if scanoss_settings else {} + file_snippet_settings = ( + scanoss_settings.get_file_snippet_settings() + if scanoss_settings and not scanoss_settings.is_legacy() + else {} + ) settings_skip_headers = file_snippet_settings.get('skip_headers') settings_skip_headers_limit = file_snippet_settings.get('skip_headers_limit')
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (5)
- tests/data/scanoss.json
- src/scanoss/init.py
- docs/source/_static/scanoss-settings-schema.json
- CLIENT_HELP.md
- src/scanoss/scanoss_settings.py
🧰 Additional context used
🧬 Code graph analysis (5)
src/scanoss/scan_settings_builder.py (3)
src/scanoss/scanossbase.py (2)
ScanossBase(28-107)print_msg(51-56)src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(244-255)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-441)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(35-311)_merge_cli_with_settings(257-264)with_proxy(83-97)with_url(99-113)with_ignore_cert_errors(115-133)with_min_snippet_hits(135-156)with_min_snippet_lines(158-179)with_honour_file_exts(181-197)with_ranking(199-214)with_ranking_threshold(216-244)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (20)
src/scanoss/data/scanoss-settings-schema.json (5)
143-166: LGTM!The
proxyandhttp_configsections are well-structured with clear descriptions for API request configuration.
167-194: LGTM!The nested
proxyandhttp_configwithinfile_snippetproperly allow independent configuration for file snippet requests, enabling different endpoints or proxy settings from the root configuration.
195-206: Clarifyranking_thresholddefault value semantics.The schema allows both
nulltype and-1integer value, with the description stating "-1 defers to server configuration." However, the default is0, which is a valid threshold value. If the intent is to defer to server by default (not set), consider changing the default tonullor-1for consistency with the "defer to server" behavior.Please verify whether a default of
0is intentional (meaning: apply zero threshold by default) or if it should benull/-1to defer to server configuration when not explicitly set.
207-224: LGTM!The
min_snippet_hits,min_snippet_lines, andsnippet_range_toleranceproperties are well-defined with appropriate constraints (non-negative integers with default 0 to defer to server per PR objectives).
230-244: LGTM!The
dependency_analysis,skip_headers, andskip_headers_limitproperties are clearly defined with sensible defaults.scanoss.json (1)
17-21: LGTM!The addition of
scanoss-winnowing.pyto the BOM include list is correctly formatted and consistent with the existing entry structure.CHANGELOG.md (2)
11-21: LGTM!The changelog entry for version 1.44.0 is well-structured, follows the Keep a Changelog format, and clearly documents all the new scan engine tuning parameters and related additions.
786-787: LGTM!Link references are correctly added for versions 1.43.1 and 1.44.0, properly chaining from the previous versions.
tests/test_scan_settings_builder.py (1)
336-358: LGTM!Comprehensive test coverage for method chaining and the fluent API pattern. All builder methods are exercised and verified in a single chain.
src/scanoss/scan_settings_builder.py (2)
247-274: LGTM!Static helper methods are well-implemented with clear, focused responsibilities. The
_str_to_boolmethod properly handlesNone, boolean pass-through, and string conversion.
277-311: LGTM!Private extraction methods properly handle null cases and safely navigate nested dictionary structures. Good defensive coding with early returns for missing
scanoss_settings.src/scanoss/scanpostprocessor.py (4)
81-100: LGTM!Clean refactoring from
scan_settingstoscanoss_settingsfor naming consistency across the codebase. The parameter, docstring, and instance attribute are all correctly updated.
112-120: LGTM!Renaming the local variable from
resulttoentryimproves clarity and avoids potential confusion with the outerresultfrom the loop iteration.
129-136: LGTM!Legacy detection now correctly uses
self.scanoss_settings.is_legacy()and includes an early return to skip unsupported post-processing.
155-163: LGTM!The
_replace_purlsmethod consistently usesself.scanoss_settingsand the renamedentryvariable throughout the replacement logic.src/scanoss/cli.py (2)
1632-1639: LGTM!The new snippet tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) are correctly propagated from CLI arguments to the Scanner constructor.
2822-2831: LGTM!The
get_scanoss_settings_from_argshelper consolidates settings loading logic forfolder_hashing_scanandfolder_hashcommands, following the same pattern used elsewhere in the CLI.src/scanoss/scanossapi.py (2)
75-109: LGTM!The new tuning parameters are properly added to the constructor signature with appropriate type hints (
Union[bool, str, None]forhonour_file_extsandranking), and the docstring accurately documents the new parameters with their default values.
176-179: LGTM!The scan settings header is cleanly integrated into the request flow, only adding the
scanoss-settingsheader when tuning parameters are configured.src/scanoss/scanner.py (1)
167-200: LGTM!The
ScanSettingsBuilderintegration cleanly merges CLI arguments with settings file values and passes the merged configuration toScanossApi. The builder pattern provides a clear, chainable API for configuration merging.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
5fa4487 to
59fd38f
Compare
59fd38f to
86725c2
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
86725c2 to
882f8ce
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
882f8ce to
64e3b80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@docs/source/_static/scanoss-settings-schema.json`:
- Around line 200-205: The docs and main JSON schemas for ranking_threshold are
inconsistent with the runtime constant MAX_RANKING_THRESHOLD (which enforces a
max of 10) and clamping logic; update the ranking_threshold entries in both
schemas to use "type": ["integer","null"], "minimum": -1, "maximum": 10 and set
"default": -1 so defaults and constraints match the runtime (refer to
MAX_RANKING_THRESHOLD and the builder/clamping logic that clamps values >10).
Ensure both the docs schema and the main schema use these same values to remove
the contradiction.
In `@src/scanoss/cli.py`:
- Around line 231-234: Remove the duplicate argument registration for
'--wfp-output' that is being added a second time via the p_scan.add_argument
call; locate the redundant p_scan.add_argument(... '--wfp-output' ...) block and
delete it so only the original definition remains (keep the first declaration
and remove the later duplicate to avoid argparse conflicts).
In `@src/scanoss/scanner.py`:
- Around line 170-203: The ScanSettingsBuilder usage can cause CLI flags to be
overridden because the merge order currently may let file_snippet/root values
win; update the merge so CLI values take precedence by applying CLI overrides
last: when constructing scan_settings with ScanSettingsBuilder, ensure builder
is fed base settings first (root/file_snippet) and then call
.with_min_snippet_hits(min_snippet_hits),
.with_min_snippet_lines(min_snippet_lines), .with_ranking(ranking),
.with_ranking_threshold(ranking_threshold),
.with_honour_file_exts(honour_file_exts), .with_proxy(proxy), .with_url(url),
.with_ignore_cert_errors(ignore_cert_errors) only if the corresponding CLI
variable is not None (or always call them last to overwrite), so
scan_settings.min_snippet_hits/min_snippet_lines/ranking/ranking_threshold/honour_file_exts/proxy/url/ignore_cert_errors
reflect CLI overrides before passing scan_settings into ScanossApi.
♻️ Duplicate comments (9)
src/scanoss/scan_settings_builder.py (2)
35-43: Confirm merge precedence matches the stated CLI priority.PR objectives say CLI should override file_snippet/root, but
_merge_with_priority/_merge_cli_with_settingscurrently make settings win. This also meanswith_ignore_cert_errorscan’t guarantee the “CLI True wins” behavior its docstring promises when settings are present. Please confirm the intended precedence; if CLI should win, swap the merge order and update tests/docs accordingly.🔧 Possible fix if CLI should have highest priority
def _merge_with_priority(cli_value, file_snippet_value, root_value): - if file_snippet_value is not None: - return file_snippet_value - if root_value is not None: - return root_value - return cli_value + if cli_value is not None: + return cli_value + if file_snippet_value is not None: + return file_snippet_value + return root_value def _merge_cli_with_settings(cli_value, settings_value): - if settings_value is not None: - return settings_value - return cli_value + if cli_value is not None: + return cli_value + return settings_valueAlso applies to: 115-133, 246-264
32-32: Alignranking_thresholdmax with the intended spec.
MAX_RANKING_THRESHOLDis 10 and the docstring says “-1 to 10”, but the PR objectives state “-1 to 99”. Please confirm the intended max and align constant/docstrings/tests/docs accordingly.🔧 Possible fix if max should be 99
-MAX_RANKING_THRESHOLD = 10 +MAX_RANKING_THRESHOLD = 99 - Valid range is -1 to 10. Values outside this range will be clamped and logged. + Valid range is -1 to 99. Values outside this range will be clamped and logged.Also applies to: 216-237
src/scanoss/data/scanoss-settings-schema.json (1)
201-205: Fix misleadinghonour_file_extsdescription.The description says "Ignores file extensions" but the field name and CLI help describe it as "Honour file extensions". This inconsistency will confuse users. Update to: "Whether to honour file extensions during matching. When not set, defers to server configuration."
src/scanoss/scanossapi.py (1)
75-79: Missingsnippet_range_toleranceparameter.The schema and
ScanossSettings.get_snippet_range_tolerance()support this parameter, but it's not wired into the API constructor or_build_scan_settings_header(). If this is intentional (server-only config), consider documenting why it's excluded; otherwise, add it for completeness.docs/source/_static/scanoss-settings-schema.json (1)
219-223: Fix invertedhonour_file_extsdescription.Same issue as the main schema: description says "Ignores file extensions" but field name means "honour". Update to reflect actual semantics.
src/scanoss/cli.py (1)
211-230: Inconsistent spacing in choices and help text range mismatch.
- Line 214:
choices=['unset' ,'true', 'false']has extra space before comma- Line 227:
choices=['unset','true', 'false']missing space after first item- Line 222: Help says "Valid range: -1 to 10" but should match the actual implementation/schema range
src/scanoss/scanner.py (3)
144-153: CLI should override file_snippet for skip_headers merge.The helper and inline comment still prefer settings over CLI, which conflicts with the stated priority (CLI > file_snippet > root). Flip the precedence and update the docstring/comment.
Also applies to: 245-257
233-235: Use the same variable in the post-processor guard.The condition checks
scan_settingsbut passesscanoss_settings, so the guard is ineffective whenscanoss_settingsisNone.
341-342: Guard legacy settings before readingfile_snippet.If legacy settings lack
file_snippet, this fallback can misbehave; consider skipping when settings are legacy.
🧹 Nitpick comments (1)
src/scanoss/scanossapi.py (1)
310-316: Ensure boolean serialization forhonour_file_extsandranking.These values have type
Union[bool, str, None]with a default of'unset'. WhileScanSettingsBuilder._str_to_bool()converts CLI strings to booleans, direct API instantiation could pass string values that would serialize incorrectly in JSON. Consider adding explicit boolean conversion:♻️ Suggested defensive conversion
# honour_file_exts: None = unset, don't send if self.honour_file_exts is not None and self.honour_file_exts != 'unset': - settings['honour_file_exts'] = self.honour_file_exts + settings['honour_file_exts'] = bool(self.honour_file_exts) if isinstance(self.honour_file_exts, bool) else self.honour_file_exts == 'true' # ranking: None = unset, don't send if self.ranking is not None and self.ranking != 'unset': - settings['ranking_enabled'] = self.ranking + settings['ranking_enabled'] = bool(self.ranking) if isinstance(self.ranking, bool) else self.ranking == 'true'
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonscanoss.jsonsrc/scanoss/__init__.pysrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pysrc/scanoss/scanner.pysrc/scanoss/scanoss_settings.pysrc/scanoss/scanossapi.pysrc/scanoss/scanpostprocessor.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (3)
- scanoss.json
- tests/data/scanoss.json
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (6)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (3)
ScanossSettings(76-441)load_json_file(103-138)set_file_type(140-152)
src/scanoss/scanner.py (4)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(35-311)_merge_cli_with_settings(257-264)with_proxy(83-97)with_url(99-113)with_ignore_cert_errors(115-133)with_min_snippet_hits(135-156)with_min_snippet_lines(158-179)with_honour_file_exts(181-197)with_ranking(199-214)with_ranking_threshold(216-244)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)src/scanoss/scanossapi.py (1)
ScanossApi(52-341)src/scanoss/scanpostprocessor.py (1)
ScanPostProcessor(76-289)
tests/test_scan_settings_builder.py (2)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(35-311)_str_to_bool(268-274)_merge_with_priority(248-254)_merge_cli_with_settings(257-264)with_proxy(83-97)with_url(99-113)with_ignore_cert_errors(115-133)with_min_snippet_hits(135-156)with_min_snippet_lines(158-179)with_honour_file_exts(181-197)with_ranking(199-214)with_ranking_threshold(216-244)src/scanoss/scanner.py (1)
_merge_cli_with_settings(246-257)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(246-257)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (13)
src/scanoss/__init__.py (1)
25-25: Version bump looks consistent.CLIENT_HELP.md (1)
262-404: No additional issues to note here.tests/test_scan_settings_builder.py (1)
33-359: No additional issues to note here.src/scanoss/scanpostprocessor.py (1)
112-164: Good robustness improvements in post-processing.Handling list-vs-dict entries and short-circuiting legacy settings is clean and avoids unnecessary processing.
src/scanoss/scanoss_settings.py (1)
338-441: LGTM! Well-structured getters for file_snippet settings.The new accessor methods follow the established patterns in the class, with consistent docstrings, proper Optional type hints, and appropriate default values matching the schema (e.g.,
skip_headers_limitdefaults to0,skip_headersdefaults toFalse). The separation of root-level and file_snippet-level proxy/http_config getters supports the priority-based configuration model described in the PR objectives.src/scanoss/data/scanoss-settings-schema.json (1)
142-242: New schema blocks for file_snippet, hpfm, and container look good.The schema additions properly define the new tuning parameters with appropriate types, ranges, and defaults. The nested proxy/http_config blocks within file_snippet align with the priority-based configuration approach.
src/scanoss/scanossapi.py (1)
176-179: Scan settings header implementation looks good.The
_build_scan_settings_header()method correctly:
- Omits parameters with sentinel values (0 for counts, 'unset'/None for booleans, -1 for threshold)
- Base64-encodes the JSON payload
- Only attaches the header when settings are present
Also applies to: 293-326
docs/source/_static/scanoss-settings-schema.json (1)
142-238: New proxy, http_config, and file_snippet schema blocks are well-structured.The hierarchical configuration with root-level and file_snippet-level proxy/http_config follows a sensible pattern for override semantics.
src/scanoss/cli.py (2)
1422-1448: LGTM!wfp()function properly wired withscanoss_settings.The settings loading and Scanner initialization correctly use the renamed
scanoss_settingsvariable throughout.
1641-1651: LGTM! New tuning parameters properly passed to Scanner.The new snippet tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) are correctly propagated from CLI args to the Scanner constructor.src/scanoss/scanner.py (3)
42-42: Good plumbing forscanoss_settingsand tuning flags.Signature updates and storing the settings reference align with the new builder wiring.
Also applies to: 73-114, 141-141
238-242: SBOM propagation looks consistent.Using
scanoss_settingshere keeps post-init wiring centralized.
410-410: Consistent settings propagation intoFileFilters.Good to pass
scanoss_settingsthrough so filters can access file-based configuration.Also applies to: 678-678, 993-993
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/scanoss/data/scanoss-settings-schema.json`:
- Around line 176-182: Update the MAX_RANKING_THRESHOLD constant in
scan_settings_builder.py from 10 to 99, update the docstring in
scan_settings_builder.py that currently says "Valid range is -1 to 10" to "Valid
range is -1 to 99", and update the documentation JSON schema
(scanoss-settings-schema.json under the docs static assets) to change the
"maximum" value from 10 to 99 so all code, docstring, and docs schema match the
schema maximum of 99.
♻️ Duplicate comments (6)
src/scanoss/scan_settings_builder.py (2)
32-32:MAX_RANKING_THRESHOLDshould be 99 per PR objectives.The PR description and CHANGELOG state the valid range is
-1 to 99, but the constant is set to 10. This causes the docstring on line 219 ("Valid range is -1 to 10") to be internally consistent but incorrect relative to the documented behavior.🔧 Suggested fix
-MAX_RANKING_THRESHOLD = 10 +MAX_RANKING_THRESHOLD = 99Also update line 219:
- Valid range is -1 to 10. Values outside this range will be clamped and logged. + Valid range is -1 to 99. Values outside this range will be clamped and logged.
115-134:with_ignore_cert_errorsprecedence doesn't match docstring.The docstring states "CLI True > file_snippet > settings > False", implying that
--ignore-cert-errorson the CLI should override settings file values. However,_merge_with_priorityalways prefersfile_snippetthen root over CLI. If a settings file setsignore_cert_errors: false, the CLI flag won't override it.🔧 Suggested fix to honour "CLI True wins" semantics
def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder': """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False. Note: CLI value only takes effect if True (flag present). False means the flag was not provided, so settings file values are checked. """ + # Explicit CLI True should always win + if cli_value: + self.ignore_cert_errors = True + return self + + # Otherwise, fall back to file_snippet then root, defaulting to False result = self._merge_with_priority( - cli_value if cli_value else None, + None, self._get_file_snippet_http_config_value('ignore_cert_errors'), - self._get_http_config_value('ignore_cert_errors') + self._get_http_config_value('ignore_cert_errors'), ) - self.ignore_cert_errors = result if result is not None else False + self.ignore_cert_errors = bool(result) if result is not None else False return selfsrc/scanoss/data/scanoss-settings-schema.json (1)
201-205: Fixhonour_file_extsdescription to match actual semantics.The description says "Ignores file extensions" but the field name and CLI help describe it as "Honour file extensions". This is confusing for users.
🔧 Suggested fix
"honour_file_exts": { "type": ["boolean", "null"], - "description": "Ignores file extensions. When not set, defers to server configuration.", + "description": "Whether to honour file extensions during matching. True = honour extensions, False = ignore them. When not set, defers to server configuration.", "default": true },docs/source/_static/scanoss-settings-schema.json (2)
200-206: Schema inconsistency:ranking_thresholdconstraints differ from main schema.The docs schema defines
maximum: 10, default: -1, while the main schema (src/scanoss/data/) usesmaximum: 99, default: 0. This creates confusion. Align both schemas to match the PR objectives (-1 to 99).🔧 Suggested fix to align with main schema
"ranking_threshold": { "type": ["integer", "null"], "description": "Ranking threshold value. A value of -1 defers to server configuration", "minimum": -1, - "maximum": 10, - "default": -1 + "maximum": 99, + "default": 0 },
219-223: Fixhonour_file_extsdescription (same issue as main schema).The description says "Ignores file extensions" but should reflect the actual behavior:
True= honour extensions,False= ignore them.🔧 Suggested fix
"honour_file_exts": { "type": ["boolean", "null"], - "description": "Ignores file extensions. When not set, defers to server configuration.", + "description": "Whether to honour file extensions during matching. True = honour extensions, False = ignore them. When not set, defers to server configuration.", "default": true },src/scanoss/cli.py (1)
211-230: Inconsistent spacing inchoicesand incorrect help text for ranking threshold.
- Line 214:
choices=['unset' ,'true', 'false']has inconsistent spacing (space before comma)- Line 222: Help says "Valid range: -1 to 10" but should be "-1 to 99" per PR objectives
- Line 227:
choices=['unset','true', 'false']has inconsistent spacing🔧 Suggested fixes
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=-1, - help='Ranking threshold value. Valid range: -1 to 10. A value of -1 defers to server configuration (optional)', + help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )
🧹 Nitpick comments (2)
src/scanoss/scan_settings_builder.py (2)
79-81: UseAnyfrom typing instead of lowercaseany.
anyis not a valid type hint. UseAnyfrom thetypingmodule, or consider a more specific union type likeOptional[bool | str]to reflect the actual'unset'+ boolean usage.♻️ Suggested fix
-from typing import TYPE_CHECKING, Optional +from typing import TYPE_CHECKING, Any, Optional ... - self.honour_file_exts: Optional[any] = None - self.ranking: Optional[any] = None + self.honour_file_exts: Optional[Any] = None + self.ranking: Optional[Any] = NoneOr for stricter typing:
self.honour_file_exts: Optional[bool | str] = None self.ranking: Optional[bool | str] = None
195-196: Minor: Missing space before!=operator.- if self.honour_file_exts is not None and self.honour_file_exts!= 'unset': + if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.mdCLIENT_HELP.mddocs/source/_static/scanoss-settings-schema.jsonsrc/scanoss/cli.pysrc/scanoss/data/scanoss-settings-schema.jsonsrc/scanoss/scan_settings_builder.pytests/data/scanoss.jsontests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (4)
- tests/data/scanoss.json
- CHANGELOG.md
- CLIENT_HELP.md
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(427-433)get_proxy(411-417)get_http_config(419-425)get_file_snippet_http_config(435-441)src/scanoss/scanner.py (1)
_merge_cli_with_settings(246-257)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-441)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (8)
src/scanoss/scan_settings_builder.py (2)
246-265: LGTM!The merge helper methods correctly implement the documented priority:
file_snippet > root settings > CLIfor_merge_with_priorityandsettings > CLIfor_merge_cli_with_settings.
267-311: LGTM!The utility and accessor methods handle null checks properly and correctly extract nested configuration values from the settings.
src/scanoss/data/scanoss-settings-schema.json (1)
142-242: LGTM on new settings sections.The
file_snippet,hpfm, andcontainersections are well-structured with appropriate types, defaults, and constraints for the new tuning parameters.docs/source/_static/scanoss-settings-schema.json (1)
142-166: LGTM on root-levelproxyandhttp_configadditions.The docs schema correctly adds root-level proxy and http_config sections that the main schema lacks. This appears intentional for documentation purposes.
src/scanoss/cli.py (4)
198-210: LGTM on--min-snippet-hitsand--min-snippet-linesarguments.The new CLI arguments for snippet tuning are well-defined with appropriate types, defaults, and help text.
1418-1447: LGTM onwfp()function updates.The refactoring to use
scanoss_settingsis consistent with the new settings architecture.
1637-1648: LGTM on passing tuning parameters to Scanner.The new snippet tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) are correctly threaded through to the Scanner constructor.
2828-2837: LGTM onget_scanoss_settings_from_argshelper.Clean extraction of settings loading logic into a reusable helper function for
folder_hashing_scanandfolder_hashcommands.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
What's Changed
Added
--min-snippet-hits- Minimum snippet hits required (0 defers to server config)--min-snippet-lines- Minimum snippet lines required (0 defers to server config)--ranking- Enable/disable result ranking (unset/true/false)--ranking-threshold- Ranking threshold value (-1 to 99, -1 defers to server config)--honour-file-exts- Honour file extensions during matching (unset/true/false)file_snippetsection to scanoss.json settings schema for configuring tuning parametersScanSettingsBuilderclass for merging CLI and settings file configurations with priority: CLI > file_snippet > root settingsSummary by CodeRabbit
New Features
Configuration
Documentation
Tests
Other
✏️ Tip: You can customize this high-level summary in your review settings.