Add auto-install for SCIP indexer tools by wende · Pull Request #237 · wende/cicada

wende · 2026-02-07T22:44:02Z

Summary

When a SCIP indexer binary isn't found, automatically install it using the language's native package manager (go install, gem install, dart pub, dotnet tool, rustup, coursier, npm)
Adds generic SCIPToolInstaller class with 7 install methods and InstallConfig dataclass
Integrates auto-install into ConfigurableSCIPIndexer._find_executable() and RustSCIPIndexer._ensure_rust_analyzer_installed()

Test plan

All 3680 existing tests pass
New unit tests for each install method (success, runtime missing, subprocess failure)
Integration tests for _find_executable auto-install path
Integration tests for Rust auto-install path
Verified C/C++ correctly have no auto-install config

When a SCIP indexer binary isn't found, automatically install it using the language's native package manager (go install, gem install, dart pub, dotnet tool, rustup, coursier, npm). This removes a major accessibility barrier - users no longer need to manually install indexer tools. - Add generic SCIPToolInstaller with 7 install methods - Add install_config field to SCIPLanguageConfig for go, ruby, dart, java, scala, csharp, vb (c/cpp have no auto-install) - Integrate auto-install into ConfigurableSCIPIndexer._find_executable() - Add rust-analyzer auto-install via rustup in RustSCIPIndexer

cursor · 2026-02-07T22:44:06Z

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on March 1.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

gemini-code-assist · 2026-02-07T22:44:26Z

Summary of Changes

Hello @wende, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust auto-installation system for SCIP indexer tools, significantly improving the developer experience by automatically resolving missing dependencies. By centralizing installation logic and integrating it into the core indexer framework, it streamlines the setup process for various programming languages, making the system more self-sufficient and user-friendly.

Highlights

Generic Auto-Installation Framework: A new SCIPToolInstaller class and InstallConfig dataclass have been introduced to provide a generic mechanism for automatically installing SCIP indexer tools using various native package managers (npm, go, gem, dart pub, dotnet, rustup, coursier).
Integrated Auto-Install Logic: The auto-installation functionality is integrated into ConfigurableSCIPIndexer._find_executable() and RustSCIPIndexer._ensure_rust_analyzer_installed(), allowing the system to automatically install missing indexers for supported languages like Go, Ruby, Dart, Java, Scala, C#, VB, and Rust.
Enhanced SCIPLanguageConfig: The SCIPLanguageConfig now includes an install_config field, enabling language-specific auto-installation settings to be defined centrally.
Non-Interactive npx Commands: Existing npx commands used for JavaScript and TypeScript SCIP index generation have been updated with the --yes flag to ensure non-interactive execution.
Comprehensive Testing: New unit tests cover each installation method's success and failure scenarios, along with integration tests for the auto-install paths in _find_executable and RustSCIPIndexer.

Changelog

Makefile
- Added --yes flag to npx command for JavaScript SCIP index generation to ensure non-interactive execution.
cicada/languages/rust/indexer.py
- Modified _ensure_rust_analyzer_installed to use the new SCIPToolInstaller for automatic installation of rust-analyzer if it's not found.
cicada/languages/scip/configs.py
- Added install_config: InstallConfig | None field to SCIPLanguageConfig.
- Introduced _make_install_configs() and _get_install_config() to lazily define and retrieve auto-install configurations for Go, Ruby, Dart, Java, Scala, C#, and VB.
- Updated LANGUAGE_CONFIGS to include the new install_config for supported languages.
cicada/languages/scip/configurable_indexer.py
- Modified _find_executable to check for an install_config and attempt auto-installation via SCIPToolInstaller.try_install() if the required executable is not found. If successful, the command is updated with the installed path.
cicada/languages/scip/installer.py
- New file: Implemented InstallMethod enum, InstallConfig dataclass, and the SCIPToolInstaller class with methods for installing tools via npm, go, gem, dart pub, dotnet, rustup, and coursier.
cicada/languages/typescript/indexer.py
- Added --yes flag to npx command for TypeScript SCIP index generation.
tests/fixtures/.cicada/hashes.json
- Updated last_updated timestamp.
tests/languages/scip/test_scip_tool_installer.py
- New file: Added extensive unit tests for InstallConfig and SCIPToolInstaller, covering runtime checks, successful installations, and failure scenarios for all supported package managers.
- Included integration tests for the auto-install logic within ConfigurableSCIPIndexer and RustSCIPIndexer.
tests/languages/typescript/test_javascript_indexer.py
- Updated npx command in tests to include --yes.
tests/languages/typescript/test_javascript_integration.py
- Updated npx command in test setup and skip messages to include --yes.
tests/languages/typescript/test_typescript_indexer.py
- Updated npx command in tests to include --yes.
tests/setup_fixtures.sh
- Added --yes flag to npx command for SCIP TypeScript index generation in the setup script.

Activity

The author, wende, has introduced a significant feature to automatically install missing SCIP indexer tools, enhancing the system's robustness and ease of use.
This pull request includes the creation of a new Python module (installer.py) dedicated to handling various package manager installations.
The core logic for finding and executing indexers has been updated to incorporate this new auto-installation capability.
Comprehensive test coverage has been added to ensure the reliability of the new installation methods across different languages and scenarios.
Minor adjustments were made to existing scripts and test files to align with non-interactive npx command execution.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey - I've found 1 security issue, and 2 other issues

Security issues:

Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `cicada/languages/scip/installer.py:135-58` </location>
<code_context>
+        return None
+
+    @classmethod
+    def _install_gem(cls, config: InstallConfig, verbose: bool) -> str | None:
+        """Install via gem install."""
+        result = subprocess.run(
+            ["gem", "install", config.package, "--no-document"],
+            capture_output=True,
+            text=True,
+            timeout=180,
+        )
+
+        if result.returncode != 0:
+            if verbose:
+                print(f"  gem install failed: {result.stderr}")
+            return None
+
+        path = shutil.which(config.executable)
</code_context>

<issue_to_address>
**suggestion:** Gem-installed binaries may not be found via PATH; consider probing standard gem bin directories.

Unlike the Go/Dart/Dotnet helpers, this relies only on `shutil.which(config.executable)` after `gem install`. If the gem bin dir isn’t on PATH (common on some distros/CI images), this will look like a failed install even when it succeeded. Consider checking standard gem bin locations (e.g., from `gem environment` or `$HOME/.gem/.../bin`) before falling back to `which` so installs work reliably across environments.

Suggested implementation:

```python
    @classmethod
    def _install_gem(cls, config: InstallConfig, verbose: bool) -> str | None:
        """Install via gem install."""
        result = subprocess.run(
            ["gem", "install", config.package, "--no-document"],
            capture_output=True,
            text=True,
            timeout=180,
        )

        if result.returncode != 0:
            if verbose:
                print(f"  gem install failed: {result.stderr}")
            return None

        # Probe standard gem bin directories first, then fall back to PATH.
        gem_bin_dirs = set()

        # Prefer gem-reported directories for robustness across environments.
        for key in ("gemdir", "user_gemdir"):
            try:
                env = subprocess.run(
                    ["gem", "environment", key],
                    capture_output=True,
                    text=True,
                    timeout=30,
                )
                if env.returncode == 0:
                    value = env.stdout.strip()
                    if value:
                        gem_bin_dirs.add(Path(value) / "bin")
            except Exception as exc:  # pragma: no cover - defensive
                if verbose:
                    print(f"  failed to query gem environment {key}: {exc}")

        # Common fallback: ~/.gem/.../bin
        try:
            home = Path.home()
            gem_bin_dirs.add(home / ".gem" / "bin")
        except Exception:
            # If we can't resolve $HOME, just skip this fallback.
            pass

        for bin_dir in gem_bin_dirs:
            candidate = bin_dir / config.executable
            if candidate.is_file() and os.access(candidate, os.X_OK):
                if verbose:
                    print(f"  Installed {config.executable} to {candidate}")
                return str(candidate)

        path = shutil.which(config.executable)
        if path:
            return path

        return None

```

This edit assumes `Path` from `pathlib` and `os` are already imported in `cicada/languages/scip/installer.py`. If they are not, add:

- `from pathlib import Path`
- `import os`

to the import section near the top of the file, following existing style and ordering.
</issue_to_address>

### Comment 2
<location> `cicada/languages/typescript/indexer.py:51` </location>
<code_context>
         # Security audit: Command uses list-form arguments (not shell=True),
         # so no command injection risk. All arguments are hardcoded strings.
-        cmd = ["npx", "@sourcegraph/scip-typescript", "index"]
+        cmd = ["npx", "--yes", "@sourcegraph/scip-typescript", "index"]
         scip_file = repo_path / "index.scip"

</code_context>

<issue_to_address>
**question:** Using `npx --yes` can break on older npm versions that don’t support this flag.

This flag avoids prompts but is unsupported in older npm/npx versions (e.g., npm < 7), causing the command to fail there. If this tool is expected to run on varied environments, consider either checking the npm/npx version before adding `--yes` or falling back to plain `npx` when the flag isn’t supported to avoid failures on older toolchains.
</issue_to_address>

### Comment 3
<location> `cicada/languages/scip/installer.py:243-248` </location>
<code_context>
        result = subprocess.run(
            [cs_cmd, "install", config.package],
            capture_output=True,
            text=True,
            timeout=180,
        )
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-02-07T22:45:19Z

cicada/languages/scip/installer.py

+        if config.runtime_check and not shutil.which(config.runtime_check):
+            if verbose:
+                print(f"  {config.runtime_check} not found - cannot install {config.executable}")
+            return None


suggestion: Gem-installed binaries may not be found via PATH; consider probing standard gem bin directories.

Unlike the Go/Dart/Dotnet helpers, this relies only on shutil.which(config.executable) after gem install. If the gem bin dir isn’t on PATH (common on some distros/CI images), this will look like a failed install even when it succeeded. Consider checking standard gem bin locations (e.g., from gem environment or $HOME/.gem/.../bin) before falling back to which so installs work reliably across environments.

Suggested implementation:

@classmethod def _install_gem(cls, config: InstallConfig, verbose: bool) -> str | None: """Install via gem install.""" result = subprocess.run( ["gem", "install", config.package, "--no-document"], capture_output=True, text=True, timeout=180, ) if result.returncode != 0: if verbose: print(f" gem install failed: {result.stderr}") return None # Probe standard gem bin directories first, then fall back to PATH. gem_bin_dirs = set() # Prefer gem-reported directories for robustness across environments. for key in ("gemdir", "user_gemdir"): try: env = subprocess.run( ["gem", "environment", key], capture_output=True, text=True, timeout=30, ) if env.returncode == 0: value = env.stdout.strip() if value: gem_bin_dirs.add(Path(value) / "bin") except Exception as exc: # pragma: no cover - defensive if verbose: print(f" failed to query gem environment {key}: {exc}") # Common fallback: ~/.gem/.../bin try: home = Path.home() gem_bin_dirs.add(home / ".gem" / "bin") except Exception: # If we can't resolve $HOME, just skip this fallback. pass for bin_dir in gem_bin_dirs: candidate = bin_dir / config.executable if candidate.is_file() and os.access(candidate, os.X_OK): if verbose: print(f" Installed {config.executable} to {candidate}") return str(candidate) path = shutil.which(config.executable) if path: return path return None

This edit assumes Path from pathlib and os are already imported in cicada/languages/scip/installer.py. If they are not, add:

from pathlib import Path

import os

to the import section near the top of the file, following existing style and ordering.

sourcery-ai · 2026-02-07T22:45:19Z

cicada/languages/typescript/indexer.py

        # Security audit: Command uses list-form arguments (not shell=True),
        # so no command injection risk. All arguments are hardcoded strings.
-        cmd = ["npx", "@sourcegraph/scip-typescript", "index"]
+        cmd = ["npx", "--yes", "@sourcegraph/scip-typescript", "index"]


question: Using npx --yes can break on older npm versions that don’t support this flag.

This flag avoids prompts but is unsupported in older npm/npx versions (e.g., npm < 7), causing the command to fail there. If this tool is expected to run on varied environments, consider either checking the npm/npx version before adding --yes or falling back to plain npx when the flag isn’t supported to avoid failures on older toolchains.

sourcery-ai · 2026-02-07T22:45:19Z

cicada/languages/scip/installer.py

+        result = subprocess.run(
+            [cs_cmd, "install", config.package],
+            capture_output=True,
+            text=True,
+            timeout=180,
+        )


security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'run' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep

gemini-code-assist

Code Review

This pull request introduces an excellent feature for automatically installing missing SCIP indexer tools, which will significantly improve the user experience. The implementation is well-structured, with a new generic SCIPToolInstaller that cleanly handles different package managers. The integration into the existing indexers is also well-done, and the new unit tests are comprehensive. I have a couple of minor suggestions to improve code consistency and efficiency, but overall this is a high-quality contribution.

gemini-code-assist · 2026-02-07T22:46:38Z

cicada/languages/scip/configs.py

+def _make_install_configs() -> dict[str, InstallConfig]:
+    """Build install configs lazily to avoid circular imports."""
+    from cicada.languages.scip.installer import InstallConfig, InstallMethod
+
+    return {
+        "go": InstallConfig(
+            method=InstallMethod.GO,
+            package="github.com/sourcegraph/scip-go/cmd/scip-go@latest",
+            executable="scip-go",
+            runtime_check="go",
+        ),
+        "ruby": InstallConfig(
+            method=InstallMethod.GEM,
+            package="scip-ruby",
+            executable="scip-ruby",
+            runtime_check="gem",
+        ),
+        "dart": InstallConfig(
+            method=InstallMethod.DART_PUB,
+            package="scip_dart",
+            executable="scip_dart",
+            runtime_check="dart",
+        ),
+        "java": InstallConfig(
+            method=InstallMethod.COURSIER,
+            package="scip-java",
+            executable="scip-java",
+        ),
+        "scala": InstallConfig(
+            method=InstallMethod.COURSIER,
+            package="scip-java",
+            executable="scip-java",
+        ),
+        "csharp": InstallConfig(
+            method=InstallMethod.DOTNET,
+            package="scip-dotnet",
+            executable="scip-dotnet",
+            runtime_check="dotnet",
+        ),
+        "vb": InstallConfig(
+            method=InstallMethod.DOTNET,
+            package="scip-dotnet",
+            executable="scip-dotnet",
+            runtime_check="dotnet",
+        ),
+    }
+
+
+def _get_install_config(language: str) -> InstallConfig | None:
+    """Get install config for a language, or None if not supported."""
+    configs = _make_install_configs()
+    return configs.get(language)


The _make_install_configs() function is called every time _get_install_config() is invoked. Since _get_install_config() is called for each language config at module load time, this leads to the dictionary of install configs being created multiple times unnecessarily. It would be more efficient to create it once and cache it in a module-level variable.

_INSTALL_CONFIGS: dict[str, InstallConfig] | None = None def def _make_install_configs() -> dict[str, InstallConfig]: """Build install configs lazily to avoid circular imports.""" from cicada.languages.scip.installer import InstallConfig, InstallMethod return { "go": InstallConfig( method=InstallMethod.GO, package="github.com/sourcegraph/scip-go/cmd/scip-go@latest", executable="scip-go", runtime_check="go", ), "ruby": InstallConfig( method=InstallMethod.GEM, package="scip-ruby", executable="scip-ruby", runtime_check="gem", ), "dart": InstallConfig( method=InstallMethod.DART_PUB, package="scip_dart", executable="scip_dart", runtime_check="dart", ), "java": InstallConfig( method=InstallMethod.COURSIER, package="scip-java", executable="scip-java", ), "scala": InstallConfig( method=InstallMethod.COURSIER, package="scip-java", executable="scip-java", ), "csharp": InstallConfig( method=InstallMethod.DOTNET, package="scip-dotnet", executable="scip-dotnet", runtime_check="dotnet", ), "vb": InstallConfig( method=InstallMethod.DOTNET, package="scip-dotnet", executable="scip-dotnet", runtime_check="dotnet", ), } def _get_install_config(language: str) -> InstallConfig | None: """Get install config for a language, or None if not supported.""" global _INSTALL_CONFIGS if _INSTALL_CONFIGS is None: _INSTALL_CONFIGS = _make_install_configs() return _INSTALL_CONFIGS.get(language)

gemini-code-assist · 2026-02-07T22:46:38Z

cicada/languages/scip/installer.py

+        path = shutil.which(config.executable)
+        if path:
+            return path


The fallback to shutil.which() does not log a success message when verbose is true. This is inconsistent with other installer methods like _install_gem which do log a message. Adding a log message here would improve consistency and aid in debugging. This also applies to _install_dart_pub and _install_dotnet.

Suggested change

path = shutil.which(config.executable)

if path:

return path

path = shutil.which(config.executable)

if path:

if verbose:

print(f" Installed {config.executable} at {path}")

return path

sourcery-ai bot reviewed Feb 7, 2026

View reviewed changes

gemini-code-assist bot reviewed Feb 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add auto-install for SCIP indexer tools#237

Add auto-install for SCIP indexer tools#237
wende wants to merge 1 commit intomainfrom
feat/auto-install-scip-tools

wende commented Feb 7, 2026

Uh oh!

cursor bot commented Feb 7, 2026

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Feb 7, 2026

Uh oh!

sourcery-ai bot Feb 7, 2026

Uh oh!

sourcery-ai bot Feb 7, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Uh oh!

gemini-code-assist bot Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wende commented Feb 7, 2026

Summary

Test plan

Uh oh!

cursor bot commented Feb 7, 2026

Uh oh!

gemini-code-assist bot commented Feb 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant