feat: implement rpg_analyze_health tool - Code Health Meter#59
Merged
userFRM merged 3 commits intouserFRM:mainfrom Feb 25, 2026
Merged
feat: implement rpg_analyze_health tool - Code Health Meter#59userFRM merged 3 commits intouserFRM:mainfrom
userFRM merged 3 commits intouserFRM:mainfrom
Conversation
Implements a new MCP tool providing code health metrics based on the Code Health Meter (CHM) framework from the research paper. ## New Files - health.rs: Graph health metrics (instability, centrality, god objects) - duplication.rs: Token-based (Rabin-Karp) and semantic (Jaccard) clone detection ## Key Features - Instability index: I = Ce / (Ca + Ce) - Degree centrality (normalized) - God object detection (high degree + extreme instability) - Rabin-Karp rolling hash for Type-1/Type-2 clone detection - Jaccard similarity on lifted features for Type-3/Type-4 detection - LLM-friendly output via TOON formatter ## Testing - 17 unit tests in duplication module - 6 unit tests in health module ## References - Inspired by: Khalfallah, B. H. (2025). Code Health Meter. ACM Trans. Softw. Eng. Methodol. https://doi.org/10.1145/3737670
…mentation, configuration, and audit reports.
…on detection, alongside new audit reports, documentation, and configuration files.
userFRM
approved these changes
Feb 25, 2026
Owner
userFRM
left a comment
There was a problem hiding this comment.
Code Review
Verified locally: merge, fmt, clippy (both default and --no-default-features), and all workspace tests pass.
What's good
- Clean architecture: health.rs handles graph metrics, duplication.rs handles clone detection — good separation. Both are in rpg-nav where they belong.
- Solid algorithm choices: Rabin-Karp with rolling hash for token clones, inverted-index + Jaccard for semantic clones. The inverted index avoids O(n²) pair generation — important for large graphs.
- Good edge case handling:
clean_floatfor NaN/Inf,saturating_subfor line ranges, graceful skip on missing files, empty graph, unlifted entities. - Thorough tests: 23 unit tests covering both modules (tokenization, fingerprinting, Type-2 detection, semantic duplicates, boundary conditions, similarity bounds).
- Non-invasive integration: minimal changes to existing code (search.rs visibility bump, module exports, one new tool handler).
- Duplication deduplicated per-fingerprint (line 537-542) — correctly prevents similarity > 1.0.
Minor notes (non-blocking)
jaccard_similaritywas changed fromfntopub(crate) fnand#[allow(dead_code)]removed — clean.- The
rayondep added to rpg-nav is already in the workspace, so no new dependency introduced. - The
format_health_reportrecommendations section uses an emoji — fits the context (LLM output).
LGTM. Merging.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements a new MCP tool
analyze_healththat provides code health metrics based on the Code Health Meter (CHM) framework, including coupling analysis, instability detection, centrality scoring, god object identification, and dual-mode duplication detection.Changes
New Files
crates/rpg-nav/src/health.rs- Graph health metrics (instability, centrality, god objects)crates/rpg-nav/src/duplication.rs- Token-based (Rabin-Karp) and semantic (Jaccard) clone detectionModified Files
crates/rpg-mcp/src/params.rs- AddedAnalyzeHealthParamscrates/rpg-mcp/src/tools.rs- Addedanalyze_healthtool handlercrates/rpg-nav/src/lib.rs- Added module exportscrates/rpg-nav/src/toon.rs- Added health report formattingcrates/rpg-nav/Cargo.toml- Addedrayondependencycrates/rpg-nav/src/search.rs- Exposed Jaccard similarityInspiration
This implementation is inspired by the Code Health Meter framework described in:
Testing
Unit Tests
duplicationmodule covering tokenization, Rabin-Karp fingerprinting, Jaccard similarity, per-entity detection, and edge caseshealthmodule covering instability, centrality, and god object detectionManual Testing (Verified via MCP Server)
The following test cases were executed against the live MCP server:
{}{"include_duplication": true}{"include_semantic_duplication": true}{"include_duplication": true, "include_semantic_duplication": true}Verified Duplicates Detected
The token-based detection successfully identified actual duplicates in the codebase:
looks_like_custom_hook- Found 3 identical copies across:crates/rpg-parser/src/entities.rscrates/rpg-parser/src/paradigms/features.rscrates/rpg-parser/src/paradigms/helpers.rslooks_like_react_component- Found 2 identical copies across:crates/rpg-parser/src/entities.rscrates/rpg-parser/src/paradigms/helpers.rsAll detected duplicates were manually verified to be 100% identical functions, confirming the detection algorithm works correctly.
Example Usage
{ "instability_threshold": 0.7, "god_object_threshold": 10, "include_duplication": true, "include_semantic_duplication": true, "semantic_similarity_threshold": 0.6 }Related