Skip to content

feat: implement rpg_analyze_health tool - Code Health Meter#59

Merged
userFRM merged 3 commits intouserFRM:mainfrom
VooDisss:rpg_analyze_health
Feb 25, 2026
Merged

feat: implement rpg_analyze_health tool - Code Health Meter#59
userFRM merged 3 commits intouserFRM:mainfrom
VooDisss:rpg_analyze_health

Conversation

@VooDisss
Copy link
Contributor

Summary

Implements a new MCP tool analyze_health that provides code health metrics based on the Code Health Meter (CHM) framework, including coupling analysis, instability detection, centrality scoring, god object identification, and dual-mode duplication detection.

Changes

New Files

  • crates/rpg-nav/src/health.rs - Graph health metrics (instability, centrality, god objects)
  • crates/rpg-nav/src/duplication.rs - Token-based (Rabin-Karp) and semantic (Jaccard) clone detection

Modified Files

  • crates/rpg-mcp/src/params.rs - Added AnalyzeHealthParams
  • crates/rpg-mcp/src/tools.rs - Added analyze_health tool handler
  • crates/rpg-nav/src/lib.rs - Added module exports
  • crates/rpg-nav/src/toon.rs - Added health report formatting
  • crates/rpg-nav/Cargo.toml - Added rayon dependency
  • crates/rpg-nav/src/search.rs - Exposed Jaccard similarity

Inspiration

This implementation is inspired by the Code Health Meter framework described in:

Khalfallah, B. H. (2025). Code Health Meter: A Quantitative and Graph-Theoretic Foundation for Automated Code Quality and Architecture Assessment. ACM Transactions on Software Engineering and Methodology. https://doi.org/10.1145/3737670

Testing

Unit Tests

  • 17 unit tests in duplication module covering tokenization, Rabin-Karp fingerprinting, Jaccard similarity, per-entity detection, and edge cases
  • 6 unit tests in health module covering instability, centrality, and god object detection

Manual Testing (Verified via MCP Server)

The following test cases were executed against the live MCP server:

Test Case Parameters Result
TC-01: Baseline health {} ✅ Pass - Shows health analysis without duplication sections
TC-04: Token duplication {"include_duplication": true} ✅ Pass - "## Duplication Hotspots" section appears with valid similarity (0-100%)
TC-05: Semantic duplication {"include_semantic_duplication": true} ✅ Pass - "## Semantic Duplication (Conceptual Clones)" appears
TC-08: Both modes {"include_duplication": true, "include_semantic_duplication": true} ✅ Pass - Both sections appear in correct order
Regression: search_node query="validate input" ✅ Pass - No regression
Regression: fetch_node entity_id="..." ✅ Pass - No regression
Regression: explore_rpg direction="both" ✅ Pass - No regression

Verified Duplicates Detected

The token-based detection successfully identified actual duplicates in the codebase:

  • looks_like_custom_hook - Found 3 identical copies across:

    • crates/rpg-parser/src/entities.rs
    • crates/rpg-parser/src/paradigms/features.rs
    • crates/rpg-parser/src/paradigms/helpers.rs
  • looks_like_react_component - Found 2 identical copies across:

    • crates/rpg-parser/src/entities.rs
    • crates/rpg-parser/src/paradigms/helpers.rs

All detected duplicates were manually verified to be 100% identical functions, confirming the detection algorithm works correctly.

Example Usage

{
  "instability_threshold": 0.7,
  "god_object_threshold": 10,
  "include_duplication": true,
  "include_semantic_duplication": true,
  "semantic_similarity_threshold": 0.6
}

Related

  • Implements CHM metrics: In-degree, Out-degree, Instability Index, Degree Centrality
  • Detects Type-1/Type-2 clones via Rabin-Karp rolling hash
  • Detects Type-3/Type-4 clones via Jaccard similarity on lifted features

Implements a new MCP tool providing code health metrics based on the
Code Health Meter (CHM) framework from the research paper.

## New Files

- health.rs: Graph health metrics (instability, centrality, god objects)
- duplication.rs: Token-based (Rabin-Karp) and semantic (Jaccard) clone detection

## Key Features

- Instability index: I = Ce / (Ca + Ce)
- Degree centrality (normalized)
- God object detection (high degree + extreme instability)
- Rabin-Karp rolling hash for Type-1/Type-2 clone detection
- Jaccard similarity on lifted features for Type-3/Type-4 detection
- LLM-friendly output via TOON formatter

## Testing

- 17 unit tests in duplication module
- 6 unit tests in health module

## References

- Inspired by: Khalfallah, B. H. (2025). Code Health Meter.
  ACM Trans. Softw. Eng. Methodol. https://doi.org/10.1145/3737670
…mentation, configuration, and audit reports.
…on detection, alongside new audit reports, documentation, and configuration files.
Copy link
Owner

@userFRM userFRM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Verified locally: merge, fmt, clippy (both default and --no-default-features), and all workspace tests pass.

What's good

  • Clean architecture: health.rs handles graph metrics, duplication.rs handles clone detection — good separation. Both are in rpg-nav where they belong.
  • Solid algorithm choices: Rabin-Karp with rolling hash for token clones, inverted-index + Jaccard for semantic clones. The inverted index avoids O(n²) pair generation — important for large graphs.
  • Good edge case handling: clean_float for NaN/Inf, saturating_sub for line ranges, graceful skip on missing files, empty graph, unlifted entities.
  • Thorough tests: 23 unit tests covering both modules (tokenization, fingerprinting, Type-2 detection, semantic duplicates, boundary conditions, similarity bounds).
  • Non-invasive integration: minimal changes to existing code (search.rs visibility bump, module exports, one new tool handler).
  • Duplication deduplicated per-fingerprint (line 537-542) — correctly prevents similarity > 1.0.

Minor notes (non-blocking)

  1. jaccard_similarity was changed from fn to pub(crate) fn and #[allow(dead_code)] removed — clean.
  2. The rayon dep added to rpg-nav is already in the workspace, so no new dependency introduced.
  3. The format_health_report recommendations section uses an emoji — fits the context (LLM output).

LGTM. Merging.

@userFRM userFRM merged commit 4b6519c into userFRM:main Feb 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants