-
Notifications
You must be signed in to change notification settings - Fork 11
CSharp Language Support Plan
Branch: 110-csharp-code-indexing | Date: 2026-01-27 | Spec: spec.md
Input: Feature specification from .speckit/features/110-csharp-code-indexing/spec.md
Add C# as the 11th supported language in doc-serve's code indexing pipeline. This is a slot-in addition following the established pattern from 10 existing languages: add file extension mappings, tree-sitter parser setup, AST query patterns for symbol extraction, and content-based detection patterns. No new dependencies, no API changes, no architectural modifications.
Language/Version: Python 3.10+
Primary Dependencies: tree-sitter-language-pack (existing, includes c_sharp grammar), LlamaIndex CodeSplitter (existing)
Storage: ChromaDB (existing vector store), disk-based BM25 index (existing)
Testing: pytest (with async support)
Target Platform: macOS/Linux local development
Project Type: Monorepo (doc-serve-server, doc-svr-ctl, doc-serve-skill)
Performance Goals: C# indexing performance on par with existing languages (~100 files/sec)
Constraints: No new external dependencies; no API changes
Scale/Scope: Standard C# projects (1-10k files)
GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.
- All changes are within
doc-serve-server(the server package) - No changes to
doc-svr-ctlordoc-serve-skill - No cross-package dependency changes
- FR-009 explicitly requires no API changes
- Existing
/indexand/queryendpoints work unchanged - The
language=csharpfilter uses the existinglanguagequery parameter
- Unit tests for C# language detection, AST parsing, and symbol extraction
- Integration tests for end-to-end C# indexing and querying
- Tests follow existing patterns from other language test files
- C# chunks counted in existing
code_chunksmetric (FR-008) - Existing structured logging covers C# files through the standard pipeline
- No new observability requirements
- ~100 lines of new code (well under 500-line threshold)
- 0 new dependencies
- 3 files modified (document_loader.py, chunking.py, and tests)
- Follows exact pattern of existing languages — no new abstractions
.speckit/features/110-csharp-code-indexing/
├── plan.md # This file
├── research.md # Phase 0 output (6 decisions)
├── quickstart.md # Phase 1 output
└── tasks.md # Phase 2 output (/speckit.tasks)
doc-serve-server/
├── doc_serve_server/
│ ├── indexing/
│ │ ├── document_loader.py # MODIFIED: add .cs/.csx extensions, csharp content patterns
│ │ └── chunking.py # MODIFIED: add c_sharp parser setup, C# AST query patterns
│ └── ...
└── tests/
└── unit/
├── test_document_loader.py # MODIFIED: add C# extension and content detection tests
└── test_chunking.py # MODIFIED: add C# AST parsing and symbol extraction tests
Structure Decision: Monorepo structure preserved. Only 2 source files modified, plus their existing test files. No new modules or packages created.
Goal: Full C# indexing with AST-aware chunking, symbol metadata extraction, and content detection.
This feature is small enough to implement in a single phase. All three user stories touch the same two files and are tightly coupled.
- Add
.csand.csxtoEXTENSION_TO_LANGUAGEdict indocument_loader.py - Add
.csand.csxtoCODE_EXTENSIONSset indocument_loader.py - Add
csharpcontent-detection patterns toCONTENT_PATTERNSdict indocument_loader.py - Add
"c_sharp": "c_sharp"mapping to_setup_language()inchunking.py - Add C# AST query patterns to
_get_symbols()inchunking.py— query for class_declaration, method_declaration, constructor_declaration, interface_declaration, property_declaration, enum_declaration, struct_declaration, record_declaration - Add XML doc comment extraction logic for C#
///comments inchunking.py - Write unit tests for all new code
- Run
task before-pushto verify all quality gates
Files Modified: document_loader.py, chunking.py, test_document_loader.py, test_chunking.py
| Decision | Choice | Rationale |
|---|---|---|
| Grammar source |
tree-sitter-language-pack c_sharp
|
Already a dependency; no new package needed |
| File extensions |
.cs, .csx
|
Standard C# source and script extensions |
| Tree-sitter ID | c_sharp |
Language pack naming convention |
| Symbol types | 9 node types (class, method, constructor, interface, property, enum, struct, record, namespace) | Covers all common C# declarations |
| Doc comments | Extract /// comment text |
Equivalent to Python docstrings and Java Javadoc |
| Content patterns | 5 regex patterns (using System, namespace, property accessors, attributes, type declarations) | Disambiguates C# from Java |
See research.md for full decision records with alternatives considered.
- Design-Architecture-Overview
- Design-Query-Architecture
- Design-Storage-Architecture
- Design-Class-Diagrams
- GraphRAG-Guide
- Agent-Skill-Hybrid-Search-Guide
- Agent-Skill-Graph-Search-Guide
- Agent-Skill-Vector-Search-Guide
- Agent-Skill-BM25-Search-Guide
Search
Server
Setup
- Pluggable-Providers-Spec
- GraphRAG-Integration-Spec
- Agent-Brain-Plugin-Spec
- Multi-Instance-Architecture-Spec