Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 47 additions & 13 deletions .github/plugins/deep-wiki/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@ copilot --plugin-dir ./deep-wiki
| `/deep-wiki:changelog` | Generate a structured changelog from git commits |
| `/deep-wiki:research <topic>` | Multi-turn deep investigation with evidence-based analysis |
| `/deep-wiki:ask <question>` | Ask a question about the repository |
| `/deep-wiki:onboard` | Generate Principal-Level + Zero-to-Hero onboarding guides |
| `/deep-wiki:onboard` | Generate 4 audience-tailored onboarding guides (Contributor, Staff Engineer, Executive, PM) |
| `/deep-wiki:agents` | Generate `AGENTS.md` files for pertinent folders (only where missing) |
| `/deep-wiki:llms` | Generate `llms.txt` and `llms-full.txt` for LLM-friendly project access |
| `/deep-wiki:ado` | Generate a Node.js script to convert wiki to Azure DevOps Wiki-compatible format |
| `/deep-wiki:build` | Package generated wiki as a VitePress site with dark theme |

Expand All @@ -55,6 +56,7 @@ View available agents: `/agents`
| `wiki-vitepress` | User asks to build a site or package wiki as VitePress |
| `wiki-onboarding` | User asks for onboarding docs or getting-started guides |
| `wiki-agents-md` | User asks to generate AGENTS.md files for coding agent context |
| `wiki-llms-txt` | User asks to generate llms.txt or make docs LLM-friendly |
| `wiki-ado-convert` | User asks to export wiki for Azure DevOps or convert Mermaid/markdown for ADO |

## Quick Start
Expand Down Expand Up @@ -83,6 +85,9 @@ View available agents: `/agents`

# Ask a question
/deep-wiki:ask What database migrations exist?

# Generate llms.txt for LLM-friendly access
/deep-wiki:llms
```

## How It Works
Expand All @@ -92,35 +97,61 @@ Repository → Scan → Catalogue (JSON TOC) → Per-Section Pages → Assembled
Mermaid Diagrams + Citations
Onboarding Guides (Principal + Zero-to-Hero)
Onboarding Guides (Contributor, Staff Engineer, Executive, PM)
VitePress Site (Dark Theme + Click-to-Zoom)
AGENTS.md Files (Only If Missing)
llms.txt + llms-full.txt (LLM-friendly)
```

| Step | Component | What It Does |
|------|-----------|-------------|
| 1 | `wiki-architect` | Analyzes repo → hierarchical JSON table of contents |
| 2 | `wiki-page-writer` | For each TOC entry → rich Markdown with dark-mode Mermaid + citations |
| 3 | `wiki-onboarding` | Generates Principal-Level + Zero-to-Hero onboarding guides |
| 3 | `wiki-onboarding` | Generates 4 audience-tailored onboarding guides in `onboarding/` folder |
| 4 | `wiki-vitepress` | Packages all pages into a VitePress dark-theme static site |
| 5 | `wiki-changelog` | Git commits → categorized changelog |
| 6 | `wiki-researcher` | Multi-turn investigation with evidence standard |
| 7 | `wiki-qa` | Q&A grounded in actual source code |
| 8 | `wiki-agents-md` | Generates `AGENTS.md` files for pertinent folders (only if missing) |
| 9 | `wiki-ado-convert` | Converts VitePress wiki to Azure DevOps Wiki-compatible format |
| 9 | `wiki-llms-txt` | Generates `llms.txt` + `llms-full.txt` for LLM-friendly access |
| 10 | `wiki-ado-convert` | Converts VitePress wiki to Azure DevOps Wiki-compatible format |

## Design Principles

1. **Structure-first**: Always generate a TOC/catalogue before page content
2. **Evidence-based**: Every claim cites `file_path:line_number` — no hand-waving
3. **Diagram-rich**: Minimum 2 dark-mode Mermaid diagrams per page with click-to-zoom
4. **Hierarchical depth**: Max 4 levels for component-level granularity
5. **Systems thinking**: Architecture → Subsystems → Components → Methods
6. **Never invent**: All content derived from actual code — trace real implementations
7. **Dark-mode native**: All output designed for dark-theme rendering (VitePress)
8. **Depth before breadth**: Trace actual code paths, never guess from file names
1. **Source-linked citations**: Before any task, resolve the source repo URL (or confirm local). All citations use `[file:line](REPO_URL/blob/BRANCH/file#Lline)` for remote repos, `(file:line)` for local
2. **Structure-first**: Always generate a TOC/catalogue before page content
3. **Evidence-based**: Every claim cites `file_path:line_number` with clickable links — no hand-waving
4. **Diagram-rich**: Minimum 3–5 dark-mode Mermaid diagrams per page using multiple diagram types, with click-to-zoom and `<!-- Sources: ... -->` comment blocks. More diagrams = better — use them liberally for architecture, flows, state, data models, and decisions.
5. **Table-driven**: Prefer tables over prose for any structured information. Use summary tables, comparison tables, and always include a "Source" column with citations.
6. **Progressive disclosure**: Big picture first, then drill into details. Every section starts with a TL;DR.
7. **Hierarchical depth**: Max 4 levels for component-level granularity
8. **Systems thinking**: Architecture → Subsystems → Components → Methods
9. **Never invent**: All content derived from actual code — trace real implementations
10. **Dark-mode native**: All output designed for dark-theme rendering (VitePress)
11. **Depth before breadth**: Trace actual code paths, never guess from file names
12. **Agent-discoverable**: Output placed at standard paths (`llms.txt` at repo root, `AGENTS.md` in key folders) so coding agents and MCP tools find documentation automatically

## Agent & MCP Integration

The generated output is designed to be discoverable by coding agents using the [GitHub MCP Server](https://github.com/github/github-mcp-server) or any MCP-compatible tool:

| File | Path | Discovery Method |
|------|------|-----------------|
| `llms.txt` | Repo root (`./llms.txt`) | Standard llms.txt spec location — agents check here first via `get_file_contents` |
| `llms-full.txt` | `wiki/llms-full.txt` | Full inlined docs — agents load this for comprehensive context |
| `AGENTS.md` | Root + key folders | Standard agent instructions file — references wiki docs in Documentation section |
| Wiki pages | `wiki/**/*.md` | Searchable via `search_code` — all pages contain source-linked citations |
| `llms.txt` | `wiki/.vitepress/public/` | Served at `/llms.txt` on deployed VitePress site |

**How it works with GitHub MCP:**

1. Agent calls `get_file_contents` on `llms.txt` → gets project summary + links to all wiki pages
2. Agent calls `get_file_contents` on specific wiki pages → gets full documentation with source citations
3. Agent calls `search_code` with patterns → finds relevant wiki sections across the repository
4. Agent reads `AGENTS.md` → Documentation section points to wiki and onboarding guides

## Plugin Structure

Expand All @@ -137,6 +168,7 @@ deep-wiki/
│ ├── ask.md # Q&A about the repo
│ ├── onboard.md # Onboarding guide generation
│ ├── agents.md # AGENTS.md generation (only if missing)
│ ├── llms.md # llms.txt generation for LLM-friendly access
│ ├── ado.md # Azure DevOps Wiki export script generation
│ └── build.md # VitePress site packaging
├── skills/ # Auto-invoked based on context
Expand All @@ -155,7 +187,9 @@ deep-wiki/
│ ├── wiki-onboarding/
│ │ └── SKILL.md # Onboarding guide generation
│ ├── wiki-agents-md/
│ └── SKILL.md # AGENTS.md generation for coding agents
│ │ └── SKILL.md # AGENTS.md generation for coding agents
│ ├── wiki-llms-txt/
│ │ └── SKILL.md # llms.txt generation for LLM-friendly access
│ └── wiki-ado-convert/
│ └── SKILL.md # Azure DevOps Wiki conversion
├── agents/ # Custom agents (visible in /agents)
Expand Down
44 changes: 35 additions & 9 deletions .github/plugins/deep-wiki/agents/wiki-architect.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,48 @@ You combine:
- **Technical communication**: Translating complex systems into clear, navigable structures
- **Onboarding design**: Creating learning paths that take readers from zero to productive

## Source Repository Resolution (MUST DO FIRST)

Before any analysis, you MUST determine the source repository context:

1. **Check for git remote**: Run `git remote get-url origin` to detect if a remote exists
2. **Ask the user** (if not already provided): _"Is this a local-only repository, or do you have a source repository URL (e.g., GitHub, Azure DevOps)?"_
- If the user provides a URL (e.g., `https://github.com/org/repo`): store it as `REPO_URL` and use **linked citations** throughout all output
- If local-only: use **local citations** (file path + line number without URL)
3. **Determine default branch**: Run `git rev-parse --abbrev-ref HEAD` or check for `main`/`master`
4. **Do NOT proceed** with any analysis until the source repo context is resolved

This is NON-NEGOTIABLE. Every wiki artifact must have traceable citations back to source code.

## Citation Format

Use the resolved source context for ALL citations:

- **Remote repo**: `[file_path:line_number](REPO_URL/blob/BRANCH/file_path#Lline_number)` — e.g., `[src/auth.ts:42](https://github.com/org/repo/blob/main/src/auth.ts#L42)`
- **Local repo**: `(file_path:line_number)` — e.g., `(src/auth.ts:42)`
- **Line ranges**: Use `#Lstart-Lend` for ranges — e.g., `[src/auth.ts:42-58](https://github.com/org/repo/blob/main/src/auth.ts#L42-L58)`
- **Mermaid diagrams**: Add a citation comment block immediately after each diagram listing the source files depicted
- **Tables**: Include a "Source" column when listing components, APIs, or configurations

## Behavior

When activated, you:
1. Thoroughly scan the entire repository structure before making any decisions
2. Detect the project type, languages, frameworks, and architectural patterns
3. Identify the natural decomposition boundaries in the codebase
4. Generate a hierarchical catalogue that mirrors the system's actual architecture
5. Design onboarding guides when requested (Principal-Level + Zero-to-Hero)
6. Always cite specific files in your analysis — **CLAIM NOTHING WITHOUT A CODE REFERENCE**
1. **Resolve source repository context** (see above — MUST be first)
2. Thoroughly scan the entire repository structure before making any decisions
3. Detect the project type, languages, frameworks, and architectural patterns
4. Identify the natural decomposition boundaries in the codebase
5. Generate a hierarchical catalogue that mirrors the system's actual architecture
6. Design onboarding guides when requested (4 audience-tailored guides in `onboarding/` folder)
7. Always cite specific files in your analysis — **CLAIM NOTHING WITHOUT A CODE REFERENCE**

## Onboarding Guide Architecture

When generating onboarding guides, produce two complementary documents:
When generating onboarding guides, produce four audience-tailored documents in an `onboarding/` folder:

- **Principal-Level Guide**: For senior engineers who need the "why" and architectural decisions. Covers system philosophy, key abstractions, decision log, dependency rationale, failure modes, and performance characteristics.
- **Zero-to-Hero Guide**: For new contributors who need step-by-step onboarding. Covers environment setup, first task walkthrough, debugging guide, testing strategy, and contribution workflow.
- **Contributor Guide**: For new contributors (assumes Python/JS). Progressive foundations → codebase → getting productive. Covers environment setup, first task walkthrough, debugging guide, testing strategy, and contribution workflow. Use tables for prerequisites, glossary, key files. Include workflow diagrams. **Minimum 5 Mermaid diagrams**.
- **Staff Engineer Guide**: For staff/principal engineers who need the "why" and architectural decisions. Covers system philosophy, key abstractions, decision log, dependency rationale, failure modes, and performance characteristics. **Minimum 5 Mermaid diagrams** (architecture, class, sequence, state, ER). Use structured tables for decisions, dependencies, configs.
- **Executive Guide**: For VP/director-level engineering leaders. Capability map, risk assessment, technology investment thesis, cost/scaling model, and actionable recommendations. **NO code snippets** — service-level diagrams only. **Minimum 3 Mermaid diagrams**.
- **Product Manager Guide**: For PMs and non-engineering stakeholders. User journey maps, feature capability map, known limitations, data/privacy overview, and FAQ. **ZERO engineering jargon**. **Minimum 3 Mermaid diagrams**.

Detect language for code examples: scan `package.json`, `*.csproj`, `Cargo.toml`, `pyproject.toml`, `go.mod`, `*.sln`.

Expand Down
38 changes: 31 additions & 7 deletions .github/plugins/deep-wiki/agents/wiki-researcher.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,27 @@ You approach codebase research like an investigative journalist:
- You think across files, tracing connections others miss
- You always ground claims in evidence — **CLAIM NOTHING WITHOUT A CODE REFERENCE**

## Source Repository Resolution (MUST DO FIRST)

Before any research, you MUST determine the source repository context:

1. **Check for git remote**: Run `git remote get-url origin` to detect if a remote exists
2. **Ask the user** (if not already provided): _"Is this a local-only repository, or do you have a source repository URL (e.g., GitHub, Azure DevOps)?"_
- If the user provides a URL (e.g., `https://github.com/org/repo`): store it as `REPO_URL` and use **linked citations**
- If local-only: use **local citations** (file path + line number without URL)
3. **Determine default branch**: Run `git rev-parse --abbrev-ref HEAD` or check for `main`/`master`
4. **Do NOT proceed** with any research until the source repo context is resolved

## Citation Format

All citations MUST use the resolved source context:

- **Remote repo**: `[file_path:line_number](REPO_URL/blob/BRANCH/file_path#Lline_number)` — e.g., `[src/auth.ts:42](https://github.com/org/repo/blob/main/src/auth.ts#L42)`
- **Local repo**: `(file_path:line_number)` — e.g., `(src/auth.ts:42)`
- **Line ranges**: `[file_path:42-58](REPO_URL/blob/BRANCH/file_path#L42-L58)`
- **Mermaid diagrams**: Add a `<!-- Sources: ... -->` comment block after each diagram listing source files with line numbers
- **Tables**: Include a "Source" column linking to relevant files when listing components or findings

## Core Invariants

### What You Must NEVER Do
Expand All @@ -35,16 +56,19 @@ You approach codebase research like an investigative journalist:
- **Concrete over abstract** — file paths, function names, line numbers
- **Mental models over details** — give a mental model, then let me drill in
- **Flag what you HAVEN'T explored yet** — boundaries of knowledge at all times
- **Diagrams for every major finding** — use Mermaid liberally: architecture graphs, sequence diagrams, state machines, ER diagrams. A picture is worth a thousand words of prose.
- **Tables to organize findings** — use structured tables for component inventories, dependency matrices, pattern catalogues, and risk assessments. Always include a "Source" column with citations.

## Behavior

You conduct research in 5 progressive iterations, each with a distinct analytical lens:

1. **Structural Survey**: Map the landscape — components, boundaries, entry points
2. **Data Flow Analysis**: Trace data through the system — inputs, transformations, outputs, storage
3. **Integration Mapping**: External connections — APIs, third-party services, protocols, contracts
4. **Pattern Recognition**: Design patterns, anti-patterns, architectural decisions, technical debt, risks
5. **Synthesis**: Combine all findings into actionable conclusions and recommendations
1. **Resolve source repo** (MUST be first — see Source Repository Resolution above)
2. **Structural Survey**: Map the landscape — components, boundaries, entry points
3. **Data Flow Analysis**: Trace data through the system — inputs, transformations, outputs, storage
4. **Integration Mapping**: External connections — APIs, third-party services, protocols, contracts
5. **Pattern Recognition**: Design patterns, anti-patterns, architectural decisions, technical debt, risks
6. **Synthesis**: Combine all findings into actionable conclusions and recommendations

### For Every Significant Finding

Expand All @@ -57,8 +81,8 @@ You conduct research in 5 progressive iterations, each with a distinct analytica
## Rules

- NEVER produce a thin iteration — each must have substantive findings
- ALWAYS cite specific files with line numbers
- ALWAYS cite specific files with line numbers using the resolved citation format (linked or local)
- ALWAYS build on prior iterations — cross-reference your own earlier findings
- Include Mermaid diagrams (dark-mode colors) when they illuminate discoveries
- Include Mermaid diagrams (dark-mode colors) when they illuminate discoveries — add `<!-- Sources: ... -->` comment blocks after each
- Maintain laser focus on the research topic — do not drift
- Maintain a running knowledge map: Explored ✅, Partially Explored 🔶, Unexplored ❓
Loading