Graph-based & Hypothesis-driven threat hunting
Ingest security logs, build an entity-relationship graph with causal ordering, and hunt for attack paths using pattern matching with optional MITRE ATT&CK–aligned detection templates. Integrates GNN-based threat classification via ONNX models with optional NPU/GPU acceleration.
- About
- Why Graph-Based Hunting?
- How It Works
- Key Features
- Supported Log Formats
- SIEM Integrations
- Hypothesis DSL & ATT&CK Catalog
- GNN Threat Scoring
- Architecture
- HTTP API & MCP (AI integration)
- Installation
- Usage
- Demo Data & Try It
- Privacy & Data
- Screenshots
- Core Engine Details
- Changelog
- Contributing
- License
Graph Hunter is a graph-based threat hunting engine that turns heterogeneous security telemetry (Sysmon, Microsoft Sentinel, generic JSON, CSV) into a single knowledge graph. Analysts define hypotheses as chains of entity types and relation types (e.g., User →[Auth]→ Host →[Execute]→ Process). The engine finds all paths that match the pattern while enforcing causal monotonicity: each step occurs at or after the previous one in time. Results are explored via an interactive graph canvas, IOC search, timeline and heatmap views, and optional ATT&CK-mapped hypothesis templates.
The engine includes an endogenous anomaly scoring system with five components — Entity Rarity, Edge Rarity, Neighborhood Concentration, Temporal Novelty, and GNN Threat — that automatically prioritizes the most suspicious paths. The GNN component integrates ONNX models (e.g., exported from GraphOS-APT) that classify k-hop subgraphs into threat categories (Benign, Exfiltration, C2 Beacon, Lateral Movement, Privilege Escalation), with optional NPU/GPU acceleration via DirectML.
Screenshot: Exploring nodes on map
Traditional SIEM-style queries are rigid and schema-bound. Attack chains span multiple data sources and event types; correlating them often requires custom rules and manual pivoting. Graph Hunter instead:
- Normalizes diverse log formats into a unified model (entities + typed relations + timestamps).
- Searches by pattern (who executed what, who connected where, what wrote which file) instead of by field names.
- Surfaces multi-hop attack paths that satisfy temporal order, so you see full chains, not isolated events.
Security Logs ──► Parser ──► Knowledge Graph ──► Hypothesis Search ──► Hunt Attack Paths
- Ingest — Load logs in any supported format. The engine auto-detects the format or you can specify it. Parsers extract entities (IP, Host, User, Process, File, Domain, Registry, URL, Service) and relations (Auth, Connect, Execute, Read, Write, DNS, Modify, Spawn, Delete) with timestamps.
- Build Graph — Entities become nodes, relations become directed edges. Duplicate entities are deduplicated; metadata is merged.
- Hunt — Define a hypothesis as a chain of typed steps (e.g.,
User →[Auth]→ Host →[Execute]→ Process). The engine finds all paths matching the pattern with causal monotonicity (each step at or after the previous one). Optional k-simplicity allows a vertex to repeat up to k times per path. - Explore — Search for IOCs, expand node neighborhoods, inspect metadata and anomaly scores, pivot via Events view, Heatmap, and Timeline.
Screenshot: Ingesting data
| Area | Features |
|---|---|
| Engine | Temporal pattern matching (DFS + causal monotonicity), 5-component endogenous anomaly scoring (ER, EdgeR, NC, TN, GNN Threat), parallel parsing (Rayon), entity/relation deduplication |
| GNN Scoring | ONNX model inference for k-hop subgraph classification (5 threat classes), DirectML NPU/GPU acceleration, batch scoring, configurable k-hop depth, feature-gated (ml-scoring) |
| Formats | Sysmon, EVTX, Microsoft Sentinel, generic JSON (80+ field variants), CSV; |
| Hypotheses | Visual step builder or DSL (User -[Auth]-> Host -[Execute]-> Process); wildcards (*) for any type; ATT&CK hypothesis catalog with one-click load |
| UI | Sessions (multiple graphs, persisted); Hunt vs Explorer modes; Events, Heatmap, Timeline views; Path Nodes (pinned nodes); Notes (standalone or node-linked); GNN Threat Model panel; paginated hunt results for large path sets |
| Data | Configurable generic parser (field → entity type mapping); preview before ingest; dataset list per session (remove/rename) |
| SIEM integrations | Azure Sentinel (Log Analytics): KQL queries, workspace + tenant/client/secret (env or UI). Elasticsearch: index + query JSON, API key or user/password (env or UI). Query-based ingest via gateway or CLI; results loaded into the graph. |
Graph Hunter supports Sysmon, Microsoft Sentinel, generic JSON (80+ field variants), and CSV. Use Auto-detect to let the engine choose the parser from content heuristics, or select a format manually.
Full details (event IDs, Sentinel tables, triples, generic field mapping, CSV): Supported log formats in the documentation.
Graph Hunter can pull data directly from Azure Sentinel (Log Analytics) and Elasticsearch via their APIs—run a query, then ingest the results into your session.
| SIEM | Auth | Usage |
|---|---|---|
| Azure Sentinel | Tenant ID, Client ID, Client Secret (env or UI) | Workspace ID + KQL query; default: SecurityEvent, last 24h |
| Elasticsearch | API key or User/Password (env or UI) | Cluster URL, index, query JSON, size |
Available in the web app with gateway (Datasets → Data Ingestion) or via the gateway API (POST /api/ingest/query). Desktop app without gateway: use From file and export from your SIEM first. See SIEM query-based ingest in the docs for env vars and pagination.
DSL — Build hypotheses as arrow chains with optional wildcards:
User -[Auth]-> Host -[Execute]-> Process
Process -[DNS]-> Domain -[Connect]-> IP
* -[Execute]-> Process -[Spawn]-> Process
Catalog — Pre-built hypotheses mapped to MITRE ATT&CK (e.g., Valid Accounts T1078, Credential Dumping T1003, RDP Lateral Movement T1021.001, C2 T1071). Load from the catalog or use them as templates for custom chains.
Graph Hunter can use GNN-based threat classification via ONNX models (e.g. from GraphOS-APT): the engine extracts k-hop subgraphs, runs inference (DirectML/GPU or CPU), and injects a 5-class threat score (Benign, Exfiltration, C2 Beacon, Lateral Movement, Privilege Escalation) into the anomaly scorer as weight W5. Hunt results are then ranked by the composite score so high-threat paths appear first. GNN scoring is optional and off by default; load a model and click Compute Scores in the GNN Threat Model panel to enable it.
Pre-trained ONNX model: Download from Hugging Face.
Full details (pipeline, threat classes, UI workflow, training): GNN Threat Scoring in the documentation.
Graph Hunter is split into a Rust core (domain logic, parsing, graph, search), a Tauri + React desktop app (UI and persistence), and optional graph-hunter-mcp for AI assistants. The core holds all business logic; the app exposes commands, session state, and an HTTP API.
Full details (directory layout, core modules, app structure, data flow): Architecture in the documentation.
You need Rust, Node.js, and the Tauri v2 prerequisites—no extra services or accounts. Follow the steps below; the first run may take a few minutes while dependencies build.
-
Install prerequisites (if not already installed):
- Rust (2024 edition)
- Node.js (v18+)
- Platform-specific build tools: see Tauri prerequisites
-
Clone and run in development:
cd app npm install npm run tauri dev -
Verify: The app window opens. Create a session, load
demo_data/apt_attack_simulation.jsonwith Auto-detect, then run a hunt (e.g. Hunt Mode → add stepUser -[Auth]-> Host→ Run). If you see paths and the graph, you’re ready to go.
Run tests:
cd graph_hunter_core
cargo testBuild for production:
cd app
npm run tauri buildMinimal run: start the app, load a log file, and hunt.
cd app && npm run tauri devThen in the UI: create or select a session → Select Log File → choose a file from demo_data/ (or your own) → Auto-detect → load. Switch to Hunt Mode, build a hypothesis (or pick one from the ATT&CK catalog), and click Run. Results appear in the graph and in the hunt table when there are many paths.
Three attack simulation datasets are included in demo_data/:
| File | Format | Scenario |
|---|---|---|
apt_attack_simulation.json |
Sysmon | APT kill chain: spearphishing, discovery, Mimikatz, PsExec, C2, exfiltration |
sentinel_attack_simulation.json |
Sentinel | Cloud-to-on-prem: brute-force DC, Azure AD abuse, lateral movement, beacon, exfiltration |
generic_csv_logs.csv |
CSV | Firewall/proxy logs: normal + C2, SMB lateral, exfiltration attempts |
Quick run:
- Start the app:
npm run tauri dev(fromapp/). - Create or select a session; choose Auto-detect (or a specific format), then load a demo file.
- Open Hunt Mode and build a hypothesis, e.g.:
User →[Execute]→ Process →[Write]→ File(malware drop)User →[Auth]→ Host(lateral auth)Host →[Connect]→ IP(C2)Process →[Spawn]→ Process(parent-child chains)- Or pick a pattern from the ATT&CK catalog.
- Switch to Explorer Mode to search IOCs and expand neighborhoods; use Events, Heatmap, and Timeline for context.
Real-world datasets (OTRF/Mordor, Splunk attack_data)
For large-scale testing with real attack telemetry, see demo_data/DOWNLOAD_REAL_DATA.md for download and conversion instructions (OTRF Security-Datasets, Mordor, Splunk attack_data).
All processing is local. Logs are read from files you select; no data is sent to external services. Sessions and notes are stored in your OS application data directory. No telemetry or analytics are included.
The engine provides temporal pattern matching (DFS with causal monotonicity), time-window filtering, 5-component endogenous anomaly scoring (optional GNN), k-simplicity for path constraints, parallel parsing (Rayon), and entity/relation deduplication. Entity and relation types, and full module descriptions, are in the documentation.
Full details: Architecture (core modules and data flow); Hypothesis & catalog (DSL, k-simplicity); Log formats (entity and relation types).
When the desktop app is running with a session loaded, it exposes an HTTP API on 127.0.0.1:37891 (configurable via GRAPHHUNTER_API_PORT). This allows external tools to query the graph (entity types, search, expand nodes, run hunts, create notes) without using the UI. The API is protected by token authentication: at startup the app prints GRAPHHUNTER_API_TOKEN=<uuid> to the console; clients (e.g. the MCP server) must send this token (e.g. via Authorization: Bearer <token> or the GRAPHHUNTER_API_TOKEN env var) or requests return 401 Unauthorized.
The graph-hunter-mcp package is an MCP (Model Context Protocol) server that turns these operations into tools for AI assistants (e.g. Claude Code). You can ask the AI to hunt for malicious paths, expand nodes, or summarize findings while the app holds the session and graph.
| Prerequisite | Description |
|---|---|
| App running | Start the Tauri app and load or create a session with data. |
| API token | Copy GRAPHHUNTER_API_TOKEN from the app startup log into your MCP config env so the MCP can authenticate. |
| MCP config | Add the graph-hunter-mcp server to your MCP client pointing at the app’s API URL. |
Usage sample — Once the MCP is connected, you can ask the AI assistant in natural language to run hunts and explore the graph. For example:
- "Use Graph Hunter to find any user who logged into a Host and then ran a suspicious process that wrote to the System32 folder."
The assistant will translate your request into different searchs, the appropriate hypothesis (e.g. User -[Auth]-> Host -[Execute]-> Process -[Write]-> File with filters) and run the hunt.
Demo
Quick setup: See graph-hunter-mcp/README.md for install, mcp.json example, tool list, and troubleshooting (firewall, port, 401, session required).
This project is licensed under the GNU General Public License v3.0 — see the LICENSE file for details.








