Note
Originally built for the DKGcon2025 Hackathon powered by OriginTrail.
This tool helps you compare Grokipedia and Wikipedia topics. It parses structured snapshots, computes discrepancies, and drafts Community Notes so that you can publish verifiable context on the OriginTrail DKG.
GWALN is built on three key layers, leveraging the OriginTrail Decentralized Knowledge Graph (DKG):
- 🤖 Agent Layer: The MCP Server exposes all CLI capabilities (fetch, analyze, publish) to AI agents (like Claude or Cursor) via the Model Context Protocol. This allows agents to autonomously verify information and interact with the DKG.
- 🧠 Knowledge Layer: The tool normalizes unstructured data (Wikipedia/Grokipedia) into Structured Knowledge Assets (JSON-LD). These assets are analyzed for discrepancies and formatted as standardized ClaimReviews.
- 🔗 Trust Layer:
- OriginTrail DKG: Verifiable publication of Knowledge Assets on the DKG Edge Node.
- NeuroWeb/Polkadot: Blockchain consensus for DKG operations.
- x402: Implements the x402 payment standard for incentivized access to premium MCP tools.
GWALN CLI is intended for analysts and contributors who review AI-generated encyclopedia content. It is meant to help them fetch topic snapshots, analyze alignment gaps, and package Community Notes for publication.
GWALN CLI reads wikitext and Grokipedia HTML, normalizes both into a shared JSON schema, and runs an analyzer that aligns sections, claims, and citations. The CLI then outputs structured analysis files, an HTML report, and JSON-LD Community Notes. For more details about the technical implementation, see the developer documentation.
This tool cannot edit Grokipedia or Wikipedia. It does not have content moderation powers and cannot publish to the OriginTrail DKG without valid signing keys. Users still need to review the findings and decide whether to publish.
Before using this tool, you should be familiar with:
- Basic command-line usage and Node.js tooling.
- OriginTrail concepts, including DKG nodes and Knowledge Assets.
You should have:
- Node.js 20.18.1 or later on macOS, Linux, or Windows
- Network access to a DKG edge node (NeuroWeb) and sufficient $TRAC tokens if you plan to publish.
- Optional: a Google Gemini API key if you use automated bias verification.
-
Install dependencies:
npm install npm run build npm link # optional -
Run the setup wizard:
gwaln init
- Provide your DKG endpoint, environment, and port.
- Supply blockchain identifiers and signing keys.
- Set publish defaults such as epochs, retries, and dry-run mode.
-
Confirm that
~/.gwaln/.gwalnrc.jsoncontains the expected values.
Run the topics helper when you need to copy the bundled catalog or ingest a
custom JSON feed (local file or HTTPS endpoint). By default it writes to
~/.gwaln/topics.json.
gwaln topics syncTo pull from a remote or local feed:
gwaln topics sync \
--source https://example.org/gwaln-topics.json \
--output ~/analyst/topics.json--source accepts either a path on disk or an HTTPS URL. Use --output if you
need to mirror the catalog elsewhere; the CLI still keeps ~/.gwaln/topics.json
up to date for its own use.
Before fetching snapshots, you can search for topics in your local catalog or discover new ones using the lookup command.
-
Search for a topic in the local catalog:
gwaln lookup "Moon"This checks if the topic exists in
~/.gwaln/topics.jsonby title and displays its details if found. -
Search both Grokipedia and Wikipedia APIs for a new topic:
gwaln lookup "Bitcoin"If the topic is not found locally, it automatically searches both platforms and prompts you to select matching entries to add to
~/.gwaln/topics.json. -
Limit the number of search results:
gwaln lookup "Blockchain" --limit 3The default limit is 5 results per platform.
-
Select a topic ID from
~/.gwaln/topics.json(for example,moon). -
Download raw Wikipedia data:
gwaln fetch wiki --topic moon
-
Download the Grokipedia counterpart:
gwaln fetch grok --topic moon
-
Verify that
~/.gwaln/data/wiki/<topic>.parsed.jsonand~/.gwaln/data/grok/<topic>.parsed.jsonexist.
-
Run the analyzer:
gwaln analyse --topic moon
To bypass cached results and regenerate even if inputs are unchanged:
gwaln analyse --topic moon --force
By default, bias cues are keyword-only. Enable transformer-based semantic bias detection (slower, downloads a model) when you need it:
gwaln analyse --topic moon --force --semantic-bias
-
Review the terminal summary:
gwaln show --topic moon
-
Generate an HTML dashboard for presentations:
gwaln show --topic moon --open-html
-
Open
~/.gwaln/analysis/moon-report.htmlin a browser to explore section alignment, numeric/entity discrepancies, bias cues, and diff samples.
-
Create a ClaimReview draft:
gwaln notes build \ --topic moon \ --summary "Grok omits the NASA mission context and adds speculative claims." \ --accuracy 3 --completeness 3 --tone-bias 3 \ --stake-token TRAC --stake-amount 0 -
Inspect the output in
~/.gwaln/notes/moon.jsonand~/.gwaln/notes/index.json.Example JSON-LD Community Note
{ "@context": ["https://schema.org", "https://www.w3.org/ns/anno.jsonld"], "@type": "ClaimReview", "@id": "urn:gwaln:note:moon:2024-11-26T12:00:00.000Z", "topic_id": "moon", "claimReviewed": "Comparison of Moon entries on Grokipedia and Wikipedia", "reviewRating": { "@type": "Rating", "ratingValue": "4.8", "ratingExplanation": "Detected 2 discrepancies including 0 bias issues..." }, "gwalnTrust": { "accuracy": 3, "completeness": 3, "tone_bias": 3, "stake": { "token": "TRAC", "amount": 0 } }, "citation": [ { "@type": "CreativeWork", "name": "Wikipedia", "url": "..." }, { "@type": "CreativeWork", "name": "Grokipedia", "url": "..." } ] } -
Publish to OriginTrail (ensure your config has live signing keys):
gwaln notes publish --topic moon
-
Record the printed UAL for reporting.
Retrieve previously published Community Notes from the DKG by topic title:
gwaln query --topic "Moon" --save moon-retrievedThe query command uses the DKG as the source of truth. It first checks for a local UAL cache, and if not found, searches the DKG directly using SPARQL to find the most recent published Community Note for the topic.
You can also query by UAL directly for advanced use cases:
gwaln query --ual "did:dkg:base:8453/0xc28f310a87f7621a087a603e2ce41c22523f11d7/666506" --save moon-retrievedThis retrieves the assertion and optional metadata, displays them in the terminal, and optionally saves the result to ~/.gwaln/data/dkg/moon-retrieved.json. You can override connection settings with flags like --endpoint, --blockchain, or --private-key.
The same workflows are also available to AI agents via the Model Context
Protocol (see the official docs).
This lets tools such as Claude Code, Cursor, and MCP Inspector call
fetch, analyze, notes, publish, query and show without duplicating
logic.
-
Start the stdio server:
npm run mcp
The process stays attached to your terminal so you can connect via an MCP-aware client (Claude Code, Cursor MCP configuration, or Inspector).
-
Register the server with your MCP client. For example, in MCP Inspector run
npx @modelcontextprotocol/inspectorand point it to the stdio process, or in Cursor add a “custom MCP server” that runsnpm run mcp. -
Typical agent flow:
fetchwithsource="both"(or specifywiki/grok) to grab the on-disk snapshots for a topic.analyzewithtopicId(optionallyforce,verifyCitations, or Gemini settings) to produce/refreshanalysis/<topic>.json.showwithtopicId(+renderHtml=trueif you want an HTML file path) to summarize the structured analysis + note draft.noteswithaction="build"to regenerate the Community Note for that topic; once reviewed, callnoteswithaction="publish"and either supplyualor let it hit the DKG node.- If you need to publish arbitrary JSON-LD assets (outside the note
flow), call
publishwith either afilePathor inlinepayload.
Each MCP tool mirrors the CLI flags:
fetch:{ source?, topicId? }analyze:{ topicId?, force?, biasVerifier?, geminiKey?, geminiModel?, geminiSummary?, verifyCitations? }notes: discriminated union forbuild,publish, orstatuspublish:{ filePath? , payload?, privacy?, endpoint?, environment?, ... }show:{ topicId, renderHtml? }
Because the MCP server calls the same workflow modules as the CLI,
cached files, Gemini credentials, and ~/.gwaln/.gwalnrc.json are honored
automatically.
The server reads DKG credentials and defaults from ~/.gwaln/.gwalnrc.json
via the same resolvePublishConfig helper used by the CLI, so you never
have to expose secrets through the MCP request itself. Just keep the
config file up to date with gwaln init.
When you run npm run mcp the process spins up a single endpoint
(POST /mcp, default URL http://127.0.0.1:3233/mcp). MCP clients must
first call initialize; the server then creates a dedicated session
using the Model Context Protocol’s session headers and reuses it for the
subsequent tools/list, tools/call, etc. There are no extra discovery
routes—just point your MCP client at that one URL.
The MCP server implements the x402 payment standard (via src/lib/x402.ts) to monetize premium tools like query, publish, and lookup on the NeuroWeb testnet.
- Free Tools:
fetch,analyze,show - Paywalled Tools:
query,publish,lookup(requires 1 TRAC payment)
When an AI agent attempts to use a paywalled tool without payment, the server returns a 402 Payment Required error with payment details. The agent can then facilitate the payment on-chain and retry the request with the payment proof.
Analysis not found for topic
- Run both
gwaln fetch wiki --topic <id>andgwaln fetch grok --topic <id>before analyzing.
DKG publish failed: UNAUTHORIZED
- Ensure
~/.gwaln/.gwalnrc.jsoncontains validdkgPrivateKey,dkgPublicKey, and endpoint values; confirm the key has sufficient balance on the target chain.
- Report issues at the GitHub issue tracker for this repository.
- Ask questions by opening a discussion or contacting the maintainers on the project chat. You can expect a response within one week.
To review the full analyzer pipeline, see docs/analyzer-overview.md.
To review and learn about the HTML report metrics, see docs/html-report-metrics.md
The CLI uses Node.js and Commander.js to expose subcommands. Parsing is handled by a custom module that converts Wikipedia wikitext and Grokipedia HTML into identical structured JSON (lead/sections, sentences, claims, citations, media attachments). The analyzer stage:
- normalizes sentences into token sets and compares them to detect missing or extra context
- aligns sections and claims using cosine similarity from the
string-similaritylibrary - computes numeric discrepancies via relative-difference heuristics and entity discrepancies via set symmetric differences
- flags bias/hallucination cues through lexicon scans plus subjectivity/polarity scoring
Optional verification hooks call the Gemini API for bias confirmation and run citation checks against Grokipedia references.
src/commands/: CLI entry points (init,fetch,analyse,show,notes,topics,publish,query).src/lib/: reusable modules including the parser, analyzer, discrepancies, bias metrics, and DKG helpers.~/.gwaln/data/: cached structured snapshots per topic.~/.gwaln/analysis/: analyzer outputs (JSON + HTML report).~/.gwaln/notes/: JSON-LD Community Notes and index metadata.
- Clone the repository and move into
gwaln/cli. - Install dependencies with
npm install.
-
Build TypeScript sources:
npm run build
-
Link the CLI locally (optional):
npm link
- Run
gwaln initto create~/.gwaln/.gwalnrc.jsonand populate node, blockchain, and publish defaults. - All user data (topics, snapshots, analysis, notes) will be stored in
~/.gwaln/.
-
Build:
npm run build
-
Run tests:
npm test
Analysis not found: check~/.gwaln/data/wikiand~/.gwaln/data/grokfor missing snapshots; rerungwaln fetch.Publish timeout: increasepublishMaxRetriesin~/.gwaln/.gwalnrc.jsonor verify the DKG node endpoint is reachable; use--dry-runto ensure the payload is valid before retrying.
See CONTRIBUTING.md for contribution guidelines.
Developed by Doğu Abaris, Damjan Dimitrov and contributors. The project builds on the
OriginTrail ecosystem and open-source libraries noted in package.json.
GWALN is released under the MIT License.