Design: Evidence Linking Protocol

## Goal

Specify how claims in interrogatory model cards must link to verifiable artifacts, making documentation auditable rather than self-reported.

## Core Principle

From v0.1: "Performance claims → dataset version + eval script commit + run hash"

Every claim should trace to evidence. The protocol defines:
1. **What** kinds of evidence are acceptable
2. **How** links are formatted/validated
3. **When** evidence must be public vs attestable

## Evidence Types

### Code Artifacts
- Git commits (full SHA)
- Git tags/releases
- Container images (digest)
- Package versions (pinned)

### Data Artifacts
- Dataset versions (HuggingFace revision, Kaggle version)
- Data Card links
- Croissant metadata files
- DVC references

### Execution Artifacts
- Run hashes (W&B run ID, MLflow run ID)
- Logs with timestamps
- Reproducibility seeds

### Third-Party Attestations
- Audit reports (linked, dated)
- Certification references (ISO 42001 cert number)
- Red-team evaluation reports

## Link Format Specification

```json
{
  "claim": "Achieves 85% accuracy on MMLU",
  "evidence": [
    {
      "type": "benchmark_result",
      "dataset": {
        "name": "MMLU",
        "version": "1.0.0",
        "source": "https://huggingface.co/datasets/cais/mmlu"
      },
      "eval_script": {
        "repo": "https://github.com/org/eval-harness",
        "commit": "abc123def456..."
      },
      "run": {
        "id": "wandb://org/project/runs/xyz789",
        "seed": 42,
        "timestamp": "2026-01-15T10:30:00Z"
      }
    }
  ]
}
```

## Validation Rules

### Strict Mode (for high-risk)
- All links must resolve
- Git commits must be in public repos or attested private
- Run artifacts must be retrievable or attested

### Standard Mode
- Links should resolve; warnings for broken
- Private repos allowed with attestation statement
- Run artifacts recommended but not required

### Minimal Mode (for early-stage/research)
- Links encouraged
- Missing evidence flagged but not blocking

## Open Questions

1. How to handle proprietary/confidential evidence?
   - Attestation by third party?
   - Cryptographic commitments?
2. Link rot: what happens when URLs die?
   - Archive.org / perma.cc recommendations?
   - Hash-based content addressing?
3. Incremental disclosure: can evidence be added post-publication?

## Deliverables

1. `schema/evidence-link.schema.json` - JSON Schema for evidence objects
2. `docs/evidence-protocol.md` - human-readable specification
3. `tools/link-validator.py` - basic link checking tool

## Related Issues

- Schema: Croissant/schema.org Extension Design
- Tooling: Validation & Generation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design: Evidence Linking Protocol #6

Goal

Core Principle

Evidence Types

Code Artifacts

Data Artifacts

Execution Artifacts

Third-Party Attestations

Link Format Specification

Validation Rules

Strict Mode (for high-risk)

Standard Mode

Minimal Mode (for early-stage/research)

Open Questions

Deliverables

Related Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Design: Evidence Linking Protocol #6

Description

Goal

Core Principle

Evidence Types

Code Artifacts

Data Artifacts

Execution Artifacts

Third-Party Attestations

Link Format Specification

Validation Rules

Strict Mode (for high-risk)

Standard Mode

Minimal Mode (for early-stage/research)

Open Questions

Deliverables

Related Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions