Skip to content

Research: Literature & Standards References (2024-2026) #12

@PipFoweraker

Description

@PipFoweraker

Purpose

Centralized reference list from landscape survey (Jan 2026). These inform the design and should be incorporated into references.md and cited in the LW/AF publication.


Model Cards & AI Documentation

Academic Papers

  • Mitchell et al. 2019 - Model Cards for Model Reporting (arXiv) - foundational paper
  • AI Cards (APF 2024) - Machine-readable AI/risk documentation for EU AI Act (Springer, arXiv)
  • Policy Cards (2025) - Runtime governance for AI agents (arXiv)
  • Automated Model Card Generation (NAACL 2024) - LLM-based card generation (arXiv, GitHub)

Industry Implementations

  • Anthropic System Cards - Claude model documentation (Hub, Claude 4)
  • NVIDIA Model Card++ - Extended transparency fields (Blog)
  • Red Hat AI System Cards - Security/governance extension (Blog)
  • HuggingFace Model Cards - Most widely adopted (Docs, Guidebook)

Standards & Schemas

Machine-Readable Formats

  • Croissant (MLCommons) - JSON-LD dataset metadata (Site, Spec, Paper)
  • CycloneDX ML-BOM - Supply chain BOM with ML support (Site, GitHub)

Management Systems & Certification

  • ISO/IEC 42001:2023 - AI Management Systems (ISO, BSI)
  • NIST AI RMF - Risk Management Framework (NIST)

Regulatory

EU AI Act

  • Article 11 - Technical Documentation (Text)
  • Annex IV - Documentation Requirements (Text)
  • Implementation Timeline - Feb 2025 (bans), Aug 2025 (GPAI), Aug 2026 (full), Aug 2027 (grace period ends)
  • National Implementation Plans (Overview)

Evaluation Science

Core Resources

  • Apollo Research - "We Need a Science of Evals" (Blog)
  • Apollo Research - "The Evals Gap" (Blog)
  • Apollo Research - Opinionated Evals Reading List (Blog)
  • METR - Frontier model evaluations (Site)
  • UK AISI Inspect Evals (Site)
  • HuggingFace OpenEvals Guidebook (Space)

Papers

  • Observational Scaling Laws (Ruan et al. 2024) - predictive eval methodology
  • Evaluating AI Evaluation (arXiv)

Tooling & Infrastructure

  • huggingface_hub - Model card creation/validation (Docs)
  • Croissant Editor - Visual JSON-LD editor for datasets
  • Croissant + MCP (Blog)

To Investigate Further

  • IEEE P2894 - AI Safety Standards (status?)
  • OECD AI Principles implementation guidance
  • Singapore Model AI Governance Framework updates
  • Partnership on AI documentation recommendations
  • AI incident databases (AIID, OECD) - relationship to model cards

Action Items

  • Update references.md with categorized links
  • Add BibTeX entries for academic citations
  • Identify gaps requiring additional research

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions