Skip to content

Conversation

@mfkaptan-motius
Copy link
Collaborator

@mfkaptan-motius mfkaptan-motius commented Dec 18, 2025

Replace hash-based IDs with semantic versioning for variant ID generation

Summary

This MR replaces the hash-based variant ID generation system with a semantic versioning approach that provides better traceability and human-readable IDs. Variant increments are now automatically tied to changes detected by GraphQL Inspector.

What Changed

New Variant ID System

  • Variant IDs now use semantic versioning format: Concept/vM.m (e.g., Vehicle.speed/v1.0)
  • Each concept tracks a variant counter that increments with modifications
  • Variant increments are tied to graphql-inspector detected changes for accurate change tracking
  • Support for marking concepts as removed in specific versions
  • Version tags are now required for all registry operations

GraphQL Inspector Integration

New: graphql_inspector_diff.js - Node.js script using @graphql-inspector/core API

  • Generates structured JSON diff output comparing schema versions
  • Detects field additions, removals, and modifications
  • Identifies breaking vs non-breaking changes
  • Provides concept-level change tracking for variant increments

New: diff_parser.py - Python module to process GraphQL Inspector output

  • Parses structured diff JSON
  • Maps changes to concept names for variant tracking
  • Supports automated variant increment logic

Architecture Changes

Removed: src/s2dm/idgen/ module (old hash-based system)

Added: src/s2dm/registry/ module with:

  • variant_ids.py: Models for semantic version-based variant IDs
  • concept_uris.py: Concept URI registry management
  • spec_history.py: Spec history tracking
  • search.py: Moved from tools (previously skos_search.py)

Added: src/s2dm/tools/graphql_inspector_diff.js - Structured diff generation

Added: src/s2dm/tools/diff_parser.py - Diff output processing

Updated: src/s2dm/tools/graphql_inspector.py - Integration with new diff system

CLI Changes

  • All registry commands now require --version-tag parameter
  • registry update requires --previous-ids parameter
  • Optional --diff-file parameter for providing structured diffs from GraphQL Inspector
  • Removed --strict-mode option from registry id command

How It Works

  1. Schema Comparison: s2dm diff graphql uses GraphQL Inspector to compare old and new schemas
  2. Change Detection: Identifies specific changes (field additions, removals, type changes)
  3. Variant Increment: Each detected change triggers a variant increment for affected concepts
  4. Semantic Versioning: Concepts receive incremental versions (v1.0, v1.1, v2.0, etc.)

Example workflow:

# Generate diff using s2dm diff command
s2dm diff graphql --schema old.graphql --val-schema new.graphql --output changes.json

# Update registry with detected changes
s2dm registry update \
  --current-schema new.graphql \
  --previous-ids variant_ids_v1.0.0.json \
  --previous-spec-history spec_history_v1.0.0.json \
  --diff-file changes.json \
  --version-tag v1.1.0 \
  --output-dir ./registry

Breaking Changes

  1. Removed idgen module: The hash-based ID generation system has been completely removed
  2. CLI parameter changes:
    • --version-tag is now required for registry id, registry init, and registry update
    • --previous-ids is now required for registry update
    • --strict-mode option removed from registry id
  3. ID format change: Variant IDs now use semantic versioning (vM.m) instead of content hashes
  4. New dependency: Requires @graphql-inspector/core npm package for diff generation

Documentation

  • Updated src/s2dm/exporters/README.md with variant ID documentation
  • Added src/s2dm/registry/README.md with registry module overview
  • Updated examples/spec-history-registry/README.md with new CLI usage
  • All new modules include docstrings

Testing

Added tests covering:

  • Variant ID generation and format validation
  • Major/minor version increment logic
  • Breaking vs non-breaking change handling
  • Enum changes and propagation
  • Variant counter tracking
  • Edge cases (empty diffs, multiple changes, removed fields)

Updated E2E CLI tests for new registry commands.

Test Infrastructure Improvements

GraphQL Inspector Test Marker: Implemented explicit test management for GraphQL Inspector dependencies:

  • Added @pytest.mark.graphql_inspector marker for tests requiring npm dependencies
  • Created inspector_path pytest fixture that auto-locates node_modules directory
  • Tests automatically skip with helpful message when @graphql-inspector/cli not installed
  • Developers can selectively skip these tests: pytest -m "not graphql_inspector"
  • Documented skip behavior in CONTRIBUTING.md

Related to #153

@mfkaptan-motius mfkaptan-motius changed the title feat!: replace hash-based IDs with semantic versioning for variant ID generation [DRAFT] feat!: replace hash-based IDs with semantic versioning for variant ID generation Dec 18, 2025
@mfkaptan-motius mfkaptan-motius force-pushed the feat/variant-ids branch 5 times, most recently from cec8985 to 44cfb1f Compare December 23, 2025 12:57
@mfkaptan-motius mfkaptan-motius changed the title [DRAFT] feat!: replace hash-based IDs with semantic versioning for variant ID generation feat!: replace hash-based IDs with semantic versioning for variant ID generation Dec 23, 2025
… generation

Add variant-based ID generation and registry management to the
s2dm-publish GitHub Action. The workflow now automatically initializes
or updates the registry based on schema changes detected via GraphQL
Inspector diff.

- Add new `registry` module with variant ID tracking using semantic versions
- Add `variant_ids.py` with VariantEntry and VariantIDFile models
- Add `graphql_inspector_diff.js` for automated schema change detection
- Add `diff_parser.py` to process GraphQL Inspector output
- Variant increments now tied to graphql-inspector detected changes
- Update ID exporter to generate semantic version-based variant IDs
- Update spec history exporter to work with new variant ID system
- Move `skos_search.py` to registry module
- Add variant counter tracking for concept modifications
- Add support for marking concepts as removed in specific versions
- Add 15 comprehensive tests for variant ID generation logic
- Update E2E CLI tests for new registry commands

BREAKING CHANGES:
- Removed `idgen` module and hash-based ID generation
- `registry id` command now requires `--version-tag` parameter
- `registry id` command removed `--strict-mode` option
- `registry init` command now requires `--version-tag` parameter
- `registry update` command now requires `--version-tag` and `--previous-ids`
- Variant IDs use semantic versioning format (Concept/vM.m) instead of hashes
- Integrate variant registry into automated workflow
- Remove dry_run parameter from IDExporter
- Registry operations now require version tags and
previous registry files from release artifacts.

Signed-off-by: Mustafa Kaptan <mustafa.kaptan@motius.de>
…tern

- Add package.json for local Node.js dependency management
- Add @requires_graphql_inspector decorator for dependency injection
- Add locate_graphql_inspector() to find node_modules automatically
- Resolve CLI path once during initialization for performance
- Improve dependency checking with runtime verification
- Update CI/Actions to use npm install instead of global install
- Update documentation in CONTRIBUTING.md and tools/README.md

Replaces global npm install with local package.json management.
The decorator pattern auto-injects inspector_path to CLI commands,
eliminating repeated lookups and global state.

Signed-off-by: Mustafa Kaptan <mustafa.kaptan@motius.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants