Dependency versioning system and auto-updates#2
Merged
Conversation
- specs/dependencies.json: version manifest tracking 8 external deps (container images, upstream sources, pre-commit hooks) - scripts/check-updates.sh: checker script with --json, --apply, --category modes. Queries nvcr.io, Docker Hub, and GitHub APIs. - tests/check_updates.bats: 16 BATS tests (manifest schema, cross-checks, script behavior) - Makefile: add check-updates target - .github/dependabot.yml: auto-update GitHub Actions versions Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
The container image version in conf/defaults.sh uses bash parameter expansion syntax (e.g., :24.03}"), so the sed replacement must match the closing brace, not a trailing quote. Also use a regex for intel/hpckit to handle any current version string. Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
Updated via check-updates.sh --apply: - pre-commit-hooks: v4.5.0 → v6.0.0 - pre-commit-shfmt: v3.8.0-1 → v3.12.0-2 - shellcheck-py: v0.9.0.6 → v0.11.0.1 All lint checks pass with the updated hooks. Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
Add 6 NVIDIA dependencies to the update checker: - nvidia-driver (apt repo, fallback pin in bootstrap.sh) - cuda-toolkit (apt repo, fallback pin in bootstrap.sh) - datacenter-gpu-manager (apt repo, report-only) - libnccl2 (apt repo, report-only) - nvidia-fabricmanager (apt repo, report-only) - nvidia-container-toolkit (GitHub releases, report-only) New check_nvidia_apt_repo() downloads and caches the NVIDIA apt repo Packages index once per run. Supports highest_match, cuda_major_minor, package_version, and nccl_version extractors. 5 new BATS tests (119 total). Cross-checks verify driver/CUDA fallbacks in bootstrap.sh match manifest. Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
Add 6 NVIDIA dependencies to the update checker: - nvidia-driver (apt repo, fallback pin in bootstrap.sh) - cuda-toolkit (apt repo, fallback pin in bootstrap.sh) - datacenter-gpu-manager (apt repo, report-only) - libnccl2 (apt repo, report-only) - nvidia-fabricmanager (apt repo, report-only) - nvidia-container-toolkit (GitHub releases, report-only) New check_nvidia_apt_repo() downloads and caches the NVIDIA apt repo Packages index once per run. Supports highest_match, cuda_major_minor, package_version, and nccl_version extractors. 5 new BATS tests (119 total). Cross-checks verify driver/CUDA fallbacks in bootstrap.sh match manifest. Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
Weekly GitHub Actions workflow (.github/workflows/dependency-update.yml): - Runs check-updates.sh on schedule (Mondays 09:00 UTC) or manual trigger - Applies available updates and validates with lint, tests, static checks, and smoke run before opening a PR - Deduplicates PRs: reuses existing open dependency PR branch - Optional GPU validation on self-hosted runner (HPC_ENABLE_GPU_CI=1) - Posts GPU test results as PR comment Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
…and update history (Phase 4)
Compatibility constraints:
- cuda-toolkit manifest entry now includes constraints requiring
nvidia-driver >= 580 for CUDA 13.x, >= 525 for 12.x, >= 450 for 11.x
- check_constraints() evaluates constraints and reports OK/INCOMPATIBLE
- Constraint warnings shown inline in human-readable and JSON output
Dry-run mode:
- New --dry-run flag (requires --apply) previews what changes would be
made without modifying any files
Post-apply validation:
- After --apply, runs bash -n on modified .sh files, jq on .json files,
and YAML validation on .yaml files
- Reverts individual files via git checkout on validation failure
Update history:
- specs/update-history.json tracks each --apply run with timestamp and
list of {name, from, to} changes
- Appended automatically after successful --apply
Version utilities:
- version_gte() for numeric version comparison via sort -V
- version_major() for extracting major version numbers
7 new BATS tests (126 total). All quality gates pass.
Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
…stem - README.md: updated repository layout tree with new files, added Dependency tracking section and CI references for the weekly workflow and Dependabot - SKILL.md: updated codebase structure tree, running tests section, and key files table with dependency checker entries - CHANGELOG.md: added entry for the dependency update system Co-authored-by: Hitesh Kumar <FistOfHit@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.