feat: add GitHub Actions workflow for verifying reproducible builds by Evrard-Nil · Pull Request #415 · nearai/cloud-api

Evrard-Nil · 2026-02-04T14:42:16Z

Note

Low Risk
CI-only change that doesn’t affect runtime code paths, but it can introduce new build flakiness or longer CI times due to multi-runner builds and Docker pruning.

Overview
Adds a new GitHub Actions workflow (verify-reproducible-build.yml) that validates container build reproducibility by running ./build-image.sh on multiple Ubuntu runners and comparing the resulting OCI image digest.

The workflow also performs two sequential builds on the same runner with aggressive Docker/buildx cleanup to catch nondeterminism, uploads digests (and the OCI archives only on mismatch), and posts a step summary table plus a pass/fail result based on digest equality.

^{Written by Cursor Bugbot for commit 4fa37d1. This will update automatically on new commits. Configure here.}

gemini-code-assist · 2026-02-04T14:42:22Z

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

…s workflow

claude · 2026-02-04T14:44:13Z

Code Review

This PR adds a reproducibility verification workflow for Docker builds. The approach is solid, but there are a few critical issues that could cause false positives/negatives or confusing failures:

Critical Issues

1. Builder cleanup hardcodes name (lines 98-104)

The cleanup step assumes buildkit_20 builder name, which is hardcoded in build-image.sh:27
If build-image.sh changes the builder name, sequential builds won't be properly isolated
Risk: False positives if cached layers persist between builds
Suggested fix: Extract builder name dynamically from build-image.sh to stay in sync

2. Summary shows empty digests on early failure (lines 240-266)

If verification fails before setting GITHUB_OUTPUT variables (e.g., at line 199), the summary will display an empty table
The summary step runs with if: always() but doesn't handle missing outputs gracefully
Suggested fix: Set default empty values at the start of the verify step (before line 166)

3. Potential disk space exhaustion

Uploads 5 OCI tar files simultaneously (each could be 500MB-2GB+)
GitHub Actions runners have limited disk space (~14GB available)
Risk: Intermittent failures unrelated to reproducibility
Recommendation: Consider uploading only digests by default, and OCI tars only on mismatch/failure for debugging

Minor Suggestions

Line 31: Add || true to apt-get update to handle transient repository failures
Line 14-15: Weekly cron may be unnecessary for reproducibility verification (this doesn't change often) - consider monthly or on-release only

Overall the workflow logic is sound, but these issues could make it unreliable or confusing when failures occur. Recommend addressing issues #1 and #2 before merge.

⚠️ Issues found

Copilot

Pull request overview

This PR adds a GitHub Actions workflow to verify that Docker image builds are reproducible across different environments and sequential builds. The workflow tests reproducibility by building on multiple Ubuntu versions in parallel and performing sequential builds on the same runner with a clean environment between builds, then verifying that all builds produce identical image digests.

Changes:

Added a new workflow file verify-reproducible-build.yml that runs on pull requests affecting build files, manually via workflow_dispatch, and weekly on schedule
Implemented three jobs: parallel builds on different runners, sequential builds on the same runner, and verification that all digests match
Configured artifact uploads for digests and OCI archives with 1-day retention

Comments suppressed due to low confidence (1)

.github/workflows/verify-reproducible-build.yml:266

The summary generation step runs with 'if: always()' which means it will run even if the verify step fails. However, if the verify step fails early (e.g., during digest extraction), the output variables may not be set, and the summary will show empty digests. Consider adding error handling to display a more helpful message when outputs are missing, or checking if the outputs exist before using them in the summary.

            echo ""
            echo "| Environment | Digest |"
            echo "|-------------|--------|"
            echo "| ubuntu-latest | \`${{ steps.verify.outputs.digest_ubuntu_latest }}\` |"
            echo "| ubuntu-22.04 | \`${{ steps.verify.outputs.digest_ubuntu_22 }}\` |"
            echo "| ubuntu-24.04 | \`${{ steps.verify.outputs.digest_ubuntu_24 }}\` |"
            echo "| Sequential (first) | \`${{ steps.verify.outputs.digest_seq_first }}\` |"
            echo "| Sequential (second) | \`${{ steps.verify.outputs.digest_seq_second }}\` |"
            echo ""

            if [ "${{ steps.verify.outputs.result }}" = "passed" ]; then
              echo "### Result: REPRODUCIBLE"
              echo ""
              echo "All builds produced identical digests."
            else
              echo "### Result: NOT REPRODUCIBLE"
              echo ""
              echo "Build digests do not match across environments."
            fi
          } >> "$GITHUB_STEP_SUMMARY"

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-04T14:46:58Z

.github/workflows/verify-reproducible-build.yml

+on:
+  workflow_dispatch:
+  pull_request:
+  schedule:
+    # Run weekly on Sunday at 00:00 UTC
+    - cron: '0 0 * * 0'
+
+jobs:
+  # Build on multiple runners in parallel to test cross-machine reproducibility
+  build:
+    name: Build (${{ matrix.runner }})
+    runs-on: ${{ matrix.runner }}
+    strategy:


The workflow runs on every pull request that modifies build-related files, plus weekly on schedule. Each run performs 5 complete builds (3 parallel + 2 sequential), which can be resource-intensive. Consider whether this frequency is necessary, or if the workflow should only run on schedule or manual trigger, with PR runs being optional via a label or comment.

Copilot · 2026-02-04T14:46:59Z

.github/workflows/verify-reproducible-build.yml

+          done
+
+          if [ "$FAILED" = true ]; then
+            exit 1
+          fi
+
+          # Compare all digests - they should all match
+          REFERENCE_DIGEST="$DIGEST_UBUNTU_LATEST"
+          MISMATCH=false
+
+          echo "=== Comparing all digests against reference (ubuntu-latest) ==="
+          echo ""


The script sets FAILED=true but continues checking all digests before exiting. However, if FAILED is true, the script will exit with code 1 before reaching the digest comparison logic. This means if any digest is empty, the comparison step never runs and the summary will not show which specific builds failed. Consider moving the empty check failure handling to after all comparisons, or providing more specific error information in the output before exiting.

Copilot · 2026-02-04T14:46:59Z

.github/workflows/verify-reproducible-build.yml

+
+          # Remove any cached layers
+          docker builder prune -af || true
+


The cleanup step removes the buildkit builder with a hardcoded name 'buildkit_20', but this builder is created with a conditional check in build-image.sh. If the builder doesn't exist when the cleanup runs, the '|| true' will suppress the error. However, if the builder was never created (because it already existed from a previous run), the state won't be truly fresh. Consider also cleaning up the builder before the first build to ensure a completely clean state, or document that this is intentional to test reproducibility with an existing builder.

Copilot · 2026-02-04T14:46:59Z

.github/workflows/verify-reproducible-build.yml

+            else
+              echo "[FAIL] ${name}: MISMATCH"
+              echo "  Expected: ${REFERENCE_DIGEST}"
+              echo "  Got:      ${digest}"
+              MISMATCH=true
+            fi
+          }
+
+          compare_digest "ubuntu-22.04" "$DIGEST_UBUNTU_22"
+          compare_digest "ubuntu-24.04" "$DIGEST_UBUNTU_24"
+          compare_digest "Sequential first" "$DIGEST_SEQ_FIRST"
+          compare_digest "Sequential second" "$DIGEST_SEQ_SECOND"


The variable MISMATCH is set inside the compare_digest function but is used after the function calls to determine the exit status. In bash, variables set inside functions are global by default (unless declared with 'local'), so this will work. However, this pattern can be error-prone and makes the code harder to understand. Consider making the function return a status code or using a more explicit approach to track mismatches.

Suggested change

else

echo "[FAIL] ${name}: MISMATCH"

echo " Expected: ${REFERENCE_DIGEST}"

echo " Got: ${digest}"

MISMATCH=true

fi

}

compare_digest "ubuntu-22.04" "$DIGEST_UBUNTU_22"

compare_digest "ubuntu-24.04" "$DIGEST_UBUNTU_24"

compare_digest "Sequential first" "$DIGEST_SEQ_FIRST"

compare_digest "Sequential second" "$DIGEST_SEQ_SECOND"

return 0

else

echo "[FAIL] ${name}: MISMATCH"

echo " Expected: ${REFERENCE_DIGEST}"

echo " Got: ${digest}"

return 1

fi

}

compare_digest "ubuntu-22.04" "$DIGEST_UBUNTU_22" || MISMATCH=true

compare_digest "ubuntu-24.04" "$DIGEST_UBUNTU_24" || MISMATCH=true

compare_digest "Sequential first" "$DIGEST_SEQ_FIRST" || MISMATCH=true

compare_digest "Sequential second" "$DIGEST_SEQ_SECOND" || MISMATCH=true

Copilot · 2026-02-04T14:47:00Z

.github/workflows/verify-reproducible-build.yml

+          # Save digest to file for artifact upload
+          echo "${DIGEST}" > digest.txt
+
+      - name: Upload digest artifact
+        uses: actions/upload-artifact@v4
+        with:


The digest extraction uses 'jq -r' to get the raw digest value, which is correct. However, there's no validation that the digest matches the expected format (e.g., sha256:...). If skopeo returns an unexpected format or the '.Digest' field doesn't exist, jq will return 'null' which will fail the empty check. Consider adding validation that the digest matches the expected format to provide clearer error messages.

Copilot · 2026-02-04T14:47:00Z

.github/workflows/verify-reproducible-build.yml

+      - name: Install build dependencies
+        run: |
+          sudo apt-get update


The matrix includes 'ubuntu-latest', 'ubuntu-22.04', and 'ubuntu-24.04', but 'ubuntu-latest' is currently an alias that points to one of these specific versions (typically the latest LTS). This means two runners in the matrix will be running the same underlying OS version, which doesn't add value for testing cross-machine reproducibility. Consider either removing 'ubuntu-latest' from the matrix or clarifying in comments that this is intentional to test the alias behavior.

Copilot · 2026-02-04T14:47:00Z

.github/workflows/verify-reproducible-build.yml

+        run: |
+          echo "=== Reproducibility Verification Results ==="
+          echo ""
+
+          # Read digests from artifact files
+          DIGEST_UBUNTU_LATEST=$(cat digest-ubuntu-latest/digest.txt)
+          DIGEST_UBUNTU_22=$(cat digest-ubuntu-22.04/digest.txt)
+          DIGEST_UBUNTU_24=$(cat digest-ubuntu-24.04/digest.txt)
+          DIGEST_SEQ_FIRST=$(cat digest-sequential/digest-first.txt)
+          DIGEST_SEQ_SECOND=$(cat digest-sequential/digest-second.txt)
+
+          # Export for summary step
+          echo "digest_ubuntu_latest=${DIGEST_UBUNTU_LATEST}" >> "$GITHUB_OUTPUT"
+          echo "digest_ubuntu_22=${DIGEST_UBUNTU_22}" >> "$GITHUB_OUTPUT"
+          echo "digest_ubuntu_24=${DIGEST_UBUNTU_24}" >> "$GITHUB_OUTPUT"
+          echo "digest_seq_first=${DIGEST_SEQ_FIRST}" >> "$GITHUB_OUTPUT"
+          echo "digest_seq_second=${DIGEST_SEQ_SECOND}" >> "$GITHUB_OUTPUT"


The workflow downloads artifacts with 'merge-multiple: false', which means each artifact will be downloaded into its own directory. This is correct given the subsequent code that reads from 'digest-ubuntu-latest/digest.txt', 'digest-ubuntu-22.04/digest.txt', etc. However, if the artifact names change (for example, if the runner names change), the hardcoded paths in lines 166-170 will break. Consider adding error handling or using a more dynamic approach to read from the downloaded artifacts.

think-in-universe

Thanks Evrard. I was also thinking about automate the reproducible build verification yesterday. Thanks for adding this!

think-in-universe · 2026-02-05T00:24:09Z

.github/workflows/verify-reproducible-build.yml

+  pull_request:
+  schedule:
+    # Run weekly on Sunday at 00:00 UTC
+    - cron: '0 0 * * 0'


For weekly cron job, I think we can compare the generated image digest with the one already published in docker registry.

Also we may consider running for each push to main instead of pull_request I think.

…e error handling and artifact management

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on February 9

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2026-02-06T11:53:23Z

.github/workflows/verify-reproducible-build.yml

+          echo "=== Cleaning build environment ==="
+
+          # Extract builder name from build-image.sh to stay in sync
+          BUILDER_NAME=$(grep -oP '(?<=--name )\w+' build-image.sh | head -1) || true


Regex fails to extract builder names with hyphens

Medium Severity

The regex pattern \w+ used to extract the builder name from build-image.sh only matches word characters (letters, digits, underscore) and stops at hyphens. If the builder name contains a hyphen (e.g., my-builder), only the portion before the hyphen is captured, resulting in docker buildx rm attempting to remove a non-existent builder. The actual builder remains with its cached state, potentially invalidating the sequential build reproducibility test.

feat: add GitHub Actions workflow for verifying reproducible builds

3e275d9

Copilot AI review requested due to automatic review settings February 4, 2026 14:42

Evrard-Nil temporarily deployed to Cloud API test env February 4, 2026 14:42 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Evrard-Nil February 4, 2026 14:42 View session

fix: remove specific paths from pull request trigger in GitHub Action…

8a937a7

…s workflow

Evrard-Nil temporarily deployed to Cloud API test env February 4, 2026 14:43 — with GitHub Actions Inactive

Copilot AI reviewed Feb 4, 2026

View reviewed changes

think-in-universe reviewed Feb 5, 2026

View reviewed changes

fix: update GitHub Actions workflow for reproducible builds to improv…

4fa37d1

…e error handling and artifact management

Evrard-Nil temporarily deployed to Cloud API test env February 6, 2026 11:45 — with GitHub Actions Inactive

cursor bot reviewed Feb 6, 2026

View reviewed changes

Conversation

Evrard-Nil commented Feb 4, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 4, 2026

Uh oh!

claude bot commented Feb 4, 2026

Code Review

Critical Issues

Minor Suggestions

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

think-in-universe left a comment

Choose a reason for hiding this comment

Uh oh!

think-in-universe Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

think-in-universe Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This is the final PR Bugbot will review for you during this billing cycle

Uh oh!

cursor bot Feb 6, 2026

Choose a reason for hiding this comment

Regex fails to extract builder names with hyphens

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Evrard-Nil commented Feb 4, 2026 •

edited by cursor bot

Loading

think-in-universe Feb 5, 2026 •

edited

Loading