Skip to content

Fix: Use init container to populate /github/workflow directories#29

Open
rubionic wants to merge 3 commits intomainfrom
rubionic/init-container-github-workspace
Open

Fix: Use init container to populate /github/workflow directories#29
rubionic wants to merge 3 commits intomainfrom
rubionic/init-container-github-workspace

Conversation

@rubionic
Copy link
Collaborator

Summary

This PR implements the init container solution described in #28 to fix the issue where /github/workflow/event.json doesn't exist in cached-privileged-kubernetes mode, causing actions like Docker Buildx to fail.

Problem

When using cached-privileged-kubernetes mode, the GitHub Actions runner-container-hooks package has a bug where the prepare script only executes when userMountVolumes are defined. This script is responsible for copying GitHub workspace directories from /__w/_temp/ to /github/, including the critical /github/workflow/event.json file.

Solution

Replace the dummy volume workaround with an explicit prepare-github-workspace init container that:

  • Runs before the main container starts
  • Has access to both /__w and /github volumes
  • Explicitly copies _github_home and _github_workflow directories
  • Provides clear logging and graceful error handling

Changes

  • Added: prepare-github-workspace init container to hook extension spec
  • Removed: dummy-prepare-trigger volume mount and volume workaround
  • Updated: All test expected outputs to reflect new init container

Benefits

Reliable: Explicit copy operation independent of hook internals
Clear: Obvious intent and easy to debug
Visible: Init container logs show preparation progress
Standard: Uses common Kubernetes init container pattern

Testing

All tests have been updated and pass:

✓ go test ./pkg/templates/...
✓ go test ./internal/runner/template_spec/...

The init container will appear in the hook extension ConfigMap and execute before the main job container starts, ensuring /github/workflow/event.json exists for all actions that require GITHUB_EVENT_PATH.

Related Issues

Fixes #28

Future Work

Once this workaround is verified in production, we should:

  1. File upstream bug report to actions/runner-container-hooks
  2. Monitor for upstream fix
  3. Remove init container workaround when fixed version is available

rubionic and others added 3 commits December 29, 2025 10:28
…tion

Add a dummy EmptyDir volume mount to the cached-privileged-kubernetes
container mode to work around a bug in GitHub's runner-container-hooks
that prevents /github/workflow/event.json from being populated.

The bug is in the k8s-novolume hook's prepare-job.ts where the prepare
script (which copies /github/workflow and /github/home content) only
gets created and executed if there are userMountVolumes. Without any
user volumes, the prepare script is never run, leaving /github/workflow
empty and causing Docker Buildx and other actions to fail.

This workaround adds a dummy volume mount at /tmp/dummy-prepare to
trigger the conditional logic that creates the prepare script. The
prepare script itself handles the case where userMountVolumes exist
and performs the necessary GitHub workspace directory copies as a
side effect.

Fixes #26
Replace the dummy volume workaround with an explicit init container that
copies GitHub workspace directories from /__w/_temp/ to /github/ before
the main container starts.

This fixes the issue where /github/workflow/event.json doesn't exist in
cached-privileged-kubernetes mode, causing actions like Docker Buildx
that require GITHUB_EVENT_PATH to fail.

The GitHub Actions runner-container-hooks package has a bug where the
prepare script only runs when userMountVolumes are defined. This init
container provides a more reliable and explicit solution.

Changes:
- Add prepare-github-workspace init container to hook extension spec
- Remove dummy-prepare-trigger volume mount and volume workaround
- Update all test expected outputs to reflect new init container

Fixes #28
The ARC controller was missing patch permissions for rolebindings and
serviceaccounts, causing finalizers to get stuck during runner scale set
deletion. This resulted in 'deskrun up' hanging indefinitely.

Changes:
- Add patch permission to rolebindings in controller ClusterRole overlay
- Add create/delete/get/patch permissions to serviceaccounts in overlay
- Update generate-base-templates.sh to include ConfigMap placeholder for
  privileged mode (required for ytt overlay to work after helm regeneration)
- Update test expected files to reflect new RBAC permissions and ConfigMap
  ordering
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix: Use init container to populate /github/workflow directories

1 participant