Skip to content

fix: add image pre-pull before upgrades to prevent delays from slow image pulls (OP-237)#2128

Open
kristina-solovyova wants to merge 1 commit into01-29-chore_pass_wekai-endpoint_from_gh_varsfrom
01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_
Open

fix: add image pre-pull before upgrades to prevent delays from slow image pulls (OP-237)#2128
kristina-solovyova wants to merge 1 commit into01-29-chore_pass_wekai-endpoint_from_gh_varsfrom
01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_

Conversation

@kristina-solovyova
Copy link
Collaborator

@kristina-solovyova kristina-solovyova commented Jan 27, 2026

TL;DR

Added image pre-pull functionality to improve cluster and client upgrades by ensuring container images are downloaded to nodes before the upgrade process begins.

What changed?

  • Added image pre-pull capability that creates a DaemonSet(s) to pull container images on target nodes before upgrades
  • Implemented pre-pull operations for both WekaCluster and WekaClient resources
  • Added configuration options in Helm chart values to control pre-pull behavior:
    • upgrade.imagePrePull.enabled - Enable/disable pre-pulling (default: true)
    • upgrade.imagePrePull.timeout - Maximum time to wait for image pulls (default: 20m)
  • Added cleanup logic to remove pre-pull DaemonSets after successful upgrades

How to test?

Use PRE_SETUP_HOOK to make sure there is no existing wekacluster running

Why make this change?

Image pre-pulling significantly improves upgrade reliability and performance by:

  • Preventing upgrade failures due to image pull timeouts during rolling updates
  • Reducing upgrade duration by downloading images in parallel across all nodes
  • Ensuring images are available on nodes before the critical upgrade process begins
  • Providing visibility into image pull progress before committing to an upgrade

This is especially valuable in environments with slow image registry access or large container images.

Copy link
Collaborator Author

kristina-solovyova commented Jan 27, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label main-merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has required the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from 6ff45cb to b4118b1 Compare January 27, 2026 12:50
@kristina-solovyova kristina-solovyova added the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Jan 28, 2026 — with Graphite App
@kristina-solovyova kristina-solovyova changed the base branch from main to graphite-base/2128 January 28, 2026 14:47
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from b4118b1 to 1029992 Compare January 28, 2026 14:47
@kristina-solovyova kristina-solovyova changed the base branch from graphite-base/2128 to 01-28-chore_upgrade_dagger_version_to_0.19.10 January 28, 2026 14:47
@kristina-solovyova kristina-solovyova removed the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Jan 28, 2026
@kristina-solovyova kristina-solovyova force-pushed the 01-28-chore_upgrade_dagger_version_to_0.19.10 branch from a00ff0f to 576914f Compare January 28, 2026 15:10
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch 2 times, most recently from 1029992 to 0730cc9 Compare January 28, 2026 15:10
@graphite-app graphite-app bot changed the base branch from 01-28-chore_upgrade_dagger_version_to_0.19.10 to graphite-base/2128 January 28, 2026 16:07
@graphite-app graphite-app bot changed the base branch from graphite-base/2128 to main January 28, 2026 16:08
@graphite-app graphite-app bot force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from 0730cc9 to 249e211 Compare January 28, 2026 16:09
@kristina-solovyova kristina-solovyova changed the base branch from main to graphite-base/2128 January 29, 2026 11:48
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from 249e211 to 6212b51 Compare January 29, 2026 11:48
@kristina-solovyova kristina-solovyova changed the base branch from graphite-base/2128 to 01-29-chore_pass_wekai-endpoint_from_gh_vars January 29, 2026 11:48
@kristina-solovyova kristina-solovyova added the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Jan 29, 2026 — with Graphite App
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch 2 times, most recently from 3ca7fef to 7590092 Compare January 29, 2026 22:14
@kristina-solovyova kristina-solovyova removed the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Jan 29, 2026
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from 7590092 to be86f4e Compare January 29, 2026 22:15
@kristina-solovyova kristina-solovyova force-pushed the 01-29-chore_pass_wekai-endpoint_from_gh_vars branch from b374967 to 4859706 Compare January 29, 2026 22:17
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch 3 times, most recently from 795c018 to 2d2dd90 Compare January 30, 2026 21:45
@weka weka deleted a comment from scalar-workers bot Jan 30, 2026
@weka weka deleted a comment from scalar-workers bot Jan 30, 2026
@weka weka deleted a comment from scalar-workers bot Jan 30, 2026
@weka weka deleted a comment from scalar-workers bot Jan 30, 2026
@weka weka deleted a comment from scalar-workers bot Jan 30, 2026
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from 2d2dd90 to 844518d Compare February 1, 2026 21:09
@kristina-solovyova kristina-solovyova marked this pull request as ready for review February 1, 2026 21:14
@kristina-solovyova kristina-solovyova requested a review from a team as a code owner February 1, 2026 21:14
@graphite-app graphite-app bot requested review from assafgi and tigrawap February 1, 2026 21:14
@graphite-app
Copy link

graphite-app bot commented Feb 1, 2026

Graphite Automations

"Add anton/matt/sergey/kristina as reviwers on operator PRs" took an action on this PR • (02/01/26)

2 reviewers were added to this PR based on Anton Bykov's automation.

@kristina-solovyova kristina-solovyova added the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Feb 2, 2026 — with Graphite App
@kristina-solovyova kristina-solovyova force-pushed the 01-29-chore_pass_wekai-endpoint_from_gh_vars branch from 4859706 to 6e76c76 Compare February 2, 2026 10:43
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch 2 times, most recently from 35b74ed to bb1caea Compare February 2, 2026 10:50
@kristina-solovyova kristina-solovyova force-pushed the 01-29-chore_pass_wekai-endpoint_from_gh_vars branch from 6e76c76 to dd0fc9b Compare February 2, 2026 10:50
@weka weka deleted a comment from scalar-workers bot Feb 2, 2026
@weka weka deleted a comment from scalar-workers bot Feb 2, 2026
@kristina-solovyova kristina-solovyova force-pushed the 01-27-fix_add_image_pre-pull_before_upgrades_to_prevent_delays_from_slow_image_pulls_op-237_ branch from bb1caea to 9a56bcd Compare February 2, 2026 11:18
@kristina-solovyova kristina-solovyova force-pushed the 01-29-chore_pass_wekai-endpoint_from_gh_vars branch from dd0fc9b to 3c00867 Compare February 2, 2026 11:18
@kristina-solovyova kristina-solovyova removed the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Feb 2, 2026
@kristina-solovyova kristina-solovyova added the run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks label Feb 11, 2026 — with Graphite App
@scalar-workers
Copy link

🤖 WekAI Remote Test Execution

Status: ❌ Failed
Worker: operator-ci

Error Details:

Wekai process failed with exit code 1

Result Preview:

{"plan_id":"","plan_url":"","status":"initializing","success":false,"result":"","error_details":"timeout waiting for document generation"}

Last updated: 2026-02-11 14:15:05 UTC • Commit: 68e2dca

@weka weka deleted a comment from scalar-workers bot Feb 11, 2026
@weka weka deleted a comment from scalar-workers bot Feb 11, 2026
@weka weka deleted a comment from scalar-workers bot Feb 11, 2026
@weka weka deleted a comment from scalar-workers bot Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run_ci_on_merge_queue_plan Run upgrade-extended test with generated AI hooks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants