Make pods wait for user code containers to be actual ready by simeoncarstens · Pull Request #491 · tweag/chainsail

simeoncarstens · 2024-06-19T16:19:59Z

This is an attempt (and WIP) to solve #386. The current strategy is to implement a readiness or startup probe (currently, startup, but probably readiness is the appropriate one) to make sure the user code containers are ready, meaning the gRPC services for log-prob / gradient are ready to respond in a timely manner.
Once the probe succeeds, that container is deemed ready / started, and the pod can be considered ready.

One pitfall could be that possibly the controller pod is also running a user code container that is actually used in the calculation. So we want the controller container to only start sending out sampling requests until not only once all user code containers in other pods are ready, but also the user code container in the controller pod has to be ready.

I'm not sure whether a startup probe is enough, or whether we need an init container on the controller pod that makes all controller pod containers start only once all other pods are ready.

simeoncarstens added 10 commits June 19, 2024 17:54

Add health check service proto

8dcff76

Add health check service to gRPC server

6db5f8d

Run protoc

9d2a725

Add startup probe to user code server container

27c3d7f

Remove hardcoding controller wait

8822e88

Bump user code Docker base image version

8fa7b6b

Turn startup probe into readiness probe

adee7a9

Update / remove dependencies

f33ac0b

Bump Docker base image

a5b4bd7

Remove cloudstorage dependency in code

e00372a

simeoncarstens mentioned this pull request Sep 25, 2024

Add BridgeStan to user code Docker image #480

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make pods wait for user code containers to be actual ready#491

Make pods wait for user code containers to be actual ready#491
simeoncarstens wants to merge 10 commits intomainfrom
user-code-startup-probe

simeoncarstens commented Jun 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

simeoncarstens commented Jun 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant