janitor resource cleanup draft #73
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Details
Clickup Link -
Description
THIS IS A DRAFT PR!
The purpose of this PR is to complete the cancellation and resource cleanup process within heimdall. Currently, cancellation puts jobs in a cancelling state, which terminates the handler. This means that remote resources could still be running, however, so we want a mechanism that can pick up stale or cancelling jobs and ensure that all resources have been properly terminated.
In order to do this, we pass the command handlers to the janitor, who then actively queries the DB for any stale or cancelling jobs. When it finds any, it grabs the appropriate cluster and activates the cleanUp Handler associated with each plugin (using the same interface that the normal handler uses). It then becomes necessary for each plugin to write a cleanup() function that will take any values from the job/command/cluster context to terminate resources.
I have updated the ECS plugin as an example of how the above approach would be implemented.
TODO:
If resources are created in a way that cannot be linked to a heimdall job, then a cancellationCtx was created as part of the job object. A generic way to update this field during plugin execution has not been developed yet.
Types of changes
Checklist