Skip to content

Support resuming failed execution of a task graph #6

@sportsracer

Description

@sportsracer

Sometimes, running many tasks takes a long time. If the graph files when it's almost done, you currently need to rerun everything.

Solution: When execution of a graph files, serialize the state and data of the task graph. Then, resume execution from that point. Note: You should be able to change the code of tasks in between failure and retry. Since bugs are many times the cause of task failure.

This needs to be well thought through wrt multiprocessing and sharing of data.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions