Skip to content

Help modifying startTasksFrom to support sub-WorkflowTasks #6

@hyjkim

Description

@hyjkim

Pyflow has a nice option of starting from a particular task id when a workflow is run. eg,

wf = SomeWorkflow()
wf.run(startFromTasks='task_id')

Which works great as long as your workflows are just one level deep. In some instances, I'd like to start from a specific subworkflow task. Take the following example:

from pyflow import WorkflowRunner
import sys

class ChildA(WorkflowRunner):
    def workflow(self):
        self.flowLog('ChildA called')


class ChildB(WorkflowRunner):
    def workflow(self):
        self.flowLog('ChildB called')
        gcc_wc = GrandchildC()
        gcc_task = self.addWorkflowTask('grandchild_c', gcc_wc)

        gcd_wc = GrandchildD()
        self.addWorkflowTask('grandchild_d', gcd_wc, dependencies=gcc_task)

class GrandchildC(WorkflowRunner):
    def workflow(self):
        self.flowLog('GrandchildC called')


class GrandchildD(WorkflowRunner):
    def workflow(self):
        self.flowLog('GrandchildD called')

class Master(WorkflowRunner):
    def workflow(self):
        a_wf = ChildA()
        a_task = self.addWorkflowTask('child_a', a_wf)
        b_wf = ChildB()
        self.addWorkflowTask('child_b', b_wf, dependencies = a_task)

if __name__ == "__main__":
    startFromTasks = None
    if len(sys.argv) > 1:
        startFromTasks = sys.argv[1]
    wf = Master()
    wf.run(startFromTasks=startFromTasks, isContinue='Auto')

Where this workflow is launched as

python workflow.py

Running the whole workflow generates a task graph like this:
example state

Like in my initial example, it's simple enough to launch from a child workflow:

python workflow.py child_b

But trying to start from a specific grandchild workflow results in no tasks being run at all:

python workflow.py child_b+grandchild_d

I'm guessing this is due to the way that pyflow builds its DAG. A grandchild task will not be added to the DAG if the child task is already marked as complete.

Any ideas on how I could extend pyflow to support this feature?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions