Skip to content

Sub-workflows can cause multiple instances of their root workflow's orchestrator to run concurrently

While working on #756 and thinking about #999, I noticed a more serious problem with sub-workflows that I'm not entirely sure how to solve. Take this hypothetical situation:

  • Root workflow A
    • Task B
    • Task C
    • Sub-workflow D (depends on B)
    • Sub-workflow E (depends on C)

Let's say B completes, unblocking D; the scheduler notices that D is now pending and schedules a Celery task to run A's orchestrator. Shortly afterwards, C completes, unblocking E; the scheduler notices that E is now pending and schedules a Celery task to run A's orchestrator. It is possible for this to result in two instances of A's orchestrator running concurrently. Depending on exactly what the orchestrator does, this might just result in one of the instances failing with a database integrity error or similar, or it might do something much more confusing. This situation must be avoided.

For workflow callbacks, we prevent this by excluding pending callbacks whose parent already has another running callback (though #999 would change that by needing to check for any other running callback under the same root instead). But that approach can't work for sub-workflows, because workflows remain running until all their children have finished.

What we actually need to know is "are there any pending Celery tasks that will involve running this root workflow orchestrator", and if so we shouldn't schedule another one because they might end up running concurrently (and I want to avoid explicit locks). I can't think of a way to do that with our current database model.

In some ways this is a bit like a refinement of the RUNNING status: there's a difference between "the workflow orchestrator itself is running" and "the workflow as a whole is running" that we don't currently model very well. We have workflow_runtime_status now, so maybe we could express this as status: RUNNING; workflow_runtime_status: RUNNING (workflow orchestrator itself is running) versus status: BLOCKED; workflow_runtime_status: RUNNING (workflow orchestrator itself doesn't have anything to do right now, but some of its children are running)? Changing the semantics of status tends to be difficult, but it might be possible to make this work. If that could be done, then the scheduler would just need to exclude callbacks and sub-workflows where their root workflow is RUNNING. But I'm not sure whether this would conflict with other uses of status: BLOCKED.

Alternatively, we could create a new database table where we add a row each time we schedule a workflow orchestrator run, and delete it when that run finishes. This is more ad-hoc and less elegant, but it would probably work.

Any other ideas?

Edited by Colin Watson
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information