Display input to workflows
Split out of #549 (closed) as workflows got significantly more complex.
Workflows depend heavily on collections, especially in our default configurations. We are currently unable to show the input to workflows, as the input is made up of lookups, that are only resolved by the child work requests.
We discussed possible approaches to resolving this, and decided to store the input data to the workflow in the workflow's dynamic data. This dynamic data calculation happens fairly early in the scheduler, and not all of the input is available yet. So this will need to be recomputed, when any child work requests get unblocked.
The re-computation could rely on pubsub (#610) or the same architecture that we use to update_workflows
.
If dynamic data changes, that's probably a good signal that the workflow needs re-orchestration (although no workflows currently use dynamic data in this way). An algorithm could be: recompute dynamic data, if it has changed, run the orchestrator. Trigger the parents' dynamic data update.
UI:
Ideally we'd display for each input lookup:
- The lookup
- The resolved artifacts
- Any promises that we're waiting for
But this requires a redesign of the current flat artifact list.
Concerns:
There's still the risk of skew, if a collection changes between the task's dynamic data calculation and its parent dynamic data calculation. This would lead to the workflow page showing the wrong input data.
The best of example of this is environments. If a debian-pipeline runs sbuild today, and the reverse-autopkgtests take two more days, there could be 3 sets of environment artifacts involved, if we update them daily. The workflow would only show one set.
Other potential approaches:
- We could look into children's dynamic_task_data, and extract the real inputs for the workflow. But this adds complexity to workflow creation.
- We could pass both lookups and the resolved lookup artifact lists to created children. But this also adds complexity to the workflow as it needs to understand the constraints that the child would put onto the lookup.
- We could verify that lookup results we obtain really appear in children's
get_source_artifacts_ids
. But this only stops us from listing unrelated artifacts, it doesn't provide missing input artifacts.