Schedule and execute tasks on appropriate workers
As a followup to #7 (closed) and #5 (closed), we are covering here the logic used to match tasks with workers. Tasks contribute to it in two ways:
- they have a
analyze_worker
method that runs on the worker and that generates some dynamic metadata that is returned back to the debusine server, that dynamic metadata is then merged with any static data set by the debusine administrator (cf #4 (closed)). The static data set by the admin takes precedence over the dynamic metadata.- the dynamic metadata returned by
analyze_worker
are all namespaced to thetask_name
- among the data, there must be a version field that can be used to ensure compatibility between the data returned by the worker and the code running on the scheduler. The value should be a plain integer (defaulting to 1 if the task does not explicitly set a version).
- the dynamic metadata returned by
- they have a
can_run_on(worker)
method that checks whether the worker can run the task based on the worker's metadata
The debusine server stores the latest version of the dynamic metadata along with a timestamp. It also triggers a refresh of the dynamic metadata every 24 hours. A debusine worker that is (re)started spontaneously sends updated metadata.
In the static metadata set by the administrator there are two keys with a special meaning:
- tasks_denylist: any task named in that list will not be allowed to run on this worker
- tasks_allowlist: only tasks named in that list will be allowed to run
When a new task is submitted, the server checks whether it is able to find one worker that would be suitable for the task. If none are suitable, the task is immediately marked as failed with the message "No suitable worker found". Otherwise it keeps the task in the queue as "Waiting".
When a worker is looking for a new task, the server identifies the next task that is suitable for this worker:
- it generates a list of tasks based on the "task_denylist", "tasks_allowlist" keys (the full list of existing tasks should be known to the server!). If
tasks_allowlist
is present, then that's the list of tasks to use. Otherwise all tasks are used except those listed intasks_denylist
. - it filter the queues to include only tasks in the former list, sorts them by descending age, and it picks the first one where
task.can_run_on(worker)
is True.