Skip to content

Improve work request scheduling with tag based approach

Right now all the scheduling relies on a can_work_on() function that require instantiating each task to find the next work request for a given worker. This scales poorly with very large number of work request. We can improve significantly this process by having each work request documents a set of tags that the desired worker must have.

The tags exposed by a worker can be created/extracted from the static and dynamic worker_data.

The tags requested by a work request can be a combination of user-submitted tags and tags generated by the task after analysis of the task_data. Sometimes the architecture requirement is quite explicit in the task_data (via "host_architecture" fields for example) but sometimes it's much less explicit, for example if the user provides a custome "environment_id" then that environment is built for a specific architecture and we must run the work request on that architecture, even if the task in itself is fairly architecture agnostic (like a lintian analysis).

Then the scheduling logic can exclude work requests that need a tag that the current worker does not have.

Among the expected tags, we should have:

  • the architectures that can run (natively) on the worker
  • some small/medium/big classification based on the number of CPU cores/RAM/disk (to be documented/defined)
  • maybe the kind of worker ? i.e. internal / external (server-side tasks vs worker-side tasks)

Tags (or worker-tag relationships and workrequest-tag relationships) likely have some associated meta-data/permissions too:

  • whether a given user can set a given tag (might rely on a workspace membership?)
  • whether a worker exclusively runs tasks with a given tag, or if it can run work requests without that tag
  • whether the tag can be provided by the user or whether it can only be set indirectly by debusine through analysis of history and/or task parameters

Some initial design notes are available here and here. Other parts remain to be specified, in particular at the database model level and at the scheduler level.

Edited by Raphaël Hertzog
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information