Server tasks stuck in running status forever
When investigating why my package upload did not happen, I realized that we had 18 running "aptmirror" tasks and my package_upload task was not picked up. We had plenty of pending "aptmirror" tasks that were waiting their turn too. I restarted the celery worker and two new tasks were picked up. I saw that they had some activity since I could see some "apt-get download" process in the process tree.
But all the other workers seemed idle. Eventually I aborted the pending work requests and the running work requests that were multiple days old and things started to work again.
I'm not quite sure what to conclude from this but I think that we have at the very least a problem when we restart the celery worker... the running work requests are not picked back by the worker, but they are also not put back in their pending status (or aborted + rescheduled like we do for worker tasks).
The fact that nothing happened while we had only 18 running server task while the concurrency limit is at 20 questions me about the scheduling logic of server tasks. Are we running it regularly as part of the scheduler?