Skip to content

Terminate Sbuild process on worker's exit

Carles Pina i Estany requested to merge worker-handler-sigterm-exiting into devel

If debusine-worker is running an sbuild task and systemd tries to stop it (SIGTERM): because sbuild was not finishing it was taking the default 90 seconds from systemd to finish to decide to use SIGKILL. I think that this might be seen as negative for people deploying debusine-worker (I personally don't like when processes stop the shutting down process).

I know that we could change debusine-worker.service and add TimeoutStopSec= to force systemd to use SIGKILL faster but it does not seem right (but let me know if this is preferred).

We also had the "problem" that when using debusine-worker outside systemd, sbuild or sbuild's spawned processes could outlive debusine-worker execution (I haven't tested this but it should be fixed in this MR).

In this MR:

  • Task has a cancelled boolean flag
  • Sbuild.execute calls Sbuild._execute_cmd and this one will send a SIGTERM to the subprocess if cancelled == True
  • Sbuild._execute_cmd, if after sending SIGTERM to the process does not die in 5 seconds will send a SIGKILL
  • It uses a process group to kill all the subprocesses that the subprocess.Popen command might have spawned

TODO:

  • test that all good (as sbuild is killed) when using debusine-worker outside systemd
  • decide what should happen API wise if debusine-worker is killed while sbuild is running. Send anything to the API? Or just let it "running" and will run the job again the next time that the debusine-worker connects?
  • I'm tempted to to move Sbuild._execute_cmd to Task.execute_cmd. I expect that other subclasses of Sbuild would benefit of Task.execute_cmd with the Task.cancelled flag. Thoughts about this? Or not at the moment and when time comes?
Edited by Carles Pina i Estany

Merge request reports

Loading