Skip to content

Introduce timeout for lack of forward progress

In cases where --timeout-test is necessarily large (hours), it may be possible to detect a hung test and bail out earlier, rather than wait for the test timeout to expire, thus freeing resources.

Some hung tests can be detected by a lack of forward progress, meaning they stopped producing output.

This introduces a --timeout-test-noprogress for this case: if nothing has been written to stdout or stderr for longer than this period, VirtSubproc.Timeout is raised, which then triggers the already existing test timeout functionality.

For this to work, we need to direct output to pipes, because among other things Popen.communicate() waits for the process to exit, which is precisely what we don't want.

The Debian ROCm Team has been using this feature successfully for a while now. You can see this in action with our results for src:hipcub, where some tests enter some sort of infinite loop:

  • on a host with --timeout-test-noprogress=1200, the test run ends after 25mins (see log)
  • on a host without this feature, the test run needs to run the full 7h (our global timeout) to abort see log).
Edited by Christian Kastner

Merge request reports

Loading