simultaneous startup of autopkgtest causes lxc failures

I currently believe we're facing an lxc startup issue on our hosts that

  • run bullseye (I don't think I saw the issue on buster)
  • have multiple debci runners
  • start multiple autopkgtest's runs approximately at the same time.

The issue is that regularly tests tmpfail (see alerts). I think the pattern that I recognize is that the failure happens when multiple autopktest's are started at the same moment, e.g. when multiple debci workers are idle and britney triggers a whole set of tests. The effect is most clearly seen in the stats of our armhf host where one can see that the amount of left over lxc's typically increases several minutes after the whole hour and does that stair wise (the host currently has 12 debci workers).

Interestingly we don't currently see it on our ppc64el hosts (which have each two debci workers). Our amd64/i386 workers don't show this because either they only have one debci worker or they still run buster (ci-worker13).

A failing log looks like this:

autopkgtest [13:07:56]: host ci-worker-armel-01; command line: /usr/bin/autopkgtest --no-built-binaries '--setup-commands=echo '"'"'node-uniqs testing/armhf'"'"' > /var/tmp/debci.pkg 2>&1 || true' '--setup-commands=echo '"'"'Acquire::Retries "10";'"'"' > /etc/apt/apt.conf.d/75retry 2>&1 || true' --user debci --apt-upgrade '--add-apt-source=deb http://incoming.debian.org/debian-buildd buildd-unstable main contrib non-free' --add-apt-release=unstable --pin-packages=unstable=src:node-uniqs --output-dir /tmp/tmp.UAuICJsTJ0/autopkgtest-incoming/testing/armhf/n/node-uniqs/16399047 node-uniqs -- lxc --sudo --name ci-307-cb6c1a3c autopkgtest-testing-armhf
lxc-start: ci-307-cb6c1a3c: lxccontainer.c: wait_on_daemonized_start: 859 Received container state "ABORTING" instead of "RUNNING"
lxc-start: ci-307-cb6c1a3c: tools/lxc_start.c: main: 308 The container failed to start
lxc-start: ci-307-cb6c1a3c: tools/lxc_start.c: main: 311 To get more details, run the container in foreground mode
lxc-start: ci-307-cb6c1a3c: tools/lxc_start.c: main: 313 Additional information can be obtained by setting the --logfile and --logpriority options
<VirtSubproc>: failure: lxc-start failed with exit status 1
autopkgtest [13:08:03]: ERROR: testbed failure: cannot send to testbed: [Errno 32] Broken pipe