Worker stuck in running state if work-request-completed can't be transmitted
This is the return of #263 (closed) from what I can tell. We have witnessed this with https://debusine.debian.net/debian/developers/work-request/98525/, it was assigned to debusine-worker-amd64-hades.05.freexian.com and in the logs of that worker we can see this.
Worker logs
2025-06-02 12:06:58,840 Connected to https://debusine.debian.net/api
2025-06-02 16:45:19,768 Work request 98525: Fetching input
2025-06-02 16:45:20,198 Downloading artifact and uncompressing into /tmp/debusine-fetch-exec-upload-31jw
hu9g
2025-06-02 16:45:20,811 tcpdf_6.3.5+dfsg1-1+deb11u1.debian.tar.xz
tcpdf_6.3.5+dfsg1-1+deb11u1.dsc
tcpdf_6.3.5+dfsg1.orig.tar.xz
2025-06-02 16:45:22,712 Artifact file downloaded: /var/lib/debusine/worker/system-images/2057740/.download.tmp.system.tar.xz
2025-06-02 16:45:22,720 Work request 98525: Configuring for execution
2025-06-02 16:45:22,720 Work request 98525: Preparing to run
2025-06-02 16:45:22,720 Work request 98525: Running
2025-06-02 16:45:22,720 Executing: sbuild --purge-deps=never --no-run-lintian --no-arch-any --arch-all --no-source --arch=amd64 --dist=bullseye --chroot-mode=unshare --chroot=/var/lib/debusine/worker/system-images/2057740/system.tar.xz --chroot-setup-commands=rm -f /etc/resolv.conf --pre-build-commands=cat /tmp/debusine-fetch-exec-upload-31jwhu9g/extra_repository_0.sources | %SBUILD_CHROOT_EXEC sh -c 'cat > /etc/apt/sources.list.d/extra_repository_0.sources' --pre-build-commands=cat /tmp/debusine-fetch-exec-upload-31jwhu9g/extra_repository_1.sources | %SBUILD_CHROOT_EXEC sh -c 'cat > /etc/apt/sources.list.d/extra_repository_1.sources' --bd-uninstallable-explainer=dose3 /tmp/debusine-fetch-exec-upload-31jwhu9g/tcpdf_6.3.5+dfsg1-1+deb11u1.dsc
2025-06-02 16:46:31,947 sbuild exited with code 0
2025-06-02 16:46:31,948 Work request 98525: Checking output
2025-06-02 16:46:32,005 Work request 98525: Uploading artifacts
2025-06-02 16:46:33,483 Work request 98525: Cleaning up
2025-06-02 16:46:34,911 Exception in Task sbuild
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 884, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 884, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 884, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
[Previous line repeated 1 more time]
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 874, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 594, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/artifact/ (Caused by ResponseError('too many 503 error responses'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/debusine/client/debusine_http_client.py", line 285, in _api_request
response = self._method(method)(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 635, in post
return self.request("POST", url, data=data, json=json, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 556, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/artifact/ (Caused by ResponseError('too many 503 error responses'))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/debusine/tasks/_task.py", line 511, in execute_logging_exceptions
return self.execute()
^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/tasks/_task.py", line 526, in execute
self._upload_work_request_debug_logs()
File "/usr/lib/python3/dist-packages/debusine/tasks/_task.py", line 912, in _upload_work_request_debug_logs
remote_artifact = self.debusine.upload_artifact(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine.py", line 443, in upload_artifact
artifact_response = self.artifact_create(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine.py", line 379, in artifact_create
return self._debusine_http_client.post(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine_http_client.py", line 102, in post
return self._api_request(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine_http_client.py", line 289, in _api_request
raise exceptions.ClientConnectionError(
debusine.client.exceptions.ClientConnectionError: Cannot connect to https://debusine.debian.net/api/1.0/artifact/. Error: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/artifact/ (Caused by ResponseError('too many 503 error responses'))
2025-06-02 16:46:34,914 Task: sbuild Error execute: Cannot connect to https://debusine.debian.net/api/1.0/artifact/. Error: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/artifact/ (Caused by ResponseError('too many 503 error responses'))
2025-06-02 16:46:36,391 Cannot reach server to report work request completed.
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 884, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 884, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 884, in urlopen
return self.urlopen(
^^^^^^^^^^^^^
[Previous line repeated 1 more time]
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 874, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 594, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/work-request/98525/completed/ (Caused by ResponseError('too many 503 error responses'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/debusine/client/debusine_http_client.py", line 285, in _api_request
response = self._method(method)(
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 647, in put
return self.request("PUT", url, data=data, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 556, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/work-request/98525/completed/ (Caused by ResponseError('too many 503 error responses'))
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/debusine/worker/_worker.py", line 745, in _send_task_result
await asyncio.to_thread(
File "/usr/lib/python3.11/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine.py", line 254, in work_request_completed_update
return self._debusine_http_client.put(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine_http_client.py", line 123, in put
return self._api_request(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/debusine/client/debusine_http_client.py", line 289, in _api_request
raise exceptions.ClientConnectionError(
debusine.client.exceptions.ClientConnectionError: Cannot connect to https://debusine.debian.net/api/1.0/work-request/98525/completed/. Error: HTTPSConnectionPool(host='debusine.debian.net', port=443): Max retries exceeded with url: /api/1.0/work-request/98525/completed/ (Caused by ResponseError('too many 503 error responses'))
In https://debusine.debian.net/debian/developers/work-request/98525/ we can see this:
In https://debusine.debian.net/-/status/workers/ we have this:
We can notice that the "last_seen_at" field is really recent, meaning that the regular _send_dynamic_metadata
is working... the fact that the work request is not retried clearly means that this comment is no longer accurate in _send_task_result
:
except Exception:
# Log this, but leave the work request running. The server
# will retry it when this worker next manages to connect and
# request a new work request to run.
logging.exception(
"Cannot reach server to report work request completed."
)