Improve handling of incomplete FileInArtifact rows
https://salsa.debian.org/freexian-team/debusine/-/issues/662 (confidential due to containing several specific URLs) reported that viewing incomplete files causes noisy tracebacks. I analysed this with the following results:
In all the cases shown here, db_fileinartifact.complete
is f
. There is no case where a complete FileInArtifact
can later be marked as incomplete, so this must mean that the file was never completely uploaded to begin with. It can't be a consequence of artifact cleanup failures.
I've looked through a sample of some of the incomplete files in the database. Many of them, though not all, have a similar complete file attached to the same artifact; that suggests to me that an upload process failed and was later retried. File uploads typically involve multiple requests, so to some extent this seems like part of the cost of doing business with a distributed system. Right now there are 69 incomplete files, with the earliest from an artifact created on 2024-10-28, so I don't think we have an epidemic of upload failures. However, the failure mode here is certainly suboptimal.
There are a few things we should do here:
-
Make the file views return a simple 404 if FileInArtifact.complete
is false, rather than logging a noisy traceback (!1523 (merged)) -
Don't link to incomplete files in the artifact detail view; they should still be listed, but without a link and followed by "(incomplete)" (!1523 (merged)) -
Mark artifacts as incomplete in artifact lists (particularly the work request detail view) if any of their files are incomplete (!1523 (merged)) -
Skip incomplete artifacts in theupdate-collection-with-artifacts
event reaction (!1524 (closed)) -
Retry running work requests when a worker asks for a new work request (!1528 (merged)) -
Clean up incomplete FileInArtifact
rows; clients should of course be given some time to complete uploads, but deleting incomplete FIAs a day or two after the artifact was created should be fine