Draft: Fallback to dpkg-deb on deb decompression failures
Whenever new compression algorithms are accepted in deb packages (e.g., #892664),python-debian will need to implement handlers for those. This may lead to temporary breakage for some use cases.
Moreover, the new compression algorithm may not be supported in python standard libraries, therefore, not being supported by tarfile, which is currently used to decompress the DebParts.
This patch introduces a decompression fallback in DebFile, which will try to use dpkg-deb (when available) to decompress the parts in the ar archive that would fail to decompress with tarfile. This should prevent some (but not all) debian-python use cases from breaking whenever new compression algorithms (not supported by tarfile) are introduced into dpkg.
Note that this MR was motivated by the introduction of the zstd compression format in Ubuntu, which led to breakage in some of the python-debian features. If interested, you can check [1] for further reference and the downstream discussion that led to the proposed implementation.
Merge request reports
Activity
Thanks @athos! This is a really useful contribution.
Looking through the code and thinking how it will be used, I've got a few comments.
I can see that there is an immediate benefit for Ubuntu here in that it will let python-debian in Ubuntu handle .deb files created in Ubuntu. Unfortunately, packages created on Ubuntu still won't be able to be inspected at all on Debian as dpkg is unable to process them. A fallback for the fallback would be making for quite icky code, but is it feasible to use python-zstd if it is installed as the primary method and dpkg-deb as the fallback, perhaps?
I think
zstd
should also be added todebfile.PART_EXTS
so that the text of theDebError
raised at line 369 and any other consumers of PART_EXTS know that zstd is supported; that means rethinking the detection mechanism to not look for DebError fromcompressed_part_name
.The implementation of reading the entire file into memory via
ret.stdout
to then feed it intoArFile
feels suboptimal. If we have gone down the--fsys-tarfile
route as a universal fallback, we need to be careful that there are regular .deb files that are 1.5 GB (and that's the xz compressed size!) and dbgsym packages that are 1.7 GB compressed. Can this be set up to yield a chunk at a time so that the data is streamed and not copied whole?I'm keen to hear your thoughts on these questions!
BTW you can add additional commits to this same MR just by pushing more commits to your original salsa branch, no need to make additional MRs (and !66 (closed) can get included in this one too).
Hi Stuart,
Thanks for the review!
The first version of this patch relied on the former python-debian logic for supporting xz, where a subprocess would call
unxz
due to Python < 3.3 support issues. The first idea was to callunzstd
using the same approach. When I realized that this bit of code was removed on7068abd
, I decided to not reintroduce that popen logic here.Moreover, I thought that introducing zstd support directly into python-debian would be frowned upon here until it is actually accepted in dpkg. I was wrong and I should have brought the thought here first :)
I will re-implement this patch (in this same PR) using python-zstd, as suggested. In that case, I have no strong opinion on removing or keeping the dpkg-deb fallback, do you?
I will close !66 (closed) and move that commit here as well, as suggested.
Finally, sorry for the delay on this reply :)
mentioned in merge request !66 (closed)
added 22 commits
-
d7ff96b0...9d5bef83 - 20 commits from branch
python-debian-team:master
- 38cc78ac - Fallback to dpkg-deb on deb decompression failures
- 6cd9ff01 - Fix param in control compression tests
-
d7ff96b0...9d5bef83 - 20 commits from branch
added 21 commits
-
6cd9ff01...9cb96c05 - 19 commits from branch
python-debian-team:master
- 65752277 - Fallback to dpkg-deb on deb decompression failures
- d5ede0eb - Fix param in control compression tests
-
6cd9ff01...9cb96c05 - 19 commits from branch
mentioned in merge request !79 (merged)
Closing this in favor of !79 (merged)
Thanks, @schopin :)