WIP: add mechanism to add embedded-code-copies to CVE/list
So here's the result of my work on using the embedded-code-copies file as a basis for detecting renamed packages in the security tracker, which was previously discussed on the debian-lts mailing list.
The patch adds a script that parses the file and adds missing entries in data/CVE/list for packags that have embedded code copies, which cover the "renamed package" use-case, but also covers, more broadly, embedded code copies.
The script iterates over each CVE and, for each source package specified in the embedded-code-copies database, adds an entry for the "copied" packages for that CVE. The sript is eventually idempotent, that is it won't add new entries when it is run again on its own input, although it might add new entries transitively. For example, if A has a code copy of B which has a code copy of C, then if the script finds C, it will add B on the first run and then A on the second run. So it eventually converges and stabilises the file, at least in my experiments.
It takes the approach of rewriting the whole file: it is not incremental, in that it will re-add missing entries and processes the whole file every time. The first dumb version of the algorithm was taking about 80 seconds to process the whole file, but I was able to trim this down to 12 seconds which seems acceptable for a cron job. A first run could be done with --status '' to get up to speed and next runs would have '' or '' (current default, trivial to change).
The resulting diff is rather brutal: 44k lines are added to the CVE/list
file, which is already 320k lines long, a 13% increase. This brings up
the question of how to process older entries: right now, it processes
everything. I added an untested --stop
argument that we could use to
process only entries up to a certain one. The script could save such
state somewhere, but I haven't implemented that. I think there are
fundamental issues with the way the list file is updated that might make
this approach impractical, so i didn't dig much further.
This is work in progress in that it is not integrated anywhere or documented in the workflow. It also affects the security team so I obviously want to see what that team thinks of the approach. I am unfamiliar with the embedded-code-copies workflow and this might be completely out of whack, but hopefully it will be useful.
Here's an example of the first few lines of diff it creates on the list file:
diff --git i/data/CVE/list w/data/CVE/list
index febc0a81fd..abe9f2d782 100644
--- i/data/CVE/list
+++ w/data/CVE/list
@@ -31,6 +31,10 @@ CVE-2018-11807
RESERVED
CVE-2018-11806 [slirp: heap buffer overflow while reassembling fragmented datagrams]
RESERVED
+ - xen-unstable <ignored>
+ - xen-3 <ignored>
+ - kvm <ignored>
+ - qemu-kvm <ignored>
- qemu <unfixed>
NOTE: https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01012.html
CVE-2018-1000202 (A persisted cross-site scripting vulnerability exists in Jenkins ...)
@@ -448,10 +452,12 @@ CVE-2018-11658
CVE-2018-11657 (ngiflib.c in MiniUPnP ngiflib 0.4 has an infinite loop in DecodeGifImg ...)
NOT-FOR-US: ngiflib
CVE-2018-11656 (In ImageMagick 7.0.7-20 Q16 x86_64, a memory leak vulnerability was ...)
+ - graphicsmagick <ignored>
- imagemagick 8:6.9.9.34+dfsg-3 (unimportant)
NOTE: https://github.com/ImageMagick/ImageMagick/issues/931
NOTE: https://github.com/ImageMagick/ImageMagick/commit/4da2cd650532ffd18fa11578fc2ec7c2467727bb
CVE-2018-11655 (In ImageMagick 7.0.7-20 Q16 x86_64, a memory leak vulnerability was ...)
+ - graphicsmagick <ignored>
- imagemagick 8:6.9.9.34+dfsg-3 (unimportant)
NOTE: https://github.com/ImageMagick/ImageMagick/issues/930
NOTE: https://github.com/ImageMagick/ImageMagick/commit/a7414b7322201a9c8a5cacf563f08468c329b4b1
@@ -482,6 +488,7 @@ CVE-2018-11646 (webkitFaviconDatabaseSetIconForPageURL and ...)
NOTE: different issue.
NOTE: Not covered by security support
CVE-2018-11645 (psi/zfile.c in Artifex Ghostscript before 9.21rc1 permits the status ...)
+ - gs-gpl <ignored>
- ghostscript 9.21~dfsg-1 (low)
[stretch] - ghostscript <postponed> (Be be fixed along in future update)
[jessie] - ghostscript <postponed> (Be be fixed along in future update)