Try to minimise regressions using DEP-8 tests

changed the description

As a first step it would be nice to collect a list of packages that have a (recent) security history and that lack autopkgtests, we could then drive LTS contributors towards this list in this issue.

changed the description

(referenced LTS/Development which I just updated to include more guidelines)

I wouldn't focus too much on autopkgtests, because proper automated package testing often relies on non-packaged test suites (e.g. wordpress), as well as large external data sets (e.g. tiff, ffmpeg).

(incidentally packages with autopkgtests may have broken tests, e.g. sane-backends; sadly the maintainer even removed the tests in sid.)

I'd recommend adding an entry at https://wiki.debian.org/LTS/TestSuites whenever there's a non-trivial testing procedure.

It's still a good idea to identify packages that lack testing procedures, and add more DEP-8 (smoke) tests in LTS+sid.

TL;DR DEP-8 are not an end but a mean to better testing.

Should we s/DEP-8 tests/tests/ in the task title?

I don't agree with this assessment. DEP-8 should not be limited to smoke tests.

You can download external data set and mark the test as needs-internet. And you can surely package the external test suite if you want (and provide it in the debian tarball for the benefit of autopkgtest).

https://wiki.debian.org/LTS/TestSuites is still appreciable but there's no reason why it should be LTS specific and it will stay LTS specific if it's not integrated in something official.

I'm not sure what assessment you disagree with. My summary was: "DEP-8 are not an end but a mean to better testing."

I mostly saw DEP-8 smoke tests in my experience, but I agree this can be more extensive tests.

DEP-8 by itself is not enough, also because packages like qemu, wpa or sane-backends require a physical test at a point; tiff maintainer also integrates visual checking in his procedure.

Non-trivial repackaging of e.g. wordpress to bundle an external test suite is a new vector for errors, including reducing the effectiveness of tools like debdiff (or maybe I didn't understand the specifics of your proposal). Since that suite (or mysql-connector-java's bundled one) also isn't 100% robust (blame upstream), it doesn't make a good candidature for fully-automated testing.

LTS/TestSuites is as specific to LTS as DEP-8 tests, both are some suite-specific instructions, and both can be better integrated in sid (e.g. improved test & install documentation) for long-term benefit.

I'll need to check how autopkgtest can deal with caching large datasets (as autopkgtest is meant to work in clean environments, and is run multiple times while working on an update).

As an aside, I received no feedback on my efforts on improving testing for over a year, including maintaining and extending our building and testing procedures, or the https://wiki.debian.org/LTS/TestSuites pages which I authored at 100% so far. I didn't see much other work on improving LTS testing, so your answer feels a bit cold. I hope we're aiming at a compatible goal.

I certainly agree that DEP-8 has limitations and that you can't test everything in that framework. But so far DEP-8 is the only thing that we have where tests are automated and automation is a key goal IMO, manual tests are good to have but I do not trust us to actually use consistently manual tests that we have documented for various packages. You note yourself that you feel alone in that effort and that you don't see much buy-in from other contributors. Maybe that should be discussed in our next meeting (ping @holger).

So in the end I disagree on your "I wouldn't focus too much on autopkgtests" because IMO only autopkgtests have a chance to be used consistenly in the future by other persons working on the package. Your work is still useful and appreciated, even if I fear that only a handful of persons will know of the testing procedures documented in the wiki. Maybe that's again something that we must make more visible.

Thanks for sharing your vision for this task, which clarifies the goals for me.

(Let me clarify that my last comment wasn't deploring a lack of DLA/ELA testing, but of documenting/sharing said testing.)

#!/bin/bash
# Generate a list of packages that have had (recurrent) security updates and that lack autopkgtest
# https://salsa.debian.org/lts-team/lts-extra-tasks/-/issues/1
min_nb_updates=3
year_re='(201[89]|202.)'
cat 'packages-to-support' | while read package rest; do
    if ! apt-cache showsrc $package | grep -q '^Testsuite:'; then
	count=$(grep -c -P "^\[\d+\s\w+\s${year_re}\]\sDLA-\d+-\d+\s${package}\s" ../security-tracker/data/DLA/list)
	if [ $count -ge $min_nb_updates ]; then
	    echo $package $rest "($count)"
	fi
    fi
done

changed the description

FYI, there are already packages in LTS packages project. For most of them pipelines are enabled. I am continuing to add some more packages, which I touch.

There is some ongoing work on this topic.

is about running regression testing for uploads to the embargoed queues (so right now for stable and oldstable, as handled by the Security Team). This means not just running the uploaded package's autopkgtests, but basically running all its dependencies' autopkgtests.

The Security Team now has automated regression testing for all uploads made to the security queues. It's documented at https://salsa.debian.org/security-team/britney2/-/blob/master/scripts/autopkgtests.md, and the code changes required to run it have all been merged back upstream by the Release Team. Now would probably be a good point in time to try and get this rolling for *LTS suites.

Try to minimise regressions using DEP-8 tests

Designs

Child items ...

Activity