Skip to content

Do not fail on JAR archives containing invalid members with a .jar extension

_jar_normalize_member() in the JAR handler calls File::StripNondeterminism::handlers::zip::normalize_member for every archive member with a .jar extension. If the member is not a valid archive (e.g. because it is just an empty file), this leads to a hard failure "Reading ZIP archive failed" in the ZIP handler.

To solve this, extend normalize_member() in the ZIP file to detect the file type if no $normalizer is explicitly specified. This uses get_normalizer_for_file(), which looks at both the file extension and the file command to detect the type more accurately. Therefore we need to preserve the name of the extracted temporary file instead of just calling it "member".

With this change in place, we can leave the normalizer for files with a .jar extension unspecified in _jar_normalize_member(). This way if the file is not an archive, it will get skipped by get_normalizer_for_file() and a warning will be printed instead of a fatal error.

A minimal reproducer for a file that strip-nondeterminism fails to process without these changes can be created as follows:

touch empty.jar
zip testcase.jar empty.jar

Running an unpatched strip-nondeterminism on testcase.jar fails with the following error and the file is not getting modified at all:

strip-nondeterminism: testcase.jar: Reading ZIP archive failed: format error: file is too short

With the changes in this merge request, only a warning is printed instead and the file is processed by strip-nondeterminism:

strip-nondeterminism: unknown file type of empty.jar at lib/File/StripNondeterminism/handlers/zip.pm line 78

The real world case where we are seeing this issue is the openapi-generator-cli.jar file in the openapi-generator 5.3.0-1 package on Arch Linux: it contains an empty JAR file java-micronaut-client/configuration/gradlew/gradle-wrapper.jar, which leads to the same problem with strip-nondeterminism as the above minimal test case.

Merge request reports

Loading