[Medium] calls to subprocess.check_{call,output} outside of try/except blocks can lead to crashes of diffoscope
Issue details
diffoscope
uses various functions of Pythons subprocess
module to call external programs like xz
or apktool
to compare complex file formats and archives. The check_
variants of these functions throw a CalledProcessError
exception if the called subprocess exits with a non-zero returncode (i.e. the command failed). These function calls are largely placed outside of try/except
blocks and thus lead to crashes of diffoscope
if the subprocess exits unsuccessfully.
This issue also applies to Command.our_check_output
which essentially just wraps subprocess.check_output
with an additional logging call.
Example locations where this issue can be found:
def open_archive(self):
[...]
subprocess.check_call(
(
"apktool",
"d",
"-f",
"-k",
"-m",
"-o",
self._tmpdir.name,
self.source.path,
),
stderr=None,
stdout=subprocess.PIPE,
)
[...]
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
[...]
output = our_check_output(cmd, shell=False, stderr=subprocess.DEVNULL)
[...]
While the above sections are examples, all calls to subprocess
functions that throw exceptions such as CalledProcessErrors
are problematic and should be mitigated.
Risk
Depending on the concrete external tool called, it can be comparatively easy to craft malicious files that lead to the programs exiting abnormaly and thus effectively crash diffoscope
. Alternatively, simple forms of argument injection (e.g. from file metadata) can lead to unexpected return codes and thus crashes.
Mitigation
To ensure diffoscope
operates correctly, it is essential to verify the return codes of external tools, as their proper execution is critical to its functionality. Additionally, calls should be encapsulated within try/except
blocks to handle errors gracefully and ensure diffoscope
exits safely in case of failures.
This mitigation is already in place in selected locations, sometimes even in the same file as the problematic sections, e.g., in diffoscope/comparators/elf.py
:
def get_debug_link(path):
try:
output = our_check_output(
[get_tool_name("readelf"), "--string-dump=.gnu_debuglink", path],
stderr=subprocess.DEVNULL,
)
except subprocess.CalledProcessError as e:
logger.debug("Unable to get Build Id for %s: %s", path, e)
return None
m = re.search(
r"^\s+\[\s+0\]\s+(\S+)$",
output.decode("utf-8", errors="replace"),
flags=re.MULTILINE,
)
if not m:
return None
return m.group(1)