Binary diff when comparing invalid xml file
diffing the following files (one of them contain invalid xml, the tags closed in incorrect order):
2.xml
<root><a><b>aaa2</b></a></root>
3.xml
<root><a><b>aaa2</a></b></root>
result:
--- 2.xml
+++ 3.xml
@@ -1,2 +1,2 @@
00000000: 3c72 6f6f 743e 3c61 3e3c 623e 6161 6132 <root><a><b>aaa2
-00000010: 3c2f 623e 3c2f 613e 3c2f 726f 6f74 3e0a </b></a></root>.
+00000010: 3c2f 613e 3c2f 623e 3c2f 726f 6f74 3e0a </a></b></root>.
debug log:
D: diffoscope.main: Starting diffoscope 147
D: diffoscope.main: Free space in temporary directory: 1.12 GiB
D: diffoscope.presenters.formats: Will generate the following formats: text
D: diffoscope.environ: Normalising locale, timezone, etc. Inheriting PATH of /srv/diffoscope/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
D: diffoscope.main: Starting comparison
D: diffoscope.comparators: Loaded 76 comparator classes
D: diffoscope.comparators.utils.specialize: Using diffoscope.comparators.xml.XMLFile for 2.xml
D: diffoscope.comparators.utils.specialize: Using diffoscope.comparators.text.TextFile for 3.xml
D: diffoscope.comparators.utils.compare: Comparing 2.xml (XMLFile) and 3.xml (TextFile)
D: diffoscope.comparators.utils.file: File.has_same_content: <<class 'abc.XMLFile'> 2.xml> <<class 'abc.TextFile'> 3.xml>
D: diffoscope.comparators.utils.command: Executing xxd {}
D: diffoscope.comparators.utils.command: Executing xxd {}
D: diffoscope.tempfiles: Created top-level temporary directory: /tmp/diffoscope_th9zfxu9
D: diffoscope.diff: Running diff -aU7 /tmp/diffoscope_th9zfxu9/tmpr0_xd2hh/fifo1 /tmp/diffoscope_th9zfxu9/tmpr0_xd2hh/fifo2
D: diffoscope.diff: diff -aU7 /tmp/diffoscope_th9zfxu9/tmpr0_xd2hh/fifo1 /tmp/diffoscope_th9zfxu9/tmpr0_xd2hh/fifo2: returncode 1, parsed True
D: diffoscope.presenters.formats: Generating 'text' output at '-'
it happening probably because the parsing failed in recognizes
function, it should fail to 'text compare' not to 'binary diff', also the situation can be improved by supporting other xml formatters that work on invalid data as well, see issue #166