Add support to Difference.from_command_exc and friends to ignore specific...
Add support to Difference.from_command_exc and friends to ignore specific returncodes from the called program and treat them as "no" difference.
-
This commit causes regressions in the use cases I previously fixed in commit efd5f959.
- To reproduce, generate two .a files as described below. I understand .a files are "not supported any more", however I don't think this regression is specific to .a files, it's just the only example I have. I suspect this can be reproduced with any other container file and recursive comparison.
rm mix1.a rm mix2.a ar qU mix1.a some.o COPYING ar qU mix2.a some.o COPYINGmix1.aandmix2.adiffer only by some timestamp(s). The only purpose of the random text file "COPYING" is to triggerreadelffailures.Before this commit, a useful comparison:
--- mix1.a +++ mix2.a │┄ Command `readelf --wide --file-header mix1.a` exited with return code 1. Output: │┄ readelf: mix1.a: Error: Not an ELF file - it has the wrong magic bytes at the start │┄ @@ -1,10 +1,10 @@ 00000000: 213c 6172 6368 3e0a 2f20 2020 2020 2020 !<arch>./ 00000010: 2020 2020 2020 2020 3135 3636 3934 3035 15669405 -00000020: 3736 2020 3020 2020 2020 3020 2020 2020 76 0 0 +00000020: 3830 2020 3020 2020 2020 3020 2020 2020 80 0 0 00000030: 3020 2020 2020 2020 3134 2020 2020 2020 0 14 00000040: 2020 600a 0000 0001 0000 0052 6d61 696e `........Rmain 00000050: 0000 7838 362e 6f2f 2020 2020 2020 2020 ..x86.o/ 00000060: 2020 3135 3636 3933 3930 3138 2020 3130 1566939018 10 00000070: 3031 2020 3130 3034 2020 3130 3036 3434 01 1004 100644 00000080: 2020 3838 3936 2020 2020 2020 600a 7f45 8896 `..E 00000090: 4c46 0101 0100 0000 0000 0000 0000 0100 LF..............After this commit, three large and relatively cryptic stack traces and zero comparison result:
... long stack traces skipped ... File "diffoscope/diffoscope/difference.py", line 292, in from_command_exc assert False, exc.__dict__ AssertionError: {'returncode': 1, 'cmd': ['readelf', '--wide', '--file-header', 'mix1.a'], 'output': 'readelf: mix1.a: Error: Not an ELF file - it has the wrong magic bytes at the start\n', 'stderr': None}- Another, similar use case is to trigger
objdumpfailures with a single, cross-compiled.ofile:
Before this commit, a fantastically useful comparison:
2019-08-27 21:19:48 E: diffoscope.comparators.elf: Command '['objdump', '--disassemble', '--demangle', '--section=.text.main', 'em1.a']' returned non-zero exit status 1. --- em1.a +++ em2.a ├── file list @@ -1,2 +1,2 @@ ----------- 0 0 0 14 2019-07-17 00:19:18.000000 / +---------- 0 0 0 14 2019-07-17 00:19:07.000000 / ?rw-r--r-- 0 0 0 199684 1970-01-01 00:00:00.000000 main.c.objAfter this commit, three large and relatively cryptic stack traces and zero comparison result:
... long stack traces skipped ... File "//diffoscope/diffoscope/difference.py", line 292, in from_command_exc assert False, exc.__dict__ AssertionError: {'returncode': 1, 'cmd': ['objdump', '--line-numbers', '--disassemble', '--demangle', '--reloc', '--section=.text', 'em1.a'], 'output': "objdump: can't disassemble for architecture UNKNOWN!\n\nobjdump: section '.text' mentioned in a -j option, but not found in any input file\n", 'stderr': None}Edited by Marc Herbert -
SORRY I bisected too fast, didn't notice the errors evolved over the git history.
-
Use case number 2 above (only one foreign .o in the .a) is already fixed in the commits after this one and works again on the current master branch. Ignore use case 2, again sorry for the noise.
-
The more artificial use case 1 (specially crafted when trying to understand what's going on) still fails on the current master branch (tag 121 and later) but with a completely different and typical python3 unicode error:
... subprocess.CalledProcessError: Command '['readelf', '--wide', '--section-headers', 'mix1.a']' returned non-zero exit status 1. During handling of the above exception, another exception occurred: ... File "diffoscope/diffoscope/comparators/utils/file.py", line 458, in compare re.sub(r'^', ' ', e.output, flags=re.MULTILINE) File "/usr/lib64/python3.7/re.py", line 192, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: cannot use a string pattern on a bytes-like objectThis quick and dirty hack avoids this unicode issue temporarily:
--- a/diffoscope/comparators/utils/file.py +++ b/diffoscope/comparators/utils/file.py @@ -455,7 +455,7 @@ class File(object, metaclass=abc.ABCMeta): output = '<none>' if e.output: output = '\n{}'.format( - re.sub(r'^', ' ', e.output, flags=re.MULTILINE) + re.sub(r'^', ' ', e.output.decode("utf-8"), flags=re.MULTILINE) ) difference.add_comment( "Command `{}` exited with return code {}. Output: {}".format( -
-
File "/usr/lib64/python3.7/re.py", line 192, in sub return _compile(pattern, flags).sub(repl, string, count) TypeError: cannot use a string pattern on a bytes-like objectThis unicode error is also reproduced when trying (and failing) to compare the two files
strip-nondeterminism/t/fixtures/ar/one.a.{in,out}BUT ONLY when they have been renamedar/in.aandar/out.a?!? (with a.asuffix) -
👀 @lambyAuthor OwnerThis unicode error is also reproduced when trying (and failing) to compare the two files
strip-nondeterminism/t/fixtures/ar/one.a.{in,out}BUT ONLY when they have been renamedar/in.aandar/out.a?!? (with a.asuffix)I could reproduce this. Inspecting the output of
--debugand a small bit of Git archeology reveals this is because of this commit of mine. This might need a separate issue.For the rest of these comments after poking at it for about an hour, can I ask you to:
-
Ensure you are using the very latest Git
master(I made some changes this morning) -
If you can still reproduce this problem please copy-paste any relevant sections into a new issue with the exact SHA1 whenever you say "current master", for obvious reasons. :)
-
Please attach your test files (eg.
mix1.a) so that my own method of constructing them is not flawed in anyway (I just want to 100% rule this time-consuming and useless diversion if so...)
Thanks for such a detailed and good report here and my apologies I can't quickly fix it for you without asking you to do more "busywork" of sorts here...
-
-
Just tested again with HEAD at 2f101b8b. Pretty much everything works now. Interesting to see my utf8 hardcoding hack become production :-)
One issue left:
diffoscope strip-nondeterminism/t/fixtures/ar/one.a.{in,out}looks good. However when I rename the files to more common namesin.aandout.athen I get a readelf error message and the nice,ls -llisting of file permissions and dates is gone! Just a hexdump instead. More test files at strip-nondeterminism!4 (closed) (the pipeline seems out of order)-
I think I understand why the
readelftriggers a hexdump, however it's sad to lose thels -lstyle display when this happens. Can't we have both the hexdump and the listing? -
So it looks like the decision to run
readelfon the archive is made based on the.aextension of the archive file. I think it would make more sense to make that decision based on the.o,.obj, ... extensions of the files inside the archive instead. For instance: if no file inside looks like a compiled ELF, then don't runreadelf. Note I'm not suggesting extracting members and running readelf on each. Hybrid archives are rare.
Let me know if you want the above in a new bug or something.
Thanks for such a detailed and good report
I wish!
Edited by Marc Herbert -
-
👀 @lambyAuthor Owner- So it looks like the decision to run
readelfon the archive is made based on the.aextension of the archive file. […]
Please file this (or the other issue) as a new bug, we are unfortunately very much losing context in the multiple conversations here (eg. I think you missed my reference to this previous commit of mine) but, yes, each issue new bug please with enough "fresh" context; it is far easier to merge than to split.. :)
- So it looks like the decision to run
-
mentioned in issue #64 (closed)
-
OK, I did more testing and got to the bottom of this. The tentative fix looks simple: #64 (closed)
Remove elf.StaticLibFile: it's a serious design flaw