Skip to content

APK diff is slow b/c libmagic takes minutes to identify 40k .smali files

Edit: update

Running file (or using magic from Python like diffoscope does) on all 40k .smali files takes just under two minutes.

Twice that -- since we have two APKs -- is pretty close to the overhead we see.

Original: APK diff is (unnecessarily) slow because it decompiles classes in .dex files twice

I was using diffoscope on an APK with only differences in classes8.dex.

It seems to be wasting a significant amount of time somewhere:

$ time diffoscope --text diff-apk.txt --text-color always a.apk b.apk

real    9m35.038s
user    15m58.099s
sys     0m41.146s
$ mkdir A B
$ unzip -d A a.apk classes8.dex
$ unzip -d B b.apk classes8.dex
$ time diffoscope --text diff-dex.txt --text-color always A/classes8.dex B/classes8.dex

real    2m31.845s
user    8m26.106s
sys     0m25.237s

Running apktool only takes 30 seconds, so that's not the cause; it's what happens afterwards.

My hypothesis is that the difference is caused by comparing the 42228 .smali files generated by apktool (for this particular APK) as well.

So the issue is that apktool decompiles the .dex files into .smali files, whereas we also convert the .dex into a .jar using enjarify and then subsequently decompile the .class files in the .jar with procyon.

I can't speak for other users, but I personally don't see the value in performing the .smali comparison as well in this case.

So I would suggest ignoring the .smali files (I don't know if you can tell apktool not to generate them) if enjarify and procyon are available, at least by default.

I can't rule out it could be useful to have both in some cases (since the different methods of decompilation do not produce identical output), so being able to opt-in to the double comparison does seem useful.

Edited by FC (Fay) Stegerman
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information