Skip to content
Snippets Groups Projects

Draft: Add support for nar files

Open Christopher Baines requested to merge cbaines-guest/diffoscope:nar-support into master
3 unresolved threads

I encounter nar files when working with GNU Guix, but the format comes from the Nix project.

This is the archive format used for Guix binary substitutes, so it's very useful to be able to use diffoscope to compare nars (it's equivalent to comparing .deb's). Currently though, you have to unpack the nar first, then run diffoscope on the directories. For simplicity, I've been looking at whether diffoscope could take care of that.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • I've got to the point though where I'm going round in circles. In principle, unpacking the file then comparing the directories sounds simple, but I'm not sure there are any other containers that work this way? I'm also not sure if this is the approach to fit this in to diffoscope. I'm not sure how much Python I should try and write for nars, and how much I should try and just delegate to other classes?

    Does anyone have any thoughts on the approach?

    • In principle, unpacking the file then comparing the directories sounds simple, but I'm not sure there are any other containers that work this way?

      Hm, something must be broken somewhere, as that is how all Archive subclasses are meant to work -- ie. diffoscope should essentially recurse into them automatically. Take a look at the bzip2 comparator (diffoscope/comparators/bzip2.py), for example: it implements the Archive.extract method which takes a dest_dir as a parameter, with no need to manage a _contents and separate classes for the different types of files within the archive. All extract has to do is ensure that the contents end up within this dest_dir.

      Would you like another try at this, or should I jump in? I see you've added test files to your .nar so I think I have everything I might need, but I'm sure you'd feel more satisfied if you could finish your own MR. (Of course, I might be misunderstand what a .nar file really is.)

    • I think this nar comparator differs from the bzip2 one in that while you can view it as containing a single thing, that thing might be a directory (as well as a file/symlink). I think there's some assumption in the code that an ArchiveMember is a file, and diffoscope crashes if it's a directory, the NarDirectory class avoids that.

      Assuming the general approach I've taken is OK, there are maybe two issues.

      The first superficial one is that the temporary directories end up in the output, e.g.

      ./bin/diffoscope tests/data/test3.nar tests/data/test4.nar
      --- tests/data/test3.nar
      +++ tests/data/test4.nar
      │   --- /tmp/diffoscope_ro1vy1s5_data/tmpo2s2k6a5_nar/contents
      ├── +++ /tmp/diffoscope_ro1vy1s5_data/tmpuqkgjzcq_nar/contents
      │ │   --- /tmp/diffoscope_ro1vy1s5_data/tmpo2s2k6a5_nar/contents/txt
      │ ├── +++ /tmp/diffoscope_ro1vy1s5_data/tmpuqkgjzcq_nar/contents/txt
      │ │ @@ -1,6 +1,12 @@
      │ │ +A common form of lorem ipsum reads:
      │ │ +
      │ │  Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
      │ │  incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
      │ │  nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
      │ │  Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
      │ │  fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
      │ │  culpa qui officia deserunt mollit anim id est laborum.
      │ │ +
      │ │ +"Lorem ipsum" text is derived from sections 1.10.32--3 of Cicero's De finibus
      │ │ +bonorum et malorum (On the Ends of Goods and Evils, or alternatively [About]
      │ │ +The Purposes of Good and Evil).

      The other maybe more important one is that the temporary directory management tied in with the Python garbage collection causes diffoscope to crash after the comparison, it seems to try deleting a directory without deleting the contents. get_temporary_directory is used elsewhere, so I'm not sure what I'm doing differently, maybe it's something to do with the temporary directory containing directories though...

      There might also be another problem for weird comparisons as well. When trying to compare two nars, one containing a text file, and another a directory, diffoscope seems to fallback to a binary comparison. This is a bit of an edge case, but diffoscope seems to handle it better when just given a file and directory to compare outside of a nar, so I'm guessing this is some problem in the implementation I've done.

      → ./bin/diffoscope tests/data/test1.nar tests/data/test4.nar
      --- tests/data/test1.nar
      +++ tests/data/test4.nar
      │┄ Format-specific differences are supported for nar files but no file-specific differences were detected; falling back to a binary diff.
    • Thanks for the background. I'll have a think about the right solution given that the nar can be different things - I suspect there is a vaguely elegant solution, but I might need to fiddle first. Can you upload different nar files for each case (dir and file/symlink)? Otherwise I will likely miss something.

      Changing the general approach will probably affect the other things you mention too, so I won't address them here/now.

    • Awesome, thanks. The 4 test nar files should be sufficiently representative. I'll double check about a top level symlink and add a couple of examples if that's possible (I think practically it's unused).

    • I don't think the symlink test file is really representative. Playing with that I see that diffoscope treats it as a Symlink and not a NarSymlink simply because it's not the top-level file (which is the right thing to do, imho)

      Do you have something else that you believe might be problematic?

    • Please register or sign in to reply
  • Chris Lamb changed title from Add support for nar files (work in progress) to Add support for nar files (WIP)

    changed title from Add support for nar files (work in progress) to Add support for nar files (WIP)

  • Mattia Rizzolo marked this merge request as draft

    marked this merge request as draft

  • Mattia Rizzolo changed title from Add support for nar files (WIP) to Draft: Add support for nar files

    changed title from Add support for nar files (WIP) to Draft: Add support for nar files

Please register or sign in to reply
Loading