Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • diffoscope diffoscope
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 132
    • Issues 132
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Reproducible BuildsReproducible Builds
  • diffoscopediffoscope
  • Merge requests
  • !93

Fix missing diff output on large diffs

  • Review changes

  • Download
  • Email patches
  • Plain diff
Closed Brandon Maier requested to merge blmaier-col/diffoscope:fix-side-by-side-diff-for-large-diffs into master Nov 15, 2021
  • Overview 3
  • Commits 1
  • Pipelines 1
  • Changes 1

When there is a large diff chunk, match_lines() will skip running the difflib.Differ.compare(). However this causes the following issues:

  • It does not empty the self.buf buffer. This means that all future calls to match_lines() for that file will always be too large. So effectively no more diffs from the file get output.

  • It outputs a debug message, but does not output anything to the side-by-side diff, so a user looking at the side-by-side diff may be misled into thinking the rest of the file has no differences.

We can fix these issue by falling back to a lazy line-by-line diff. This produces suboptimal output, but it runs in linear O(n) time while providing some form of output. We include a comment in the diff so the user knows the following output is using a lazy diff algorithm.

Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: fix-side-by-side-diff-for-large-diffs