Skip to content

Many improvements for dl10n-check

Pino Toscano requested to merge pino/dl10n:dl10n-check-improvements into master

This work branch provides lots of improvements to dl10n-check:

  1. now it uses (via the Debian::Pkg::DebSrc module) the dpkg API to extract the source package to a temporary directory: while this takes disk space (and more time for the unpacking), it removes the need to manually parse the source package. This alone should fix almost (if not all) the problematic sources mentioned in IGNMATERIAL (in etc/dl10n.conf)
  2. it uses the dpkg API also to:
  • parse the control files
  • compare versions (instead of spawning dpkg commands)
  • parse the .dsc files
  1. improves the handling of languages:
  • switches to the core Locale::* modules, so it knows about way more languages than now
  • tweaks the language detection of .po files, so it properly recognizes more cases
  • allow 3-letters languages, e.g. hsb, etc
  1. uses more existing modules, instead of spawning external commands
  2. drops support for yada
  3. improves the convert_to_unicode subroutine:
  • requires the Text::Iconv module, caching its instances
  • move the fallback code to a subroutine, using it only if needed (saves some s///)
  1. adds an optional value for --careful, so it is possible to batch DB saves every N changes (instead of after every change)
  2. simplifies checking of Priority, and Section in sources
  3. ... and other minor changes

Regarding the memory usage: now dl10n-check takes more RAM (in my tests ~500MB or so), but simply because the DB now will contains way more files than before. Also, the memory usage is more or less constant, and it does not spike to absurd values when handling very big sources (e.g. 0ad-data, libreoffice, etc) So, it should be fine to let it run on all the sources, now.

Edited by Pino Toscano

Merge request reports

Loading