Skip to content
Bees v0.6

This release brings some significant performance improvements.

This release exists so we can refer its beeshome file format as "the bees
v0.6 format."  This format is three years old and is currently blocking
further performance improvement.  Future bees versions may import data
from this format, but no support for downgrades is planned.

Highlights:

        * Fixed a bug in extent mapping that was causing severe performance loss
        * Multi-threaded parallel execution
        * Dynamic thread pool size based on system load average
        * Automatically adjusts scan polling interval to match filesystem update rate
        * Subvol parallel scan modes:  lockstep (0), independent (1), and sequential (2)
        * Fixes for ARM, Gentoo, systemd compatibility

Shortlog:

Kai Krakow (75):
      crucible: Allow setting a relative path option for name_fd()
      getopt: Add logic to set relative path from $CWD
      Remove filter path logic from frontend script
      Fix example config for timestamp logging
      Fix indentation/alignment after integration
      Remove process forking from frontend script
      Make clear that options must be supplied in one variable
      Fix a fallthrough error in GCC 7+
      Fix a fallthrough error in GCC 7+
      Don't zap localconf in "make clean"
      Add scripts to "make all" target
      Generalize sed invocation rule
      systemd: Don't start in system-update.target
      systemd: Don't start without essential system services
      systemd: Provide URL and better description
      Makefile: depend install_scripts on scripts
      Makefile: let "make install" install the complete distribution
      Add beesd@.service to gitignore
      Makefile improvement
      Installation: Prepare README
      Installation: Add new section to README
      Makefile: Document Makefile changes
      Installation: Add Arch Linux instructions
      Makefile: Document scripts/beesd
      Installation: Document optional dependency on blkid
      Installation: Introduce DESTDIR into Makefile
      Installation: Improve filesystem layout flexibility
      Installation: Keep version tag in a variable
      Installation: Fix soname QA warning in Gentoo
      Installation: -fPIC should not be used unconditionally
      Installation: Add Gentoo ebuild
      Installation: Remove superfluous cruft from Gentoo ebuild
      Installation: Depend Gentoo ebuild on markdown
      Makefile: Fail gracefully if markdown is not installed
      Makefile: depends.mk is not an optional include
      Makefile: .o already depends on its .h file
      Makefile: generalize .so target
      Makefile: rename OBJS to CRUCIBLE_OBJS
      Makefile: fix dependency generation
      Makefile: Generalize the .version.cc target
      Makefile: do not be verbose about mv
      Makefile: speedup dependency generation
      Logging: Add log levels to output
      Makefile: Some cleanups
      Makefile: Fix some dependencies
      Logging: Improve text layout when discarding log timestamps
      Makefile: -lXXXXX is really a filename parameter
      Makefile: force rebuilding tests when Makefile changed
      Makefile: Get rid of test for-loop
      README: Fix markdown syntax error
      README: Some things are simply no longer true
      Cmdline: Rename "notimestamps" to "no-timestamps"
      Cmdline: Rename "relative-paths" to "strip-paths"
      Cmdline: Fix text alignment
      README: Add notes about packaging
      Makefile: Unclutter "make test" output
      Code style: Fix wrong indentation
      Makefile: remove tests from "make all"
      Makefile: Run install tests only for default target "reallyall"
      Makefile: make installing libs a separate target
      Makefile: Allow installation of fiemap/fiewalk support tools
      Makefile: Auto-detect systemd unit path
      Scripts: Fix systemd unit not being templated
      Installation: Remove USR_PREFIX from Makefile
      Compilation: Let the code know about package config
      Makefile: .version.o is made from a generated file
      Scripts: Don't prefix timestamps when running with systemd
      Makefile: create a template compiler
      Makefile: Due to VPATH, libcrucible links to hard-coded libuuid path
      Makefile: "which" is not portable
      Makefile: Do not force making README.html
      Makefile: Do not force optimizations by default
      Gentoo: Rework Gentoo ebuild into overlay
      beesd: Fix the wrapper not finding any config file
      contrib/gentoo: Update ebuild

Timofey Titovets (5):
      Fix: exec bees - breaks bash trap handling of umount bees workdir
      Fix: exec bees - breaks bash trap handling of umount bees workdir
      Make beesd -h useful
      Update options in sample config
      Rewrite beesd arg parser

Zygo Blaxell (83):
      Merge remote-tracking branches 'kakra/feature/add-relative-path-option' and 'kakra/integration'
      hash: reduce mutex contention using one mutex per hash table extent
      Merge remote-tracking branch 'nefelim4ag/master'
      crucible: add cleanup class
      lockset: drop unused method wait_unlock
      crucible: resource: remove excess locking
      crucible: resource: optimize map cleanup
      roots: remove open_root_cache correctly
      subvol-threads: increase resource and thread limits
      Makefiles: don't append to depends.mk.new
      test: add -lpthread to Makefile
      bees: clean up #if 0 ... fsync ... #endif code
      README: update dependencies and Linux kernel bugs list
      crucible: add Task class
      roots: remove dead code and #if blocks
      crucible: remove unused TimeQueue and WorkQueue classes
      roots: scan in parallel using Tasks
      crawl: implement two crawler algorithms and adjust scheduling parameters
      README: describe the scanning mode (-m option)
      hash: do the mlock after loading the table
      crucible: cache: linked-list LRU implementation
      counters: track pair growing time
      crawl: make logging less verbose
      task: allow external access to Task print function
      BeesNote: if thread name was not set, get it from Task or pthread_getname_np
      logging: get Task names for log messages
      roots: update Task print functions for new usage
      Merge remote-tracking branch 'kakra/proposal/prepare-for-more-libs'
      BeesNote: thread naming fixes
      Task: convert print_fn to a string
      time: drop unused Timer methods
      types: don't throw an exception when it's likely we are already reporting an exception
      BeesBlockData: don't leak file contents in the log
      README: update Linux kernel bugs list (v4.14)
      bees: drop BEESINFO
      roots: comment updates and general cleanup
      time: add RateEstimator, a class for optimally polling irregular external events
      roots: use RateEstimator to track transids
      crawl: combine two messages per crawl cycle into one
      FdCache: clear cache on every new transid / crawl cycle
      roots: use RateEstimator as a transid_max cache and clean up logs
      roots: poll every 10 transids
      scan: fix length mismatch exception for prealloc extents at EOF
      ExtentWalker: increase efficiency for typical btrfs extent sizes
      roots: move common code for creating crawl Tasks into a method
      roots: add scan-mode 2 "oldest crawler first"
      README: add scan-mode 2 and expand descriptions of modes 0 and 1
      resolve: drop support for old-style compressed BeesAddr
      context: improve toxic match logs
      time: add update_monotonic to RateEstimator
      task: allow user access to ID and default constructor
      log: BEESLOGNOTE doesn't do what we think it does
      crawl: don't block a Task waiting for new transids
      roots: determine transid_max without open()ing every subvol root
      crawl: filter extents correctly
      crawl: somebody should set max_transid
      extentwalker: remove wrong constraint check
      roots: get rid of common error messages, add more error counters
      cache: release lock before clearing
      README: clarify that bees is not to be used on old kernels
      README: FD caches are now cleared every 10 transactions
      resolve: break up long intra-extent dedup loops
      crucible: MAP_32BIT is not defined on ARM
      crucible: progress: a progress tracker for worker queues
      BeesBlockData: fix data type issues
      stats: rename "chase_wrong_data" to "chase_no_data"
      fs: fix FTBFS on GCC 8
      tempfile: update comments around bees_sync
      bees: revert TOXIC_INTERVAL back to pre-4.14 levels
      crawl: use custom order instead of (ab)using BeesFileRange::operator<
      README: update known bugs and issues list
      crucible: error: record location of exception in what() message
      crucible: progress: drop the set() method
      README.md: update build-deps
      context: log dedups with single unbroken log message
      bees: configurable log verbosity
      bees: use readahead instead of posix_fadvise
      bees: dynamic thread pool size based on system load average
      roots: if queue is full run again
      README: spell 'available' correctly
      bees: add -G/--thread-min option for minimum thread count
      Merge https://github.com/Zygo/bees/pull/62
      extentwalker: don't fetch absurd numbers of extents just to throw them away