Bees v0.6 This release brings some significant performance improvements. This release exists so we can refer its beeshome file format as "the bees v0.6 format." This format is three years old and is currently blocking further performance improvement. Future bees versions may import data from this format, but no support for downgrades is planned. Highlights: * Fixed a bug in extent mapping that was causing severe performance loss * Multi-threaded parallel execution * Dynamic thread pool size based on system load average * Automatically adjusts scan polling interval to match filesystem update rate * Subvol parallel scan modes: lockstep (0), independent (1), and sequential (2) * Fixes for ARM, Gentoo, systemd compatibility Shortlog: Kai Krakow (75): crucible: Allow setting a relative path option for name_fd() getopt: Add logic to set relative path from $CWD Remove filter path logic from frontend script Fix example config for timestamp logging Fix indentation/alignment after integration Remove process forking from frontend script Make clear that options must be supplied in one variable Fix a fallthrough error in GCC 7+ Fix a fallthrough error in GCC 7+ Don't zap localconf in "make clean" Add scripts to "make all" target Generalize sed invocation rule systemd: Don't start in system-update.target systemd: Don't start without essential system services systemd: Provide URL and better description Makefile: depend install_scripts on scripts Makefile: let "make install" install the complete distribution Add beesd@.service to gitignore Makefile improvement Installation: Prepare README Installation: Add new section to README Makefile: Document Makefile changes Installation: Add Arch Linux instructions Makefile: Document scripts/beesd Installation: Document optional dependency on blkid Installation: Introduce DESTDIR into Makefile Installation: Improve filesystem layout flexibility Installation: Keep version tag in a variable Installation: Fix soname QA warning in Gentoo Installation: -fPIC should not be used unconditionally Installation: Add Gentoo ebuild Installation: Remove superfluous cruft from Gentoo ebuild Installation: Depend Gentoo ebuild on markdown Makefile: Fail gracefully if markdown is not installed Makefile: depends.mk is not an optional include Makefile: .o already depends on its .h file Makefile: generalize .so target Makefile: rename OBJS to CRUCIBLE_OBJS Makefile: fix dependency generation Makefile: Generalize the .version.cc target Makefile: do not be verbose about mv Makefile: speedup dependency generation Logging: Add log levels to output Makefile: Some cleanups Makefile: Fix some dependencies Logging: Improve text layout when discarding log timestamps Makefile: -lXXXXX is really a filename parameter Makefile: force rebuilding tests when Makefile changed Makefile: Get rid of test for-loop README: Fix markdown syntax error README: Some things are simply no longer true Cmdline: Rename "notimestamps" to "no-timestamps" Cmdline: Rename "relative-paths" to "strip-paths" Cmdline: Fix text alignment README: Add notes about packaging Makefile: Unclutter "make test" output Code style: Fix wrong indentation Makefile: remove tests from "make all" Makefile: Run install tests only for default target "reallyall" Makefile: make installing libs a separate target Makefile: Allow installation of fiemap/fiewalk support tools Makefile: Auto-detect systemd unit path Scripts: Fix systemd unit not being templated Installation: Remove USR_PREFIX from Makefile Compilation: Let the code know about package config Makefile: .version.o is made from a generated file Scripts: Don't prefix timestamps when running with systemd Makefile: create a template compiler Makefile: Due to VPATH, libcrucible links to hard-coded libuuid path Makefile: "which" is not portable Makefile: Do not force making README.html Makefile: Do not force optimizations by default Gentoo: Rework Gentoo ebuild into overlay beesd: Fix the wrapper not finding any config file contrib/gentoo: Update ebuild Timofey Titovets (5): Fix: exec bees - breaks bash trap handling of umount bees workdir Fix: exec bees - breaks bash trap handling of umount bees workdir Make beesd -h useful Update options in sample config Rewrite beesd arg parser Zygo Blaxell (83): Merge remote-tracking branches 'kakra/feature/add-relative-path-option' and 'kakra/integration' hash: reduce mutex contention using one mutex per hash table extent Merge remote-tracking branch 'nefelim4ag/master' crucible: add cleanup class lockset: drop unused method wait_unlock crucible: resource: remove excess locking crucible: resource: optimize map cleanup roots: remove open_root_cache correctly subvol-threads: increase resource and thread limits Makefiles: don't append to depends.mk.new test: add -lpthread to Makefile bees: clean up #if 0 ... fsync ... #endif code README: update dependencies and Linux kernel bugs list crucible: add Task class roots: remove dead code and #if blocks crucible: remove unused TimeQueue and WorkQueue classes roots: scan in parallel using Tasks crawl: implement two crawler algorithms and adjust scheduling parameters README: describe the scanning mode (-m option) hash: do the mlock after loading the table crucible: cache: linked-list LRU implementation counters: track pair growing time crawl: make logging less verbose task: allow external access to Task print function BeesNote: if thread name was not set, get it from Task or pthread_getname_np logging: get Task names for log messages roots: update Task print functions for new usage Merge remote-tracking branch 'kakra/proposal/prepare-for-more-libs' BeesNote: thread naming fixes Task: convert print_fn to a string time: drop unused Timer methods types: don't throw an exception when it's likely we are already reporting an exception BeesBlockData: don't leak file contents in the log README: update Linux kernel bugs list (v4.14) bees: drop BEESINFO roots: comment updates and general cleanup time: add RateEstimator, a class for optimally polling irregular external events roots: use RateEstimator to track transids crawl: combine two messages per crawl cycle into one FdCache: clear cache on every new transid / crawl cycle roots: use RateEstimator as a transid_max cache and clean up logs roots: poll every 10 transids scan: fix length mismatch exception for prealloc extents at EOF ExtentWalker: increase efficiency for typical btrfs extent sizes roots: move common code for creating crawl Tasks into a method roots: add scan-mode 2 "oldest crawler first" README: add scan-mode 2 and expand descriptions of modes 0 and 1 resolve: drop support for old-style compressed BeesAddr context: improve toxic match logs time: add update_monotonic to RateEstimator task: allow user access to ID and default constructor log: BEESLOGNOTE doesn't do what we think it does crawl: don't block a Task waiting for new transids roots: determine transid_max without open()ing every subvol root crawl: filter extents correctly crawl: somebody should set max_transid extentwalker: remove wrong constraint check roots: get rid of common error messages, add more error counters cache: release lock before clearing README: clarify that bees is not to be used on old kernels README: FD caches are now cleared every 10 transactions resolve: break up long intra-extent dedup loops crucible: MAP_32BIT is not defined on ARM crucible: progress: a progress tracker for worker queues BeesBlockData: fix data type issues stats: rename "chase_wrong_data" to "chase_no_data" fs: fix FTBFS on GCC 8 tempfile: update comments around bees_sync bees: revert TOXIC_INTERVAL back to pre-4.14 levels crawl: use custom order instead of (ab)using BeesFileRange::operator< README: update known bugs and issues list crucible: error: record location of exception in what() message crucible: progress: drop the set() method README.md: update build-deps context: log dedups with single unbroken log message bees: configurable log verbosity bees: use readahead instead of posix_fadvise bees: dynamic thread pool size based on system load average roots: if queue is full run again README: spell 'available' correctly bees: add -G/--thread-min option for minimum thread count Merge https://github.com/Zygo/bees/pull/62 extentwalker: don't fetch absurd numbers of extents just to throw them away