Commit 4f907360 authored by Mark Fasheh's avatar Mark Fasheh

Revert commit ac32d43d - "Make block-dedupe the default"

This is resulting in too much meta data fragmentation on the users end.
This sometimes pushes us into negative space savings - definitely not something
users expect from their dedupe tool. So revert the default back to our
extent search algorithm. Block dedupe is still there in case there is a use
for it.
Signed-off-by: 's avatarMark Fasheh <mfasheh@suse.de>
parent d2a228da
......@@ -5,7 +5,7 @@ This README is for duperemove v0.11.
Duperemove is a simple tool for finding duplicated extents and
submitting them for deduplication. When given a list of files it will
hash their contents on a block by block basis and compare those hashes
to each other, finding and categorizing blocks that match each
to each other, finding and categorizing extents that match each
other. When given the -d option, duperemove will submit those
extents for deduplication using the Linux kernel extent-same ioctl.
......
This diff is collapsed.
......@@ -8,7 +8,7 @@ duperemove \- Find duplicate extents and print them to stdout
\fBduperemove\fR is a simple tool for finding duplicated extents and
submitting them for deduplication. When given a list of files it will
hash their contents on a block by block basis and compare those hashes
to each other, finding and categorizing blocks that match each
to each other, finding and categorizing extents that match each
other. When given the \fB-d\fR option, \fBduperemove\fR will submit
those extents for deduplication using the Linux kernel extent-same
ioctl.
......@@ -36,6 +36,10 @@ seeing what duperemove might do when run with \fB-d\fR. The output could
also be used by some other software to submit the extents for
deduplication at a later time.
It is important to note that this mode will not print out \fBall\fR
instances of matching extents, just those it would consider for
deduplication.
Generally, duperemove does not concern itself with the underlying
representation of the extents it processes. Some of them could be
compressed, undergoing I/O, or even have already been deduplicated. In
......@@ -186,14 +190,10 @@ fiemap during the file scan stage, you will also want to use the
\fB--lookup-extents=no\fR option.
.TP
\fB[no]block\fR
Defaults to \fBon\fR. Duperemove submits duplicate blocks directly to
the dedupe engine.
Duperemove can optionally optimize the duplicate block lists into
larger extents prior to dedupe submission. The search algorithm used
for this however has a very high memory and cpu overhead, but may
reduce the number of extent references created during dedupe. If you'd
like to try this, run with 'noblock'.
Defaults to \fBoff\fR. Dedupe by block - don't optimize our data into
extents before dedupe. Generally this is undesirable as it will
greatly increase the total number of dedupe requests. There is also a
larger potential for file fragmentation.
.RE
.TP
......
......@@ -56,7 +56,7 @@ unsigned int blocksize = DEFAULT_BLOCKSIZE;
int run_dedupe = 0;
int recurse_dirs = 0;
int one_file_system = 1;
int block_dedupe = 1;
int block_dedupe = 0;
int dedupe_same_file = 0;
int skip_zeroes = 0;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment