Re: Identifying reflink / CoW files

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Saint Germain <saintger@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Identifying reflink / CoW files
Date: Thu, 24 Nov 2016 22:55:43 -0500	[thread overview]
Message-ID: <20161125035543.GZ21290@hungrycats.org> (raw)
In-Reply-To: <20161104154149.754c3eab@system>

[-- Attachment #1: Type: text/plain, Size: 3749 bytes --]

On Fri, Nov 04, 2016 at 03:41:49PM +0100, Saint Germain wrote:
> On Thu, 3 Nov 2016 01:17:07 -0400, Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote :
> > [...]
> > The quality of the result therefore depends on the amount of effort
> > put into measuring it.  If you look for the first non-hole extent in
> > each file and use its physical address as a physical file identifier,
> > then you get a fast reflink detector function that has a high risk of
> > false positives.  If you map out two files and compare physical
> > addresses block by block, you get a slow function with a low risk of
> > false positives (but maybe a small risk of false negatives too).
> > 
> > If your dedup program only does full-file reflink copies then the
> > first extent physical address method is sufficient.  If your program
> > does block- or extent-level dedup then it shouldn't be using files in
> > its data model at all, except where necessary to provide a mechanism
> > to access the physical blocks through the POSIX filesystem API.
> > 
> > FIEMAP will tell you about all the extents (physical address for
> > extents that have them, zero for other extent types).  It's also slow
> > and has assorted accuracy problems especially with compressed files.
> > Any user can run FIEMAP, and it uses only standard structure arrays.
> > 
> > SEARCH_V2 is root-only and requires parsing variable-length binary
> > btrfs data encoding, but it's faster than FIEMAP and gives more
> > accurate results on compressed files.
> 
> As the dedup program only does full-file reflink, the first extent
> physical address method can be used as a fast first check to identify
> potential files.
> 
> But how to implement the second check in order to have 0% risk of false
> positive ?
> Because you said that mapping out two files and comparing the physical
> addresses block by block also has a low risk of false positives.

In theory, what you do is call FIEMAP on each file and compare the
physical blocks that come back.  If they are large files you will have
to call FIEMAP multiple times on both files, each time setting the start
position to the end position of the previous run.  Translate each result
record into a range of physical addresses, then compare them.  If there
were no differences, the files are already deduped.

In practice, FIEMAP doesn't provide full accuracy for compressed extents,
and in some cases the physical address data will compare equal when
the files are in fact different.  This is the small risk of false
positives, and the only way to get 100% accuracy is to not use FIEMAP.

Instead you can use the SEARCH ioctl, which dumps out the binary extent
items from btrfs.  If you look up the items corresponding to one inode,
you can get the real physical block address plus the offset from the
beginning of the extent for compressed extents.

In Bees I encode the compressed extent start offset into the same
uint64_t as the physical extent start address using the bottom 6 bits
of the physical (bytenr) address:

	https://github.com/Zygo/bees/blob/master/src/bees-types.cc#L744

This fills in an object which uniquely (and reversibly) identifies
the block on the filesystem.

The raw btrfs extent data is extracted here:

	https://github.com/Zygo/bees/blob/master/lib/extentwalker.cc#L533

BeesAddress gives no false positives, but it's built on top of hundreds
of lines of userspace support code.  :-/

> Thank you very much for the detailed explanation !
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

next prev parent reply	other threads:[~2016-11-25  3:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 11:30 Identifying reflink / CoW files Saint Germain
2016-11-03  5:17 ` Zygo Blaxell
2016-11-04 14:41   ` Saint Germain
2016-11-25  3:55     ` Zygo Blaxell [this message]
  -- strict thread matches above, loose matches on Subject: below --
2012-09-22  3:38 Jp Wise
2012-09-22  7:49 ` Arne Jansen
2012-09-22 21:56   ` Jp Wise
2012-09-24 13:53     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161125035543.GZ21290@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=saintger@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.