All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Omar Sandoval <osandov@osandov.com>,
	linux-btrfs@vger.kernel.org, linux-xfs@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Qu Wenruo <quwenruo@cn.fujitsu.com>,
	Christoph Hellwig <hch@infradead.org>,
	kernel-team@fb.com
Subject: Re: [RFC PATCH 0/2] Btrfs: make a source length of 0 imply EOF for dedupe
Date: Wed, 23 Nov 2016 08:55:59 -0500	[thread overview]
Message-ID: <20161123135559.GC8685@hungrycats.org> (raw)
In-Reply-To: <20161123042632.GQ31101@dastard>

[-- Attachment #1: Type: text/plain, Size: 2181 bytes --]

On Wed, Nov 23, 2016 at 03:26:32PM +1100, Dave Chinner wrote:
> On Tue, Nov 22, 2016 at 09:02:10PM -0500, Zygo Blaxell wrote:
> > On Thu, Nov 17, 2016 at 04:07:48PM -0800, Omar Sandoval wrote:
> > > 3. Both XFS and Btrfs cap each dedupe operation to 16MB, but the
> > >    implicit EOF gets around this in the existing XFS implementation. I
> > >    copied this for the Btrfs implementation.
> > 
> > Somewhat tangential to this patch, but on the dedup topic:  Can we raise
> > or drop that 16MB limit?
> > 
> > The maximum btrfs extent length is 128MB.  Currently the btrfs dedup
> > behavior for a 128MB extent is to generate 8x16MB shared extent references
> > with different extent offsets to a single 128MB physical extent.
> > These references no longer look like the original 128MB extent to a
> > userspace dedup tool.  That raises the difficulty level substantially
> > for a userspace dedup tool when it tries to figure out which extents to
> > keep and which to discard or rewrite.
> 
> That, IMO, is a btrfs design/implementation problem, not a problem
> with the API. Applications are always going to end up doing things
> that aren't perfectly aligned to extent boundaries or sizes
> regardless of the size limit that is placed on the dedupe ranges.

Given that XFS doesn't have all the problems btrfs does, why does XFS
have the same aribitrary size limit?  Especially since XFS demonstrably
doesn't need it?

> > XFS may not have this problem--I haven't checked.
> 
> It doesn't - it tracks shared blocks exactly and merges adjacent
> extent records whenever possible.
> 
> > Even if we want to keep the 16MB limit, there's also no way to query the
> > kernel from userspace to find out what the limit is, other than by trial
> > and error.  It's not even in a header file, userspace just has to *know*.
> 
> So add a define to the API to make it visible to applications and
> document it in the man page.

To answer some of my own questions on the btrfs side:  It looks like
the btrfs implementation does have a reason for it (fixed-size arrays).

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

  reply	other threads:[~2016-11-23 13:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-18  0:07 [RFC PATCH 0/2] Btrfs: make a source length of 0 imply EOF for dedupe Omar Sandoval
2016-11-18  0:07 ` [RFC PATCH 1/2] Btrfs: refactor btrfs_extent_same() slightly Omar Sandoval
2016-11-18  3:22   ` Qu Wenruo
2016-11-18  3:22     ` Qu Wenruo
2016-11-18  0:07 ` [RFC PATCH 2/2] Btrfs: make a source length of 0 imply EOF for dedupe Omar Sandoval
2016-11-18  5:38 ` [RFC PATCH 0/2] " Christoph Hellwig
2016-11-22 21:17   ` Darrick J. Wong
2016-11-23  2:02 ` Zygo Blaxell
2016-11-23  2:44   ` Darrick J. Wong
2016-11-24  5:16     ` Zygo Blaxell
2016-11-23  4:26   ` Dave Chinner
2016-11-23 13:55     ` Zygo Blaxell [this message]
2016-11-23 22:13       ` Dave Chinner
2016-11-23 23:14         ` Zygo Blaxell
2016-11-23 23:53           ` Dave Chinner
2016-11-24  1:26             ` Darrick J. Wong
2016-11-25  4:20               ` Zygo Blaxell
2016-11-28 17:58                 ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161123135559.GC8685@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=osandov@osandov.com \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.