public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: ov2k <ov2k.github@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: FIDEDUPERANGE and compression
Date: Fri, 11 Mar 2022 21:47:40 -0500	[thread overview]
Message-ID: <YiwJzPEk6xfrjdx/@hungrycats.org> (raw)
In-Reply-To: <CADwZqEs8PHvmGAg4=+qwiQgrY1gFksoNkLZi3rne7uTFzZhoeA@mail.gmail.com>

On Wed, Mar 09, 2022 at 03:04:40PM -0500, ov2k wrote:
> On Sat, Mar 5, 2022 at 11:44 PM Zygo Blaxell
> <ce3g8jdj@umail.furryterror.org> wrote:
> >
> > On Mon, Feb 21, 2022 at 05:31:13PM -0500, ov2k wrote:
> > > It looks like btrfs coalesces adjacent uncompressed extents.  I'm not
> > > sure whether this is done by FIDEDUPERANGE or FS_IOC_FIEMAP.  I think
> > > the problem is that adjacent decompressed ranges (defined by #3 and
> > > #4) within the same compressed block are not coalesced in a similar
> > > manner.  Is there a particular reason why this isn't done, or is this
> > > simply a case of nobody having done it?
> >
> > It hasn't been done because FIEMAP can't produce results for compressed
> > extents that aren't nonsense.  The interface can't cope with compressed
> > data.
> >
> 
> I think there's a misunderstanding here.  The issue isn't making FS_IOC_FIEMAP
> represent compressed data sensibly.  The goal is for btrfs_fiemap() to handle
> adjacent subranges of a compressed extent in much the same way as it handles
> adjacent uncompressed extents.  The result should be no more or less
> nonsensical than it already is.
[...]
> I'm talking about emitting a single struct fiemap_extent that corresponds to
> two adjacent subranges of the same compressed btrfs extent.  The two btrfs
> extents would simply have to satisfy:
> 
>         extent 1 #2 (bytenr) == extent2 #2 (bytenr)
> 
>         extent 1 #1 (seek offset) + extent 1 #3 (decompressed subrange length)
>         == extent 2 #1 (seek offset)
> 
>         extent 1 #4 (decompressed subrange offset) + extent 1 #3 (decompressed
>         subrange length) == extent 2 #4 (decompressed subrange offset)
> 
> The resulting struct fiemap_extent would have:
> 
>         fe_logical: extent 1 #1 (seek offset)
> 
>         fe_physical: extent 1 #2 (bytenr)
> 
>         fe_length: extent 1 #3 (decompressed subrange length) + extent 2 #3
>         (decompressed subrange length)

OK, FIEMAP could handle that one special case.  And it is a frequently
requested feature--filefrag's physically-contiguous-extent counter report
doesn't work at all on compressed files, and it could work in the common
case of a simple sequential write (or reflink thereof).

On the other hand, if you're trying to do dedupe on btrfs, you'll need
access to all the other extent fields to avoid bookending issues.

      reply	other threads:[~2022-03-12  2:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-19  3:14 FIDEDUPERANGE and compression ov 2k
2022-02-21  6:37 ` Zygo Blaxell
2022-02-21 22:31   ` ov2k
2022-03-06  4:44     ` Zygo Blaxell
2022-03-09 20:04       ` ov2k
2022-03-12  2:47         ` Zygo Blaxell [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YiwJzPEk6xfrjdx/@hungrycats.org \
    --to=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=ov2k.github@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox