From: "Darrick J. Wong" <djwong@kernel.org>
To: Zbigniew Halas <zhalas@gmail.com>
Cc: Amir Goldstein <amir73il@gmail.com>,
Filipe Manana <fdmanana@kernel.org>,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org
Subject: Re: FIDEDUPERANGE claims to succeed for non-identical files
Date: Wed, 4 Jan 2023 09:25:35 -0800 [thread overview]
Message-ID: <Y7W2j0yFT3Y0GLR2@magnolia> (raw)
In-Reply-To: <CAPr0N2gtz79Z1fNmOc_UHjQrZfqUwzx2rJ7+4X0jFbMAAoh3-Q@mail.gmail.com>
On Thu, Dec 22, 2022 at 03:41:30PM +0100, Zbigniew Halas wrote:
> On Thu, Dec 22, 2022 at 9:25 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > Thanks for the analysis.
> > Would you be interested in trying to fix the bug and writing a test?
> > I can help if you would like.
>
> I can give it a try unless it turns out that some deep VFS changes are
> required, but let's try to narrow down the reasonable API semantics
> first.
>
> > It's hard to follow all the changes since
> > 54dbc1517237 ("vfs: hoist the btrfs deduplication ioctl to the vfs")
> > in v4.5, but it *looks* like this behavior might have been in btrfs,
> > before the ioctl was promoted to vfs.. not sure.
> >
> > We have fstests coverage for the "good" case of same size src/dst
> > (generic/136), but I didn't find a test for the non-same size src/dst.
> >
> > In any case, vfs_dedupe_file_range_one() and ->remap_file_range()
> > do not even have an interface to return the actual bytes_deduped,
The number of bytes remapped is passed back via the loff_t return value
of ->remap_file_range. If CAN_SHORTEN is set, the VFS and the fs
implementation are allowed to reduce the @len parameter as much as they
want. TBH I'm mystified why the original btrfs dedupe ioctl wouldn't
allow deduplication of common prefixes, which means that len only gets
shortened to avoid weird problems when dealing with eof being in the
middle of a block.
(Or not, since there's clearly a bug.)
> > so I do not see how any of the REMAP_FILE_CAN_SHORTEN cases
> > are valid, regardless of EOF.
>
> Not sure about this, it looks to me that they are actually returning
> the number of bytes deduped, but the value is not used, but maybe I'm
> missing something.
> Anyway I think there are valid cases when REMAP_FILE_CAN_SHORTEN makes sense.
> For example if a source file content is a prefix of a destination file
> content and we want to dedup the whole range of the source file
> without REMAP_FILE_CAN_SHORTEN,
> then the ioctl will only succeed when the end of the source file is at
> the block boundary, otherwise it will just fail. This will render the
> API very inconsistent.
<nod> I'll try to figure out where the len adjusting went wrong here and
report back.
--D
> Cheers,
> Zbigniew
next prev parent reply other threads:[~2023-01-04 17:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-20 0:10 FIDEDUPERANGE claims to succeed for non-identical files Zbigniew Halas
2022-12-22 8:25 ` Amir Goldstein
2022-12-22 14:41 ` Zbigniew Halas
2023-01-04 17:25 ` Darrick J. Wong [this message]
2023-01-04 19:36 ` Zbigniew Halas
2023-02-04 2:23 ` Darrick J. Wong
2023-02-04 7:00 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y7W2j0yFT3Y0GLR2@magnolia \
--to=djwong@kernel.org \
--cc=amir73il@gmail.com \
--cc=fdmanana@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
--cc=zhalas@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox