From: Luis Henriques <luis@igalia.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>,
linux-fsdevel@vger.kernel.org,
Bernd Schubert <bschubert@ddn.com>,
Florian Weimer <fweimer@redhat.com>
Subject: Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
Date: Wed, 06 Aug 2025 20:48:41 +0100 [thread overview]
Message-ID: <87ldnw44fa.fsf@wotan.olymp> (raw)
In-Reply-To: <20250806160142.GF2672029@frogsfrogsfrogs> (Darrick J. Wong's message of "Wed, 6 Aug 2025 09:01:42 -0700")
On Wed, Aug 06 2025, Darrick J. Wong wrote:
> On Wed, Aug 06, 2025 at 10:17:06AM +0100, Luis Henriques wrote:
>> On Tue, Aug 05 2025, Miklos Szeredi wrote:
>>
>> > The FUSE protocol uses struct fuse_write_out to convey the return value of
>> > copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
>> > interface supports a 64-bit size copies.
>> >
>> > Currently the number of bytes copied is silently truncated to 32-bit, which
>> > is unfortunate at best.
>> >
>> > Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
>> > number of bytes copied is returned in a 64-bit value.
>> >
>> > If the fuse server does not support COPY_FILE_RANGE_64, fall back to
>> > COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
>>
>> I was wondering if it wouldn't make more sense to truncate the size to
>> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
>> correctly (which is probably a big 'if'!), the VFS will fallback to
>> splice() if the file system does not implement copy_file_range. And in
>> this case splice() seems to limit the operation to MAX_RW_COUNT.
>
> It doesn't, because copy_file_range implementations can do other things
> (like remapping/reflinking file blocks) that produce a very small amount
> of disk IO for what is effectively a very large change to file contents.
> That's why the VFS doesn't cap len at MAX_RW_COUNT bytes.
Oh, OK. So looks like I misunderstood that code. In vfs_copy_file_range(),
I assumed that the fallback to splice ('splice = true;') would cap the IO
with the following:
ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
min_t(size_t, len, MAX_RW_COUNT), 0);
And that's why I suggested to do the same here instead of UINT_MAX - 4096.
Cheers,
--
Luís
> --D
>
>> Cheers,
>> --
>> Luís
>>
>>
>> > Reported-by: Florian Weimer <fweimer@redhat.com>
>> > Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
>> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>> > ---
>> > fs/fuse/file.c | 34 ++++++++++++++++++++++++++--------
>> > fs/fuse/fuse_i.h | 3 +++
>> > include/uapi/linux/fuse.h | 12 +++++++++++-
>> > 3 files changed, 40 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>> > index adc4aa6810f5..bd6624885855 100644
>> > --- a/fs/fuse/file.c
>> > +++ b/fs/fuse/file.c
>> > @@ -3017,6 +3017,8 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
>> > .flags = flags
>> > };
>> > struct fuse_write_out outarg;
>> > + struct fuse_copy_file_range_out outarg_64;
>> > + u64 bytes_copied;
>> > ssize_t err;
>> > /* mark unstable when write-back is not used, and file_out gets
>> > * extended */
>> > @@ -3066,30 +3068,46 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
>> > if (is_unstable)
>> > set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
>> >
>> > - args.opcode = FUSE_COPY_FILE_RANGE;
>> > + args.opcode = FUSE_COPY_FILE_RANGE_64;
>> > args.nodeid = ff_in->nodeid;
>> > args.in_numargs = 1;
>> > args.in_args[0].size = sizeof(inarg);
>> > args.in_args[0].value = &inarg;
>> > args.out_numargs = 1;
>> > - args.out_args[0].size = sizeof(outarg);
>> > - args.out_args[0].value = &outarg;
>> > + args.out_args[0].size = sizeof(outarg_64);
>> > + args.out_args[0].value = &outarg_64;
>> > + if (fc->no_copy_file_range_64) {
>> > +fallback:
>> > + /* Fall back to old op that can't handle large copy length */
>> > + args.opcode = FUSE_COPY_FILE_RANGE;
>> > + args.out_args[0].size = sizeof(outarg);
>> > + args.out_args[0].value = &outarg;
>> > + inarg.len = min_t(size_t, len, 0xfffff000);
>> > + }
>> > err = fuse_simple_request(fm, &args);
>> > if (err == -ENOSYS) {
>> > - fc->no_copy_file_range = 1;
>> > - err = -EOPNOTSUPP;
>> > + if (fc->no_copy_file_range_64) {
>> > + fc->no_copy_file_range = 1;
>> > + err = -EOPNOTSUPP;
>> > + } else {
>> > + fc->no_copy_file_range_64 = 1;
>> > + goto fallback;
>> > + }
>> > }
>> > if (err)
>> > goto out;
>> >
>> > + bytes_copied = fc->no_copy_file_range_64 ?
>> > + outarg.size : outarg_64.bytes_copied;
>> > +
>> > truncate_inode_pages_range(inode_out->i_mapping,
>> > ALIGN_DOWN(pos_out, PAGE_SIZE),
>> > - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
>> > + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
>> >
>> > file_update_time(file_out);
>> > - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
>> > + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
>> >
>> > - err = outarg.size;
>> > + err = bytes_copied;
>> > out:
>> > if (is_unstable)
>> > clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
>> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
>> > index b54f4f57789f..a8be19f686b1 100644
>> > --- a/fs/fuse/fuse_i.h
>> > +++ b/fs/fuse/fuse_i.h
>> > @@ -850,6 +850,9 @@ struct fuse_conn {
>> > /** Does the filesystem support copy_file_range? */
>> > unsigned no_copy_file_range:1;
>> >
>> > + /** Does the filesystem support copy_file_range_64? */
>> > + unsigned no_copy_file_range_64:1;
>> > +
>> > /* Send DESTROY request */
>> > unsigned int destroy:1;
>> >
>> > diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
>> > index 122d6586e8d4..94621f68a5cc 100644
>> > --- a/include/uapi/linux/fuse.h
>> > +++ b/include/uapi/linux/fuse.h
>> > @@ -235,6 +235,10 @@
>> > *
>> > * 7.44
>> > * - add FUSE_NOTIFY_INC_EPOCH
>> > + *
>> > + * 7.45
>> > + * - add FUSE_COPY_FILE_RANGE_64
>> > + * - add struct fuse_copy_file_range_out
>> > */
>> >
>> > #ifndef _LINUX_FUSE_H
>> > @@ -270,7 +274,7 @@
>> > #define FUSE_KERNEL_VERSION 7
>> >
>> > /** Minor version number of this interface */
>> > -#define FUSE_KERNEL_MINOR_VERSION 44
>> > +#define FUSE_KERNEL_MINOR_VERSION 45
>> >
>> > /** The node ID of the root inode */
>> > #define FUSE_ROOT_ID 1
>> > @@ -657,6 +661,7 @@ enum fuse_opcode {
>> > FUSE_SYNCFS = 50,
>> > FUSE_TMPFILE = 51,
>> > FUSE_STATX = 52,
>> > + FUSE_COPY_FILE_RANGE_64 = 53,
>> >
>> > /* CUSE specific operations */
>> > CUSE_INIT = 4096,
>> > @@ -1148,6 +1153,11 @@ struct fuse_copy_file_range_in {
>> > uint64_t flags;
>> > };
>> >
>> > +/* For FUSE_COPY_FILE_RANGE_64 */
>> > +struct fuse_copy_file_range_out {
>> > + uint64_t bytes_copied;
>> > +};
>> > +
>> > #define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
>> > #define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
>> > struct fuse_setupmapping_in {
>> > --
>> > 2.49.0
>> >
>>
>>
next prev parent reply other threads:[~2025-08-06 19:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-05 18:30 [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Miklos Szeredi
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
2025-08-12 11:21 ` Miklos Szeredi
2025-08-15 14:22 ` Christian Brauner
2025-08-12 14:26 ` Amir Goldstein
2025-08-12 15:50 ` Miklos Szeredi
2025-08-06 9:17 ` [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Luis Henriques
2025-08-06 16:01 ` Darrick J. Wong
2025-08-06 19:48 ` Luis Henriques [this message]
2025-08-11 15:45 ` Darrick J. Wong
2025-08-06 19:43 ` Bernd Schubert
2025-08-12 9:08 ` Chunsheng Luo
2025-08-12 19:49 ` Darrick J. Wong
2025-08-07 6:24 ` Chunsheng Luo
2025-08-11 15:47 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ldnw44fa.fsf@wotan.olymp \
--to=luis@igalia.com \
--cc=bschubert@ddn.com \
--cc=djwong@kernel.org \
--cc=fweimer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=mszeredi@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.