* [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
@ 2025-08-05 18:30 Miklos Szeredi
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Miklos Szeredi @ 2025-08-05 18:30 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Bernd Schubert, Florian Weimer
The FUSE protocol uses struct fuse_write_out to convey the return value of
copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
interface supports a 64-bit size copies.
Currently the number of bytes copied is silently truncated to 32-bit, which
is unfortunate at best.
Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
number of bytes copied is returned in a 64-bit value.
If the fuse server does not support COPY_FILE_RANGE_64, fall back to
COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
Reported-by: Florian Weimer <fweimer@redhat.com>
Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
fs/fuse/file.c | 34 ++++++++++++++++++++++++++--------
fs/fuse/fuse_i.h | 3 +++
include/uapi/linux/fuse.h | 12 +++++++++++-
3 files changed, 40 insertions(+), 9 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index adc4aa6810f5..bd6624885855 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3017,6 +3017,8 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
.flags = flags
};
struct fuse_write_out outarg;
+ struct fuse_copy_file_range_out outarg_64;
+ u64 bytes_copied;
ssize_t err;
/* mark unstable when write-back is not used, and file_out gets
* extended */
@@ -3066,30 +3068,46 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
if (is_unstable)
set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
- args.opcode = FUSE_COPY_FILE_RANGE;
+ args.opcode = FUSE_COPY_FILE_RANGE_64;
args.nodeid = ff_in->nodeid;
args.in_numargs = 1;
args.in_args[0].size = sizeof(inarg);
args.in_args[0].value = &inarg;
args.out_numargs = 1;
- args.out_args[0].size = sizeof(outarg);
- args.out_args[0].value = &outarg;
+ args.out_args[0].size = sizeof(outarg_64);
+ args.out_args[0].value = &outarg_64;
+ if (fc->no_copy_file_range_64) {
+fallback:
+ /* Fall back to old op that can't handle large copy length */
+ args.opcode = FUSE_COPY_FILE_RANGE;
+ args.out_args[0].size = sizeof(outarg);
+ args.out_args[0].value = &outarg;
+ inarg.len = min_t(size_t, len, 0xfffff000);
+ }
err = fuse_simple_request(fm, &args);
if (err == -ENOSYS) {
- fc->no_copy_file_range = 1;
- err = -EOPNOTSUPP;
+ if (fc->no_copy_file_range_64) {
+ fc->no_copy_file_range = 1;
+ err = -EOPNOTSUPP;
+ } else {
+ fc->no_copy_file_range_64 = 1;
+ goto fallback;
+ }
}
if (err)
goto out;
+ bytes_copied = fc->no_copy_file_range_64 ?
+ outarg.size : outarg_64.bytes_copied;
+
truncate_inode_pages_range(inode_out->i_mapping,
ALIGN_DOWN(pos_out, PAGE_SIZE),
- ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
+ ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
file_update_time(file_out);
- fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
+ fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
- err = outarg.size;
+ err = bytes_copied;
out:
if (is_unstable)
clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index b54f4f57789f..a8be19f686b1 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -850,6 +850,9 @@ struct fuse_conn {
/** Does the filesystem support copy_file_range? */
unsigned no_copy_file_range:1;
+ /** Does the filesystem support copy_file_range_64? */
+ unsigned no_copy_file_range_64:1;
+
/* Send DESTROY request */
unsigned int destroy:1;
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 122d6586e8d4..94621f68a5cc 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -235,6 +235,10 @@
*
* 7.44
* - add FUSE_NOTIFY_INC_EPOCH
+ *
+ * 7.45
+ * - add FUSE_COPY_FILE_RANGE_64
+ * - add struct fuse_copy_file_range_out
*/
#ifndef _LINUX_FUSE_H
@@ -270,7 +274,7 @@
#define FUSE_KERNEL_VERSION 7
/** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 44
+#define FUSE_KERNEL_MINOR_VERSION 45
/** The node ID of the root inode */
#define FUSE_ROOT_ID 1
@@ -657,6 +661,7 @@ enum fuse_opcode {
FUSE_SYNCFS = 50,
FUSE_TMPFILE = 51,
FUSE_STATX = 52,
+ FUSE_COPY_FILE_RANGE_64 = 53,
/* CUSE specific operations */
CUSE_INIT = 4096,
@@ -1148,6 +1153,11 @@ struct fuse_copy_file_range_in {
uint64_t flags;
};
+/* For FUSE_COPY_FILE_RANGE_64 */
+struct fuse_copy_file_range_out {
+ uint64_t bytes_copied;
+};
+
#define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
#define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
struct fuse_setupmapping_in {
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH 2/2] copy_file_range: limit size if in compat mode
2025-08-05 18:30 [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Miklos Szeredi
@ 2025-08-05 18:30 ` Miklos Szeredi
2025-08-12 11:21 ` Miklos Szeredi
2025-08-12 14:26 ` Amir Goldstein
2025-08-06 9:17 ` [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Luis Henriques
2025-08-07 6:24 ` Chunsheng Luo
2 siblings, 2 replies; 15+ messages in thread
From: Miklos Szeredi @ 2025-08-05 18:30 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Bernd Schubert, Florian Weimer
If the process runs in 32-bit compat mode, copy_file_range results can be
in the in-band error range. In this case limit copy length to MAX_RW_COUNT
to prevent a signed overflow.
Reported-by: Florian Weimer <fweimer@redhat.com>
Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
fs/read_write.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/read_write.c b/fs/read_write.c
index 0ef70e128c4a..e2ccc44d96e6 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1576,6 +1576,10 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
if (len == 0)
return 0;
+ /* Make sure return value doesn't overflow in 32bit compat mode */
+ if (in_compat_syscall() && len > MAX_RW_COUNT)
+ len = MAX_RW_COUNT;
+
file_start_write(file_out);
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-05 18:30 [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Miklos Szeredi
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
@ 2025-08-06 9:17 ` Luis Henriques
2025-08-06 16:01 ` Darrick J. Wong
2025-08-06 19:43 ` Bernd Schubert
2025-08-07 6:24 ` Chunsheng Luo
2 siblings, 2 replies; 15+ messages in thread
From: Luis Henriques @ 2025-08-06 9:17 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, Bernd Schubert, Florian Weimer
On Tue, Aug 05 2025, Miklos Szeredi wrote:
> The FUSE protocol uses struct fuse_write_out to convey the return value of
> copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
> interface supports a 64-bit size copies.
>
> Currently the number of bytes copied is silently truncated to 32-bit, which
> is unfortunate at best.
>
> Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
> number of bytes copied is returned in a 64-bit value.
>
> If the fuse server does not support COPY_FILE_RANGE_64, fall back to
> COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
I was wondering if it wouldn't make more sense to truncate the size to
MAX_RW_COUNT instead. My reasoning is that, if I understand the code
correctly (which is probably a big 'if'!), the VFS will fallback to
splice() if the file system does not implement copy_file_range. And in
this case splice() seems to limit the operation to MAX_RW_COUNT.
Cheers,
--
Luís
> Reported-by: Florian Weimer <fweimer@redhat.com>
> Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
> fs/fuse/file.c | 34 ++++++++++++++++++++++++++--------
> fs/fuse/fuse_i.h | 3 +++
> include/uapi/linux/fuse.h | 12 +++++++++++-
> 3 files changed, 40 insertions(+), 9 deletions(-)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index adc4aa6810f5..bd6624885855 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -3017,6 +3017,8 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> .flags = flags
> };
> struct fuse_write_out outarg;
> + struct fuse_copy_file_range_out outarg_64;
> + u64 bytes_copied;
> ssize_t err;
> /* mark unstable when write-back is not used, and file_out gets
> * extended */
> @@ -3066,30 +3068,46 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> if (is_unstable)
> set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
>
> - args.opcode = FUSE_COPY_FILE_RANGE;
> + args.opcode = FUSE_COPY_FILE_RANGE_64;
> args.nodeid = ff_in->nodeid;
> args.in_numargs = 1;
> args.in_args[0].size = sizeof(inarg);
> args.in_args[0].value = &inarg;
> args.out_numargs = 1;
> - args.out_args[0].size = sizeof(outarg);
> - args.out_args[0].value = &outarg;
> + args.out_args[0].size = sizeof(outarg_64);
> + args.out_args[0].value = &outarg_64;
> + if (fc->no_copy_file_range_64) {
> +fallback:
> + /* Fall back to old op that can't handle large copy length */
> + args.opcode = FUSE_COPY_FILE_RANGE;
> + args.out_args[0].size = sizeof(outarg);
> + args.out_args[0].value = &outarg;
> + inarg.len = min_t(size_t, len, 0xfffff000);
> + }
> err = fuse_simple_request(fm, &args);
> if (err == -ENOSYS) {
> - fc->no_copy_file_range = 1;
> - err = -EOPNOTSUPP;
> + if (fc->no_copy_file_range_64) {
> + fc->no_copy_file_range = 1;
> + err = -EOPNOTSUPP;
> + } else {
> + fc->no_copy_file_range_64 = 1;
> + goto fallback;
> + }
> }
> if (err)
> goto out;
>
> + bytes_copied = fc->no_copy_file_range_64 ?
> + outarg.size : outarg_64.bytes_copied;
> +
> truncate_inode_pages_range(inode_out->i_mapping,
> ALIGN_DOWN(pos_out, PAGE_SIZE),
> - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
> + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
>
> file_update_time(file_out);
> - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
> + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
>
> - err = outarg.size;
> + err = bytes_copied;
> out:
> if (is_unstable)
> clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index b54f4f57789f..a8be19f686b1 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -850,6 +850,9 @@ struct fuse_conn {
> /** Does the filesystem support copy_file_range? */
> unsigned no_copy_file_range:1;
>
> + /** Does the filesystem support copy_file_range_64? */
> + unsigned no_copy_file_range_64:1;
> +
> /* Send DESTROY request */
> unsigned int destroy:1;
>
> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> index 122d6586e8d4..94621f68a5cc 100644
> --- a/include/uapi/linux/fuse.h
> +++ b/include/uapi/linux/fuse.h
> @@ -235,6 +235,10 @@
> *
> * 7.44
> * - add FUSE_NOTIFY_INC_EPOCH
> + *
> + * 7.45
> + * - add FUSE_COPY_FILE_RANGE_64
> + * - add struct fuse_copy_file_range_out
> */
>
> #ifndef _LINUX_FUSE_H
> @@ -270,7 +274,7 @@
> #define FUSE_KERNEL_VERSION 7
>
> /** Minor version number of this interface */
> -#define FUSE_KERNEL_MINOR_VERSION 44
> +#define FUSE_KERNEL_MINOR_VERSION 45
>
> /** The node ID of the root inode */
> #define FUSE_ROOT_ID 1
> @@ -657,6 +661,7 @@ enum fuse_opcode {
> FUSE_SYNCFS = 50,
> FUSE_TMPFILE = 51,
> FUSE_STATX = 52,
> + FUSE_COPY_FILE_RANGE_64 = 53,
>
> /* CUSE specific operations */
> CUSE_INIT = 4096,
> @@ -1148,6 +1153,11 @@ struct fuse_copy_file_range_in {
> uint64_t flags;
> };
>
> +/* For FUSE_COPY_FILE_RANGE_64 */
> +struct fuse_copy_file_range_out {
> + uint64_t bytes_copied;
> +};
> +
> #define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
> #define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
> struct fuse_setupmapping_in {
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-06 9:17 ` [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Luis Henriques
@ 2025-08-06 16:01 ` Darrick J. Wong
2025-08-06 19:48 ` Luis Henriques
2025-08-06 19:43 ` Bernd Schubert
1 sibling, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2025-08-06 16:01 UTC (permalink / raw)
To: Luis Henriques
Cc: Miklos Szeredi, linux-fsdevel, Bernd Schubert, Florian Weimer
On Wed, Aug 06, 2025 at 10:17:06AM +0100, Luis Henriques wrote:
> On Tue, Aug 05 2025, Miklos Szeredi wrote:
>
> > The FUSE protocol uses struct fuse_write_out to convey the return value of
> > copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
> > interface supports a 64-bit size copies.
> >
> > Currently the number of bytes copied is silently truncated to 32-bit, which
> > is unfortunate at best.
> >
> > Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
> > number of bytes copied is returned in a 64-bit value.
> >
> > If the fuse server does not support COPY_FILE_RANGE_64, fall back to
> > COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
>
> I was wondering if it wouldn't make more sense to truncate the size to
> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
> correctly (which is probably a big 'if'!), the VFS will fallback to
> splice() if the file system does not implement copy_file_range. And in
> this case splice() seems to limit the operation to MAX_RW_COUNT.
It doesn't, because copy_file_range implementations can do other things
(like remapping/reflinking file blocks) that produce a very small amount
of disk IO for what is effectively a very large change to file contents.
That's why the VFS doesn't cap len at MAX_RW_COUNT bytes.
--D
> Cheers,
> --
> Luís
>
>
> > Reported-by: Florian Weimer <fweimer@redhat.com>
> > Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> > ---
> > fs/fuse/file.c | 34 ++++++++++++++++++++++++++--------
> > fs/fuse/fuse_i.h | 3 +++
> > include/uapi/linux/fuse.h | 12 +++++++++++-
> > 3 files changed, 40 insertions(+), 9 deletions(-)
> >
> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> > index adc4aa6810f5..bd6624885855 100644
> > --- a/fs/fuse/file.c
> > +++ b/fs/fuse/file.c
> > @@ -3017,6 +3017,8 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> > .flags = flags
> > };
> > struct fuse_write_out outarg;
> > + struct fuse_copy_file_range_out outarg_64;
> > + u64 bytes_copied;
> > ssize_t err;
> > /* mark unstable when write-back is not used, and file_out gets
> > * extended */
> > @@ -3066,30 +3068,46 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> > if (is_unstable)
> > set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
> >
> > - args.opcode = FUSE_COPY_FILE_RANGE;
> > + args.opcode = FUSE_COPY_FILE_RANGE_64;
> > args.nodeid = ff_in->nodeid;
> > args.in_numargs = 1;
> > args.in_args[0].size = sizeof(inarg);
> > args.in_args[0].value = &inarg;
> > args.out_numargs = 1;
> > - args.out_args[0].size = sizeof(outarg);
> > - args.out_args[0].value = &outarg;
> > + args.out_args[0].size = sizeof(outarg_64);
> > + args.out_args[0].value = &outarg_64;
> > + if (fc->no_copy_file_range_64) {
> > +fallback:
> > + /* Fall back to old op that can't handle large copy length */
> > + args.opcode = FUSE_COPY_FILE_RANGE;
> > + args.out_args[0].size = sizeof(outarg);
> > + args.out_args[0].value = &outarg;
> > + inarg.len = min_t(size_t, len, 0xfffff000);
> > + }
> > err = fuse_simple_request(fm, &args);
> > if (err == -ENOSYS) {
> > - fc->no_copy_file_range = 1;
> > - err = -EOPNOTSUPP;
> > + if (fc->no_copy_file_range_64) {
> > + fc->no_copy_file_range = 1;
> > + err = -EOPNOTSUPP;
> > + } else {
> > + fc->no_copy_file_range_64 = 1;
> > + goto fallback;
> > + }
> > }
> > if (err)
> > goto out;
> >
> > + bytes_copied = fc->no_copy_file_range_64 ?
> > + outarg.size : outarg_64.bytes_copied;
> > +
> > truncate_inode_pages_range(inode_out->i_mapping,
> > ALIGN_DOWN(pos_out, PAGE_SIZE),
> > - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
> > + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
> >
> > file_update_time(file_out);
> > - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
> > + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
> >
> > - err = outarg.size;
> > + err = bytes_copied;
> > out:
> > if (is_unstable)
> > clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index b54f4f57789f..a8be19f686b1 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -850,6 +850,9 @@ struct fuse_conn {
> > /** Does the filesystem support copy_file_range? */
> > unsigned no_copy_file_range:1;
> >
> > + /** Does the filesystem support copy_file_range_64? */
> > + unsigned no_copy_file_range_64:1;
> > +
> > /* Send DESTROY request */
> > unsigned int destroy:1;
> >
> > diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> > index 122d6586e8d4..94621f68a5cc 100644
> > --- a/include/uapi/linux/fuse.h
> > +++ b/include/uapi/linux/fuse.h
> > @@ -235,6 +235,10 @@
> > *
> > * 7.44
> > * - add FUSE_NOTIFY_INC_EPOCH
> > + *
> > + * 7.45
> > + * - add FUSE_COPY_FILE_RANGE_64
> > + * - add struct fuse_copy_file_range_out
> > */
> >
> > #ifndef _LINUX_FUSE_H
> > @@ -270,7 +274,7 @@
> > #define FUSE_KERNEL_VERSION 7
> >
> > /** Minor version number of this interface */
> > -#define FUSE_KERNEL_MINOR_VERSION 44
> > +#define FUSE_KERNEL_MINOR_VERSION 45
> >
> > /** The node ID of the root inode */
> > #define FUSE_ROOT_ID 1
> > @@ -657,6 +661,7 @@ enum fuse_opcode {
> > FUSE_SYNCFS = 50,
> > FUSE_TMPFILE = 51,
> > FUSE_STATX = 52,
> > + FUSE_COPY_FILE_RANGE_64 = 53,
> >
> > /* CUSE specific operations */
> > CUSE_INIT = 4096,
> > @@ -1148,6 +1153,11 @@ struct fuse_copy_file_range_in {
> > uint64_t flags;
> > };
> >
> > +/* For FUSE_COPY_FILE_RANGE_64 */
> > +struct fuse_copy_file_range_out {
> > + uint64_t bytes_copied;
> > +};
> > +
> > #define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
> > #define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
> > struct fuse_setupmapping_in {
> > --
> > 2.49.0
> >
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-06 9:17 ` [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Luis Henriques
2025-08-06 16:01 ` Darrick J. Wong
@ 2025-08-06 19:43 ` Bernd Schubert
2025-08-12 9:08 ` Chunsheng Luo
1 sibling, 1 reply; 15+ messages in thread
From: Bernd Schubert @ 2025-08-06 19:43 UTC (permalink / raw)
To: Luis Henriques, Miklos Szeredi
Cc: linux-fsdevel@vger.kernel.org, Florian Weimer
On 8/6/25 11:17, Luis Henriques wrote:
> On Tue, Aug 05 2025, Miklos Szeredi wrote:
>
>> The FUSE protocol uses struct fuse_write_out to convey the return value of
>> copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
>> interface supports a 64-bit size copies.
>>
>> Currently the number of bytes copied is silently truncated to 32-bit, which
>> is unfortunate at best.
>>
>> Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
>> number of bytes copied is returned in a 64-bit value.
>>
>> If the fuse server does not support COPY_FILE_RANGE_64, fall back to
>> COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
>
> I was wondering if it wouldn't make more sense to truncate the size to
> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
> correctly (which is probably a big 'if'!), the VFS will fallback to
> splice() if the file system does not implement copy_file_range. And in
> this case splice() seems to limit the operation to MAX_RW_COUNT.
I personally don't like the hard coded value too much and would use
inarg.len = min_t(size_t, len, (UINT_MAX - 4096));
(btw, 0xfffff000 is UINT_MAX - 4095, isn't it?).
Though that is all nitpick. The worst part that could happen are
applications that do not handle partial file copy and then wouldn't
copy the entire file. For these it probably would be better to return
-ENOSYS.
LGTM,
Reviewed-by: Bernd Schubert <bschubert@ddn.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-06 16:01 ` Darrick J. Wong
@ 2025-08-06 19:48 ` Luis Henriques
2025-08-11 15:45 ` Darrick J. Wong
0 siblings, 1 reply; 15+ messages in thread
From: Luis Henriques @ 2025-08-06 19:48 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Miklos Szeredi, linux-fsdevel, Bernd Schubert, Florian Weimer
On Wed, Aug 06 2025, Darrick J. Wong wrote:
> On Wed, Aug 06, 2025 at 10:17:06AM +0100, Luis Henriques wrote:
>> On Tue, Aug 05 2025, Miklos Szeredi wrote:
>>
>> > The FUSE protocol uses struct fuse_write_out to convey the return value of
>> > copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
>> > interface supports a 64-bit size copies.
>> >
>> > Currently the number of bytes copied is silently truncated to 32-bit, which
>> > is unfortunate at best.
>> >
>> > Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
>> > number of bytes copied is returned in a 64-bit value.
>> >
>> > If the fuse server does not support COPY_FILE_RANGE_64, fall back to
>> > COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
>>
>> I was wondering if it wouldn't make more sense to truncate the size to
>> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
>> correctly (which is probably a big 'if'!), the VFS will fallback to
>> splice() if the file system does not implement copy_file_range. And in
>> this case splice() seems to limit the operation to MAX_RW_COUNT.
>
> It doesn't, because copy_file_range implementations can do other things
> (like remapping/reflinking file blocks) that produce a very small amount
> of disk IO for what is effectively a very large change to file contents.
> That's why the VFS doesn't cap len at MAX_RW_COUNT bytes.
Oh, OK. So looks like I misunderstood that code. In vfs_copy_file_range(),
I assumed that the fallback to splice ('splice = true;') would cap the IO
with the following:
ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
min_t(size_t, len, MAX_RW_COUNT), 0);
And that's why I suggested to do the same here instead of UINT_MAX - 4096.
Cheers,
--
Luís
> --D
>
>> Cheers,
>> --
>> Luís
>>
>>
>> > Reported-by: Florian Weimer <fweimer@redhat.com>
>> > Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
>> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>> > ---
>> > fs/fuse/file.c | 34 ++++++++++++++++++++++++++--------
>> > fs/fuse/fuse_i.h | 3 +++
>> > include/uapi/linux/fuse.h | 12 +++++++++++-
>> > 3 files changed, 40 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>> > index adc4aa6810f5..bd6624885855 100644
>> > --- a/fs/fuse/file.c
>> > +++ b/fs/fuse/file.c
>> > @@ -3017,6 +3017,8 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
>> > .flags = flags
>> > };
>> > struct fuse_write_out outarg;
>> > + struct fuse_copy_file_range_out outarg_64;
>> > + u64 bytes_copied;
>> > ssize_t err;
>> > /* mark unstable when write-back is not used, and file_out gets
>> > * extended */
>> > @@ -3066,30 +3068,46 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
>> > if (is_unstable)
>> > set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
>> >
>> > - args.opcode = FUSE_COPY_FILE_RANGE;
>> > + args.opcode = FUSE_COPY_FILE_RANGE_64;
>> > args.nodeid = ff_in->nodeid;
>> > args.in_numargs = 1;
>> > args.in_args[0].size = sizeof(inarg);
>> > args.in_args[0].value = &inarg;
>> > args.out_numargs = 1;
>> > - args.out_args[0].size = sizeof(outarg);
>> > - args.out_args[0].value = &outarg;
>> > + args.out_args[0].size = sizeof(outarg_64);
>> > + args.out_args[0].value = &outarg_64;
>> > + if (fc->no_copy_file_range_64) {
>> > +fallback:
>> > + /* Fall back to old op that can't handle large copy length */
>> > + args.opcode = FUSE_COPY_FILE_RANGE;
>> > + args.out_args[0].size = sizeof(outarg);
>> > + args.out_args[0].value = &outarg;
>> > + inarg.len = min_t(size_t, len, 0xfffff000);
>> > + }
>> > err = fuse_simple_request(fm, &args);
>> > if (err == -ENOSYS) {
>> > - fc->no_copy_file_range = 1;
>> > - err = -EOPNOTSUPP;
>> > + if (fc->no_copy_file_range_64) {
>> > + fc->no_copy_file_range = 1;
>> > + err = -EOPNOTSUPP;
>> > + } else {
>> > + fc->no_copy_file_range_64 = 1;
>> > + goto fallback;
>> > + }
>> > }
>> > if (err)
>> > goto out;
>> >
>> > + bytes_copied = fc->no_copy_file_range_64 ?
>> > + outarg.size : outarg_64.bytes_copied;
>> > +
>> > truncate_inode_pages_range(inode_out->i_mapping,
>> > ALIGN_DOWN(pos_out, PAGE_SIZE),
>> > - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
>> > + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
>> >
>> > file_update_time(file_out);
>> > - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
>> > + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
>> >
>> > - err = outarg.size;
>> > + err = bytes_copied;
>> > out:
>> > if (is_unstable)
>> > clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
>> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
>> > index b54f4f57789f..a8be19f686b1 100644
>> > --- a/fs/fuse/fuse_i.h
>> > +++ b/fs/fuse/fuse_i.h
>> > @@ -850,6 +850,9 @@ struct fuse_conn {
>> > /** Does the filesystem support copy_file_range? */
>> > unsigned no_copy_file_range:1;
>> >
>> > + /** Does the filesystem support copy_file_range_64? */
>> > + unsigned no_copy_file_range_64:1;
>> > +
>> > /* Send DESTROY request */
>> > unsigned int destroy:1;
>> >
>> > diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
>> > index 122d6586e8d4..94621f68a5cc 100644
>> > --- a/include/uapi/linux/fuse.h
>> > +++ b/include/uapi/linux/fuse.h
>> > @@ -235,6 +235,10 @@
>> > *
>> > * 7.44
>> > * - add FUSE_NOTIFY_INC_EPOCH
>> > + *
>> > + * 7.45
>> > + * - add FUSE_COPY_FILE_RANGE_64
>> > + * - add struct fuse_copy_file_range_out
>> > */
>> >
>> > #ifndef _LINUX_FUSE_H
>> > @@ -270,7 +274,7 @@
>> > #define FUSE_KERNEL_VERSION 7
>> >
>> > /** Minor version number of this interface */
>> > -#define FUSE_KERNEL_MINOR_VERSION 44
>> > +#define FUSE_KERNEL_MINOR_VERSION 45
>> >
>> > /** The node ID of the root inode */
>> > #define FUSE_ROOT_ID 1
>> > @@ -657,6 +661,7 @@ enum fuse_opcode {
>> > FUSE_SYNCFS = 50,
>> > FUSE_TMPFILE = 51,
>> > FUSE_STATX = 52,
>> > + FUSE_COPY_FILE_RANGE_64 = 53,
>> >
>> > /* CUSE specific operations */
>> > CUSE_INIT = 4096,
>> > @@ -1148,6 +1153,11 @@ struct fuse_copy_file_range_in {
>> > uint64_t flags;
>> > };
>> >
>> > +/* For FUSE_COPY_FILE_RANGE_64 */
>> > +struct fuse_copy_file_range_out {
>> > + uint64_t bytes_copied;
>> > +};
>> > +
>> > #define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
>> > #define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
>> > struct fuse_setupmapping_in {
>> > --
>> > 2.49.0
>> >
>>
>>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-05 18:30 [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Miklos Szeredi
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
2025-08-06 9:17 ` [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Luis Henriques
@ 2025-08-07 6:24 ` Chunsheng Luo
2025-08-11 15:47 ` Darrick J. Wong
2 siblings, 1 reply; 15+ messages in thread
From: Chunsheng Luo @ 2025-08-07 6:24 UTC (permalink / raw)
To: mszeredi; +Cc: bschubert, fweimer, linux-fsdevel
On Thu, Aug 07 2025, Chunsheng Luo wrote:
> On Tue, Aug 05 2025, Miklos Szeredi wrote:
>
> + bytes_copied = fc->no_copy_file_range_64 ?
> + outarg.size : outarg_64.bytes_copied;
> +
> truncate_inode_pages_range(inode_out->i_mapping,
> ALIGN_DOWN(pos_out, PAGE_SIZE),
> - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
> + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
>
> file_update_time(file_out);
> - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
> + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
The copy_file_range syscall returns bytes_copied, a value provided by the userspace filesystem that the kernel cannot control. If bytes_copied > len, how should the application handle this? Similarly, if pos_out + bytes_copied < pos_outdue to integer overflow, could this cause any issues? Since vfs_copy_file_range->generic_copy_file_checks already check that pos_out + len does not overflow, so just need check bytes_copied > len.
Thanks
Chunsheng Luo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-06 19:48 ` Luis Henriques
@ 2025-08-11 15:45 ` Darrick J. Wong
0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2025-08-11 15:45 UTC (permalink / raw)
To: Luis Henriques
Cc: Miklos Szeredi, linux-fsdevel, Bernd Schubert, Florian Weimer
On Wed, Aug 06, 2025 at 08:48:41PM +0100, Luis Henriques wrote:
> On Wed, Aug 06 2025, Darrick J. Wong wrote:
>
> > On Wed, Aug 06, 2025 at 10:17:06AM +0100, Luis Henriques wrote:
> >> On Tue, Aug 05 2025, Miklos Szeredi wrote:
> >>
> >> > The FUSE protocol uses struct fuse_write_out to convey the return value of
> >> > copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
> >> > interface supports a 64-bit size copies.
> >> >
> >> > Currently the number of bytes copied is silently truncated to 32-bit, which
> >> > is unfortunate at best.
> >> >
> >> > Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
> >> > number of bytes copied is returned in a 64-bit value.
> >> >
> >> > If the fuse server does not support COPY_FILE_RANGE_64, fall back to
> >> > COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
> >>
> >> I was wondering if it wouldn't make more sense to truncate the size to
> >> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
> >> correctly (which is probably a big 'if'!), the VFS will fallback to
> >> splice() if the file system does not implement copy_file_range. And in
> >> this case splice() seems to limit the operation to MAX_RW_COUNT.
> >
> > It doesn't, because copy_file_range implementations can do other things
> > (like remapping/reflinking file blocks) that produce a very small amount
> > of disk IO for what is effectively a very large change to file contents.
> > That's why the VFS doesn't cap len at MAX_RW_COUNT bytes.
>
> Oh, OK. So looks like I misunderstood that code. In vfs_copy_file_range(),
> I assumed that the fallback to splice ('splice = true;') would cap the IO
> with the following:
>
> ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> min_t(size_t, len, MAX_RW_COUNT), 0);
>
> And that's why I suggested to do the same here instead of UINT_MAX - 4096.
(/me stumbles back in after FOSSY)
Yeah -- splice actually /does/ dirty pages, because it's actually doing
a buffer copy in a (hopefully) more efficiently than havng the userspace
process do malloc/read/write and endure syscall overhead. That's why
it's ok to limit a splice to MAX_RW_COUNT, because it's basically a
write().
--D
> Cheers,
> --
> Luís
>
>
> > --D
> >
> >> Cheers,
> >> --
> >> Luís
> >>
> >>
> >> > Reported-by: Florian Weimer <fweimer@redhat.com>
> >> > Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
> >> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> >> > ---
> >> > fs/fuse/file.c | 34 ++++++++++++++++++++++++++--------
> >> > fs/fuse/fuse_i.h | 3 +++
> >> > include/uapi/linux/fuse.h | 12 +++++++++++-
> >> > 3 files changed, 40 insertions(+), 9 deletions(-)
> >> >
> >> > diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> >> > index adc4aa6810f5..bd6624885855 100644
> >> > --- a/fs/fuse/file.c
> >> > +++ b/fs/fuse/file.c
> >> > @@ -3017,6 +3017,8 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> >> > .flags = flags
> >> > };
> >> > struct fuse_write_out outarg;
> >> > + struct fuse_copy_file_range_out outarg_64;
> >> > + u64 bytes_copied;
> >> > ssize_t err;
> >> > /* mark unstable when write-back is not used, and file_out gets
> >> > * extended */
> >> > @@ -3066,30 +3068,46 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
> >> > if (is_unstable)
> >> > set_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
> >> >
> >> > - args.opcode = FUSE_COPY_FILE_RANGE;
> >> > + args.opcode = FUSE_COPY_FILE_RANGE_64;
> >> > args.nodeid = ff_in->nodeid;
> >> > args.in_numargs = 1;
> >> > args.in_args[0].size = sizeof(inarg);
> >> > args.in_args[0].value = &inarg;
> >> > args.out_numargs = 1;
> >> > - args.out_args[0].size = sizeof(outarg);
> >> > - args.out_args[0].value = &outarg;
> >> > + args.out_args[0].size = sizeof(outarg_64);
> >> > + args.out_args[0].value = &outarg_64;
> >> > + if (fc->no_copy_file_range_64) {
> >> > +fallback:
> >> > + /* Fall back to old op that can't handle large copy length */
> >> > + args.opcode = FUSE_COPY_FILE_RANGE;
> >> > + args.out_args[0].size = sizeof(outarg);
> >> > + args.out_args[0].value = &outarg;
> >> > + inarg.len = min_t(size_t, len, 0xfffff000);
> >> > + }
> >> > err = fuse_simple_request(fm, &args);
> >> > if (err == -ENOSYS) {
> >> > - fc->no_copy_file_range = 1;
> >> > - err = -EOPNOTSUPP;
> >> > + if (fc->no_copy_file_range_64) {
> >> > + fc->no_copy_file_range = 1;
> >> > + err = -EOPNOTSUPP;
> >> > + } else {
> >> > + fc->no_copy_file_range_64 = 1;
> >> > + goto fallback;
> >> > + }
> >> > }
> >> > if (err)
> >> > goto out;
> >> >
> >> > + bytes_copied = fc->no_copy_file_range_64 ?
> >> > + outarg.size : outarg_64.bytes_copied;
> >> > +
> >> > truncate_inode_pages_range(inode_out->i_mapping,
> >> > ALIGN_DOWN(pos_out, PAGE_SIZE),
> >> > - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
> >> > + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
> >> >
> >> > file_update_time(file_out);
> >> > - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
> >> > + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
> >> >
> >> > - err = outarg.size;
> >> > + err = bytes_copied;
> >> > out:
> >> > if (is_unstable)
> >> > clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state);
> >> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> >> > index b54f4f57789f..a8be19f686b1 100644
> >> > --- a/fs/fuse/fuse_i.h
> >> > +++ b/fs/fuse/fuse_i.h
> >> > @@ -850,6 +850,9 @@ struct fuse_conn {
> >> > /** Does the filesystem support copy_file_range? */
> >> > unsigned no_copy_file_range:1;
> >> >
> >> > + /** Does the filesystem support copy_file_range_64? */
> >> > + unsigned no_copy_file_range_64:1;
> >> > +
> >> > /* Send DESTROY request */
> >> > unsigned int destroy:1;
> >> >
> >> > diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> >> > index 122d6586e8d4..94621f68a5cc 100644
> >> > --- a/include/uapi/linux/fuse.h
> >> > +++ b/include/uapi/linux/fuse.h
> >> > @@ -235,6 +235,10 @@
> >> > *
> >> > * 7.44
> >> > * - add FUSE_NOTIFY_INC_EPOCH
> >> > + *
> >> > + * 7.45
> >> > + * - add FUSE_COPY_FILE_RANGE_64
> >> > + * - add struct fuse_copy_file_range_out
> >> > */
> >> >
> >> > #ifndef _LINUX_FUSE_H
> >> > @@ -270,7 +274,7 @@
> >> > #define FUSE_KERNEL_VERSION 7
> >> >
> >> > /** Minor version number of this interface */
> >> > -#define FUSE_KERNEL_MINOR_VERSION 44
> >> > +#define FUSE_KERNEL_MINOR_VERSION 45
> >> >
> >> > /** The node ID of the root inode */
> >> > #define FUSE_ROOT_ID 1
> >> > @@ -657,6 +661,7 @@ enum fuse_opcode {
> >> > FUSE_SYNCFS = 50,
> >> > FUSE_TMPFILE = 51,
> >> > FUSE_STATX = 52,
> >> > + FUSE_COPY_FILE_RANGE_64 = 53,
> >> >
> >> > /* CUSE specific operations */
> >> > CUSE_INIT = 4096,
> >> > @@ -1148,6 +1153,11 @@ struct fuse_copy_file_range_in {
> >> > uint64_t flags;
> >> > };
> >> >
> >> > +/* For FUSE_COPY_FILE_RANGE_64 */
> >> > +struct fuse_copy_file_range_out {
> >> > + uint64_t bytes_copied;
> >> > +};
> >> > +
> >> > #define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
> >> > #define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
> >> > struct fuse_setupmapping_in {
> >> > --
> >> > 2.49.0
> >> >
> >>
> >>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-07 6:24 ` Chunsheng Luo
@ 2025-08-11 15:47 ` Darrick J. Wong
0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2025-08-11 15:47 UTC (permalink / raw)
To: Chunsheng Luo; +Cc: mszeredi, bschubert, fweimer, linux-fsdevel
On Thu, Aug 07, 2025 at 02:24:25PM +0800, Chunsheng Luo wrote:
> On Thu, Aug 07 2025, Chunsheng Luo wrote:
>
> > On Tue, Aug 05 2025, Miklos Szeredi wrote:
> >
> > + bytes_copied = fc->no_copy_file_range_64 ?
> > + outarg.size : outarg_64.bytes_copied;
> > +
> > truncate_inode_pages_range(inode_out->i_mapping,
> > ALIGN_DOWN(pos_out, PAGE_SIZE),
> > - ALIGN(pos_out + outarg.size, PAGE_SIZE) - 1);
> > + ALIGN(pos_out + bytes_copied, PAGE_SIZE) - 1);
> >
> > file_update_time(file_out);
> > - fuse_write_update_attr(inode_out, pos_out + outarg.size, outarg.size);
> > + fuse_write_update_attr(inode_out, pos_out + bytes_copied, bytes_copied);
>
> The copy_file_range syscall returns bytes_copied, a value provided by
> the userspace filesystem that the kernel cannot control. If
> bytes_copied > len, how should the application handle this? Similarly,
> if pos_out + bytes_copied < pos_outdue to integer overflow, could this
> cause any issues? Since vfs_copy_file_range->generic_copy_file_checks
> already check that pos_out + len does not overflow, so just need check
> bytes_copied > len.
if (WARN_ON_ONCE(bytes_copied > len))
return -EIO;
perhaps?
--D
>
> Thanks
> Chunsheng Luo
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-06 19:43 ` Bernd Schubert
@ 2025-08-12 9:08 ` Chunsheng Luo
2025-08-12 19:49 ` Darrick J. Wong
0 siblings, 1 reply; 15+ messages in thread
From: Chunsheng Luo @ 2025-08-12 9:08 UTC (permalink / raw)
To: bschubert; +Cc: fweimer, linux-fsdevel, luis, mszeredi
On 8/6/25 19:43, Bernd Schubert wrote:
> On 8/6/25 11:17, Luis Henriques wrote:
>> On Tue, Aug 05 2025, Miklos Szeredi wrote:
>>
>>> The FUSE protocol uses struct fuse_write_out to convey the return value of
>>> copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
>>> interface supports a 64-bit size copies.
>>>
>>> Currently the number of bytes copied is silently truncated to 32-bit, which
>>> is unfortunate at best.
>>>
>>> Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
>>> number of bytes copied is returned in a 64-bit value.
>>>
>>> If the fuse server does not support COPY_FILE_RANGE_64, fall back to
>>> COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
>>
>> I was wondering if it wouldn't make more sense to truncate the size to
>> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
>> correctly (which is probably a big 'if'!), the VFS will fallback to
>> splice() if the file system does not implement copy_file_range. And in
>> this case splice() seems to limit the operation to MAX_RW_COUNT.
>
> I personally don't like the hard coded value too much and would use
>
> inarg.len = min_t(size_t, len, (UINT_MAX - 4096));
>
> (btw, 0xfffff000 is UINT_MAX - 4095, isn't it?).
>
> Though that is all nitpick. The worst part that could happen are
> applications that do not handle partial file copy and then wouldn't
> copy the entire file. For these it probably would be better to return
> -ENOSYS.
>
> LGTM,
>
> Reviewed-by: Bernd Schubert <bschubert@ddn.com>
Abot "truncate the size to UINT_MAX - 4096":
4096 refers to PAGE_SIZE (the standard memory page size in most systems)?
If so, wouldn't UINT_MAX & PAGE_MASK be more appropriate?
like: `#define MAX_RW_COUNT (INT_MAX & PAGE_MASK)`
UINT_MAX & PAGE_MASK ensures the operation does not cross a page boundary.
Thanks
Chunsheng Luo
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] copy_file_range: limit size if in compat mode
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
@ 2025-08-12 11:21 ` Miklos Szeredi
2025-08-15 14:22 ` Christian Brauner
2025-08-12 14:26 ` Amir Goldstein
1 sibling, 1 reply; 15+ messages in thread
From: Miklos Szeredi @ 2025-08-12 11:21 UTC (permalink / raw)
To: Christian Brauner; +Cc: linux-fsdevel, Florian Weimer, Al Viro
On Tue, 5 Aug 2025 at 20:30, Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> If the process runs in 32-bit compat mode, copy_file_range results can be
> in the in-band error range. In this case limit copy length to MAX_RW_COUNT
> to prevent a signed overflow.
This is VFS territory, so if it looks okay, can you please apply this,
Christian?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] copy_file_range: limit size if in compat mode
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
2025-08-12 11:21 ` Miklos Szeredi
@ 2025-08-12 14:26 ` Amir Goldstein
2025-08-12 15:50 ` Miklos Szeredi
1 sibling, 1 reply; 15+ messages in thread
From: Amir Goldstein @ 2025-08-12 14:26 UTC (permalink / raw)
To: Miklos Szeredi
Cc: linux-fsdevel, Bernd Schubert, Florian Weimer, Darrick J. Wong
On Tue, Aug 5, 2025 at 8:30 PM Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> If the process runs in 32-bit compat mode, copy_file_range results can be
> in the in-band error range. In this case limit copy length to MAX_RW_COUNT
> to prevent a signed overflow.
>
> Reported-by: Florian Weimer <fweimer@redhat.com>
> Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
> fs/read_write.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 0ef70e128c4a..e2ccc44d96e6 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1576,6 +1576,10 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> if (len == 0)
> return 0;
>
> + /* Make sure return value doesn't overflow in 32bit compat mode */
> + if (in_compat_syscall() && len > MAX_RW_COUNT)
> + len = MAX_RW_COUNT;
> +
1. Note that generic_copy_file_checks() can already shorten len,
so maybe this should also be done there?? not sure..
2. Both ->remap_file_range() and splice cases already trim to MAX_RW_COUNT
so the only remaining case for len > MAX_RW_COUNT are filesystems
that implement ->copy_file_range() and actually support copying ranges
larger than 2MB (don't know if they actually exist)
IOW, if we do:
if (splice || !file_out->f_op->copy_file_range || in_compat_syscall())
len = min_t(loff_t, MAX_RW_COUNT, len);
We will not need to repeat the same trim of len in 3 different places
in this function.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] copy_file_range: limit size if in compat mode
2025-08-12 14:26 ` Amir Goldstein
@ 2025-08-12 15:50 ` Miklos Szeredi
0 siblings, 0 replies; 15+ messages in thread
From: Miklos Szeredi @ 2025-08-12 15:50 UTC (permalink / raw)
To: Amir Goldstein
Cc: Miklos Szeredi, linux-fsdevel, Bernd Schubert, Florian Weimer,
Darrick J. Wong
On Tue, 12 Aug 2025 at 16:36, Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 8:30 PM Miklos Szeredi <mszeredi@redhat.com> wrote:
> > + /* Make sure return value doesn't overflow in 32bit compat mode */
> > + if (in_compat_syscall() && len > MAX_RW_COUNT)
> > + len = MAX_RW_COUNT;
> > +
>
> 1. Note that generic_copy_file_checks() can already shorten len,
> so maybe this should also be done there?? not sure..
I guess it doesn't really matter, since all of these functions are
local. I've decided to do it directly in vfs_copy_file_range because
in other cases MAX_RW_COUNTS are checked directly in vfs_...
> 2. Both ->remap_file_range() and splice cases already trim to MAX_RW_COUNT
> so the only remaining case for len > MAX_RW_COUNT are filesystems
> that implement ->copy_file_range() and actually support copying ranges
> larger than 2MB (don't know if they actually exist)
>
> IOW, if we do:
>
> if (splice || !file_out->f_op->copy_file_range || in_compat_syscall())
> len = min_t(loff_t, MAX_RW_COUNT, len);
>
> We will not need to repeat the same trim of len in 3 different places
> in this function.
Makes sense.
Thanks,
Miklos
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface
2025-08-12 9:08 ` Chunsheng Luo
@ 2025-08-12 19:49 ` Darrick J. Wong
0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2025-08-12 19:49 UTC (permalink / raw)
To: Chunsheng Luo; +Cc: bschubert, fweimer, linux-fsdevel, luis, mszeredi
On Tue, Aug 12, 2025 at 05:08:18PM +0800, Chunsheng Luo wrote:
> On 8/6/25 19:43, Bernd Schubert wrote:
>
> > On 8/6/25 11:17, Luis Henriques wrote:
> >> On Tue, Aug 05 2025, Miklos Szeredi wrote:
> >>
> >>> The FUSE protocol uses struct fuse_write_out to convey the return value of
> >>> copy_file_range, which is restricted to uint32_t. But the COPY_FILE_RANGE
> >>> interface supports a 64-bit size copies.
> >>>
> >>> Currently the number of bytes copied is silently truncated to 32-bit, which
> >>> is unfortunate at best.
> >>>
> >>> Introduce a new op COPY_FILE_RANGE_64, which is identical, except the
> >>> number of bytes copied is returned in a 64-bit value.
> >>>
> >>> If the fuse server does not support COPY_FILE_RANGE_64, fall back to
> >>> COPY_FILE_RANGE and truncate the size to UINT_MAX - 4096.
> >>
> >> I was wondering if it wouldn't make more sense to truncate the size to
> >> MAX_RW_COUNT instead. My reasoning is that, if I understand the code
> >> correctly (which is probably a big 'if'!), the VFS will fallback to
> >> splice() if the file system does not implement copy_file_range. And in
> >> this case splice() seems to limit the operation to MAX_RW_COUNT.
> >
> > I personally don't like the hard coded value too much and would use
> >
> > inarg.len = min_t(size_t, len, (UINT_MAX - 4096));
> >
> > (btw, 0xfffff000 is UINT_MAX - 4095, isn't it?).
> >
> > Though that is all nitpick. The worst part that could happen are
> > applications that do not handle partial file copy and then wouldn't
> > copy the entire file. For these it probably would be better to return
> > -ENOSYS.
> >
> > LGTM,
> >
> > Reviewed-by: Bernd Schubert <bschubert@ddn.com>
>
> Abot "truncate the size to UINT_MAX - 4096":
> 4096 refers to PAGE_SIZE (the standard memory page size in most systems)?
> If so, wouldn't UINT_MAX & PAGE_MASK be more appropriate?
> like: `#define MAX_RW_COUNT (INT_MAX & PAGE_MASK)`
> UINT_MAX & PAGE_MASK ensures the operation does not cross a page boundary.
Yeah, I was wondering that too -- if we're going to cap a
copy_file_range to something resembling MAX_RW_COUNT, then why not use
that symbol directly? :)
--D
> Thanks
> Chunsheng Luo
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/2] copy_file_range: limit size if in compat mode
2025-08-12 11:21 ` Miklos Szeredi
@ 2025-08-15 14:22 ` Christian Brauner
0 siblings, 0 replies; 15+ messages in thread
From: Christian Brauner @ 2025-08-15 14:22 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, Florian Weimer, Al Viro
On Tue, Aug 12, 2025 at 01:21:00PM +0200, Miklos Szeredi wrote:
> On Tue, 5 Aug 2025 at 20:30, Miklos Szeredi <mszeredi@redhat.com> wrote:
> >
> > If the process runs in 32-bit compat mode, copy_file_range results can be
> > in the in-band error range. In this case limit copy length to MAX_RW_COUNT
> > to prevent a signed overflow.
>
> This is VFS territory, so if it looks okay, can you please apply this,
> Christian?
I did so now! Thank you. Fyi, I'm on vacation in Sweden and will be back
on Tuesday so that's why there's a few delays.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-08-15 14:22 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-05 18:30 [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Miklos Szeredi
2025-08-05 18:30 ` [PATCH 2/2] copy_file_range: limit size if in compat mode Miklos Szeredi
2025-08-12 11:21 ` Miklos Szeredi
2025-08-15 14:22 ` Christian Brauner
2025-08-12 14:26 ` Amir Goldstein
2025-08-12 15:50 ` Miklos Szeredi
2025-08-06 9:17 ` [PATCH 1/2] fuse: fix COPY_FILE_RANGE interface Luis Henriques
2025-08-06 16:01 ` Darrick J. Wong
2025-08-06 19:48 ` Luis Henriques
2025-08-11 15:45 ` Darrick J. Wong
2025-08-06 19:43 ` Bernd Schubert
2025-08-12 9:08 ` Chunsheng Luo
2025-08-12 19:49 ` Darrick J. Wong
2025-08-07 6:24 ` Chunsheng Luo
2025-08-11 15:47 ` Darrick J. Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).