* Re: [fuse-devel] FICLONE / FICLONERANGE support [not found] <1fb83b2a-38cf-4b70-8c9e-ac1c77db7080@spawn.link> @ 2024-01-28 10:07 ` Amir Goldstein 2024-01-28 19:11 ` Antonio SJ Musumeci 2024-01-28 21:25 ` Dave Chinner 0 siblings, 2 replies; 5+ messages in thread From: Amir Goldstein @ 2024-01-28 10:07 UTC (permalink / raw) To: Antonio SJ Musumeci; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi On Sun, Jan 28, 2024 at 2:31 AM Antonio SJ Musumeci <trapexit@spawn.link> wrote: > > Hello, > > Has anyone investigated adding support for FICLONE and FICLONERANGE? I'm > not seeing any references to either on the mailinglist. I've got a > passthrough filesystem and with more users taking advantage of btrfs and > xfs w/ reflinks there has been some demand for the ability to support it. > [CC fsdevel because my answer's scope is wider than just FUSE] FWIW, the kernel implementation of copy_file_range() calls remap_file_range() (a.k.a. clone_file_range()) for both xfs and btrfs, so if your users control the application they are using, calling copy_file_range() will propagate via your fuse filesystem correctly to underlying xfs/btrfs and will effectively result in clone_file_range(). Thus using tools like cp --reflink, on your passthrough filesystem should yield the expected result. For a more practical example see: https://bugzilla.samba.org/show_bug.cgi?id=12033 Since Samba 4.1, server-side-copy is implemented as copy_file_range() API-wise, there are two main differences between copy_file_range() and FICLONERANGE: 1. copy_file_range() can result in partial copy 2. copy_file_range() can results in more used disk space Other API differences are minor, but the fact that copy_file_range() is a syscall with a @flags argument makes it a candidate for being a super-set of both functionalities. The question is, for your users, are you actually looking for clone_file_range() support? or is best-effort copy_file_range() with clone_file_range() fallback enough? If your users are looking for the atomic clone_file_range() behavior, then a single flag in fuse_copy_file_range_in::flags is enough to indicate to the server that the "atomic clone" behavior is wanted. Note that the @flags argument to copy_file_range() syscall does not support any flags at all at the moment. The only flag defined in the kernel COPY_FILE_SPLICE is for internal use only. We can define a flag COPY_FILE_CLONE to use either only internally in kernel and in FUSE protocol or even also in copy_file_range() syscall. Sure, we can also add a new FUSE protocol command for FUSE_CLONE_FILE_RANGE, but I don't think that is necessary. It is certainly not necessary if there is agreement to extend the copy_file_range() syscall to support COPY_FILE_CLONE flag. What do folks think about this possible API extension? Thanks, Amir. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [fuse-devel] FICLONE / FICLONERANGE support 2024-01-28 10:07 ` [fuse-devel] FICLONE / FICLONERANGE support Amir Goldstein @ 2024-01-28 19:11 ` Antonio SJ Musumeci 2024-01-28 21:25 ` Dave Chinner 1 sibling, 0 replies; 5+ messages in thread From: Antonio SJ Musumeci @ 2024-01-28 19:11 UTC (permalink / raw) To: Amir Goldstein; +Cc: fuse-devel, linux-fsdevel, Miklos Szeredi On Sunday, January 28th, 2024 at 4:07 AM, Amir Goldstein <amir73il@gmail.com> wrote: > > > On Sun, Jan 28, 2024 at 2:31 AM Antonio SJ Musumeci trapexit@spawn.link wrote: > > > Hello, > > > > Has anyone investigated adding support for FICLONE and FICLONERANGE? I'm > > not seeing any references to either on the mailinglist. I've got a > > passthrough filesystem and with more users taking advantage of btrfs and > > xfs w/ reflinks there has been some demand for the ability to support it. > > > [CC fsdevel because my answer's scope is wider than just FUSE] > > FWIW, the kernel implementation of copy_file_range() calls remap_file_range() > (a.k.a. clone_file_range()) for both xfs and btrfs, so if your users control the > application they are using, calling copy_file_range() will propagate via your > fuse filesystem correctly to underlying xfs/btrfs and will effectively result in > clone_file_range(). > > Thus using tools like cp --reflink, on your passthrough filesystem should yield > the expected result. > > For a more practical example see: > https://bugzilla.samba.org/show_bug.cgi?id=12033 > Since Samba 4.1, server-side-copy is implemented as copy_file_range() > > API-wise, there are two main differences between copy_file_range() and > FICLONERANGE: > 1. copy_file_range() can result in partial copy > 2. copy_file_range() can results in more used disk space > > Other API differences are minor, but the fact that copy_file_range() > is a syscall with a @flags argument makes it a candidate for being > a super-set of both functionalities. > > The question is, for your users, are you actually looking for > clone_file_range() support? or is best-effort copy_file_range() with > clone_file_range() fallback enough? > > If your users are looking for the atomic clone_file_range() behavior, > then a single flag in fuse_copy_file_range_in::flags is enough to > indicate to the server that the "atomic clone" behavior is wanted. > > Note that the @flags argument to copy_file_range() syscall does not > support any flags at all at the moment. > > The only flag defined in the kernel COPY_FILE_SPLICE is for > internal use only. > > We can define a flag COPY_FILE_CLONE to use either only > internally in kernel and in FUSE protocol or even also in > copy_file_range() syscall. > > Sure, we can also add a new FUSE protocol command for > FUSE_CLONE_FILE_RANGE, but I don't think that is > necessary. > It is certainly not necessary if there is agreement to extend the > copy_file_range() syscall to support COPY_FILE_CLONE flag. > > What do folks think about this possible API extension? > > Thanks, > Amir. cp --reflink calls FICLONE. It received a EOPNOTSUPP and falls back to copying normally (if set to auto mode). It appears it still does this: https://github.com/coreutils/coreutils/blob/master/src/copy.c#L1509 My users don't control the software they are running. They are using random tooling that happen to support FICLONE such as cp --reflink. In the most recent case using it for some rsnapshot like backup strategy I believe. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [fuse-devel] FICLONE / FICLONERANGE support 2024-01-28 10:07 ` [fuse-devel] FICLONE / FICLONERANGE support Amir Goldstein 2024-01-28 19:11 ` Antonio SJ Musumeci @ 2024-01-28 21:25 ` Dave Chinner 2024-01-29 13:54 ` Amir Goldstein 1 sibling, 1 reply; 5+ messages in thread From: Dave Chinner @ 2024-01-28 21:25 UTC (permalink / raw) To: Amir Goldstein Cc: Antonio SJ Musumeci, fuse-devel, linux-fsdevel, Miklos Szeredi On Sun, Jan 28, 2024 at 12:07:22PM +0200, Amir Goldstein wrote: > On Sun, Jan 28, 2024 at 2:31 AM Antonio SJ Musumeci <trapexit@spawn.link> wrote: > > > > Hello, > > > > Has anyone investigated adding support for FICLONE and FICLONERANGE? I'm > > not seeing any references to either on the mailinglist. I've got a > > passthrough filesystem and with more users taking advantage of btrfs and > > xfs w/ reflinks there has been some demand for the ability to support it. > > > > [CC fsdevel because my answer's scope is wider than just FUSE] > > FWIW, the kernel implementation of copy_file_range() calls remap_file_range() > (a.k.a. clone_file_range()) for both xfs and btrfs, so if your users control the > application they are using, calling copy_file_range() will propagate via your > fuse filesystem correctly to underlying xfs/btrfs and will effectively result in > clone_file_range(). > > Thus using tools like cp --reflink, on your passthrough filesystem should yield > the expected result. > > For a more practical example see: > https://bugzilla.samba.org/show_bug.cgi?id=12033 > Since Samba 4.1, server-side-copy is implemented as copy_file_range() > > API-wise, there are two main differences between copy_file_range() and > FICLONERANGE: > 1. copy_file_range() can result in partial copy > 2. copy_file_range() can results in more used disk space > > Other API differences are minor, but the fact that copy_file_range() > is a syscall with a @flags argument makes it a candidate for being > a super-set of both functionalities. > > The question is, for your users, are you actually looking for > clone_file_range() support? or is best-effort copy_file_range() with > clone_file_range() fallback enough? > > If your users are looking for the atomic clone_file_range() behavior, > then a single flag in fuse_copy_file_range_in::flags is enough to > indicate to the server that the "atomic clone" behavior is wanted. > > Note that the @flags argument to copy_file_range() syscall does not > support any flags at all at the moment. > > The only flag defined in the kernel COPY_FILE_SPLICE is for > internal use only. > > We can define a flag COPY_FILE_CLONE to use either only > internally in kernel and in FUSE protocol or even also in > copy_file_range() syscall. I don't care how fuse implements ->remap_file_range(), but no change to syscall behaviour, please. copy_file_range() is supposed to select the best available method for copying the data based on kernel side technology awareness that the application knows nothing about (e.g. clone, server-side copy, block device copy offload, etc). The API is technology agnostic and largely future proof because of this; adding flags to say "use this specific technology to copy data or fail" is the exact opposite of how we want copy_file_range() to work. i.e. if you want a specific type of "copy" to be done (i.e. clone rather than data copy) then call FICLONE or copy the data yourself to do exactly what you need. If you just want it done fast as possible and don't care about implementation (99% of cases), then just call copy_file_range(). > Sure, we can also add a new FUSE protocol command for > FUSE_CLONE_FILE_RANGE, but I don't think that is > necessary. > It is certainly not necessary if there is agreement to extend the > copy_file_range() syscall to support COPY_FILE_CLONE flag. We have already have FICLONE/FICLONERANGE for this operation. Fuse just needs to implement ->remap_file_range() server stubs, and then the back end driver can choose to implement it if it's storage mechanisms support such functionality. Then it will get used automatically for copy_file_range() for those FUSE drivers, the rest will just copy the data in the kernel using splice as they currently do... -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [fuse-devel] FICLONE / FICLONERANGE support 2024-01-28 21:25 ` Dave Chinner @ 2024-01-29 13:54 ` Amir Goldstein 2024-01-30 8:08 ` Shachar Sharon 0 siblings, 1 reply; 5+ messages in thread From: Amir Goldstein @ 2024-01-29 13:54 UTC (permalink / raw) To: Dave Chinner Cc: Antonio SJ Musumeci, fuse-devel, linux-fsdevel, Miklos Szeredi On Sun, Jan 28, 2024 at 11:25 PM Dave Chinner <david@fromorbit.com> wrote: > > On Sun, Jan 28, 2024 at 12:07:22PM +0200, Amir Goldstein wrote: > > On Sun, Jan 28, 2024 at 2:31 AM Antonio SJ Musumeci <trapexit@spawn.link> wrote: > > > > > > Hello, > > > > > > Has anyone investigated adding support for FICLONE and FICLONERANGE? I'm > > > not seeing any references to either on the mailinglist. I've got a > > > passthrough filesystem and with more users taking advantage of btrfs and > > > xfs w/ reflinks there has been some demand for the ability to support it. > > > > > > > [CC fsdevel because my answer's scope is wider than just FUSE] > > > > FWIW, the kernel implementation of copy_file_range() calls remap_file_range() > > (a.k.a. clone_file_range()) for both xfs and btrfs, so if your users control the > > application they are using, calling copy_file_range() will propagate via your > > fuse filesystem correctly to underlying xfs/btrfs and will effectively result in > > clone_file_range(). > > > > Thus using tools like cp --reflink, on your passthrough filesystem should yield > > the expected result. Sorry, cp --reflink indeed uses clone > > > > For a more practical example see: > > https://bugzilla.samba.org/show_bug.cgi?id=12033 > > Since Samba 4.1, server-side-copy is implemented as copy_file_range() > > > > API-wise, there are two main differences between copy_file_range() and > > FICLONERANGE: > > 1. copy_file_range() can result in partial copy > > 2. copy_file_range() can results in more used disk space > > > > Other API differences are minor, but the fact that copy_file_range() > > is a syscall with a @flags argument makes it a candidate for being > > a super-set of both functionalities. > > > > The question is, for your users, are you actually looking for > > clone_file_range() support? or is best-effort copy_file_range() with > > clone_file_range() fallback enough? > > > > If your users are looking for the atomic clone_file_range() behavior, > > then a single flag in fuse_copy_file_range_in::flags is enough to > > indicate to the server that the "atomic clone" behavior is wanted. > > > > Note that the @flags argument to copy_file_range() syscall does not > > support any flags at all at the moment. > > > > The only flag defined in the kernel COPY_FILE_SPLICE is for > > internal use only. > > > > We can define a flag COPY_FILE_CLONE to use either only > > internally in kernel and in FUSE protocol or even also in > > copy_file_range() syscall. > > I don't care how fuse implements ->remap_file_range(), but no change > to syscall behaviour, please. > ok. > copy_file_range() is supposed to select the best available method > for copying the data based on kernel side technology awareness that > the application knows nothing about (e.g. clone, server-side copy, > block device copy offload, etc). The API is technology agnostic and > largely future proof because of this; adding flags to say "use this > specific technology to copy data or fail" is the exact opposite of > how we want copy_file_range() to work. > > i.e. if you want a specific type of "copy" to be done (i.e. clone > rather than data copy) then call FICLONE or copy the data yourself > to do exactly what you need. If you just want it done fast as > possible and don't care about implementation (99% of cases), then > just call copy_file_range(). > Technically, a flag COPY_FILE_ATOMIC would be a requirement not an implementation detail, but this requirement could currently be fulfilled only by fs that implement remap_file_range(), but nevermind, I won't be trying to push a syscall API change myself. > > Sure, we can also add a new FUSE protocol command for > > FUSE_CLONE_FILE_RANGE, but I don't think that is > > necessary. > > It is certainly not necessary if there is agreement to extend the > > copy_file_range() syscall to support COPY_FILE_CLONE flag. > > We have already have FICLONE/FICLONERANGE for this operation. Fuse > just needs to implement ->remap_file_range() server stubs, and then > the back end driver can choose to implement it if it's storage > mechanisms support such functionality. For Antonio's request to support FICLONERANGE with FUSE, that would be enough using a new protocol command. > Then it will get used > automatically for copy_file_range() for those FUSE drivers, the rest > will just copy the data in the kernel using splice as they currently > do... This is not the current behavior of FUSE as far as I can tell. The reason is that vfs_copy_file_range() checks if fs implement ->copy_file_range(), if it does, it will not fallback to ->remap_file_range() nor to splice. This is intentional - fs with ->copy_file_range() has full control including the decision to return whatever error code to userspace. The problem is that the FUSE kernel driver always implements ->copy_file_range(), regardless whether the FUSE server implements FUSE_COPY_FILE_RANGE. So for a FUSE server that does not implement FUSE_COPY_FILE_RANGE, fc->no_copy_file_range is true and copy_file_range() returns -EOPNOTSUPP. So either the fallback from FUSE_COPY_FILE_RANGE to FUSE_CLONE_FILE_RANGE will be done internally by FUSE, or clone/copy support will need to be advertised during FUSE_INIT and a different set of fuse_file_operations will need to be used accordingly, which seems overly complicated. Thanks, Amir. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [fuse-devel] FICLONE / FICLONERANGE support 2024-01-29 13:54 ` Amir Goldstein @ 2024-01-30 8:08 ` Shachar Sharon 0 siblings, 0 replies; 5+ messages in thread From: Shachar Sharon @ 2024-01-30 8:08 UTC (permalink / raw) To: Amir Goldstein Cc: Dave Chinner, Antonio SJ Musumeci, fuse-devel, linux-fsdevel, Miklos Szeredi On Mon, Jan 29, 2024 at 3:54 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Sun, Jan 28, 2024 at 11:25 PM Dave Chinner <david@fromorbit.com> wrote: > > > > On Sun, Jan 28, 2024 at 12:07:22PM +0200, Amir Goldstein wrote: > > > On Sun, Jan 28, 2024 at 2:31 AM Antonio SJ Musumeci <trapexit@spawn.link> wrote: > > > > > > > > Hello, > > > > > > > > Has anyone investigated adding support for FICLONE and FICLONERANGE? I'm > > > > not seeing any references to either on the mailinglist. I've got a > > > > passthrough filesystem and with more users taking advantage of btrfs and > > > > xfs w/ reflinks there has been some demand for the ability to support it. > > > > > > > > > > [CC fsdevel because my answer's scope is wider than just FUSE] > > > > > > FWIW, the kernel implementation of copy_file_range() calls remap_file_range() > > > (a.k.a. clone_file_range()) for both xfs and btrfs, so if your users control the > > > application they are using, calling copy_file_range() will propagate via your > > > fuse filesystem correctly to underlying xfs/btrfs and will effectively result in > > > clone_file_range(). > > > > > > Thus using tools like cp --reflink, on your passthrough filesystem should yield > > > the expected result. > > Sorry, cp --reflink indeed uses clone > > > > > > > For a more practical example see: > > > https://bugzilla.samba.org/show_bug.cgi?id=12033 > > > Since Samba 4.1, server-side-copy is implemented as copy_file_range() > > > > > > API-wise, there are two main differences between copy_file_range() and > > > FICLONERANGE: > > > 1. copy_file_range() can result in partial copy > > > 2. copy_file_range() can results in more used disk space > > > > > > Other API differences are minor, but the fact that copy_file_range() > > > is a syscall with a @flags argument makes it a candidate for being > > > a super-set of both functionalities. > > > > > > The question is, for your users, are you actually looking for > > > clone_file_range() support? or is best-effort copy_file_range() with > > > clone_file_range() fallback enough? > > > > > > If your users are looking for the atomic clone_file_range() behavior, > > > then a single flag in fuse_copy_file_range_in::flags is enough to > > > indicate to the server that the "atomic clone" behavior is wanted. > > > > > > Note that the @flags argument to copy_file_range() syscall does not > > > support any flags at all at the moment. > > > > > > The only flag defined in the kernel COPY_FILE_SPLICE is for > > > internal use only. > > > > > > We can define a flag COPY_FILE_CLONE to use either only > > > internally in kernel and in FUSE protocol or even also in > > > copy_file_range() syscall. > > > > I don't care how fuse implements ->remap_file_range(), but no change > > to syscall behaviour, please. > > > > ok. > > > copy_file_range() is supposed to select the best available method > > for copying the data based on kernel side technology awareness that > > the application knows nothing about (e.g. clone, server-side copy, > > block device copy offload, etc). The API is technology agnostic and > > largely future proof because of this; adding flags to say "use this > > specific technology to copy data or fail" is the exact opposite of > > how we want copy_file_range() to work. > > > > i.e. if you want a specific type of "copy" to be done (i.e. clone > > rather than data copy) then call FICLONE or copy the data yourself > > to do exactly what you need. If you just want it done fast as > > possible and don't care about implementation (99% of cases), then > > just call copy_file_range(). > > > > Technically, a flag COPY_FILE_ATOMIC would be a requirement > not an implementation detail, but this requirement could currently be > fulfilled only by fs that implement remap_file_range(), but nevermind, > I won't be trying to push a syscall API change myself. > > > > Sure, we can also add a new FUSE protocol command for > > > FUSE_CLONE_FILE_RANGE, but I don't think that is > > > necessary. > > > It is certainly not necessary if there is agreement to extend the > > > copy_file_range() syscall to support COPY_FILE_CLONE flag. > > > > We have already have FICLONE/FICLONERANGE for this operation. Fuse > > just needs to implement ->remap_file_range() server stubs, and then > > the back end driver can choose to implement it if it's storage > > mechanisms support such functionality. > > For Antonio's request to support FICLONERANGE with FUSE, > that would be enough using a new protocol command. > > > Then it will get used > > automatically for copy_file_range() for those FUSE drivers, the rest > > will just copy the data in the kernel using splice as they currently > > do... > > This is not the current behavior of FUSE as far as I can tell. > The reason is that vfs_copy_file_range() checks if fs implement > ->copy_file_range(), if it does, it will not fallback to ->remap_file_range() > nor to splice. This is intentional - fs with ->copy_file_range() has full > control including the decision to return whatever error code to userspace. > > The problem is that the FUSE kernel driver always implements > ->copy_file_range(), regardless whether the FUSE server implements > FUSE_COPY_FILE_RANGE. So for a FUSE server that does not > implement FUSE_COPY_FILE_RANGE, fc->no_copy_file_range is > true and copy_file_range() returns -EOPNOTSUPP. > > So either the fallback from FUSE_COPY_FILE_RANGE to > FUSE_CLONE_FILE_RANGE will be done internally by FUSE, > or clone/copy support will need to be advertised during FUSE_INIT > and a different set of fuse_file_operations will need to be used > accordingly, which seems overly complicated. > Note that FUSE_COPY_FILE_RANGE uses struct fuse_write_out to report the number of bytes copied between files (uint32_t size), and therefore it can not copy more than 2^32-1 bytes at each call. For example, a call to cp --reflink of 1T file yields multiple calls to copy_file_range() by userspace. - Shachar. > Thanks, > Amir. > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-01-30 8:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1fb83b2a-38cf-4b70-8c9e-ac1c77db7080@spawn.link>
2024-01-28 10:07 ` [fuse-devel] FICLONE / FICLONERANGE support Amir Goldstein
2024-01-28 19:11 ` Antonio SJ Musumeci
2024-01-28 21:25 ` Dave Chinner
2024-01-29 13:54 ` Amir Goldstein
2024-01-30 8:08 ` Shachar Sharon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).