* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
[not found] <176169810144.1424854.11439355400009006946.stgit@frogsfrogsfrogs>
@ 2025-11-19 9:19 ` Demi Marie Obenour
2025-11-19 9:41 ` Gao Xiang
2025-11-19 18:04 ` Darrick J. Wong
0 siblings, 2 replies; 9+ messages in thread
From: Demi Marie Obenour @ 2025-11-19 9:19 UTC (permalink / raw)
To: djwong
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 2626 bytes --]
> By keeping the I/O path mostly within the kernel, we can dramatically
> increase the speed of disk-based filesystems.
ZFS, BTRFS, and bcachefs all support compression, checksumming,
and RAID. ZFS and bcachefs also support encryption, and f2fs and
ext4 support fscrypt.
Will this patchset be able to improve FUSE implementations of these
filesystems? I'd rather not be in the situation where one can have
a FUSE filesystem that is fast, but only if it doesn't support modern
data integrity or security features.
I'm not a filesystem developer, but here are some ideas (that you
can take or leave):
1. Keep the compression, checksumming, and/or encryption in-kernel,
and have userspace tell the kernel what algorithm and/or encryption
key to use. These algorithms are generally well-known and secure
against malicious input. It might be necessary to make an extra
data copy, but ideally that copy could just stay within the
CPU caches.
2. Somehow integrate with the blk-crypto framework. This has the
advantage that it supports inline encryption hardware, which
I suspect is needed for this to be usable on mobile devices.
After all, the keys on these systems are often not even visible
to the kernel, let alone to userspace.
3. Figure out a way to make a userspace data path fast enough.
To prevent data corruption by unprivileged users of the FS,
it's necessary to make a copy before checksumming, compression,
or authenticated encryption. If this copy is done in the kernel,
the server doesn't have to perform its own copy. By using large
ring buffers, it might be possible to amortize the context switch
cost away.
Authenticated encryption also needs a copy in the *other* direction:
if the (untrusted) client can see unauthenticated plaintext, it's
a security vulnerability. That needs another copy from server
buffers to client buffers, and the kernel can do that as well.
4. Make context switches much faster. L4-style IPC is incredibly fast,
at least if one doesn't have to worry about Spectre. Unfortunately,
nowadays one *does* need to worry about Spectre.
Obviously, none of these will be as fast as doing DMA directly to user
buffers. However, all of these features (except for encryption using
inline encryption hardware) come at a performance penalty already.
I just don't want a FUSE server to have to pay a much larger penalty
than a kernel filesystem would.
I'm CCing the bcachefs, BTRFS, and ZFS-on-Linux mailing lists.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 9:19 ` [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance Demi Marie Obenour
@ 2025-11-19 9:41 ` Gao Xiang
2025-11-19 18:04 ` Darrick J. Wong
1 sibling, 0 replies; 9+ messages in thread
From: Gao Xiang @ 2025-11-19 9:41 UTC (permalink / raw)
To: Demi Marie Obenour, djwong
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
On 2025/11/19 17:19, Demi Marie Obenour wrote:
>> By keeping the I/O path mostly within the kernel, we can dramatically
>> increase the speed of disk-based filesystems.
>
> ZFS, BTRFS, and bcachefs all support compression, checksumming,
> and RAID. ZFS and bcachefs also support encryption, and f2fs and
> ext4 support fscrypt.
>
> Will this patchset be able to improve FUSE implementations of these
> filesystems? I'd rather not be in the situation where one can have
> a FUSE filesystem that is fast, but only if it doesn't support modern
> data integrity or security features.
>
> I'm not a filesystem developer, but here are some ideas (that you
> can take or leave):
>
> 1. Keep the compression, checksumming, and/or encryption in-kernel,
> and have userspace tell the kernel what algorithm and/or encryption
I don't think it's generally feasible unless it's limited to
specific implementations because each transformation-like ondisk
encoded data has its own design, which is unlike raw data.
Although the algorithms are well-known but the ondisk data could
be wrapped up with headers, footers, or specific markers.
I think for the specific fscrypt or fsverity it could be possible
(for example, I'm not sure zfs is 100%-compatible with fscrypt or
fsverity, if they implements similiar stuffs), but considering
generic compression, checksumming, and encryption, filesystem
implementations can do various ways (even in various orders) with
possible additional representations.
> key to use. These algorithms are generally well-known and secure
> against malicious input. It might be necessary to make an extra
> data copy, but ideally that copy could just stay within the
> CPU caches.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 9:19 ` [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance Demi Marie Obenour
2025-11-19 9:41 ` Gao Xiang
@ 2025-11-19 18:04 ` Darrick J. Wong
2025-11-19 21:00 ` Gao Xiang
2025-11-20 1:05 ` Demi Marie Obenour
1 sibling, 2 replies; 9+ messages in thread
From: Darrick J. Wong @ 2025-11-19 18:04 UTC (permalink / raw)
To: Demi Marie Obenour
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
> > By keeping the I/O path mostly within the kernel, we can dramatically
> > increase the speed of disk-based filesystems.
>
> ZFS, BTRFS, and bcachefs all support compression, checksumming,
> and RAID. ZFS and bcachefs also support encryption, and f2fs and
> ext4 support fscrypt.
>
> Will this patchset be able to improve FUSE implementations of these
> filesystems? I'd rather not be in the situation where one can have
> a FUSE filesystem that is fast, but only if it doesn't support modern
> data integrity or security features.
Not on its own, no.
> I'm not a filesystem developer, but here are some ideas (that you
> can take or leave):
>
> 1. Keep the compression, checksumming, and/or encryption in-kernel,
> and have userspace tell the kernel what algorithm and/or encryption
> key to use. These algorithms are generally well-known and secure
> against malicious input. It might be necessary to make an extra
> data copy, but ideally that copy could just stay within the
> CPU caches.
I think this is easily doable for fscrypt and compression since (IIRC)
the kernel filesystems already know how to transform data for I/O, and
nowadays iomap allows hooking of bios before submission and/or after
endio. Obviously you'd have to store encryption keys in the kernel
somewhere.
Checksumming is harder though, since the checksum information has to be
persisted in the metadata somewhere and AFAICT each checksumming fs does
things differently. For that, I think the fuse server would have to
convey to the kernel (a) a description of the checksum geometry and (b)
a buffer for storing the checksums. On write the kernel would compute
the checksum and write it to the buffer for the fs to persist as part of
the ioend; and for read the fuse server would have to read the checksums
into the buffer and pass that to the kernel.
(Note that fsverity won't have this problem because all current
implementations stuff the merkle tree in post-eof datablocks; the
fsverity code only wants fses to read it in the pagecache; and pass it
the page)
> 2. Somehow integrate with the blk-crypto framework. This has the
> advantage that it supports inline encryption hardware, which
> I suspect is needed for this to be usable on mobile devices.
> After all, the keys on these systems are often not even visible
> to the kernel, let alone to userspace.
Yes, that would be even easier than messing around with bounce buffers.
> 3. Figure out a way to make a userspace data path fast enough.
> To prevent data corruption by unprivileged users of the FS,
> it's necessary to make a copy before checksumming, compression,
> or authenticated encryption. If this copy is done in the kernel,
> the server doesn't have to perform its own copy. By using large
> ring buffers, it might be possible to amortize the context switch
> cost away.
>
> Authenticated encryption also needs a copy in the *other* direction:
> if the (untrusted) client can see unauthenticated plaintext, it's
> a security vulnerability. That needs another copy from server
> buffers to client buffers, and the kernel can do that as well.
>
> 4. Make context switches much faster. L4-style IPC is incredibly fast,
> at least if one doesn't have to worry about Spectre. Unfortunately,
> nowadays one *does* need to worry about Spectre.
I don't think context switching overhead is going down.
> Obviously, none of these will be as fast as doing DMA directly to user
> buffers. However, all of these features (except for encryption using
> inline encryption hardware) come at a performance penalty already.
> I just don't want a FUSE server to have to pay a much larger penalty
> than a kernel filesystem would.
>
> I'm CCing the bcachefs, BTRFS, and ZFS-on-Linux mailing lists.
> --
> Sincerely,
> Demi Marie Obenour (she/her/hers)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 18:04 ` Darrick J. Wong
@ 2025-11-19 21:00 ` Gao Xiang
2025-11-19 21:51 ` Gao Xiang
2025-11-20 1:10 ` Demi Marie Obenour
2025-11-20 1:05 ` Demi Marie Obenour
1 sibling, 2 replies; 9+ messages in thread
From: Gao Xiang @ 2025-11-19 21:00 UTC (permalink / raw)
To: Darrick J. Wong, Demi Marie Obenour
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
On 2025/11/20 02:04, Darrick J. Wong wrote:
> On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
>>> By keeping the I/O path mostly within the kernel, we can dramatically
>>> increase the speed of disk-based filesystems.
>>
>> ZFS, BTRFS, and bcachefs all support compression, checksumming,
>> and RAID. ZFS and bcachefs also support encryption, and f2fs and
>> ext4 support fscrypt.
>>
>> Will this patchset be able to improve FUSE implementations of these
>> filesystems? I'd rather not be in the situation where one can have
>> a FUSE filesystem that is fast, but only if it doesn't support modern
>> data integrity or security features.
>
> Not on its own, no.
>
>> I'm not a filesystem developer, but here are some ideas (that you
>> can take or leave):
>>
>> 1. Keep the compression, checksumming, and/or encryption in-kernel,
>> and have userspace tell the kernel what algorithm and/or encryption
>> key to use. These algorithms are generally well-known and secure
>> against malicious input. It might be necessary to make an extra
>> data copy, but ideally that copy could just stay within the
>> CPU caches.
>
> I think this is easily doable for fscrypt and compression since (IIRC)
> the kernel filesystems already know how to transform data for I/O, and
> nowadays iomap allows hooking of bios before submission and/or after
> endio. Obviously you'd have to store encryption keys in the kernel
> somewhere.
I think it depends, since (this way) it tries to reuse some of the
existing kernel filesystem implementations (and assuming the code is
safe), so at least it still needs to load a dedicated kernel module
for such usage at least.
I think it's not an issue for userspace ext4 of course (because ext4
and fscrypt is almost builtin for all kernels), but for out-of-tree
fses even pure userspace fses, I'm not sure it's doable to load the
module in a container context.
Maybe eBPF is useful for this area, but it's still not quite
flexible compared to native kernel filesystems.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 21:00 ` Gao Xiang
@ 2025-11-19 21:51 ` Gao Xiang
2025-11-20 1:13 ` Demi Marie Obenour
2025-11-20 1:10 ` Demi Marie Obenour
1 sibling, 1 reply; 9+ messages in thread
From: Gao Xiang @ 2025-11-19 21:51 UTC (permalink / raw)
To: Darrick J. Wong, Demi Marie Obenour
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
On 2025/11/20 05:00, Gao Xiang wrote:
>
>
> On 2025/11/20 02:04, Darrick J. Wong wrote:
>> On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
>>>> By keeping the I/O path mostly within the kernel, we can dramatically
>>>> increase the speed of disk-based filesystems.
>>>
>>> ZFS, BTRFS, and bcachefs all support compression, checksumming,
>>> and RAID. ZFS and bcachefs also support encryption, and f2fs and
>>> ext4 support fscrypt.
>>>
>>> Will this patchset be able to improve FUSE implementations of these
>>> filesystems? I'd rather not be in the situation where one can have
>>> a FUSE filesystem that is fast, but only if it doesn't support modern
>>> data integrity or security features.
>>
>> Not on its own, no.
>>
>>> I'm not a filesystem developer, but here are some ideas (that you
>>> can take or leave):
>>>
>>> 1. Keep the compression, checksumming, and/or encryption in-kernel,
>>> and have userspace tell the kernel what algorithm and/or encryption
>>> key to use. These algorithms are generally well-known and secure
>>> against malicious input. It might be necessary to make an extra
>>> data copy, but ideally that copy could just stay within the
>>> CPU caches.
>>
>> I think this is easily doable for fscrypt and compression since (IIRC)
>> the kernel filesystems already know how to transform data for I/O, and
>> nowadays iomap allows hooking of bios before submission and/or after
>> endio. Obviously you'd have to store encryption keys in the kernel
>> somewhere.
>
> I think it depends, since (this way) it tries to reuse some of the
> existing kernel filesystem implementations (and assuming the code is
> safe), so at least it still needs to load a dedicated kernel module
> for such usage at least.
>
> I think it's not an issue for userspace ext4 of course (because ext4
> and fscrypt is almost builtin for all kernels), but for out-of-tree
> fses even pure userspace fses, I'm not sure it's doable to load the
> module in a container context.
Two examples for reference:
- For compression, in-tree f2fs already has a compression header
in data of each compressed extent:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/f2fs/f2fs.h?h=v6.17#n1497
while other fs may store additional metadata in extent metadata
or other place.
- gocryptfs (a pure userspace FUSE fs) uses a different format
from fscrypt (encrypted data seems even unaligned on disk):
https://github.com/rfjakob/gocryptfs/blob/master/Documentation/file-format.md
>
> Maybe eBPF is useful for this area, but it's still not quite
> flexible compared to native kernel filesystems.
>
> Thanks,
> Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 18:04 ` Darrick J. Wong
2025-11-19 21:00 ` Gao Xiang
@ 2025-11-20 1:05 ` Demi Marie Obenour
1 sibling, 0 replies; 9+ messages in thread
From: Demi Marie Obenour @ 2025-11-20 1:05 UTC (permalink / raw)
To: Darrick J. Wong
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 4811 bytes --]
Thank you so much for the helpful responses!
On 11/19/25 13:04, Darrick J. Wong wrote:
> On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
>>> By keeping the I/O path mostly within the kernel, we can dramatically
>>> increase the speed of disk-based filesystems.
>>
>> ZFS, BTRFS, and bcachefs all support compression, checksumming,
>> and RAID. ZFS and bcachefs also support encryption, and f2fs and
>> ext4 support fscrypt.
>>
>> Will this patchset be able to improve FUSE implementations of these
>> filesystems? I'd rather not be in the situation where one can have
>> a FUSE filesystem that is fast, but only if it doesn't support modern
>> data integrity or security features.
>
> Not on its own, no.
Not surprised. I'm mostly curious if there is a path forward to add
such support in the future.
>> I'm not a filesystem developer, but here are some ideas (that you
>> can take or leave):
>>
>> 1. Keep the compression, checksumming, and/or encryption in-kernel,
>> and have userspace tell the kernel what algorithm and/or encryption
>> key to use. These algorithms are generally well-known and secure
>> against malicious input. It might be necessary to make an extra
>> data copy, but ideally that copy could just stay within the
>> CPU caches.
>
> I think this is easily doable for fscrypt and compression since (IIRC)
> the kernel filesystems already know how to transform data for I/O, and
> nowadays iomap allows hooking of bios before submission and/or after
> endio. Obviously you'd have to store encryption keys in the kernel
> somewhere.
>
> Checksumming is harder though, since the checksum information has to be
> persisted in the metadata somewhere and AFAICT each checksumming fs does
> things differently. For that, I think the fuse server would have to
> convey to the kernel (a) a description of the checksum geometry and (b)
> a buffer for storing the checksums. On write the kernel would compute
> the checksum and write it to the buffer for the fs to persist as part of
> the ioend; and for read the fuse server would have to read the checksums
> into the buffer and pass that to the kernel.
That definitely sounds doable. Bcachefs, and I believe ZFS and BTRFS,
store the checksum in the pointer. This means that when the kernel
is asked to read data from the buffer, the checksum or authentication
tag is already available.
For CoW filesystems, there is still the problem that every write
requires a metadata operation. Does that mean these filesystems
will not be able to benefit for writes? Or can the latency be
hidden somehow?
> (Note that fsverity won't have this problem because all current
> implementations stuff the merkle tree in post-eof datablocks; the
> fsverity code only wants fses to read it in the pagecache; and pass it
> the page)
>
>> 2. Somehow integrate with the blk-crypto framework. This has the
>> advantage that it supports inline encryption hardware, which
>> I suspect is needed for this to be usable on mobile devices.
>> After all, the keys on these systems are often not even visible
>> to the kernel, let alone to userspace.
>
> Yes, that would be even easier than messing around with bounce buffers.
Makes sense.
>> 3. Figure out a way to make a userspace data path fast enough.
>> To prevent data corruption by unprivileged users of the FS,
>> it's necessary to make a copy before checksumming, compression,
>> or authenticated encryption. If this copy is done in the kernel,
>> the server doesn't have to perform its own copy. By using large
>> ring buffers, it might be possible to amortize the context switch
>> cost away.
>>
>> Authenticated encryption also needs a copy in the *other* direction:
>> if the (untrusted) client can see unauthenticated plaintext, it's
>> a security vulnerability. That needs another copy from server
>> buffers to client buffers, and the kernel can do that as well.
>>
>> 4. Make context switches much faster. L4-style IPC is incredibly fast,
>> at least if one doesn't have to worry about Spectre. Unfortunately,
>> nowadays one *does* need to worry about Spectre.
>
> I don't think context switching overhead is going down.
I agree, at least for big CPUs.
>> Obviously, none of these will be as fast as doing DMA directly to user
>> buffers. However, all of these features (except for encryption using
>> inline encryption hardware) come at a performance penalty already.
>> I just don't want a FUSE server to have to pay a much larger penalty
>> than a kernel filesystem would.
>>
>> I'm CCing the bcachefs, BTRFS, and ZFS-on-Linux mailing lists.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 21:00 ` Gao Xiang
2025-11-19 21:51 ` Gao Xiang
@ 2025-11-20 1:10 ` Demi Marie Obenour
2025-11-20 1:49 ` Gao Xiang
1 sibling, 1 reply; 9+ messages in thread
From: Demi Marie Obenour @ 2025-11-20 1:10 UTC (permalink / raw)
To: Gao Xiang, Darrick J. Wong
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 2644 bytes --]
On 11/19/25 16:00, Gao Xiang wrote:
>
>
> On 2025/11/20 02:04, Darrick J. Wong wrote:
>> On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
>>>> By keeping the I/O path mostly within the kernel, we can dramatically
>>>> increase the speed of disk-based filesystems.
>>>
>>> ZFS, BTRFS, and bcachefs all support compression, checksumming,
>>> and RAID. ZFS and bcachefs also support encryption, and f2fs and
>>> ext4 support fscrypt.
>>>
>>> Will this patchset be able to improve FUSE implementations of these
>>> filesystems? I'd rather not be in the situation where one can have
>>> a FUSE filesystem that is fast, but only if it doesn't support modern
>>> data integrity or security features.
>>
>> Not on its own, no.
>>
>>> I'm not a filesystem developer, but here are some ideas (that you
>>> can take or leave):
>>>
>>> 1. Keep the compression, checksumming, and/or encryption in-kernel,
>>> and have userspace tell the kernel what algorithm and/or encryption
>>> key to use. These algorithms are generally well-known and secure
>>> against malicious input. It might be necessary to make an extra
>>> data copy, but ideally that copy could just stay within the
>>> CPU caches.
>>
>> I think this is easily doable for fscrypt and compression since (IIRC)
>> the kernel filesystems already know how to transform data for I/O, and
>> nowadays iomap allows hooking of bios before submission and/or after
>> endio. Obviously you'd have to store encryption keys in the kernel
>> somewhere.
>
> I think it depends, since (this way) it tries to reuse some of the
> existing kernel filesystem implementations (and assuming the code is
> safe), so at least it still needs to load a dedicated kernel module
> for such usage at least.
My hope is that these modules could be generic library code.
Compression, checksumming, and encryption algorithms aren't specific
to any particular filesystem, and the kernel might well be using them
already for other purposes.
Of course, it's still the host admin's job to make sure the relevant
modules are loaded, unless they are autoloaded.
> I think it's not an issue for userspace ext4 of course (because ext4
> and fscrypt is almost builtin for all kernels), but for out-of-tree
> fses even pure userspace fses, I'm not sure it's doable to load the
> module in a container context.
>
> Maybe eBPF is useful for this area, but it's still not quite
> flexible compared to native kernel filesystems.
>
> Thanks,
> Gao Xiang
Thank you for the feedback!
--
Sincerely,
Demi Marie Obenour (she/her/hers)
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-19 21:51 ` Gao Xiang
@ 2025-11-20 1:13 ` Demi Marie Obenour
0 siblings, 0 replies; 9+ messages in thread
From: Demi Marie Obenour @ 2025-11-20 1:13 UTC (permalink / raw)
To: Gao Xiang, Darrick J. Wong
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
[-- Attachment #1.1.1: Type: text/plain, Size: 3108 bytes --]
On 11/19/25 16:51, Gao Xiang wrote:
> On 2025/11/20 05:00, Gao Xiang wrote:
>> On 2025/11/20 02:04, Darrick J. Wong wrote:
>>> On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
>>>>> By keeping the I/O path mostly within the kernel, we can dramatically
>>>>> increase the speed of disk-based filesystems.
>>>>
>>>> ZFS, BTRFS, and bcachefs all support compression, checksumming,
>>>> and RAID. ZFS and bcachefs also support encryption, and f2fs and
>>>> ext4 support fscrypt.
>>>>
>>>> Will this patchset be able to improve FUSE implementations of these
>>>> filesystems? I'd rather not be in the situation where one can have
>>>> a FUSE filesystem that is fast, but only if it doesn't support modern
>>>> data integrity or security features.
>>>
>>> Not on its own, no.
>>>
>>>> I'm not a filesystem developer, but here are some ideas (that you
>>>> can take or leave):
>>>>
>>>> 1. Keep the compression, checksumming, and/or encryption in-kernel,
>>>> and have userspace tell the kernel what algorithm and/or encryption
>>>> key to use. These algorithms are generally well-known and secure
>>>> against malicious input. It might be necessary to make an extra
>>>> data copy, but ideally that copy could just stay within the
>>>> CPU caches.
>>>
>>> I think this is easily doable for fscrypt and compression since (IIRC)
>>> the kernel filesystems already know how to transform data for I/O, and
>>> nowadays iomap allows hooking of bios before submission and/or after
>>> endio. Obviously you'd have to store encryption keys in the kernel
>>> somewhere.
>>
>> I think it depends, since (this way) it tries to reuse some of the
>> existing kernel filesystem implementations (and assuming the code is
>> safe), so at least it still needs to load a dedicated kernel module
>> for such usage at least.
>>
>> I think it's not an issue for userspace ext4 of course (because ext4
>> and fscrypt is almost builtin for all kernels), but for out-of-tree
>> fses even pure userspace fses, I'm not sure it's doable to load the
>> module in a container context.
>
> Two examples for reference:
>
> - For compression, in-tree f2fs already has a compression header
> in data of each compressed extent:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/f2fs/f2fs.h?h=v6.17#n1497
>
> while other fs may store additional metadata in extent metadata
> or other place.
Extent metadata shouldn't be a problem, as that is already available
during reads and written asynchronously for writes. The headers are
awkward, though, and might need some special-casing.
> - gocryptfs (a pure userspace FUSE fs) uses a different format
> from fscrypt (encrypted data seems even unaligned on disk):
> https://github.com/rfjakob/gocryptfs/blob/master/Documentation/file-format.md
This is probably an anti-pattern in general, as I expect it precludes
the use of inline encryption hardware via blk-crypto.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
2025-11-20 1:10 ` Demi Marie Obenour
@ 2025-11-20 1:49 ` Gao Xiang
0 siblings, 0 replies; 9+ messages in thread
From: Gao Xiang @ 2025-11-20 1:49 UTC (permalink / raw)
To: Demi Marie Obenour, Darrick J. Wong
Cc: bernd, joannelkoong, linux-ext4, linux-fsdevel, miklos, neal,
linux-bcachefs, linux-btrfs, zfs-devel
On 2025/11/20 09:10, Demi Marie Obenour wrote:
> On 11/19/25 16:00, Gao Xiang wrote:
>>
>>
>> On 2025/11/20 02:04, Darrick J. Wong wrote:
>>> On Wed, Nov 19, 2025 at 04:19:36AM -0500, Demi Marie Obenour wrote:
>>>>> By keeping the I/O path mostly within the kernel, we can dramatically
>>>>> increase the speed of disk-based filesystems.
>>>>
>>>> ZFS, BTRFS, and bcachefs all support compression, checksumming,
>>>> and RAID. ZFS and bcachefs also support encryption, and f2fs and
>>>> ext4 support fscrypt.
>>>>
>>>> Will this patchset be able to improve FUSE implementations of these
>>>> filesystems? I'd rather not be in the situation where one can have
>>>> a FUSE filesystem that is fast, but only if it doesn't support modern
>>>> data integrity or security features.
>>>
>>> Not on its own, no.
>>>
>>>> I'm not a filesystem developer, but here are some ideas (that you
>>>> can take or leave):
>>>>
>>>> 1. Keep the compression, checksumming, and/or encryption in-kernel,
>>>> and have userspace tell the kernel what algorithm and/or encryption
>>>> key to use. These algorithms are generally well-known and secure
>>>> against malicious input. It might be necessary to make an extra
>>>> data copy, but ideally that copy could just stay within the
>>>> CPU caches.
>>>
>>> I think this is easily doable for fscrypt and compression since (IIRC)
>>> the kernel filesystems already know how to transform data for I/O, and
>>> nowadays iomap allows hooking of bios before submission and/or after
>>> endio. Obviously you'd have to store encryption keys in the kernel
>>> somewhere.
>>
>> I think it depends, since (this way) it tries to reuse some of the
>> existing kernel filesystem implementations (and assuming the code is
>> safe), so at least it still needs to load a dedicated kernel module
>> for such usage at least.
>
> My hope is that these modules could be generic library code.
Actually, the proposed generic library code for compression,
checksumming, and encryption is already in "crypto/", but
except for checksumming usage, filesystems rarely use the
others, mostly because of inflexibility (for example,
algorithms may have case-by-case advanced functionality.)
> Compression, checksumming, and encryption algorithms aren't specific
> to any particular filesystem, and the kernel might well be using them
> already for other purposes.
>
> Of course, it's still the host admin's job to make sure the relevant
> modules are loaded, unless they are autoloaded.
My thought is still roughly that, although algorithms could
be generic, the specific implementations are still varied
due to different filesystem on-disk intrinsicness (each
filesystem has its own special trait) and/or whether designs
are made with thoughtful thinking. fscrypt and fsverity are
Linux kernel reference implementations, but, for example,
fsverity metadata representation still takes a while for
XFS folks to discuss (of course it doesn't impact the main
mechanism).
>
>> I think it's not an issue for userspace ext4 of course (because ext4
>> and fscrypt is almost builtin for all kernels), but for out-of-tree
>> fses even pure userspace fses, I'm not sure it's doable to load the
>> module in a container context.
>>
>> Maybe eBPF is useful for this area, but it's still not quite
>> flexible compared to native kernel filesystems.
>>
>> Thanks,
>> Gao Xiang
> Thank you for the feedback!
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-11-20 1:49 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <176169810144.1424854.11439355400009006946.stgit@frogsfrogsfrogs>
2025-11-19 9:19 ` [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance Demi Marie Obenour
2025-11-19 9:41 ` Gao Xiang
2025-11-19 18:04 ` Darrick J. Wong
2025-11-19 21:00 ` Gao Xiang
2025-11-19 21:51 ` Gao Xiang
2025-11-20 1:13 ` Demi Marie Obenour
2025-11-20 1:10 ` Demi Marie Obenour
2025-11-20 1:49 ` Gao Xiang
2025-11-20 1:05 ` Demi Marie Obenour
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox