From: "Darrick J. Wong" <djwong@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, neal@gompa.dev, John@groves.net,
miklos@szeredi.hu, bernd@bsbernd.com, joannelkoong@gmail.com
Subject: Re: [PATCH 06/13] fuse: implement buffered IO with iomap
Date: Fri, 18 Jul 2025 11:01:21 -0700 [thread overview]
Message-ID: <20250718180121.GV2672029@frogsfrogsfrogs> (raw)
In-Reply-To: <CAOQ4uxjfTp0My7xv39BA1_nD95XLQd-TqERAMG-C4V3UFYpX8w@mail.gmail.com>
On Fri, Jul 18, 2025 at 05:10:14PM +0200, Amir Goldstein wrote:
> On Fri, Jul 18, 2025 at 1:32 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Implement pagecache IO with iomap, complete with hooks into truncate and
> > fallocate so that the fuse server needn't implement disk block zeroing
> > of post-EOF and unaligned punch/zero regions.
> >
> > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > ---
> > fs/fuse/fuse_i.h | 46 +++
> > fs/fuse/fuse_trace.h | 391 ++++++++++++++++++++++++
> > include/uapi/linux/fuse.h | 5
> > fs/fuse/dir.c | 23 +
> > fs/fuse/file.c | 90 +++++-
> > fs/fuse/file_iomap.c | 723 +++++++++++++++++++++++++++++++++++++++++++++
> > fs/fuse/inode.c | 14 +
> > 7 files changed, 1268 insertions(+), 24 deletions(-)
> >
> >
> > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> > index 67e428da4391aa..f33b348d296d5e 100644
> > --- a/fs/fuse/fuse_i.h
> > +++ b/fs/fuse/fuse_i.h
> > @@ -161,6 +161,13 @@ struct fuse_inode {
> >
> > /* waitq for direct-io completion */
> > wait_queue_head_t direct_io_waitq;
> > +
> > +#ifdef CONFIG_FUSE_IOMAP
> > + /* pending io completions */
> > + spinlock_t ioend_lock;
> > + struct work_struct ioend_work;
> > + struct list_head ioend_list;
> > +#endif
> > };
>
> This union member you are changing is declared for
> /* read/write io cache (regular file only) */
> but actually it is also for parallel dio and passthrough mode
>
> IIUC, there should be zero intersection between these io modes and
> /* iomap cached fileio (regular file only) */
>
> Right?
Right. iomap will get very very confused if you switch file IO paths on
a live file. I think it's /possible/ to switch if you flush and
truncate the whole page cache while holding inode_lock() but I don't
think anyone has ever tried.
> So it can use its own union member without increasing fuse_inode size.
>
> Just need to be carefull in fuse_init_file_inode(), fuse_evict_inode() and
> fuse_file_io_release() which do not assume a specific inode io mode.
Yes, I think it's possible to put the iomap stuff in a separate struct
within that union so that we're not increasing the fuse_inode size
unnecessarily. That's desirable for something to do before merging,
but for now prototyping is /much/ easier if I don't have to do that.
Making that change will require a lot of careful auditing, first I want
to make sure you all agree with the iomap approach because it's much
different from what I see in the other fuse IO paths. :)
Eeeyiks, struct fuse_inode shrinks from 1272 bytes to 1152 if I push the
iomap stuff into its own union struct.
> Was it your intention to allow filesystems to configure some inodes to be
> in file_iomap mode and other inodes to be in regular cached/direct/passthrough
> io modes?
That was a deliberate design decision on my part -- maybe a fuse server
would be capable of serving up some files from a local disk, and others
from (say) a network filesystem. Or maybe it would like to expose an
administrative fd for the filesystem (like the xfs_healer event stream)
that isn't backed by storage.
> I can't say that I see a big benefit in allowing such setups.
> It certainly adds a lot of complication to the test matrix if we allow that.
> My instinct is for initial version, either allow only opening files in
> FILE_IOMAP or
> DIRECT_IOMAP to inodes for a filesystem that supports those modes.
I was thinking about combining FUSE_ATTR_IOMAP_(DIRECTIO|FILEIO) for the
next RFC because I can't imagine any scenario where you don't want
directio support if you already use iomap for the pagecache. fuse iomap
requires directio write support for writeback, so the server *must*
support IOMAP_WRITE|IOMAP_DIRECT.
> Perhaps later we can allow (and maybe fallback to) FOPEN_DIRECT_IO
> (without parallel dio) if a server does not configure IOMAP to some inode
> to allow a server to provide the data for a specific inode directly.
Hrmm. Is FOPEN_DIRECT_IO the magic flag that fuse passes to the fuse
server to tell it that a file is open in directio mode? There's a few
fstests that initiate aio+dio writes to a dm-error device that currently
fail in non-iomap mode because fuse2fs writes everything to the bdev
pagecache.
> fuse_file_io_open/release() can help you manage those restrictions and
> set ff->iomode = IOM_FILE_IOMAP when a file is opened for file iomap.
> I did not look closely enough to see if file_iomap code ends up setting
> ff->iomode = IOM_CACHED/UNCACHED or always remains IOM_NONE.
I don't touch ff->iomode because iomap is a per-inode property, not a
per-file property... but I suppose that would be a good place to look.
Thank you for the feedback!
--D
next prev parent reply other threads:[~2025-07-18 18:01 UTC|newest]
Thread overview: 174+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-17 23:10 [RFC v3] fuse: use fs-iomap for better performance so we can containerize ext4 Darrick J. Wong
2025-07-17 23:23 ` [PATCHSET RFC v3 1/4] fuse: fixes and cleanups ahead of iomap support Darrick J. Wong
2025-07-17 23:26 ` [PATCH 1/7] fuse: fix livelock in synchronous file put from fuseblk workers Darrick J. Wong
2025-07-17 23:26 ` [PATCH 2/7] fuse: flush pending fuse events before aborting the connection Darrick J. Wong
2025-07-18 16:37 ` Bernd Schubert
2025-07-18 17:50 ` Joanne Koong
2025-07-18 17:57 ` Bernd Schubert
2025-07-18 18:38 ` Darrick J. Wong
2025-07-18 18:07 ` Bernd Schubert
2025-07-18 18:13 ` Bernd Schubert
2025-07-18 19:34 ` Darrick J. Wong
2025-07-18 21:03 ` Bernd Schubert
2025-07-18 22:23 ` Joanne Koong
2025-07-19 0:32 ` Darrick J. Wong
2025-07-21 20:32 ` Joanne Koong
2025-07-23 17:34 ` Darrick J. Wong
2025-07-23 21:02 ` Joanne Koong
2025-07-23 21:11 ` Joanne Koong
2025-07-24 22:28 ` Darrick J. Wong
2025-07-22 12:30 ` Jeff Layton
2025-07-22 12:38 ` Jeff Layton
2025-07-23 15:37 ` Darrick J. Wong
2025-07-23 16:24 ` Jeff Layton
2025-07-31 9:45 ` Christian Brauner
2025-07-31 17:52 ` Darrick J. Wong
2025-07-19 7:18 ` Amir Goldstein
2025-07-21 20:05 ` Joanne Koong
2025-07-23 17:06 ` Darrick J. Wong
2025-07-23 20:27 ` Joanne Koong
2025-07-24 22:34 ` Darrick J. Wong
2025-07-17 23:27 ` [PATCH 3/7] fuse: capture the unique id of fuse commands being sent Darrick J. Wong
2025-07-18 17:10 ` Bernd Schubert
2025-07-18 18:13 ` Darrick J. Wong
2025-07-22 22:20 ` Bernd Schubert
2025-07-17 23:27 ` [PATCH 4/7] fuse: implement file attributes mask for statx Darrick J. Wong
2025-08-18 15:11 ` Miklos Szeredi
2025-08-18 20:01 ` Darrick J. Wong
2025-08-18 20:04 ` Darrick J. Wong
2025-08-19 15:01 ` Miklos Szeredi
2025-08-19 22:51 ` Darrick J. Wong
2025-08-20 9:16 ` Miklos Szeredi
2025-08-20 9:40 ` Miklos Szeredi
2025-08-20 15:16 ` Darrick J. Wong
2025-08-20 15:31 ` Miklos Szeredi
2025-08-20 15:09 ` Darrick J. Wong
2025-08-20 15:23 ` Miklos Szeredi
2025-08-20 15:29 ` Darrick J. Wong
2025-07-17 23:27 ` [PATCH 5/7] iomap: exit early when iomap_iter is called with zero length Darrick J. Wong
2025-07-17 23:27 ` [PATCH 6/7] iomap: trace iomap_zero_iter zeroing activities Darrick J. Wong
2025-07-17 23:28 ` [PATCH 7/7] iomap: error out on file IO when there is no inline_data buffer Darrick J. Wong
2025-07-17 23:24 ` [PATCHSET RFC v3 2/4] fuse: allow servers to use iomap for better file IO performance Darrick J. Wong
2025-07-17 23:28 ` [PATCH 01/13] fuse: implement the basic iomap mechanisms Darrick J. Wong
2025-07-17 23:28 ` [PATCH 02/13] fuse: add an ioctl to add new iomap devices Darrick J. Wong
2025-07-17 23:28 ` [PATCH 03/13] fuse: flush events and send FUSE_SYNCFS and FUSE_DESTROY on unmount Darrick J. Wong
2025-07-17 23:29 ` [PATCH 04/13] fuse: implement basic iomap reporting such as FIEMAP and SEEK_{DATA,HOLE} Darrick J. Wong
2025-07-17 23:29 ` [PATCH 05/13] fuse: implement direct IO with iomap Darrick J. Wong
2025-07-17 23:29 ` [PATCH 06/13] fuse: implement buffered " Darrick J. Wong
2025-07-18 15:10 ` Amir Goldstein
2025-07-18 18:01 ` Darrick J. Wong [this message]
2025-07-18 18:39 ` Bernd Schubert
2025-07-18 18:46 ` Darrick J. Wong
2025-07-18 19:45 ` Amir Goldstein
2025-07-18 20:20 ` Darrick J. Wong
2025-07-17 23:29 ` [PATCH 07/13] fuse: enable caching of timestamps Darrick J. Wong
2025-07-17 23:30 ` [PATCH 08/13] fuse: implement large folios for iomap pagecache files Darrick J. Wong
2025-07-17 23:30 ` [PATCH 09/13] fuse: use an unrestricted backing device with iomap pagecache io Darrick J. Wong
2025-07-17 23:30 ` [PATCH 10/13] fuse: advertise support for iomap Darrick J. Wong
2025-07-17 23:31 ` [PATCH 11/13] fuse: query filesystem geometry when using iomap Darrick J. Wong
2025-07-17 23:31 ` [PATCH 12/13] fuse: implement fadvise for iomap files Darrick J. Wong
2025-07-17 23:31 ` [PATCH 13/13] fuse: implement inline data file IO via iomap Darrick J. Wong
2025-07-17 23:24 ` [PATCHSET RFC v3 3/4] fuse: cache iomap mappings for even better file IO performance Darrick J. Wong
2025-07-17 23:31 ` [PATCH 1/4] fuse: cache iomaps Darrick J. Wong
2025-07-17 23:32 ` [PATCH 2/4] fuse: use the iomap cache for iomap_begin Darrick J. Wong
2025-07-17 23:32 ` [PATCH 3/4] fuse: invalidate iomap cache after file updates Darrick J. Wong
2025-07-17 23:32 ` [PATCH 4/4] fuse: enable iomap cache management Darrick J. Wong
2025-07-17 23:24 ` [PATCHSET RFC v3 4/4] fuse: handle timestamps and ACLs correctly when iomap is enabled Darrick J. Wong
2025-07-17 23:32 ` [PATCH 1/7] fuse: force a ctime update after a fileattr_set call when in iomap mode Darrick J. Wong
2025-07-17 23:33 ` [PATCH 2/7] fuse: synchronize inode->i_flags after fileattr_[gs]et Darrick J. Wong
2025-07-17 23:33 ` [PATCH 3/7] fuse: cache atime when in iomap mode Darrick J. Wong
2025-07-17 23:33 ` [PATCH 4/7] fuse: update file mode when updating acls Darrick J. Wong
2025-07-17 23:33 ` [PATCH 5/7] fuse: propagate default and file acls on creation Darrick J. Wong
2025-07-17 23:34 ` [PATCH 6/7] fuse: let the kernel handle KILL_SUID/KILL_SGID for iomap filesystems Darrick J. Wong
2025-07-17 23:34 ` [PATCH 7/7] fuse: update ctime when updating acls on an iomap inode Darrick J. Wong
2025-07-17 23:25 ` [PATCHSET RFC v3 1/3] libfuse: allow servers to use iomap for better file IO performance Darrick J. Wong
2025-07-17 23:34 ` [PATCH 01/14] libfuse: add kernel gates for FUSE_IOMAP and bump libfuse api version Darrick J. Wong
2025-07-17 23:34 ` [PATCH 02/14] libfuse: add fuse commands for iomap_begin and end Darrick J. Wong
2025-07-17 23:35 ` [PATCH 03/14] libfuse: add upper level iomap commands Darrick J. Wong
2025-07-17 23:35 ` [PATCH 04/14] libfuse: add a notification to add a new device to iomap Darrick J. Wong
2025-07-17 23:35 ` [PATCH 05/14] libfuse: add iomap ioend low level handler Darrick J. Wong
2025-07-17 23:35 ` [PATCH 06/14] libfuse: add upper level iomap ioend commands Darrick J. Wong
2025-07-17 23:36 ` [PATCH 07/14] libfuse: add a reply function to send FUSE_ATTR_* to the kernel Darrick J. Wong
2025-07-18 14:10 ` Amir Goldstein
2025-07-18 15:48 ` Darrick J. Wong
2025-07-19 7:34 ` Amir Goldstein
2025-07-17 23:36 ` [PATCH 08/14] libfuse: connect high level fuse library to fuse_reply_attr_iflags Darrick J. Wong
2025-07-18 14:27 ` Amir Goldstein
2025-07-18 15:55 ` Darrick J. Wong
2025-07-21 18:51 ` Bernd Schubert
2025-07-23 17:50 ` Darrick J. Wong
2025-07-24 19:56 ` Amir Goldstein
2025-07-29 5:35 ` Darrick J. Wong
2025-07-29 7:50 ` Amir Goldstein
2025-07-29 14:22 ` Darrick J. Wong
2025-07-17 23:36 ` [PATCH 09/14] libfuse: add FUSE_IOMAP_DIRECTIO Darrick J. Wong
2025-07-17 23:37 ` [PATCH 10/14] libfuse: add FUSE_IOMAP_FILEIO Darrick J. Wong
2025-07-17 23:37 ` [PATCH 11/14] libfuse: allow discovery of the kernel's iomap capabilities Darrick J. Wong
2025-07-17 23:37 ` [PATCH 12/14] libfuse: add lower level iomap_config implementation Darrick J. Wong
2025-07-17 23:37 ` [PATCH 13/14] libfuse: add upper " Darrick J. Wong
2025-07-17 23:38 ` [PATCH 14/14] libfuse: add strictatime/lazytime mount options Darrick J. Wong
2025-07-17 23:25 ` [PATCHSET RFC v3 2/3] libfuse: cache iomap mappings for even better file IO performance Darrick J. Wong
2025-07-17 23:38 ` [PATCH 1/1] libfuse: enable iomap cache management Darrick J. Wong
2025-07-18 16:16 ` Bernd Schubert
2025-07-18 18:22 ` Darrick J. Wong
2025-07-18 18:35 ` Bernd Schubert
2025-07-18 18:40 ` Darrick J. Wong
2025-07-18 18:51 ` Bernd Schubert
2025-07-17 23:25 ` [PATCHSET RFC v3 3/3] libfuse: implement statx and syncfs Darrick J. Wong
2025-07-17 23:38 ` [PATCH 1/4] libfuse: wire up FUSE_SYNCFS to the low level library Darrick J. Wong
2025-07-17 23:38 ` [PATCH 2/4] libfuse: add syncfs support to the upper library Darrick J. Wong
2025-07-17 23:39 ` [PATCH 3/4] libfuse: add statx support to the lower level library Darrick J. Wong
2025-07-18 13:28 ` Amir Goldstein
2025-07-18 15:58 ` Darrick J. Wong
2025-07-18 16:27 ` Darrick J. Wong
2025-07-18 16:54 ` Bernd Schubert
2025-07-18 18:42 ` Darrick J. Wong
2025-07-17 23:39 ` [PATCH 4/4] libfuse: add upper level statx hooks Darrick J. Wong
2025-07-17 23:25 ` [PATCHSET RFC v3 1/3] fuse2fs: use fuse iomap data paths for better file I/O performance Darrick J. Wong
2025-07-17 23:39 ` [PATCH 01/22] fuse2fs: implement bare minimum iomap for file mapping reporting Darrick J. Wong
2025-07-17 23:39 ` [PATCH 02/22] fuse2fs: add iomap= mount option Darrick J. Wong
2025-07-17 23:40 ` [PATCH 03/22] fuse2fs: implement iomap configuration Darrick J. Wong
2025-07-17 23:40 ` [PATCH 04/22] fuse2fs: register block devices for use with iomap Darrick J. Wong
2025-07-17 23:40 ` [PATCH 05/22] fuse2fs: always use directio disk reads with fuse2fs Darrick J. Wong
2025-07-17 23:40 ` [PATCH 06/22] fuse2fs: implement directio file reads Darrick J. Wong
2025-07-17 23:41 ` [PATCH 07/22] fuse2fs: use tagged block IO for zeroing sub-block regions Darrick J. Wong
2025-07-17 23:41 ` [PATCH 08/22] fuse2fs: only flush the cache for the file under directio read Darrick J. Wong
2025-07-17 23:41 ` [PATCH 09/22] fuse2fs: add extent dump function for debugging Darrick J. Wong
2025-07-17 23:41 ` [PATCH 10/22] fuse2fs: implement direct write support Darrick J. Wong
2025-07-17 23:42 ` [PATCH 11/22] fuse2fs: turn on iomap for pagecache IO Darrick J. Wong
2025-07-17 23:42 ` [PATCH 12/22] fuse2fs: improve tracing for fallocate Darrick J. Wong
2025-07-17 23:42 ` [PATCH 13/22] fuse2fs: don't zero bytes in punch hole Darrick J. Wong
2025-07-17 23:43 ` [PATCH 14/22] fuse2fs: don't do file data block IO when iomap is enabled Darrick J. Wong
2025-07-17 23:43 ` [PATCH 15/22] fuse2fs: disable most io channel flush/invalidate in iomap pagecache mode Darrick J. Wong
2025-07-17 23:43 ` [PATCH 16/22] fuse2fs: re-enable the block device pagecache for metadata IO Darrick J. Wong
2025-07-17 23:43 ` [PATCH 17/22] fuse2fs: avoid fuseblk mode if fuse-iomap support is likely Darrick J. Wong
2025-07-17 23:44 ` [PATCH 18/22] fuse2fs: don't allow hardlinks for now Darrick J. Wong
2025-07-17 23:44 ` [PATCH 19/22] fuse2fs: enable file IO to inline data files Darrick J. Wong
2025-07-17 23:44 ` [PATCH 20/22] fuse2fs: set iomap-related inode flags Darrick J. Wong
2025-07-17 23:44 ` [PATCH 21/22] fuse2fs: add strictatime/lazytime mount options Darrick J. Wong
2025-07-17 23:45 ` [PATCH 22/22] fuse2fs: configure block device block size Darrick J. Wong
2025-07-17 23:26 ` [PATCHSET RFC v3 2/3] fuse2fs: use fuse iomap data paths for better file I/O performance Darrick J. Wong
2025-07-17 23:45 ` [PATCH 1/1] fuse2fs: enable caching of iomaps Darrick J. Wong
2025-07-17 23:26 ` [PATCHSET RFC v3 3/3] fuse2fs: handle timestamps and ACLs correctly when iomap is enabled Darrick J. Wong
2025-07-17 23:45 ` [PATCH 01/10] fuse2fs: allow O_APPEND and O_TRUNC opens Darrick J. Wong
2025-07-17 23:45 ` [PATCH 02/10] fuse2fs: skip permission checking on utimens when iomap is enabled Darrick J. Wong
2025-07-17 23:46 ` [PATCH 03/10] fuse2fs: let the kernel tell us about acl/mode updates Darrick J. Wong
2025-07-17 23:46 ` [PATCH 04/10] fuse2fs: better debugging for file mode updates Darrick J. Wong
2025-07-17 23:46 ` [PATCH 05/10] fuse2fs: debug timestamp updates Darrick J. Wong
2025-07-17 23:46 ` [PATCH 06/10] fuse2fs: use coarse timestamps for iomap mode Darrick J. Wong
2025-07-17 23:47 ` [PATCH 07/10] fuse2fs: add tracing for retrieving timestamps Darrick J. Wong
2025-07-17 23:47 ` [PATCH 08/10] fuse2fs: enable syncfs Darrick J. Wong
2025-07-17 23:47 ` [PATCH 09/10] fuse2fs: skip the gdt write in op_destroy if syncfs is working Darrick J. Wong
2025-07-17 23:47 ` [PATCH 10/10] fuse2fs: implement statx Darrick J. Wong
2025-07-18 8:54 ` [RFC v3] fuse: use fs-iomap for better performance so we can containerize ext4 Christian Brauner
2025-07-18 11:55 ` Amir Goldstein
2025-07-18 19:31 ` Darrick J. Wong
2025-07-18 19:56 ` Amir Goldstein
2025-07-18 20:21 ` Darrick J. Wong
2025-07-23 13:05 ` Christian Brauner
2025-07-23 18:04 ` Darrick J. Wong
2025-07-31 10:13 ` Christian Brauner
2025-07-31 17:22 ` Darrick J. Wong
2025-08-04 10:12 ` Christian Brauner
2025-08-12 20:20 ` Darrick J. Wong
2025-08-15 14:20 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250718180121.GV2672029@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=John@groves.net \
--cc=amir73il@gmail.com \
--cc=bernd@bsbernd.com \
--cc=joannelkoong@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=neal@gompa.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).