From: "Darrick J. Wong" <djwong@kernel.org>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: miklos@szeredi.hu, bernd@bsbernd.com, neal@gompa.dev,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance
Date: Mon, 26 Jan 2026 18:22:35 -0800 [thread overview]
Message-ID: <20260127022235.GG5900@frogsfrogsfrogs> (raw)
In-Reply-To: <CAJnrk1Z05QZmos90qmWtnWGF+Kb7rVziJ51UpuJ0O=A+6N1vrg@mail.gmail.com>
On Mon, Jan 26, 2026 at 04:59:16PM -0800, Joanne Koong wrote:
> On Tue, Oct 28, 2025 at 5:38 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > Hi all,
> >
> > This series connects fuse (the userspace filesystem layer) to fs-iomap
> > to get fuse servers out of the business of handling file I/O themselves.
> > By keeping the IO path mostly within the kernel, we can dramatically
> > improve the speed of disk-based filesystems. This enables us to move
> > all the filesystem metadata parsing code out of the kernel and into
> > userspace, which means that we can containerize them for security
> > without losing a lot of performance.
>
> I haven't looked through how the fuse2fs or fuse4fs servers are
> implemented yet (also, could you explain the difference between the
> two? Which one should we look at to see how it all ties together?),
fuse4fs is a lowlevel fuse server; fuse2fs is a high(?) level fuse
server. fuse4fs is the successor to fuse2fs, at least on Linux and BSD.
> but I wonder if having bpf infrastructure hooked up to fuse would be
> especially helpful for what you're doing here with fuse iomap. afaict,
> every read/write whether it's buffered or direct will incur at least 1
> call to ->iomap_begin() to get the mapping metadata, which will be 2
> context-switches (and if the server has ->iomap_end() implemented,
> then 2 more context-switches).
Yes, I agree that's a lot of context switching for file IO...
> But it seems like the logic for retrieving mapping
> offsets/lengths/metadata should be pretty straightforward?
...but it gets very cheap if the fuse server can cache mappings in the
kernel to avoid all that. That is, incidentally, what patchset #7
implements.
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-cache_2026-01-22
> If the extent lookups are table lookups or tree
> traversals without complex side effects, then having
> ->iomap_begin()/->iomap_end() be executed as a bpf program would avoid
> the context switches and allow all the caching logic to be moved from
> the kernel to the server-side (eg using bpf maps).
Hrmm. Now that /is/ an interesting proposal. Does BPF have a data
structure that supports interval mappings? I think the existing bpf map
only does key -> value. Also, is there an upper limit on the size of a
map? You could have hundreds of millions of maps for a very fragmented
regular file.
At one point I suggested to the famfs maintainer that it might be
easier/better to implement the interleaved mapping lookups as bpf
programs instead of being stuck with a fixed format in the fuse
userspace abi, but I don't know if he ever implemented that.
> Is this your
> assessment of it as well or do you think the server-side logic for
> iomap_begin()/iomap_end() is too complicated to make this realistic?
> Asking because I'm curious whether this direction makes sense, not
> because I think it would be a blocker for your series.
For disk-based filesystems I think it would be difficult to model a bpf
program to do mappings, since they can basically point anywhere and be
of any size.
OTOH it would be enormously hilarious to me if one could load a file
mapping predictive model into the kernel as a bpf program and use that
as a first tier before checking the in-memory btree mapping cache from
patchset 7. Quite a few years ago now there was a FAST paper
establishing that even a stupid linear regression model could in theory
beat a disk btree lookup.
--D
> Thanks,
> Joanne
>
> >
> > If you're going to start using this code, I strongly recommend pulling
> > from my git trees, which are linked below.
> >
> > This has been running on the djcloud for months with no problems. Enjoy!
> > Comments and questions are, as always, welcome.
> >
> > --D
> >
> > kernel git tree:
> > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-fileio
> > ---
> > Commits in this patchset:
> > * fuse: implement the basic iomap mechanisms
> > * fuse_trace: implement the basic iomap mechanisms
> > * fuse: make debugging configurable at runtime
> > * fuse: adapt FUSE_DEV_IOC_BACKING_{OPEN,CLOSE} to add new iomap devices
> > * fuse_trace: adapt FUSE_DEV_IOC_BACKING_{OPEN,CLOSE} to add new iomap devices
> > * fuse: flush events and send FUSE_SYNCFS and FUSE_DESTROY on unmount
> > * fuse: create a per-inode flag for toggling iomap
> > * fuse_trace: create a per-inode flag for toggling iomap
> > * fuse: isolate the other regular file IO paths from iomap
> > * fuse: implement basic iomap reporting such as FIEMAP and SEEK_{DATA,HOLE}
> > * fuse_trace: implement basic iomap reporting such as FIEMAP and SEEK_{DATA,HOLE}
> > * fuse: implement direct IO with iomap
> > * fuse_trace: implement direct IO with iomap
> > * fuse: implement buffered IO with iomap
> > * fuse_trace: implement buffered IO with iomap
> > * fuse: implement large folios for iomap pagecache files
> > * fuse: use an unrestricted backing device with iomap pagecache io
> > * fuse: advertise support for iomap
> > * fuse: query filesystem geometry when using iomap
> > * fuse_trace: query filesystem geometry when using iomap
> > * fuse: implement fadvise for iomap files
> > * fuse: invalidate ranges of block devices being used for iomap
> > * fuse_trace: invalidate ranges of block devices being used for iomap
> > * fuse: implement inline data file IO via iomap
> > * fuse_trace: implement inline data file IO via iomap
> > * fuse: allow more statx fields
> > * fuse: support atomic writes with iomap
> > * fuse_trace: support atomic writes with iomap
> > * fuse: disable direct reclaim for any fuse server that uses iomap
> > * fuse: enable swapfile activation on iomap
> > * fuse: implement freeze and shutdowns for iomap filesystems
> > ---
> > fs/fuse/fuse_i.h | 161 +++
> > fs/fuse/fuse_trace.h | 939 +++++++++++++++++++
> > fs/fuse/iomap_i.h | 52 +
> > include/uapi/linux/fuse.h | 219 ++++
> > fs/fuse/Kconfig | 48 +
> > fs/fuse/Makefile | 1
> > fs/fuse/backing.c | 12
> > fs/fuse/dev.c | 30 +
> > fs/fuse/dir.c | 120 ++
> > fs/fuse/file.c | 133 ++-
> > fs/fuse/file_iomap.c | 2230 +++++++++++++++++++++++++++++++++++++++++++++
> > fs/fuse/inode.c | 162 +++
> > fs/fuse/iomode.c | 2
> > fs/fuse/trace.c | 2
> > 14 files changed, 4056 insertions(+), 55 deletions(-)
> > create mode 100644 fs/fuse/iomap_i.h
> > create mode 100644 fs/fuse/file_iomap.c
> >
>
next prev parent reply other threads:[~2026-01-27 2:22 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20251029002755.GK6174@frogsfrogsfrogs>
[not found] ` <176169810144.1424854.11439355400009006946.stgit@frogsfrogsfrogs>
[not found] ` <176169810371.1424854.3010195280915622081.stgit@frogsfrogsfrogs>
2026-01-21 19:34 ` [PATCH 01/31] fuse: implement the basic iomap mechanisms Joanne Koong
2026-01-21 22:45 ` Darrick J. Wong
2026-01-22 0:06 ` Joanne Koong
2026-01-22 0:34 ` Darrick J. Wong
2026-02-05 19:22 ` Chris Mason
2026-02-05 23:31 ` Darrick J. Wong
[not found] ` <176169810415.1424854.10373764649459618752.stgit@frogsfrogsfrogs>
2026-01-21 23:42 ` [PATCH 03/31] fuse: make debugging configurable at runtime Joanne Koong
2026-01-22 0:02 ` Darrick J. Wong
2026-01-22 0:23 ` Joanne Koong
2026-01-22 0:40 ` Darrick J. Wong
[not found] ` <176169810502.1424854.13869957103489591272.stgit@frogsfrogsfrogs>
2026-01-22 1:13 ` [PATCH 07/31] fuse: create a per-inode flag for toggling iomap Joanne Koong
2026-01-22 22:22 ` Darrick J. Wong
2026-01-23 18:05 ` Joanne Koong
2026-01-24 16:54 ` Darrick J. Wong
2026-01-27 23:33 ` Darrick J. Wong
[not found] ` <176169810568.1424854.4073875923015322741.stgit@frogsfrogsfrogs>
2026-01-22 2:07 ` [PATCH 10/31] fuse: implement basic iomap reporting such as FIEMAP and SEEK_{DATA,HOLE} Joanne Koong
2026-01-22 22:31 ` Darrick J. Wong
[not found] ` <176169810612.1424854.16053093294573829123.stgit@frogsfrogsfrogs>
2026-01-23 18:56 ` [PATCH 12/31] fuse: implement direct IO with iomap Joanne Koong
2026-01-26 23:46 ` Darrick J. Wong
2026-02-05 19:19 ` Chris Mason
2026-02-06 2:08 ` Darrick J. Wong
2026-02-06 2:52 ` Chris Mason
2026-02-06 5:08 ` Darrick J. Wong
2026-02-06 14:27 ` Chris Mason
[not found] ` <176169810700.1424854.5753715202341698632.stgit@frogsfrogsfrogs>
2026-01-23 21:50 ` [PATCH 16/31] fuse: implement large folios for iomap pagecache files Joanne Koong
[not found] ` <176169810721.1424854.6150447623894591900.stgit@frogsfrogsfrogs>
2026-01-26 22:03 ` [PATCH 17/31] fuse: use an unrestricted backing device with iomap pagecache io Joanne Koong
2026-01-26 23:55 ` Darrick J. Wong
2026-01-27 1:35 ` Joanne Koong
2026-01-27 2:09 ` Darrick J. Wong
2026-01-27 18:04 ` Joanne Koong
2026-01-27 23:37 ` Darrick J. Wong
2026-01-27 0:59 ` [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance Joanne Koong
2026-01-27 2:22 ` Darrick J. Wong [this message]
2026-01-27 19:47 ` Joanne Koong
2026-01-27 23:21 ` Darrick J. Wong
2026-01-28 0:10 ` Joanne Koong
2026-01-28 0:34 ` Darrick J. Wong
2026-01-29 1:12 ` Joanne Koong
2026-01-29 20:02 ` Darrick J. Wong
2026-01-29 22:41 ` Darrick J. Wong
2026-01-29 22:50 ` Joanne Koong
2026-01-29 23:12 ` Darrick J. Wong
[not found] ` <176169810980.1424854.10557015500766654898.stgit@frogsfrogsfrogs>
2026-02-05 18:57 ` [PATCH 29/31] fuse: disable direct reclaim for any fuse server that uses iomap Chris Mason
2026-02-06 4:25 ` Darrick J. Wong
[not found] ` <176169810874.1424854.5037707950055785011.stgit@frogsfrogsfrogs>
2026-02-05 19:01 ` [PATCH 24/31] fuse: implement inline data file IO via iomap Chris Mason
2026-02-06 2:27 ` Darrick J. Wong
[not found] ` <176169810765.1424854.10969346031644824992.stgit@frogsfrogsfrogs>
2026-02-05 19:07 ` [PATCH 19/31] fuse: query filesystem geometry when using iomap Chris Mason
2026-02-06 2:17 ` Darrick J. Wong
[not found] ` <176169810656.1424854.15239592653019383193.stgit@frogsfrogsfrogs>
2026-02-05 19:12 ` [PATCH 14/31] fuse: implement buffered IO with iomap Chris Mason
2026-02-06 2:14 ` Darrick J. Wong
[not found] ` <176169810634.1424854.13084435884326863405.stgit@frogsfrogsfrogs>
2026-02-05 19:16 ` [PATCH 13/31] fuse_trace: implement direct " Chris Mason
2026-02-06 2:12 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260127022235.GG5900@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=bernd@bsbernd.com \
--cc=joannelkoong@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=neal@gompa.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox