Linux EXT4 FS development
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	fuse-devel <fuse-devel@lists.linux.dev>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	Bernd Schubert <bernd@bsbernd.com>,
	Joanne Koong <joannelkoong@gmail.com>,
	Theodore Ts'o <tytso@mit.edu>, Neal Gompa <neal@gompa.dev>,
	Amir Goldstein <amir73il@gmail.com>,
	Christian Brauner <brauner@kernel.org>,
	john@groves.net
Subject: [PATCHBOMB v9] fuse/libfuse/e2fsprogs: faster file IO for containerized ext4 servers
Date: Tue, 19 May 2026 15:22:29 -0700	[thread overview]
Message-ID: <20260519222229.GB9544@frogsfrogsfrogs> (raw)

Hi everyone,

This is the ninth public draft of a prototype to connect the Linux
fuse driver to fs-iomap for regular file IO operations to and from files
whose contents persist to locally attached storage devices.  With this
release, I show that it's possible to build a fuse server for a real
filesystem (ext4) that runs entirely in userspace yet maintains most of
its performance.

This effort is now separate from the one to run fuse servers in a
constrained environment via systemd.  Putting fuse servers in a
container gets you all the blast radii reduction advantages and provides
a pathway to removing less popular filesystem drivers to reduce
maintenance work in the kernel; now we want trade relaxation of that
isolation for better performance.

The fuse command plumbing is very simple -- the ->iomap_begin,
->iomap_end, and iomap ->ioend calls within iomap are turned into
upcalls to the fuse server via a trio of new fuse commands.  Pagecache
writeback is now a directio write.  The fuse server can upsert mappings
into the kernel for cached access (== zero upcalls for rereads and pure
overwrites!) and the iomap cache revalidation code works.

At this stage I still get about 95% of the kernel ext4 driver's
streaming directio performance on streaming IO, and 110% of its
streaming buffered IO performance.  Random buffered IO is about 85% as
fast as the kernel.  Random direct IO is about 80% as fast as the
kernel; see the cover letter for the fuse2fs iomap changes for more
details.  Unwritten extent conversions on random direct writes are
especially painful for fuse+iomap (~90% more overhead) due to upcall
overhead.  And that's with (now dynamic) debugging turned on!

This series has been rebased to 7.1-rc4 since the eighth RFC, with
the following kernel changes:

1. The BPF stuff has been replaced with a filesystem striping mechanism.
   This is my first attempt ever to implement raid0.

2. Much tightening of the validation code based on Codex reviews so that
   we don't expose more "ABI" than we feel like getting yelled at for
   in 2031.

3. Refactored iomap writeback mapping so that you can use the standard
   iomap_begin functions for that.

4. Better userspace helpers so that fuse server authors don't have to
   know quite so much detail of the innards.

5. The libfuse changes are based off the WIP fuse-service-container
   branch.

There are some questions remaining:

a. fuse2fs doesn't support the ext4 journal.  Urk.

b. I've dropped everything but the kernel patches for basic plumbing and
   file IO paths because frankly they weren't getting looked at.

c. How on earth am I going to separate out the file_operations?
   Will it actually work to say that fuse-iomap only supports local
   filesystems initially?  How many of the "is_iomap?" predicates are
   actually for local filesystems and not the IO path???

I would like to any part of this submission reviewed for 7.2 now that
this has been collecting comments and tweaks in non-rfc status for 6
months.

Kernel:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-striping

libfuse:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/libfuse.git/log/?h=fuse-iomap-striping

e2fsprogs:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/e2fsprogs.git/log/?h=fuse4fs-memory-reclaim

fstests:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fuse2fs

--Darrick

                 reply	other threads:[~2026-05-19 22:22 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260519222229.GB9544@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=bernd@bsbernd.com \
    --cc=brauner@kernel.org \
    --cc=fuse-devel@lists.linux.dev \
    --cc=joannelkoong@gmail.com \
    --cc=john@groves.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neal@gompa.dev \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox