Linux filesystem development
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: miklos@szeredi.hu, bernd@bsbernd.com, neal@gompa.dev,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 17/31] fuse: use an unrestricted backing device with iomap pagecache io
Date: Mon, 26 Jan 2026 18:09:44 -0800	[thread overview]
Message-ID: <20260127020944.GF5900@frogsfrogsfrogs> (raw)
In-Reply-To: <CAJnrk1ZYF7MG0mBZ4GRdKfmSiEEx3vXxgiH3oYdMS-neWSA2mw@mail.gmail.com>

On Mon, Jan 26, 2026 at 05:35:05PM -0800, Joanne Koong wrote:
> On Mon, Jan 26, 2026 at 3:55 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > On Mon, Jan 26, 2026 at 02:03:35PM -0800, Joanne Koong wrote:
> > > On Tue, Oct 28, 2025 at 5:49 PM Darrick J. Wong <djwong@kernel.org> wrote:
> > > >
> > > > From: Darrick J. Wong <djwong@kernel.org>
> > > >
> > > > With iomap support turned on for the pagecache, the kernel issues
> > > > writeback to directly to block devices and we no longer have to push all
> > > > those pages through the fuse device to userspace.  Therefore, we don't
> > > > need the tight dirty limits (~1M) that are used for regular fuse.  This
> > > > dramatically increases the performance of fuse's pagecache IO.
> > > >
> > > > Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> > > > ---
> > > >  fs/fuse/file_iomap.c |   21 +++++++++++++++++++++
> > > >  1 file changed, 21 insertions(+)
> > > >
> > > >
> > > > diff --git a/fs/fuse/file_iomap.c b/fs/fuse/file_iomap.c
> > > > index 0bae356045638b..a9bacaa0991afa 100644
> > > > --- a/fs/fuse/file_iomap.c
> > > > +++ b/fs/fuse/file_iomap.c
> > > > @@ -713,6 +713,27 @@ const struct fuse_backing_ops fuse_iomap_backing_ops = {
> > > >  void fuse_iomap_mount(struct fuse_mount *fm)
> > > >  {
> > > >         struct fuse_conn *fc = fm->fc;
> > > > +       struct super_block *sb = fm->sb;
> > > > +       struct backing_dev_info *old_bdi = sb->s_bdi;
> > > > +       char *suffix = sb->s_bdev ? "-fuseblk" : "-fuse";
> > > > +       int res;
> > > > +
> > > > +       /*
> > > > +        * sb->s_bdi points to the initial private bdi.  However, we want to
> > > > +        * redirect it to a new private bdi with default dirty and readahead
> > > > +        * settings because iomap writeback won't be pushing a ton of dirty
> > > > +        * data through the fuse device.  If this fails we fall back to the
> > > > +        * initial fuse bdi.
> > > > +        */
> > > > +       sb->s_bdi = &noop_backing_dev_info;
> > > > +       res = super_setup_bdi_name(sb, "%u:%u%s.iomap", MAJOR(fc->dev),
> > > > +                                  MINOR(fc->dev), suffix);
> > > > +       if (res) {
> > > > +               sb->s_bdi = old_bdi;
> > > > +       } else {
> > > > +               bdi_unregister(old_bdi);
> > > > +               bdi_put(old_bdi);
> > > > +       }
> > >
> > > Maybe I'm missing something here, but isn't sb->s_bdi already set to
> > > noop_backing_dev_info when fuse_iomap_mount() is called?
> > > fuse_fill_super() -> fuse_fill_super_common() -> fuse_bdi_init() does
> > > this already before the fuse_iomap_mount() call, afaict.
> >
> > Right.
> >
> > > I think what we need to do is just unset BDI_CAP_STRICTLIMIT and
> > > adjust the bdi max ratio?
> >
> > That's sufficient to undo the effects of fuse_bdi_init, yes.  However
> > the BDI gets created with the name "$major:$minor{-fuseblk}" and there
> > are "management" scripts that try to tweak fuse BDIs for better
> > performance.
> >
> > I don't want some dumb script to mismanage a fuse-iomap filesystem
> > because it can't tell the difference, so I create a new bdi with the
> > name "$major:$minor.iomap" to make it obvious.  But super_setup_bdi_name
> > gets cranky if s_bdi isn't set to noop and we don't want to fail a mount
> > here due to ENOMEM so ... I implemented this weird switcheroo code.
> 
> I see. It might be useful to copy/paste this into the commit message
> just for added context. I don't see a better way of doing it than what
> you have in this patch then since we rely on the init reply to know
> whether iomap should be used or not...

I'll do that.  I will also add that as soon as any BDI is created, it
will be exposed to userspace in sysfs.  That means that running the code
from fuse_bdi_init in reverse will not necessarily produce the same
results as a freshly created BDI.

> If the new bdi setup fails, I wonder if the mount should just fail
> entirely then. That seems better to me than letting it succeed with

Err, which new bdi setup?  If fuse-iomap can't create a new BDI, it will
set s_bdi back to the old one and move on.  You'll get degraded
performance, but that's not the end of the world.

> strictlimiting enforced, especially since large folios will be enabled
> for fuse iomap. [1] has some numbers for the performance degradations
> I saw for writes with strictlimiting on and large folios enabled.

If fuse_bdi_init can't set up a bdi it will fail the mount.

That said... from reading [1], if strictlimiting is enabled with large
folios, then can we figure out what is the effective max folio size and
lower it to that?

> Speaking of strictlimiting though, from a policy standpoint if we
> think strictlimiting is needed in general in fuse (there's a thread
> from last year [1] about removing strict limiting), then I think that

(did you mean [2] here?)

> would need to apply to iomap as well, at least for unprivileged
> servers.

iomap requires a privileged server, FWIW.

> [1] https://lore.kernel.org/linux-fsdevel/CAJnrk1bwat_r4+pmhaWH-ThAi+zoAJFwmJG65ANj1Zv0O0s4_A@mail.gmail.com/
> [2] https://lore.kernel.org/linux-fsdevel/20251010150113.GC6174@frogsfrogsfrogs/T/#ma34ff5ae338a83f8b2e946d7e5332ea835fa0ff6
> 
> >
> > > This is more of a nit, but I think it'd also be nice if we
> > > swapped the ordering of this patch with the previous one enabling
> > > large folios, so that large folios gets enabled only when all the bdi
> > > stuff for it is ready.
> >
> > Will do, thanks for reading these patches!
> >
> > Also note that I've changed this part of the patchset quite a lot since
> > this posting; iomap configuration is now a completely separate fuse
> > command that gets triggered after the FUSE_INIT reply is received.
> 
> Great, I'll look at your upstream tree then for this part.

Ok.

--D

> Thanks,
> Joanne
> 
> >
> > --D
> >
> > > Thanks,
> > > Joanne
> > >
> > > >
> > > >         /*
> > > >          * Enable syncfs for iomap fuse servers so that we can send a final
> > > >
> > >
> 

  reply	other threads:[~2026-01-27  2:09 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251029002755.GK6174@frogsfrogsfrogs>
     [not found] ` <176169810144.1424854.11439355400009006946.stgit@frogsfrogsfrogs>
     [not found]   ` <176169810415.1424854.10373764649459618752.stgit@frogsfrogsfrogs>
2026-01-21 23:42     ` [PATCH 03/31] fuse: make debugging configurable at runtime Joanne Koong
2026-01-22  0:02       ` Darrick J. Wong
2026-01-22  0:23         ` Joanne Koong
2026-01-22  0:40           ` Darrick J. Wong
     [not found]   ` <176169810502.1424854.13869957103489591272.stgit@frogsfrogsfrogs>
2026-01-22  1:13     ` [PATCH 07/31] fuse: create a per-inode flag for toggling iomap Joanne Koong
2026-01-22 22:22       ` Darrick J. Wong
2026-01-23 18:05         ` Joanne Koong
2026-01-24 16:54           ` Darrick J. Wong
2026-01-27 23:33             ` Darrick J. Wong
     [not found]   ` <176169810568.1424854.4073875923015322741.stgit@frogsfrogsfrogs>
2026-01-22  2:07     ` [PATCH 10/31] fuse: implement basic iomap reporting such as FIEMAP and SEEK_{DATA,HOLE} Joanne Koong
2026-01-22 22:31       ` Darrick J. Wong
     [not found]   ` <176169810700.1424854.5753715202341698632.stgit@frogsfrogsfrogs>
2026-01-23 21:50     ` [PATCH 16/31] fuse: implement large folios for iomap pagecache files Joanne Koong
     [not found]   ` <176169810721.1424854.6150447623894591900.stgit@frogsfrogsfrogs>
2026-01-26 22:03     ` [PATCH 17/31] fuse: use an unrestricted backing device with iomap pagecache io Joanne Koong
2026-01-26 23:55       ` Darrick J. Wong
2026-01-27  1:35         ` Joanne Koong
2026-01-27  2:09           ` Darrick J. Wong [this message]
2026-01-27 18:04             ` Joanne Koong
2026-01-27 23:37               ` Darrick J. Wong
2026-01-27  0:59   ` [PATCHSET v6 4/8] fuse: allow servers to use iomap for better file IO performance Joanne Koong
2026-01-27  2:22     ` Darrick J. Wong
2026-01-27 19:47       ` Joanne Koong
2026-01-27 23:21         ` Darrick J. Wong
2026-01-28  0:10           ` Joanne Koong
2026-01-28  0:34             ` Darrick J. Wong
2026-01-29  1:12               ` Joanne Koong
2026-01-29 20:02                 ` Darrick J. Wong
2026-01-29 22:41                   ` Darrick J. Wong
2026-01-29 22:50                   ` Joanne Koong
2026-01-29 23:12                     ` Darrick J. Wong
     [not found]   ` <176169810980.1424854.10557015500766654898.stgit@frogsfrogsfrogs>
2026-02-05 18:57     ` [PATCH 29/31] fuse: disable direct reclaim for any fuse server that uses iomap Chris Mason
2026-02-06  4:25       ` Darrick J. Wong
     [not found]   ` <176169810874.1424854.5037707950055785011.stgit@frogsfrogsfrogs>
2026-02-05 19:01     ` [PATCH 24/31] fuse: implement inline data file IO via iomap Chris Mason
2026-02-06  2:27       ` Darrick J. Wong
     [not found]   ` <176169810765.1424854.10969346031644824992.stgit@frogsfrogsfrogs>
2026-02-05 19:07     ` [PATCH 19/31] fuse: query filesystem geometry when using iomap Chris Mason
2026-02-06  2:17       ` Darrick J. Wong
     [not found]   ` <176169810656.1424854.15239592653019383193.stgit@frogsfrogsfrogs>
2026-02-05 19:12     ` [PATCH 14/31] fuse: implement buffered IO with iomap Chris Mason
2026-02-06  2:14       ` Darrick J. Wong
     [not found]   ` <176169810634.1424854.13084435884326863405.stgit@frogsfrogsfrogs>
2026-02-05 19:16     ` [PATCH 13/31] fuse_trace: implement direct " Chris Mason
2026-02-06  2:12       ` Darrick J. Wong
     [not found]   ` <176169810612.1424854.16053093294573829123.stgit@frogsfrogsfrogs>
2026-01-23 18:56     ` [PATCH 12/31] fuse: " Joanne Koong
2026-01-26 23:46       ` Darrick J. Wong
2026-02-05 19:19     ` Chris Mason
2026-02-06  2:08       ` Darrick J. Wong
2026-02-06  2:52         ` Chris Mason
2026-02-06  5:08           ` Darrick J. Wong
2026-02-06 14:27             ` Chris Mason
     [not found]   ` <176169810371.1424854.3010195280915622081.stgit@frogsfrogsfrogs>
2026-01-21 19:34     ` [PATCH 01/31] fuse: implement the basic iomap mechanisms Joanne Koong
2026-01-21 22:45       ` Darrick J. Wong
2026-01-22  0:06         ` Joanne Koong
2026-01-22  0:34           ` Darrick J. Wong
2026-02-05 19:22     ` Chris Mason
2026-02-05 23:31       ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260127020944.GF5900@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=bernd@bsbernd.com \
    --cc=joannelkoong@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=neal@gompa.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox