public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Joanne Koong <joannelkoong@gmail.com>
To: John Groves <john@groves.net>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Amir Goldstein <amir73il@gmail.com>,
	 Miklos Szeredi <miklos@szeredi.hu>,
	 "f-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	 "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Bernd Schubert <bernd@bsbernd.com>,
	 Luis Henriques <luis@igalia.com>,
	Horst Birthelmer <horst@birthelmer.de>
Subject: Re: [LSF/MM/BPF TOPIC] Where is fuse going? API cleanup, restructuring and more
Date: Wed, 11 Feb 2026 20:46:26 -0800	[thread overview]
Message-ID: <CAJnrk1YMqDKA5gDZasrxGjJtfdbhmjxX5uhUv=OSPyA=G5EE+Q@mail.gmail.com> (raw)
In-Reply-To: <CAJnrk1Za2SdCkpJ=sZR8LJ1qvBn8dd3CCsH=PvMrg=_0Jv+40Q@mail.gmail.com>

On Fri, Feb 6, 2026 at 4:22 PM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Fri, Feb 6, 2026 at 12:48 PM John Groves <john@groves.net> wrote:
> >
> > On 26/02/05 09:52PM, Darrick J. Wong wrote:
> > > On Thu, Feb 05, 2026 at 10:27:52AM +0100, Amir Goldstein wrote:
> > > > On Thu, Feb 5, 2026 at 4:33 AM John Groves <john@jagalactic.com> wrote:
> > > > >
> > > > > On 26/02/04 11:06AM, Darrick J. Wong wrote:
> > > > >
> > > > > [ ... ]
> > > > >
> > > > > > >  - famfs: export distributed memory
> > > > > >
> > > > > > This has been, uh, hanging out for an extraordinarily long time.
> > > > >
> > > > > Um, *yeah*. Although a significant part of that time was on me, because
> > > > > getting it ported into fuse was kinda hard, my users and I are hoping we
> > > > > can get this upstreamed fairly soon now. I'm hoping that after the 6.19
> > > > > merge window dust settles we can negotiate any needed changes etc. and
> > > > > shoot for the 7.0 merge window.
> > >
> > > I think we've all missed getting merged for 7.0 since 6.19 will be
> > > released in 3 days. :/
> > >
> > > (Granted most of the maintainers I know are /much/ less conservative
> > > than I was about the schedule)
> >
> > Doh - right you are...
> >
> > >
> > > > I think that the work on famfs is setting an example, and I very much
> > > > hope it will be a good example, of how improving existing infrastructure
> > > > (FUSE) is a better contribution than adding another fs to the pile.
> > >
> > > Yeah.  Joanne and I spent a couple of days this week coprogramming a
> > > prototype of a way for famfs to create BPF programs to handle
> > > INTERLEAVED_EXTENT files.  We might be ready to show that off in a
> > > couple of weeks, and that might be a way to clear up the
> > > GET_FMAP/IOMAP_BEGIN logjam at last.
> >
> > I'd love to learn more about this; happy to do a call if that's a
> > good way to get me briefed.
> >
> > I [generally but not specifically] understand how this could avoid
> > GET_FMAP, but not GET_DAXDEV.
> >
> > But I'm not sure it could (or should) avoid dax_iomap_rw() and
> > dax_iomap_fault(). The thing is that those call my begin() function
> > to resolve an offset in a file to an offset on a daxdev, and then
> > dax completes the fault or memcpy. In that dance, famfs never knows
> > the kernel address of the memory at all (also true of xfs in fs-dax
> > mode, unless that's changed fairly recently). I think that's a pretty
> > decent interface all in all.
> >
> > Also: dunno whether y'all have looked at the dax patches in the famfs
> > series, but the solution to working with Alistair's folio-ification
> > and cleanup of the dax layer (which set me back months) was to create
> > drivers/dax/fsdev.c, which, when bound to a daxdev in place of
> > drivers/dax/device.c, configures folios & pages compatibly with
> > fs-dax. So I kinda think I need the dax_iomap* interface.
> >
> > As usual, if I'm overlooking something let me know...
>
> Hi John,
>
> The conversation started [1] on Darrick's containerization patchset
> about using bpf to a) avoid extra requests / context switching for
> ->iomap_begin and ->iomap_end calls and b) offload what would
> otherwise have to be hard-coded kernel logic into userspace, which
> gives userspace more flexibility / control with updating the logic and
> is less of a maintenance burden for fuse. There was some musing [2]
> about whether with bpf infrastructure added, it would allow famfs to
> move all famfs-specific logic to userspace/bpf.
>
> I agree that it makes sense for famfs to go through dax iomap
> interfaces. imo it seems cleanest if fuse has a generic iomap
> interface with iomap dax going through that plumbing, and any
> famfs-specific logic that would be needed beyond that (eg computing
> the interleaved mappings) being moved to custom famfs bpf programs. I
> started trying to implement this yesterday afternoon because I wanted
> to make sure it would actually be doable for the famfs logic before
> bringing it up and I didn't want to derail your project. So far I only
> have the general iomap interface for fuse added with dax operations
> going through dax_iomap* and haven't tried out integrating the famfs
> GET_FMAP/GET_DAXDEV bpf program part yet but I'm planning/hoping to
> get to that early next week. The work I did with Darrick this week was
> on getting a server's bpf programs hooked up to fuse through bpf links
> and Darrick has fleshed that out and gotten that working now. If it
> turns out famfs can go through a generic iomap fuse plumbing layer,
> I'd be curious to hear your thoughts on which approach you'd prefer.

I put together a quick prototype to test this out - this is what it
looks like with fuse having a generic iomap interface that supports
dax [1], and the famfs custom logic moved to a bpf program [2]. I
didn't change much, I just moved around your famfs code to the bpf
side. The kernel side changes are in [3] and the libfuse changes are
in [4].

For testing out the prototype, I hooked it up to passthrough_hp to
test running the bpf program and verify that it is able to find the
extent from the bpf map. In my opinion, this makes the fuse side
infrastructure cleaner and more extendable for other servers that will
want to go through dax iomap in the future, but I think this also has
a few benefits for famfs. Instead of needing to issue a FUSE_GET_FMAP
request after a file is opened, the server can directly populate the
metadata map from userspace with the mapping info when it processes
the FUSE_OPEN request, which gets rid of the roundtrip cost. The
server can dynamically update the metadata at any time from userspace
if the mapping info needs to change in the future. For setting up the
daxdevs, I moved your logic to the init side, where the server passes
the daxdev info upfront through an IOMAP_CONFIG exchange with the
kernel initializing the daxdevs based off that info. I think this will
also make deploying future updates for famfs easier, as updating the
logic won't need to go through the upstream kernel mailing list
process and deploying updates won't require a new kernel release.

These are just my two cents based on my (cursory) understanding of
famfs. Just wanted to float this alternative approach in case it's
useful.

Thanks,
Joanne

[1] https://github.com/joannekoong/linux/commit/b8f9d284a6955391f00f576d890e1c1ccc943cfd
[2] https://github.com/joannekoong/libfuse/commit/444fa27fa9fd2118a0dc332933197faf9bbf25aa
[3] https://github.com/joannekoong/linux/commits/prototype_generic_iomap_dax/
[4] https://github.com/joannekoong/libfuse/commits/famfs_bpf/

>
> Thanks,
> Joanne
>
> [1] https://lore.kernel.org/linux-fsdevel/CAJnrk1bxhw2u0qwjw0dJPGdmxEXbcEyKn-=iFrszqof2c8wGCA@mail.gmail.com/t/#md1b8003a109760d8ee1d5397e053673c1978ed4d
> [2] https://lore.kernel.org/linux-fsdevel/CAJnrk1bxhw2u0qwjw0dJPGdmxEXbcEyKn-=iFrszqof2c8wGCA@mail.gmail.com/t/#u
>
> >
> > Regards,
> > John
> >

  reply	other threads:[~2026-02-12  4:46 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <aYIsRc03fGhQ7vbS@groves.net>
2026-02-02 13:51 ` [LSF/MM/BPF TOPIC] Where is fuse going? API cleanup, restructuring and more Miklos Szeredi
2026-02-02 16:14   ` Amir Goldstein
2026-02-03  7:55     ` Miklos Szeredi
2026-02-03  9:19       ` [Lsf-pc] " Jan Kara
2026-02-03 10:31         ` Amir Goldstein
2026-02-04  9:22       ` Joanne Koong
2026-02-04 10:37         ` Amir Goldstein
2026-02-04 10:43         ` [Lsf-pc] " Jan Kara
2026-02-06  6:09           ` Darrick J. Wong
2026-02-21  6:07             ` Demi Marie Obenour
2026-02-21  7:07               ` Darrick J. Wong
2026-02-21 22:16                 ` Demi Marie Obenour
2026-02-23 21:58                   ` Darrick J. Wong
2026-02-04 20:47         ` Bernd Schubert
2026-02-06  6:26         ` Darrick J. Wong
2026-02-03 10:15     ` Luis Henriques
2026-02-03 10:20       ` Amir Goldstein
2026-02-03 10:38         ` Luis Henriques
2026-02-03 14:20         ` Christian Brauner
2026-02-03 10:36   ` Amir Goldstein
2026-02-03 17:13   ` John Groves
2026-02-04 19:06   ` Darrick J. Wong
2026-02-04 19:38     ` Horst Birthelmer
2026-02-04 20:58     ` Bernd Schubert
2026-02-06  5:47       ` Darrick J. Wong
2026-02-04 22:50     ` Gao Xiang
2026-02-06  5:38       ` Darrick J. Wong
2026-02-06  6:15         ` Gao Xiang
2026-02-21  0:47           ` Darrick J. Wong
2026-03-17  4:17             ` Gao Xiang
2026-03-18 21:51               ` Darrick J. Wong
2026-03-19  8:05                 ` Gao Xiang
2026-03-22  3:25                 ` Demi Marie Obenour
2026-03-22  3:52                   ` Gao Xiang
2026-03-22  4:51                   ` Gao Xiang
2026-03-22  5:13                     ` Demi Marie Obenour
2026-03-22  5:30                       ` Gao Xiang
2026-03-23  9:54                     ` [Lsf-pc] " Jan Kara
2026-03-23 10:19                       ` Gao Xiang
2026-03-23 11:14                         ` Jan Kara
2026-03-23 11:42                           ` Gao Xiang
2026-03-23 12:01                             ` Gao Xiang
2026-03-23 14:13                               ` Jan Kara
2026-03-23 14:36                                 ` Gao Xiang
2026-03-23 14:47                                   ` Jan Kara
2026-03-23 14:57                                     ` Gao Xiang
2026-03-24  8:48                                     ` Christian Brauner
2026-03-24  9:30                                       ` Gao Xiang
2026-03-24  9:49                                         ` Demi Marie Obenour
2026-03-24  9:53                                           ` Gao Xiang
2026-03-24 10:02                                             ` Demi Marie Obenour
2026-03-24 10:14                                               ` Gao Xiang
2026-03-24 10:17                                                 ` Demi Marie Obenour
2026-03-24 10:25                                                   ` Gao Xiang
2026-03-24 11:58                                       ` Demi Marie Obenour
2026-03-24 12:21                                         ` Gao Xiang
2026-03-26 14:39                                           ` Christian Brauner
2026-03-23 12:08                           ` Demi Marie Obenour
2026-03-23 12:13                             ` Gao Xiang
2026-03-23 12:19                               ` Demi Marie Obenour
2026-03-23 12:30                                 ` Gao Xiang
2026-03-23 12:33                                   ` Gao Xiang
2026-03-22  5:14                   ` Gao Xiang
2026-03-23  9:43                     ` [Lsf-pc] " Jan Kara
2026-03-23 10:05                       ` Gao Xiang
2026-03-23 10:14                         ` Jan Kara
2026-03-23 10:30                           ` Gao Xiang
2026-02-04 23:19     ` Gao Xiang
2026-02-05  3:33     ` John Groves
2026-02-05  9:27       ` Amir Goldstein
2026-02-06  5:52         ` Darrick J. Wong
2026-02-06 20:48           ` John Groves
2026-02-07  0:22             ` Joanne Koong
2026-02-12  4:46               ` Joanne Koong [this message]
2026-02-21  0:37                 ` Darrick J. Wong
2026-02-26 20:21                   ` Joanne Koong
2026-03-03  4:57                     ` Darrick J. Wong
2026-03-03 17:28                       ` Joanne Koong
2026-02-20 23:59             ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJnrk1YMqDKA5gDZasrxGjJtfdbhmjxX5uhUv=OSPyA=G5EE+Q@mail.gmail.com' \
    --to=joannelkoong@gmail.com \
    --cc=amir73il@gmail.com \
    --cc=bernd@bsbernd.com \
    --cc=djwong@kernel.org \
    --cc=horst@birthelmer.de \
    --cc=john@groves.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=luis@igalia.com \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox