From: John Groves <John@groves.net>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Gregory Price <gourry@gourry.net>,
"Darrick J. Wong" <djwong@kernel.org>,
Miklos Szeredi <miklos@szeredi.hu>,
Joanne Koong <joannelkoong@gmail.com>,
Bernd Schubert <bernd@bsbernd.com>,
John Groves <john@jagalactic.com>,
Dan Williams <dan.j.williams@intel.com>,
Bernd Schubert <bschubert@ddn.com>,
Alison Schofield <alison.schofield@intel.com>,
John Groves <jgroves@micron.com>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>,
Randy Dunlap <rdunlap@infradead.org>,
Jeff Layton <jlayton@kernel.org>,
Amir Goldstein <amir73il@gmail.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Stefan Hajnoczi <shajnocz@redhat.com>,
Josef Bacik <josef@toxicpanda.com>,
Bagas Sanjaya <bagasdotme@gmail.com>,
Chen Linxuan <chenlinxuan@uniontech.com>,
James Morse <james.morse@arm.com>, Fuad Tabba <tabba@google.com>,
Sean Christopherson <seanjc@google.com>,
Shivank Garg <shivankg@amd.com>,
Ackerley Tng <ackerleytng@google.com>,
Aravind Ramesh <arramesh@micron.com>,
Ajay Joshi <ajayjoshi@micron.com>,
"venkataravis@micron.com" <venkataravis@micron.com>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
djbw@kernel.org
Subject: Re: [PATCH V10 00/10] famfs: port into fuse
Date: Sun, 19 Apr 2026 15:36:30 -0500 [thread overview]
Message-ID: <aeUU8hMwPij2WvfF@groves.net> (raw)
In-Reply-To: <f254f6fc-dc06-4612-82d7-35bb10dbd32e@kernel.org>
On 26/04/15 10:16AM, David Hildenbrand (Arm) wrote:
> On 4/15/26 00:20, Gregory Price wrote:
> > On Tue, Apr 14, 2026 at 11:57:40AM -0700, Darrick J. Wong wrote:
> >>>
> >>> I very strongly object to making this a prerequisite to merging. This
> >>> is an untested idea that will certainly delay us by at least a couple
> >>> of merge windows when products are shipping now, and the existing approach
> >>> has been in circulation for a long time. It is TOO LATE!!!!!!
> >>
> > ...
> >>
> >> That said, you're clearly pissed at the goalposts changing yet again,
> >> and that's really not fair that we collectively keep moving them.
> >>
> >
> > This seems a bit more than moving a goalpost.
> >
> > We're now gating working software, for real working hardware, on a novel,
> > unproven BPF ops structure that controls page table mappings on page table
> > faults which would be used by exactly 1 user : FAMFS.
>
> Are MM people on board with even letting BPF do that? Honest question,
> if someone has a pointer to how that should work, that would be appreciated.
David, that question is pivotal!! How can we get at least a preliminary
answer sooner rather than later? If the answer is "Hell No", a lot of
this thread (but not all) becomes moot.
Prior to today this entire discussion has happened in the absence, to my
knowledge, of anybody actually hooking famfs for BPF-based fault handling.
But today Gregory has shared some code with me that does that. However,
the code doesn't build for me so I guess I'll have to debug that as soon
as I can.
Gregory's code, in the current form, still uses two new fuse messages,
GET_FMAP and GET_DAXDEV, but it makes the fmap message format opaque by
removing fmap format structs from the uapi. It also uses two BPF programs.
One BPF program parses and validates the GET_FMAP payload for every file,
and hangs it from a 'void *' in each fuse_inode (just like the current famfs
code). The other BPF program is called during vma faults and reads the
fuse_inode->'void *' in order to handle faults the same way famfs-fuse does
today, but via BPF instead.
As with all vma "providers", famfs services zillions of faults. But famfs
faults never involve blocking or retrieving from storage, so we don't
have that to amortize a less efficient fault handling code path over.
As I've said many times, we're enabling memory and it must run at
"memory speeds". Gregory's code includes a BPF invocation to resolve
each vma fault, but does avoid the BPF hashmap lookup that would be
required with a generalized implementation of Joanne's ideas.
The first question (very much unanswered) is whether a BPF fault handler
can resolve vma faults with performance equivalent to hugetlbfs or
anonymous mmap performance. If not, the famfs community will assert that
BPF would defeat or degrade the purpose of famfs. Added
overhead/latency/cache misses in a fault handler will serialize into the
stall time that software sees for a virtual address to be resolved -
it really is performance critical. If BPF is slower, we'll be able to
measure it, but one benchmark or test case does not fit all, so this
won't be a one-and-done test...
I'll share performance measurements as soon as I can build Gregory's code,
test, get time on a proper big-memory cluster, and measure something that
makes sense. This will take some days, but I'm working it.
Hopefully Monday I plan to try to do a substantial on-list reply that
attempts to summarize the various objections to my current famfs fuse
implementation as well as the open questions and my specific performance
and complexity concerns.
Thanks,
John
next prev parent reply other threads:[~2026-04-19 20:36 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260331123702.35052-1-john@jagalactic.com>
2026-03-31 12:37 ` [PATCH V10 00/10] famfs: port into fuse John Groves
2026-03-31 12:38 ` [PATCH V10 01/10] famfs_fuse: Update macro s/FUSE_IS_DAX/FUSE_IS_VIRTIO_DAX/ John Groves
2026-03-31 12:38 ` [PATCH V10 02/10] famfs_fuse: Basic fuse kernel ABI enablement for famfs John Groves
2026-03-31 12:38 ` [PATCH V10 03/10] famfs_fuse: Plumb the GET_FMAP message/response John Groves
2026-03-31 12:38 ` [PATCH V10 04/10] famfs_fuse: Create files with famfs fmaps John Groves
2026-03-31 12:38 ` [PATCH V10 05/10] famfs_fuse: GET_DAXDEV message and daxdev_table John Groves
2026-03-31 12:39 ` [PATCH V10 06/10] famfs_fuse: Plumb dax iomap and fuse read/write/mmap John Groves
2026-03-31 12:39 ` [PATCH V10 07/10] famfs_fuse: Add holder_operations for dax notify_failure() John Groves
2026-03-31 12:39 ` [PATCH V10 08/10] famfs_fuse: Add DAX address_space_operations with noop_dirty_folio John Groves
2026-03-31 12:39 ` [PATCH V10 09/10] famfs_fuse: Add famfs fmap metadata documentation John Groves
2026-03-31 12:39 ` [PATCH V10 10/10] famfs_fuse: Add documentation John Groves
2026-04-01 15:15 ` [PATCH V10 00/10] famfs: port into fuse John Groves
2026-04-06 17:43 ` Joanne Koong
2026-04-10 14:46 ` John Groves
2026-04-10 15:24 ` Bernd Schubert
2026-04-10 18:38 ` John Groves
2026-04-10 19:44 ` Joanne Koong
2026-04-14 13:19 ` Miklos Szeredi
2026-04-14 13:41 ` John Groves
2026-04-14 14:18 ` Miklos Szeredi
2026-04-14 15:23 ` John Groves
2026-04-14 18:57 ` Darrick J. Wong
2026-04-14 22:13 ` Joanne Koong
2026-04-14 23:36 ` Darrick J. Wong
2026-04-15 0:10 ` John Groves
2026-04-16 15:56 ` Joanne Koong
2026-04-16 20:14 ` Gregory Price
2026-04-16 20:53 ` Dan Williams
2026-04-16 22:43 ` Darrick J. Wong
2026-04-17 0:44 ` Joanne Koong
2026-04-17 5:40 ` Darrick J. Wong
2026-04-17 8:17 ` Christoph Hellwig
2026-04-17 15:58 ` Darrick J. Wong
2026-04-17 8:13 ` Christoph Hellwig
2026-04-17 13:30 ` Gregory Price
2026-04-17 1:24 ` Joanne Koong
2026-04-17 6:46 ` Gregory Price
2026-04-17 9:06 ` Amir Goldstein
2026-04-14 22:20 ` Gregory Price
2026-04-15 8:16 ` David Hildenbrand (Arm)
2026-04-15 13:34 ` Gregory Price
2026-04-15 14:04 ` Miklos Szeredi
2026-04-15 15:10 ` Matthew Wilcox
2026-04-15 15:28 ` Darrick J. Wong
2026-04-15 15:32 ` Gregory Price
2026-04-15 17:12 ` Joanne Koong
2026-04-15 19:40 ` Gregory Price
2026-04-19 20:36 ` John Groves [this message]
2026-04-20 0:27 ` Gregory Price
2026-04-21 3:12 ` Joanne Koong
2026-04-21 14:30 ` Gregory Price
2026-04-21 18:59 ` Joanne Koong
2026-04-21 22:13 ` Gregory Price
2026-04-14 23:53 ` John Groves
2026-04-15 0:15 ` Darrick J. Wong
2026-04-15 8:57 ` Miklos Szeredi
2026-04-17 8:04 ` Christoph Hellwig
2026-04-17 19:35 ` Joanne Koong
2026-04-21 6:59 ` Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aeUU8hMwPij2WvfF@groves.net \
--to=john@groves.net \
--cc=Jonathan.Cameron@huawei.com \
--cc=ackerleytng@google.com \
--cc=ajayjoshi@micron.com \
--cc=alison.schofield@intel.com \
--cc=amir73il@gmail.com \
--cc=arramesh@micron.com \
--cc=bagasdotme@gmail.com \
--cc=bernd@bsbernd.com \
--cc=brauner@kernel.org \
--cc=bschubert@ddn.com \
--cc=chenlinxuan@uniontech.com \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=david@kernel.org \
--cc=djbw@kernel.org \
--cc=djwong@kernel.org \
--cc=gourry@gourry.net \
--cc=jack@suse.cz \
--cc=james.morse@arm.com \
--cc=jgroves@micron.com \
--cc=jlayton@kernel.org \
--cc=joannelkoong@gmail.com \
--cc=john@jagalactic.com \
--cc=josef@toxicpanda.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=nvdimm@lists.linux.dev \
--cc=rdunlap@infradead.org \
--cc=seanjc@google.com \
--cc=shajnocz@redhat.com \
--cc=shivankg@amd.com \
--cc=skhan@linuxfoundation.org \
--cc=tabba@google.com \
--cc=venkataravis@micron.com \
--cc=viro@zeniv.linux.org.uk \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox