public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: "John Groves" <john@jagalactic.com>
To: "John Groves" <John@Groves.net>,
	"Miklos Szeredi" <miklos@szeredi.hu>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Bernd Schubert" <bschubert@ddn.com>,
	"Alison Schofield" <alison.schofield@intel.com>
Cc: "John Groves" <jgroves@micron.com>,
	"John Groves" <jgroves@fastmail.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Vishal Verma" <vishal.l.verma@intel.com>,
	"Dave Jiang" <dave.jiang@intel.com>,
	"Matthew Wilcox" <willy@infradead.org>, "Jan Kara" <jack@suse.cz>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"David Hildenbrand" <david@kernel.org>,
	"Christian Brauner" <brauner@kernel.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	"Jeff Layton" <jlayton@kernel.org>,
	"Amir Goldstein" <amir73il@gmail.com>,
	"Jonathan Cameron" <Jonathan.Cameron@huawei.com>,
	"Stefan Hajnoczi" <shajnocz@redhat.com>,
	"Joanne Koong" <joannelkoong@gmail.com>,
	"Josef Bacik" <josef@toxicpanda.com>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"James Morse" <james.morse@arm.com>,
	"Fuad Tabba" <tabba@google.com>,
	"Sean Christopherson" <seanjc@google.com>,
	"Shivank Garg" <shivankg@amd.com>,
	"Ackerley Tng" <ackerleytng@google.com>,
	"Gregory Price" <gourry@gourry.net>,
	"Aravind Ramesh" <arramesh@micron.com>,
	"Ajay Joshi" <ajayjoshi@micron.com>,
	"venkataravis@micron.com" <venkataravis@micron.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"John Groves" <john@groves.net>
Subject: [PATCH V7 00/19] famfs: port into fuse
Date: Sun, 18 Jan 2026 22:30:25 +0000	[thread overview]
Message-ID: <0100019bd33b1f66-b835e86a-e8ae-443f-a474-02db88f7e6db-000000@email.amazonses.com> (raw)
In-Reply-To: <0100019bd33a16b4-6da11a99-d883-4cfc-b561-97973253bc4a-000000@email.amazonses.com>

From: John Groves <john@groves.net>

This patch series is available as a git tag at [0].

Changes v6 -> v7
- Fixed a regression in famfs_interleave_fileofs_to_daxofs() that
  was reported by Intel's kernel test robot
- Added a check in __fsdev_dax_direct_access() for negative return
  from pgoff_to_phys(), which would indicate an out-of-range offset
- Fixed a bug in __famfs_meta_free(), where not all interleaved
  extents were freed
- Added chunksize alignment checks in famfs_fuse_meta_alloc() and
  famfs_interleave_fileofs_to_daxofs() as interleaved chunks must
  be PTE or PMD aligned
- Simplified famfs_file_init_dax() a bit
- Re-ran CM's kernel code review prompts on the entire series and
  fixed several minor issues

Changes v4 -> v5 -> v6
- None. Re-sending due to technical difficulties

Changes v3 [9] -> v4
- The patch "dax: prevent driver unbind while filesystem holds device"
  has been dropped. Dan Williams indicated that the favored behavior is
  for a file system to stop working if an underlying driver is unbound,
  rather than preventing the unbind.
- The patch "famfs_fuse: Famfs mount opt: -o shadow=<shadowpath>" has
  been dropped. Found a way for the famfs user space to do without the
  -o opt (via getxattr).
- Squashed the fs/fuse/Kconfig patch into the first subsequent patch
  that needed the change
  ("famfs_fuse: Basic fuse kernel ABI enablement for famfs")
- Many review comments addressed.
- Addressed minor kerneldoc infractions reported by test robot.

Description:

This patch series introduces famfs into the fuse file system framework.
This is really two patch series concatenated.

- The patches with the 'dax:' prefix introduce necessary dax
  functionality.
- The patches with 'famfs_fuse:' introduce the famfs functionality into
  fuse. The famfs_fuse patches depend on the dax patches.

In addition, there are related patch sets for libfuse and ndctl(daxctl).

Related patches and code

- Related patch to libfuse - posted under the same cover
- Related patch to ndctl/daxctl - posted under the same cover
- The famfs user space code can be found at [1]

Dax Overview:

This series introduces a new "famfs mode" of devdax, whose driver is
drivers/dax/fsdev.c. This driver supports dax_iomap_rw() and
dax_iomap_fault() calls against a character dax instance. A dax device
now can be converted among three modes: 'system-ram', 'devdax' and
'famfs' via daxctl or sysfs (e.g. unbind devdax and bind famfs instead).

In famfs mode, a dax device initializes its pages consistent with the
fsdaxmode of pmem. Raw read/write/mmap are not supported in this mode,
but famfs is happy in this mode - using dax_iomap_rw() for read/write and
dax_iomap_fault() for mmap faults.

Fuse Overview:

Famfs started as a standalone file system, but this series is intended to
permanently supersede that implementation. At a high level, famfs adds
two new fuse server messages:

GET_FMAP   - Retrieves a famfs fmap (the file-to-dax map for a famfs
	     file)
GET_DAXDEV - Retrieves the details of a particular daxdev that was
	     referenced by an fmap

Famfs Overview

Famfs exposes shared memory as a file system. Famfs consumes shared
memory from dax devices, and provides memory-mappable files that map
directly to the memory - no page cache involvement. Famfs differs from
conventional file systems in fs-dax mode, in that it handles in-memory
metadata in a sharable way (which begins with never caching dirty shared
metadata).

Famfs started as a standalone file system [2,3], but the consensus at
LSFMM was that it should be ported into fuse [4,5].

The key performance requirement is that famfs must resolve mapping faults
without upcalls. This is achieved by fully caching the file-to-devdax
metadata for all active files. This is done via two fuse client/server
message/response pairs: GET_FMAP and GET_DAXDEV.

Famfs remains the first fs-dax file system that is backed by devdax
rather than pmem in fs-dax mode (hence the need for the new dax mode).

Notes

- When a file is opened in a famfs mount, the OPEN is followed by a
  GET_FMAP message and response. The "fmap" is the full file-to-dax
  mapping, allowing the fuse/famfs kernel code to handle
  read/write/fault without any upcalls.

- After each GET_FMAP, the fmap is checked for extents that reference
  previously-unknown daxdevs. Each such occurrence is handled with a
  GET_DAXDEV message and response.

- Daxdevs are stored in a table (which might become an xarray at some
  point). When entries are added to the table, we acquire exclusive
  access to the daxdev via the fs_dax_get() call (modeled after how
  fs-dax handles this with pmem devices). Famfs provides
  holder_operations to devdax, providing a notification path in the
  event of memory errors or forced reconfiguration.

- If devdax notifies famfs of memory errors on a dax device, famfs
  currently blocks all subsequent accesses to data on that device. The
  recovery is to re-initialize the memory and file system. Famfs is
  memory, not storage...

- Because famfs uses backing (devdax) devices, only privileged mounts are
  supported (i.e. the fuse server requires CAP_SYS_RAWIO).

- The famfs kernel code never accesses the memory directly - it only
  facilitates read, write and mmap on behalf of user processes, using
  fmap metadata provided by its privileged fuse server. As such, the
  RAS of the shared memory affects applications, but not the kernel.

- Famfs has backing device(s), but they are devdax (char) rather than
  block. Right now there is no way to tell the vfs layer that famfs has a
  char backing device (unless we say it's block, but it's not). Currently
  we use the standard anonymous fuse fs_type - but I'm not sure that's
  ultimately optimal (thoughts?)

Changes v2 [7] -> v3
- Dax: Completely new fsdev driver (drivers/dax/fsdev.c) replaces the
  dev_dax_iomap modifications to bus.c/device.c. Devdax devices can now
  be switched among 'devdax', 'famfs' and 'system-ram' modes via daxctl
  or sysfs.
- Dax: fsdev uses MEMORY_DEVICE_FS_DAX type and leaves folios at order-0
  (no vmemmap_shift), allowing fs-dax to manage folio lifecycles
  dynamically like pmem does.
- Dax: The "poisoned page" problem is properly fixed via
  fsdev_clear_folio_state(), which clears stale mapping/compound state
  when fsdev binds. The temporary WARN_ON_ONCE workaround in fs/dax.c
  has been removed.
- Dax: Added dax_set_ops() so fsdev can set dax_operations at bind time
  (and clear them on unbind), since the dax_device is created before we
  know which driver will bind.
- Dax: Added custom bind/unbind sysfs handlers; unbind return -EBUSY if a
  filesystem holds the device, preventing unbind while famfs is mounted.
- Fuse: Famfs mounts now require that the fuse server/daemon has
  CAP_SYS_RAWIO because they expose raw memory devices.
- Fuse: Added DAX address_space_operations with noop_dirty_folio since
  famfs is memory-backed with no writeback required.
- Rebased to latest kernels, fully compatible with Alistair Popple
  et. al's recent dax refactoring.
- Ran this series through Chris Mason's code review AI prompts to check
  for issues - several subtle problems found and fixed.
- Dropped RFC status - this version is intended to be mergeable.

Changes v1 [8] -> v2:

- The GET_FMAP message/response has been moved from LOOKUP to OPEN, as
  was the pretty much unanimous consensus.
- Made the response payload to GET_FMAP variable sized (patch 12)
- Dodgy kerneldoc comments cleaned up or removed.
- Fixed memory leak of fc->shadow in patch 11 (thanks Joanne)
- Dropped many pr_debug and pr_notice calls


References

[0] - https://github.com/jagalactic/linux/tree/famfs-v7 (this patch set)
[1] - https://famfs.org (famfs user space)
[2] - https://lore.kernel.org/linux-cxl/cover.1708709155.git.john@groves.net/
[3] - https://lore.kernel.org/linux-cxl/cover.1714409084.git.john@groves.net/
[4] - https://lwn.net/Articles/983105/ (lsfmm 2024)
[5] - https://lwn.net/Articles/1020170/ (lsfmm 2025)
[6] - https://lore.kernel.org/linux-cxl/cover.8068ad144a7eea4a813670301f4d2a86a8e68ec4.1740713401.git-series.apopple@nvidia.com/
[7] - https://lore.kernel.org/linux-fsdevel/20250703185032.46568-1-john@groves.net/ (famfs fuse v2)
[8] - https://lore.kernel.org/linux-fsdevel/20250421013346.32530-1-john@groves.net/ (famfs fuse v1)
[9] - https://lore.kernel.org/linux-fsdevel/20260107153244.64703-1-john@groves.net/T/#mb2c868801be16eca82dab239a1d201628534aea7 (famfs fuse v3)


John Groves (19):
  dax: move dax_pgoff_to_phys from [drivers/dax/] device.c to bus.c
  dax: Factor out dax_folio_reset_order() helper
  dax: add fsdev.c driver for fs-dax on character dax
  dax: Save the kva from memremap
  dax: Add dax_operations for use by fs-dax on fsdev dax
  dax: Add dax_set_ops() for setting dax_operations at bind time
  dax: Add fs_dax_get() func to prepare dax for fs-dax usage
  dax: export dax_dev_get()
  famfs_fuse: magic.h: Add famfs magic numbers
  famfs_fuse: Update macro s/FUSE_IS_DAX/FUSE_IS_VIRTIO_DAX/
  famfs_fuse: Basic fuse kernel ABI enablement for famfs
  famfs_fuse: Plumb the GET_FMAP message/response
  famfs_fuse: Create files with famfs fmaps
  famfs_fuse: GET_DAXDEV message and daxdev_table
  famfs_fuse: Plumb dax iomap and fuse read/write/mmap
  famfs_fuse: Add holder_operations for dax notify_failure()
  famfs_fuse: Add DAX address_space_operations with noop_dirty_folio
  famfs_fuse: Add famfs fmap metadata documentation
  famfs_fuse: Add documentation

 Documentation/filesystems/famfs.rst |  142 ++++
 Documentation/filesystems/index.rst |    1 +
 MAINTAINERS                         |   18 +
 drivers/dax/Makefile                |    6 +
 drivers/dax/bus.c                   |   30 +-
 drivers/dax/bus.h                   |    3 +
 drivers/dax/dax-private.h           |    3 +
 drivers/dax/device.c                |   23 -
 drivers/dax/fsdev.c                 |  344 ++++++++
 drivers/dax/super.c                 |   99 ++-
 fs/dax.c                            |   61 +-
 fs/fuse/Kconfig                     |   14 +
 fs/fuse/Makefile                    |    1 +
 fs/fuse/dir.c                       |    2 +-
 fs/fuse/famfs.c                     | 1184 +++++++++++++++++++++++++++
 fs/fuse/famfs_kfmap.h               |  167 ++++
 fs/fuse/file.c                      |   45 +-
 fs/fuse/fuse_i.h                    |  116 ++-
 fs/fuse/inode.c                     |   34 +-
 fs/fuse/iomode.c                    |    2 +-
 fs/namei.c                          |    1 +
 include/linux/dax.h                 |   21 +-
 include/uapi/linux/fuse.h           |   88 ++
 include/uapi/linux/magic.h          |    2 +
 24 files changed, 2344 insertions(+), 63 deletions(-)
 create mode 100644 Documentation/filesystems/famfs.rst
 create mode 100644 drivers/dax/fsdev.c
 create mode 100644 fs/fuse/famfs.c
 create mode 100644 fs/fuse/famfs_kfmap.h


base-commit: 0f61b1860cc3f52aef9036d7235ed1f017632193
-- 
2.52.0



  reply	other threads:[~2026-01-18 22:30 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260118222911.92214-1-john@jagalactic.com>
2026-01-18 22:29 ` [PATCH BUNDLE v7] famfs: Fabric-Attached Memory File System John Groves
2026-01-18 22:30   ` John Groves [this message]
2026-01-18 22:31     ` [PATCH V7 01/19] dax: move dax_pgoff_to_phys from [drivers/dax/] device.c to bus.c John Groves
2026-02-11 14:23       ` Ira Weiny
2026-02-18 23:00       ` Dave Jiang
2026-01-18 22:31     ` [PATCH V7 02/19] dax: Factor out dax_folio_reset_order() helper John Groves
2026-02-13 21:24       ` Ira Weiny
2026-02-18 23:04       ` Dave Jiang
2026-02-24  3:00       ` Ackerley Tng
2026-03-02 15:06         ` John Groves
2026-03-09  6:27           ` Ackerley Tng
2026-01-18 22:31     ` [PATCH V7 03/19] dax: add fsdev.c driver for fs-dax on character dax John Groves
2026-02-13 21:05       ` Ira Weiny
2026-02-17 17:56         ` John Groves
2026-03-19 15:11           ` Jonathan Cameron
2026-01-18 22:31     ` [PATCH V7 04/19] dax: Save the kva from memremap John Groves
2026-02-13 21:23       ` Ira Weiny
2026-02-18 23:33       ` Dave Jiang
2026-01-18 22:31     ` [PATCH V7 05/19] dax: Add dax_operations for use by fs-dax on fsdev dax John Groves
2026-02-13 21:23       ` Ira Weiny
2026-02-18  0:38         ` John Groves
2026-02-14 16:10       ` Ira Weiny
2026-02-18  0:49         ` John Groves
2026-01-18 22:32     ` [PATCH V7 06/19] dax: Add dax_set_ops() for setting dax_operations at bind time John Groves
2026-02-19 15:41       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 07/19] dax: Add fs_dax_get() func to prepare dax for fs-dax usage John Groves
2026-02-19 16:07       ` Dave Jiang
2026-02-26 23:20         ` John Groves
2026-01-18 22:32     ` [PATCH V7 08/19] dax: export dax_dev_get() John Groves
2026-02-19 16:18       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 09/19] famfs_fuse: magic.h: Add famfs magic numbers John Groves
2026-02-19 16:21       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 10/19] famfs_fuse: Update macro s/FUSE_IS_DAX/FUSE_IS_VIRTIO_DAX/ John Groves
2026-02-19 16:33       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 11/19] famfs_fuse: Basic fuse kernel ABI enablement for famfs John Groves
2026-02-19 16:57       ` Dave Jiang
2026-01-18 22:33     ` [PATCH V7 12/19] famfs_fuse: Plumb the GET_FMAP message/response John Groves
2026-02-19 17:12       ` Dave Jiang
2026-02-26  0:24         ` John Groves
2026-01-18 22:33     ` [PATCH V7 13/19] famfs_fuse: Create files with famfs fmaps John Groves
2026-02-19 18:31       ` Dave Jiang
2026-02-25 21:30         ` John Groves
2026-01-18 22:33     ` [PATCH V7 14/19] famfs_fuse: GET_DAXDEV message and daxdev_table John Groves
2026-02-19 18:51       ` Dave Jiang
2026-02-25 23:51         ` John Groves
2026-01-18 22:33     ` [PATCH V7 15/19] famfs_fuse: Plumb dax iomap and fuse read/write/mmap John Groves
2026-01-18 22:33     ` [PATCH V7 16/19] famfs_fuse: Add holder_operations for dax notify_failure() John Groves
2026-01-18 22:33     ` [PATCH V7 17/19] famfs_fuse: Add DAX address_space_operations with noop_dirty_folio John Groves
2026-01-30 23:13       ` Joanne Koong
2026-01-18 22:34     ` [PATCH V7 18/19] famfs_fuse: Add famfs fmap metadata documentation John Groves
2026-02-19 20:22       ` Dave Jiang
2026-01-18 22:34     ` [PATCH V7 19/19] famfs_fuse: Add documentation John Groves
2026-02-19 21:39       ` Dave Jiang
2026-02-26  0:29         ` John Groves
2026-01-18 22:34   ` [PATCH V7 0/3] libfuse: add basic famfs support to libfuse John Groves
2026-01-18 22:35     ` [PATCH V7 1/3] fuse_kernel.h: bring up to baseline 6.19 John Groves
2026-01-30 22:53       ` Joanne Koong
2026-01-31  0:41         ` Darrick J. Wong
2026-01-31  1:18           ` Joanne Koong
2026-01-18 22:35     ` [PATCH V7 2/3] fuse_kernel.h: add famfs DAX fmap protocol definitions John Groves
2026-01-18 22:35     ` [PATCH V7 3/3] fuse: add famfs DAX fmap support John Groves
2026-01-18 22:36   ` [PATCH V4 0/2] ndctl: Add daxctl support for the new "famfs" mode of devdax John Groves
2026-01-18 22:36     ` [PATCH V4 1/2] daxctl: Add support for famfs mode John Groves
2026-02-19 21:47       ` Dave Jiang
2026-02-27  2:00       ` Alison Schofield
2026-01-18 22:36     ` [PATCH V4 2/2] Add test/daxctl-famfs.sh to test famfs mode transitions: John Groves
2026-02-19 22:02       ` Dave Jiang
2026-01-20 17:01     ` [PATCH V4 0/2] ndctl: Add daxctl support for the new "famfs" mode of devdax Alireza Sanaee
2026-01-20 17:05       ` John Groves
2026-02-09 23:13     ` Alison Schofield
2026-02-11 14:31       ` John Groves
2026-01-20  9:12   ` [PATCH BUNDLE v7] famfs: Fabric-Attached Memory File System Alireza Sanaee
2026-01-20 15:13     ` John Groves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0100019bd33b1f66-b835e86a-e8ae-443f-a474-02db88f7e6db-000000@email.amazonses.com \
    --to=john@jagalactic.com \
    --cc=John@Groves.net \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ackerleytng@google.com \
    --cc=ajayjoshi@micron.com \
    --cc=alison.schofield@intel.com \
    --cc=amir73il@gmail.com \
    --cc=arramesh@micron.com \
    --cc=bagasdotme@gmail.com \
    --cc=brauner@kernel.org \
    --cc=bschubert@ddn.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@kernel.org \
    --cc=djwong@kernel.org \
    --cc=gourry@gourry.net \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jgroves@fastmail.com \
    --cc=jgroves@micron.com \
    --cc=jlayton@kernel.org \
    --cc=joannelkoong@gmail.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=nvdimm@lists.linux.dev \
    --cc=rdunlap@infradead.org \
    --cc=seanjc@google.com \
    --cc=shajnocz@redhat.com \
    --cc=shivankg@amd.com \
    --cc=tabba@google.com \
    --cc=venkataravis@micron.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox