All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Groves <John@groves.net>
To: Ira Weiny <ira.weiny@intel.com>
Cc: John Groves <john@jagalactic.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	 Dan Williams <dan.j.williams@intel.com>,
	Bernd Schubert <bschubert@ddn.com>,
	 Alison Schofield <alison.schofield@intel.com>,
	John Groves <jgroves@micron.com>,
	 John Groves <jgroves@fastmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	 Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	 Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	David Hildenbrand <david@kernel.org>,
	 Christian Brauner <brauner@kernel.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	 Randy Dunlap <rdunlap@infradead.org>,
	Jeff Layton <jlayton@kernel.org>,
	 Amir Goldstein <amir73il@gmail.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	 Stefan Hajnoczi <shajnocz@redhat.com>,
	Joanne Koong <joannelkoong@gmail.com>,
	 Josef Bacik <josef@toxicpanda.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	 James Morse <james.morse@arm.com>, Fuad Tabba <tabba@google.com>,
	 Sean Christopherson <seanjc@google.com>,
	Shivank Garg <shivankg@amd.com>,
	 Ackerley Tng <ackerleytng@google.com>,
	Gregory Price <gourry@gourry.net>,
	 Aravind Ramesh <arramesh@micron.com>,
	Ajay Joshi <ajayjoshi@micron.com>,
	 "venkataravis@micron.com" <venkataravis@micron.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	 "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH V7 03/19] dax: add fsdev.c driver for fs-dax on character dax
Date: Tue, 17 Feb 2026 11:56:20 -0600	[thread overview]
Message-ID: <aZSoCIjbxKIqRZF4@groves.net> (raw)
In-Reply-To: <698f922296bd0_bcb8910059@iweiny-mobl.notmuch>

On 26/02/13 03:05PM, Ira Weiny wrote:
> John Groves wrote:
> > From: John Groves <john@groves.net>
> > 
> > The new fsdev driver provides pages/folios initialized compatibly with
> > fsdax - normal rather than devdax-style refcounting, and starting out
> > with order-0 folios.
> > 
> > When fsdev binds to a daxdev, it is usually (always?) switching from the
> > devdax mode (device.c), which pre-initializes compound folios according
> > to its alignment. Fsdev uses fsdev_clear_folio_state() to switch the
> > folios into a fsdax-compatible state.
> > 
> > A side effect of this is that raw mmap doesn't (can't?) work on an fsdev
> > dax instance. Accordingly, The fsdev driver does not provide raw mmap -
> > devices must be put in 'devdax' mode (drivers/dax/device.c) to get raw
> > mmap capability.
> > 
> > In this commit is just the framework, which remaps pages/folios compatibly
> > with fsdax.
> > 
> > Enabling dax changes:
> > 
> > - bus.h: add DAXDRV_FSDEV_TYPE driver type
> > - bus.c: allow DAXDRV_FSDEV_TYPE drivers to bind to daxdevs
> > - dax.h: prototype inode_dax(), which fsdev needs
> > 
> > Suggested-by: Dan Williams <dan.j.williams@intel.com>
> > Suggested-by: Gregory Price <gourry@gourry.net>
> > Signed-off-by: John Groves <john@groves.net>
> > ---
> >  MAINTAINERS          |   8 ++
> >  drivers/dax/Makefile |   6 ++
> >  drivers/dax/bus.c    |   4 +
> >  drivers/dax/bus.h    |   1 +
> >  drivers/dax/fsdev.c  | 242 +++++++++++++++++++++++++++++++++++++++++++
> >  fs/dax.c             |   1 +
> >  include/linux/dax.h  |   5 +
> >  7 files changed, 267 insertions(+)
> >  create mode 100644 drivers/dax/fsdev.c
> > 
> 
> [snip]
> 
> > +
> > +static int fsdev_dax_probe(struct dev_dax *dev_dax)
> > +{
> > +	struct dax_device *dax_dev = dev_dax->dax_dev;
> > +	struct device *dev = &dev_dax->dev;
> > +	struct dev_pagemap *pgmap;
> > +	u64 data_offset = 0;
> > +	struct inode *inode;
> > +	struct cdev *cdev;
> > +	void *addr;
> > +	int rc, i;
> > +
> > +	if (static_dev_dax(dev_dax))  {
> > +		if (dev_dax->nr_range > 1) {
> > +			dev_warn(dev, "static pgmap / multi-range device conflict\n");
> > +			return -EINVAL;
> > +		}
> > +
> > +		pgmap = dev_dax->pgmap;
> > +	} else {
> > +		size_t pgmap_size;
> > +
> > +		if (dev_dax->pgmap) {
> > +			dev_warn(dev, "dynamic-dax with pre-populated page map\n");
> > +			return -EINVAL;
> > +		}
> > +
> > +		pgmap_size = struct_size(pgmap, ranges, dev_dax->nr_range - 1);
> > +		pgmap = devm_kzalloc(dev, pgmap_size,  GFP_KERNEL);
> > +		if (!pgmap)
> > +			return -ENOMEM;
> > +
> > +		pgmap->nr_range = dev_dax->nr_range;
> > +		dev_dax->pgmap = pgmap;
> > +
> > +		for (i = 0; i < dev_dax->nr_range; i++) {
> > +			struct range *range = &dev_dax->ranges[i].range;
> > +
> > +			pgmap->ranges[i] = *range;
> > +		}
> > +	}
> > +
> > +	for (i = 0; i < dev_dax->nr_range; i++) {
> > +		struct range *range = &dev_dax->ranges[i].range;
> > +
> > +		if (!devm_request_mem_region(dev, range->start,
> > +					range_len(range), dev_name(dev))) {
> > +			dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve range\n",
> > +				 i, range->start, range->end);
> > +			return -EBUSY;
> > +		}
> > +	}
> 
> All of the above code is AFAICT exactly the same as the dev_dax driver.
> Isn't there a way to make this common?
> 
> The rest of the common code is simple enough.

dev_dax_probe() and fsdev_dax_probe() do indeed have some "same code" - 
range validity checking and pgmap setup, from the top of probe through 
the for loop above. After that they're different. Also, I just did a scan 
and the probe function seems like the only remaining common code between 
device.c and fsdev.c.

These are separate kmods; that code could certainly be factored out and 
shared, but it would need to go somewhere common (maybe bus.c)?

So both device.c and fsdev.c would call bus.c:dax_prepare_pgmap() or
some such.

I feel like this might not be worth factoring out, but I'm happy to do it
if you and/or the dax team prefer it factored out and shared.

> 
> > +
> > +	/*
> > +	 * FS-DAX compatible mode: Use MEMORY_DEVICE_FS_DAX type and
> > +	 * do NOT set vmemmap_shift. This leaves folios at order-0,
> > +	 * allowing fs-dax to dynamically create compound folios as needed
> > +	 * (similar to pmem behavior).
> > +	 */
> > +	pgmap->type = MEMORY_DEVICE_FS_DAX;
> > +	pgmap->ops = &fsdev_pagemap_ops;
> > +	pgmap->owner = dev_dax;
> > +
> > +	/*
> > +	 * CRITICAL DIFFERENCE from device.c:
> > +	 * We do NOT set vmemmap_shift here, even if align > PAGE_SIZE.
> > +	 * This ensures folios remain order-0 and are compatible with
> > +	 * fs-dax's folio management.
> > +	 */
> > +
> > +	addr = devm_memremap_pages(dev, pgmap);
> > +	if (IS_ERR(addr))
> > +		return PTR_ERR(addr);
> > +
> > +	/*
> > +	 * Clear any stale compound folio state left over from a previous
> > +	 * driver (e.g., device_dax with vmemmap_shift).
> > +	 */
> > +	fsdev_clear_folio_state(dev_dax);
> > +
> > +	/* Detect whether the data is at a non-zero offset into the memory */
> > +	if (pgmap->range.start != dev_dax->ranges[0].range.start) {
> > +		u64 phys = dev_dax->ranges[0].range.start;
> > +		u64 pgmap_phys = dev_dax->pgmap[0].range.start;
> > +
> > +		if (!WARN_ON(pgmap_phys > phys))
> > +			data_offset = phys - pgmap_phys;
> > +
> > +		pr_debug("%s: offset detected phys=%llx pgmap_phys=%llx offset=%llx\n",
> > +		       __func__, phys, pgmap_phys, data_offset);
> > +	}
> > +
> > +	inode = dax_inode(dax_dev);
> > +	cdev = inode->i_cdev;
> > +	cdev_init(cdev, &fsdev_fops);
> > +	cdev->owner = dev->driver->owner;
> > +	cdev_set_parent(cdev, &dev->kobj);
> > +	rc = cdev_add(cdev, dev->devt, 1);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = devm_add_action_or_reset(dev, fsdev_cdev_del, cdev);
> > +	if (rc)
> > +		return rc;
> > +
> > +	run_dax(dax_dev);
> > +	return devm_add_action_or_reset(dev, fsdev_kill, dev_dax);
> > +}
> > +
> 
> [snip]
> 
> > diff --git a/include/linux/dax.h b/include/linux/dax.h
> > index 9d624f4d9df6..fe1315135fdd 100644
> > --- a/include/linux/dax.h
> > +++ b/include/linux/dax.h
> > @@ -51,6 +51,10 @@ struct dax_holder_operations {
> >  
> >  #if IS_ENABLED(CONFIG_DAX)
> >  struct dax_device *alloc_dax(void *private, const struct dax_operations *ops);
> > +
> > +#if IS_ENABLED(CONFIG_DEV_DAX_FS)
> > +struct dax_device *inode_dax(struct inode *inode);
> > +#endif
> 
> I don't understand why this hunk is added here but then removed in a later
> patch?  Why can't this be placed below? ...
> 
> >  void *dax_holder(struct dax_device *dax_dev);
> >  void put_dax(struct dax_device *dax_dev);
> >  void kill_dax(struct dax_device *dax_dev);
> > @@ -153,6 +157,7 @@ static inline void fs_put_dax(struct dax_device *dax_dev, void *holder)
> >  #if IS_ENABLED(CONFIG_FS_DAX)
> >  int dax_writeback_mapping_range(struct address_space *mapping,
> >  		struct dax_device *dax_dev, struct writeback_control *wbc);
> > +int dax_folio_reset_order(struct folio *folio);
> 
> ... Here?

Done, thanks - good catch. That was just sloppy factoring into a series on
my part.

> 
> Ira
> 
> [snip]

Thanks for the reviewing Ira!

Regards,
John


  reply	other threads:[~2026-02-17 17:56 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260118222911.92214-1-john@jagalactic.com>
2026-01-18 22:29 ` [PATCH BUNDLE v7] famfs: Fabric-Attached Memory File System John Groves
2026-01-18 22:30   ` [PATCH V7 00/19] famfs: port into fuse John Groves
2026-01-18 22:31     ` [PATCH V7 01/19] dax: move dax_pgoff_to_phys from [drivers/dax/] device.c to bus.c John Groves
2026-02-11 14:23       ` Ira Weiny
2026-02-18 23:00       ` Dave Jiang
2026-01-18 22:31     ` [PATCH V7 02/19] dax: Factor out dax_folio_reset_order() helper John Groves
2026-02-13 21:24       ` Ira Weiny
2026-02-18 23:04       ` Dave Jiang
2026-02-24  3:00       ` Ackerley Tng
2026-03-02 15:06         ` John Groves
2026-03-09  6:27           ` Ackerley Tng
2026-01-18 22:31     ` [PATCH V7 03/19] dax: add fsdev.c driver for fs-dax on character dax John Groves
2026-02-13 21:05       ` Ira Weiny
2026-02-17 17:56         ` John Groves [this message]
2026-03-19 15:11           ` Jonathan Cameron
2026-01-18 22:31     ` [PATCH V7 04/19] dax: Save the kva from memremap John Groves
2026-02-13 21:23       ` Ira Weiny
2026-02-18 23:33       ` Dave Jiang
2026-01-18 22:31     ` [PATCH V7 05/19] dax: Add dax_operations for use by fs-dax on fsdev dax John Groves
2026-02-13 21:23       ` Ira Weiny
2026-02-18  0:38         ` John Groves
2026-02-14 16:10       ` Ira Weiny
2026-02-18  0:49         ` John Groves
2026-01-18 22:32     ` [PATCH V7 06/19] dax: Add dax_set_ops() for setting dax_operations at bind time John Groves
2026-02-19 15:41       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 07/19] dax: Add fs_dax_get() func to prepare dax for fs-dax usage John Groves
2026-02-19 16:07       ` Dave Jiang
2026-02-26 23:20         ` John Groves
2026-01-18 22:32     ` [PATCH V7 08/19] dax: export dax_dev_get() John Groves
2026-02-19 16:18       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 09/19] famfs_fuse: magic.h: Add famfs magic numbers John Groves
2026-02-19 16:21       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 10/19] famfs_fuse: Update macro s/FUSE_IS_DAX/FUSE_IS_VIRTIO_DAX/ John Groves
2026-02-19 16:33       ` Dave Jiang
2026-01-18 22:32     ` [PATCH V7 11/19] famfs_fuse: Basic fuse kernel ABI enablement for famfs John Groves
2026-02-19 16:57       ` Dave Jiang
2026-01-18 22:33     ` [PATCH V7 12/19] famfs_fuse: Plumb the GET_FMAP message/response John Groves
2026-02-19 17:12       ` Dave Jiang
2026-02-26  0:24         ` John Groves
2026-01-18 22:33     ` [PATCH V7 13/19] famfs_fuse: Create files with famfs fmaps John Groves
2026-02-19 18:31       ` Dave Jiang
2026-02-25 21:30         ` John Groves
2026-01-18 22:33     ` [PATCH V7 14/19] famfs_fuse: GET_DAXDEV message and daxdev_table John Groves
2026-02-19 18:51       ` Dave Jiang
2026-02-25 23:51         ` John Groves
2026-01-18 22:33     ` [PATCH V7 15/19] famfs_fuse: Plumb dax iomap and fuse read/write/mmap John Groves
2026-01-18 22:33     ` [PATCH V7 16/19] famfs_fuse: Add holder_operations for dax notify_failure() John Groves
2026-01-18 22:33     ` [PATCH V7 17/19] famfs_fuse: Add DAX address_space_operations with noop_dirty_folio John Groves
2026-01-30 23:13       ` Joanne Koong
2026-01-18 22:34     ` [PATCH V7 18/19] famfs_fuse: Add famfs fmap metadata documentation John Groves
2026-02-19 20:22       ` Dave Jiang
2026-01-18 22:34     ` [PATCH V7 19/19] famfs_fuse: Add documentation John Groves
2026-02-19 21:39       ` Dave Jiang
2026-02-26  0:29         ` John Groves
2026-01-18 22:34   ` [PATCH V7 0/3] libfuse: add basic famfs support to libfuse John Groves
2026-01-18 22:35     ` [PATCH V7 1/3] fuse_kernel.h: bring up to baseline 6.19 John Groves
2026-01-30 22:53       ` Joanne Koong
2026-01-31  0:41         ` Darrick J. Wong
2026-01-31  1:18           ` Joanne Koong
2026-01-18 22:35     ` [PATCH V7 2/3] fuse_kernel.h: add famfs DAX fmap protocol definitions John Groves
2026-01-18 22:35     ` [PATCH V7 3/3] fuse: add famfs DAX fmap support John Groves
2026-01-18 22:36   ` [PATCH V4 0/2] ndctl: Add daxctl support for the new "famfs" mode of devdax John Groves
2026-01-18 22:36     ` [PATCH V4 1/2] daxctl: Add support for famfs mode John Groves
2026-02-19 21:47       ` Dave Jiang
2026-02-27  2:00       ` Alison Schofield
2026-04-20 23:17         ` Alison Schofield
2026-04-21  1:47           ` John Groves
2026-04-22 18:09             ` Ira Weiny
2026-04-26 23:56         ` John Groves
2026-04-28  4:38           ` Alison Schofield
2026-04-28 19:14             ` Ira Weiny
2026-04-28 20:06               ` John Groves
2026-01-18 22:36     ` [PATCH V4 2/2] Add test/daxctl-famfs.sh to test famfs mode transitions: John Groves
2026-02-19 22:02       ` Dave Jiang
2026-01-20 17:01     ` [PATCH V4 0/2] ndctl: Add daxctl support for the new "famfs" mode of devdax Alireza Sanaee
2026-01-20 17:05       ` John Groves
2026-02-09 23:13     ` Alison Schofield
2026-02-11 14:31       ` John Groves
2026-01-20  9:12   ` [PATCH BUNDLE v7] famfs: Fabric-Attached Memory File System Alireza Sanaee
2026-01-20 15:13     ` John Groves

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aZSoCIjbxKIqRZF4@groves.net \
    --to=john@groves.net \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=ackerleytng@google.com \
    --cc=ajayjoshi@micron.com \
    --cc=alison.schofield@intel.com \
    --cc=amir73il@gmail.com \
    --cc=arramesh@micron.com \
    --cc=bagasdotme@gmail.com \
    --cc=brauner@kernel.org \
    --cc=bschubert@ddn.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@kernel.org \
    --cc=djwong@kernel.org \
    --cc=gourry@gourry.net \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=james.morse@arm.com \
    --cc=jgroves@fastmail.com \
    --cc=jgroves@micron.com \
    --cc=jlayton@kernel.org \
    --cc=joannelkoong@gmail.com \
    --cc=john@jagalactic.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=nvdimm@lists.linux.dev \
    --cc=rdunlap@infradead.org \
    --cc=seanjc@google.com \
    --cc=shajnocz@redhat.com \
    --cc=shivankg@amd.com \
    --cc=tabba@google.com \
    --cc=venkataravis@micron.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.