From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Fan Ni <fan.ni@samsung.com>,
"Navneet Singh" <navneet.singh@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
"Andrew Morton" <akpm@linux-foundation.org>,
Dan Williams <dan.j.williams@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
"Alison Schofield" <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
<linux-btrfs@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<linux-doc@vger.kernel.org>, <nvdimm@lists.linux.dev>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 24/28] dax/region: Create resources on sparse DAX regions
Date: Thu, 10 Oct 2024 16:27:45 +0100 [thread overview]
Message-ID: <20241010162745.00007b31@Huawei.com> (raw)
In-Reply-To: <20241007-dcd-type2-upstream-v4-24-c261ee6eeded@intel.com>
On Mon, 07 Oct 2024 18:16:30 -0500
ira.weiny@intel.com wrote:
> From: Navneet Singh <navneet.singh@intel.com>
>
> DAX regions which map dynamic capacity partitions require that memory be
> allowed to come and go. Recall sparse regions were created for this
> purpose. Now that extents can be realized within DAX regions the DAX
> region driver can start tracking sub-resource information.
>
> The tight relationship between DAX region operations and extent
> operations require memory changes to be controlled synchronously with
> the user of the region. Synchronize through the dax_region_rwsem and by
> having the region driver drive both the region device as well as the
> extent sub-devices.
>
> Recall requests to remove extents can happen at any time and that a host
> is not obligated to release the memory until it is not being used. If
> an extent is not used allow a release response.
>
> The DAX layer has no need for the details of the CXL memory extent
> devices. Expose extents to the DAX layer as device children of the DAX
> region device. A single callback from the driver aids the DAX layer to
> determine if the child device is an extent. The DAX layer also
> registers a devres function to automatically clean up when the device is
> removed from the region.
>
> There is a race between extents being surfaced and the dax_cxl driver
> being loaded. The driver must therefore scan for any existing extents
> while still under the device lock.
>
> Respond to extent notifications. Manage the DAX region resource tree
> based on the extents lifetime. Return the status of remove
> notifications to lower layers such that it can manage the hardware
> appropriately.
>
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>
More somewhat superficial review from me.
Needs DAX expert reviewers.
Jonathan
> ---
> drivers/cxl/core/extent.c | 74 ++++++++++++--
> drivers/cxl/cxl.h | 6 ++
> drivers/dax/bus.c | 243 +++++++++++++++++++++++++++++++++++++++++-----
> drivers/dax/bus.h | 3 +-
> drivers/dax/cxl.c | 62 +++++++++++-
> drivers/dax/dax-private.h | 42 ++++++++
> drivers/dax/hmem/hmem.c | 2 +-
> drivers/dax/pmem.c | 2 +-
> 8 files changed, 396 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
> index a1eb6e8e4f1a..75fb73ce2185 100644
> --- a/drivers/cxl/core/extent.c
> +++ b/drivers/cxl/core/extent.c
> @@ -270,20 +270,65 @@ static void calc_hpa_range(struct cxl_endpoint_decoder *cxled,
> hpa_range->end = hpa_range->start + range_len(dpa_range) - 1;
> }
>
> +static int cxlr_notify_extent(struct cxl_region *cxlr, enum dc_event event,
> + struct region_extent *region_extent)
> +{
> + struct device *dev = &cxlr->cxlr_dax->dev;
> + struct cxl_notify_data notify_data;
> + struct cxl_driver *driver;
> +
> + dev_dbg(dev, "Trying notify: type %d HPA %pra\n",
> + event, ®ion_extent->hpa_range);
> +
> + guard(device)(dev);
> +
> + /*
> + * The lack of a driver indicates a notification has failed. No user
> + * space coordiantion was possible.
spell check.
coordination
> + */
> + if (!dev->driver)
> + return 0;
> + driver = to_cxl_drv(dev->driver);
> + if (!driver->notify)
> + return 0;
> +
> + notify_data = (struct cxl_notify_data) {
> + .event = event,
> + .region_extent = region_extent,
> + };
> +
> + dev_dbg(dev, "Notify: type %d HPA %pra\n",
> + event, ®ion_extent->hpa_range);
> + return driver->notify(dev, ¬ify_data);
> +}
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index f0e3f8c787df..4e19d18369de 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -183,6 +183,86 @@ static bool is_sparse(struct dax_region *dax_region)
> return (dax_region->res.flags & IORESOURCE_DAX_SPARSE_CAP) != 0;
> }
> +
> +int dax_region_add_resource(struct dax_region *dax_region,
> + struct device *device,
> + resource_size_t start, resource_size_t length)
> +{
> + struct resource *new_resource;
> + int rc;
> +
> + struct dax_resource *dax_resource __free(kfree) =
> + kzalloc(sizeof(*dax_resource), GFP_KERNEL);
> + if (!dax_resource)
> + return -ENOMEM;
> +
> + guard(rwsem_write)(&dax_region_rwsem);
> +
> + dev_dbg(dax_region->dev, "DAX region resource %pr\n", &dax_region->res);
> + new_resource = __request_region(&dax_region->res, start, length, "extent", 0);
> + if (!new_resource) {
> + dev_err(dax_region->dev, "Failed to add region s:%pa l:%pa\n",
> + &start, &length);
> + return -ENOSPC;
> + }
> +
> + dev_dbg(dax_region->dev, "add resource %pr\n", new_resource);
> + dax_resource->region = dax_region;
> + dax_resource->res = new_resource;
> + dev_set_drvdata(device, dax_resource);
> + rc = devm_add_action_or_reset(device, dax_release_resource,
> + no_free_ptr(dax_resource));
> + /* On error; ensure driver data is cleared under semaphore */
It's not used in the dax_release_resource callback (that I can
immediately spot) so could you just not set it until after
this has succeeded?
> + if (rc)
> + dev_set_drvdata(device, NULL);
i.e. move
dev_set_drvdata(device, dax_resource);
to here.
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(dax_region_add_resource);
Adding quite a few exports. Is it time to namespace DAX exports?
Perhaps a follow up series.
> bool static_dev_dax(struct dev_dax *dev_dax)
> {
> return is_static(dev_dax->region);
> @@ -296,19 +376,44 @@ static ssize_t region_align_show(struct device *dev,
> static struct device_attribute dev_attr_region_align =
> __ATTR(align, 0400, region_align_show, NULL);
>
> +#define for_each_child_resource(extent, res) \
> + for (res = (extent)->child; res; res = res->sibling)
> +
Extent naming in here is a little off for a general sounding macro.
Maybe for_each_child_resource(parent, res) or something like that?
Seem generally useful. Maybe move to resource.h?
> @@ -1494,8 +1679,14 @@ static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data)
> device_initialize(dev);
> dev_set_name(dev, "dax%d.%d", dax_region->id, dev_dax->id);
>
> + if (is_sparse(dax_region) && data->size) {
> + dev_err(parent, "Sparse DAX region devices must be created initially with 0 size");
> + rc = -EINVAL;
> + goto err_id;
Right label? This code doesn't have side effects and the next error path is goto err_range
Looks like you fail to reverse the alloc_dev_dax_id() in this error path.
> + }
> +
> rc = alloc_dev_dax_range(&dax_region->res, dev_dax, dax_region->res.start,
> - data->size);
> + data->size, NULL);
> if (rc)
> goto err_range;
>
> diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
> index 783bfeef42cc..ae5029ea6047 100644
> --- a/drivers/dax/bus.h
> +++ b/drivers/dax/bus.h
> @@ -9,6 +9,7 @@ struct dev_dax;
> struct resource;
> struct dax_device;
> struct dax_region;
> +struct dax_sparse_ops;
>
> /* dax bus specific ioresource flags */
> #define IORESOURCE_DAX_STATIC BIT(0)
> @@ -17,7 +18,7 @@ struct dax_region;
>
> struct dax_region *alloc_dax_region(struct device *parent, int region_id,
> struct range *range, int target_node, unsigned int align,
> - unsigned long flags);
> + unsigned long flags, struct dax_sparse_ops *sparse_ops);
>
> struct dev_dax_data {
> struct dax_region *dax_region;
> diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c
> index 367e86b1c22a..df979ea2cb59 100644
> --- a/drivers/dax/cxl.c
> +++ b/drivers/dax/cxl.c
> @@ -5,6 +5,58 @@
>
> #include "../cxl/cxl.h"
> #include "bus.h"
> +#include "dax-private.h"
> +
> +static int __cxl_dax_add_resource(struct dax_region *dax_region,
> + struct region_extent *region_extent)
> +{
> + resource_size_t start, length;
> + struct device *dev;
> +
> + dev = ®ion_extent->dev;
Might as well do
struct device *dev = ®ion_extent->dev;
> + start = dax_region->res.start + region_extent->hpa_range.start;
> + length = range_len(®ion_extent->hpa_range);
> + return dax_region_add_resource(dax_region, dev, start, length);
> +}
> diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
> index ccde98c3d4e2..e3866115243e 100644
> --- a/drivers/dax/dax-private.h
> +++ b/drivers/dax/dax-private.h
...
> +/*
> + * Similar to run_dax() dax_region_{add,rm}_resource() and dax_avail_size() are
> + * exported but are not intended to be generic operations outside the dax
> + * subsystem. They are only generic between the dax layer and the dax drivers.
> + */
> +int dax_region_add_resource(struct dax_region *dax_region, struct device *dev,
> + resource_size_t start, resource_size_t length);
> +int dax_region_rm_resource(struct dax_region *dax_region,
> + struct device *dev);
> +resource_size_t dax_avail_size(struct resource *dax_resource);
> +
> +typedef int (*match_cb)(struct device *dev, resource_size_t *size_avail);
Why is this here?
next prev parent reply other threads:[~2024-10-10 15:27 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-07 23:16 [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2024-10-07 23:16 ` [PATCH v4 01/28] test printk: Add very basic struct resource tests Ira Weiny
2024-10-08 16:35 ` Andy Shevchenko
2024-10-09 12:24 ` Jonathan Cameron
2024-10-09 17:09 ` Fan Ni
2024-10-10 14:59 ` Petr Mladek
2024-10-11 14:49 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 02/28] printk: Add print format (%pra) for struct range Ira Weiny
2024-10-08 16:56 ` Andy Shevchenko
2024-10-09 12:27 ` Jonathan Cameron
2024-10-09 14:42 ` Andy Shevchenko
2024-10-09 13:30 ` Rasmus Villemoes
2024-10-09 14:41 ` Andy Shevchenko
2024-10-14 0:08 ` Ira Weiny
2024-10-11 16:54 ` Ira Weiny
2024-10-09 17:33 ` Fan Ni
2024-10-11 2:09 ` Bagas Sanjaya
2024-10-17 20:57 ` Ira Weiny
2024-10-25 12:42 ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 03/28] cxl/cdat: Use %pra for dpa range outputs Ira Weiny
2024-10-09 12:33 ` Jonathan Cameron
2024-10-09 17:34 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 04/28] range: Add range_overlaps() Ira Weiny
2024-10-08 16:10 ` David Sterba
2024-10-09 14:45 ` Andy Shevchenko
2024-10-09 14:46 ` Andy Shevchenko
2024-10-14 0:12 ` Ira Weiny
2024-10-09 15:36 ` David Sterba
2024-10-09 16:04 ` Andy Shevchenko
2024-10-10 15:24 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 05/28] dax: Document dax dev range tuple Ira Weiny
2024-10-09 12:42 ` Jonathan Cameron
2024-10-11 20:40 ` Ira Weiny
2024-10-16 15:48 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 06/28] cxl/pci: Delay event buffer allocation Ira Weiny
2024-10-09 17:47 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 07/28] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) ira.weiny
2024-10-07 23:16 ` [PATCH v4 08/28] cxl/mem: Read dynamic capacity configuration from the device ira.weiny
2024-10-09 12:49 ` Jonathan Cameron
2024-10-14 0:05 ` Ira Weiny
2024-10-16 15:54 ` Jonathan Cameron
2024-10-16 16:59 ` Kees Cook
2024-10-07 23:16 ` [PATCH v4 09/28] cxl/core: Separate region mode from decoder mode ira.weiny
2024-10-09 12:51 ` Jonathan Cameron
2024-10-09 18:06 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 10/28] cxl/region: Add dynamic capacity decoder and region modes ira.weiny
2024-10-07 23:16 ` [PATCH v4 11/28] cxl/hdm: Add dynamic capacity size support to endpoint decoders ira.weiny
2024-10-10 12:45 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 12/28] cxl/cdat: Gather DSMAS data for DCD regions Ira Weiny
2024-10-09 14:42 ` Rafael J. Wysocki
2024-10-11 20:38 ` Ira Weiny
2024-10-14 20:52 ` Wysocki, Rafael J
2024-10-09 18:16 ` Fan Ni
2024-10-14 1:16 ` Ira Weiny
2024-10-10 12:51 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 13/28] cxl/mem: Expose DCD partition capabilities in sysfs ira.weiny
2024-10-09 20:46 ` Fan Ni
2024-10-14 1:34 ` Ira Weiny
2024-10-10 13:04 ` Jonathan Cameron
2024-10-16 21:34 ` Ira Weiny
2024-10-11 2:15 ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 14/28] cxl/port: Add endpoint decoder DC mode support to sysfs ira.weiny
2024-10-10 13:14 ` Jonathan Cameron
2024-10-17 17:51 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 15/28] cxl/region: Refactor common create region code Ira Weiny
2024-10-10 13:18 ` Jonathan Cameron
2024-10-17 20:29 ` Ira Weiny
2024-10-10 16:27 ` Fan Ni
2024-10-24 2:17 ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 16/28] cxl/region: Add sparse DAX region support ira.weiny
2024-10-10 13:46 ` Jonathan Cameron
2024-10-10 17:41 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 17/28] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2024-10-10 13:49 ` Jonathan Cameron
2024-10-10 17:58 ` Fan Ni
2024-10-24 2:33 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 18/28] cxl/pci: Factor out interrupt policy check Ira Weiny
2024-10-10 18:07 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 19/28] cxl/mem: Configure dynamic capacity interrupts ira.weiny
2024-10-10 14:15 ` Jonathan Cameron
2024-10-10 18:25 ` Fan Ni
2024-10-24 3:09 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 20/28] cxl/core: Return endpoint decoder information from region search Ira Weiny
2024-10-10 14:21 ` Jonathan Cameron
2024-10-10 18:29 ` Fan Ni
2024-10-24 2:30 ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 21/28] cxl/extent: Process DCD events and realize region extents ira.weiny
2024-10-09 1:56 ` Li, Ming4
2024-10-09 19:49 ` Ira Weiny
2024-10-10 3:06 ` Li, Ming4
2024-10-14 2:05 ` Ira Weiny
2024-10-10 14:50 ` Jonathan Cameron
2024-10-11 19:14 ` Fan Ni
2024-10-17 21:15 ` Ira Weiny
2024-10-18 9:03 ` Jonathan Cameron
2024-10-21 14:04 ` Ira Weiny
2024-10-21 14:47 ` Jonathan Cameron
2024-10-10 14:58 ` Jonathan Cameron
2024-10-17 21:39 ` Ira Weiny
2024-10-18 9:09 ` Jonathan Cameron
2024-10-21 18:45 ` Ira Weiny
2024-10-22 17:01 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 22/28] cxl/region/extent: Expose region extent information in sysfs ira.weiny
2024-10-10 15:01 ` Jonathan Cameron
2024-10-18 18:26 ` Ira Weiny
2024-10-21 9:37 ` Jonathan Cameron
2024-10-14 16:08 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 23/28] dax/bus: Factor out dev dax resize logic Ira Weiny
2024-10-10 15:06 ` Jonathan Cameron
2024-10-21 21:16 ` Ira Weiny
2024-10-14 16:56 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 24/28] dax/region: Create resources on sparse DAX regions ira.weiny
2024-10-10 15:27 ` Jonathan Cameron [this message]
2024-10-23 1:20 ` Ira Weiny
2024-10-23 11:22 ` Jonathan Cameron
2024-10-24 3:50 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 25/28] cxl/region: Read existing extents on region creation ira.weiny
2024-10-10 15:33 ` Jonathan Cameron
2024-10-24 1:41 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 26/28] cxl/mem: Trace Dynamic capacity Event Record ira.weiny
2024-10-10 15:41 ` Jonathan Cameron
2024-10-24 1:52 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 27/28] tools/testing/cxl: Make event logs dynamic Ira Weiny
2024-10-10 15:49 ` Jonathan Cameron
2024-10-24 1:59 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 28/28] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2024-10-10 15:58 ` Jonathan Cameron
2024-10-24 2:23 ` Ira Weiny
2024-10-08 22:57 ` [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2024-10-08 23:06 ` Fan Ni
2024-10-10 15:30 ` Ira Weiny
2024-10-10 15:31 ` Ira Weiny
2024-10-21 16:47 ` Fan Ni
2024-10-22 17:05 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241010162745.00007b31@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alison.schofield@intel.com \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=fan.ni@samsung.com \
--cc=ira.weiny@intel.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=navneet.singh@intel.com \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).