Linux CXL
 help / color / mirror / Atom feed
From: "Jørgen Hansen" <Jorgen.Hansen@wdc.com>
To: Ira Weiny <ira.weiny@intel.com>, Dan Williams <dan.j.williams@intel.com>
Cc: Navneet Singh <navneet.singh@intel.com>,
	Fan Ni <fan.ni@samsung.com>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Dave Jiang <dave.jiang@intel.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC v2 12/18] cxl/region: Notify regions of DC changes
Date: Mon, 18 Sep 2023 13:56:49 +0000	[thread overview]
Message-ID: <c6d4bd17-46c5-e77b-5dc1-ca5b9c9bf5d0@wdc.com> (raw)
In-Reply-To: <20230604-dcd-type2-upstream-v2-12-f740c47e7916@intel.com>

On 8/29/23 07:21, Ira Weiny wrote:
> 
> In order for a user to use dynamic capacity effectively they need to
> know when dynamic capacity is available.  Thus when Dynamic Capacity
> (DC) extents are added or removed by a DC device the regions affected
> need to be notified.  Ultimately the DAX region uses the memory
> associated with DC extents.  However, remember that CXL DAX regions
> maintain any interleave details between devices.
> 
> When a DCD event occurs, iterate all CXL endpoint decoders and notify
> regions which contain the endpoints affected by the event.  In turn
> notify the DAX regions of the changes to the DAX region extents.
> 
> For now interleave is handled by creating simple 1:1 mappings between
> the CXL DAX region and DAX region layers.  Future implementations will
> need to resolve when to actually surface a DAX region extent and pass
> the notification along.
> 
> Remember that adding capacity is safe because there is no chance of the
> memory being in use.  Also remember at this point releasing capacity is
> straight forward because DAX devices do not yet have references to the
> extents.  Future patches will handle that complication.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from v1:
> [iweiny: Rewrite]
> ---
>   drivers/cxl/core/mbox.c   |  39 +++++++++++++--
>   drivers/cxl/core/region.c | 123 +++++++++++++++++++++++++++++++++++++++++-----
>   drivers/cxl/cxl.h         |  22 +++++++++
>   drivers/cxl/mem.c         |  50 +++++++++++++++++++
>   drivers/dax/cxl.c         |  99 ++++++++++++++++++++++++++++++-------
>   drivers/dax/dax-private.h |   3 ++
>   drivers/dax/extent.c      |  14 ++++++
>   7 files changed, 317 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 5472ab1d0370..9d9c13e13ecf 100644

[snip]

> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 0aeea50550f6..a0c1f2793dd7 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1547,8 +1547,8 @@ static int cxl_region_validate_position(struct cxl_region *cxlr,
>          return 0;
>   }
> 
> -static bool cxl_dc_extent_in_ed(struct cxl_endpoint_decoder *cxled,
> -                               struct cxl_dc_extent_data *extent)
> +bool cxl_dc_extent_in_ed(struct cxl_endpoint_decoder *cxled,
> +                        struct cxl_dc_extent_data *extent)
>   {
>          struct range dpa_range = (struct range){
>                  .start = extent->dpa_start,
> @@ -1567,14 +1567,66 @@ static bool cxl_dc_extent_in_ed(struct cxl_endpoint_decoder *cxled,
>          return (cxled->dpa_res->start <= dpa_range.start &&
>                  dpa_range.end <= cxled->dpa_res->end);
>   }
> +EXPORT_SYMBOL_NS_GPL(cxl_dc_extent_in_ed, CXL);
> +
> +static int cxl_region_notify_extent(struct cxl_endpoint_decoder *cxled,
> +                                   enum dc_event event,
> +                                   struct cxl_dr_extent *cxl_dr_ext)
> +{
> +       struct cxl_dax_region *cxlr_dax;
> +       struct device *dev;
> +       int rc = 0;
> +
> +       cxlr_dax = cxled->cxld.region->cxlr_dax;
> +       dev = &cxlr_dax->dev;
> +       dev_dbg(dev, "Trying notify: type %d HPA:%llx LEN:%llx\n",
> +               event, cxl_dr_ext->hpa_offset, cxl_dr_ext->hpa_length);
> +
> +       device_lock(dev);
> +       if (dev->driver) {
> +               struct cxl_driver *reg_drv = to_cxl_drv(dev->driver);
> +               struct cxl_drv_nd nd = (struct cxl_drv_nd) {
> +                       .event = event,
> +                       .cxl_dr_ext = cxl_dr_ext,
> +               };
> +
> +               if (reg_drv->notify) {
> +                       dev_dbg(dev, "Notify: type %d HPA:%llx LEN:%llx\n",
> +                               event, cxl_dr_ext->hpa_offset,
> +                               cxl_dr_ext->hpa_length);
> +                       rc = reg_drv->notify(dev, &nd);
> +               }
> +       }
> +       device_unlock(dev);
> +       return rc;
> +}
> +
> +static resource_size_t
> +cxl_dc_extent_to_hpa_offset(struct cxl_endpoint_decoder *cxled,
> +                           struct cxl_dc_extent_data *extent)
> +{
> +       struct cxl_dax_region *cxlr_dax;
> +       resource_size_t dpa_offset, hpa;
> +       struct range *ed_hpa_range;
> +
> +       cxlr_dax = cxled->cxld.region->cxlr_dax;
> +
> +       /*
> +        * Without interleave...
> +        * HPA offset == DPA offset
> +        * ... but do the math anyway
> +        */
> +       dpa_offset = extent->dpa_start - cxled->dpa_res->start;
> +       ed_hpa_range = &cxled->cxld.hpa_range;
> +       hpa = ed_hpa_range->start + dpa_offset;
> +       return hpa - cxlr_dax->hpa_range.start;
> +}
> 
>   static int cxl_ed_add_one_extent(struct cxl_endpoint_decoder *cxled,
>                                   struct cxl_dc_extent_data *extent)
>   {
>          struct cxl_dr_extent *cxl_dr_ext;
>          struct cxl_dax_region *cxlr_dax;
> -       resource_size_t dpa_offset, hpa;
> -       struct range *ed_hpa_range;
>          struct device *dev;
>          int rc;
> 
> @@ -1601,15 +1653,7 @@ static int cxl_ed_add_one_extent(struct cxl_endpoint_decoder *cxled,
>          cxl_dr_ext->extent = extent;
>          kref_init(&cxl_dr_ext->region_ref);
> 
> -       /*
> -        * Without interleave...
> -        * HPA offset == DPA offset
> -        * ... but do the math anyway
> -        */
> -       dpa_offset = extent->dpa_start - cxled->dpa_res->start;
> -       ed_hpa_range = &cxled->cxld.hpa_range;
> -       hpa = ed_hpa_range->start + dpa_offset;
> -       cxl_dr_ext->hpa_offset = hpa - cxlr_dax->hpa_range.start;
> +       cxl_dr_ext->hpa_offset = cxl_dc_extent_to_hpa_offset(cxled, extent);
> 
>          /* Without interleave carry length and label through */
>          cxl_dr_ext->hpa_length = extent->length;
> @@ -1626,6 +1670,7 @@ static int cxl_ed_add_one_extent(struct cxl_endpoint_decoder *cxled,
>          }
>          /* Put in cxl_dr_release() */
>          cxl_dc_extent_get(cxl_dr_ext->extent);
> +       cxl_region_notify_extent(cxled, DCD_ADD_CAPACITY, cxl_dr_ext);
>          return 0;
>   }
> 
> @@ -1663,6 +1708,58 @@ static int cxl_ed_add_extents(struct cxl_endpoint_decoder *cxled)
>          return 0;
>   }
> 
> +static int cxl_ed_rm_dc_extent(struct cxl_endpoint_decoder *cxled,
> +                              enum dc_event event,
> +                              struct cxl_dc_extent_data *extent)
> +{
> +       struct cxl_region *cxlr = cxled->cxld.region;
> +       struct cxl_dax_region *cxlr_dax = cxlr->cxlr_dax;
> +       struct cxl_dr_extent *cxl_dr_ext;
> +       resource_size_t hpa_offset;
> +
> +       hpa_offset = cxl_dc_extent_to_hpa_offset(cxled, extent);
> +
> +       /*
> +        * NOTE on Interleaving: There is no need to 'break up' the cxl_dr_ext.
> +        * If one of the extents comprising it is gone it should be removed
> +        * from the region to prevent future use.  Later code may save other
> +        * extents for future processing.  But for now the corelation is 1:1:1
> +        * so just erase the extent.
> +        */
> +       cxl_dr_ext = xa_erase(&cxlr_dax->extents, hpa_offset);
> +
> +       dev_dbg(&cxlr_dax->dev, "Remove DAX region ext HPA:%llx\n",
> +               cxl_dr_ext->hpa_offset);
> +       cxl_region_notify_extent(cxled, event, cxl_dr_ext);
> +       cxl_dr_extent_put(cxl_dr_ext);
> +       return 0;
> +}
> +
> +int cxl_ed_notify_extent(struct cxl_endpoint_decoder *cxled,
> +                        struct cxl_drv_nd *nd)
> +{
> +       int rc = 0;
> +
> +       switch (nd->event) {
> +       case DCD_ADD_CAPACITY:
> +               if (cxl_dc_extent_get_not_zero(nd->extent)) {
> +                       rc = cxl_ed_add_one_extent(cxled, nd->extent);
> +                       if (rc)
> +                               cxl_dc_extent_put(nd->extent);

Hi,
when playing around with adding and releasing DCD extents through the 
qmp interface for the QEMU DCD emulation, I noticed that extents weren't 
handed back to the device. It looks like there is a refcounting issue, 
as the kref never drops below 2 for the dc extents. So I was wondering 
whether we should only put the dc extent here on error or maybe always 
put it?  cxl_ed_add_one_extent() also grabs a reference to the dc 
extent, and that one is put in cxl_dr_release(), but I couldn't find a 
matching put for this get_not_zero.


> +               }
> +               break;
> +       case DCD_RELEASE_CAPACITY:
> +       case DCD_FORCED_CAPACITY_RELEASE:
> +               rc = cxl_ed_rm_dc_extent(cxled, nd->event, nd->extent);
> +               break;
> +       default:
> +               dev_err(&cxled->cxld.dev, "Unknown DC event %d\n", nd->event);
> +               break;
> +       }
> +       return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_ed_notify_extent, CXL);
> +
>   static int cxl_region_attach_position(struct cxl_region *cxlr,
>                                        struct cxl_root_decoder *cxlrd,
>                                        struct cxl_endpoint_decoder *cxled,

[snip]

> 
> --
> 2.41.0
> 

Thanks,
Jorgen

  parent reply	other threads:[~2023-09-18 15:23 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-29  5:20 [PATCH RFC v2 00/18] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2023-08-29  5:20 ` [PATCH RFC v2 01/18] cxl/hdm: Debug, use decoder name function Ira Weiny
2023-08-29 14:03   ` Jonathan Cameron
2023-08-29 21:48     ` Fan Ni
2023-09-03  2:55     ` Ira Weiny
2023-08-30 20:32   ` Dave Jiang
2023-08-29  5:20 ` [PATCH RFC v2 02/18] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) Ira Weiny
2023-08-29 14:07   ` Jonathan Cameron
2023-09-03  3:38     ` Ira Weiny
2023-08-29 21:49   ` Fan Ni
2023-08-30 20:33   ` Dave Jiang
2023-10-24 16:16   ` Jonathan Cameron
2023-08-29  5:20 ` [PATCH RFC v2 03/18] cxl/mem: Read Dynamic capacity configuration from the device ira.weiny
2023-08-29 14:37   ` Jonathan Cameron
2023-09-03 23:36     ` Ira Weiny
2023-08-30 21:01   ` Dave Jiang
2023-09-05  0:14     ` Ira Weiny
2023-09-08 20:23     ` Ira Weiny
2023-08-30 21:44   ` Fan Ni
2023-09-08 22:52     ` Ira Weiny
2023-09-12 21:32       ` Fan Ni
2023-09-07 15:46   ` Alison Schofield
2023-09-12  1:18     ` Ira Weiny
2023-09-08 12:46   ` Jørgen Hansen
2023-09-11 20:26     ` Ira Weiny
2023-08-29  5:20 ` [PATCH RFC v2 04/18] cxl/region: Add Dynamic Capacity decoder and region modes Ira Weiny
2023-08-29 14:39   ` Jonathan Cameron
2023-08-30 21:13   ` Dave Jiang
2023-08-31 17:00   ` Fan Ni
2023-08-29  5:20 ` [PATCH RFC v2 05/18] cxl/port: Add Dynamic Capacity mode support to endpoint decoders Ira Weiny
2023-08-29 14:49   ` Jonathan Cameron
2023-09-05  0:05     ` Ira Weiny
2023-08-31 17:25   ` Fan Ni
2023-09-08 23:26     ` Ira Weiny
2023-08-29  5:20 ` [PATCH RFC v2 06/18] cxl/port: Add Dynamic Capacity size " Ira Weiny
2023-08-29 15:09   ` Jonathan Cameron
2023-09-05  4:32     ` Ira Weiny
2023-08-29  5:20 ` [PATCH RFC v2 07/18] cxl/mem: Expose device dynamic capacity configuration ira.weiny
2023-08-29 15:14   ` Jonathan Cameron
2023-09-05 17:55     ` Fan Ni
2023-09-05 20:45     ` Ira Weiny
2023-08-30 22:46   ` Dave Jiang
2023-09-08 23:22     ` Ira Weiny
2023-08-29  5:20 ` [PATCH RFC v2 08/18] cxl/region: Add Dynamic Capacity CXL region support Ira Weiny
2023-08-29 15:19   ` Jonathan Cameron
2023-08-30 23:27   ` Dave Jiang
2023-09-06  4:36     ` Ira Weiny
2023-09-05 21:09   ` Fan Ni
2023-08-29  5:21 ` [PATCH RFC v2 09/18] cxl/mem: Read extents on memory device discovery Ira Weiny
2023-08-29 15:26   ` Jonathan Cameron
2023-08-30  0:16     ` Ira Weiny
2023-09-05 21:41     ` Ira Weiny
2023-08-29  5:21 ` [PATCH RFC v2 10/18] cxl/mem: Handle DCD add and release capacity events Ira Weiny
2023-08-29 15:59   ` Jonathan Cameron
2023-09-05 23:49     ` Ira Weiny
2023-08-31 17:28   ` Dave Jiang
2023-09-08 15:35     ` Ira Weiny
2023-08-29  5:21 ` [PATCH RFC v2 11/18] cxl/region: Expose DC extents on region driver load Ira Weiny
2023-08-29 16:20   ` Jonathan Cameron
2023-09-06  3:36     ` Ira Weiny
2023-08-31 18:38   ` Dave Jiang
2023-09-08 23:57     ` Ira Weiny
2023-08-29  5:21 ` [PATCH RFC v2 12/18] cxl/region: Notify regions of DC changes Ira Weiny
2023-08-29 16:40   ` Jonathan Cameron
2023-09-06  4:00     ` Ira Weiny
2023-09-18 13:56   ` Jørgen Hansen [this message]
2023-09-18 17:45     ` Ira Weiny
2023-08-29  5:21 ` [PATCH RFC v2 13/18] dax/bus: Factor out dev dax resize logic Ira Weiny
2023-08-30 11:27   ` Jonathan Cameron
2023-09-06  4:12     ` Ira Weiny
2023-08-31 21:48   ` Dave Jiang
2023-08-29  5:21 ` [PATCH RFC v2 14/18] dax/region: Support DAX device creation on dynamic DAX regions Ira Weiny
2023-08-30 11:50   ` Jonathan Cameron
2023-09-06  4:35     ` Ira Weiny
2023-09-12 16:49       ` Jonathan Cameron
2023-09-12 22:08         ` Ira Weiny
2023-09-12 22:35           ` Dan Williams
2023-09-13 17:30             ` Ira Weiny
2023-09-13 17:59               ` Dan Williams
2023-09-13 19:26                 ` Ira Weiny
2023-09-14 10:32                   ` Jonathan Cameron
2023-08-29  5:21 ` [PATCH RFC v2 15/18] cxl/mem: Trace Dynamic capacity Event Record ira.weiny
2023-08-29 16:46   ` Jonathan Cameron
2023-09-06  4:07     ` Ira Weiny
2023-08-29  5:21 ` [PATCH RFC v2 16/18] tools/testing/cxl: Make event logs dynamic Ira Weiny
2023-08-30 12:11   ` Jonathan Cameron
2023-09-06 21:15     ` Ira Weiny
2023-08-29  5:21 ` [PATCH RFC v2 17/18] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2023-08-30 12:20   ` Jonathan Cameron
2023-09-06 21:18     ` Ira Weiny
2023-08-31 23:19   ` Dave Jiang
2023-08-29  5:21 ` [PATCH RFC v2 18/18] tools/testing/cxl: Add Dynamic Capacity events Ira Weiny
2023-08-30 12:23   ` Jonathan Cameron
2023-09-06 21:39     ` Ira Weiny
2023-08-31 23:20   ` Dave Jiang
2023-09-07 21:01 ` [PATCH RFC v2 00/18] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2023-09-12  1:44   ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6d4bd17-46c5-e77b-5dc1-ca5b9c9bf5d0@wdc.com \
    --to=jorgen.hansen@wdc.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=fan.ni@samsung.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=navneet.singh@intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox