From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Fan Ni <fan.ni@samsung.com>,
"Dan Williams" <dan.j.williams@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
<linux-cxl@vger.kernel.org>, <nvdimm@lists.linux.dev>,
<linux-kernel@vger.kernel.org>, Li Ming <ming.li@zohomail.com>
Subject: Re: [PATCH v9 12/19] cxl/extent: Process dynamic partition events and realize region extents
Date: Mon, 14 Apr 2025 17:07:52 +0100 [thread overview]
Message-ID: <20250414170752.00002356@huawei.com> (raw)
In-Reply-To: <20250413-dcd-type2-upstream-v9-12-1d4911a0b365@intel.com>
On Sun, 13 Apr 2025 17:52:20 -0500
Ira Weiny <ira.weiny@intel.com> wrote:
> A dynamic capacity device (DCD) sends events to signal the host for
> changes in the availability of Dynamic Capacity (DC) memory. These
> events contain extents describing a DPA range and meta data for memory
> to be added or removed. Events may be sent from the device at any time.
>
> Three types of events can be signaled, Add, Release, and Force Release.
>
> On add, the host may accept or reject the memory being offered. If no
> region exists, or the extent is invalid, the extent should be rejected.
> Add extent events may be grouped by a 'more' bit which indicates those
> extents should be processed as a group.
>
> On remove, the host can delay the response until the host is safely not
> using the memory. If no region exists the release can be sent
> immediately. The host may also release extents (or partial extents) at
> any time. Thus the 'more' bit grouping of release events is of less
> value and can be ignored in favor of sending multiple release capacity
> responses for groups of release events.
>
> Force removal is intended as a mechanism between the FM and the device
> and intended only when the host is unresponsive, out of sync, or
> otherwise broken. Purposely ignore force removal events.
>
> Regions are made up of one or more devices which may be surfacing memory
> to the host. Once all devices in a region have surfaced an extent the
> region can expose a corresponding extent for the user to consume.
> Without interleaving a device extent forms a 1:1 relationship with the
> region extent. Immediately surface a region extent upon getting a
> device extent.
>
> Per the specification the device is allowed to offer or remove extents
> at any time. However, anticipated use cases can expect extents to be
> offered, accepted, and removed in well defined chunks.
>
> Simplify extent tracking with the following restrictions.
>
> 1) Flag for removal any extent which overlaps a requested
> release range.
> 2) Refuse the offer of extents which overlap already accepted
> memory ranges.
> 3) Accept again a range which has already been accepted by the
> host. Eating duplicates serves three purposes.
> 3a) This simplifies the code if the device should get out of
> sync with the host. And it should be safe to acknowledge
> the extent again.
> 3b) This simplifies the code to process existing extents if
> the extent list should change while the extent list is
> being read.
> 3c) Duplicates for a given partition which are seen during a
> race between the hardware surfacing an extent and the cxl
> dax driver scanning for existing extents will be ignored.
>
> NOTE: Processing existing extents is done in a later patch.
>
> Management of the region extent devices must be synchronized with
> potential uses of the memory within the DAX layer. Create region extent
> devices as children of the cxl_dax_region device such that the DAX
> region driver can co-drive them and synchronize with the DAX layer.
> Synchronization and management is handled in a subsequent patch.
>
> Tag support within the DAX layer is not yet supported. To maintain
> compatibility with legacy DAX/region processing only tags with a value
> of 0 are allowed. This defines existing DAX devices as having a 0 tag
> which makes the most logical sense as a default.
>
> Process DCD events and create region devices.
>
> Based on an original patch by Navneet Singh.
>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Li Ming <ming.li@zohomail.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
I've forgotten what our policy on spec references in new
code. Maybe update them to 3.2?
A few tiny little things inline from a fresh look.
Thanks,
Jonathan
> diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
> new file mode 100644
> index 000000000000..6df277caf974
> --- /dev/null
> +++ b/drivers/cxl/core/extent.c
> +static int cxlr_rm_extent(struct device *dev, void *data)
> +{
> + struct region_extent *region_extent = to_region_extent(dev);
> + struct range *region_hpa_range = data;
> +
> + if (!region_extent)
> + return 0;
> +
> + /*
> + * Any extent which 'touches' the released range is removed.
> + */
Single line comment syntax.
> + if (range_overlaps(region_hpa_range, ®ion_extent->hpa_range)) {
> + dev_dbg(dev, "Remove region extent HPA %pra\n",
> + ®ion_extent->hpa_range);
> + region_rm_extent(region_extent);
> + }
> + return 0;
> +}
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b3dd119d166a..de01c6684530 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -930,6 +930,60 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL");
>
> +static int cxl_validate_extent(struct cxl_memdev_state *mds,
> + struct cxl_extent *extent)
> +{
> + struct cxl_dev_state *cxlds = &mds->cxlds;
> + struct device *dev = mds->cxlds.dev;
> + u64 start, length;
> +
> + start = le64_to_cpu(extent->start_dpa);
> + length = le64_to_cpu(extent->length);
Set these at declaration..
> +
> + struct range ext_range = (struct range){
> + .start = start,
> + .end = start + length - 1,
> + };
With the above set at declaration this is then not declaration
mid code which are still generally looked at in a funny way in kernel!
> +
> + if (le16_to_cpu(extent->shared_extn_seq) != 0) {
> + dev_err_ratelimited(dev,
> + "DC extent DPA %pra (%pU) can not be shared\n",
> + &ext_range, extent->uuid);
> + return -ENXIO;
> + }
> +
> + if (!uuid_is_null((const uuid_t *)extent->uuid)) {
> + dev_err_ratelimited(dev,
> + "DC extent DPA %pra (%pU); tags not supported\n",
> + &ext_range, extent->uuid);
> + return -ENXIO;
> + }
> +
> + /* Extents must be within the DC partition boundary */
> + for (int i = 0; i < cxlds->nr_partitions; i++) {
> + struct cxl_dpa_partition *part = &cxlds->part[i];
> +
> + if (part->mode != CXL_PARTMODE_DYNAMIC_RAM_A)
> + continue;
> +
> + struct range partition_range = (struct range) {
Maybe move the declaration up and just assign it here.
> + .start = part->res.start,
> + .end = part->res.end,
> + };
> +
> + if (range_contains(&partition_range, &ext_range)) {
> + dev_dbg(dev, "DC extent DPA %pra (DCR:%pra)(%pU)\n",
> + &ext_range, &partition_range, extent->uuid);
> + return 0;
> + }
> + }
> +
> + dev_err_ratelimited(dev,
> + "DC extent DPA %pra (%pU) is not in a valid DC partition\n",
> + &ext_range, extent->uuid);
> + return -ENXIO;
> +}
> +/**
> + * struct cxled_extent - Extent within an endpoint decoder
> + * @cxled: Reference to the endpoint decoder
> + * @dpa_range: DPA range this extent covers within the decoder
> + * @uuid: uuid from device for this extent
> + */
> +struct cxled_extent {
> + struct cxl_endpoint_decoder *cxled;
> + struct range dpa_range;
> + uuid_t uuid;
> +};
> +/* See CXL 3.1 8.2.9.2.1.6 */
> +enum dc_event {
> + DCD_ADD_CAPACITY,
> + DCD_RELEASE_CAPACITY,
> + DCD_FORCED_CAPACITY_RELEASE,
> + DCD_REGION_CONFIGURATION_UPDATED,
Perhaps a comment here that the other values don't apply to the
normal mailbox interface (they are FM only).
Might avoid confusion.
> +};
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 34a606c5ead0..63a38e449454 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> +/*
> + * Add Dynamic Capacity Response
> + * CXL rev 3.1 section 8.2.9.9.9.3; Table 8-168 & Table 8-169
> + */
> +struct cxl_mbox_dc_response {
> + __le32 extent_list_size;
> + u8 flags;
> + u8 reserved[3];
> + struct updated_extent_list {
> + __le64 dpa_start;
> + __le64 length;
> + u8 reserved[8];
> + } __packed extent_list[];
counted_by marking always nice to have and here it's the extent_list_size I think
(which has an odd name giving it is a count, not a size... *dramatic sigh*)
> +} __packed;
next prev parent reply other threads:[~2025-04-14 16:08 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-13 22:52 [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2025-04-13 22:52 ` [PATCH v9 01/19] cxl/mbox: Flag " Ira Weiny
2025-04-14 14:19 ` Jonathan Cameron
2025-05-05 21:04 ` Fan Ni
2025-05-06 16:09 ` Ira Weiny
2025-05-06 18:54 ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 02/19] cxl/mem: Read dynamic capacity configuration from the device Ira Weiny
2025-04-14 14:35 ` Jonathan Cameron
2025-04-14 15:20 ` Jonathan Cameron
2025-05-07 17:40 ` Fan Ni
2025-05-08 13:35 ` Ira Weiny
2025-04-13 22:52 ` [PATCH v9 03/19] cxl/cdat: Gather DSMAS data for DCD partitions Ira Weiny
2025-04-14 15:29 ` Jonathan Cameron
2025-04-13 22:52 ` [PATCH v9 04/19] cxl/core: Enforce partition order/simplify partition calls Ira Weiny
2025-04-14 15:32 ` Jonathan Cameron
2026-02-02 19:25 ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 05/19] cxl/mem: Expose dynamic ram A partition in sysfs Ira Weiny
2025-04-14 15:34 ` Jonathan Cameron
2026-02-02 19:28 ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 06/19] cxl/port: Add 'dynamic_ram_a' to endpoint decoder mode Ira Weiny
2025-04-14 15:36 ` Jonathan Cameron
2025-05-07 20:50 ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 07/19] cxl/region: Add sparse DAX region support Ira Weiny
2025-04-14 15:40 ` Jonathan Cameron
2025-05-08 17:54 ` Fan Ni
2025-05-08 18:17 ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 08/19] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2025-04-13 22:52 ` [PATCH v9 09/19] cxl/pci: Factor out interrupt policy check Ira Weiny
2025-04-13 22:52 ` [PATCH v9 10/19] cxl/mem: Configure dynamic capacity interrupts Ira Weiny
2025-04-13 22:52 ` [PATCH v9 11/19] cxl/core: Return endpoint decoder information from region search Ira Weiny
2025-04-13 22:52 ` [PATCH v9 12/19] cxl/extent: Process dynamic partition events and realize region extents Ira Weiny
2025-04-14 16:07 ` Jonathan Cameron [this message]
2025-04-14 22:10 ` Alison Schofield
2025-05-12 17:47 ` Fan Ni
2026-02-02 20:00 ` Davidlohr Bueso
2026-02-24 1:24 ` Anisa Su
2026-03-05 22:00 ` Ira Weiny
2025-04-13 22:52 ` [PATCH v9 13/19] cxl/region/extent: Expose region extent information in sysfs Ira Weiny
2025-04-13 22:52 ` [PATCH v9 14/19] dax/bus: Factor out dev dax resize logic Ira Weiny
2025-04-13 22:52 ` [PATCH v9 15/19] dax/region: Create resources on sparse DAX regions Ira Weiny
2025-04-13 22:52 ` [PATCH v9 16/19] cxl/region: Read existing extents on region creation Ira Weiny
2025-04-14 16:15 ` Jonathan Cameron
2026-02-02 19:42 ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 17/19] cxl/mem: Trace Dynamic capacity Event Record Ira Weiny
2025-04-13 22:52 ` [PATCH v9 18/19] tools/testing/cxl: Make event logs dynamic Ira Weiny
2025-04-13 22:52 ` [PATCH v9 19/19] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2025-04-14 16:11 ` [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2025-04-15 2:37 ` Ira Weiny
2025-04-15 2:47 ` Fan Ni
2025-04-15 4:28 ` Dan Williams
2025-05-13 18:55 ` Fan Ni
2025-04-14 16:47 ` Jonathan Cameron
2025-04-15 4:50 ` Dan Williams
2025-04-15 10:03 ` Jonathan Cameron
2025-04-15 17:45 ` Dan Williams
2025-06-03 16:32 ` Fan Ni
2025-06-09 17:09 ` Fan Ni
2026-02-02 20:22 ` Gregory Price
2026-02-03 22:04 ` Ira Weiny
2026-02-04 15:12 ` Gregory Price
2026-02-04 17:57 ` Ira Weiny
2026-02-04 18:53 ` Gregory Price
2026-02-05 17:48 ` Jonathan Cameron
2026-02-06 11:01 ` Alireza Sanaee
2026-02-06 13:26 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250414170752.00002356@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=fan.ni@samsung.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.li@zohomail.com \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.