From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Fan Ni <fan.ni@samsung.com>,
"Dan Williams" <dan.j.williams@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
<linux-cxl@vger.kernel.org>, <nvdimm@lists.linux.dev>,
<linux-kernel@vger.kernel.org>, Li Ming <ming.li@zohomail.com>
Subject: Re: [PATCH v9 12/19] cxl/extent: Process dynamic partition events and realize region extents
Date: Mon, 14 Apr 2025 17:07:52 +0100 [thread overview]
Message-ID: <20250414170752.00002356@huawei.com> (raw)
In-Reply-To: <20250413-dcd-type2-upstream-v9-12-1d4911a0b365@intel.com>
On Sun, 13 Apr 2025 17:52:20 -0500
Ira Weiny <ira.weiny@intel.com> wrote:
> A dynamic capacity device (DCD) sends events to signal the host for
> changes in the availability of Dynamic Capacity (DC) memory. These
> events contain extents describing a DPA range and meta data for memory
> to be added or removed. Events may be sent from the device at any time.
>
> Three types of events can be signaled, Add, Release, and Force Release.
>
> On add, the host may accept or reject the memory being offered. If no
> region exists, or the extent is invalid, the extent should be rejected.
> Add extent events may be grouped by a 'more' bit which indicates those
> extents should be processed as a group.
>
> On remove, the host can delay the response until the host is safely not
> using the memory. If no region exists the release can be sent
> immediately. The host may also release extents (or partial extents) at
> any time. Thus the 'more' bit grouping of release events is of less
> value and can be ignored in favor of sending multiple release capacity
> responses for groups of release events.
>
> Force removal is intended as a mechanism between the FM and the device
> and intended only when the host is unresponsive, out of sync, or
> otherwise broken. Purposely ignore force removal events.
>
> Regions are made up of one or more devices which may be surfacing memory
> to the host. Once all devices in a region have surfaced an extent the
> region can expose a corresponding extent for the user to consume.
> Without interleaving a device extent forms a 1:1 relationship with the
> region extent. Immediately surface a region extent upon getting a
> device extent.
>
> Per the specification the device is allowed to offer or remove extents
> at any time. However, anticipated use cases can expect extents to be
> offered, accepted, and removed in well defined chunks.
>
> Simplify extent tracking with the following restrictions.
>
> 1) Flag for removal any extent which overlaps a requested
> release range.
> 2) Refuse the offer of extents which overlap already accepted
> memory ranges.
> 3) Accept again a range which has already been accepted by the
> host. Eating duplicates serves three purposes.
> 3a) This simplifies the code if the device should get out of
> sync with the host. And it should be safe to acknowledge
> the extent again.
> 3b) This simplifies the code to process existing extents if
> the extent list should change while the extent list is
> being read.
> 3c) Duplicates for a given partition which are seen during a
> race between the hardware surfacing an extent and the cxl
> dax driver scanning for existing extents will be ignored.
>
> NOTE: Processing existing extents is done in a later patch.
>
> Management of the region extent devices must be synchronized with
> potential uses of the memory within the DAX layer. Create region extent
> devices as children of the cxl_dax_region device such that the DAX
> region driver can co-drive them and synchronize with the DAX layer.
> Synchronization and management is handled in a subsequent patch.
>
> Tag support within the DAX layer is not yet supported. To maintain
> compatibility with legacy DAX/region processing only tags with a value
> of 0 are allowed. This defines existing DAX devices as having a 0 tag
> which makes the most logical sense as a default.
>
> Process DCD events and create region devices.
>
> Based on an original patch by Navneet Singh.
>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Li Ming <ming.li@zohomail.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
I've forgotten what our policy on spec references in new
code. Maybe update them to 3.2?
A few tiny little things inline from a fresh look.
Thanks,
Jonathan
> diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
> new file mode 100644
> index 000000000000..6df277caf974
> --- /dev/null
> +++ b/drivers/cxl/core/extent.c
> +static int cxlr_rm_extent(struct device *dev, void *data)
> +{
> + struct region_extent *region_extent = to_region_extent(dev);
> + struct range *region_hpa_range = data;
> +
> + if (!region_extent)
> + return 0;
> +
> + /*
> + * Any extent which 'touches' the released range is removed.
> + */
Single line comment syntax.
> + if (range_overlaps(region_hpa_range, ®ion_extent->hpa_range)) {
> + dev_dbg(dev, "Remove region extent HPA %pra\n",
> + ®ion_extent->hpa_range);
> + region_rm_extent(region_extent);
> + }
> + return 0;
> +}
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b3dd119d166a..de01c6684530 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -930,6 +930,60 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> }
> EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, "CXL");
>
> +static int cxl_validate_extent(struct cxl_memdev_state *mds,
> + struct cxl_extent *extent)
> +{
> + struct cxl_dev_state *cxlds = &mds->cxlds;
> + struct device *dev = mds->cxlds.dev;
> + u64 start, length;
> +
> + start = le64_to_cpu(extent->start_dpa);
> + length = le64_to_cpu(extent->length);
Set these at declaration..
> +
> + struct range ext_range = (struct range){
> + .start = start,
> + .end = start + length - 1,
> + };
With the above set at declaration this is then not declaration
mid code which are still generally looked at in a funny way in kernel!
> +
> + if (le16_to_cpu(extent->shared_extn_seq) != 0) {
> + dev_err_ratelimited(dev,
> + "DC extent DPA %pra (%pU) can not be shared\n",
> + &ext_range, extent->uuid);
> + return -ENXIO;
> + }
> +
> + if (!uuid_is_null((const uuid_t *)extent->uuid)) {
> + dev_err_ratelimited(dev,
> + "DC extent DPA %pra (%pU); tags not supported\n",
> + &ext_range, extent->uuid);
> + return -ENXIO;
> + }
> +
> + /* Extents must be within the DC partition boundary */
> + for (int i = 0; i < cxlds->nr_partitions; i++) {
> + struct cxl_dpa_partition *part = &cxlds->part[i];
> +
> + if (part->mode != CXL_PARTMODE_DYNAMIC_RAM_A)
> + continue;
> +
> + struct range partition_range = (struct range) {
Maybe move the declaration up and just assign it here.
> + .start = part->res.start,
> + .end = part->res.end,
> + };
> +
> + if (range_contains(&partition_range, &ext_range)) {
> + dev_dbg(dev, "DC extent DPA %pra (DCR:%pra)(%pU)\n",
> + &ext_range, &partition_range, extent->uuid);
> + return 0;
> + }
> + }
> +
> + dev_err_ratelimited(dev,
> + "DC extent DPA %pra (%pU) is not in a valid DC partition\n",
> + &ext_range, extent->uuid);
> + return -ENXIO;
> +}
> +/**
> + * struct cxled_extent - Extent within an endpoint decoder
> + * @cxled: Reference to the endpoint decoder
> + * @dpa_range: DPA range this extent covers within the decoder
> + * @uuid: uuid from device for this extent
> + */
> +struct cxled_extent {
> + struct cxl_endpoint_decoder *cxled;
> + struct range dpa_range;
> + uuid_t uuid;
> +};
> +/* See CXL 3.1 8.2.9.2.1.6 */
> +enum dc_event {
> + DCD_ADD_CAPACITY,
> + DCD_RELEASE_CAPACITY,
> + DCD_FORCED_CAPACITY_RELEASE,
> + DCD_REGION_CONFIGURATION_UPDATED,
Perhaps a comment here that the other values don't apply to the
normal mailbox interface (they are FM only).
Might avoid confusion.
> +};
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 34a606c5ead0..63a38e449454 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> +/*
> + * Add Dynamic Capacity Response
> + * CXL rev 3.1 section 8.2.9.9.9.3; Table 8-168 & Table 8-169
> + */
> +struct cxl_mbox_dc_response {
> + __le32 extent_list_size;
> + u8 flags;
> + u8 reserved[3];
> + struct updated_extent_list {
> + __le64 dpa_start;
> + __le64 length;
> + u8 reserved[8];
> + } __packed extent_list[];
counted_by marking always nice to have and here it's the extent_list_size I think
(which has an odd name giving it is a count, not a size... *dramatic sigh*)
> +} __packed;
next prev parent reply other threads:[~2025-04-14 16:08 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-13 22:52 [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2025-04-13 22:52 ` [PATCH v9 01/19] cxl/mbox: Flag " Ira Weiny
2025-04-14 14:19 ` Jonathan Cameron
2025-05-05 21:04 ` Fan Ni
2025-05-06 16:09 ` Ira Weiny
2025-05-06 18:54 ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 02/19] cxl/mem: Read dynamic capacity configuration from the device Ira Weiny
2025-04-14 14:35 ` Jonathan Cameron
2025-04-14 15:20 ` Jonathan Cameron
2025-05-07 17:40 ` Fan Ni
2025-05-08 13:35 ` Ira Weiny
2025-04-13 22:52 ` [PATCH v9 03/19] cxl/cdat: Gather DSMAS data for DCD partitions Ira Weiny
2025-04-14 15:29 ` Jonathan Cameron
2025-04-13 22:52 ` [PATCH v9 04/19] cxl/core: Enforce partition order/simplify partition calls Ira Weiny
2025-04-14 15:32 ` Jonathan Cameron
2026-02-02 19:25 ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 05/19] cxl/mem: Expose dynamic ram A partition in sysfs Ira Weiny
2025-04-14 15:34 ` Jonathan Cameron
2026-02-02 19:28 ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 06/19] cxl/port: Add 'dynamic_ram_a' to endpoint decoder mode Ira Weiny
2025-04-14 15:36 ` Jonathan Cameron
2025-05-07 20:50 ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 07/19] cxl/region: Add sparse DAX region support Ira Weiny
2025-04-14 15:40 ` Jonathan Cameron
2025-05-08 17:54 ` Fan Ni
2025-05-08 18:17 ` Fan Ni
2025-04-13 22:52 ` [PATCH v9 08/19] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2025-04-13 22:52 ` [PATCH v9 09/19] cxl/pci: Factor out interrupt policy check Ira Weiny
2025-04-13 22:52 ` [PATCH v9 10/19] cxl/mem: Configure dynamic capacity interrupts Ira Weiny
2025-04-13 22:52 ` [PATCH v9 11/19] cxl/core: Return endpoint decoder information from region search Ira Weiny
2025-04-13 22:52 ` [PATCH v9 12/19] cxl/extent: Process dynamic partition events and realize region extents Ira Weiny
2025-04-14 16:07 ` Jonathan Cameron [this message]
2025-04-14 22:10 ` Alison Schofield
2025-05-12 17:47 ` Fan Ni
2026-02-02 20:00 ` Davidlohr Bueso
2026-02-24 1:24 ` Anisa Su
2026-03-05 22:00 ` Ira Weiny
2025-04-13 22:52 ` [PATCH v9 13/19] cxl/region/extent: Expose region extent information in sysfs Ira Weiny
2025-04-13 22:52 ` [PATCH v9 14/19] dax/bus: Factor out dev dax resize logic Ira Weiny
2025-04-13 22:52 ` [PATCH v9 15/19] dax/region: Create resources on sparse DAX regions Ira Weiny
2025-04-13 22:52 ` [PATCH v9 16/19] cxl/region: Read existing extents on region creation Ira Weiny
2025-04-14 16:15 ` Jonathan Cameron
2026-02-02 19:42 ` Davidlohr Bueso
2025-04-13 22:52 ` [PATCH v9 17/19] cxl/mem: Trace Dynamic capacity Event Record Ira Weiny
2025-04-13 22:52 ` [PATCH v9 18/19] tools/testing/cxl: Make event logs dynamic Ira Weiny
2025-04-13 22:52 ` [PATCH v9 19/19] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2025-04-14 16:11 ` [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2025-04-15 2:37 ` Ira Weiny
2025-04-15 2:47 ` Fan Ni
2025-04-15 4:28 ` Dan Williams
2025-05-13 18:55 ` Fan Ni
2025-04-14 16:47 ` Jonathan Cameron
2025-04-15 4:50 ` Dan Williams
2025-04-15 10:03 ` Jonathan Cameron
2025-04-15 17:45 ` Dan Williams
2025-06-03 16:32 ` Fan Ni
2025-06-09 17:09 ` Fan Ni
2026-02-02 20:22 ` Gregory Price
2026-02-03 22:04 ` Ira Weiny
2026-02-04 15:12 ` Gregory Price
2026-02-04 17:57 ` Ira Weiny
2026-02-04 18:53 ` Gregory Price
2026-02-05 17:48 ` Jonathan Cameron
2026-02-06 11:01 ` Alireza Sanaee
2026-02-06 13:26 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250414170752.00002356@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=fan.ni@samsung.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.li@zohomail.com \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox