From: "Li, Ming4" <ming4.li@intel.com>
To: <ira.weiny@intel.com>, Dave Jiang <dave.jiang@intel.com>,
Fan Ni <fan.ni@samsung.com>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Navneet Singh <navneet.singh@intel.com>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Alison Schofield <alison.schofield@intel.com>,
"Vishal Verma" <vishal.l.verma@intel.com>,
<linux-btrfs@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<linux-doc@vger.kernel.org>, <nvdimm@lists.linux.dev>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 21/28] cxl/extent: Process DCD events and realize region extents
Date: Wed, 9 Oct 2024 09:56:06 +0800 [thread overview]
Message-ID: <4337ddd9-312b-4fb7-9597-81e8b00d57cb@intel.com> (raw)
In-Reply-To: <20241007-dcd-type2-upstream-v4-21-c261ee6eeded@intel.com>
On 10/8/2024 7:16 AM, ira.weiny@intel.com wrote:
> From: Navneet Singh <navneet.singh@intel.com>
>
> A dynamic capacity device (DCD) sends events to signal the host for
> changes in the availability of Dynamic Capacity (DC) memory. These
> events contain extents describing a DPA range and meta data for memory
> to be added or removed. Events may be sent from the device at any time.
>
> Three types of events can be signaled, Add, Release, and Force Release.
>
> On add, the host may accept or reject the memory being offered. If no
> region exists, or the extent is invalid, the extent should be rejected.
> Add extent events may be grouped by a 'more' bit which indicates those
> extents should be processed as a group.
>
> On remove, the host can delay the response until the host is safely not
> using the memory. If no region exists the release can be sent
> immediately. The host may also release extents (or partial extents) at
> any time. Thus the 'more' bit grouping of release events is of less
> value and can be ignored in favor of sending multiple release capacity
> responses for groups of release events.
>
> Force removal is intended as a mechanism between the FM and the device
> and intended only when the host is unresponsive, out of sync, or
> otherwise broken. Purposely ignore force removal events.
>
> Regions are made up of one or more devices which may be surfacing memory
> to the host. Once all devices in a region have surfaced an extent the
> region can expose a corresponding extent for the user to consume.
> Without interleaving a device extent forms a 1:1 relationship with the
> region extent. Immediately surface a region extent upon getting a
> device extent.
>
> Per the specification the device is allowed to offer or remove extents
> at any time. However, anticipated use cases can expect extents to be
> offered, accepted, and removed in well defined chunks.
>
> Simplify extent tracking with the following restrictions.
>
> 1) Flag for removal any extent which overlaps a requested
> release range.
> 2) Refuse the offer of extents which overlap already accepted
> memory ranges.
> 3) Accept again a range which has already been accepted by the
> host. Eating duplicates serves three purposes. First, this
> simplifies the code if the device should get out of sync with
> the host. And it should be safe to acknowledge the extent
> again. Second, this simplifies the code to process existing
> extents if the extent list should change while the extent
> list is being read. Third, duplicates for a given region
> which are seen during a race between the hardware surfacing
> an extent and the cxl dax driver scanning for existing
> extents will be ignored.
>
> NOTE: Processing existing extents is done in a later patch.
>
> Management of the region extent devices must be synchronized with
> potential uses of the memory within the DAX layer. Create region extent
> devices as children of the cxl_dax_region device such that the DAX
> region driver can co-drive them and synchronize with the DAX layer.
> Synchronization and management is handled in a subsequent patch.
>
> Tag support within the DAX layer is not yet supported. To maintain
> compatibility legacy DAX/region processing only tags with a value of 0
> are allowed. This defines existing DAX devices as having a 0 tag which
> makes the most logical sense as a default.
>
> Process DCD events and create region devices.
>
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>
Hi Ira,
I guess you missed my comments for V3, I comment it again for this patch.
> +static bool extents_contain(struct cxl_dax_region *cxlr_dax,
> + struct cxl_endpoint_decoder *cxled,
> + struct range *new_range)
> +{
> + struct device *extent_device;
> + struct match_data md = {
> + .cxled = cxled,
> + .new_range = new_range,
> + };
> +
> + extent_device = device_find_child(&cxlr_dax->dev, &md, match_contains);
> + if (!extent_device)
> + return false;
> +
> + put_device(extent_device);
could use __free(put_device) to drop this 'put_device(extent_device)'
> + return true;
> +}
[...]
> +static bool extents_overlap(struct cxl_dax_region *cxlr_dax,
> + struct cxl_endpoint_decoder *cxled,
> + struct range *new_range)
> +{
> + struct device *extent_device;
> + struct match_data md = {
> + .cxled = cxled,
> + .new_range = new_range,
> + };
> +
> + extent_device = device_find_child(&cxlr_dax->dev, &md, match_overlaps);
> + if (!extent_device)
> + return false;
> +
> + put_device(extent_device);
Same as above.
> + return true;
> +}
> +
[...]
> +static int cxl_send_dc_response(struct cxl_memdev_state *mds, int opcode,
> + struct xarray *extent_array, int cnt)
> +{
> + struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox;
> + struct cxl_mbox_dc_response *p;
> + struct cxl_mbox_cmd mbox_cmd;
> + struct cxl_extent *extent;
> + unsigned long index;
> + u32 pl_index;
> + int rc;
> +
> + size_t pl_size = struct_size(p, extent_list, cnt);
> + u32 max_extents = cnt;
> +
> + /* May have to use more bit on response. */
> + if (pl_size > cxl_mbox->payload_size) {
> + max_extents = (cxl_mbox->payload_size - sizeof(*p)) /
> + sizeof(struct updated_extent_list);
> + pl_size = struct_size(p, extent_list, max_extents);
> + }
> +
> + struct cxl_mbox_dc_response *response __free(kfree) =
> + kzalloc(pl_size, GFP_KERNEL);
> + if (!response)
> + return -ENOMEM;
> +
> + pl_index = 0;
> + xa_for_each(extent_array, index, extent) {
> +
> + response->extent_list[pl_index].dpa_start = extent->start_dpa;
> + response->extent_list[pl_index].length = extent->length;
> + pl_index++;
> + response->extent_list_size = cpu_to_le32(pl_index);
> +
> + if (pl_index == max_extents) {
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = opcode,
> + .size_in = struct_size(response, extent_list,
> + pl_index),
> + .payload_in = response,
> + };
> +
> + response->flags = 0;
> + if (pl_index < cnt)
> + response->flags &= CXL_DCD_EVENT_MORE;
It should be 'response->flags |= CXL_DCD_EVENT_MORE' here.
Another issue is if 'cnt' is N times bigger than 'max_extents'(e,g. cnt=20, max_extents=10). all responses will be sent in this xa_for_each(), and CXL_DCD_EVENT_MORE will be set in the last response but it should not be set in these cases.
> +
> + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
> + if (rc)
> + return rc;
> + pl_index = 0;
> + }
> + }
> +
> + if (cnt == 0 || pl_index) {
> + mbox_cmd = (struct cxl_mbox_cmd) {
> + .opcode = opcode,
> + .size_in = struct_size(response, extent_list,
> + pl_index),
> + .payload_in = response,
> + };
> +
> + response->flags = 0;
> + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
> + if (rc)
> + return rc;
> + }
> +
> + return 0;
> +}
> +
next prev parent reply other threads:[~2024-10-09 1:56 UTC|newest]
Thread overview: 134+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-07 23:16 [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2024-10-07 23:16 ` [PATCH v4 01/28] test printk: Add very basic struct resource tests Ira Weiny
2024-10-08 16:35 ` Andy Shevchenko
2024-10-09 12:24 ` Jonathan Cameron
2024-10-09 17:09 ` Fan Ni
2024-10-10 14:59 ` Petr Mladek
2024-10-11 14:49 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 02/28] printk: Add print format (%pra) for struct range Ira Weiny
2024-10-08 16:56 ` Andy Shevchenko
2024-10-09 12:27 ` Jonathan Cameron
2024-10-09 14:42 ` Andy Shevchenko
2024-10-09 13:30 ` Rasmus Villemoes
2024-10-09 14:41 ` Andy Shevchenko
2024-10-14 0:08 ` Ira Weiny
2024-10-11 16:54 ` Ira Weiny
2024-10-09 17:33 ` Fan Ni
2024-10-11 2:09 ` Bagas Sanjaya
2024-10-17 20:57 ` Ira Weiny
2024-10-25 12:42 ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 03/28] cxl/cdat: Use %pra for dpa range outputs Ira Weiny
2024-10-09 12:33 ` Jonathan Cameron
2024-10-09 17:34 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 04/28] range: Add range_overlaps() Ira Weiny
2024-10-08 16:10 ` David Sterba
2024-10-09 14:45 ` Andy Shevchenko
2024-10-09 14:46 ` Andy Shevchenko
2024-10-14 0:12 ` Ira Weiny
2024-10-09 15:36 ` David Sterba
2024-10-09 16:04 ` Andy Shevchenko
2024-10-10 15:24 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 05/28] dax: Document dax dev range tuple Ira Weiny
2024-10-09 12:42 ` Jonathan Cameron
2024-10-11 20:40 ` Ira Weiny
2024-10-16 15:48 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 06/28] cxl/pci: Delay event buffer allocation Ira Weiny
2024-10-09 17:47 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 07/28] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) ira.weiny
2024-10-07 23:16 ` [PATCH v4 08/28] cxl/mem: Read dynamic capacity configuration from the device ira.weiny
2024-10-09 12:49 ` Jonathan Cameron
2024-10-14 0:05 ` Ira Weiny
2024-10-16 15:54 ` Jonathan Cameron
2024-10-16 16:59 ` Kees Cook
2024-10-07 23:16 ` [PATCH v4 09/28] cxl/core: Separate region mode from decoder mode ira.weiny
2024-10-09 12:51 ` Jonathan Cameron
2024-10-09 18:06 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 10/28] cxl/region: Add dynamic capacity decoder and region modes ira.weiny
2024-10-07 23:16 ` [PATCH v4 11/28] cxl/hdm: Add dynamic capacity size support to endpoint decoders ira.weiny
2024-10-10 12:45 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 12/28] cxl/cdat: Gather DSMAS data for DCD regions Ira Weiny
2024-10-09 14:42 ` Rafael J. Wysocki
2024-10-11 20:38 ` Ira Weiny
2024-10-14 20:52 ` Wysocki, Rafael J
2024-10-09 18:16 ` Fan Ni
2024-10-14 1:16 ` Ira Weiny
2024-10-10 12:51 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 13/28] cxl/mem: Expose DCD partition capabilities in sysfs ira.weiny
2024-10-09 20:46 ` Fan Ni
2024-10-14 1:34 ` Ira Weiny
2024-10-10 13:04 ` Jonathan Cameron
2024-10-16 21:34 ` Ira Weiny
2024-10-11 2:15 ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 14/28] cxl/port: Add endpoint decoder DC mode support to sysfs ira.weiny
2024-10-10 13:14 ` Jonathan Cameron
2024-10-17 17:51 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 15/28] cxl/region: Refactor common create region code Ira Weiny
2024-10-10 13:18 ` Jonathan Cameron
2024-10-17 20:29 ` Ira Weiny
2024-10-10 16:27 ` Fan Ni
2024-10-24 2:17 ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 16/28] cxl/region: Add sparse DAX region support ira.weiny
2024-10-10 13:46 ` Jonathan Cameron
2024-10-10 17:41 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 17/28] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2024-10-10 13:49 ` Jonathan Cameron
2024-10-10 17:58 ` Fan Ni
2024-10-24 2:33 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 18/28] cxl/pci: Factor out interrupt policy check Ira Weiny
2024-10-10 18:07 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 19/28] cxl/mem: Configure dynamic capacity interrupts ira.weiny
2024-10-10 14:15 ` Jonathan Cameron
2024-10-10 18:25 ` Fan Ni
2024-10-24 3:09 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 20/28] cxl/core: Return endpoint decoder information from region search Ira Weiny
2024-10-10 14:21 ` Jonathan Cameron
2024-10-10 18:29 ` Fan Ni
2024-10-24 2:30 ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 21/28] cxl/extent: Process DCD events and realize region extents ira.weiny
2024-10-09 1:56 ` Li, Ming4 [this message]
2024-10-09 19:49 ` Ira Weiny
2024-10-10 3:06 ` Li, Ming4
2024-10-14 2:05 ` Ira Weiny
2024-10-10 14:50 ` Jonathan Cameron
2024-10-11 19:14 ` Fan Ni
2024-10-17 21:15 ` Ira Weiny
2024-10-18 9:03 ` Jonathan Cameron
2024-10-21 14:04 ` Ira Weiny
2024-10-21 14:47 ` Jonathan Cameron
2024-10-10 14:58 ` Jonathan Cameron
2024-10-17 21:39 ` Ira Weiny
2024-10-18 9:09 ` Jonathan Cameron
2024-10-21 18:45 ` Ira Weiny
2024-10-22 17:01 ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 22/28] cxl/region/extent: Expose region extent information in sysfs ira.weiny
2024-10-10 15:01 ` Jonathan Cameron
2024-10-18 18:26 ` Ira Weiny
2024-10-21 9:37 ` Jonathan Cameron
2024-10-14 16:08 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 23/28] dax/bus: Factor out dev dax resize logic Ira Weiny
2024-10-10 15:06 ` Jonathan Cameron
2024-10-21 21:16 ` Ira Weiny
2024-10-14 16:56 ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 24/28] dax/region: Create resources on sparse DAX regions ira.weiny
2024-10-10 15:27 ` Jonathan Cameron
2024-10-23 1:20 ` Ira Weiny
2024-10-23 11:22 ` Jonathan Cameron
2024-10-24 3:50 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 25/28] cxl/region: Read existing extents on region creation ira.weiny
2024-10-10 15:33 ` Jonathan Cameron
2024-10-24 1:41 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 26/28] cxl/mem: Trace Dynamic capacity Event Record ira.weiny
2024-10-10 15:41 ` Jonathan Cameron
2024-10-24 1:52 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 27/28] tools/testing/cxl: Make event logs dynamic Ira Weiny
2024-10-10 15:49 ` Jonathan Cameron
2024-10-24 1:59 ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 28/28] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2024-10-10 15:58 ` Jonathan Cameron
2024-10-24 2:23 ` Ira Weiny
2024-10-08 22:57 ` [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2024-10-08 23:06 ` Fan Ni
2024-10-10 15:30 ` Ira Weiny
2024-10-10 15:31 ` Ira Weiny
2024-10-21 16:47 ` Fan Ni
2024-10-22 17:05 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4337ddd9-312b-4fb7-9597-81e8b00d57cb@intel.com \
--to=ming4.li@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=alison.schofield@intel.com \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=fan.ni@samsung.com \
--cc=ira.weiny@intel.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=navneet.singh@intel.com \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox