linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>, Fan Ni <fan.ni@samsung.com>,
	"Navneet Singh" <navneet.singh@intel.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	"Alison Schofield" <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	<linux-btrfs@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
	<linux-doc@vger.kernel.org>, <nvdimm@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 21/28] cxl/extent: Process DCD events and realize region extents
Date: Thu, 10 Oct 2024 15:58:21 +0100	[thread overview]
Message-ID: <20241010155821.00005079@Huawei.com> (raw)
In-Reply-To: <20241007-dcd-type2-upstream-v4-21-c261ee6eeded@intel.com>

On Mon, 07 Oct 2024 18:16:27 -0500
ira.weiny@intel.com wrote:

> From: Navneet Singh <navneet.singh@intel.com>
> 
> A dynamic capacity device (DCD) sends events to signal the host for
> changes in the availability of Dynamic Capacity (DC) memory.  These
> events contain extents describing a DPA range and meta data for memory
> to be added or removed.  Events may be sent from the device at any time.
> 
> Three types of events can be signaled, Add, Release, and Force Release.
> 
> On add, the host may accept or reject the memory being offered.  If no
> region exists, or the extent is invalid, the extent should be rejected.
> Add extent events may be grouped by a 'more' bit which indicates those
> extents should be processed as a group.
> 
> On remove, the host can delay the response until the host is safely not
> using the memory.  If no region exists the release can be sent
> immediately.  The host may also release extents (or partial extents) at
> any time.  Thus the 'more' bit grouping of release events is of less
> value and can be ignored in favor of sending multiple release capacity
> responses for groups of release events.

True today - I think that would be an error for shared extents
though as they need to be released in one go.  We can deal with
that when it matters.  


Mind you patch seems to try to handle more bit anyway, so maybe just
remove that discussion from this description?
> 
> Simplify extent tracking with the following restrictions.
> 
> 	1) Flag for removal any extent which overlaps a requested
> 	   release range.
> 	2) Refuse the offer of extents which overlap already accepted
> 	   memory ranges.
> 	3) Accept again a range which has already been accepted by the
> 	   host.  Eating duplicates serves three purposes.  First, this
> 	   simplifies the code if the device should get out of sync with
> 	   the host. 

Maybe scream about this a little.  AFAIK that happening is a device
bug.

> And it should be safe to acknowledge the extent
> 	   again.  Second, this simplifies the code to process existing
> 	   extents if the extent list should change while the extent
> 	   list is being read.  Third, duplicates for a given region
> 	   which are seen during a race between the hardware surfacing
> 	   an extent and the cxl dax driver scanning for existing
> 	   extents will be ignored.

This last one is a good justification.

> 
> 	   NOTE: Processing existing extents is done in a later patch.
> 
> Management of the region extent devices must be synchronized with
> potential uses of the memory within the DAX layer.  Create region extent
> devices as children of the cxl_dax_region device such that the DAX
> region driver can co-drive them and synchronize with the DAX layer.
> Synchronization and management is handled in a subsequent patch.
> 
> Tag support within the DAX layer is not yet supported.  To maintain
> compatibility legacy DAX/region processing only tags with a value of 0
> are allowed.  This defines existing DAX devices as having a 0 tag which
> makes the most logical sense as a default.
> 
> Process DCD events and create region devices.
> 
> Signed-off-by: Navneet Singh <navneet.singh@intel.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
A couple of minor comments from me.

J
> diff --git a/drivers/cxl/core/extent.c b/drivers/cxl/core/extent.c
> new file mode 100644
> index 000000000000..69a7614ba6a9
> --- /dev/null
> +++ b/drivers/cxl/core/extent.c


> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 584d7d282a97..d66beec687a0 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -889,6 +889,58 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)


> @@ -1017,6 +1069,223 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>  	return rc;
>  }
>  
> +static int cxl_send_dc_response(struct cxl_memdev_state *mds, int opcode,
> +				struct xarray *extent_array, int cnt)
> +{
> +	struct cxl_mailbox *cxl_mbox = &mds->cxlds.cxl_mbox;
> +	struct cxl_mbox_dc_response *p;
> +	struct cxl_mbox_cmd mbox_cmd;
> +	struct cxl_extent *extent;
> +	unsigned long index;
> +	u32 pl_index;
> +	int rc;
> +
> +	size_t pl_size = struct_size(p, extent_list, cnt);
> +	u32 max_extents = cnt;
> +
> +	/* May have to use more bit on response. */

I thought you argued in the patch description that it didn't matter if you
didn't set it?

> +	if (pl_size > cxl_mbox->payload_size) {
> +		max_extents = (cxl_mbox->payload_size - sizeof(*p)) /
> +			      sizeof(struct updated_extent_list);
> +		pl_size = struct_size(p, extent_list, max_extents);
> +	}
> +
> +	struct cxl_mbox_dc_response *response __free(kfree) =
> +						kzalloc(pl_size, GFP_KERNEL);
> +	if (!response)
> +		return -ENOMEM;
> +
> +	pl_index = 0;
> +	xa_for_each(extent_array, index, extent) {
> +
> +		response->extent_list[pl_index].dpa_start = extent->start_dpa;
> +		response->extent_list[pl_index].length = extent->length;
> +		pl_index++;
> +		response->extent_list_size = cpu_to_le32(pl_index);
> +
> +		if (pl_index == max_extents) {
> +			mbox_cmd = (struct cxl_mbox_cmd) {
> +				.opcode = opcode,
> +				.size_in = struct_size(response, extent_list,
> +						       pl_index),
> +				.payload_in = response,
> +			};
> +
> +			response->flags = 0;
> +			if (pl_index < cnt)
> +				response->flags &= CXL_DCD_EVENT_MORE;
Covered in other branch of thread.

> +
> +			rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
> +			if (rc)
> +				return rc;
> +			pl_index = 0;
> +		}
> +	}
> +
> +	if (cnt == 0 || pl_index) {
> +		mbox_cmd = (struct cxl_mbox_cmd) {
> +			.opcode = opcode,
> +			.size_in = struct_size(response, extent_list,
> +					       pl_index),
> +			.payload_in = response,
> +		};
> +
> +		response->flags = 0;
> +		rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}




> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index cbaacbe0f36d..b75653e9bc32 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h


>  
> +/* See CXL 3.0 8.2.9.2.1.5 */

Maybe update to 3.1? Otherwise patch reviewer needs to open two 
spec versions!  In 3.1 it is 8.2.9.2.1.6

> +enum dc_event {
> +	DCD_ADD_CAPACITY,
> +	DCD_RELEASE_CAPACITY,
> +	DCD_FORCED_CAPACITY_RELEASE,
> +	DCD_REGION_CONFIGURATION_UPDATED,
> +};


  parent reply	other threads:[~2024-10-10 14:58 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-07 23:16 [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Ira Weiny
2024-10-07 23:16 ` [PATCH v4 01/28] test printk: Add very basic struct resource tests Ira Weiny
2024-10-08 16:35   ` Andy Shevchenko
2024-10-09 12:24   ` Jonathan Cameron
2024-10-09 17:09   ` Fan Ni
2024-10-10 14:59   ` Petr Mladek
2024-10-11 14:49     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 02/28] printk: Add print format (%pra) for struct range Ira Weiny
2024-10-08 16:56   ` Andy Shevchenko
2024-10-09 12:27     ` Jonathan Cameron
2024-10-09 14:42       ` Andy Shevchenko
2024-10-09 13:30   ` Rasmus Villemoes
2024-10-09 14:41     ` Andy Shevchenko
2024-10-14  0:08       ` Ira Weiny
2024-10-11 16:54     ` Ira Weiny
2024-10-09 17:33   ` Fan Ni
2024-10-11  2:09   ` Bagas Sanjaya
2024-10-17 20:57     ` Ira Weiny
2024-10-25 12:42       ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 03/28] cxl/cdat: Use %pra for dpa range outputs Ira Weiny
2024-10-09 12:33   ` Jonathan Cameron
2024-10-09 17:34   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 04/28] range: Add range_overlaps() Ira Weiny
2024-10-08 16:10   ` David Sterba
2024-10-09 14:45     ` Andy Shevchenko
2024-10-09 14:46       ` Andy Shevchenko
2024-10-14  0:12         ` Ira Weiny
2024-10-09 15:36       ` David Sterba
2024-10-09 16:04         ` Andy Shevchenko
2024-10-10 15:24     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 05/28] dax: Document dax dev range tuple Ira Weiny
2024-10-09 12:42   ` Jonathan Cameron
2024-10-11 20:40     ` Ira Weiny
2024-10-16 15:48       ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 06/28] cxl/pci: Delay event buffer allocation Ira Weiny
2024-10-09 17:47   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 07/28] cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) ira.weiny
2024-10-07 23:16 ` [PATCH v4 08/28] cxl/mem: Read dynamic capacity configuration from the device ira.weiny
2024-10-09 12:49   ` Jonathan Cameron
2024-10-14  0:05     ` Ira Weiny
2024-10-16 15:54       ` Jonathan Cameron
2024-10-16 16:59         ` Kees Cook
2024-10-07 23:16 ` [PATCH v4 09/28] cxl/core: Separate region mode from decoder mode ira.weiny
2024-10-09 12:51   ` Jonathan Cameron
2024-10-09 18:06   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 10/28] cxl/region: Add dynamic capacity decoder and region modes ira.weiny
2024-10-07 23:16 ` [PATCH v4 11/28] cxl/hdm: Add dynamic capacity size support to endpoint decoders ira.weiny
2024-10-10 12:45   ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 12/28] cxl/cdat: Gather DSMAS data for DCD regions Ira Weiny
2024-10-09 14:42   ` Rafael J. Wysocki
2024-10-11 20:38     ` Ira Weiny
2024-10-14 20:52       ` Wysocki, Rafael J
2024-10-09 18:16   ` Fan Ni
2024-10-14  1:16     ` Ira Weiny
2024-10-10 12:51   ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 13/28] cxl/mem: Expose DCD partition capabilities in sysfs ira.weiny
2024-10-09 20:46   ` Fan Ni
2024-10-14  1:34     ` Ira Weiny
2024-10-10 13:04   ` Jonathan Cameron
2024-10-16 21:34     ` Ira Weiny
2024-10-11  2:15   ` Bagas Sanjaya
2024-10-07 23:16 ` [PATCH v4 14/28] cxl/port: Add endpoint decoder DC mode support to sysfs ira.weiny
2024-10-10 13:14   ` Jonathan Cameron
2024-10-17 17:51     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 15/28] cxl/region: Refactor common create region code Ira Weiny
2024-10-10 13:18   ` Jonathan Cameron
2024-10-17 20:29     ` Ira Weiny
2024-10-10 16:27   ` Fan Ni
2024-10-24  2:17   ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 16/28] cxl/region: Add sparse DAX region support ira.weiny
2024-10-10 13:46   ` Jonathan Cameron
2024-10-10 17:41   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 17/28] cxl/events: Split event msgnum configuration from irq setup Ira Weiny
2024-10-10 13:49   ` Jonathan Cameron
2024-10-10 17:58   ` Fan Ni
2024-10-24  2:33     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 18/28] cxl/pci: Factor out interrupt policy check Ira Weiny
2024-10-10 18:07   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 19/28] cxl/mem: Configure dynamic capacity interrupts ira.weiny
2024-10-10 14:15   ` Jonathan Cameron
2024-10-10 18:25   ` Fan Ni
2024-10-24  3:09     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 20/28] cxl/core: Return endpoint decoder information from region search Ira Weiny
2024-10-10 14:21   ` Jonathan Cameron
2024-10-10 18:29   ` Fan Ni
2024-10-24  2:30   ` Alison Schofield
2024-10-07 23:16 ` [PATCH v4 21/28] cxl/extent: Process DCD events and realize region extents ira.weiny
2024-10-09  1:56   ` Li, Ming4
2024-10-09 19:49     ` Ira Weiny
2024-10-10  3:06       ` Li, Ming4
2024-10-14  2:05         ` Ira Weiny
2024-10-10 14:50       ` Jonathan Cameron
2024-10-11 19:14         ` Fan Ni
2024-10-17 21:15         ` Ira Weiny
2024-10-18  9:03           ` Jonathan Cameron
2024-10-21 14:04             ` Ira Weiny
2024-10-21 14:47               ` Jonathan Cameron
2024-10-10 14:58   ` Jonathan Cameron [this message]
2024-10-17 21:39     ` Ira Weiny
2024-10-18  9:09       ` Jonathan Cameron
2024-10-21 18:45         ` Ira Weiny
2024-10-22 17:01           ` Jonathan Cameron
2024-10-07 23:16 ` [PATCH v4 22/28] cxl/region/extent: Expose region extent information in sysfs ira.weiny
2024-10-10 15:01   ` Jonathan Cameron
2024-10-18 18:26     ` Ira Weiny
2024-10-21  9:37       ` Jonathan Cameron
2024-10-14 16:08   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 23/28] dax/bus: Factor out dev dax resize logic Ira Weiny
2024-10-10 15:06   ` Jonathan Cameron
2024-10-21 21:16     ` Ira Weiny
2024-10-14 16:56   ` Fan Ni
2024-10-07 23:16 ` [PATCH v4 24/28] dax/region: Create resources on sparse DAX regions ira.weiny
2024-10-10 15:27   ` Jonathan Cameron
2024-10-23  1:20     ` Ira Weiny
2024-10-23 11:22       ` Jonathan Cameron
2024-10-24  3:50         ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 25/28] cxl/region: Read existing extents on region creation ira.weiny
2024-10-10 15:33   ` Jonathan Cameron
2024-10-24  1:41     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 26/28] cxl/mem: Trace Dynamic capacity Event Record ira.weiny
2024-10-10 15:41   ` Jonathan Cameron
2024-10-24  1:52     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 27/28] tools/testing/cxl: Make event logs dynamic Ira Weiny
2024-10-10 15:49   ` Jonathan Cameron
2024-10-24  1:59     ` Ira Weiny
2024-10-07 23:16 ` [PATCH v4 28/28] tools/testing/cxl: Add DC Regions to mock mem data Ira Weiny
2024-10-10 15:58   ` Jonathan Cameron
2024-10-24  2:23     ` Ira Weiny
2024-10-08 22:57 ` [PATCH v4 00/28] DCD: Add support for Dynamic Capacity Devices (DCD) Fan Ni
2024-10-08 23:06   ` Fan Ni
2024-10-10 15:30     ` Ira Weiny
2024-10-10 15:31     ` Ira Weiny
2024-10-21 16:47 ` Fan Ni
2024-10-22 17:05   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241010155821.00005079@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=alison.schofield@intel.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=fan.ni@samsung.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=navneet.singh@intel.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).