All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Ben Widawsky <bwidawsk@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ingo Molnar <mingo@redhat.com>, <linux-cxl@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 2/6] cxl/mbox: Add GET_POISON_LIST mailbox command
Date: Wed, 16 Nov 2022 12:41:18 +0000	[thread overview]
Message-ID: <20221116124118.0000144b@Huawei.com> (raw)
In-Reply-To: <46c7c7339224744fce424b196da3e5566effec17.1668115235.git.alison.schofield@intel.com>

On Thu, 10 Nov 2022 19:12:40 -0800
alison.schofield@intel.com wrote:

> From: Alison Schofield <alison.schofield@intel.com>
> 
> CXL devices maintain a list of locations that are poisoned or result
> in poison if the addresses are accessed by the host.
> 
> Per the spec (CXL 3.0 8.2.9.8.4.1), the device returns this Poison
> list as a set of  Media Error Records that include the source of the
> error, the starting device physical address and length. The length is
> the number of adjacent DPAs in the record and is in units of 64 bytes.
> 
> Retrieve the list and log each Media Error Record as a trace event of
> type 'cxl_poison'.
> 
> When the poison list is requested by region, include the region name
> and uuid in the trace event.
> 
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Hi Alison,

I've forgotten most of previous discussions around versions of this series
so I may well repeat things that were covered earlier!

A few things inline.

Thanks,

Jonathan


> ---
>  drivers/cxl/core/mbox.c | 81 +++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h    | 37 +++++++++++++++++++
>  2 files changed, 118 insertions(+)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 0c90f13870a4..88f034e97812 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -9,6 +9,9 @@
>  
>  #include "core.h"
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/cxl.h>
> +
>  static bool cxl_raw_allow_all;
>  
>  /**
> @@ -752,6 +755,7 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
>  {
>  	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
>  	struct cxl_mbox_identify id;
> +	__le32 val = 0;
>  	int rc;
>  
>  	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_IDENTIFY, NULL, 0, &id,
> @@ -771,6 +775,9 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
>  	cxlds->lsa_size = le32_to_cpu(id.lsa_size);
>  	memcpy(cxlds->firmware_version, id.fw_revision, sizeof(id.fw_revision));
>  
> +	memcpy(&val, id.poison_list_max_mer, 3);

This is ugly.  I've lost track of last discussion about get_unaligned_le24()
and using it on elements of a packed structure.  At very least can we
do a memcpy to a u8[3] array and then use get_unaligned_le24() on that if
we can't use it directly on the structure element?


> +	cxlds->poison_max = min_t(u32, le32_to_cpu(val), CXL_POISON_LIST_MAX);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_dev_state_identify, CXL);
> @@ -835,6 +842,79 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_mem_create_range_info, CXL);
>  
> +int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> +		       struct cxl_region *cxlr)
> +{
> +	struct cxl_dev_state *cxlds = cxlmd->cxlds;
> +	const char *memdev_name = dev_name(&cxlmd->dev);
Could just do this where it's used rather than here.

> +	const char *pcidev_name = dev_name(cxlds->dev);
Same with this.

> +	struct cxl_mbox_poison_payload_out *po;
> +	struct cxl_mbox_poison_payload_in pi;
> +	int nr_records = 0;
> +	int rc;
> +
> +	po = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	if (!po)
> +		return -ENOMEM;
> +
> +	pi.offset = cpu_to_le64(offset);
> +	pi.length = cpu_to_le64(len);
> +
> +	rc = mutex_lock_interruptible(&cxlds->poison_list_mutex);
> +	if (rc)
> +		goto out;
> +
> +	do {
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_POISON, &pi,
> +				       sizeof(pi), po, cxlds->payload_size);
> +		if (rc)
> +			break;
> +
> +		if (trace_cxl_poison_enabled())
> +			cxl_trace_poison(po, cxlr, memdev_name, pcidev_name);
> +
> +		/* Protect against an uncleared _FLAG_MORE */
> +		nr_records = nr_records + le16_to_cpu(po->count);
> +		if (nr_records >= cxlds->poison_max) {
> +			dev_dbg(&cxlmd->dev, "Max Error Records reached: %d\n",
> +				nr_records);
> +			break;
> +		}
> +	} while (po->flags & CXL_POISON_FLAG_MORE);
> +
> +	mutex_unlock(&cxlds->poison_list_mutex);
> +out:
> +	kvfree(po);
> +	return rc;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, CXL);
> +
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  {
>  	struct cxl_dev_state *cxlds;
> @@ -846,6 +926,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  	}
>  
>  	mutex_init(&cxlds->mbox_mutex);
> +	mutex_init(&cxlds->poison_list_mutex);
>  	cxlds->dev = dev;
>  
>  	return cxlds;
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 669868cc1553..49d891347e39 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -192,6 +192,8 @@ struct cxl_endpoint_dvsec_info {
>   *                (CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register)
>   * @lsa_size: Size of Label Storage Area
>   *                (CXL 2.0 8.2.9.5.1.1 Identify Memory Device)
> + * @poison_max: maximum media error records held in device cache
For consistency of capitalization: Maximum 
> + * @poison_list_mutex: Mutex to synchronize poison list retrieval
>   * @mbox_mutex: Mutex to synchronize mailbox access.
>   * @firmware_version: Firmware version for the memory device.
>   * @enabled_cmds: Hardware commands found enabled in CEL.
> @@ -224,6 +226,8 @@ struct cxl_dev_state {
>  

...



  reply	other threads:[~2022-11-16 12:41 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-11  3:12 [PATCH v3 0/6] CXL Poison List Retrieval & Tracing alison.schofield
2022-11-11  3:12 ` [PATCH v3 1/6] trace, cxl: Introduce a TRACE_EVENT for CXL poison records alison.schofield
2022-11-16 12:19   ` Jonathan Cameron
2022-12-04 22:42   ` Dan Williams
2022-11-11  3:12 ` [PATCH v3 2/6] cxl/mbox: Add GET_POISON_LIST mailbox command alison.schofield
2022-11-16 12:41   ` Jonathan Cameron [this message]
2022-11-17 23:55     ` Alison Schofield
2022-12-07  2:41   ` Dan Williams
2022-12-07 16:10     ` Alison Schofield
2022-12-07 21:39       ` Dan Williams
2022-12-08  3:47         ` Alison Schofield
2022-11-11  3:12 ` [PATCH v3 3/6] cxl/memdev: Add trigger_poison_list sysfs attribute alison.schofield
2022-11-16 12:48   ` Jonathan Cameron
2022-11-18  0:15     ` Alison Schofield
2022-11-11  3:12 ` [PATCH v3 4/6] cxl/region: " alison.schofield
2022-11-16 12:50   ` Jonathan Cameron
2022-11-18  0:24     ` Alison Schofield
2022-11-11  3:12 ` [PATCH v3 5/6] tools/testing/cxl: Mock the max err records field of Identify cmd alison.schofield
2022-11-16 12:51   ` Jonathan Cameron
2022-11-18  0:25     ` Alison Schofield
2022-11-11  3:12 ` [PATCH v3 6/6] tools/testing/cxl: Mock the Get Poison List mbox command alison.schofield
2022-11-16 12:52   ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221116124118.0000144b@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bwidawsk@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.