Linux CXL
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Alison Schofield <alison.schofield@intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	Ben Widawsky <bwidawsk@kernel.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	<linux-cxl@vger.kernel.org>,
	Jonathan Cameron <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v12 1/6] cxl/mbox: Add GET_POISON_LIST mailbox command
Date: Tue, 11 Apr 2023 22:18:40 -0700	[thread overview]
Message-ID: <64363f30a6370_417e294d0@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <ZDY3dR0vA6+zihxN@aschofie-mobl2>

Alison Schofield wrote:
> On Tue, Apr 11, 2023 at 06:47:56PM -0700, Dan Williams wrote:
> > alison.schofield@ wrote:
> > > From: Alison Schofield <alison.schofield@intel.com>
> > > 
> > > CXL devices maintain a list of locations that are poisoned or result
> > > in poison if the addresses are accessed by the host.
> > > 
> > > Per the spec, (CXL 3.0 8.2.9.8.4.1), the device returns this Poison
> > > list as a set of Media Error Records that include the source of the
> > > error, the starting device physical address, and length. The length is
> > > the number of adjacent DPAs in the record and is in units of 64 bytes.
> > > 
> > > Retrieve the poison list.
> > > 
> > > Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  drivers/cxl/core/mbox.c | 71 +++++++++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxlmem.h    | 67 ++++++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/pci.c       |  4 +++
> > >  3 files changed, 142 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > index f2addb457172..69a5d69dd53b 100644
> > > --- a/drivers/cxl/core/mbox.c
> > > +++ b/drivers/cxl/core/mbox.c
> > > @@ -5,6 +5,8 @@
> > >  #include <linux/debugfs.h>
> > >  #include <linux/ktime.h>
> > >  #include <linux/mutex.h>
> > > +#include <asm/unaligned.h>
> > > +#include <cxlpci.h>
> > >  #include <cxlmem.h>
> > >  #include <cxl.h>
> > >  
> > > @@ -994,6 +996,7 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
> > >  	/* See CXL 2.0 Table 175 Identify Memory Device Output Payload */
> > >  	struct cxl_mbox_identify id;
> > >  	struct cxl_mbox_cmd mbox_cmd;
> > > +	u32 val;
> > >  	int rc;
> > >  
> > >  	mbox_cmd = (struct cxl_mbox_cmd) {
> > > @@ -1017,6 +1020,11 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds)
> > >  	cxlds->lsa_size = le32_to_cpu(id.lsa_size);
> > >  	memcpy(cxlds->firmware_version, id.fw_revision, sizeof(id.fw_revision));
> > >  
> > > +	if (test_bit(CXL_MEM_COMMAND_ID_GET_POISON, cxlds->enabled_cmds)) {
> > > +		val = get_unaligned_le24(id.poison_list_max_mer);
> > > +		cxlds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX);
> > > +	}
> > > +
> > 
> > With this new interface I do not expect we want to support user tooling
> > that wants to retrieve the list via ioctl. So I think this wants a
> > lead-in patch that deprecates the poison command support so that the
> > linux-cxl community only has one mechanism to maintain going forward.
> > 
> > Something like the below as a lead-in, and then you would add code to
> > cxl_walk_cel() to set a flag for the "get poison" machinery.
> 
> In the inject & clear series, I made the commands kernel exclusive (and
> also blocked their usage in raw mode).
> 
> https://lore.kernel.org/linux-cxl/1576040e-e8db-bc78-2fa3-622c8f7da8ec@intel.com/T/#m5b86f3e88ee7ad5b92843babdb9fd41b7f03cf36
> 
> Is that not enough or not the right protection?

I think I'm having deja vu and maybe we talked about this already and I
forgot the outcome?

The concern is having a consistent approach across the commands that are
valid from userspace all the time, those that are valid sometimes
(temporarily marked exclusive), and those that are unsupported from
userspace (unsupported via raw command being a special case of that).

For recent new mailbox functionality like CXL_MBOX_OP_GET_EVENT_RECORD
and the DCD commands, that are only expected to be executed from the
kernel, those are kept out of the CXL_CMDS list, not marked exclusive.

The commands marked exclusive like the label commands return EBUSY, but
only while libnvdimm owns the label area. If libnvdimm is disconnected
(echo pmemX > /sys/bus/cxl/drivers/cxl_pmem/unbind), then those commands
stop being exclusive.

So, I think I want to keep "exclusive" as a set of commands that are
suitable to build into a user tool just that the tool needs to know the
rules about when those commands might be busy.

For poison the user tool is expected to use sysfs + trace-events, so
including it in the list of available commands returned by
CXL_MEM_QUERY_COMMANDS only to never be able to execute it seems wrong.

Marking it deprecated as a "whoops" feels more honest and brings it in
line with the other commands that are permanently kernel exclusive.

  reply	other threads:[~2023-04-12  5:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-10 20:55 [PATCH v12 0/6] CXL Poison List Retrieval & Tracing alison.schofield
2023-04-10 20:55 ` [PATCH v12 1/6] cxl/mbox: Add GET_POISON_LIST mailbox command alison.schofield
2023-04-12  1:47   ` Dan Williams
2023-04-12  4:45     ` Alison Schofield
2023-04-12  5:18       ` Dan Williams [this message]
2023-04-12 18:01         ` Alison Schofield
2023-04-12 19:16           ` Dan Williams
2023-04-12 18:06     ` Alison Schofield
2023-04-13 16:48     ` Alison Schofield
2023-04-13 18:34       ` Dan Williams
2023-04-17 16:32     ` Alison Schofield
2023-04-17 19:39       ` Dan Williams
2023-04-10 20:55 ` [PATCH v12 2/6] cxl/trace: Add TRACE support for CXL media-error records alison.schofield
2023-04-10 20:55 ` [PATCH v12 3/6] cxl/memdev: Add trigger_poison_list sysfs attribute alison.schofield
2023-04-12  5:37   ` Dan Williams
2023-04-12 18:32     ` Alison Schofield
2023-04-12 19:34       ` Dan Williams
2023-04-10 20:55 ` [PATCH v12 4/6] cxl/region: Provide region info to the cxl_poison trace event alison.schofield
2023-04-12  5:55   ` Dan Williams
2023-04-12 18:39     ` Alison Schofield
2023-04-12 22:09       ` Dan Williams
2023-04-10 20:55 ` [PATCH v12 5/6] cxl/trace: Add an HPA to cxl_poison trace events alison.schofield
2023-04-10 20:55 ` [PATCH v12 6/6] tools/testing/cxl: Mock support for Get Poison List alison.schofield

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64363f30a6370_417e294d0@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=alison.schofield@intel.com \
    --cc=bwidawsk@kernel.org \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox