Linux CXL
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "Schofield, Alison" <alison.schofield@intel.com>,
	"Benjamin.Cheatham@amd.com" <Benjamin.Cheatham@amd.com>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"Jiang, Dave" <dave.jiang@intel.com>
Cc: "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 7/7] Documentation: Add docs for inject/clear-error commands
Date: Thu, 29 Jan 2026 19:45:25 +0000	[thread overview]
Message-ID: <4e3cf71a568f98a8349416874a7f08a5e5099799.camel@intel.com> (raw)
In-Reply-To: <20260122203728.622-8-Benjamin.Cheatham@amd.com>

On Thu, 2026-01-22 at 14:37 -0600, Ben Cheatham wrote:
> Add man pages for the 'cxl-inject-error' and 'cxl-clear-error' commands.
> These man pages show usage and examples for each of their use cases.
> 
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>

Sorry to jump in late in the review cycle, but I had some thoughts on
the command interface below.

<snip>
> 
> +
> +cxl-inject-error(1)
> +===================
> +
> +NAME
> +----
> +cxl-inject-error - Inject CXL errors into CXL devices
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'cxl inject-error' <device name> [<options>]
> +
> +WARNING: Error injection can cause system instability and should only be used
> +for debugging hardware and software error recovery flows. Use at your own risk!
> +
> +Inject an error into a CXL device. The type of errors supported depend on the
> +device specified. The types of devices supported are:
> +
> +"Downstream Ports":: A CXL RCH downstream port (dport) or a CXL VH root port.
> +Eligible ports will have their 'protocol_injectable' attribute in 'cxl-list'
> +set to true. Dports are specified by host name ("0000:0e:01.1").
> +"memdevs":: A CXL memory device. Memory devices are specified by device name
> +("mem0"), device id ("0"), and/or host device name ("0000:35:00.0").
> +
> +There are two types of errors which can be injected: CXL protocol errors
> +and device poison.
> +
> +CXL protocol errors can only be used with downstream ports (as defined above).
> +Protocol errors follow the format of "<protocol>-<severity>". For example,
> +a "mem-fatal" error is a CXL.mem fatal protocol error. Protocol errors can be
> +found in the "injectable_protocol_errors" list under a CXL bus object. This
> +list is only available when the CXL debugfs is accessible (normally mounted
> +at "/sys/kernel/debug/cxl"). For example:
> +
> +----
> +
> +# cxl list -B
> +[
> +  {
> +	"bus":"root0",
> +	"provider":"ACPI.CXL",
> +	"injectable_protocol_errors":[
> +	  "mem-correctable",
> +	  "mem-fatal",
> +	]
> +  }
> +]
> +
> +----
> +
> +CXL protocol (CXL.cache/mem) error injection requires the platform to support
> +ACPI v6.5+ error injection (EINJ). In addition to platform support, the
> +CONFIG_ACPI_APEI_EINJ and CONFIG_ACPI_APEI_EINJ_CXL kernel configuration options
> +will need to be enabled. For more information, view the Linux kernel documentation
> +on EINJ. Example using the bus output above:
> +
> +----
> +
> +# cxl list -TP
> + [
> +  {
> +    "port":"port1",
> +    "host":"pci0000:e0",
> +    "depth":1,
> +    "decoders_committed":1,
> +    "nr_dports":1,
> +    "dports":[
> +      {
> +        "dport":"0000:e0:01.1",
> +        "alias":"device:02",
> +        "id":0,
> +        "protocol_injectable":true
> +      }
> +    ]
> +  }
> +]
> +
> +# cxl inject-error "0000:e0:01.1" -t mem-correctable
> +cxl inject-error: inject_proto_err: injected mem-correctable protocol error.
> +
> +----
> +
> +Device poison can only by used with CXL memory devices. A device physical address
> +(DPA) is required to do poison injection. DPAs range from 0 to the size of
> +device's memory, which can be found using 'cxl-list'. An example injection:
> +
> +----
> +
> +# cxl inject-error mem0 -t poison -a 0x1000
> +poison injected at mem0:0x1000
> +# cxl list -m mem0 -u --media-errors
> +{
> +  "memdev":"mem0",
> +  "ram_size":"256.00 MiB (268.44 MB)",
> +  "serial":"0",
> +  "host":"0000:0d:00.0",
> +  "firmware_version":"BWFW VERSION 00",
> +  "media_errors":[
> +    {
> +      "offset":"0x1000",
> +      "length":64,
> +      "source":"Injected"
> +    }
> +  ]
> +}
> +
> +----

It feels to me like the two injection 'modes' should really be two
separate commands, especially since they act on different classes of
targets.

So essentially, split both the injection and clear commands into:

inject-protocol-error
inject-media-error
clear-protocol-error
clear-media-error.

That way the target operands for them are well defined - i.e. port
objects for protocol errors and memdevs for media errors.


Another thing - and I'm not too attached to either way for this -

The -t 'long-string' feels a bit awkward. Could it be split into
something like:

  --target={mem,cache} --type={correctable,uncorrectable,fatal}

And then 'compose' the actual thing being injected from those options?
Or is that unnecessary gymnastics?


  reply	other threads:[~2026-01-29 19:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-22 20:37 [ndctl PATCH v7 0/7] Add error injection support Ben Cheatham
2026-01-22 20:37 ` [PATCH 1/7] libcxl: Add debugfs path to CXL context Ben Cheatham
2026-01-22 20:37 ` [PATCH 2/7] libcxl: Add CXL protocol errors Ben Cheatham
2026-01-22 20:37 ` [PATCH 3/7] libcxl: Add poison injection support Ben Cheatham
2026-01-22 20:37 ` [PATCH 4/7] cxl: Add inject-error command Ben Cheatham
2026-01-22 20:37 ` [PATCH 5/7] cxl: Add clear-error command Ben Cheatham
2026-01-22 20:37 ` [PATCH 6/7] cxl/list: Add injectable errors in output Ben Cheatham
2026-01-22 20:37 ` [PATCH 7/7] Documentation: Add docs for inject/clear-error commands Ben Cheatham
2026-01-29 19:45   ` Verma, Vishal L [this message]
2026-01-30 19:59     ` Cheatham, Benjamin
2026-01-30 20:58       ` Verma, Vishal L
2026-01-30 21:45         ` Cheatham, Benjamin
2026-02-03  1:31           ` Alison Schofield
  -- strict thread matches above, loose matches on Subject: below --
2026-01-09 16:07 [ndctl PATCH v6 0/7] Add error injection support Ben Cheatham
2026-01-09 16:07 ` [PATCH 7/7] Documentation: Add docs for inject/clear-error commands Ben Cheatham
2026-01-09 22:25   ` Dave Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e3cf71a568f98a8349416874a7f08a5e5099799.camel@intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox