From: <dan.j.williams@intel.com>
To: <alison.schofield@intel.com>, Davidlohr Bueso <dave@stgolabs.net>,
Jonathan Cameron <jonathan.cameron@huawei.com>,
Dave Jiang <dave.jiang@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
"Vishal Verma" <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
"Dan Williams" <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>
Subject: Re: [PATCH 3/3] cxl/region: Add inject and clear poison by HPA
Date: Wed, 25 Jun 2025 16:05:35 -0700 [thread overview]
Message-ID: <685c80bf237b2_23a2a100bb@dwillia2-mobl4.notmuch> (raw)
In-Reply-To: <50bd42e0db1ed5979a00e6f5a43147320a1d5b9b.1750725512.git.alison.schofield@intel.com>
alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
>
> Add CXL region debugfs attributes to inject and clear poison based
> on Host Physical Addresses (HPA). These new interfaces allow users
> to operate on poison at the region level without needing to resolve
> Device Physical Addresses (DPA) or target individual memdevs.
>
> The implementation leverages an internal HPA-to-DPA helper, which
> applies decoder interleave logic, including XOR-based address decoding
> when applicable. Note that XOR decodes rely on driver internal xormaps
> which are not exposed to userspace. So, this support is not only a
> simplification of poison operations that could be done using existing
> per memdev operations, but also it enables the functionality for XOR
> interleaved regions for the first time.
>
> The new debugfs attributes are added under /sys/kernel/debug/regionX/:
> inject_poison and clear_poison. These are only exposed if all memdevs
> participating in the region support both inject and clear commands,
> ensuring consistent and reliable behavior across multi-device regions.
>
> If tracing is enabled, these operations are logged as cxl_poison
> events in /sys/kernel/tracing/trace.
>
> Signed-off-by: Alison Schofield <alison.schofield@intel.com>
> ---
> Documentation/ABI/testing/debugfs-cxl | 33 +++++++
> drivers/cxl/core/core.h | 4 +
> drivers/cxl/core/memdev.c | 11 +++
> drivers/cxl/core/region.c | 120 +++++++++++++++++++++++++-
> 4 files changed, 166 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
> index 12488c14be64..c784885cdf07 100644
> --- a/Documentation/ABI/testing/debugfs-cxl
> +++ b/Documentation/ABI/testing/debugfs-cxl
> @@ -35,6 +35,39 @@ Description:
> The clear_poison attribute is only visible for devices
> supporting the capability.
>
> +
> +What: /sys/kernel/debug/regionX/inject_poison
> +Date: August, 2025
> +KernelVersion: v6.17
I do not find the KernelVersion line all that useful because git blame
can get you that and sometimes the merged kernel version changes.
> +Contact: linux-cxl@vger.kernel.org
> +Description:
> + (WO) When a Host Physical Address (HPA) is written to this
> + attribute, the region driver translates it to a Device
> + Physical Address (DPA) and identifies the corresponding
> + memdev. It then sends an inject poison command to that memdev
> + at the translated DPA. Refer to the memdev ABI entry at:
> + /sys/kernel/debug/cxl/memX/inject_poison for the detailed
> + behavior. This attribute is only visible if all memdevs
> + participating in the region support both inject and clear
> + poison commands.
> +
> +
> +What: /sys/kernel/debug/regionX/clear_poison
> +Date: August, 2025
> +KernelVersion: v6.17
> +Contact: linux-cxl@vger.kernel.org
> +Description:
> + (WO) When a Host Physical Address (HPA) is written to this
> + attribute, the region driver translates it to a Device
> + Physical Address (DPA) and identifies the corresponding
> + memdev. It then sends a clear poison command to that memdev
> + at the translated DPA. Refer to the memdev ABI entry at:
> + /sys/kernel/debug/cxl/memX/clear_poison for the detailed
> + behavior. This attribute is only visible if all memdevs
> + participating in the region support both inject and clear
> + poison commands.
A few food for thought comments here:
In the nvdimm subsystem all of the poison addressing is object relative.
I.e. instead of absolute HPA it would be a 0-based region offset. That
might help when we get to platforms that have additional Host Bridge
translation because HPA != SPA in all cases. The existing DPA interface
for memX injection just happens to comply because DPA is 0-based
mem-object relative address. Lets use region-offset values for this new
capability.
This documentation probably needs to be clearer about the data loss
danger here especially when this error inject can permanently destroy
data in the case of CXL PMEM, and crash kernels if this is just memory.
Again, nvdimm was careful to separate "uninject" from "data
repair/recovery" because they are not equivalent. The documentation can
note that this interface is test-only not repair
next prev parent reply other threads:[~2025-06-25 23:06 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-24 0:53 [PATCH 0/3] cxl: Support Poison Inject & Clear by HPA alison.schofield
2025-06-24 0:53 ` [PATCH 1/3] cxl/core: Add locked variants of the poison inject and clear funcs alison.schofield
2025-06-24 13:38 ` Jonathan Cameron
2025-06-25 22:15 ` Dave Jiang
2025-06-30 19:37 ` Alison Schofield
2025-06-30 19:36 ` Alison Schofield
2025-06-24 0:53 ` [PATCH 2/3] cxl/region: Introduce HPA to DPA address translation alison.schofield
2025-06-24 14:27 ` Jonathan Cameron
2025-06-30 20:05 ` Alison Schofield
2025-07-01 20:40 ` Alison Schofield
2025-06-25 22:49 ` Dave Jiang
2025-06-30 20:12 ` Alison Schofield
2025-06-24 0:53 ` [PATCH 3/3] cxl/region: Add inject and clear poison by HPA alison.schofield
2025-06-24 8:06 ` kernel test robot
2025-06-24 14:33 ` Jonathan Cameron
2025-06-30 20:39 ` Alison Schofield
2025-06-25 23:05 ` dan.j.williams [this message]
2025-06-30 20:32 ` Alison Schofield
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=685c80bf237b2_23a2a100bb@dwillia2-mobl4.notmuch \
--to=dan.j.williams@intel.com \
--cc=alison.schofield@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).