From: Richard Cheng <icheng@nvidia.com>
To: dave@stgolabs.net, jic23@kernel.org, dave.jiang@intel.com,
alison.schofield@intel.com, vishal.l.verma@intel.com,
djbw@kernel.org, danwilliams@nvidia.com
Cc: iweiny@kernel.org, ming.li@zohomail.com, gourry@gourry.net,
rrichter@amd.com, linux-cxl@vger.kernel.org,
linux-kernel@vger.kernel.org, kees@kernel.org,
newtonl@nvidia.com, kristinc@nvidia.com, mochs@nvidia.com,
kaihengf@nvidia.com, kobak@nvidia.com,
Richard Cheng <icheng@nvidia.com>
Subject: [PATCH v2 0/5] cxl: Sashiko bug fixes
Date: Thu, 2 Jul 2026 17:08:44 +0800 [thread overview]
Message-ID: <20260702090849.47501-1-icheng@nvidia.com> (raw)
Five independent, pre-existing bugs in the CXL core, reported by sashiko.
Patch 1: Get/Set Feature stored offset + transfer-size into a 16-bit
field via cpu_to_le16() with no bounds check, so a large offset/count
from the fwctl interface silently wrapped and steered the device to the
wrong feature offset. Reject offset + size > U16_MAX up front.
Patch 2: cxl_get_poison_unmapped() aborted its whole partition sweep on
the first fully-mapped partition, silently skipping unmapped poison in
all later partitions. Skip that partition instead.
Patch 3: the same function tolerated the -EFAULT a RAM partition returns
for Get Poison List but left it in rc, so a benign fault on the last
scanned partition surfaced as a spurious read failure. Clear rc, as
poison_by_decoder() already does.
Patch 4: the same function also ignored the ctx->offset handoff from
poison_by_decoder() and derived its scan start from the highest DPA
allocation, so the DPA of allocated-but-uncommitted decoders was never
scanned by either phase. Resume the sweep at ctx->offset.
Patch 5: cxl_get_poison_by_memdev() overwrote rc on each partition
query, so an earlier partition's failure was masked by a later success
and unscanned poison was reported as a clean list. Stop on any error
not tolerated as a RAM -EFAULT.
Changes since v1 [1]:
- Patch 1: write the bounds checks as size > U16_MAX - offset so the
check itself cannot wrap on 32-bit architectures (sashiko)
- Patch 2: commit message wording fix (Dave)
- New patches 4 and 5, fixing the pre-existing issues sashiko raised on
the v1 patch 3 thread [2]
[1]:
https://lore.kernel.org/linux-cxl/20260630074657.43077-1-icheng@nvidia.com/
[2]:
https://lore.kernel.org/linux-cxl/20260630100022.A621A1F000E9@smtp.kernel.org/
Richard Cheng (5):
cxl/features: Reject feature offset that overflows 16-bit field
cxl/region: Scan all partitions for unmapped poison
cxl/region: Don't leak tolerated RAM -EFAULT from unmapped poison scan
cxl/region: Start unmapped poison scan at the committed decoder
boundary
cxl/memdev: Don't overwrite the error from an earlier partition poison
query
drivers/cxl/core/features.c | 6 ++++++
drivers/cxl/core/memdev.c | 2 ++
drivers/cxl/core/region.c | 13 ++++++-------
3 files changed, 14 insertions(+), 7 deletions(-)
base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
--
2.43.0
next reply other threads:[~2026-07-02 9:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-02 9:08 Richard Cheng [this message]
2026-07-02 9:08 ` [PATCH v2 1/5] cxl/features: Reject feature offset that overflows 16-bit field Richard Cheng
2026-07-02 11:22 ` sashiko-bot
2026-07-02 9:08 ` [PATCH v2 2/5] cxl/region: Scan all partitions for unmapped poison Richard Cheng
2026-07-02 9:08 ` [PATCH v2 3/5] cxl/region: Don't leak tolerated RAM -EFAULT from unmapped poison scan Richard Cheng
2026-07-02 9:20 ` sashiko-bot
2026-07-02 9:08 ` [PATCH v2 4/5] cxl/region: Start unmapped poison scan at the committed decoder boundary Richard Cheng
2026-07-02 9:08 ` [PATCH v2 5/5] cxl/memdev: Don't overwrite the error from an earlier partition poison query Richard Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260702090849.47501-1-icheng@nvidia.com \
--to=icheng@nvidia.com \
--cc=alison.schofield@intel.com \
--cc=danwilliams@nvidia.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=djbw@kernel.org \
--cc=gourry@gourry.net \
--cc=iweiny@kernel.org \
--cc=jic23@kernel.org \
--cc=kaihengf@nvidia.com \
--cc=kees@kernel.org \
--cc=kobak@nvidia.com \
--cc=kristinc@nvidia.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.li@zohomail.com \
--cc=mochs@nvidia.com \
--cc=newtonl@nvidia.com \
--cc=rrichter@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox