From: sashiko-bot@kernel.org
To: "Richard Cheng" <icheng@nvidia.com>
Cc: linux-cxl@vger.kernel.org
Subject: Re: [PATCH v2 1/5] cxl/features: Reject feature offset that overflows 16-bit field
Date: Thu, 02 Jul 2026 11:22:21 +0000 [thread overview]
Message-ID: <20260702112221.D1BE41F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260702090849.47501-2-icheng@nvidia.com>
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [Medium] The newly introduced boundary checks contain an off-by-one arithmetic error that wrongly rejects valid CXL mailbox commands, preventing accesses up to the exact 64KB maximum boundary.
Pre-existing issues:
- [High] Heap buffer overflow in `cxl_get_feature` due to re-using a constant `size_out` for multiple mailbox commands, which can exceed the remaining buffer space if a device returns more data than requested.
- [High] In `cxlctl_get_feature()`, the output buffer `rpc_out` is allocated based on user-provided `*out_len`, but the function copies `count` bytes into it without validating that the buffer is large enough.
- [High] `cxlctl_fw_rpc()` and its command handlers blindly read from `inbuf` (sized `in_len`) without validating that `in_len` is large enough to contain the structures and payload specified by `rpc_in->op_size`.
- [High] Multi-part CXL mailbox feature transfers lack serialization, leading to interleaved commands and hardware state corruption.
--
commit 85b87615c810272287769d45a80ec38b15c23133
Author: Richard Cheng <icheng@nvidia.com>
cxl/features: Reject feature offset that overflows 16-bit field
cxl_get_feature() and cxl_set_feature() build the mailbox command's
offset as cpu_to_le16(offset + data_rcvd_size/data_sent_size), but never
check the sum fits in the 16-bit field. Via fwctl, a user-supplied
offset plus count/op_size summing over 65535 silently wraps, steering
the device to the wrong feature offset.
> diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c
> index 85185af46b72d..c3d5f88a4e045 100644
> --- a/drivers/cxl/core/features.c
> +++ b/drivers/cxl/core/features.c
> @@ -237,6 +237,9 @@ size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
> if (!feat_out || !feat_out_size)
> return 0;
>
> + if (feat_out_size > U16_MAX - offset)
> + return 0;
> +
[Severity: Medium]
Will this off-by-one arithmetic error reject valid boundary cases?
If a caller requests a full 64KB transfer (offset=0, size=65536) or a 1-byte
transfer at the highest offset (offset=65535, size=1), both evaluate as
greater than 65535 (U16_MAX) and will incorrectly fail this check.
Should this check be feat_out_size > 65536 - offset instead?
[Severity: High]
This is a pre-existing issue, but can this function overflow the heap buffer?
Looking at cxl_get_feature() in drivers/cxl/core/features.c, if
feat_out_size requires multiple iterations and the remaining capacity on the
final iteration is smaller than the statically computed size_out,
mbox_cmd.size_out is incorrectly assigned size_out instead of the dynamically
bounded data_to_rd_size:
drivers/cxl/core/features.c:cxl_get_feature() {
...
do {
data_to_rd_size = min(feat_out_size - data_rcvd_size,
cxl_mbox->payload_size);
...
mbox_cmd = (struct cxl_mbox_cmd) {
...
.size_out = size_out,
...
};
rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
...
} while (data_rcvd_size < feat_out_size);
...
}
If a hardware device responds with more bytes than the final chunk requires,
will the mailbox subsystem write up to mbox_cmd.size_out bytes and overflow
the remaining space in feat_out?
[Severity: High]
Also a pre-existing issue, but does this multi-part transfer lack
serialization?
Since the global mbox_mutex is dropped between loop iterations inside
cxl_internal_send_cmd(), if two threads concurrently initiate feature
transfers, will their mailbox commands unpredictably interleave and corrupt
the CXL device's multi-part transfer state machine?
> size_out = min(feat_out_size, cxl_mbox->payload_size);
> uuid_copy(&pi.uuid, feat_uuid);
[ ... ]
> @@ -287,6 +290,9 @@ int cxl_set_feature(struct cxl_mailbox *cxl_mbox,
> if (return_code)
> *return_code = CXL_MBOX_CMD_RC_INPUT;
>
> + if (feat_data_size > U16_MAX - offset)
> + return -EINVAL;
> +
[Severity: Medium]
Does this bounds check also wrongly reject valid boundary limits?
Similar to cxl_get_feature(), if feat_data_size + offset equals 65536,
it evaluates as greater than 65535 (U16_MAX) and fails.
Would changing this to feat_data_size > 65536 - offset permit exact
maximum boundaries?
[Severity: High]
This is a pre-existing issue, but are multi-part transfers lacking
serialization here as well?
If two threads concurrently initiate feature transfers, will their INITIATE,
CONTINUE, and FINISH mailbox commands interleave because a higher-level lock
isn't held across the loop iterations?
> struct cxl_mbox_set_feat_in *pi __free(kfree) =
> kzalloc(cxl_mbox->payload_size, GFP_KERNEL);
[ ... ]
[Severity: High]
This is a pre-existing issue, but does cxlctl_get_feature() have a heap
buffer overflow when writing to rpc_out?
drivers/cxl/core/features.c:cxlctl_get_feature() {
...
out_size = cxl_get_feature(cxl_mbox, &feat_in->uuid,
feat_in->selection, rpc_out->payload,
count, offset, &return_code);
...
}
If a user issues the fwctl ioctl with a small out_len but a large
get_feat_in.count, cxl_get_feature() is instructed to write count bytes into
the buffer allocated for out_len. Can this lead to memory corruption?
[Severity: High]
This is a pre-existing issue, but does cxlctl_set_feature() perform an
out-of-bounds read?
drivers/cxl/core/features.c:cxlctl_set_feature() {
...
data_size = rpc_in->op_size - sizeof(feat_in->hdr);
rc = cxl_set_feature(cxl_mbox, &feat_in->uuid,
feat_in->version, feat_in->feat_data,
data_size, flags, offset, &return_code);
...
}
If a user provides a valid but small in_len (e.g., just enough for the
header) but sets a large op_size inside the CXL payload header,
cxlctl_fw_rpc() blindly reads data_size bytes past the end of the inbuf
allocation. Could this leak kernel memory to the hardware device?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260702090849.47501-1-icheng@nvidia.com?part=1
next prev parent reply other threads:[~2026-07-02 11:22 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-02 9:08 [PATCH v2 0/5] cxl: Sashiko bug fixes Richard Cheng
2026-07-02 9:08 ` [PATCH v2 1/5] cxl/features: Reject feature offset that overflows 16-bit field Richard Cheng
2026-07-02 11:22 ` sashiko-bot [this message]
2026-07-02 9:08 ` [PATCH v2 2/5] cxl/region: Scan all partitions for unmapped poison Richard Cheng
2026-07-02 9:08 ` [PATCH v2 3/5] cxl/region: Don't leak tolerated RAM -EFAULT from unmapped poison scan Richard Cheng
2026-07-02 9:20 ` sashiko-bot
2026-07-02 9:08 ` [PATCH v2 4/5] cxl/region: Start unmapped poison scan at the committed decoder boundary Richard Cheng
2026-07-02 9:08 ` [PATCH v2 5/5] cxl/memdev: Don't overwrite the error from an earlier partition poison query Richard Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260702112221.D1BE41F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=icheng@nvidia.com \
--cc=linux-cxl@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox