Re: [PATCH v2 1/5] cxl/features: Reject feature offset that overflows 16-bit field

Linux CXL
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Richard Cheng" <icheng@nvidia.com>
Cc: linux-cxl@vger.kernel.org
Subject: Re: [PATCH v2 1/5] cxl/features: Reject feature offset that overflows 16-bit field
Date: Thu, 02 Jul 2026 11:22:21 +0000	[thread overview]
Message-ID: <20260702112221.D1BE41F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260702090849.47501-2-icheng@nvidia.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

New issues:
- [Medium] The newly introduced boundary checks contain an off-by-one arithmetic error that wrongly rejects valid CXL mailbox commands, preventing accesses up to the exact 64KB maximum boundary.

Pre-existing issues:
- [High] Heap buffer overflow in `cxl_get_feature` due to re-using a constant `size_out` for multiple mailbox commands, which can exceed the remaining buffer space if a device returns more data than requested.
- [High] In `cxlctl_get_feature()`, the output buffer `rpc_out` is allocated based on user-provided `*out_len`, but the function copies `count` bytes into it without validating that the buffer is large enough.
- [High] `cxlctl_fw_rpc()` and its command handlers blindly read from `inbuf` (sized `in_len`) without validating that `in_len` is large enough to contain the structures and payload specified by `rpc_in->op_size`.
- [High] Multi-part CXL mailbox feature transfers lack serialization, leading to interleaved commands and hardware state corruption.
--

commit 85b87615c810272287769d45a80ec38b15c23133
Author: Richard Cheng <icheng@nvidia.com>

cxl/features: Reject feature offset that overflows 16-bit field

cxl_get_feature() and cxl_set_feature() build the mailbox command's
offset as cpu_to_le16(offset + data_rcvd_size/data_sent_size), but never
check the sum fits in the 16-bit field. Via fwctl, a user-supplied
offset plus count/op_size summing over 65535 silently wraps, steering
the device to the wrong feature offset.

> diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c
> index 85185af46b72d..c3d5f88a4e045 100644
> --- a/drivers/cxl/core/features.c
> +++ b/drivers/cxl/core/features.c
> @@ -237,6 +237,9 @@ size_t cxl_get_feature(struct cxl_mailbox *cxl_mbox, const uuid_t *feat_uuid,
>  	if (!feat_out || !feat_out_size)
>  		return 0;
>  
> +	if (feat_out_size > U16_MAX - offset)
> +		return 0;
> +

[Severity: Medium]
Will this off-by-one arithmetic error reject valid boundary cases?

If a caller requests a full 64KB transfer (offset=0, size=65536) or a 1-byte
transfer at the highest offset (offset=65535, size=1), both evaluate as 
greater than 65535 (U16_MAX) and will incorrectly fail this check.

Should this check be feat_out_size > 65536 - offset instead?

[Severity: High]
This is a pre-existing issue, but can this function overflow the heap buffer?

Looking at cxl_get_feature() in drivers/cxl/core/features.c, if
feat_out_size requires multiple iterations and the remaining capacity on the
final iteration is smaller than the statically computed size_out,
mbox_cmd.size_out is incorrectly assigned size_out instead of the dynamically
bounded data_to_rd_size:

drivers/cxl/core/features.c:cxl_get_feature() {
	...
	do {
		data_to_rd_size = min(feat_out_size - data_rcvd_size,
				      cxl_mbox->payload_size);
		...
		mbox_cmd = (struct cxl_mbox_cmd) {
			...
			.size_out = size_out,
			...
		};
		rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd);
		...
	} while (data_rcvd_size < feat_out_size);
	...
}

If a hardware device responds with more bytes than the final chunk requires,
will the mailbox subsystem write up to mbox_cmd.size_out bytes and overflow
the remaining space in feat_out?

[Severity: High]
Also a pre-existing issue, but does this multi-part transfer lack
serialization?

Since the global mbox_mutex is dropped between loop iterations inside
cxl_internal_send_cmd(), if two threads concurrently initiate feature
transfers, will their mailbox commands unpredictably interleave and corrupt
the CXL device's multi-part transfer state machine?

>  	size_out = min(feat_out_size, cxl_mbox->payload_size);
>  	uuid_copy(&pi.uuid, feat_uuid);

[ ... ]

> @@ -287,6 +290,9 @@ int cxl_set_feature(struct cxl_mailbox *cxl_mbox,
>  	if (return_code)
>  		*return_code = CXL_MBOX_CMD_RC_INPUT;
>  
> +	if (feat_data_size > U16_MAX - offset)
> +		return -EINVAL;
> +

[Severity: Medium]
Does this bounds check also wrongly reject valid boundary limits?

Similar to cxl_get_feature(), if feat_data_size + offset equals 65536,
it evaluates as greater than 65535 (U16_MAX) and fails.

Would changing this to feat_data_size > 65536 - offset permit exact
maximum boundaries?

[Severity: High]
This is a pre-existing issue, but are multi-part transfers lacking
serialization here as well?

If two threads concurrently initiate feature transfers, will their INITIATE,
CONTINUE, and FINISH mailbox commands interleave because a higher-level lock
isn't held across the loop iterations?

>  	struct cxl_mbox_set_feat_in *pi __free(kfree) =
>  			kzalloc(cxl_mbox->payload_size, GFP_KERNEL);

[ ... ]

[Severity: High]
This is a pre-existing issue, but does cxlctl_get_feature() have a heap
buffer overflow when writing to rpc_out?

drivers/cxl/core/features.c:cxlctl_get_feature() {
	...
	out_size = cxl_get_feature(cxl_mbox, &feat_in->uuid,
				   feat_in->selection, rpc_out->payload,
				   count, offset, &return_code);
	...
}

If a user issues the fwctl ioctl with a small out_len but a large
get_feat_in.count, cxl_get_feature() is instructed to write count bytes into
the buffer allocated for out_len. Can this lead to memory corruption?

[Severity: High]
This is a pre-existing issue, but does cxlctl_set_feature() perform an
out-of-bounds read?

drivers/cxl/core/features.c:cxlctl_set_feature() {
	...
	data_size = rpc_in->op_size - sizeof(feat_in->hdr);
	rc = cxl_set_feature(cxl_mbox, &feat_in->uuid,
			     feat_in->version, feat_in->feat_data,
			     data_size, flags, offset, &return_code);
	...
}

If a user provides a valid but small in_len (e.g., just enough for the
header) but sets a large op_size inside the CXL payload header,
cxlctl_fw_rpc() blindly reads data_size bytes past the end of the inbuf
allocation. Could this leak kernel memory to the hardware device?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260702090849.47501-1-icheng@nvidia.com?part=1

next prev parent reply	other threads:[~2026-07-02 11:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02  9:08 [PATCH v2 0/5] cxl: Sashiko bug fixes Richard Cheng
2026-07-02  9:08 ` [PATCH v2 1/5] cxl/features: Reject feature offset that overflows 16-bit field Richard Cheng
2026-07-02 11:22   ` sashiko-bot [this message]
2026-07-02  9:08 ` [PATCH v2 2/5] cxl/region: Scan all partitions for unmapped poison Richard Cheng
2026-07-02  9:08 ` [PATCH v2 3/5] cxl/region: Don't leak tolerated RAM -EFAULT from unmapped poison scan Richard Cheng
2026-07-02  9:20   ` sashiko-bot
2026-07-02  9:08 ` [PATCH v2 4/5] cxl/region: Start unmapped poison scan at the committed decoder boundary Richard Cheng
2026-07-02  9:08 ` [PATCH v2 5/5] cxl/memdev: Don't overwrite the error from an earlier partition poison query Richard Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260702112221.D1BE41F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=icheng@nvidia.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox