Re: [PATCH v5 2/5] nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Hongling Zeng" <zenghongling@kylinos.cn>
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v5 2/5] nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in  rpc.c
Date: Mon, 22 Jun 2026 08:27:37 +0000	[thread overview]
Message-ID: <20260622082737.7CF5B1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260622081136.61695-3-zenghongling@kylinos.cn>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Missing upper and lower bounds checks on the untrusted `rpc->length` field from GSP messages causes severe integer underflows and heap buffer overflows.
- [High] Returning NULL on message length error in `r535_gsp_msgq_recv()` leads to false success reporting.
- [High] Early returns on error paths in GSP message reception fail to advance the read pointer, causing a permanently deadlocked queue.
- [High] Memory leak of large GSP RPC messages on error paths in `r535_gsp_rpc_push`.
- [High] Missing read memory barrier (`smp_rmb`) when polling the GSP message queue write pointer allows out-of-order reads of stale payload data.
--

commit b9953ccb9eee21566520494158c392c95237ec17
Author: Hongling Zeng <zenghongling@kylinos.cn>

nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c

This commit cleans up incorrect IS_ERR_OR_NULL() usage for functions
that never return NULL, such as r535_gsp_msgq_peek() and
r535_gsp_msgq_recv_one_elem(). It updates them to be checked with
IS_ERR() instead.

> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> index b4b2b5c00ab4e..5b89c8835a2e4 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/rm/r535/rpc.c
> @@ -350,7 +350,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, int *retries)
>  		u32 size;
>  
>  		rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), info.retries);

[Severity: High]
This isn't a bug introduced by this patch, but earlier in r535_gsp_msgq_recv(),
if the message length is invalid, it returns NULL:

	if (WARN_ON(rpc->length > max_rpc_size))
		return NULL;

Since callers like nvkm_gsp_rm_alloc_wr() use IS_ERR() to check for failures,
does this cause the caller to incorrectly treat the operation as a success
because IS_ERR(NULL) evaluates to false?

[Severity: High]
This is a pre-existing issue, but do the early returns on error paths in
GSP message reception fail to advance the read pointer?

For example, in r535_gsp_msgq_recv():

	if (WARN_ON(rpc->length > max_rpc_size))
		return NULL;

And in r535_gsp_msgq_recv_one_elem():

	if (rpc->function != NV_VGPU_MSG_FUNCTION_CONTINUATION_RECORD) {
		nvkm_error(&gsp->subdev,
			   "Not a continuation of a large RPC\n");
		r535_gsp_msg_dump(gsp, rpc, NV_DBG_ERROR);
		return ERR_PTR(-EIO);
	}

Since they return without advancing *gsp->msgq.rptr, will subsequent attempts
to process the queue peek the exact same malformed message and hit the same
error, causing a permanently deadlocked queue?

> -		if (IS_ERR_OR_NULL(rpc)) {
> +		if (IS_ERR(rpc)) {
>  			kvfree(buf);
>  			return rpc;
>  		}
> @@ -359,7 +359,7 @@ r535_gsp_msgq_recv(struct nvkm_gsp *gsp, u32 gsp_rpc_len, int *retries)
>  		info.continuation = true;
>  
>  		rpc = r535_gsp_msgq_recv_one_elem(gsp, &info);

[Severity: Critical]
This is a pre-existing issue, but does this code lack upper and lower bounds
checks on the untrusted rpc->length field from GSP messages?

If a compromised GSP provides a small length for a continuation record, it
looks like len - sizeof(struct nvfw_gsp_rpc) in r535_gsp_msgq_recv_one_elem()
could underflow:

	if (info->continuation)
		memcpy(buf, mqe->data + sizeof(struct nvfw_gsp_rpc),
		       len - sizeof(struct nvfw_gsp_rpc));

Additionally, a few lines below this hunk in r535_gsp_msgq_recv():

		size = info.gsp_rpc_len - sizeof(*rpc);
		expected -= size;
		info.gsp_rpc_buf += size;

Could this overflow the kvmalloc-allocated buffer since there is no upper
bound check ensuring rpc->length does not exceed expected?

> -		if (IS_ERR_OR_NULL(rpc)) {
> +		if (IS_ERR(rpc)) {
>  			kvfree(buf);
>  			return rpc;
>  		}

[ ... ]

> @@ -499,7 +499,7 @@ r535_gsp_msg_recv(struct nvkm_gsp *gsp, int fn, u32 gsp_rpc_len)
>  
>  retry:
>  	rpc = r535_gsp_msgq_peek(gsp, sizeof(*rpc), &retries);
> -	if (IS_ERR_OR_NULL(rpc))
> +	if (IS_ERR(rpc))
>  		return rpc;
>  
>  	rpc = r535_gsp_msgq_recv(gsp, gsp_rpc_len, &retries);

[Severity: High]
This is a pre-existing issue, but does r535_gsp_rpc_push() leak large GSP RPC
messages on error paths?

If the payload size exceeds max_payload_size, the message is chunked and the
original msg must be freed manually. However, the error paths seem to bypass
this:

r535_gsp_rpc_push() {
...
done:
	mutex_unlock(&gsp->cmdq.mutex);
	return repv;
}

If any intermediate chunk allocation or sending step fails and triggers the
goto done path, does this bypass the kvfree(msg) call and leak the large
message allocation?

[Severity: High]
This isn't a bug introduced by this patch, but does r535_gsp_msgq_wait()
miss a read memory barrier when polling the GSP message queue write pointer?

r535_gsp_msgq_wait() {
...
	if (WARN_ON(!*ptime))
		return -ETIMEDOUT;

	return used;
}

Since there is no smp_rmb() between checking the write pointer and reading
the payload from shared memory, could the CPU speculatively execute the
payload reads before the pointer read completes on weakly ordered architectures,
observing stale memory?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260622081136.61695-1-zenghongling@kylinos.cn?part=2

next prev parent reply	other threads:[~2026-06-22  8:27 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22  8:11 [PATCH v5 0/5] nouveau/gsp: Clean up IS_ERR vs IS_ERR_OR_NULL usage Hongling Zeng
2026-06-22  8:11 ` [PATCH v5 1/5] nouveau/gsp/rpc: Document RPC function return value contracts Hongling Zeng
2026-06-22  8:11   ` Hongling Zeng
2026-06-22  8:21   ` sashiko-bot
2026-06-22  8:11 ` Hongling Zeng
2026-06-22  9:18   ` sashiko-bot
2026-06-22  8:11 ` [PATCH v5 2/5] nouveau/gsp/rpc: Cleanup incorrect IS_ERR_OR_NULL in rpc.c Hongling Zeng
2026-06-22  8:11   ` Hongling Zeng
2026-06-22  8:27   ` sashiko-bot [this message]
2026-06-22  8:11 ` Hongling Zeng
2026-06-22  9:22   ` sashiko-bot
2026-06-22  8:11 ` [PATCH v5 3/5] nouveau/gsp/rm/alloc: Cleanup IS_ERR_OR_NULL usage Hongling Zeng
2026-06-22  9:19   ` sashiko-bot
2026-06-22  8:11 ` Hongling Zeng
2026-06-22  8:25   ` sashiko-bot
2026-06-22  8:11 ` [PATCH v5 4/5] nouveau/gsp/rm/bar: " Hongling Zeng
2026-06-22  9:30   ` sashiko-bot
2026-06-22  8:11 ` Hongling Zeng
2026-06-22  8:23   ` sashiko-bot
2026-06-22  8:11 ` [PATCH v5 5/5] nouveau/gsp: Cleanup IS_ERR_OR_NULL in nvkm_gsp_rpc_rd() Hongling Zeng
2026-06-22  8:11   ` Hongling Zeng
2026-06-22  8:36   ` sashiko-bot
2026-06-22  8:11 ` Hongling Zeng
2026-06-22  9:32   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260622082737.7CF5B1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=zenghongling@kylinos.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.