* [PATCH net] pds_core: Fix potential invalid stack memory access
@ 2026-05-01 3:19 Eric Joyner
2026-05-05 18:45 ` Simon Horman
0 siblings, 1 reply; 2+ messages in thread
From: Eric Joyner @ 2026-05-01 3:19 UTC (permalink / raw)
To: netdev
Cc: Brett Creeley, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Eric Joyner
pds_adminq_post() and pds_process_adminq() can run concurrently
together; either one can check if the current request is complete:
pds_adminq_post() does it after the loop inside it times out waiting
for the completion, and pds_process_adminq() when it sees the
uncompleted request on the ring.
However, since pds_adminq_post() does not do any synchronization
around checking for an incomplete command after it reaches its timeout
and marking the command as complete, there is a small window where both
pds_process_adminq() and pds_adminq_post() can execute their
if(!completion_done()) clauses when the completion_done() returns
false; if pdsc_adminq_post() finishes first and its callers return,
then pdsc_process_adminq() will attempt to access q_info->dest after
the address in there points to an invalid location.
(q_info->dest will be invalid after pdsc_adminq_post()'s call chain
exits because it is pointing to a stack variable "comp" in
pdsc_fw_rpc())
Fix this by locking around the completion_done() check and the
complete() contained in the if-statement with the same adminq_lock that
pds_process_adminq() uses, which will prevent this synchronization
issue from occurring.
Fixes: 3f77c3dfffc7 ("pds_core: make wait_context part of q_info")
Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
drivers/net/ethernet/amd/pds_core/adminq.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/amd/pds_core/adminq.c b/drivers/net/ethernet/amd/pds_core/adminq.c
index 097bb092bdb8..6db7d29cdf5c 100644
--- a/drivers/net/ethernet/amd/pds_core/adminq.c
+++ b/drivers/net/ethernet/amd/pds_core/adminq.c
@@ -283,9 +283,15 @@ int pdsc_adminq_post(struct pdsc *pdsc,
__func__, jiffies_to_msecs(time_done - time_start));
/* Check the results and clear an un-completed timeout */
- if (time_after_eq(time_done, time_limit) && !completion_done(wc)) {
- err = -ETIMEDOUT;
- complete(wc);
+ if (time_after_eq(time_done, time_limit)) {
+ unsigned long irqflags;
+
+ spin_lock_irqsave(&pdsc->adminq_lock, irqflags);
+ if (!completion_done(wc)) {
+ err = -ETIMEDOUT;
+ complete(wc);
+ }
+ spin_unlock_irqrestore(&pdsc->adminq_lock, irqflags);
}
dev_dbg(pdsc->dev, "read admin queue completion idx %d:\n", index);
base-commit: e728258debd553c95d2e70f9cd97c9fde27c7130
--
2.17.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH net] pds_core: Fix potential invalid stack memory access
2026-05-01 3:19 [PATCH net] pds_core: Fix potential invalid stack memory access Eric Joyner
@ 2026-05-05 18:45 ` Simon Horman
0 siblings, 0 replies; 2+ messages in thread
From: Simon Horman @ 2026-05-05 18:45 UTC (permalink / raw)
To: eric.joyner
Cc: 'Simon Horman', netdev, brett.creeley, andrew+netdev,
davem, edumazet, kuba, pabeni
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
pds_core: Fix potential invalid stack memory access
This commit attempts to fix a potential stack corruption issue when an admin
queue command times out. It adds a spinlock around the timeout completion
check to prevent concurrent access by the timeout handler and the completion
handler.
> diff --git a/drivers/net/ethernet/amd/pds_core/adminq.c b/drivers/net/ethernet/amd/pds_core/adminq.c
> --- a/drivers/net/ethernet/amd/pds_core/adminq.c
> +++ b/drivers/net/ethernet/amd/pds_core/adminq.c
> @@ -283,9 +283,15 @@ int pdsc_adminq_post(struct pdsc *pdsc,
[ ... ]
> remaining = wait_for_completion_timeout(wc, poll_jiffies);
> if (remaining)
> break;
>
> if (!pdsc_is_fw_running(pdsc)) {
[ ... ]
> err = -ENXIO;
> break;
> }
[ ... ]
> } while (time_before(jiffies, time_limit));
> time_done = jiffies;
[ ... ]
> /* Check the results and clear an un-completed timeout */
> - if (time_after_eq(time_done, time_limit) && !completion_done(wc)) {
> - err = -ETIMEDOUT;
> - complete(wc);
> + if (time_after_eq(time_done, time_limit)) {
> + unsigned long irqflags;
> +
> + spin_lock_irqsave(&pdsc->adminq_lock, irqflags);
Can an ABA race condition occur if the thread is preempted right before
acquiring the adminq_lock?
If the thread times out and is preempted just before spin_lock_irqsave(),
the hardware could complete the command, and a new thread might wrap the
admin queue, reusing the same slot. The new thread would call
reinit_completion(wc), resetting wc->done to 0.
When the preempted thread resumes, completion_done(wc) would evaluate to
false, causing it to erroneously complete the new command prematurely.
> + if (!completion_done(wc)) {
> + err = -ETIMEDOUT;
> + complete(wc);
> + }
> + spin_unlock_irqrestore(&pdsc->adminq_lock, irqflags);
> }
Does this leave the stack memory corruption possible on the -ENXIO error
path?
If the wait loop breaks early because pdsc_is_fw_running() evaluates to false,
time_done may be less than time_limit. The code would bypass this newly added
locked block and return without calling complete(wc). Could the interrupt
handler later write to the destroyed stack variable comp?
Will successful commands that complete right near the timeout limit
incorrectly return -ETIMEDOUT?
If wait_for_completion_timeout() successfully acquires the completion, it
decrements wc->done to 0 and breaks. If jiffies has advanced past time_limit
due to thread preemption, completion_done(wc) will evaluate to false inside
this block because wc->done is 0.
This sets err to -ETIMEDOUT, which isn't cleared on success, triggering an
unintended device reset down the err_out path.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-05-05 18:45 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-01 3:19 [PATCH net] pds_core: Fix potential invalid stack memory access Eric Joyner
2026-05-05 18:45 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox