Netdev List
 help / color / mirror / Atom feed
From: "Nikhil P. Rao" <nikhil.rao@amd.com>
To: <netdev@vger.kernel.org>
Cc: <kuba@kernel.org>, <brett.creeley@amd.com>, <eric.joyner@amd.com>,
	<andrew+netdev@lunn.ch>, <davem@davemloft.net>,
	<edumazet@google.com>, <pabeni@redhat.com>,
	"Nikhil P. Rao" <nikhil.rao@amd.com>
Subject: [PATCH net v2 2/2] pds_core: fix use-after-free on workqueue during remove
Date: Mon, 29 Jun 2026 20:03:58 +0000	[thread overview]
Message-ID: <20260629200358.2626129-3-nikhil.rao@amd.com> (raw)
In-Reply-To: <20260629200358.2626129-1-nikhil.rao@amd.com>

In pdsc_remove(), the workqueue is destroyed before pdsc_teardown()
is called. This ordering allows two paths to queue work on the
destroyed workqueue:

1. If pdsc_teardown() -> pdsc_devcmd_reset() times out, the error
   path in pdsc_devcmd_locked() queues health_work.

2. A NotifyQ event can trigger the ISR and queue work before free_irq()
   is called in pdsc_teardown().

Fix problem 1 by moving destroy_workqueue() after pdsc_teardown(),
ensuring the workqueue exists when health_work may be queued during
teardown.

Fix problem 2 by adding cancel_work_sync() in pdsc_qcq_free() after
free_irq(). This ensures no new ISR can queue work, and any
already-queued work is drained before freeing the qcq. Work draining
during teardown may race with intx becoming invalid, so skip returning
interrupt credits if intx is no longer assigned.

Also change pdsc_core_uninit() to free adminqcq before notifyqcq,
since adminqcq's work accesses notifyqcq via pdsc_process_notifyq().
This ensures notifyqcq remains valid while adminqcq's work drains.

Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands")
Reported-by: Sashiko AI Review <sashiko-bot@kernel.org>
Signed-off-by: Nikhil P. Rao <nikhil.rao@amd.com>
---
 drivers/net/ethernet/amd/pds_core/adminq.c | 15 +++++++++++----
 drivers/net/ethernet/amd/pds_core/core.c   | 14 ++++++++++----
 drivers/net/ethernet/amd/pds_core/main.c   |  5 +++--
 3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/amd/pds_core/adminq.c b/drivers/net/ethernet/amd/pds_core/adminq.c
index 097bb092bdb8..c0d9b7e6b8c3 100644
--- a/drivers/net/ethernet/amd/pds_core/adminq.c
+++ b/drivers/net/ethernet/amd/pds_core/adminq.c
@@ -77,6 +77,7 @@ void pdsc_process_adminq(struct pdsc_qcq *qcq)
 	unsigned long irqflags;
 	int nq_work = 0;
 	int aq_work = 0;
+	int intx;
 
 	/* Don't process AdminQ when it's not up */
 	if (!pdsc_adminq_inc_if_up(pdsc)) {
@@ -121,10 +122,16 @@ void pdsc_process_adminq(struct pdsc_qcq *qcq)
 	qcq->accum_work += aq_work;
 
 credits:
-	/* Return the interrupt credits, one for each completion */
-	pds_core_intr_credits(&pdsc->intr_ctrl[qcq->intx],
-			      nq_work + aq_work,
-			      PDS_CORE_INTR_CRED_REARM);
+	/* Return the interrupt credits, one for each completion.
+	 * Use READ_ONCE to get a single consistent copy of intx since it can
+	 * be set to PDS_CORE_INTR_INDEX_NOT_ASSIGNED concurrently during
+	 * teardown, and skip the credits if so.
+	 */
+	intx = READ_ONCE(qcq->intx);
+	if (intx != PDS_CORE_INTR_INDEX_NOT_ASSIGNED)
+		pds_core_intr_credits(&pdsc->intr_ctrl[intx],
+				      nq_work + aq_work,
+				      PDS_CORE_INTR_CRED_REARM);
 	refcount_dec(&pdsc->adminq_refcnt);
 }
 
diff --git a/drivers/net/ethernet/amd/pds_core/core.c b/drivers/net/ethernet/amd/pds_core/core.c
index 1074a022a52f..570c0cd7339e 100644
--- a/drivers/net/ethernet/amd/pds_core/core.c
+++ b/drivers/net/ethernet/amd/pds_core/core.c
@@ -110,7 +110,8 @@ static void pdsc_qcq_intr_free(struct pdsc *pdsc, struct pdsc_qcq *qcq)
 		return;
 
 	pdsc_intr_free(pdsc, qcq->intx);
-	qcq->intx = PDS_CORE_INTR_INDEX_NOT_ASSIGNED;
+	/* Pairs with READ_ONCE in pdsc_process_adminq() */
+	WRITE_ONCE(qcq->intx, PDS_CORE_INTR_INDEX_NOT_ASSIGNED);
 }
 
 static int pdsc_qcq_intr_alloc(struct pdsc *pdsc, struct pdsc_qcq *qcq)
@@ -145,6 +146,10 @@ void pdsc_qcq_free(struct pdsc *pdsc, struct pdsc_qcq *qcq)
 
 	pdsc_qcq_intr_free(pdsc, qcq);
 
+	/* Drain any work queued by ISR before it was freed above */
+	if (qcq->work.func)
+		cancel_work_sync(&qcq->work);
+
 	if (qcq->q_base)
 		dma_free_coherent(dev, qcq->q_size,
 				  qcq->q_base, qcq->q_base_pa);
@@ -304,8 +309,11 @@ int pdsc_qcq_alloc(struct pdsc *pdsc, unsigned int type, unsigned int index,
 
 static void pdsc_core_uninit(struct pdsc *pdsc)
 {
-	pdsc_qcq_free(pdsc, &pdsc->notifyqcq);
+	/* Free adminqcq first: its work accesses notifyqcq, so we must
+	 * disable its IRQ and drain its work before freeing notifyqcq.
+	 */
 	pdsc_qcq_free(pdsc, &pdsc->adminqcq);
+	pdsc_qcq_free(pdsc, &pdsc->notifyqcq);
 
 	if (pdsc->kern_dbpage) {
 		iounmap(pdsc->kern_dbpage);
@@ -479,8 +487,6 @@ void pdsc_teardown(struct pdsc *pdsc, bool removing)
 {
 	if (!pdsc->pdev->is_virtfn)
 		pdsc_devcmd_reset(pdsc);
-	if (pdsc->adminqcq.work.func)
-		cancel_work_sync(&pdsc->adminqcq.work);
 
 	pci_clear_master(pdsc->pdev);
 
diff --git a/drivers/net/ethernet/amd/pds_core/main.c b/drivers/net/ethernet/amd/pds_core/main.c
index 22db78343eb0..638b9c7a509d 100644
--- a/drivers/net/ethernet/amd/pds_core/main.c
+++ b/drivers/net/ethernet/amd/pds_core/main.c
@@ -435,8 +435,6 @@ static void pdsc_remove(struct pci_dev *pdev)
 		pdsc_auxbus_dev_del(pdsc, pdsc, &pdsc->padev);
 
 		timer_shutdown_sync(&pdsc->wdtimer);
-		if (pdsc->wq)
-			destroy_workqueue(pdsc->wq);
 
 		mutex_lock(&pdsc->config_lock);
 		set_bit(PDSC_S_STOPPING_DRIVER, &pdsc->state);
@@ -444,6 +442,9 @@ static void pdsc_remove(struct pci_dev *pdev)
 		pdsc_stop(pdsc);
 		pdsc_teardown(pdsc, PDSC_TEARDOWN_REMOVING);
 		mutex_unlock(&pdsc->config_lock);
+
+		if (pdsc->wq)
+			destroy_workqueue(pdsc->wq);
 		mutex_destroy(&pdsc->config_lock);
 		mutex_destroy(&pdsc->devcmd_lock);
 
-- 
2.43.0


  parent reply	other threads:[~2026-06-29 20:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29 20:03 [PATCH net v2 0/2] pds_core: fix use-after-free on workqueue during remove Nikhil P. Rao
2026-06-29 20:03 ` [PATCH net v2 1/2] pds_core: fix deadlock between reset thread and remove Nikhil P. Rao
2026-06-29 21:30   ` Harshitha Ramamurthy
2026-06-29 20:03 ` Nikhil P. Rao [this message]
2026-06-29 21:32   ` [PATCH net v2 2/2] pds_core: fix use-after-free on workqueue during remove Harshitha Ramamurthy
2026-06-29 23:42     ` Rao, Nikhil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629200358.2626129-3-nikhil.rao@amd.com \
    --to=nikhil.rao@amd.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=brett.creeley@amd.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.joyner@amd.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox