public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Joyner <eric.joyner@amd.com>
To: <netdev@vger.kernel.org>
Cc: Brett Creeley <brett.creeley@amd.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Eric Joyner <eric.joyner@amd.com>
Subject: [PATCH net 6/7] ionic: service adminq CQ before cancelling to avoid false timeouts
Date: Wed, 29 Apr 2026 14:00:06 -0700	[thread overview]
Message-ID: <20260429210007.40015-7-eric.joyner@amd.com> (raw)
In-Reply-To: <20260429210007.40015-1-eric.joyner@amd.com>

From: Brett Creeley <brett.creeley@amd.com>

When ionic_adminq_wait() hits its timeout, it's possible the
firmware has already written the completion to the CQ but
ionic_adminq_service() was never scheduled via NAPI to process it.
In this case the command is falsely reported as timed out even
though it completed successfully.

Fix this by renaming ionic_adminq_cancel() to
ionic_adminq_service_or_cancel() and having it call
ionic_cq_service() under the adminq_lock before checking whether
the context needs to be cancelled.  ionic_cq_service() invokes
ionic_adminq_service() which will process any pending CQ entries,
copy the completion data, call complete_all(), and clear
desc_info->ctx for completed commands.

After servicing, check completion_done() on the ctx.  If the
completion was found and processed, return false (not cancelled)
so ionic_adminq_wait() can use the actual firmware result instead
of reporting a false timeout.  If the completion was not found,
fall through to the existing cancel path that NULLs desc_info->ctx
and return true (cancelled).

Fixes: 938962d55229 ("ionic: Add adminq action")
Assisted-by: Claude:claude-opus-4.6
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Eric Joyner <eric.joyner@amd.com>
---
 .../net/ethernet/pensando/ionic/ionic_main.c  | 42 +++++++++++++++----
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_main.c b/drivers/net/ethernet/pensando/ionic/ionic_main.c
index 0971ca4d6650..708c7e4c578b 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_main.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_main.c
@@ -190,8 +190,8 @@ static const char *ionic_opcode_to_str(enum ionic_cmd_opcode opcode)
 	}
 }
 
-static void ionic_adminq_cancel(struct ionic_lif *lif,
-				struct ionic_admin_ctx *ctx)
+static bool ionic_adminq_service_or_cancel(struct ionic_lif *lif,
+					   struct ionic_admin_ctx *ctx)
 {
 	struct ionic_admin_desc_info *desc_info;
 	unsigned long irqflags;
@@ -201,9 +201,29 @@ static void ionic_adminq_cancel(struct ionic_lif *lif,
 	spin_lock_irqsave(&lif->adminq_lock, irqflags);
 	if (!lif->adminqcq) {
 		spin_unlock_irqrestore(&lif->adminq_lock, irqflags);
-		return;
+		return true;
+	}
+
+	/* Service the CQ to pick up any completions that the FW has
+	 * sent but NAPI hasn't processed yet.  This will call
+	 * complete_all() on any matching contexts, including ours.
+	 */
+	ionic_cq_service(&lif->adminqcq->cq, lif->adminqcq->cq.num_descs,
+			 ionic_adminq_service, NULL, NULL);
+
+	/* If the completion was serviced above, the ctx will have been
+	 * completed and its desc_info->ctx cleared by
+	 * ionic_adminq_service().  Check and return not-cancelled.
+	 */
+	if (completion_done(&ctx->work)) {
+		spin_unlock_irqrestore(&lif->adminq_lock, irqflags);
+		return false;
 	}
 
+	/* The command is still pending, cancel it by clearing
+	 * desc_info->ctx so ionic_adminq_service() won't touch
+	 * the caller's ctx after we return.
+	 */
 	q = &lif->adminqcq->q;
 
 	for (i = 0; i < q->num_descs; i++) {
@@ -214,6 +234,8 @@ static void ionic_adminq_cancel(struct ionic_lif *lif,
 		}
 	}
 	spin_unlock_irqrestore(&lif->adminq_lock, irqflags);
+
+	return true;
 }
 
 static void ionic_adminq_flush(struct ionic_lif *lif)
@@ -444,6 +466,7 @@ int ionic_adminq_wait(struct ionic_lif *lif, struct ionic_admin_ctx *ctx,
 	unsigned long time_start;
 	unsigned long time_done;
 	unsigned long remaining;
+	bool timed_out = false;
 	const char *name;
 
 	name = ionic_opcode_to_str(ctx->cmd.cmd.opcode);
@@ -474,7 +497,7 @@ int ionic_adminq_wait(struct ionic_lif *lif, struct ionic_admin_ctx *ctx,
 			if (do_msg)
 				netdev_warn(netdev, "%s (%d) interrupted, FW in reset\n",
 					    name, ctx->cmd.cmd.opcode);
-			ionic_adminq_cancel(lif, ctx);
+			ionic_adminq_service_or_cancel(lif, ctx);
 			ctx->comp.comp.status = IONIC_RC_ERROR;
 			return -ENXIO;
 		}
@@ -485,12 +508,15 @@ int ionic_adminq_wait(struct ionic_lif *lif, struct ionic_admin_ctx *ctx,
 	dev_dbg(lif->ionic->dev, "%s: elapsed %d msecs\n",
 		__func__, jiffies_to_msecs(time_done - time_start));
 
+	/* If the wait timed out, attempt to service the CQ and cancel
+	 * the ctx.  If ionic_adminq_service() completed the ctx between
+	 * timeout detection and taking the lock, cancel returns false
+	 * and we avoid a false timeout.
+	 */
 	if (time_after_eq(time_done, time_limit))
-		ionic_adminq_cancel(lif, ctx);
+		timed_out = ionic_adminq_service_or_cancel(lif, ctx);
 
-	return ionic_adminq_check_err(lif, ctx,
-				      time_after_eq(time_done, time_limit),
-				      do_msg);
+	return ionic_adminq_check_err(lif, ctx, timed_out, do_msg);
 }
 
 static int __ionic_adminq_post_wait(struct ionic_lif *lif,
-- 
2.17.1


  parent reply	other threads:[~2026-04-29 21:01 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-29 21:00 [PATCH net 0/7] ionic: Various bugfixes Eric Joyner
2026-04-29 21:00 ` [PATCH net 1/7] ionic: Allow the first devcmd to trigger deferred probe Eric Joyner
2026-04-29 21:00 ` [PATCH net 2/7] ionic: Handle failures from ionic_reset() when relevant Eric Joyner
2026-04-29 21:00 ` [PATCH net 3/7] ionic: Fix unexpected dev_cmd failures Eric Joyner
2026-04-29 21:00 ` [PATCH net 4/7] ionic: Fix check in ionic_get_link_ext_stats Eric Joyner
2026-04-29 21:00 ` [PATCH net 5/7] ionic: fix adminq use-after-free on command timeout Eric Joyner
2026-05-01  3:31   ` Eric Joyner
2026-04-29 21:00 ` Eric Joyner [this message]
2026-04-29 21:00 ` [PATCH net 7/7] ionic: fix completion descriptor access with 2x desc size Eric Joyner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260429210007.40015-7-eric.joyner@amd.com \
    --to=eric.joyner@amd.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=brett.creeley@amd.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox