netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shailend Chand <shailend@google.com>
To: netdev@vger.kernel.org
Cc: almasrymina@google.com, davem@davemloft.net, edumazet@google.com,
	 hramamurthy@google.com, jeroendb@google.com, kuba@kernel.org,
	 pabeni@redhat.com, pkaligineedi@google.com, willemb@google.com,
	 Shailend Chand <shailend@google.com>
Subject: [PATCH net-next 06/10] gve: Avoid rescheduling napi if on wrong cpu
Date: Tue, 30 Apr 2024 23:14:15 +0000	[thread overview]
Message-ID: <20240430231420.699177-7-shailend@google.com> (raw)
In-Reply-To: <20240430231420.699177-1-shailend@google.com>

In order to make possible the implementation of per-queue ndo hooks,
gve_turnup was changed in a previous patch to account for queues already
having some unprocessed descriptors: it does a one-off napi_schdule to
handle them. If conditions of consistent high traffic persist in the
immediate aftermath of this, the poll routine for a queue can be "stuck"
on the cpu on which the ndo hooks ran, instead of the cpu its irq has
affinity with.

This situation is exacerbated by the fact that the ndo hooks for all the
queues are invoked on the same cpu, potentially causing all the napi
poll routines to be residing on the same cpu.

A self correcting mechanism in the poll method itself solves this
problem.

Tested-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Shailend Chand <shailend@google.com>
---
 drivers/net/ethernet/google/gve/gve.h      |  1 +
 drivers/net/ethernet/google/gve/gve_main.c | 33 ++++++++++++++++++++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
index 53b5244dc7bc..f27a6d5fbecf 100644
--- a/drivers/net/ethernet/google/gve/gve.h
+++ b/drivers/net/ethernet/google/gve/gve.h
@@ -610,6 +610,7 @@ struct gve_notify_block {
 	struct gve_priv *priv;
 	struct gve_tx_ring *tx; /* tx rings on this block */
 	struct gve_rx_ring *rx; /* rx rings on this block */
+	u32 irq;
 };
 
 /* Tracks allowed and current queue settings */
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index ef902b72b9a9..79b7a677ec0b 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -9,6 +9,7 @@
 #include <linux/etherdevice.h>
 #include <linux/filter.h>
 #include <linux/interrupt.h>
+#include <linux/irq.h>
 #include <linux/module.h>
 #include <linux/pci.h>
 #include <linux/sched.h>
@@ -253,6 +254,18 @@ static irqreturn_t gve_intr_dqo(int irq, void *arg)
 	return IRQ_HANDLED;
 }
 
+static int gve_is_napi_on_home_cpu(struct gve_priv *priv, u32 irq)
+{
+	int cpu_curr = smp_processor_id();
+	const struct cpumask *aff_mask;
+
+	aff_mask = irq_get_effective_affinity_mask(irq);
+	if (unlikely(!aff_mask))
+		return 1;
+
+	return cpumask_test_cpu(cpu_curr, aff_mask);
+}
+
 int gve_napi_poll(struct napi_struct *napi, int budget)
 {
 	struct gve_notify_block *block;
@@ -322,8 +335,21 @@ int gve_napi_poll_dqo(struct napi_struct *napi, int budget)
 		reschedule |= work_done == budget;
 	}
 
-	if (reschedule)
-		return budget;
+	if (reschedule) {
+		/* Reschedule by returning budget only if already on the correct
+		 * cpu.
+		 */
+		if (likely(gve_is_napi_on_home_cpu(priv, block->irq)))
+			return budget;
+
+		/* If not on the cpu with which this queue's irq has affinity
+		 * with, we avoid rescheduling napi and arm the irq instead so
+		 * that napi gets rescheduled back eventually onto the right
+		 * cpu.
+		 */
+		if (work_done == budget)
+			work_done--;
+	}
 
 	if (likely(napi_complete_done(napi, work_done))) {
 		/* Enable interrupts again.
@@ -428,6 +454,7 @@ static int gve_alloc_notify_blocks(struct gve_priv *priv)
 				"Failed to receive msix vector %d\n", i);
 			goto abort_with_some_ntfy_blocks;
 		}
+		block->irq = priv->msix_vectors[msix_idx].vector;
 		irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
 				      get_cpu_mask(i % active_cpus));
 		block->irq_db_index = &priv->irq_db_indices[i].index;
@@ -441,6 +468,7 @@ static int gve_alloc_notify_blocks(struct gve_priv *priv)
 		irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
 				      NULL);
 		free_irq(priv->msix_vectors[msix_idx].vector, block);
+		block->irq = 0;
 	}
 	kvfree(priv->ntfy_blocks);
 	priv->ntfy_blocks = NULL;
@@ -474,6 +502,7 @@ static void gve_free_notify_blocks(struct gve_priv *priv)
 		irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
 				      NULL);
 		free_irq(priv->msix_vectors[msix_idx].vector, block);
+		block->irq = 0;
 	}
 	free_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector, priv);
 	kvfree(priv->ntfy_blocks);
-- 
2.45.0.rc0.197.gbae5840b3b-goog


  parent reply	other threads:[~2024-04-30 23:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-30 23:14 [PATCH net-next 00/10] gve: Implement queue api Shailend Chand
2024-04-30 23:14 ` [PATCH net-next 01/10] queue_api: define " Shailend Chand
2024-04-30 23:14 ` [PATCH net-next 02/10] gve: Make the GQ RX free queue funcs idempotent Shailend Chand
2024-04-30 23:14 ` [PATCH net-next 03/10] gve: Add adminq funcs to add/remove a single Rx queue Shailend Chand
2024-05-01  4:19   ` Ratheesh Kannoth
2024-05-01 13:50   ` Willem de Bruijn
2024-05-01 23:27     ` Shailend Chand
2024-05-02  1:39       ` Willem de Bruijn
2024-04-30 23:14 ` [PATCH net-next 04/10] gve: Make gve_turn(up|down) ignore stopped queues Shailend Chand
2024-04-30 23:14 ` [PATCH net-next 05/10] gve: Make gve_turnup work for nonempty queues Shailend Chand
2024-04-30 23:14 ` Shailend Chand [this message]
2024-04-30 23:14 ` [PATCH net-next 07/10] gve: Reset Rx ring state in the ring-stop funcs Shailend Chand
2024-04-30 23:14 ` [PATCH net-next 08/10] gve: Account for stopped queues when reading NIC stats Shailend Chand
2024-04-30 23:14 ` [PATCH net-next 09/10] gve: Alloc and free QPLs with the rings Shailend Chand
2024-05-01 15:31   ` Jakub Kicinski
2024-04-30 23:14 ` [PATCH net-next 10/10] gve: Implement queue api Shailend Chand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240430231420.699177-7-shailend@google.com \
    --to=shailend@google.com \
    --cc=almasrymina@google.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hramamurthy@google.com \
    --cc=jeroendb@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).