* [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config
@ 2025-02-04 22:06 Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 1/5] net: move ARFS rmap management to core Ahmed Zaki
` (6 more replies)
0 siblings, 7 replies; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-04 22:06 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil, Ahmed Zaki
Drivers usually need to re-apply the user-set IRQ affinity to their IRQs
after reset. However, since there can be only one IRQ affinity notifier
for each IRQ, registering IRQ notifiers conflicts with the ARFS rmap
management in the core (which also registers separate IRQ affinity
notifiers).
Move the IRQ affinity management to the napi struct. This way we can have
a unified IRQ notifier to re-apply the user-set affinity and also manage
the ARFS rmaps. The first patch moves the ARFS rmap management to CORE.
The second patch adds the IRQ affinity mask to napi_config and re-applies
the mask after reset. Patches 3-5 use the new API for bnxt, ice and idpf
drivers.
Tested on bnxt, ice and idpf.
V7:
- P1: add documentation for netif_enable_cpu_rmap()
- P1: move a couple of "if (rx_cpu_rmap_auto)" from patch 1 to patch 2
where they are really needed.
- P1: remove a defensive "if (!rmap)"
- p1: In netif_disable_cpu_rmap(), remove the for loop that frees
notifiers since this is already done in napi_disable_locked().
Also rename it to netif_del_cpu_rmap().
- P1 and P2: simplify the if conditions in netif_napi_set_irq_locked()
- Other nits
V6:
- https://lore.kernel.org/netdev/20250118003335.155379-1-ahmed.zaki@intel.com/
- Modifications to have less #ifdef CONFIG_RF_ACCL guards
- Remove rmap entry in napi_disable
- Rebase on rc7 and use netif_napi_set_irq_locked()
- Assume IRQ can be -1 and free resources if an old valid IRQ was
associated with the napi. For this, I had to merge the first 2
patches to use the new rmap API.
V5:
- https://lore.kernel.org/netdev/20250113171042.158123-1-ahmed.zaki@intel.com/
- Add kernel doc for new netdev flags (Simon).
- Remove defensive (if !napi) check in napi_irq_cpu_rmap_add()
(patch 2) since caller is already dereferencing the pointer (Simon).
- Fix build error when CONFIG_ARFS_ACCEL is not defined (patch 3).
v4:
- https://lore.kernel.org/netdev/20250109233107.17519-1-ahmed.zaki@intel.com/
- Better introduction in the cover letter.
- Fix Kernel build errors in ena_init_rx_cpu_rmap() (Patch 1)
- Fix kernel test robot warnings reported by Dan Carpenter:
https://lore.kernel.org/all/202501050625.nY1c97EX-lkp@intel.com/
- Remove unrelated empty line in patch 4 (Kalesh Anakkur Purayil)
- Fix a memleak (rmap was not freed) by calling cpu_rmap_put() in
netif_napi_affinity_release() (patch 2).
v3:
- https://lore.kernel.org/netdev/20250104004314.208259-1-ahmed.zaki@intel.com/
- Assign one cpu per mask starting from local NUMA node (Shay Drori).
- Keep the new ARFS and Affinity flags per nedev (Jakub).
v2:
- https://lore.kernel.org/netdev/202412190454.nwvp3hU2-lkp@intel.com/T/
- Also move the ARFS IRQ affinity management from drivers to core. Via
netif_napi_set_irq(), drivers can ask the core to add the IRQ to the
ARFS rmap (already allocated by the driver).
RFC -> v1:
- https://lore.kernel.org/netdev/20241210002626.366878-1-ahmed.zaki@intel.com/
- move static inline affinity functions to net/dev/core.c
- add the new napi->irq_flags (patch 1)
- add code changes to bnxt, mlx4 and ice.
Ahmed Zaki (5):
net: move ARFS rmap management to core
net: napi: add CPU affinity to napi_config
bnxt: use napi's irq affinity
ice: use napi's irq affinity
idpf: use napi's irq affinity
Documentation/networking/scaling.rst | 6 +-
drivers/net/ethernet/amazon/ena/ena_netdev.c | 43 +----
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 54 +------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 -
drivers/net/ethernet/intel/ice/ice.h | 3 -
drivers/net/ethernet/intel/ice/ice_arfs.c | 17 +-
drivers/net/ethernet/intel/ice/ice_base.c | 7 +-
drivers/net/ethernet/intel/ice/ice_lib.c | 6 -
drivers/net/ethernet/intel/ice/ice_main.c | 47 +-----
drivers/net/ethernet/intel/idpf/idpf_lib.c | 1 +
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 22 +--
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 6 +-
include/linux/cpu_rmap.h | 1 +
include/linux/netdevice.h | 25 ++-
lib/cpu_rmap.c | 2 +-
net/core/dev.c | 160 ++++++++++++++++++-
16 files changed, 210 insertions(+), 192 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next v7 1/5] net: move ARFS rmap management to core
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
@ 2025-02-04 22:06 ` Ahmed Zaki
2025-02-07 2:29 ` Jakub Kicinski
2025-02-04 22:06 ` [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config Ahmed Zaki
` (5 subsequent siblings)
6 siblings, 1 reply; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-04 22:06 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil, Ahmed Zaki,
David Arinzon
Add a new netdev flag "rx_cpu_rmap_auto". Drivers supporting ARFS should
set the flag via netif_enable_cpu_rmap() and core will allocate and manage
the ARFS rmap. Freeing the rmap is also done by core when the netdev is
freed.
For better IRQ affinity management, move the IRQ rmap notifier inside the
napi_struct. Consequently, add new notify.notify and notify.release
functions: netif_irq_cpu_rmap_notify() and netif_napi_affinity_release().
Acked-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
Documentation/networking/scaling.rst | 6 +-
drivers/net/ethernet/amazon/ena/ena_netdev.c | 43 +------
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 29 +----
drivers/net/ethernet/intel/ice/ice_arfs.c | 17 +--
include/linux/cpu_rmap.h | 1 +
include/linux/netdevice.h | 15 ++-
lib/cpu_rmap.c | 2 +-
net/core/dev.c | 122 +++++++++++++++++++
8 files changed, 145 insertions(+), 90 deletions(-)
diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 4eb50bcb9d42..e5d4d3ecb980 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -427,8 +427,10 @@ rps_dev_flow_table. The stack consults a CPU to hardware queue map which
is maintained by the NIC driver. This is an auto-generated reverse map of
the IRQ affinity table shown by /proc/interrupts. Drivers can use
functions in the cpu_rmap (“CPU affinity reverse map”) kernel library
-to populate the map. For each CPU, the corresponding queue in the map is
-set to be one whose processing CPU is closest in cache locality.
+to populate the map. Alternatively, drivers can delegate the cpu_rmap
+management to the Kernel by calling netif_enable_cpu_rmap(). For each CPU,
+the corresponding queue in the map is set to be one whose processing CPU is
+closest in cache locality.
Accelerated RFS Configuration
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index c1295dfad0d0..6aab85a7c60a 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -5,9 +5,6 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-#ifdef CONFIG_RFS_ACCEL
-#include <linux/cpu_rmap.h>
-#endif /* CONFIG_RFS_ACCEL */
#include <linux/ethtool.h>
#include <linux/kernel.h>
#include <linux/module.h>
@@ -162,30 +159,6 @@ int ena_xmit_common(struct ena_adapter *adapter,
return 0;
}
-static int ena_init_rx_cpu_rmap(struct ena_adapter *adapter)
-{
-#ifdef CONFIG_RFS_ACCEL
- u32 i;
- int rc;
-
- adapter->netdev->rx_cpu_rmap = alloc_irq_cpu_rmap(adapter->num_io_queues);
- if (!adapter->netdev->rx_cpu_rmap)
- return -ENOMEM;
- for (i = 0; i < adapter->num_io_queues; i++) {
- int irq_idx = ENA_IO_IRQ_IDX(i);
-
- rc = irq_cpu_rmap_add(adapter->netdev->rx_cpu_rmap,
- pci_irq_vector(adapter->pdev, irq_idx));
- if (rc) {
- free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
- adapter->netdev->rx_cpu_rmap = NULL;
- return rc;
- }
- }
-#endif /* CONFIG_RFS_ACCEL */
- return 0;
-}
-
static void ena_init_io_rings_common(struct ena_adapter *adapter,
struct ena_ring *ring, u16 qid)
{
@@ -1596,7 +1569,7 @@ static int ena_enable_msix(struct ena_adapter *adapter)
adapter->num_io_queues = irq_cnt - ENA_ADMIN_MSIX_VEC;
}
- if (ena_init_rx_cpu_rmap(adapter))
+ if (netif_enable_cpu_rmap(adapter->netdev, adapter->num_io_queues))
netif_warn(adapter, probe, adapter->netdev,
"Failed to map IRQs to CPUs\n");
@@ -1742,13 +1715,6 @@ static void ena_free_io_irq(struct ena_adapter *adapter)
struct ena_irq *irq;
int i;
-#ifdef CONFIG_RFS_ACCEL
- if (adapter->msix_vecs >= 1) {
- free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
- adapter->netdev->rx_cpu_rmap = NULL;
- }
-#endif /* CONFIG_RFS_ACCEL */
-
for (i = ENA_IO_IRQ_FIRST_IDX; i < ENA_MAX_MSIX_VEC(io_queue_count); i++) {
irq = &adapter->irq_tbl[i];
irq_set_affinity_hint(irq->vector, NULL);
@@ -4131,13 +4097,6 @@ static void __ena_shutoff(struct pci_dev *pdev, bool shutdown)
ena_dev = adapter->ena_dev;
netdev = adapter->netdev;
-#ifdef CONFIG_RFS_ACCEL
- if ((adapter->msix_vecs >= 1) && (netdev->rx_cpu_rmap)) {
- free_irq_cpu_rmap(netdev->rx_cpu_rmap);
- netdev->rx_cpu_rmap = NULL;
- }
-
-#endif /* CONFIG_RFS_ACCEL */
/* Make sure timer and reset routine won't be called after
* freeing device resources.
*/
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7b8b5b39c7bb..b9b839cb942a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -49,7 +49,6 @@
#include <linux/cache.h>
#include <linux/log2.h>
#include <linux/bitmap.h>
-#include <linux/cpu_rmap.h>
#include <linux/cpumask.h>
#include <net/pkt_cls.h>
#include <net/page_pool/helpers.h>
@@ -10886,10 +10885,8 @@ static int bnxt_set_real_num_queues(struct bnxt *bp)
if (rc)
return rc;
-#ifdef CONFIG_RFS_ACCEL
if (bp->flags & BNXT_FLAG_RFS)
- dev->rx_cpu_rmap = alloc_irq_cpu_rmap(bp->rx_nr_rings);
-#endif
+ return netif_enable_cpu_rmap(dev, bp->rx_nr_rings);
return rc;
}
@@ -11242,10 +11239,6 @@ static void bnxt_free_irq(struct bnxt *bp)
struct bnxt_irq *irq;
int i;
-#ifdef CONFIG_RFS_ACCEL
- free_irq_cpu_rmap(bp->dev->rx_cpu_rmap);
- bp->dev->rx_cpu_rmap = NULL;
-#endif
if (!bp->irq_tbl || !bp->bnapi)
return;
@@ -11268,11 +11261,8 @@ static void bnxt_free_irq(struct bnxt *bp)
static int bnxt_request_irq(struct bnxt *bp)
{
- int i, j, rc = 0;
+ int i, rc = 0;
unsigned long flags = 0;
-#ifdef CONFIG_RFS_ACCEL
- struct cpu_rmap *rmap;
-#endif
rc = bnxt_setup_int_mode(bp);
if (rc) {
@@ -11280,22 +11270,11 @@ static int bnxt_request_irq(struct bnxt *bp)
rc);
return rc;
}
-#ifdef CONFIG_RFS_ACCEL
- rmap = bp->dev->rx_cpu_rmap;
-#endif
- for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
+
+ for (i = 0; i < bp->cp_nr_rings; i++) {
int map_idx = bnxt_cp_num_to_irq_num(bp, i);
struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
-#ifdef CONFIG_RFS_ACCEL
- if (rmap && bp->bnapi[i]->rx_ring) {
- rc = irq_cpu_rmap_add(rmap, irq->vector);
- if (rc)
- netdev_warn(bp->dev, "failed adding irq rmap for ring %d\n",
- j);
- j++;
- }
-#endif
rc = request_irq(irq->vector, irq->handler, flags, irq->name,
bp->bnapi[i]);
if (rc)
diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c
index 7cee365cc7d1..3b1b892e6958 100644
--- a/drivers/net/ethernet/intel/ice/ice_arfs.c
+++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
@@ -584,9 +584,6 @@ void ice_free_cpu_rx_rmap(struct ice_vsi *vsi)
netdev = vsi->netdev;
if (!netdev || !netdev->rx_cpu_rmap)
return;
-
- free_irq_cpu_rmap(netdev->rx_cpu_rmap);
- netdev->rx_cpu_rmap = NULL;
}
/**
@@ -597,7 +594,6 @@ int ice_set_cpu_rx_rmap(struct ice_vsi *vsi)
{
struct net_device *netdev;
struct ice_pf *pf;
- int i;
if (!vsi || vsi->type != ICE_VSI_PF)
return 0;
@@ -610,18 +606,7 @@ int ice_set_cpu_rx_rmap(struct ice_vsi *vsi)
netdev_dbg(netdev, "Setup CPU RMAP: vsi type 0x%x, ifname %s, q_vectors %d\n",
vsi->type, netdev->name, vsi->num_q_vectors);
- netdev->rx_cpu_rmap = alloc_irq_cpu_rmap(vsi->num_q_vectors);
- if (unlikely(!netdev->rx_cpu_rmap))
- return -EINVAL;
-
- ice_for_each_q_vector(vsi, i)
- if (irq_cpu_rmap_add(netdev->rx_cpu_rmap,
- vsi->q_vectors[i]->irq.virq)) {
- ice_free_cpu_rx_rmap(vsi);
- return -EINVAL;
- }
-
- return 0;
+ return netif_enable_cpu_rmap(netdev, vsi->num_q_vectors);
}
/**
diff --git a/include/linux/cpu_rmap.h b/include/linux/cpu_rmap.h
index 20b5729903d7..2fd7ba75362a 100644
--- a/include/linux/cpu_rmap.h
+++ b/include/linux/cpu_rmap.h
@@ -32,6 +32,7 @@ struct cpu_rmap {
#define CPU_RMAP_DIST_INF 0xffff
extern struct cpu_rmap *alloc_cpu_rmap(unsigned int size, gfp_t flags);
+extern void cpu_rmap_get(struct cpu_rmap *rmap);
extern int cpu_rmap_put(struct cpu_rmap *rmap);
extern int cpu_rmap_add(struct cpu_rmap *rmap, void *obj);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2a59034a5fa2..0d19fa98b65e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -394,6 +394,10 @@ struct napi_struct {
struct list_head dev_list;
struct hlist_node napi_hash_node;
int irq;
+#ifdef CONFIG_RFS_ACCEL
+ struct irq_affinity_notify notify;
+ int napi_rmap_idx;
+#endif
int index;
struct napi_config *config;
};
@@ -1988,6 +1992,9 @@ enum netdev_reg_state {
*
* @threaded: napi threaded mode is enabled
*
+ * @rx_cpu_rmap_auto: driver wants the core to manage the ARFS rmap.
+ * Set by calling netif_enable_cpu_rmap().
+ *
* @see_all_hwtstamp_requests: device wants to see calls to
* ndo_hwtstamp_set() for all timestamp requests
* regardless of source, even if those aren't
@@ -2395,6 +2402,7 @@ struct net_device {
struct lock_class_key *qdisc_tx_busylock;
bool proto_down;
bool threaded;
+ bool rx_cpu_rmap_auto;
/* priv_flags_slow, ungrouped to save space */
unsigned long see_all_hwtstamp_requests:1;
@@ -2717,10 +2725,7 @@ static inline void netdev_assert_locked_or_invisible(struct net_device *dev)
netdev_assert_locked(dev);
}
-static inline void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
-{
- napi->irq = irq;
-}
+void netif_napi_set_irq_locked(struct napi_struct *napi, int irq);
static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
{
@@ -2858,6 +2863,8 @@ static inline void netif_napi_del(struct napi_struct *napi)
synchronize_net();
}
+int netif_enable_cpu_rmap(struct net_device *dev, unsigned int num_irqs);
+
struct packet_type {
__be16 type; /* This is really htons(ether_type). */
bool ignore_outgoing;
diff --git a/lib/cpu_rmap.c b/lib/cpu_rmap.c
index 4c348670da31..f03d9be3f06b 100644
--- a/lib/cpu_rmap.c
+++ b/lib/cpu_rmap.c
@@ -73,7 +73,7 @@ static void cpu_rmap_release(struct kref *ref)
* cpu_rmap_get - internal helper to get new ref on a cpu_rmap
* @rmap: reverse-map allocated with alloc_cpu_rmap()
*/
-static inline void cpu_rmap_get(struct cpu_rmap *rmap)
+void cpu_rmap_get(struct cpu_rmap *rmap)
{
kref_get(&rmap->refcount);
}
diff --git a/net/core/dev.c b/net/core/dev.c
index c0021cbd28fc..33e84477c9c2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6866,6 +6866,123 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
}
EXPORT_SYMBOL(netif_queue_set_napi);
+#ifdef CONFIG_RFS_ACCEL
+static void
+netif_irq_cpu_rmap_notify(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
+{
+ struct napi_struct *napi =
+ container_of(notify, struct napi_struct, notify);
+ struct cpu_rmap *rmap = napi->dev->rx_cpu_rmap;
+ int err;
+
+ err = cpu_rmap_update(rmap, napi->napi_rmap_idx, mask);
+ if (err)
+ netdev_warn(napi->dev, "RMAP update failed (%d)\n",
+ err);
+}
+
+static void netif_napi_affinity_release(struct kref *ref)
+{
+ struct napi_struct *napi =
+ container_of(ref, struct napi_struct, notify.kref);
+ struct cpu_rmap *rmap = napi->dev->rx_cpu_rmap;
+
+ rmap->obj[napi->napi_rmap_idx] = NULL;
+ napi->napi_rmap_idx = -1;
+ cpu_rmap_put(rmap);
+}
+
+static int napi_irq_cpu_rmap_add(struct napi_struct *napi, int irq)
+{
+ struct cpu_rmap *rmap = napi->dev->rx_cpu_rmap;
+ int rc;
+
+ napi->notify.notify = netif_irq_cpu_rmap_notify;
+ napi->notify.release = netif_napi_affinity_release;
+ cpu_rmap_get(rmap);
+ rc = cpu_rmap_add(rmap, napi);
+ if (rc < 0)
+ goto err_add;
+
+ napi->napi_rmap_idx = rc;
+ rc = irq_set_affinity_notifier(irq, &napi->notify);
+ if (rc)
+ goto err_set;
+
+ return 0;
+
+err_set:
+ rmap->obj[napi->napi_rmap_idx] = NULL;
+ napi->napi_rmap_idx = -1;
+err_add:
+ cpu_rmap_put(rmap);
+ return rc;
+}
+
+int netif_enable_cpu_rmap(struct net_device *dev, unsigned int num_irqs)
+{
+ if (dev->rx_cpu_rmap_auto)
+ return 0;
+
+ dev->rx_cpu_rmap = alloc_irq_cpu_rmap(num_irqs);
+ if (!dev->rx_cpu_rmap)
+ return -ENOMEM;
+
+ dev->rx_cpu_rmap_auto = true;
+ return 0;
+}
+EXPORT_SYMBOL(netif_enable_cpu_rmap);
+
+static void netif_del_cpu_rmap(struct net_device *dev)
+{
+ struct cpu_rmap *rmap = dev->rx_cpu_rmap;
+
+ if (!dev->rx_cpu_rmap_auto)
+ return;
+
+ /* Free the rmap */
+ cpu_rmap_put(rmap);
+ dev->rx_cpu_rmap = NULL;
+ dev->rx_cpu_rmap_auto = false;
+}
+
+#else
+static int napi_irq_cpu_rmap_add(struct napi_struct *napi, int irq)
+{
+ return 0;
+}
+
+int netif_enable_cpu_rmap(struct net_device *dev, unsigned int num_irqs)
+{
+ return 0;
+}
+EXPORT_SYMBOL(netif_enable_cpu_rmap);
+
+static void netif_del_cpu_rmap(struct net_device *dev)
+{
+}
+#endif
+
+void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
+{
+ int rc;
+
+ /* Remove existing rmap entries */
+ if (napi->dev->rx_cpu_rmap_auto &&
+ napi->irq != irq && napi->irq > 0)
+ irq_set_affinity_notifier(napi->irq, NULL);
+
+ napi->irq = irq;
+ if (irq > 0) {
+ rc = napi_irq_cpu_rmap_add(napi, irq);
+ if (rc)
+ netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
+ rc);
+ }
+}
+EXPORT_SYMBOL(netif_napi_set_irq_locked);
+
static void napi_restore_config(struct napi_struct *n)
{
n->defer_hard_irqs = n->config->defer_hard_irqs;
@@ -6995,6 +7112,9 @@ void napi_disable_locked(struct napi_struct *n)
else
napi_hash_del(n);
+ if (n->irq > 0 && n->dev->rx_cpu_rmap_auto)
+ irq_set_affinity_notifier(n->irq, NULL);
+
clear_bit(NAPI_STATE_DISABLE, &n->state);
}
EXPORT_SYMBOL(napi_disable_locked);
@@ -11610,6 +11730,8 @@ void free_netdev(struct net_device *dev)
netdev_napi_exit(dev);
+ netif_del_cpu_rmap(dev);
+
ref_tracker_dir_exit(&dev->refcnt_tracker);
#ifdef CONFIG_PCPU_DEV_REFCNT
free_percpu(dev->pcpu_refcnt);
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 1/5] net: move ARFS rmap management to core Ahmed Zaki
@ 2025-02-04 22:06 ` Ahmed Zaki
2025-02-04 22:43 ` Joe Damato
2025-02-07 2:37 ` Jakub Kicinski
2025-02-04 22:06 ` [PATCH net-next v7 3/5] bnxt: use napi's irq affinity Ahmed Zaki
` (4 subsequent siblings)
6 siblings, 2 replies; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-04 22:06 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil, Ahmed Zaki
A common task for most drivers is to remember the user-set CPU affinity
to its IRQs. On each netdev reset, the driver should re-assign the
user's settings to the IRQs.
Add CPU affinity mask to napi_config. To delegate the CPU affinity
management to the core, drivers must:
1 - set the new netdev flag "irq_affinity_auto":
netif_enable_irq_affinity(netdev)
2 - create the napi with persistent config:
netif_napi_add_config()
3 - bind an IRQ to the napi instance: netif_napi_set_irq()
the core will then make sure to use re-assign affinity to the napi's
IRQ.
The default IRQ mask is set to one cpu starting from the closest NUMA.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
include/linux/netdevice.h | 14 +++++++--
net/core/dev.c | 62 +++++++++++++++++++++++++++++++--------
2 files changed, 61 insertions(+), 15 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0d19fa98b65e..0436605ee607 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -352,6 +352,7 @@ struct napi_config {
u64 gro_flush_timeout;
u64 irq_suspend_timeout;
u32 defer_hard_irqs;
+ cpumask_t affinity_mask;
unsigned int napi_id;
};
@@ -394,10 +395,8 @@ struct napi_struct {
struct list_head dev_list;
struct hlist_node napi_hash_node;
int irq;
-#ifdef CONFIG_RFS_ACCEL
struct irq_affinity_notify notify;
int napi_rmap_idx;
-#endif
int index;
struct napi_config *config;
};
@@ -1992,6 +1991,11 @@ enum netdev_reg_state {
*
* @threaded: napi threaded mode is enabled
*
+ * @irq_affinity_auto: driver wants the core to manage the IRQ affinity.
+ * Set by netif_enable_irq_affinity(), then driver must
+ * create persistent napi by netif_napi_add_config()
+ * and finally bind napi to IRQ (netif_napi_set_irq).
+ *
* @rx_cpu_rmap_auto: driver wants the core to manage the ARFS rmap.
* Set by calling netif_enable_cpu_rmap().
*
@@ -2402,6 +2406,7 @@ struct net_device {
struct lock_class_key *qdisc_tx_busylock;
bool proto_down;
bool threaded;
+ bool irq_affinity_auto;
bool rx_cpu_rmap_auto;
/* priv_flags_slow, ungrouped to save space */
@@ -2662,6 +2667,11 @@ static inline void netdev_set_ml_priv(struct net_device *dev,
dev->ml_priv_type = type;
}
+static inline void netif_enable_irq_affinity(struct net_device *dev)
+{
+ dev->irq_affinity_auto = true;
+}
+
/*
* Net namespace inlines
*/
diff --git a/net/core/dev.c b/net/core/dev.c
index 33e84477c9c2..4cde7ac31e74 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6866,28 +6866,39 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
}
EXPORT_SYMBOL(netif_queue_set_napi);
-#ifdef CONFIG_RFS_ACCEL
static void
-netif_irq_cpu_rmap_notify(struct irq_affinity_notify *notify,
- const cpumask_t *mask)
+netif_napi_irq_notify(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
{
struct napi_struct *napi =
container_of(notify, struct napi_struct, notify);
+#ifdef CONFIG_RFS_ACCEL
struct cpu_rmap *rmap = napi->dev->rx_cpu_rmap;
int err;
+#endif
- err = cpu_rmap_update(rmap, napi->napi_rmap_idx, mask);
- if (err)
- netdev_warn(napi->dev, "RMAP update failed (%d)\n",
- err);
+ if (napi->config && napi->dev->irq_affinity_auto)
+ cpumask_copy(&napi->config->affinity_mask, mask);
+
+#ifdef CONFIG_RFS_ACCEL
+ if (napi->dev->rx_cpu_rmap_auto) {
+ err = cpu_rmap_update(rmap, napi->napi_rmap_idx, mask);
+ if (err)
+ netdev_warn(napi->dev, "RMAP update failed (%d)\n",
+ err);
+ }
+#endif
}
+#ifdef CONFIG_RFS_ACCEL
static void netif_napi_affinity_release(struct kref *ref)
{
struct napi_struct *napi =
container_of(ref, struct napi_struct, notify.kref);
struct cpu_rmap *rmap = napi->dev->rx_cpu_rmap;
+ if (!napi->dev->rx_cpu_rmap_auto)
+ return;
rmap->obj[napi->napi_rmap_idx] = NULL;
napi->napi_rmap_idx = -1;
cpu_rmap_put(rmap);
@@ -6898,7 +6909,7 @@ static int napi_irq_cpu_rmap_add(struct napi_struct *napi, int irq)
struct cpu_rmap *rmap = napi->dev->rx_cpu_rmap;
int rc;
- napi->notify.notify = netif_irq_cpu_rmap_notify;
+ napi->notify.notify = netif_napi_irq_notify;
napi->notify.release = netif_napi_affinity_release;
cpu_rmap_get(rmap);
rc = cpu_rmap_add(rmap, napi);
@@ -6948,6 +6959,10 @@ static void netif_del_cpu_rmap(struct net_device *dev)
}
#else
+static void netif_napi_affinity_release(struct kref *ref)
+{
+}
+
static int napi_irq_cpu_rmap_add(struct napi_struct *napi, int irq)
{
return 0;
@@ -6968,17 +6983,28 @@ void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
{
int rc;
- /* Remove existing rmap entries */
- if (napi->dev->rx_cpu_rmap_auto &&
+ /* Remove existing resources */
+ if ((napi->dev->rx_cpu_rmap_auto || napi->dev->irq_affinity_auto) &&
napi->irq != irq && napi->irq > 0)
irq_set_affinity_notifier(napi->irq, NULL);
napi->irq = irq;
- if (irq > 0) {
+ if (irq < 0)
+ return;
+
+ if (napi->dev->rx_cpu_rmap_auto) {
rc = napi_irq_cpu_rmap_add(napi, irq);
if (rc)
netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
rc);
+ } else if (napi->config && napi->dev->irq_affinity_auto) {
+ napi->notify.notify = netif_napi_irq_notify;
+ napi->notify.release = netif_napi_affinity_release;
+
+ rc = irq_set_affinity_notifier(irq, &napi->notify);
+ if (rc)
+ netdev_warn(napi->dev, "Unable to set IRQ notifier (%d)\n",
+ rc);
}
}
EXPORT_SYMBOL(netif_napi_set_irq_locked);
@@ -6988,6 +7014,10 @@ static void napi_restore_config(struct napi_struct *n)
n->defer_hard_irqs = n->config->defer_hard_irqs;
n->gro_flush_timeout = n->config->gro_flush_timeout;
n->irq_suspend_timeout = n->config->irq_suspend_timeout;
+
+ if (n->irq > 0 && n->dev->irq_affinity_auto)
+ irq_set_affinity(n->irq, &n->config->affinity_mask);
+
/* a NAPI ID might be stored in the config, if so use it. if not, use
* napi_hash_add to generate one for us.
*/
@@ -7112,7 +7142,8 @@ void napi_disable_locked(struct napi_struct *n)
else
napi_hash_del(n);
- if (n->irq > 0 && n->dev->rx_cpu_rmap_auto)
+ if (n->irq > 0 &&
+ (n->dev->irq_affinity_auto || n->dev->rx_cpu_rmap_auto))
irq_set_affinity_notifier(n->irq, NULL);
clear_bit(NAPI_STATE_DISABLE, &n->state);
@@ -11550,9 +11581,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
void (*setup)(struct net_device *),
unsigned int txqs, unsigned int rxqs)
{
+ unsigned int maxqs, i, numa;
struct net_device *dev;
size_t napi_config_sz;
- unsigned int maxqs;
BUG_ON(strlen(name) >= sizeof(dev->name));
@@ -11654,6 +11685,11 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
if (!dev->napi_config)
goto free_all;
+ numa = dev_to_node(&dev->dev);
+ for (i = 0; i < maxqs; i++)
+ cpumask_set_cpu(cpumask_local_spread(i, numa),
+ &dev->napi_config[i].affinity_mask);
+
strscpy(dev->name, name);
dev->name_assign_type = name_assign_type;
dev->group = INIT_NETDEV_GROUP;
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next v7 3/5] bnxt: use napi's irq affinity
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 1/5] net: move ARFS rmap management to core Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config Ahmed Zaki
@ 2025-02-04 22:06 ` Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 4/5] ice: " Ahmed Zaki
` (3 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-04 22:06 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 25 +++--------------------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 --
2 files changed, 3 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b9b839cb942a..b2019bb74861 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11246,14 +11246,8 @@ static void bnxt_free_irq(struct bnxt *bp)
int map_idx = bnxt_cp_num_to_irq_num(bp, i);
irq = &bp->irq_tbl[map_idx];
- if (irq->requested) {
- if (irq->have_cpumask) {
- irq_update_affinity_hint(irq->vector, NULL);
- free_cpumask_var(irq->cpu_mask);
- irq->have_cpumask = 0;
- }
+ if (irq->requested)
free_irq(irq->vector, bp->bnapi[i]);
- }
irq->requested = 0;
}
@@ -11282,21 +11276,6 @@ static int bnxt_request_irq(struct bnxt *bp)
netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector);
irq->requested = 1;
-
- if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
- int numa_node = dev_to_node(&bp->pdev->dev);
-
- irq->have_cpumask = 1;
- cpumask_set_cpu(cpumask_local_spread(i, numa_node),
- irq->cpu_mask);
- rc = irq_update_affinity_hint(irq->vector, irq->cpu_mask);
- if (rc) {
- netdev_warn(bp->dev,
- "Update affinity hint failed, IRQ = %d\n",
- irq->vector);
- break;
- }
- }
}
return rc;
}
@@ -16225,6 +16204,8 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
NETDEV_XDP_ACT_RX_SG;
+ netif_enable_irq_affinity(dev);
+
#ifdef CONFIG_BNXT_SRIOV
init_waitqueue_head(&bp->sriov_cfg_wait);
#endif
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 2373f423a523..9e6984458b46 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1231,9 +1231,7 @@ struct bnxt_irq {
irq_handler_t handler;
unsigned int vector;
u8 requested:1;
- u8 have_cpumask:1;
char name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
- cpumask_var_t cpu_mask;
};
#define HWRM_RING_ALLOC_TX 0x1
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next v7 4/5] ice: use napi's irq affinity
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (2 preceding siblings ...)
2025-02-04 22:06 ` [PATCH net-next v7 3/5] bnxt: use napi's irq affinity Ahmed Zaki
@ 2025-02-04 22:06 ` Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 5/5] idpf: " Ahmed Zaki
` (2 subsequent siblings)
6 siblings, 0 replies; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-04 22:06 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/intel/ice/ice.h | 3 --
drivers/net/ethernet/intel/ice/ice_base.c | 7 +---
drivers/net/ethernet/intel/ice/ice_lib.c | 6 ---
drivers/net/ethernet/intel/ice/ice_main.c | 47 ++---------------------
4 files changed, 5 insertions(+), 58 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 71e05d30f0fd..a6e6c9e1edc1 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -478,9 +478,6 @@ struct ice_q_vector {
struct ice_ring_container rx;
struct ice_ring_container tx;
- cpumask_t affinity_mask;
- struct irq_affinity_notify affinity_notify;
-
struct ice_channel *ch;
char name[ICE_INT_NAME_STR_LEN];
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index b2af8e3586f7..86cf715de00f 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -147,10 +147,6 @@ static int ice_vsi_alloc_q_vector(struct ice_vsi *vsi, u16 v_idx)
q_vector->reg_idx = q_vector->irq.index;
q_vector->vf_reg_idx = q_vector->irq.index;
- /* only set affinity_mask if the CPU is online */
- if (cpu_online(v_idx))
- cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
-
/* This will not be called in the driver load path because the netdev
* will not be created yet. All other cases with register the NAPI
* handler here (i.e. resume, reset/rebuild, etc.)
@@ -276,7 +272,8 @@ static void ice_cfg_xps_tx_ring(struct ice_tx_ring *ring)
if (test_and_set_bit(ICE_TX_XPS_INIT_DONE, ring->xps_state))
return;
- netif_set_xps_queue(ring->netdev, &ring->q_vector->affinity_mask,
+ netif_set_xps_queue(ring->netdev,
+ &ring->q_vector->napi.config->affinity_mask,
ring->q_index);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 38a1c8372180..31fb09e30683 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2595,12 +2595,6 @@ void ice_vsi_free_irq(struct ice_vsi *vsi)
vsi->q_vectors[i]->num_ring_rx))
continue;
- /* clear the affinity notifier in the IRQ descriptor */
- if (!IS_ENABLED(CONFIG_RFS_ACCEL))
- irq_set_affinity_notifier(irq_num, NULL);
-
- /* clear the affinity_hint in the IRQ descriptor */
- irq_update_affinity_hint(irq_num, NULL);
synchronize_irq(irq_num);
devm_free_irq(ice_pf_to_dev(pf), irq_num, vsi->q_vectors[i]);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c3a0fb97c5ee..a348a37d5ba3 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2527,34 +2527,6 @@ int ice_schedule_reset(struct ice_pf *pf, enum ice_reset_req reset)
return 0;
}
-/**
- * ice_irq_affinity_notify - Callback for affinity changes
- * @notify: context as to what irq was changed
- * @mask: the new affinity mask
- *
- * This is a callback function used by the irq_set_affinity_notifier function
- * so that we may register to receive changes to the irq affinity masks.
- */
-static void
-ice_irq_affinity_notify(struct irq_affinity_notify *notify,
- const cpumask_t *mask)
-{
- struct ice_q_vector *q_vector =
- container_of(notify, struct ice_q_vector, affinity_notify);
-
- cpumask_copy(&q_vector->affinity_mask, mask);
-}
-
-/**
- * ice_irq_affinity_release - Callback for affinity notifier release
- * @ref: internal core kernel usage
- *
- * This is a callback function used by the irq_set_affinity_notifier function
- * to inform the current notification subscriber that they will no longer
- * receive notifications.
- */
-static void ice_irq_affinity_release(struct kref __always_unused *ref) {}
-
/**
* ice_vsi_ena_irq - Enable IRQ for the given VSI
* @vsi: the VSI being configured
@@ -2618,19 +2590,6 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
err);
goto free_q_irqs;
}
-
- /* register for affinity change notifications */
- if (!IS_ENABLED(CONFIG_RFS_ACCEL)) {
- struct irq_affinity_notify *affinity_notify;
-
- affinity_notify = &q_vector->affinity_notify;
- affinity_notify->notify = ice_irq_affinity_notify;
- affinity_notify->release = ice_irq_affinity_release;
- irq_set_affinity_notifier(irq_num, affinity_notify);
- }
-
- /* assign the mask for this irq */
- irq_update_affinity_hint(irq_num, &q_vector->affinity_mask);
}
err = ice_set_cpu_rx_rmap(vsi);
@@ -2646,9 +2605,6 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
free_q_irqs:
while (vector--) {
irq_num = vsi->q_vectors[vector]->irq.virq;
- if (!IS_ENABLED(CONFIG_RFS_ACCEL))
- irq_set_affinity_notifier(irq_num, NULL);
- irq_update_affinity_hint(irq_num, NULL);
devm_free_irq(dev, irq_num, &vsi->q_vectors[vector]);
}
return err;
@@ -3689,6 +3645,9 @@ void ice_set_netdev_features(struct net_device *netdev)
*/
netdev->hw_features |= NETIF_F_RXFCS;
+ /* Allow core to manage IRQs affinity */
+ netif_enable_irq_affinity(netdev);
+
netif_set_tso_max_size(netdev, ICE_MAX_TSO_SIZE);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next v7 5/5] idpf: use napi's irq affinity
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (3 preceding siblings ...)
2025-02-04 22:06 ` [PATCH net-next v7 4/5] ice: " Ahmed Zaki
@ 2025-02-04 22:06 ` Ahmed Zaki
2025-02-07 0:24 ` [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Joe Damato
2025-02-07 18:47 ` Jakub Kicinski
6 siblings, 0 replies; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-04 22:06 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/intel/idpf/idpf_lib.c | 1 +
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 22 +++++++--------------
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 6 ++----
3 files changed, 10 insertions(+), 19 deletions(-)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index b4fbb99bfad2..d54be068f53f 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -814,6 +814,7 @@ static int idpf_cfg_netdev(struct idpf_vport *vport)
netdev->hw_features |= dflt_features | offloads;
netdev->hw_enc_features |= dflt_features | offloads;
idpf_set_ethtool_ops(netdev);
+ netif_enable_irq_affinity(netdev);
SET_NETDEV_DEV(netdev, &adapter->pdev->dev);
/* carrier off on init to avoid Tx hangs */
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 2fa9c36e33c9..f6b5b45a061c 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -3554,8 +3554,6 @@ void idpf_vport_intr_rel(struct idpf_vport *vport)
q_vector->tx = NULL;
kfree(q_vector->rx);
q_vector->rx = NULL;
-
- free_cpumask_var(q_vector->affinity_mask);
}
kfree(vport->q_vectors);
@@ -3582,8 +3580,6 @@ static void idpf_vport_intr_rel_irq(struct idpf_vport *vport)
vidx = vport->q_vector_idxs[vector];
irq_num = adapter->msix_entries[vidx].vector;
- /* clear the affinity_mask in the IRQ descriptor */
- irq_set_affinity_hint(irq_num, NULL);
kfree(free_irq(irq_num, q_vector));
}
}
@@ -3771,8 +3767,6 @@ static int idpf_vport_intr_req_irq(struct idpf_vport *vport)
"Request_irq failed, error: %d\n", err);
goto free_q_irqs;
}
- /* assign the mask for this irq */
- irq_set_affinity_hint(irq_num, q_vector->affinity_mask);
}
return 0;
@@ -4184,7 +4178,8 @@ static int idpf_vport_intr_init_vec_idx(struct idpf_vport *vport)
static void idpf_vport_intr_napi_add_all(struct idpf_vport *vport)
{
int (*napi_poll)(struct napi_struct *napi, int budget);
- u16 v_idx;
+ u16 v_idx, qv_idx;
+ int irq_num;
if (idpf_is_queue_model_split(vport->txq_model))
napi_poll = idpf_vport_splitq_napi_poll;
@@ -4193,12 +4188,12 @@ static void idpf_vport_intr_napi_add_all(struct idpf_vport *vport)
for (v_idx = 0; v_idx < vport->num_q_vectors; v_idx++) {
struct idpf_q_vector *q_vector = &vport->q_vectors[v_idx];
+ qv_idx = vport->q_vector_idxs[v_idx];
+ irq_num = vport->adapter->msix_entries[qv_idx].vector;
- netif_napi_add(vport->netdev, &q_vector->napi, napi_poll);
-
- /* only set affinity_mask if the CPU is online */
- if (cpu_online(v_idx))
- cpumask_set_cpu(v_idx, q_vector->affinity_mask);
+ netif_napi_add_config(vport->netdev, &q_vector->napi,
+ napi_poll, v_idx);
+ netif_napi_set_irq(&q_vector->napi, irq_num);
}
}
@@ -4242,9 +4237,6 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
q_vector->rx_intr_mode = IDPF_ITR_DYNAMIC;
q_vector->rx_itr_idx = VIRTCHNL2_ITR_IDX_0;
- if (!zalloc_cpumask_var(&q_vector->affinity_mask, GFP_KERNEL))
- goto error;
-
q_vector->tx = kcalloc(txqs_per_vector, sizeof(*q_vector->tx),
GFP_KERNEL);
if (!q_vector->tx)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 0f71a6f5557b..13251f63c7c3 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -401,7 +401,6 @@ struct idpf_intr_reg {
* @rx_intr_mode: Dynamic ITR or not
* @rx_itr_idx: RX ITR index
* @v_idx: Vector index
- * @affinity_mask: CPU affinity mask
*/
struct idpf_q_vector {
__cacheline_group_begin_aligned(read_mostly);
@@ -438,13 +437,12 @@ struct idpf_q_vector {
__cacheline_group_begin_aligned(cold);
u16 v_idx;
- cpumask_var_t affinity_mask;
__cacheline_group_end_aligned(cold);
};
libeth_cacheline_set_assert(struct idpf_q_vector, 120,
24 + sizeof(struct napi_struct) +
2 * sizeof(struct dim),
- 8 + sizeof(cpumask_var_t));
+ 8);
struct idpf_rx_queue_stats {
u64_stats_t packets;
@@ -940,7 +938,7 @@ static inline int idpf_q_vector_to_mem(const struct idpf_q_vector *q_vector)
if (!q_vector)
return NUMA_NO_NODE;
- cpu = cpumask_first(q_vector->affinity_mask);
+ cpu = cpumask_first(&q_vector->napi.config->affinity_mask);
return cpu < nr_cpu_ids ? cpu_to_mem(cpu) : NUMA_NO_NODE;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
2025-02-04 22:06 ` [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config Ahmed Zaki
@ 2025-02-04 22:43 ` Joe Damato
2025-02-05 15:20 ` Ahmed Zaki
2025-02-07 2:37 ` Jakub Kicinski
1 sibling, 1 reply; 15+ messages in thread
From: Joe Damato @ 2025-02-04 22:43 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, kuba, horms,
pabeni, davem, michael.chan, tariqt, anthony.l.nguyen,
przemyslaw.kitszel, shayd, akpm, shayagr, kalesh-anakkur.purayil
On Tue, Feb 04, 2025 at 03:06:19PM -0700, Ahmed Zaki wrote:
> A common task for most drivers is to remember the user-set CPU affinity
> to its IRQs. On each netdev reset, the driver should re-assign the
> user's settings to the IRQs.
>
> Add CPU affinity mask to napi_config. To delegate the CPU affinity
> management to the core, drivers must:
> 1 - set the new netdev flag "irq_affinity_auto":
> netif_enable_irq_affinity(netdev)
> 2 - create the napi with persistent config:
> netif_napi_add_config()
> 3 - bind an IRQ to the napi instance: netif_napi_set_irq()
>
> the core will then make sure to use re-assign affinity to the napi's
> IRQ.
>
> The default IRQ mask is set to one cpu starting from the closest NUMA.
Not sure, but maybe the above should be documented somewhere like
Documentation/networking/napi.rst or similar?
Maybe that's too nit-picky, though, since the per-NAPI config stuff
never made it into the docs (I'll propose a patch to fix that).
> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
> ---
> include/linux/netdevice.h | 14 +++++++--
> net/core/dev.c | 62 +++++++++++++++++++++++++++++++--------
> 2 files changed, 61 insertions(+), 15 deletions(-)
[...]
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 33e84477c9c2..4cde7ac31e74 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
[...]
> @@ -6968,17 +6983,28 @@ void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
> {
> int rc;
>
> - /* Remove existing rmap entries */
> - if (napi->dev->rx_cpu_rmap_auto &&
> + /* Remove existing resources */
> + if ((napi->dev->rx_cpu_rmap_auto || napi->dev->irq_affinity_auto) &&
> napi->irq != irq && napi->irq > 0)
> irq_set_affinity_notifier(napi->irq, NULL);
>
> napi->irq = irq;
> - if (irq > 0) {
> + if (irq < 0)
> + return;
> +
> + if (napi->dev->rx_cpu_rmap_auto) {
> rc = napi_irq_cpu_rmap_add(napi, irq);
> if (rc)
> netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
> rc);
> + } else if (napi->config && napi->dev->irq_affinity_auto) {
> + napi->notify.notify = netif_napi_irq_notify;
> + napi->notify.release = netif_napi_affinity_release;
> +
> + rc = irq_set_affinity_notifier(irq, &napi->notify);
> + if (rc)
> + netdev_warn(napi->dev, "Unable to set IRQ notifier (%d)\n",
> + rc);
> }
Should there be a WARN_ON or WARN_ON_ONCE in here somewhere if the
driver calls netif_napi_set_irq_locked but did not link NAPI config
with a call to netif_napi_add_config?
It seems like in that case the driver is buggy and a warning might
be helpful.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
2025-02-04 22:43 ` Joe Damato
@ 2025-02-05 15:20 ` Ahmed Zaki
2025-02-07 2:33 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-05 15:20 UTC (permalink / raw)
To: Joe Damato, netdev, intel-wired-lan, andrew+netdev, edumazet,
kuba, horms, pabeni, davem, michael.chan, tariqt,
anthony.l.nguyen, przemyslaw.kitszel, shayd, akpm, shayagr,
kalesh-anakkur.purayil
On 2025-02-04 3:43 p.m., Joe Damato wrote:
> On Tue, Feb 04, 2025 at 03:06:19PM -0700, Ahmed Zaki wrote:
>> A common task for most drivers is to remember the user-set CPU affinity
>> to its IRQs. On each netdev reset, the driver should re-assign the
>> user's settings to the IRQs.
>>
>> Add CPU affinity mask to napi_config. To delegate the CPU affinity
>> management to the core, drivers must:
>> 1 - set the new netdev flag "irq_affinity_auto":
>> netif_enable_irq_affinity(netdev)
>> 2 - create the napi with persistent config:
>> netif_napi_add_config()
>> 3 - bind an IRQ to the napi instance: netif_napi_set_irq()
>>
>> the core will then make sure to use re-assign affinity to the napi's
>> IRQ.
>>
>> The default IRQ mask is set to one cpu starting from the closest NUMA.
>
> Not sure, but maybe the above should be documented somewhere like
> Documentation/networking/napi.rst or similar?
>
> Maybe that's too nit-picky, though, since the per-NAPI config stuff
> never made it into the docs (I'll propose a patch to fix that).
Yeah, and not all API is there (like netif_napi_set_irq()).
>
>> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
>> ---
>> include/linux/netdevice.h | 14 +++++++--
>> net/core/dev.c | 62 +++++++++++++++++++++++++++++++--------
>> 2 files changed, 61 insertions(+), 15 deletions(-)
>
> [...]
>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 33e84477c9c2..4cde7ac31e74 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>
> [...]
>
>> @@ -6968,17 +6983,28 @@ void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
>> {
>> int rc;
>>
>> - /* Remove existing rmap entries */
>> - if (napi->dev->rx_cpu_rmap_auto &&
>> + /* Remove existing resources */
>> + if ((napi->dev->rx_cpu_rmap_auto || napi->dev->irq_affinity_auto) &&
>> napi->irq != irq && napi->irq > 0)
>> irq_set_affinity_notifier(napi->irq, NULL);
>>
>> napi->irq = irq;
>> - if (irq > 0) {
>> + if (irq < 0)
>> + return;
>> +
>> + if (napi->dev->rx_cpu_rmap_auto) {
>> rc = napi_irq_cpu_rmap_add(napi, irq);
>> if (rc)
>> netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
>> rc);
>> + } else if (napi->config && napi->dev->irq_affinity_auto) {
>> + napi->notify.notify = netif_napi_irq_notify;
>> + napi->notify.release = netif_napi_affinity_release;
>> +
>> + rc = irq_set_affinity_notifier(irq, &napi->notify);
>> + if (rc)
>> + netdev_warn(napi->dev, "Unable to set IRQ notifier (%d)\n",
>> + rc);
>> }
>
> Should there be a WARN_ON or WARN_ON_ONCE in here somewhere if the
> driver calls netif_napi_set_irq_locked but did not link NAPI config
> with a call to netif_napi_add_config?
>
> It seems like in that case the driver is buggy and a warning might
> be helpful.
>
I think that is a good idea, if there is a new version I can add this in
the second part of the if:
if (WARN_ON_ONCE(!napi->config))
return;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (4 preceding siblings ...)
2025-02-04 22:06 ` [PATCH net-next v7 5/5] idpf: " Ahmed Zaki
@ 2025-02-07 0:24 ` Joe Damato
2025-02-07 18:47 ` Jakub Kicinski
6 siblings, 0 replies; 15+ messages in thread
From: Joe Damato @ 2025-02-07 0:24 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, kuba, horms,
pabeni, davem, michael.chan, tariqt, anthony.l.nguyen,
przemyslaw.kitszel, shayd, akpm, shayagr, kalesh-anakkur.purayil
On Tue, Feb 04, 2025 at 03:06:17PM -0700, Ahmed Zaki wrote:
> Drivers usually need to re-apply the user-set IRQ affinity to their IRQs
> after reset. However, since there can be only one IRQ affinity notifier
> for each IRQ, registering IRQ notifiers conflicts with the ARFS rmap
> management in the core (which also registers separate IRQ affinity
> notifiers).
>
> Move the IRQ affinity management to the napi struct. This way we can have
> a unified IRQ notifier to re-apply the user-set affinity and also manage
> the ARFS rmaps. The first patch moves the ARFS rmap management to CORE.
> The second patch adds the IRQ affinity mask to napi_config and re-applies
> the mask after reset. Patches 3-5 use the new API for bnxt, ice and idpf
> drivers.
If there's another version maybe adding this to netdevsim might be
good?
Was just thinking that if one day in the distant future netdev-genl
was extended to expose the per NAPI affinity mask, a test could
probably be written.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 1/5] net: move ARFS rmap management to core
2025-02-04 22:06 ` [PATCH net-next v7 1/5] net: move ARFS rmap management to core Ahmed Zaki
@ 2025-02-07 2:29 ` Jakub Kicinski
2025-02-10 15:04 ` Ahmed Zaki
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-07 2:29 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil,
David Arinzon
On Tue, 4 Feb 2025 15:06:18 -0700 Ahmed Zaki wrote:
> +void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
> +{
> + int rc;
> +
> + /* Remove existing rmap entries */
> + if (napi->dev->rx_cpu_rmap_auto &&
> + napi->irq != irq && napi->irq > 0)
this condition gets a bit hairy by the end of the series.
could you add a napi state bit that indicates that a notifier is
installed? Then here:
if (napi->irq == irq)
return;
if (test_and_clear_bit(NAPI_STATE_HAS_NOTIFIER, &napi->state))
irq_set_affinity_notifier(napi->irq, NULL);
if (irq < 0)
return;
And you can similarly simplify napi_disable_locked().
Speaking of which, why do the auto-removal in napi_disable()
rather than netif_napi_del() ? We don't reinstall on napi_enable()
and doing a disable() + enable() is fairly common during driver
reconfig.
> + irq_set_affinity_notifier(napi->irq, NULL);
> +
> + napi->irq = irq;
> + if (irq > 0) {
> + rc = napi_irq_cpu_rmap_add(napi, irq);
> + if (rc)
> + netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
nit: not sure I'd grasp this message as a user, maybe:
"Unable to install aRFS CPU to Rx queue mapping"
? Not great either, I guess.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
2025-02-05 15:20 ` Ahmed Zaki
@ 2025-02-07 2:33 ` Jakub Kicinski
0 siblings, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-07 2:33 UTC (permalink / raw)
To: Ahmed Zaki
Cc: Joe Damato, netdev, intel-wired-lan, andrew+netdev, edumazet,
horms, pabeni, davem, michael.chan, tariqt, anthony.l.nguyen,
przemyslaw.kitszel, shayd, akpm, shayagr, kalesh-anakkur.purayil
On Wed, 5 Feb 2025 08:20:20 -0700 Ahmed Zaki wrote:
> >> + if (napi->dev->rx_cpu_rmap_auto) {
> >> rc = napi_irq_cpu_rmap_add(napi, irq);
> >> if (rc)
> >> netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
> >> rc);
> >> + } else if (napi->config && napi->dev->irq_affinity_auto) {
> >> + napi->notify.notify = netif_napi_irq_notify;
> >> + napi->notify.release = netif_napi_affinity_release;
> >> +
> >> + rc = irq_set_affinity_notifier(irq, &napi->notify);
> >> + if (rc)
> >> + netdev_warn(napi->dev, "Unable to set IRQ notifier (%d)\n",
> >> + rc);
> >> }
> >
> > Should there be a WARN_ON or WARN_ON_ONCE in here somewhere if the
> > driver calls netif_napi_set_irq_locked but did not link NAPI config
> > with a call to netif_napi_add_config?
> >
> > It seems like in that case the driver is buggy and a warning might
> > be helpful.
> >
>
> I think that is a good idea, if there is a new version I can add this in
> the second part of the if:
>
>
> if (WARN_ON_ONCE(!napi->config))
> return;
To be clear, this will make it illegal to set IRQ on a NAPI instance
before it's listed. Probably for the best if we also have auto-remove
in netif_napi_del().
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config
2025-02-04 22:06 ` [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config Ahmed Zaki
2025-02-04 22:43 ` Joe Damato
@ 2025-02-07 2:37 ` Jakub Kicinski
1 sibling, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-07 2:37 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil
On Tue, 4 Feb 2025 15:06:19 -0700 Ahmed Zaki wrote:
> + * @irq_affinity_auto: driver wants the core to manage the IRQ affinity.
"manage" is probably too strong? "store" or "remember" ?
Your commit message explains it quite nicely.
> + * Set by netif_enable_irq_affinity(), then driver must
> + * create persistent napi by netif_napi_add_config()
> + * and finally bind napi to IRQ (netif_napi_set_irq).
> + *
> * @rx_cpu_rmap_auto: driver wants the core to manage the ARFS rmap.
> * Set by calling netif_enable_cpu_rmap().
> *
> @@ -2402,6 +2406,7 @@ struct net_device {
> struct lock_class_key *qdisc_tx_busylock;
> bool proto_down;
> bool threaded;
> + bool irq_affinity_auto;
> bool rx_cpu_rmap_auto;
>
> /* priv_flags_slow, ungrouped to save space */
> @@ -2662,6 +2667,11 @@ static inline void netdev_set_ml_priv(struct net_device *dev,
> dev->ml_priv_type = type;
> }
>
> +static inline void netif_enable_irq_affinity(struct net_device *dev)
Similar here, "enable affinity" is a bit strong.
netif_remember_irq_affinity() would be more accurate IMHO
> +{
> + dev->irq_affinity_auto = true;
> +}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (5 preceding siblings ...)
2025-02-07 0:24 ` [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Joe Damato
@ 2025-02-07 18:47 ` Jakub Kicinski
6 siblings, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-07 18:47 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil
On Tue, 4 Feb 2025 15:06:17 -0700 Ahmed Zaki wrote:
> Drivers usually need to re-apply the user-set IRQ affinity to their IRQs
> after reset. However, since there can be only one IRQ affinity notifier
> for each IRQ, registering IRQ notifiers conflicts with the ARFS rmap
> management in the core (which also registers separate IRQ affinity
> notifiers).
>
> Move the IRQ affinity management to the napi struct. This way we can have
> a unified IRQ notifier to re-apply the user-set affinity and also manage
> the ARFS rmaps. The first patch moves the ARFS rmap management to CORE.
> The second patch adds the IRQ affinity mask to napi_config and re-applies
> the mask after reset. Patches 3-5 use the new API for bnxt, ice and idpf
> drivers.
Hi Ahmed!
I put together a selftest for maintaining the affinity:
https://github.com/kuba-moo/linux/commit/de7d2475750ac05b6e414d7e5201e354b05cf146
It depends on a couple of selftest infra patches (in that branch)
which I just posted to the list. But if you'd like you can use
it against your drivers.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 1/5] net: move ARFS rmap management to core
2025-02-07 2:29 ` Jakub Kicinski
@ 2025-02-10 15:04 ` Ahmed Zaki
2025-02-11 0:13 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: Ahmed Zaki @ 2025-02-10 15:04 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil,
David Arinzon
On 2025-02-06 7:29 p.m., Jakub Kicinski wrote:
> Speaking of which, why do the auto-removal in napi_disable()
> rather than netif_napi_del() ? We don't reinstall on napi_enable()
> and doing a disable() + enable() is fairly common during driver
> reconfig.
>
The patch does not re-install the notifiers in napi_add either, they are
installed in set_irq() :
napi_add_config() -> napi_set_irq() -> napi_enable()
so napi_disable or napi_del seemed both OK to me.
However, I moved notifier auto-removal to npi_del() and did some testing
on ice but it seems the driver does not delete napi on "ip link down"
and that generates warnings on free_irq(). It only disables the napis.
So is this a bug? Do we need to ask drivers to disable __and__ delete
napis before freeing the IRQs?
If not, then we have to keep notifier aut-removal in napi_diasable().
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v7 1/5] net: move ARFS rmap management to core
2025-02-10 15:04 ` Ahmed Zaki
@ 2025-02-11 0:13 ` Jakub Kicinski
0 siblings, 0 replies; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-11 0:13 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, horms, pabeni,
davem, michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, shayagr, kalesh-anakkur.purayil,
David Arinzon
On Mon, 10 Feb 2025 08:04:43 -0700 Ahmed Zaki wrote:
> On 2025-02-06 7:29 p.m., Jakub Kicinski wrote:
>
> > Speaking of which, why do the auto-removal in napi_disable()
> > rather than netif_napi_del() ? We don't reinstall on napi_enable()
> > and doing a disable() + enable() is fairly common during driver
> > reconfig.
> >
>
> The patch does not re-install the notifiers in napi_add either, they are
> installed in set_irq() :
>
> napi_add_config() -> napi_set_irq() -> napi_enable()
>
> so napi_disable or napi_del seemed both OK to me.
>
> However, I moved notifier auto-removal to npi_del() and did some testing
> on ice but it seems the driver does not delete napi on "ip link down"
> and that generates warnings on free_irq(). It only disables the napis.
>
> So is this a bug? Do we need to ask drivers to disable __and__ delete
> napis before freeing the IRQs?
>
> If not, then we have to keep notifier aut-removal in napi_diasable().
If the driver releases the IRQ but keeps the NAPI instance I would have
expected it to call:
napi_set_irq(napi, -1);
before freeing the IRQ. Otherwise the NAPI instance will "point" to
a freed IRQ.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-02-11 0:13 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-04 22:06 [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 1/5] net: move ARFS rmap management to core Ahmed Zaki
2025-02-07 2:29 ` Jakub Kicinski
2025-02-10 15:04 ` Ahmed Zaki
2025-02-11 0:13 ` Jakub Kicinski
2025-02-04 22:06 ` [PATCH net-next v7 2/5] net: napi: add CPU affinity to napi_config Ahmed Zaki
2025-02-04 22:43 ` Joe Damato
2025-02-05 15:20 ` Ahmed Zaki
2025-02-07 2:33 ` Jakub Kicinski
2025-02-07 2:37 ` Jakub Kicinski
2025-02-04 22:06 ` [PATCH net-next v7 3/5] bnxt: use napi's irq affinity Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 4/5] ice: " Ahmed Zaki
2025-02-04 22:06 ` [PATCH net-next v7 5/5] idpf: " Ahmed Zaki
2025-02-07 0:24 ` [PATCH net-next v7 0/5] net: napi: add CPU affinity to napi->config Joe Damato
2025-02-07 18:47 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).