* [PATCH net-next v2 1/8] net: napi: add irq_flags to napi struct
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-20 3:34 ` Jakub Kicinski
2024-12-18 16:58 ` [PATCH net-next v2 2/8] net: allow ARFS rmap management in core Ahmed Zaki
` (7 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Add irq_flags to the napi struct. This will allow the drivers to choose
how the core handles the IRQ assigned to the napi via
netif_napi_set_irq().
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +-
drivers/net/ethernet/broadcom/tg3.c | 2 +-
drivers/net/ethernet/google/gve/gve_utils.c | 2 +-
drivers/net/ethernet/intel/e1000/e1000_main.c | 2 +-
drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
drivers/net/ethernet/intel/ice/ice_lib.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 4 ++--
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 3 ++-
include/linux/netdevice.h | 6 ++----
net/core/dev.c | 9 ++++++++-
12 files changed, 22 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index c1295dfad0d0..4898c8be78ad 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1712,7 +1712,7 @@ static int ena_request_io_irq(struct ena_adapter *adapter)
for (i = 0; i < io_queue_count; i++) {
irq_idx = ENA_IO_IRQ_IDX(i);
irq = &adapter->irq_tbl[irq_idx];
- netif_napi_set_irq(&adapter->ena_napi[i].napi, irq->vector);
+ netif_napi_set_irq(&adapter->ena_napi[i].napi, irq->vector, 0);
}
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index b86f980fa7ea..4763c6300bd3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11225,7 +11225,7 @@ static int bnxt_request_irq(struct bnxt *bp)
if (rc)
break;
- netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector);
+ netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector, 0);
irq->requested = 1;
if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 9cc8db10a8d6..0d6383804270 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7447,7 +7447,7 @@ static void tg3_napi_init(struct tg3 *tp)
for (i = 0; i < tp->irq_cnt; i++) {
netif_napi_add(tp->dev, &tp->napi[i].napi,
i ? tg3_poll_msix : tg3_poll);
- netif_napi_set_irq(&tp->napi[i].napi, tp->napi[i].irq_vec);
+ netif_napi_set_irq(&tp->napi[i].napi, tp->napi[i].irq_vec, 0);
}
}
diff --git a/drivers/net/ethernet/google/gve/gve_utils.c b/drivers/net/ethernet/google/gve/gve_utils.c
index 30fef100257e..2657e583f5c6 100644
--- a/drivers/net/ethernet/google/gve/gve_utils.c
+++ b/drivers/net/ethernet/google/gve/gve_utils.c
@@ -111,7 +111,7 @@ void gve_add_napi(struct gve_priv *priv, int ntfy_idx,
struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx];
netif_napi_add(priv->dev, &block->napi, gve_poll);
- netif_napi_set_irq(&block->napi, block->irq);
+ netif_napi_set_irq(&block->napi, block->irq, 0);
}
void gve_remove_napi(struct gve_priv *priv, int ntfy_idx)
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 3f089c3d47b2..a83af159837a 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -1394,7 +1394,7 @@ int e1000_open(struct net_device *netdev)
/* From here on the code is the same as e1000_up() */
clear_bit(__E1000_DOWN, &adapter->flags);
- netif_napi_set_irq(&adapter->napi, adapter->pdev->irq);
+ netif_napi_set_irq(&adapter->napi, adapter->pdev->irq, 0);
napi_enable(&adapter->napi);
netif_queue_set_napi(netdev, 0, NETDEV_QUEUE_TYPE_RX, &adapter->napi);
netif_queue_set_napi(netdev, 0, NETDEV_QUEUE_TYPE_TX, &adapter->napi);
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 286155efcedf..8fc5603ed962 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -4676,7 +4676,7 @@ int e1000e_open(struct net_device *netdev)
else
irq = adapter->pdev->irq;
- netif_napi_set_irq(&adapter->napi, irq);
+ netif_napi_set_irq(&adapter->napi, irq, 0);
napi_enable(&adapter->napi);
netif_queue_set_napi(netdev, 0, NETDEV_QUEUE_TYPE_RX, &adapter->napi);
netif_queue_set_napi(netdev, 0, NETDEV_QUEUE_TYPE_TX, &adapter->napi);
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index a7d45a8ce7ac..ff91e70f596f 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2735,7 +2735,7 @@ void ice_vsi_set_napi_queues(struct ice_vsi *vsi)
ice_for_each_q_vector(vsi, v_idx) {
struct ice_q_vector *q_vector = vsi->q_vectors[v_idx];
- netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq);
+ netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq, 0);
}
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 0e92956e84cf..b8531283e3ac 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -150,7 +150,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
case TX:
cq->mcq.comp = mlx4_en_tx_irq;
netif_napi_add_tx(cq->dev, &cq->napi, mlx4_en_poll_tx_cq);
- netif_napi_set_irq(&cq->napi, irq);
+ netif_napi_set_irq(&cq->napi, irq, 0);
napi_enable(&cq->napi);
netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_TX, &cq->napi);
break;
@@ -158,7 +158,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
cq->mcq.comp = mlx4_en_rx_irq;
netif_napi_add_config(cq->dev, &cq->napi, mlx4_en_poll_rx_cq,
cq_idx);
- netif_napi_set_irq(&cq->napi, irq);
+ netif_napi_set_irq(&cq->napi, irq, 0);
napi_enable(&cq->napi);
netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_RX, &cq->napi);
break;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index dd16d73000c3..58b8313f4c5a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2733,7 +2733,7 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
c->lag_port = mlx5e_enumerate_lag_port(mdev, ix);
netif_napi_add_config(netdev, &c->napi, mlx5e_napi_poll, ix);
- netif_napi_set_irq(&c->napi, irq);
+ netif_napi_set_irq(&c->napi, irq, 0);
err = mlx5e_open_queues(c, params, cparam);
if (unlikely(err))
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
index b5050fabe8fe..6ca91ce85d48 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
@@ -1227,7 +1227,8 @@ static int fbnic_alloc_napi_vector(struct fbnic_dev *fbd, struct fbnic_net *fbn,
/* Record IRQ to NAPI struct */
netif_napi_set_irq(&nv->napi,
- pci_irq_vector(to_pci_dev(fbd->dev), nv->v_idx));
+ pci_irq_vector(to_pci_dev(fbd->dev), nv->v_idx),
+ 0);
/* Tie nv back to PCIe dev */
nv->dev = fbd->dev;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2593019ad5b1..ca91b6662bde 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -392,6 +392,7 @@ struct napi_struct {
struct list_head dev_list;
struct hlist_node napi_hash_node;
int irq;
+ unsigned long irq_flags;
int index;
struct napi_config *config;
};
@@ -2671,10 +2672,7 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
enum netdev_queue_type type,
struct napi_struct *napi);
-static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
-{
- napi->irq = irq;
-}
+void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags);
/* Default NAPI poll() weight
* Device drivers are strongly advised to not use bigger value
diff --git a/net/core/dev.c b/net/core/dev.c
index c7f3dea3e0eb..88a7d4b6e71b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6705,6 +6705,13 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
}
EXPORT_SYMBOL(netif_queue_set_napi);
+void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
+{
+ napi->irq = irq;
+ napi->irq_flags = flags;
+}
+EXPORT_SYMBOL(netif_napi_set_irq);
+
static void napi_restore_config(struct napi_struct *n)
{
n->defer_hard_irqs = n->config->defer_hard_irqs;
@@ -6770,7 +6777,7 @@ void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
*/
if (dev->threaded && napi_kthread_create(napi))
dev->threaded = false;
- netif_napi_set_irq(napi, -1);
+ netif_napi_set_irq(napi, -1, 0);
}
EXPORT_SYMBOL(netif_napi_add_weight);
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 1/8] net: napi: add irq_flags to napi struct
2024-12-18 16:58 ` [PATCH net-next v2 1/8] net: napi: add irq_flags to napi struct Ahmed Zaki
@ 2024-12-20 3:34 ` Jakub Kicinski
2024-12-20 14:50 ` Ahmed Zaki
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-12-20 3:34 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On Wed, 18 Dec 2024 09:58:36 -0700 Ahmed Zaki wrote:
> Add irq_flags to the napi struct. This will allow the drivers to choose
> how the core handles the IRQ assigned to the napi via
> netif_napi_set_irq().
I haven't read all the code, but I think the flag should be for the
netdev as a while, not NAPI by NAPI. In fact you can combine it with
allocating the map, too.
int netif_enable_cpu_rmap(dev, num_queues)
{
#ifdef CONFIG_RFS_ACCEL
WARN_ON(dev->rx_cpu_rmap);
dev->rx_cpu_rmap = alloc_irq_cpu_rmap(adapter->num_queues);
if ...
dev->rx_cpu_rmap_auto = 1;
return 0;
#endif
}
void netif_disable_cpu_rmap(dev)
{
dev->rx_cpu_rmap_auto = 0;
free_irq_cpu_rmap(dev->rx_cpu_rmap);
}
Then in the NAPI code you just:
void netif_napi_set_irq(...)
{
...
if (napi->dev->rx_cpu_rmap_auto) {
err = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq);
...
}
}
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 1/8] net: napi: add irq_flags to napi struct
2024-12-20 3:34 ` Jakub Kicinski
@ 2024-12-20 14:50 ` Ahmed Zaki
0 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-20 14:50 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On 2024-12-19 8:34 p.m., Jakub Kicinski wrote:
> On Wed, 18 Dec 2024 09:58:36 -0700 Ahmed Zaki wrote:
>> Add irq_flags to the napi struct. This will allow the drivers to choose
>> how the core handles the IRQ assigned to the napi via
>> netif_napi_set_irq().
>
> I haven't read all the code, but I think the flag should be for the
> netdev as a while, not NAPI by NAPI. In fact you can combine it with
> allocating the map, too.
>
> int netif_enable_cpu_rmap(dev, num_queues)
int netif_enable_cpu_rmap(dev, num_vectors)
> {
> #ifdef CONFIG_RFS_ACCEL
> WARN_ON(dev->rx_cpu_rmap);
>
> dev->rx_cpu_rmap = alloc_irq_cpu_rmap(adapter->num_queues);
> if ...
>
> dev->rx_cpu_rmap_auto = 1;
> return 0;
> #endif
> }
I was trying to avoid adding an extra function, but since this will
replace alloc_irq_cpu_rmap() I guess I can try. May be even use
dev->netdev_ops->ndo_rx_flow_steer
instead of dev->rx_cpu_rmap_auto.
I will keep the flag in patch 4 (NAPI_IRQ_AFFINITY) per NAPI since it is
used in netif_napi_set_irq().
Thanks for the review.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net-next v2 2/8] net: allow ARFS rmap management in core
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 1/8] net: napi: add irq_flags to napi struct Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-18 19:56 ` [Intel-wired-lan] " kernel test robot
2024-12-18 16:58 ` [PATCH net-next v2 3/8] lib: cpu_rmap: allow passing a notifier callback Ahmed Zaki
` (6 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Add a new napi->irq flag; NAPIF_IRQ_ARFS_RMAP. A driver can use the flag
when binding an irq to a napi:
netif_napi_set_irq(napi, irq, NAPIF_IRQ_ARFS_RMAP)
and the core will update the ARFS rmap with the assigned irq affinity.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 19 ++++---------
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 25 ++++++-----------
drivers/net/ethernet/intel/ice/ice_arfs.c | 10 +------
drivers/net/ethernet/intel/ice/ice_lib.c | 5 ++++
drivers/net/ethernet/qlogic/qede/qede_main.c | 28 +++++++++----------
drivers/net/ethernet/sfc/falcon/efx.c | 9 ++++++
drivers/net/ethernet/sfc/falcon/nic.c | 10 -------
drivers/net/ethernet/sfc/siena/efx_channels.c | 9 ++++++
drivers/net/ethernet/sfc/siena/nic.c | 10 -------
include/linux/netdevice.h | 12 ++++++++
net/core/dev.c | 14 ++++++++++
11 files changed, 77 insertions(+), 74 deletions(-)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 4898c8be78ad..752b1c61b610 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -165,23 +165,9 @@ int ena_xmit_common(struct ena_adapter *adapter,
static int ena_init_rx_cpu_rmap(struct ena_adapter *adapter)
{
#ifdef CONFIG_RFS_ACCEL
- u32 i;
- int rc;
-
adapter->netdev->rx_cpu_rmap = alloc_irq_cpu_rmap(adapter->num_io_queues);
if (!adapter->netdev->rx_cpu_rmap)
return -ENOMEM;
- for (i = 0; i < adapter->num_io_queues; i++) {
- int irq_idx = ENA_IO_IRQ_IDX(i);
-
- rc = irq_cpu_rmap_add(adapter->netdev->rx_cpu_rmap,
- pci_irq_vector(adapter->pdev, irq_idx));
- if (rc) {
- free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
- adapter->netdev->rx_cpu_rmap = NULL;
- return rc;
- }
- }
#endif /* CONFIG_RFS_ACCEL */
return 0;
}
@@ -1712,7 +1698,12 @@ static int ena_request_io_irq(struct ena_adapter *adapter)
for (i = 0; i < io_queue_count; i++) {
irq_idx = ENA_IO_IRQ_IDX(i);
irq = &adapter->irq_tbl[irq_idx];
+#ifdef CONFIG_RFS_ACCEL
+ netif_napi_set_irq(&adapter->ena_napi[i].napi, irq->vector,
+ NAPIF_IRQ_ARFS_RMAP);
+#else
netif_napi_set_irq(&adapter->ena_napi[i].napi, irq->vector, 0);
+#endif
}
return rc;
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 4763c6300bd3..ac729a25ba52 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11192,11 +11192,8 @@ static void bnxt_free_irq(struct bnxt *bp)
static int bnxt_request_irq(struct bnxt *bp)
{
- int i, j, rc = 0;
+ int i, rc = 0;
unsigned long flags = 0;
-#ifdef CONFIG_RFS_ACCEL
- struct cpu_rmap *rmap;
-#endif
rc = bnxt_setup_int_mode(bp);
if (rc) {
@@ -11204,28 +11201,22 @@ static int bnxt_request_irq(struct bnxt *bp)
rc);
return rc;
}
-#ifdef CONFIG_RFS_ACCEL
- rmap = bp->dev->rx_cpu_rmap;
-#endif
- for (i = 0, j = 0; i < bp->cp_nr_rings; i++) {
+
+ for (i = 0; i < bp->cp_nr_rings; i++) {
int map_idx = bnxt_cp_num_to_irq_num(bp, i);
struct bnxt_irq *irq = &bp->irq_tbl[map_idx];
-#ifdef CONFIG_RFS_ACCEL
- if (rmap && bp->bnapi[i]->rx_ring) {
- rc = irq_cpu_rmap_add(rmap, irq->vector);
- if (rc)
- netdev_warn(bp->dev, "failed adding irq rmap for ring %d\n",
- j);
- j++;
- }
-#endif
rc = request_irq(irq->vector, irq->handler, flags, irq->name,
bp->bnapi[i]);
if (rc)
break;
+#ifdef CONFIG_RFS_ACCEL
+ netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector,
+ NAPIF_IRQ_ARFS_RMAP);
+#else
netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector, 0);
+#endif
irq->requested = 1;
if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c
index 7cee365cc7d1..54d51d218cae 100644
--- a/drivers/net/ethernet/intel/ice/ice_arfs.c
+++ b/drivers/net/ethernet/intel/ice/ice_arfs.c
@@ -590,14 +590,13 @@ void ice_free_cpu_rx_rmap(struct ice_vsi *vsi)
}
/**
- * ice_set_cpu_rx_rmap - setup CPU reverse map for each queue
+ * ice_set_cpu_rx_rmap - allocate CPU reverse map for a VSI
* @vsi: the VSI to be forwarded to
*/
int ice_set_cpu_rx_rmap(struct ice_vsi *vsi)
{
struct net_device *netdev;
struct ice_pf *pf;
- int i;
if (!vsi || vsi->type != ICE_VSI_PF)
return 0;
@@ -614,13 +613,6 @@ int ice_set_cpu_rx_rmap(struct ice_vsi *vsi)
if (unlikely(!netdev->rx_cpu_rmap))
return -EINVAL;
- ice_for_each_q_vector(vsi, i)
- if (irq_cpu_rmap_add(netdev->rx_cpu_rmap,
- vsi->q_vectors[i]->irq.virq)) {
- ice_free_cpu_rx_rmap(vsi);
- return -EINVAL;
- }
-
return 0;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index ff91e70f596f..7c0b2d8e86ba 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2735,7 +2735,12 @@ void ice_vsi_set_napi_queues(struct ice_vsi *vsi)
ice_for_each_q_vector(vsi, v_idx) {
struct ice_q_vector *q_vector = vsi->q_vectors[v_idx];
+#ifdef CONFIG_RFS_ACCEL
+ netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq,
+ NAPIF_IRQ_ARFS_RMAP);
+#else
netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq, 0);
+#endif
}
}
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c
index 99df00c30b8c..27c987435242 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -1944,12 +1944,24 @@ static void qede_napi_disable_remove(struct qede_dev *edev)
static void qede_napi_add_enable(struct qede_dev *edev)
{
+ struct qede_fastpath *fp = &edev->fp_array[i];
int i;
/* Add NAPI objects */
for_each_queue(i) {
- netif_napi_add(edev->ndev, &edev->fp_array[i].napi, qede_poll);
- napi_enable(&edev->fp_array[i].napi);
+ fp = &edev->fp_array[i];
+ netif_napi_add(edev->ndev, &fp->napi, qede_poll);
+ napi_enable(&fp->napi);
+ if (edev->ndev->rx_cpu_rmap && (fp->type & QEDE_FASTPATH_RX))
+#ifdef CONFIG_RFS_ACCEL
+ netif_napi_set_irq(&edev->fp_array[i].napi,
+ edev->int_info.msix[i].vector,
+ NAPIF_IRQ_ARFS_RMAP);
+#else
+ netif_napi_set_irq(&edev->fp_array[i].napi,
+ edev->int_info.msix[i].vector,
+ 0);
+#endif
}
}
@@ -1983,18 +1995,6 @@ static int qede_req_msix_irqs(struct qede_dev *edev)
}
for (i = 0; i < QEDE_QUEUE_CNT(edev); i++) {
-#ifdef CONFIG_RFS_ACCEL
- struct qede_fastpath *fp = &edev->fp_array[i];
-
- if (edev->ndev->rx_cpu_rmap && (fp->type & QEDE_FASTPATH_RX)) {
- rc = irq_cpu_rmap_add(edev->ndev->rx_cpu_rmap,
- edev->int_info.msix[i].vector);
- if (rc) {
- DP_ERR(edev, "Failed to add CPU rmap\n");
- qede_free_arfs(edev);
- }
- }
-#endif
rc = request_irq(edev->int_info.msix[i].vector,
qede_msix_fp_int, 0, edev->fp_array[i].name,
&edev->fp_array[i]);
diff --git a/drivers/net/ethernet/sfc/falcon/efx.c b/drivers/net/ethernet/sfc/falcon/efx.c
index b07f7e4e2877..8c2f850d4639 100644
--- a/drivers/net/ethernet/sfc/falcon/efx.c
+++ b/drivers/net/ethernet/sfc/falcon/efx.c
@@ -2004,6 +2004,15 @@ static void ef4_init_napi_channel(struct ef4_channel *channel)
channel->napi_dev = efx->net_dev;
netif_napi_add(channel->napi_dev, &channel->napi_str, ef4_poll);
+
+ if (efx->interrupt_mode == EF4_INT_MODE_MSIX &&
+ channel->channel < efx->n_rx_channels)
+#ifdef CONFIG_RFS_ACCEL
+ netif_napi_set_irq(&channel->napi_str, channel->irq,
+ NAPIF_IRQ_ARFS_RMAP);
+#else
+ netif_napi_set_irq(&channel->napi_str, channel->irq, 0);
+#endif
}
static void ef4_init_napi(struct ef4_nic *efx)
diff --git a/drivers/net/ethernet/sfc/falcon/nic.c b/drivers/net/ethernet/sfc/falcon/nic.c
index a6304686bc90..fa31d83e64e4 100644
--- a/drivers/net/ethernet/sfc/falcon/nic.c
+++ b/drivers/net/ethernet/sfc/falcon/nic.c
@@ -115,16 +115,6 @@ int ef4_nic_init_interrupt(struct ef4_nic *efx)
goto fail2;
}
++n_irqs;
-
-#ifdef CONFIG_RFS_ACCEL
- if (efx->interrupt_mode == EF4_INT_MODE_MSIX &&
- channel->channel < efx->n_rx_channels) {
- rc = irq_cpu_rmap_add(efx->net_dev->rx_cpu_rmap,
- channel->irq);
- if (rc)
- goto fail2;
- }
-#endif
}
return 0;
diff --git a/drivers/net/ethernet/sfc/siena/efx_channels.c b/drivers/net/ethernet/sfc/siena/efx_channels.c
index d120b3c83ac0..6fed4f7b311f 100644
--- a/drivers/net/ethernet/sfc/siena/efx_channels.c
+++ b/drivers/net/ethernet/sfc/siena/efx_channels.c
@@ -1321,6 +1321,15 @@ static void efx_init_napi_channel(struct efx_channel *channel)
channel->napi_dev = efx->net_dev;
netif_napi_add(channel->napi_dev, &channel->napi_str, efx_poll);
+
+ if (efx->interrupt_mode == EFX_INT_MODE_MSIX &&
+ channel->channel < efx->n_rx_channels)
+#ifdef CONFIG_RFS_ACCEL
+ netif_napi_set_irq(&channel->napi_str, channel->irq,
+ NAPIF_IRQ_ARFS_RMAP);
+#else
+ netif_napi_set_irq(&channel->napi_str, channel->irq, 0);
+#endif
}
void efx_siena_init_napi(struct efx_nic *efx)
diff --git a/drivers/net/ethernet/sfc/siena/nic.c b/drivers/net/ethernet/sfc/siena/nic.c
index 32fce70085e3..28ef6222395e 100644
--- a/drivers/net/ethernet/sfc/siena/nic.c
+++ b/drivers/net/ethernet/sfc/siena/nic.c
@@ -117,16 +117,6 @@ int efx_siena_init_interrupt(struct efx_nic *efx)
goto fail2;
}
++n_irqs;
-
-#ifdef CONFIG_RFS_ACCEL
- if (efx->interrupt_mode == EFX_INT_MODE_MSIX &&
- channel->channel < efx->n_rx_channels) {
- rc = irq_cpu_rmap_add(efx->net_dev->rx_cpu_rmap,
- channel->irq);
- if (rc)
- goto fail2;
- }
-#endif
}
efx->irqs_hooked = true;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ca91b6662bde..0df419052434 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -354,6 +354,18 @@ struct napi_config {
unsigned int napi_id;
};
+enum {
+#ifdef CONFIG_RFS_ACCEL
+ NAPI_IRQ_ARFS_RMAP, /* Core handles RMAP updates */
+#endif
+};
+
+enum {
+#ifdef CONFIG_RFS_ACCEL
+ NAPIF_IRQ_ARFS_RMAP = BIT(NAPI_IRQ_ARFS_RMAP),
+#endif
+};
+
/*
* Structure for NAPI scheduling similar to tasklet but with weighting
*/
diff --git a/net/core/dev.c b/net/core/dev.c
index 88a7d4b6e71b..7c3abff48aea 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6707,8 +6707,22 @@ EXPORT_SYMBOL(netif_queue_set_napi);
void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
{
+ int rc;
+
napi->irq = irq;
napi->irq_flags = flags;
+
+#ifdef CONFIG_RFS_ACCEL
+ if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
+ rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq);
+ if (rc) {
+ netdev_warn(napi->dev, "Unable to update ARFS map (%d).\n",
+ rc);
+ free_irq_cpu_rmap(napi->dev->rx_cpu_rmap);
+ napi->dev->rx_cpu_rmap = NULL;
+ }
+ }
+#endif
}
EXPORT_SYMBOL(netif_napi_set_irq);
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [Intel-wired-lan] [PATCH net-next v2 2/8] net: allow ARFS rmap management in core
2024-12-18 16:58 ` [PATCH net-next v2 2/8] net: allow ARFS rmap management in core Ahmed Zaki
@ 2024-12-18 19:56 ` kernel test robot
0 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2024-12-18 19:56 UTC (permalink / raw)
To: Ahmed Zaki, netdev
Cc: oe-kbuild-all, intel-wired-lan, andrew+netdev, edumazet, kuba,
pabeni, davem, michael.chan, tariqt, anthony.l.nguyen,
przemyslaw.kitszel, jdamato, shayd, akpm, Ahmed Zaki
Hi Ahmed,
kernel test robot noticed the following build warnings:
[auto build test WARNING on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Ahmed-Zaki/net-napi-add-irq_flags-to-napi-struct/20241219-010125
base: net-next/main
patch link: https://lore.kernel.org/r/20241218165843.744647-3-ahmed.zaki%40intel.com
patch subject: [Intel-wired-lan] [PATCH net-next v2 2/8] net: allow ARFS rmap management in core
config: arm-randconfig-002-20241219 (https://download.01.org/0day-ci/archive/20241219/202412190318.OR90xHNu-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241219/202412190318.OR90xHNu-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412190318.OR90xHNu-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from net/core/dev.c:92:
include/linux/netdevice.h:361:1: error: empty enum is invalid
361 | };
| ^
include/linux/netdevice.h:367:1: error: empty enum is invalid
367 | };
| ^
net/core/dev.c: In function 'netif_napi_set_irq':
>> net/core/dev.c:6710:14: warning: unused variable 'rc' [-Wunused-variable]
6710 | int rc;
| ^~
vim +/rc +6710 net/core/dev.c
6707
6708 void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
6709 {
> 6710 int rc;
6711
6712 napi->irq = irq;
6713 napi->irq_flags = flags;
6714
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net-next v2 3/8] lib: cpu_rmap: allow passing a notifier callback
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 1/8] net: napi: add irq_flags to napi struct Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 2/8] net: allow ARFS rmap management in core Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (5 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Allow the rmap users to pass a notifier callback function that can be
called instead of irq_cpu_rmap_notify().
Two modifications are made:
* make struct irg_glue visible in cpu_rmap.h
* pass a new "void* data" parameter that can be used by the cb
function.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 3 ++-
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/eq.c | 2 +-
.../net/ethernet/mellanox/mlx5/core/pci_irq.c | 2 +-
drivers/net/ethernet/sfc/nic.c | 2 +-
include/linux/cpu_rmap.h | 13 +++++++++++-
lib/cpu_rmap.c | 20 +++++++++----------
7 files changed, 28 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c
index 9913952ccb42..e384b975b8af 100644
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -1657,7 +1657,8 @@ static void enic_set_rx_cpu_rmap(struct enic *enic)
return;
for (i = 0; i < enic->rq_count; i++) {
res = irq_cpu_rmap_add(enic->netdev->rx_cpu_rmap,
- enic->msix_entry[i].vector);
+ enic->msix_entry[i].vector,
+ NULL, NULL);
if (unlikely(res)) {
enic_free_rx_cpu_rmap(enic);
return;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 43377a7b2426..3f732516c8ee 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -697,7 +697,7 @@ static int hns3_set_rx_cpu_rmap(struct net_device *netdev)
for (i = 0; i < priv->vector_num; i++) {
tqp_vector = &priv->tqp_vector[i];
ret = irq_cpu_rmap_add(netdev->rx_cpu_rmap,
- tqp_vector->vector_irq);
+ tqp_vector->vector_irq, NULL, NULL);
if (ret) {
hns3_free_rx_cpu_rmap(netdev);
return ret;
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index 9572a45f6143..d768a6a828c4 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -1243,7 +1243,7 @@ int mlx4_init_eq_table(struct mlx4_dev *dev)
}
err = irq_cpu_rmap_add(
- info->rmap, eq->irq);
+ info->rmap, eq->irq, NULL, NULL);
if (err)
mlx4_warn(dev, "Failed adding irq rmap\n");
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
index 7db9cab9bedf..4f2c4631aecb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
@@ -285,7 +285,7 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i,
if (i && rmap && *rmap) {
#ifdef CONFIG_RFS_ACCEL
- err = irq_cpu_rmap_add(*rmap, irq->map.virq);
+ err = irq_cpu_rmap_add(*rmap, irq->map.virq, NULL, NULL);
if (err)
goto err_irq_rmap;
#endif
diff --git a/drivers/net/ethernet/sfc/nic.c b/drivers/net/ethernet/sfc/nic.c
index 80aa5e9c732a..e7c6c3002826 100644
--- a/drivers/net/ethernet/sfc/nic.c
+++ b/drivers/net/ethernet/sfc/nic.c
@@ -122,7 +122,7 @@ int efx_nic_init_interrupt(struct efx_nic *efx)
if (efx->interrupt_mode == EFX_INT_MODE_MSIX &&
channel->channel < efx->n_rx_channels) {
rc = irq_cpu_rmap_add(efx->net_dev->rx_cpu_rmap,
- channel->irq);
+ channel->irq, NULL, NULL);
if (rc)
goto fail2;
}
diff --git a/include/linux/cpu_rmap.h b/include/linux/cpu_rmap.h
index 20b5729903d7..48f89d19bdb9 100644
--- a/include/linux/cpu_rmap.h
+++ b/include/linux/cpu_rmap.h
@@ -11,6 +11,15 @@
#include <linux/gfp.h>
#include <linux/slab.h>
#include <linux/kref.h>
+#include <linux/interrupt.h>
+
+/* Glue between IRQ affinity notifiers and CPU rmaps */
+struct irq_glue {
+ struct irq_affinity_notify notify;
+ struct cpu_rmap *rmap;
+ void *data;
+ u16 index;
+};
/**
* struct cpu_rmap - CPU affinity reverse-map
@@ -61,6 +70,8 @@ static inline struct cpu_rmap *alloc_irq_cpu_rmap(unsigned int size)
extern void free_irq_cpu_rmap(struct cpu_rmap *rmap);
int irq_cpu_rmap_remove(struct cpu_rmap *rmap, int irq);
-extern int irq_cpu_rmap_add(struct cpu_rmap *rmap, int irq);
+extern int irq_cpu_rmap_add(struct cpu_rmap *rmap, int irq, void *data,
+ void (*notify)(struct irq_affinity_notify *notify,
+ const cpumask_t *mask));
#endif /* __LINUX_CPU_RMAP_H */
diff --git a/lib/cpu_rmap.c b/lib/cpu_rmap.c
index 4c348670da31..0c9c1078143d 100644
--- a/lib/cpu_rmap.c
+++ b/lib/cpu_rmap.c
@@ -220,14 +220,6 @@ int cpu_rmap_update(struct cpu_rmap *rmap, u16 index,
}
EXPORT_SYMBOL(cpu_rmap_update);
-/* Glue between IRQ affinity notifiers and CPU rmaps */
-
-struct irq_glue {
- struct irq_affinity_notify notify;
- struct cpu_rmap *rmap;
- u16 index;
-};
-
/**
* free_irq_cpu_rmap - free a CPU affinity reverse-map used for IRQs
* @rmap: Reverse-map allocated with alloc_irq_cpu_map(), or %NULL
@@ -300,6 +292,8 @@ EXPORT_SYMBOL(irq_cpu_rmap_remove);
* irq_cpu_rmap_add - add an IRQ to a CPU affinity reverse-map
* @rmap: The reverse-map
* @irq: The IRQ number
+ * @data: Generic data
+ * @notify: Callback function to update the CPU-IRQ rmap
*
* This adds an IRQ affinity notifier that will update the reverse-map
* automatically.
@@ -307,16 +301,22 @@ EXPORT_SYMBOL(irq_cpu_rmap_remove);
* Must be called in process context, after the IRQ is allocated but
* before it is bound with request_irq().
*/
-int irq_cpu_rmap_add(struct cpu_rmap *rmap, int irq)
+int irq_cpu_rmap_add(struct cpu_rmap *rmap, int irq, void *data,
+ void (*notify)(struct irq_affinity_notify *notify,
+ const cpumask_t *mask))
{
struct irq_glue *glue = kzalloc(sizeof(*glue), GFP_KERNEL);
int rc;
if (!glue)
return -ENOMEM;
- glue->notify.notify = irq_cpu_rmap_notify;
+
+ if (!notify)
+ notify = irq_cpu_rmap_notify;
+ glue->notify.notify = notify;
glue->notify.release = irq_cpu_rmap_release;
glue->rmap = rmap;
+ glue->data = data;
cpu_rmap_get(rmap);
rc = cpu_rmap_add(rmap, glue);
if (rc < 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (2 preceding siblings ...)
2024-12-18 16:58 ` [PATCH net-next v2 3/8] lib: cpu_rmap: allow passing a notifier callback Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-18 20:16 ` [Intel-wired-lan] " kernel test robot
` (2 more replies)
2024-12-18 16:58 ` [PATCH net-next v2 5/8] bnxt: use napi's irq affinity Ahmed Zaki
` (4 subsequent siblings)
8 siblings, 3 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
A common task for most drivers is to remember the user-set CPU affinity
to its IRQs. On each netdev reset, the driver should re-assign the
user's setting to the IRQs.
Add CPU affinity mask to napi->config. To delegate the CPU affinity
management to the core, drivers must:
1 - add a persistent napi config: netif_napi_add_config()
2 - bind an IRQ to the napi instance: netif_napi_set_irq() with the new
flag NAPIF_IRQ_AFFINITY
the core will then make sure to use re-assign affinity to the napi's
IRQ.
The default mask set to all IRQs is all online CPUs.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
include/linux/netdevice.h | 5 +++
net/core/dev.c | 66 +++++++++++++++++++++++++++++++++++++--
2 files changed, 69 insertions(+), 2 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0df419052434..4fa047fad8fb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -351,6 +351,7 @@ struct napi_config {
u64 gro_flush_timeout;
u64 irq_suspend_timeout;
u32 defer_hard_irqs;
+ cpumask_t affinity_mask;
unsigned int napi_id;
};
@@ -358,12 +359,16 @@ enum {
#ifdef CONFIG_RFS_ACCEL
NAPI_IRQ_ARFS_RMAP, /* Core handles RMAP updates */
#endif
+ NAPI_IRQ_AFFINITY, /* Core manages IRQ affinity */
+ NAPI_IRQ_NORMAP /* Set by core (internal) */
};
enum {
#ifdef CONFIG_RFS_ACCEL
NAPIF_IRQ_ARFS_RMAP = BIT(NAPI_IRQ_ARFS_RMAP),
#endif
+ NAPIF_IRQ_AFFINITY = BIT(NAPI_IRQ_AFFINITY),
+ NAPIF_IRQ_NORMAP = BIT(NAPI_IRQ_NORMAP),
};
/*
diff --git a/net/core/dev.c b/net/core/dev.c
index 7c3abff48aea..84745cea03a7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6705,8 +6705,44 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
}
EXPORT_SYMBOL(netif_queue_set_napi);
+static void
+netif_irq_cpu_rmap_notify(struct irq_affinity_notify *notify,
+ const cpumask_t *mask)
+{
+ struct irq_glue *glue =
+ container_of(notify, struct irq_glue, notify);
+ struct napi_struct *napi = glue->data;
+ unsigned int flags;
+ int rc;
+
+ flags = napi->irq_flags;
+
+ if (napi->config && flags & NAPIF_IRQ_AFFINITY)
+ cpumask_copy(&napi->config->affinity_mask, mask);
+
+#ifdef CONFIG_RFS_ACCEL
+ if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
+ rc = cpu_rmap_update(glue->rmap, glue->index, mask);
+ if (rc)
+ pr_warn("%s: update failed: %d\n",
+ __func__, rc);
+ }
+#endif
+}
+
+static void
+netif_napi_affinity_release(struct kref __always_unused *ref)
+{
+ struct irq_glue *glue =
+ container_of(ref, struct irq_glue, notify.kref);
+
+ kfree(glue);
+}
+
void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
{
+ struct irq_glue *glue = NULL;
+ bool glue_created;
int rc;
napi->irq = irq;
@@ -6714,15 +6750,29 @@ void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
#ifdef CONFIG_RFS_ACCEL
if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
- rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq);
+ rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
+ netif_irq_cpu_rmap_notify);
if (rc) {
netdev_warn(napi->dev, "Unable to update ARFS map (%d).\n",
rc);
free_irq_cpu_rmap(napi->dev->rx_cpu_rmap);
napi->dev->rx_cpu_rmap = NULL;
+ } else {
+ glue_created = true;
}
}
#endif
+
+ if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
+ glue = kzalloc(sizeof(*glue), GFP_KERNEL);
+ if (!glue)
+ return;
+ glue->notify.notify = netif_irq_cpu_rmap_notify;
+ glue->notify.release = netif_napi_affinity_release;
+ glue->data = napi;
+ glue->rmap = NULL;
+ napi->irq_flags |= NAPIF_IRQ_NORMAP;
+ }
}
EXPORT_SYMBOL(netif_napi_set_irq);
@@ -6731,6 +6781,10 @@ static void napi_restore_config(struct napi_struct *n)
n->defer_hard_irqs = n->config->defer_hard_irqs;
n->gro_flush_timeout = n->config->gro_flush_timeout;
n->irq_suspend_timeout = n->config->irq_suspend_timeout;
+
+ if (n->irq > 0 && n->irq_flags & NAPIF_IRQ_AFFINITY)
+ irq_set_affinity(n->irq, &n->config->affinity_mask);
+
/* a NAPI ID might be stored in the config, if so use it. if not, use
* napi_hash_add to generate one for us. It will be saved to the config
* in napi_disable.
@@ -6747,6 +6801,11 @@ static void napi_save_config(struct napi_struct *n)
n->config->gro_flush_timeout = n->gro_flush_timeout;
n->config->irq_suspend_timeout = n->irq_suspend_timeout;
n->config->napi_id = n->napi_id;
+
+ if (n->irq > 0 &&
+ n->irq_flags & (NAPIF_IRQ_AFFINITY | NAPIF_IRQ_NORMAP))
+ irq_set_affinity_notifier(n->irq, NULL);
+
napi_hash_del(n);
}
@@ -11211,7 +11270,7 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
{
struct net_device *dev;
size_t napi_config_sz;
- unsigned int maxqs;
+ unsigned int maxqs, i;
BUG_ON(strlen(name) >= sizeof(dev->name));
@@ -11307,6 +11366,9 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
dev->napi_config = kvzalloc(napi_config_sz, GFP_KERNEL_ACCOUNT);
if (!dev->napi_config)
goto free_all;
+ for (i = 0; i < maxqs; i++)
+ cpumask_copy(&dev->napi_config[i].affinity_mask,
+ cpu_online_mask);
strscpy(dev->name, name);
dev->name_assign_type = name_assign_type;
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [Intel-wired-lan] [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-18 16:58 ` [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
@ 2024-12-18 20:16 ` kernel test robot
2024-12-18 20:27 ` kernel test robot
2024-12-20 3:42 ` Jakub Kicinski
2 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2024-12-18 20:16 UTC (permalink / raw)
To: Ahmed Zaki, netdev
Cc: llvm, oe-kbuild-all, intel-wired-lan, andrew+netdev, edumazet,
kuba, pabeni, davem, michael.chan, tariqt, anthony.l.nguyen,
przemyslaw.kitszel, jdamato, shayd, akpm, Ahmed Zaki
Hi Ahmed,
kernel test robot noticed the following build warnings:
[auto build test WARNING on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Ahmed-Zaki/net-napi-add-irq_flags-to-napi-struct/20241219-010125
base: net-next/main
patch link: https://lore.kernel.org/r/20241218165843.744647-5-ahmed.zaki%40intel.com
patch subject: [Intel-wired-lan] [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
config: arm-randconfig-001-20241219 (https://download.01.org/0day-ci/archive/20241219/202412190421.N2xtn20H-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241219/202412190421.N2xtn20H-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412190421.N2xtn20H-lkp@intel.com/
All warnings (new ones prefixed by >>):
net/core/dev.c:6716:6: warning: unused variable 'rc' [-Wunused-variable]
6716 | int rc;
| ^~
net/core/dev.c:6746:7: warning: unused variable 'rc' [-Wunused-variable]
6746 | int rc;
| ^~
>> net/core/dev.c:6766:7: warning: variable 'glue_created' is uninitialized when used here [-Wuninitialized]
6766 | if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
| ^~~~~~~~~~~~
net/core/dev.c:6745:19: note: initialize the variable 'glue_created' to silence this warning
6745 | bool glue_created;
| ^
| = 0
3 warnings generated.
vim +/glue_created +6766 net/core/dev.c
6765
> 6766 if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
6767 glue = kzalloc(sizeof(*glue), GFP_KERNEL);
6768 if (!glue)
6769 return;
6770 glue->notify.notify = netif_irq_cpu_rmap_notify;
6771 glue->notify.release = netif_napi_affinity_release;
6772 glue->data = napi;
6773 glue->rmap = NULL;
6774 napi->irq_flags |= NAPIF_IRQ_NORMAP;
6775 }
6776 }
6777 EXPORT_SYMBOL(netif_napi_set_irq);
6778
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [Intel-wired-lan] [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-18 16:58 ` [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
2024-12-18 20:16 ` [Intel-wired-lan] " kernel test robot
@ 2024-12-18 20:27 ` kernel test robot
2024-12-20 3:42 ` Jakub Kicinski
2 siblings, 0 replies; 23+ messages in thread
From: kernel test robot @ 2024-12-18 20:27 UTC (permalink / raw)
To: Ahmed Zaki, netdev
Cc: llvm, oe-kbuild-all, intel-wired-lan, andrew+netdev, edumazet,
kuba, pabeni, davem, michael.chan, tariqt, anthony.l.nguyen,
przemyslaw.kitszel, jdamato, shayd, akpm, Ahmed Zaki
Hi Ahmed,
kernel test robot noticed the following build warnings:
[auto build test WARNING on net-next/main]
url: https://github.com/intel-lab-lkp/linux/commits/Ahmed-Zaki/net-napi-add-irq_flags-to-napi-struct/20241219-010125
base: net-next/main
patch link: https://lore.kernel.org/r/20241218165843.744647-5-ahmed.zaki%40intel.com
patch subject: [Intel-wired-lan] [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
config: riscv-randconfig-001-20241219 (https://download.01.org/0day-ci/archive/20241219/202412190454.nwvp3hU2-lkp@intel.com/config)
compiler: clang version 16.0.6 (https://github.com/llvm/llvm-project 7cbf1a2591520c2491aa35339f227775f4d3adf6)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241219/202412190454.nwvp3hU2-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202412190454.nwvp3hU2-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> net/core/dev.c:6755:7: warning: variable 'glue_created' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (rc) {
^~
net/core/dev.c:6766:7: note: uninitialized use occurs here
if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
^~~~~~~~~~~~
net/core/dev.c:6755:3: note: remove the 'if' if its condition is always false
if (rc) {
^~~~~~~~~
>> net/core/dev.c:6752:6: warning: variable 'glue_created' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/core/dev.c:6766:7: note: uninitialized use occurs here
if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
^~~~~~~~~~~~
net/core/dev.c:6752:2: note: remove the 'if' if its condition is always true
if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> net/core/dev.c:6752:6: warning: variable 'glue_created' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
^~~~~~~~~~~~~~~~~~~~~~
net/core/dev.c:6766:7: note: uninitialized use occurs here
if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
^~~~~~~~~~~~
net/core/dev.c:6752:6: note: remove the '&&' if its condition is always true
if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
^~~~~~~~~~~~~~~~~~~~~~~~~
net/core/dev.c:6745:19: note: initialize the variable 'glue_created' to silence this warning
bool glue_created;
^
= 0
net/core/dev.c:4176:1: warning: unused function 'sch_handle_ingress' [-Wunused-function]
sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
^
net/core/dev.c:4183:1: warning: unused function 'sch_handle_egress' [-Wunused-function]
sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
^
net/core/dev.c:5440:19: warning: unused function 'nf_ingress' [-Wunused-function]
static inline int nf_ingress(struct sk_buff *skb, struct packet_type **pt_prev,
^
6 warnings generated.
Kconfig warnings: (for reference only)
WARNING: unmet direct dependencies detected for FB_IOMEM_HELPERS
Depends on [n]: HAS_IOMEM [=y] && FB_CORE [=n]
Selected by [m]:
- DRM_XE_DISPLAY [=y] && HAS_IOMEM [=y] && DRM [=m] && DRM_XE [=m] && DRM_XE [=m]=m [=m] && HAS_IOPORT [=y]
vim +6755 net/core/dev.c
8e5191fb19bffce Ahmed Zaki 2024-12-18 6741
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6742 void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long flags)
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6743 {
8e5191fb19bffce Ahmed Zaki 2024-12-18 6744 struct irq_glue *glue = NULL;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6745 bool glue_created;
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6746 int rc;
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6747
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6748 napi->irq = irq;
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6749 napi->irq_flags = flags;
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6750
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6751 #ifdef CONFIG_RFS_ACCEL
a274d2669a73ef7 Ahmed Zaki 2024-12-18 @6752 if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
8e5191fb19bffce Ahmed Zaki 2024-12-18 6753 rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
8e5191fb19bffce Ahmed Zaki 2024-12-18 6754 netif_irq_cpu_rmap_notify);
a274d2669a73ef7 Ahmed Zaki 2024-12-18 @6755 if (rc) {
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6756 netdev_warn(napi->dev, "Unable to update ARFS map (%d).\n",
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6757 rc);
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6758 free_irq_cpu_rmap(napi->dev->rx_cpu_rmap);
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6759 napi->dev->rx_cpu_rmap = NULL;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6760 } else {
8e5191fb19bffce Ahmed Zaki 2024-12-18 6761 glue_created = true;
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6762 }
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6763 }
a274d2669a73ef7 Ahmed Zaki 2024-12-18 6764 #endif
8e5191fb19bffce Ahmed Zaki 2024-12-18 6765
8e5191fb19bffce Ahmed Zaki 2024-12-18 6766 if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
8e5191fb19bffce Ahmed Zaki 2024-12-18 6767 glue = kzalloc(sizeof(*glue), GFP_KERNEL);
8e5191fb19bffce Ahmed Zaki 2024-12-18 6768 if (!glue)
8e5191fb19bffce Ahmed Zaki 2024-12-18 6769 return;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6770 glue->notify.notify = netif_irq_cpu_rmap_notify;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6771 glue->notify.release = netif_napi_affinity_release;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6772 glue->data = napi;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6773 glue->rmap = NULL;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6774 napi->irq_flags |= NAPIF_IRQ_NORMAP;
8e5191fb19bffce Ahmed Zaki 2024-12-18 6775 }
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6776 }
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6777 EXPORT_SYMBOL(netif_napi_set_irq);
001dc6db21f4bfe Ahmed Zaki 2024-12-18 6778
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-18 16:58 ` [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
2024-12-18 20:16 ` [Intel-wired-lan] " kernel test robot
2024-12-18 20:27 ` kernel test robot
@ 2024-12-20 3:42 ` Jakub Kicinski
2024-12-20 14:51 ` Ahmed Zaki
2 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-12-20 3:42 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
> + if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
> + glue = kzalloc(sizeof(*glue), GFP_KERNEL);
> + if (!glue)
> + return;
> + glue->notify.notify = netif_irq_cpu_rmap_notify;
> + glue->notify.release = netif_napi_affinity_release;
> + glue->data = napi;
> + glue->rmap = NULL;
> + napi->irq_flags |= NAPIF_IRQ_NORMAP;
Why allocate the glue? is it not possible to add the fields:
struct irq_affinity_notify notify;
u16 index;
to struct napi_struct ?
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-20 3:42 ` Jakub Kicinski
@ 2024-12-20 14:51 ` Ahmed Zaki
2024-12-20 17:23 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-20 14:51 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On 2024-12-19 8:42 p.m., Jakub Kicinski wrote:
> On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
>> + if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
>> + glue = kzalloc(sizeof(*glue), GFP_KERNEL);
>> + if (!glue)
>> + return;
>> + glue->notify.notify = netif_irq_cpu_rmap_notify;
>> + glue->notify.release = netif_napi_affinity_release;
>> + glue->data = napi;
>> + glue->rmap = NULL;
>> + napi->irq_flags |= NAPIF_IRQ_NORMAP;
>
> Why allocate the glue? is it not possible to add the fields:
>
> struct irq_affinity_notify notify;
> u16 index;
>
> to struct napi_struct ?
In the first branch of "if", the cb function netif_irq_cpu_rmap_notify()
is also passed to irq_cpu_rmap_add() where the irq notifier is embedded
in "struct irq_glue".
I think this cannot be changed as long as some drivers are directly
calling irq_cpu_rmap_add() instead of the proposed API.
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-20 14:51 ` Ahmed Zaki
@ 2024-12-20 17:23 ` Jakub Kicinski
2024-12-20 19:15 ` Ahmed Zaki
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-12-20 17:23 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On Fri, 20 Dec 2024 07:51:09 -0700 Ahmed Zaki wrote:
> On 2024-12-19 8:42 p.m., Jakub Kicinski wrote:
> > On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
> >> + if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
> >> + glue = kzalloc(sizeof(*glue), GFP_KERNEL);
> >> + if (!glue)
> >> + return;
> >> + glue->notify.notify = netif_irq_cpu_rmap_notify;
> >> + glue->notify.release = netif_napi_affinity_release;
> >> + glue->data = napi;
> >> + glue->rmap = NULL;
> >> + napi->irq_flags |= NAPIF_IRQ_NORMAP;
> >
> > Why allocate the glue? is it not possible to add the fields:
> >
> > struct irq_affinity_notify notify;
> > u16 index;
> >
> > to struct napi_struct ?
>
> In the first branch of "if", the cb function netif_irq_cpu_rmap_notify()
> is also passed to irq_cpu_rmap_add() where the irq notifier is embedded
> in "struct irq_glue".
I don't understand what you're trying to say, could you rephrase?
> I think this cannot be changed as long as some drivers are directly
> calling irq_cpu_rmap_add() instead of the proposed API.
Drivers which are not converted shouldn't matter if we have our own
notifier and call cpu_rmap_update() directly, no?
Drivers which are converted should not call irq_cpu_rmap_add().
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-20 17:23 ` Jakub Kicinski
@ 2024-12-20 19:15 ` Ahmed Zaki
2024-12-20 19:37 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-20 19:15 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On 2024-12-20 10:23 a.m., Jakub Kicinski wrote:
> On Fri, 20 Dec 2024 07:51:09 -0700 Ahmed Zaki wrote:
>> On 2024-12-19 8:42 p.m., Jakub Kicinski wrote:
>>> On Wed, 18 Dec 2024 09:58:39 -0700 Ahmed Zaki wrote:
>>>> + if (!glue_created && flags & NAPIF_IRQ_AFFINITY) {
>>>> + glue = kzalloc(sizeof(*glue), GFP_KERNEL);
>>>> + if (!glue)
>>>> + return;
>>>> + glue->notify.notify = netif_irq_cpu_rmap_notify;
>>>> + glue->notify.release = netif_napi_affinity_release;
>>>> + glue->data = napi;
>>>> + glue->rmap = NULL;
>>>> + napi->irq_flags |= NAPIF_IRQ_NORMAP;
>>>
>>> Why allocate the glue? is it not possible to add the fields:
>>>
>>> struct irq_affinity_notify notify;
>>> u16 index;
>>>
>>> to struct napi_struct ?
>>
>> In the first branch of "if", the cb function netif_irq_cpu_rmap_notify()
>> is also passed to irq_cpu_rmap_add() where the irq notifier is embedded
>> in "struct irq_glue".
>
> I don't understand what you're trying to say, could you rephrase?
Sure. After this patch, we have (simplified):
void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long
flags)
{
struct irq_glue *glue = NULL;
int rc;
napi->irq = irq;
#ifdef CONFIG_RFS_ACCEL
if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
netif_irq_cpu_rmap_notify);
.
.
.
}
#endif
if (flags & NAPIF_IRQ_AFFINITY) {
glue = kzalloc(sizeof(*glue), GFP_KERNEL);
if (!glue)
return;
glue->notify.notify = netif_irq_cpu_rmap_notify;
glue->notify.release = netif_napi_affinity_release;
.
.
}
}
Both branches assign the new cb function "netif_irq_cpu_rmap_notify()"
as the new IRQ notifier, but the first branch calls irq_cpu_rmap_add()
where the notifier is embedded in "struct irq_glue". So the cb function
needs to assume the notifier is inside irq_glue, so the second "if"
branch needs to do the same.
>
>> I think this cannot be changed as long as some drivers are directly
>> calling irq_cpu_rmap_add() instead of the proposed API.
>
> Drivers which are not converted shouldn't matter if we have our own
> notifier and call cpu_rmap_update() directly, no?
>
Only dependency is that irq_cpu_rmap_add() puts notifier inside irq_glue.
> Drivers which are converted should not call irq_cpu_rmap_add().
Correct, they don't.
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-20 19:15 ` Ahmed Zaki
@ 2024-12-20 19:37 ` Jakub Kicinski
2024-12-20 20:14 ` Ahmed Zaki
0 siblings, 1 reply; 23+ messages in thread
From: Jakub Kicinski @ 2024-12-20 19:37 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On Fri, 20 Dec 2024 12:15:33 -0700 Ahmed Zaki wrote:
> > I don't understand what you're trying to say, could you rephrase?
>
> Sure. After this patch, we have (simplified):
>
> void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long
> flags)
> {
> struct irq_glue *glue = NULL;
> int rc;
>
> napi->irq = irq;
>
> #ifdef CONFIG_RFS_ACCEL
> if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
> rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
> netif_irq_cpu_rmap_notify);
> .
> .
> .
> }
> #endif
>
> if (flags & NAPIF_IRQ_AFFINITY) {
> glue = kzalloc(sizeof(*glue), GFP_KERNEL);
> if (!glue)
> return;
> glue->notify.notify = netif_irq_cpu_rmap_notify;
> glue->notify.release = netif_napi_affinity_release;
> .
> .
> }
> }
>
>
> Both branches assign the new cb function "netif_irq_cpu_rmap_notify()"
> as the new IRQ notifier, but the first branch calls irq_cpu_rmap_add()
> where the notifier is embedded in "struct irq_glue". So the cb function
> needs to assume the notifier is inside irq_glue, so the second "if"
> branch needs to do the same.
First off, I'm still a bit confused why you think the flags should be
per NAPI call and not set at init time, once.
Perhaps rename netif_enable_cpu_rmap() suggested earlier to something
more generic (netif_enable_irq_tracking()?) and pass the flags there?
Or is there a driver which wants to vary the flags per NAPI instance?
Then you can probably register a single unified handler, and inside
that handler check if the device wanted to have rmap or just affinity?
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-20 19:37 ` Jakub Kicinski
@ 2024-12-20 20:14 ` Ahmed Zaki
2024-12-20 20:51 ` Jakub Kicinski
0 siblings, 1 reply; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-20 20:14 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On 2024-12-20 12:37 p.m., Jakub Kicinski wrote:
> On Fri, 20 Dec 2024 12:15:33 -0700 Ahmed Zaki wrote:
>>> I don't understand what you're trying to say, could you rephrase?
>>
>> Sure. After this patch, we have (simplified):
>>
>> void netif_napi_set_irq(struct napi_struct *napi, int irq, unsigned long
>> flags)
>> {
>> struct irq_glue *glue = NULL;
>> int rc;
>>
>> napi->irq = irq;
>>
>> #ifdef CONFIG_RFS_ACCEL
>> if (napi->dev->rx_cpu_rmap && flags & NAPIF_IRQ_ARFS_RMAP) {
>> rc = irq_cpu_rmap_add(napi->dev->rx_cpu_rmap, irq, napi,
>> netif_irq_cpu_rmap_notify);
>> .
>> .
>> .
>> }
>> #endif
>>
>> if (flags & NAPIF_IRQ_AFFINITY) {
>> glue = kzalloc(sizeof(*glue), GFP_KERNEL);
>> if (!glue)
>> return;
>> glue->notify.notify = netif_irq_cpu_rmap_notify;
>> glue->notify.release = netif_napi_affinity_release;
>> .
>> .
>> }
>> }
>>
>>
>> Both branches assign the new cb function "netif_irq_cpu_rmap_notify()"
>> as the new IRQ notifier, but the first branch calls irq_cpu_rmap_add()
>> where the notifier is embedded in "struct irq_glue". So the cb function
>> needs to assume the notifier is inside irq_glue, so the second "if"
>> branch needs to do the same.
>
> First off, I'm still a bit confused why you think the flags should be
> per NAPI call and not set at init time, once.
> Perhaps rename netif_enable_cpu_rmap() suggested earlier to something
> more generic (netif_enable_irq_tracking()?) and pass the flags there?
> Or is there a driver which wants to vary the flags per NAPI instance?
>
set_irq() just seemed like natural choice since it is already called for
each IRQ. I was also trying to avoid adding a new function. But sure I
can do that and move the flags to netdev.
> Then you can probably register a single unified handler, and inside
> that handler check if the device wanted to have rmap or just affinity?
This is what is in this patch already, all drivers following new
approach will have netif_irq_cpu_rmap_notify() as their IRQ notifier.
IIUC, your goal is to have the notifier inside napi, not irq_glue. For
this, we'll have to have our own version of irq_cpu_rmap_add() (for the
above reason).
sounds OK?
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config
2024-12-20 20:14 ` Ahmed Zaki
@ 2024-12-20 20:51 ` Jakub Kicinski
0 siblings, 0 replies; 23+ messages in thread
From: Jakub Kicinski @ 2024-12-20 20:51 UTC (permalink / raw)
To: Ahmed Zaki
Cc: netdev, intel-wired-lan, andrew+netdev, edumazet, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm
On Fri, 20 Dec 2024 13:14:48 -0700 Ahmed Zaki wrote:
> > Then you can probably register a single unified handler, and inside
> > that handler check if the device wanted to have rmap or just affinity?
>
> This is what is in this patch already, all drivers following new
> approach will have netif_irq_cpu_rmap_notify() as their IRQ notifier.
>
> IIUC, your goal is to have the notifier inside napi, not irq_glue. For
> this, we'll have to have our own version of irq_cpu_rmap_add() (for the
> above reason).
>
> sounds OK?
Yes.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH net-next v2 5/8] bnxt: use napi's irq affinity
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (3 preceding siblings ...)
2024-12-18 16:58 ` [PATCH net-next v2 4/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 6/8] ice: " Ahmed Zaki
` (3 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 28 ++++-------------------
drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 --
2 files changed, 4 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index ac729a25ba52..f68f07686105 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11177,14 +11177,8 @@ static void bnxt_free_irq(struct bnxt *bp)
int map_idx = bnxt_cp_num_to_irq_num(bp, i);
irq = &bp->irq_tbl[map_idx];
- if (irq->requested) {
- if (irq->have_cpumask) {
- irq_update_affinity_hint(irq->vector, NULL);
- free_cpumask_var(irq->cpu_mask);
- irq->have_cpumask = 0;
- }
+ if (irq->requested)
free_irq(irq->vector, bp->bnapi[i]);
- }
irq->requested = 0;
}
@@ -11213,26 +11207,12 @@ static int bnxt_request_irq(struct bnxt *bp)
#ifdef CONFIG_RFS_ACCEL
netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector,
- NAPIF_IRQ_ARFS_RMAP);
+ NAPIF_IRQ_AFFINITY | NAPIF_IRQ_ARFS_RMAP);
#else
- netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector, 0);
+ netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector,
+ NAPIF_IRQ_AFFINITY);
#endif
irq->requested = 1;
-
- if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
- int numa_node = dev_to_node(&bp->pdev->dev);
-
- irq->have_cpumask = 1;
- cpumask_set_cpu(cpumask_local_spread(i, numa_node),
- irq->cpu_mask);
- rc = irq_update_affinity_hint(irq->vector, irq->cpu_mask);
- if (rc) {
- netdev_warn(bp->dev,
- "Update affinity hint failed, IRQ = %d\n",
- irq->vector);
- break;
- }
- }
}
return rc;
}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 7df7a2233307..8a97c1fb2083 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1228,9 +1228,7 @@ struct bnxt_irq {
irq_handler_t handler;
unsigned int vector;
u8 requested:1;
- u8 have_cpumask:1;
char name[IFNAMSIZ + BNXT_IRQ_NAME_EXTRA];
- cpumask_var_t cpu_mask;
};
#define HWRM_RING_ALLOC_TX 0x1
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH net-next v2 6/8] ice: use napi's irq affinity
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (4 preceding siblings ...)
2024-12-18 16:58 ` [PATCH net-next v2 5/8] bnxt: use napi's irq affinity Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 7/8] idpf: " Ahmed Zaki
` (2 subsequent siblings)
8 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/intel/ice/ice.h | 3 --
drivers/net/ethernet/intel/ice/ice_base.c | 7 ++--
drivers/net/ethernet/intel/ice/ice_lib.c | 11 ++----
drivers/net/ethernet/intel/ice/ice_main.c | 44 -----------------------
4 files changed, 5 insertions(+), 60 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 2f5d6f974185..0db665a6b38a 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -477,9 +477,6 @@ struct ice_q_vector {
struct ice_ring_container rx;
struct ice_ring_container tx;
- cpumask_t affinity_mask;
- struct irq_affinity_notify affinity_notify;
-
struct ice_channel *ch;
char name[ICE_INT_NAME_STR_LEN];
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index b2af8e3586f7..86cf715de00f 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -147,10 +147,6 @@ static int ice_vsi_alloc_q_vector(struct ice_vsi *vsi, u16 v_idx)
q_vector->reg_idx = q_vector->irq.index;
q_vector->vf_reg_idx = q_vector->irq.index;
- /* only set affinity_mask if the CPU is online */
- if (cpu_online(v_idx))
- cpumask_set_cpu(v_idx, &q_vector->affinity_mask);
-
/* This will not be called in the driver load path because the netdev
* will not be created yet. All other cases with register the NAPI
* handler here (i.e. resume, reset/rebuild, etc.)
@@ -276,7 +272,8 @@ static void ice_cfg_xps_tx_ring(struct ice_tx_ring *ring)
if (test_and_set_bit(ICE_TX_XPS_INIT_DONE, ring->xps_state))
return;
- netif_set_xps_queue(ring->netdev, &ring->q_vector->affinity_mask,
+ netif_set_xps_queue(ring->netdev,
+ &ring->q_vector->napi.config->affinity_mask,
ring->q_index);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 7c0b2d8e86ba..75cf5ce447cc 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2589,12 +2589,6 @@ void ice_vsi_free_irq(struct ice_vsi *vsi)
vsi->q_vectors[i]->num_ring_rx))
continue;
- /* clear the affinity notifier in the IRQ descriptor */
- if (!IS_ENABLED(CONFIG_RFS_ACCEL))
- irq_set_affinity_notifier(irq_num, NULL);
-
- /* clear the affinity_hint in the IRQ descriptor */
- irq_update_affinity_hint(irq_num, NULL);
synchronize_irq(irq_num);
devm_free_irq(ice_pf_to_dev(pf), irq_num, vsi->q_vectors[i]);
}
@@ -2737,9 +2731,10 @@ void ice_vsi_set_napi_queues(struct ice_vsi *vsi)
#ifdef CONFIG_RFS_ACCEL
netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq,
- NAPIF_IRQ_ARFS_RMAP);
+ NAPIF_IRQ_AFFINITY | NAPIF_IRQ_ARFS_RMAP);
#else
- netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq, 0);
+ netif_napi_set_irq(&q_vector->napi, q_vector->irq.virq,
+ NAPIF_IRQ_AFFINITY);
#endif
}
}
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 0ab35607e5d5..33118a435cd2 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2504,34 +2504,6 @@ int ice_schedule_reset(struct ice_pf *pf, enum ice_reset_req reset)
return 0;
}
-/**
- * ice_irq_affinity_notify - Callback for affinity changes
- * @notify: context as to what irq was changed
- * @mask: the new affinity mask
- *
- * This is a callback function used by the irq_set_affinity_notifier function
- * so that we may register to receive changes to the irq affinity masks.
- */
-static void
-ice_irq_affinity_notify(struct irq_affinity_notify *notify,
- const cpumask_t *mask)
-{
- struct ice_q_vector *q_vector =
- container_of(notify, struct ice_q_vector, affinity_notify);
-
- cpumask_copy(&q_vector->affinity_mask, mask);
-}
-
-/**
- * ice_irq_affinity_release - Callback for affinity notifier release
- * @ref: internal core kernel usage
- *
- * This is a callback function used by the irq_set_affinity_notifier function
- * to inform the current notification subscriber that they will no longer
- * receive notifications.
- */
-static void ice_irq_affinity_release(struct kref __always_unused *ref) {}
-
/**
* ice_vsi_ena_irq - Enable IRQ for the given VSI
* @vsi: the VSI being configured
@@ -2595,19 +2567,6 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
err);
goto free_q_irqs;
}
-
- /* register for affinity change notifications */
- if (!IS_ENABLED(CONFIG_RFS_ACCEL)) {
- struct irq_affinity_notify *affinity_notify;
-
- affinity_notify = &q_vector->affinity_notify;
- affinity_notify->notify = ice_irq_affinity_notify;
- affinity_notify->release = ice_irq_affinity_release;
- irq_set_affinity_notifier(irq_num, affinity_notify);
- }
-
- /* assign the mask for this irq */
- irq_update_affinity_hint(irq_num, &q_vector->affinity_mask);
}
err = ice_set_cpu_rx_rmap(vsi);
@@ -2623,9 +2582,6 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
free_q_irqs:
while (vector--) {
irq_num = vsi->q_vectors[vector]->irq.virq;
- if (!IS_ENABLED(CONFIG_RFS_ACCEL))
- irq_set_affinity_notifier(irq_num, NULL);
- irq_update_affinity_hint(irq_num, NULL);
devm_free_irq(dev, irq_num, &vsi->q_vectors[vector]);
}
return err;
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH net-next v2 7/8] idpf: use napi's irq affinity
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (5 preceding siblings ...)
2024-12-18 16:58 ` [PATCH net-next v2 6/8] ice: " Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-18 16:58 ` [PATCH net-next v2 8/8] mlx4: " Ahmed Zaki
2024-12-22 9:23 ` [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Shay Drori
8 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 19 +++++--------------
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 6 ++----
2 files changed, 7 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index 34f4118c7bc0..5728f0bfc610 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -3554,8 +3554,6 @@ void idpf_vport_intr_rel(struct idpf_vport *vport)
q_vector->tx = NULL;
kfree(q_vector->rx);
q_vector->rx = NULL;
-
- free_cpumask_var(q_vector->affinity_mask);
}
kfree(vport->q_vectors);
@@ -3582,8 +3580,6 @@ static void idpf_vport_intr_rel_irq(struct idpf_vport *vport)
vidx = vport->q_vector_idxs[vector];
irq_num = adapter->msix_entries[vidx].vector;
- /* clear the affinity_mask in the IRQ descriptor */
- irq_set_affinity_hint(irq_num, NULL);
kfree(free_irq(irq_num, q_vector));
}
}
@@ -3762,8 +3758,6 @@ static int idpf_vport_intr_req_irq(struct idpf_vport *vport)
"Request_irq failed, error: %d\n", err);
goto free_q_irqs;
}
- /* assign the mask for this irq */
- irq_set_affinity_hint(irq_num, q_vector->affinity_mask);
}
return 0;
@@ -4184,12 +4178,12 @@ static void idpf_vport_intr_napi_add_all(struct idpf_vport *vport)
for (v_idx = 0; v_idx < vport->num_q_vectors; v_idx++) {
struct idpf_q_vector *q_vector = &vport->q_vectors[v_idx];
+ int irq_num = vport->adapter->msix_entries[v_idx].vector;
- netif_napi_add(vport->netdev, &q_vector->napi, napi_poll);
-
- /* only set affinity_mask if the CPU is online */
- if (cpu_online(v_idx))
- cpumask_set_cpu(v_idx, q_vector->affinity_mask);
+ netif_napi_add_config(vport->netdev, &q_vector->napi,
+ napi_poll, v_idx);
+ netif_napi_set_irq(&q_vector->napi,
+ irq_num, NAPIF_IRQ_AFFINITY);
}
}
@@ -4233,9 +4227,6 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
q_vector->rx_intr_mode = IDPF_ITR_DYNAMIC;
q_vector->rx_itr_idx = VIRTCHNL2_ITR_IDX_0;
- if (!zalloc_cpumask_var(&q_vector->affinity_mask, GFP_KERNEL))
- goto error;
-
q_vector->tx = kcalloc(txqs_per_vector, sizeof(*q_vector->tx),
GFP_KERNEL);
if (!q_vector->tx)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
index 9c1fe84108ed..5efb3402b378 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -397,7 +397,6 @@ struct idpf_intr_reg {
* @rx_intr_mode: Dynamic ITR or not
* @rx_itr_idx: RX ITR index
* @v_idx: Vector index
- * @affinity_mask: CPU affinity mask
*/
struct idpf_q_vector {
__cacheline_group_begin_aligned(read_mostly);
@@ -434,13 +433,12 @@ struct idpf_q_vector {
__cacheline_group_begin_aligned(cold);
u16 v_idx;
- cpumask_var_t affinity_mask;
__cacheline_group_end_aligned(cold);
};
libeth_cacheline_set_assert(struct idpf_q_vector, 112,
24 + sizeof(struct napi_struct) +
2 * sizeof(struct dim),
- 8 + sizeof(cpumask_var_t));
+ 8);
struct idpf_rx_queue_stats {
u64_stats_t packets;
@@ -934,7 +932,7 @@ static inline int idpf_q_vector_to_mem(const struct idpf_q_vector *q_vector)
if (!q_vector)
return NUMA_NO_NODE;
- cpu = cpumask_first(q_vector->affinity_mask);
+ cpu = cpumask_first(&q_vector->napi.config->affinity_mask);
return cpu < nr_cpu_ids ? cpu_to_mem(cpu) : NUMA_NO_NODE;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* [PATCH net-next v2 8/8] mlx4: use napi's irq affinity
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (6 preceding siblings ...)
2024-12-18 16:58 ` [PATCH net-next v2 7/8] idpf: " Ahmed Zaki
@ 2024-12-18 16:58 ` Ahmed Zaki
2024-12-22 9:23 ` [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Shay Drori
8 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2024-12-18 16:58 UTC (permalink / raw)
To: netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, shayd, akpm, Ahmed Zaki
Delete the driver CPU affinity info and use the core's napi config
instead.
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
---
drivers/net/ethernet/mellanox/mlx4/en_cq.c | 8 ++--
.../net/ethernet/mellanox/mlx4/en_netdev.c | 33 +--------------
drivers/net/ethernet/mellanox/mlx4/eq.c | 22 ----------
drivers/net/ethernet/mellanox/mlx4/main.c | 42 ++-----------------
drivers/net/ethernet/mellanox/mlx4/mlx4.h | 1 -
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 -
6 files changed, 10 insertions(+), 97 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index b8531283e3ac..46e28e0bfcd0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -90,6 +90,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
int cq_idx)
{
struct mlx4_en_dev *mdev = priv->mdev;
+ struct napi_config *napi_config;
int irq, err = 0;
int timestamp_en = 0;
bool assigned_eq = false;
@@ -100,11 +101,12 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
*cq->mcq.set_ci_db = 0;
*cq->mcq.arm_db = 0;
memset(cq->buf, 0, cq->buf_size);
+ napi_config = cq->napi.config;
if (cq->type == RX) {
if (!mlx4_is_eq_vector_valid(mdev->dev, priv->port,
cq->vector)) {
- cq->vector = cpumask_first(priv->rx_ring[cq->ring]->affinity_mask);
+ cq->vector = cpumask_first(&napi_config->affinity_mask);
err = mlx4_assign_eq(mdev->dev, priv->port,
&cq->vector);
@@ -150,7 +152,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
case TX:
cq->mcq.comp = mlx4_en_tx_irq;
netif_napi_add_tx(cq->dev, &cq->napi, mlx4_en_poll_tx_cq);
- netif_napi_set_irq(&cq->napi, irq, 0);
+ netif_napi_set_irq(&cq->napi, irq, NAPIF_IRQ_AFFINITY);
napi_enable(&cq->napi);
netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_TX, &cq->napi);
break;
@@ -158,7 +160,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
cq->mcq.comp = mlx4_en_rx_irq;
netif_napi_add_config(cq->dev, &cq->napi, mlx4_en_poll_rx_cq,
cq_idx);
- netif_napi_set_irq(&cq->napi, irq, 0);
+ netif_napi_set_irq(&cq->napi, irq, NAPIF_IRQ_AFFINITY);
napi_enable(&cq->napi);
netif_queue_set_napi(cq->dev, cq_idx, NETDEV_QUEUE_TYPE_RX, &cq->napi);
break;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 281b34af0bb4..e4c2532b5909 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1596,24 +1596,6 @@ static void mlx4_en_linkstate_work(struct work_struct *work)
mutex_unlock(&mdev->state_lock);
}
-static int mlx4_en_init_affinity_hint(struct mlx4_en_priv *priv, int ring_idx)
-{
- struct mlx4_en_rx_ring *ring = priv->rx_ring[ring_idx];
- int numa_node = priv->mdev->dev->numa_node;
-
- if (!zalloc_cpumask_var(&ring->affinity_mask, GFP_KERNEL))
- return -ENOMEM;
-
- cpumask_set_cpu(cpumask_local_spread(ring_idx, numa_node),
- ring->affinity_mask);
- return 0;
-}
-
-static void mlx4_en_free_affinity_hint(struct mlx4_en_priv *priv, int ring_idx)
-{
- free_cpumask_var(priv->rx_ring[ring_idx]->affinity_mask);
-}
-
static void mlx4_en_init_recycle_ring(struct mlx4_en_priv *priv,
int tx_ring_idx)
{
@@ -1663,16 +1645,9 @@ int mlx4_en_start_port(struct net_device *dev)
for (i = 0; i < priv->rx_ring_num; i++) {
cq = priv->rx_cq[i];
- err = mlx4_en_init_affinity_hint(priv, i);
- if (err) {
- en_err(priv, "Failed preparing IRQ affinity hint\n");
- goto cq_err;
- }
-
err = mlx4_en_activate_cq(priv, cq, i);
if (err) {
en_err(priv, "Failed activating Rx CQ\n");
- mlx4_en_free_affinity_hint(priv, i);
goto cq_err;
}
@@ -1688,7 +1663,6 @@ int mlx4_en_start_port(struct net_device *dev)
if (err) {
en_err(priv, "Failed setting cq moderation parameters\n");
mlx4_en_deactivate_cq(priv, cq);
- mlx4_en_free_affinity_hint(priv, i);
goto cq_err;
}
mlx4_en_arm_cq(priv, cq);
@@ -1874,10 +1848,9 @@ int mlx4_en_start_port(struct net_device *dev)
mac_err:
mlx4_en_put_qp(priv);
cq_err:
- while (rx_index--) {
+ while (rx_index--)
mlx4_en_deactivate_cq(priv, priv->rx_cq[rx_index]);
- mlx4_en_free_affinity_hint(priv, rx_index);
- }
+
for (i = 0; i < priv->rx_ring_num; i++)
mlx4_en_deactivate_rx_ring(priv, priv->rx_ring[i]);
@@ -2011,8 +1984,6 @@ void mlx4_en_stop_port(struct net_device *dev, int detach)
napi_synchronize(&cq->napi);
mlx4_en_deactivate_rx_ring(priv, priv->rx_ring[i]);
mlx4_en_deactivate_cq(priv, cq);
-
- mlx4_en_free_affinity_hint(priv, i);
}
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/eq.c b/drivers/net/ethernet/mellanox/mlx4/eq.c
index d768a6a828c4..b005eb697c64 100644
--- a/drivers/net/ethernet/mellanox/mlx4/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/eq.c
@@ -233,23 +233,6 @@ static void mlx4_slave_event(struct mlx4_dev *dev, int slave,
slave_event(dev, slave, eqe);
}
-#if defined(CONFIG_SMP)
-static void mlx4_set_eq_affinity_hint(struct mlx4_priv *priv, int vec)
-{
- int hint_err;
- struct mlx4_dev *dev = &priv->dev;
- struct mlx4_eq *eq = &priv->eq_table.eq[vec];
-
- if (!cpumask_available(eq->affinity_mask) ||
- cpumask_empty(eq->affinity_mask))
- return;
-
- hint_err = irq_update_affinity_hint(eq->irq, eq->affinity_mask);
- if (hint_err)
- mlx4_warn(dev, "irq_update_affinity_hint failed, err %d\n", hint_err);
-}
-#endif
-
int mlx4_gen_pkey_eqe(struct mlx4_dev *dev, int slave, u8 port)
{
struct mlx4_eqe eqe;
@@ -1123,8 +1106,6 @@ static void mlx4_free_irqs(struct mlx4_dev *dev)
for (i = 0; i < dev->caps.num_comp_vectors + 1; ++i)
if (eq_table->eq[i].have_irq) {
- free_cpumask_var(eq_table->eq[i].affinity_mask);
- irq_update_affinity_hint(eq_table->eq[i].irq, NULL);
free_irq(eq_table->eq[i].irq, eq_table->eq + i);
eq_table->eq[i].have_irq = 0;
}
@@ -1516,9 +1497,6 @@ int mlx4_assign_eq(struct mlx4_dev *dev, u8 port, int *vector)
clear_bit(*prequested_vector, priv->msix_ctl.pool_bm);
*prequested_vector = -1;
} else {
-#if defined(CONFIG_SMP)
- mlx4_set_eq_affinity_hint(priv, *prequested_vector);
-#endif
eq_set_ci(&priv->eq_table.eq[*prequested_vector], 1);
priv->eq_table.eq[*prequested_vector].have_irq = 1;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index febeadfdd5a5..5f88c297332f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2923,36 +2923,6 @@ static int mlx4_setup_hca(struct mlx4_dev *dev)
return err;
}
-static int mlx4_init_affinity_hint(struct mlx4_dev *dev, int port, int eqn)
-{
- int requested_cpu = 0;
- struct mlx4_priv *priv = mlx4_priv(dev);
- struct mlx4_eq *eq;
- int off = 0;
- int i;
-
- if (eqn > dev->caps.num_comp_vectors)
- return -EINVAL;
-
- for (i = 1; i < port; i++)
- off += mlx4_get_eqs_per_port(dev, i);
-
- requested_cpu = eqn - off - !!(eqn > MLX4_EQ_ASYNC);
-
- /* Meaning EQs are shared, and this call comes from the second port */
- if (requested_cpu < 0)
- return 0;
-
- eq = &priv->eq_table.eq[eqn];
-
- if (!zalloc_cpumask_var(&eq->affinity_mask, GFP_KERNEL))
- return -ENOMEM;
-
- cpumask_set_cpu(requested_cpu, eq->affinity_mask);
-
- return 0;
-}
-
static void mlx4_enable_msi_x(struct mlx4_dev *dev)
{
struct mlx4_priv *priv = mlx4_priv(dev);
@@ -2997,19 +2967,13 @@ static void mlx4_enable_msi_x(struct mlx4_dev *dev)
priv->eq_table.eq[i].irq =
entries[i + 1 - !!(i > MLX4_EQ_ASYNC)].vector;
- if (MLX4_IS_LEGACY_EQ_MODE(dev->caps)) {
+ if (MLX4_IS_LEGACY_EQ_MODE(dev->caps))
bitmap_fill(priv->eq_table.eq[i].actv_ports.ports,
dev->caps.num_ports);
- /* We don't set affinity hint when there
- * aren't enough EQs
- */
- } else {
+ else
set_bit(port,
priv->eq_table.eq[i].actv_ports.ports);
- if (mlx4_init_affinity_hint(dev, port + 1, i))
- mlx4_warn(dev, "Couldn't init hint cpumask for EQ %d\n",
- i);
- }
+
/* We divide the Eqs evenly between the two ports.
* (dev->caps.num_comp_vectors / dev->caps.num_ports)
* refers to the number of Eqs per port
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index d7d856d1758a..66b1ebfd5816 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -403,7 +403,6 @@ struct mlx4_eq {
struct mlx4_eq_tasklet tasklet_ctx;
struct mlx4_active_ports actv_ports;
u32 ref_count;
- cpumask_var_t affinity_mask;
};
struct mlx4_slave_eqe {
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 28b70dcc652e..41594dd00636 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -357,7 +357,6 @@ struct mlx4_en_rx_ring {
unsigned long dropped;
unsigned long alloc_fail;
int hwtstamp_rx_filter;
- cpumask_var_t affinity_mask;
struct xdp_rxq_info xdp_rxq;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config
2024-12-18 16:58 [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Ahmed Zaki
` (7 preceding siblings ...)
2024-12-18 16:58 ` [PATCH net-next v2 8/8] mlx4: " Ahmed Zaki
@ 2024-12-22 9:23 ` Shay Drori
2025-01-02 21:38 ` Ahmed Zaki
8 siblings, 1 reply; 23+ messages in thread
From: Shay Drori @ 2024-12-22 9:23 UTC (permalink / raw)
To: Ahmed Zaki, netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, akpm
On 18/12/2024 18:58, Ahmed Zaki wrote:
> External email: Use caution opening links or attachments
>
>
> Move the IRQ affinity management to the napi struct. All drivers that are
> already using netif_napi_set_irq() are modified to the new API. Except
> mlx5 because it is implementing IRQ pools and moving to the new API does
> not seem trivial.
>
> Tested on bnxt, ice and idpf.
> ---
> Opens: is cpu_online_mask the best default mask? drivers do this differently
cpu_online_mask is not the best default mask for IRQ affinity management.
Here are two reasons:
- Performance Gains from Driver-Specific CPU Assignment: Many drivers
assign different CPUs to each IRQ to optimize performance. This plays
a crucial role in CPU utilization.
- Impact of NUMA Node Distance on Traffic Performance: NUMA topology
plays a crucial role in IRQ performance. Assigning IRQs to CPUs on
the same NUMA node as the associated device minimizes latency caused
by remote memory access.[1]
[1]
for more details on NUMA preference, you can look at commit
2acda57736de1e486036b90a648e67a3599080a1
>
> v2:
> - Also move the ARFS IRQ affinity management from drivers to core. Via
> netif_napi_set_irq(), drivers can ask the core to add the IRQ to the
> ARFS rmap (already allocated by the driver).
>
> RFC -> v1:
> - https://lore.kernel.org/netdev/20241210002626.366878-1-ahmed.zaki@intel.com/
> - move static inline affinity functions to net/dev/core.c
> - add the new napi->irq_flags (patch 1)
> - add code changes to bnxt, mlx4 and ice.
>
> Ahmed Zaki (8):
> net: napi: add irq_flags to napi struct
> net: allow ARFS rmap management in core
> lib: cpu_rmap: allow passing a notifier callback
> net: napi: add CPU affinity to napi->config
> bnxt: use napi's irq affinity
> ice: use napi's irq affinity
> idpf: use napi's irq affinity
> mlx4: use napi's irq affinity
>
> drivers/net/ethernet/amazon/ena/ena_netdev.c | 21 ++---
> drivers/net/ethernet/broadcom/bnxt/bnxt.c | 51 +++--------
> drivers/net/ethernet/broadcom/bnxt/bnxt.h | 2 -
> drivers/net/ethernet/broadcom/tg3.c | 2 +-
> drivers/net/ethernet/cisco/enic/enic_main.c | 3 +-
> drivers/net/ethernet/google/gve/gve_utils.c | 2 +-
> .../net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
> drivers/net/ethernet/intel/e1000/e1000_main.c | 2 +-
> drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
> drivers/net/ethernet/intel/ice/ice.h | 3 -
> drivers/net/ethernet/intel/ice/ice_arfs.c | 10 +--
> drivers/net/ethernet/intel/ice/ice_base.c | 7 +-
> drivers/net/ethernet/intel/ice/ice_lib.c | 14 +--
> drivers/net/ethernet/intel/ice/ice_main.c | 44 ----------
> drivers/net/ethernet/intel/idpf/idpf_txrx.c | 19 ++--
> drivers/net/ethernet/intel/idpf/idpf_txrx.h | 6 +-
> drivers/net/ethernet/mellanox/mlx4/en_cq.c | 8 +-
> .../net/ethernet/mellanox/mlx4/en_netdev.c | 33 +------
> drivers/net/ethernet/mellanox/mlx4/eq.c | 24 +----
> drivers/net/ethernet/mellanox/mlx4/main.c | 42 +--------
> drivers/net/ethernet/mellanox/mlx4/mlx4.h | 1 -
> drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 -
> .../net/ethernet/mellanox/mlx5/core/en_main.c | 2 +-
> .../net/ethernet/mellanox/mlx5/core/pci_irq.c | 2 +-
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 3 +-
> drivers/net/ethernet/qlogic/qede/qede_main.c | 28 +++---
> drivers/net/ethernet/sfc/falcon/efx.c | 9 ++
> drivers/net/ethernet/sfc/falcon/nic.c | 10 ---
> drivers/net/ethernet/sfc/nic.c | 2 +-
> drivers/net/ethernet/sfc/siena/efx_channels.c | 9 ++
> drivers/net/ethernet/sfc/siena/nic.c | 10 ---
> include/linux/cpu_rmap.h | 13 ++-
> include/linux/netdevice.h | 23 ++++-
> lib/cpu_rmap.c | 20 ++---
> net/core/dev.c | 87 ++++++++++++++++++-
> 35 files changed, 215 insertions(+), 302 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config
2024-12-22 9:23 ` [PATCH net-next v2 0/8] net: napi: add CPU affinity to napi->config Shay Drori
@ 2025-01-02 21:38 ` Ahmed Zaki
0 siblings, 0 replies; 23+ messages in thread
From: Ahmed Zaki @ 2025-01-02 21:38 UTC (permalink / raw)
To: Shay Drori, netdev
Cc: intel-wired-lan, andrew+netdev, edumazet, kuba, pabeni, davem,
michael.chan, tariqt, anthony.l.nguyen, przemyslaw.kitszel,
jdamato, akpm
On 2024-12-22 2:23 a.m., Shay Drori wrote:
>
>
> On 18/12/2024 18:58, Ahmed Zaki wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Move the IRQ affinity management to the napi struct. All drivers that are
>> already using netif_napi_set_irq() are modified to the new API. Except
>> mlx5 because it is implementing IRQ pools and moving to the new API does
>> not seem trivial.
>>
>> Tested on bnxt, ice and idpf.
>> ---
>> Opens: is cpu_online_mask the best default mask? drivers do this
>> differently
>
> cpu_online_mask is not the best default mask for IRQ affinity management.
> Here are two reasons:
> - Performance Gains from Driver-Specific CPU Assignment: Many drivers
> assign different CPUs to each IRQ to optimize performance. This plays
> a crucial role in CPU utilization.
> - Impact of NUMA Node Distance on Traffic Performance: NUMA topology
> plays a crucial role in IRQ performance. Assigning IRQs to CPUs on
> the same NUMA node as the associated device minimizes latency caused
> by remote memory access.[1]
>
> [1]
> for more details on NUMA preference, you can look at commit
> 2acda57736de1e486036b90a648e67a3599080a1
>
Thanks for replying.
I will use cpumask_local_spread() (which now considers NUMA distances)
in the next iteration.
^ permalink raw reply [flat|nested] 23+ messages in thread