Netdev List
 help / color / mirror / Atom feed
* [PATCH 0/2] net/mlx5: Only consider online CPUs in affinity subset check
@ 2026-06-03  7:26 Fushuai Wang
  2026-06-03  7:26 ` [PATCH 1/2] net/mlx5: Simplify cpumask operations in comp_irq_request_sf() Fushuai Wang
  2026-06-03  7:26 ` [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
  0 siblings, 2 replies; 6+ messages in thread
From: Fushuai Wang @ 2026-06-03  7:26 UTC (permalink / raw)
  To: saeedm, leon, tariqt, mbloch, andrew+netdev, davem, edumazet,
	kuba, pabeni, shayd, parav, moshe
  Cc: netdev, linux-rdma, linux-kernel, wangfushuai

From: Fushuai Wang <wangfushuai@baidu.com>

Hi all,

When an SF is created after a CPU has been taken offline, the IRQ affinity
check fails because existing IRQs in the pool may have affinity masks that
include the now-offline CPU. This causes SF creation to fail even though
suitable online CPUs are available.

This series fixes this issue and includes a small cleanup:

Patch 1 folds cpumask_copy() into cpumask_andnot() for better code clarity
in comp_irq_request_sf().

Patch 2 filters affinity masks to only consider online CPUs before the subset
check, ensuring SF creation succeeds when CPUs have been taken offline.

--WANG

Fushuai Wang (2):
  net/mlx5: Simplify cpumask operations in comp_irq_request_sf()
  net/mlx5: Only consider online CPUs in affinity subset check

 drivers/net/ethernet/mellanox/mlx5/core/eq.c       |  3 +--
 .../net/ethernet/mellanox/mlx5/core/irq_affinity.c | 14 ++++++++++++--
 2 files changed, 13 insertions(+), 4 deletions(-)

-- 
2.36.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] net/mlx5: Simplify cpumask operations in comp_irq_request_sf()
  2026-06-03  7:26 [PATCH 0/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
@ 2026-06-03  7:26 ` Fushuai Wang
  2026-06-04  5:57   ` Shay Drori
  2026-06-03  7:26 ` [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
  1 sibling, 1 reply; 6+ messages in thread
From: Fushuai Wang @ 2026-06-03  7:26 UTC (permalink / raw)
  To: saeedm, leon, tariqt, mbloch, andrew+netdev, davem, edumazet,
	kuba, pabeni, shayd, parav, moshe
  Cc: netdev, linux-rdma, linux-kernel, wangfushuai

From: Fushuai Wang <wangfushuai@baidu.com>

Combine cpumask_copy() and cpumask_andnot() into a single
cpumask_andnot() since the function can take cpu_online_mask
directly as the source.

Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 22a637111aa2..d11ec263d53c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -886,8 +886,7 @@ static int comp_irq_request_sf(struct mlx5_core_dev *dev, u16 vecidx)
 		return -ENOMEM;
 
 	af_desc->is_managed = false;
-	cpumask_copy(&af_desc->mask, cpu_online_mask);
-	cpumask_andnot(&af_desc->mask, &af_desc->mask, &table->used_cpus);
+	cpumask_andnot(&af_desc->mask, cpu_online_mask, &table->used_cpus);
 	irq = mlx5_irq_affinity_request(dev, pool, af_desc);
 	if (IS_ERR(irq)) {
 		kvfree(af_desc);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check
  2026-06-03  7:26 [PATCH 0/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
  2026-06-03  7:26 ` [PATCH 1/2] net/mlx5: Simplify cpumask operations in comp_irq_request_sf() Fushuai Wang
@ 2026-06-03  7:26 ` Fushuai Wang
  2026-06-04  5:53   ` Shay Drori
  1 sibling, 1 reply; 6+ messages in thread
From: Fushuai Wang @ 2026-06-03  7:26 UTC (permalink / raw)
  To: saeedm, leon, tariqt, mbloch, andrew+netdev, davem, edumazet,
	kuba, pabeni, shayd, parav, moshe
  Cc: netdev, linux-rdma, linux-kernel, wangfushuai

From: Fushuai Wang <wangfushuai@baidu.com>

When an SF is created after a CPU has been taken offline, the IRQ pool may
contain IRQs with affinity masks that include the offline CPU. Since only
online CPUs should be considered for IRQ placement, cpumask_subset() check
would fail because the iter_mask contains offline CPUs that are not present
in req_mask, causing SF creation to fail.

Filter the affinity mask to only include online CPUs before checking if it's
a subset of the requested mask, ensuring SF creation succeeds in this scenario.

Fixes: 061f5b23588a ("net/mlx5: SF, Use all available cpu for setting cpu affinity")
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
---
 .../net/ethernet/mellanox/mlx5/core/irq_affinity.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
index 994fe83da4be..8c0df240b888 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
@@ -102,18 +102,26 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
 	struct mlx5_irq *iter;
 	int irq_refcount = 0;
 	unsigned long index;
+	cpumask_var_t tmp;
 
 	lockdep_assert_held(&pool->lock);
+
+	if (!alloc_cpumask_var(&tmp, GFP_ATOMIC))
+		return NULL;
+
 	xa_for_each_range(&pool->irqs, index, iter, start, end) {
 		struct cpumask *iter_mask = mlx5_irq_get_affinity_mask(iter);
 		int iter_refcount = mlx5_irq_read_locked(iter);
 
-		if (!cpumask_subset(iter_mask, req_mask))
+		cpumask_and(tmp, iter_mask, cpu_online_mask);
+		if (!cpumask_subset(tmp, req_mask))
 			/* skip IRQs with a mask which is not subset of req_mask */
 			continue;
-		if (iter_refcount < pool->min_threshold)
+		if (iter_refcount < pool->min_threshold) {
 			/* If we found an IRQ with less than min_thres, return it */
+			free_cpumask_var(tmp);
 			return iter;
+		}
 		if (!irq || iter_refcount < irq_refcount) {
 			/* In case we won't find an IRQ with less than min_thres,
 			 * keep a pointer to the least used IRQ
@@ -122,6 +130,8 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
 			irq = iter;
 		}
 	}
+
+	free_cpumask_var(tmp);
 	return irq;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check
  2026-06-03  7:26 ` [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
@ 2026-06-04  5:53   ` Shay Drori
  2026-06-04 12:54     ` Fushuai Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Shay Drori @ 2026-06-04  5:53 UTC (permalink / raw)
  To: Fushuai Wang, saeedm, leon, tariqt, mbloch, andrew+netdev, davem,
	edumazet, kuba, pabeni, parav, moshe
  Cc: netdev, linux-rdma, linux-kernel, wangfushuai



On 03/06/2026 10:26, Fushuai Wang wrote:
> External email: Use caution opening links or attachments
> 
> 
> From: Fushuai Wang <wangfushuai@baidu.com>
> 
> When an SF is created after a CPU has been taken offline, the IRQ pool may
> contain IRQs with affinity masks that include the offline CPU. Since only
> online CPUs should be considered for IRQ placement, cpumask_subset() check
> would fail because the iter_mask contains offline CPUs that are not present
> in req_mask, causing SF creation to fail.

Thank for the patch!

can you please provide a full example? for simplicity, lets say the SF
pool is of size of 2 IRQs.

> 
> Filter the affinity mask to only include online CPUs before checking if it's
> a subset of the requested mask, 

won't this cause the affinity mask to be empty, which is kind of missing
the point of this API... :(
can you check if irq_get_effective_affinity_mask() will solve the issue?

Thanks

> ensuring SF creation succeeds in this scenario.
> 
> Fixes: 061f5b23588a ("net/mlx5: SF, Use all available cpu for setting cpu affinity")
> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
> ---
>   .../net/ethernet/mellanox/mlx5/core/irq_affinity.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
> index 994fe83da4be..8c0df240b888 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c
> @@ -102,18 +102,26 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
>          struct mlx5_irq *iter;
>          int irq_refcount = 0;
>          unsigned long index;
> +       cpumask_var_t tmp;
> 
>          lockdep_assert_held(&pool->lock);
> +
> +       if (!alloc_cpumask_var(&tmp, GFP_ATOMIC))
> +               return NULL;
> +
>          xa_for_each_range(&pool->irqs, index, iter, start, end) {
>                  struct cpumask *iter_mask = mlx5_irq_get_affinity_mask(iter);
>                  int iter_refcount = mlx5_irq_read_locked(iter);
> 
> -               if (!cpumask_subset(iter_mask, req_mask))
> +               cpumask_and(tmp, iter_mask, cpu_online_mask);
> +               if (!cpumask_subset(tmp, req_mask))
>                          /* skip IRQs with a mask which is not subset of req_mask */
>                          continue;
> -               if (iter_refcount < pool->min_threshold)
> +               if (iter_refcount < pool->min_threshold) {
>                          /* If we found an IRQ with less than min_thres, return it */
> +                       free_cpumask_var(tmp);
>                          return iter;
> +               }
>                  if (!irq || iter_refcount < irq_refcount) {
>                          /* In case we won't find an IRQ with less than min_thres,
>                           * keep a pointer to the least used IRQ
> @@ -122,6 +130,8 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req
>                          irq = iter;
>                  }
>          }
> +
> +       free_cpumask_var(tmp);
>          return irq;
>   }
> 
> --
> 2.36.1
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] net/mlx5: Simplify cpumask operations in comp_irq_request_sf()
  2026-06-03  7:26 ` [PATCH 1/2] net/mlx5: Simplify cpumask operations in comp_irq_request_sf() Fushuai Wang
@ 2026-06-04  5:57   ` Shay Drori
  0 siblings, 0 replies; 6+ messages in thread
From: Shay Drori @ 2026-06-04  5:57 UTC (permalink / raw)
  To: Fushuai Wang, saeedm, leon, tariqt, mbloch, andrew+netdev, davem,
	edumazet, kuba, pabeni, parav, moshe
  Cc: netdev, linux-rdma, linux-kernel, wangfushuai



On 03/06/2026 10:26, Fushuai Wang wrote:
> External email: Use caution opening links or attachments
> 
> 
> From: Fushuai Wang <wangfushuai@baidu.com>
> 
> Combine cpumask_copy() and cpumask_andnot() into a single
> cpumask_andnot() since the function can take cpu_online_mask
> directly as the source.

Reviewed-by: Shay Drory <shayd@nvidia.com>

> 
> Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/eq.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index 22a637111aa2..d11ec263d53c 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -886,8 +886,7 @@ static int comp_irq_request_sf(struct mlx5_core_dev *dev, u16 vecidx)
>                  return -ENOMEM;
> 
>          af_desc->is_managed = false;
> -       cpumask_copy(&af_desc->mask, cpu_online_mask);
> -       cpumask_andnot(&af_desc->mask, &af_desc->mask, &table->used_cpus);
> +       cpumask_andnot(&af_desc->mask, cpu_online_mask, &table->used_cpus);
>          irq = mlx5_irq_affinity_request(dev, pool, af_desc);
>          if (IS_ERR(irq)) {
>                  kvfree(af_desc);
> --
> 2.36.1
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check
  2026-06-04  5:53   ` Shay Drori
@ 2026-06-04 12:54     ` Fushuai Wang
  0 siblings, 0 replies; 6+ messages in thread
From: Fushuai Wang @ 2026-06-04 12:54 UTC (permalink / raw)
  To: shayd
  Cc: andrew+netdev, davem, edumazet, fushuai.wang, kuba, leon,
	linux-kernel, linux-rdma, mbloch, moshe, netdev, pabeni, parav,
	saeedm, tariqt, wangfushuai

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 1692 bytes --]

> > From: Fushuai Wang <wangfushuai@baidu.com>
> > 
> > When an SF is created after a CPU has been taken offline, the IRQ pool may
> > contain IRQs with affinity masks that include the offline CPU. Since only
> > online CPUs should be considered for IRQ placement, cpumask_subset() check
> > would fail because the iter_mask contains offline CPUs that are not present
> > in req_mask, causing SF creation to fail.
> 
> Thank for the patch!
> 
> can you please provide a full example? for simplicity, lets say the SF
> pool is of size of 2 IRQs.
> 

Sure, here are the AI summarized steps:

  1. When mlx5 driver loads, it initializes the IRQ pools.
     For sf_ctrl_pool with ≤64 SFs:
     - xa_num_irqs = {N, N} (There is only one slot)
  2. When the first SF is created:
     - The ctrl IRQ is allocated with mask=cpu_online_mask={0-191}
  2. We take CPU 20 offline
  3. Existing ctl irq still have mask={0-191}
  4. Create a new SF:
     - req_mask={0-19,21-191}
     - iter_mask={0-191}
     - {0-191} is NOT a subset of {0-19,21-191}
     - least_loaded_irq=NULL
  5. Try to allocate a new irq via irq_pool_request_irq()
  6. xa_alloc() fails because the pool is full(There is only one slot)
  7. sf creation fails with error


> > 
> > Filter the affinity mask to only include online CPUs before checking if it's
> > a subset of the requested mask, 
> 
> won't this cause the affinity mask to be empty, which is kind of missing
> the point of this API... :(

Yes, I didn't realize this.

> can you check if irq_get_effective_affinity_mask() will solve the issue?
> 

Yes, I tested that irq_get_effective_affinity_mask can solve the issue.
I will send a v2 shortly.

-- 
Regards,
WANG

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-04 12:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-03  7:26 [PATCH 0/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
2026-06-03  7:26 ` [PATCH 1/2] net/mlx5: Simplify cpumask operations in comp_irq_request_sf() Fushuai Wang
2026-06-04  5:57   ` Shay Drori
2026-06-03  7:26 ` [PATCH 2/2] net/mlx5: Only consider online CPUs in affinity subset check Fushuai Wang
2026-06-04  5:53   ` Shay Drori
2026-06-04 12:54     ` Fushuai Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox