[PATCH 0/7] RDMA: hfi1: cpumasks usage fixes

linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes
@ 2025-06-04 19:39 Yury Norov
  2025-06-04 19:39 ` [PATCH 1/7] cpumask: add cpumask_clear_cpus() Yury Norov
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

The driver uses cpumasks API in a non-optimal way; partially because of
absence of proper functions. Fix this and nearby logic.

Yury Norov [NVIDIA] (7):
  cpumask: add cpumask_clear_cpus()
  RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
  RDMA: hfi1: simplify find_hw_thread_mask()
  RDMA: hfi1: simplify init_real_cpu_mask()
  RDMA: hfi1: use rounddown in find_hw_thread_mask()
  RDMA: hfi1: simplify hfi1_get_proc_affinity()
  RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c

 drivers/infiniband/hw/hfi1/affinity.c | 96 +++++++++++----------------
 include/linux/cpumask.h               | 12 ++++
 2 files changed, 49 insertions(+), 59 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] cpumask: add cpumask_clear_cpus()
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-04 19:39 ` [PATCH 2/7] RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask() Yury Norov
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

When user wants to clear a range in cpumask, the only option the API
provides now is a for-loop, like:

	for_each_cpu_from(cpu, mask) {
		if (cpu >= ncpus)
			break;
		__cpumask_clear_cpu(cpu, mask);
	}

In the bitmap API we have bitmap_clear() for that, which is
significantly faster than a for-loop. Propagate it to cpumasks.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 7ae80a7ca81e..ede95bbe8b80 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -609,6 +609,18 @@ void __cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
 	__set_bit(cpumask_check(cpu), cpumask_bits(dstp));
 }
 
+/**
+ * cpumask_clear_cpus - clear cpus in a cpumask
+ * @dstp:  the cpumask pointer
+ * @cpu:   cpu number (< nr_cpu_ids)
+ * @ncpus: number of cpus to clear (< nr_cpu_ids)
+ */
+static __always_inline void cpumask_clear_cpus(struct cpumask *dstp,
+						unsigned int cpu, unsigned int ncpus)
+{
+	cpumask_check(cpu + ncpus - 1);
+	bitmap_clear(cpumask_bits(dstp), cpumask_check(cpu), ncpus);
+}
 
 /**
  * cpumask_clear_cpu - clear a cpu in a cpumask
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
  2025-06-04 19:39 ` [PATCH 1/7] cpumask: add cpumask_clear_cpus() Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-04 19:39 ` [PATCH 3/7] RDMA: hfi1: simplify find_hw_thread_mask() Yury Norov
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

The function divides number of online CPUs by num_core_siblings, and
later checks the divider by zero. This implies a possibility to get
and divide-by-zero runtime error. Fix it by moving the check prior to
division. This also helps to save one indentation level.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 44 +++++++++++++++------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 7ead8746b79b..f2c530ab85a5 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -964,31 +964,35 @@ static void find_hw_thread_mask(uint hw_thread_no, cpumask_var_t hw_thread_mask,
 				struct hfi1_affinity_node_list *affinity)
 {
 	int possible, curr_cpu, i;
-	uint num_cores_per_socket = node_affinity.num_online_cpus /
+	uint num_cores_per_socket;
+
+	cpumask_copy(hw_thread_mask, &affinity->proc.mask);
+
+	if (affinity->num_core_siblings == 0)
+		return;
+
+	num_cores_per_socket = node_affinity.num_online_cpus /
 					affinity->num_core_siblings /
 						node_affinity.num_online_nodes;
 
-	cpumask_copy(hw_thread_mask, &affinity->proc.mask);
-	if (affinity->num_core_siblings > 0) {
-		/* Removing other siblings not needed for now */
-		possible = cpumask_weight(hw_thread_mask);
-		curr_cpu = cpumask_first(hw_thread_mask);
-		for (i = 0;
-		     i < num_cores_per_socket * node_affinity.num_online_nodes;
-		     i++)
-			curr_cpu = cpumask_next(curr_cpu, hw_thread_mask);
-
-		for (; i < possible; i++) {
-			cpumask_clear_cpu(curr_cpu, hw_thread_mask);
-			curr_cpu = cpumask_next(curr_cpu, hw_thread_mask);
-		}
+	/* Removing other siblings not needed for now */
+	possible = cpumask_weight(hw_thread_mask);
+	curr_cpu = cpumask_first(hw_thread_mask);
+	for (i = 0;
+	     i < num_cores_per_socket * node_affinity.num_online_nodes;
+	     i++)
+		curr_cpu = cpumask_next(curr_cpu, hw_thread_mask);
 
-		/* Identifying correct HW threads within physical cores */
-		cpumask_shift_left(hw_thread_mask, hw_thread_mask,
-				   num_cores_per_socket *
-				   node_affinity.num_online_nodes *
-				   hw_thread_no);
+	for (; i < possible; i++) {
+		cpumask_clear_cpu(curr_cpu, hw_thread_mask);
+		curr_cpu = cpumask_next(curr_cpu, hw_thread_mask);
 	}
+
+	/* Identifying correct HW threads within physical cores */
+	cpumask_shift_left(hw_thread_mask, hw_thread_mask,
+			   num_cores_per_socket *
+			   node_affinity.num_online_nodes *
+			   hw_thread_no);
 }
 
 int hfi1_get_proc_affinity(int node)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] RDMA: hfi1: simplify find_hw_thread_mask()
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
  2025-06-04 19:39 ` [PATCH 1/7] cpumask: add cpumask_clear_cpus() Yury Norov
  2025-06-04 19:39 ` [PATCH 2/7] RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask() Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-04 19:39 ` [PATCH 4/7] RDMA: hfi1: simplify init_real_cpu_mask() Yury Norov
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

The function opencodes cpumask_nth() and cpumask_clear_cpus(). The
dedicated helpers are easier to use and usually much faster than
opencoded for-loops.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index f2c530ab85a5..9ea80b777061 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -963,7 +963,7 @@ void hfi1_put_irq_affinity(struct hfi1_devdata *dd,
 static void find_hw_thread_mask(uint hw_thread_no, cpumask_var_t hw_thread_mask,
 				struct hfi1_affinity_node_list *affinity)
 {
-	int possible, curr_cpu, i;
+	int curr_cpu;
 	uint num_cores_per_socket;
 
 	cpumask_copy(hw_thread_mask, &affinity->proc.mask);
@@ -976,17 +976,9 @@ static void find_hw_thread_mask(uint hw_thread_no, cpumask_var_t hw_thread_mask,
 						node_affinity.num_online_nodes;
 
 	/* Removing other siblings not needed for now */
-	possible = cpumask_weight(hw_thread_mask);
-	curr_cpu = cpumask_first(hw_thread_mask);
-	for (i = 0;
-	     i < num_cores_per_socket * node_affinity.num_online_nodes;
-	     i++)
-		curr_cpu = cpumask_next(curr_cpu, hw_thread_mask);
-
-	for (; i < possible; i++) {
-		cpumask_clear_cpu(curr_cpu, hw_thread_mask);
-		curr_cpu = cpumask_next(curr_cpu, hw_thread_mask);
-	}
+	curr_cpu = cpumask_cpumask_nth(num_cores_per_socket *
+			node_affinity.num_online_nodes, hw_thread_mask) + 1;
+	cpumask_clear_cpus(hw_thread_mask, curr_cpu, nr_cpu_ids - curr_cpu);
 
 	/* Identifying correct HW threads within physical cores */
 	cpumask_shift_left(hw_thread_mask, hw_thread_mask,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/7] RDMA: hfi1: simplify init_real_cpu_mask()
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
                   ` (2 preceding siblings ...)
  2025-06-04 19:39 ` [PATCH 3/7] RDMA: hfi1: simplify find_hw_thread_mask() Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-04 19:39 ` [PATCH 5/7] RDMA: hfi1: use rounddown in find_hw_thread_mask() Yury Norov
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

The function opencodes cpumask_nth() and cpumask_clear_cpus(). The
dedicated helpers are easier to use and  usually much faster than
opencoded for-loops.

While there, drop useless clear of real_cpu_mask.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 9ea80b777061..b2884226827a 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -92,9 +92,7 @@ static void cpu_mask_set_put(struct cpu_mask_set *set, int cpu)
 /* Initialize non-HT cpu cores mask */
 void init_real_cpu_mask(void)
 {
-	int possible, curr_cpu, i, ht;
-
-	cpumask_clear(&node_affinity.real_cpu_mask);
+	int possible, curr_cpu, ht;
 
 	/* Start with cpu online mask as the real cpu mask */
 	cpumask_copy(&node_affinity.real_cpu_mask, cpu_online_mask);
@@ -110,17 +108,10 @@ void init_real_cpu_mask(void)
 	 * "real" cores.  Assumes that HT cores are not enumerated in
 	 * succession (except in the single core case).
 	 */
-	curr_cpu = cpumask_first(&node_affinity.real_cpu_mask);
-	for (i = 0; i < possible / ht; i++)
-		curr_cpu = cpumask_next(curr_cpu, &node_affinity.real_cpu_mask);
-	/*
-	 * Step 2.  Remove the remaining HT siblings.  Use cpumask_next() to
-	 * skip any gaps.
-	 */
-	for (; i < possible; i++) {
-		cpumask_clear_cpu(curr_cpu, &node_affinity.real_cpu_mask);
-		curr_cpu = cpumask_next(curr_cpu, &node_affinity.real_cpu_mask);
-	}
+	curr_cpu = cpumask_nth(possible / ht, &node_affinity.real_cpu_mask) + 1;
+
+	/* Step 2.  Remove the remaining HT siblings. */
+	cpumask_clear_cpus(&node_affinity.real_cpu_mask, curr_cpu, nr_cpu_ids - curr_cpu);
 }
 
 int node_affinity_init(void)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/7] RDMA: hfi1: use rounddown in find_hw_thread_mask()
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
                   ` (3 preceding siblings ...)
  2025-06-04 19:39 ` [PATCH 4/7] RDMA: hfi1: simplify init_real_cpu_mask() Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-04 19:39 ` [PATCH 6/7] RDMA: hfi1: simplify hfi1_get_proc_affinity() Yury Norov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

num_cores_per_socket is calculated by dividing by
node_affinity.num_online_nodes, but all users of this variable multiply
it by node_affinity.num_online_nodes again. This effectively is the same
as rounding it down by node_affinity.num_online_nodes.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index b2884226827a..7fa894c23fea 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -955,27 +955,22 @@ static void find_hw_thread_mask(uint hw_thread_no, cpumask_var_t hw_thread_mask,
 				struct hfi1_affinity_node_list *affinity)
 {
 	int curr_cpu;
-	uint num_cores_per_socket;
+	uint num_cores;
 
 	cpumask_copy(hw_thread_mask, &affinity->proc.mask);
 
 	if (affinity->num_core_siblings == 0)
 		return;
 
-	num_cores_per_socket = node_affinity.num_online_cpus /
-					affinity->num_core_siblings /
-						node_affinity.num_online_nodes;
+	num_cores = rounddown(node_affinity.num_online_cpus / affinity->num_core_siblings,
+				node_affinity.num_online_nodes);
 
 	/* Removing other siblings not needed for now */
-	curr_cpu = cpumask_cpumask_nth(num_cores_per_socket *
-			node_affinity.num_online_nodes, hw_thread_mask) + 1;
+	curr_cpu = cpumask_nth(num_cores * node_affinity.num_online_nodes, hw_thread_mask) + 1;
 	cpumask_clear_cpus(hw_thread_mask, curr_cpu, nr_cpu_ids - curr_cpu);
 
 	/* Identifying correct HW threads within physical cores */
-	cpumask_shift_left(hw_thread_mask, hw_thread_mask,
-			   num_cores_per_socket *
-			   node_affinity.num_online_nodes *
-			   hw_thread_no);
+	cpumask_shift_left(hw_thread_mask, hw_thread_mask, num_cores * hw_thread_no);
 }
 
 int hfi1_get_proc_affinity(int node)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/7] RDMA: hfi1: simplify hfi1_get_proc_affinity()
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
                   ` (4 preceding siblings ...)
  2025-06-04 19:39 ` [PATCH 5/7] RDMA: hfi1: use rounddown in find_hw_thread_mask() Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-04 19:39 ` [PATCH 7/7] RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c Yury Norov
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

The function protects the for loop with affinity->num_core_siblings > 0
condition, which is redundant because the loop will break immediately in
that case.

Drop it and save one indentation level.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 28 +++++++++++++--------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 7fa894c23fea..8974aa1e63d1 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -1069,22 +1069,20 @@ int hfi1_get_proc_affinity(int node)
 	 * If HT cores are enabled, identify which HW threads within the
 	 * physical cores should be used.
 	 */
-	if (affinity->num_core_siblings > 0) {
-		for (i = 0; i < affinity->num_core_siblings; i++) {
-			find_hw_thread_mask(i, hw_thread_mask, affinity);
+	for (i = 0; i < affinity->num_core_siblings; i++) {
+		find_hw_thread_mask(i, hw_thread_mask, affinity);
 
-			/*
-			 * If there's at least one available core for this HW
-			 * thread number, stop looking for a core.
-			 *
-			 * diff will always be not empty at least once in this
-			 * loop as the used mask gets reset when
-			 * (set->mask == set->used) before this loop.
-			 */
-			cpumask_andnot(diff, hw_thread_mask, &set->used);
-			if (!cpumask_empty(diff))
-				break;
-		}
+		/*
+		 * If there's at least one available core for this HW
+		 * thread number, stop looking for a core.
+		 *
+		 * diff will always be not empty at least once in this
+		 * loop as the used mask gets reset when
+		 * (set->mask == set->used) before this loop.
+		 */
+		cpumask_andnot(diff, hw_thread_mask, &set->used);
+		if (!cpumask_empty(diff))
+			break;
 	}
 	hfi1_cdbg(PROC, "Same available HW thread on all physical CPUs: %*pbl",
 		  cpumask_pr_args(hw_thread_mask));
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/7] RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
                   ` (5 preceding siblings ...)
  2025-06-04 19:39 ` [PATCH 6/7] RDMA: hfi1: simplify hfi1_get_proc_affinity() Yury Norov
@ 2025-06-04 19:39 ` Yury Norov
  2025-06-12  8:12 ` [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Leon Romanovsky
  2025-06-25 10:41 ` Leon Romanovsky
  8 siblings, 0 replies; 13+ messages in thread
From: Yury Norov @ 2025-06-04 19:39 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Leon Romanovsky, Yury Norov,
	Rasmus Villemoes, linux-rdma, linux-kernel

From: "Yury Norov [NVIDIA]" <yury.norov@gmail.com>

In few places, the driver tests a cpumask for emptiness immediately
before calling functions that report emptiness themself.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
---
 drivers/infiniband/hw/hfi1/affinity.c | 16 +++++++---------
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 8974aa1e63d1..ee7fedc67b86 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -337,9 +337,10 @@ static int _dev_comp_vect_cpu_get(struct hfi1_devdata *dd,
 		       &entry->def_intr.used);
 
 	/* If there are non-interrupt CPUs available, use them first */
-	if (!cpumask_empty(non_intr_cpus))
-		cpu = cpumask_first(non_intr_cpus);
-	else /* Otherwise, use interrupt CPUs */
+	cpu = cpumask_first(non_intr_cpus);
+
+	/* Otherwise, use interrupt CPUs */
+	if (cpu >= nr_cpu_ids)
 		cpu = cpumask_first(available_cpus);
 
 	if (cpu >= nr_cpu_ids) { /* empty */
@@ -1080,8 +1081,7 @@ int hfi1_get_proc_affinity(int node)
 		 * loop as the used mask gets reset when
 		 * (set->mask == set->used) before this loop.
 		 */
-		cpumask_andnot(diff, hw_thread_mask, &set->used);
-		if (!cpumask_empty(diff))
+		if (cpumask_andnot(diff, hw_thread_mask, &set->used))
 			break;
 	}
 	hfi1_cdbg(PROC, "Same available HW thread on all physical CPUs: %*pbl",
@@ -1113,8 +1113,7 @@ int hfi1_get_proc_affinity(int node)
 	 *    used for process assignments using the same method as
 	 *    the preferred NUMA node.
 	 */
-	cpumask_andnot(diff, available_mask, intrs_mask);
-	if (!cpumask_empty(diff))
+	if (cpumask_andnot(diff, available_mask, intrs_mask))
 		cpumask_copy(available_mask, diff);
 
 	/* If we don't have CPUs on the preferred node, use other NUMA nodes */
@@ -1130,8 +1129,7 @@ int hfi1_get_proc_affinity(int node)
 		 * At first, we don't want to place processes on the same
 		 * CPUs as interrupt handlers.
 		 */
-		cpumask_andnot(diff, available_mask, intrs_mask);
-		if (!cpumask_empty(diff))
+		if (cpumask_andnot(diff, available_mask, intrs_mask))
 			cpumask_copy(available_mask, diff);
 	}
 	hfi1_cdbg(PROC, "Possible CPUs for process: %*pbl",
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
                   ` (6 preceding siblings ...)
  2025-06-04 19:39 ` [PATCH 7/7] RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c Yury Norov
@ 2025-06-12  8:12 ` Leon Romanovsky
  2025-06-21 15:03   ` Yury Norov
  2025-06-25 10:41 ` Leon Romanovsky
  8 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2025-06-12  8:12 UTC (permalink / raw)
  To: Yury Norov, Dennis Dalessandro
  Cc: Jason Gunthorpe, Rasmus Villemoes, linux-rdma, linux-kernel

On Wed, Jun 04, 2025 at 03:39:36PM -0400, Yury Norov wrote:
> The driver uses cpumasks API in a non-optimal way; partially because of
> absence of proper functions. Fix this and nearby logic.
> 
> Yury Norov [NVIDIA] (7):
>   cpumask: add cpumask_clear_cpus()
>   RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
>   RDMA: hfi1: simplify find_hw_thread_mask()
>   RDMA: hfi1: simplify init_real_cpu_mask()
>   RDMA: hfi1: use rounddown in find_hw_thread_mask()
>   RDMA: hfi1: simplify hfi1_get_proc_affinity()
>   RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
> 
>  drivers/infiniband/hw/hfi1/affinity.c | 96 +++++++++++----------------
>  include/linux/cpumask.h               | 12 ++++
>  2 files changed, 49 insertions(+), 59 deletions(-)

Dennis?

> 
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes
  2025-06-12  8:12 ` [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Leon Romanovsky
@ 2025-06-21 15:03   ` Yury Norov
  2025-06-23 16:23     ` Dennis Dalessandro
  0 siblings, 1 reply; 13+ messages in thread
From: Yury Norov @ 2025-06-21 15:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Dennis Dalessandro, Jason Gunthorpe, Rasmus Villemoes, linux-rdma,
	linux-kernel

On Thu, Jun 12, 2025 at 11:12:29AM +0300, Leon Romanovsky wrote:
> On Wed, Jun 04, 2025 at 03:39:36PM -0400, Yury Norov wrote:
> > The driver uses cpumasks API in a non-optimal way; partially because of
> > absence of proper functions. Fix this and nearby logic.
> > 
> > Yury Norov [NVIDIA] (7):
> >   cpumask: add cpumask_clear_cpus()
> >   RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
> >   RDMA: hfi1: simplify find_hw_thread_mask()
> >   RDMA: hfi1: simplify init_real_cpu_mask()
> >   RDMA: hfi1: use rounddown in find_hw_thread_mask()
> >   RDMA: hfi1: simplify hfi1_get_proc_affinity()
> >   RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
> > 
> >  drivers/infiniband/hw/hfi1/affinity.c | 96 +++++++++++----------------
> >  include/linux/cpumask.h               | 12 ++++
> >  2 files changed, 49 insertions(+), 59 deletions(-)
> 
> Dennis?

So?.. Any feedback?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes
  2025-06-21 15:03   ` Yury Norov
@ 2025-06-23 16:23     ` Dennis Dalessandro
  2025-06-25 10:41       ` Leon Romanovsky
  0 siblings, 1 reply; 13+ messages in thread
From: Dennis Dalessandro @ 2025-06-23 16:23 UTC (permalink / raw)
  To: Yury Norov, Leon Romanovsky
  Cc: Jason Gunthorpe, Rasmus Villemoes, linux-rdma, linux-kernel

On 6/21/25 11:03 AM, Yury Norov wrote:
> On Thu, Jun 12, 2025 at 11:12:29AM +0300, Leon Romanovsky wrote:
>> On Wed, Jun 04, 2025 at 03:39:36PM -0400, Yury Norov wrote:
>>> The driver uses cpumasks API in a non-optimal way; partially because of
>>> absence of proper functions. Fix this and nearby logic.
>>>
>>> Yury Norov [NVIDIA] (7):
>>>   cpumask: add cpumask_clear_cpus()
>>>   RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
>>>   RDMA: hfi1: simplify find_hw_thread_mask()
>>>   RDMA: hfi1: simplify init_real_cpu_mask()
>>>   RDMA: hfi1: use rounddown in find_hw_thread_mask()
>>>   RDMA: hfi1: simplify hfi1_get_proc_affinity()
>>>   RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
>>>
>>>  drivers/infiniband/hw/hfi1/affinity.c | 96 +++++++++++----------------
>>>  include/linux/cpumask.h               | 12 ++++
>>>  2 files changed, 49 insertions(+), 59 deletions(-)
>>
>> Dennis?
> 
> So?.. Any feedback?

I'm ambivalent about this patch series. It looks OK but I don't think it's
really fixing anything.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes
  2025-06-23 16:23     ` Dennis Dalessandro
@ 2025-06-25 10:41       ` Leon Romanovsky
  0 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-06-25 10:41 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Yury Norov, Jason Gunthorpe, Rasmus Villemoes, linux-rdma,
	linux-kernel

On Mon, Jun 23, 2025 at 12:23:36PM -0400, Dennis Dalessandro wrote:
> On 6/21/25 11:03 AM, Yury Norov wrote:
> > On Thu, Jun 12, 2025 at 11:12:29AM +0300, Leon Romanovsky wrote:
> >> On Wed, Jun 04, 2025 at 03:39:36PM -0400, Yury Norov wrote:
> >>> The driver uses cpumasks API in a non-optimal way; partially because of
> >>> absence of proper functions. Fix this and nearby logic.
> >>>
> >>> Yury Norov [NVIDIA] (7):
> >>>   cpumask: add cpumask_clear_cpus()
> >>>   RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
> >>>   RDMA: hfi1: simplify find_hw_thread_mask()
> >>>   RDMA: hfi1: simplify init_real_cpu_mask()
> >>>   RDMA: hfi1: use rounddown in find_hw_thread_mask()
> >>>   RDMA: hfi1: simplify hfi1_get_proc_affinity()
> >>>   RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
> >>>
> >>>  drivers/infiniband/hw/hfi1/affinity.c | 96 +++++++++++----------------
> >>>  include/linux/cpumask.h               | 12 ++++
> >>>  2 files changed, 49 insertions(+), 59 deletions(-)
> >>
> >> Dennis?
> > 
> > So?.. Any feedback?
> 
> I'm ambivalent about this patch series. It looks OK but I don't think it's
> really fixing anything.

Yeas, I applied because more code was deleted than added.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes
  2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
                   ` (7 preceding siblings ...)
  2025-06-12  8:12 ` [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Leon Romanovsky
@ 2025-06-25 10:41 ` Leon Romanovsky
  8 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2025-06-25 10:41 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe, Rasmus Villemoes, linux-rdma,
	linux-kernel, Yury Norov


On Wed, 04 Jun 2025 15:39:36 -0400, Yury Norov wrote:
> The driver uses cpumasks API in a non-optimal way; partially because of
> absence of proper functions. Fix this and nearby logic.
> 
> Yury Norov [NVIDIA] (7):
>   cpumask: add cpumask_clear_cpus()
>   RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
>   RDMA: hfi1: simplify find_hw_thread_mask()
>   RDMA: hfi1: simplify init_real_cpu_mask()
>   RDMA: hfi1: use rounddown in find_hw_thread_mask()
>   RDMA: hfi1: simplify hfi1_get_proc_affinity()
>   RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
> 
> [...]

Applied, thanks!

[1/7] cpumask: add cpumask_clear_cpus()
      https://git.kernel.org/rdma/rdma/c/c15d5e70db9627
[2/7] RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask()
      https://git.kernel.org/rdma/rdma/c/37b3cba54b404a
[3/7] RDMA: hfi1: simplify find_hw_thread_mask()
      https://git.kernel.org/rdma/rdma/c/f2c2afbba77c11
[4/7] RDMA: hfi1: simplify init_real_cpu_mask()
      https://git.kernel.org/rdma/rdma/c/9c965445a636b7
[5/7] RDMA: hfi1: use rounddown in find_hw_thread_mask()
      https://git.kernel.org/rdma/rdma/c/9370795029d41a
[6/7] RDMA: hfi1: simplify hfi1_get_proc_affinity()
      https://git.kernel.org/rdma/rdma/c/7c2c2f3a205b2b
[7/7] RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c
      https://git.kernel.org/rdma/rdma/c/185e34e8f249cb

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-06-25 10:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-04 19:39 [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Yury Norov
2025-06-04 19:39 ` [PATCH 1/7] cpumask: add cpumask_clear_cpus() Yury Norov
2025-06-04 19:39 ` [PATCH 2/7] RDMA: hfi1: fix possible divide-by-zero in find_hw_thread_mask() Yury Norov
2025-06-04 19:39 ` [PATCH 3/7] RDMA: hfi1: simplify find_hw_thread_mask() Yury Norov
2025-06-04 19:39 ` [PATCH 4/7] RDMA: hfi1: simplify init_real_cpu_mask() Yury Norov
2025-06-04 19:39 ` [PATCH 5/7] RDMA: hfi1: use rounddown in find_hw_thread_mask() Yury Norov
2025-06-04 19:39 ` [PATCH 6/7] RDMA: hfi1: simplify hfi1_get_proc_affinity() Yury Norov
2025-06-04 19:39 ` [PATCH 7/7] RDMI: hfi1: drop cpumask_empty() call in hfi1/affinity.c Yury Norov
2025-06-12  8:12 ` [PATCH 0/7] RDMA: hfi1: cpumasks usage fixes Leon Romanovsky
2025-06-21 15:03   ` Yury Norov
2025-06-23 16:23     ` Dennis Dalessandro
2025-06-25 10:41       ` Leon Romanovsky
2025-06-25 10:41 ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).