public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage
@ 2024-12-28 18:49 Yury Norov
  2024-12-28 18:49 ` [PATCH 01/14] objpool: rework objpool_pop() Yury Norov
                   ` (14 more replies)
  0 siblings, 15 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Yury Norov, Rasmus Villemoes, Matt Wu, Steffen Klassert,
	Daniel Jordan, Andrew Morton, Greg Kurz, Peter Xu,
	Shrikanth Hegde, Hendrik Brueckner

cpumask_next_wrap() is overly complicated, comparing to it's generic
version find_next_bit_wrap(), not mentioning it duplicates the above.
It roots to the times when the function was used in the implementation
of for_each_cpu_wrap() iterator. The function has 2 additional parameters
that were used to catch loop termination condition for the iterator.
(Although, only one is needed.)

Since 4fe49b3b97c262 ("lib/bitmap: introduce for_each_set_bit_wrap()
macro"), for_each_cpu_wrap() is wired to corresponding generic
wrapping bitmap iterator, and additional complexity of
cpumask_next_wrap() is not needed anymore.

All existing users call cpumask_next_wrap() in a manner that makes
it possible to turn it to a straight and simple alias to
find_next_bit_wrap().

This series replaces historical 4-parameter cpumask_next_wrap() with a
thin 2-parameter wrapper around find_next_bit_wrap().

Where it's possible to use for_each_cpu_wrap() iterator, the code is
switched to use it because it's always preferable to use iterators over
open loops.

This series touches various scattered subsystems and To-list for the
whole series is quite a long. To minimize noise, I send cover-letter and
key patches #5 and 6 to every person involved. All other patches are sent
individually to those pointed by scripts/get_maintainers.pl.

I'd like to move the series with my bitmap-for-next branch as a whole.

Yury Norov (14):
  objpool: rework objpool_pop()
  virtio_net: simplify virtnet_set_affinity()
  ibmvnic: simplify ibmvnic_set_queue_affinity()
  powerpc/xmon: simplify xmon_batch_next_cpu()
  cpumask: deprecate cpumask_next_wrap()
  cpumask: re-introduce cpumask_next_wrap()
  cpumask: use cpumask_next_wrap() where appropriate
  padata: switch padata_find_next() to using cpumask_next_wrap()
  s390: switch stop_machine_yield() to using cpumask_next_wrap()
  nvme-tcp: switch nvme_tcp_set_queue_io_cpu() to using
    cpumask_next_wrap()
  scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap()
  scsi: lpfc: rework lpfc_next_{online,present}_cpu()
  PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using
    cpumask_next_wrap()
  cpumask: drop cpumask_next_wrap_old()

 arch/powerpc/xmon/xmon.c            |  6 +---
 arch/s390/kernel/processor.c        |  2 +-
 drivers/net/ethernet/ibm/ibmvnic.c  | 17 +++++-----
 drivers/net/virtio_net.c            | 12 +++++---
 drivers/nvme/host/tcp.c             |  2 +-
 drivers/pci/controller/pci-hyperv.c |  3 +-
 drivers/scsi/lpfc/lpfc.h            | 23 +++-----------
 drivers/scsi/lpfc/lpfc_init.c       |  2 +-
 include/linux/cpumask.h             | 48 ++++++++++++++++-------------
 include/linux/objpool.h             |  7 ++---
 kernel/padata.c                     |  2 +-
 lib/cpumask.c                       | 37 ++--------------------
 12 files changed, 60 insertions(+), 101 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/14] objpool: rework objpool_pop()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-28 18:49 ` [PATCH 02/14] virtio_net: simplify virtnet_set_affinity() Yury Norov
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, Matt Wu; +Cc: Yury Norov, Rasmus Villemoes

The function has to track number of iterations to prevent an infinite
loop. for_each_cpu_wrap() macro takes care of it, which simplifies user
code.

Similarly to for_each_possible_cpu() flavor, this patch introduces
for_each_possible_cpu_wrap() version of iterator to keep usage of the
API simple and coherent.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 6 ++++++
 include/linux/objpool.h | 7 +++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 9278a50d514f..5cf69a110c1c 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -1033,11 +1033,17 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
 #define for_each_possible_cpu(cpu)	for ((cpu) = 0; (cpu) < 1; (cpu)++)
 #define for_each_online_cpu(cpu)	for ((cpu) = 0; (cpu) < 1; (cpu)++)
 #define for_each_present_cpu(cpu)	for ((cpu) = 0; (cpu) < 1; (cpu)++)
+
+#define for_each_possible_cpu_wrap(cpu, start)	\
+	for ((void)(start), (cpu) = 0; (cpu) < 1; (cpu)++)
 #else
 #define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
 #define for_each_online_cpu(cpu)   for_each_cpu((cpu), cpu_online_mask)
 #define for_each_enabled_cpu(cpu)   for_each_cpu((cpu), cpu_enabled_mask)
 #define for_each_present_cpu(cpu)  for_each_cpu((cpu), cpu_present_mask)
+
+#define for_each_possible_cpu_wrap(cpu, start)	\
+	for_each_cpu_wrap((cpu), cpu_possible_mask, (start))
 #endif
 
 /* Wrappers for arch boot code to manipulate normally-constant masks */
diff --git a/include/linux/objpool.h b/include/linux/objpool.h
index cb1758eaa2d3..b713a1fe7521 100644
--- a/include/linux/objpool.h
+++ b/include/linux/objpool.h
@@ -170,17 +170,16 @@ static inline void *objpool_pop(struct objpool_head *pool)
 {
 	void *obj = NULL;
 	unsigned long flags;
-	int i, cpu;
+	int start, cpu;
 
 	/* disable local irq to avoid preemption & interruption */
 	raw_local_irq_save(flags);
 
-	cpu = raw_smp_processor_id();
-	for (i = 0; i < pool->nr_possible_cpus; i++) {
+	start = raw_smp_processor_id();
+	for_each_possible_cpu_wrap(cpu, start) {
 		obj = __objpool_try_get_slot(pool, cpu);
 		if (obj)
 			break;
-		cpu = cpumask_next_wrap(cpu, cpu_possible_mask, -1, 1);
 	}
 	raw_local_irq_restore(flags);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 02/14] virtio_net: simplify virtnet_set_affinity()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
  2024-12-28 18:49 ` [PATCH 01/14] objpool: rework objpool_pop() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-28 18:49 ` [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity() Yury Norov
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, netdev, virtualization, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Yury Norov, Rasmus Villemoes

The inner loop may be replaced with the dedicated for_each_online_cpu_wrap.
It helps to avoid setting the same bits in the @mask more than once, in
case of group_size is greater than number of online CPUs.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/virtio_net.c | 12 +++++++-----
 include/linux/cpumask.h  |  4 ++++
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7646ddd9bef7..5e266486de1f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -3826,7 +3826,7 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
 	cpumask_var_t mask;
 	int stragglers;
 	int group_size;
-	int i, j, cpu;
+	int i, start = 0, cpu;
 	int num_cpu;
 	int stride;
 
@@ -3840,16 +3840,18 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
 	stragglers = num_cpu >= vi->curr_queue_pairs ?
 			num_cpu % vi->curr_queue_pairs :
 			0;
-	cpu = cpumask_first(cpu_online_mask);
 
 	for (i = 0; i < vi->curr_queue_pairs; i++) {
 		group_size = stride + (i < stragglers ? 1 : 0);
 
-		for (j = 0; j < group_size; j++) {
+		for_each_online_cpu_wrap(cpu, start) {
+			if (!group_size--)
+				break;
 			cpumask_set_cpu(cpu, mask);
-			cpu = cpumask_next_wrap(cpu, cpu_online_mask,
-						nr_cpu_ids, false);
 		}
+
+		start = cpu < nr_cpu_ids ? cpu + 1 : start;
+
 		virtqueue_set_affinity(vi->rq[i].vq, mask);
 		virtqueue_set_affinity(vi->sq[i].vq, mask);
 		__netif_set_xps_queue(vi->dev, cpumask_bits(mask), i, XPS_CPUS);
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 5cf69a110c1c..30042351f15f 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -1036,6 +1036,8 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
 
 #define for_each_possible_cpu_wrap(cpu, start)	\
 	for ((void)(start), (cpu) = 0; (cpu) < 1; (cpu)++)
+#define for_each_online_cpu_wrap(cpu, start)	\
+	for ((void)(start), (cpu) = 0; (cpu) < 1; (cpu)++)
 #else
 #define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
 #define for_each_online_cpu(cpu)   for_each_cpu((cpu), cpu_online_mask)
@@ -1044,6 +1046,8 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
 
 #define for_each_possible_cpu_wrap(cpu, start)	\
 	for_each_cpu_wrap((cpu), cpu_possible_mask, (start))
+#define for_each_online_cpu_wrap(cpu, start)	\
+	for_each_cpu_wrap((cpu), cpu_online_mask, (start))
 #endif
 
 /* Wrappers for arch boot code to manipulate normally-constant masks */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
  2024-12-28 18:49 ` [PATCH 01/14] objpool: rework objpool_pop() Yury Norov
  2024-12-28 18:49 ` [PATCH 02/14] virtio_net: simplify virtnet_set_affinity() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2025-01-07 22:37   ` Nick Child
  2024-12-28 18:49 ` [PATCH 04/14] powerpc/xmon: simplify xmon_batch_next_cpu() Yury Norov
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, netdev, linuxppc-dev, Haren Myneni, Rick Lindsley,
	Nick Child, Thomas Falcon, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Madhavan Srinivasan,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Yury Norov, Rasmus Villemoes

A loop based on cpumask_next_wrap() opencodes the dedicated macro
for_each_online_cpu_wrap(). Using the macro allows to avoid setting
bits affinity mask more than once when stride >= num_online_cpus.

This also helps to drop cpumask handling code in the caller function.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index e95ae0d39948..4cfd90fb206b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
 		(*stragglers)--;
 	}
 	/* atomic write is safer than writing bit by bit directly */
-	for (i = 0; i < stride; i++) {
-		cpumask_set_cpu(*cpu, mask);
-		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
-					 nr_cpu_ids, false);
+	for_each_online_cpu_wrap(i, *cpu) {
+		if (!stride--)
+			break;
+		cpumask_set_cpu(i, mask);
 	}
+
+	/* For the next queue we start from the first unused CPU in this queue */
+	if (i < nr_cpu_ids)
+		*cpu = i + 1;
+
 	/* set queue affinity mask */
 	cpumask_copy(queue->affinity_mask, mask);
 	rc = irq_set_affinity_and_hint(queue->irq, queue->affinity_mask);
@@ -256,7 +261,7 @@ static void ibmvnic_set_affinity(struct ibmvnic_adapter *adapter)
 	int num_rxqs = adapter->num_active_rx_scrqs, i_rxqs = 0;
 	int num_txqs = adapter->num_active_tx_scrqs, i_txqs = 0;
 	int total_queues, stride, stragglers, i;
-	unsigned int num_cpu, cpu;
+	unsigned int num_cpu, cpu = 0;
 	bool is_rx_queue;
 	int rc = 0;
 
@@ -274,8 +279,6 @@ static void ibmvnic_set_affinity(struct ibmvnic_adapter *adapter)
 	stride = max_t(int, num_cpu / total_queues, 1);
 	/* number of leftover cpu's */
 	stragglers = num_cpu >= total_queues ? num_cpu % total_queues : 0;
-	/* next available cpu to assign irq to */
-	cpu = cpumask_next(-1, cpu_online_mask);
 
 	for (i = 0; i < total_queues; i++) {
 		is_rx_queue = false;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 04/14] powerpc/xmon: simplify xmon_batch_next_cpu()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (2 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-28 18:49 ` [PATCH 05/14] cpumask: deprecate cpumask_next_wrap() Yury Norov
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Madhavan Srinivasan
  Cc: Yury Norov, Rasmus Villemoes

The function opencodes for_each_cpu_wrap() macro. As a loop termination
condition it uses cpumask_empty(), which is O(N), and it makes the whole
algorithm O(N^2). Switching to for_each_cpu_wrap() simplifies the logic,
and makes the algorithm linear.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/powerpc/xmon/xmon.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index f4e841a36458..d7809f15dc68 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -1271,11 +1271,7 @@ static int xmon_batch_next_cpu(void)
 {
 	unsigned long cpu;
 
-	while (!cpumask_empty(&xmon_batch_cpus)) {
-		cpu = cpumask_next_wrap(smp_processor_id(), &xmon_batch_cpus,
-					xmon_batch_start_cpu, true);
-		if (cpu >= nr_cpu_ids)
-			break;
+	for_each_cpu_wrap(cpu, &xmon_batch_cpus, xmon_batch_start_cpu) {
 		if (xmon_batch_start_cpu == -1)
 			xmon_batch_start_cpu = cpu;
 		if (xmon_switch_cpu(cpu))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 05/14] cpumask: deprecate cpumask_next_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (3 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 04/14] powerpc/xmon: simplify xmon_batch_next_cpu() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2025-01-03 17:39   ` Bjorn Helgaas
  2024-12-28 18:49 ` [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap() Yury Norov
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Yury Norov, Rasmus Villemoes, Matt Wu, Steffen Klassert,
	Daniel Jordan, Andrew Morton, Greg Kurz, Peter Xu,
	Shrikanth Hegde, Hendrik Brueckner

The next patche aligns implementation of cpumask_next_wrap() with the
generic version in find.h which changes function signature.

To make the transition smooth, this patch deprecates current
implementation by adding an _old suffix. The following patches switch
current users to the new implementation one by one.

No functional changes were intended.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/s390/kernel/processor.c        | 2 +-
 drivers/nvme/host/tcp.c             | 2 +-
 drivers/pci/controller/pci-hyperv.c | 2 +-
 drivers/scsi/lpfc/lpfc_init.c       | 2 +-
 include/linux/cpumask.h             | 4 ++--
 kernel/padata.c                     | 2 +-
 lib/cpumask.c                       | 6 +++---
 7 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/s390/kernel/processor.c b/arch/s390/kernel/processor.c
index 5ce9a795a0fe..42ca61909030 100644
--- a/arch/s390/kernel/processor.c
+++ b/arch/s390/kernel/processor.c
@@ -72,7 +72,7 @@ void notrace stop_machine_yield(const struct cpumask *cpumask)
 	this_cpu = smp_processor_id();
 	if (__this_cpu_inc_return(cpu_relax_retry) >= spin_retry) {
 		__this_cpu_write(cpu_relax_retry, 0);
-		cpu = cpumask_next_wrap(this_cpu, cpumask, this_cpu, false);
+		cpu = cpumask_next_wrap_old(this_cpu, cpumask, this_cpu, false);
 		if (cpu >= nr_cpu_ids)
 			return;
 		if (arch_vcpu_is_preempted(cpu))
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 28c76a3e1bd2..054904376c3c 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1578,7 +1578,7 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
 	if (wq_unbound)
 		queue->io_cpu = WORK_CPU_UNBOUND;
 	else
-		queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false);
+		queue->io_cpu = cpumask_next_wrap_old(n - 1, cpu_online_mask, -1, false);
 }
 
 static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index cdd5be16021d..86d1c2be8eb5 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1757,7 +1757,7 @@ static int hv_compose_multi_msi_req_get_cpu(void)
 
 	spin_lock_irqsave(&multi_msi_cpu_lock, flags);
 
-	cpu_next = cpumask_next_wrap(cpu_next, cpu_online_mask, nr_cpu_ids,
+	cpu_next = cpumask_next_wrap_old(cpu_next, cpu_online_mask, nr_cpu_ids,
 				     false);
 	cpu = cpu_next;
 
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 7f57397d91a9..31622fb0614a 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12876,7 +12876,7 @@ lpfc_irq_rebalance(struct lpfc_hba *phba, unsigned int cpu, bool offline)
 
 	if (offline) {
 		/* Find next online CPU on original mask */
-		cpu_next = cpumask_next_wrap(cpu, orig_mask, cpu, true);
+		cpu_next = cpumask_next_wrap_old(cpu, orig_mask, cpu, true);
 		cpu_select = lpfc_next_online_cpu(orig_mask, cpu_next);
 
 		/* Found a valid CPU */
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 30042351f15f..b267a4f6a917 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -296,7 +296,7 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 
 #if NR_CPUS == 1
 static __always_inline
-unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
+unsigned int cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap)
 {
 	cpumask_check(start);
 	if (n != -1)
@@ -312,7 +312,7 @@ unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, boo
 	return cpumask_first(mask);
 }
 #else
-unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap);
+unsigned int __pure cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap);
 #endif
 
 /**
diff --git a/kernel/padata.c b/kernel/padata.c
index d51bbc76b227..454ff2fca40b 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -274,7 +274,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
 	if (remove_object) {
 		list_del_init(&padata->list);
 		++pd->processed;
-		pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, false);
+		pd->cpu = cpumask_next_wrap_old(cpu, pd->cpumask.pcpu, -1, false);
 	}
 
 	spin_unlock(&reorder->lock);
diff --git a/lib/cpumask.c b/lib/cpumask.c
index e77ee9d46f71..c9a9b451772a 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -8,7 +8,7 @@
 #include <linux/numa.h>
 
 /**
- * cpumask_next_wrap - helper to implement for_each_cpu_wrap
+ * cpumask_next_wrap_old - helper to implement for_each_cpu_wrap
  * @n: the cpu prior to the place to search
  * @mask: the cpumask pointer
  * @start: the start point of the iteration
@@ -19,7 +19,7 @@
  * Note: the @wrap argument is required for the start condition when
  * we cannot assume @start is set in @mask.
  */
-unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
+unsigned int cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap)
 {
 	unsigned int next;
 
@@ -37,7 +37,7 @@ unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, boo
 
 	return next;
 }
-EXPORT_SYMBOL(cpumask_next_wrap);
+EXPORT_SYMBOL(cpumask_next_wrap_old);
 
 /* These are not inline because of header tangles. */
 #ifdef CONFIG_CPUMASK_OFFSTACK
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (4 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 05/14] cpumask: deprecate cpumask_next_wrap() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2025-01-03 17:44   ` Bjorn Helgaas
  2025-01-07 13:28   ` Alexander Gordeev
  2024-12-28 18:49 ` [PATCH 07/14] cpumask: use cpumask_next_wrap() where appropriate Yury Norov
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Yury Norov, Rasmus Villemoes, Matt Wu, Steffen Klassert,
	Daniel Jordan, Andrew Morton, Greg Kurz, Peter Xu,
	Shrikanth Hegde, Hendrik Brueckner

cpumask_next_wrap_old() has two additional parameters, comparing to it's
analogue in linux/find.h find_next_bit_wrap(). The reason for that is
historical.

Before 4fe49b3b97c262 ("lib/bitmap: introduce for_each_set_bit_wrap()
macro"), cpumask_next_wrap() was used to implement for_each_cpu_wrap()
iterator. Now that the iterator is an alias to generic
for_each_set_bit_wrap(), the additional parameters aren't used and may
confuse readers.

All existing users call cpumask_next_wrap() in a way that makes it
possible to turn it to straight and simple alias to find_next_bit_wrap().

In a couple places kernel users opencode missing cpumask_next_and_wrap().
Add it as well.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index b267a4f6a917..18c9908d50c4 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -284,6 +284,43 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 		small_cpumask_bits, n + 1);
 }
 
+/**
+ * cpumask_next_and_wrap - get the next cpu in *src1p & *src2p, starting from
+ *			   @n and wrapping around, if needed
+ * @n: the cpu prior to the place to search (i.e. return will be > @n)
+ * @src1p: the first cpumask pointer
+ * @src2p: the second cpumask pointer
+ *
+ * Return: >= nr_cpu_ids if no further cpus set in both.
+ */
+static __always_inline
+unsigned int cpumask_next_and_wrap(int n, const struct cpumask *src1p,
+			      const struct cpumask *src2p)
+{
+	/* -1 is a legal arg here. */
+	if (n != -1)
+		cpumask_check(n);
+	return find_next_and_bit_wrap(cpumask_bits(src1p), cpumask_bits(src2p),
+		small_cpumask_bits, n + 1);
+}
+
+/*
+ * cpumask_next_wrap - get the next cpu in *src, starting from
+ *			   @n and wrapping around, if needed
+ * @n: the cpu prior to the place to search
+ * @src: cpumask pointer
+ *
+ * Return: >= nr_cpu_ids if no further cpus set in both.
+ */
+static __always_inline
+unsigned int cpumask_next_wrap(int n, const struct cpumask *src)
+{
+	/* -1 is a legal arg here. */
+	if (n != -1)
+		cpumask_check(n);
+	return find_next_bit_wrap(cpumask_bits(src), small_cpumask_bits, n + 1);
+}
+
 /**
  * for_each_cpu - iterate over every cpu in a mask
  * @cpu: the (optionally unsigned) integer iterator
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 07/14] cpumask: use cpumask_next_wrap() where appropriate
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (5 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-28 18:49 ` [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap() Yury Norov
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, Yury Norov, Rasmus Villemoes, Andrew Morton

Now that cpumask_next{_and}_wrap() is wired to generic
find_next_bit_wrap(), we can use it in cpumask_any{_and}_distribute().

This automatically makes the cpumask_*_distribute() functions to use 
small_cpumask_bits instead of nr_cpumask_bits, which itself is a good
optimization.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/cpumask.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/cpumask.c b/lib/cpumask.c
index c9a9b451772a..d7cce2aaebc2 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -174,8 +174,7 @@ unsigned int cpumask_any_and_distribute(const struct cpumask *src1p,
 	/* NOTE: our first selection will skip 0. */
 	prev = __this_cpu_read(distribute_cpu_mask_prev);
 
-	next = find_next_and_bit_wrap(cpumask_bits(src1p), cpumask_bits(src2p),
-					nr_cpumask_bits, prev + 1);
+	next = cpumask_next_and_wrap(prev, src1p, src2p);
 	if (next < nr_cpu_ids)
 		__this_cpu_write(distribute_cpu_mask_prev, next);
 
@@ -195,7 +194,7 @@ unsigned int cpumask_any_distribute(const struct cpumask *srcp)
 
 	/* NOTE: our first selection will skip 0. */
 	prev = __this_cpu_read(distribute_cpu_mask_prev);
-	next = find_next_bit_wrap(cpumask_bits(srcp), nr_cpumask_bits, prev + 1);
+	next = cpumask_next_wrap(prev, srcp);
 	if (next < nr_cpu_ids)
 		__this_cpu_write(distribute_cpu_mask_prev, next);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (6 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 07/14] cpumask: use cpumask_next_wrap() where appropriate Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2025-01-04  0:33   ` Herbert Xu
  2025-01-07 19:02   ` Daniel Jordan
  2024-12-28 18:49 ` [PATCH 09/14] s390: switch stop_machine_yield() " Yury Norov
                   ` (6 subsequent siblings)
  14 siblings, 2 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linux-crypto, Steffen Klassert, Daniel Jordan
  Cc: Yury Norov, Rasmus Villemoes

Calling cpumask_next_wrap_old() with starting CPU == -1 effectively means
the request to find next CPU, wrapping around if needed.

cpumask_next_wrap() is the proper replacement for that.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/padata.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index 454ff2fca40b..a886e5bf028c 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -274,7 +274,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
 	if (remove_object) {
 		list_del_init(&padata->list);
 		++pd->processed;
-		pd->cpu = cpumask_next_wrap_old(cpu, pd->cpumask.pcpu, -1, false);
+		pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu);
 	}
 
 	spin_unlock(&reorder->lock);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 09/14] s390: switch stop_machine_yield() to using cpumask_next_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (7 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-28 18:49 ` [PATCH 10/14] nvme-tcp: switch nvme_tcp_set_queue_io_cpu() " Yury Norov
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linux-s390, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Hendrik Brueckner
  Cc: Yury Norov, Rasmus Villemoes

Calling cpumask_next_wrap_old() with starting CPU equal to wrapping CPU
effectively means the request to find next CPU, wrapping around if needed.

cpumask_next_wrap() is the proper replacement for that.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/s390/kernel/processor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kernel/processor.c b/arch/s390/kernel/processor.c
index 42ca61909030..745649ad9779 100644
--- a/arch/s390/kernel/processor.c
+++ b/arch/s390/kernel/processor.c
@@ -72,7 +72,7 @@ void notrace stop_machine_yield(const struct cpumask *cpumask)
 	this_cpu = smp_processor_id();
 	if (__this_cpu_inc_return(cpu_relax_retry) >= spin_retry) {
 		__this_cpu_write(cpu_relax_retry, 0);
-		cpu = cpumask_next_wrap_old(this_cpu, cpumask, this_cpu, false);
+		cpu = cpumask_next_wrap(this_cpu, cpumask);
 		if (cpu >= nr_cpu_ids)
 			return;
 		if (arch_vcpu_is_preempted(cpu))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 10/14] nvme-tcp: switch nvme_tcp_set_queue_io_cpu() to using cpumask_next_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (8 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 09/14] s390: switch stop_machine_yield() " Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-30  8:31   ` Sagi Grimberg
  2024-12-28 18:49 ` [PATCH 11/14] scsi: lpfc: switch lpfc_irq_rebalance() " Yury Norov
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-nvme, linux-kernel, Keith Busch, Jens Axboe,
	Christoph Hellwig, Sagi Grimberg
  Cc: Yury Norov, Rasmus Villemoes

Calling cpumask_next_wrap_old() with starting CPU == -1 effectively
is the same as request to find next CPU, wrapping around if needed.

cpumask_next_wrap() is the proper replacement for that.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/nvme/host/tcp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 054904376c3c..088101c57f53 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1578,7 +1578,7 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
 	if (wq_unbound)
 		queue->io_cpu = WORK_CPU_UNBOUND;
 	else
-		queue->io_cpu = cpumask_next_wrap_old(n - 1, cpu_online_mask, -1, false);
+		queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask);
 }
 
 static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 11/14] scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (9 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 10/14] nvme-tcp: switch nvme_tcp_set_queue_io_cpu() " Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-31  1:05   ` Justin Tee
  2024-12-28 18:49 ` [PATCH 12/14] scsi: lpfc: rework lpfc_next_{online,present}_cpu() Yury Norov
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, linux-scsi, James Smart, Dick Kennedy,
	James E.J. Bottomley, Martin K. Petersen
  Cc: Yury Norov, Rasmus Villemoes

Calling cpumask_next_wrap_old() with starting CPU equal to wrapping CPU
is the same as request to find next CPU, wrapping around if needed.

cpumask_next_wrap() is the proper replacement for that.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/scsi/lpfc/lpfc_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 31622fb0614a..e94a7b8973a7 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -12876,7 +12876,7 @@ lpfc_irq_rebalance(struct lpfc_hba *phba, unsigned int cpu, bool offline)
 
 	if (offline) {
 		/* Find next online CPU on original mask */
-		cpu_next = cpumask_next_wrap_old(cpu, orig_mask, cpu, true);
+		cpu_next = cpumask_next_wrap(cpu, orig_mask);
 		cpu_select = lpfc_next_online_cpu(orig_mask, cpu_next);
 
 		/* Found a valid CPU */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 12/14] scsi: lpfc: rework lpfc_next_{online,present}_cpu()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (10 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 11/14] scsi: lpfc: switch lpfc_irq_rebalance() " Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-31  1:06   ` Justin Tee
  2024-12-28 18:49 ` [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() Yury Norov
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-scsi, linux-kernel, James Smart, Dick Kennedy,
	James E.J. Bottomley, Martin K. Petersen, Yury Norov,
	Rasmus Villemoes

lpfc_next_online_cpu() opencodes cpumask_next_and_wrap() by using
a for-loop. Use it and make the lpfc_next_online_cpu() a plain
one-liner.

While there, rework lpfc_next_present_cpu() similarly. Notice that
cpumask_next() followed by cpumask_first() in the worst case of an
empty mask may traverse the mask twice. Cpumask_next_wrap() takes
care of that correctly.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/scsi/lpfc/lpfc.h | 23 +++++------------------
 1 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index e5a9c5a323f8..62438e84e52a 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -1715,18 +1715,12 @@ lpfc_phba_elsring(struct lpfc_hba *phba)
  * Note: If no valid cpu found, then nr_cpu_ids is returned.
  *
  **/
-static inline unsigned int
+static __always_inline unsigned int
 lpfc_next_online_cpu(const struct cpumask *mask, unsigned int start)
 {
-	unsigned int cpu_it;
-
-	for_each_cpu_wrap(cpu_it, mask, start) {
-		if (cpu_online(cpu_it))
-			break;
-	}
-
-	return cpu_it;
+	return cpumask_next_and_wrap(start, mask, cpu_online_mask);
 }
+
 /**
  * lpfc_next_present_cpu - Finds next present CPU after n
  * @n: the cpu prior to search
@@ -1734,16 +1728,9 @@ lpfc_next_online_cpu(const struct cpumask *mask, unsigned int start)
  * Note: If no next present cpu, then fallback to first present cpu.
  *
  **/
-static inline unsigned int lpfc_next_present_cpu(int n)
+static __always_inline unsigned int lpfc_next_present_cpu(int n)
 {
-	unsigned int cpu;
-
-	cpu = cpumask_next(n, cpu_present_mask);
-
-	if (cpu >= nr_cpu_ids)
-		cpu = cpumask_first(cpu_present_mask);
-
-	return cpu;
+	return cpumask_next_wrap(n, cpu_present_mask);
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (11 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 12/14] scsi: lpfc: rework lpfc_next_{online,present}_cpu() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2024-12-29 17:34   ` Michael Kelley
  2025-01-03 17:45   ` Bjorn Helgaas
  2024-12-28 18:49 ` [PATCH 14/14] cpumask: drop cpumask_next_wrap_old() Yury Norov
  2025-01-03  7:02 ` [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Christoph Hellwig
  14 siblings, 2 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-hyperv, linux-pci, linux-kernel, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
	Bjorn Helgaas
  Cc: Yury Norov, Rasmus Villemoes

Calling cpumask_next_wrap_old() with starting CPU == nr_cpu_ids
is effectively the same as request to find first CPU, starting
from a given one and wrapping around if needed.

cpumask_next_wrap() is a proper replacement for that.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 drivers/pci/controller/pci-hyperv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 86d1c2be8eb5..f8ebf98248b3 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1757,8 +1757,7 @@ static int hv_compose_multi_msi_req_get_cpu(void)
 
 	spin_lock_irqsave(&multi_msi_cpu_lock, flags);
 
-	cpu_next = cpumask_next_wrap_old(cpu_next, cpu_online_mask, nr_cpu_ids,
-				     false);
+	cpu_next = cpumask_next_wrap(cpu_next, cpu_online_mask);
 	cpu = cpu_next;
 
 	spin_unlock_irqrestore(&multi_msi_cpu_lock, flags);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 14/14] cpumask: drop cpumask_next_wrap_old()
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (12 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() Yury Norov
@ 2024-12-28 18:49 ` Yury Norov
  2025-01-03  7:02 ` [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Christoph Hellwig
  14 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2024-12-28 18:49 UTC (permalink / raw)
  To: linux-kernel, Yury Norov, Rasmus Villemoes, Andrew Morton

Now that we have cpumask_next_wrap() wired to generic find_next_bit_wrap(),
the old implementation is not needed.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 21 ---------------------
 lib/cpumask.c           | 32 --------------------------------
 2 files changed, 53 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index d9a3d0ea2df1..80fa8c9bfa68 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -311,27 +311,6 @@ unsigned int cpumask_next_wrap(int n, const struct cpumask *src)
 #define for_each_cpu(cpu, mask)				\
 	for_each_set_bit(cpu, cpumask_bits(mask), small_cpumask_bits)
 
-#if NR_CPUS == 1
-static __always_inline
-unsigned int cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap)
-{
-	cpumask_check(start);
-	if (n != -1)
-		cpumask_check(n);
-
-	/*
-	 * Return the first available CPU when wrapping, or when starting before cpu0,
-	 * since there is only one valid option.
-	 */
-	if (wrap && n >= 0)
-		return nr_cpumask_bits;
-
-	return cpumask_first(mask);
-}
-#else
-unsigned int __pure cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap);
-#endif
-
 /**
  * for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
  * @cpu: the (optionally unsigned) integer iterator
diff --git a/lib/cpumask.c b/lib/cpumask.c
index d7cce2aaebc2..fbf7630f2ac9 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -7,38 +7,6 @@
 #include <linux/memblock.h>
 #include <linux/numa.h>
 
-/**
- * cpumask_next_wrap_old - helper to implement for_each_cpu_wrap
- * @n: the cpu prior to the place to search
- * @mask: the cpumask pointer
- * @start: the start point of the iteration
- * @wrap: assume @n crossing @start terminates the iteration
- *
- * Return: >= nr_cpu_ids on completion
- *
- * Note: the @wrap argument is required for the start condition when
- * we cannot assume @start is set in @mask.
- */
-unsigned int cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap)
-{
-	unsigned int next;
-
-again:
-	next = cpumask_next(n, mask);
-
-	if (wrap && n < start && next >= start) {
-		return nr_cpumask_bits;
-
-	} else if (next >= nr_cpumask_bits) {
-		wrap = true;
-		n = -1;
-		goto again;
-	}
-
-	return next;
-}
-EXPORT_SYMBOL(cpumask_next_wrap_old);
-
 /* These are not inline because of header tangles. */
 #ifdef CONFIG_CPUMASK_OFFSTACK
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* RE: [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() Yury Norov
@ 2024-12-29 17:34   ` Michael Kelley
  2025-01-03 17:45   ` Bjorn Helgaas
  1 sibling, 0 replies; 34+ messages in thread
From: Michael Kelley @ 2024-12-29 17:34 UTC (permalink / raw)
  To: Yury Norov, linux-hyperv@vger.kernel.org,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas
  Cc: Rasmus Villemoes

From: Yury Norov <yury.norov@gmail.com> Sent: Saturday, December 28, 2024 10:50 AM
> 
> Calling cpumask_next_wrap_old() with starting CPU == nr_cpu_ids
> is effectively the same as request to find first CPU, starting
> from a given one and wrapping around if needed.
> 
> cpumask_next_wrap() is a proper replacement for that.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/pci/controller/pci-hyperv.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 86d1c2be8eb5..f8ebf98248b3 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1757,8 +1757,7 @@ static int hv_compose_multi_msi_req_get_cpu(void)
> 
>  	spin_lock_irqsave(&multi_msi_cpu_lock, flags);
> 
> -	cpu_next = cpumask_next_wrap_old(cpu_next, cpu_online_mask, nr_cpu_ids,
> -				     false);
> +	cpu_next = cpumask_next_wrap(cpu_next, cpu_online_mask);
>  	cpu = cpu_next;
> 
>  	spin_unlock_irqrestore(&multi_msi_cpu_lock, flags);
> --
> 2.43.0
> 

I remember reviewing the patch that originally added this use of
cpumask_next_wrap(). The two extra parameters were really
hard to understand. Nice to see them go away!

Reviewed-by: Michael Kelley <mhklinux@outlook.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/14] nvme-tcp: switch nvme_tcp_set_queue_io_cpu() to using cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 10/14] nvme-tcp: switch nvme_tcp_set_queue_io_cpu() " Yury Norov
@ 2024-12-30  8:31   ` Sagi Grimberg
  0 siblings, 0 replies; 34+ messages in thread
From: Sagi Grimberg @ 2024-12-30  8:31 UTC (permalink / raw)
  To: Yury Norov, linux-nvme, linux-kernel, Keith Busch, Jens Axboe,
	Christoph Hellwig
  Cc: Rasmus Villemoes




On 28/12/2024 20:49, Yury Norov wrote:
> Calling cpumask_next_wrap_old() with starting CPU == -1 effectively
> is the same as request to find next CPU, wrapping around if needed.
>
> cpumask_next_wrap() is the proper replacement for that.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>   drivers/nvme/host/tcp.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 054904376c3c..088101c57f53 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1578,7 +1578,7 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
>   	if (wq_unbound)
>   		queue->io_cpu = WORK_CPU_UNBOUND;
>   	else
> -		queue->io_cpu = cpumask_next_wrap_old(n - 1, cpu_online_mask, -1, false);
> +		queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask);
>   }
>   
>   static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)

Acked-by: Sagi Grimberg <sagi@grimberg.me>

Note that this will conflict with another outstanding patch:
https://lore.kernel.org/linux-nvme/20241224120457.576100-1-sagi@grimberg.me/T/#u 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 11/14] scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 11/14] scsi: lpfc: switch lpfc_irq_rebalance() " Yury Norov
@ 2024-12-31  1:05   ` Justin Tee
  0 siblings, 0 replies; 34+ messages in thread
From: Justin Tee @ 2024-12-31  1:05 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linux-scsi, Justin Tee, James Smart, Dick Kennedy,
	James E.J. Bottomley, Martin K. Petersen, Rasmus Villemoes

Reviewed-by: Justin Tee <justin.tee@broadcom.com>

Regards,
Justin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/14] scsi: lpfc: rework lpfc_next_{online,present}_cpu()
  2024-12-28 18:49 ` [PATCH 12/14] scsi: lpfc: rework lpfc_next_{online,present}_cpu() Yury Norov
@ 2024-12-31  1:06   ` Justin Tee
  0 siblings, 0 replies; 34+ messages in thread
From: Justin Tee @ 2024-12-31  1:06 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-scsi, linux-kernel, Justin Tee, James Smart, Dick Kennedy,
	James E.J. Bottomley, Martin K. Petersen, Rasmus Villemoes

Reviewed-by: Justin Tee <justin.tee@broadcom.com>

Regards,
Justin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage
  2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
                   ` (13 preceding siblings ...)
  2024-12-28 18:49 ` [PATCH 14/14] cpumask: drop cpumask_next_wrap_old() Yury Norov
@ 2025-01-03  7:02 ` Christoph Hellwig
  2025-01-03 15:21   ` Yury Norov
  14 siblings, 1 reply; 34+ messages in thread
From: Christoph Hellwig @ 2025-01-03  7:02 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Rasmus Villemoes, Matt Wu, Steffen Klassert, Daniel Jordan,
	Andrew Morton, Greg Kurz, Peter Xu, Shrikanth Hegde,
	Hendrik Brueckner

You've sent me less than a handfull of 14 patches, there's no way
to properly review this.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage
  2025-01-03  7:02 ` [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Christoph Hellwig
@ 2025-01-03 15:21   ` Yury Norov
  0 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2025-01-03 15:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Sagi Grimberg,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Rasmus Villemoes, Matt Wu, Steffen Klassert, Daniel Jordan,
	Andrew Morton, Greg Kurz, Peter Xu, Shrikanth Hegde,
	Hendrik Brueckner

On Fri, Jan 03, 2025 at 08:02:29AM +0100, Christoph Hellwig wrote:
> You've sent me less than a handfull of 14 patches, there's no way
> to properly review this.

Hi Christoph,

You can find the whole series here:

https://lore.kernel.org/linux-scsi/CABPRKS-uqfJmDp5pS+hSnvzggdMv0bNawpsVNpY4aU4V+UdR7Q@mail.gmail.com/T/

Or you can download it by message ID like this:

b4 mbox 20241228184949.31582-1-yury.norov@gmail.com

Sorry for not CC-ing you to the whole series. Some people prefer to
receive minimal noise, and you never know who is who. If it comes to
v2, you'll be in CC for every patch.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 05/14] cpumask: deprecate cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 05/14] cpumask: deprecate cpumask_next_wrap() Yury Norov
@ 2025-01-03 17:39   ` Bjorn Helgaas
  0 siblings, 0 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2025-01-03 17:39 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Rasmus Villemoes, Matt Wu, Steffen Klassert, Daniel Jordan,
	Andrew Morton, Greg Kurz, Peter Xu, Shrikanth Hegde,
	Hendrik Brueckner

On Sat, Dec 28, 2024 at 10:49:37AM -0800, Yury Norov wrote:
> The next patche aligns implementation of cpumask_next_wrap() with the
> generic version in find.h which changes function signature.

s/patche/patch/

I guess this is an indirect reference to find_next_bit_wrap()?  If so,
I think mentioning the function name would be more useful than
referring to "the generic version in find.h".

> To make the transition smooth, this patch deprecates current
> implementation by adding an _old suffix. The following patches switch
> current users to the new implementation one by one.
> 
> No functional changes were intended.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  arch/s390/kernel/processor.c        | 2 +-
>  drivers/nvme/host/tcp.c             | 2 +-
>  drivers/pci/controller/pci-hyperv.c | 2 +-
>  drivers/scsi/lpfc/lpfc_init.c       | 2 +-
>  include/linux/cpumask.h             | 4 ++--
>  kernel/padata.c                     | 2 +-
>  lib/cpumask.c                       | 6 +++---
>  7 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/s390/kernel/processor.c b/arch/s390/kernel/processor.c
> index 5ce9a795a0fe..42ca61909030 100644
> --- a/arch/s390/kernel/processor.c
> +++ b/arch/s390/kernel/processor.c
> @@ -72,7 +72,7 @@ void notrace stop_machine_yield(const struct cpumask *cpumask)
>  	this_cpu = smp_processor_id();
>  	if (__this_cpu_inc_return(cpu_relax_retry) >= spin_retry) {
>  		__this_cpu_write(cpu_relax_retry, 0);
> -		cpu = cpumask_next_wrap(this_cpu, cpumask, this_cpu, false);
> +		cpu = cpumask_next_wrap_old(this_cpu, cpumask, this_cpu, false);
>  		if (cpu >= nr_cpu_ids)
>  			return;
>  		if (arch_vcpu_is_preempted(cpu))
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 28c76a3e1bd2..054904376c3c 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1578,7 +1578,7 @@ static void nvme_tcp_set_queue_io_cpu(struct nvme_tcp_queue *queue)
>  	if (wq_unbound)
>  		queue->io_cpu = WORK_CPU_UNBOUND;
>  	else
> -		queue->io_cpu = cpumask_next_wrap(n - 1, cpu_online_mask, -1, false);
> +		queue->io_cpu = cpumask_next_wrap_old(n - 1, cpu_online_mask, -1, false);
>  }
>  
>  static void nvme_tcp_tls_done(void *data, int status, key_serial_t pskid)
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index cdd5be16021d..86d1c2be8eb5 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1757,7 +1757,7 @@ static int hv_compose_multi_msi_req_get_cpu(void)
>  
>  	spin_lock_irqsave(&multi_msi_cpu_lock, flags);
>  
> -	cpu_next = cpumask_next_wrap(cpu_next, cpu_online_mask, nr_cpu_ids,
> +	cpu_next = cpumask_next_wrap_old(cpu_next, cpu_online_mask, nr_cpu_ids,
>  				     false);
>  	cpu = cpu_next;
>  
> diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
> index 7f57397d91a9..31622fb0614a 100644
> --- a/drivers/scsi/lpfc/lpfc_init.c
> +++ b/drivers/scsi/lpfc/lpfc_init.c
> @@ -12876,7 +12876,7 @@ lpfc_irq_rebalance(struct lpfc_hba *phba, unsigned int cpu, bool offline)
>  
>  	if (offline) {
>  		/* Find next online CPU on original mask */
> -		cpu_next = cpumask_next_wrap(cpu, orig_mask, cpu, true);
> +		cpu_next = cpumask_next_wrap_old(cpu, orig_mask, cpu, true);
>  		cpu_select = lpfc_next_online_cpu(orig_mask, cpu_next);
>  
>  		/* Found a valid CPU */
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 30042351f15f..b267a4f6a917 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -296,7 +296,7 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
>  
>  #if NR_CPUS == 1
>  static __always_inline
> -unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
> +unsigned int cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap)
>  {
>  	cpumask_check(start);
>  	if (n != -1)
> @@ -312,7 +312,7 @@ unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, boo
>  	return cpumask_first(mask);
>  }
>  #else
> -unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap);
> +unsigned int __pure cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap);
>  #endif
>  
>  /**
> diff --git a/kernel/padata.c b/kernel/padata.c
> index d51bbc76b227..454ff2fca40b 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -274,7 +274,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
>  	if (remove_object) {
>  		list_del_init(&padata->list);
>  		++pd->processed;
> -		pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, false);
> +		pd->cpu = cpumask_next_wrap_old(cpu, pd->cpumask.pcpu, -1, false);
>  	}
>  
>  	spin_unlock(&reorder->lock);
> diff --git a/lib/cpumask.c b/lib/cpumask.c
> index e77ee9d46f71..c9a9b451772a 100644
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -8,7 +8,7 @@
>  #include <linux/numa.h>
>  
>  /**
> - * cpumask_next_wrap - helper to implement for_each_cpu_wrap
> + * cpumask_next_wrap_old - helper to implement for_each_cpu_wrap
>   * @n: the cpu prior to the place to search
>   * @mask: the cpumask pointer
>   * @start: the start point of the iteration
> @@ -19,7 +19,7 @@
>   * Note: the @wrap argument is required for the start condition when
>   * we cannot assume @start is set in @mask.
>   */
> -unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
> +unsigned int cpumask_next_wrap_old(int n, const struct cpumask *mask, int start, bool wrap)
>  {
>  	unsigned int next;
>  
> @@ -37,7 +37,7 @@ unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, boo
>  
>  	return next;
>  }
> -EXPORT_SYMBOL(cpumask_next_wrap);
> +EXPORT_SYMBOL(cpumask_next_wrap_old);
>  
>  /* These are not inline because of header tangles. */
>  #ifdef CONFIG_CPUMASK_OFFSTACK
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap()
  2024-12-28 18:49 ` [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap() Yury Norov
@ 2025-01-03 17:44   ` Bjorn Helgaas
  2025-01-15  3:41     ` Yury Norov
  2025-01-07 13:28   ` Alexander Gordeev
  1 sibling, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2025-01-03 17:44 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Rasmus Villemoes, Matt Wu, Steffen Klassert, Daniel Jordan,
	Andrew Morton, Greg Kurz, Peter Xu, Shrikanth Hegde,
	Hendrik Brueckner

On Sat, Dec 28, 2024 at 10:49:38AM -0800, Yury Norov wrote:
> cpumask_next_wrap_old() has two additional parameters, comparing to it's
> analogue in linux/find.h find_next_bit_wrap(). The reason for that is
> historical.

s/it's/its/

Personally I think cscope/tags/git grep make "find_next_bit_wrap()"
enough even without mentioning "linux/find.h".

> + * cpumask_next_and_wrap - get the next cpu in *src1p & *src2p, starting from
> + *			   @n and wrapping around, if needed
> + * @n: the cpu prior to the place to search (i.e. return will be > @n)

Is the return really > @n if it wraps?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() Yury Norov
  2024-12-29 17:34   ` Michael Kelley
@ 2025-01-03 17:45   ` Bjorn Helgaas
  2025-01-03 18:56     ` Yury Norov
  1 sibling, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2025-01-03 17:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-hyperv, linux-pci, linux-kernel, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
	Bjorn Helgaas, Rasmus Villemoes

On Sat, Dec 28, 2024 at 10:49:45AM -0800, Yury Norov wrote:
> Calling cpumask_next_wrap_old() with starting CPU == nr_cpu_ids
> is effectively the same as request to find first CPU, starting
> from a given one and wrapping around if needed.
> 
> cpumask_next_wrap() is a proper replacement for that.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

s/switch/Switch/ in subject to match history.

Since this depends on previous patches, I assume you'll merge them all
together, so:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/controller/pci-hyperv.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index 86d1c2be8eb5..f8ebf98248b3 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1757,8 +1757,7 @@ static int hv_compose_multi_msi_req_get_cpu(void)
>  
>  	spin_lock_irqsave(&multi_msi_cpu_lock, flags);
>  
> -	cpu_next = cpumask_next_wrap_old(cpu_next, cpu_online_mask, nr_cpu_ids,
> -				     false);
> +	cpu_next = cpumask_next_wrap(cpu_next, cpu_online_mask);
>  	cpu = cpu_next;
>  
>  	spin_unlock_irqrestore(&multi_msi_cpu_lock, flags);
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
  2025-01-03 17:45   ` Bjorn Helgaas
@ 2025-01-03 18:56     ` Yury Norov
  0 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2025-01-03 18:56 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-hyperv, linux-pci, linux-kernel, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
	Bjorn Helgaas, Rasmus Villemoes

On Fri, Jan 03, 2025 at 11:45:43AM -0600, Bjorn Helgaas wrote:
> On Sat, Dec 28, 2024 at 10:49:45AM -0800, Yury Norov wrote:
> > Calling cpumask_next_wrap_old() with starting CPU == nr_cpu_ids
> > is effectively the same as request to find first CPU, starting
> > from a given one and wrapping around if needed.
> > 
> > cpumask_next_wrap() is a proper replacement for that.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> 
> s/switch/Switch/ in subject to match history.
> 
> Since this depends on previous patches, I assume you'll merge them all
> together, so:
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Hi Bjorn,

Thanks for review!

Agree with everything you spotted out. I'll fix it in v2 if it will be
needed, or inplace when applying.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap() Yury Norov
@ 2025-01-04  0:33   ` Herbert Xu
  2025-01-07 19:02   ` Daniel Jordan
  1 sibling, 0 replies; 34+ messages in thread
From: Herbert Xu @ 2025-01-04  0:33 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linux-crypto, steffen.klassert, daniel.m.jordan,
	yury.norov, linux

Yury Norov <yury.norov@gmail.com> wrote:
> Calling cpumask_next_wrap_old() with starting CPU == -1 effectively means
> the request to find next CPU, wrapping around if needed.
> 
> cpumask_next_wrap() is the proper replacement for that.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
> kernel/padata.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap()
  2024-12-28 18:49 ` [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap() Yury Norov
  2025-01-03 17:44   ` Bjorn Helgaas
@ 2025-01-07 13:28   ` Alexander Gordeev
  2025-01-15  3:38     ` Yury Norov
  1 sibling, 1 reply; 34+ messages in thread
From: Alexander Gordeev @ 2025-01-07 13:28 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Sven Schnelle, Haren Myneni, Rick Lindsley,
	Nick Child, Thomas Falcon, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, Keith Busch,
	Jens Axboe, Christoph Hellwig, Sagi Grimberg, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
	Bjorn Helgaas, James Smart, Dick Kennedy, James E.J. Bottomley,
	Martin K. Petersen, Rasmus Villemoes, Matt Wu, Steffen Klassert,
	Daniel Jordan, Andrew Morton, Greg Kurz, Peter Xu,
	Shrikanth Hegde, Hendrik Brueckner

On Sat, Dec 28, 2024 at 10:49:38AM -0800, Yury Norov wrote:

Hi Yury,

> cpumask_next_wrap_old() has two additional parameters, comparing to it's
> analogue in linux/find.h find_next_bit_wrap(). The reason for that is
> historical.
> 
> Before 4fe49b3b97c262 ("lib/bitmap: introduce for_each_set_bit_wrap()
> macro"), cpumask_next_wrap() was used to implement for_each_cpu_wrap()
> iterator. Now that the iterator is an alias to generic
> for_each_set_bit_wrap(), the additional parameters aren't used and may
> confuse readers.
> 
> All existing users call cpumask_next_wrap() in a way that makes it
> possible to turn it to straight and simple alias to find_next_bit_wrap().
> 
> In a couple places kernel users opencode missing cpumask_next_and_wrap().
> Add it as well.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  include/linux/cpumask.h | 37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index b267a4f6a917..18c9908d50c4 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -284,6 +284,43 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
>  		small_cpumask_bits, n + 1);
>  }
>  
> +/**
> + * cpumask_next_and_wrap - get the next cpu in *src1p & *src2p, starting from
> + *			   @n and wrapping around, if needed
> + * @n: the cpu prior to the place to search (i.e. return will be > @n)
> + * @src1p: the first cpumask pointer
> + * @src2p: the second cpumask pointer
> + *
> + * Return: >= nr_cpu_ids if no further cpus set in both.
> + */
> +static __always_inline
> +unsigned int cpumask_next_and_wrap(int n, const struct cpumask *src1p,
> +			      const struct cpumask *src2p)
> +{
> +	/* -1 is a legal arg here. */
> +	if (n != -1)
> +		cpumask_check(n);
> +	return find_next_and_bit_wrap(cpumask_bits(src1p), cpumask_bits(src2p),
> +		small_cpumask_bits, n + 1);
> +}
> +
> +/*
> + * cpumask_next_wrap - get the next cpu in *src, starting from
> + *			   @n and wrapping around, if needed

Does it mean the search wraps a cpumask and starts from the beginning
if the bit is not found and returns >= nr_cpu_ids if @n crosses itself?

> + * @n: the cpu prior to the place to search
> + * @src: cpumask pointer
> + *
> + * Return: >= nr_cpu_ids if no further cpus set in both.

It looks like Return is a cpumask_next_and_wrap() comment leftover.

> + */
> +static __always_inline
> +unsigned int cpumask_next_wrap(int n, const struct cpumask *src)
> +{
> +	/* -1 is a legal arg here. */
> +	if (n != -1)
> +		cpumask_check(n);
> +	return find_next_bit_wrap(cpumask_bits(src), small_cpumask_bits, n + 1);
> +}
> +
>  /**
>   * for_each_cpu - iterate over every cpu in a mask
>   * @cpu: the (optionally unsigned) integer iterator

Thanks!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap()
  2024-12-28 18:49 ` [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap() Yury Norov
  2025-01-04  0:33   ` Herbert Xu
@ 2025-01-07 19:02   ` Daniel Jordan
  1 sibling, 0 replies; 34+ messages in thread
From: Daniel Jordan @ 2025-01-07 19:02 UTC (permalink / raw)
  To: Yury Norov; +Cc: linux-kernel, linux-crypto, Steffen Klassert, Rasmus Villemoes

On Sat, Dec 28, 2024 at 10:49:40AM -0800, Yury Norov wrote:
> Calling cpumask_next_wrap_old() with starting CPU == -1 effectively means
> the request to find next CPU, wrapping around if needed.
> 
> cpumask_next_wrap() is the proper replacement for that.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()
  2024-12-28 18:49 ` [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity() Yury Norov
@ 2025-01-07 22:37   ` Nick Child
  2025-01-07 22:42     ` Yury Norov
  0 siblings, 1 reply; 34+ messages in thread
From: Nick Child @ 2025-01-07 22:37 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, netdev, linuxppc-dev, Haren Myneni, Rick Lindsley,
	Thomas Falcon, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Madhavan Srinivasan,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rasmus Villemoes

On Sat, Dec 28, 2024 at 10:49:35AM -0800, Yury Norov wrote:
> A loop based on cpumask_next_wrap() opencodes the dedicated macro
> for_each_online_cpu_wrap(). Using the macro allows to avoid setting
> bits affinity mask more than once when stride >= num_online_cpus.
> 
> This also helps to drop cpumask handling code in the caller function.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index e95ae0d39948..4cfd90fb206b 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
>  		(*stragglers)--;
>  	}
>  	/* atomic write is safer than writing bit by bit directly */
> -	for (i = 0; i < stride; i++) {
> -		cpumask_set_cpu(*cpu, mask);
> -		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
> -					 nr_cpu_ids, false);
> +	for_each_online_cpu_wrap(i, *cpu) {
> +		if (!stride--)
> +			break;
> +		cpumask_set_cpu(i, mask);
>  	}
> +
> +	/* For the next queue we start from the first unused CPU in this queue */
> +	if (i < nr_cpu_ids)
> +		*cpu = i + 1;
> +
This should read '*cpu = i'. Since the loop breaks after incrementing i.
Thanks!

>  	/* set queue affinity mask */
>  	cpumask_copy(queue->affinity_mask, mask);
>  	rc = irq_set_affinity_and_hint(queue->irq, queue->affinity_mask);
> @@ -256,7 +261,7 @@ static void ibmvnic_set_affinity(struct ibmvnic_adapter *adapter)
>  	int num_rxqs = adapter->num_active_rx_scrqs, i_rxqs = 0;
>  	int num_txqs = adapter->num_active_tx_scrqs, i_txqs = 0;
>  	int total_queues, stride, stragglers, i;
> -	unsigned int num_cpu, cpu;
> +	unsigned int num_cpu, cpu = 0;
>  	bool is_rx_queue;
>  	int rc = 0;
>  
> @@ -274,8 +279,6 @@ static void ibmvnic_set_affinity(struct ibmvnic_adapter *adapter)
>  	stride = max_t(int, num_cpu / total_queues, 1);
>  	/* number of leftover cpu's */
>  	stragglers = num_cpu >= total_queues ? num_cpu % total_queues : 0;
> -	/* next available cpu to assign irq to */
> -	cpu = cpumask_next(-1, cpu_online_mask);
>  
>  	for (i = 0; i < total_queues; i++) {
>  		is_rx_queue = false;
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()
  2025-01-07 22:37   ` Nick Child
@ 2025-01-07 22:42     ` Yury Norov
  2025-01-07 23:04       ` Yury Norov
  0 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2025-01-07 22:42 UTC (permalink / raw)
  To: Nick Child
  Cc: linux-kernel, netdev, linuxppc-dev, Haren Myneni, Rick Lindsley,
	Thomas Falcon, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Madhavan Srinivasan,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rasmus Villemoes

On Tue, Jan 07, 2025 at 04:37:17PM -0600, Nick Child wrote:
> On Sat, Dec 28, 2024 at 10:49:35AM -0800, Yury Norov wrote:
> > A loop based on cpumask_next_wrap() opencodes the dedicated macro
> > for_each_online_cpu_wrap(). Using the macro allows to avoid setting
> > bits affinity mask more than once when stride >= num_online_cpus.
> > 
> > This also helps to drop cpumask handling code in the caller function.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
> >  1 file changed, 10 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> > index e95ae0d39948..4cfd90fb206b 100644
> > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > @@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
> >  		(*stragglers)--;
> >  	}
> >  	/* atomic write is safer than writing bit by bit directly */
> > -	for (i = 0; i < stride; i++) {
> > -		cpumask_set_cpu(*cpu, mask);
> > -		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
> > -					 nr_cpu_ids, false);
> > +	for_each_online_cpu_wrap(i, *cpu) {
> > +		if (!stride--)
> > +			break;
> > +		cpumask_set_cpu(i, mask);
> >  	}
> > +
> > +	/* For the next queue we start from the first unused CPU in this queue */
> > +	if (i < nr_cpu_ids)
> > +		*cpu = i + 1;
> > +
> This should read '*cpu = i'. Since the loop breaks after incrementing i.
> Thanks!

cpumask_next_wrap() makes '+ 1' for you. The for_each_cpu_wrap() starts
exactly where you point. So, this '+1' needs to be explicit now.

Does that make sense?

> 
> >  	/* set queue affinity mask */
> >  	cpumask_copy(queue->affinity_mask, mask);
> >  	rc = irq_set_affinity_and_hint(queue->irq, queue->affinity_mask);
> > @@ -256,7 +261,7 @@ static void ibmvnic_set_affinity(struct ibmvnic_adapter *adapter)
> >  	int num_rxqs = adapter->num_active_rx_scrqs, i_rxqs = 0;
> >  	int num_txqs = adapter->num_active_tx_scrqs, i_txqs = 0;
> >  	int total_queues, stride, stragglers, i;
> > -	unsigned int num_cpu, cpu;
> > +	unsigned int num_cpu, cpu = 0;
> >  	bool is_rx_queue;
> >  	int rc = 0;
> >  
> > @@ -274,8 +279,6 @@ static void ibmvnic_set_affinity(struct ibmvnic_adapter *adapter)
> >  	stride = max_t(int, num_cpu / total_queues, 1);
> >  	/* number of leftover cpu's */
> >  	stragglers = num_cpu >= total_queues ? num_cpu % total_queues : 0;
> > -	/* next available cpu to assign irq to */
> > -	cpu = cpumask_next(-1, cpu_online_mask);
> >  
> >  	for (i = 0; i < total_queues; i++) {
> >  		is_rx_queue = false;
> > -- 
> > 2.43.0
> > 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()
  2025-01-07 22:42     ` Yury Norov
@ 2025-01-07 23:04       ` Yury Norov
  2025-01-08 14:08         ` [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()y Nick Child
  0 siblings, 1 reply; 34+ messages in thread
From: Yury Norov @ 2025-01-07 23:04 UTC (permalink / raw)
  To: Nick Child
  Cc: linux-kernel, netdev, linuxppc-dev, Haren Myneni, Rick Lindsley,
	Thomas Falcon, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Madhavan Srinivasan,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rasmus Villemoes

On Tue, Jan 07, 2025 at 02:43:01PM -0800, Yury Norov wrote:
> On Tue, Jan 07, 2025 at 04:37:17PM -0600, Nick Child wrote:
> > On Sat, Dec 28, 2024 at 10:49:35AM -0800, Yury Norov wrote:
> > > A loop based on cpumask_next_wrap() opencodes the dedicated macro
> > > for_each_online_cpu_wrap(). Using the macro allows to avoid setting
> > > bits affinity mask more than once when stride >= num_online_cpus.
> > > 
> > > This also helps to drop cpumask handling code in the caller function.
> > > 
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > ---
> > >  drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
> > >  1 file changed, 10 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> > > index e95ae0d39948..4cfd90fb206b 100644
> > > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > > @@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
> > >  		(*stragglers)--;
> > >  	}
> > >  	/* atomic write is safer than writing bit by bit directly */
> > > -	for (i = 0; i < stride; i++) {
> > > -		cpumask_set_cpu(*cpu, mask);
> > > -		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
> > > -					 nr_cpu_ids, false);
> > > +	for_each_online_cpu_wrap(i, *cpu) {
> > > +		if (!stride--)
> > > +			break;
> > > +		cpumask_set_cpu(i, mask);
> > >  	}
> > > +
> > > +	/* For the next queue we start from the first unused CPU in this queue */
> > > +	if (i < nr_cpu_ids)
> > > +		*cpu = i + 1;
> > > +
> > This should read '*cpu = i'. Since the loop breaks after incrementing i.
> > Thanks!
> 
> cpumask_next_wrap() makes '+ 1' for you. The for_each_cpu_wrap() starts
> exactly where you point. So, this '+1' needs to be explicit now.
> 
> Does that make sense?

Ah, I think I see what you mean. It should be like this, right?

  for_each_online_cpu_wrap(i, *cpu) {
  	if (!stride--) {
        	*cpu = i + 1;
  		break;
        }
  	cpumask_set_cpu(i, mask);
  }

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()y
  2025-01-07 23:04       ` Yury Norov
@ 2025-01-08 14:08         ` Nick Child
  0 siblings, 0 replies; 34+ messages in thread
From: Nick Child @ 2025-01-08 14:08 UTC (permalink / raw)
  To: Yury Norov, y
  Cc: linux-kernel, netdev, linuxppc-dev, Haren Myneni, Rick Lindsley,
	Thomas Falcon, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Naveen N Rao, Madhavan Srinivasan,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rasmus Villemoes

On Tue, Jan 07, 2025 at 03:04:40PM -0800, Yury Norov wrote:
> On Tue, Jan 07, 2025 at 02:43:01PM -0800, Yury Norov wrote:
> > On Tue, Jan 07, 2025 at 04:37:17PM -0600, Nick Child wrote:
> > > On Sat, Dec 28, 2024 at 10:49:35AM -0800, Yury Norov wrote:
> > > > A loop based on cpumask_next_wrap() opencodes the dedicated macro
> > > > for_each_online_cpu_wrap(). Using the macro allows to avoid setting
> > > > bits affinity mask more than once when stride >= num_online_cpus.
> > > > 
> > > > This also helps to drop cpumask handling code in the caller function.
> > > > 
> > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > ---
> > > >  drivers/net/ethernet/ibm/ibmvnic.c | 17 ++++++++++-------
> > > >  1 file changed, 10 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> > > > index e95ae0d39948..4cfd90fb206b 100644
> > > > --- a/drivers/net/ethernet/ibm/ibmvnic.c
> > > > +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> > > > @@ -234,11 +234,16 @@ static int ibmvnic_set_queue_affinity(struct ibmvnic_sub_crq_queue *queue,
> > > >  		(*stragglers)--;
> > > >  	}
> > > >  	/* atomic write is safer than writing bit by bit directly */
> > > > -	for (i = 0; i < stride; i++) {
> > > > -		cpumask_set_cpu(*cpu, mask);
> > > > -		*cpu = cpumask_next_wrap(*cpu, cpu_online_mask,
> > > > -					 nr_cpu_ids, false);
> > > > +	for_each_online_cpu_wrap(i, *cpu) {
> > > > +		if (!stride--)
> > > > +			break;
> > > > +		cpumask_set_cpu(i, mask);
> > > >  	}
> > > > +
> > > > +	/* For the next queue we start from the first unused CPU in this queue */
> > > > +	if (i < nr_cpu_ids)
> > > > +		*cpu = i + 1;
> > > > +
> > > This should read '*cpu = i'. Since the loop breaks after incrementing i.
> > > Thanks!
> > 
> > cpumask_next_wrap() makes '+ 1' for you. The for_each_cpu_wrap() starts
> > exactly where you point. So, this '+1' needs to be explicit now.
> > 
> > Does that make sense?
> 
> Ah, I think I see what you mean. It should be like this, right?
> 
>   for_each_online_cpu_wrap(i, *cpu) {
>   	if (!stride--) {
>         	*cpu = i + 1;
>   		break;
>         }
>   	cpumask_set_cpu(i, mask);
>   }
Not quite, for_each_online_cpu_wrap will increment i to point to the
next online cpu, then enter the body of the loop. When we break (beacuse
stride is zero), we exit the loop early before i is added to any mask, i
is the next unassigned online cpu.
I tested this to make sure, we see unused cpus (#7, #23)  with the patch as is:
  IRQ : 256 -> ibmvnic-30000003-tx0
	/proc/irq/256/smp_affinity_list:0-6
  IRQ : 257 -> ibmvnic-30000003-tx1
	/proc/irq/257/smp_affinity_list:16-22
  IRQ : 258 -> ibmvnic-30000003-rx0
	/proc/irq/258/smp_affinity_list:8-14
  IRQ : 259 -> ibmvnic-30000003-rx1
	/proc/irq/259/smp_affinity_list:24-30


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap()
  2025-01-07 13:28   ` Alexander Gordeev
@ 2025-01-15  3:38     ` Yury Norov
  0 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2025-01-15  3:38 UTC (permalink / raw)
  To: Alexander Gordeev
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Sven Schnelle, Haren Myneni, Rick Lindsley,
	Nick Child, Thomas Falcon, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Michael S. Tsirkin,
	Jason Wang, Xuan Zhuo, Eugenio Pérez, Keith Busch,
	Jens Axboe, Christoph Hellwig, Sagi Grimberg, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Manivannan Sadhasivam, Rob Herring,
	Bjorn Helgaas, James Smart, Dick Kennedy, James E.J. Bottomley,
	Martin K. Petersen, Rasmus Villemoes, Matt Wu, Steffen Klassert,
	Daniel Jordan, Andrew Morton, Greg Kurz, Peter Xu,
	Shrikanth Hegde, Hendrik Brueckner

On Tue, Jan 07, 2025 at 02:28:31PM +0100, Alexander Gordeev wrote:
> On Sat, Dec 28, 2024 at 10:49:38AM -0800, Yury Norov wrote:
> 
> Hi Yury,
> 
> > cpumask_next_wrap_old() has two additional parameters, comparing to it's
> > analogue in linux/find.h find_next_bit_wrap(). The reason for that is
> > historical.
> > 
> > Before 4fe49b3b97c262 ("lib/bitmap: introduce for_each_set_bit_wrap()
> > macro"), cpumask_next_wrap() was used to implement for_each_cpu_wrap()
> > iterator. Now that the iterator is an alias to generic
> > for_each_set_bit_wrap(), the additional parameters aren't used and may
> > confuse readers.
> > 
> > All existing users call cpumask_next_wrap() in a way that makes it
> > possible to turn it to straight and simple alias to find_next_bit_wrap().
> > 
> > In a couple places kernel users opencode missing cpumask_next_and_wrap().
> > Add it as well.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  include/linux/cpumask.h | 37 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> > 
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index b267a4f6a917..18c9908d50c4 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -284,6 +284,43 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
> >  		small_cpumask_bits, n + 1);
> >  }
> >  
> > +/**
> > + * cpumask_next_and_wrap - get the next cpu in *src1p & *src2p, starting from
> > + *			   @n and wrapping around, if needed
> > + * @n: the cpu prior to the place to search (i.e. return will be > @n)
> > + * @src1p: the first cpumask pointer
> > + * @src2p: the second cpumask pointer
> > + *
> > + * Return: >= nr_cpu_ids if no further cpus set in both.
> > + */
> > +static __always_inline
> > +unsigned int cpumask_next_and_wrap(int n, const struct cpumask *src1p,
> > +			      const struct cpumask *src2p)
> > +{
> > +	/* -1 is a legal arg here. */
> > +	if (n != -1)
> > +		cpumask_check(n);
> > +	return find_next_and_bit_wrap(cpumask_bits(src1p), cpumask_bits(src2p),
> > +		small_cpumask_bits, n + 1);
> > +}
> > +
> > +/*
> > + * cpumask_next_wrap - get the next cpu in *src, starting from
> > + *			   @n and wrapping around, if needed
> 
> Does it mean the search wraps a cpumask and starts from the beginning
> if the bit is not found and returns >= nr_cpu_ids if @n crosses itself?
> 
> > + * @n: the cpu prior to the place to search
> > + * @src: cpumask pointer
> > + *
> > + * Return: >= nr_cpu_ids if no further cpus set in both.
> 
> It looks like Return is a cpumask_next_and_wrap() comment leftover.
> 
> > + */
> > +static __always_inline
> > +unsigned int cpumask_next_wrap(int n, const struct cpumask *src)
> > +{
> > +	/* -1 is a legal arg here. */
> > +	if (n != -1)
> > +		cpumask_check(n);
> > +	return find_next_bit_wrap(cpumask_bits(src), small_cpumask_bits, n + 1);
> > +}
> > +
> >  /**
> >   * for_each_cpu - iterate over every cpu in a mask
> >   * @cpu: the (optionally unsigned) integer iterator
> 
> Thanks!

Thanks, I'll update the comments.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap()
  2025-01-03 17:44   ` Bjorn Helgaas
@ 2025-01-15  3:41     ` Yury Norov
  0 siblings, 0 replies; 34+ messages in thread
From: Yury Norov @ 2025-01-15  3:41 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-kernel, linuxppc-dev, linux-s390, netdev, virtualization,
	linux-nvme, linux-hyperv, linux-pci, linux-scsi, linux-crypto,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N Rao,
	Madhavan Srinivasan, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Haren Myneni, Rick Lindsley, Nick Child, Thomas Falcon,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Michael S. Tsirkin, Jason Wang, Xuan Zhuo,
	Eugenio Pérez, Keith Busch, Jens Axboe, Christoph Hellwig,
	Sagi Grimberg, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, James Smart,
	Dick Kennedy, James E.J. Bottomley, Martin K. Petersen,
	Rasmus Villemoes, Matt Wu, Steffen Klassert, Daniel Jordan,
	Andrew Morton, Greg Kurz, Peter Xu, Shrikanth Hegde,
	Hendrik Brueckner

On Fri, Jan 03, 2025 at 11:44:32AM -0600, Bjorn Helgaas wrote:
> On Sat, Dec 28, 2024 at 10:49:38AM -0800, Yury Norov wrote:
> > cpumask_next_wrap_old() has two additional parameters, comparing to it's
> > analogue in linux/find.h find_next_bit_wrap(). The reason for that is
> > historical.
> 
> s/it's/its/
> 
> Personally I think cscope/tags/git grep make "find_next_bit_wrap()"
> enough even without mentioning "linux/find.h".
> 
> > + * cpumask_next_and_wrap - get the next cpu in *src1p & *src2p, starting from
> > + *			   @n and wrapping around, if needed
> > + * @n: the cpu prior to the place to search (i.e. return will be > @n)
> 
> Is the return really > @n if it wraps?

No, this is a copy-paste error. Will fix in v2.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2025-01-15  3:41 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-28 18:49 [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Yury Norov
2024-12-28 18:49 ` [PATCH 01/14] objpool: rework objpool_pop() Yury Norov
2024-12-28 18:49 ` [PATCH 02/14] virtio_net: simplify virtnet_set_affinity() Yury Norov
2024-12-28 18:49 ` [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity() Yury Norov
2025-01-07 22:37   ` Nick Child
2025-01-07 22:42     ` Yury Norov
2025-01-07 23:04       ` Yury Norov
2025-01-08 14:08         ` [PATCH 03/14] ibmvnic: simplify ibmvnic_set_queue_affinity()y Nick Child
2024-12-28 18:49 ` [PATCH 04/14] powerpc/xmon: simplify xmon_batch_next_cpu() Yury Norov
2024-12-28 18:49 ` [PATCH 05/14] cpumask: deprecate cpumask_next_wrap() Yury Norov
2025-01-03 17:39   ` Bjorn Helgaas
2024-12-28 18:49 ` [PATCH 06/14] cpumask: re-introduce cpumask_next{,_and}_wrap() Yury Norov
2025-01-03 17:44   ` Bjorn Helgaas
2025-01-15  3:41     ` Yury Norov
2025-01-07 13:28   ` Alexander Gordeev
2025-01-15  3:38     ` Yury Norov
2024-12-28 18:49 ` [PATCH 07/14] cpumask: use cpumask_next_wrap() where appropriate Yury Norov
2024-12-28 18:49 ` [PATCH 08/14] padata: switch padata_find_next() to using cpumask_next_wrap() Yury Norov
2025-01-04  0:33   ` Herbert Xu
2025-01-07 19:02   ` Daniel Jordan
2024-12-28 18:49 ` [PATCH 09/14] s390: switch stop_machine_yield() " Yury Norov
2024-12-28 18:49 ` [PATCH 10/14] nvme-tcp: switch nvme_tcp_set_queue_io_cpu() " Yury Norov
2024-12-30  8:31   ` Sagi Grimberg
2024-12-28 18:49 ` [PATCH 11/14] scsi: lpfc: switch lpfc_irq_rebalance() " Yury Norov
2024-12-31  1:05   ` Justin Tee
2024-12-28 18:49 ` [PATCH 12/14] scsi: lpfc: rework lpfc_next_{online,present}_cpu() Yury Norov
2024-12-31  1:06   ` Justin Tee
2024-12-28 18:49 ` [PATCH 13/14] PCI: hv: switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap() Yury Norov
2024-12-29 17:34   ` Michael Kelley
2025-01-03 17:45   ` Bjorn Helgaas
2025-01-03 18:56     ` Yury Norov
2024-12-28 18:49 ` [PATCH 14/14] cpumask: drop cpumask_next_wrap_old() Yury Norov
2025-01-03  7:02 ` [PATCH 00/14] cpumask: cleanup cpumask_next_wrap() implementation and usage Christoph Hellwig
2025-01-03 15:21   ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox