netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] sched/topology: add for_each_numa_cpu() macro
@ 2023-04-30 17:18 Yury Norov
  2023-04-30 17:18 ` [PATCH v3 1/8] sched: fix sched_numa_find_nth_cpu() in non-NUMA case Yury Norov
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Yury Norov @ 2023-04-30 17:18 UTC (permalink / raw)
  To: Jakub Kicinski, netdev, linux-rdma, linux-kernel
  Cc: Yury Norov, Saeed Mahameed, Pawel Chmielewski, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Paolo Abeni, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider,
	Tariq Toukan, Gal Pressman, Greg Kroah-Hartman, Heiko Carstens,
	Barry Song

for_each_cpu() is widely used in kernel, and it's beneficial to create
a NUMA-aware version of the macro.

Recently added for_each_numa_hop_mask() works, but switching existing
codebase to it is not an easy process.

This series adds for_each_numa_cpu(), which is designed to be similar to
the for_each_cpu(). It allows to convert existing code to NUMA-aware as
simple as adding a hop iterator variable and passing it inside new macro.
for_each_numa_cpu() takes care of the rest.

At the moment, we have 2 users of NUMA-aware enumerators. One is
Melanox's in-tree driver, and another is Intel's in-review driver:

https://lore.kernel.org/lkml/20230216145455.661709-1-pawel.chmielewski@intel.com/

Both real-life examples follow the same pattern:

        for_each_numa_hop_mask(cpus, prev, node) {
                for_each_cpu_andnot(cpu, cpus, prev) {
                        if (cnt++ == max_num)
                                goto out;
                        do_something(cpu);
                }
                prev = cpus;
        }

With the new macro, it has a more standard look, like this:

        for_each_numa_cpu(cpu, hop, node, cpu_possible_mask) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

Straight conversion of existing for_each_cpu() codebase to NUMA-aware
version with for_each_numa_hop_mask() is difficult because it doesn't
take a user-provided cpu mask, and eventually ends up with open-coded
double loop. With for_each_numa_cpu() it shouldn't be a brainteaser.
Consider the NUMA-ignorant example:

        cpumask_t cpus = get_mask();
        int cnt = 0, cpu;

        for_each_cpu(cpu, cpus) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

Converting it to NUMA-aware version would be as simple as:

        cpumask_t cpus = get_mask();
        int node = get_node();
        int cnt = 0, hop, cpu;

        for_each_numa_cpu(cpu, hop, node, cpus) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

The latter looks more verbose and avoids from open-coding that annoying
double loop. Another advantage is that it works with a 'hop' parameter with
the clear meaning of NUMA distance, and doesn't make people not familiar
to enumerator internals bothering with current and previous masks machinery.

v2: https://lore.kernel.org/netdev/ZD3l6FBnUh9vTIGc@yury-ThinkPad/T/
v3:
 - fix sched_numa_find_{next,nth}_cpu() when CONFIG_NUMA is off to
   only traverse online CPUs;
 - don't export sched_domains_numa_levels for testing purposes. In
   the test, use for_each_node() macro;
 - extend the test for for_each_node();
 - in comments, mention that only online CPUs are traversed;
 - rebase on top of 6.3. 

Yury Norov (8):
  sched: fix sched_numa_find_nth_cpu() in non-NUMA case
  lib/find: add find_next_and_andnot_bit()
  sched/topology: introduce sched_numa_find_next_cpu()
  sched/topology: add for_each_numa_{,online}_cpu() macro
  net: mlx5: switch comp_irqs_request() to using for_each_numa_cpu
  lib/cpumask: update comment to cpumask_local_spread()
  sched: drop for_each_numa_hop_mask()
  lib: test for_each_numa_cpus()

 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 16 ++---
 include/linux/find.h                         | 43 ++++++++++++
 include/linux/topology.h                     | 40 ++++++-----
 kernel/sched/topology.c                      | 53 ++++++++-------
 lib/cpumask.c                                |  7 +-
 lib/find_bit.c                               | 12 ++++
 lib/test_bitmap.c                            | 70 +++++++++++++++++++-
 7 files changed, 183 insertions(+), 58 deletions(-)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 21+ messages in thread
* [PATCH v2 0/8] sched/topology: add for_each_numa_cpu() macro
@ 2023-04-20  5:19 Yury Norov
  0 siblings, 0 replies; 21+ messages in thread
From: Yury Norov @ 2023-04-20  5:19 UTC (permalink / raw)
  To: Jakub Kicinski, netdev, linux-rdma, linux-kernel
  Cc: Yury Norov, Saeed Mahameed, Pawel Chmielewski, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Paolo Abeni, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider,
	Tariq Toukan, Gal Pressman, Greg Kroah-Hartman, Heiko Carstens,
	Barry Song

for_each_cpu() is widely used in kernel, and it's beneficial to create
a NUMA-aware version of the macro.

Recently added for_each_numa_hop_mask() works, but switching existing
codebase to it is not an easy process.

This series adds for_each_numa_cpu(), which is designed to be similar to
the for_each_cpu(). It allows to convert existing code to NUMA-aware as
simple as adding a hop iterator variable and passing it inside new macro.
for_each_numa_cpu() takes care of the rest.

At the moment, we have 2 users of NUMA-aware enumerators. One is
Melanox's in-tree driver, and another is Intel's in-review driver:

https://lore.kernel.org/lkml/20230216145455.661709-1-pawel.chmielewski@intel.com/

Both real-life examples follow the same pattern:

        for_each_numa_hop_mask(cpus, prev, node) {
                for_each_cpu_andnot(cpu, cpus, prev) {
                        if (cnt++ == max_num)
                                goto out;
                        do_something(cpu);
                }
                prev = cpus;
        }

With the new macro, it has a more standard look, like this:

        for_each_numa_cpu(cpu, hop, node, cpu_possible_mask) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

Straight conversion of existing for_each_cpu() codebase to NUMA-aware
version with for_each_numa_hop_mask() is difficult because it doesn't
take a user-provided cpu mask, and eventually ends up with open-coded
double loop. With for_each_numa_cpu() it shouldn't be a brainteaser.
Consider the NUMA-ignorant example:

        cpumask_t cpus = get_mask();
        int cnt = 0, cpu;

        for_each_cpu(cpu, cpus) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

Converting it to NUMA-aware version would be as simple as:

        cpumask_t cpus = get_mask();
        int node = get_node();
        int cnt = 0, hop, cpu;

        for_each_numa_cpu(cpu, hop, node, cpus) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

The latter looks more verbose and avoids from open-coding that annoying
double loop. Another advantage is that it works with a 'hop' parameter with
the clear meaning of NUMA distance, and doesn't make people not familiar
to enumerator internals bothering with current and previous masks machinery.

v2:
 - repase on top of master;
 - cleanup comments and tweak them to comply with kernel-doc;
 - remove RFC from patch #8 as there's no objections.

Yury Norov (8):
  lib/find: add find_next_and_andnot_bit()
  sched/topology: introduce sched_numa_find_next_cpu()
  sched/topology: add for_each_numa_cpu() macro
  net: mlx5: switch comp_irqs_request() to using for_each_numa_cpu
  lib/cpumask: update comment to cpumask_local_spread()
  sched/topology: export sched_domains_numa_levels
  lib: add test for for_each_numa_{cpu,hop_mask}()
  sched: drop for_each_numa_hop_mask()

 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 16 ++----
 include/linux/find.h                         | 43 ++++++++++++++
 include/linux/topology.h                     | 37 ++++++------
 kernel/sched/topology.c                      | 59 +++++++++++---------
 lib/cpumask.c                                |  7 +--
 lib/find_bit.c                               | 12 ++++
 lib/test_bitmap.c                            | 16 ++++++
 7 files changed, 133 insertions(+), 57 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread
* [PATCH v2 0/8] sched/topology: add for_each_numa_cpu() macro
@ 2023-04-15  5:06 Yury Norov
  2023-04-17 11:11 ` Peter Zijlstra
  0 siblings, 1 reply; 21+ messages in thread
From: Yury Norov @ 2023-04-15  5:06 UTC (permalink / raw)
  To: Jakub Kicinski, netdev, linux-rdma, linux-kernel
  Cc: Yury Norov, Saeed Mahameed, Pawel Chmielewski, Leon Romanovsky,
	David S. Miller, Eric Dumazet, Paolo Abeni, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira, Valentin Schneider,
	Tariq Toukan, Gal Pressman, Greg Kroah-Hartman, Heiko Carstens,
	Barry Song

for_each_cpu() is widely used in kernel, and it's beneficial to create
a NUMA-aware version of the macro.

Recently added for_each_numa_hop_mask() works, but switching existing
codebase to it is not an easy process.

This series adds for_each_numa_cpu(), which is designed to be similar to
the for_each_cpu(). It allows to convert existing code to NUMA-aware as
simple as adding a hop iterator variable and passing it inside new macro.
for_each_numa_cpu() takes care of the rest.

At the moment, we have 2 users of NUMA-aware enumerators. One is
Melanox's in-tree driver, and another is Intel's in-review driver:

https://lore.kernel.org/lkml/20230216145455.661709-1-pawel.chmielewski@intel.com/

Both real-life examples follow the same pattern:

        for_each_numa_hop_mask(cpus, prev, node) {
                for_each_cpu_andnot(cpu, cpus, prev) {
                        if (cnt++ == max_num)
                                goto out;
                        do_something(cpu);
                }
                prev = cpus;
        }

With the new macro, it has a more standard look, like this:

        for_each_numa_cpu(cpu, hop, node, cpu_possible_mask) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

Straight conversion of existing for_each_cpu() codebase to NUMA-aware
version with for_each_numa_hop_mask() is difficult because it doesn't
take a user-provided cpu mask, and eventually ends up with open-coded
double loop. With for_each_numa_cpu() it shouldn't be a brainteaser.
Consider the NUMA-ignorant example:

        cpumask_t cpus = get_mask();
        int cnt = 0, cpu;

        for_each_cpu(cpu, cpus) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

Converting it to NUMA-aware version would be as simple as:

        cpumask_t cpus = get_mask();
        int node = get_node();
        int cnt = 0, hop, cpu;

        for_each_numa_cpu(cpu, hop, node, cpus) {
                if (cnt++ == max_num)
                        break;
                do_something(cpu);
        }

The latter looks more verbose and avoids from open-coding that annoying
double loop. Another advantage is that it works with a 'hop' parameter with
the clear meaning of NUMA distance, and doesn't make people not familiar
to enumerator internals bothering with current and previous masks machinery.

v2:
 - repase on top of master;
 - cleanup comments and tweak them to comply with kernel-doc;
 - remove RFC from patch #8 as there's no objections.

Yury Norov (8):
  lib/find: add find_next_and_andnot_bit()
  sched/topology: introduce sched_numa_find_next_cpu()
  sched/topology: add for_each_numa_cpu() macro
  net: mlx5: switch comp_irqs_request() to using for_each_numa_cpu
  lib/cpumask: update comment to cpumask_local_spread()
  sched/topology: export sched_domains_numa_levels
  lib: add test for for_each_numa_{cpu,hop_mask}()
  sched: drop for_each_numa_hop_mask()

 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 16 ++----
 include/linux/find.h                         | 43 ++++++++++++++
 include/linux/topology.h                     | 37 ++++++------
 kernel/sched/topology.c                      | 59 +++++++++++---------
 lib/cpumask.c                                |  7 +--
 lib/find_bit.c                               | 12 ++++
 lib/test_bitmap.c                            | 16 ++++++
 7 files changed, 133 insertions(+), 57 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-07-27  2:11 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-30 17:18 [PATCH v2 0/8] sched/topology: add for_each_numa_cpu() macro Yury Norov
2023-04-30 17:18 ` [PATCH v3 1/8] sched: fix sched_numa_find_nth_cpu() in non-NUMA case Yury Norov
2023-04-30 17:18 ` [PATCH v3 2/8] lib/find: add find_next_and_andnot_bit() Yury Norov
2023-04-30 17:18 ` [PATCH v3 3/8] sched/topology: introduce sched_numa_find_next_cpu() Yury Norov
2023-04-30 17:18 ` [PATCH v3 4/8] sched/topology: add for_each_numa_{,online}_cpu() macro Yury Norov
2023-04-30 17:18 ` [PATCH v3 5/8] net: mlx5: switch comp_irqs_request() to using for_each_numa_cpu Yury Norov
2023-04-30 17:18 ` [PATCH v3 6/8] lib/cpumask: update comment to cpumask_local_spread() Yury Norov
2023-04-30 17:18 ` [PATCH v3 7/8] sched: drop for_each_numa_hop_mask() Yury Norov
2023-04-30 17:18 ` [PATCH v3 8/8] lib: test for_each_numa_cpus() Yury Norov
2023-07-22 15:47   ` Guenter Roeck
2023-07-27  2:11     ` Yury Norov
2023-05-02 16:59 ` [PATCH v2 0/8] sched/topology: add for_each_numa_cpu() macro Valentin Schneider
2023-05-02 21:58   ` Yury Norov
2023-05-03 10:00     ` Valentin Schneider
2023-05-31 15:43 ` Yury Norov
2023-05-31 17:01   ` Jakub Kicinski
2023-05-31 17:08     ` Yury Norov
  -- strict thread matches above, loose matches on Subject: below --
2023-04-20  5:19 Yury Norov
2023-04-15  5:06 Yury Norov
2023-04-17 11:11 ` Peter Zijlstra
2023-04-18  0:51   ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).