From: Yury Norov <yury.norov@gmail.com>
To: Andrea Righi <arighi@nvidia.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Joel Fernandes <joel@joelfernandes.org>,
Ian May <ianm@nvidia.com>,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/8] sched/topology: Introduce for_each_node_numadist() iterator
Date: Fri, 14 Feb 2025 16:16:53 -0500 [thread overview]
Message-ID: <Z6-yxTEbuJZUZW8f@thinkpad> (raw)
In-Reply-To: <20250214194134.658939-5-arighi@nvidia.com>
On Fri, Feb 14, 2025 at 08:40:03PM +0100, Andrea Righi wrote:
> Introduce the new helper for_each_node_numadist() to iterate over node
> IDs in order of increasing NUMA distance from a given starting node.
>
> This iterator is somehow similar to for_each_numa_hop_mask(), but
> instead of providing a cpumask at each iteration, it provides a node ID.
>
> Example usage:
>
> nodemask_t unvisited = NODE_MASK_ALL;
> int node, start = cpu_to_node(smp_processor_id());
>
> node = start;
> for_each_node_numadist(node, unvisited)
> pr_info("node (%d, %d) -> %d\n",
> start, node, node_distance(start, node));
>
> On a system with equidistant nodes:
>
> $ numactl -H
> ...
> node distances:
> node 0 1 2 3
> 0: 10 20 20 20
> 1: 20 10 20 20
> 2: 20 20 10 20
> 3: 20 20 20 10
>
> Output of the example above (on node 0):
>
> [ 7.367022] node (0, 0) -> 10
> [ 7.367151] node (0, 1) -> 20
> [ 7.367186] node (0, 2) -> 20
> [ 7.367247] node (0, 3) -> 20
>
> On a system with non-equidistant nodes (simulated using virtme-ng):
>
> $ numactl -H
> ...
> node distances:
> node 0 1 2 3
> 0: 10 51 31 41
> 1: 51 10 21 61
> 2: 31 21 10 11
> 3: 41 61 11 10
>
> Output of the example above (on node 0):
>
> [ 8.953644] node (0, 0) -> 10
> [ 8.953712] node (0, 2) -> 31
> [ 8.953764] node (0, 3) -> 41
> [ 8.953817] node (0, 1) -> 51
>
> Suggested-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
> ---
> include/linux/topology.h | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3e..a1815f4395ab6 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -261,6 +261,36 @@ sched_numa_hop_mask(unsigned int node, unsigned int hops)
> }
> #endif /* CONFIG_NUMA */
>
> +/**
> + * for_each_node_numadist() - iterate over nodes in increasing distance
> + * order, starting from a given node
> + * @node: the iteration variable and the starting node.
> + * @unvisited: a nodemask to keep track of the unvisited nodes.
> + *
> + * This macro iterates over NUMA node IDs in increasing distance from the
> + * starting @node and yields MAX_NUMNODES when all the nodes have been
> + * visited.
> + *
> + * Note that by the time the loop completes, the @unvisited nodemask will
> + * be fully cleared, unless the loop exits early.
> + *
> + * The difference between for_each_node() and for_each_node_numadist() is
> + * that the former allows to iterate over nodes in numerical order, whereas
> + * the latter iterates over nodes in increasing order of distance.
> + *
> + * This complexity of this iterator is O(N^2), where N represents the
> + * number of nodes, as each iteration involves scanning all nodes to
> + * find the one with the shortest distance.
> + *
> + * Requires rcu_lock to be held.
> + */
> +#define for_each_node_numadist(node, unvisited) \
> + for (int __start = (node), \
> + (node) = nearest_node_nodemask((__start), &(unvisited)); \
> + (node) < MAX_NUMNODES; \
> + node_clear((node), (unvisited)), \
> + (node) = nearest_node_nodemask((__start), &(unvisited)))
> +
> /**
> * for_each_numa_hop_mask - iterate over cpumasks of increasing NUMA distance
> * from a given node.
> --
> 2.48.1
next prev parent reply other threads:[~2025-02-14 21:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-14 19:39 [PATCHSET v12 sched_ext/for-6.15] sched_ext: split global idle cpumask into per-NUMA cpumasks Andrea Righi
2025-02-14 19:40 ` [PATCH 1/8] nodemask: add nodes_copy() Andrea Righi
2025-02-14 19:40 ` [PATCH 2/8] nodemask: numa: reorganize inclusion path Andrea Righi
2025-02-14 19:40 ` [PATCH 3/8] mm/numa: Introduce nearest_node_nodemask() Andrea Righi
2025-02-14 21:14 ` Yury Norov
2025-02-14 19:40 ` [PATCH 4/8] sched/topology: Introduce for_each_node_numadist() iterator Andrea Righi
2025-02-14 21:16 ` Yury Norov [this message]
2025-02-14 21:29 ` Tejun Heo
2025-02-14 21:30 ` Yury Norov
2025-02-16 16:12 ` Tejun Heo
2025-02-14 19:40 ` [PATCH 5/8] sched_ext: idle: Make idle static keys private Andrea Righi
2025-02-14 19:40 ` [PATCH 6/8] sched_ext: idle: Introduce SCX_OPS_BUILTIN_IDLE_PER_NODE Andrea Righi
2025-02-14 21:18 ` Yury Norov
2025-02-14 19:40 ` [PATCH 7/8] sched_ext: idle: Per-node idle cpumasks Andrea Righi
2025-02-14 21:21 ` Yury Norov
2025-02-14 19:40 ` [PATCH 8/8] sched_ext: idle: Introduce node-aware idle cpu kfunc helpers Andrea Righi
2025-02-14 21:28 ` Yury Norov
2025-02-17 13:41 ` Andrea Righi
2025-02-17 17:24 ` Yury Norov
2025-02-17 17:27 ` Andrea Righi
2025-02-16 16:57 ` Tejun Heo
2025-02-16 19:54 ` Andrea Righi
2025-02-16 16:54 ` [PATCHSET v12 sched_ext/for-6.15] sched_ext: split global idle cpumask into per-NUMA cpumasks Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6-yxTEbuJZUZW8f@thinkpad \
--to=yury.norov@gmail.com \
--cc=arighi@nvidia.com \
--cc=bpf@vger.kernel.org \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=dietmar.eggemann@arm.com \
--cc=ianm@nvidia.com \
--cc=joel@joelfernandes.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.