From: Andrea Righi <arighi@nvidia.com>
To: Yury Norov <yury.norov@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Joel Fernandes <joel@joelfernandes.org>,
Ian May <ianm@nvidia.com>,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/7] mm/numa: Introduce nearest_node_nodemask()
Date: Thu, 13 Feb 2025 17:19:58 +0100 [thread overview]
Message-ID: <Z64brsSMAR7cLPUU@gpd3> (raw)
In-Reply-To: <Z64WTLPaSxixbE2q@thinkpad>
On Thu, Feb 13, 2025 at 10:57:00AM -0500, Yury Norov wrote:
> On Wed, Feb 12, 2025 at 05:48:09PM +0100, Andrea Righi wrote:
> > Introduce the new helper nearest_node_nodemask() to find the closest
> > node in a specified nodemask from a given starting node.
> >
> > Returns MAX_NUMNODES if no node is found.
> >
> > Cc: Yury Norov <yury.norov@gmail.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
>
> Suggested-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Ok.
>
> > ---
> > include/linux/numa.h | 7 +++++++
> > mm/mempolicy.c | 32 ++++++++++++++++++++++++++++++++
> > 2 files changed, 39 insertions(+)
> >
> > diff --git a/include/linux/numa.h b/include/linux/numa.h
> > index 31d8bf8a951a7..e6baaf6051bcf 100644
> > --- a/include/linux/numa.h
> > +++ b/include/linux/numa.h
> > @@ -31,6 +31,8 @@ void __init alloc_offline_node_data(int nid);
> > /* Generic implementation available */
> > int numa_nearest_node(int node, unsigned int state);
> >
> > +int nearest_node_nodemask(int node, nodemask_t *mask);
> > +
>
> See how you use it. It looks a bit inconsistent to the other functions:
>
> #define for_each_node_numadist(node, unvisited) \
> for (int start = (node), \
> node = nearest_node_nodemask((start), &(unvisited)); \
> node < MAX_NUMNODES; \
> node_clear(node, (unvisited)), \
> node = nearest_node_nodemask((start), &(unvisited)))
>
>
> I would suggest to make it aligned with the rest of the API:
>
> #define node_clear(node, dst) __node_clear((node), &(dst))
> static __always_inline void __node_clear(int node, volatile nodemask_t *dstp)
> {
> clear_bit(node, dstp->bits);
> }
Sorry Yury, can you elaborate more on this? What do you mean with
inconsistent, is it the volatile nodemask_t *?
>
> > #ifndef memory_add_physaddr_to_nid
> > int memory_add_physaddr_to_nid(u64 start);
> > #endif
> > @@ -47,6 +49,11 @@ static inline int numa_nearest_node(int node, unsigned int state)
> > return NUMA_NO_NODE;
> > }
> >
> > +static inline int nearest_node_nodemask(int node, nodemask_t *mask)
> > +{
> > + return NUMA_NO_NODE;
> > +}
> > +
> > static inline int memory_add_physaddr_to_nid(u64 start)
> > {
> > return 0;
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 162407fbf2bc7..1e2acf187ea3a 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -196,6 +196,38 @@ int numa_nearest_node(int node, unsigned int state)
> > }
> > EXPORT_SYMBOL_GPL(numa_nearest_node);
> >
> > +/**
> > + * nearest_node_nodemask - Find the node in @mask at the nearest distance
> > + * from @node.
> > + *
> > + * @node: the node to start the search from.
> > + * @mask: a pointer to a nodemask representing the allowed nodes.
> > + *
> > + * This function iterates over all nodes in the given state and calculates
> > + * the distance to the starting node.
> > + *
> > + * Returns the node ID in @mask that is the closest in terms of distance
> > + * from @node, or MAX_NUMNODES if no node is found.
> > + */
> > +int nearest_node_nodemask(int node, nodemask_t *mask)
> > +{
> > + int dist, n, min_dist = INT_MAX, min_node = MAX_NUMNODES;
> > +
> > + if (node == NUMA_NO_NODE)
> > + return MAX_NUMNODES;
>
> This makes it unclear: you make it legal to pass NUMA_NO_NODE, but
> your function returns something useless. I don't think it would help
> users in any reasonable scenario.
>
> So, if you don't want user to call this with node == NUMA_NO_NODE,
> just describe it in comment on top of the function. Otherwise, please
> do something useful like
>
> if (node == NUMA_NO_NODE)
> node = current_node;
>
> I would go with option 1. Notice, node_distance() doesn't bother to
> check against NUMA_NO_NODE.
Hm... is it? Looking at __node_distance(), it doesn't seem really safe to
pass a negative value (maybe I'm missing something?).
Anyway, I'd also prefer to go with option 1 and not implicitly assuming
NUMA_NO_NODE == current node (it feels that it might hide nasty bugs).
So, I can add a comment in the description to clarify that NUMA_NO_NODE is
forbidenx, but what is someone is passing it? Should we WARN_ON_ONCE() at
least?
>
> > + for_each_node_mask(n, *mask) {
> > + dist = node_distance(node, n);
> > + if (dist < min_dist) {
> > + min_dist = dist;
> > + min_node = n;
> > + }
> > + }
> > +
> > + return min_node;
> > +}
> > +EXPORT_SYMBOL_GPL(nearest_node_nodemask);
> > +
> > struct mempolicy *get_task_policy(struct task_struct *p)
> > {
> > struct mempolicy *pol = p->mempolicy;
> > --
> > 2.48.1
Thanks,
-Andrea
next prev parent reply other threads:[~2025-02-13 16:20 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-12 16:48 [PATCHSET v11 sched_ext/for-6.15] sched_ext: split global idle cpumask into per-NUMA cpumasks Andrea Righi
2025-02-12 16:48 ` [PATCH 1/7] nodemask: numa: reorganize inclusion path Andrea Righi
2025-02-13 15:29 ` Yury Norov
2025-02-13 15:59 ` Andrea Righi
2025-02-12 16:48 ` [PATCH 2/7] mm/numa: Introduce nearest_node_nodemask() Andrea Righi
2025-02-13 15:57 ` Yury Norov
2025-02-13 16:19 ` Andrea Righi [this message]
2025-02-13 17:12 ` Yury Norov
2025-02-14 8:55 ` Andrea Righi
2025-02-14 16:04 ` Yury Norov
2025-02-12 16:48 ` [PATCH 3/7] sched/topology: Introduce for_each_node_numadist() iterator Andrea Righi
2025-02-13 16:02 ` Yury Norov
2025-02-13 16:32 ` Andrea Righi
2025-02-12 16:48 ` [PATCH 4/7] sched_ext: idle: Make idle static keys private Andrea Righi
2025-02-12 16:48 ` [PATCH 5/7] sched_ext: idle: Introduce SCX_OPS_BUILTIN_IDLE_PER_NODE Andrea Righi
2025-02-13 16:08 ` Yury Norov
2025-02-13 16:22 ` Andrea Righi
2025-02-12 16:48 ` [PATCH 6/7] sched_ext: idle: Per-node idle cpumasks Andrea Righi
2025-02-13 10:57 ` kernel test robot
2025-02-13 18:03 ` Yury Norov
2025-02-12 16:48 ` [PATCH 7/7] sched_ext: idle: Introduce node-aware idle cpu kfunc helpers Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z64brsSMAR7cLPUU@gpd3 \
--to=arighi@nvidia.com \
--cc=bpf@vger.kernel.org \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=dietmar.eggemann@arm.com \
--cc=ianm@nvidia.com \
--cc=joel@joelfernandes.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.