From: Andrea Righi <arighi@nvidia.com>
To: Yury Norov <yury.norov@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
bpf@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 08/10] sched_ext: idle: introduce SCX_PICK_IDLE_NODE
Date: Tue, 24 Dec 2024 09:37:35 +0100 [thread overview]
Message-ID: <Z2pyzzmrbcVJ14TI@gpd3> (raw)
In-Reply-To: <Z2owJmy22Tk-bl4A@yury-ThinkPad>
On Mon, Dec 23, 2024 at 07:53:21PM -0800, Yury Norov wrote:
> On Mon, Dec 23, 2024 at 06:48:48PM -0800, Yury Norov wrote:
> > On Fri, Dec 20, 2024 at 04:11:40PM +0100, Andrea Righi wrote:
> > > Introduce a flag to restrict the selection of an idle CPU to a specific
> > > NUMA node.
> > >
> > > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > > ---
> > > kernel/sched/ext.c | 1 +
> > > kernel/sched/ext_idle.c | 11 +++++++++--
> > > 2 files changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > > index 143938e935f1..da5c15bd3c56 100644
> > > --- a/kernel/sched/ext.c
> > > +++ b/kernel/sched/ext.c
> > > @@ -773,6 +773,7 @@ enum scx_deq_flags {
> > >
> > > enum scx_pick_idle_cpu_flags {
> > > SCX_PICK_IDLE_CORE = 1LLU << 0, /* pick a CPU whose SMT siblings are also idle */
> > > + SCX_PICK_IDLE_NODE = 1LLU << 1, /* pick a CPU in the same target NUMA node */
> >
> > SCX_FORCE_NODE or SCX_FIX_NODE?
> >
> > > };
> > >
> > > enum scx_kick_flags {
> > > diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> > > index 444f2a15f1d4..013deaa08f12 100644
> > > --- a/kernel/sched/ext_idle.c
> > > +++ b/kernel/sched/ext_idle.c
> > > @@ -199,6 +199,12 @@ static s32 scx_pick_idle_cpu(const struct cpumask *cpus_allowed, int node, u64 f
>
> This function begins with:
>
> static s32 scx_pick_idle_cpu(const struct cpumask *cpus_allowed, int node, u64 flags)
> {
> nodemask_t hop_nodes = NODE_MASK_NONE;
> s32 cpu = -EBUSY;
>
> if (!static_branch_maybe(CONFIG_NUMA, &scx_builtin_idle_per_node))
> return pick_idle_cpu_from_node(cpus_allowed, NUMA_FLAT_NODE, flags);
>
> ...
>
> So if I disable scx_builtin_idle_per_node and then call:
>
> scx_pick_idle_cpu(some_cpus, numa_node_id(), SCX_PICK_IDLE_NODE)
>
> I may get a CPU from any non-local node, right? I think we need to honor user's
> request:
>
> if (!static_branch_maybe(CONFIG_NUMA, &scx_builtin_idle_per_node))
> return pick_idle_cpu_from_node(cpus_allowed,
> flags & SCX_PICK_IDLE_NODE ? node : NUMA_FLAT_NODE, flags);
>
> That way the code will be coherent: if you enable idle cpumasks, you
> will be able to follow all the NUMA hierarchy. If you disable them, at
> least you honor user's request to return a CPU from a given node, if
> he's very explicit about his intention.
>
> You can be even nicer:
>
> if (!static_branch_maybe(CONFIG_NUMA, &scx_builtin_idle_per_node)) {
> node = pick_idle_cpu_from_node(cpus, node, flags);
> if (node == MAX_NUM_NODES && flags & SCX_PICK_IDLE_NODE == 0)
> node = pick_idle_cpu_from_node(cpus, NUMA_FLAT_NODE, flags);
>
> return node;
> }
>
Sorry, I'm not following, if scx_builtin_idle_per_node is disabled, we’re
only tracking idle CPUs in a single NUMA_FLAT_NODE (which is node 0). All
the other cpumasks are just empty, and we would always return -EBUSY if we
honor the user request.
Maybe we should just return an error if scx_builtin_idle_per_node is
disabled and the user is requesting an idle CPU in a specific node?
-Andrea
next prev parent reply other threads:[~2024-12-24 8:37 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 15:11 [PATCHSET v8 sched_ext/for-6.14] sched_ext: split global idle cpumask into per-NUMA cpumasks Andrea Righi
2024-12-20 15:11 ` [PATCH 01/10] sched/topology: introduce for_each_numa_hop_node() / sched_numa_hop_node() Andrea Righi
2024-12-23 21:18 ` Yury Norov
2024-12-24 7:54 ` Andrea Righi
2024-12-24 17:33 ` Yury Norov
2024-12-20 15:11 ` [PATCH 02/10] sched_ext: Move built-in idle CPU selection policy to a separate file Andrea Righi
2024-12-24 21:21 ` Tejun Heo
2024-12-20 15:11 ` [PATCH 03/10] sched_ext: idle: introduce check_builtin_idle_enabled() helper Andrea Righi
2024-12-20 15:11 ` [PATCH 04/10] sched_ext: idle: use assign_cpu() to update the idle cpumask Andrea Righi
2024-12-23 22:26 ` Yury Norov
2024-12-20 15:11 ` [PATCH 05/10] sched_ext: idle: clarify comments Andrea Righi
2024-12-23 22:28 ` Yury Norov
2024-12-20 15:11 ` [PATCH 06/10] sched_ext: Introduce SCX_OPS_NODE_BUILTIN_IDLE Andrea Righi
2024-12-20 15:11 ` [PATCH 07/10] sched_ext: Introduce per-node idle cpumasks Andrea Righi
2024-12-24 4:05 ` Yury Norov
2024-12-24 8:18 ` Andrea Righi
2024-12-24 17:59 ` Yury Norov
2024-12-20 15:11 ` [PATCH 08/10] sched_ext: idle: introduce SCX_PICK_IDLE_NODE Andrea Righi
2024-12-24 2:48 ` Yury Norov
2024-12-24 3:53 ` Yury Norov
2024-12-24 8:37 ` Andrea Righi [this message]
2024-12-24 18:15 ` Yury Norov
2024-12-24 8:22 ` Andrea Righi
2024-12-24 21:29 ` Tejun Heo
2024-12-20 15:11 ` [PATCH 09/10] sched_ext: idle: Get rid of the scx_selcpu_topo_numa logic Andrea Righi
2024-12-23 23:39 ` Yury Norov
2024-12-24 8:58 ` Andrea Righi
2024-12-20 15:11 ` [PATCH 10/10] sched_ext: idle: Introduce NUMA aware idle cpu kfunc helpers Andrea Righi
2024-12-24 0:57 ` Yury Norov
2024-12-24 9:32 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2pyzzmrbcVJ14TI@gpd3 \
--to=arighi@nvidia.com \
--cc=bpf@vger.kernel.org \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.