All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: Cheng-Yang Chou <yphbchou0911@gmail.com>,
	sched-ext@lists.linux.dev, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
	Chia-Ping Tsai <chia7712@gmail.com>,
	Christian Loehle <christian.loehle@arm.com>
Subject: Re: [PATCH 1/2] sched_ext: Cache per-node NUMA distance order in scx_idle_init_masks()
Date: Tue, 17 Mar 2026 19:02:12 +0100	[thread overview]
Message-ID: <abmXJGGsb9dnpPzS@gpd4> (raw)
In-Reply-To: <abmQRgwQZhZk9-_S@slm.duckdns.org>

On Tue, Mar 17, 2026 at 07:32:54AM -1000, Tejun Heo wrote:
> On Tue, Mar 17, 2026 at 11:04:11PM +0800, Cheng-Yang Chou wrote:
> > Hi Andrea,
> > 
> > On Mon, Mar 16, 2026 at 04:51:34PM +0100, Andrea Righi wrote:
> > > Hi Cheng-Yang,
> > > 
> > > this has been sitting in my TODO list for a while, so thanks for looking
> > > at it. :)
> > > 
> > > Comments below.
> > > 
> > > On Mon, Mar 16, 2026 at 02:10:14AM +0800, Cheng-Yang Chou wrote:
> > > > Add scx_numa_node_order[] and scx_numa_node_order_cnt[], per-node
> > > > arrays that store NUMA nodes sorted by increasing distance from each
> > > > node. They are allocated and populated once during boot in
> > > > scx_idle_init_masks() using the existing for_each_node_numadist()
> > > > O(N^2) traversal, so the cost is paid only at init time.
> 
> Wasn't the conclusion that given the low numa node count, O(N^2) doesn't
> matter for now here although I can see node count becoming high enough with
> AMD fake LLC NUMA enabled on an actual large NUMA machines. But if we decide
> that's an actual problem, shouldn't it be addressed in the topology code
> rather than from sched_ext side?

Yes, that was the conclusion from the previous discussion on this topic.
Honestly I've never seen NUMA system where the O(N^2) could be an issue.
One thing that I like about this change is the removal of
preempt_disable/enable() when we iterate nodes.

However, speaking of performance, the O(N^2) bitmask iteration on a single
nodemask_t is probably more efficient than the O(N) iteration through the
scx_numa_node_order[] array (it should be more cache friendly). Ideally we
should run some tests on those big Intel NUMA machines that were showing
the bad regressions with the single global idle cpumasks, but I don't think
I have access to them anymore...

So, thinking more about this, I'm still conflicted if we should apply this
change or not, especially because we don't have any performance number...

-Andrea

  reply	other threads:[~2026-03-17 18:02 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-15 18:10 [PATCH sched_ext/for-7.1 0/2] sched_ext: Reduce idle CPU NUMA traversal from O(N^2) to O(N) Cheng-Yang Chou
2026-03-15 18:10 ` [PATCH 1/2] sched_ext: Cache per-node NUMA distance order in scx_idle_init_masks() Cheng-Yang Chou
2026-03-16 15:51   ` Andrea Righi
2026-03-17 15:04     ` Cheng-Yang Chou
2026-03-17 17:32       ` Tejun Heo
2026-03-17 18:02         ` Andrea Righi [this message]
2026-03-17 19:00           ` Cheng-Yang Chou
2026-03-15 18:10 ` [PATCH 2/2] sched_ext: Use cached NUMA order in pick_idle_cpu_from_online_nodes() Cheng-Yang Chou
2026-03-16 12:27 ` [PATCH sched_ext/for-7.1 0/2] sched_ext: Reduce idle CPU NUMA traversal from O(N^2) to O(N) Christian Loehle
2026-03-16 14:50   ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abmXJGGsb9dnpPzS@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=chia7712@gmail.com \
    --cc=christian.loehle@arm.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.