From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: mingo@kernel.org, peterz@infradead.org,
vincent.guittot@linaro.org, linux-kernel@vger.kernel.org,
kprateek.nayak@amd.com
Cc: sshegde@linux.ibm.com, dietmar.eggemann@arm.com,
vschneid@redhat.com, rostedt@goodmis.org, tglx@linutronix.de,
tim.c.chen@linux.intel.com
Subject: [PATCH 0/4] sched/fair: improve nohz fields for large systems
Date: Tue, 2 Dec 2025 00:01:42 +0530 [thread overview]
Message-ID: <20251201183146.74443-1-sshegde@linux.ibm.com> (raw)
It was noted when running on large systems nohz.nr_cpus cacheline was
bouncing quite often. There is atomic inc/dec and read happening on many
CPUs at a time and it is possible for this line to bounce often.
Gist of the series is to get rid of nr_cpus, instead use the cpumask
which is always updated alongside with it. Functionally it should serve
the same purpose. At worst, one might miss an idle load balance
happening due to race. Looking at comments, it might happen even today.
Other patches are minor ones. there are couple of time checks to bail
out. Check the variables after the time checks to avoid cache references
to it.
There is a series which aims to solve contention by moving to LLC.
https://lore.kernel.org/all/20250904041516.3046-1-kprateek.nayak@amd.com/
Maybe these bits are useful for that too. We could discuss further at
LPC.
Ran "hackbench 100 process 5000 loops" and collected perf cycles and
selected top nohz functions. Benchmark numbers don't change by much.
Will ask our performance team to do the numbers with the series.
baseline: tip sched/core at 3eb593560146
1.01% [k] nohz_balance_exit_idle
0.31% [k] nohz_balancer_kick
0.05% [k] nohz_balance_enter_idle
With series:
0.45% [k] nohz_balance_exit_idle
0.18% [k] nohz_balancer_kick
0.01% [k] nohz_balance_enter_idle
Shrikanth Hegde (4):
sched/fair: Move checking for nohz cpus after time check
sched/fair: Change likelyhood of nohz nr_cpus check
sched/fair: Check for blocked task after time check
sched/fair: Remove atomic nr_cpus and use cpumask instead
kernel/sched/fair.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
--
2.43.0
next reply other threads:[~2025-12-01 18:32 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-01 18:31 Shrikanth Hegde [this message]
2025-12-01 18:31 ` [PATCH 1/4] sched/fair: Move checking for nohz cpus after time check Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 2/4] sched/fair: Change likelyhood of nohz nr_cpus check Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 3/4] sched/fair: Check for blocked task after time check Shrikanth Hegde
2025-12-02 6:26 ` Ingo Molnar
2025-12-02 6:55 ` Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 4/4] sched/fair: Remove atomic nr_cpus and use cpumask instead Shrikanth Hegde
2025-12-01 19:58 ` Ingo Molnar
2025-12-02 5:29 ` Shrikanth Hegde
2025-12-02 7:54 ` Ingo Molnar
2025-12-02 14:35 ` Shrikanth Hegde
2025-12-02 16:14 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251201183146.74443-1-sshegde@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=dietmar.eggemann@arm.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox