public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] sched/fair: improve nohz fields for large systems
@ 2025-12-01 18:31 Shrikanth Hegde
  2025-12-01 18:31 ` [PATCH 1/4] sched/fair: Move checking for nohz cpus after time check Shrikanth Hegde
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Shrikanth Hegde @ 2025-12-01 18:31 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, linux-kernel, kprateek.nayak
  Cc: sshegde, dietmar.eggemann, vschneid, rostedt, tglx, tim.c.chen

It was noted when running on large systems nohz.nr_cpus cacheline was
bouncing quite often. There is atomic inc/dec and read happening on many
CPUs at a time and it is possible for this line to bounce often.

Gist of the series is to get rid of nr_cpus, instead use the cpumask
which is always updated alongside with it. Functionally it should serve
the same purpose. At worst, one might miss an idle load balance
happening due to race. Looking at comments, it might happen even today.

Other patches are minor ones. there are couple of time checks to bail
out. Check the variables after the time checks to avoid cache references
to it.

There is a series which aims to solve contention by moving to LLC.
https://lore.kernel.org/all/20250904041516.3046-1-kprateek.nayak@amd.com/
Maybe these bits are useful for that too. We could discuss further at
LPC.

Ran "hackbench 100 process 5000 loops" and collected perf cycles and
selected top nohz functions. Benchmark numbers don't change by much.
Will ask our performance team to do the numbers with the series.

baseline: tip sched/core at 3eb593560146

   1.01%  [k] nohz_balance_exit_idle
   0.31%  [k] nohz_balancer_kick
   0.05%  [k] nohz_balance_enter_idle

With series:
   0.45%  [k] nohz_balance_exit_idle
   0.18%  [k] nohz_balancer_kick
   0.01%  [k] nohz_balance_enter_idle


Shrikanth Hegde (4):
  sched/fair: Move checking for nohz cpus after time check
  sched/fair: Change likelyhood of nohz nr_cpus check
  sched/fair: Check for blocked task after time check
  sched/fair: Remove atomic nr_cpus and use cpumask instead

 kernel/sched/fair.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-12-02 16:14 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-01 18:31 [PATCH 0/4] sched/fair: improve nohz fields for large systems Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 1/4] sched/fair: Move checking for nohz cpus after time check Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 2/4] sched/fair: Change likelyhood of nohz nr_cpus check Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 3/4] sched/fair: Check for blocked task after time check Shrikanth Hegde
2025-12-02  6:26   ` Ingo Molnar
2025-12-02  6:55     ` Shrikanth Hegde
2025-12-01 18:31 ` [PATCH 4/4] sched/fair: Remove atomic nr_cpus and use cpumask instead Shrikanth Hegde
2025-12-01 19:58   ` Ingo Molnar
2025-12-02  5:29     ` Shrikanth Hegde
2025-12-02  7:54       ` Ingo Molnar
2025-12-02 14:35         ` Shrikanth Hegde
2025-12-02 16:14           ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox