All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: kprateek.nayak@amd.com, juri.lelli@redhat.com,
	tglx@linutronix.de, dietmar.eggemann@arm.com,
	anna-maria@linutronix.de, frederic@kernel.org,
	wangyang.guo@intel.com, mingo@kernel.org, peterz@infradead.org,
	vincent.guittot@linaro.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 3/3] sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead
Date: Fri, 9 Jan 2026 20:45:48 +0530	[thread overview]
Message-ID: <d2cec8f3-781b-4a7f-9ca9-e848167e5f30@linux.ibm.com> (raw)
In-Reply-To: <xhsmhbjj2onfp.mognet@vschneid-thinkpadt14sgen2i.remote.csb>

Hi Valentin. Thanks for going through.

On 1/9/26 8:14 PM, Valentin Schneider wrote:
> On 07/01/26 12:21, Shrikanth Hegde wrote:
>> nohz.nr_cpus was observed as contended cacheline when running
>> enterprise workload on large systems.
>>
>> Fundamental scalability challenge with nohz.idle_cpus_mask
>> and nohz.nr_cpus is the following:
>>
>>   (1) nohz_balancer_kick() observes (reads) nohz.nr_cpus
>>       (or nohz.idle_cpu_mask) and nohz.has_blocked to  see whether there's
>>       any nohz balancing work to do, in every scheduler tick.
>>
>>   (2) nohz_balance_enter_idle() and nohz_balance_exit_idle()
>>       (through nohz_balancer_kick() via sched_tick()) modify (write)
>>       nohz.nr_cpus (and/or nohz.idle_cpu_mask) and nohz.has_blocked.
>>
> 
> My first reaction on reading the whole changelog was: "but .nr_cpus and
> .idle_cpus_mask are in the same cacheline?!", which as Ingo pointed out
> somewhere down [1] isn't true for CPUMASK_OFFSTACK, so this change
> effectively gets rid of the dirtying of one extra cacheline during idle
> entry/exit.
> 
> [1]: http://lore.kernel.org/r/aS3za7X9BLS5rg65@gmail.com
> 
> I'd suggest adding something like so in this part of the changelog:
> 
> """
> Note that nohz.idle_cpus_mask and nohz.nr_cpus reside in the same
> cacheline, however under CONFIG_CPUMASK_OFFSTACK the backing storage for
> nohz.idle_cpus_mask will be elsewhere. This implies two separate cachelines
> being dirtied upon idle entry / exit.
> """
> 

ok. Will do that. Thanks.

Even for CONFIG_CPUMASK_OFFSTACK=n, usual configuration is like 512/1024/
2048 or higher.

For 64 byte cacheline, 1 cacheline can hold 512 CPUs.
So idle_cpus_mask and rest of nohz fields including nr_cpus will be in different
cacheline.

Even for powerpc(128 byte cacheline), where CONFIG_CPUMASK_OFFSTACK=n,
default is NR_CPUS=2048. that means idle_cpus_mask will take 2 cachelines and rest
of nohz fields will be in third cacheline.

So in most of the cases, this implies dirtying one less cacheline.

data points with CONFIG_CPUMASK_OFFSTACK=y/n
[1]: https://lore.kernel.org/all/fdb378e7-7797-4aeb-a79f-12af4cb1b81a@linux.ibm.com/

      reply	other threads:[~2026-01-09 15:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-07  6:51 [PATCH v3 0/3] sched/fair: Improve nohz fields for large systems Shrikanth Hegde
2026-01-07  6:51 ` [PATCH v3 1/3] sched/fair: Move checking for nohz cpus after time check Shrikanth Hegde
2026-01-08 12:44   ` Peter Zijlstra
2026-01-09 15:14     ` Shrikanth Hegde
2026-01-07  6:51 ` [PATCH v3 2/3] sched/fair: Change likelyhood of nohz.nr_cpus Shrikanth Hegde
2026-01-07  6:51 ` [PATCH v3 3/3] sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead Shrikanth Hegde
2026-01-09 14:44   ` Valentin Schneider
2026-01-09 15:15     ` Shrikanth Hegde [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d2cec8f3-781b-4a7f-9ca9-e848167e5f30@linux.ibm.com \
    --to=sshegde@linux.ibm.com \
    --cc=anna-maria@linutronix.de \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=wangyang.guo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.