The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Shrikanth Hegde <sshegde@linux.ibm.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>,
	Felix Abecassis <fabecassis@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path
Date: Sat, 16 May 2026 07:45:21 +0200	[thread overview]
Message-ID: <aggEcYO0aJCgdzXm@gpd4> (raw)
In-Reply-To: <4b04aade-8474-4e37-991e-16f2faedaf0c@linux.ibm.com>

Hi Shrikanth,

On Fri, May 15, 2026 at 12:19:16PM +0530, Shrikanth Hegde wrote:
> 
> 
> On 5/9/26 11:37 PM, Andrea Righi wrote:
> > nohz_balancer_kick() is reached from sched_balance_trigger(), which is
> > called from sched_tick(). sched_tick() runs with IRQs disabled, so the
> > additional rcu_read_lock/unlock() used around sched_domain accesses in
> > this path is redundant. Rely on the existing IRQ-disabled context (and
> > the rcu_dereference_all() checking) instead.
> > 
> > The same applies to set_cpu_sd_state_idle(), called from the idle entry
> > path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via
> > nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs
> > disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE
> > teardown, which runs under cpus_write_lock(), so it cannot race with
> > sched-domain rebuilds). In both cases the rcu_dereference_all()
> > validation is sufficient.
> > 
> > No functional change intended.
> > 
> 
> For this patch, few more comments below.
> 
> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> 
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> 
> 
> > @@ -12868,17 +12860,13 @@ static void nohz_balancer_kick(struct rq *rq)
> >   static void set_cpu_sd_state_busy(int cpu)
> >   {
> >   	struct sched_domain *sd;
> > -
> > -	rcu_read_lock();
> >   	sd = rcu_dereference_all(per_cpu(sd_llc, cpu));
> >   	if (!sd || !sd->nohz_idle)
> > -		goto unlock;
> > +		return;
> >   	sd->nohz_idle = 0;
> >   	atomic_inc(&sd->shared->nr_busy_cpus);
> > -unlock:
> > -	rcu_read_unlock();
> >   }
> >   void nohz_balance_exit_idle(struct rq *rq)
> > @@ -12897,17 +12885,13 @@ void nohz_balance_exit_idle(struct rq *rq)
> >   static void set_cpu_sd_state_idle(int cpu)
> >   {
> >   	struct sched_domain *sd;
> > -
> > -	rcu_read_lock();
> >   	sd = rcu_dereference_all(per_cpu(sd_llc, cpu));
> >   	if (!sd || sd->nohz_idle)
> > -		goto unlock;
> > +		return;
> >   	sd->nohz_idle = 1;
> >   	atomic_dec(&sd->shared->nr_busy_cpus);
> > -unlock:
> > -	rcu_read_unlock();
> >   }
> >   /*
> 
> I was looking at other users of sd_llc, i.e test_idle_core and set_idle_core.
> They have rcu_dereference_all. So callers need not call rcu_read_lock/unlock if
> the irq disabled/preempt_disabled.
> 
> One more place would be update_idle_core. I think it is called with interrupt disabled
> in __schedule path.

Good point, __update_idle_core() reaches set_next_task_idle() via
pick_next_task() in __schedule(), and __schedule() disables IRQs before that
path.

Since set_idle_cores()/test_idle_cores() use rcu_dereference_all(), the
rcu_read_lock/unlock() pair in __update_idle_core() is indeed redundant. I can
send a follow-up patch for this.

> 
> And in sched_ext, scx_idle_update_selcpu_topology, It seems to be tied to cpu hotplug and
> by same logic of cpus_write_lock held, one could remove redundant rcu_read_lock there as well.
> 
> No?

For scx_idle_update_selcpu_topology() it's a bit more nuanced, if I'm not
missing anything:
 - the helpers it uses (llc_weight/llc_span/numa_weight/numa_span) use plain
   rcu_dereference(), so simply dropping rcu_read_lock() in the caller would
   trip the lockdep check. They'd need to be converted to rcu_dereference_all()
   first;
 - the two call sites have different protection:
    - handle_hotplug() runs from a CPU hotplug callback, so cpus_write_lock()
      is held, serializes against sched-domain rebuilds,
    - scx_enable() only holds cpus_read_lock(), which doesn't on
      its own prevent cpuset sched-domain rebuilds (those run under
      cpus_read_lock() too).

I think this one needs a separate, more careful patch. Maybe we should keep this
series scoped to the NOHZ kick path and address those as follow-ups?

Thanks,
-Andrea

  reply	other threads:[~2026-05-16  5:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-09 18:07 [PATCH v6 0/5 RESEND] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-05-09 18:07 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-05-11 13:04   ` Vincent Guittot
2026-05-15  6:49   ` Shrikanth Hegde
2026-05-16  5:45     ` Andrea Righi [this message]
2026-05-09 18:07 ` [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Andrea Righi
2026-05-11 13:04   ` Vincent Guittot
2026-05-15 10:05   ` Shrikanth Hegde
2026-05-16  5:58     ` [PATCH v2 " Andrea Righi
2026-05-09 18:07 ` [PATCH 3/5] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-05-11 13:07   ` Vincent Guittot
2026-05-11 13:45     ` Andrea Righi
2026-05-11 14:25     ` [PATCH v2 " Andrea Righi
2026-05-09 18:07 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-05-11 13:07   ` Vincent Guittot
2026-05-15 10:09   ` Shrikanth Hegde
2026-05-16  9:04     ` Andrea Righi
2026-05-09 18:07 ` [PATCH 5/5] sched/fair: Add SIS_UTIL support to select_idle_capacity() Andrea Righi
2026-05-11 13:08   ` Vincent Guittot
  -- strict thread matches above, loose matches on Subject: below --
2026-05-09 18:01 Andrea Righi
2026-05-09 18:01 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-04-28 14:41 [PATCH v5 0/5] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-04-28 14:41 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-04-28 16:29   ` K Prateek Nayak
2026-05-05  9:15   ` Dietmar Eggemann
2026-05-05  9:22     ` Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aggEcYO0aJCgdzXm@gpd4 \
    --to=arighi@nvidia.com \
    --cc=balbirs@nvidia.com \
    --cc=bsegall@google.com \
    --cc=christian.loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=fabecassis@nvidia.com \
    --cc=joelagnelf@nvidia.com \
    --cc=juri.lelli@redhat.com \
    --cc=kobak@nvidia.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sshegde@linux.ibm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox