From: Shrikanth Hegde <sshegde@linux.ibm.com> To: Andrea Righi <arighi@nvidia.com>, K Prateek Nayak <kprateek.nayak@amd.com>, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Valentin Schneider <vschneid@redhat.com>, Christian Loehle <christian.loehle@arm.com>, Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>, Felix Abecassis <fabecassis@nvidia.com>, Balbir Singh <balbirs@nvidia.com>, Joel Fernandes <joelagnelf@nvidia.com>, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Date: Fri, 15 May 2026 12:19:16 +0530 [thread overview] Message-ID: <4b04aade-8474-4e37-991e-16f2faedaf0c@linux.ibm.com> (raw) In-Reply-To: <20260509180955.1840064-2-arighi@nvidia.com> On 5/9/26 11:37 PM, Andrea Righi wrote: > nohz_balancer_kick() is reached from sched_balance_trigger(), which is > called from sched_tick(). sched_tick() runs with IRQs disabled, so the > additional rcu_read_lock/unlock() used around sched_domain accesses in > this path is redundant. Rely on the existing IRQ-disabled context (and > the rcu_dereference_all() checking) instead. > > The same applies to set_cpu_sd_state_idle(), called from the idle entry > path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via > nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs > disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE > teardown, which runs under cpus_write_lock(), so it cannot race with > sched-domain rebuilds). In both cases the rcu_dereference_all() > validation is sufficient. > > No functional change intended. > For this patch, few more comments below. Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> > Cc: Vincent Guittot <vincent.guittot@linaro.org> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> > Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> > Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> > Signed-off-by: Andrea Righi <arighi@nvidia.com> > @@ -12868,17 +12860,13 @@ static void nohz_balancer_kick(struct rq *rq) > static void set_cpu_sd_state_busy(int cpu) > { > struct sched_domain *sd; > - > - rcu_read_lock(); > sd = rcu_dereference_all(per_cpu(sd_llc, cpu)); > > if (!sd || !sd->nohz_idle) > - goto unlock; > + return; > sd->nohz_idle = 0; > > atomic_inc(&sd->shared->nr_busy_cpus); > -unlock: > - rcu_read_unlock(); > } > > void nohz_balance_exit_idle(struct rq *rq) > @@ -12897,17 +12885,13 @@ void nohz_balance_exit_idle(struct rq *rq) > static void set_cpu_sd_state_idle(int cpu) > { > struct sched_domain *sd; > - > - rcu_read_lock(); > sd = rcu_dereference_all(per_cpu(sd_llc, cpu)); > > if (!sd || sd->nohz_idle) > - goto unlock; > + return; > sd->nohz_idle = 1; > > atomic_dec(&sd->shared->nr_busy_cpus); > -unlock: > - rcu_read_unlock(); > } > > /* I was looking at other users of sd_llc, i.e test_idle_core and set_idle_core. They have rcu_dereference_all. So callers need not call rcu_read_lock/unlock if the irq disabled/preempt_disabled. One more place would be update_idle_core. I think it is called with interrupt disabled in __schedule path. And in sched_ext, scx_idle_update_selcpu_topology, It seems to be tied to cpu hotplug and by same logic of cpus_write_lock held, one could remove redundant rcu_read_lock there as well. No?