The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Andrea Righi <arighi@nvidia.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>,
	Felix Abecassis <fabecassis@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path
Date: Fri, 15 May 2026 12:19:16 +0530	[thread overview]
Message-ID: <4b04aade-8474-4e37-991e-16f2faedaf0c@linux.ibm.com> (raw)
In-Reply-To: <20260509180955.1840064-2-arighi@nvidia.com>



On 5/9/26 11:37 PM, Andrea Righi wrote:
> nohz_balancer_kick() is reached from sched_balance_trigger(), which is
> called from sched_tick(). sched_tick() runs with IRQs disabled, so the
> additional rcu_read_lock/unlock() used around sched_domain accesses in
> this path is redundant. Rely on the existing IRQ-disabled context (and
> the rcu_dereference_all() checking) instead.
> 
> The same applies to set_cpu_sd_state_idle(), called from the idle entry
> path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via
> nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs
> disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE
> teardown, which runs under cpus_write_lock(), so it cannot race with
> sched-domain rebuilds). In both cases the rcu_dereference_all()
> validation is sufficient.
> 
> No functional change intended.
> 

For this patch, few more comments below.

Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>

> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>


> @@ -12868,17 +12860,13 @@ static void nohz_balancer_kick(struct rq *rq)
>   static void set_cpu_sd_state_busy(int cpu)
>   {
>   	struct sched_domain *sd;
> -
> -	rcu_read_lock();
>   	sd = rcu_dereference_all(per_cpu(sd_llc, cpu));
>   
>   	if (!sd || !sd->nohz_idle)
> -		goto unlock;
> +		return;
>   	sd->nohz_idle = 0;
>   
>   	atomic_inc(&sd->shared->nr_busy_cpus);
> -unlock:
> -	rcu_read_unlock();
>   }
>   
>   void nohz_balance_exit_idle(struct rq *rq)
> @@ -12897,17 +12885,13 @@ void nohz_balance_exit_idle(struct rq *rq)
>   static void set_cpu_sd_state_idle(int cpu)
>   {
>   	struct sched_domain *sd;
> -
> -	rcu_read_lock();
>   	sd = rcu_dereference_all(per_cpu(sd_llc, cpu));
>   
>   	if (!sd || sd->nohz_idle)
> -		goto unlock;
> +		return;
>   	sd->nohz_idle = 1;
>   
>   	atomic_dec(&sd->shared->nr_busy_cpus);
> -unlock:
> -	rcu_read_unlock();
>   }
>   
>   /*

I was looking at other users of sd_llc, i.e test_idle_core and set_idle_core.
They have rcu_dereference_all. So callers need not call rcu_read_lock/unlock if
the irq disabled/preempt_disabled.

One more place would be update_idle_core. I think it is called with interrupt disabled
in __schedule path.

And in sched_ext, scx_idle_update_selcpu_topology, It seems to be tied to cpu hotplug and
by same logic of cpus_write_lock held, one could remove redundant rcu_read_lock there as well.

No?