From: Andrea Righi <arighi@nvidia.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Christian Loehle <christian.loehle@arm.com>,
Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>,
Felix Abecassis <fabecassis@nvidia.com>,
Balbir Singh <balbirs@nvidia.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path
Date: Thu, 21 May 2026 22:13:43 +0200 [thread overview]
Message-ID: <ag9ndzFtUahhsYPZ@gpd4> (raw)
In-Reply-To: <38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com>
Hi Marek,
On Thu, May 21, 2026 at 09:47:03PM +0200, Marek Szyprowski wrote:
> On 09.05.2026 20:07, Andrea Righi wrote:
> > nohz_balancer_kick() is reached from sched_balance_trigger(), which is
> > called from sched_tick(). sched_tick() runs with IRQs disabled, so the
> > additional rcu_read_lock/unlock() used around sched_domain accesses in
> > this path is redundant. Rely on the existing IRQ-disabled context (and
> > the rcu_dereference_all() checking) instead.
> >
> > The same applies to set_cpu_sd_state_idle(), called from the idle entry
> > path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via
> > nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs
> > disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE
> > teardown, which runs under cpus_write_lock(), so it cannot race with
> > sched-domain rebuilds). In both cases the rcu_dereference_all()
> > validation is sufficient.
> >
> > No functional change intended.
> >
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> This patch landed in today's linux-next as commit c9d93a73ce87 ("sched/fair: Drop
> redundant RCU read lock in NOHZ kick path"). In my tests I found that it introduced
> the following warning during the CPU hot-plug tests:
>
>
> root@target:~# for i in /sys/devices/system/cpu/cpu[1-9]; do echo 0 >$i/online; done
>
> =============================
> WARNING: suspicious RCU usage
> 7.1.0-rc2+ #12775 Not tainted
> -----------------------------
> kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by cpuhp/1/20:
> #0: ffffffff81a16220 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
> #1: ffffffff81a16270 (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 20 Comm: cpuhp/1 Not tainted 7.1.0-rc2+ #12775 PREEMPTLAZY
> Hardware name: StarFive VisionFive 2 v1.2A (DT)
> Call Trace:
> [<ffffffff8001827c>] dump_backtrace+0x1c/0x24
> [<ffffffff800014c0>] show_stack+0x28/0x34
> [<ffffffff80010d42>] dump_stack_lvl+0x5e/0x86
> [<ffffffff80010d7e>] dump_stack+0x14/0x1c
> [<ffffffff800987ec>] lockdep_rcu_suspicious+0x14c/0x1b8
> [<ffffffff80079992>] nohz_balance_exit_idle+0xf4/0xf6
> [<ffffffff800664e6>] sched_cpu_deactivate+0x6c/0x1c8
> [<ffffffff8002a5d0>] cpuhp_invoke_callback+0xf8/0x1ce
> [<ffffffff8002a944>] cpuhp_thread_fun+0x150/0x1ae
> [<ffffffff8005dc64>] smpboot_thread_fn+0x138/0x2a4
> [<ffffffff800554ae>] kthread+0xea/0x10c
> [<ffffffff800134c4>] ret_from_fork_kernel+0x22/0x386
> [<ffffffff80c278ee>] ret_from_fork_kernel_asm+0x16/0x18
> CPU1: off
> CPU2: off
> CPU3: off
>
> This issue is observed on most of my ARM 32bit, ARM 64bit and RiscV64 based boards.
>
Ah, yes, makes sense. We missed the CPU hotplug case. When CPUs are taken
offline, set_cpu_sd_state_busy() is invoked via:
cpuhp/N kthread
cpuhp_thread_fun()
cpuhp_invoke_callback()
sched_cpu_deactivate()
nohz_balance_exit_idle()
set_cpu_sd_state_busy()
rcu_dereference_all(per_cpu(sd_llc, cpu))
The cpuhp kthread holds cpu_hotplug_lock, but runs with preemption and IRQs
enabled. I think we should just restore the RCU read lock in
set_cpu_sd_state_{busy,idle}() to fix this. I'll send a patch soon.
Thanks,
-Andrea
next prev parent reply other threads:[~2026-05-21 20:14 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 18:07 [PATCH v6 0/5 RESEND] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-05-09 18:07 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 6:49 ` Shrikanth Hegde
2026-05-16 5:45 ` Andrea Righi
2026-05-16 17:15 ` Shrikanth Hegde
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-21 19:47 ` [PATCH 1/5] " Marek Szyprowski
2026-05-21 20:13 ` Andrea Righi [this message]
2026-05-09 18:07 ` [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 10:05 ` Shrikanth Hegde
2026-05-16 5:58 ` [PATCH v2 " Andrea Righi
2026-05-16 17:19 ` Shrikanth Hegde
2026-05-18 20:58 ` Peter Zijlstra
2026-05-18 21:31 ` Andrea Righi
2026-05-19 5:52 ` K Prateek Nayak
2026-05-19 6:43 ` Andrea Righi
2026-05-19 7:47 ` K Prateek Nayak
2026-05-19 7:54 ` Andrea Righi
2026-05-19 8:46 ` Peter Zijlstra
2026-05-19 11:27 ` K Prateek Nayak
2026-05-19 11:47 ` Peter Zijlstra
2026-05-25 8:30 ` Chen, Yu C
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-05-09 18:07 ` [PATCH 3/5] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-11 13:45 ` Andrea Righi
2026-05-11 14:25 ` [PATCH v2 " Andrea Righi
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-09 18:07 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-15 10:09 ` Shrikanth Hegde
2026-05-16 9:04 ` Andrea Righi
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-09 18:07 ` [PATCH 5/5] sched/fair: Add SIS_UTIL support to select_idle_capacity() Andrea Righi
2026-05-11 13:08 ` Vincent Guittot
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
-- strict thread matches above, loose matches on Subject: below --
2026-05-09 18:01 Andrea Righi
2026-05-09 18:01 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-04-28 14:41 [PATCH v5 0/5] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-04-28 14:41 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-04-28 16:29 ` K Prateek Nayak
2026-05-05 9:15 ` Dietmar Eggemann
2026-05-05 9:22 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ag9ndzFtUahhsYPZ@gpd4 \
--to=arighi@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=bsegall@google.com \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=fabecassis@nvidia.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kobak@nvidia.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox