From: Andrea Righi <arighi@nvidia.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Christian Loehle <christian.loehle@arm.com>,
Phil Auld <pauld@redhat.com>, Koba Ko <kobak@nvidia.com>,
Felix Abecassis <fabecassis@nvidia.com>,
Balbir Singh <balbirs@nvidia.com>,
Joel Fernandes <joelagnelf@nvidia.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path
Date: Thu, 21 May 2026 22:13:43 +0200 [thread overview]
Message-ID: <ag9ndzFtUahhsYPZ@gpd4> (raw)
In-Reply-To: <38fe0a1d-1a48-435a-910a-c278024d9ac9@samsung.com>
Hi Marek,
On Thu, May 21, 2026 at 09:47:03PM +0200, Marek Szyprowski wrote:
> On 09.05.2026 20:07, Andrea Righi wrote:
> > nohz_balancer_kick() is reached from sched_balance_trigger(), which is
> > called from sched_tick(). sched_tick() runs with IRQs disabled, so the
> > additional rcu_read_lock/unlock() used around sched_domain accesses in
> > this path is redundant. Rely on the existing IRQ-disabled context (and
> > the rcu_dereference_all() checking) instead.
> >
> > The same applies to set_cpu_sd_state_idle(), called from the idle entry
> > path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via
> > nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs
> > disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE
> > teardown, which runs under cpus_write_lock(), so it cannot race with
> > sched-domain rebuilds). In both cases the rcu_dereference_all()
> > validation is sufficient.
> >
> > No functional change intended.
> >
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
> > Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> This patch landed in today's linux-next as commit c9d93a73ce87 ("sched/fair: Drop
> redundant RCU read lock in NOHZ kick path"). In my tests I found that it introduced
> the following warning during the CPU hot-plug tests:
>
>
> root@target:~# for i in /sys/devices/system/cpu/cpu[1-9]; do echo 0 >$i/online; done
>
> =============================
> WARNING: suspicious RCU usage
> 7.1.0-rc2+ #12775 Not tainted
> -----------------------------
> kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by cpuhp/1/20:
> #0: ffffffff81a16220 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
> #1: ffffffff81a16270 (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 20 Comm: cpuhp/1 Not tainted 7.1.0-rc2+ #12775 PREEMPTLAZY
> Hardware name: StarFive VisionFive 2 v1.2A (DT)
> Call Trace:
> [<ffffffff8001827c>] dump_backtrace+0x1c/0x24
> [<ffffffff800014c0>] show_stack+0x28/0x34
> [<ffffffff80010d42>] dump_stack_lvl+0x5e/0x86
> [<ffffffff80010d7e>] dump_stack+0x14/0x1c
> [<ffffffff800987ec>] lockdep_rcu_suspicious+0x14c/0x1b8
> [<ffffffff80079992>] nohz_balance_exit_idle+0xf4/0xf6
> [<ffffffff800664e6>] sched_cpu_deactivate+0x6c/0x1c8
> [<ffffffff8002a5d0>] cpuhp_invoke_callback+0xf8/0x1ce
> [<ffffffff8002a944>] cpuhp_thread_fun+0x150/0x1ae
> [<ffffffff8005dc64>] smpboot_thread_fn+0x138/0x2a4
> [<ffffffff800554ae>] kthread+0xea/0x10c
> [<ffffffff800134c4>] ret_from_fork_kernel+0x22/0x386
> [<ffffffff80c278ee>] ret_from_fork_kernel_asm+0x16/0x18
> CPU1: off
> CPU2: off
> CPU3: off
>
> This issue is observed on most of my ARM 32bit, ARM 64bit and RiscV64 based boards.
>
Ah, yes, makes sense. We missed the CPU hotplug case. When CPUs are taken
offline, set_cpu_sd_state_busy() is invoked via:
cpuhp/N kthread
cpuhp_thread_fun()
cpuhp_invoke_callback()
sched_cpu_deactivate()
nohz_balance_exit_idle()
set_cpu_sd_state_busy()
rcu_dereference_all(per_cpu(sd_llc, cpu))
The cpuhp kthread holds cpu_hotplug_lock, but runs with preemption and IRQs
enabled. I think we should just restore the RCU read lock in
set_cpu_sd_state_{busy,idle}() to fix this. I'll send a patch soon.
Thanks,
-Andrea
next prev parent reply other threads:[~2026-05-21 20:14 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 18:07 [PATCH v6 0/5 RESEND] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-05-09 18:07 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 6:49 ` Shrikanth Hegde
2026-05-16 5:45 ` Andrea Righi
2026-05-16 17:15 ` Shrikanth Hegde
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-21 19:47 ` [PATCH 1/5] " Marek Szyprowski
2026-05-21 20:13 ` Andrea Righi [this message]
2026-05-09 18:07 ` [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Andrea Righi
2026-05-11 13:04 ` Vincent Guittot
2026-05-15 10:05 ` Shrikanth Hegde
2026-05-16 5:58 ` [PATCH v2 " Andrea Righi
2026-05-16 17:19 ` Shrikanth Hegde
2026-05-18 20:58 ` Peter Zijlstra
2026-05-18 21:31 ` Andrea Righi
2026-05-19 5:52 ` K Prateek Nayak
2026-05-19 6:43 ` Andrea Righi
2026-05-19 7:47 ` K Prateek Nayak
2026-05-19 7:54 ` Andrea Righi
2026-05-19 8:46 ` Peter Zijlstra
2026-05-19 11:27 ` K Prateek Nayak
2026-05-19 11:47 ` Peter Zijlstra
2026-05-25 8:30 ` Chen, Yu C
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-05-09 18:07 ` [PATCH 3/5] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-11 13:45 ` Andrea Righi
2026-05-11 14:25 ` [PATCH v2 " Andrea Righi
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-09 18:07 ` [PATCH 4/5] sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity Andrea Righi
2026-05-11 13:07 ` Vincent Guittot
2026-05-15 10:09 ` Shrikanth Hegde
2026-05-16 9:04 ` Andrea Righi
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-05-09 18:07 ` [PATCH 5/5] sched/fair: Add SIS_UTIL support to select_idle_capacity() Andrea Righi
2026-05-11 13:08 ` Vincent Guittot
2026-05-20 8:34 ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
-- strict thread matches above, loose matches on Subject: below --
2026-05-09 18:01 Andrea Righi
2026-05-09 18:01 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-04-28 14:41 [PATCH v5 0/5] sched/fair: SMT-aware asymmetric CPU capacity Andrea Righi
2026-04-28 14:41 ` [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path Andrea Righi
2026-04-28 16:29 ` K Prateek Nayak
2026-05-05 9:15 ` Dietmar Eggemann
2026-05-05 9:22 ` Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ag9ndzFtUahhsYPZ@gpd4 \
--to=arighi@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=bsegall@google.com \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=fabecassis@nvidia.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kobak@nvidia.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.