From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Stefan Wiehler <stefan.wiehler@nokia.com>
Cc: linux-arm-kernel@lists.infradead.org
Subject: Re: Lockdep-RCU splat in ARM CPU hotplug
Date: Tue, 5 Mar 2024 21:04:39 +0000 [thread overview]
Message-ID: <ZeeI5+u63jE9NSvX@shell.armlinux.org.uk> (raw)
In-Reply-To: <a08b5efd-cda4-4e1b-968e-05b900b5b215@nokia.com>
On Tue, Mar 05, 2024 at 05:00:06PM +0100, Stefan Wiehler wrote:
> Hi,
>
> With CONFIG_PROVE_RCU_LIST=y and by executing
>
> $ echo 0 > /sys/devices/system/cpu/cpu1/online
>
> one can trigger the following Lockdep-RCU splat on ARM (reproducible on an Orange Pi PC in QEMU):
>
> =============================
> WARNING: suspicious RCU usage
> 6.8.0-rc7-00001-g0db1d0ed8958 #10 Not tainted
> -----------------------------
> kernel/locking/lockdep.c:3762 RCU-list traversed in non-reader section!!
>
> other info that might help us debug this:
>
>
> RCU used illegally from offline CPU!
> rcu_scheduler_active = 2, debug_locks = 1
> no locks held by swapper/1/0.
>
> stack backtrace:
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.8.0-rc7-00001-g0db1d0ed8958 #10
> Hardware name: Allwinner sun8i Family
> unwind_backtrace from show_stack+0x10/0x14
> show_stack from dump_stack_lvl+0x60/0x90
> dump_stack_lvl from lockdep_rcu_suspicious+0x150/0x1a0
> lockdep_rcu_suspicious from __lock_acquire+0x11fc/0x29f8
> __lock_acquire from lock_acquire+0x10c/0x348
> lock_acquire from _raw_spin_lock_irqsave+0x50/0x6c
> _raw_spin_lock_irqsave from check_and_switch_context+0x7c/0x4a8
> check_and_switch_context from arch_cpu_idle_dead+0x10/0x7c
> arch_cpu_idle_dead from do_idle+0xbc/0x138
> do_idle from cpu_startup_entry+0x28/0x2c
> cpu_startup_entry from secondary_start_kernel+0x11c/0x124
> secondary_start_kernel from 0x401018a0
>
> Originally the splat was found on an AXM5516 with v5.15, so the issue presumably exists for quite some time already on all ARM boards.
>
> Lockdep-RCU is triggered by this call of raw_spin_lock_irqsave() in check_and_switch_context() while the CPU is already marked offline: https://elixir.bootlin.com/linux/v6.8-rc7/source/arch/arm/mm/context.c#L257
>
> On ARM64, we have cpu_die_early() calling rcutree_report_cpu_dead() which presumably prevents such a splat from occurring: https://elixir.bootlin.com/linux/v6.8-rc7/source/arch/arm64/kernel/smp.c#L412
>
> Simply calling rcutree_report_cpu_dead() in arch_cpu_idle_dead() on ARM seems to have no effect though. As my understanding of the CPU hotplugging subsystem on ARM is a bit limited, I would appreciate some help here.
So I think this is down to what check_and_switch_context() is doing.
Tracing through the paths, idle_task_exit() is called from the
arch_cpu_idle_dead() path on both 32-bit ARM and x86. So this is legal
to do (if it wasn't then x86 would have problems.)
idle_task_exit() calls switch_mm(), which is an arch-defined function,
and this calls check_and_switch_context(). Anything which switch_mm()
calls has to be safe to be called from the arch_cpu_idle_dead() path.
We can't get rid of the spinlock in check_and_switch_context() as that
is fundamental to how the ASID handling works - removing it would
cause all sorts of races.
I don't see how we can solve this at the moment, not helped by my
limited RCU knowledge.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-03-05 21:05 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-05 16:00 Lockdep-RCU splat in ARM CPU hotplug Stefan Wiehler
2024-03-05 21:04 ` Russell King (Oracle) [this message]
2024-03-07 16:01 ` Stefan Wiehler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZeeI5+u63jE9NSvX@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=stefan.wiehler@nokia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.