* commit 3284e4adca9b causes hang on boot with CONFIG_PREEMPT_RT=y
@ 2025-07-11 23:00 Bert Karwatzki
2025-07-11 23:06 ` Joel Fernandes
0 siblings, 1 reply; 3+ messages in thread
From: Bert Karwatzki @ 2025-07-11 23:00 UTC (permalink / raw)
To: Joel Fernandes
Cc: Bert Karwatzki, linux-kernel, linux-next, linux-rt-devel,
ankur.a.arora, bobo.shaobowang, boqun.feng, frederic, joel,
neeraj.upadhyay, paulmck, rcu, urezki, wangxiongfeng2, xiexiuqi,
xiqi2
When booting linux next-20250711 (with CONFIG_PREEMPT_RT=y) on my MSI Alpha 15
Laptop running debian sid amd64 the boot process hangs with the last
messages displayed on screen being:
fbcon: amdgpudrmfb (fb0) is primary device
Console: switching to colour frame buffer device
amdgpu: 0000:08:00.0: [drm]fb0: admgpudrmfb frame buffer device
after some time (about 60s) this error messages appears (hand copied
from screen, not entirely accurate)
rcu_preempt self detected stall
with call trace
run_irq_workd
smpboot_thread_fn
kthread
? kthreads_online_cpu
? kthreads_online_cpu
ret_from_fork
? kthreads_online_cpu
ret_from_fork
This only occurs when compiling with CONFIG_PREEMPT_RT=y.
I bisected this and found the first bad commit to be
3284e4adca9b ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
Reverting this commit in next-20250711 fixes the issue for me.
Hardware:
$ lspci -nn
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne Root Complex [1022:1630]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU [1022:1631]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP Bridge [1022:1633]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge [1022:1634]
00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge [1022:1634]
00:02.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge [1022:1634]
00:02.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne PCIe GPP Bridge [1022:1634]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe Dummy Host Bridge [1022:1632]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal PCIe GPP Bridge to Bus [1022:1635]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 51)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 0 [1022:166a]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 1 [1022:166b]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 2 [1022:166c]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 3 [1022:166d]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 4 [1022:166e]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 5 [1022:166f]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 6 [1022:1670]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data Fabric; Function 7 [1022:1671]
01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch [1002:1478] (rev c3)
02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch [1002:1479]
03:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] [1002:73ff] (rev c3)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller [1002:ab28]
04:00.0 Network controller [0280]: MEDIATEK Corp. MT7921K (RZ608) Wi-Fi 6E 80MHz [14c3:0608]
05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
06:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company, Inc. KC3000/FURY Renegade NVMe SSD [E18] [2646:5013] (rev 01)
07:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology P1 NVMe PCIe SSD[Frampton] [c0a9:2263] (rev 03)
08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] (rev c5)
08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir Radeon High Definition Audio Controller [1002:1637]
08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) Platform Security Processor [1022:15df]
08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1 [1022:1639]
08:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne USB 3.1 [1022:1639]
08:00.5 Multimedia controller [0480]: Advanced Micro Devices, Inc. [AMD] Audio Coprocessor [1022:15e2] (rev 01)
08:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h/19h/1ah HD Audio Controller [1022:15e3]
08:00.7 Signal processing controller [1180]: Advanced Micro Devices, Inc. [AMD] Sensor Fusion Hub [1022:15e4]
$ head /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 80
model name : AMD Ryzen 7 5800H with Radeon Graphics
stepping : 0
microcode : 0xa50000c
cpu MHz : 2826.830
cache size : 512 KB
physical id : 0
Bert Karwatzki
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: commit 3284e4adca9b causes hang on boot with CONFIG_PREEMPT_RT=y
2025-07-11 23:00 commit 3284e4adca9b causes hang on boot with CONFIG_PREEMPT_RT=y Bert Karwatzki
@ 2025-07-11 23:06 ` Joel Fernandes
2025-07-11 23:27 ` Bert Karwatzki
0 siblings, 1 reply; 3+ messages in thread
From: Joel Fernandes @ 2025-07-11 23:06 UTC (permalink / raw)
To: Bert Karwatzki
Cc: linux-kernel, linux-next, linux-rt-devel, ankur.a.arora,
bobo.shaobowang, boqun.feng, frederic, joel, neeraj.upadhyay,
paulmck, rcu, urezki, wangxiongfeng2, xiexiuqi, xiqi2
On 7/11/2025 7:00 PM, Bert Karwatzki wrote:
> When booting linux next-20250711 (with CONFIG_PREEMPT_RT=y) on my MSI Alpha 15
> Laptop running debian sid amd64 the boot process hangs with the last
> messages displayed on screen being:
>
> fbcon: amdgpudrmfb (fb0) is primary device
> Console: switching to colour frame buffer device
> amdgpu: 0000:08:00.0: [drm]fb0: admgpudrmfb frame buffer device
>
> after some time (about 60s) this error messages appears (hand copied
> from screen, not entirely accurate)
>
> rcu_preempt self detected stall
>
> with call trace
> run_irq_workd
> smpboot_thread_fn
> kthread
> ? kthreads_online_cpu
> ? kthreads_online_cpu
> ret_from_fork
> ? kthreads_online_cpu
> ret_from_fork
>
> This only occurs when compiling with CONFIG_PREEMPT_RT=y.
> I bisected this and found the first bad commit to be
>
> 3284e4adca9b ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
This commit is still using old code which was fixed in the last day.
Here is the new commit:
https://web.git.kernel.org/pub/scm/linux/kernel/git/rcu/linux.git/commit/?h=next&id=2e154d164418e1eaadbf5dc58cbf19e7be8fdc67
Thanks!
- Joel
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: commit 3284e4adca9b causes hang on boot with CONFIG_PREEMPT_RT=y
2025-07-11 23:06 ` Joel Fernandes
@ 2025-07-11 23:27 ` Bert Karwatzki
0 siblings, 0 replies; 3+ messages in thread
From: Bert Karwatzki @ 2025-07-11 23:27 UTC (permalink / raw)
To: Joel Fernandes
Cc: linux-kernel, linux-next, linux-rt-devel, ankur.a.arora,
bobo.shaobowang, boqun.feng, frederic, joel, neeraj.upadhyay,
paulmck, rcu, urezki, wangxiongfeng2, xiexiuqi, xiqi2, spasswolf
Am Freitag, dem 11.07.2025 um 19:06 -0400 schrieb Joel Fernandes:
>
> On 7/11/2025 7:00 PM, Bert Karwatzki wrote:
> > When booting linux next-20250711 (with CONFIG_PREEMPT_RT=y) on my MSI Alpha 15
> > Laptop running debian sid amd64 the boot process hangs with the last
> > messages displayed on screen being:
> >
> > fbcon: amdgpudrmfb (fb0) is primary device
> > Console: switching to colour frame buffer device
> > amdgpu: 0000:08:00.0: [drm]fb0: admgpudrmfb frame buffer device
> >
> > after some time (about 60s) this error messages appears (hand copied
> > from screen, not entirely accurate)
> >
> > rcu_preempt self detected stall
> >
> > with call trace
> > run_irq_workd
> > smpboot_thread_fn
> > kthread
> > ? kthreads_online_cpu
> > ? kthreads_online_cpu
> > ret_from_fork
> > ? kthreads_online_cpu
> > ret_from_fork
> >
> > This only occurs when compiling with CONFIG_PREEMPT_RT=y.
> > I bisected this and found the first bad commit to be
> >
> > 3284e4adca9b ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
>
> This commit is still using old code which was fixed in the last day.
>
> Here is the new commit:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/rcu/linux.git/commit/?h=next&id=2e154d164418e1eaadbf5dc58cbf19e7be8fdc67
>
> Thanks!
>
> - Joel
I already found the new commit, it works!
Bert Karwatzki
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-07-11 23:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 23:00 commit 3284e4adca9b causes hang on boot with CONFIG_PREEMPT_RT=y Bert Karwatzki
2025-07-11 23:06 ` Joel Fernandes
2025-07-11 23:27 ` Bert Karwatzki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).