public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
@ 2025-11-26 14:32 Borislav Petkov
  2025-11-26 15:26 ` Borislav Petkov
  2025-11-26 16:00 ` Borislav Petkov
  0 siblings, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2025-11-26 14:32 UTC (permalink / raw)
  To: iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	linux-kernel

Hi,

this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
because the backtrace points to it.

Ideas?

[   12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
[   12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
[   12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
[   12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.
[   12.983644] Sending NMI from CPU 1 to CPUs 0:
[   12.983652] NMI backtrace for cpu 0
[   12.983655] CPU: 0 UID: 0 PID: 504 Comm: (modprobe) Not tainted 6.18.0-rc7+ #1 PREEMPT(voluntary) 
[   12.983658] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
[   12.983660] RIP: 0010:delay_halt_mwaitx+0x37/0x40
[   12.983665] Code: 01 31 d2 89 d1 48 05 00 d0 3a 83 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <5b> e9 ae 2b 1d ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90
[   12.983666] RSP: 0018:ffffc90000003d88 EFLAGS: 00000097
[   12.983668] RAX: 00000000000000f0 RBX: 0000000000000b9b RCX: 0000000000000002
[   12.983670] RDX: 0000000000000000 RSI: 0000000000000b9b RDI: 00000034e14b49dc
[   12.983671] RBP: 00000034e14b49dc R08: 000000000000006c R09: 0000000000000002
[   12.983672] R10: 0000000000000050 R11: 0000000000000002 R12: 00000000000001a4
[   12.983673] R13: 0000000000000002 R14: ffff888100064018 R15: ffff888100064000
[   12.983674] FS:  00007f8ae7452e00(0000) GS:ffff889086e58000(0000) knlGS:0000000000000000
[   12.983675] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   12.983676] CR2: 00007f8ae6c9a8a0 CR3: 0000000141913004 CR4: 0000000000770ef0
[   12.983677] PKRU: 55555554
[   12.983678] Call Trace:
[   12.983680]  <IRQ>
[   12.983682]  delay_halt+0x3b/0x60
[   12.983685]  iommu_completion_wait.part.0.isra.0+0xd3/0x100
[   12.983693]  domain_flush_complete+0x64/0xc0
[   12.983696]  amd_iommu_flush_iotlb_all+0x33/0x50
[   12.983700]  fq_flush_timeout+0x34/0xd0
[   12.983703]  ? __pfx_fq_flush_timeout+0x10/0x10
[   12.983706]  call_timer_fn+0xa8/0x220
[   12.983713]  ? __pfx_fq_flush_timeout+0x10/0x10
[   12.983715]  __run_timers+0x1f6/0x2b0
[   12.983724]  run_timer_base+0x46/0x60
[   12.983727]  run_timer_softirq+0x16/0x30
[   12.983729]  handle_softirqs+0xe5/0x430
[   12.983736]  __irq_exit_rcu+0xc3/0x120
[   12.983738]  irq_exit_rcu+0xa/0x30
[   12.983740]  sysvec_apic_timer_interrupt+0x89/0xb0
[   12.983744]  </IRQ>
[   12.983744]  <TASK>
[   12.983746]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[   12.983749] RIP: 0010:console_flush_all+0x49e/0x620
[   12.983751] Code: 90 60 01 02 0f 85 91 01 00 00 48 83 c4 40 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f e9 78 a1 d5 ff e8 c8 ce 0e 00 fb 0f 1f 44 00 00 <41> 80 3c 24 00 48 8b 43 58 0f 84 07 ff ff ff eb ba e8 5c cf 0e 00
[   12.983752] RSP: 0018:ffffc90003d87bf8 EFLAGS: 00000202
[   12.983754] RAX: 000000000000162f RBX: ffffffff82a59f80 RCX: 0000000000000000
[   12.983755] RDX: 0000000000000000 RSI: ffffffff8255d9fc RDI: ffffffff82532a7e
[   12.983755] RBP: 0000000000000200 R08: 0000000000000001 R09: 0000000000000000
[   12.983756] R10: 0000000000000001 R11: 0000000000000003 R12: ffffc90003d87c73
[   12.983757] R13: ffffc90003d87c78 R14: 0000000000000000 R15: ffffffff813a9e3d
[   12.983758]  ? console_flush_all+0x33d/0x620
[   12.983768]  ? console_flush_all+0x498/0x620
[   12.983770]  ? console_flush_all+0x3d/0x620
[   12.983776]  console_unlock+0x155/0x230
[   12.983781]  vprintk_emit+0x2e5/0x500
[   12.983787]  devkmsg_emit.constprop.0+0x56/0x70
[   12.983793]  devkmsg_write.cold+0x1c/0x4a
[   12.983797]  do_iter_readv_writev+0x139/0x230
[   12.983802]  vfs_writev+0xfa/0x500
[   12.983817]  ? do_writev+0x7e/0x120
[   12.983819]  do_writev+0x7e/0x120
[   12.983824]  do_syscall_64+0x62/0x350
[   12.983828]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   12.983830] RIP: 0033:0x7f8ae6a99687
[   12.983832] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
[   12.983833] RSP: 002b:00007ffc877c5cf0 EFLAGS: 00000202 ORIG_RAX: 0000000000000014
[   12.983835] RAX: ffffffffffffffda RBX: 00007f8ae7452e00 RCX: 00007f8ae6a99687
[   12.983836] RDX: 0000000000000007 RSI: 00007ffc877c5dc0 RDI: 0000000000000003
[   12.983836] RBP: 00007f8ae6f133c6 R08: 0000000000000000 R09: 0000000000000000
[   12.983837] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[   12.983838] R13: 00007ffc877c5f88 R14: 000000000000001f R15: 0000000000000000
[   12.983847]  </TASK>
[   12.984730] (journald)[506]: ID-mapping is supported for exec directory /run/systemd/journal
[   13.400185] (journald)[506]: Successfully forked off '(sd-mkdcreds)' as PID 510.
[   13.400955] (sd-mkdcre[510]: Changing mount propagation /dev (MS_REC|MS_SLAVE "")

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-11-26 14:32 amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/ Borislav Petkov
@ 2025-11-26 15:26 ` Borislav Petkov
  2025-11-28 20:28   ` Paul E. McKenney
  2025-11-26 16:00 ` Borislav Petkov
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2025-11-26 15:26 UTC (permalink / raw)
  To: iommu, Paul E. McKenney
  Cc: Joerg Roedel, Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	linux-kernel

On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
> Hi,
> 
> this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
> because the backtrace points to it.
> 
> Ideas?
> 
> [   12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> [   12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
> [   12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
> [   12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.

And as suspected, booting in it again, it doesn't trigger anymore. But there's
something new in dmesg which looks weird and makes me want to Cc Paul:

[    6.965526] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
[    6.971581] Key type fscrypt-provisioning registered
[    6.975191] PM: Image not found (code -6)
[    6.975631]  } 8 jiffies s: 89 root: 0x0/.


and

[   12.549532] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 8 jiffies s: 113 root: 0x1/.
[   12.550863] rcu: blocking rcu_node structures (internal RCU debug):

[   12.817601] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
[   12.819773] (sd-mkdcre[520]: Credential search path is: /etc/credstore:/run/credstore:/usr/local/lib/credstore:/usr/lib/credstore
[   12.827074]  } 8 jiffies s: 129 root: 0x0/.

[   12.881508] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
[   12.892854] (sd-mkdcr[522]: Credential search path is: /etc/credstore.encrypted:/run/credstore.encrypted:/usr/local/lib/credstore.encrypted:/usr/lib/credstore.encrypted
[   12.905244]  } 8 jiffies s: 133 root: 0x0/.

Paul, this looks weird.

Why is that issuing empty lists between the { }?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-11-26 14:32 amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/ Borislav Petkov
  2025-11-26 15:26 ` Borislav Petkov
@ 2025-11-26 16:00 ` Borislav Petkov
  2025-11-26 19:50   ` Jason Gunthorpe
  2025-11-27  5:15   ` Vasant Hegde
  1 sibling, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2025-11-26 16:00 UTC (permalink / raw)
  To: iommu, Vasant Hegde
  Cc: Joerg Roedel, Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	linux-kernel

+ Vasant.

On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
> Hi,
> 
> this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
> because the backtrace points to it.
> 
> Ideas?
> 
> [   12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> [   12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
> [   12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
> [   12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.
> [   12.983644] Sending NMI from CPU 1 to CPUs 0:
> [   12.983652] NMI backtrace for cpu 0
> [   12.983655] CPU: 0 UID: 0 PID: 504 Comm: (modprobe) Not tainted 6.18.0-rc7+ #1 PREEMPT(voluntary) 
> [   12.983658] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
> [   12.983660] RIP: 0010:delay_halt_mwaitx+0x37/0x40
> [   12.983665] Code: 01 31 d2 89 d1 48 05 00 d0 3a 83 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <5b> e9 ae 2b 1d ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90
> [   12.983666] RSP: 0018:ffffc90000003d88 EFLAGS: 00000097
> [   12.983668] RAX: 00000000000000f0 RBX: 0000000000000b9b RCX: 0000000000000002
> [   12.983670] RDX: 0000000000000000 RSI: 0000000000000b9b RDI: 00000034e14b49dc
> [   12.983671] RBP: 00000034e14b49dc R08: 000000000000006c R09: 0000000000000002
> [   12.983672] R10: 0000000000000050 R11: 0000000000000002 R12: 00000000000001a4
> [   12.983673] R13: 0000000000000002 R14: ffff888100064018 R15: ffff888100064000
> [   12.983674] FS:  00007f8ae7452e00(0000) GS:ffff889086e58000(0000) knlGS:0000000000000000
> [   12.983675] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   12.983676] CR2: 00007f8ae6c9a8a0 CR3: 0000000141913004 CR4: 0000000000770ef0
> [   12.983677] PKRU: 55555554
> [   12.983678] Call Trace:
> [   12.983680]  <IRQ>
> [   12.983682]  delay_halt+0x3b/0x60
> [   12.983685]  iommu_completion_wait.part.0.isra.0+0xd3/0x100
> [   12.983693]  domain_flush_complete+0x64/0xc0
> [   12.983696]  amd_iommu_flush_iotlb_all+0x33/0x50
> [   12.983700]  fq_flush_timeout+0x34/0xd0
> [   12.983703]  ? __pfx_fq_flush_timeout+0x10/0x10
> [   12.983706]  call_timer_fn+0xa8/0x220
> [   12.983713]  ? __pfx_fq_flush_timeout+0x10/0x10
> [   12.983715]  __run_timers+0x1f6/0x2b0
> [   12.983724]  run_timer_base+0x46/0x60
> [   12.983727]  run_timer_softirq+0x16/0x30
> [   12.983729]  handle_softirqs+0xe5/0x430
> [   12.983736]  __irq_exit_rcu+0xc3/0x120
> [   12.983738]  irq_exit_rcu+0xa/0x30
> [   12.983740]  sysvec_apic_timer_interrupt+0x89/0xb0
> [   12.983744]  </IRQ>
> [   12.983744]  <TASK>
> [   12.983746]  asm_sysvec_apic_timer_interrupt+0x16/0x20
> [   12.983749] RIP: 0010:console_flush_all+0x49e/0x620
> [   12.983751] Code: 90 60 01 02 0f 85 91 01 00 00 48 83 c4 40 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f e9 78 a1 d5 ff e8 c8 ce 0e 00 fb 0f 1f 44 00 00 <41> 80 3c 24 00 48 8b 43 58 0f 84 07 ff ff ff eb ba e8 5c cf 0e 00
> [   12.983752] RSP: 0018:ffffc90003d87bf8 EFLAGS: 00000202
> [   12.983754] RAX: 000000000000162f RBX: ffffffff82a59f80 RCX: 0000000000000000
> [   12.983755] RDX: 0000000000000000 RSI: ffffffff8255d9fc RDI: ffffffff82532a7e
> [   12.983755] RBP: 0000000000000200 R08: 0000000000000001 R09: 0000000000000000
> [   12.983756] R10: 0000000000000001 R11: 0000000000000003 R12: ffffc90003d87c73
> [   12.983757] R13: ffffc90003d87c78 R14: 0000000000000000 R15: ffffffff813a9e3d
> [   12.983758]  ? console_flush_all+0x33d/0x620
> [   12.983768]  ? console_flush_all+0x498/0x620
> [   12.983770]  ? console_flush_all+0x3d/0x620
> [   12.983776]  console_unlock+0x155/0x230
> [   12.983781]  vprintk_emit+0x2e5/0x500
> [   12.983787]  devkmsg_emit.constprop.0+0x56/0x70
> [   12.983793]  devkmsg_write.cold+0x1c/0x4a
> [   12.983797]  do_iter_readv_writev+0x139/0x230
> [   12.983802]  vfs_writev+0xfa/0x500
> [   12.983817]  ? do_writev+0x7e/0x120
> [   12.983819]  do_writev+0x7e/0x120
> [   12.983824]  do_syscall_64+0x62/0x350
> [   12.983828]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   12.983830] RIP: 0033:0x7f8ae6a99687
> [   12.983832] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
> [   12.983833] RSP: 002b:00007ffc877c5cf0 EFLAGS: 00000202 ORIG_RAX: 0000000000000014
> [   12.983835] RAX: ffffffffffffffda RBX: 00007f8ae7452e00 RCX: 00007f8ae6a99687
> [   12.983836] RDX: 0000000000000007 RSI: 00007ffc877c5dc0 RDI: 0000000000000003
> [   12.983836] RBP: 00007f8ae6f133c6 R08: 0000000000000000 R09: 0000000000000000
> [   12.983837] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
> [   12.983838] R13: 00007ffc877c5f88 R14: 000000000000001f R15: 0000000000000000
> [   12.983847]  </TASK>
> [   12.984730] (journald)[506]: ID-mapping is supported for exec directory /run/systemd/journal
> [   13.400185] (journald)[506]: Successfully forked off '(sd-mkdcreds)' as PID 510.
> [   13.400955] (sd-mkdcre[510]: Changing mount propagation /dev (MS_REC|MS_SLAVE "")
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-11-26 16:00 ` Borislav Petkov
@ 2025-11-26 19:50   ` Jason Gunthorpe
  2025-11-27  5:15   ` Vasant Hegde
  1 sibling, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2025-11-26 19:50 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: iommu, Vasant Hegde, Joerg Roedel, Suravee Suthikulpanit,
	Will Deacon, Robin Murphy, linux-kernel

On Wed, Nov 26, 2025 at 05:00:31PM +0100, Borislav Petkov wrote:
> + Vasant.
> 
> On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
> > Hi,
> > 
> > this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
> > because the backtrace points to it.
> > 
> > Ideas?
> > 
> > [   12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> > [   12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
> > [   12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
> > [   12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.
> > [   12.983644] Sending NMI from CPU 1 to CPUs 0:
> > [   12.983652] NMI backtrace for cpu 0
> > [   12.983655] CPU: 0 UID: 0 PID: 504 Comm: (modprobe) Not tainted 6.18.0-rc7+ #1 PREEMPT(voluntary) 
> > [   12.983658] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.5 09/08/2022
> > [   12.983660] RIP: 0010:delay_halt_mwaitx+0x37/0x40
> > [   12.983665] Code: 01 31 d2 89 d1 48 05 00 d0 3a 83 0f 01 fa b8 ff ff ff ff b9 02 00 00 00 48 39 c6 48 0f 46 c6 48 89 c3 b8 f0 00 00 00 0f 01 fb <5b> e9 ae 2b 1d ff 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90
> > [   12.983666] RSP: 0018:ffffc90000003d88 EFLAGS: 00000097
> > [   12.983668] RAX: 00000000000000f0 RBX: 0000000000000b9b RCX: 0000000000000002
> > [   12.983670] RDX: 0000000000000000 RSI: 0000000000000b9b RDI: 00000034e14b49dc
> > [   12.983671] RBP: 00000034e14b49dc R08: 000000000000006c R09: 0000000000000002
> > [   12.983672] R10: 0000000000000050 R11: 0000000000000002 R12: 00000000000001a4
> > [   12.983673] R13: 0000000000000002 R14: ffff888100064018 R15: ffff888100064000
> > [   12.983674] FS:  00007f8ae7452e00(0000) GS:ffff889086e58000(0000) knlGS:0000000000000000
> > [   12.983675] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   12.983676] CR2: 00007f8ae6c9a8a0 CR3: 0000000141913004 CR4: 0000000000770ef0
> > [   12.983677] PKRU: 55555554
> > [   12.983678] Call Trace:
> > [   12.983680]  <IRQ>
> > [   12.983682]  delay_halt+0x3b/0x60
> > [   12.983685]  iommu_completion_wait.part.0.isra.0+0xd3/0x100
> > [   12.983693]  domain_flush_complete+0x64/0xc0
> > [   12.983696]  amd_iommu_flush_iotlb_all+0x33/0x50
> > [   12.983700]  fq_flush_timeout+0x34/0xd0

This code in the AMD driver is spinning until the HW completes some
work

Maybe the HW has alot of work to do, or is slow, and this is
legitimately taking a long time? I don't expect
amd_iommu_flush_iotlb_all() to have a problem, that's pretty weird.

Does it eventually boot?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-11-26 16:00 ` Borislav Petkov
  2025-11-26 19:50   ` Jason Gunthorpe
@ 2025-11-27  5:15   ` Vasant Hegde
  1 sibling, 0 replies; 11+ messages in thread
From: Vasant Hegde @ 2025-11-27  5:15 UTC (permalink / raw)
  To: Borislav Petkov, iommu
  Cc: Joerg Roedel, Suravee Suthikulpanit, Will Deacon, Robin Murphy,
	linux-kernel, Jason Gunthorpe

Hi Boris,

On 11/26/2025 9:30 PM, Borislav Petkov wrote:
> + Vasant.
> 
> On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
>> Hi,
>>
>> this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
>> because the backtrace points to it.

I have tried tip/master on my Zen3 and I didn't hit this issue. Can you pass
your config file?

We haven't touched that code path recently and generally it should work fine. At
least we shouldn't hit issues under normal condition.

Having said that recently we observed that under really stressed environment we
hit softlock in the culprit code path. We are working on moving wait_on_sem()
out of spinlock.


-Vasant



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-11-26 15:26 ` Borislav Petkov
@ 2025-11-28 20:28   ` Paul E. McKenney
  2025-12-03 12:44     ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2025-11-28 20:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Will Deacon,
	Robin Murphy, linux-kernel

Sorry to be slow, USA Turkey Day and all that...

On Wed, Nov 26, 2025 at 04:26:37PM +0100, Borislav Petkov wrote:
> On Wed, Nov 26, 2025 at 03:32:19PM +0100, Borislav Petkov wrote:
> > Hi,
> > 
> > this is latest Linus + latest tip/master. Box is Zen3. CCing AMD IOMMU folks
> > because the backtrace points to it.
> > 
> > Ideas?
> > 
> > [   12.946913] (journald)[506]: Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
> > [   12.948083] (journald)[506]: Successfully forked off '(sd-mkuserns)' as PID 507.
> > [   12.977579] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
> > [   12.983638] rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x1/.

This one of course is a stall on CPU 0.  But you knew that already.

Also, it looks like you have CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20 or maybe
booted with rcupdate.rcu_exp_cpu_stall_timeout=20 on a system with HZ=250?
Or set rcu_exp_cpu_stall_timeout=20 via sysfs?

> And as suspected, booting in it again, it doesn't trigger anymore. But there's
> something new in dmesg which looks weird and makes me want to Cc Paul:
> 
> [    6.965526] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {

This is the beginning of the message.

> [    6.971581] Key type fscrypt-provisioning registered
> [    6.975191] PM: Image not found (code -6)
> [    6.975631]  } 8 jiffies s: 89 root: 0x0/.

And this is the end.  This looks like the stall ended just as the
stall-warning message started printing.

> and
> 
> [   12.549532] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 2-.... } 8 jiffies s: 113 root: 0x1/.
> [   12.550863] rcu: blocking rcu_node structures (internal RCU debug):

This is a stall on CPU 2.

> 
> [   12.817601] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
> [   12.819773] (sd-mkdcre[520]: Credential search path is: /etc/credstore:/run/credstore:/usr/local/lib/credstore:/usr/lib/credstore
> [   12.827074]  } 8 jiffies s: 129 root: 0x0/.
> 
> [   12.881508] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
> [   12.892854] (sd-mkdcr[522]: Credential search path is: /etc/credstore.encrypted:/run/credstore.encrypted:/usr/local/lib/credstore.encrypted:/usr/lib/credstore.encrypted
> [   12.905244]  } 8 jiffies s: 133 root: 0x0/.
> 
> Paul, this looks weird.
> 
> Why is that issuing empty lists between the { }?

Again, my guess is that the stall is ending just as the print starts.

It also looks like you have the expedited stall warning set to 20
milliseconds, which as far as I know is used only on constrained systems
such as smartphones.  If you set this value on a typical large server,
you will get very large numbers of expedited RCU CPU stall warnings.

Oh, and if you are running with HZ=1000 and the expedited RCU CPU stall
warning set to 20 milliseconds (let alone 8!), then as far as I know,
you are a pioneer breaking new ground.  ;-)

							Thanx, Paul

> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-11-28 20:28   ` Paul E. McKenney
@ 2025-12-03 12:44     ` Borislav Petkov
  2025-12-03 17:16       ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2025-12-03 12:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Will Deacon,
	Robin Murphy, linux-kernel

On Fri, Nov 28, 2025 at 12:28:34PM -0800, Paul E. McKenney wrote:
> Sorry to be slow, USA Turkey Day and all that...

Nothing to be sorry for - email is asynchronous communication. :-P

> This one of course is a stall on CPU 0.  But you knew that already.
> 
> Also, it looks like you have CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20 or maybe
> booted with rcupdate.rcu_exp_cpu_stall_timeout=20 on a system with HZ=250?
> Or set rcu_exp_cpu_stall_timeout=20 via sysfs?

Not really - this is me simply doing "make olddefconfig" on a .config and then
using it on the test box. I'm simply doing defaults and I can imagine they
have changed over the years.

[boris@zn: ~/kernel/configs/brent> grep CONFIG_RCU_EXP_CPU_STALL_TIMEOUT config-6.18.0-rc7+ 
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20

Yap, 20 it is.

> This is the beginning of the message.
> 
> > [    6.971581] Key type fscrypt-provisioning registered
> > [    6.975191] PM: Image not found (code -6)
> > [    6.975631]  } 8 jiffies s: 89 root: 0x0/.
> 
> And this is the end.  This looks like the stall ended just as the
> stall-warning message started printing.

I suspected that... Judging by your explanation, I don't think we can stop
printing empty stall messages - sounds like they're multiline and that comes
from different places in the code.

To avoid confusion, I mean...

> It also looks like you have the expedited stall warning set to 20
> milliseconds, which as far as I know is used only on constrained systems
> such as smartphones. 

That "smartphone" can't possibly fit in my pocket! :-P :-P

> If you set this value on a typical large server, you will get very large
> numbers of expedited RCU CPU stall warnings.

Should I reset it to its default 0?

And for that other value I have there:

config RCU_CPU_STALL_TIMEOUT
        int "RCU CPU stall timeout in seconds"
        depends on RCU_STALL_COMMON
        range 3 300
        default 21 

which is weird. I guess I need to reset all those to something sensible for
server...

> Oh, and if you are running with HZ=1000 and the expedited RCU CPU stall
> warning set to 20 milliseconds (let alone 8!), then as far as I know,
> you are a pioneer breaking new ground.  ;-)

I do things like that from tim to time...

But nah, it is 250:

# CONFIG_HZ_PERIODIC is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-12-03 12:44     ` Borislav Petkov
@ 2025-12-03 17:16       ` Paul E. McKenney
  2025-12-04 14:45         ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2025-12-03 17:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Will Deacon,
	Robin Murphy, linux-kernel

On Wed, Dec 03, 2025 at 01:44:22PM +0100, Borislav Petkov wrote:
> On Fri, Nov 28, 2025 at 12:28:34PM -0800, Paul E. McKenney wrote:
> > Sorry to be slow, USA Turkey Day and all that...
> 
> Nothing to be sorry for - email is asynchronous communication. :-P
> 
> > This one of course is a stall on CPU 0.  But you knew that already.
> > 
> > Also, it looks like you have CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20 or maybe
> > booted with rcupdate.rcu_exp_cpu_stall_timeout=20 on a system with HZ=250?
> > Or set rcu_exp_cpu_stall_timeout=20 via sysfs?
> 
> Not really - this is me simply doing "make olddefconfig" on a .config and then
> using it on the test box. I'm simply doing defaults and I can imagine they
> have changed over the years.
> 
> [boris@zn: ~/kernel/configs/brent> grep CONFIG_RCU_EXP_CPU_STALL_TIMEOUT config-6.18.0-rc7+ 
> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20
> 
> Yap, 20 it is.
> 
> > This is the beginning of the message.
> > 
> > > [    6.971581] Key type fscrypt-provisioning registered
> > > [    6.975191] PM: Image not found (code -6)
> > > [    6.975631]  } 8 jiffies s: 89 root: 0x0/.
> > 
> > And this is the end.  This looks like the stall ended just as the
> > stall-warning message started printing.
> 
> I suspected that... Judging by your explanation, I don't think we can stop
> printing empty stall messages - sounds like they're multiline and that comes
> from different places in the code.
> 
> To avoid confusion, I mean...

Actually, there are some things that can be done.  If you really meant
to test with 20 milliseconds instead of the normal server-class 21
seconds, please let me know and I can see about adjusting.  No promises
on schedule, though, LPC being next week and all.

> > It also looks like you have the expedited stall warning set to 20
> > milliseconds, which as far as I know is used only on constrained systems
> > such as smartphones. 
> 
> That "smartphone" can't possibly fit in my pocket! :-P :-P

Then do you *really* want to be setting the expedited RCU CPU stall
warning to 20 milliseconds?  That value was set up specifically for
constrained systems that don't scan huge memories or have huge piles of
running tasks, not for datacenter servers.

> > If you set this value on a typical large server, you will get very large
> > numbers of expedited RCU CPU stall warnings.
> 
> Should I reset it to its default 0?

Or to some value that works for you.  But if you are not looking to be
an expedited RCU CPU stall-warning pioneer, yes, setting it to zero is
a good approach.

If you would like to be a more sane pioneer, setting it to (say) 11000
(or 11 seconds) could be appropriate.  But what fun is sanity?  ;-)

> And for that other value I have there:
> 
> config RCU_CPU_STALL_TIMEOUT
>         int "RCU CPU stall timeout in seconds"
>         depends on RCU_STALL_COMMON
>         range 3 300
>         default 21 
> 
> which is weird. I guess I need to reset all those to something sensible for
> server...

No, not weird, see "seconds", not "milliseconds".

> > Oh, and if you are running with HZ=1000 and the expedited RCU CPU stall
> > warning set to 20 milliseconds (let alone 8!), then as far as I know,
> > you are a pioneer breaking new ground.  ;-)
> 
> I do things like that from tim to time...
> 
> But nah, it is 250:
> 
> # CONFIG_HZ_PERIODIC is not set
> # CONFIG_HZ_100 is not set
> CONFIG_HZ_250=y
> # CONFIG_HZ_300 is not set
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=250

OK, then all thoae "8 jiffies" in your console log make sense.  ;-)

						Thanx, Paul

> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-12-03 17:16       ` Paul E. McKenney
@ 2025-12-04 14:45         ` Borislav Petkov
  2025-12-04 20:42           ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2025-12-04 14:45 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Will Deacon,
	Robin Murphy, linux-kernel

On Wed, Dec 03, 2025 at 09:16:37AM -0800, Paul E. McKenney wrote:
> Or to some value that works for you.  But if you are not looking to be
> an expedited RCU CPU stall-warning pioneer, yes, setting it to zero is
> a good approach.
> 
> If you would like to be a more sane pioneer, setting it to (say) 11000
> (or 11 seconds) could be appropriate.  But what fun is sanity?  ;-)

Oh, I have a lot of excitement even without RCU experiments. :-P

But if you need me to try things, lemme know.

For now I've simply reset the values as to what defconfig sets them to:

CONFIG_RCU_CPU_STALL_TIMEOUT=21
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-12-04 14:45         ` Borislav Petkov
@ 2025-12-04 20:42           ` Paul E. McKenney
  2025-12-08  4:57             ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2025-12-04 20:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Will Deacon,
	Robin Murphy, linux-kernel

On Thu, Dec 04, 2025 at 03:45:05PM +0100, Borislav Petkov wrote:
> On Wed, Dec 03, 2025 at 09:16:37AM -0800, Paul E. McKenney wrote:
> > Or to some value that works for you.  But if you are not looking to be
> > an expedited RCU CPU stall-warning pioneer, yes, setting it to zero is
> > a good approach.
> > 
> > If you would like to be a more sane pioneer, setting it to (say) 11000
> > (or 11 seconds) could be appropriate.  But what fun is sanity?  ;-)
> 
> Oh, I have a lot of excitement even without RCU experiments. :-P

;-) ;-) ;-)

> But if you need me to try things, lemme know.
> 
> For now I've simply reset the values as to what defconfig sets them to:
> 
> CONFIG_RCU_CPU_STALL_TIMEOUT=21
> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0

Sounds good, and thank you!

							Thanx, Paul

> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.
  2025-12-04 20:42           ` Paul E. McKenney
@ 2025-12-08  4:57             ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2025-12-08  4:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: iommu, Joerg Roedel, Suravee Suthikulpanit, Will Deacon,
	Robin Murphy, linux-kernel

On Thu, Dec 04, 2025 at 12:42:52PM -0800, Paul E. McKenney wrote:
> On Thu, Dec 04, 2025 at 03:45:05PM +0100, Borislav Petkov wrote:
> > On Wed, Dec 03, 2025 at 09:16:37AM -0800, Paul E. McKenney wrote:
> > > Or to some value that works for you.  But if you are not looking to be
> > > an expedited RCU CPU stall-warning pioneer, yes, setting it to zero is
> > > a good approach.
> > > 
> > > If you would like to be a more sane pioneer, setting it to (say) 11000
> > > (or 11 seconds) could be appropriate.  But what fun is sanity?  ;-)
> > 
> > Oh, I have a lot of excitement even without RCU experiments. :-P
> 
> ;-) ;-) ;-)
> 
> > But if you need me to try things, lemme know.
> > 
> > For now I've simply reset the values as to what defconfig sets them to:
> > 
> > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
> 
> Sounds good, and thank you!

But this turned out to be too simple to fail to address (famous last
words!).  Please see below.

							Thanx, Paul

------------------------------------------------------------------------

commit 9776d62e236bef99859e9067f540aa6f6683b432
Author: Paul E. McKenney <paulmck@kernel.org>
Date:   Sun Dec 7 20:49:35 2025 -0800

    rcu: Make expedited RCU CPU stall warnings detect stall-end races
    
    If an expedited RCU CPU stall ends just at the stall-warning timeout,
    the current code will print an expedited stall-warning message, but one
    that doesn't identify any CPUs or tasks causing the stall.  This is most
    likely to happen for short-timeout stalls, for example, the 20-millisecond
    timeouts that are sometimes used for small embedded devices.  Needless to
    say, these semi-empty stall-warning messages can be rather confusing.
    
    One option would be to suppress the stall-warning message entirely in
    this case, but the near-miss information can be quite valuable.
    
    This commit therefore detects this race condition and emits a "INFO:
    Expedited stall ended before state dump start" message to clarify matters.
    
    Reported-by: Borislav Petkov <bp@alien8.de>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 6058a734090c1..fd02cd12b7980 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -589,7 +589,12 @@ static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigne
 	pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
 		j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
 		".T"[!!data_race(rnp_root->exp_tasks)]);
-	if (ndetected) {
+	if (!ndetected) {
+		// This is invoked from the grace-period worker, so
+		// a new grace period cannot have started.  And if this
+		// worker were stalled, we would not get here.  ;-)
+		pr_err("INFO: Expedited stall ended before state dump start\n");
+	} else {
 		pr_err("blocking rcu_node structures (internal RCU debug):");
 		rcu_for_each_node_breadth_first(rnp) {
 			if (rnp == rnp_root)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-12-08  4:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-26 14:32 amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/ Borislav Petkov
2025-11-26 15:26 ` Borislav Petkov
2025-11-28 20:28   ` Paul E. McKenney
2025-12-03 12:44     ` Borislav Petkov
2025-12-03 17:16       ` Paul E. McKenney
2025-12-04 14:45         ` Borislav Petkov
2025-12-04 20:42           ` Paul E. McKenney
2025-12-08  4:57             ` Paul E. McKenney
2025-11-26 16:00 ` Borislav Petkov
2025-11-26 19:50   ` Jason Gunthorpe
2025-11-27  5:15   ` Vasant Hegde

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox