Re: NMI stack overflow during resume of PCIe bridge with CONFIG_HARDLOCKUP_DETECTOR=y

public inbox for linux-next@vger.kernel.org
 help / color / mirror / Atom feed

From: Bert Karwatzki <spasswolf@web.de>
To: Thomas Gleixner <tglx@kernel.org>, linux-kernel@vger.kernel.org
Cc: linux-next@vger.kernel.org, spasswolf@web.de,
	"Mario Limonciello" <mario.limonciello@amd.com>,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"Clark Williams" <clrkwllms@kernel.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Christian König" <christian.koenig@amd.com>,
	regressions@lists.linux.dev, linux-pci@vger.kernel.org,
	linux-acpi@vger.kernel.org,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	acpica-devel@lists.linux.dev,
	"Robert Moore" <robert.moore@intel.com>,
	"Saket Dumbre" <saket.dumbre@intel.com>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Clemens Ladisch" <clemens@ladisch.de>,
	"Jinchao Wang" <wangjinchao600@gmail.com>,
	"Yury Norov" <yury.norov@gmail.com>,
	"Anna Schumaker" <anna.schumaker@oracle.com>,
	"Baoquan He" <bhe@redhat.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"Dave Young" <dyoung@redhat.com>,
	"Doug Anderson" <dianders@chromium.org>,
	"Guilherme G. Piccoli" <gpiccoli@igalia.com>,
	"Helge Deller" <deller@gmx.de>, "Ingo Molnar" <mingo@kernel.org>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Joanthan Cameron" <Jonathan.Cameron@huawei.com>,
	"Joel Granados" <joel.granados@kernel.org>,
	"John Ogness" <john.ogness@linutronix.de>,
	"Kees Cook" <kees@kernel.org>, "Li Huafei" <lihuafei1@huawei.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	"Luo Gengkun" <luogengkun@huaweicloud.com>,
	"Max Kellermann" <max.kellermann@ionos.com>,
	"Nam Cao" <namcao@linutronix.de>,
	oushixiong <oushixiong@kylinos.cn>,
	"Petr Mladek" <pmladek@suse.com>,
	"Qianqiang Liu" <qianqiang.liu@163.com>,
	"Sergey Senozhatsky" <senozhatsky@chromium.org>,
	"Sohil Mehta" <sohil.mehta@intel.com>,
	"Tejun Heo" <tj@kernel.org>,
	"Thomas Zimemrmann" <tzimmermann@suse.de>,
	"Thorsten Blum" <thorsten.blum@linux.dev>,
	"Ville Syrjala" <ville.syrjala@linux.intel.com>,
	"Vivek Goyal" <vgoyal@redhat.com>,
	"Yunhui Cui" <cuiyunhui@bytedance.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	W_Armin@gmx.de
Subject: Re: NMI stack overflow during resume of PCIe bridge with CONFIG_HARDLOCKUP_DETECTOR=y
Date: Tue, 13 Jan 2026 18:50:24 +0100	[thread overview]
Message-ID: <bc20529d7520e7db7de2022bf9c96a1bc3a2f0df.camel@web.de> (raw)
In-Reply-To: <87h5spk01t.ffs@tglx>

Am Dienstag, dem 13.01.2026 um 16:24 +0100 schrieb Thomas Gleixner:
> On Tue, Jan 13 2026 at 10:41, Bert Karwatzki wrote:
> > Here's the result in case of the crash:
> > 2026-01-12T04:24:36.809904+01:00 T1510;acpi_ex_system_memory_space_handler 255: logical_addr_ptr = ffffc066977b3000
> > 2026-01-12T04:24:36.846170+01:00 C14;exc_nmi: 0
> 
> Here the NMI triggers in non-task context on CPU14
> 
> > 2026-01-12T04:24:36.960760+01:00 C14;exc_nmi: 10.3
> > 2026-01-12T04:24:36.960760+01:00 C14;default_do_nmi 
> > 2026-01-12T04:24:36.960760+01:00 C14;nmi_handle: type=0x0
> > 2026-01-12T04:24:36.960760+01:00 C14;nmi_handle: a=0xffffffffa1612de0
> > 2026-01-12T04:24:36.960760+01:00 C14;nmi_handle: a->handler=perf_event_nmi_handler+0x0/0xa6
> > 2026-01-12T04:24:36.960760+01:00 C14;perf_event_nmi_handler: 0
> > 2026-01-12T04:24:36.960760+01:00 C14;perf_event_nmi_handler: 1
> > 2026-01-12T04:24:36.960760+01:00 C14;perf_event_nmi_handler: 2
> > 2026-01-12T04:24:36.960760+01:00 C14;x86_pmu_handle_irq: 2
> > 2026-01-12T04:24:36.960760+01:00 C14;x86_pmu_handle_irq: 2.6
> > 2026-01-12T04:24:36.960760+01:00 C14;__perf_event_overflow: 0
> > 2026-01-12T04:24:36.960760+01:00 C14;__perf_event_overflow: 6.99: overflow_handler=watchdog_overflow_callback+0x0/0x10d
> > 2026-01-12T04:24:36.960760+01:00 C14;watchdog_overflow_callback: 0
> > 2026-01-12T04:24:36.960760+01:00 C14;__ktime_get_fast_ns_debug: 0.1
> > 2026-01-12T04:24:36.960760+01:00 C14;tk_clock_read_debug: read=read_hpet+0x0/0xf0
> > 2026-01-12T04:24:36.960760+01:00 C14;read_hpet: 0
> > 2026-01-12T04:24:36.960760+01:00 C14;read_hpet: 0.1
> 
> > 2026-01-12T04:24:36.960760+01:00 T0;exc_nmi: 0
> 
> This one triggers in task context of PID0, aka idle task, but it's not
> clear on which CPU that happens. It's probably CPU13 as that continues
> with the expected 10.3 output, but that's almost ~1.71 seconds later.
> 
The long delays seem to be typical for the first NMI after trying to access
the broken memory at phys_addr 0xf0100000, here's an example from an earlier
run with more printk()s in that part of the code (too many printk()s seem to
cause addtional system freezes ...)


2026-01-03T14:10:10.312182+01:00 T1511;acpi_ex_system_memory_space_handler 255: logical_addr_ptr = ffffbaa49c15d000
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 0
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 1
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 2
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 3
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 4
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 5
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 6
2026-01-03T14:10:10.616281+01:00 T0;exc_nmi: 7
2026-01-03T14:10:10.616281+01:00 T0;irqentry_nmi_enter: 0
2026-01-03T14:10:10.616281+01:00 T0;irqentry_nmi_enter: 1
2026-01-03T14:10:11.055800+01:00 C8;irqentry_nmi_enter: 2
2026-01-03T14:10:11.055800+01:00 C8;irqentry_nmi_enter: 3
2026-01-03T14:10:11.055800+01:00 C8;irqentry_nmi_enter: 4
2026-01-03T14:10:11.055800+01:00 C8;irqentry_nmi_enter: 5
2026-01-03T14:10:11.055800+01:00 C8;irqentry_nmi_enter: irq_state=0x0
2026-01-03T14:10:11.055800+01:00 C8;exc_nmi: 8
2026-01-03T14:10:11.055800+01:00 C8;exc_nmi: 9
2026-01-03T14:10:11.055800+01:00 C8;exc_nmi: 10.3

Position of printk()s in irqentry_nmi_enter() was:
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index e33691d5adf7..42cba2ea7aa1 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -370,12 +370,18 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs)
 {
        irqentry_state_t irq_state;
 
+       printk(KERN_INFO "%s: 0\n", __func__);
        irq_state.lockdep = lockdep_hardirqs_enabled();
+       printk(KERN_INFO "%s: 1\n", __func__);
 
        __nmi_enter();
+       printk(KERN_INFO "%s: 2\n", __func__);
        lockdep_hardirqs_off(CALLER_ADDR0);
+       printk(KERN_INFO "%s: 3\n", __func__);
        lockdep_hardirq_enter();
+       printk(KERN_INFO "%s: 4\n", __func__);
        ct_nmi_enter();
+       printk(KERN_INFO "%s: 5\n", __func__);
 
        instrumentation_begin();
        kmsan_unpoison_entry_regs(regs);
@@ -383,6 +389,7 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs)
        ftrace_nmi_enter();
        instrumentation_end();
 
+       printk(KERN_INFO "%s: irq_state=0x%x\n", __func__, irq_state);
        return irq_state;
 }
 

>  What's more likely is that after a while
> _ALL_ CPUs are hung up in the NMI handler after they tripped over the
> HPET read.

I'm not sure about that, my latest testrun (with v6.18) crashed with only one message
from exc_nmi().

> 
> > The behaviour described here seems to be similar to the bug that commit
> > 3d5f4f15b778 ("watchdog: skip checks when panic is in progress") is fixing, but
> > this is actually a different bug as kernel 6.18 (which contains 3d5f4f15b778)
> > is also affected (I've conducted 5 tests with 6.18 so far and got 4 crashes (crashes occured
> > after (0.5h, 1h, 4.5h, 1.5h) of testing)). 
> > Nevertheless these look similar enough to CC the involved people.
> 
> There is nothing similar.
> 
> Your problem originates from a screwed up hardware state which in turn
> causes the HPET to go haywire for unknown reasons.
> 
> What is the physical address of this ACPI handler access:
> 
>        logical_addr_ptr = ffffc066977b3000
> 
> along with the full output of /proc/iomem

The physical address is 0xf0100000

$ cat /proc/iomem
00000000-00000fff : Reserved
00001000-0009ffff : System RAM
000a0000-000fffff : Reserved
  000a0000-000dffff : PCI Bus 0000:00
  000f0000-000fffff : System ROM
00100000-09bfefff : System RAM
09bff000-0a000fff : Reserved
0a001000-0a1fffff : System RAM
0a200000-0a20efff : ACPI Non-volatile Storage
0a20f000-e6057fff : System RAM
  15000000-15b252c1 : Kernel code
  15c00000-15f60fff : Kernel rodata
  16000000-1610e27f : Kernel data
  165ce000-167fffff : Kernel bss
  9c000000-dbffffff : Crash kernel
e6058000-e614bfff : Reserved
e614c000-e868afff : System RAM
e868b000-e868bfff : Reserved
e868c000-e9cdefff : System RAM
e9cdf000-eb1fdfff : Reserved
  eb1dd000-eb1e0fff : MSFT0101:00
  eb1e1000-eb1e4fff : MSFT0101:00
eb1fe000-eb25dfff : ACPI Tables
eb25e000-eb555fff : ACPI Non-volatile Storage
eb556000-ed1fefff : Reserved
ed1ff000-edffffff : System RAM
ee000000-efffffff : Reserved
f0000000-fcffffff : PCI Bus 0000:00
  f0000000-f7ffffff : PCI ECAM 0000 [bus 00-7f]
    f0000000-f7ffffff : pnp 00:00
  fc500000-fc9fffff : PCI Bus 0000:08
    fc500000-fc5fffff : 0000:08:00.7
      fc500000-fc5fffff : pcie_mp2_amd
    fc600000-fc6fffff : 0000:08:00.4
      fc600000-fc6fffff : xhci-hcd
    fc700000-fc7fffff : 0000:08:00.3
      fc700000-fc7fffff : xhci-hcd
    fc800000-fc8fffff : 0000:08:00.2
      fc800000-fc8fffff : ccp
    fc900000-fc97ffff : 0000:08:00.0
    fc980000-fc9bffff : 0000:08:00.5
      fc980000-fc9bffff : AMD ACP3x audio
        fc980000-fc990200 : acp_pdm_iomem
    fc9c0000-fc9c7fff : 0000:08:00.6
      fc9c0000-fc9c7fff : ICH HD audio
    fc9c8000-fc9cbfff : 0000:08:00.1
      fc9c8000-fc9cbfff : ICH HD audio
    fc9cc000-fc9cdfff : 0000:08:00.7
    fc9ce000-fc9cffff : 0000:08:00.2
      fc9ce000-fc9cffff : ccp
  fca00000-fccfffff : PCI Bus 0000:01
    fca00000-fcbfffff : PCI Bus 0000:02
      fca00000-fcbfffff : PCI Bus 0000:03
        fca00000-fcafffff : 0000:03:00.0
        fcb00000-fcb1ffff : 0000:03:00.0
        fcb20000-fcb23fff : 0000:03:00.1
          fcb20000-fcb23fff : ICH HD audio
    fcc00000-fcc03fff : 0000:01:00.0
  fcd00000-fcdfffff : PCI Bus 0000:07
    fcd00000-fcd03fff : 0000:07:00.0
      fcd00000-fcd03fff : nvme
  fce00000-fcefffff : PCI Bus 0000:06
    fce00000-fce03fff : 0000:06:00.0
      fce00000-fce03fff : nvme
  fcf00000-fcffffff : PCI Bus 0000:05
    fcf00000-fcf03fff : 0000:05:00.0
    fcf04000-fcf04fff : 0000:05:00.0
      fcf04000-fcf04fff : r8169
fd300000-fd37ffff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec01000-fec013ff : IOAPIC 1
fec10000-fec10fff : Reserved
  fec10000-fec10fff : pnp 00:04
fed00000-fed00fff : Reserved
  fed00000-fed003ff : HPET 0
    fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : Reserved
fed80000-fed8ffff : Reserved
  fed81200-fed812ff : AMDI0030:00
  fed81500-fed818ff : AMDI0030:00
    fed81500-fed818ff : AMDI0030:00 AMDI0030:00
fedc0000-fedc0fff : pnp 00:04
fedc4000-fedc9fff : Reserved
  fedc5000-fedc5fff : AMDI0010:03
    fedc5000-fedc5fff : AMDI0010:03 AMDI0010:03
fedcc000-fedcefff : Reserved
fedd5000-fedd5fff : Reserved
fee00000-fee00fff : pnp 00:04
ff000000-ffffffff : pnp 00:04
100000000-3ee2fffff : System RAM
3ee300000-40fffffff : Reserved
410000000-ffffffffff : PCI Bus 0000:00
  fc00000000-fe0fffffff : PCI Bus 0000:01
    fc00000000-fe0fffffff : PCI Bus 0000:02
      fc00000000-fe0fffffff : PCI Bus 0000:03
        fc00000000-fdffffffff : 0000:03:00.0
        fe00000000-fe0fffffff : 0000:03:00.0
  fe20000000-fe301fffff : PCI Bus 0000:08
    fe20000000-fe2fffffff : 0000:08:00.0
    fe30000000-fe301fffff : 0000:08:00.0
  fe30300000-fe304fffff : PCI Bus 0000:04
    fe30300000-fe303fffff : 0000:04:00.0
      fe30300000-fe303fffff : 0000:04:00.0
    fe30400000-fe30403fff : 0000:04:00.0
    fe30404000-fe30404fff : 0000:04:00.0

> 
> Thanks,
> 
>         tglx

Thank you,

Bert Karwatzki

next prev parent reply	other threads:[~2026-01-13 17:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-13  9:41 NMI stack overflow during resume of PCIe bridge with CONFIG_HARDLOCKUP_DETECTOR=y Bert Karwatzki
2026-01-13 15:24 ` Thomas Gleixner
2026-01-13 17:50   ` Bert Karwatzki [this message]
2026-01-13 19:30     ` Thomas Gleixner
2026-01-13 21:15       ` Jason Gunthorpe
2026-01-13 22:19       ` Bert Karwatzki
2026-01-20 10:27         ` crash during resume of PCIe bridge in v5.17 (v5.16 works) Bert Karwatzki
2026-02-01  0:36           ` crash during resume of PCIe bridge from v5.17 to next-20260130 " Bert Karwatzki
2026-02-01 10:19             ` Armin Wolf
2026-02-01 11:42               ` Rafael J. Wysocki
2026-02-01 16:42             ` Thomas Gleixner
2026-02-02 10:37               ` Christian König

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:e33691d5adf dfblob:42cba2ea7aa )
 OR (
bs:"Re: NMI stack overflow during resume of PCIe bridge with CONFIG_HARDLOCKUP_DETECTOR=y" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bc20529d7520e7db7de2022bf9c96a1bc3a2f0df.camel@web.de \
    --to=spasswolf@web.de \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=W_Armin@gmx.de \
    --cc=acpica-devel@lists.linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=anna.schumaker@oracle.com \
    --cc=bhe@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=bigeasy@linutronix.de \
    --cc=christian.koenig@amd.com \
    --cc=clemens@ladisch.de \
    --cc=clrkwllms@kernel.org \
    --cc=cuiyunhui@bytedance.com \
    --cc=deller@gmx.de \
    --cc=dianders@chromium.org \
    --cc=djwong@kernel.org \
    --cc=dyoung@redhat.com \
    --cc=gpiccoli@igalia.com \
    --cc=jgg@ziepe.ca \
    --cc=joel.granados@kernel.org \
    --cc=john.ogness@linutronix.de \
    --cc=kees@kernel.org \
    --cc=lihuafei1@huawei.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=luogengkun@huaweicloud.com \
    --cc=mario.limonciello@amd.com \
    --cc=max.kellermann@ionos.com \
    --cc=mingo@kernel.org \
    --cc=namcao@linutronix.de \
    --cc=oushixiong@kylinos.cn \
    --cc=pmladek@suse.com \
    --cc=qianqiang.liu@163.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=regressions@lists.linux.dev \
    --cc=robert.moore@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=saket.dumbre@intel.com \
    --cc=senozhatsky@chromium.org \
    --cc=sohil.mehta@intel.com \
    --cc=tglx@kernel.org \
    --cc=thorsten.blum@linux.dev \
    --cc=tj@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=tzimmermann@suse.de \
    --cc=vgoyal@redhat.com \
    --cc=ville.syrjala@linux.intel.com \
    --cc=wangjinchao600@gmail.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox