All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: x86@kernel.org, kexec@lists.infradead.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: kdump kernel randomly hang with tick_periodic call trace on bare metal system
Date: Thu, 22 Dec 2022 12:09:05 +0800	[thread overview]
Message-ID: <Y6PYYUKXR2OCH3WG@MiWiFi-R3L-srv> (raw)
In-Reply-To: <efe7e25e-8b05-90e7-fc02-4f1dc84fe324@igalia.com>

On 12/21/22 at 12:46pm, Guilherme G. Piccoli wrote:
> On 20/12/2022 02:51, Baoquan He wrote:
> > On 12/20/22 at 01:41pm, Baoquan He wrote:
> >> On one intel bare metal system, I can randomly reproduce the kdump hang
> >> as below with tick_periodic call trace. Attach the kernel config for
> >> reference.
> > 
> > Forgot mentioning this random hang is also caused by adding
> > 'nr_cpus=2' into normal kernel's cmdline, then triggering crash will get
> > kdump kernel hang as below kdump log shown.
> > 
> 
> The weird thing is that you seem to be using "nr_cpus=1" instead - this
> is the cmdline from the log:
> 
> "nr_cpus=2 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off
> numa=off udev.children-max=2 panic=10 acpi_no_memhotplug
> transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0
> hugetlb_cma=0 disable_cpu_apicid=16 [...]"
> 
> You seems to pass twice the "nr_cpus" thing, and I guess kernel pick the
> last one?

From the kdump kernel boot log, yes, the nr_cpus=1 is taken. The
parse_early_param() will parse the kernel parameters one by one, then
the last one will take effect. Here, the problem is not at nr_cpus=2 or
1, the bare metal system has 16 cpus, only 2 cpus is present, it seems
to be the halted 14 cpus get wrong message and behave incorrectly to
cause the issue.

> 
> Also, what is "disable_cpu_apicid=16"? Could this be related?

Not really. Please check disable_cpu_apicid in
Documentation/admin-guide/kdump/kdump.rst, it's bsp's apic id.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: Baoquan He <bhe@redhat.com>
To: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
Cc: x86@kernel.org, kexec@lists.infradead.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org
Subject: Re: kdump kernel randomly hang with tick_periodic call trace on bare metal system
Date: Thu, 22 Dec 2022 12:09:05 +0800	[thread overview]
Message-ID: <Y6PYYUKXR2OCH3WG@MiWiFi-R3L-srv> (raw)
In-Reply-To: <efe7e25e-8b05-90e7-fc02-4f1dc84fe324@igalia.com>

On 12/21/22 at 12:46pm, Guilherme G. Piccoli wrote:
> On 20/12/2022 02:51, Baoquan He wrote:
> > On 12/20/22 at 01:41pm, Baoquan He wrote:
> >> On one intel bare metal system, I can randomly reproduce the kdump hang
> >> as below with tick_periodic call trace. Attach the kernel config for
> >> reference.
> > 
> > Forgot mentioning this random hang is also caused by adding
> > 'nr_cpus=2' into normal kernel's cmdline, then triggering crash will get
> > kdump kernel hang as below kdump log shown.
> > 
> 
> The weird thing is that you seem to be using "nr_cpus=1" instead - this
> is the cmdline from the log:
> 
> "nr_cpus=2 irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off
> numa=off udev.children-max=2 panic=10 acpi_no_memhotplug
> transparent_hugepage=never nokaslr hest_disable novmcoredd cma=0
> hugetlb_cma=0 disable_cpu_apicid=16 [...]"
> 
> You seems to pass twice the "nr_cpus" thing, and I guess kernel pick the
> last one?

From the kdump kernel boot log, yes, the nr_cpus=1 is taken. The
parse_early_param() will parse the kernel parameters one by one, then
the last one will take effect. Here, the problem is not at nr_cpus=2 or
1, the bare metal system has 16 cpus, only 2 cpus is present, it seems
to be the halted 14 cpus get wrong message and behave incorrectly to
cause the issue.

> 
> Also, what is "disable_cpu_apicid=16"? Could this be related?

Not really. Please check disable_cpu_apicid in
Documentation/admin-guide/kdump/kdump.rst, it's bsp's apic id.


  reply	other threads:[~2022-12-22  4:09 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-20  5:34 [PATCH] Revert "x86/apic/x2apic: Implement IPI shorthands support" Baoquan He
2022-12-20  5:34 ` Baoquan He
2022-12-20  5:41 ` kdump kernel randomly hang with tick_periodic call trace on bare metal system Baoquan He
2022-12-20  5:41   ` Baoquan He
2022-12-20  5:51   ` Baoquan He
2022-12-20  5:51     ` Baoquan He
2022-12-21 15:46     ` Guilherme G. Piccoli
2022-12-21 15:46       ` Guilherme G. Piccoli
2022-12-22  4:09       ` Baoquan He [this message]
2022-12-22  4:09         ` Baoquan He
2023-01-09 21:57   ` Thomas Gleixner
2023-01-09 21:57     ` Thomas Gleixner
2023-01-14  2:08     ` Baoquan He
2023-01-14  2:08       ` Baoquan He
2023-01-16  9:08     ` Baoquan He
2023-01-16  9:08       ` Baoquan He
2023-01-16 16:27     ` [tip: x86/urgent] x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL tip-bot2 for Thomas Gleixner
2022-12-20 11:38 ` [PATCH] Revert "x86/apic/x2apic: Implement IPI shorthands support" Peter Zijlstra
2022-12-20 11:38   ` Peter Zijlstra
2022-12-20 12:38   ` Baoquan He
2022-12-20 12:38     ` Baoquan He
2023-01-04 15:18     ` Dr. David Alan Gilbert
2023-01-04 15:18       ` Dr. David Alan Gilbert
2023-01-09 21:59 ` Thomas Gleixner
2023-01-09 21:59   ` Thomas Gleixner
2023-01-10  2:24   ` Baoquan He
2023-01-10  2:24     ` Baoquan He
2023-01-17  8:27     ` Baoquan He
2023-01-17  8:27       ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6PYYUKXR2OCH3WG@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=gpiccoli@igalia.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.