public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: "河合英宏 / KAWAI,HIDEHIRO" <hidehiro.kawai.ez@hitachi.com>
Cc: "Jonathan Corbet" <corbet@lwn.net>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vivek Goyal" <vgoyal@redhat.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"平松雅巳 / HIRAMATU,MASAMI" <masami.hiramatsu.pt@hitachi.com>
Subject: Re: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI
Date: Thu, 30 Jul 2015 14:27:47 +0200	[thread overview]
Message-ID: <20150730122747.GA3954@dhcp22.suse.cz> (raw)
In-Reply-To: <04EAB7311EE43145B2D3536183D1A8445491FC55@GSjpTKYDCembx31.service.hitachi.net>

On Thu 30-07-15 11:55:52, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > From: Michal Hocko [mailto:mhocko@kernel.org]
[...]
> > Could you point me to the code which does that, please? Maybe we are
> > missing that in our 3.0 kernel. I was quite surprised to see this
> > behavior as well.
> 
> Please see the snippet below.
> 
> void setup_local_APIC(void)
> {
> ...
>         /*
>          * only the BP should see the LINT1 NMI signal, obviously.
>          */
>         if (!cpu)
>                 value = APIC_DM_NMI;
>         else
>                 value = APIC_DM_NMI | APIC_LVT_MASKED;
>         if (!lapic_is_integrated())             /* 82489DX */
>                 value |= APIC_LVT_LEVEL_TRIGGER;
>         apic_write(APIC_LVT1, value);
> 
> 
> LINT1 pins of cpus other than CPU 0 are masked here.
> However, at least on some of Hitachi servers, NMI caused by NMI
> button doesn't seem to be delivered through LINT1.  So, my `external NMI'
> word may not be correct.

I am not familiar with details here but I can tell you that this
particular code snippet is the same in our 3.0 based kernel so it seems
that the HW is indeed doing something differently.

> > You might still get a panic on hardlockup which will happen on all CPUs
> > from the NMI context so we have to be able to handle panic in NMI on
> > many CPUs.
> 
> Do you say about the case of a kerne panic while other cpus locks up
> in NMI context?  In that case, there is no way to do things needed by
> kdump procedure including saving registeres...

I am saying that watchdog_overflow_callback might trigger on more CPUs
and panic from NMI context as well. So this is not reduced to the NMI
button sends NMI to more CPUs.

Why cannot the panic() context save all the registers if we are going to
loop in NMI context? This would be imho preferable to returning from
panic IMO.

[...]
> > I can provide the full log but it is quite mangled. I guess the
> > CPU130 was the only one allowed to proceed with the panic while others
> > returned from the unknown NMI handling. It took a lot of time until
> > CPU130 managed to boot the crash kernel with soft lockups and RCU stalls
> > reports. CPU0 is most probably locked up waiting for CPU130 to
> > acknowledge the IPI which will not happen apparently.
> 
> There is a timeout of 1000ms in nmi_shootdown_cpus(), so I don't know
> why CPU 130 waits so long.  I'll try to consider for a while.

Yes, I do not understand the timing here either and the fact that the
log is a complete mess in the important parts doesn't help a wee bit.
 
[...]

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2015-07-30 12:27 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-27  1:58 [V2 PATCH 0/3] x86: Fix panic vs. NMI issues Hidehiro Kawai
2015-07-27  1:58 ` [V2 PATCH 3/3] x86/apic: Introduce noextnmi boot option Hidehiro Kawai
2015-07-27  1:58 ` [V2 PATCH 2/3] kexec: Fix race between panic() and crash_kexec() called directly Hidehiro Kawai
2015-07-27 14:55   ` Michal Hocko
2015-07-28  2:15     ` Hidehiro Kawai
2015-07-27  1:58 ` [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Hidehiro Kawai
2015-07-27 14:34   ` Michal Hocko
2015-07-28  2:02     ` Hidehiro Kawai
2015-07-28  8:01       ` Michal Hocko
2015-07-29  5:48       ` 河合英宏 / KAWAI,HIDEHIRO
2015-07-29  8:23         ` Michal Hocko
2015-07-29  9:09           ` 河合英宏 / KAWAI,HIDEHIRO
2015-07-29  9:21             ` Michal Hocko
2015-07-30  1:45               ` 河合英宏 / KAWAI,HIDEHIRO
2015-07-30  7:33                 ` 河合英宏 / KAWAI,HIDEHIRO
2015-07-30  7:55                   ` Michal Hocko
2015-07-30  8:06                     ` 河合英宏 / KAWAI,HIDEHIRO
2015-07-30  7:48                 ` Michal Hocko
2015-07-30 11:55                   ` 河合英宏 / KAWAI,HIDEHIRO
2015-07-30 12:27                     ` Michal Hocko [this message]
2015-07-31 11:23                       ` 河合英宏 / KAWAI,HIDEHIRO
2015-08-04  8:56                         ` Michal Hocko
2015-08-04 11:53                           ` 河合英宏 / KAWAI,HIDEHIRO

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150730122747.GA3954@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=ebiederm@xmission.com \
    --cc=hidehiro.kawai.ez@hitachi.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox