Re: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: "河合英宏 / KAWAI，HIDEHIRO" <hidehiro.kawai.ez@hitachi.com>
Cc: "Jonathan Corbet" <corbet@lwn.net>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Vivek Goyal" <vgoyal@redhat.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"平松雅巳 / HIRAMATU，MASAMI" <masami.hiramatsu.pt@hitachi.com>
Subject: Re: Re: [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI
Date: Wed, 29 Jul 2015 11:21:58 +0200	[thread overview]
Message-ID: <20150729092157.GC15801@dhcp22.suse.cz> (raw)
In-Reply-To: <04EAB7311EE43145B2D3536183D1A8445491DB5E@GSjpTKYDCembx31.service.hitachi.net>

On Wed 29-07-15 09:09:18, 河合英宏 / KAWAI，HIDEHIRO wrote:
> > From: Michal Hocko [mailto:mhocko@kernel.org]
> > On Wed 29-07-15 05:48:47, 河合英宏 / KAWAI，HIDEHIRO wrote:
> > > Hi,
> > >
> > > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Hidehiro Kawai
> > > > (2015/07/27 23:34), Michal Hocko wrote:
> > > > > On Mon 27-07-15 10:58:50, Hidehiro Kawai wrote:
> > > [...]
> > > > > The check could be also relaxed a bit and nmi_panic would
> > > > > return only if the ongoing panic is the current cpu when we really have
> > > > > to return and allow the preempted panic to finish.
> > > >
> > > > It's reasonable.  I'll do that in the next version.
> > >
> > > I noticed atomic_read() is insufficient.  Please consider the following
> > > scenario.
> > >
> > > CPU 1: call panic() in the normal context
> > > CPU 0: call nmi_panic(), check the value of panic_cpu, then call panic()
> > > CPU 1: set 1 to panic_cpu
> > > CPU 0: fail to set 0 to panic_cpu, then do an infinite loop
> > > CPU 1: call crash_kexec(), then call kdump_nmi_shootdown_cpus()
> > >
> > > At this point, since CPU 0 loops in NMI context, it never executes
> > > the NMI handler registered by kdump_nmi_shootdown_cpus().  This means
> > > that no register states are saved and no cleanups for VMX/SVM are
> > > performed.
> > 
> > Yes this is true but it is no different from the current state, isn't
> > it? So if you want to handle that then it deserves a separate patch.
> > It is certainly not harmful wrt. panic behavior.
> > 
> > > So, we should still use atomic_cmpxchg() in nmi_panic() to
> > > prevent other cpus from running panic routines.
> > 
> > Not sure what you mean by that.
> 
> I mean that we should use the same logic as my V2 patch like this:
> 
> #define nmi_panic(fmt, ...)                                            \
>        do {                                                            \
>                if (atomic_cmpxchg(&panic_cpu, -1, raw_smp_processor_id()) \
>                    == -1)                                              \
>                        panic(fmt, ##__VA_ARGS__);                      \
>        } while (0)

This would allow to return from NMI too eagerly. When I was testing my
previous approach (on 3.0 based kernel) I had basically the same thing
(one NMI to process panic) and others to return. This led to a strange
behavior when the NMI button triggered NMI on all (hundreds) CPUs. The
crash kernel booted eventually but the log contained lockups when a
CPU waited for an IPI to the CPU which was handling the NMI panic.

Anyway, I do not thing this is really necessary to solve the panic
reentrancy issue. If the missing saved state is a real problem then it
should be handled separately - maybe it can be achieved without an IPI
and directly from the panic context if we are in NMI.
-- 
Michal Hocko
SUSE Labs

next prev parent reply	other threads:[~2015-07-29  9:22 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-27  1:58 [V2 PATCH 0/3] x86: Fix panic vs. NMI issues Hidehiro Kawai
2015-07-27  1:58 ` [V2 PATCH 3/3] x86/apic: Introduce noextnmi boot option Hidehiro Kawai
2015-07-27  1:58 ` [V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Hidehiro Kawai
2015-07-27 14:34   ` Michal Hocko
2015-07-28  2:02     ` Hidehiro Kawai
2015-07-28  8:01       ` Michal Hocko
2015-07-29  5:48       ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-29  8:23         ` Michal Hocko
2015-07-29  9:09           ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-29  9:21             ` Michal Hocko [this message]
2015-07-30  1:45               ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-30  7:33                 ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-30  7:55                   ` Michal Hocko
2015-07-30  8:06                     ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-30  7:48                 ` Michal Hocko
2015-07-30 11:55                   ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-30 12:27                     ` Michal Hocko
2015-07-31 11:23                       ` 河合英宏 / KAWAI，HIDEHIRO
2015-08-04  8:56                         ` Michal Hocko
2015-08-04 11:53                           ` 河合英宏 / KAWAI，HIDEHIRO
2015-07-27  1:58 ` [V2 PATCH 2/3] kexec: Fix race between panic() and crash_kexec() called directly Hidehiro Kawai
2015-07-27 14:55   ` Michal Hocko
2015-07-28  2:15     ` Hidehiro Kawai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150729092157.GC15801@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=ebiederm@xmission.com \
    --cc=hidehiro.kawai.ez@hitachi.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox