All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-doc@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>, Vivek Goyal <vgoyal@redhat.com>
Subject: Re: [PATCH 0/3] x86: Fix panic vs. NMI issues
Date: Thu, 23 Jul 2015 19:11:03 +0900	[thread overview]
Message-ID: <55B0BDB7.2050809@hitachi.com> (raw)
In-Reply-To: <20150723082514.GC9386@dhcp22.suse.cz>

Hi,

Thanks for the feedback.

(2015/07/23 17:25), Michal Hocko wrote:
> Hi,
> 
> On Wed 22-07-15 11:14:21, Hidehiro Kawai wrote:
>> When an HA cluster software or administrator detects non-response
>> of a host, they issue an NMI to the host to completely stop current
>> works and take a crash dump.  If the kernel has already panicked
>> or is capturing a crash dump at that time, further NMI can cause
>> a crash dump failure.
>>
>> To solve this issue, this patch set does two things:
>>
>> - Don't panic on NMI if the kernel has already panicked
>> - Introduce "noextnmi" boot option which masks external NMI at the
>>   boot time (supported only for x86)
> 
> I am currently debugging the same issue for our customer. Curiously
> enough the issue happens on a Hitachi HW.

I found these issues by my white-box testing and source code
reading.  So, they haven't happened on our customers yet, but
possibly happen.

> I haven't posted my patch for an upstream review yet because I still
> do not have a feedback but I believe your solution is unnecessarily
> too complex. Unless I am missing something the following should be enough,
> no?

Your patch solves some cases, but I think it wouldn't cover
all cases where I want to solve.  How about the following cases?

1) panic -> acquire panic_lock -> unknown NMI on this CPU ->
   panic -> failed to acquire panic_lock -> infinite loop
==> no one processes kdump procedure. 

2) crash_kexec w/o entering panic -> acquire kexec_mutex ->
   unknown NMI on this CPU -> panic -> crash_kexec ->
   failed to acquire kexec_mutex -> return to panic -> smp_send_stop

Even if with your patch, case 2) causes infinite loop of
try_crash_kexec and no one processes kdump procedure.

Regards,

> ---
>>From ba6ef85d26113e720a630ea22b08efef5b70210f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 17 Jul 2015 15:17:08 +0200
> Subject: [PATCH] kexec: Never return from crash_kexec when kexex is in
>  progress
> 
> We had a report when kdump kernel hasn't booted after unknown NMI has
> been delivered and unknown_nmi_panic is enabled. The NMI is triggered
> by HW and it is delivered to all CPUs at the same time. The machine has
> hundreds of CPUs and the most plausible theory is that one CPU really
> manages to kick the kexec but it cannot shut down all the CPUs because
> they are processing NMI and so cannot process an IPI. Another CPU then
> manages to call smp_send_stop from a concurrent panic and this stops the
> kexec CPU which has managed to switch to the new kernel and doesn't run
> in the NMI mode anymore.
> 
> Fix this by making crash_kexec to never return if there is a kexec in
> progress. This can be done easily by relying on the fact that
> kexec_mutex will never be released for an ongoing kexec so we just have
> to loop over the try lock. The only tricky part is that
> kexec_crash_image might be not loaded when we have to return. The check
> has to be done under the lock. Extract the trylock and check into
> try_crash_kexec and make it return true only if crash kexec is disabled.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  kernel/kexec.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index a785c1015e25..d61b1478167d 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1470,7 +1470,7 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
>  
>  #endif /* CONFIG_KEXEC_FILE */
>  
> -void crash_kexec(struct pt_regs *regs)
> +static bool try_crash_kexec(struct pt_regs *regs)
>  {
>  	/* Take the kexec_mutex here to prevent sys_kexec_load
>  	 * running on one cpu from replacing the crash kernel
> @@ -1490,7 +1490,20 @@ void crash_kexec(struct pt_regs *regs)
>  			machine_kexec(kexec_crash_image);
>  		}
>  		mutex_unlock(&kexec_mutex);
> +		return true;
>  	}
> +	return false;
> +}
> +
> +void crash_kexec(struct pt_regs *regs)
> +{
> +	/*
> +	 * Never return from this function if a kexec is in progress
> +	 * already because next steps might interfere with it.
> +	 * try_crash_kexec will never succeed in such a case.
> +	 */
> +	while (!try_crash_kexec(regs))
> +		cpu_relax();
>  }
>  
>  size_t crash_get_memory_size(void)
> 


-- 
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vivek Goyal <vgoyal@redhat.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
	x86@kernel.org, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCH 0/3] x86: Fix panic vs. NMI issues
Date: Thu, 23 Jul 2015 19:11:03 +0900	[thread overview]
Message-ID: <55B0BDB7.2050809@hitachi.com> (raw)
In-Reply-To: <20150723082514.GC9386@dhcp22.suse.cz>

Hi,

Thanks for the feedback.

(2015/07/23 17:25), Michal Hocko wrote:
> Hi,
> 
> On Wed 22-07-15 11:14:21, Hidehiro Kawai wrote:
>> When an HA cluster software or administrator detects non-response
>> of a host, they issue an NMI to the host to completely stop current
>> works and take a crash dump.  If the kernel has already panicked
>> or is capturing a crash dump at that time, further NMI can cause
>> a crash dump failure.
>>
>> To solve this issue, this patch set does two things:
>>
>> - Don't panic on NMI if the kernel has already panicked
>> - Introduce "noextnmi" boot option which masks external NMI at the
>>   boot time (supported only for x86)
> 
> I am currently debugging the same issue for our customer. Curiously
> enough the issue happens on a Hitachi HW.

I found these issues by my white-box testing and source code
reading.  So, they haven't happened on our customers yet, but
possibly happen.

> I haven't posted my patch for an upstream review yet because I still
> do not have a feedback but I believe your solution is unnecessarily
> too complex. Unless I am missing something the following should be enough,
> no?

Your patch solves some cases, but I think it wouldn't cover
all cases where I want to solve.  How about the following cases?

1) panic -> acquire panic_lock -> unknown NMI on this CPU ->
   panic -> failed to acquire panic_lock -> infinite loop
==> no one processes kdump procedure. 

2) crash_kexec w/o entering panic -> acquire kexec_mutex ->
   unknown NMI on this CPU -> panic -> crash_kexec ->
   failed to acquire kexec_mutex -> return to panic -> smp_send_stop

Even if with your patch, case 2) causes infinite loop of
try_crash_kexec and no one processes kdump procedure.

Regards,

> ---
>>From ba6ef85d26113e720a630ea22b08efef5b70210f Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Fri, 17 Jul 2015 15:17:08 +0200
> Subject: [PATCH] kexec: Never return from crash_kexec when kexex is in
>  progress
> 
> We had a report when kdump kernel hasn't booted after unknown NMI has
> been delivered and unknown_nmi_panic is enabled. The NMI is triggered
> by HW and it is delivered to all CPUs at the same time. The machine has
> hundreds of CPUs and the most plausible theory is that one CPU really
> manages to kick the kexec but it cannot shut down all the CPUs because
> they are processing NMI and so cannot process an IPI. Another CPU then
> manages to call smp_send_stop from a concurrent panic and this stops the
> kexec CPU which has managed to switch to the new kernel and doesn't run
> in the NMI mode anymore.
> 
> Fix this by making crash_kexec to never return if there is a kexec in
> progress. This can be done easily by relying on the fact that
> kexec_mutex will never be released for an ongoing kexec so we just have
> to loop over the try lock. The only tricky part is that
> kexec_crash_image might be not loaded when we have to return. The check
> has to be done under the lock. Extract the trylock and check into
> try_crash_kexec and make it return true only if crash kexec is disabled.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.cz>
> ---
>  kernel/kexec.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index a785c1015e25..d61b1478167d 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1470,7 +1470,7 @@ SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
>  
>  #endif /* CONFIG_KEXEC_FILE */
>  
> -void crash_kexec(struct pt_regs *regs)
> +static bool try_crash_kexec(struct pt_regs *regs)
>  {
>  	/* Take the kexec_mutex here to prevent sys_kexec_load
>  	 * running on one cpu from replacing the crash kernel
> @@ -1490,7 +1490,20 @@ void crash_kexec(struct pt_regs *regs)
>  			machine_kexec(kexec_crash_image);
>  		}
>  		mutex_unlock(&kexec_mutex);
> +		return true;
>  	}
> +	return false;
> +}
> +
> +void crash_kexec(struct pt_regs *regs)
> +{
> +	/*
> +	 * Never return from this function if a kexec is in progress
> +	 * already because next steps might interfere with it.
> +	 * try_crash_kexec will never succeed in such a case.
> +	 */
> +	while (!try_crash_kexec(regs))
> +		cpu_relax();
>  }
>  
>  size_t crash_get_memory_size(void)
> 


-- 
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group



  reply	other threads:[~2015-07-23 10:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-22  2:14 [PATCH 0/3] x86: Fix panic vs. NMI issues Hidehiro Kawai
2015-07-22  2:14 ` Hidehiro Kawai
2015-07-22  2:14 ` [PATCH 3/3] x86/apic: Introduce noextnmi boot option Hidehiro Kawai
2015-07-22  2:14   ` Hidehiro Kawai
2015-07-22  2:14 ` [PATCH 2/3] kexec: Fix race between panic() and crash_kexec() directly called Hidehiro Kawai
2015-07-22  2:14   ` Hidehiro Kawai
2015-07-22  2:14 ` [PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI Hidehiro Kawai
2015-07-22  2:14   ` Hidehiro Kawai
2015-07-23  8:15   ` Peter Zijlstra
2015-07-23  8:15     ` Peter Zijlstra
2015-07-23  9:43     ` Hidehiro Kawai
2015-07-23  9:43       ` Hidehiro Kawai
2015-07-23  8:25 ` [PATCH 0/3] x86: Fix panic vs. NMI issues Michal Hocko
2015-07-23  8:25   ` Michal Hocko
2015-07-23 10:11   ` Hidehiro Kawai [this message]
2015-07-23 10:11     ` Hidehiro Kawai
2015-07-23 11:25     ` Michal Hocko
2015-07-23 11:25       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B0BDB7.2050809@hitachi.com \
    --to=hidehiro.kawai.ez@hitachi.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.