[PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
@ 2017-02-23 13:36 Xunlei Pang
  2017-03-03  9:07 ` Xunlei Pang
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Xunlei Pang @ 2017-02-23 13:36 UTC (permalink / raw)
  To: x86, linux-kernel, kexec
  Cc: Tony Luck, Borislav Petkov, Ingo Molnar, Dave Young,
	Prarit Bhargava, Junichi Nomura, Kiyoshi Ueda, Xunlei Pang,
	Naoya Horiguchi

We met an issue for kdump: after kdump kernel boots up,
and there comes a broadcasted mce in first kernel, the
other cpus remaining in first kernel will enter the old
mce handler of first kernel, then timeout and panic due
to MCE synchronization, finally reset the kdump cpus.

This patch lets cpus stay quiet after nmi_shootdown_cpus(),
so after kdump boots, cpus remaining in 1st kernel should
not do anything except clearing MCG_STATUS. This is useful
for kdump to let vmcore dumping perform as hard as it can.

Previous efforts:
https://patchwork.kernel.org/patch/6167631/
https://lists.gt.net/linux/kernel/2146557

Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
---
v1->v2:
- Using crashing_cpu according to Borislav's suggestion.

v2->v3:
- Used crashing_cpu in mce.c explicitly, not skip crashing_cpu.
- Added some comments.

v3->v4:
- Added more code comments according to Tony's feedback.

 arch/x86/include/asm/reboot.h    |  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 17 +++++++++++++++--
 arch/x86/kernel/reboot.c         |  5 +++--
 3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 2cb1cc2..fc62ba8 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -15,6 +15,7 @@ struct machine_ops {
 };
 
 extern struct machine_ops machine_ops;
+extern int crashing_cpu;
 
 void native_machine_crash_shutdown(struct pt_regs *regs);
 void native_machine_shutdown(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8e9725c..b65505f 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -49,6 +49,7 @@
 #include <asm/tlbflush.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
+#include <asm/reboot.h>
 
 #include "mce-internal.h"
 
@@ -1127,9 +1128,21 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	 * on Intel.
 	 */
 	int lmce = 1;
+	int cpu = smp_processor_id();
 
-	/* If this CPU is offline, just bail out. */
-	if (cpu_is_offline(smp_processor_id())) {
+	/*
+	 * Cases to bail out to avoid rendezvous process timeout:
+	 * 1)If this CPU is offline.
+	 * 2)If crashing_cpu was set, e.g. entering kdump,
+	 *   we need to skip cpus remaining in 1st kernel.
+	 *   Note: there is a small window between kexecing
+	 *   and kdump kernel establishing new mce handler,
+	 *   if some MCE comes within the window, there is
+	 *   no valid mce handler due to pgtable changing,
+	 *   let's just face the fate.
+	 */
+	if (cpu_is_offline(cpu) ||
+	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
 
 		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e244c19..92ecf4b 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
 #endif
 
 
+/* This keeps a track of which one is crashing cpu. */
+int crashing_cpu = -1;
+
 #if defined(CONFIG_SMP)
 
-/* This keeps a track of which one is crashing cpu. */
-static int crashing_cpu;
 static nmi_shootdown_cb shootdown_callback;
 
 static atomic_t waiting_for_crash_ipi;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
  2017-02-23 13:36 [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made Xunlei Pang
@ 2017-03-03  9:07 ` Xunlei Pang
  2017-03-06 11:16 ` Borislav Petkov
  2017-03-13  9:50 ` [PATCH] x86/mce: Handle broadcasted MCE gracefully with kexec Borislav Petkov
  2 siblings, 0 replies; 6+ messages in thread
From: Xunlei Pang @ 2017-03-03  9:07 UTC (permalink / raw)
  To: Xunlei Pang, x86, linux-kernel, kexec, Borislav Petkov
  Cc: Tony Luck, Ingo Molnar, Dave Young, Prarit Bhargava,
	Junichi Nomura, Kiyoshi Ueda, Naoya Horiguchi

Ping Boris

On 02/23/2017 at 09:36 PM, Xunlei Pang wrote:
> We met an issue for kdump: after kdump kernel boots up,
> and there comes a broadcasted mce in first kernel, the
> other cpus remaining in first kernel will enter the old
> mce handler of first kernel, then timeout and panic due
> to MCE synchronization, finally reset the kdump cpus.
>
> This patch lets cpus stay quiet after nmi_shootdown_cpus(),
> so after kdump boots, cpus remaining in 1st kernel should
> not do anything except clearing MCG_STATUS. This is useful
> for kdump to let vmcore dumping perform as hard as it can.
>
> Previous efforts:
> https://patchwork.kernel.org/patch/6167631/
> https://lists.gt.net/linux/kernel/2146557
>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Suggested-by: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Xunlei Pang <xlpang@redhat.com>
> ---
> v1->v2:
> - Using crashing_cpu according to Borislav's suggestion.
>
> v2->v3:
> - Used crashing_cpu in mce.c explicitly, not skip crashing_cpu.
> - Added some comments.
>
> v3->v4:
> - Added more code comments according to Tony's feedback.
>
>  arch/x86/include/asm/reboot.h    |  1 +
>  arch/x86/kernel/cpu/mcheck/mce.c | 17 +++++++++++++++--
>  arch/x86/kernel/reboot.c         |  5 +++--
>  3 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
> index 2cb1cc2..fc62ba8 100644
> --- a/arch/x86/include/asm/reboot.h
> +++ b/arch/x86/include/asm/reboot.h
> @@ -15,6 +15,7 @@ struct machine_ops {
>  };
>  
>  extern struct machine_ops machine_ops;
> +extern int crashing_cpu;
>  
>  void native_machine_crash_shutdown(struct pt_regs *regs);
>  void native_machine_shutdown(void);
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 8e9725c..b65505f 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -49,6 +49,7 @@
>  #include <asm/tlbflush.h>
>  #include <asm/mce.h>
>  #include <asm/msr.h>
> +#include <asm/reboot.h>
>  
>  #include "mce-internal.h"
>  
> @@ -1127,9 +1128,21 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>  	 * on Intel.
>  	 */
>  	int lmce = 1;
> +	int cpu = smp_processor_id();
>  
> -	/* If this CPU is offline, just bail out. */
> -	if (cpu_is_offline(smp_processor_id())) {
> +	/*
> +	 * Cases to bail out to avoid rendezvous process timeout:
> +	 * 1)If this CPU is offline.
> +	 * 2)If crashing_cpu was set, e.g. entering kdump,
> +	 *   we need to skip cpus remaining in 1st kernel.
> +	 *   Note: there is a small window between kexecing
> +	 *   and kdump kernel establishing new mce handler,
> +	 *   if some MCE comes within the window, there is
> +	 *   no valid mce handler due to pgtable changing,
> +	 *   let's just face the fate.
> +	 */
> +	if (cpu_is_offline(cpu) ||
> +	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
>  		u64 mcgstatus;
>  
>  		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index e244c19..92ecf4b 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
>  #endif
>  
>  
> +/* This keeps a track of which one is crashing cpu. */
> +int crashing_cpu = -1;
> +
>  #if defined(CONFIG_SMP)
>  
> -/* This keeps a track of which one is crashing cpu. */
> -static int crashing_cpu;
>  static nmi_shootdown_cb shootdown_callback;
>  
>  static atomic_t waiting_for_crash_ipi;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
  2017-02-23 13:36 [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made Xunlei Pang
  2017-03-03  9:07 ` Xunlei Pang
@ 2017-03-06 11:16 ` Borislav Petkov
  2017-03-06 18:27   ` Luck, Tony
  2017-03-13  9:50 ` [PATCH] x86/mce: Handle broadcasted MCE gracefully with kexec Borislav Petkov
  2 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2017-03-06 11:16 UTC (permalink / raw)
  To: Xunlei Pang, Tony Luck
  Cc: x86, linux-kernel, kexec, Ingo Molnar, Dave Young,
	Prarit Bhargava, Junichi Nomura, Kiyoshi Ueda, Naoya Horiguchi

On Thu, Feb 23, 2017 at 09:36:52PM +0800, Xunlei Pang wrote:
> We met an issue for kdump: after kdump kernel boots up,
> and there comes a broadcasted mce in first kernel, the
> other cpus remaining in first kernel will enter the old
> mce handler of first kernel, then timeout and panic due
> to MCE synchronization, finally reset the kdump cpus.
> 
> This patch lets cpus stay quiet after nmi_shootdown_cpus(),
> so after kdump boots, cpus remaining in 1st kernel should
> not do anything except clearing MCG_STATUS. This is useful
> for kdump to let vmcore dumping perform as hard as it can.

Ok, I went and rewrote the text to make it more succinct, to the point
and correct spelling and formatting.

Tony, ACK?

---
>From 2d76fdd4044b4659bb8746948b986e3f4eb75e22 Mon Sep 17 00:00:00 2001
From: Xunlei Pang <xlpang@redhat.com>
Date: Thu, 23 Feb 2017 21:36:52 +0800
Subject: [PATCH] x86/mce: Handle broadcasted MCE gracefully with kexec

When we are about to kexec a crash kernel and right then and there
a broadcasted MCE fires while we're still in first kernel and while
the other CPUs remain in a holding pattern, the #MC handler of the
first kernel will timeout and then panic due to never completing MCE
synchronization.

Handle this in a similar way to as when the CPUs are offlined when that
broadcasted MCE happens.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: kexec@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1487857012-9059-1-git-send-email-xlpang@redhat.com
[ Boris: rewrote commit message and comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/reboot.h    |  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 18 ++++++++++++++++--
 arch/x86/kernel/reboot.c         |  5 +++--
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 2cb1cc253d51..fc62ba8dce93 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -15,6 +15,7 @@ struct machine_ops {
 };
 
 extern struct machine_ops machine_ops;
+extern int crashing_cpu;
 
 void native_machine_crash_shutdown(struct pt_regs *regs);
 void native_machine_shutdown(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8e9725c607ea..177472ace838 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -49,6 +49,7 @@
 #include <asm/tlbflush.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
+#include <asm/reboot.h>
 
 #include "mce-internal.h"
 
@@ -1127,9 +1128,22 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	 * on Intel.
 	 */
 	int lmce = 1;
+	int cpu = smp_processor_id();
 
-	/* If this CPU is offline, just bail out. */
-	if (cpu_is_offline(smp_processor_id())) {
+	/*
+	 * Cases where we avoid rendezvous handler timeout:
+	 * 1) If this CPU is offline.
+	 *
+	 * 2) If crashing_cpu was set, e.g. we're entering kdump and we need to
+	 *  skip those CPUs which remain looping in the 1st kernel - see
+	 *  crash_nmi_callback().
+	 *
+	 * Note: there still is a small window between kexec-ing and the new,
+	 * kdump kernel establishing a new #MC handler where a broadcasted MCE
+	 * might not get handled properly.
+	 */
+	if (cpu_is_offline(cpu) ||
+	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
 
 		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e244c19a2451..d3718cc5edbf 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
 #endif
 
 
+/* This is the CPU performing the emergency shutdown work. */
+int crashing_cpu = -1;
+
 #if defined(CONFIG_SMP)
 
-/* This keeps a track of which one is crashing cpu. */
-static int crashing_cpu;
 static nmi_shootdown_cb shootdown_callback;
 
 static atomic_t waiting_for_crash_ipi;
-- 
2.11.0

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
  2017-03-06 11:16 ` Borislav Petkov
@ 2017-03-06 18:27   ` Luck, Tony
  0 siblings, 0 replies; 6+ messages in thread
From: Luck, Tony @ 2017-03-06 18:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Xunlei Pang, x86, linux-kernel, kexec, Ingo Molnar, Dave Young,
	Prarit Bhargava, Junichi Nomura, Kiyoshi Ueda, Naoya Horiguchi

On Mon, Mar 06, 2017 at 12:16:54PM +0100, Borislav Petkov wrote:
> On Thu, Feb 23, 2017 at 09:36:52PM +0800, Xunlei Pang wrote:
> > We met an issue for kdump: after kdump kernel boots up,
> > and there comes a broadcasted mce in first kernel, the
> > other cpus remaining in first kernel will enter the old
> > mce handler of first kernel, then timeout and panic due
> > to MCE synchronization, finally reset the kdump cpus.
> > 
> > This patch lets cpus stay quiet after nmi_shootdown_cpus(),
> > so after kdump boots, cpus remaining in 1st kernel should
> > not do anything except clearing MCG_STATUS. This is useful
> > for kdump to let vmcore dumping perform as hard as it can.
> 
> Ok, I went and rewrote the text to make it more succinct, to the point
> and correct spelling and formatting.
> 
> Tony, ACK?

Yes. Looks good now.

Acked-by: Tony Luck <tony.luck@intel.com>

-Tony

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] x86/mce: Handle broadcasted MCE gracefully with kexec
@ 2017-03-13  9:50 ` Borislav Petkov
  2017-03-13 19:21   ` [tip:ras/core] " tip-bot for Xunlei Pang
  0 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2017-03-13  9:50 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML, Naoya Horiguchi, kexec, linux-edac

From: Xunlei Pang <xlpang@redhat.com>

When we are about to kexec a crash kernel and right then and there a
broadcasted MCE fires while we're still in the first kernel and while
the other CPUs remain in a holding pattern, the #MC handler of the
first kernel will timeout and then panic due to never completing MCE
synchronization.

Handle this in a similar way as to when the CPUs are offlined when that
broadcasted MCE happens.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: kexec@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1487857012-9059-1-git-send-email-xlpang@redhat.com
[ Boris: rewrote commit message and comments. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/reboot.h    |  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 18 ++++++++++++++++--
 arch/x86/kernel/reboot.c         |  5 +++--
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 2cb1cc253d51..fc62ba8dce93 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -15,6 +15,7 @@ struct machine_ops {
 };
 
 extern struct machine_ops machine_ops;
+extern int crashing_cpu;
 
 void native_machine_crash_shutdown(struct pt_regs *regs);
 void native_machine_shutdown(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8e9725c607ea..177472ace838 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -49,6 +49,7 @@
 #include <asm/tlbflush.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
+#include <asm/reboot.h>
 
 #include "mce-internal.h"
 
@@ -1127,9 +1128,22 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	 * on Intel.
 	 */
 	int lmce = 1;
+	int cpu = smp_processor_id();
 
-	/* If this CPU is offline, just bail out. */
-	if (cpu_is_offline(smp_processor_id())) {
+	/*
+	 * Cases where we avoid rendezvous handler timeout:
+	 * 1) If this CPU is offline.
+	 *
+	 * 2) If crashing_cpu was set, e.g. we're entering kdump and we need to
+	 *  skip those CPUs which remain looping in the 1st kernel - see
+	 *  crash_nmi_callback().
+	 *
+	 * Note: there still is a small window between kexec-ing and the new,
+	 * kdump kernel establishing a new #MC handler where a broadcasted MCE
+	 * might not get handled properly.
+	 */
+	if (cpu_is_offline(cpu) ||
+	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
 
 		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e244c19a2451..d3718cc5edbf 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
 #endif
 
 
+/* This is the CPU performing the emergency shutdown work. */
+int crashing_cpu = -1;
+
 #if defined(CONFIG_SMP)
 
-/* This keeps a track of which one is crashing cpu. */
-static int crashing_cpu;
 static nmi_shootdown_cb shootdown_callback;
 
 static atomic_t waiting_for_crash_ipi;
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [tip:ras/core] x86/mce: Handle broadcasted MCE gracefully with kexec
  2017-03-13  9:50 ` [PATCH] x86/mce: Handle broadcasted MCE gracefully with kexec Borislav Petkov
@ 2017-03-13 19:21   ` tip-bot for Xunlei Pang
  0 siblings, 0 replies; 6+ messages in thread
From: tip-bot for Xunlei Pang @ 2017-03-13 19:21 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, hpa, tglx, tony.luck, linux-edac, mingo, bp, linux-kernel,
	n-horiguchi, xlpang

Commit-ID:  5bc329503e8191c91c4c40836f062ef771d8ba83
Gitweb:     http://git.kernel.org/tip/5bc329503e8191c91c4c40836f062ef771d8ba83
Author:     Xunlei Pang <xlpang@redhat.com>
AuthorDate: Mon, 13 Mar 2017 10:50:19 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 13 Mar 2017 20:18:07 +0100

x86/mce: Handle broadcasted MCE gracefully with kexec

When we are about to kexec a crash kernel and right then and there a
broadcasted MCE fires while we're still in the first kernel and while
the other CPUs remain in a holding pattern, the #MC handler of the
first kernel will timeout and then panic due to never completing MCE
synchronization.

Handle this in a similar way as to when the CPUs are offlined when that
broadcasted MCE happens.

[ Boris: rewrote commit message and comments. ]

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: kexec@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1487857012-9059-1-git-send-email-xlpang@redhat.com
Link: http://lkml.kernel.org/r/20170313095019.19351-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/reboot.h    |  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 18 ++++++++++++++++--
 arch/x86/kernel/reboot.c         |  5 +++--
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 2cb1cc2..fc62ba8 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -15,6 +15,7 @@ struct machine_ops {
 };
 
 extern struct machine_ops machine_ops;
+extern int crashing_cpu;
 
 void native_machine_crash_shutdown(struct pt_regs *regs);
 void native_machine_shutdown(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8e9725c..177472a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -49,6 +49,7 @@
 #include <asm/tlbflush.h>
 #include <asm/mce.h>
 #include <asm/msr.h>
+#include <asm/reboot.h>
 
 #include "mce-internal.h"
 
@@ -1127,9 +1128,22 @@ void do_machine_check(struct pt_regs *regs, long error_code)
 	 * on Intel.
 	 */
 	int lmce = 1;
+	int cpu = smp_processor_id();
 
-	/* If this CPU is offline, just bail out. */
-	if (cpu_is_offline(smp_processor_id())) {
+	/*
+	 * Cases where we avoid rendezvous handler timeout:
+	 * 1) If this CPU is offline.
+	 *
+	 * 2) If crashing_cpu was set, e.g. we're entering kdump and we need to
+	 *  skip those CPUs which remain looping in the 1st kernel - see
+	 *  crash_nmi_callback().
+	 *
+	 * Note: there still is a small window between kexec-ing and the new,
+	 * kdump kernel establishing a new #MC handler where a broadcasted MCE
+	 * might not get handled properly.
+	 */
+	if (cpu_is_offline(cpu) ||
+	    (crashing_cpu != -1 && crashing_cpu != cpu)) {
 		u64 mcgstatus;
 
 		mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 067f981..2544700 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -765,10 +765,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
 #endif
 
 
+/* This is the CPU performing the emergency shutdown work. */
+int crashing_cpu = -1;
+
 #if defined(CONFIG_SMP)
 
-/* This keeps a track of which one is crashing cpu. */
-static int crashing_cpu;
 static nmi_shootdown_cb shootdown_callback;
 
 static atomic_t waiting_for_crash_ipi;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-03-13 19:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-23 13:36 [PATCH v4] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made Xunlei Pang
2017-03-03  9:07 ` Xunlei Pang
2017-03-06 11:16 ` Borislav Petkov
2017-03-06 18:27   ` Luck, Tony
2017-03-13  9:50 ` [PATCH] x86/mce: Handle broadcasted MCE gracefully with kexec Borislav Petkov
2017-03-13 19:21   ` [tip:ras/core] " tip-bot for Xunlei Pang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).