All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch] Small fix for CMCI/Poll conditions
@ 2009-04-02  8:33 Ke, Liping
  2009-04-02 20:55 ` Frank van der Linden
  0 siblings, 1 reply; 2+ messages in thread
From: Ke, Liping @ 2009-04-02  8:33 UTC (permalink / raw)
  To: Frank.Vanderlinden@Sun.COM, Keir Fraser; +Cc: xen-devel@lists.xensource.com

[-- Attachment #1: Type: text/plain, Size: 443 bytes --]

Hi, Frank& Keir

This is a small fix for solving CMCI and polling race conditions.
We found when CMCI happens very quickly, polling/CMCI processing path might cross. It's not graceful.
So for latest Intel CPU which support CMCI, if the error bank has CMCI capability (not every bank has this support
even if CPU supports CMCI), we will not poll it. Otherwise, we keep polling mechanism.

Thanks a lot for your help!
Regards,
Criping

[-- Attachment #2: poll_cmci.patch --]
[-- Type: application/octet-stream, Size: 3279 bytes --]

Small fix for polling/CMCI race conditions.

When CMCI happens very quickly, polling/CMCI processing path might cross.
For Intel CPUs which support CMCI, if the error bank has CMCI capability,
we'll disable poll on this bank.

Signed-off-by: Liping Ke <liping.ke@intel.com>
Signed-off-by: Yunhong Jiang<yunhong.jiang@intel.com>


diff -r f6fd1c2e4da6 xen/arch/x86/cpu/mcheck/mce.c
--- a/xen/arch/x86/cpu/mcheck/mce.c	Tue Mar 31 18:37:58 2009 +0100
+++ b/xen/arch/x86/cpu/mcheck/mce.c	Thu Apr 02 16:19:13 2009 +0800
@@ -577,6 +577,7 @@
 		break;
 	}
 
+    set_poll_bankmask(c);
 	if (!inited)
 		printk(XENLOG_INFO "CPU%i: No machine check initialization\n",
 		    smp_processor_id());
@@ -1230,7 +1231,19 @@
 
 	return ret;
 }
+void set_poll_bankmask(struct cpuinfo_x86 *c)
+{
 
+    if (cmci_support && !mce_disabled) {
+        memcpy(&(__get_cpu_var(poll_bankmask)),
+                &(__get_cpu_var(no_cmci_banks)), sizeof(cpu_banks_t));
+    }
+    else {
+        memcpy(&(get_cpu_var(poll_bankmask)), &mca_allbanks, sizeof(cpu_banks_t));
+        if (mce_firstbank(c))
+            clear_bit(0, get_cpu_var(poll_bankmask));
+    }
+}
 void mc_panic(char *s)
 {
     console_start_sync();
diff -r f6fd1c2e4da6 xen/arch/x86/cpu/mcheck/mce.h
--- a/xen/arch/x86/cpu/mcheck/mce.h	Tue Mar 31 18:37:58 2009 +0100
+++ b/xen/arch/x86/cpu/mcheck/mce.h	Thu Apr 02 16:19:13 2009 +0800
@@ -88,6 +88,10 @@
 };
 
 extern cpu_banks_t mca_allbanks;
+void set_poll_bankmask(struct cpuinfo_x86 *c);
+DECLARE_PER_CPU(cpu_banks_t, poll_bankmask);
+DECLARE_PER_CPU(cpu_banks_t, no_cmci_banks);
+extern int cmci_support;
 
 extern mctelem_cookie_t mcheck_mca_logout(enum mca_source, cpu_banks_t,
     struct mca_summary *);
diff -r f6fd1c2e4da6 xen/arch/x86/cpu/mcheck/mce_intel.c
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c	Tue Mar 31 18:37:58 2009 +0100
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c	Thu Apr 02 16:19:13 2009 +0800
@@ -12,9 +12,10 @@
 #include "x86_mca.h"
 
 DEFINE_PER_CPU(cpu_banks_t, mce_banks_owned);
+DEFINE_PER_CPU(cpu_banks_t, no_cmci_banks);
+int cmci_support = 0;
 
 static int nr_intel_ext_msrs = 0;
-static int cmci_support = 0;
 static int firstbank;
 
 #ifdef CONFIG_X86_MCE_THERMAL
@@ -548,7 +549,6 @@
 }
 
 static DEFINE_SPINLOCK(cmci_discover_lock);
-static DEFINE_PER_CPU(cpu_banks_t, no_cmci_banks);
 
 /*
  * Discover bank sharing using the algorithm recommended in the SDM.
diff -r f6fd1c2e4da6 xen/arch/x86/cpu/mcheck/non-fatal.c
--- a/xen/arch/x86/cpu/mcheck/non-fatal.c	Tue Mar 31 18:37:58 2009 +0100
+++ b/xen/arch/x86/cpu/mcheck/non-fatal.c	Thu Apr 02 16:19:13 2009 +0800
@@ -22,7 +22,7 @@
 
 #include "mce.h"
 
-static cpu_banks_t bankmask;
+DEFINE_PER_CPU(cpu_banks_t, poll_bankmask);
 static struct timer mce_timer;
 
 #define MCE_PERIOD MILLISECS(8000)
@@ -39,7 +39,7 @@
 	struct mca_summary bs;
 	static uint64_t dumpcount = 0;
 
-	mctc = mcheck_mca_logout(MCA_POLLER, bankmask, &bs);
+	mctc = mcheck_mca_logout(MCA_POLLER, __get_cpu_var(poll_bankmask), &bs);
 
 	if (bs.errcnt && mctc != NULL) {
 		adjust++;
@@ -94,10 +94,6 @@
 	if (!mce_available(c))
 		return -ENODEV;
 
-	memcpy(&bankmask, &mca_allbanks, sizeof (cpu_banks_t));
-	if (mce_firstbank(c) == 1)
-		clear_bit(0, bankmask);
-
 	/*
 	 * Check for non-fatal errors every MCE_RATE s
 	 */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Patch] Small fix for CMCI/Poll conditions
  2009-04-02  8:33 [Patch] Small fix for CMCI/Poll conditions Ke, Liping
@ 2009-04-02 20:55 ` Frank van der Linden
  0 siblings, 0 replies; 2+ messages in thread
From: Frank van der Linden @ 2009-04-02 20:55 UTC (permalink / raw)
  To: Ke, Liping; +Cc: xen-devel@lists.xensource.com, Keir Fraser

Ke, Liping wrote:
> Hi, Frank& Keir
> 
> This is a small fix for solving CMCI and polling race conditions.
> We found when CMCI happens very quickly, polling/CMCI processing path might cross. It's not graceful.
> So for latest Intel CPU which support CMCI, if the error bank has CMCI capability (not every bank has this support
> even if CPU supports CMCI), we will not poll it. Otherwise, we keep polling mechanism.
> 
> Thanks a lot for your help!
> Regards,
> Criping

This patch looks good to me.

Thanks,

- Frank

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-04-02 20:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-02  8:33 [Patch] Small fix for CMCI/Poll conditions Ke, Liping
2009-04-02 20:55 ` Frank van der Linden

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.