[patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
@ 2012-06-06 21:53 Thomas Gleixner
  2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner
                   ` (5 more replies)
  0 siblings, 6 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
  To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

Sorry for the resend, I guess I need more sleep or vacation or both :(

The following series fixes a few interesting bugs (found by review in
context of the CMCI poll effort) and a cleanup to the timer/hotplug
code followed by a consolidated version of the CMCI poll
implementation. This series is based on

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

which contains the bugfix for the dropped timer interval init.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 1/5] x86: mce: Disable preemption when calling raise_local()
  2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
@ 2012-06-06 21:53 ` Thomas Gleixner
  2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
  To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

[-- Attachment #1: x86-mce-disable-preemption-when-calling-raise-local.patch --]
[-- Type: text/plain, Size: 923 bytes --]

raise_mce() has a code path which does not disable preemption when the
raise_local() is called. The per cpu variable access in raise_local()
depends on preemption being disabled to be functional. So that code
path was either never tested or never tested with CONFIG_DEBUG_PREEMPT
enabled.

Add the missing preempt_disable/enable() pair around the call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mcheck/mce-inject.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: tip/arch/x86/kernel/cpu/mcheck/mce-inject.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ tip/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -194,7 +194,11 @@ static void raise_mce(struct mce *m)
 		put_online_cpus();
 	} else
 #endif
+	{
+		preempt_disable();
 		raise_local();
+		preempt_enable();
+	}
 }
 
 /* Error injection interface */



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 3/5] x86: mce: Split timer init
  2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
  2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner
@ 2012-06-06 21:53 ` Thomas Gleixner
  2012-06-07 15:18   ` Borislav Petkov
  2012-06-20  3:35   ` Hidetoshi Seto
  2012-06-06 21:53 ` [patch 2/5] x86: mce: Serialize mce injection Thomas Gleixner
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
  To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

[-- Attachment #1: x86-mce-split-timer-init.patch --]
[-- Type: text/plain, Size: 1904 bytes --]

Split timer init function into the init and the start part, so the
start part can replace the open coded version in CPU_DOWN_FAILED.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c |   25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

Index: tip/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ tip/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1554,23 +1554,28 @@ static void __mcheck_cpu_init_vendor(str
 	}
 }
 
-static void __mcheck_cpu_init_timer(void)
+static void mce_start_timer(unsigned int cpu, struct timer_list *t)
 {
-	struct timer_list *t = &__get_cpu_var(mce_timer);
 	unsigned long iv = check_interval * HZ;
 
-	setup_timer(t, mce_timer_fn, smp_processor_id());
+	__this_cpu_write(mce_next_interval, iv);
 
-	if (mce_ignore_ce)
+	if (mce_ignore_ce || !iv)
 		return;
 
-	__this_cpu_write(mce_next_interval, iv);
-	if (!iv)
-		return;
 	t->expires = round_jiffies(jiffies + iv);
 	add_timer_on(t, smp_processor_id());
 }
 
+static void __mcheck_cpu_init_timer(void)
+{
+	struct timer_list *t = &__get_cpu_var(mce_timer);
+	unsigned int cpu = smp_processor_id();
+
+	setup_timer(t, mce_timer_fn, cpu);
+	mce_start_timer(cpu, t);
+}
+
 /* Handle unconfigured int18 (should never happen) */
 static void unexpected_machine_check(struct pt_regs *regs, long error_code)
 {
@@ -2275,12 +2280,8 @@ mce_cpu_callback(struct notifier_block *
 		break;
 	case CPU_DOWN_FAILED:
 	case CPU_DOWN_FAILED_FROZEN:
-		if (!mce_ignore_ce && check_interval) {
-			t->expires = round_jiffies(jiffies +
-					per_cpu(mce_next_interval, cpu));
-			add_timer_on(t, cpu);
-		}
 		smp_call_function_single(cpu, mce_reenable_cpu, &action, 1);
+		mce_start_timer(cpu, t);
 		break;
 	case CPU_POST_DEAD:
 		/* intentionally ignoring frozen here */



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 3/5] x86: mce: Split timer init
  2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner
@ 2012-06-07 15:18   ` Borislav Petkov
  2012-06-20  3:35   ` Hidetoshi Seto
  1 sibling, 0 replies; 33+ messages in thread
From: Borislav Petkov @ 2012-06-07 15:18 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Tony Luck, Chen Gong, x86, Peter Zijlstra

On Wed, Jun 06, 2012 at 09:53:23PM +0000, Thomas Gleixner wrote:
> Split timer init function into the init and the start part, so the
> start part can replace the open coded version in CPU_DOWN_FAILED.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Ok, it looks good to me and it has been lightly tested. In light of last
time's review debacle, however, I'd only very tentatively say:

Acked-by: Borislav Petkov <borislav.petkov@amd.com>

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 3/5] x86: mce: Split timer init
  2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner
  2012-06-07 15:18   ` Borislav Petkov
@ 2012-06-20  3:35   ` Hidetoshi Seto
  1 sibling, 0 replies; 33+ messages in thread
From: Hidetoshi Seto @ 2012-06-20  3:35 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

(2012/06/07 6:53), Thomas Gleixner wrote:
> --- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c
> +++ tip/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1554,23 +1554,28 @@ static void __mcheck_cpu_init_vendor(str
>  	}
>  }
>  
> -static void __mcheck_cpu_init_timer(void)
> +static void mce_start_timer(unsigned int cpu, struct timer_list *t)
>  {
> -	struct timer_list *t = &__get_cpu_var(mce_timer);
>  	unsigned long iv = check_interval * HZ;
>  
> -	setup_timer(t, mce_timer_fn, smp_processor_id());
> +	__this_cpu_write(mce_next_interval, iv);
>  
> -	if (mce_ignore_ce)
> +	if (mce_ignore_ce || !iv)
>  		return;
>  
> -	__this_cpu_write(mce_next_interval, iv);
> -	if (!iv)
> -		return;
>  	t->expires = round_jiffies(jiffies + iv);
>  	add_timer_on(t, smp_processor_id());

add_timer_on(t, cpu) ?

If so, using __this_cpu_write() here is wrong too.

>  }
>  

Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 2/5] x86: mce: Serialize mce injection
  2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
  2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner
  2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner
@ 2012-06-06 21:53 ` Thomas Gleixner
  2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
  To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

[-- Attachment #1: x86-mce-serialize-mce-injection.patch --]
[-- Type: text/plain, Size: 1006 bytes --]

raise_mce() fiddles with global state, but lacks any kind of
serialization.

Add a mutex around the raise_mce() call, so concurrent writers do not
stomp on each other toes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mcheck/mce-inject.c |    4 ++++
 1 file changed, 4 insertions(+)

Index: tip/arch/x86/kernel/cpu/mcheck/mce-inject.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce-inject.c
+++ tip/arch/x86/kernel/cpu/mcheck/mce-inject.c
@@ -78,6 +78,7 @@ static void raise_exception(struct mce *
 }
 
 static cpumask_var_t mce_inject_cpumask;
+static DEFINE_MUTEX(mce_inject_mutex);
 
 static int mce_raise_notify(unsigned int cmd, struct pt_regs *regs)
 {
@@ -229,7 +230,10 @@ static ssize_t mce_write(struct file *fi
 	 * so do it a jiffie or two later everywhere.
 	 */
 	schedule_timeout(2);
+
+	mutex_lock(&mce_inject_mutex);
 	raise_mce(&m);
+	mutex_unlock(&mce_inject_mutex);
 	return usize;
 }
 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 5/5] x86: mce: Add cmci poll mode
  2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
                   ` (2 preceding siblings ...)
  2012-06-06 21:53 ` [patch 2/5] x86: mce: Serialize mce injection Thomas Gleixner
@ 2012-06-06 21:53 ` Thomas Gleixner
  2012-06-07 18:14   ` Borislav Petkov
  2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner
  2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
  5 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
  To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

[-- Attachment #1: x86-mce-cmci-poll-mode.patch --]
[-- Type: text/plain, Size: 7321 bytes --]

Still waits for explanation :)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mcheck/mce-internal.h |   10 +++
 arch/x86/kernel/cpu/mcheck/mce.c          |   46 +++++++++++++--
 arch/x86/kernel/cpu/mcheck/mce_intel.c    |   88 +++++++++++++++++++++++++++++-
 3 files changed, 137 insertions(+), 7 deletions(-)

Index: tip/arch/x86/kernel/cpu/mcheck/mce-internal.h
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ tip/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -28,6 +28,16 @@ extern int mce_ser;
 
 extern struct mce_bank *mce_banks;
 
+#ifdef CONFIG_X86_MCE_INTEL
+unsigned long mce_intel_adjust_timer(unsigned long interval);
+void mce_intel_cmci_poll(void);
+#else
+# define mce_intel_adjust_timer mce_adjust_timer_default
+static inline void mce_intel_cmci_poll(void) { }
+#endif
+
+void mce_timer_kick(unsigned long interval);
+
 #ifdef CONFIG_ACPI_APEI
 int apei_write_mce(struct mce *m);
 ssize_t apei_read_mce(struct mce *m, u64 *record_id);
Index: tip/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ tip/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1256,6 +1256,14 @@ static unsigned long check_interval = 5 
 static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */
 static DEFINE_PER_CPU(struct timer_list, mce_timer);
 
+static unsigned long mce_adjust_timer_default(unsigned long interval)
+{
+	return interval;
+}
+
+static unsigned long (*mce_adjust_timer)(unsigned long interval) =
+	mce_adjust_timer_default;
+
 static void mce_timer_fn(unsigned long data)
 {
 	struct timer_list *t = &__get_cpu_var(mce_timer);
@@ -1266,6 +1274,7 @@ static void mce_timer_fn(unsigned long d
 	if (mce_available(__this_cpu_ptr(&cpu_info))) {
 		machine_check_poll(MCP_TIMESTAMP,
 				&__get_cpu_var(mce_poll_banks));
+		mce_intel_cmci_poll();
 	}
 
 	/*
@@ -1273,14 +1282,38 @@ static void mce_timer_fn(unsigned long d
 	 * polling interval, otherwise increase the polling interval.
 	 */
 	iv = __this_cpu_read(mce_next_interval);
-	if (mce_notify_irq())
+	if (mce_notify_irq()) {
 		iv = max(iv / 2, (unsigned long) HZ/100);
-	else
+	} else {
 		iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
+		iv = mce_adjust_timer(iv);
+	}
 	__this_cpu_write(mce_next_interval, iv);
+	/* Might have become 0 after CMCI storm subsided */
+	if (iv) {
+		t->expires = jiffies + iv;
+		add_timer_on(t, smp_processor_id());
+	}
+}
 
-	t->expires = jiffies + iv;
-	add_timer_on(t, smp_processor_id());
+/*
+ * Ensure that the timer is firing in @interval from now.
+ */
+void mce_timer_kick(unsigned long interval)
+{
+	struct timer_list *t = &__get_cpu_var(mce_timer);
+	unsigned long when = jiffies + interval;
+	unsigned long iv = __this_cpu_read(mce_next_interval);
+
+	if (timer_pending(t)) {
+		if (time_before(when, t->expires))
+			mod_timer_pinned(t, when);
+	} else {
+		t->expires = round_jiffies(when);
+		add_timer_on(t, smp_processor_id());
+	}
+	if (interval < iv)
+		__this_cpu_write(mce_next_interval, interval);
 }
 
 /* Must not be called in IRQ context where del_timer_sync() can deadlock */
@@ -1545,6 +1578,7 @@ static void __mcheck_cpu_init_vendor(str
 	switch (c->x86_vendor) {
 	case X86_VENDOR_INTEL:
 		mce_intel_feature_init(c);
+		mce_adjust_timer = mce_intel_adjust_timer;
 		break;
 	case X86_VENDOR_AMD:
 		mce_amd_feature_init(c);
@@ -1556,7 +1590,7 @@ static void __mcheck_cpu_init_vendor(str
 
 static void mce_start_timer(unsigned int cpu, struct timer_list *t)
 {
-	unsigned long iv = check_interval * HZ;
+	unsigned long iv = mce_adjust_timer(check_interval * HZ);
 
 	__this_cpu_write(mce_next_interval, iv);
 
@@ -2272,8 +2306,8 @@ mce_cpu_callback(struct notifier_block *
 		mce_device_remove(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
-		del_timer_sync(t);
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
+		del_timer_sync(t);
 		break;
 	case CPU_DOWN_FAILED:
 		smp_call_function_single(cpu, mce_reenable_cpu, &action, 1);
Index: tip/arch/x86/kernel/cpu/mcheck/mce_intel.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ tip/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -15,6 +15,8 @@
 #include <asm/msr.h>
 #include <asm/mce.h>
 
+#include "mce-internal.h"
+
 /*
  * Support for Intel Correct Machine Check Interrupts. This allows
  * the CPU to raise an interrupt when a corrected machine check happened.
@@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_b
  */
 static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
 
-#define CMCI_THRESHOLD 1
+#define CMCI_THRESHOLD		1
+#define CMCI_POLL_INTERVAL	(30 * HZ)
+#define CMCI_STORM_INTERVAL	(1 * HZ)
+#define CMCI_STORM_TRESHOLD	5
+
+static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
+static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
+static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+
+enum {
+	CMCI_STORM_NONE,
+	CMCI_STORM_ACTIVE,
+	CMCI_STORM_SUBSIDED,
+};
+
+static atomic_t cmci_storm_on_cpus;
 
 static int cmci_supported(int *banks)
 {
@@ -53,6 +70,73 @@ static int cmci_supported(int *banks)
 	return !!(cap & MCG_CMCI_P);
 }
 
+void mce_intel_cmci_poll(void)
+{
+	if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE)
+		return;
+	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+}
+
+unsigned long mce_intel_adjust_timer(unsigned long interval)
+{
+	if (interval < CMCI_POLL_INTERVAL)
+		return interval;
+
+	switch (__this_cpu_read(cmci_storm_state)) {
+	case CMCI_STORM_ACTIVE:
+		/*
+		 * We switch back to interrupt mode once the poll timer has
+		 * silenced itself. That means no events recorded and the
+		 * timer interval is back to our poll interval.
+		 */
+		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
+		atomic_dec(&cmci_storm_on_cpus);
+
+	case CMCI_STORM_SUBSIDED:
+		/*
+		 * We wait for all cpus to go back to SUBSIDED
+		 * state. When that happens we switch back to
+		 * interrupt mode.
+		 */
+		if (!atomic_read(&cmci_storm_on_cpus)) {
+			__this_cpu_write(cmci_storm_state, CMCI_STORM_NONE);
+			cmci_reenable();
+			cmci_recheck();
+		}
+		return CMCI_POLL_INTERVAL;
+	default:
+		/*
+		 * We have shiny wheather, let the poll do whatever it
+		 * thinks.
+		 */
+		return interval;
+	}
+}
+
+static bool cmci_storm_detect(void)
+{
+	unsigned int cnt = __this_cpu_read(cmci_storm_cnt);
+	unsigned long ts = __this_cpu_read(cmci_time_stamp);
+	unsigned long now = jiffies;
+
+	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
+		cnt++;
+	} else {
+		cnt = 1;
+		__this_cpu_write(cmci_time_stamp, now);
+	}
+	__this_cpu_write(cmci_storm_cnt, cnt);
+
+	if (cnt <= CMCI_STORM_TRESHOLD)
+		return false;
+
+	cmci_clear();
+	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	atomic_inc(&cmci_storm_on_cpus);
+	mce_timer_kick(CMCI_POLL_INTERVAL);
+	return true;
+}
+
 /*
  * The interrupt handler. This is called on every event.
  * Just call the poller directly to log any events.
@@ -61,6 +145,8 @@ static int cmci_supported(int *banks)
  */
 static void intel_threshold_interrupt(void)
 {
+	if (cmci_storm_detect())
+		return;
 	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
 	mce_notify_irq();
 }



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 5/5] x86: mce: Add cmci poll mode
  2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner
@ 2012-06-07 18:14   ` Borislav Petkov
  0 siblings, 0 replies; 33+ messages in thread
From: Borislav Petkov @ 2012-06-07 18:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

On Wed, Jun 06, 2012 at 09:53:24PM +0000, Thomas Gleixner wrote:

[ … ]

> Index: tip/arch/x86/kernel/cpu/mcheck/mce_intel.c
> ===================================================================
> --- tip.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ tip/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -15,6 +15,8 @@
>  #include <asm/msr.h>
>  #include <asm/mce.h>
>  
> +#include "mce-internal.h"
> +
>  /*
>   * Support for Intel Correct Machine Check Interrupts. This allows
>   * the CPU to raise an interrupt when a corrected machine check happened.
> @@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_b
>   */
>  static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
>  
> -#define CMCI_THRESHOLD 1
> +#define CMCI_THRESHOLD		1
> +#define CMCI_POLL_INTERVAL	(30 * HZ)
> +#define CMCI_STORM_INTERVAL	(1 * HZ)
> +#define CMCI_STORM_TRESHOLD	5

Just a spelling correction:

	CMCI_STORM_THRESHOLD

> +
> +static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
> +static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
> +static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
> +
> +enum {
> +	CMCI_STORM_NONE,
> +	CMCI_STORM_ACTIVE,
> +	CMCI_STORM_SUBSIDED,
> +};
> +
> +static atomic_t cmci_storm_on_cpus;
>  
>  static int cmci_supported(int *banks)
>  {
> @@ -53,6 +70,73 @@ static int cmci_supported(int *banks)
>  	return !!(cap & MCG_CMCI_P);
>  }
>  
> +void mce_intel_cmci_poll(void)
> +{
> +	if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE)
> +		return;
> +	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
> +}
> +
> +unsigned long mce_intel_adjust_timer(unsigned long interval)
> +{
> +	if (interval < CMCI_POLL_INTERVAL)
> +		return interval;
> +
> +	switch (__this_cpu_read(cmci_storm_state)) {
> +	case CMCI_STORM_ACTIVE:
> +		/*
> +		 * We switch back to interrupt mode once the poll timer has
> +		 * silenced itself. That means no events recorded and the
> +		 * timer interval is back to our poll interval.
> +		 */
> +		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
> +		atomic_dec(&cmci_storm_on_cpus);
> +
> +	case CMCI_STORM_SUBSIDED:
> +		/*
> +		 * We wait for all cpus to go back to SUBSIDED
> +		 * state. When that happens we switch back to
> +		 * interrupt mode.
> +		 */
> +		if (!atomic_read(&cmci_storm_on_cpus)) {
> +			__this_cpu_write(cmci_storm_state, CMCI_STORM_NONE);
> +			cmci_reenable();
> +			cmci_recheck();
> +		}
> +		return CMCI_POLL_INTERVAL;
> +	default:
> +		/*
> +		 * We have shiny wheather, let the poll do whatever it
> +		 * thinks.
> +		 */
> +		return interval;
> +	}
> +}
> +
> +static bool cmci_storm_detect(void)
> +{
> +	unsigned int cnt = __this_cpu_read(cmci_storm_cnt);
> +	unsigned long ts = __this_cpu_read(cmci_time_stamp);
> +	unsigned long now = jiffies;
> +
> +	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
> +		cnt++;
> +	} else {
> +		cnt = 1;
> +		__this_cpu_write(cmci_time_stamp, now);
> +	}
> +	__this_cpu_write(cmci_storm_cnt, cnt);
> +
> +	if (cnt <= CMCI_STORM_TRESHOLD)

and here too.

> +		return false;
> +
> +	cmci_clear();
> +	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
> +	atomic_inc(&cmci_storm_on_cpus);
> +	mce_timer_kick(CMCI_POLL_INTERVAL);
> +	return true;
> +}
> +
>  /*
>   * The interrupt handler. This is called on every event.
>   * Just call the poller directly to log any events.
> @@ -61,6 +145,8 @@ static int cmci_supported(int *banks)
>   */
>  static void intel_threshold_interrupt(void)
>  {
> +	if (cmci_storm_detect())
> +		return;
>  	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
>  	mce_notify_irq();
>  }
> 
> 
> 

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code
  2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
                   ` (3 preceding siblings ...)
  2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner
@ 2012-06-06 21:53 ` Thomas Gleixner
  2012-06-07 17:49   ` Borislav Petkov
  2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
  5 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
  To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra

[-- Attachment #1: x86-mce-remove-frozen-cases.patch --]
[-- Type: text/plain, Size: 1440 bytes --]

No point in having double cases if we can simply mask the FROZEN bit
out.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/mcheck/mce.c |   12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

Index: tip/arch/x86/kernel/cpu/mcheck/mce.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c
+++ tip/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2260,34 +2260,32 @@ mce_cpu_callback(struct notifier_block *
 	unsigned int cpu = (unsigned long)hcpu;
 	struct timer_list *t = &per_cpu(mce_timer, cpu);
 
-	switch (action) {
+	switch (action & ~CPU_TASKS_FROZEN) {
 	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
 		mce_device_create(cpu);
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		break;
 	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		mce_device_remove(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
-	case CPU_DOWN_PREPARE_FROZEN:
 		del_timer_sync(t);
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
 		break;
 	case CPU_DOWN_FAILED:
-	case CPU_DOWN_FAILED_FROZEN:
 		smp_call_function_single(cpu, mce_reenable_cpu, &action, 1);
 		mce_start_timer(cpu, t);
 		break;
-	case CPU_POST_DEAD:
+	}
+
+	if (action == CPU_POST_DEAD) {
 		/* intentionally ignoring frozen here */
 		cmci_rediscover(cpu);
-		break;
 	}
+
 	return NOTIFY_OK;
 }
 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code
  2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner
@ 2012-06-07 17:49   ` Borislav Petkov
  0 siblings, 0 replies; 33+ messages in thread
From: Borislav Petkov @ 2012-06-07 17:49 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Tony Luck, Chen Gong, x86, Peter Zijlstra

On Wed, Jun 06, 2012 at 09:53:24PM +0000, Thomas Gleixner wrote:
> No point in having double cases if we can simply mask the FROZEN bit
> out.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Borislav Petkov <borislav.petkov@amd.com>

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
  2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
                   ` (4 preceding siblings ...)
  2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner
@ 2012-06-07 10:08 ` Chen Gong
  2012-06-07 13:35   ` Borislav Petkov
  2012-06-08  7:49   ` Thomas Gleixner
  5 siblings, 2 replies; 33+ messages in thread
From: Chen Gong @ 2012-06-07 10:08 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra

于 2012/6/7 5:53, Thomas Gleixner 写道:
> Sorry for the resend, I guess I need more sleep or vacation or both :(
>
> The following series fixes a few interesting bugs (found by review in
> context of the CMCI poll effort) and a cleanup to the timer/hotplug
> code followed by a consolidated version of the CMCI poll
> implementation. This series is based on
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
>
> which contains the bugfix for the dropped timer interval init.
>
> Thanks,
>
> 	tglx
>
>
>
>

I tested the latest patch series based on your tip tree. The basic logic
is correct as we expected :-).

But during the CPU online/offline test I found an issue. After *STORM*
mode is entered, it can't come back from *STORM* mode to normal
interrupt mode. At least there exists such an issue: when *STORM* is
entered, in the meanwhile, one CPU is offline during this period,
which means *cmci_storm_on_cpus* can't decrease to 0 because there
is one bit stuck on this offlined CPU. So we should detect such
situation and decrease on *cmci_storm_on_cpus* at proper time.
BTW, even I online the *CPU* in above situation, the normal CMCI
still doesn't come back, strange.

I still have another question: When we handle following case:
mce_cpu_callback(struct notifier_block *
		mce_device_remove(cpu);
		break;
	case CPU_DOWN_PREPARE:
-		del_timer_sync(t);
		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
+ 		del_timer_sync(t);
		break;

Where we add this timer back? I can't find it in "case CPU_ONLINE".

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
  2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
@ 2012-06-07 13:35   ` Borislav Petkov
  2012-06-07 16:22     ` Luck, Tony
  2012-06-08  7:49   ` Thomas Gleixner
  1 sibling, 1 reply; 33+ messages in thread
From: Borislav Petkov @ 2012-06-07 13:35 UTC (permalink / raw)
  To: Chen Gong
  Cc: Thomas Gleixner, LKML, Tony Luck, Borislav Petkov, x86,
	Peter Zijlstra

On Thu, Jun 07, 2012 at 06:08:15PM +0800, Chen Gong wrote:
> But during the CPU online/offline test I found an issue. After *STORM*
> mode is entered, it can't come back from *STORM* mode to normal
> interrupt mode. At least there exists such an issue: when *STORM* is
> entered, in the meanwhile, one CPU is offline during this period,
> which means *cmci_storm_on_cpus* can't decrease to 0 because there is
> one bit stuck on this offlined CPU.

Stupid question: do you even want to allow offlining of CPUs while the
storm is raging? IOW, wouldn't it be better to disallow hotplug when the
storm happens so that all online CPUs, which is all CPUs on the system,
can work out the storm conditions faster so that the system gets back to
normal faster...

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
  2012-06-07 13:35   ` Borislav Petkov
@ 2012-06-07 16:22     ` Luck, Tony
  0 siblings, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2012-06-07 16:22 UTC (permalink / raw)
  To: Borislav Petkov, Chen Gong
  Cc: Thomas Gleixner, LKML, x86@kernel.org, Peter Zijlstra

> Stupid question: do you even want to allow offlining of CPUs while the
> storm is raging? IOW, wouldn't it be better to disallow hotplug when the
> storm happens so that all online CPUs, which is all CPUs on the system,
> can work out the storm conditions faster so that the system gets back to
> normal faster...

What if a processor is the source of the storm (we can get CMCI from
corrected errors in the cache)? Then our action to abate the storm would
be to take the cpu offline.

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
  2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
  2012-06-07 13:35   ` Borislav Petkov
@ 2012-06-08  7:49   ` Thomas Gleixner
  2012-06-11  5:46     ` Chen Gong
                       ` (2 more replies)
  1 sibling, 3 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-08  7:49 UTC (permalink / raw)
  To: Chen Gong; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra

On Thu, 7 Jun 2012, Chen Gong wrote:
> 
> But during the CPU online/offline test I found an issue. After *STORM*
> mode is entered, it can't come back from *STORM* mode to normal
> interrupt mode. At least there exists such an issue: when *STORM* is
> entered, in the meanwhile, one CPU is offline during this period,
> which means *cmci_storm_on_cpus* can't decrease to 0 because there
> is one bit stuck on this offlined CPU. So we should detect such
> situation and decrease on *cmci_storm_on_cpus* at proper time.

Yes, we need to reset the storm state as well I think.

> BTW, even I online the *CPU* in above situation, the normal CMCI
> still doesn't come back, strange.

That's weird.
 
> I still have another question: When we handle following case:
> mce_cpu_callback(struct notifier_block *
> 		mce_device_remove(cpu);
> 		break;
> 	case CPU_DOWN_PREPARE:
> -		del_timer_sync(t);
> 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
> + 		del_timer_sync(t);
> 		break;
> 
> Where we add this timer back? I can't find it in "case CPU_ONLINE".

The timer gets added back via mcheck_cpu_init(), which is called on
the newly onlined cpu from smp_callin().

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
  2012-06-08  7:49   ` Thomas Gleixner
@ 2012-06-11  5:46     ` Chen Gong
  2012-06-11  6:09     ` Chen Gong
  2012-06-14 13:49     ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong
  2 siblings, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-06-11  5:46 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra

于 2012/6/8 15:49, Thomas Gleixner 写道:
> On Thu, 7 Jun 2012, Chen Gong wrote:
>>
>> But during the CPU online/offline test I found an issue. After *STORM*
>> mode is entered, it can't come back from *STORM* mode to normal
>> interrupt mode. At least there exists such an issue: when *STORM* is
>> entered, in the meanwhile, one CPU is offline during this period,
>> which means *cmci_storm_on_cpus* can't decrease to 0 because there
>> is one bit stuck on this offlined CPU. So we should detect such
>> situation and decrease on *cmci_storm_on_cpus* at proper time.
>
> Yes, we need to reset the storm state as well I think.
>
>> BTW, even I online the *CPU* in above situation, the normal CMCI
>> still doesn't come back, strange.
>
> That's weird.

Here is the reason. When CPU offline and then online, the CMCI is
reenabled, which means it opens a backdoor to enable *CMCI storm detect*
again. The result is the counter cmci_sotrm_cnt is increased again
before it decreases to zero. At last, this counter will never be back
to zero, irrelevant to CPU online. Obviously, we must stop CMCI enabling
again before CMCI storm ends, even during CPU online/offline. Besides
this, we still need to remove the count if one CPU is offlined. Like
below tmp patch I wrote (not guarantee it is OK :-))

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h 
b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 2cd73ce..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
  #ifdef CONFIG_X86_MCE_INTEL
  unsigned long mce_intel_adjust_timer(unsigned long interval);
  void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
  #else
  # define mce_intel_adjust_timer mce_adjust_timer_default
  static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
  #endif

  void mce_timer_kick(unsigned long interval);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c 
b/arch/x86/kernel/cpu/mcheck/mce.c
index e3f8b94..5e22d99 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, 
unsigned long action, void *hcpu)
  		if (threshold_cpu_callback)
  			threshold_cpu_callback(action, cpu);
  		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
  		break;
  	case CPU_DOWN_PREPARE:
  		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c 
b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 92d8b5c..0051b2d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
  static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
  static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
  static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);

  enum {
  	CMCI_STORM_NONE,
@@ -47,6 +48,11 @@ enum {
  	CMCI_STORM_SUBSIDED,
  };

+enum {
+	CMCI_STORM_HCPU_NONE,
+	CMCI_STORM_HCPU_ACTIVE,
+};
+
  static atomic_t cmci_storm_on_cpus;

  static int cmci_supported(int *banks)
@@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void)
  	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
  }

+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu);
+
+	if (*status == CMCI_STORM_HCPU_ACTIVE) {
+		*status = CMCI_STORM_HCPU_NONE;
+		atomic_dec(&cmci_storm_on_cpus);
+		per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+	}	
+}
+
  unsigned long mce_intel_adjust_timer(unsigned long interval)
  {
  	if (interval < CMCI_POLL_INTERVAL)
@@ -132,6 +149,7 @@ static bool cmci_storm_detect(void)

  	cmci_clear();
  	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE);
  	atomic_inc(&cmci_storm_on_cpus);
  	mce_timer_kick(CMCI_POLL_INTERVAL);
  	return true;


>
>> I still have another question: When we handle following case:
>> mce_cpu_callback(struct notifier_block *
>> 		mce_device_remove(cpu);
>> 		break;
>> 	case CPU_DOWN_PREPARE:
>> -		del_timer_sync(t);
>> 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
>> + 		del_timer_sync(t);
>> 		break;
>>
>> Where we add this timer back? I can't find it in "case CPU_ONLINE".
>
> The timer gets added back via mcheck_cpu_init(), which is called on
> the newly onlined cpu from smp_callin().

Stupid me! I must too busy that day so that I lost my head. :-(.
Thanks Tomas for your explanation!

>
> Thanks,
>
> 	tglx
>
>



^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
  2012-06-08  7:49   ` Thomas Gleixner
  2012-06-11  5:46     ` Chen Gong
@ 2012-06-11  6:09     ` Chen Gong
  2012-06-14 13:49     ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong
  2 siblings, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-06-11  6:09 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra

I don't know why my reply format is changed in the previous mail, I have
to resend it again, sorry.

于 2012/6/8 15:49, Thomas Gleixner 写道:
> On Thu, 7 Jun 2012, Chen Gong wrote:
>>
>> But during the CPU online/offline test I found an issue. After *STORM*
>> mode is entered, it can't come back from *STORM* mode to normal
>> interrupt mode. At least there exists such an issue: when *STORM* is
>> entered, in the meanwhile, one CPU is offline during this period,
>> which means *cmci_storm_on_cpus* can't decrease to 0 because there
>> is one bit stuck on this offlined CPU. So we should detect such
>> situation and decrease on *cmci_storm_on_cpus* at proper time.
>
> Yes, we need to reset the storm state as well I think.
>
>> BTW, even I online the *CPU* in above situation, the normal CMCI
>> still doesn't come back, strange.
>
> That's weird.

Here is the reason. When CPU offline and then online, the CMCI is
reenabled, which means it opens a backdoor to enable *CMCI storm detect*
again. The result is the counter cmci_sotrm_cnt is increased again
before it decreases to zero. At last, this counter will never be back
to zero, irrelevant to CPU online. Obviously, we must stop CMCI enabling
again before CMCI storm ends, even during CPU online/offline. Besides
this, we still need to remove the count if one CPU is offlined. Like
below tmp patch I wrote (not guarantee it is OK :-))

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h 
b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 2cd73ce..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
  #ifdef CONFIG_X86_MCE_INTEL
  unsigned long mce_intel_adjust_timer(unsigned long interval);
  void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
  #else
  # define mce_intel_adjust_timer mce_adjust_timer_default
  static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
  #endif

  void mce_timer_kick(unsigned long interval);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c 
b/arch/x86/kernel/cpu/mcheck/mce.c
index e3f8b94..5e22d99 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, 
unsigned long action, void *hcpu)
  		if (threshold_cpu_callback)
  			threshold_cpu_callback(action, cpu);
  		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
  		break;
  	case CPU_DOWN_PREPARE:
  		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c 
b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 92d8b5c..0051b2d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
  static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
  static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
  static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);

  enum {
  	CMCI_STORM_NONE,
@@ -47,6 +48,11 @@ enum {
  	CMCI_STORM_SUBSIDED,
  };

+enum {
+	CMCI_STORM_HCPU_NONE,
+	CMCI_STORM_HCPU_ACTIVE,
+};
+
  static atomic_t cmci_storm_on_cpus;

  static int cmci_supported(int *banks)
@@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void)
  	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
  }

+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu);
+
+	if (*status == CMCI_STORM_HCPU_ACTIVE) {
+		*status = CMCI_STORM_HCPU_NONE;
+		atomic_dec(&cmci_storm_on_cpus);
+		per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+	}	
+}
+
  unsigned long mce_intel_adjust_timer(unsigned long interval)
  {
  	if (interval < CMCI_POLL_INTERVAL)
@@ -132,6 +149,7 @@ static bool cmci_storm_detect(void)

  	cmci_clear();
  	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE);
  	atomic_inc(&cmci_storm_on_cpus);
  	mce_timer_kick(CMCI_POLL_INTERVAL);
  	return true;

>
>> I still have another question: When we handle following case:
>> mce_cpu_callback(struct notifier_block *
>> 		mce_device_remove(cpu);
>> 		break;
>> 	case CPU_DOWN_PREPARE:
>> -		del_timer_sync(t);
>> 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
>> + 		del_timer_sync(t);
>> 		break;
>>
>> Where we add this timer back? I can't find it in "case CPU_ONLINE".
>
> The timer gets added back via mcheck_cpu_init(), which is called on
> the newly onlined cpu from smp_callin().
>

Stupid me! I must too busy that day so that I lost my head. :-(.
Thanks Tomas for your explanation!

> Thanks,
>
> 	tglx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH] tmp patch to fix hotplug issue in CMCI storm
  2012-06-08  7:49   ` Thomas Gleixner
  2012-06-11  5:46     ` Chen Gong
  2012-06-11  6:09     ` Chen Gong
@ 2012-06-14 13:49     ` Chen Gong
  2012-06-14 14:07       ` Thomas Gleixner
  2 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-06-14 13:49 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel, Chen Gong

this patch is based on tip tree and previous 5 patches.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-internal.h |    2 +
 arch/x86/kernel/cpu/mcheck/mce.c          |    1 +
 arch/x86/kernel/cpu/mcheck/mce_intel.c    |   49 ++++++++++++++++++++++++++++-
 3 files changed, 51 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 2cd73ce..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
 #ifdef CONFIG_X86_MCE_INTEL
 unsigned long mce_intel_adjust_timer(unsigned long interval);
 void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
 #else
 # define mce_intel_adjust_timer mce_adjust_timer_default
 static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
 #endif
 
 void mce_timer_kick(unsigned long interval);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e3f8b94..5e22d99 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 92d8b5c..ef687df 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
 static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
 static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
 static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);
 
 enum {
 	CMCI_STORM_NONE,
@@ -47,6 +48,12 @@ enum {
 	CMCI_STORM_SUBSIDED,
 };
 
+enum {
+	CMCI_STORM_HCPU_NONE,
+	CMCI_STORM_HCPU_ACTIVE,
+	CMCI_STORM_HCPU_SUBSIDED,
+};
+
 static atomic_t cmci_storm_on_cpus;
 
 static int cmci_supported(int *banks)
@@ -77,6 +84,17 @@ void mce_intel_cmci_poll(void)
 	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
 }
 
+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu);
+
+	if (*status == CMCI_STORM_HCPU_ACTIVE) {
+		per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+		*status = CMCI_STORM_HCPU_SUBSIDED;
+		atomic_dec(&cmci_storm_on_cpus);
+	}
+}
+
 unsigned long mce_intel_adjust_timer(unsigned long interval)
 {
 	if (interval < CMCI_POLL_INTERVAL)
@@ -90,6 +108,7 @@ unsigned long mce_intel_adjust_timer(unsigned long interval)
 		 * timer interval is back to our poll interval.
 		 */
 		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
+		__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE);
 		atomic_dec(&cmci_storm_on_cpus);
 
 	case CMCI_STORM_SUBSIDED:
@@ -109,6 +128,21 @@ unsigned long mce_intel_adjust_timer(unsigned long interval)
 		 * We have shiny wheather, let the poll do whatever it
 		 * thinks.
 		 */
+
+		/*
+		 * If one CPU is offlined and onlined during CMCI storm, it
+		 * has no chance to enable CMCI again. Here is the portal.
+		 */
+		if ((!atomic_read(&cmci_storm_on_cpus)) &&
+			(CMCI_STORM_HCPU_SUBSIDED ==
+			__this_cpu_read(cmci_storm_hcpu_status))) {
+			__this_cpu_write(cmci_storm_hcpu_status,
+					CMCI_STORM_HCPU_NONE);
+			cmci_reenable();
+			apic_write(APIC_LVTCMCI,
+				THRESHOLD_APIC_VECTOR|APIC_DM_FIXED);
+			cmci_recheck();
+		}
 		return interval;
 	}
 }
@@ -132,6 +166,7 @@ static bool cmci_storm_detect(void)
 
 	cmci_clear();
 	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE);
 	atomic_inc(&cmci_storm_on_cpus);
 	mce_timer_kick(CMCI_POLL_INTERVAL);
 	return true;
@@ -259,7 +294,9 @@ void cmci_rediscover(int dying)
 	int cpu;
 	cpumask_var_t old;
 
-	if (!cmci_supported(&banks))
+	if (!cmci_supported(&banks) ||
+	/* if still in CMCI storm, don't enable it */
+	(0 != atomic_read(&cmci_storm_on_cpus)))
 		return;
 	if (!alloc_cpumask_var(&old, GFP_KERNEL))
 		return;
@@ -297,6 +334,16 @@ static void intel_init_cmci(void)
 		return;
 
 	mce_threshold_vector = intel_threshold_interrupt;
+	/* if still in CMCI storm, don't enable it */
+	if (0 != atomic_read(&cmci_storm_on_cpus))
+		return;
+	/*
+	 * If one CPU is offlined during CMCI storm and onlined after
+	 * CMCI storm, this *hCPU* status must be updated to avoid
+	 * to reenable CMCI twice.
+	 */
+	__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE);
+
 	cmci_discover(banks, 1);
 	/*
 	 * For CPU #0 this runs with still disabled APIC, but that's
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm
  2012-06-14 13:49     ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong
@ 2012-06-14 14:07       ` Thomas Gleixner
  2012-06-15  6:51         ` Chen Gong
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-14 14:07 UTC (permalink / raw)
  To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

On Thu, 14 Jun 2012, Chen Gong wrote:
> this patch is based on tip tree and previous 5 patches.

You really don't need all this complexity to handle that. The main
thing is that you clear the storm state and adjust the storm counter
when the cpu goes offline (in case the state is ACTIVE).

When it comes online again then you can simply let it restart cmci. If
the storm on this cpu (or node) still exists then it will notice and
everything falls in place.

Keep it simple, really!

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm
  2012-06-14 14:07       ` Thomas Gleixner
@ 2012-06-15  6:51         ` Chen Gong
  2012-06-15  9:55           ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-06-15  6:51 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

于 2012/6/14 22:07, Thomas Gleixner 写道:
> On Thu, 14 Jun 2012, Chen Gong wrote:
>> this patch is based on tip tree and previous 5 patches.
>
> You really don't need all this complexity to handle that. The main
> thing is that you clear the storm state and adjust the storm counter
> when the cpu goes offline (in case the state is ACTIVE).
>
> When it comes online again then you can simply let it restart cmci. If
> the storm on this cpu (or node) still exists then it will notice and
> everything falls in place.

I ever tested some different scenarios, if storm on this cpu still
exists, it triggers the CMCI and broadcast it on the sibling CPU,
which means the counter *cmci_storm_on_cpus* will increase beyond
the upper limit. E.g. on a 2 sockets SandyBridge-EP system (one socket
has 8 cores and 16 threads), inject one error on one socket, you can
watch *cmci_storm_on_cpus* = 16 becuase of CMCI broadcast, during
this time, offline and online one CPU on this socket, firstly
*cmci_storm_on_cpus* = 15 because of offline and ACTIVE status, and then 
*cmci_storm_on_cpus* = 31 in that CMCI is actived because of
online.That's why I have to disable CMCI during whole online/offline
until CMCI storm is subsided. Frankly, the logic is a little bit
complex so that I write many comments to avoid I forget it after some
time :-)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm
  2012-06-15  6:51         ` Chen Gong
@ 2012-06-15  9:55           ` Thomas Gleixner
  2012-06-18  6:42             ` Chen Gong
  2012-06-18  6:45             ` [PATCH V2] " Chen Gong
  0 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-15  9:55 UTC (permalink / raw)
  To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2778 bytes --]

On Fri, 15 Jun 2012, Chen Gong wrote:
> 于 2012/6/14 22:07, Thomas Gleixner 写道:
> > On Thu, 14 Jun 2012, Chen Gong wrote:
> > > this patch is based on tip tree and previous 5 patches.
> > 
> > You really don't need all this complexity to handle that. The main
> > thing is that you clear the storm state and adjust the storm counter
> > when the cpu goes offline (in case the state is ACTIVE).
> > 
> > When it comes online again then you can simply let it restart cmci. If
> > the storm on this cpu (or node) still exists then it will notice and
> > everything falls in place.
> 
> I ever tested some different scenarios, if storm on this cpu still
> exists, it triggers the CMCI and broadcast it on the sibling CPU,
> which means the counter *cmci_storm_on_cpus* will increase beyond
> the upper limit. E.g. on a 2 sockets SandyBridge-EP system (one socket
> has 8 cores and 16 threads), inject one error on one socket, you can
> watch *cmci_storm_on_cpus* = 16 becuase of CMCI broadcast, during
> this time, offline and online one CPU on this socket, firstly
> *cmci_storm_on_cpus* = 15 because of offline and ACTIVE status, and then
> *cmci_storm_on_cpus* = 31 in that CMCI is actived because of
> online.That's why I have to disable CMCI during whole online/offline
> until CMCI storm is subsided. Frankly, the logic is a little bit
> complex so that I write many comments to avoid I forget it after some
> time :-)

This does not make any sense at all.

What you are saying is that even if CPU0 run cmci_clear() the CMCI
raised on CPU1 will cause the CMCI vector to be triggered on CPU0.

So how does the whole storm machinery work in the following case:

CPU0   	    	      CPU1
cmci incoming	      cmci incoming
     storm detected   	   no storm detected yet
     cmci_clear()
     switch to poll
		      
		      cmci raised

So according to your explanation that would cause the cmci vector to
be broadcasted to CPU0 as well. Now that would cause the counter to
get a bogus increment, right ?

So instead of hacking insane crap into the code, we have simply to do
the obvious Right Thing:

Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -119,6 +119,9 @@ static bool cmci_storm_detect(void)
 	unsigned long ts = __this_cpu_read(cmci_time_stamp);
 	unsigned long now = jiffies;
 
+	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
+		return true;
+
 	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
 		cnt++;
 	} else {

That will prevent damage under all circumstances, cpu hotplug
included. 

But that's too simple and comprehensible I fear.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm
  2012-06-15  9:55           ` Thomas Gleixner
@ 2012-06-18  6:42             ` Chen Gong
  2012-06-18  6:45             ` [PATCH V2] " Chen Gong
  1 sibling, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-06-18  6:42 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

[...]
>
> So according to your explanation that would cause the cmci vector to
> be broadcasted to CPU0 as well. Now that would cause the counter to
> get a bogus increment, right ?
>
> So instead of hacking insane crap into the code, we have simply to do
> the obvious Right Thing:
>
> Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -119,6 +119,9 @@ static bool cmci_storm_detect(void)
>   	unsigned long ts = __this_cpu_read(cmci_time_stamp);
>   	unsigned long now = jiffies;
>
> +	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
> +		return true;
> +
>   	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
>   		cnt++;
>   	} else {
>
> That will prevent damage under all circumstances, cpu hotplug
> included.
>
> But that's too simple and comprehensible I fear.
>

Oh, yes, your suggestion can simplify my logic, but not all. Obviously,
when hotplug is happened, it can be considered quitting CMCI storm in
another way, so the corresponding counter and status should be
decreased from that path. And in my original patch, I defined three
status:

enum {
	CMCI_STORM_HCPU_NONE,
	CMCI_STORM_HCPU_ACTIVE,
	CMCI_STORM_HCPU_SUBSIDED,
};

I use CMCI_STORM_HCPU_SUBSIDED to descirbe stalled CPU in hotplug
progressI to stop CMCI enable during whole hotplog status, but
according to your suggestion, now the CMCI_STORM_HCPU_SUBSIDED can
be removed (FIXME: because CMCI can be enabled, if CPU offline and
then online again during CMCI storm, it will enter CMCI storm status
right now. It can simplify the logic, but a little bit performance
degradation).

I will send the 2nd patch based on previous 5 patches in a separate
mail.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH V2] tmp patch to fix hotplug issue in CMCI storm
  2012-06-15  9:55           ` Thomas Gleixner
  2012-06-18  6:42             ` Chen Gong
@ 2012-06-18  6:45             ` Chen Gong
  2012-06-18  8:00               ` Thomas Gleixner
  1 sibling, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-06-18  6:45 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel, Chen Gong

This patch is based on previous 5 patches and tuned based on
Thomas' suggestion.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-internal.h |    2 ++
 arch/x86/kernel/cpu/mcheck/mce.c          |    1 +
 arch/x86/kernel/cpu/mcheck/mce_intel.c    |   22 ++++++++++++++++++++++
 3 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 2cd73ce..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
 #ifdef CONFIG_X86_MCE_INTEL
 unsigned long mce_intel_adjust_timer(unsigned long interval);
 void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
 #else
 # define mce_intel_adjust_timer mce_adjust_timer_default
 static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
 #endif
 
 void mce_timer_kick(unsigned long interval);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e3f8b94..5e22d99 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 92d8b5c..0493525 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
 static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
 static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
 static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);
 
 enum {
 	CMCI_STORM_NONE,
@@ -47,6 +48,11 @@ enum {
 	CMCI_STORM_SUBSIDED,
 };
 
+enum {
+	CMCI_STORM_HCPU_NONE,
+	CMCI_STORM_HCPU_ACTIVE,
+};
+
 static atomic_t cmci_storm_on_cpus;
 
 static int cmci_supported(int *banks)
@@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void)
 	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
 }
 
+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu);
+
+	if (*status == CMCI_STORM_HCPU_ACTIVE) {
+		per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+		*status = CMCI_STORM_HCPU_NONE;
+		atomic_dec(&cmci_storm_on_cpus);
+	}
+}
+
 unsigned long mce_intel_adjust_timer(unsigned long interval)
 {
 	if (interval < CMCI_POLL_INTERVAL)
@@ -90,6 +107,7 @@ unsigned long mce_intel_adjust_timer(unsigned long interval)
 		 * timer interval is back to our poll interval.
 		 */
 		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
+		__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE);
 		atomic_dec(&cmci_storm_on_cpus);
 
 	case CMCI_STORM_SUBSIDED:
@@ -119,6 +137,9 @@ static bool cmci_storm_detect(void)
 	unsigned long ts = __this_cpu_read(cmci_time_stamp);
 	unsigned long now = jiffies;
 
+	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
+		return true;
+
 	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
 		cnt++;
 	} else {
@@ -132,6 +153,7 @@ static bool cmci_storm_detect(void)
 
 	cmci_clear();
 	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE);
 	atomic_inc(&cmci_storm_on_cpus);
 	mce_timer_kick(CMCI_POLL_INTERVAL);
 	return true;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm
  2012-06-18  6:45             ` [PATCH V2] " Chen Gong
@ 2012-06-18  8:00               ` Thomas Gleixner
  2012-06-18 10:13                 ` Chen Gong
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-18  8:00 UTC (permalink / raw)
  To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

On Mon, 18 Jun 2012, Chen Gong wrote:
> index 92d8b5c..0493525 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
>  static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
>  static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
>  static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
> +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);

Why do you insist on having another status variable, which does
actually nothing than obfuscate the code?

Look at the usage sites:

>  		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
> +		__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE);

>  	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
> +	__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE);

So it's a shadow variable of cmci_storm_state for no value.

And all you do with it is:

> +void mce_intel_hcpu_update(unsigned long cpu)
> +{
> +	unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu);
> +
> +	if (*status == CMCI_STORM_HCPU_ACTIVE) {

This can be checked with the existing variable as well. And your check
leaves CMCI_STORM_SUBSIDED as a stale value around.

This simply wants to check

     	if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE)
		atomic_dec(&cmci_storm_on_cpus);

and unconditionally clear the state	   			      

	per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;

Right?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm
  2012-06-18  8:00               ` Thomas Gleixner
@ 2012-06-18 10:13                 ` Chen Gong
  2012-06-18 12:23                   ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-06-18 10:13 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

于 2012/6/18 16:00, Thomas Gleixner 写道:
> On Mon, 18 Jun 2012, Chen Gong wrote:
>> index 92d8b5c..0493525 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
>> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
>> @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
>>   static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
>>   static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
>>   static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
>> +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);
>
> Why do you insist on having another status variable, which does
> actually nothing than obfuscate the code?
>
> Look at the usage sites:
>
>>   		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
>> +		__this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE);

Because here I can't substitute CMCI_STORM_HCPU_NONE with
CMCI_STORM_SUBSIDED. If so, the status is chaos.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm
  2012-06-18 10:13                 ` Chen Gong
@ 2012-06-18 12:23                   ` Thomas Gleixner
  2012-06-19  6:05                     ` Chen Gong
  2012-06-19  6:09                     ` [PATCH V3] " Chen Gong
  0 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-18 12:23 UTC (permalink / raw)
  To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1098 bytes --]

On Mon, 18 Jun 2012, Chen Gong wrote:
> 于 2012/6/18 16:00, Thomas Gleixner 写道:
> > On Mon, 18 Jun 2012, Chen Gong wrote:
> > > index 92d8b5c..0493525 100644
> > > --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> > > +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> > > @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
> > >   static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
> > >   static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
> > >   static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
> > > +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);
> > 
> > Why do you insist on having another status variable, which does
> > actually nothing than obfuscate the code?
> > 
> > Look at the usage sites:
> > 
> > >   		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
> > > +		__this_cpu_write(cmci_storm_hcpu_status,
> > > CMCI_STORM_HCPU_NONE);
> 
> Because here I can't substitute CMCI_STORM_HCPU_NONE with
> CMCI_STORM_SUBSIDED. If so, the status is chaos.

Have you actually read what I wrote later? You do not neeed that extra
state, period.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm
  2012-06-18 12:23                   ` Thomas Gleixner
@ 2012-06-19  6:05                     ` Chen Gong
  2012-06-19  6:09                     ` [PATCH V3] " Chen Gong
  1 sibling, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-06-19  6:05 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel

于 2012/6/18 20:23, Thomas Gleixner 写道:
> On Mon, 18 Jun 2012, Chen Gong wrote:
>> 于 2012/6/18 16:00, Thomas Gleixner 写道:
>>> On Mon, 18 Jun 2012, Chen Gong wrote:
>>>> index 92d8b5c..0493525 100644
>>>> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
>>>> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
>>>> @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
>>>>    static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
>>>>    static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
>>>>    static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
>>>> +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status);
>>>
>>> Why do you insist on having another status variable, which does
>>> actually nothing than obfuscate the code?
>>>
>>> Look at the usage sites:
>>>
>>>>    		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
>>>> +		__this_cpu_write(cmci_storm_hcpu_status,
>>>> CMCI_STORM_HCPU_NONE);
>>
>> Because here I can't substitute CMCI_STORM_HCPU_NONE with
>> CMCI_STORM_SUBSIDED. If so, the status is chaos.
>
> Have you actually read what I wrote later? You do not neeed that extra
> state, period.
>

Oh, yes, you are right. I really don't take enough time on your reply.
I apology for it. Unconditionally clearing the state can avoid stale
CMCI_STORM_SUBSIDED status. Your logic is fine and simply my previous
unnecessarily complicated logic. Thanks a lot.

I will send the V3 tmp patch based on your previous 5 patches right now.
If OK, you can merge it into your 5th patch and I can write the comment
for this patch if you are OK.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH V3] tmp patch to fix hotplug issue in CMCI storm
  2012-06-18 12:23                   ` Thomas Gleixner
  2012-06-19  6:05                     ` Chen Gong
@ 2012-06-19  6:09                     ` Chen Gong
  2012-07-04  8:12                       ` Chen Gong
  1 sibling, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-06-19  6:09 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, borislav.petkov, peterz, x86, linux-kernel, Chen Gong

v3->v1
Thanks very much for Thomas' suggestion to simply the whole logic.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-internal.h |    2 ++
 arch/x86/kernel/cpu/mcheck/mce.c          |    1 +
 arch/x86/kernel/cpu/mcheck/mce_intel.c    |   11 +++++++++++
 3 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 2cd73ce..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
 #ifdef CONFIG_X86_MCE_INTEL
 unsigned long mce_intel_adjust_timer(unsigned long interval);
 void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
 #else
 # define mce_intel_adjust_timer mce_adjust_timer_default
 static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
 #endif
 
 void mce_timer_kick(unsigned long interval);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index e3f8b94..5e22d99 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 92d8b5c..693bc7d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -77,6 +77,14 @@ void mce_intel_cmci_poll(void)
 	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
 }
 
+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE)
+		atomic_dec(&cmci_storm_on_cpus);
+
+	per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+}
+
 unsigned long mce_intel_adjust_timer(unsigned long interval)
 {
 	if (interval < CMCI_POLL_INTERVAL)
@@ -119,6 +127,9 @@ static bool cmci_storm_detect(void)
 	unsigned long ts = __this_cpu_read(cmci_time_stamp);
 	unsigned long now = jiffies;
 
+	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
+		return true;
+
 	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
 		cnt++;
 	} else {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm
  2012-06-19  6:09                     ` [PATCH V3] " Chen Gong
@ 2012-07-04  8:12                       ` Chen Gong
  2012-07-16  3:16                         ` Chen Gong
  0 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-07-04  8:12 UTC (permalink / raw)
  To: Chen Gong; +Cc: tglx, tony.luck, borislav.petkov, peterz, x86, linux-kernel

于 2012/6/19 14:09, Chen Gong 写道:
> v3->v1
> Thanks very much for Thomas' suggestion to simply the whole logic.
>
> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
> ---
>   arch/x86/kernel/cpu/mcheck/mce-internal.h |    2 ++
>   arch/x86/kernel/cpu/mcheck/mce.c          |    1 +
>   arch/x86/kernel/cpu/mcheck/mce_intel.c    |   11 +++++++++++
>   3 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
> index 2cd73ce..6a05c1d 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
> +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
> @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
>   #ifdef CONFIG_X86_MCE_INTEL
>   unsigned long mce_intel_adjust_timer(unsigned long interval);
>   void mce_intel_cmci_poll(void);
> +void mce_intel_hcpu_update(unsigned long cpu);
>   #else
>   # define mce_intel_adjust_timer mce_adjust_timer_default
>   static inline void mce_intel_cmci_poll(void) { }
> +static inline void mce_intel_hcpu_update(unsigned long cpu) { }
>   #endif
>
>   void mce_timer_kick(unsigned long interval);
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index e3f8b94..5e22d99 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
>   		if (threshold_cpu_callback)
>   			threshold_cpu_callback(action, cpu);
>   		mce_device_remove(cpu);
> +		mce_intel_hcpu_update(cpu);
>   		break;
>   	case CPU_DOWN_PREPARE:
>   		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> index 92d8b5c..693bc7d 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -77,6 +77,14 @@ void mce_intel_cmci_poll(void)
>   	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
>   }
>
> +void mce_intel_hcpu_update(unsigned long cpu)
> +{
> +	if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE)
> +		atomic_dec(&cmci_storm_on_cpus);
> +
> +	per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
> +}
> +
>   unsigned long mce_intel_adjust_timer(unsigned long interval)
>   {
>   	if (interval < CMCI_POLL_INTERVAL)
> @@ -119,6 +127,9 @@ static bool cmci_storm_detect(void)
>   	unsigned long ts = __this_cpu_read(cmci_time_stamp);
>   	unsigned long now = jiffies;
>
> +	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
> +		return true;
> +
>   	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
>   		cnt++;
>   	} else {
>


Hi, Thomas

How is going on for this patch and whole patch series? I don't know
if you have updated it or not.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm
  2012-07-04  8:12                       ` Chen Gong
@ 2012-07-16  3:16                         ` Chen Gong
  2012-07-16  8:22                           ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-07-16  3:16 UTC (permalink / raw)
  To: Chen Gong; +Cc: tglx, tony.luck, borislav.petkov, peterz, x86, linux-kernel

于 2012/7/4 16:12, Chen Gong 写道:
> 于 2012/6/19 14:09, Chen Gong 写道:
>> v3->v1
>> Thanks very much for Thomas' suggestion to simply the whole logic.
>>
>> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
>> ---
>>   arch/x86/kernel/cpu/mcheck/mce-internal.h |    2 ++
>>   arch/x86/kernel/cpu/mcheck/mce.c          |    1 +
>>   arch/x86/kernel/cpu/mcheck/mce_intel.c    |   11 +++++++++++
>>   3 files changed, 14 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h
>> b/arch/x86/kernel/cpu/mcheck/mce-internal.h
>> index 2cd73ce..6a05c1d 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
>> +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
>> @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks;
>>   #ifdef CONFIG_X86_MCE_INTEL
>>   unsigned long mce_intel_adjust_timer(unsigned long interval);
>>   void mce_intel_cmci_poll(void);
>> +void mce_intel_hcpu_update(unsigned long cpu);
>>   #else
>>   # define mce_intel_adjust_timer mce_adjust_timer_default
>>   static inline void mce_intel_cmci_poll(void) { }
>> +static inline void mce_intel_hcpu_update(unsigned long cpu) { }
>>   #endif
>>
>>   void mce_timer_kick(unsigned long interval);
>> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c
>> b/arch/x86/kernel/cpu/mcheck/mce.c
>> index e3f8b94..5e22d99 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce.c
>> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
>> @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb,
>> unsigned long action, void *hcpu)
>>           if (threshold_cpu_callback)
>>               threshold_cpu_callback(action, cpu);
>>           mce_device_remove(cpu);
>> +        mce_intel_hcpu_update(cpu);
>>           break;
>>       case CPU_DOWN_PREPARE:
>>           smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
>> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c
>> b/arch/x86/kernel/cpu/mcheck/mce_intel.c
>> index 92d8b5c..693bc7d 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
>> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
>> @@ -77,6 +77,14 @@ void mce_intel_cmci_poll(void)
>>       machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
>>   }
>>
>> +void mce_intel_hcpu_update(unsigned long cpu)
>> +{
>> +    if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE)
>> +        atomic_dec(&cmci_storm_on_cpus);
>> +
>> +    per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
>> +}
>> +
>>   unsigned long mce_intel_adjust_timer(unsigned long interval)
>>   {
>>       if (interval < CMCI_POLL_INTERVAL)
>> @@ -119,6 +127,9 @@ static bool cmci_storm_detect(void)
>>       unsigned long ts = __this_cpu_read(cmci_time_stamp);
>>       unsigned long now = jiffies;
>>
>> +    if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
>> +        return true;
>> +
>>       if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
>>           cnt++;
>>       } else {
>>
>
>
> Hi, Thomas
>
> How is going on for this patch and whole patch series? I don't know
> if you have updated it or not.

Hi, Thomas

Are you still care about this thread any more? Any plan to update it?
Hope to get your feedback ASAP.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm
  2012-07-16  3:16                         ` Chen Gong
@ 2012-07-16  8:22                           ` Thomas Gleixner
  2012-07-17 21:47                             ` Chen Gong
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2012-07-16  8:22 UTC (permalink / raw)
  To: Chen Gong; +Cc: tony.luck, borislav.petkov, peterz, x86, linux-kernel

On Mon, 16 Jul 2012, Chen Gong wrote:
> 
> Are you still care about this thread any more? Any plan to update it?
> Hope to get your feedback ASAP.

Can you please collect the latest series and send it to lkml, Tony and
Boris. I think it's ok as is now.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm
  2012-07-16  8:22                           ` Thomas Gleixner
@ 2012-07-17 21:47                             ` Chen Gong
  0 siblings, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-07-17 21:47 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, peterz, x86, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 819 bytes --]

On Mon, Jul 16, 2012 at 10:22:16AM +0200, Thomas Gleixner wrote:
> Date: Mon, 16 Jul 2012 10:22:16 +0200 (CEST)
> From: Thomas Gleixner <tglx@linutronix.de>
> To: Chen Gong <gong.chen@linux.intel.com>
> cc: tony.luck@intel.com, borislav.petkov@amd.com, peterz@infradead.org,
> 	x86@kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm
> User-Agent: Alpine 2.02 (LFD 1266 2009-07-14)
> 
> On Mon, 16 Jul 2012, Chen Gong wrote:
> > 
> > Are you still care about this thread any more? Any plan to update it?
> > Hope to get your feedback ASAP.
> 
> Can you please collect the latest series and send it to lkml, Tony and
> Boris. I think it's ok as is now.
> 
> Thanks,
> 
> 	tglx

Fine, but what base I should use, -tip tree or Linus' tree?

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version
@ 2012-07-18 19:59 Chen Gong
  2012-07-18 19:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong
  0 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-07-18 19:59 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, bp, x86, linux-kernel

[PATCH 1/5] x86: mce: Disable preemption when calling raise_local()
[PATCH 2/5] x86: mce: Serialize mce injection
[PATCH 3/5] x86: mce: Split timer init
[PATCH 4/5] x86: mce: Remove the frozen cases in the hotplug code
[PATCH 5/5] x86: mce: Add cmci poll mode

The following series fixes a few interesting bugs (found by review in
context of the CMCI poll effort) and a cleanup to the timer/hotplug
code followed by a consolidated version of the CMCI poll
implementation. This series is based on Linus' tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 5/5] x86: mce: Add cmci poll mode
  2012-07-18 19:59 [V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
@ 2012-07-18 19:59 ` Chen Gong
  0 siblings, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-07-18 19:59 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, bp, x86, linux-kernel, Chen Gong

When CMCI is too many to handle, it should be disabled
to avoid hanging the whole system. In the meanwhile, CMCI poll
timer can be employed to receive CMCI periodically. When no
more CMCI happens CMCI handler can be switched from poll mode
to interrupt mode again.

By now, every CPU core owns one poll timer, but in fact, maybe
it should be enough that every package (or socket) owning one
poll timer. It is because CMCI gets broadcast to all threads on
the same socket. So if one cpu has a problem, all the cpus on
the same socket have a problem.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Gong <gong.chen@linux.intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-internal.h |   12 ++++
 arch/x86/kernel/cpu/mcheck/mce.c          |   47 ++++++++++++--
 arch/x86/kernel/cpu/mcheck/mce_intel.c    |   99 ++++++++++++++++++++++++++++-
 3 files changed, 151 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index ed44c8a..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -28,6 +28,18 @@ extern int mce_ser;
 
 extern struct mce_bank *mce_banks;
 
+#ifdef CONFIG_X86_MCE_INTEL
+unsigned long mce_intel_adjust_timer(unsigned long interval);
+void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
+#else
+# define mce_intel_adjust_timer mce_adjust_timer_default
+static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
+#endif
+
+void mce_timer_kick(unsigned long interval);
+
 #ifdef CONFIG_ACPI_APEI
 int apei_write_mce(struct mce *m);
 ssize_t apei_read_mce(struct mce *m, u64 *record_id);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index eff73e7..95738db0 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1256,6 +1256,14 @@ static unsigned long check_interval = 5 * 60; /* 5 minutes */
 static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */
 static DEFINE_PER_CPU(struct timer_list, mce_timer);
 
+static unsigned long mce_adjust_timer_default(unsigned long interval)
+{
+	return interval;
+}
+
+static unsigned long (*mce_adjust_timer)(unsigned long interval) =
+	mce_adjust_timer_default;
+
 static void mce_timer_fn(unsigned long data)
 {
 	struct timer_list *t = &__get_cpu_var(mce_timer);
@@ -1266,6 +1274,7 @@ static void mce_timer_fn(unsigned long data)
 	if (mce_available(__this_cpu_ptr(&cpu_info))) {
 		machine_check_poll(MCP_TIMESTAMP,
 				&__get_cpu_var(mce_poll_banks));
+		mce_intel_cmci_poll();
 	}
 
 	/*
@@ -1273,14 +1282,38 @@ static void mce_timer_fn(unsigned long data)
 	 * polling interval, otherwise increase the polling interval.
 	 */
 	iv = __this_cpu_read(mce_next_interval);
-	if (mce_notify_irq())
+	if (mce_notify_irq()) {
 		iv = max(iv / 2, (unsigned long) HZ/100);
-	else
+	} else {
 		iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
+		iv = mce_adjust_timer(iv);
+	}
 	__this_cpu_write(mce_next_interval, iv);
+	/* Might have become 0 after CMCI storm subsided */
+	if (iv) {
+		t->expires = jiffies + iv;
+		add_timer_on(t, smp_processor_id());
+	}
+}
 
-	t->expires = jiffies + iv;
-	add_timer_on(t, smp_processor_id());
+/*
+ * Ensure that the timer is firing in @interval from now.
+ */
+void mce_timer_kick(unsigned long interval)
+{
+	struct timer_list *t = &__get_cpu_var(mce_timer);
+	unsigned long when = jiffies + interval;
+	unsigned long iv = __this_cpu_read(mce_next_interval);
+
+	if (timer_pending(t)) {
+		if (time_before(when, t->expires))
+			mod_timer_pinned(t, when);
+	} else {
+		t->expires = round_jiffies(when);
+		add_timer_on(t, smp_processor_id());
+	}
+	if (interval < iv)
+		__this_cpu_write(mce_next_interval, interval);
 }
 
 /* Must not be called in IRQ context where del_timer_sync() can deadlock */
@@ -1545,6 +1578,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
 	switch (c->x86_vendor) {
 	case X86_VENDOR_INTEL:
 		mce_intel_feature_init(c);
+		mce_adjust_timer = mce_intel_adjust_timer;
 		break;
 	case X86_VENDOR_AMD:
 		mce_amd_feature_init(c);
@@ -1556,7 +1590,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
 
 static void mce_start_timer(unsigned int cpu, struct timer_list *t)
 {
-	unsigned long iv = check_interval * HZ;
+	unsigned long iv = mce_adjust_timer(check_interval * HZ);
 
 	__this_cpu_write(mce_next_interval, iv);
 
@@ -2270,10 +2304,11 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
-		del_timer_sync(t);
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
+		del_timer_sync(t);
 		break;
 	case CPU_DOWN_FAILED:
 		smp_call_function_single(cpu, mce_reenable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 38e49bc..693bc7d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -15,6 +15,8 @@
 #include <asm/msr.h>
 #include <asm/mce.h>
 
+#include "mce-internal.h"
+
 /*
  * Support for Intel Correct Machine Check Interrupts. This allows
  * the CPU to raise an interrupt when a corrected machine check happened.
@@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);
  */
 static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
 
-#define CMCI_THRESHOLD 1
+#define CMCI_THRESHOLD		1
+#define CMCI_POLL_INTERVAL	(30 * HZ)
+#define CMCI_STORM_INTERVAL	(1 * HZ)
+#define CMCI_STORM_TRESHOLD	5
+
+static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
+static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
+static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+
+enum {
+	CMCI_STORM_NONE,
+	CMCI_STORM_ACTIVE,
+	CMCI_STORM_SUBSIDED,
+};
+
+static atomic_t cmci_storm_on_cpus;
 
 static int cmci_supported(int *banks)
 {
@@ -53,6 +70,84 @@ static int cmci_supported(int *banks)
 	return !!(cap & MCG_CMCI_P);
 }
 
+void mce_intel_cmci_poll(void)
+{
+	if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE)
+		return;
+	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+}
+
+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE)
+		atomic_dec(&cmci_storm_on_cpus);
+
+	per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+}
+
+unsigned long mce_intel_adjust_timer(unsigned long interval)
+{
+	if (interval < CMCI_POLL_INTERVAL)
+		return interval;
+
+	switch (__this_cpu_read(cmci_storm_state)) {
+	case CMCI_STORM_ACTIVE:
+		/*
+		 * We switch back to interrupt mode once the poll timer has
+		 * silenced itself. That means no events recorded and the
+		 * timer interval is back to our poll interval.
+		 */
+		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
+		atomic_dec(&cmci_storm_on_cpus);
+
+	case CMCI_STORM_SUBSIDED:
+		/*
+		 * We wait for all cpus to go back to SUBSIDED
+		 * state. When that happens we switch back to
+		 * interrupt mode.
+		 */
+		if (!atomic_read(&cmci_storm_on_cpus)) {
+			__this_cpu_write(cmci_storm_state, CMCI_STORM_NONE);
+			cmci_reenable();
+			cmci_recheck();
+		}
+		return CMCI_POLL_INTERVAL;
+	default:
+		/*
+		 * We have shiny wheather, let the poll do whatever it
+		 * thinks.
+		 */
+		return interval;
+	}
+}
+
+static bool cmci_storm_detect(void)
+{
+	unsigned int cnt = __this_cpu_read(cmci_storm_cnt);
+	unsigned long ts = __this_cpu_read(cmci_time_stamp);
+	unsigned long now = jiffies;
+
+	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
+		return true;
+
+	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
+		cnt++;
+	} else {
+		cnt = 1;
+		__this_cpu_write(cmci_time_stamp, now);
+	}
+	__this_cpu_write(cmci_storm_cnt, cnt);
+
+	if (cnt <= CMCI_STORM_TRESHOLD)
+		return false;
+
+	cmci_clear();
+	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	atomic_inc(&cmci_storm_on_cpus);
+	mce_timer_kick(CMCI_POLL_INTERVAL);
+	return true;
+}
+
 /*
  * The interrupt handler. This is called on every event.
  * Just call the poller directly to log any events.
@@ -61,6 +156,8 @@ static int cmci_supported(int *banks)
  */
 static void intel_threshold_interrupt(void)
 {
+	if (cmci_storm_detect())
+		return;
 	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
 	mce_notify_irq();
 }
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version
@ 2012-07-19 17:59 Chen Gong
  2012-07-19 17:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong
  0 siblings, 1 reply; 33+ messages in thread
From: Chen Gong @ 2012-07-19 17:59 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, bp, x86, linux-kernel

[PATCH 1/5] x86: mce: Disable preemption when calling raise_local()
[PATCH 2/5] x86: mce: Serialize mce injection
[PATCH 3/5] x86: mce: Split timer init
[PATCH 4/5] x86: mce: Remove the frozen cases in the hotplug code
[PATCH 5/5] x86: mce: Add cmci poll mode

The following series fixes a few interesting bugs (found by review in
context of the CMCI poll effort) and a cleanup to the timer/hotplug
code followed by a consolidated version of the CMCI poll
implementation. This series is based on Linus' tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git

Thanks Boris to point out how to use git to commit correct authorship :-).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 5/5] x86: mce: Add cmci poll mode
  2012-07-19 17:59 [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
@ 2012-07-19 17:59 ` Chen Gong
  0 siblings, 0 replies; 33+ messages in thread
From: Chen Gong @ 2012-07-19 17:59 UTC (permalink / raw)
  To: tglx; +Cc: tony.luck, bp, x86, linux-kernel, Chen Gong

When CMCI is too many to handle, it should be disabled
to avoid hanging the whole system. In the meanwhile, CMCI poll
timer can be employed to receive CMCI periodically. When no
more CMCI happens CMCI handler can be switched from poll mode
to interrupt mode again.

By now, every CPU core owns one poll timer, but in fact, maybe
it should be enough that every package (or socket) owning one
poll timer. It is because CMCI gets broadcast to all threads on
the same socket. So if one cpu has a problem, all the cpus on
the same socket have a problem.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Gong <gong.chen@linux.intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-internal.h |   12 ++++
 arch/x86/kernel/cpu/mcheck/mce.c          |   47 ++++++++++++--
 arch/x86/kernel/cpu/mcheck/mce_intel.c    |   99 ++++++++++++++++++++++++++++-
 3 files changed, 151 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index ed44c8a..6a05c1d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -28,6 +28,18 @@ extern int mce_ser;
 
 extern struct mce_bank *mce_banks;
 
+#ifdef CONFIG_X86_MCE_INTEL
+unsigned long mce_intel_adjust_timer(unsigned long interval);
+void mce_intel_cmci_poll(void);
+void mce_intel_hcpu_update(unsigned long cpu);
+#else
+# define mce_intel_adjust_timer mce_adjust_timer_default
+static inline void mce_intel_cmci_poll(void) { }
+static inline void mce_intel_hcpu_update(unsigned long cpu) { }
+#endif
+
+void mce_timer_kick(unsigned long interval);
+
 #ifdef CONFIG_ACPI_APEI
 int apei_write_mce(struct mce *m);
 ssize_t apei_read_mce(struct mce *m, u64 *record_id);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index eff73e7..95738db0 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -1256,6 +1256,14 @@ static unsigned long check_interval = 5 * 60; /* 5 minutes */
 static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */
 static DEFINE_PER_CPU(struct timer_list, mce_timer);
 
+static unsigned long mce_adjust_timer_default(unsigned long interval)
+{
+	return interval;
+}
+
+static unsigned long (*mce_adjust_timer)(unsigned long interval) =
+	mce_adjust_timer_default;
+
 static void mce_timer_fn(unsigned long data)
 {
 	struct timer_list *t = &__get_cpu_var(mce_timer);
@@ -1266,6 +1274,7 @@ static void mce_timer_fn(unsigned long data)
 	if (mce_available(__this_cpu_ptr(&cpu_info))) {
 		machine_check_poll(MCP_TIMESTAMP,
 				&__get_cpu_var(mce_poll_banks));
+		mce_intel_cmci_poll();
 	}
 
 	/*
@@ -1273,14 +1282,38 @@ static void mce_timer_fn(unsigned long data)
 	 * polling interval, otherwise increase the polling interval.
 	 */
 	iv = __this_cpu_read(mce_next_interval);
-	if (mce_notify_irq())
+	if (mce_notify_irq()) {
 		iv = max(iv / 2, (unsigned long) HZ/100);
-	else
+	} else {
 		iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
+		iv = mce_adjust_timer(iv);
+	}
 	__this_cpu_write(mce_next_interval, iv);
+	/* Might have become 0 after CMCI storm subsided */
+	if (iv) {
+		t->expires = jiffies + iv;
+		add_timer_on(t, smp_processor_id());
+	}
+}
 
-	t->expires = jiffies + iv;
-	add_timer_on(t, smp_processor_id());
+/*
+ * Ensure that the timer is firing in @interval from now.
+ */
+void mce_timer_kick(unsigned long interval)
+{
+	struct timer_list *t = &__get_cpu_var(mce_timer);
+	unsigned long when = jiffies + interval;
+	unsigned long iv = __this_cpu_read(mce_next_interval);
+
+	if (timer_pending(t)) {
+		if (time_before(when, t->expires))
+			mod_timer_pinned(t, when);
+	} else {
+		t->expires = round_jiffies(when);
+		add_timer_on(t, smp_processor_id());
+	}
+	if (interval < iv)
+		__this_cpu_write(mce_next_interval, interval);
 }
 
 /* Must not be called in IRQ context where del_timer_sync() can deadlock */
@@ -1545,6 +1578,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
 	switch (c->x86_vendor) {
 	case X86_VENDOR_INTEL:
 		mce_intel_feature_init(c);
+		mce_adjust_timer = mce_intel_adjust_timer;
 		break;
 	case X86_VENDOR_AMD:
 		mce_amd_feature_init(c);
@@ -1556,7 +1590,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
 
 static void mce_start_timer(unsigned int cpu, struct timer_list *t)
 {
-	unsigned long iv = check_interval * HZ;
+	unsigned long iv = mce_adjust_timer(check_interval * HZ);
 
 	__this_cpu_write(mce_next_interval, iv);
 
@@ -2270,10 +2304,11 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu)
 		if (threshold_cpu_callback)
 			threshold_cpu_callback(action, cpu);
 		mce_device_remove(cpu);
+		mce_intel_hcpu_update(cpu);
 		break;
 	case CPU_DOWN_PREPARE:
-		del_timer_sync(t);
 		smp_call_function_single(cpu, mce_disable_cpu, &action, 1);
+		del_timer_sync(t);
 		break;
 	case CPU_DOWN_FAILED:
 		smp_call_function_single(cpu, mce_reenable_cpu, &action, 1);
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index 38e49bc..693bc7d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -15,6 +15,8 @@
 #include <asm/msr.h>
 #include <asm/mce.h>
 
+#include "mce-internal.h"
+
 /*
  * Support for Intel Correct Machine Check Interrupts. This allows
  * the CPU to raise an interrupt when a corrected machine check happened.
@@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);
  */
 static DEFINE_RAW_SPINLOCK(cmci_discover_lock);
 
-#define CMCI_THRESHOLD 1
+#define CMCI_THRESHOLD		1
+#define CMCI_POLL_INTERVAL	(30 * HZ)
+#define CMCI_STORM_INTERVAL	(1 * HZ)
+#define CMCI_STORM_TRESHOLD	5
+
+static DEFINE_PER_CPU(unsigned long, cmci_time_stamp);
+static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt);
+static DEFINE_PER_CPU(unsigned int, cmci_storm_state);
+
+enum {
+	CMCI_STORM_NONE,
+	CMCI_STORM_ACTIVE,
+	CMCI_STORM_SUBSIDED,
+};
+
+static atomic_t cmci_storm_on_cpus;
 
 static int cmci_supported(int *banks)
 {
@@ -53,6 +70,84 @@ static int cmci_supported(int *banks)
 	return !!(cap & MCG_CMCI_P);
 }
 
+void mce_intel_cmci_poll(void)
+{
+	if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE)
+		return;
+	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
+}
+
+void mce_intel_hcpu_update(unsigned long cpu)
+{
+	if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE)
+		atomic_dec(&cmci_storm_on_cpus);
+
+	per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE;
+}
+
+unsigned long mce_intel_adjust_timer(unsigned long interval)
+{
+	if (interval < CMCI_POLL_INTERVAL)
+		return interval;
+
+	switch (__this_cpu_read(cmci_storm_state)) {
+	case CMCI_STORM_ACTIVE:
+		/*
+		 * We switch back to interrupt mode once the poll timer has
+		 * silenced itself. That means no events recorded and the
+		 * timer interval is back to our poll interval.
+		 */
+		__this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED);
+		atomic_dec(&cmci_storm_on_cpus);
+
+	case CMCI_STORM_SUBSIDED:
+		/*
+		 * We wait for all cpus to go back to SUBSIDED
+		 * state. When that happens we switch back to
+		 * interrupt mode.
+		 */
+		if (!atomic_read(&cmci_storm_on_cpus)) {
+			__this_cpu_write(cmci_storm_state, CMCI_STORM_NONE);
+			cmci_reenable();
+			cmci_recheck();
+		}
+		return CMCI_POLL_INTERVAL;
+	default:
+		/*
+		 * We have shiny wheather, let the poll do whatever it
+		 * thinks.
+		 */
+		return interval;
+	}
+}
+
+static bool cmci_storm_detect(void)
+{
+	unsigned int cnt = __this_cpu_read(cmci_storm_cnt);
+	unsigned long ts = __this_cpu_read(cmci_time_stamp);
+	unsigned long now = jiffies;
+
+	if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE)
+		return true;
+
+	if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) {
+		cnt++;
+	} else {
+		cnt = 1;
+		__this_cpu_write(cmci_time_stamp, now);
+	}
+	__this_cpu_write(cmci_storm_cnt, cnt);
+
+	if (cnt <= CMCI_STORM_TRESHOLD)
+		return false;
+
+	cmci_clear();
+	__this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE);
+	atomic_inc(&cmci_storm_on_cpus);
+	mce_timer_kick(CMCI_POLL_INTERVAL);
+	return true;
+}
+
 /*
  * The interrupt handler. This is called on every event.
  * Just call the poller directly to log any events.
@@ -61,6 +156,8 @@ static int cmci_supported(int *banks)
  */
 static void intel_threshold_interrupt(void)
 {
+	if (cmci_storm_detect())
+		return;
 	machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned));
 	mce_notify_irq();
 }
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2012-07-19  5:59 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner
2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner
2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner
2012-06-07 15:18   ` Borislav Petkov
2012-06-20  3:35   ` Hidetoshi Seto
2012-06-06 21:53 ` [patch 2/5] x86: mce: Serialize mce injection Thomas Gleixner
2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner
2012-06-07 18:14   ` Borislav Petkov
2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner
2012-06-07 17:49   ` Borislav Petkov
2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
2012-06-07 13:35   ` Borislav Petkov
2012-06-07 16:22     ` Luck, Tony
2012-06-08  7:49   ` Thomas Gleixner
2012-06-11  5:46     ` Chen Gong
2012-06-11  6:09     ` Chen Gong
2012-06-14 13:49     ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong
2012-06-14 14:07       ` Thomas Gleixner
2012-06-15  6:51         ` Chen Gong
2012-06-15  9:55           ` Thomas Gleixner
2012-06-18  6:42             ` Chen Gong
2012-06-18  6:45             ` [PATCH V2] " Chen Gong
2012-06-18  8:00               ` Thomas Gleixner
2012-06-18 10:13                 ` Chen Gong
2012-06-18 12:23                   ` Thomas Gleixner
2012-06-19  6:05                     ` Chen Gong
2012-06-19  6:09                     ` [PATCH V3] " Chen Gong
2012-07-04  8:12                       ` Chen Gong
2012-07-16  3:16                         ` Chen Gong
2012-07-16  8:22                           ` Thomas Gleixner
2012-07-17 21:47                             ` Chen Gong
  -- strict thread matches above, loose matches on Subject: below --
2012-07-18 19:59 [V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
2012-07-18 19:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong
2012-07-19 17:59 [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong
2012-07-19 17:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).