* [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version
@ 2012-06-06 21:53 Thomas Gleixner
2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner
` (5 more replies)
0 siblings, 6 replies; 33+ messages in thread
From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw)
To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra
Sorry for the resend, I guess I need more sleep or vacation or both :(
The following series fixes a few interesting bugs (found by review in
context of the CMCI poll effort) and a cleanup to the timer/hotplug
code followed by a consolidated version of the CMCI poll
implementation. This series is based on
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
which contains the bugfix for the dropped timer interval init.
Thanks,
tglx
^ permalink raw reply [flat|nested] 33+ messages in thread* [patch 1/5] x86: mce: Disable preemption when calling raise_local() 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner @ 2012-06-06 21:53 ` Thomas Gleixner 2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner ` (4 subsequent siblings) 5 siblings, 0 replies; 33+ messages in thread From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw) To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra [-- Attachment #1: x86-mce-disable-preemption-when-calling-raise-local.patch --] [-- Type: text/plain, Size: 923 bytes --] raise_mce() has a code path which does not disable preemption when the raise_local() is called. The per cpu variable access in raise_local() depends on preemption being disabled to be functional. So that code path was either never tested or never tested with CONFIG_DEBUG_PREEMPT enabled. Add the missing preempt_disable/enable() pair around the call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce-inject.c | 4 ++++ 1 file changed, 4 insertions(+) Index: tip/arch/x86/kernel/cpu/mcheck/mce-inject.c =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce-inject.c +++ tip/arch/x86/kernel/cpu/mcheck/mce-inject.c @@ -194,7 +194,11 @@ static void raise_mce(struct mce *m) put_online_cpus(); } else #endif + { + preempt_disable(); raise_local(); + preempt_enable(); + } } /* Error injection interface */ ^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 3/5] x86: mce: Split timer init 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner 2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner @ 2012-06-06 21:53 ` Thomas Gleixner 2012-06-07 15:18 ` Borislav Petkov 2012-06-20 3:35 ` Hidetoshi Seto 2012-06-06 21:53 ` [patch 2/5] x86: mce: Serialize mce injection Thomas Gleixner ` (3 subsequent siblings) 5 siblings, 2 replies; 33+ messages in thread From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw) To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra [-- Attachment #1: x86-mce-split-timer-init.patch --] [-- Type: text/plain, Size: 1904 bytes --] Split timer init function into the init and the start part, so the start part can replace the open coded version in CPU_DOWN_FAILED. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) Index: tip/arch/x86/kernel/cpu/mcheck/mce.c =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c +++ tip/arch/x86/kernel/cpu/mcheck/mce.c @@ -1554,23 +1554,28 @@ static void __mcheck_cpu_init_vendor(str } } -static void __mcheck_cpu_init_timer(void) +static void mce_start_timer(unsigned int cpu, struct timer_list *t) { - struct timer_list *t = &__get_cpu_var(mce_timer); unsigned long iv = check_interval * HZ; - setup_timer(t, mce_timer_fn, smp_processor_id()); + __this_cpu_write(mce_next_interval, iv); - if (mce_ignore_ce) + if (mce_ignore_ce || !iv) return; - __this_cpu_write(mce_next_interval, iv); - if (!iv) - return; t->expires = round_jiffies(jiffies + iv); add_timer_on(t, smp_processor_id()); } +static void __mcheck_cpu_init_timer(void) +{ + struct timer_list *t = &__get_cpu_var(mce_timer); + unsigned int cpu = smp_processor_id(); + + setup_timer(t, mce_timer_fn, cpu); + mce_start_timer(cpu, t); +} + /* Handle unconfigured int18 (should never happen) */ static void unexpected_machine_check(struct pt_regs *regs, long error_code) { @@ -2275,12 +2280,8 @@ mce_cpu_callback(struct notifier_block * break; case CPU_DOWN_FAILED: case CPU_DOWN_FAILED_FROZEN: - if (!mce_ignore_ce && check_interval) { - t->expires = round_jiffies(jiffies + - per_cpu(mce_next_interval, cpu)); - add_timer_on(t, cpu); - } smp_call_function_single(cpu, mce_reenable_cpu, &action, 1); + mce_start_timer(cpu, t); break; case CPU_POST_DEAD: /* intentionally ignoring frozen here */ ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/5] x86: mce: Split timer init 2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner @ 2012-06-07 15:18 ` Borislav Petkov 2012-06-20 3:35 ` Hidetoshi Seto 1 sibling, 0 replies; 33+ messages in thread From: Borislav Petkov @ 2012-06-07 15:18 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, Tony Luck, Chen Gong, x86, Peter Zijlstra On Wed, Jun 06, 2012 at 09:53:23PM +0000, Thomas Gleixner wrote: > Split timer init function into the init and the start part, so the > start part can replace the open coded version in CPU_DOWN_FAILED. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Ok, it looks good to me and it has been lightly tested. In light of last time's review debacle, however, I'd only very tentatively say: Acked-by: Borislav Petkov <borislav.petkov@amd.com> -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551 ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 3/5] x86: mce: Split timer init 2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner 2012-06-07 15:18 ` Borislav Petkov @ 2012-06-20 3:35 ` Hidetoshi Seto 1 sibling, 0 replies; 33+ messages in thread From: Hidetoshi Seto @ 2012-06-20 3:35 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra (2012/06/07 6:53), Thomas Gleixner wrote: > --- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c > +++ tip/arch/x86/kernel/cpu/mcheck/mce.c > @@ -1554,23 +1554,28 @@ static void __mcheck_cpu_init_vendor(str > } > } > > -static void __mcheck_cpu_init_timer(void) > +static void mce_start_timer(unsigned int cpu, struct timer_list *t) > { > - struct timer_list *t = &__get_cpu_var(mce_timer); > unsigned long iv = check_interval * HZ; > > - setup_timer(t, mce_timer_fn, smp_processor_id()); > + __this_cpu_write(mce_next_interval, iv); > > - if (mce_ignore_ce) > + if (mce_ignore_ce || !iv) > return; > > - __this_cpu_write(mce_next_interval, iv); > - if (!iv) > - return; > t->expires = round_jiffies(jiffies + iv); > add_timer_on(t, smp_processor_id()); add_timer_on(t, cpu) ? If so, using __this_cpu_write() here is wrong too. > } > Thanks, H.Seto ^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 2/5] x86: mce: Serialize mce injection 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner 2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner 2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner @ 2012-06-06 21:53 ` Thomas Gleixner 2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner ` (2 subsequent siblings) 5 siblings, 0 replies; 33+ messages in thread From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw) To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra [-- Attachment #1: x86-mce-serialize-mce-injection.patch --] [-- Type: text/plain, Size: 1006 bytes --] raise_mce() fiddles with global state, but lacks any kind of serialization. Add a mutex around the raise_mce() call, so concurrent writers do not stomp on each other toes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce-inject.c | 4 ++++ 1 file changed, 4 insertions(+) Index: tip/arch/x86/kernel/cpu/mcheck/mce-inject.c =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce-inject.c +++ tip/arch/x86/kernel/cpu/mcheck/mce-inject.c @@ -78,6 +78,7 @@ static void raise_exception(struct mce * } static cpumask_var_t mce_inject_cpumask; +static DEFINE_MUTEX(mce_inject_mutex); static int mce_raise_notify(unsigned int cmd, struct pt_regs *regs) { @@ -229,7 +230,10 @@ static ssize_t mce_write(struct file *fi * so do it a jiffie or two later everywhere. */ schedule_timeout(2); + + mutex_lock(&mce_inject_mutex); raise_mce(&m); + mutex_unlock(&mce_inject_mutex); return usize; } ^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 5/5] x86: mce: Add cmci poll mode 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner ` (2 preceding siblings ...) 2012-06-06 21:53 ` [patch 2/5] x86: mce: Serialize mce injection Thomas Gleixner @ 2012-06-06 21:53 ` Thomas Gleixner 2012-06-07 18:14 ` Borislav Petkov 2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner 2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong 5 siblings, 1 reply; 33+ messages in thread From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw) To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra [-- Attachment #1: x86-mce-cmci-poll-mode.patch --] [-- Type: text/plain, Size: 7321 bytes --] Still waits for explanation :) Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce-internal.h | 10 +++ arch/x86/kernel/cpu/mcheck/mce.c | 46 +++++++++++++-- arch/x86/kernel/cpu/mcheck/mce_intel.c | 88 +++++++++++++++++++++++++++++- 3 files changed, 137 insertions(+), 7 deletions(-) Index: tip/arch/x86/kernel/cpu/mcheck/mce-internal.h =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ tip/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -28,6 +28,16 @@ extern int mce_ser; extern struct mce_bank *mce_banks; +#ifdef CONFIG_X86_MCE_INTEL +unsigned long mce_intel_adjust_timer(unsigned long interval); +void mce_intel_cmci_poll(void); +#else +# define mce_intel_adjust_timer mce_adjust_timer_default +static inline void mce_intel_cmci_poll(void) { } +#endif + +void mce_timer_kick(unsigned long interval); + #ifdef CONFIG_ACPI_APEI int apei_write_mce(struct mce *m); ssize_t apei_read_mce(struct mce *m, u64 *record_id); Index: tip/arch/x86/kernel/cpu/mcheck/mce.c =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c +++ tip/arch/x86/kernel/cpu/mcheck/mce.c @@ -1256,6 +1256,14 @@ static unsigned long check_interval = 5 static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */ static DEFINE_PER_CPU(struct timer_list, mce_timer); +static unsigned long mce_adjust_timer_default(unsigned long interval) +{ + return interval; +} + +static unsigned long (*mce_adjust_timer)(unsigned long interval) = + mce_adjust_timer_default; + static void mce_timer_fn(unsigned long data) { struct timer_list *t = &__get_cpu_var(mce_timer); @@ -1266,6 +1274,7 @@ static void mce_timer_fn(unsigned long d if (mce_available(__this_cpu_ptr(&cpu_info))) { machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_poll_banks)); + mce_intel_cmci_poll(); } /* @@ -1273,14 +1282,38 @@ static void mce_timer_fn(unsigned long d * polling interval, otherwise increase the polling interval. */ iv = __this_cpu_read(mce_next_interval); - if (mce_notify_irq()) + if (mce_notify_irq()) { iv = max(iv / 2, (unsigned long) HZ/100); - else + } else { iv = min(iv * 2, round_jiffies_relative(check_interval * HZ)); + iv = mce_adjust_timer(iv); + } __this_cpu_write(mce_next_interval, iv); + /* Might have become 0 after CMCI storm subsided */ + if (iv) { + t->expires = jiffies + iv; + add_timer_on(t, smp_processor_id()); + } +} - t->expires = jiffies + iv; - add_timer_on(t, smp_processor_id()); +/* + * Ensure that the timer is firing in @interval from now. + */ +void mce_timer_kick(unsigned long interval) +{ + struct timer_list *t = &__get_cpu_var(mce_timer); + unsigned long when = jiffies + interval; + unsigned long iv = __this_cpu_read(mce_next_interval); + + if (timer_pending(t)) { + if (time_before(when, t->expires)) + mod_timer_pinned(t, when); + } else { + t->expires = round_jiffies(when); + add_timer_on(t, smp_processor_id()); + } + if (interval < iv) + __this_cpu_write(mce_next_interval, interval); } /* Must not be called in IRQ context where del_timer_sync() can deadlock */ @@ -1545,6 +1578,7 @@ static void __mcheck_cpu_init_vendor(str switch (c->x86_vendor) { case X86_VENDOR_INTEL: mce_intel_feature_init(c); + mce_adjust_timer = mce_intel_adjust_timer; break; case X86_VENDOR_AMD: mce_amd_feature_init(c); @@ -1556,7 +1590,7 @@ static void __mcheck_cpu_init_vendor(str static void mce_start_timer(unsigned int cpu, struct timer_list *t) { - unsigned long iv = check_interval * HZ; + unsigned long iv = mce_adjust_timer(check_interval * HZ); __this_cpu_write(mce_next_interval, iv); @@ -2272,8 +2306,8 @@ mce_cpu_callback(struct notifier_block * mce_device_remove(cpu); break; case CPU_DOWN_PREPARE: - del_timer_sync(t); smp_call_function_single(cpu, mce_disable_cpu, &action, 1); + del_timer_sync(t); break; case CPU_DOWN_FAILED: smp_call_function_single(cpu, mce_reenable_cpu, &action, 1); Index: tip/arch/x86/kernel/cpu/mcheck/mce_intel.c =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ tip/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -15,6 +15,8 @@ #include <asm/msr.h> #include <asm/mce.h> +#include "mce-internal.h" + /* * Support for Intel Correct Machine Check Interrupts. This allows * the CPU to raise an interrupt when a corrected machine check happened. @@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_b */ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); -#define CMCI_THRESHOLD 1 +#define CMCI_THRESHOLD 1 +#define CMCI_POLL_INTERVAL (30 * HZ) +#define CMCI_STORM_INTERVAL (1 * HZ) +#define CMCI_STORM_TRESHOLD 5 + +static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); +static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); +static DEFINE_PER_CPU(unsigned int, cmci_storm_state); + +enum { + CMCI_STORM_NONE, + CMCI_STORM_ACTIVE, + CMCI_STORM_SUBSIDED, +}; + +static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) { @@ -53,6 +70,73 @@ static int cmci_supported(int *banks) return !!(cap & MCG_CMCI_P); } +void mce_intel_cmci_poll(void) +{ + if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE) + return; + machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); +} + +unsigned long mce_intel_adjust_timer(unsigned long interval) +{ + if (interval < CMCI_POLL_INTERVAL) + return interval; + + switch (__this_cpu_read(cmci_storm_state)) { + case CMCI_STORM_ACTIVE: + /* + * We switch back to interrupt mode once the poll timer has + * silenced itself. That means no events recorded and the + * timer interval is back to our poll interval. + */ + __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); + atomic_dec(&cmci_storm_on_cpus); + + case CMCI_STORM_SUBSIDED: + /* + * We wait for all cpus to go back to SUBSIDED + * state. When that happens we switch back to + * interrupt mode. + */ + if (!atomic_read(&cmci_storm_on_cpus)) { + __this_cpu_write(cmci_storm_state, CMCI_STORM_NONE); + cmci_reenable(); + cmci_recheck(); + } + return CMCI_POLL_INTERVAL; + default: + /* + * We have shiny wheather, let the poll do whatever it + * thinks. + */ + return interval; + } +} + +static bool cmci_storm_detect(void) +{ + unsigned int cnt = __this_cpu_read(cmci_storm_cnt); + unsigned long ts = __this_cpu_read(cmci_time_stamp); + unsigned long now = jiffies; + + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { + cnt++; + } else { + cnt = 1; + __this_cpu_write(cmci_time_stamp, now); + } + __this_cpu_write(cmci_storm_cnt, cnt); + + if (cnt <= CMCI_STORM_TRESHOLD) + return false; + + cmci_clear(); + __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + atomic_inc(&cmci_storm_on_cpus); + mce_timer_kick(CMCI_POLL_INTERVAL); + return true; +} + /* * The interrupt handler. This is called on every event. * Just call the poller directly to log any events. @@ -61,6 +145,8 @@ static int cmci_supported(int *banks) */ static void intel_threshold_interrupt(void) { + if (cmci_storm_detect()) + return; machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); mce_notify_irq(); } ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 5/5] x86: mce: Add cmci poll mode 2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner @ 2012-06-07 18:14 ` Borislav Petkov 0 siblings, 0 replies; 33+ messages in thread From: Borislav Petkov @ 2012-06-07 18:14 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra On Wed, Jun 06, 2012 at 09:53:24PM +0000, Thomas Gleixner wrote: [ … ] > Index: tip/arch/x86/kernel/cpu/mcheck/mce_intel.c > =================================================================== > --- tip.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c > +++ tip/arch/x86/kernel/cpu/mcheck/mce_intel.c > @@ -15,6 +15,8 @@ > #include <asm/msr.h> > #include <asm/mce.h> > > +#include "mce-internal.h" > + > /* > * Support for Intel Correct Machine Check Interrupts. This allows > * the CPU to raise an interrupt when a corrected machine check happened. > @@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_b > */ > static DEFINE_RAW_SPINLOCK(cmci_discover_lock); > > -#define CMCI_THRESHOLD 1 > +#define CMCI_THRESHOLD 1 > +#define CMCI_POLL_INTERVAL (30 * HZ) > +#define CMCI_STORM_INTERVAL (1 * HZ) > +#define CMCI_STORM_TRESHOLD 5 Just a spelling correction: CMCI_STORM_THRESHOLD > + > +static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); > +static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); > +static DEFINE_PER_CPU(unsigned int, cmci_storm_state); > + > +enum { > + CMCI_STORM_NONE, > + CMCI_STORM_ACTIVE, > + CMCI_STORM_SUBSIDED, > +}; > + > +static atomic_t cmci_storm_on_cpus; > > static int cmci_supported(int *banks) > { > @@ -53,6 +70,73 @@ static int cmci_supported(int *banks) > return !!(cap & MCG_CMCI_P); > } > > +void mce_intel_cmci_poll(void) > +{ > + if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE) > + return; > + machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); > +} > + > +unsigned long mce_intel_adjust_timer(unsigned long interval) > +{ > + if (interval < CMCI_POLL_INTERVAL) > + return interval; > + > + switch (__this_cpu_read(cmci_storm_state)) { > + case CMCI_STORM_ACTIVE: > + /* > + * We switch back to interrupt mode once the poll timer has > + * silenced itself. That means no events recorded and the > + * timer interval is back to our poll interval. > + */ > + __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); > + atomic_dec(&cmci_storm_on_cpus); > + > + case CMCI_STORM_SUBSIDED: > + /* > + * We wait for all cpus to go back to SUBSIDED > + * state. When that happens we switch back to > + * interrupt mode. > + */ > + if (!atomic_read(&cmci_storm_on_cpus)) { > + __this_cpu_write(cmci_storm_state, CMCI_STORM_NONE); > + cmci_reenable(); > + cmci_recheck(); > + } > + return CMCI_POLL_INTERVAL; > + default: > + /* > + * We have shiny wheather, let the poll do whatever it > + * thinks. > + */ > + return interval; > + } > +} > + > +static bool cmci_storm_detect(void) > +{ > + unsigned int cnt = __this_cpu_read(cmci_storm_cnt); > + unsigned long ts = __this_cpu_read(cmci_time_stamp); > + unsigned long now = jiffies; > + > + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { > + cnt++; > + } else { > + cnt = 1; > + __this_cpu_write(cmci_time_stamp, now); > + } > + __this_cpu_write(cmci_storm_cnt, cnt); > + > + if (cnt <= CMCI_STORM_TRESHOLD) and here too. > + return false; > + > + cmci_clear(); > + __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); > + atomic_inc(&cmci_storm_on_cpus); > + mce_timer_kick(CMCI_POLL_INTERVAL); > + return true; > +} > + > /* > * The interrupt handler. This is called on every event. > * Just call the poller directly to log any events. > @@ -61,6 +145,8 @@ static int cmci_supported(int *banks) > */ > static void intel_threshold_interrupt(void) > { > + if (cmci_storm_detect()) > + return; > machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); > mce_notify_irq(); > } > > > -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551 ^ permalink raw reply [flat|nested] 33+ messages in thread
* [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner ` (3 preceding siblings ...) 2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner @ 2012-06-06 21:53 ` Thomas Gleixner 2012-06-07 17:49 ` Borislav Petkov 2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong 5 siblings, 1 reply; 33+ messages in thread From: Thomas Gleixner @ 2012-06-06 21:53 UTC (permalink / raw) To: LKML; +Cc: Tony Luck, Borislav Petkov, Chen Gong, x86, Peter Zijlstra [-- Attachment #1: x86-mce-remove-frozen-cases.patch --] [-- Type: text/plain, Size: 1440 bytes --] No point in having double cases if we can simply mask the FROZEN bit out. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/kernel/cpu/mcheck/mce.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) Index: tip/arch/x86/kernel/cpu/mcheck/mce.c =================================================================== --- tip.orig/arch/x86/kernel/cpu/mcheck/mce.c +++ tip/arch/x86/kernel/cpu/mcheck/mce.c @@ -2260,34 +2260,32 @@ mce_cpu_callback(struct notifier_block * unsigned int cpu = (unsigned long)hcpu; struct timer_list *t = &per_cpu(mce_timer, cpu); - switch (action) { + switch (action & ~CPU_TASKS_FROZEN) { case CPU_ONLINE: - case CPU_ONLINE_FROZEN: mce_device_create(cpu); if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); break; case CPU_DEAD: - case CPU_DEAD_FROZEN: if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); break; case CPU_DOWN_PREPARE: - case CPU_DOWN_PREPARE_FROZEN: del_timer_sync(t); smp_call_function_single(cpu, mce_disable_cpu, &action, 1); break; case CPU_DOWN_FAILED: - case CPU_DOWN_FAILED_FROZEN: smp_call_function_single(cpu, mce_reenable_cpu, &action, 1); mce_start_timer(cpu, t); break; - case CPU_POST_DEAD: + } + + if (action == CPU_POST_DEAD) { /* intentionally ignoring frozen here */ cmci_rediscover(cpu); - break; } + return NOTIFY_OK; } ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code 2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner @ 2012-06-07 17:49 ` Borislav Petkov 0 siblings, 0 replies; 33+ messages in thread From: Borislav Petkov @ 2012-06-07 17:49 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, Tony Luck, Chen Gong, x86, Peter Zijlstra On Wed, Jun 06, 2012 at 09:53:24PM +0000, Thomas Gleixner wrote: > No point in having double cases if we can simply mask the FROZEN bit > out. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <borislav.petkov@amd.com> -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551 ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner ` (4 preceding siblings ...) 2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner @ 2012-06-07 10:08 ` Chen Gong 2012-06-07 13:35 ` Borislav Petkov 2012-06-08 7:49 ` Thomas Gleixner 5 siblings, 2 replies; 33+ messages in thread From: Chen Gong @ 2012-06-07 10:08 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra 于 2012/6/7 5:53, Thomas Gleixner 写道: > Sorry for the resend, I guess I need more sleep or vacation or both :( > > The following series fixes a few interesting bugs (found by review in > context of the CMCI poll effort) and a cleanup to the timer/hotplug > code followed by a consolidated version of the CMCI poll > implementation. This series is based on > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > > which contains the bugfix for the dropped timer interval init. > > Thanks, > > tglx > > > > I tested the latest patch series based on your tip tree. The basic logic is correct as we expected :-). But during the CPU online/offline test I found an issue. After *STORM* mode is entered, it can't come back from *STORM* mode to normal interrupt mode. At least there exists such an issue: when *STORM* is entered, in the meanwhile, one CPU is offline during this period, which means *cmci_storm_on_cpus* can't decrease to 0 because there is one bit stuck on this offlined CPU. So we should detect such situation and decrease on *cmci_storm_on_cpus* at proper time. BTW, even I online the *CPU* in above situation, the normal CMCI still doesn't come back, strange. I still have another question: When we handle following case: mce_cpu_callback(struct notifier_block * mce_device_remove(cpu); break; case CPU_DOWN_PREPARE: - del_timer_sync(t); smp_call_function_single(cpu, mce_disable_cpu, &action, 1); + del_timer_sync(t); break; Where we add this timer back? I can't find it in "case CPU_ONLINE". ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version 2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong @ 2012-06-07 13:35 ` Borislav Petkov 2012-06-07 16:22 ` Luck, Tony 2012-06-08 7:49 ` Thomas Gleixner 1 sibling, 1 reply; 33+ messages in thread From: Borislav Petkov @ 2012-06-07 13:35 UTC (permalink / raw) To: Chen Gong Cc: Thomas Gleixner, LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra On Thu, Jun 07, 2012 at 06:08:15PM +0800, Chen Gong wrote: > But during the CPU online/offline test I found an issue. After *STORM* > mode is entered, it can't come back from *STORM* mode to normal > interrupt mode. At least there exists such an issue: when *STORM* is > entered, in the meanwhile, one CPU is offline during this period, > which means *cmci_storm_on_cpus* can't decrease to 0 because there is > one bit stuck on this offlined CPU. Stupid question: do you even want to allow offlining of CPUs while the storm is raging? IOW, wouldn't it be better to disallow hotplug when the storm happens so that all online CPUs, which is all CPUs on the system, can work out the storm conditions faster so that the system gets back to normal faster... Thanks. -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551 ^ permalink raw reply [flat|nested] 33+ messages in thread
* RE: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version 2012-06-07 13:35 ` Borislav Petkov @ 2012-06-07 16:22 ` Luck, Tony 0 siblings, 0 replies; 33+ messages in thread From: Luck, Tony @ 2012-06-07 16:22 UTC (permalink / raw) To: Borislav Petkov, Chen Gong Cc: Thomas Gleixner, LKML, x86@kernel.org, Peter Zijlstra > Stupid question: do you even want to allow offlining of CPUs while the > storm is raging? IOW, wouldn't it be better to disallow hotplug when the > storm happens so that all online CPUs, which is all CPUs on the system, > can work out the storm conditions faster so that the system gets back to > normal faster... What if a processor is the source of the storm (we can get CMCI from corrected errors in the cache)? Then our action to abate the storm would be to take the cpu offline. -Tony ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version 2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong 2012-06-07 13:35 ` Borislav Petkov @ 2012-06-08 7:49 ` Thomas Gleixner 2012-06-11 5:46 ` Chen Gong ` (2 more replies) 1 sibling, 3 replies; 33+ messages in thread From: Thomas Gleixner @ 2012-06-08 7:49 UTC (permalink / raw) To: Chen Gong; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra On Thu, 7 Jun 2012, Chen Gong wrote: > > But during the CPU online/offline test I found an issue. After *STORM* > mode is entered, it can't come back from *STORM* mode to normal > interrupt mode. At least there exists such an issue: when *STORM* is > entered, in the meanwhile, one CPU is offline during this period, > which means *cmci_storm_on_cpus* can't decrease to 0 because there > is one bit stuck on this offlined CPU. So we should detect such > situation and decrease on *cmci_storm_on_cpus* at proper time. Yes, we need to reset the storm state as well I think. > BTW, even I online the *CPU* in above situation, the normal CMCI > still doesn't come back, strange. That's weird. > I still have another question: When we handle following case: > mce_cpu_callback(struct notifier_block * > mce_device_remove(cpu); > break; > case CPU_DOWN_PREPARE: > - del_timer_sync(t); > smp_call_function_single(cpu, mce_disable_cpu, &action, 1); > + del_timer_sync(t); > break; > > Where we add this timer back? I can't find it in "case CPU_ONLINE". The timer gets added back via mcheck_cpu_init(), which is called on the newly onlined cpu from smp_callin(). Thanks, tglx ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version 2012-06-08 7:49 ` Thomas Gleixner @ 2012-06-11 5:46 ` Chen Gong 2012-06-11 6:09 ` Chen Gong 2012-06-14 13:49 ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong 2 siblings, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-06-11 5:46 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra 于 2012/6/8 15:49, Thomas Gleixner 写道: > On Thu, 7 Jun 2012, Chen Gong wrote: >> >> But during the CPU online/offline test I found an issue. After *STORM* >> mode is entered, it can't come back from *STORM* mode to normal >> interrupt mode. At least there exists such an issue: when *STORM* is >> entered, in the meanwhile, one CPU is offline during this period, >> which means *cmci_storm_on_cpus* can't decrease to 0 because there >> is one bit stuck on this offlined CPU. So we should detect such >> situation and decrease on *cmci_storm_on_cpus* at proper time. > > Yes, we need to reset the storm state as well I think. > >> BTW, even I online the *CPU* in above situation, the normal CMCI >> still doesn't come back, strange. > > That's weird. Here is the reason. When CPU offline and then online, the CMCI is reenabled, which means it opens a backdoor to enable *CMCI storm detect* again. The result is the counter cmci_sotrm_cnt is increased again before it decreases to zero. At last, this counter will never be back to zero, irrelevant to CPU online. Obviously, we must stop CMCI enabling again before CMCI storm ends, even during CPU online/offline. Besides this, we still need to remove the count if one CPU is offlined. Like below tmp patch I wrote (not guarantee it is OK :-)) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 2cd73ce..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; #ifdef CONFIG_X86_MCE_INTEL unsigned long mce_intel_adjust_timer(unsigned long interval); void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); #else # define mce_intel_adjust_timer mce_adjust_timer_default static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } #endif void mce_timer_kick(unsigned long interval); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index e3f8b94..5e22d99 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: smp_call_function_single(cpu, mce_disable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 92d8b5c..0051b2d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); static DEFINE_PER_CPU(unsigned int, cmci_storm_state); +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); enum { CMCI_STORM_NONE, @@ -47,6 +48,11 @@ enum { CMCI_STORM_SUBSIDED, }; +enum { + CMCI_STORM_HCPU_NONE, + CMCI_STORM_HCPU_ACTIVE, +}; + static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) @@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void) machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); } +void mce_intel_hcpu_update(unsigned long cpu) +{ + unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu); + + if (*status == CMCI_STORM_HCPU_ACTIVE) { + *status = CMCI_STORM_HCPU_NONE; + atomic_dec(&cmci_storm_on_cpus); + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; + } +} + unsigned long mce_intel_adjust_timer(unsigned long interval) { if (interval < CMCI_POLL_INTERVAL) @@ -132,6 +149,7 @@ static bool cmci_storm_detect(void) cmci_clear(); __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE); atomic_inc(&cmci_storm_on_cpus); mce_timer_kick(CMCI_POLL_INTERVAL); return true; > >> I still have another question: When we handle following case: >> mce_cpu_callback(struct notifier_block * >> mce_device_remove(cpu); >> break; >> case CPU_DOWN_PREPARE: >> - del_timer_sync(t); >> smp_call_function_single(cpu, mce_disable_cpu, &action, 1); >> + del_timer_sync(t); >> break; >> >> Where we add this timer back? I can't find it in "case CPU_ONLINE". > > The timer gets added back via mcheck_cpu_init(), which is called on > the newly onlined cpu from smp_callin(). Stupid me! I must too busy that day so that I lost my head. :-(. Thanks Tomas for your explanation! > > Thanks, > > tglx > > ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version 2012-06-08 7:49 ` Thomas Gleixner 2012-06-11 5:46 ` Chen Gong @ 2012-06-11 6:09 ` Chen Gong 2012-06-14 13:49 ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong 2 siblings, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-06-11 6:09 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, Tony Luck, Borislav Petkov, x86, Peter Zijlstra I don't know why my reply format is changed in the previous mail, I have to resend it again, sorry. 于 2012/6/8 15:49, Thomas Gleixner 写道: > On Thu, 7 Jun 2012, Chen Gong wrote: >> >> But during the CPU online/offline test I found an issue. After *STORM* >> mode is entered, it can't come back from *STORM* mode to normal >> interrupt mode. At least there exists such an issue: when *STORM* is >> entered, in the meanwhile, one CPU is offline during this period, >> which means *cmci_storm_on_cpus* can't decrease to 0 because there >> is one bit stuck on this offlined CPU. So we should detect such >> situation and decrease on *cmci_storm_on_cpus* at proper time. > > Yes, we need to reset the storm state as well I think. > >> BTW, even I online the *CPU* in above situation, the normal CMCI >> still doesn't come back, strange. > > That's weird. Here is the reason. When CPU offline and then online, the CMCI is reenabled, which means it opens a backdoor to enable *CMCI storm detect* again. The result is the counter cmci_sotrm_cnt is increased again before it decreases to zero. At last, this counter will never be back to zero, irrelevant to CPU online. Obviously, we must stop CMCI enabling again before CMCI storm ends, even during CPU online/offline. Besides this, we still need to remove the count if one CPU is offlined. Like below tmp patch I wrote (not guarantee it is OK :-)) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 2cd73ce..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; #ifdef CONFIG_X86_MCE_INTEL unsigned long mce_intel_adjust_timer(unsigned long interval); void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); #else # define mce_intel_adjust_timer mce_adjust_timer_default static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } #endif void mce_timer_kick(unsigned long interval); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index e3f8b94..5e22d99 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: smp_call_function_single(cpu, mce_disable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 92d8b5c..0051b2d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); static DEFINE_PER_CPU(unsigned int, cmci_storm_state); +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); enum { CMCI_STORM_NONE, @@ -47,6 +48,11 @@ enum { CMCI_STORM_SUBSIDED, }; +enum { + CMCI_STORM_HCPU_NONE, + CMCI_STORM_HCPU_ACTIVE, +}; + static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) @@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void) machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); } +void mce_intel_hcpu_update(unsigned long cpu) +{ + unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu); + + if (*status == CMCI_STORM_HCPU_ACTIVE) { + *status = CMCI_STORM_HCPU_NONE; + atomic_dec(&cmci_storm_on_cpus); + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; + } +} + unsigned long mce_intel_adjust_timer(unsigned long interval) { if (interval < CMCI_POLL_INTERVAL) @@ -132,6 +149,7 @@ static bool cmci_storm_detect(void) cmci_clear(); __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE); atomic_inc(&cmci_storm_on_cpus); mce_timer_kick(CMCI_POLL_INTERVAL); return true; > >> I still have another question: When we handle following case: >> mce_cpu_callback(struct notifier_block * >> mce_device_remove(cpu); >> break; >> case CPU_DOWN_PREPARE: >> - del_timer_sync(t); >> smp_call_function_single(cpu, mce_disable_cpu, &action, 1); >> + del_timer_sync(t); >> break; >> >> Where we add this timer back? I can't find it in "case CPU_ONLINE". > > The timer gets added back via mcheck_cpu_init(), which is called on > the newly onlined cpu from smp_callin(). > Stupid me! I must too busy that day so that I lost my head. :-(. Thanks Tomas for your explanation! > Thanks, > > tglx ^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCH] tmp patch to fix hotplug issue in CMCI storm 2012-06-08 7:49 ` Thomas Gleixner 2012-06-11 5:46 ` Chen Gong 2012-06-11 6:09 ` Chen Gong @ 2012-06-14 13:49 ` Chen Gong 2012-06-14 14:07 ` Thomas Gleixner 2 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-06-14 13:49 UTC (permalink / raw) To: tglx; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel, Chen Gong this patch is based on tip tree and previous 5 patches. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> --- arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 + arch/x86/kernel/cpu/mcheck/mce.c | 1 + arch/x86/kernel/cpu/mcheck/mce_intel.c | 49 ++++++++++++++++++++++++++++- 3 files changed, 51 insertions(+), 1 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 2cd73ce..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; #ifdef CONFIG_X86_MCE_INTEL unsigned long mce_intel_adjust_timer(unsigned long interval); void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); #else # define mce_intel_adjust_timer mce_adjust_timer_default static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } #endif void mce_timer_kick(unsigned long interval); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index e3f8b94..5e22d99 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: smp_call_function_single(cpu, mce_disable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 92d8b5c..ef687df 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); static DEFINE_PER_CPU(unsigned int, cmci_storm_state); +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); enum { CMCI_STORM_NONE, @@ -47,6 +48,12 @@ enum { CMCI_STORM_SUBSIDED, }; +enum { + CMCI_STORM_HCPU_NONE, + CMCI_STORM_HCPU_ACTIVE, + CMCI_STORM_HCPU_SUBSIDED, +}; + static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) @@ -77,6 +84,17 @@ void mce_intel_cmci_poll(void) machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); } +void mce_intel_hcpu_update(unsigned long cpu) +{ + unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu); + + if (*status == CMCI_STORM_HCPU_ACTIVE) { + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; + *status = CMCI_STORM_HCPU_SUBSIDED; + atomic_dec(&cmci_storm_on_cpus); + } +} + unsigned long mce_intel_adjust_timer(unsigned long interval) { if (interval < CMCI_POLL_INTERVAL) @@ -90,6 +108,7 @@ unsigned long mce_intel_adjust_timer(unsigned long interval) * timer interval is back to our poll interval. */ __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE); atomic_dec(&cmci_storm_on_cpus); case CMCI_STORM_SUBSIDED: @@ -109,6 +128,21 @@ unsigned long mce_intel_adjust_timer(unsigned long interval) * We have shiny wheather, let the poll do whatever it * thinks. */ + + /* + * If one CPU is offlined and onlined during CMCI storm, it + * has no chance to enable CMCI again. Here is the portal. + */ + if ((!atomic_read(&cmci_storm_on_cpus)) && + (CMCI_STORM_HCPU_SUBSIDED == + __this_cpu_read(cmci_storm_hcpu_status))) { + __this_cpu_write(cmci_storm_hcpu_status, + CMCI_STORM_HCPU_NONE); + cmci_reenable(); + apic_write(APIC_LVTCMCI, + THRESHOLD_APIC_VECTOR|APIC_DM_FIXED); + cmci_recheck(); + } return interval; } } @@ -132,6 +166,7 @@ static bool cmci_storm_detect(void) cmci_clear(); __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE); atomic_inc(&cmci_storm_on_cpus); mce_timer_kick(CMCI_POLL_INTERVAL); return true; @@ -259,7 +294,9 @@ void cmci_rediscover(int dying) int cpu; cpumask_var_t old; - if (!cmci_supported(&banks)) + if (!cmci_supported(&banks) || + /* if still in CMCI storm, don't enable it */ + (0 != atomic_read(&cmci_storm_on_cpus))) return; if (!alloc_cpumask_var(&old, GFP_KERNEL)) return; @@ -297,6 +334,16 @@ static void intel_init_cmci(void) return; mce_threshold_vector = intel_threshold_interrupt; + /* if still in CMCI storm, don't enable it */ + if (0 != atomic_read(&cmci_storm_on_cpus)) + return; + /* + * If one CPU is offlined during CMCI storm and onlined after + * CMCI storm, this *hCPU* status must be updated to avoid + * to reenable CMCI twice. + */ + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE); + cmci_discover(banks, 1); /* * For CPU #0 this runs with still disabled APIC, but that's -- 1.7.1 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm 2012-06-14 13:49 ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong @ 2012-06-14 14:07 ` Thomas Gleixner 2012-06-15 6:51 ` Chen Gong 0 siblings, 1 reply; 33+ messages in thread From: Thomas Gleixner @ 2012-06-14 14:07 UTC (permalink / raw) To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel On Thu, 14 Jun 2012, Chen Gong wrote: > this patch is based on tip tree and previous 5 patches. You really don't need all this complexity to handle that. The main thing is that you clear the storm state and adjust the storm counter when the cpu goes offline (in case the state is ACTIVE). When it comes online again then you can simply let it restart cmci. If the storm on this cpu (or node) still exists then it will notice and everything falls in place. Keep it simple, really! Thanks, tglx ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm 2012-06-14 14:07 ` Thomas Gleixner @ 2012-06-15 6:51 ` Chen Gong 2012-06-15 9:55 ` Thomas Gleixner 0 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-06-15 6:51 UTC (permalink / raw) To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel 于 2012/6/14 22:07, Thomas Gleixner 写道: > On Thu, 14 Jun 2012, Chen Gong wrote: >> this patch is based on tip tree and previous 5 patches. > > You really don't need all this complexity to handle that. The main > thing is that you clear the storm state and adjust the storm counter > when the cpu goes offline (in case the state is ACTIVE). > > When it comes online again then you can simply let it restart cmci. If > the storm on this cpu (or node) still exists then it will notice and > everything falls in place. I ever tested some different scenarios, if storm on this cpu still exists, it triggers the CMCI and broadcast it on the sibling CPU, which means the counter *cmci_storm_on_cpus* will increase beyond the upper limit. E.g. on a 2 sockets SandyBridge-EP system (one socket has 8 cores and 16 threads), inject one error on one socket, you can watch *cmci_storm_on_cpus* = 16 becuase of CMCI broadcast, during this time, offline and online one CPU on this socket, firstly *cmci_storm_on_cpus* = 15 because of offline and ACTIVE status, and then *cmci_storm_on_cpus* = 31 in that CMCI is actived because of online.That's why I have to disable CMCI during whole online/offline until CMCI storm is subsided. Frankly, the logic is a little bit complex so that I write many comments to avoid I forget it after some time :-) ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm 2012-06-15 6:51 ` Chen Gong @ 2012-06-15 9:55 ` Thomas Gleixner 2012-06-18 6:42 ` Chen Gong 2012-06-18 6:45 ` [PATCH V2] " Chen Gong 0 siblings, 2 replies; 33+ messages in thread From: Thomas Gleixner @ 2012-06-15 9:55 UTC (permalink / raw) To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel [-- Attachment #1: Type: TEXT/PLAIN, Size: 2778 bytes --] On Fri, 15 Jun 2012, Chen Gong wrote: > 于 2012/6/14 22:07, Thomas Gleixner 写道: > > On Thu, 14 Jun 2012, Chen Gong wrote: > > > this patch is based on tip tree and previous 5 patches. > > > > You really don't need all this complexity to handle that. The main > > thing is that you clear the storm state and adjust the storm counter > > when the cpu goes offline (in case the state is ACTIVE). > > > > When it comes online again then you can simply let it restart cmci. If > > the storm on this cpu (or node) still exists then it will notice and > > everything falls in place. > > I ever tested some different scenarios, if storm on this cpu still > exists, it triggers the CMCI and broadcast it on the sibling CPU, > which means the counter *cmci_storm_on_cpus* will increase beyond > the upper limit. E.g. on a 2 sockets SandyBridge-EP system (one socket > has 8 cores and 16 threads), inject one error on one socket, you can > watch *cmci_storm_on_cpus* = 16 becuase of CMCI broadcast, during > this time, offline and online one CPU on this socket, firstly > *cmci_storm_on_cpus* = 15 because of offline and ACTIVE status, and then > *cmci_storm_on_cpus* = 31 in that CMCI is actived because of > online.That's why I have to disable CMCI during whole online/offline > until CMCI storm is subsided. Frankly, the logic is a little bit > complex so that I write many comments to avoid I forget it after some > time :-) This does not make any sense at all. What you are saying is that even if CPU0 run cmci_clear() the CMCI raised on CPU1 will cause the CMCI vector to be triggered on CPU0. So how does the whole storm machinery work in the following case: CPU0 CPU1 cmci incoming cmci incoming storm detected no storm detected yet cmci_clear() switch to poll cmci raised So according to your explanation that would cause the cmci vector to be broadcasted to CPU0 as well. Now that would cause the counter to get a bogus increment, right ? So instead of hacking insane crap into the code, we have simply to do the obvious Right Thing: Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -119,6 +119,9 @@ static bool cmci_storm_detect(void) unsigned long ts = __this_cpu_read(cmci_time_stamp); unsigned long now = jiffies; + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) + return true; + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { cnt++; } else { That will prevent damage under all circumstances, cpu hotplug included. But that's too simple and comprehensible I fear. Thanks, tglx ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH] tmp patch to fix hotplug issue in CMCI storm 2012-06-15 9:55 ` Thomas Gleixner @ 2012-06-18 6:42 ` Chen Gong 2012-06-18 6:45 ` [PATCH V2] " Chen Gong 1 sibling, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-06-18 6:42 UTC (permalink / raw) To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel [...] > > So according to your explanation that would cause the cmci vector to > be broadcasted to CPU0 as well. Now that would cause the counter to > get a bogus increment, right ? > > So instead of hacking insane crap into the code, we have simply to do > the obvious Right Thing: > > Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c > =================================================================== > --- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce_intel.c > +++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce_intel.c > @@ -119,6 +119,9 @@ static bool cmci_storm_detect(void) > unsigned long ts = __this_cpu_read(cmci_time_stamp); > unsigned long now = jiffies; > > + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) > + return true; > + > if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { > cnt++; > } else { > > That will prevent damage under all circumstances, cpu hotplug > included. > > But that's too simple and comprehensible I fear. > Oh, yes, your suggestion can simplify my logic, but not all. Obviously, when hotplug is happened, it can be considered quitting CMCI storm in another way, so the corresponding counter and status should be decreased from that path. And in my original patch, I defined three status: enum { CMCI_STORM_HCPU_NONE, CMCI_STORM_HCPU_ACTIVE, CMCI_STORM_HCPU_SUBSIDED, }; I use CMCI_STORM_HCPU_SUBSIDED to descirbe stalled CPU in hotplug progressI to stop CMCI enable during whole hotplog status, but according to your suggestion, now the CMCI_STORM_HCPU_SUBSIDED can be removed (FIXME: because CMCI can be enabled, if CPU offline and then online again during CMCI storm, it will enter CMCI storm status right now. It can simplify the logic, but a little bit performance degradation). I will send the 2nd patch based on previous 5 patches in a separate mail. ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V2] tmp patch to fix hotplug issue in CMCI storm 2012-06-15 9:55 ` Thomas Gleixner 2012-06-18 6:42 ` Chen Gong @ 2012-06-18 6:45 ` Chen Gong 2012-06-18 8:00 ` Thomas Gleixner 1 sibling, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-06-18 6:45 UTC (permalink / raw) To: tglx; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel, Chen Gong This patch is based on previous 5 patches and tuned based on Thomas' suggestion. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> --- arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 ++ arch/x86/kernel/cpu/mcheck/mce.c | 1 + arch/x86/kernel/cpu/mcheck/mce_intel.c | 22 ++++++++++++++++++++++ 3 files changed, 25 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 2cd73ce..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; #ifdef CONFIG_X86_MCE_INTEL unsigned long mce_intel_adjust_timer(unsigned long interval); void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); #else # define mce_intel_adjust_timer mce_adjust_timer_default static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } #endif void mce_timer_kick(unsigned long interval); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index e3f8b94..5e22d99 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: smp_call_function_single(cpu, mce_disable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 92d8b5c..0493525 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); static DEFINE_PER_CPU(unsigned int, cmci_storm_state); +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); enum { CMCI_STORM_NONE, @@ -47,6 +48,11 @@ enum { CMCI_STORM_SUBSIDED, }; +enum { + CMCI_STORM_HCPU_NONE, + CMCI_STORM_HCPU_ACTIVE, +}; + static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) @@ -77,6 +83,17 @@ void mce_intel_cmci_poll(void) machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); } +void mce_intel_hcpu_update(unsigned long cpu) +{ + unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu); + + if (*status == CMCI_STORM_HCPU_ACTIVE) { + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; + *status = CMCI_STORM_HCPU_NONE; + atomic_dec(&cmci_storm_on_cpus); + } +} + unsigned long mce_intel_adjust_timer(unsigned long interval) { if (interval < CMCI_POLL_INTERVAL) @@ -90,6 +107,7 @@ unsigned long mce_intel_adjust_timer(unsigned long interval) * timer interval is back to our poll interval. */ __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE); atomic_dec(&cmci_storm_on_cpus); case CMCI_STORM_SUBSIDED: @@ -119,6 +137,9 @@ static bool cmci_storm_detect(void) unsigned long ts = __this_cpu_read(cmci_time_stamp); unsigned long now = jiffies; + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) + return true; + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { cnt++; } else { @@ -132,6 +153,7 @@ static bool cmci_storm_detect(void) cmci_clear(); __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE); atomic_inc(&cmci_storm_on_cpus); mce_timer_kick(CMCI_POLL_INTERVAL); return true; -- 1.7.1 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm 2012-06-18 6:45 ` [PATCH V2] " Chen Gong @ 2012-06-18 8:00 ` Thomas Gleixner 2012-06-18 10:13 ` Chen Gong 0 siblings, 1 reply; 33+ messages in thread From: Thomas Gleixner @ 2012-06-18 8:00 UTC (permalink / raw) To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel On Mon, 18 Jun 2012, Chen Gong wrote: > index 92d8b5c..0493525 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c > +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c > @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); > static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); > static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); > static DEFINE_PER_CPU(unsigned int, cmci_storm_state); > +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); Why do you insist on having another status variable, which does actually nothing than obfuscate the code? Look at the usage sites: > __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); > + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE); > __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); > + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_ACTIVE); So it's a shadow variable of cmci_storm_state for no value. And all you do with it is: > +void mce_intel_hcpu_update(unsigned long cpu) > +{ > + unsigned long *status = &per_cpu(cmci_storm_hcpu_status, cpu); > + > + if (*status == CMCI_STORM_HCPU_ACTIVE) { This can be checked with the existing variable as well. And your check leaves CMCI_STORM_SUBSIDED as a stale value around. This simply wants to check if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE) atomic_dec(&cmci_storm_on_cpus); and unconditionally clear the state per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; Right? ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm 2012-06-18 8:00 ` Thomas Gleixner @ 2012-06-18 10:13 ` Chen Gong 2012-06-18 12:23 ` Thomas Gleixner 0 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-06-18 10:13 UTC (permalink / raw) To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel 于 2012/6/18 16:00, Thomas Gleixner 写道: > On Mon, 18 Jun 2012, Chen Gong wrote: >> index 92d8b5c..0493525 100644 >> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c >> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c >> @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); >> static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); >> static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); >> static DEFINE_PER_CPU(unsigned int, cmci_storm_state); >> +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); > > Why do you insist on having another status variable, which does > actually nothing than obfuscate the code? > > Look at the usage sites: > >> __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); >> + __this_cpu_write(cmci_storm_hcpu_status, CMCI_STORM_HCPU_NONE); Because here I can't substitute CMCI_STORM_HCPU_NONE with CMCI_STORM_SUBSIDED. If so, the status is chaos. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm 2012-06-18 10:13 ` Chen Gong @ 2012-06-18 12:23 ` Thomas Gleixner 2012-06-19 6:05 ` Chen Gong 2012-06-19 6:09 ` [PATCH V3] " Chen Gong 0 siblings, 2 replies; 33+ messages in thread From: Thomas Gleixner @ 2012-06-18 12:23 UTC (permalink / raw) To: Chen Gong; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel [-- Attachment #1: Type: TEXT/PLAIN, Size: 1098 bytes --] On Mon, 18 Jun 2012, Chen Gong wrote: > 于 2012/6/18 16:00, Thomas Gleixner 写道: > > On Mon, 18 Jun 2012, Chen Gong wrote: > > > index 92d8b5c..0493525 100644 > > > --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c > > > +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c > > > @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); > > > static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); > > > static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); > > > static DEFINE_PER_CPU(unsigned int, cmci_storm_state); > > > +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); > > > > Why do you insist on having another status variable, which does > > actually nothing than obfuscate the code? > > > > Look at the usage sites: > > > > > __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); > > > + __this_cpu_write(cmci_storm_hcpu_status, > > > CMCI_STORM_HCPU_NONE); > > Because here I can't substitute CMCI_STORM_HCPU_NONE with > CMCI_STORM_SUBSIDED. If so, the status is chaos. Have you actually read what I wrote later? You do not neeed that extra state, period. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V2] tmp patch to fix hotplug issue in CMCI storm 2012-06-18 12:23 ` Thomas Gleixner @ 2012-06-19 6:05 ` Chen Gong 2012-06-19 6:09 ` [PATCH V3] " Chen Gong 1 sibling, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-06-19 6:05 UTC (permalink / raw) To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, x86, peterz, linux-kernel 于 2012/6/18 20:23, Thomas Gleixner 写道: > On Mon, 18 Jun 2012, Chen Gong wrote: >> 于 2012/6/18 16:00, Thomas Gleixner 写道: >>> On Mon, 18 Jun 2012, Chen Gong wrote: >>>> index 92d8b5c..0493525 100644 >>>> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c >>>> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c >>>> @@ -40,6 +40,7 @@ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); >>>> static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); >>>> static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); >>>> static DEFINE_PER_CPU(unsigned int, cmci_storm_state); >>>> +static DEFINE_PER_CPU(unsigned long, cmci_storm_hcpu_status); >>> >>> Why do you insist on having another status variable, which does >>> actually nothing than obfuscate the code? >>> >>> Look at the usage sites: >>> >>>> __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); >>>> + __this_cpu_write(cmci_storm_hcpu_status, >>>> CMCI_STORM_HCPU_NONE); >> >> Because here I can't substitute CMCI_STORM_HCPU_NONE with >> CMCI_STORM_SUBSIDED. If so, the status is chaos. > > Have you actually read what I wrote later? You do not neeed that extra > state, period. > Oh, yes, you are right. I really don't take enough time on your reply. I apology for it. Unconditionally clearing the state can avoid stale CMCI_STORM_SUBSIDED status. Your logic is fine and simply my previous unnecessarily complicated logic. Thanks a lot. I will send the V3 tmp patch based on your previous 5 patches right now. If OK, you can merge it into your 5th patch and I can write the comment for this patch if you are OK. ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH V3] tmp patch to fix hotplug issue in CMCI storm 2012-06-18 12:23 ` Thomas Gleixner 2012-06-19 6:05 ` Chen Gong @ 2012-06-19 6:09 ` Chen Gong 2012-07-04 8:12 ` Chen Gong 1 sibling, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-06-19 6:09 UTC (permalink / raw) To: tglx; +Cc: tony.luck, borislav.petkov, peterz, x86, linux-kernel, Chen Gong v3->v1 Thanks very much for Thomas' suggestion to simply the whole logic. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> --- arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 ++ arch/x86/kernel/cpu/mcheck/mce.c | 1 + arch/x86/kernel/cpu/mcheck/mce_intel.c | 11 +++++++++++ 3 files changed, 14 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index 2cd73ce..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; #ifdef CONFIG_X86_MCE_INTEL unsigned long mce_intel_adjust_timer(unsigned long interval); void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); #else # define mce_intel_adjust_timer mce_adjust_timer_default static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } #endif void mce_timer_kick(unsigned long interval); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index e3f8b94..5e22d99 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: smp_call_function_single(cpu, mce_disable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 92d8b5c..693bc7d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -77,6 +77,14 @@ void mce_intel_cmci_poll(void) machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); } +void mce_intel_hcpu_update(unsigned long cpu) +{ + if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE) + atomic_dec(&cmci_storm_on_cpus); + + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; +} + unsigned long mce_intel_adjust_timer(unsigned long interval) { if (interval < CMCI_POLL_INTERVAL) @@ -119,6 +127,9 @@ static bool cmci_storm_detect(void) unsigned long ts = __this_cpu_read(cmci_time_stamp); unsigned long now = jiffies; + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) + return true; + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { cnt++; } else { -- 1.7.1 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm 2012-06-19 6:09 ` [PATCH V3] " Chen Gong @ 2012-07-04 8:12 ` Chen Gong 2012-07-16 3:16 ` Chen Gong 0 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-07-04 8:12 UTC (permalink / raw) To: Chen Gong; +Cc: tglx, tony.luck, borislav.petkov, peterz, x86, linux-kernel 于 2012/6/19 14:09, Chen Gong 写道: > v3->v1 > Thanks very much for Thomas' suggestion to simply the whole logic. > > Signed-off-by: Chen Gong <gong.chen@linux.intel.com> > --- > arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 ++ > arch/x86/kernel/cpu/mcheck/mce.c | 1 + > arch/x86/kernel/cpu/mcheck/mce_intel.c | 11 +++++++++++ > 3 files changed, 14 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h > index 2cd73ce..6a05c1d 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h > +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h > @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; > #ifdef CONFIG_X86_MCE_INTEL > unsigned long mce_intel_adjust_timer(unsigned long interval); > void mce_intel_cmci_poll(void); > +void mce_intel_hcpu_update(unsigned long cpu); > #else > # define mce_intel_adjust_timer mce_adjust_timer_default > static inline void mce_intel_cmci_poll(void) { } > +static inline void mce_intel_hcpu_update(unsigned long cpu) { } > #endif > > void mce_timer_kick(unsigned long interval); > diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c > index e3f8b94..5e22d99 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce.c > +++ b/arch/x86/kernel/cpu/mcheck/mce.c > @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) > if (threshold_cpu_callback) > threshold_cpu_callback(action, cpu); > mce_device_remove(cpu); > + mce_intel_hcpu_update(cpu); > break; > case CPU_DOWN_PREPARE: > smp_call_function_single(cpu, mce_disable_cpu, &action, 1); > diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c > index 92d8b5c..693bc7d 100644 > --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c > +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c > @@ -77,6 +77,14 @@ void mce_intel_cmci_poll(void) > machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); > } > > +void mce_intel_hcpu_update(unsigned long cpu) > +{ > + if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE) > + atomic_dec(&cmci_storm_on_cpus); > + > + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; > +} > + > unsigned long mce_intel_adjust_timer(unsigned long interval) > { > if (interval < CMCI_POLL_INTERVAL) > @@ -119,6 +127,9 @@ static bool cmci_storm_detect(void) > unsigned long ts = __this_cpu_read(cmci_time_stamp); > unsigned long now = jiffies; > > + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) > + return true; > + > if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { > cnt++; > } else { > Hi, Thomas How is going on for this patch and whole patch series? I don't know if you have updated it or not. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm 2012-07-04 8:12 ` Chen Gong @ 2012-07-16 3:16 ` Chen Gong 2012-07-16 8:22 ` Thomas Gleixner 0 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-07-16 3:16 UTC (permalink / raw) To: Chen Gong; +Cc: tglx, tony.luck, borislav.petkov, peterz, x86, linux-kernel 于 2012/7/4 16:12, Chen Gong 写道: > 于 2012/6/19 14:09, Chen Gong 写道: >> v3->v1 >> Thanks very much for Thomas' suggestion to simply the whole logic. >> >> Signed-off-by: Chen Gong <gong.chen@linux.intel.com> >> --- >> arch/x86/kernel/cpu/mcheck/mce-internal.h | 2 ++ >> arch/x86/kernel/cpu/mcheck/mce.c | 1 + >> arch/x86/kernel/cpu/mcheck/mce_intel.c | 11 +++++++++++ >> 3 files changed, 14 insertions(+), 0 deletions(-) >> >> diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h >> b/arch/x86/kernel/cpu/mcheck/mce-internal.h >> index 2cd73ce..6a05c1d 100644 >> --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h >> +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h >> @@ -31,9 +31,11 @@ extern struct mce_bank *mce_banks; >> #ifdef CONFIG_X86_MCE_INTEL >> unsigned long mce_intel_adjust_timer(unsigned long interval); >> void mce_intel_cmci_poll(void); >> +void mce_intel_hcpu_update(unsigned long cpu); >> #else >> # define mce_intel_adjust_timer mce_adjust_timer_default >> static inline void mce_intel_cmci_poll(void) { } >> +static inline void mce_intel_hcpu_update(unsigned long cpu) { } >> #endif >> >> void mce_timer_kick(unsigned long interval); >> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c >> b/arch/x86/kernel/cpu/mcheck/mce.c >> index e3f8b94..5e22d99 100644 >> --- a/arch/x86/kernel/cpu/mcheck/mce.c >> +++ b/arch/x86/kernel/cpu/mcheck/mce.c >> @@ -2306,6 +2306,7 @@ mce_cpu_callback(struct notifier_block *nfb, >> unsigned long action, void *hcpu) >> if (threshold_cpu_callback) >> threshold_cpu_callback(action, cpu); >> mce_device_remove(cpu); >> + mce_intel_hcpu_update(cpu); >> break; >> case CPU_DOWN_PREPARE: >> smp_call_function_single(cpu, mce_disable_cpu, &action, 1); >> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c >> b/arch/x86/kernel/cpu/mcheck/mce_intel.c >> index 92d8b5c..693bc7d 100644 >> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c >> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c >> @@ -77,6 +77,14 @@ void mce_intel_cmci_poll(void) >> machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); >> } >> >> +void mce_intel_hcpu_update(unsigned long cpu) >> +{ >> + if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE) >> + atomic_dec(&cmci_storm_on_cpus); >> + >> + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; >> +} >> + >> unsigned long mce_intel_adjust_timer(unsigned long interval) >> { >> if (interval < CMCI_POLL_INTERVAL) >> @@ -119,6 +127,9 @@ static bool cmci_storm_detect(void) >> unsigned long ts = __this_cpu_read(cmci_time_stamp); >> unsigned long now = jiffies; >> >> + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) >> + return true; >> + >> if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { >> cnt++; >> } else { >> > > > Hi, Thomas > > How is going on for this patch and whole patch series? I don't know > if you have updated it or not. Hi, Thomas Are you still care about this thread any more? Any plan to update it? Hope to get your feedback ASAP. ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm 2012-07-16 3:16 ` Chen Gong @ 2012-07-16 8:22 ` Thomas Gleixner 2012-07-17 21:47 ` Chen Gong 0 siblings, 1 reply; 33+ messages in thread From: Thomas Gleixner @ 2012-07-16 8:22 UTC (permalink / raw) To: Chen Gong; +Cc: tony.luck, borislav.petkov, peterz, x86, linux-kernel On Mon, 16 Jul 2012, Chen Gong wrote: > > Are you still care about this thread any more? Any plan to update it? > Hope to get your feedback ASAP. Can you please collect the latest series and send it to lkml, Tony and Boris. I think it's ok as is now. Thanks, tglx ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm 2012-07-16 8:22 ` Thomas Gleixner @ 2012-07-17 21:47 ` Chen Gong 0 siblings, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-07-17 21:47 UTC (permalink / raw) To: Thomas Gleixner; +Cc: tony.luck, borislav.petkov, peterz, x86, linux-kernel [-- Attachment #1: Type: text/plain, Size: 819 bytes --] On Mon, Jul 16, 2012 at 10:22:16AM +0200, Thomas Gleixner wrote: > Date: Mon, 16 Jul 2012 10:22:16 +0200 (CEST) > From: Thomas Gleixner <tglx@linutronix.de> > To: Chen Gong <gong.chen@linux.intel.com> > cc: tony.luck@intel.com, borislav.petkov@amd.com, peterz@infradead.org, > x86@kernel.org, linux-kernel@vger.kernel.org > Subject: Re: [PATCH V3] tmp patch to fix hotplug issue in CMCI storm > User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) > > On Mon, 16 Jul 2012, Chen Gong wrote: > > > > Are you still care about this thread any more? Any plan to update it? > > Hope to get your feedback ASAP. > > Can you please collect the latest series and send it to lkml, Tony and > Boris. I think it's ok as is now. > > Thanks, > > tglx Fine, but what base I should use, -tip tree or Linus' tree? [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 33+ messages in thread
* [V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version @ 2012-07-18 19:59 Chen Gong 2012-07-18 19:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong 0 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-07-18 19:59 UTC (permalink / raw) To: tglx; +Cc: tony.luck, bp, x86, linux-kernel [PATCH 1/5] x86: mce: Disable preemption when calling raise_local() [PATCH 2/5] x86: mce: Serialize mce injection [PATCH 3/5] x86: mce: Split timer init [PATCH 4/5] x86: mce: Remove the frozen cases in the hotplug code [PATCH 5/5] x86: mce: Add cmci poll mode The following series fixes a few interesting bugs (found by review in context of the CMCI poll effort) and a cleanup to the timer/hotplug code followed by a consolidated version of the CMCI poll implementation. This series is based on Linus' tree. git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 5/5] x86: mce: Add cmci poll mode 2012-07-18 19:59 [V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong @ 2012-07-18 19:59 ` Chen Gong 0 siblings, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-07-18 19:59 UTC (permalink / raw) To: tglx; +Cc: tony.luck, bp, x86, linux-kernel, Chen Gong When CMCI is too many to handle, it should be disabled to avoid hanging the whole system. In the meanwhile, CMCI poll timer can be employed to receive CMCI periodically. When no more CMCI happens CMCI handler can be switched from poll mode to interrupt mode again. By now, every CPU core owns one poll timer, but in fact, maybe it should be enough that every package (or socket) owning one poll timer. It is because CMCI gets broadcast to all threads on the same socket. So if one cpu has a problem, all the cpus on the same socket have a problem. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chen Gong <gong.chen@linux.intel.com> --- arch/x86/kernel/cpu/mcheck/mce-internal.h | 12 ++++ arch/x86/kernel/cpu/mcheck/mce.c | 47 ++++++++++++-- arch/x86/kernel/cpu/mcheck/mce_intel.c | 99 ++++++++++++++++++++++++++++- 3 files changed, 151 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index ed44c8a..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -28,6 +28,18 @@ extern int mce_ser; extern struct mce_bank *mce_banks; +#ifdef CONFIG_X86_MCE_INTEL +unsigned long mce_intel_adjust_timer(unsigned long interval); +void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); +#else +# define mce_intel_adjust_timer mce_adjust_timer_default +static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } +#endif + +void mce_timer_kick(unsigned long interval); + #ifdef CONFIG_ACPI_APEI int apei_write_mce(struct mce *m); ssize_t apei_read_mce(struct mce *m, u64 *record_id); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index eff73e7..95738db0 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1256,6 +1256,14 @@ static unsigned long check_interval = 5 * 60; /* 5 minutes */ static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */ static DEFINE_PER_CPU(struct timer_list, mce_timer); +static unsigned long mce_adjust_timer_default(unsigned long interval) +{ + return interval; +} + +static unsigned long (*mce_adjust_timer)(unsigned long interval) = + mce_adjust_timer_default; + static void mce_timer_fn(unsigned long data) { struct timer_list *t = &__get_cpu_var(mce_timer); @@ -1266,6 +1274,7 @@ static void mce_timer_fn(unsigned long data) if (mce_available(__this_cpu_ptr(&cpu_info))) { machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_poll_banks)); + mce_intel_cmci_poll(); } /* @@ -1273,14 +1282,38 @@ static void mce_timer_fn(unsigned long data) * polling interval, otherwise increase the polling interval. */ iv = __this_cpu_read(mce_next_interval); - if (mce_notify_irq()) + if (mce_notify_irq()) { iv = max(iv / 2, (unsigned long) HZ/100); - else + } else { iv = min(iv * 2, round_jiffies_relative(check_interval * HZ)); + iv = mce_adjust_timer(iv); + } __this_cpu_write(mce_next_interval, iv); + /* Might have become 0 after CMCI storm subsided */ + if (iv) { + t->expires = jiffies + iv; + add_timer_on(t, smp_processor_id()); + } +} - t->expires = jiffies + iv; - add_timer_on(t, smp_processor_id()); +/* + * Ensure that the timer is firing in @interval from now. + */ +void mce_timer_kick(unsigned long interval) +{ + struct timer_list *t = &__get_cpu_var(mce_timer); + unsigned long when = jiffies + interval; + unsigned long iv = __this_cpu_read(mce_next_interval); + + if (timer_pending(t)) { + if (time_before(when, t->expires)) + mod_timer_pinned(t, when); + } else { + t->expires = round_jiffies(when); + add_timer_on(t, smp_processor_id()); + } + if (interval < iv) + __this_cpu_write(mce_next_interval, interval); } /* Must not be called in IRQ context where del_timer_sync() can deadlock */ @@ -1545,6 +1578,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) switch (c->x86_vendor) { case X86_VENDOR_INTEL: mce_intel_feature_init(c); + mce_adjust_timer = mce_intel_adjust_timer; break; case X86_VENDOR_AMD: mce_amd_feature_init(c); @@ -1556,7 +1590,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) static void mce_start_timer(unsigned int cpu, struct timer_list *t) { - unsigned long iv = check_interval * HZ; + unsigned long iv = mce_adjust_timer(check_interval * HZ); __this_cpu_write(mce_next_interval, iv); @@ -2270,10 +2304,11 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: - del_timer_sync(t); smp_call_function_single(cpu, mce_disable_cpu, &action, 1); + del_timer_sync(t); break; case CPU_DOWN_FAILED: smp_call_function_single(cpu, mce_reenable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 38e49bc..693bc7d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -15,6 +15,8 @@ #include <asm/msr.h> #include <asm/mce.h> +#include "mce-internal.h" + /* * Support for Intel Correct Machine Check Interrupts. This allows * the CPU to raise an interrupt when a corrected machine check happened. @@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned); */ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); -#define CMCI_THRESHOLD 1 +#define CMCI_THRESHOLD 1 +#define CMCI_POLL_INTERVAL (30 * HZ) +#define CMCI_STORM_INTERVAL (1 * HZ) +#define CMCI_STORM_TRESHOLD 5 + +static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); +static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); +static DEFINE_PER_CPU(unsigned int, cmci_storm_state); + +enum { + CMCI_STORM_NONE, + CMCI_STORM_ACTIVE, + CMCI_STORM_SUBSIDED, +}; + +static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) { @@ -53,6 +70,84 @@ static int cmci_supported(int *banks) return !!(cap & MCG_CMCI_P); } +void mce_intel_cmci_poll(void) +{ + if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE) + return; + machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); +} + +void mce_intel_hcpu_update(unsigned long cpu) +{ + if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE) + atomic_dec(&cmci_storm_on_cpus); + + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; +} + +unsigned long mce_intel_adjust_timer(unsigned long interval) +{ + if (interval < CMCI_POLL_INTERVAL) + return interval; + + switch (__this_cpu_read(cmci_storm_state)) { + case CMCI_STORM_ACTIVE: + /* + * We switch back to interrupt mode once the poll timer has + * silenced itself. That means no events recorded and the + * timer interval is back to our poll interval. + */ + __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); + atomic_dec(&cmci_storm_on_cpus); + + case CMCI_STORM_SUBSIDED: + /* + * We wait for all cpus to go back to SUBSIDED + * state. When that happens we switch back to + * interrupt mode. + */ + if (!atomic_read(&cmci_storm_on_cpus)) { + __this_cpu_write(cmci_storm_state, CMCI_STORM_NONE); + cmci_reenable(); + cmci_recheck(); + } + return CMCI_POLL_INTERVAL; + default: + /* + * We have shiny wheather, let the poll do whatever it + * thinks. + */ + return interval; + } +} + +static bool cmci_storm_detect(void) +{ + unsigned int cnt = __this_cpu_read(cmci_storm_cnt); + unsigned long ts = __this_cpu_read(cmci_time_stamp); + unsigned long now = jiffies; + + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) + return true; + + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { + cnt++; + } else { + cnt = 1; + __this_cpu_write(cmci_time_stamp, now); + } + __this_cpu_write(cmci_storm_cnt, cnt); + + if (cnt <= CMCI_STORM_TRESHOLD) + return false; + + cmci_clear(); + __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + atomic_inc(&cmci_storm_on_cpus); + mce_timer_kick(CMCI_POLL_INTERVAL); + return true; +} + /* * The interrupt handler. This is called on every event. * Just call the poller directly to log any events. @@ -61,6 +156,8 @@ static int cmci_supported(int *banks) */ static void intel_threshold_interrupt(void) { + if (cmci_storm_detect()) + return; machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); mce_notify_irq(); } -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 33+ messages in thread
* [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version @ 2012-07-19 17:59 Chen Gong 2012-07-19 17:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong 0 siblings, 1 reply; 33+ messages in thread From: Chen Gong @ 2012-07-19 17:59 UTC (permalink / raw) To: tglx; +Cc: tony.luck, bp, x86, linux-kernel [PATCH 1/5] x86: mce: Disable preemption when calling raise_local() [PATCH 2/5] x86: mce: Serialize mce injection [PATCH 3/5] x86: mce: Split timer init [PATCH 4/5] x86: mce: Remove the frozen cases in the hotplug code [PATCH 5/5] x86: mce: Add cmci poll mode The following series fixes a few interesting bugs (found by review in context of the CMCI poll effort) and a cleanup to the timer/hotplug code followed by a consolidated version of the CMCI poll implementation. This series is based on Linus' tree. git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git Thanks Boris to point out how to use git to commit correct authorship :-). ^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 5/5] x86: mce: Add cmci poll mode 2012-07-19 17:59 [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong @ 2012-07-19 17:59 ` Chen Gong 0 siblings, 0 replies; 33+ messages in thread From: Chen Gong @ 2012-07-19 17:59 UTC (permalink / raw) To: tglx; +Cc: tony.luck, bp, x86, linux-kernel, Chen Gong When CMCI is too many to handle, it should be disabled to avoid hanging the whole system. In the meanwhile, CMCI poll timer can be employed to receive CMCI periodically. When no more CMCI happens CMCI handler can be switched from poll mode to interrupt mode again. By now, every CPU core owns one poll timer, but in fact, maybe it should be enough that every package (or socket) owning one poll timer. It is because CMCI gets broadcast to all threads on the same socket. So if one cpu has a problem, all the cpus on the same socket have a problem. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Chen Gong <gong.chen@linux.intel.com> --- arch/x86/kernel/cpu/mcheck/mce-internal.h | 12 ++++ arch/x86/kernel/cpu/mcheck/mce.c | 47 ++++++++++++-- arch/x86/kernel/cpu/mcheck/mce_intel.c | 99 ++++++++++++++++++++++++++++- 3 files changed, 151 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h index ed44c8a..6a05c1d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce-internal.h +++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h @@ -28,6 +28,18 @@ extern int mce_ser; extern struct mce_bank *mce_banks; +#ifdef CONFIG_X86_MCE_INTEL +unsigned long mce_intel_adjust_timer(unsigned long interval); +void mce_intel_cmci_poll(void); +void mce_intel_hcpu_update(unsigned long cpu); +#else +# define mce_intel_adjust_timer mce_adjust_timer_default +static inline void mce_intel_cmci_poll(void) { } +static inline void mce_intel_hcpu_update(unsigned long cpu) { } +#endif + +void mce_timer_kick(unsigned long interval); + #ifdef CONFIG_ACPI_APEI int apei_write_mce(struct mce *m); ssize_t apei_read_mce(struct mce *m, u64 *record_id); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index eff73e7..95738db0 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -1256,6 +1256,14 @@ static unsigned long check_interval = 5 * 60; /* 5 minutes */ static DEFINE_PER_CPU(unsigned long, mce_next_interval); /* in jiffies */ static DEFINE_PER_CPU(struct timer_list, mce_timer); +static unsigned long mce_adjust_timer_default(unsigned long interval) +{ + return interval; +} + +static unsigned long (*mce_adjust_timer)(unsigned long interval) = + mce_adjust_timer_default; + static void mce_timer_fn(unsigned long data) { struct timer_list *t = &__get_cpu_var(mce_timer); @@ -1266,6 +1274,7 @@ static void mce_timer_fn(unsigned long data) if (mce_available(__this_cpu_ptr(&cpu_info))) { machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_poll_banks)); + mce_intel_cmci_poll(); } /* @@ -1273,14 +1282,38 @@ static void mce_timer_fn(unsigned long data) * polling interval, otherwise increase the polling interval. */ iv = __this_cpu_read(mce_next_interval); - if (mce_notify_irq()) + if (mce_notify_irq()) { iv = max(iv / 2, (unsigned long) HZ/100); - else + } else { iv = min(iv * 2, round_jiffies_relative(check_interval * HZ)); + iv = mce_adjust_timer(iv); + } __this_cpu_write(mce_next_interval, iv); + /* Might have become 0 after CMCI storm subsided */ + if (iv) { + t->expires = jiffies + iv; + add_timer_on(t, smp_processor_id()); + } +} - t->expires = jiffies + iv; - add_timer_on(t, smp_processor_id()); +/* + * Ensure that the timer is firing in @interval from now. + */ +void mce_timer_kick(unsigned long interval) +{ + struct timer_list *t = &__get_cpu_var(mce_timer); + unsigned long when = jiffies + interval; + unsigned long iv = __this_cpu_read(mce_next_interval); + + if (timer_pending(t)) { + if (time_before(when, t->expires)) + mod_timer_pinned(t, when); + } else { + t->expires = round_jiffies(when); + add_timer_on(t, smp_processor_id()); + } + if (interval < iv) + __this_cpu_write(mce_next_interval, interval); } /* Must not be called in IRQ context where del_timer_sync() can deadlock */ @@ -1545,6 +1578,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) switch (c->x86_vendor) { case X86_VENDOR_INTEL: mce_intel_feature_init(c); + mce_adjust_timer = mce_intel_adjust_timer; break; case X86_VENDOR_AMD: mce_amd_feature_init(c); @@ -1556,7 +1590,7 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c) static void mce_start_timer(unsigned int cpu, struct timer_list *t) { - unsigned long iv = check_interval * HZ; + unsigned long iv = mce_adjust_timer(check_interval * HZ); __this_cpu_write(mce_next_interval, iv); @@ -2270,10 +2304,11 @@ mce_cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) if (threshold_cpu_callback) threshold_cpu_callback(action, cpu); mce_device_remove(cpu); + mce_intel_hcpu_update(cpu); break; case CPU_DOWN_PREPARE: - del_timer_sync(t); smp_call_function_single(cpu, mce_disable_cpu, &action, 1); + del_timer_sync(t); break; case CPU_DOWN_FAILED: smp_call_function_single(cpu, mce_reenable_cpu, &action, 1); diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c index 38e49bc..693bc7d 100644 --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c @@ -15,6 +15,8 @@ #include <asm/msr.h> #include <asm/mce.h> +#include "mce-internal.h" + /* * Support for Intel Correct Machine Check Interrupts. This allows * the CPU to raise an interrupt when a corrected machine check happened. @@ -30,7 +32,22 @@ static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned); */ static DEFINE_RAW_SPINLOCK(cmci_discover_lock); -#define CMCI_THRESHOLD 1 +#define CMCI_THRESHOLD 1 +#define CMCI_POLL_INTERVAL (30 * HZ) +#define CMCI_STORM_INTERVAL (1 * HZ) +#define CMCI_STORM_TRESHOLD 5 + +static DEFINE_PER_CPU(unsigned long, cmci_time_stamp); +static DEFINE_PER_CPU(unsigned int, cmci_storm_cnt); +static DEFINE_PER_CPU(unsigned int, cmci_storm_state); + +enum { + CMCI_STORM_NONE, + CMCI_STORM_ACTIVE, + CMCI_STORM_SUBSIDED, +}; + +static atomic_t cmci_storm_on_cpus; static int cmci_supported(int *banks) { @@ -53,6 +70,84 @@ static int cmci_supported(int *banks) return !!(cap & MCG_CMCI_P); } +void mce_intel_cmci_poll(void) +{ + if (__this_cpu_read(cmci_storm_state) == CMCI_STORM_NONE) + return; + machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); +} + +void mce_intel_hcpu_update(unsigned long cpu) +{ + if (per_cpu(cmci_storm_state, cpu) == CMCI_STORM_ACTIVE) + atomic_dec(&cmci_storm_on_cpus); + + per_cpu(cmci_storm_state, cpu) = CMCI_STORM_NONE; +} + +unsigned long mce_intel_adjust_timer(unsigned long interval) +{ + if (interval < CMCI_POLL_INTERVAL) + return interval; + + switch (__this_cpu_read(cmci_storm_state)) { + case CMCI_STORM_ACTIVE: + /* + * We switch back to interrupt mode once the poll timer has + * silenced itself. That means no events recorded and the + * timer interval is back to our poll interval. + */ + __this_cpu_write(cmci_storm_state, CMCI_STORM_SUBSIDED); + atomic_dec(&cmci_storm_on_cpus); + + case CMCI_STORM_SUBSIDED: + /* + * We wait for all cpus to go back to SUBSIDED + * state. When that happens we switch back to + * interrupt mode. + */ + if (!atomic_read(&cmci_storm_on_cpus)) { + __this_cpu_write(cmci_storm_state, CMCI_STORM_NONE); + cmci_reenable(); + cmci_recheck(); + } + return CMCI_POLL_INTERVAL; + default: + /* + * We have shiny wheather, let the poll do whatever it + * thinks. + */ + return interval; + } +} + +static bool cmci_storm_detect(void) +{ + unsigned int cnt = __this_cpu_read(cmci_storm_cnt); + unsigned long ts = __this_cpu_read(cmci_time_stamp); + unsigned long now = jiffies; + + if (__this_cpu_read(cmci_storm_state) != CMCI_STORM_NONE) + return true; + + if (time_before_eq(now, ts + CMCI_STORM_INTERVAL)) { + cnt++; + } else { + cnt = 1; + __this_cpu_write(cmci_time_stamp, now); + } + __this_cpu_write(cmci_storm_cnt, cnt); + + if (cnt <= CMCI_STORM_TRESHOLD) + return false; + + cmci_clear(); + __this_cpu_write(cmci_storm_state, CMCI_STORM_ACTIVE); + atomic_inc(&cmci_storm_on_cpus); + mce_timer_kick(CMCI_POLL_INTERVAL); + return true; +} + /* * The interrupt handler. This is called on every event. * Just call the poller directly to log any events. @@ -61,6 +156,8 @@ static int cmci_supported(int *banks) */ static void intel_threshold_interrupt(void) { + if (cmci_storm_detect()) + return; machine_check_poll(MCP_TIMESTAMP, &__get_cpu_var(mce_banks_owned)); mce_notify_irq(); } -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 33+ messages in thread
end of thread, other threads:[~2012-07-19 5:59 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-06-06 21:53 [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Thomas Gleixner 2012-06-06 21:53 ` [patch 1/5] x86: mce: Disable preemption when calling raise_local() Thomas Gleixner 2012-06-06 21:53 ` [patch 3/5] x86: mce: Split timer init Thomas Gleixner 2012-06-07 15:18 ` Borislav Petkov 2012-06-20 3:35 ` Hidetoshi Seto 2012-06-06 21:53 ` [patch 2/5] x86: mce: Serialize mce injection Thomas Gleixner 2012-06-06 21:53 ` [patch 5/5] x86: mce: Add cmci poll mode Thomas Gleixner 2012-06-07 18:14 ` Borislav Petkov 2012-06-06 21:53 ` [patch 4/5] x86: mce: Remove the frozen cases in the hotplug code Thomas Gleixner 2012-06-07 17:49 ` Borislav Petkov 2012-06-07 10:08 ` [patch 0/5] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong 2012-06-07 13:35 ` Borislav Petkov 2012-06-07 16:22 ` Luck, Tony 2012-06-08 7:49 ` Thomas Gleixner 2012-06-11 5:46 ` Chen Gong 2012-06-11 6:09 ` Chen Gong 2012-06-14 13:49 ` [PATCH] tmp patch to fix hotplug issue in CMCI storm Chen Gong 2012-06-14 14:07 ` Thomas Gleixner 2012-06-15 6:51 ` Chen Gong 2012-06-15 9:55 ` Thomas Gleixner 2012-06-18 6:42 ` Chen Gong 2012-06-18 6:45 ` [PATCH V2] " Chen Gong 2012-06-18 8:00 ` Thomas Gleixner 2012-06-18 10:13 ` Chen Gong 2012-06-18 12:23 ` Thomas Gleixner 2012-06-19 6:05 ` Chen Gong 2012-06-19 6:09 ` [PATCH V3] " Chen Gong 2012-07-04 8:12 ` Chen Gong 2012-07-16 3:16 ` Chen Gong 2012-07-16 8:22 ` Thomas Gleixner 2012-07-17 21:47 ` Chen Gong -- strict thread matches above, loose matches on Subject: below -- 2012-07-18 19:59 [V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong 2012-07-18 19:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong 2012-07-19 17:59 [RESEND PATCH 0/5 V2] x86: mce: Bugfixes, cleanups and a new CMCI poll version Chen Gong 2012-07-19 17:59 ` [PATCH 5/5] x86: mce: Add cmci poll mode Chen Gong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).