From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Egger, Christoph" Subject: Re: [PATCH] xen/mce: Don't spam the console with "CPUx: Temperature z" (v2) Date: Mon, 16 Jun 2014 10:31:08 +0200 Message-ID: <539EAB4C.9090800@amazon.de> References: <1402682970-24970-1-git-send-email-konrad.wilk@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WwSQs-00057L-QV for xen-devel@lists.xenproject.org; Mon, 16 Jun 2014 08:38:27 +0000 In-Reply-To: <1402682970-24970-1-git-send-email-konrad.wilk@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk , xen-devel@lists.xenproject.org, jinsong.liu@alibaba-inc.com, keir@xen.org, jbeulich@suse.com List-Id: xen-devel@lists.xenproject.org On 13.06.14 20:09, Konrad Rzeszutek Wilk wrote: > If the machine has been quite busy it ends up with these > messages printed on the hypervisor console: > > (XEN) CPU3: Temperature/speed normal > (XEN) CPU1: Temperature/speed normal > (XEN) CPU0: Temperature/speed normal > (XEN) CPU1: Temperature/speed normal > (XEN) CPU0: Temperature/speed normal > (XEN) CPU2: Temperature/speed normal > (XEN) CPU3: Temperature/speed normal > (XEN) CPU0: Temperature/speed normal > (XEN) CPU2: Temperature/speed normal > (XEN) CPU3: Temperature/speed normal > (XEN) CPU1: Temperature/speed normal > (XEN) CPU0: Temperature above threshold > (XEN) CPU0: Running in modulated clock mode > (XEN) CPU1: Temperature/speed normal > (XEN) CPU2: Temperature/speed normal > (XEN) CPU3: Temperature/speed normal > > While the state changes are important, the non-altered > state information is not needed. As such add a latch > mechanism to only print the information if it has > changed since the last update. > > This was observed on Intel DQ67SW, > BIOS SWQ6710H.86A.0066.2012.1105.1504 11/05/2012 > > CC: Jan Beulich > CC: Keir Fraser > Signed-off-by: Konrad Rzeszutek Wilk Acked-by: Christoph Egger > > --- > [v2: Redo per Daniel and Boris's review] > [v3: Use per_cpu instead of __get_cpu_var per Andrew's review] > --- > xen/arch/x86/cpu/mcheck/mce_intel.c | 19 ++++++++++++++----- > 1 files changed, 14 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c b/xen/arch/x86/cpu/mcheck/mce_intel.c > index ad06efc..bb4ce47 100644 > --- a/xen/arch/x86/cpu/mcheck/mce_intel.c > +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c > @@ -49,11 +49,15 @@ static int __read_mostly nr_intel_ext_msrs; > #define INTEL_SRAR_INSTR_FETCH 0x150 > > #ifdef CONFIG_X86_MCE_THERMAL > +#define MCE_RING 0x1 > +static DEFINE_PER_CPU(int, last_state); > + > static void intel_thermal_interrupt(struct cpu_user_regs *regs) > { > uint64_t msr_content; > unsigned int cpu = smp_processor_id(); > static DEFINE_PER_CPU(s_time_t, next); > + int *this_last_state; > > ack_APIC_irq(); > > @@ -62,13 +66,17 @@ static void intel_thermal_interrupt(struct cpu_user_regs *regs) > > per_cpu(next, cpu) = NOW() + MILLISECS(5000); > rdmsrl(MSR_IA32_THERM_STATUS, msr_content); > - if (msr_content & 0x1) { > - printk(KERN_EMERG "CPU%d: Temperature above threshold\n", cpu); > - printk(KERN_EMERG "CPU%d: Running in modulated clock mode\n", > - cpu); > + this_last_state = &per_cpu(last_state, cpu); > + if ( *this_last_state == (msr_content & MCE_RING) ) > + return; > + *this_last_state = msr_content & MCE_RING; > + if ( msr_content & MCE_RING ) > + { > + printk(KERN_EMERG "CPU%u: Temperature above threshold\n", cpu); > + printk(KERN_EMERG "CPU%u: Running in modulated clock mode\n", cpu); > add_taint(TAINT_MACHINE_CHECK); > } else { > - printk(KERN_INFO "CPU%d: Temperature/speed normal\n", cpu); > + printk(KERN_INFO "CPU%u: Temperature/speed normal\n", cpu); > } > } > > @@ -802,6 +810,7 @@ static int cpu_mcabank_alloc(unsigned int cpu) > > per_cpu(no_cmci_banks, cpu) = cmci; > per_cpu(mce_banks_owned, cpu) = owned; > + per_cpu(last_state, cpu) = -1; > > return 0; > out: >