All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] perf: Check all MSRs before passing hw check
Date: Mon, 18 Mar 2013 11:53:20 +0100	[thread overview]
Message-ID: <20130318105320.GA28486@gmail.com> (raw)
In-Reply-To: <5146EF30.2040306@eu.citrix.com>


* George Dunlap <george.dunlap@eu.citrix.com> wrote:

> On 18/03/13 08:42, Ingo Molnar wrote:
> >* George Dunlap <george.dunlap@eu.citrix.com> wrote:
> >
> >>check_hw_exists has a number of checks which go to two exit paths:
> >>msr_fail and bios_fail.  Checks classified as msr_fail will cause
> >>check_hw_exists() to return false, causing the PMU not to be used;
> >>bios_fail checks will only cause a warning to be printed, but will
> >>return true.
> >>
> >>The problem is that if there are both msr failures and bios failures,
> >>and the routine hits a bios_fail check first, it will exit early and
> >>return true, not finishing the rest of the msr checks.  If those msrs
> >>are in fact broken, it will cause them to be used erroneously.
> >>
> >>This changset causes check_hw_exists() to go through all of the msr
> >>checks, failing and returning false if any of them fail.
> >>
> >>This problem affects kernels as far back as 3.2, and should thus be
> >>considered for backport.
> >>
> >>Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
> >>CC: Konrad Wilk <konrad.wilk@oracle.com>
> >>CC: Thomas Gleixner <tglx@linutronix.de>
> >>CC: "H. Peter Anvin" <hpa@zytor.com>
> >>CC: x86@kernel.org
> >>---
> >>  arch/x86/kernel/cpu/perf_event.c |   20 ++++++++++----------
> >>  1 file changed, 10 insertions(+), 10 deletions(-)
> >What is missing is a description of what specific platform this gets
> >triggered on and exactly why. Is some hw feature emulation missing that
> >causes the check to fail?
> 
> Remember, there are two checks failing: the second one is supposed
> to fail and disable the PMU entirely, but it's not getting there
> because when the first one fails, it skips the rest but returns
> "success" anyway.
> 
> The warning on the first check is as follows:
> 
> [    8.131985] Performance Events: Broken BIOS detected, complain to
> your hardware vendor.^M
> [    8.139997] [Firmware Bug]: the BIOS has corrupted hw-PMU
> resources (MSR c0010000 is 530076)^M
> 
> c0010000 is the AMD  MSR_K7_EVNTSEL0; the check it's failing is:
>   if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
> 
> So it discovers that one of the performance counters is already
> enabled -- worth a warning, but by itself not worth disabling the
> PMU.  This is most likely to be exactly what the warning message
> says: a buggy BIOS that enables perfcounters enabled for some
> reason.
> 
> The second check is supposed to detect that the PMU is actually not
> usable -- in my case because it's running virtualized (under Xen).

I got the logic from your original description - what I wanted was for the 
specific messages to be included in the patch changelog, plus a 
description of what misbehaved before the patch and what behaves better 
after the patch - on your specific system.

In other words, please use the customary changelog style we use in the 
kernel:

  " Current code does (A), this has a problem when (B).
    We can improve this doing (C), because (D)."

Thanks,

        Ingo

  reply	other threads:[~2013-03-18 10:53 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-15 12:20 [PATCH] perf: Check all MSRs before passing hw check George Dunlap
2013-03-15 12:50 ` Jan Beulich
2013-03-15 14:43   ` George Dunlap
2013-03-15 15:25     ` Jan Beulich
2013-03-18  8:42 ` Ingo Molnar
2013-03-18 10:40   ` George Dunlap
2013-03-18 10:53     ` Ingo Molnar [this message]
2013-03-18 10:55       ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130318105320.GA28486@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=george.dunlap@eu.citrix.com \
    --cc=hpa@zytor.com \
    --cc=konrad.wilk@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.