From: George Dunlap <george.dunlap@eu.citrix.com>
To: Don Zickus <dzickus@redhat.com>, Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
<konrad.wilk@oracle.com>
Subject: Re: [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists
Date: Tue, 2 Jun 2015 16:15:21 +0100 [thread overview]
Message-ID: <556DC889.2030403@eu.citrix.com> (raw)
In-Reply-To: <555E1C6D.6040508@eu.citrix.com>
On 05/21/2015 06:57 PM, George Dunlap wrote:
> On 05/18/2015 08:16 PM, Don Zickus wrote:
>> I stumbled upon an AMD box that had the BIOS using a hardware counter. Instead
>> of printing out a warning and continuing, it failed and blocked further perf
>> counter usage.
>>
>> Looking through the history, I found commit a5ebe0ba3dff had tweaked the rules
>> for a xen guest on an almost identical box and now changed the behaviour.
>>
>> Unfortunately the rules were tweaked incorrectly and will always lead to msr
>> failures even though the msrs are completely fine.
>>
>> What happens now is in arch/x86/kernel/cpu/perf_event.c::check_hw_exists:
>>
>> <snip>
>> for (i = 0; i < x86_pmu.num_counters; i++) {
>> reg = x86_pmu_config_addr(i);
>> ret = rdmsrl_safe(reg, &val);
>> if (ret)
>> goto msr_fail;
>> if (val & ARCH_PERFMON_EVENTSEL_ENABLE) {
>> bios_fail = 1;
>> val_fail = val;
>> reg_fail = reg;
>> }
>> }
>>
>> <snip>
>> /*
>> * Read the current value, change it and read it back to see if it
>> * matches, this is needed to detect certain hardware emulators
>> * (qemu/kvm) that don't trap on the MSR access and always return 0s.
>> */
>> reg = x86_pmu_event_addr(0);
>> ^^^^
>>
>> if the first perf counter is enabled, then this routine will always fail
>> because the counter is running. :-(
>>
>> if (rdmsrl_safe(reg, &val))
>> goto msr_fail;
>> val ^= 0xffffUL;
>> ret = wrmsrl_safe(reg, val);
>> ret |= rdmsrl_safe(reg, &val_new);
>> if (ret || val != val_new)
>> goto msr_fail;
>>
>> The above bios_fail used to be a 'goto' which is why it worked in the past.
>>
>> Further, most vendors have migrated to using fixed counters to hide their
>> evilness hence this problem rarely shows up now days except on a few old boxes.
>>
>> I fixed my problem and kept the spirit of the original Xen fix, by recording a
>> safe non-enable register to be used safely for the reading/writing check.
>> Because it is not enabled, this passes on bare metal boxes (like metal), but
>> should continue to throw an msr_fail on Xen guests because the register isn't
>> emulated yet.
>>
>> Now I get a proper bios_fail error message and Xen should still see their
>> msr_fail message (untested).
>>
>> Signed-off-by: Don Zickus <dzickus@redhat.com>
>
> Right -- so what was actually broken was the "does this register work"
> check, which needs a non-enabled register.
>
> Would it make sense to add a comment somewhere in the code saying that
> you need a disabled event counter for the MSR check to work properly?
> It's sort of implied but it's not explicit.
>
> Other than that, this looks good to me. I'm not positive I have access
> to the box I needed this for anymore -- I'll take a look for it next week.
>
> In the mean time:
>
> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
I managed to track down the machine that had the problem and verify that
things still work for me after this patch. So now you can add:
Tested-by: George Dunlap <george.dunlap@eu.citrix.com>
Thanks,
-George
next prev parent reply other threads:[~2015-06-02 15:18 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-18 19:16 [PATCH] x86, perf: Tweak broken BIOS rules during check_hw_exists Don Zickus
2015-05-19 8:17 ` Peter Zijlstra
2015-05-21 17:57 ` George Dunlap
2015-06-02 15:15 ` George Dunlap [this message]
2015-05-27 10:02 ` [tip:perf/core] perf/x86: Tweak broken BIOS rules during check_hw_exists() tip-bot for Don Zickus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=556DC889.2030403@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=dzickus@redhat.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.