From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>,
Stephane Eranian <eranian@google.com>,
Arjan van de Ven <arjan@infradead.org>,
"H. Peter Anvin" <hpa@zytor.com>,
andi.kleen@intel.com, shaohua.li@intel.com,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: perf failed with kernel 2.6.35-rc
Date: Fri, 06 Aug 2010 13:39:08 +0800 [thread overview]
Message-ID: <1281073148.2125.63.camel@ymzhang.sh.intel.com> (raw)
In-Reply-To: <1281004361.1923.1750.camel@laptop>
On Thu, 2010-08-05 at 12:32 +0200, Peter Zijlstra wrote:
> On Thu, 2010-08-05 at 14:40 +0800, Zhang, Yanmin wrote:
>
> > Here is the new patch.
I did more tracing. It seems only PMC0 overflows unexpectedly with the Errata.
> >
> > ---
> >
> > --- linux-2.6.35/arch/x86/kernel/cpu/perf_event_intel.c 2010-08-02 06:11:14.000000000 +0800
> > +++ linux-2.6.35_perferrata/arch/x86/kernel/cpu/perf_event_intel.c 2010-08-06 13:43:34.000000000 +0800
> > @@ -499,21 +499,40 @@ static void intel_pmu_nhm_enable_all(int
> > {
> > if (added) {
> > struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
> > + struct perf_event *event;
> > + int num_counters;
> > int i;
> >
> > - wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 0, 0x4300D2);
> > - wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1);
> > - wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5);
> > + /* We always operate 4 pairs of PERF Counters */
> > + num_counters = 4;
>
> Either hardcode 4 (its a model specific errata after all), or use
> x86_pmu.num_counters.
Ok.
>
> Also, please update the comment describing the new work-around.
I will add detailed steps.
>
> > + for (i = 0; i < num_counters; i++) {
> > + event = cpuc->events[i];
> > + if (!event)
> > + continue;
> > + x86_perf_event_update(event);
> > + }
> >
> > - wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3);
> > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 0, 0x4300B5);
> > + wrmsrl(MSR_ARCH_PERFMON_PERFCTR0 + 0, 0x0);
> > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300D2);
> > + wrmsrl(MSR_ARCH_PERFMON_PERFCTR0 + 1, 0x0);
> > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B1);
> > + wrmsrl(MSR_ARCH_PERFMON_PERFCTR0 + 2, 0x0);
> > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 3, 0x4300B1);
> > + wrmsrl(MSR_ARCH_PERFMON_PERFCTR0 + 3, 0x0);
> > +
>
> Maybe write that as:
>
> static const unsigned long magic[4] = { 0x4300B5, 0x4300D2, 0x4300B1, 0x4300B1 };
>
> for (i = 0; i < 4; i++) {
> wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + i, magic[i]);
> wrmsrl(MSR_ARCH_PERFMON_PERFCTR0 + i, 0);
> }
>
> > + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0xf);
> > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0);
> >
> > - for (i = 0; i < 3; i++) {
> > - struct perf_event *event = cpuc->events[i];
> > + for (i = 0; i < num_counters; i++) {
> > + event = cpuc->events[i];
> >
> > - if (!event)
> > + if (!event) {
> > + wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + i, 0x0);
>
> Indeed, if they are counting we need to at least stop them, and clearing
> them is indeed the simplest way.
>
> > continue;
> > + }
> >
> > + x86_perf_event_set_period(event);
> > __x86_pmu_enable_event(&event->hw,
> > ARCH_PERFMON_EVENTSEL_ENABLE);
> > }
>
> Please send the updated patch for inclusion, Thanks!
Below is the patch against 2.6.35.
Fix Errata AAK100/AAP53/BD53.
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
---
--- linux-2.6.35/arch/x86/kernel/cpu/perf_event_intel.c 2010-08-02 06:11:14.000000000 +0800
+++ linux-2.6.35_perferrata/arch/x86/kernel/cpu/perf_event_intel.c 2010-08-07 10:06:03.000000000 +0800
@@ -492,32 +492,73 @@ static void intel_pmu_enable_all(int add
* Intel Errata BD53 (model 44)
*
* These chips need to be 'reset' when adding counters by programming
- * the magic three (non counting) events 0x4300D2, 0x4300B1 and 0x4300B5
+ * the magic three (non counting) events 0x4300B5, 0x4300D2, and 0x4300B1
* either in sequence on the same PMC or on different PMCs.
*/
-static void intel_pmu_nhm_enable_all(int added)
+static void intel_pmu_nhm_workaround(void)
{
- if (added) {
- struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
- int i;
-
- wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 0, 0x4300D2);
- wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1);
- wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5);
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ static const unsigned long nhm_magic[4] = {
+ 0x4300B5,
+ 0x4300D2,
+ 0x4300B1,
+ 0x4300B1
+ };
+ struct perf_event *event;
+ int i;
+
+ /*
+ * The Errata requires below steps:
+ * 1) Clear MSR_IA32_PEBS_ENABLE and MSR_CORE_PERF_GLOBAL_CTRL;
+ * 2) Configure 4 PERFEVTSELx with the magic number and clear
+ * the corresponding PMCx;
+ * 3) set bit0~bit3 of MSR_CORE_PERF_GLOBAL_CTRL;
+ * 4) Clear MSR_CORE_PERF_GLOBAL_CTRL;
+ * 5) Clear 4 pairs of ERFEVTSELx and PMCx;
+ */
- wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3);
- wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0);
+ /*
+ * The real steps we choose are a little different from above.
+ * A) To reduce MSR operations, we don't run step 1) as they
+ * are already cleared before this function is called;
+ * B) Call x86_perf_event_update to save PMCx before configuring
+ * PERFEVTSELx with magic number;
+ * C) With step 5), we do clear only when the PERFEVTSELx is
+ * not used currently.
+ * D) Call x86_perf_event_set_period to restore PMCx;
+ */
- for (i = 0; i < 3; i++) {
- struct perf_event *event = cpuc->events[i];
+ /* We always operate 4 pairs of PERF Counters */
+ for (i = 0; i < 4; i++) {
+ event = cpuc->events[i];
+ if (event)
+ x86_perf_event_update(event);
+ }
- if (!event)
- continue;
+ for (i = 0; i < 4; i++) {
+ wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + i, nhm_magic[i]);
+ wrmsrl(MSR_ARCH_PERFMON_PERFCTR0 + i, 0x0);
+ }
+ wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0xf);
+ wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0);
+
+ for (i = 0; i < 4; i++) {
+ event = cpuc->events[i];
+
+ if (event) {
+ x86_perf_event_set_period(event);
__x86_pmu_enable_event(&event->hw,
- ARCH_PERFMON_EVENTSEL_ENABLE);
- }
+ ARCH_PERFMON_EVENTSEL_ENABLE);
+ } else
+ wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + i, 0x0);
}
+}
+
+static void intel_pmu_nhm_enable_all(int added)
+{
+ if (added)
+ intel_pmu_nhm_workaround();
intel_pmu_enable_all(added);
}
next prev parent reply other threads:[~2010-08-06 5:36 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-13 8:14 perf failed with kernel 2.6.35-rc Zhang, Yanmin
2010-07-13 8:50 ` Ingo Molnar
2010-07-14 0:49 ` Zhang, Yanmin
2010-07-14 8:15 ` Zhang, Yanmin
2010-07-13 15:16 ` Stephane Eranian
2010-07-14 0:13 ` Zhang, Yanmin
2010-07-14 0:36 ` Stephane Eranian
2010-07-14 2:22 ` Zhang, Yanmin
2010-08-03 12:20 ` Peter Zijlstra
2010-08-04 15:48 ` Stephane Eranian
[not found] ` <1280886349.2125.32.camel@ymzhang.sh.intel.com>
[not found] ` <1280905701.1923.717.camel@laptop>
[not found] ` <1280990413.2125.50.camel@ymzhang.sh.intel.com>
[not found] ` <1281004361.1923.1750.camel@laptop>
2010-08-06 5:39 ` Zhang, Yanmin [this message]
2010-08-18 10:27 ` [tip:perf/urgent] perf, x86: Fix Intel-nhm PMU programming errata workaround tip-bot for Zhang, Yanmin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1281073148.2125.63.camel@ymzhang.sh.intel.com \
--to=yanmin_zhang@linux.intel.com \
--cc=andi.kleen@intel.com \
--cc=arjan@infradead.org \
--cc=eranian@google.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.