public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf, x86: Make cycles:p working on SNB
@ 2012-05-24  3:02 Namhyung Kim
  2012-05-24  7:27 ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-05-24  3:02 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Paul Mackerras, LKML

The Intel SDM (Vol.3B 18-55, Table 18-21) says the PEBS
is only work for INST_RETIRED.PREC_DIST (0x01C0) and it's
applied into intel_snb_pebs_event_constraints already.
Thus, current alt_config which has umake set to 0 cannot
work on SNBs and the output will look like below:

 namhyung@sejong:perf$ ./perf stat -e cycles:p noploop 1

 Performance counter stats for 'noploop 1':

   <not supported> cycles

       1.001116732 seconds time elapsed

After applying this patch, the event is counted properly.

Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
---
 arch/x86/kernel/cpu/perf_event_intel.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 166546ec6aef..db8948171e22 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1329,6 +1329,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
 		 */
 		u64 alt_config = X86_CONFIG(.event=0xc0, .inv=1, .cmask=16);
 
+		/*
+		 * SNB introduced INST_RETIRED.PREC_DIST for this purpose.
+		 */
+		if (x86_pmu.pebs_constraints == intel_snb_pebs_event_constraints)
+			alt_config = X86_CONFIG(.event=0xc0, .umask=0x01,
+						.inv=1, .cmask=16);
 
 		alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
 		event->hw.config = alt_config;
-- 
1.7.10.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  3:02 [PATCH] perf, x86: Make cycles:p working on SNB Namhyung Kim
@ 2012-05-24  7:27 ` Peter Zijlstra
  2012-05-24  7:41   ` Stephane Eranian
  2012-05-24  8:53   ` Namhyung Kim
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2012-05-24  7:27 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML, Stephane Eranian

On Thu, 2012-05-24 at 12:02 +0900, Namhyung Kim wrote:

> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1329,6 +1329,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
>  		 */
>  		u64 alt_config = X86_CONFIG(.event=0xc0, .inv=1, .cmask=16);
>  
> +		/*
> +		 * SNB introduced INST_RETIRED.PREC_DIST for this purpose.
> +		 */
> +		if (x86_pmu.pebs_constraints == intel_snb_pebs_event_constraints)
> +			alt_config = X86_CONFIG(.event=0xc0, .umask=0x01,
> +						.inv=1, .cmask=16);
>  
>  		alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
>  		event->hw.config = alt_config;

That's rather ugly.. but that's okay, I've actually got the patch for
this still laying around, it needs a bit of an update though.

Also I'm thinking you're using SNB-EP (you didn't say) since regular SNB
has PEBS disabled as per (6a600a8b).

Stephane, you could never trigger the badness on EP, but ISTR you saying
it was in fact affected by whatever Intel found? So should we mark that
as bad as well?

Also, do you happen to know if/when a u-code update would appear?

---
Subject: perf, x86: Fix cycles:pp for SandyBridge
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri, 15 Jul 2011 21:17:34 +0200

Intel SNB doesn't support INST_RETIRED as a PEBS event, so implement
the CPU_CLK_UNHALTED alias using UOPS_RETIRED in much the same fasion.

The UOPS_RETIRED thing would work for NHM,WSM,SNB, but Core2 and Atom
really need the old one, so for now only use the new one for SNB.

Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/x86/kernel/cpu/perf_event.c       |    1 
 arch/x86/kernel/cpu/perf_event_intel.c |   68 +++++++++++++++++++++++++--------
 2 files changed, 53 insertions(+), 16 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event.c
@@ -316,6 +316,7 @@ struct x86_pmu {
 	int		pebs_record_size;
 	void		(*drain_pebs)(struct pt_regs *regs);
 	struct event_constraint *pebs_constraints;
+	void		(*pebs_aliases)(struct perf_event *event);
 
 	/*
 	 * Intel LBR
Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1241,8 +1241,30 @@ static int intel_pmu_hw_config(struct pe
 	if (ret)
 		return ret;
 
-	if (event->attr.precise_ip &&
-	    (event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
+	if (event->attr.precise_ip && x86_pmu.pebs_aliases)
+		x86_pmu.pebs_aliases(event);
+
+
+	if (event->attr.type != PERF_TYPE_RAW)
+		return 0;
+
+	if (!(event->attr.config & ARCH_PERFMON_EVENTSEL_ANY))
+		return 0;
+
+	if (x86_pmu.version < 3)
+		return -EINVAL;
+
+	if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
+		return -EACCES;
+
+	event->hw.config |= ARCH_PERFMON_EVENTSEL_ANY;
+
+	return 0;
+}
+
+static void intel_pebs_aliases_core2(struct perf_event *event)
+{
+	if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
 		/*
 		 * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
 		 * (0x003c) so that we can use it with PEBS.
@@ -1266,22 +1288,34 @@ static int intel_pmu_hw_config(struct pe
 		alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
 		event->hw.config = alt_config;
 	}
+}
 
-	if (event->attr.type != PERF_TYPE_RAW)
-		return 0;
-
-	if (!(event->attr.config & ARCH_PERFMON_EVENTSEL_ANY))
-		return 0;
-
-	if (x86_pmu.version < 3)
-		return -EINVAL;
-
-	if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
-		return -EACCES;
-
-	event->hw.config |= ARCH_PERFMON_EVENTSEL_ANY;
+static void intel_pebs_aliases_snb(struct perf_event *event)
+{
+	if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
+		/*
+		 * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
+		 * (0x003c) so that we can use it with PEBS.
+		 *
+		 * The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
+		 * PEBS capable. However we can use UOPS_RETIRED.ALL
+		 * (0x01c2), which is a PEBS capable event, to get the same
+		 * count.
+		 *
+		 * UOPS_RETIRED.ALL counts the number of cycles that retires
+		 * CNTMASK uops. By setting CNTMASK to a value (16)
+		 * larger than the maximum number of uops that can be
+		 * retired per cycle (4) and then inverting the condition, we
+		 * count all cycles that retire 16 or less uops, which
+		 * is every cycle.
+		 *
+		 * Thereby we gain a PEBS capable cycle counter.
+		 */
+		u64 alt_config = 0x108001c2; /* UOPS_RETIRED.TOTAL_CYCLES */
 
-	return 0;
+		alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
+		event->hw.config = alt_config;
+	}
 }
 
 static __initconst const struct x86_pmu core_pmu = {
@@ -1409,6 +1443,7 @@ static __initconst const struct x86_pmu
 	.max_period		= (1ULL << 31) - 1,
 	.get_event_constraints	= intel_get_event_constraints,
 	.put_event_constraints	= intel_put_event_constraints,
+	.pebs_aliases		= intel_pebs_aliases_core2,
 
 	.cpu_prepare		= intel_pmu_cpu_prepare,
 	.cpu_starting		= intel_pmu_cpu_starting,
@@ -1597,6 +1632,7 @@ static __init int intel_pmu_init(void)
 
 		x86_pmu.event_constraints = intel_snb_event_constraints;
 		x86_pmu.pebs_constraints = intel_snb_pebs_events;
+		x86_pmu.pebs_aliases = intel_pebs_aliases_snb,
 		x86_pmu.extra_regs = intel_snb_extra_regs;
 		/* all extra regs are per-cpu when HT is on */
 		x86_pmu.er_flags |= ERF_HAS_RSP_1;



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:27 ` Peter Zijlstra
@ 2012-05-24  7:41   ` Stephane Eranian
  2012-05-24  7:52     ` Peter Zijlstra
  2012-05-24  8:59     ` Namhyung Kim
  2012-05-24  8:53   ` Namhyung Kim
  1 sibling, 2 replies; 16+ messages in thread
From: Stephane Eranian @ 2012-05-24  7:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, May 24, 2012 at 9:27 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2012-05-24 at 12:02 +0900, Namhyung Kim wrote:
>
>> --- a/arch/x86/kernel/cpu/perf_event_intel.c
>> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
>> @@ -1329,6 +1329,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>                */
>>               u64 alt_config = X86_CONFIG(.event=0xc0, .inv=1, .cmask=16);
>>
>> +             /*
>> +              * SNB introduced INST_RETIRED.PREC_DIST for this purpose.
>> +              */
>> +             if (x86_pmu.pebs_constraints == intel_snb_pebs_event_constraints)
>> +                     alt_config = X86_CONFIG(.event=0xc0, .umask=0x01,
>> +                                             .inv=1, .cmask=16);
>>
>>               alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
>>               event->hw.config = alt_config;
>
> That's rather ugly.. but that's okay, I've actually got the patch for
> this still laying around, it needs a bit of an update though.
>
You cannot simply use PREC_DIST. This umask has some severe
restriction. When you measure it, NO other event on the the entire PMU
can be measured at the same time. It needs exclusive mode on SNB.

I don't buy cycles:p in general, but if you really want that what's the
problem with using uops_retired instead?


> Also I'm thinking you're using SNB-EP (you didn't say) since regular SNB
> has PEBS disabled as per (6a600a8b).
>
> Stephane, you could never trigger the badness on EP, but ISTR you saying
> it was in fact affected by whatever Intel found? So should we mark that
> as bad as well?
>
I never could but was told it was there too.

> Also, do you happen to know if/when a u-code update would appear?
>
I am hoping Intel will release the ucode update very soon now. I will
post a patch
to re-enable PEBS on model 42 when that happens.

> ---
> Subject: perf, x86: Fix cycles:pp for SandyBridge
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Fri, 15 Jul 2011 21:17:34 +0200
>
> Intel SNB doesn't support INST_RETIRED as a PEBS event, so implement
> the CPU_CLK_UNHALTED alias using UOPS_RETIRED in much the same fasion.
>
> The UOPS_RETIRED thing would work for NHM,WSM,SNB, but Core2 and Atom
> really need the old one, so for now only use the new one for SNB.
>
> Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  arch/x86/kernel/cpu/perf_event.c       |    1
>  arch/x86/kernel/cpu/perf_event_intel.c |   68 +++++++++++++++++++++++++--------
>  2 files changed, 53 insertions(+), 16 deletions(-)
>
> Index: linux-2.6/arch/x86/kernel/cpu/perf_event.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event.c
> +++ linux-2.6/arch/x86/kernel/cpu/perf_event.c
> @@ -316,6 +316,7 @@ struct x86_pmu {
>        int             pebs_record_size;
>        void            (*drain_pebs)(struct pt_regs *regs);
>        struct event_constraint *pebs_constraints;
> +       void            (*pebs_aliases)(struct perf_event *event);
>
>        /*
>         * Intel LBR
> Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c
> +++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1241,8 +1241,30 @@ static int intel_pmu_hw_config(struct pe
>        if (ret)
>                return ret;
>
> -       if (event->attr.precise_ip &&
> -           (event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
> +       if (event->attr.precise_ip && x86_pmu.pebs_aliases)
> +               x86_pmu.pebs_aliases(event);
> +
> +
> +       if (event->attr.type != PERF_TYPE_RAW)
> +               return 0;
> +
> +       if (!(event->attr.config & ARCH_PERFMON_EVENTSEL_ANY))
> +               return 0;
> +
> +       if (x86_pmu.version < 3)
> +               return -EINVAL;
> +
> +       if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
> +               return -EACCES;
> +
> +       event->hw.config |= ARCH_PERFMON_EVENTSEL_ANY;
> +
> +       return 0;
> +}
> +
> +static void intel_pebs_aliases_core2(struct perf_event *event)
> +{
> +       if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
>                /*
>                 * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
>                 * (0x003c) so that we can use it with PEBS.
> @@ -1266,22 +1288,34 @@ static int intel_pmu_hw_config(struct pe
>                alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
>                event->hw.config = alt_config;
>        }
> +}
>
> -       if (event->attr.type != PERF_TYPE_RAW)
> -               return 0;
> -
> -       if (!(event->attr.config & ARCH_PERFMON_EVENTSEL_ANY))
> -               return 0;
> -
> -       if (x86_pmu.version < 3)
> -               return -EINVAL;
> -
> -       if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
> -               return -EACCES;
> -
> -       event->hw.config |= ARCH_PERFMON_EVENTSEL_ANY;
> +static void intel_pebs_aliases_snb(struct perf_event *event)
> +{
> +       if ((event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
> +               /*
> +                * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
> +                * (0x003c) so that we can use it with PEBS.
> +                *
> +                * The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
> +                * PEBS capable. However we can use UOPS_RETIRED.ALL
> +                * (0x01c2), which is a PEBS capable event, to get the same
> +                * count.
> +                *
> +                * UOPS_RETIRED.ALL counts the number of cycles that retires
> +                * CNTMASK uops. By setting CNTMASK to a value (16)
> +                * larger than the maximum number of uops that can be
> +                * retired per cycle (4) and then inverting the condition, we
> +                * count all cycles that retire 16 or less uops, which
> +                * is every cycle.
> +                *
> +                * Thereby we gain a PEBS capable cycle counter.
> +                */
> +               u64 alt_config = 0x108001c2; /* UOPS_RETIRED.TOTAL_CYCLES */
>
> -       return 0;
> +               alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
> +               event->hw.config = alt_config;
> +       }
>  }
>
>  static __initconst const struct x86_pmu core_pmu = {
> @@ -1409,6 +1443,7 @@ static __initconst const struct x86_pmu
>        .max_period             = (1ULL << 31) - 1,
>        .get_event_constraints  = intel_get_event_constraints,
>        .put_event_constraints  = intel_put_event_constraints,
> +       .pebs_aliases           = intel_pebs_aliases_core2,
>
>        .cpu_prepare            = intel_pmu_cpu_prepare,
>        .cpu_starting           = intel_pmu_cpu_starting,
> @@ -1597,6 +1632,7 @@ static __init int intel_pmu_init(void)
>
>                x86_pmu.event_constraints = intel_snb_event_constraints;
>                x86_pmu.pebs_constraints = intel_snb_pebs_events;
> +               x86_pmu.pebs_aliases = intel_pebs_aliases_snb,
>                x86_pmu.extra_regs = intel_snb_extra_regs;
>                /* all extra regs are per-cpu when HT is on */
>                x86_pmu.er_flags |= ERF_HAS_RSP_1;
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:41   ` Stephane Eranian
@ 2012-05-24  7:52     ` Peter Zijlstra
  2012-05-24  7:57       ` Peter Zijlstra
  2012-05-24  7:58       ` Stephane Eranian
  2012-05-24  8:59     ` Namhyung Kim
  1 sibling, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2012-05-24  7:52 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, 2012-05-24 at 09:41 +0200, Stephane Eranian wrote:
> So should we mark that
> > as bad as well?
> >
> I never could but was told it was there too.
> 
> > Also, do you happen to know if/when a u-code update would appear?
> >
> I am hoping Intel will release the ucode update very soon now. I will
> post a patch
> to re-enable PEBS on model 42 when that happens.

OK, so how about I queue the below and if the u-code updates gets
released before this hits Linus' tree we'll make it all go away ;-)

---
Subject: perf,x86: Mark PEBS as broken on SNB-EP as well

As per information from Intel.

Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/x86/kernel/cpu/perf_event_intel.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 166546e..6c8e400 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1840,8 +1840,8 @@ __init int intel_pmu_init(void)
 		break;
 
 	case 42: /* SandyBridge */
-		x86_add_quirk(intel_sandybridge_quirk);
 	case 45: /* SandyBridge, "Romely-EP" */
+		x86_add_quirk(intel_sandybridge_quirk);
 		memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
 		       sizeof(hw_cache_event_ids));
 



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:52     ` Peter Zijlstra
@ 2012-05-24  7:57       ` Peter Zijlstra
  2012-05-24  8:00         ` Stephane Eranian
  2012-05-24  7:58       ` Stephane Eranian
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2012-05-24  7:57 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, 2012-05-24 at 09:52 +0200, Peter Zijlstra wrote:
> 
> OK, so how about I queue the below and if the u-code updates gets
> released before this hits Linus' tree we'll make it all go away ;-)
> 
Also, I suspect the u-code update might make INST_RETIRED.ANY_P
available on SNB again voiding the need to use a different encoding. But
we'll cross that bridge when we'll get to it.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:52     ` Peter Zijlstra
  2012-05-24  7:57       ` Peter Zijlstra
@ 2012-05-24  7:58       ` Stephane Eranian
  2012-05-30 12:16         ` Peter Zijlstra
  1 sibling, 1 reply; 16+ messages in thread
From: Stephane Eranian @ 2012-05-24  7:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, May 24, 2012 at 9:52 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2012-05-24 at 09:41 +0200, Stephane Eranian wrote:
>> So should we mark that
>> > as bad as well?
>> >
>> I never could but was told it was there too.
>>
>> > Also, do you happen to know if/when a u-code update would appear?
>> >
>> I am hoping Intel will release the ucode update very soon now. I will
>> post a patch
>> to re-enable PEBS on model 42 when that happens.
>
> OK, so how about I queue the below and if the u-code updates gets
> released before this hits Linus' tree we'll make it all go away ;-)
>
Let me get confirmation from Intel today before we do this.

> ---
> Subject: perf,x86: Mark PEBS as broken on SNB-EP as well
>
> As per information from Intel.
>
> Cc: Stephane Eranian <eranian@google.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 166546e..6c8e400 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1840,8 +1840,8 @@ __init int intel_pmu_init(void)
>                break;
>
>        case 42: /* SandyBridge */
> -               x86_add_quirk(intel_sandybridge_quirk);
>        case 45: /* SandyBridge, "Romely-EP" */
> +               x86_add_quirk(intel_sandybridge_quirk);
>                memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
>                       sizeof(hw_cache_event_ids));
>
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:57       ` Peter Zijlstra
@ 2012-05-24  8:00         ` Stephane Eranian
  2012-05-24  8:07           ` Peter Zijlstra
  0 siblings, 1 reply; 16+ messages in thread
From: Stephane Eranian @ 2012-05-24  8:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, May 24, 2012 at 9:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2012-05-24 at 09:52 +0200, Peter Zijlstra wrote:
>>
>> OK, so how about I queue the below and if the u-code updates gets
>> released before this hits Linus' tree we'll make it all go away ;-)
>>
> Also, I suspect the u-code update might make INST_RETIRED.ANY_P
> available on SNB again voiding the need to use a different encoding. But
> we'll cross that bridge when we'll get to it.

The PREC_DIST is by design, it is not a work-around of any sort.
It is an attempt at getting a better distribution of samples when sampling
with inst_retired.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  8:00         ` Stephane Eranian
@ 2012-05-24  8:07           ` Peter Zijlstra
  2012-05-24  8:59             ` Stephane Eranian
  2012-05-24  9:01             ` Namhyung Kim
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2012-05-24  8:07 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, 2012-05-24 at 10:00 +0200, Stephane Eranian wrote:
> On Thu, May 24, 2012 at 9:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Thu, 2012-05-24 at 09:52 +0200, Peter Zijlstra wrote:
> >>
> >> OK, so how about I queue the below and if the u-code updates gets
> >> released before this hits Linus' tree we'll make it all go away ;-)
> >>
> > Also, I suspect the u-code update might make INST_RETIRED.ANY_P
> > available on SNB again voiding the need to use a different encoding. But
> > we'll cross that bridge when we'll get to it.
> 
> The PREC_DIST is by design, it is not a work-around of any sort.
> It is an attempt at getting a better distribution of samples when sampling
> with inst_retired.

Oh completely agreed, note that my patch used UOPS_RETIRED.ALL. Only
Namhyung used PREC_DIST, mostly I suspect because he's not stared at all
these manuals for long yet ;-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:27 ` Peter Zijlstra
  2012-05-24  7:41   ` Stephane Eranian
@ 2012-05-24  8:53   ` Namhyung Kim
  2012-05-24  8:59     ` Stephane Eranian
  1 sibling, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-05-24  8:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras, LKML,
	Stephane Eranian

Hi, Peter

On Thu, 24 May 2012 09:27:51 +0200, Peter Zijlstra wrote:
> Also I'm thinking you're using SNB-EP (you didn't say) since regular SNB
> has PEBS disabled as per (6a600a8b).
>

I don't actually have an idea whether it's a SNB-EP or regular
one. ;-p Is Model 45 a SNB-EP?

namhyung@sejong:perf$ head /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 45
model name	: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
stepping	: 7
microcode	: 0x704
cpu MHz		: 1200.000
cache size	: 12288 KB
physical id	: 0


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  8:07           ` Peter Zijlstra
@ 2012-05-24  8:59             ` Stephane Eranian
  2012-05-24  9:01             ` Namhyung Kim
  1 sibling, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2012-05-24  8:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, May 24, 2012 at 10:07 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2012-05-24 at 10:00 +0200, Stephane Eranian wrote:
>> On Thu, May 24, 2012 at 9:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Thu, 2012-05-24 at 09:52 +0200, Peter Zijlstra wrote:
>> >>
>> >> OK, so how about I queue the below and if the u-code updates gets
>> >> released before this hits Linus' tree we'll make it all go away ;-)
>> >>
>> > Also, I suspect the u-code update might make INST_RETIRED.ANY_P
>> > available on SNB again voiding the need to use a different encoding. But
>> > we'll cross that bridge when we'll get to it.
>>
>> The PREC_DIST is by design, it is not a work-around of any sort.
>> It is an attempt at getting a better distribution of samples when sampling
>> with inst_retired.
>
> Oh completely agreed, note that my patch used UOPS_RETIRED.ALL. Only
> Namhyung used PREC_DIST, mostly I suspect because he's not stared at all
> these manuals for long yet ;-)
>
PREC_DIST is only meaningful when used with PEBS. This is where it improves
the sample distribution (compensation some of the PEBS shadow effect).
Otherwise,
you still habe INST_RETIRED:ANY_P.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:41   ` Stephane Eranian
  2012-05-24  7:52     ` Peter Zijlstra
@ 2012-05-24  8:59     ` Namhyung Kim
  2012-05-24  9:06       ` Stephane Eranian
  1 sibling, 1 reply; 16+ messages in thread
From: Namhyung Kim @ 2012-05-24  8:59 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

Hi, Stephane

On Thu, 24 May 2012 09:41:45 +0200, Stephane Eranian wrote:
> On Thu, May 24, 2012 at 9:27 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>> On Thu, 2012-05-24 at 12:02 +0900, Namhyung Kim wrote:
>>
>>> --- a/arch/x86/kernel/cpu/perf_event_intel.c
>>> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
>>> @@ -1329,6 +1329,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>>                */
>>>               u64 alt_config = X86_CONFIG(.event=0xc0, .inv=1, .cmask=16);
>>>
>>> +             /*
>>> +              * SNB introduced INST_RETIRED.PREC_DIST for this purpose.
>>> +              */
>>> +             if (x86_pmu.pebs_constraints == intel_snb_pebs_event_constraints)
>>> +                     alt_config = X86_CONFIG(.event=0xc0, .umask=0x01,
>>> +                                             .inv=1, .cmask=16);
>>>
>>>               alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
>>>               event->hw.config = alt_config;
>>
>> That's rather ugly.. but that's okay, I've actually got the patch for
>> this still laying around, it needs a bit of an update though.
>>
> You cannot simply use PREC_DIST. This umask has some severe
> restriction. When you measure it, NO other event on the the entire PMU
> can be measured at the same time. It needs exclusive mode on SNB.
>

Yeah, I read something like above on the SDM. But just got confused with
this:

$ ./perf stat -e cycles:p,instructions,cache-references,cache-misses noploop 1

 Performance counter stats for 'noploop 1':

     3,741,658,837 cycles                    #    0.000 GHz                    
     3,618,983,116 instructions              #    0.97  insns per cycle        
            51,126 cache-references                                            
             7,357 cache-misses              #   14.390 % of all cache refs    

       1.000692634 seconds time elapsed


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  8:53   ` Namhyung Kim
@ 2012-05-24  8:59     ` Stephane Eranian
  0 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2012-05-24  8:59 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, May 24, 2012 at 10:53 AM, Namhyung Kim <namhyung.kim@lge.com> wrote:
> Hi, Peter
>
> On Thu, 24 May 2012 09:27:51 +0200, Peter Zijlstra wrote:
>> Also I'm thinking you're using SNB-EP (you didn't say) since regular SNB
>> has PEBS disabled as per (6a600a8b).
>>
>
> I don't actually have an idea whether it's a SNB-EP or regular
> one. ;-p Is Model 45 a SNB-EP?
>
Yes.

> namhyung@sejong:perf$ head /proc/cpuinfo
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 45
> model name      : Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
> stepping        : 7
> microcode       : 0x704
> cpu MHz         : 1200.000
> cache size      : 12288 KB
> physical id     : 0
>
>
> Thanks,
> Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  8:07           ` Peter Zijlstra
  2012-05-24  8:59             ` Stephane Eranian
@ 2012-05-24  9:01             ` Namhyung Kim
  1 sibling, 0 replies; 16+ messages in thread
From: Namhyung Kim @ 2012-05-24  9:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Stephane Eranian, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, 24 May 2012 10:07:53 +0200, Peter Zijlstra wrote:
> On Thu, 2012-05-24 at 10:00 +0200, Stephane Eranian wrote:
>> The PREC_DIST is by design, it is not a work-around of any sort.
>> It is an attempt at getting a better distribution of samples when sampling
>> with inst_retired.
>
> Oh completely agreed, note that my patch used UOPS_RETIRED.ALL. Only
> Namhyung used PREC_DIST, mostly I suspect because he's not stared at all
> these manuals for long yet ;-)

True. :)

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  8:59     ` Namhyung Kim
@ 2012-05-24  9:06       ` Stephane Eranian
  0 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2012-05-24  9:06 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, May 24, 2012 at 10:59 AM, Namhyung Kim <namhyung.kim@lge.com> wrote:
> Hi, Stephane
>
> On Thu, 24 May 2012 09:41:45 +0200, Stephane Eranian wrote:
>> On Thu, May 24, 2012 at 9:27 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>>> On Thu, 2012-05-24 at 12:02 +0900, Namhyung Kim wrote:
>>>
>>>> --- a/arch/x86/kernel/cpu/perf_event_intel.c
>>>> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
>>>> @@ -1329,6 +1329,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>>>                */
>>>>               u64 alt_config = X86_CONFIG(.event=0xc0, .inv=1, .cmask=16);
>>>>
>>>> +             /*
>>>> +              * SNB introduced INST_RETIRED.PREC_DIST for this purpose.
>>>> +              */
>>>> +             if (x86_pmu.pebs_constraints == intel_snb_pebs_event_constraints)
>>>> +                     alt_config = X86_CONFIG(.event=0xc0, .umask=0x01,
>>>> +                                             .inv=1, .cmask=16);
>>>>
>>>>               alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
>>>>               event->hw.config = alt_config;
>>>
>>> That's rather ugly.. but that's okay, I've actually got the patch for
>>> this still laying around, it needs a bit of an update though.
>>>
>> You cannot simply use PREC_DIST. This umask has some severe
>> restriction. When you measure it, NO other event on the the entire PMU
>> can be measured at the same time. It needs exclusive mode on SNB.
>>
>
> Yeah, I read something like above on the SDM. But just got confused with
> this:
>
> $ ./perf stat -e cycles:p,instructions,cache-references,cache-misses noploop 1
>
Passing :p in counting mode is useless, it does not do anything.
The :p  suffix is only meaningful is sampling mode where it enables PEBS.

>  Performance counter stats for 'noploop 1':
>
>     3,741,658,837 cycles                    #    0.000 GHz
>     3,618,983,116 instructions              #    0.97  insns per cycle
>            51,126 cache-references
>             7,357 cache-misses              #   14.390 % of all cache refs
>
>       1.000692634 seconds time elapsed
>
>
> Thanks,
> Namhyung

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-24  7:58       ` Stephane Eranian
@ 2012-05-30 12:16         ` Peter Zijlstra
  2012-06-01  7:44           ` Stephane Eranian
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2012-05-30 12:16 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Thu, 2012-05-24 at 09:58 +0200, Stephane Eranian wrote:
> > OK, so how about I queue the below and if the u-code updates gets
> > released before this hits Linus' tree we'll make it all go away ;-)
> >
> Let me get confirmation from Intel today before we do this. 

Any word from them?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] perf, x86: Make cycles:p working on SNB
  2012-05-30 12:16         ` Peter Zijlstra
@ 2012-06-01  7:44           ` Stephane Eranian
  0 siblings, 0 replies; 16+ messages in thread
From: Stephane Eranian @ 2012-06-01  7:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Namhyung Kim, Ingo Molnar, Namhyung Kim, Arnaldo Carvalho de Melo,
	Paul Mackerras, LKML

On Wed, May 30, 2012 at 2:16 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Thu, 2012-05-24 at 09:58 +0200, Stephane Eranian wrote:
>> > OK, so how about I queue the below and if the u-code updates gets
>> > released before this hits Linus' tree we'll make it all go away ;-)
>> >
>> Let me get confirmation from Intel today before we do this.
>
> Any word from them?

They are double-checking on this.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-06-01  7:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-24  3:02 [PATCH] perf, x86: Make cycles:p working on SNB Namhyung Kim
2012-05-24  7:27 ` Peter Zijlstra
2012-05-24  7:41   ` Stephane Eranian
2012-05-24  7:52     ` Peter Zijlstra
2012-05-24  7:57       ` Peter Zijlstra
2012-05-24  8:00         ` Stephane Eranian
2012-05-24  8:07           ` Peter Zijlstra
2012-05-24  8:59             ` Stephane Eranian
2012-05-24  9:01             ` Namhyung Kim
2012-05-24  7:58       ` Stephane Eranian
2012-05-30 12:16         ` Peter Zijlstra
2012-06-01  7:44           ` Stephane Eranian
2012-05-24  8:59     ` Namhyung Kim
2012-05-24  9:06       ` Stephane Eranian
2012-05-24  8:53   ` Namhyung Kim
2012-05-24  8:59     ` Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox