* [PATCH] perf, x86: Use extended offcore mask on Haswell
@ 2014-08-12 0:11 Andi Kleen
2014-08-12 5:45 ` Stephane Eranian
2014-08-12 5:56 ` Peter Zijlstra
0 siblings, 2 replies; 5+ messages in thread
From: Andi Kleen @ 2014-08-12 0:11 UTC (permalink / raw)
To: peterz; +Cc: linux-kernel, eranian, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
HSW-EP has a larger offcore mask than the client Haswell CPUs.
It is the same mask as on Sandy/IvyBridge-EP. All of
Haswell was using the client mask, so some bits were missing.
On the client parts some bits were also missing compared
to Sandy/IvyBridge, in particular the bits to match on a L4
cache hit.
The Haswell core in both client and server incarnations
accepts the same bits (but some are nops), so we can use
the same mask.
So use the snbep extended mask, which is a superset of the
client and the server, for all of Haswell.
This allows specifying a number of extra offcore events, like
for example for HSW-EP.
% perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x3fffc00100,name=offcore_response_pf_l3_rfo_l3_miss_any_response/ true
which were <not supported> before.
v2: Post correct patch.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/cpu/perf_event_intel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 2502d0d..4648a1b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2552,7 +2552,7 @@ __init int intel_pmu_init(void)
x86_pmu.event_constraints = intel_hsw_event_constraints;
x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
- x86_pmu.extra_regs = intel_snb_extra_regs;
+ x86_pmu.extra_regs = intel_snbep_extra_regs;
x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
--
1.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] perf, x86: Use extended offcore mask on Haswell
2014-08-12 0:11 [PATCH] perf, x86: Use extended offcore mask on Haswell Andi Kleen
@ 2014-08-12 5:45 ` Stephane Eranian
2014-08-12 5:56 ` Peter Zijlstra
1 sibling, 0 replies; 5+ messages in thread
From: Stephane Eranian @ 2014-08-12 5:45 UTC (permalink / raw)
To: Andi Kleen; +Cc: Peter Zijlstra, LKML, Andi Kleen
On Tue, Aug 12, 2014 at 2:11 AM, Andi Kleen <andi@firstfloor.org> wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> HSW-EP has a larger offcore mask than the client Haswell CPUs.
> It is the same mask as on Sandy/IvyBridge-EP. All of
> Haswell was using the client mask, so some bits were missing.
>
> On the client parts some bits were also missing compared
> to Sandy/IvyBridge, in particular the bits to match on a L4
> cache hit.
>
> The Haswell core in both client and server incarnations
> accepts the same bits (but some are nops), so we can use
> the same mask.
>
> So use the snbep extended mask, which is a superset of the
> client and the server, for all of Haswell.
>
> This allows specifying a number of extra offcore events, like
> for example for HSW-EP.
>
> % perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x3fffc00100,name=offcore_response_pf_l3_rfo_l3_miss_any_response/ true
>
> which were <not supported> before.
>
I tested this on my HSW desktop including the bits only defined for
servers and everything is fine.
It does simplify the code,though desktop parts always come first, so
there will always be patching.
Also if this works on HSW/HSX, I wonder if we could not simplify the
same code for IVB/IVT and SNB/SNBEP.
Reviewed-by: Stephane Eranian <eranian@google.com>
> v2: Post correct patch.
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 2502d0d..4648a1b 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -2552,7 +2552,7 @@ __init int intel_pmu_init(void)
>
> x86_pmu.event_constraints = intel_hsw_event_constraints;
> x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
> - x86_pmu.extra_regs = intel_snb_extra_regs;
> + x86_pmu.extra_regs = intel_snbep_extra_regs;
> x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
> /* all extra regs are per-cpu when HT is on */
> x86_pmu.er_flags |= ERF_HAS_RSP_1;
> --
> 1.9.3
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] perf, x86: Use extended offcore mask on Haswell
2014-08-12 0:11 [PATCH] perf, x86: Use extended offcore mask on Haswell Andi Kleen
2014-08-12 5:45 ` Stephane Eranian
@ 2014-08-12 5:56 ` Peter Zijlstra
2014-08-12 6:03 ` Stephane Eranian
1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2014-08-12 5:56 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, eranian, Andi Kleen
[-- Attachment #1: Type: text/plain, Size: 1107 bytes --]
On Mon, Aug 11, 2014 at 05:11:30PM -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> HSW-EP has a larger offcore mask than the client Haswell CPUs.
> It is the same mask as on Sandy/IvyBridge-EP. All of
> Haswell was using the client mask, so some bits were missing.
>
> On the client parts some bits were also missing compared
> to Sandy/IvyBridge, in particular the bits to match on a L4
> cache hit.
>
> The Haswell core in both client and server incarnations
> accepts the same bits (but some are nops), so we can use
> the same mask.
>
> So use the snbep extended mask, which is a superset of the
> client and the server, for all of Haswell.
>
> This allows specifying a number of extra offcore events, like
> for example for HSW-EP.
>
> % perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x3fffc00100,name=offcore_response_pf_l3_rfo_l3_miss_any_response/ true
>
> which were <not supported> before.
>
> v2: Post correct patch.
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
I seem to have this patch; so that means this is a repost right?
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] perf, x86: Use extended offcore mask on Haswell
2014-08-12 5:56 ` Peter Zijlstra
@ 2014-08-12 6:03 ` Stephane Eranian
0 siblings, 0 replies; 5+ messages in thread
From: Stephane Eranian @ 2014-08-12 6:03 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Andi Kleen, LKML, Andi Kleen
On Tue, Aug 12, 2014 at 7:56 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Aug 11, 2014 at 05:11:30PM -0700, Andi Kleen wrote:
>> From: Andi Kleen <ak@linux.intel.com>
>>
>> HSW-EP has a larger offcore mask than the client Haswell CPUs.
>> It is the same mask as on Sandy/IvyBridge-EP. All of
>> Haswell was using the client mask, so some bits were missing.
>>
>> On the client parts some bits were also missing compared
>> to Sandy/IvyBridge, in particular the bits to match on a L4
>> cache hit.
>>
>> The Haswell core in both client and server incarnations
>> accepts the same bits (but some are nops), so we can use
>> the same mask.
>>
>> So use the snbep extended mask, which is a superset of the
>> client and the server, for all of Haswell.
>>
>> This allows specifying a number of extra offcore events, like
>> for example for HSW-EP.
>>
>> % perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x3fffc00100,name=offcore_response_pf_l3_rfo_l3_miss_any_response/ true
>>
>> which were <not supported> before.
>>
>> v2: Post correct patch.
>> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> I seem to have this patch; so that means this is a repost right?
>
Think Andi said, it was a rebase. Just found time today to test it.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] perf, x86: Use extended offcore mask on Haswell
@ 2014-07-31 21:05 Andi Kleen
0 siblings, 0 replies; 5+ messages in thread
From: Andi Kleen @ 2014-07-31 21:05 UTC (permalink / raw)
To: peterz; +Cc: eranian, linux-kernel, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
HSW-EP has a larger offcore mask than the client Haswell CPUs.
It is the same mask as on Sandy/IvyBridge-EP. All of
Haswell was using the client mask, so some bits were missing.
On the client parts some bits were also missing compared
to Sandy/IvyBridge, in particular the bits to match on a L4
cache hit.
The Haswell core in both client and server incarnations
accepts the same bits (but some are nops), so we can use
the same mask.
So use the snbep extended mask, which is a superset of the
client and the server, for all of Haswell.
This allows specifying a number of extra offcore events, like
for example for HSW-EP.
% perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x3fffc00100,name=offcore_response_pf_l3_rfo_l3_miss_any_response/ true
which were <not supported> before.
v2: Post correct patch.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
arch/x86/kernel/cpu/perf_event_intel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 2502d0d..4648a1b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2552,7 +2552,7 @@ __init int intel_pmu_init(void)
x86_pmu.event_constraints = intel_hsw_event_constraints;
x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
- x86_pmu.extra_regs = intel_snb_extra_regs;
+ x86_pmu.extra_regs = intel_snbep_extra_regs;
x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
--
1.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-08-12 6:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-12 0:11 [PATCH] perf, x86: Use extended offcore mask on Haswell Andi Kleen
2014-08-12 5:45 ` Stephane Eranian
2014-08-12 5:56 ` Peter Zijlstra
2014-08-12 6:03 ` Stephane Eranian
-- strict thread matches above, loose matches on Subject: below --
2014-07-31 21:05 Andi Kleen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox