public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] perf: honoring the architectural performance monitoring version
@ 2015-06-05 15:28 Imre Palik
  2015-06-05 15:42 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Imre Palik @ 2015-06-05 15:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo,
	Thomas Gleixner, H. Peter Anvin, x86, linux-kernel, Palik, Imre,
	Anthony Liguori

From: "Palik, Imre" <imrep@amazon.de>

Architectural performance monitoring version 1 doesn't support fixed
counters.  Currently, even if a hypervisor advertises support for
architectural performance monitoring version 1, perf may still tries to use
the fixed counters, as the constraints are set up based on the CPU model.

This patch ensures that perf honors the architectural performance
monitoring version returned by CPUID, and it only uses the fixed counters
for version two and above.

Some of the ideas in this patch are coming from Peter Zijlstra.

Signed-off-by: Imre Palik <imrep@amazon.de>
Cc: Anthony Liguori <aliguori@amazon.com>
---
 arch/x86/kernel/cpu/perf_event_intel.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 3998131..bde66aa 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1870,7 +1870,7 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
 		for_each_event_constraint(c, x86_pmu.event_constraints) {
 			if ((event->hw.config & c->cmask) == c->code) {
 				event->hw.flags |= c->flags;
-				return c;
+				return  c->idxmsk64 ? c : NULL;
 			}
 		}
 	}
@@ -3341,9 +3341,12 @@ __init int intel_pmu_init(void)
 		for_each_event_constraint(c, x86_pmu.event_constraints) {
 			if (c->cmask != FIXED_EVENT_FLAGS
 			    || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
+				c->idxmsk64 &=
+					~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
 				continue;
 			}
-
+			c->idxmsk64 &=
+				~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
 			c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
 			c->weight += x86_pmu.num_counters;
 		}
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] perf: honoring the architectural performance monitoring version
  2015-06-05 15:28 [PATCH v2] perf: honoring the architectural performance monitoring version Imre Palik
@ 2015-06-05 15:42 ` Peter Zijlstra
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2015-06-05 15:42 UTC (permalink / raw)
  To: Imre Palik
  Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo,
	Thomas Gleixner, H. Peter Anvin, x86, linux-kernel, Palik, Imre,
	Anthony Liguori

On Fri, Jun 05, 2015 at 05:28:45PM +0200, Imre Palik wrote:
> From: "Palik, Imre" <imrep@amazon.de>
> 
> Architectural performance monitoring version 1 doesn't support fixed
> counters.  Currently, even if a hypervisor advertises support for
> architectural performance monitoring version 1, perf may still tries to use
> the fixed counters, as the constraints are set up based on the CPU model.
> 
> This patch ensures that perf honors the architectural performance
> monitoring version returned by CPUID, and it only uses the fixed counters
> for version two and above.
> 
> Some of the ideas in this patch are coming from Peter Zijlstra.
> 
> Signed-off-by: Imre Palik <imrep@amazon.de>
> Cc: Anthony Liguori <aliguori@amazon.com>
> ---
>  arch/x86/kernel/cpu/perf_event_intel.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 3998131..bde66aa 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1870,7 +1870,7 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>  		for_each_event_constraint(c, x86_pmu.event_constraints) {
>  			if ((event->hw.config & c->cmask) == c->code) {
>  				event->hw.flags |= c->flags;
> -				return c;
> +				return  c->idxmsk64 ? c : NULL;

One too many spaces there :-) Returning c as found, even with empty
idxmsk is fine.

Also, I think this is broken, I think we hard assume
x86_get_event_constraints() returns a valid constraint, see for example:

	x86_schedule_event():

		c = x86_pmu.get_event_constraints()
			= intel_get_event_constraints()
				 = __intel_get_event_constraints()
					 = x86_get_event_constraints();

		cpuc->event_constraint[i] = c;

		...

		c = cpuc->event_constraint[i];

		if (!test_bit(hwc->idx, c->idxmask)) <-- *boom*


> @@ -3341,9 +3341,12 @@ __init int intel_pmu_init(void)
>  		for_each_event_constraint(c, x86_pmu.event_constraints) {
>  			if (c->cmask != FIXED_EVENT_FLAGS
>  			    || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
> +				c->idxmsk64 &=
> +					~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));

If you change idxmsk64 you also need to update weight.

>  				continue;
>  			}
> -
> +			c->idxmsk64 &=
> +				~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
>  			c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
>  			c->weight += x86_pmu.num_counters;

And since we're now not unconditionally adding num_counters bits, that
weight update is broken.

For both sites, something like:

		c->weight = hweight64(c->idxmsk64);

Will recompute the weight.

Thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2] perf: honoring the architectural performance monitoring version
@ 2015-06-08 12:46 Imre Palik
  0 siblings, 0 replies; 3+ messages in thread
From: Imre Palik @ 2015-06-08 12:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul Mackerras, Ingo Molnar, Arnaldo Carvalho de Melo,
	Thomas Gleixner, H. Peter Anvin, x86, linux-kernel, Palik, Imre,
	Anthony Liguori

From: "Palik, Imre" <imrep@amazon.de>

Architectural performance monitoring version 1 doesn't support fixed counters.
Currently, even if a hypervisor advertises support for architectural
performance monitoring version 1, perf may still tries to use the fixed
counters, as the constraints are set up based on the CPU model.

This patch ensures that perf honors the architectural performance monitoring
version returned by CPUID, and it only uses the fixed counters for version two
and above.

Some of the ideas in this patch are coming from Peter Zijlstra.

Signed-off-by: Imre Palik <imrep@amazon.de>
Cc: Anthony Liguori <aliguori@amazon.com>
---
 arch/x86/kernel/cpu/perf_event_intel.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index d4c0a0e..2917248 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2604,13 +2604,13 @@ __init int intel_pmu_init(void)
 		 * counter, so do not extend mask to generic counters
 		 */
 		for_each_event_constraint(c, x86_pmu.event_constraints) {
-			if (c->cmask != FIXED_EVENT_FLAGS
-			    || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
-				continue;
+			if (c->cmask == FIXED_EVENT_FLAGS
+			    && c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES) {
+				c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
 			}
-
-			c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
-			c->weight += x86_pmu.num_counters;
+			c->idxmsk64 &=
+				~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
+			c->weight = hweight64(c->idxmsk64);
 		}
 	}
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-06-08 12:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-05 15:28 [PATCH v2] perf: honoring the architectural performance monitoring version Imre Palik
2015-06-05 15:42 ` Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2015-06-08 12:46 Imre Palik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox