linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
@ 2011-04-15 18:12 Eric B Munson
  2011-04-27  7:40 ` David Laight
  0 siblings, 1 reply; 11+ messages in thread
From: Eric B Munson @ 2011-04-15 18:12 UTC (permalink / raw)
  To: benh
  Cc: a.p.zijlstra, linux-kernel, David.Laight, paulus, anton, acme,
	mingo, linuxppc-dev, stable, Eric B Munson

Because of speculative event roll back, it is possible for some event coutners
to decrease between reads on POWER7.  This causes a problem with the way that
counters are updated.  Delta calues are calculated in a 64 bit value and the
top 32 bits are masked.  If the register value has decreased, this leaves us
with a very large positive value added to the kernel counters.  This patch
protects against this by skipping the update if the delta would be negative.
This can lead to a lack of precision in the coutner values, but from my testing
the value is typcially fewer than 10 samples at a time.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Cc: stable@kernel.org
---
Changes from V3:
 Fix delta checking so that only roll backs are discarded
Changes from V2:
 Create a helper that should handle counter roll back as well as registers that
might be allowed to roll over
Changes from V1:
 Updated patch leader
 Added stable CC
 Use an s32 to hold delta values and discard any values that are less than 0

 arch/powerpc/kernel/perf_event.c |   37 ++++++++++++++++++++++++++++++-------
 1 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/perf_event.c b/arch/powerpc/kernel/perf_event.c
index c4063b7..822f630 100644
--- a/arch/powerpc/kernel/perf_event.c
+++ b/arch/powerpc/kernel/perf_event.c
@@ -398,6 +398,25 @@ static int check_excludes(struct perf_event **ctrs, unsigned int cflags[],
 	return 0;
 }
 
+static u64 check_and_compute_delta(u64 prev, u64 val)
+{
+	u64 delta = (val - prev) & 0xfffffffful;
+
+	/*
+	 * POWER7 can roll back counter values, if the new value is smaller
+	 * than the previous value it will cause the delta and the counter to
+	 * have bogus values unless we rolled a counter over.  If a coutner is
+	 * rolled back, it will be smaller, but within 256, which is the maximum
+	 * number of events to rollback at once.  If we dectect a rollback
+	 * return 0.  This can lead to a small lack of precision in the
+	 * counters.
+	 */
+	if (prev > val && (prev - val) < 256)
+		delta = 0;
+
+	return delta;
+}
+
 static void power_pmu_read(struct perf_event *event)
 {
 	s64 val, delta, prev;
@@ -416,10 +435,11 @@ static void power_pmu_read(struct perf_event *event)
 		prev = local64_read(&event->hw.prev_count);
 		barrier();
 		val = read_pmc(event->hw.idx);
+		delta = check_and_compute_delta(prev, val);
+		if (!delta)
+			return;
 	} while (local64_cmpxchg(&event->hw.prev_count, prev, val) != prev);
 
-	/* The counters are only 32 bits wide */
-	delta = (val - prev) & 0xfffffffful;
 	local64_add(delta, &event->count);
 	local64_sub(delta, &event->hw.period_left);
 }
@@ -449,8 +469,9 @@ static void freeze_limited_counters(struct cpu_hw_events *cpuhw,
 		val = (event->hw.idx == 5) ? pmc5 : pmc6;
 		prev = local64_read(&event->hw.prev_count);
 		event->hw.idx = 0;
-		delta = (val - prev) & 0xfffffffful;
-		local64_add(delta, &event->count);
+		delta = check_and_compute_delta(prev, val);
+		if (delta)
+			local64_add(delta, &event->count);
 	}
 }
 
@@ -458,14 +479,16 @@ static void thaw_limited_counters(struct cpu_hw_events *cpuhw,
 				  unsigned long pmc5, unsigned long pmc6)
 {
 	struct perf_event *event;
-	u64 val;
+	u64 val, prev;
 	int i;
 
 	for (i = 0; i < cpuhw->n_limited; ++i) {
 		event = cpuhw->limited_counter[i];
 		event->hw.idx = cpuhw->limited_hwidx[i];
 		val = (event->hw.idx == 5) ? pmc5 : pmc6;
-		local64_set(&event->hw.prev_count, val);
+		prev = local64_read(&event->hw.prev_count);
+		if (check_and_compute_delta(prev, val))
+			local64_set(&event->hw.prev_count, val);
 		perf_event_update_userpage(event);
 	}
 }
@@ -1197,7 +1220,7 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 
 	/* we don't have to worry about interrupts here */
 	prev = local64_read(&event->hw.prev_count);
-	delta = (val - prev) & 0xfffffffful;
+	delta = check_and_compute_delta(prev, val);
 	local64_add(delta, &event->count);
 
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-15 18:12 [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks Eric B Munson
@ 2011-04-27  7:40 ` David Laight
  2011-04-27 12:19   ` Eric B Munson
  2011-04-27 13:26   ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 11+ messages in thread
From: David Laight @ 2011-04-27  7:40 UTC (permalink / raw)
  To: Eric B Munson, benh
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

I keep telling Eric that the code below is incorrect
modulo arithimetic...

> +static u64 check_and_compute_delta(u64 prev, u64 val)
> +{
> +	u64 delta =3D (val - prev) & 0xfffffffful;
> +
> +	/*
> +	 * POWER7 can roll back counter values, if the new value is
smaller
> +	 * than the previous value it will cause the delta and the
counter to
> +	 * have bogus values unless we rolled a counter over.  If a
coutner is
> +	 * rolled back, it will be smaller, but within 256, which is the
maximum
> +	 * number of events to rollback at once.  If we dectect a
rollback
> +	 * return 0.  This can lead to a small lack of precision in the
> +	 * counters.
> +	 */
> +	if (prev > val && (prev - val) < 256)
> +		delta =3D 0;
> +
> +	return delta;

The code should detect rollback by looking at the value of 'delta'
otherwise there are horrid end effects near 2^32-1.

For instance:
	u32 delta =3D val - prev;
	return delta & 0x80000000 ? 0 : delta;


   David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27  7:40 ` David Laight
@ 2011-04-27 12:19   ` Eric B Munson
  2011-04-27 12:34     ` David Laight
  2011-04-27 12:42     ` David Laight
  2011-04-27 13:26   ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 11+ messages in thread
From: Eric B Munson @ 2011-04-27 12:19 UTC (permalink / raw)
  To: David Laight
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

[-- Attachment #1: Type: text/plain, Size: 1529 bytes --]

On Wed, 27 Apr 2011, David Laight wrote:

> I keep telling Eric that the code below is incorrect
> modulo arithimetic...

But it isn't, and it doesn't have trouble with 2^32 - 1.  Here is one done by
hand:

Counter is at 0xffffffff and is rolled over to 0x101 (258 counted items so
that we miss the test for rollback).

 0x00000000ffffffff  (Remember counters are 32 bit, but we store them in 64)
-0x0000000000000101
=0xffffffff00000102

After the mask we have 0x00000000000102, the actual difference between the
counters.

> 
> > +static u64 check_and_compute_delta(u64 prev, u64 val)
> > +{
> > +	u64 delta = (val - prev) & 0xfffffffful;
> > +
> > +	/*
> > +	 * POWER7 can roll back counter values, if the new value is
> smaller
> > +	 * than the previous value it will cause the delta and the
> counter to
> > +	 * have bogus values unless we rolled a counter over.  If a
> coutner is
> > +	 * rolled back, it will be smaller, but within 256, which is the
> maximum
> > +	 * number of events to rollback at once.  If we dectect a
> rollback
> > +	 * return 0.  This can lead to a small lack of precision in the
> > +	 * counters.
> > +	 */
> > +	if (prev > val && (prev - val) < 256)
> > +		delta = 0;
> > +
> > +	return delta;
> 
> The code should detect rollback by looking at the value of 'delta'
> otherwise there are horrid end effects near 2^32-1.
> 
> For instance:
> 	u32 delta = val - prev;
> 	return delta & 0x80000000 ? 0 : delta;
> 
> 
>    David
> 
> 
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 12:19   ` Eric B Munson
@ 2011-04-27 12:34     ` David Laight
  2011-04-27 12:59       ` Eric B Munson
  2011-04-27 12:42     ` David Laight
  1 sibling, 1 reply; 11+ messages in thread
From: David Laight @ 2011-04-27 12:34 UTC (permalink / raw)
  To: Eric B Munson
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

> But it isn't, and it doesn't have trouble with 2^32 - 1. =20

what about:
prev =3D 0x00000001
val  =3D 0xffffffff

	David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 12:19   ` Eric B Munson
  2011-04-27 12:34     ` David Laight
@ 2011-04-27 12:42     ` David Laight
  2011-04-27 13:08       ` Eric B Munson
  1 sibling, 1 reply; 11+ messages in thread
From: David Laight @ 2011-04-27 12:42 UTC (permalink / raw)
  To: Eric B Munson
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

=20
> +	if (prev > val && (prev - val) < 256)
> +		delta =3D 0;
> +
> +	return delta;

Also, 'prev' is a true 64bit value, but 'val' is only ever 32bits.
So once the 64bit 'prev' exceeds 2^32+256 both 'prev > val'
and 'prev - val' are true regardless of the value of 'val'.
This will lead to jumps in the value ....

	David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 12:34     ` David Laight
@ 2011-04-27 12:59       ` Eric B Munson
  2011-04-27 13:04         ` David Laight
  0 siblings, 1 reply; 11+ messages in thread
From: Eric B Munson @ 2011-04-27 12:59 UTC (permalink / raw)
  To: David Laight
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

[-- Attachment #1: Type: text/plain, Size: 232 bytes --]

On Wed, 27 Apr 2011, David Laight wrote:

> > But it isn't, and it doesn't have trouble with 2^32 - 1.  
> 
> what about:
> prev = 0x00000001
> val  = 0xffffffff

Result is 0xfffffffe and we are fine.
> 
> 	David
> 
> 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 12:59       ` Eric B Munson
@ 2011-04-27 13:04         ` David Laight
  0 siblings, 0 replies; 11+ messages in thread
From: David Laight @ 2011-04-27 13:04 UTC (permalink / raw)
  To: Eric B Munson
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

=20
> On Wed, 27 Apr 2011, David Laight wrote:
>=20
> > > But it isn't, and it doesn't have trouble with 2^32 - 1. =20
> >=20
> > what about:
> > prev =3D 0x00000001
> > val  =3D 0xffffffff
>=20
> Result is 0xfffffffe and we are fine.

'delta' will be 0xfffffffe, but you need the function to
return zero - since the underlying counter has decremented
by 2. But 'prev > val' is false so the counter will
be increased by 2^32-2.

	David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 12:42     ` David Laight
@ 2011-04-27 13:08       ` Eric B Munson
  2011-04-27 13:13         ` David Laight
  0 siblings, 1 reply; 11+ messages in thread
From: Eric B Munson @ 2011-04-27 13:08 UTC (permalink / raw)
  To: David Laight
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

[-- Attachment #1: Type: text/plain, Size: 596 bytes --]

On Wed, 27 Apr 2011, David Laight wrote:

>  
> > +	if (prev > val && (prev - val) < 256)
> > +		delta = 0;
> > +
> > +	return delta;
> 
> Also, 'prev' is a true 64bit value, but 'val' is only ever 32bits.
> So once the 64bit 'prev' exceeds 2^32+256 both 'prev > val'
> and 'prev - val' are true regardless of the value of 'val'.
> This will lead to jumps in the value ....

prev and val are both 64 bit variables holding 32 bit numbers, we do not
accumulate in either, they are both replaced by values directly from the
registers.  So prev > val will not always be true.

Eric

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 13:08       ` Eric B Munson
@ 2011-04-27 13:13         ` David Laight
  2011-04-27 13:20           ` Eric B Munson
  0 siblings, 1 reply; 11+ messages in thread
From: David Laight @ 2011-04-27 13:13 UTC (permalink / raw)
  To: Eric B Munson
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

=20
> prev and val are both 64 bit variables holding 32 bit numbers, we do
not
> accumulate in either, they are both replaced by values directly from
the
> registers.
> So prev > val will not always be true.

The code seems to be:
    prev =3D local64_read(&event->hw.prev_count);
    val =3D read_pmc(event->hw.idx);
    delta =3D check_and_compute_delta(prev, val);
    local64_add(delta, &event->count);
Which looks very much like 'prev' being a 64bit counter generated
from the 32bit pmc register.

	David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27 13:13         ` David Laight
@ 2011-04-27 13:20           ` Eric B Munson
  0 siblings, 0 replies; 11+ messages in thread
From: Eric B Munson @ 2011-04-27 13:20 UTC (permalink / raw)
  To: David Laight
  Cc: a.p.zijlstra, linux-kernel, paulus, anton, acme, mingo,
	linuxppc-dev, stable

[-- Attachment #1: Type: text/plain, Size: 663 bytes --]

On Wed, 27 Apr 2011, David Laight wrote:

>  
> > prev and val are both 64 bit variables holding 32 bit numbers, we do
> not
> > accumulate in either, they are both replaced by values directly from
> the
> > registers.
> > So prev > val will not always be true.
> 
> The code seems to be:
>     prev = local64_read(&event->hw.prev_count);
>     val = read_pmc(event->hw.idx);
>     delta = check_and_compute_delta(prev, val);
>     local64_add(delta, &event->count);
> Which looks very much like 'prev' being a 64bit counter generated
> from the 32bit pmc register.
> 

Which implies that it will only ever be 32 bits wide, just stored in 64.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks
  2011-04-27  7:40 ` David Laight
  2011-04-27 12:19   ` Eric B Munson
@ 2011-04-27 13:26   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2011-04-27 13:26 UTC (permalink / raw)
  To: David Laight
  Cc: a.p.zijlstra, linux-kernel, paulus, Eric B Munson, acme, mingo,
	linuxppc-dev, stable, anton

On Wed, 2011-04-27 at 08:40 +0100, David Laight wrote:
> I keep telling Eric that the code below is incorrect
> modulo arithimetic...

His previous versions were wrong yes. This one should be limping along
afaik. But I tend to agree, testing delta is the way to go.

Eric idea was to not test the sign bit to allow for bigger increments of
the counter, since the rollback is limited to 255... But I agree it's
non obvious.

I've put the patch in for now, we can always followup with something
better.

Cheers,
Ben.

> > +static u64 check_and_compute_delta(u64 prev, u64 val)
> > +{
> > +	u64 delta = (val - prev) & 0xfffffffful;
> > +
> > +	/*
> > +	 * POWER7 can roll back counter values, if the new value is
> smaller
> > +	 * than the previous value it will cause the delta and the
> counter to
> > +	 * have bogus values unless we rolled a counter over.  If a
> coutner is
> > +	 * rolled back, it will be smaller, but within 256, which is the
> maximum
> > +	 * number of events to rollback at once.  If we dectect a
> rollback
> > +	 * return 0.  This can lead to a small lack of precision in the
> > +	 * counters.
> > +	 */
> > +	if (prev > val && (prev - val) < 256)
> > +		delta = 0;
> > +
> > +	return delta;
> 
> The code should detect rollback by looking at the value of 'delta'
> otherwise there are horrid end effects near 2^32-1.
> 
> For instance:
> 	u32 delta = val - prev;
> 	return delta & 0x80000000 ? 0 : delta;
> 
> 
>    David
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-04-27 13:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-15 18:12 [PATCH V4] POWER: perf_event: Skip updating kernel counters if register value shrinks Eric B Munson
2011-04-27  7:40 ` David Laight
2011-04-27 12:19   ` Eric B Munson
2011-04-27 12:34     ` David Laight
2011-04-27 12:59       ` Eric B Munson
2011-04-27 13:04         ` David Laight
2011-04-27 12:42     ` David Laight
2011-04-27 13:08       ` Eric B Munson
2011-04-27 13:13         ` David Laight
2011-04-27 13:20           ` Eric B Munson
2011-04-27 13:26   ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).