linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM: perf: ensure overflows aren't missed due to IRQ latency
@ 2011-03-23 15:52 Will Deacon
  2011-03-23 18:13 ` Jamie Iles
  0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2011-03-23 15:52 UTC (permalink / raw)
  To: linux-arm-kernel

If a counter overflows during a perf stat profiling run it may overtake
the last known value of the counter:

    0        prev     new                0xffffffff
    |----------|-------|----------------------|

In this case, the number of events that have occurred is
(0xffffffff - prev) + new. Unfortunately, the event update code will
not realise an overflow has occurred and will instead report the event
delta as (new - prev) which may be considerably smaller than the real
count.

This patch adds an extra argument to armpmu_event_update which indicates
whether or not an overflow has occurred. If an overflow has occurred and
the new count value is greater than the previous one, then we add on
the remaining event period (which must have expired for the overflow to
occur) and the previous count value.

Cc: Jamie Iles <jamie@jamieiles.com>
Reported-by: Ashwin Chaugule <ashbertslists@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---

Jamie, I've changed quite a bit of the delta calculation code in
armpmu_event_update so that we can detect the problem above.

Please let me know if I've missed/broken anything!

Will

 arch/arm/kernel/perf_event.c        |   19 +++++++++++--------
 arch/arm/kernel/perf_event_v6.c     |    2 +-
 arch/arm/kernel/perf_event_v7.c     |    2 +-
 arch/arm/kernel/perf_event_xscale.c |    4 ++--
 4 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index d150ad1..53f6068 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -204,11 +204,9 @@ armpmu_event_set_period(struct perf_event *event,
 static u64
 armpmu_event_update(struct perf_event *event,
 		    struct hw_perf_event *hwc,
-		    int idx)
+		    int idx, int overflow)
 {
-	int shift = 64 - 32;
-	s64 prev_raw_count, new_raw_count;
-	u64 delta;
+	u64 delta, prev_raw_count, new_raw_count;
 
 again:
 	prev_raw_count = local64_read(&hwc->prev_count);
@@ -218,8 +216,13 @@ again:
 			     new_raw_count) != prev_raw_count)
 		goto again;
 
-	delta = (new_raw_count << shift) - (prev_raw_count << shift);
-	delta >>= shift;
+	new_raw_count &= armpmu->max_period;
+	prev_raw_count &= armpmu->max_period;
+
+	delta = (new_raw_count - prev_raw_count);
+
+	if (overflow && new_raw_count > prev_raw_count)
+		delta += local64_read(&hwc->period_left) + prev_raw_count;
 
 	local64_add(delta, &event->count);
 	local64_sub(delta, &hwc->period_left);
@@ -236,7 +239,7 @@ armpmu_read(struct perf_event *event)
 	if (hwc->idx < 0)
 		return;
 
-	armpmu_event_update(event, hwc, hwc->idx);
+	armpmu_event_update(event, hwc, hwc->idx, 0);
 }
 
 static void
@@ -254,7 +257,7 @@ armpmu_stop(struct perf_event *event, int flags)
 	if (!(hwc->state & PERF_HES_STOPPED)) {
 		armpmu->disable(hwc, hwc->idx);
 		barrier(); /* why? */
-		armpmu_event_update(event, hwc, hwc->idx);
+		armpmu_event_update(event, hwc, hwc->idx, 0);
 		hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
 	}
 }
diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c
index c058bfc..66ce900 100644
--- a/arch/arm/kernel/perf_event_v6.c
+++ b/arch/arm/kernel/perf_event_v6.c
@@ -474,7 +474,7 @@ armv6pmu_handle_irq(int irq_num,
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx);
+		armpmu_event_update(event, hwc, idx, 1);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 2e14025..e144b8c 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -780,7 +780,7 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev)
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx);
+		armpmu_event_update(event, hwc, idx, 1);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c
index 28cd3b0..39affbe 100644
--- a/arch/arm/kernel/perf_event_xscale.c
+++ b/arch/arm/kernel/perf_event_xscale.c
@@ -246,7 +246,7 @@ xscale1pmu_handle_irq(int irq_num, void *dev)
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx);
+		armpmu_event_update(event, hwc, idx, 1);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
@@ -578,7 +578,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev)
 			continue;
 
 		hwc = &event->hw;
-		armpmu_event_update(event, hwc, idx);
+		armpmu_event_update(event, hwc, idx, 1);
 		data.period = event->hw.last_period;
 		if (!armpmu_event_set_period(event, hwc, idx))
 			continue;
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] ARM: perf: ensure overflows aren't missed due to IRQ latency
  2011-03-23 15:52 [PATCH] ARM: perf: ensure overflows aren't missed due to IRQ latency Will Deacon
@ 2011-03-23 18:13 ` Jamie Iles
       [not found]   ` <AANLkTimOKgtuAwYOrnx7anW1__wzQg-rPhqqRmC=qJcY@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Jamie Iles @ 2011-03-23 18:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On Wed, Mar 23, 2011 at 03:52:05PM +0000, Will Deacon wrote:
> If a counter overflows during a perf stat profiling run it may overtake
> the last known value of the counter:
> 
>     0        prev     new                0xffffffff
>     |----------|-------|----------------------|
> 
> In this case, the number of events that have occurred is
> (0xffffffff - prev) + new. Unfortunately, the event update code will
> not realise an overflow has occurred and will instead report the event
> delta as (new - prev) which may be considerably smaller than the real
> count.
> 
> This patch adds an extra argument to armpmu_event_update which indicates
> whether or not an overflow has occurred. If an overflow has occurred and
> the new count value is greater than the previous one, then we add on
> the remaining event period (which must have expired for the overflow to
> occur) and the previous count value.
> 
> Cc: Jamie Iles <jamie@jamieiles.com>
> Reported-by: Ashwin Chaugule <ashbertslists@gmail.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
> 
> Jamie, I've changed quite a bit of the delta calculation code in
> armpmu_event_update so that we can detect the problem above.
> 
> Please let me know if I've missed/broken anything!
> 
> Will
> 
>  arch/arm/kernel/perf_event.c        |   19 +++++++++++--------
>  arch/arm/kernel/perf_event_v6.c     |    2 +-
>  arch/arm/kernel/perf_event_v7.c     |    2 +-
>  arch/arm/kernel/perf_event_xscale.c |    4 ++--
>  4 files changed, 15 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index d150ad1..53f6068 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -204,11 +204,9 @@ armpmu_event_set_period(struct perf_event *event,
>  static u64
>  armpmu_event_update(struct perf_event *event,
>  		    struct hw_perf_event *hwc,
> -		    int idx)
> +		    int idx, int overflow)
>  {
> -	int shift = 64 - 32;
> -	s64 prev_raw_count, new_raw_count;
> -	u64 delta;
> +	u64 delta, prev_raw_count, new_raw_count;
>  
>  again:
>  	prev_raw_count = local64_read(&hwc->prev_count);
> @@ -218,8 +216,13 @@ again:
>  			     new_raw_count) != prev_raw_count)
>  		goto again;
>  
> -	delta = (new_raw_count << shift) - (prev_raw_count << shift);
> -	delta >>= shift;
> +	new_raw_count &= armpmu->max_period;
> +	prev_raw_count &= armpmu->max_period;
> +
> +	delta = (new_raw_count - prev_raw_count);
> +
> +	if (overflow && new_raw_count > prev_raw_count)
> +		delta += local64_read(&hwc->period_left) + prev_raw_count;
>  
>  	local64_add(delta, &event->count);
>  	local64_sub(delta, &hwc->period_left);

Hmm, I'm not really sure I follow that and see how it works for an 
overflow when new < prev.  Why doesn't the naive:

	new_raw_count &= armpmu->max_period;
	prev_raw_count &= armpmu->max_period;
	if (overflow)
		delta = (armpmu->max_period - prev_raw_count) + 
			new_raw_count
	else
		delta = new_raw_count - prev_raw_count;

work?  I'm sure I'm missing something here!

Jamie

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] ARM: perf: ensure overflows aren't missed due to IRQ latency
       [not found]   ` <AANLkTimOKgtuAwYOrnx7anW1__wzQg-rPhqqRmC=qJcY@mail.gmail.com>
@ 2011-03-23 18:35     ` Ashwin Chaugule
  2011-03-23 18:44       ` Will Deacon
  0 siblings, 1 reply; 4+ messages in thread
From: Ashwin Chaugule @ 2011-03-23 18:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/23/2011 02:28 PM, Jamie Iles wrote:
> Hmm, I'm not really sure I follow that and see how it works for an
> overflow when new < prev.  Why doesn't the naive:
> 
>        new_raw_count &= armpmu->max_period;
>        prev_raw_count &= armpmu->max_period;
>        if (overflow)
>                delta = (armpmu->max_period - prev_raw_count) +
>                        new_raw_count
>        else
>                delta = new_raw_count - prev_raw_count;
> 
> work?  I'm sure I'm missing something here!
> 
> Jamie

When new < prev, delta will be greater than 2^32. So we'll have accounted for the overflow condition anyway.

The naive solution you mention above is what we discussed earlier and works too.
I guess the additional:

>        new_raw_count &= armpmu->max_period;
>        prev_raw_count &= armpmu->max_period;

just makes it cleaner than the shifting to upper 32bits, subtracting and then shifting back again.

Cheers,
Ashwin
-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] ARM: perf: ensure overflows aren't missed due to IRQ latency
  2011-03-23 18:35     ` Ashwin Chaugule
@ 2011-03-23 18:44       ` Will Deacon
  0 siblings, 0 replies; 4+ messages in thread
From: Will Deacon @ 2011-03-23 18:44 UTC (permalink / raw)
  To: linux-arm-kernel

> On 03/23/2011 02:28 PM, Jamie Iles wrote:
> > Hmm, I'm not really sure I follow that and see how it works for an
> > overflow when new < prev.  Why doesn't the naive:
> >
> >        new_raw_count &= armpmu->max_period;
> >        prev_raw_count &= armpmu->max_period;
> >        if (overflow)
> >                delta = (armpmu->max_period - prev_raw_count) +
> >                        new_raw_count
> >        else
> >                delta = new_raw_count - prev_raw_count;
> >
> > work?  I'm sure I'm missing something here!
> >
> > Jamie
> 
> When new < prev, delta will be greater than 2^32. So we'll have accounted for the overflow condition
> anyway.
> 
> The naive solution you mention above is what we discussed earlier and works too.

Yup, they both work but I think Jamie's version is easier to read so
I'll roll another patch...

> I guess the additional:
> 
> >        new_raw_count &= armpmu->max_period;
> >        prev_raw_count &= armpmu->max_period;
> 
> just makes it cleaner than the shifting to upper 32bits, subtracting and then shifting back again.

... but I'm having second thoughts about using max_period as the
mask. If the period was anything other than 0xffffffff then it
wouldn't work. I'll add another field to the PMU structure in
case somebody in the future has counters with strange period
constraints.

Will

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-03-23 18:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-23 15:52 [PATCH] ARM: perf: ensure overflows aren't missed due to IRQ latency Will Deacon
2011-03-23 18:13 ` Jamie Iles
     [not found]   ` <AANLkTimOKgtuAwYOrnx7anW1__wzQg-rPhqqRmC=qJcY@mail.gmail.com>
2011-03-23 18:35     ` Ashwin Chaugule
2011-03-23 18:44       ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).