* [PATCH] ARM: perf: limit sample_period to half max_period in non-sampling mode
@ 2012-02-20 12:07 Will Deacon
2012-02-20 12:39 ` Ming Lei
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2012-02-20 12:07 UTC (permalink / raw)
To: linux-arm-kernel
On ARM, the PMU does not stop counting after an overflow and therefore
IRQ latency affects the new counter value read by the kernel. This is
significant for non-sampling runs where it is possible for the new value
to overtake the previous one, causing the delta to be out by up to
max_period events.
Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
to IRQ latency") attempted to fix this problem by allowing interrupt
handlers to pass an overflow flag to the event update function, causing
the overflow calculation to assume that the counter passed through zero
when going from prev to new. Unfortunately, this doesn't work when
overflow occurs on the perf_task_tick path because we have the flag
cleared and end up computing a large negative delta.
This patch removes the overflow flag from armpmu_event_update and
instead limits the sample_period to half of the max_period for
non-sampling profiling runs.
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/include/asm/pmu.h | 2 +-
arch/arm/kernel/perf_event.c | 22 +++++++++++-----------
arch/arm/kernel/perf_event_v6.c | 2 +-
arch/arm/kernel/perf_event_v7.c | 2 +-
arch/arm/kernel/perf_event_xscale.c | 4 ++--
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
index b5a5be2..90114fa 100644
--- a/arch/arm/include/asm/pmu.h
+++ b/arch/arm/include/asm/pmu.h
@@ -134,7 +134,7 @@ int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type);
u64 armpmu_event_update(struct perf_event *event,
struct hw_perf_event *hwc,
- int idx, int overflow);
+ int idx);
int armpmu_event_set_period(struct perf_event *event,
struct hw_perf_event *hwc,
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 5bb91bf..56173ae 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -180,7 +180,7 @@ armpmu_event_set_period(struct perf_event *event,
u64
armpmu_event_update(struct perf_event *event,
struct hw_perf_event *hwc,
- int idx, int overflow)
+ int idx)
{
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
u64 delta, prev_raw_count, new_raw_count;
@@ -193,13 +193,7 @@ again:
new_raw_count) != prev_raw_count)
goto again;
- new_raw_count &= armpmu->max_period;
- prev_raw_count &= armpmu->max_period;
-
- if (overflow)
- delta = armpmu->max_period - prev_raw_count + new_raw_count + 1;
- else
- delta = new_raw_count - prev_raw_count;
+ delta = (new_raw_count - prev_raw_count) & armpmu->max_period;
local64_add(delta, &event->count);
local64_sub(delta, &hwc->period_left);
@@ -216,7 +210,7 @@ armpmu_read(struct perf_event *event)
if (hwc->idx < 0)
return;
- armpmu_event_update(event, hwc, hwc->idx, 0);
+ armpmu_event_update(event, hwc, hwc->idx);
}
static void
@@ -232,7 +226,7 @@ armpmu_stop(struct perf_event *event, int flags)
if (!(hwc->state & PERF_HES_STOPPED)) {
armpmu->disable(hwc, hwc->idx);
barrier(); /* why? */
- armpmu_event_update(event, hwc, hwc->idx, 0);
+ armpmu_event_update(event, hwc, hwc->idx);
hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
}
}
@@ -518,7 +512,13 @@ __hw_perf_event_init(struct perf_event *event)
hwc->config_base |= (unsigned long)mapping;
if (!hwc->sample_period) {
- hwc->sample_period = armpmu->max_period;
+ /*
+ * For non-sampling runs, limit the sample_period to half
+ * of the counter width. That way, the new counter value
+ * is far less likely to overtake the previous one unless
+ * you have some serious IRQ latency issues.
+ */
+ hwc->sample_period = armpmu->max_period >> 1;
hwc->last_period = hwc->sample_period;
local64_set(&hwc->period_left, hwc->sample_period);
}
diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c
index 533be99..88bf152 100644
--- a/arch/arm/kernel/perf_event_v6.c
+++ b/arch/arm/kernel/perf_event_v6.c
@@ -524,7 +524,7 @@ armv6pmu_handle_irq(int irq_num,
continue;
hwc = &event->hw;
- armpmu_event_update(event, hwc, idx, 1);
+ armpmu_event_update(event, hwc, idx);
data.period = event->hw.last_period;
if (!armpmu_event_set_period(event, hwc, idx))
continue;
diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c
index 6933244..6f48861 100644
--- a/arch/arm/kernel/perf_event_v7.c
+++ b/arch/arm/kernel/perf_event_v7.c
@@ -963,7 +963,7 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev)
continue;
hwc = &event->hw;
- armpmu_event_update(event, hwc, idx, 1);
+ armpmu_event_update(event, hwc, idx);
data.period = event->hw.last_period;
if (!armpmu_event_set_period(event, hwc, idx))
continue;
diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c
index 3b99d82..831e019 100644
--- a/arch/arm/kernel/perf_event_xscale.c
+++ b/arch/arm/kernel/perf_event_xscale.c
@@ -259,7 +259,7 @@ xscale1pmu_handle_irq(int irq_num, void *dev)
continue;
hwc = &event->hw;
- armpmu_event_update(event, hwc, idx, 1);
+ armpmu_event_update(event, hwc, idx);
data.period = event->hw.last_period;
if (!armpmu_event_set_period(event, hwc, idx))
continue;
@@ -596,7 +596,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev)
continue;
hwc = &event->hw;
- armpmu_event_update(event, hwc, idx, 1);
+ armpmu_event_update(event, hwc, idx);
data.period = event->hw.last_period;
if (!armpmu_event_set_period(event, hwc, idx))
continue;
--
1.7.4.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH] ARM: perf: limit sample_period to half max_period in non-sampling mode
2012-02-20 12:07 [PATCH] ARM: perf: limit sample_period to half max_period in non-sampling mode Will Deacon
@ 2012-02-20 12:39 ` Ming Lei
2012-02-20 13:34 ` Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Ming Lei @ 2012-02-20 12:39 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Mon, Feb 20, 2012 at 8:07 PM, Will Deacon <will.deacon@arm.com> wrote:
> On ARM, the PMU does not stop counting after an overflow and therefore
> IRQ latency affects the new counter value read by the kernel. This is
> significant for non-sampling runs where it is possible for the new value
> to overtake the previous one, causing the delta to be out by up to
> max_period events.
>
> Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
> to IRQ latency") attempted to fix this problem by allowing interrupt
> handlers to pass an overflow flag to the event update function, causing
> the overflow calculation to assume that the counter passed through zero
> when going from prev to new. Unfortunately, this doesn't work when
> overflow occurs on the perf_task_tick path because we have the flag
> cleared and end up computing a large negative delta.
>
> This patch removes the overflow flag from armpmu_event_update and
> instead limits the sample_period to half of the max_period for
> non-sampling profiling runs.
>
> Cc: Ming Lei <ming.lei@canonical.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
> ?arch/arm/include/asm/pmu.h ? ? ? ? ?| ? ?2 +-
> ?arch/arm/kernel/perf_event.c ? ? ? ?| ? 22 +++++++++++-----------
> ?arch/arm/kernel/perf_event_v6.c ? ? | ? ?2 +-
> ?arch/arm/kernel/perf_event_v7.c ? ? | ? ?2 +-
> ?arch/arm/kernel/perf_event_xscale.c | ? ?4 ++--
> ?5 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
> index b5a5be2..90114fa 100644
> --- a/arch/arm/include/asm/pmu.h
> +++ b/arch/arm/include/asm/pmu.h
> @@ -134,7 +134,7 @@ int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type);
>
> ?u64 armpmu_event_update(struct perf_event *event,
> ? ? ? ? ? ? ? ? ? ? ? ?struct hw_perf_event *hwc,
> - ? ? ? ? ? ? ? ? ? ? ? int idx, int overflow);
> + ? ? ? ? ? ? ? ? ? ? ? int idx);
>
> ?int armpmu_event_set_period(struct perf_event *event,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?struct hw_perf_event *hwc,
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index 5bb91bf..56173ae 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -180,7 +180,7 @@ armpmu_event_set_period(struct perf_event *event,
> ?u64
> ?armpmu_event_update(struct perf_event *event,
> ? ? ? ? ? ? ? ? ? ?struct hw_perf_event *hwc,
> - ? ? ? ? ? ? ? ? ? int idx, int overflow)
> + ? ? ? ? ? ? ? ? ? int idx)
> ?{
> ? ? ? ?struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> ? ? ? ?u64 delta, prev_raw_count, new_raw_count;
> @@ -193,13 +193,7 @@ again:
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? new_raw_count) != prev_raw_count)
> ? ? ? ? ? ? ? ?goto again;
>
> - ? ? ? new_raw_count &= armpmu->max_period;
> - ? ? ? prev_raw_count &= armpmu->max_period;
> -
> - ? ? ? if (overflow)
> - ? ? ? ? ? ? ? delta = armpmu->max_period - prev_raw_count + new_raw_count + 1;
> - ? ? ? else
> - ? ? ? ? ? ? ? delta = new_raw_count - prev_raw_count;
> + ? ? ? delta = (new_raw_count - prev_raw_count) & armpmu->max_period;
Please add
Signed-off-by: Ming Lei <ming.lei@canonical.com>
since I found the problem caused by incorrect 'delta' and the above change is
based on the original patch[1] from me.
[1], http://marc.info/?l=linux-kernel&m=132938793724425&w=2
thanks,
--
Ming Lei
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] ARM: perf: limit sample_period to half max_period in non-sampling mode
2012-02-20 12:39 ` Ming Lei
@ 2012-02-20 13:34 ` Will Deacon
2012-02-20 13:51 ` Ming Lei
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2012-02-20 13:34 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Feb 20, 2012 at 12:39:11PM +0000, Ming Lei wrote:
> On Mon, Feb 20, 2012 at 8:07 PM, Will Deacon <will.deacon@arm.com> wrote:
> > diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> > index 5bb91bf..56173ae 100644
> > --- a/arch/arm/kernel/perf_event.c
> > +++ b/arch/arm/kernel/perf_event.c
> > @@ -180,7 +180,7 @@ armpmu_event_set_period(struct perf_event *event,
> > ?u64
> > ?armpmu_event_update(struct perf_event *event,
> > ? ? ? ? ? ? ? ? ? ?struct hw_perf_event *hwc,
> > - ? ? ? ? ? ? ? ? ? int idx, int overflow)
> > + ? ? ? ? ? ? ? ? ? int idx)
> > ?{
> > ? ? ? ?struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
> > ? ? ? ?u64 delta, prev_raw_count, new_raw_count;
> > @@ -193,13 +193,7 @@ again:
> > ? ? ? ? ? ? ? ? ? ? ? ? ? ? new_raw_count) != prev_raw_count)
> > ? ? ? ? ? ? ? ?goto again;
> >
> > - ? ? ? new_raw_count &= armpmu->max_period;
> > - ? ? ? prev_raw_count &= armpmu->max_period;
> > -
> > - ? ? ? if (overflow)
> > - ? ? ? ? ? ? ? delta = armpmu->max_period - prev_raw_count + new_raw_count + 1;
> > - ? ? ? else
> > - ? ? ? ? ? ? ? delta = new_raw_count - prev_raw_count;
> > + ? ? ? delta = (new_raw_count - prev_raw_count) & armpmu->max_period;
>
> Please add
>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
Sure, didn't want to add anything more than a CC until I was sure you were
happy with it.
On the plus side, the warnings from Stephane's IRQ throttling code have been
fixed in -rc4 by f39d47ff ("perf: Fix double start/stop in
x86_pmu_start()").
So the ball should be back in the OMAP court now as far as enabling perf on
Panda.
Will
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] ARM: perf: limit sample_period to half max_period in non-sampling mode
2012-02-20 13:34 ` Will Deacon
@ 2012-02-20 13:51 ` Ming Lei
0 siblings, 0 replies; 4+ messages in thread
From: Ming Lei @ 2012-02-20 13:51 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Mon, Feb 20, 2012 at 9:34 PM, Will Deacon <will.deacon@arm.com> wrote:
> Sure, didn't want to add anything more than a CC until I was sure you were
> happy with it.
Sure, I am happy with it.
> On the plus side, the warnings from Stephane's IRQ throttling code have been
> fixed in -rc4 by f39d47ff ("perf: Fix double start/stop in
> x86_pmu_start()").
Good news.
> So the ball should be back in the OMAP court now as far as enabling perf on
> Panda.
Yes, several patches are still pending for omap4 perf support.
Also I have one patch for fixing possible oops on arm pmu, but it is a
bit late today and
I will send out it tomorrow,
thanks
--
Ming Lei
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-02-20 13:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-20 12:07 [PATCH] ARM: perf: limit sample_period to half max_period in non-sampling mode Will Deacon
2012-02-20 12:39 ` Ming Lei
2012-02-20 13:34 ` Will Deacon
2012-02-20 13:51 ` Ming Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).