* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops @ 2012-02-21 4:37 Ming Lei 2012-02-21 13:32 ` Will Deacon 0 siblings, 1 reply; 9+ messages in thread From: Ming Lei @ 2012-02-21 4:37 UTC (permalink / raw) To: linux-arm-kernel The patch adds one check in irq handler to skip handling deleted counter for avoiding oops. When one counter is deleted, event reference(hw_events->events[idx]) will be cleared but the hw overflow status may be kept, so the hw event may be handled and oops will be triggered in irq handler if pmu irq from other events come. The another candidate fix is to clear the overflow status for the event in arm_pmu->disable(), but looks it is a bit more complicated and has no obvious advantage than the fix in this patch. --- arch/arm/kernel/perf_event_v7.c | 3 ++- arch/arm/kernel/perf_event_xscale.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c index 54964dc..05762e4 100644 --- a/arch/arm/kernel/perf_event_v7.c +++ b/arch/arm/kernel/perf_event_v7.c @@ -1082,8 +1082,9 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) /* * We have a single interrupt for all counters. Check that * each counter has overflowed before we process it. + * Do not handle deleted counter. */ - if (!armv7_pmnc_counter_has_overflowed(pmnc, idx)) + if (!armv7_pmnc_counter_has_overflowed(pmnc, idx) || !event) continue; hwc = &event->hw; diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c index 831e019..1e07aec 100644 --- a/arch/arm/kernel/perf_event_xscale.c +++ b/arch/arm/kernel/perf_event_xscale.c @@ -592,7 +592,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev) struct perf_event *event = cpuc->events[idx]; struct hw_perf_event *hwc; - if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx)) + if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx) || !event) continue; hwc = &event->hw; -- 1.7.9 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 4:37 [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops Ming Lei @ 2012-02-21 13:32 ` Will Deacon 2012-02-21 14:02 ` Ming Lei 0 siblings, 1 reply; 9+ messages in thread From: Will Deacon @ 2012-02-21 13:32 UTC (permalink / raw) To: linux-arm-kernel Hi Ming Lei, Thanks for the patch. I looked at this in the past since there was a buggy FPGA flying about that could fire spurious overflows but I didn't get round to doing anything with the patches. On Tue, Feb 21, 2012 at 04:37:38AM +0000, Ming Lei wrote: > The patch adds one check in irq handler to skip handling deleted > counter for avoiding oops. > > When one counter is deleted, event reference(hw_events->events[idx]) > will be cleared but the hw overflow status may be kept, so the > hw event may be handled and oops will be triggered in irq handler if > pmu irq from other events come. Interesting, I'd not considered the case where the counter overflows while we are in the process of disabling it. That needs fixing. > The another candidate fix is to clear the overflow status for the > event in arm_pmu->disable(), but looks it is a bit more complicated > and has no obvious advantage than the fix in this patch. It's not much more complicated and it does have the advantage that we avoid taking an extra interrupt and also avoid the unlikely case where we take an interrupt in the context of a task using perf different from the one in which the counter overflowed, leading to false accounting. > --- > arch/arm/kernel/perf_event_v7.c | 3 ++- > arch/arm/kernel/perf_event_xscale.c | 2 +- > 2 files changed, 3 insertions(+), 2 deletions(-) I have a couple of patches that solve this by (a) clearing the overflow flag when disabling the counter for ARMv7 and (b) adding the event checks to the interrupt handlers in case the hardware is misbehaving. I'll post them shortly (you can add your signed-off-by to the second one if you like). Cheers, Will ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 13:32 ` Will Deacon @ 2012-02-21 14:02 ` Ming Lei 2012-02-21 15:19 ` Will Deacon 0 siblings, 1 reply; 9+ messages in thread From: Ming Lei @ 2012-02-21 14:02 UTC (permalink / raw) To: linux-arm-kernel Hi Will, On Tue, Feb 21, 2012 at 9:32 PM, Will Deacon <will.deacon@arm.com> wrote: > Hi Ming Lei, > > Thanks for the patch. I looked at this in the past since there was a buggy > FPGA flying about that could fire spurious overflows but I didn't get round > to doing anything with the patches. > > On Tue, Feb 21, 2012 at 04:37:38AM +0000, Ming Lei wrote: >> The patch adds one check in irq handler to skip handling deleted >> counter for avoiding oops. >> >> When one counter is deleted, event reference(hw_events->events[idx]) >> will be cleared but the hw overflow status may be kept, so the >> hw event may be handled and oops will be triggered in irq handler if >> pmu irq from other events come. > > Interesting, I'd not considered the case where the counter overflows while > we are in the process of disabling it. That needs fixing. > >> The another candidate fix is to clear the overflow status for the >> event in arm_pmu->disable(), but looks it is a bit more complicated >> and has no obvious advantage than the fix in this patch. > > It's not much more complicated and it does have the advantage that we avoid > taking an extra interrupt and also avoid the unlikely case where we take an The extra interrupt may not be triggered since it has been disabled in ->disable(). Even though it is routed to gic before disabling, the check added can handle the case well. > interrupt in the context of a task using perf different from the one in which > the counter overflowed, leading to false accounting. I have thought about the case. Even the first irq is triggered after the new counter is added on a new task context, there is still very few count accounted since the count read from hw counter is sure to be very similar with count written to hw counter. That is why I mentioned clearing overflow flag in ->disable has no obvious advantage. But you can do it if you like, :-) In fact, I have tried to clear overflow in armpmu->disable() on OMAP4 and found it is surely capable of fixing the oops. > >> --- >> ?arch/arm/kernel/perf_event_v7.c ? ? | ? ?3 ++- >> ?arch/arm/kernel/perf_event_xscale.c | ? ?2 +- >> ?2 files changed, 3 insertions(+), 2 deletions(-) > > I have a couple of patches that solve this by (a) clearing the overflow flag > when disabling the counter for ARMv7 and (b) adding the event checks to the Maybe ARMv6 and xscale need to be fixed too if the overflow flag will be kept even after the counter is disabled. > interrupt handlers in case the hardware is misbehaving. thanks, -- Ming Lei ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 14:02 ` Ming Lei @ 2012-02-21 15:19 ` Will Deacon 2012-02-21 15:34 ` Ming Lei 0 siblings, 1 reply; 9+ messages in thread From: Will Deacon @ 2012-02-21 15:19 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 21, 2012 at 02:02:32PM +0000, Ming Lei wrote: > On Tue, Feb 21, 2012 at 9:32 PM, Will Deacon <will.deacon@arm.com> wrote: > > I have a couple of patches that solve this by (a) clearing the overflow flag > > when disabling the counter for ARMv7 and (b) adding the event checks to the > > Maybe ARMv6 and xscale need to be fixed too if the overflow flag will be kept > even after the counter is disabled. I don't think so - seems like we clear the overflow flag already for those PMUs in the disable path (note that we write ~mask). Will ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 15:19 ` Will Deacon @ 2012-02-21 15:34 ` Ming Lei 2012-02-21 15:40 ` Will Deacon 0 siblings, 1 reply; 9+ messages in thread From: Ming Lei @ 2012-02-21 15:34 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 21, 2012 at 11:19 PM, Will Deacon <will.deacon@arm.com> wrote: > On Tue, Feb 21, 2012 at 02:02:32PM +0000, Ming Lei wrote: >> On Tue, Feb 21, 2012 at 9:32 PM, Will Deacon <will.deacon@arm.com> wrote: >> > I have a couple of patches that solve this by (a) clearing the overflow flag >> > when disabling the counter for ARMv7 and (b) adding the event checks to the >> >> Maybe ARMv6 and xscale need to be fixed too if the overflow flag will be kept >> even after the counter is disabled. > > I don't think so - seems like we clear the overflow flag already for those > PMUs in the disable path (note that we write ~mask). mask is only the counters enable bits, both v6 and xscale don't clear their overflow flag in disable callback. thanks, -- Ming Lei ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 15:34 ` Ming Lei @ 2012-02-21 15:40 ` Will Deacon 2012-02-21 15:49 ` Ming Lei 0 siblings, 1 reply; 9+ messages in thread From: Will Deacon @ 2012-02-21 15:40 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 21, 2012 at 03:34:58PM +0000, Ming Lei wrote: > On Tue, Feb 21, 2012 at 11:19 PM, Will Deacon <will.deacon@arm.com> wrote: > > On Tue, Feb 21, 2012 at 02:02:32PM +0000, Ming Lei wrote: > >> On Tue, Feb 21, 2012 at 9:32 PM, Will Deacon <will.deacon@arm.com> wrote: > >> > I have a couple of patches that solve this by (a) clearing the overflow flag > >> > when disabling the counter for ARMv7 and (b) adding the event checks to the > >> > >> Maybe ARMv6 and xscale need to be fixed too if the overflow flag will be kept > >> even after the counter is disabled. > > > > I don't think so - seems like we clear the overflow flag already for those > > PMUs in the disable path (note that we write ~mask). > > mask is only the counters enable bits, both v6 and xscale don't clear their > overflow flag in disable callback. Sure they do. When we read-modify-write the pmcr we &= ~mask, so we end up writing the overflow flags back if they are set, which is sufficient to clear them. Will ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 15:40 ` Will Deacon @ 2012-02-21 15:49 ` Ming Lei 2012-02-21 15:57 ` Will Deacon 0 siblings, 1 reply; 9+ messages in thread From: Ming Lei @ 2012-02-21 15:49 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 21, 2012 at 11:40 PM, Will Deacon <will.deacon@arm.com> wrote: > On Tue, Feb 21, 2012 at 03:34:58PM +0000, Ming Lei wrote: >> On Tue, Feb 21, 2012 at 11:19 PM, Will Deacon <will.deacon@arm.com> wrote: >> > On Tue, Feb 21, 2012 at 02:02:32PM +0000, Ming Lei wrote: >> >> On Tue, Feb 21, 2012 at 9:32 PM, Will Deacon <will.deacon@arm.com> wrote: >> >> > I have a couple of patches that solve this by (a) clearing the overflow flag >> >> > when disabling the counter for ARMv7 and (b) adding the event checks to the >> >> >> >> Maybe ARMv6 and xscale need to be fixed too if the overflow flag will be kept >> >> even after the counter is disabled. >> > >> > I don't think so - seems like we clear the overflow flag already for those >> > PMUs in the disable path (note that we write ~mask). >> >> mask is only the counters enable bits, both v6 and xscale don't clear their >> overflow flag in disable callback. > > Sure they do. When we read-modify-write the pmcr we &= ~mask, so we end up > writing the overflow flags back if they are set, which is sufficient to > clear them. Sorry for missing it, but looks xscale uses standalone overflow register and does not clear its overflow flag. thanks, -- Ming Lei ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 15:49 ` Ming Lei @ 2012-02-21 15:57 ` Will Deacon 2012-02-21 17:36 ` Will Deacon 0 siblings, 1 reply; 9+ messages in thread From: Will Deacon @ 2012-02-21 15:57 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 21, 2012 at 03:49:22PM +0000, Ming Lei wrote: > On Tue, Feb 21, 2012 at 11:40 PM, Will Deacon <will.deacon@arm.com> wrote: > > Sure they do. When we read-modify-write the pmcr we &= ~mask, so we end up > > writing the overflow flags back if they are set, which is sufficient to > > clear them. > > Sorry for missing it, but looks xscale uses standalone overflow > register and does not > clear its overflow flag. Hmmm, I don't know very much about the xscale PMUs (the perf code is based on the old OProfile code). The xscale2 overflow handler is certainly doing something with some overflow flags though, but then it checks the pmnc again for each counter... I'd rather not touch that without knowing *exactly* what I'm doing. It may be that the current code deals with the flag but, even if it doesn't, the spurious interrupt isn't the end of the world. I have a board at home that advertises its PMU as xscale2, so I could test that this patch doesn't cause breakage. Will ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops 2012-02-21 15:57 ` Will Deacon @ 2012-02-21 17:36 ` Will Deacon 0 siblings, 0 replies; 9+ messages in thread From: Will Deacon @ 2012-02-21 17:36 UTC (permalink / raw) To: linux-arm-kernel On Tue, Feb 21, 2012 at 03:57:07PM +0000, Will Deacon wrote: > On Tue, Feb 21, 2012 at 03:49:22PM +0000, Ming Lei wrote: > > > > Sorry for missing it, but looks xscale uses standalone overflow > > register and does not > > clear its overflow flag. > > Hmmm, I don't know very much about the xscale PMUs (the perf code is based > on the old OProfile code). The xscale2 overflow handler is certainly doing > something with some overflow flags though, but then it checks the pmnc again > for each counter... Bah, I got curious and took the plunge into the excitingly titled `Intel XScale Core Developer's Manual'. It looks like the interrupt handler currently has a bug, so this should sort it out (I'll try and test it on my xscale board, but that has > 200 days of uptime so I'm reluctant to reboot it). Will >From b3dc51a39347c000745939000faa1d7d50514fa2 Mon Sep 17 00:00:00 2001 From: Will Deacon <will.deacon@arm.com> Date: Tue, 21 Feb 2012 17:14:34 +0000 Subject: [PATCH] ARM: perf: fix overflow handling for xscale2 PMUs xscale2 PMUs indicate overflow not via the PMU control register, but by a separate overflow FLAG register instead. This patch fixes the xscale2 PMU code to use this register to detect to overflow and ensures that we clear any pending overflow when disabling a counter. Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/kernel/perf_event_xscale.c | 10 ++++++++-- 1 files changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c index a5bbd36..71a21e6 100644 --- a/arch/arm/kernel/perf_event_xscale.c +++ b/arch/arm/kernel/perf_event_xscale.c @@ -598,7 +598,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev) if (!event) continue; - if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx)) + if (!xscale2_pmnc_counter_has_overflowed(of_flags, idx)) continue; hwc = &event->hw; @@ -669,7 +669,7 @@ xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) static void xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) { - unsigned long flags, ien, evtsel; + unsigned long flags, ien, evtsel, of_flags; struct pmu_hw_events *events = cpu_pmu->get_hw_events(); ien = xscale2pmu_read_int_enable(); @@ -678,26 +678,31 @@ xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) switch (idx) { case XSCALE_CYCLE_COUNTER: ien &= ~XSCALE2_CCOUNT_INT_EN; + of_flags = XSCALE2_CCOUNT_OVERFLOW; break; case XSCALE_COUNTER0: ien &= ~XSCALE2_COUNT0_INT_EN; evtsel &= ~XSCALE2_COUNT0_EVT_MASK; evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT0_EVT_SHFT; + of_flags = XSCALE2_COUNT0_OVERFLOW; break; case XSCALE_COUNTER1: ien &= ~XSCALE2_COUNT1_INT_EN; evtsel &= ~XSCALE2_COUNT1_EVT_MASK; evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT1_EVT_SHFT; + of_flags = XSCALE2_COUNT1_OVERFLOW; break; case XSCALE_COUNTER2: ien &= ~XSCALE2_COUNT2_INT_EN; evtsel &= ~XSCALE2_COUNT2_EVT_MASK; evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT2_EVT_SHFT; + of_flags = XSCALE2_COUNT2_OVERFLOW; break; case XSCALE_COUNTER3: ien &= ~XSCALE2_COUNT3_INT_EN; evtsel &= ~XSCALE2_COUNT3_EVT_MASK; evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT3_EVT_SHFT; + of_flags = XSCALE2_COUNT3_OVERFLOW; break; default: WARN_ONCE(1, "invalid counter number (%d)\n", idx); @@ -707,6 +712,7 @@ xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) raw_spin_lock_irqsave(&events->pmu_lock, flags); xscale2pmu_write_event_select(evtsel); xscale2pmu_write_int_enable(ien); + xscale2pmu_write_overflow_flags(of_flags); raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-02-21 17:36 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-21 4:37 [PATCH] ARM: perf: do not handle deleted counter in irq handler to avoid oops Ming Lei 2012-02-21 13:32 ` Will Deacon 2012-02-21 14:02 ` Ming Lei 2012-02-21 15:19 ` Will Deacon 2012-02-21 15:34 ` Ming Lei 2012-02-21 15:40 ` Will Deacon 2012-02-21 15:49 ` Ming Lei 2012-02-21 15:57 ` Will Deacon 2012-02-21 17:36 ` Will Deacon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).