* [PATCH 4/6] ARM: oprofile: use perf-events framework as backend
@ 2010-05-12 4:35 Deng-Cheng Zhu
2010-05-12 9:32 ` Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Deng-Cheng Zhu @ 2010-05-12 4:35 UTC (permalink / raw)
To: linux-arm-kernel
Hi, Will
This is a great idea.
Here's one comment on irq handling (Maybe you or somebody else has already
addressed in other threads - I'm new to this list):
In original Oprofile code, the irq handler mainly does 3 things - clearing
the flag, resetting the counter, and calling oprofile_add_sample(). By
using perf events as the backend of Oprofile, they are now scattered into
a relatively long code path in armpmu->handle_irq(). Could this be
something improvable?
In addition, the counterpart of resetting the counter in perf events is
the armpmu->write_counter() in armpmu_event_set_period(). How could you
make sure the value written into the counter is the same between the
original Oprofile and the "new" Oprofile?
Deng-Cheng
On 2010-04-19 16:42, Will Deacon wrote:
> There are currently two hardware performance monitoring subsystems in the
> kernel for ARM: OProfile and perf-events. This creates the following
problems:
>
> 1.) Duplicate PMU accessor code. Inevitable code drift may lead to
bugs in one
> framework that are fixed in the other.
>
> 2.) Locking issues. OProfile doesn't reprogram hardware counters between
> profiling runs if the events to be monitored have not been changed.
This means
> that other profiling frameworks cannot use the counters if OProfile
is in use.
>
> 3.) Due to differences in the two frameworks, it may not be possible to
> compare the results obtained by OProfile with those obtained by perf.
>
> This patch removes the OProfile PMU driver code and replaces it with
calls
> to perf, therefore solving the issues mentioned above.
>
> +/*
> + * Overflow callback for oprofile.
> + */
> +static void op_overflow_handler(struct perf_event *event, int unused,
> + struct perf_sample_data *data, struct pt_regs *regs)
> +{
> + int id;
> + u32 cpu = smp_processor_id();
> +
> + for (id = 0; id < perf_num_counters; ++id)
> + if (perf_events[cpu][id] == event)
> + break;
> +
> + if (id != perf_num_counters)
> + oprofile_add_sample(regs, id);
> + else
> + pr_warning("oprofile: ignoring spurious overflow "
> + "on cpu %u\n", cpu);
> +}
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH 4/6] ARM: oprofile: use perf-events framework as backend 2010-05-12 4:35 [PATCH 4/6] ARM: oprofile: use perf-events framework as backend Deng-Cheng Zhu @ 2010-05-12 9:32 ` Will Deacon 0 siblings, 0 replies; 4+ messages in thread From: Will Deacon @ 2010-05-12 9:32 UTC (permalink / raw) To: linux-arm-kernel On Wed, 2010-05-12 at 05:35 +0100, Deng-Cheng Zhu wrote: > Hi, Will > Hi Deng-Cheng, > > This is a great idea. > I'm pleased to hear that :) > Here's one comment on irq handling (Maybe you or somebody else has already > addressed in other threads - I'm new to this list): > This code is pretty new, so comments and concerns are more than welcome. > In original Oprofile code, the irq handler mainly does 3 things - clearing > the flag, resetting the counter, and calling oprofile_add_sample(). By > using perf events as the backend of Oprofile, they are now scattered into > a relatively long code path in armpmu->handle_irq(). Could this be > something improvable? > The IRQ handling path is indeed longer, but this shouldn't affect the profiling data you obtain provided that you aren't sampling with an unrealistic period [and essentially profiling the profiling framework]. The PMU interrupt handler for ARM runs with interrupts disabled and passes the IRQ registers to the overflow handler, so although the execution path is longer than before, you're still dealing with the same data. I guess the benefit of reducing this code path is that you minimise the overhead of profiling, but unless you're seeing big problems with the new framework, I don't think this is an issue. > In addition, the counterpart of resetting the counter in perf events is > the armpmu->write_counter() in armpmu_event_set_period(). How could you > make sure the value written into the counter is the same between the > original Oprofile and the "new" Oprofile? > For Oprofile, we only deal with the sample_period attribute field and leave the frequency stuff alone. This means that the only time that a different value is written to the hardware is when we are compensating for changes to the counter that have occurred during the handling path [or more importantly, during a section of code running with interrupts disabled]. I don't believe OProfile on ARM used to do this, but it really should have. One thing that might be worth communicating to OProfile is if an event is throttled by perf. In this case, the sample period would need adjusting by userspace. Currently I don't think we can detect if this happens. I hope that helps, Will ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 0/6] ARM: oprofile: use perf-events framework as backend [v4]
@ 2010-04-19 16:42 Will Deacon
2010-04-19 16:42 ` [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2010-04-19 16:42 UTC (permalink / raw)
To: linux-arm-kernel
This is version 4 [hopefully the final iteration!] of the patchset originally
posted here:
V1: http://lists.infradead.org/pipermail/linux-arm-kernel/2010-February/010476.html
V2: http://lists.infradead.org/pipermail/linux-arm-kernel/2010-March/011126.html
V3: http://lists.infradead.org/pipermail/linux-arm-kernel/2010-April/012871.html
The only change since V3 is that the previous changes to the generic
perf-events code have been moved into the ARM backend. This places the whole
patchset under arch/arm/.
I've not received any feedback for version 3, so unless I receive any objections
to this version, I'll submit it to the patch system this week.
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Jamie Iles <jamie.iles@picochip.com>
Cc: Jean Pihet <jpihet@mvista.com>
Will Deacon (6):
ARM: perf-events: use numeric ID to identify PMU
ARM: perf-events: add support for xscale PMUs
ARM: perf-events: allow modules to query the number of hardware
counters
ARM: oprofile: use perf-events framework as backend
ARM: oprofile: remove old files and update KConfig
ARM: oprofile: convert from sysdev to platform device
arch/arm/Kconfig | 26 +-
arch/arm/include/asm/perf_event.h | 17 +
arch/arm/kernel/perf_event.c | 876 ++++++++++++++++++++++++++++++-
arch/arm/oprofile/Makefile | 7 +-
arch/arm/oprofile/backtrace.c | 83 ---
arch/arm/oprofile/common.c | 375 +++++++++++---
arch/arm/oprofile/op_arm_model.h | 35 --
arch/arm/oprofile/op_counter.h | 27 -
arch/arm/oprofile/op_model_arm11_core.c | 162 ------
arch/arm/oprofile/op_model_arm11_core.h | 45 --
arch/arm/oprofile/op_model_mpcore.c | 306 -----------
arch/arm/oprofile/op_model_mpcore.h | 61 ---
arch/arm/oprofile/op_model_v6.c | 78 ---
arch/arm/oprofile/op_model_v7.c | 415 ---------------
arch/arm/oprofile/op_model_v7.h | 103 ----
arch/arm/oprofile/op_model_xscale.c | 444 ----------------
16 files changed, 1190 insertions(+), 1870 deletions(-)
delete mode 100644 arch/arm/oprofile/backtrace.c
delete mode 100644 arch/arm/oprofile/op_arm_model.h
delete mode 100644 arch/arm/oprofile/op_counter.h
delete mode 100644 arch/arm/oprofile/op_model_arm11_core.c
delete mode 100644 arch/arm/oprofile/op_model_arm11_core.h
delete mode 100644 arch/arm/oprofile/op_model_mpcore.c
delete mode 100644 arch/arm/oprofile/op_model_mpcore.h
delete mode 100644 arch/arm/oprofile/op_model_v6.c
delete mode 100644 arch/arm/oprofile/op_model_v7.c
delete mode 100644 arch/arm/oprofile/op_model_v7.h
delete mode 100644 arch/arm/oprofile/op_model_xscale.c
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU 2010-04-19 16:42 [PATCH 0/6] ARM: oprofile: use perf-events framework as backend [v4] Will Deacon @ 2010-04-19 16:42 ` Will Deacon 2010-04-19 16:42 ` [PATCH 2/6] ARM: perf-events: add support for xscale PMUs Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Will Deacon @ 2010-04-19 16:42 UTC (permalink / raw) To: linux-arm-kernel The ARM perf-events framework provides support for a number of different PMUs using struct arm_pmu. The char *name field of this struct can be used to identify the PMU, but this is cumbersome if used outside of perf. This patch replaces the name string for a PMU with an enum, which holds a unique ID for the PMU being represented. This ID can be used to index an array of names within perf, so no functionality is lost. The presence of the ID field, allows other kernel subsystems [currently oprofile] to use their own mappings for the PMU name. Cc: Jean Pihet <jpihet@mvista.com> Acked-by: Jamie Iles <jamie.iles@picochip.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/include/asm/perf_event.h | 14 +++++++++++++ arch/arm/kernel/perf_event.c | 39 +++++++++++++++++++++++++++--------- 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h index 49e3049..fa4b326 100644 --- a/arch/arm/include/asm/perf_event.h +++ b/arch/arm/include/asm/perf_event.h @@ -28,4 +28,18 @@ set_perf_event_pending(void) * same indexes here for consistency. */ #define PERF_EVENT_INDEX_OFFSET 1 +/* ARM perf PMU IDs for use by internal perf clients. */ +enum arm_perf_pmu_ids { + ARM_PERF_PMU_ID_XSCALE1 = 0, + ARM_PERF_PMU_ID_XSCALE2, + ARM_PERF_PMU_ID_V6, + ARM_PERF_PMU_ID_V6MP, + ARM_PERF_PMU_ID_CA8, + ARM_PERF_PMU_ID_CA9, + ARM_NUM_PMU_IDS, +}; + +extern enum arm_perf_pmu_ids +armpmu_get_pmu_id(void); + #endif /* __ARM_PERF_EVENT_H__ */ diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 89a77fc..10a0bcd 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -16,6 +16,7 @@ #include <linux/interrupt.h> #include <linux/kernel.h> +#include <linux/module.h> #include <linux/perf_event.h> #include <linux/platform_device.h> #include <linux/spinlock.h> @@ -68,8 +69,18 @@ struct cpu_hw_events { }; DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events); +/* PMU names. */ +static const char *arm_pmu_names[] = { + [ARM_PERF_PMU_ID_XSCALE1] = "xscale1", + [ARM_PERF_PMU_ID_XSCALE2] = "xscale2", + [ARM_PERF_PMU_ID_V6] = "v6", + [ARM_PERF_PMU_ID_V6MP] = "v6mpcore", + [ARM_PERF_PMU_ID_CA8] = "ARMv7 Cortex-A8", + [ARM_PERF_PMU_ID_CA9] = "ARMv7 Cortex-A9", +}; + struct arm_pmu { - char *name; + enum arm_perf_pmu_ids id; irqreturn_t (*handle_irq)(int irq_num, void *dev); void (*enable)(struct hw_perf_event *evt, int idx); void (*disable)(struct hw_perf_event *evt, int idx); @@ -88,6 +99,18 @@ struct arm_pmu { /* Set at runtime when we know what CPU type we are. */ static const struct arm_pmu *armpmu; +enum arm_perf_pmu_ids +armpmu_get_pmu_id(void) +{ + int id = -ENODEV; + + if (armpmu != NULL) + id = armpmu->id; + + return id; +} +EXPORT_SYMBOL_GPL(armpmu_get_pmu_id); + #define HW_OP_UNSUPPORTED 0xFFFF #define C(_x) \ @@ -1154,7 +1177,7 @@ armv6mpcore_pmu_disable_event(struct hw_perf_event *hwc, } static const struct arm_pmu armv6pmu = { - .name = "v6", + .id = ARM_PERF_PMU_ID_V6, .handle_irq = armv6pmu_handle_irq, .enable = armv6pmu_enable_event, .disable = armv6pmu_disable_event, @@ -1177,7 +1200,7 @@ static const struct arm_pmu armv6pmu = { * reset the period and enable the interrupt reporting. */ static const struct arm_pmu armv6mpcore_pmu = { - .name = "v6mpcore", + .id = ARM_PERF_PMU_ID_V6MP, .handle_irq = armv6pmu_handle_irq, .enable = armv6pmu_enable_event, .disable = armv6mpcore_pmu_disable_event, @@ -1207,10 +1230,6 @@ static const struct arm_pmu armv6mpcore_pmu = { * counter and all 4 performance counters together can be reset separately. */ -#define ARMV7_PMU_CORTEX_A8_NAME "ARMv7 Cortex-A8" - -#define ARMV7_PMU_CORTEX_A9_NAME "ARMv7 Cortex-A9" - /* Common ARMv7 event types */ enum armv7_perf_types { ARMV7_PERFCTR_PMNC_SW_INCR = 0x00, @@ -2115,7 +2134,7 @@ init_hw_perf_events(void) perf_max_events = armv6mpcore_pmu.num_events; break; case 0xC080: /* Cortex-A8 */ - armv7pmu.name = ARMV7_PMU_CORTEX_A8_NAME; + armv7pmu.id = ARM_PERF_PMU_ID_CA8; memcpy(armpmu_perf_cache_map, armv7_a8_perf_cache_map, sizeof(armv7_a8_perf_cache_map)); armv7pmu.event_map = armv7_a8_pmu_event_map; @@ -2127,7 +2146,7 @@ init_hw_perf_events(void) perf_max_events = armv7pmu.num_events; break; case 0xC090: /* Cortex-A9 */ - armv7pmu.name = ARMV7_PMU_CORTEX_A9_NAME; + armv7pmu.id = ARM_PERF_PMU_ID_CA9; memcpy(armpmu_perf_cache_map, armv7_a9_perf_cache_map, sizeof(armv7_a9_perf_cache_map)); armv7pmu.event_map = armv7_a9_pmu_event_map; @@ -2146,7 +2165,7 @@ init_hw_perf_events(void) if (armpmu) pr_info("enabled with %s PMU driver, %d counters available\n", - armpmu->name, armpmu->num_events); + arm_pmu_names[armpmu->id], armpmu->num_events); return 0; } -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/6] ARM: perf-events: add support for xscale PMUs 2010-04-19 16:42 ` [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU Will Deacon @ 2010-04-19 16:42 ` Will Deacon 2010-04-19 16:42 ` [PATCH 3/6] ARM: perf-events: allow modules to query the number of hardware counters Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Will Deacon @ 2010-04-19 16:42 UTC (permalink / raw) To: linux-arm-kernel The perf-events framework for ARM only supports v6 and v7 cores. This patch adds support for xscale v1 and v2 PMUs to perf, based on the OProfile drivers in arch/arm/oprofile/op_model_xscale.c Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/kernel/perf_event.c | 827 +++++++++++++++++++++++++++++++++++++++++- 1 files changed, 821 insertions(+), 6 deletions(-) diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 10a0bcd..381f121 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -2108,6 +2108,803 @@ static u32 __init armv7_reset_read_pmnc(void) return nb_cnt + 1; } +/* + * ARMv5 [xscale] Performance counter handling code. + * + * Based on xscale OProfile code. + * + * There are two variants of the xscale PMU that we support: + * - xscale1pmu: 2 event counters and a cycle counter + * - xscale2pmu: 4 event counters and a cycle counter + * The two variants share event definitions, but have different + * PMU structures. + */ + +enum xscale_perf_types { + XSCALE_PERFCTR_ICACHE_MISS = 0x00, + XSCALE_PERFCTR_ICACHE_NO_DELIVER = 0x01, + XSCALE_PERFCTR_DATA_STALL = 0x02, + XSCALE_PERFCTR_ITLB_MISS = 0x03, + XSCALE_PERFCTR_DTLB_MISS = 0x04, + XSCALE_PERFCTR_BRANCH = 0x05, + XSCALE_PERFCTR_BRANCH_MISS = 0x06, + XSCALE_PERFCTR_INSTRUCTION = 0x07, + XSCALE_PERFCTR_DCACHE_FULL_STALL = 0x08, + XSCALE_PERFCTR_DCACHE_FULL_STALL_CONTIG = 0x09, + XSCALE_PERFCTR_DCACHE_ACCESS = 0x0A, + XSCALE_PERFCTR_DCACHE_MISS = 0x0B, + XSCALE_PERFCTR_DCACHE_WRITE_BACK = 0x0C, + XSCALE_PERFCTR_PC_CHANGED = 0x0D, + XSCALE_PERFCTR_BCU_REQUEST = 0x10, + XSCALE_PERFCTR_BCU_FULL = 0x11, + XSCALE_PERFCTR_BCU_DRAIN = 0x12, + XSCALE_PERFCTR_BCU_ECC_NO_ELOG = 0x14, + XSCALE_PERFCTR_BCU_1_BIT_ERR = 0x15, + XSCALE_PERFCTR_RMW = 0x16, + /* XSCALE_PERFCTR_CCNT is not hardware defined */ + XSCALE_PERFCTR_CCNT = 0xFE, + XSCALE_PERFCTR_UNUSED = 0xFF, +}; + +enum xscale_counters { + XSCALE_CYCLE_COUNTER = 1, + XSCALE_COUNTER0, + XSCALE_COUNTER1, + XSCALE_COUNTER2, + XSCALE_COUNTER3, +}; + +static const unsigned xscale_perf_map[PERF_COUNT_HW_MAX] = { + [PERF_COUNT_HW_CPU_CYCLES] = XSCALE_PERFCTR_CCNT, + [PERF_COUNT_HW_INSTRUCTIONS] = XSCALE_PERFCTR_INSTRUCTION, + [PERF_COUNT_HW_CACHE_REFERENCES] = HW_OP_UNSUPPORTED, + [PERF_COUNT_HW_CACHE_MISSES] = HW_OP_UNSUPPORTED, + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = XSCALE_PERFCTR_BRANCH, + [PERF_COUNT_HW_BRANCH_MISSES] = XSCALE_PERFCTR_BRANCH_MISS, + [PERF_COUNT_HW_BUS_CYCLES] = HW_OP_UNSUPPORTED, +}; + +static const unsigned xscale_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { + [C(L1D)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = XSCALE_PERFCTR_DCACHE_ACCESS, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DCACHE_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = XSCALE_PERFCTR_DCACHE_ACCESS, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DCACHE_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(L1I)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ICACHE_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ICACHE_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(LL)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(DTLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DTLB_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DTLB_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(ITLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ITLB_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ITLB_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(BPU)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, +}; + +#define XSCALE_PMU_ENABLE 0x001 +#define XSCALE_PMN_RESET 0x002 +#define XSCALE_CCNT_RESET 0x004 +#define XSCALE_PMU_RESET (CCNT_RESET | PMN_RESET) +#define XSCALE_PMU_CNT64 0x008 + +static inline int +xscalepmu_event_map(int config) +{ + int mapping = xscale_perf_map[config]; + if (HW_OP_UNSUPPORTED == mapping) + mapping = -EOPNOTSUPP; + return mapping; +} + +static u64 +xscalepmu_raw_event(u64 config) +{ + return config & 0xff; +} + +#define XSCALE1_OVERFLOWED_MASK 0x700 +#define XSCALE1_CCOUNT_OVERFLOW 0x400 +#define XSCALE1_COUNT0_OVERFLOW 0x100 +#define XSCALE1_COUNT1_OVERFLOW 0x200 +#define XSCALE1_CCOUNT_INT_EN 0x040 +#define XSCALE1_COUNT0_INT_EN 0x010 +#define XSCALE1_COUNT1_INT_EN 0x020 +#define XSCALE1_COUNT0_EVT_SHFT 12 +#define XSCALE1_COUNT0_EVT_MASK (0xff << XSCALE1_COUNT0_EVT_SHFT) +#define XSCALE1_COUNT1_EVT_SHFT 20 +#define XSCALE1_COUNT1_EVT_MASK (0xff << XSCALE1_COUNT1_EVT_SHFT) + +static inline u32 +xscale1pmu_read_pmnc(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c0, c0, 0" : "=r" (val)); + return val; +} + +static inline void +xscale1pmu_write_pmnc(u32 val) +{ + /* upper 4bits and 7, 11 are write-as-0 */ + val &= 0xffff77f; + asm volatile("mcr p14, 0, %0, c0, c0, 0" : : "r" (val)); +} + +static inline int +xscale1_pmnc_counter_has_overflowed(unsigned long pmnc, + enum xscale_counters counter) +{ + int ret = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + ret = pmnc & XSCALE1_CCOUNT_OVERFLOW; + break; + case XSCALE_COUNTER0: + ret = pmnc & XSCALE1_COUNT0_OVERFLOW; + break; + case XSCALE_COUNTER1: + ret = pmnc & XSCALE1_COUNT1_OVERFLOW; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", counter); + } + + return ret; +} + +static irqreturn_t +xscale1pmu_handle_irq(int irq_num, void *dev) +{ + unsigned long pmnc; + struct perf_sample_data data; + struct cpu_hw_events *cpuc; + struct pt_regs *regs; + int idx; + + /* + * NOTE: there's an A stepping erratum that states if an overflow + * bit already exists and another occurs, the previous + * Overflow bit gets cleared. There's no workaround. + * Fixed in B stepping or later. + */ + pmnc = xscale1pmu_read_pmnc(); + + /* + * Write the value back to clear the overflow flags. Overflow + * flags remain in pmnc for use below. We also disable the PMU + * while we process the interrupt. + */ + xscale1pmu_write_pmnc(pmnc & ~XSCALE_PMU_ENABLE); + + if (!(pmnc & XSCALE1_OVERFLOWED_MASK)) + return IRQ_NONE; + + regs = get_irq_regs(); + + perf_sample_data_init(&data, 0); + + cpuc = &__get_cpu_var(cpu_hw_events); + for (idx = 0; idx <= armpmu->num_events; ++idx) { + struct perf_event *event = cpuc->events[idx]; + struct hw_perf_event *hwc; + + if (!test_bit(idx, cpuc->active_mask)) + continue; + + if (!xscale1_pmnc_counter_has_overflowed(pmnc, idx)) + continue; + + hwc = &event->hw; + armpmu_event_update(event, hwc, idx); + data.period = event->hw.last_period; + if (!armpmu_event_set_period(event, hwc, idx)) + continue; + + if (perf_event_overflow(event, 0, &data, regs)) + armpmu->disable(hwc, idx); + } + + perf_event_do_pending(); + + /* + * Re-enable the PMU. + */ + pmnc = xscale1pmu_read_pmnc() | XSCALE_PMU_ENABLE; + xscale1pmu_write_pmnc(pmnc); + + return IRQ_HANDLED; +} + +static void +xscale1pmu_enable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long val, mask, evt, flags; + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + mask = 0; + evt = XSCALE1_CCOUNT_INT_EN; + break; + case XSCALE_COUNTER0: + mask = XSCALE1_COUNT0_EVT_MASK; + evt = (hwc->config_base << XSCALE1_COUNT0_EVT_SHFT) | + XSCALE1_COUNT0_INT_EN; + break; + case XSCALE_COUNTER1: + mask = XSCALE1_COUNT1_EVT_MASK; + evt = (hwc->config_base << XSCALE1_COUNT1_EVT_SHFT) | + XSCALE1_COUNT1_INT_EN; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val &= ~mask; + val |= evt; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale1pmu_disable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long val, mask, evt, flags; + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + mask = XSCALE1_CCOUNT_INT_EN; + evt = 0; + break; + case XSCALE_COUNTER0: + mask = XSCALE1_COUNT0_INT_EN | XSCALE1_COUNT0_EVT_MASK; + evt = XSCALE_PERFCTR_UNUSED << XSCALE1_COUNT0_EVT_SHFT; + break; + case XSCALE_COUNTER1: + mask = XSCALE1_COUNT1_INT_EN | XSCALE1_COUNT1_EVT_MASK; + evt = XSCALE_PERFCTR_UNUSED << XSCALE1_COUNT1_EVT_SHFT; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val &= ~mask; + val |= evt; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static int +xscale1pmu_get_event_idx(struct cpu_hw_events *cpuc, + struct hw_perf_event *event) +{ + if (XSCALE_PERFCTR_CCNT == event->config_base) { + if (test_and_set_bit(XSCALE_CYCLE_COUNTER, cpuc->used_mask)) + return -EAGAIN; + + return XSCALE_CYCLE_COUNTER; + } else { + if (!test_and_set_bit(XSCALE_COUNTER1, cpuc->used_mask)) { + return XSCALE_COUNTER1; + } + + if (!test_and_set_bit(XSCALE_COUNTER0, cpuc->used_mask)) { + return XSCALE_COUNTER0; + } + + return -EAGAIN; + } +} + +static void +xscale1pmu_start(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val |= XSCALE_PMU_ENABLE; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale1pmu_stop(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val &= ~XSCALE_PMU_ENABLE; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static inline u32 +xscale1pmu_read_counter(int counter) +{ + u32 val = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mrc p14, 0, %0, c1, c0, 0" : "=r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mrc p14, 0, %0, c2, c0, 0" : "=r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mrc p14, 0, %0, c3, c0, 0" : "=r" (val)); + break; + } + + return val; +} + +static inline void +xscale1pmu_write_counter(int counter, u32 val) +{ + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mcr p14, 0, %0, c1, c0, 0" : : "r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mcr p14, 0, %0, c2, c0, 0" : : "r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mcr p14, 0, %0, c3, c0, 0" : : "r" (val)); + break; + } +} + +static const struct arm_pmu xscale1pmu = { + .id = ARM_PERF_PMU_ID_XSCALE1, + .handle_irq = xscale1pmu_handle_irq, + .enable = xscale1pmu_enable_event, + .disable = xscale1pmu_disable_event, + .event_map = xscalepmu_event_map, + .raw_event = xscalepmu_raw_event, + .read_counter = xscale1pmu_read_counter, + .write_counter = xscale1pmu_write_counter, + .get_event_idx = xscale1pmu_get_event_idx, + .start = xscale1pmu_start, + .stop = xscale1pmu_stop, + .num_events = 3, + .max_period = (1LLU << 32) - 1, +}; + +#define XSCALE2_OVERFLOWED_MASK 0x01f +#define XSCALE2_CCOUNT_OVERFLOW 0x001 +#define XSCALE2_COUNT0_OVERFLOW 0x002 +#define XSCALE2_COUNT1_OVERFLOW 0x004 +#define XSCALE2_COUNT2_OVERFLOW 0x008 +#define XSCALE2_COUNT3_OVERFLOW 0x010 +#define XSCALE2_CCOUNT_INT_EN 0x001 +#define XSCALE2_COUNT0_INT_EN 0x002 +#define XSCALE2_COUNT1_INT_EN 0x004 +#define XSCALE2_COUNT2_INT_EN 0x008 +#define XSCALE2_COUNT3_INT_EN 0x010 +#define XSCALE2_COUNT0_EVT_SHFT 0 +#define XSCALE2_COUNT0_EVT_MASK (0xff << XSCALE2_COUNT0_EVT_SHFT) +#define XSCALE2_COUNT1_EVT_SHFT 8 +#define XSCALE2_COUNT1_EVT_MASK (0xff << XSCALE2_COUNT1_EVT_SHFT) +#define XSCALE2_COUNT2_EVT_SHFT 16 +#define XSCALE2_COUNT2_EVT_MASK (0xff << XSCALE2_COUNT2_EVT_SHFT) +#define XSCALE2_COUNT3_EVT_SHFT 24 +#define XSCALE2_COUNT3_EVT_MASK (0xff << XSCALE2_COUNT3_EVT_SHFT) + +static inline u32 +xscale2pmu_read_pmnc(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c0, c1, 0" : "=r" (val)); + /* bits 1-2 and 4-23 are read-unpredictable */ + return val & 0xff000009; +} + +static inline void +xscale2pmu_write_pmnc(u32 val) +{ + /* bits 4-23 are write-as-0, 24-31 are write ignored */ + val &= 0xf; + asm volatile("mcr p14, 0, %0, c0, c1, 0" : : "r" (val)); +} + +static inline u32 +xscale2pmu_read_overflow_flags(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c5, c1, 0" : "=r" (val)); + return val; +} + +static inline void +xscale2pmu_write_overflow_flags(u32 val) +{ + asm volatile("mcr p14, 0, %0, c5, c1, 0" : : "r" (val)); +} + +static inline u32 +xscale2pmu_read_event_select(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c8, c1, 0" : "=r" (val)); + return val; +} + +static inline void +xscale2pmu_write_event_select(u32 val) +{ + asm volatile("mcr p14, 0, %0, c8, c1, 0" : : "r"(val)); +} + +static inline u32 +xscale2pmu_read_int_enable(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c4, c1, 0" : "=r" (val)); + return val; +} + +static void +xscale2pmu_write_int_enable(u32 val) +{ + asm volatile("mcr p14, 0, %0, c4, c1, 0" : : "r" (val)); +} + +static inline int +xscale2_pmnc_counter_has_overflowed(unsigned long of_flags, + enum xscale_counters counter) +{ + int ret = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + ret = of_flags & XSCALE2_CCOUNT_OVERFLOW; + break; + case XSCALE_COUNTER0: + ret = of_flags & XSCALE2_COUNT0_OVERFLOW; + break; + case XSCALE_COUNTER1: + ret = of_flags & XSCALE2_COUNT1_OVERFLOW; + break; + case XSCALE_COUNTER2: + ret = of_flags & XSCALE2_COUNT2_OVERFLOW; + break; + case XSCALE_COUNTER3: + ret = of_flags & XSCALE2_COUNT3_OVERFLOW; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", counter); + } + + return ret; +} + +static irqreturn_t +xscale2pmu_handle_irq(int irq_num, void *dev) +{ + unsigned long pmnc, of_flags; + struct perf_sample_data data; + struct cpu_hw_events *cpuc; + struct pt_regs *regs; + int idx; + + /* Disable the PMU. */ + pmnc = xscale2pmu_read_pmnc(); + xscale2pmu_write_pmnc(pmnc & ~XSCALE_PMU_ENABLE); + + /* Check the overflow flag register. */ + of_flags = xscale2pmu_read_overflow_flags(); + if (!(of_flags & XSCALE2_OVERFLOWED_MASK)) + return IRQ_NONE; + + /* Clear the overflow bits. */ + xscale2pmu_write_overflow_flags(of_flags); + + regs = get_irq_regs(); + + perf_sample_data_init(&data, 0); + + cpuc = &__get_cpu_var(cpu_hw_events); + for (idx = 0; idx <= armpmu->num_events; ++idx) { + struct perf_event *event = cpuc->events[idx]; + struct hw_perf_event *hwc; + + if (!test_bit(idx, cpuc->active_mask)) + continue; + + if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx)) + continue; + + hwc = &event->hw; + armpmu_event_update(event, hwc, idx); + data.period = event->hw.last_period; + if (!armpmu_event_set_period(event, hwc, idx)) + continue; + + if (perf_event_overflow(event, 0, &data, regs)) + armpmu->disable(hwc, idx); + } + + perf_event_do_pending(); + + /* + * Re-enable the PMU. + */ + pmnc = xscale2pmu_read_pmnc() | XSCALE_PMU_ENABLE; + xscale2pmu_write_pmnc(pmnc); + + return IRQ_HANDLED; +} + +static void +xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long flags, ien, evtsel; + + ien = xscale2pmu_read_int_enable(); + evtsel = xscale2pmu_read_event_select(); + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + ien |= XSCALE2_CCOUNT_INT_EN; + break; + case XSCALE_COUNTER0: + ien |= XSCALE2_COUNT0_INT_EN; + evtsel &= ~XSCALE2_COUNT0_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT0_EVT_SHFT; + break; + case XSCALE_COUNTER1: + ien |= XSCALE2_COUNT1_INT_EN; + evtsel &= ~XSCALE2_COUNT1_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT1_EVT_SHFT; + break; + case XSCALE_COUNTER2: + ien |= XSCALE2_COUNT2_INT_EN; + evtsel &= ~XSCALE2_COUNT2_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT2_EVT_SHFT; + break; + case XSCALE_COUNTER3: + ien |= XSCALE2_COUNT3_INT_EN; + evtsel &= ~XSCALE2_COUNT3_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT3_EVT_SHFT; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + xscale2pmu_write_event_select(evtsel); + xscale2pmu_write_int_enable(ien); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long flags, ien, evtsel; + + ien = xscale2pmu_read_int_enable(); + evtsel = xscale2pmu_read_event_select(); + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + ien &= ~XSCALE2_CCOUNT_INT_EN; + break; + case XSCALE_COUNTER0: + ien &= ~XSCALE2_COUNT0_INT_EN; + evtsel &= ~XSCALE2_COUNT0_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT0_EVT_SHFT; + break; + case XSCALE_COUNTER1: + ien &= ~XSCALE2_COUNT1_INT_EN; + evtsel &= ~XSCALE2_COUNT1_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT1_EVT_SHFT; + break; + case XSCALE_COUNTER2: + ien &= ~XSCALE2_COUNT2_INT_EN; + evtsel &= ~XSCALE2_COUNT2_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT2_EVT_SHFT; + break; + case XSCALE_COUNTER3: + ien &= ~XSCALE2_COUNT3_INT_EN; + evtsel &= ~XSCALE2_COUNT3_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT3_EVT_SHFT; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + xscale2pmu_write_event_select(evtsel); + xscale2pmu_write_int_enable(ien); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static int +xscale2pmu_get_event_idx(struct cpu_hw_events *cpuc, + struct hw_perf_event *event) +{ + int idx = xscale1pmu_get_event_idx(cpuc, event); + if (idx >= 0) + goto out; + + if (!test_and_set_bit(XSCALE_COUNTER3, cpuc->used_mask)) + idx = XSCALE_COUNTER3; + else if (!test_and_set_bit(XSCALE_COUNTER2, cpuc->used_mask)) + idx = XSCALE_COUNTER2; +out: + return idx; +} + +static void +xscale2pmu_start(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64; + val |= XSCALE_PMU_ENABLE; + xscale2pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale2pmu_stop(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale2pmu_read_pmnc(); + val &= ~XSCALE_PMU_ENABLE; + xscale2pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static inline u32 +xscale2pmu_read_counter(int counter) +{ + u32 val = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mrc p14, 0, %0, c1, c1, 0" : "=r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mrc p14, 0, %0, c0, c2, 0" : "=r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mrc p14, 0, %0, c1, c2, 0" : "=r" (val)); + break; + case XSCALE_COUNTER2: + asm volatile("mrc p14, 0, %0, c2, c2, 0" : "=r" (val)); + break; + case XSCALE_COUNTER3: + asm volatile("mrc p14, 0, %0, c3, c2, 0" : "=r" (val)); + break; + } + + return val; +} + +static inline void +xscale2pmu_write_counter(int counter, u32 val) +{ + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mcr p14, 0, %0, c1, c1, 0" : : "r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mcr p14, 0, %0, c0, c2, 0" : : "r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mcr p14, 0, %0, c1, c2, 0" : : "r" (val)); + break; + case XSCALE_COUNTER2: + asm volatile("mcr p14, 0, %0, c2, c2, 0" : : "r" (val)); + break; + case XSCALE_COUNTER3: + asm volatile("mcr p14, 0, %0, c3, c2, 0" : : "r" (val)); + break; + } +} + +static const struct arm_pmu xscale2pmu = { + .id = ARM_PERF_PMU_ID_XSCALE2, + .handle_irq = xscale2pmu_handle_irq, + .enable = xscale2pmu_enable_event, + .disable = xscale2pmu_disable_event, + .event_map = xscalepmu_event_map, + .raw_event = xscalepmu_raw_event, + .read_counter = xscale2pmu_read_counter, + .write_counter = xscale2pmu_write_counter, + .get_event_idx = xscale2pmu_get_event_idx, + .start = xscale2pmu_start, + .stop = xscale2pmu_stop, + .num_events = 5, + .max_period = (1LLU << 32) - 1, +}; + static int __init init_hw_perf_events(void) { @@ -2115,7 +2912,7 @@ init_hw_perf_events(void) unsigned long implementor = (cpuid & 0xFF000000) >> 24; unsigned long part_number = (cpuid & 0xFFF0); - /* We only support ARM CPUs implemented by ARM at the moment. */ + /* ARM Ltd CPUs. */ if (0x41 == implementor) { switch (part_number) { case 0xB360: /* ARM1136 */ @@ -2157,15 +2954,33 @@ init_hw_perf_events(void) armv7pmu.num_events = armv7_reset_read_pmnc(); perf_max_events = armv7pmu.num_events; break; - default: - pr_info("no hardware support available\n"); - perf_max_events = -1; + } + /* Intel CPUs [xscale]. */ + } else if (0x69 == implementor) { + part_number = (cpuid >> 13) & 0x7; + switch (part_number) { + case 1: + armpmu = &xscale1pmu; + memcpy(armpmu_perf_cache_map, xscale_perf_cache_map, + sizeof(xscale_perf_cache_map)); + perf_max_events = xscale1pmu.num_events; + break; + case 2: + armpmu = &xscale2pmu; + memcpy(armpmu_perf_cache_map, xscale_perf_cache_map, + sizeof(xscale_perf_cache_map)); + perf_max_events = xscale2pmu.num_events; + break; } } - if (armpmu) + if (armpmu) { pr_info("enabled with %s PMU driver, %d counters available\n", - arm_pmu_names[armpmu->id], armpmu->num_events); + arm_pmu_names[armpmu->id], armpmu->num_events); + } else { + pr_info("no hardware support available\n"); + perf_max_events = -1; + } return 0; } -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/6] ARM: perf-events: allow modules to query the number of hardware counters 2010-04-19 16:42 ` [PATCH 2/6] ARM: perf-events: add support for xscale PMUs Will Deacon @ 2010-04-19 16:42 ` Will Deacon 2010-04-19 16:42 ` [PATCH 4/6] ARM: oprofile: use perf-events framework as backend Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Will Deacon @ 2010-04-19 16:42 UTC (permalink / raw) To: linux-arm-kernel For OProfile to initialise oprofilefs correctly, it needs to know the number of counters it can represent. This patch adds a function to the ARM perf-events backend to return the number of hardware counters available for the current PMU. Cc: Jamie Iles <jamie.iles@picochip.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/include/asm/perf_event.h | 3 +++ arch/arm/kernel/perf_event.c | 12 ++++++++++++ 2 files changed, 15 insertions(+), 0 deletions(-) diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h index fa4b326..48837e6 100644 --- a/arch/arm/include/asm/perf_event.h +++ b/arch/arm/include/asm/perf_event.h @@ -42,4 +42,7 @@ enum arm_perf_pmu_ids { extern enum arm_perf_pmu_ids armpmu_get_pmu_id(void); +extern int +armpmu_get_max_events(void); + #endif /* __ARM_PERF_EVENT_H__ */ diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 381f121..c457686 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -111,6 +111,18 @@ armpmu_get_pmu_id(void) } EXPORT_SYMBOL_GPL(armpmu_get_pmu_id); +int +armpmu_get_max_events(void) +{ + int max_events = 0; + + if (armpmu != NULL) + max_events = armpmu->num_events; + + return max_events; +} +EXPORT_SYMBOL_GPL(armpmu_get_max_events); + #define HW_OP_UNSUPPORTED 0xFFFF #define C(_x) \ -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 4/6] ARM: oprofile: use perf-events framework as backend 2010-04-19 16:42 ` [PATCH 3/6] ARM: perf-events: allow modules to query the number of hardware counters Will Deacon @ 2010-04-19 16:42 ` Will Deacon 0 siblings, 0 replies; 4+ messages in thread From: Will Deacon @ 2010-04-19 16:42 UTC (permalink / raw) To: linux-arm-kernel There are currently two hardware performance monitoring subsystems in the kernel for ARM: OProfile and perf-events. This creates the following problems: 1.) Duplicate PMU accessor code. Inevitable code drift may lead to bugs in one framework that are fixed in the other. 2.) Locking issues. OProfile doesn't reprogram hardware counters between profiling runs if the events to be monitored have not been changed. This means that other profiling frameworks cannot use the counters if OProfile is in use. 3.) Due to differences in the two frameworks, it may not be possible to compare the results obtained by OProfile with those obtained by perf. This patch removes the OProfile PMU driver code and replaces it with calls to perf, therefore solving the issues mentioned above. The only userspace-visible change is the lack of SCU counter support for 11MPCore. This is currently unsupported by OProfile userspace tools anyway and therefore shouldn't cause any problems. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Jean Pihet <jpihet@mvista.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/oprofile/common.c | 333 +++++++++++++++++++++++++++++++++++++------- 1 files changed, 281 insertions(+), 52 deletions(-) diff --git a/arch/arm/oprofile/common.c b/arch/arm/oprofile/common.c index 3fcd752..aad83df 100644 --- a/arch/arm/oprofile/common.c +++ b/arch/arm/oprofile/common.c @@ -2,32 +2,183 @@ * @file common.c * * @remark Copyright 2004 Oprofile Authors + * @remark Copyright 2010 ARM Ltd. * @remark Read the file COPYING * * @author Zwane Mwaikambo + * @author Will Deacon [move to perf] */ +#include <linux/cpumask.h> +#include <linux/errno.h> #include <linux/init.h> +#include <linux/mutex.h> #include <linux/oprofile.h> -#include <linux/errno.h> +#include <linux/perf_event.h> #include <linux/slab.h> #include <linux/sysdev.h> -#include <linux/mutex.h> +#include <asm/stacktrace.h> +#include <linux/uaccess.h> + +#include <asm/perf_event.h> +#include <asm/ptrace.h> -#include "op_counter.h" -#include "op_arm_model.h" +#ifdef CONFIG_HW_PERF_EVENTS +/* + * Per performance monitor configuration as set via oprofilefs. + */ +struct op_counter_config { + unsigned long count; + unsigned long enabled; + unsigned long event; + unsigned long unit_mask; + unsigned long kernel; + unsigned long user; + struct perf_event_attr attr; +}; -static struct op_arm_model_spec *op_arm_model; static int op_arm_enabled; static DEFINE_MUTEX(op_arm_mutex); -struct op_counter_config *counter_config; +static struct op_counter_config *counter_config; +static struct perf_event **perf_events[nr_cpumask_bits]; +static int perf_num_counters; + +/* + * Overflow callback for oprofile. + */ +static void op_overflow_handler(struct perf_event *event, int unused, + struct perf_sample_data *data, struct pt_regs *regs) +{ + int id; + u32 cpu = smp_processor_id(); + + for (id = 0; id < perf_num_counters; ++id) + if (perf_events[cpu][id] == event) + break; + + if (id != perf_num_counters) + oprofile_add_sample(regs, id); + else + pr_warning("oprofile: ignoring spurious overflow " + "on cpu %u\n", cpu); +} + +/* + * Called by op_arm_setup to create perf attributes to mirror the oprofile + * settings in counter_config. Attributes are created as `pinned' events and + * so are permanently scheduled on the PMU. + */ +static void op_perf_setup(void) +{ + int i; + u32 size = sizeof(struct perf_event_attr); + struct perf_event_attr *attr; + + for (i = 0; i < perf_num_counters; ++i) { + attr = &counter_config[i].attr; + memset(attr, 0, size); + attr->type = PERF_TYPE_RAW; + attr->size = size; + attr->config = counter_config[i].event; + attr->sample_period = counter_config[i].count; + attr->pinned = 1; + } +} + +static int op_create_counter(int cpu, int event) +{ + int ret = 0; + struct perf_event *pevent; + + if (!counter_config[event].enabled || (perf_events[cpu][event] != NULL)) + return ret; + + pevent = perf_event_create_kernel_counter(&counter_config[event].attr, + cpu, -1, + op_overflow_handler); + + if (IS_ERR(pevent)) { + ret = PTR_ERR(pevent); + } else if (pevent->state != PERF_EVENT_STATE_ACTIVE) { + pr_warning("oprofile: failed to enable event %d " + "on CPU %d\n", event, cpu); + ret = -EBUSY; + } else { + perf_events[cpu][event] = pevent; + } + + return ret; +} + +static void op_destroy_counter(int cpu, int event) +{ + struct perf_event *pevent = perf_events[cpu][event]; + + if (pevent) { + perf_event_release_kernel(pevent); + perf_events[cpu][event] = NULL; + } +} + +/* + * Called by op_arm_start to create active perf events based on the + * perviously configured attributes. + */ +static int op_perf_start(void) +{ + int cpu, event, ret = 0; + + for_each_online_cpu(cpu) { + for (event = 0; event < perf_num_counters; ++event) { + ret = op_create_counter(cpu, event); + if (ret) + goto out; + } + } + +out: + return ret; +} + +/* + * Called by op_arm_stop@the end of a profiling run. + */ +static void op_perf_stop(void) +{ + int cpu, event; + + for_each_online_cpu(cpu) + for (event = 0; event < perf_num_counters; ++event) + op_destroy_counter(cpu, event); +} + + +static char *op_name_from_perf_id(enum arm_perf_pmu_ids id) +{ + switch (id) { + case ARM_PERF_PMU_ID_XSCALE1: + return "arm/xscale1"; + case ARM_PERF_PMU_ID_XSCALE2: + return "arm/xscale2"; + case ARM_PERF_PMU_ID_V6: + return "arm/armv6"; + case ARM_PERF_PMU_ID_V6MP: + return "arm/mpcore"; + case ARM_PERF_PMU_ID_CA8: + return "arm/armv7"; + case ARM_PERF_PMU_ID_CA9: + return "arm/armv7-ca9"; + default: + return NULL; + } +} static int op_arm_create_files(struct super_block *sb, struct dentry *root) { unsigned int i; - for (i = 0; i < op_arm_model->num_counters; i++) { + for (i = 0; i < perf_num_counters; i++) { struct dentry *dir; char buf[4]; @@ -46,12 +197,10 @@ static int op_arm_create_files(struct super_block *sb, struct dentry *root) static int op_arm_setup(void) { - int ret; - spin_lock(&oprofilefs_lock); - ret = op_arm_model->setup_ctrs(); + op_perf_setup(); spin_unlock(&oprofilefs_lock); - return ret; + return 0; } static int op_arm_start(void) @@ -60,8 +209,9 @@ static int op_arm_start(void) mutex_lock(&op_arm_mutex); if (!op_arm_enabled) { - ret = op_arm_model->start(); - op_arm_enabled = !ret; + ret = 0; + op_perf_start(); + op_arm_enabled = 1; } mutex_unlock(&op_arm_mutex); return ret; @@ -71,7 +221,7 @@ static void op_arm_stop(void) { mutex_lock(&op_arm_mutex); if (op_arm_enabled) - op_arm_model->stop(); + op_perf_stop(); op_arm_enabled = 0; mutex_unlock(&op_arm_mutex); } @@ -81,7 +231,7 @@ static int op_arm_suspend(struct sys_device *dev, pm_message_t state) { mutex_lock(&op_arm_mutex); if (op_arm_enabled) - op_arm_model->stop(); + op_perf_stop(); mutex_unlock(&op_arm_mutex); return 0; } @@ -89,7 +239,7 @@ static int op_arm_suspend(struct sys_device *dev, pm_message_t state) static int op_arm_resume(struct sys_device *dev) { mutex_lock(&op_arm_mutex); - if (op_arm_enabled && op_arm_model->start()) + if (op_arm_enabled && op_perf_start()) op_arm_enabled = 0; mutex_unlock(&op_arm_mutex); return 0; @@ -126,58 +276,137 @@ static void exit_driverfs(void) #define exit_driverfs() do { } while (0) #endif /* CONFIG_PM */ -int __init oprofile_arch_init(struct oprofile_operations *ops) +static int report_trace(struct stackframe *frame, void *d) { - struct op_arm_model_spec *spec = NULL; - int ret = -ENODEV; + unsigned int *depth = d; - ops->backtrace = arm_backtrace; + if (*depth) { + oprofile_add_trace(frame->pc); + (*depth)--; + } -#ifdef CONFIG_CPU_XSCALE - spec = &op_xscale_spec; -#endif + return *depth == 0; +} -#ifdef CONFIG_OPROFILE_ARMV6 - spec = &op_armv6_spec; -#endif +/* + * The registers we're interested in are at the end of the variable + * length saved register structure. The fp points at the end of this + * structure so the address of this struct is: + * (struct frame_tail *)(xxx->fp)-1 + */ +struct frame_tail { + struct frame_tail *fp; + unsigned long sp; + unsigned long lr; +} __attribute__((packed)); -#ifdef CONFIG_OPROFILE_MPCORE - spec = &op_mpcore_spec; -#endif +static struct frame_tail* user_backtrace(struct frame_tail *tail) +{ + struct frame_tail buftail[2]; -#ifdef CONFIG_OPROFILE_ARMV7 - spec = &op_armv7_spec; -#endif + /* Also check accessibility of one struct frame_tail beyond */ + if (!access_ok(VERIFY_READ, tail, sizeof(buftail))) + return NULL; + if (__copy_from_user_inatomic(buftail, tail, sizeof(buftail))) + return NULL; - if (spec) { - ret = spec->init(); - if (ret < 0) - return ret; + oprofile_add_trace(buftail[0].lr); - counter_config = kcalloc(spec->num_counters, sizeof(struct op_counter_config), - GFP_KERNEL); - if (!counter_config) - return -ENOMEM; + /* frame pointers should strictly progress back up the stack + * (towards higher addresses) */ + if (tail >= buftail[0].fp) + return NULL; + + return buftail[0].fp-1; +} + +static void arm_backtrace(struct pt_regs * const regs, unsigned int depth) +{ + struct frame_tail *tail = ((struct frame_tail *) regs->ARM_fp) - 1; + + if (!user_mode(regs)) { + struct stackframe frame; + frame.fp = regs->ARM_fp; + frame.sp = regs->ARM_sp; + frame.lr = regs->ARM_lr; + frame.pc = regs->ARM_pc; + walk_stackframe(&frame, report_trace, &depth); + return; + } + + while (depth-- && tail && !((unsigned long) tail & 3)) + tail = user_backtrace(tail); +} + +int __init oprofile_arch_init(struct oprofile_operations *ops) +{ + int cpu, ret = 0; + + perf_num_counters = armpmu_get_max_events(); + + counter_config = kcalloc(perf_num_counters, + sizeof(struct op_counter_config), GFP_KERNEL); - op_arm_model = spec; - init_driverfs(); - ops->create_files = op_arm_create_files; - ops->setup = op_arm_setup; - ops->shutdown = op_arm_stop; - ops->start = op_arm_start; - ops->stop = op_arm_stop; - ops->cpu_type = op_arm_model->name; - printk(KERN_INFO "oprofile: using %s\n", spec->name); + if (!counter_config) { + pr_info("oprofile: failed to allocate %d " + "counters\n", perf_num_counters); + return -ENOMEM; } + for_each_possible_cpu(cpu) { + perf_events[cpu] = kcalloc(perf_num_counters, + sizeof(struct perf_event *), GFP_KERNEL); + if (!perf_events[cpu]) { + pr_info("oprofile: failed to allocate %d perf events " + "for cpu %d\n", perf_num_counters, cpu); + while (--cpu >= 0) + kfree(perf_events[cpu]); + return -ENOMEM; + } + } + + init_driverfs(); + ops->backtrace = arm_backtrace; + ops->create_files = op_arm_create_files; + ops->setup = op_arm_setup; + ops->start = op_arm_start; + ops->stop = op_arm_stop; + ops->shutdown = op_arm_stop; + ops->cpu_type = op_name_from_perf_id(armpmu_get_pmu_id()); + + if (!ops->cpu_type) + ret = -ENODEV; + else + pr_info("oprofile: using %s\n", ops->cpu_type); + return ret; } void oprofile_arch_exit(void) { - if (op_arm_model) { + int cpu, id; + struct perf_event *event; + + if (*perf_events) { exit_driverfs(); - op_arm_model = NULL; + for_each_possible_cpu(cpu) { + for (id = 0; id < perf_num_counters; ++id) { + event = perf_events[cpu][id]; + if (event != NULL) + perf_event_release_kernel(event); + } + kfree(perf_events[cpu]); + } } - kfree(counter_config); + + if (counter_config) + kfree(counter_config); +} +#else +int __init oprofile_arch_init(struct oprofile_operations *ops) +{ + pr_info("oprofile: hardware counters not available\n"); + return -ENODEV; } +void oprofile_arch_exit(void) {} +#endif /* CONFIG_HW_PERF_EVENTS */ -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 0/6] ARM: oprofile: use perf-events framework as backend [v3]
@ 2010-04-12 15:04 Will Deacon
2010-04-12 15:04 ` [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2010-04-12 15:04 UTC (permalink / raw)
To: linux-arm-kernel
This is version 3 of the patchset originally posted here:
V1: http://lists.infradead.org/pipermail/linux-arm-kernel/2010-February/010476.html
V2: http://lists.infradead.org/pipermail/linux-arm-kernel/2010-March/011126.html
Changes since V2:
- Rebased onto 2.6.34-rc3
- No longer export perf_event_{enable,disable} functions as this
stopped working with 2.6.34-rc1. Instead create/destroy counters
between profiling runs. This also prevents OProfile from hogging
the PMU lock.
- Merged in patch from Aaro Koskinen to use platform_device
instead of sys_device for OProfile suspend.
- Fixed XScale PMU interrupt handlers to call perf_sample_data_init.
I've tested this on a quad-core Realview PB11MPCore board and compared the
OProfile results with perf-events. The only scope for confusion is that, by
default, OProfile organises samples by symbol name and so counts may appear
significantly less then those reported by perf stat [which combines the
per-symbol counts into a single overall count].
This patch is intended to be applied on top of 604{4-9}/1 in RMK's incoming
patch queue.
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Jamie Iles <jamie.iles@picochip.com>
Cc: Jean Pihet <jpihet@mvista.com>
Will Deacon (6):
ARM: perf-events: use numeric ID to identify PMU
ARM: perf-events: add support for xscale PMUs
perf-events: allow modules to query the maximum number of perf events
ARM: oprofile: use perf-events framework as backend
ARM: oprofile: remove old files and update KConfig
ARM: oprofile: convert from sysdev to platform device
arch/arm/Kconfig | 26 +-
arch/arm/include/asm/perf_event.h | 14 +
arch/arm/kernel/perf_event.c | 864 ++++++++++++++++++++++++++++++-
arch/arm/oprofile/Makefile | 7 +-
arch/arm/oprofile/backtrace.c | 83 ---
arch/arm/oprofile/common.c | 375 +++++++++++---
arch/arm/oprofile/op_arm_model.h | 35 --
arch/arm/oprofile/op_counter.h | 27 -
arch/arm/oprofile/op_model_arm11_core.c | 162 ------
arch/arm/oprofile/op_model_arm11_core.h | 45 --
arch/arm/oprofile/op_model_mpcore.c | 306 -----------
arch/arm/oprofile/op_model_mpcore.h | 61 ---
arch/arm/oprofile/op_model_v6.c | 78 ---
arch/arm/oprofile/op_model_v7.c | 415 ---------------
arch/arm/oprofile/op_model_v7.h | 103 ----
arch/arm/oprofile/op_model_xscale.c | 444 ----------------
include/linux/perf_event.h | 1 +
kernel/perf_event.c | 6 +
18 files changed, 1182 insertions(+), 1870 deletions(-)
delete mode 100644 arch/arm/oprofile/backtrace.c
delete mode 100644 arch/arm/oprofile/op_arm_model.h
delete mode 100644 arch/arm/oprofile/op_counter.h
delete mode 100644 arch/arm/oprofile/op_model_arm11_core.c
delete mode 100644 arch/arm/oprofile/op_model_arm11_core.h
delete mode 100644 arch/arm/oprofile/op_model_mpcore.c
delete mode 100644 arch/arm/oprofile/op_model_mpcore.h
delete mode 100644 arch/arm/oprofile/op_model_v6.c
delete mode 100644 arch/arm/oprofile/op_model_v7.c
delete mode 100644 arch/arm/oprofile/op_model_v7.h
delete mode 100644 arch/arm/oprofile/op_model_xscale.c
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU 2010-04-12 15:04 [PATCH 0/6] ARM: oprofile: use perf-events framework as backend [v3] Will Deacon @ 2010-04-12 15:04 ` Will Deacon 2010-04-12 15:04 ` [PATCH 2/6] ARM: perf-events: add support for xscale PMUs Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Will Deacon @ 2010-04-12 15:04 UTC (permalink / raw) To: linux-arm-kernel The ARM perf-events framework provides support for a number of different PMUs using struct arm_pmu. The char *name field of this struct can be used to identify the PMU, but this is cumbersome if used outside of perf. This patch replaces the name string for a PMU with an enum, which holds a unique ID for the PMU being represented. This ID can be used to index an array of names within perf, so no functionality is lost. The presence of the ID field, allows other kernel subsystems [currently oprofile] to use their own mappings for the PMU name. Cc: Jean Pihet <jpihet@mvista.com> Acked-by: Jamie Iles <jamie.iles@picochip.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/include/asm/perf_event.h | 14 +++++++++++++ arch/arm/kernel/perf_event.c | 39 +++++++++++++++++++++++++++--------- 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/perf_event.h b/arch/arm/include/asm/perf_event.h index 49e3049..fa4b326 100644 --- a/arch/arm/include/asm/perf_event.h +++ b/arch/arm/include/asm/perf_event.h @@ -28,4 +28,18 @@ set_perf_event_pending(void) * same indexes here for consistency. */ #define PERF_EVENT_INDEX_OFFSET 1 +/* ARM perf PMU IDs for use by internal perf clients. */ +enum arm_perf_pmu_ids { + ARM_PERF_PMU_ID_XSCALE1 = 0, + ARM_PERF_PMU_ID_XSCALE2, + ARM_PERF_PMU_ID_V6, + ARM_PERF_PMU_ID_V6MP, + ARM_PERF_PMU_ID_CA8, + ARM_PERF_PMU_ID_CA9, + ARM_NUM_PMU_IDS, +}; + +extern enum arm_perf_pmu_ids +armpmu_get_pmu_id(void); + #endif /* __ARM_PERF_EVENT_H__ */ diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 89a77fc..10a0bcd 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -16,6 +16,7 @@ #include <linux/interrupt.h> #include <linux/kernel.h> +#include <linux/module.h> #include <linux/perf_event.h> #include <linux/platform_device.h> #include <linux/spinlock.h> @@ -68,8 +69,18 @@ struct cpu_hw_events { }; DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events); +/* PMU names. */ +static const char *arm_pmu_names[] = { + [ARM_PERF_PMU_ID_XSCALE1] = "xscale1", + [ARM_PERF_PMU_ID_XSCALE2] = "xscale2", + [ARM_PERF_PMU_ID_V6] = "v6", + [ARM_PERF_PMU_ID_V6MP] = "v6mpcore", + [ARM_PERF_PMU_ID_CA8] = "ARMv7 Cortex-A8", + [ARM_PERF_PMU_ID_CA9] = "ARMv7 Cortex-A9", +}; + struct arm_pmu { - char *name; + enum arm_perf_pmu_ids id; irqreturn_t (*handle_irq)(int irq_num, void *dev); void (*enable)(struct hw_perf_event *evt, int idx); void (*disable)(struct hw_perf_event *evt, int idx); @@ -88,6 +99,18 @@ struct arm_pmu { /* Set at runtime when we know what CPU type we are. */ static const struct arm_pmu *armpmu; +enum arm_perf_pmu_ids +armpmu_get_pmu_id(void) +{ + int id = -ENODEV; + + if (armpmu != NULL) + id = armpmu->id; + + return id; +} +EXPORT_SYMBOL_GPL(armpmu_get_pmu_id); + #define HW_OP_UNSUPPORTED 0xFFFF #define C(_x) \ @@ -1154,7 +1177,7 @@ armv6mpcore_pmu_disable_event(struct hw_perf_event *hwc, } static const struct arm_pmu armv6pmu = { - .name = "v6", + .id = ARM_PERF_PMU_ID_V6, .handle_irq = armv6pmu_handle_irq, .enable = armv6pmu_enable_event, .disable = armv6pmu_disable_event, @@ -1177,7 +1200,7 @@ static const struct arm_pmu armv6pmu = { * reset the period and enable the interrupt reporting. */ static const struct arm_pmu armv6mpcore_pmu = { - .name = "v6mpcore", + .id = ARM_PERF_PMU_ID_V6MP, .handle_irq = armv6pmu_handle_irq, .enable = armv6pmu_enable_event, .disable = armv6mpcore_pmu_disable_event, @@ -1207,10 +1230,6 @@ static const struct arm_pmu armv6mpcore_pmu = { * counter and all 4 performance counters together can be reset separately. */ -#define ARMV7_PMU_CORTEX_A8_NAME "ARMv7 Cortex-A8" - -#define ARMV7_PMU_CORTEX_A9_NAME "ARMv7 Cortex-A9" - /* Common ARMv7 event types */ enum armv7_perf_types { ARMV7_PERFCTR_PMNC_SW_INCR = 0x00, @@ -2115,7 +2134,7 @@ init_hw_perf_events(void) perf_max_events = armv6mpcore_pmu.num_events; break; case 0xC080: /* Cortex-A8 */ - armv7pmu.name = ARMV7_PMU_CORTEX_A8_NAME; + armv7pmu.id = ARM_PERF_PMU_ID_CA8; memcpy(armpmu_perf_cache_map, armv7_a8_perf_cache_map, sizeof(armv7_a8_perf_cache_map)); armv7pmu.event_map = armv7_a8_pmu_event_map; @@ -2127,7 +2146,7 @@ init_hw_perf_events(void) perf_max_events = armv7pmu.num_events; break; case 0xC090: /* Cortex-A9 */ - armv7pmu.name = ARMV7_PMU_CORTEX_A9_NAME; + armv7pmu.id = ARM_PERF_PMU_ID_CA9; memcpy(armpmu_perf_cache_map, armv7_a9_perf_cache_map, sizeof(armv7_a9_perf_cache_map)); armv7pmu.event_map = armv7_a9_pmu_event_map; @@ -2146,7 +2165,7 @@ init_hw_perf_events(void) if (armpmu) pr_info("enabled with %s PMU driver, %d counters available\n", - armpmu->name, armpmu->num_events); + arm_pmu_names[armpmu->id], armpmu->num_events); return 0; } -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/6] ARM: perf-events: add support for xscale PMUs 2010-04-12 15:04 ` [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU Will Deacon @ 2010-04-12 15:04 ` Will Deacon 2010-04-12 15:04 ` [PATCH 3/6] perf-events: allow modules to query the maximum number of perf events Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Will Deacon @ 2010-04-12 15:04 UTC (permalink / raw) To: linux-arm-kernel The perf-events framework for ARM only supports v6 and v7 cores. This patch adds support for xscale v1 and v2 PMUs to perf, based on the OProfile drivers in arch/arm/oprofile/op_model_xscale.c Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/kernel/perf_event.c | 827 +++++++++++++++++++++++++++++++++++++++++- 1 files changed, 821 insertions(+), 6 deletions(-) diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 10a0bcd..381f121 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -2108,6 +2108,803 @@ static u32 __init armv7_reset_read_pmnc(void) return nb_cnt + 1; } +/* + * ARMv5 [xscale] Performance counter handling code. + * + * Based on xscale OProfile code. + * + * There are two variants of the xscale PMU that we support: + * - xscale1pmu: 2 event counters and a cycle counter + * - xscale2pmu: 4 event counters and a cycle counter + * The two variants share event definitions, but have different + * PMU structures. + */ + +enum xscale_perf_types { + XSCALE_PERFCTR_ICACHE_MISS = 0x00, + XSCALE_PERFCTR_ICACHE_NO_DELIVER = 0x01, + XSCALE_PERFCTR_DATA_STALL = 0x02, + XSCALE_PERFCTR_ITLB_MISS = 0x03, + XSCALE_PERFCTR_DTLB_MISS = 0x04, + XSCALE_PERFCTR_BRANCH = 0x05, + XSCALE_PERFCTR_BRANCH_MISS = 0x06, + XSCALE_PERFCTR_INSTRUCTION = 0x07, + XSCALE_PERFCTR_DCACHE_FULL_STALL = 0x08, + XSCALE_PERFCTR_DCACHE_FULL_STALL_CONTIG = 0x09, + XSCALE_PERFCTR_DCACHE_ACCESS = 0x0A, + XSCALE_PERFCTR_DCACHE_MISS = 0x0B, + XSCALE_PERFCTR_DCACHE_WRITE_BACK = 0x0C, + XSCALE_PERFCTR_PC_CHANGED = 0x0D, + XSCALE_PERFCTR_BCU_REQUEST = 0x10, + XSCALE_PERFCTR_BCU_FULL = 0x11, + XSCALE_PERFCTR_BCU_DRAIN = 0x12, + XSCALE_PERFCTR_BCU_ECC_NO_ELOG = 0x14, + XSCALE_PERFCTR_BCU_1_BIT_ERR = 0x15, + XSCALE_PERFCTR_RMW = 0x16, + /* XSCALE_PERFCTR_CCNT is not hardware defined */ + XSCALE_PERFCTR_CCNT = 0xFE, + XSCALE_PERFCTR_UNUSED = 0xFF, +}; + +enum xscale_counters { + XSCALE_CYCLE_COUNTER = 1, + XSCALE_COUNTER0, + XSCALE_COUNTER1, + XSCALE_COUNTER2, + XSCALE_COUNTER3, +}; + +static const unsigned xscale_perf_map[PERF_COUNT_HW_MAX] = { + [PERF_COUNT_HW_CPU_CYCLES] = XSCALE_PERFCTR_CCNT, + [PERF_COUNT_HW_INSTRUCTIONS] = XSCALE_PERFCTR_INSTRUCTION, + [PERF_COUNT_HW_CACHE_REFERENCES] = HW_OP_UNSUPPORTED, + [PERF_COUNT_HW_CACHE_MISSES] = HW_OP_UNSUPPORTED, + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = XSCALE_PERFCTR_BRANCH, + [PERF_COUNT_HW_BRANCH_MISSES] = XSCALE_PERFCTR_BRANCH_MISS, + [PERF_COUNT_HW_BUS_CYCLES] = HW_OP_UNSUPPORTED, +}; + +static const unsigned xscale_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { + [C(L1D)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = XSCALE_PERFCTR_DCACHE_ACCESS, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DCACHE_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = XSCALE_PERFCTR_DCACHE_ACCESS, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DCACHE_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(L1I)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ICACHE_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ICACHE_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(LL)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(DTLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DTLB_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_DTLB_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(ITLB)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ITLB_MISS, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = XSCALE_PERFCTR_ITLB_MISS, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, + [C(BPU)] = { + [C(OP_READ)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_WRITE)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + [C(OP_PREFETCH)] = { + [C(RESULT_ACCESS)] = CACHE_OP_UNSUPPORTED, + [C(RESULT_MISS)] = CACHE_OP_UNSUPPORTED, + }, + }, +}; + +#define XSCALE_PMU_ENABLE 0x001 +#define XSCALE_PMN_RESET 0x002 +#define XSCALE_CCNT_RESET 0x004 +#define XSCALE_PMU_RESET (CCNT_RESET | PMN_RESET) +#define XSCALE_PMU_CNT64 0x008 + +static inline int +xscalepmu_event_map(int config) +{ + int mapping = xscale_perf_map[config]; + if (HW_OP_UNSUPPORTED == mapping) + mapping = -EOPNOTSUPP; + return mapping; +} + +static u64 +xscalepmu_raw_event(u64 config) +{ + return config & 0xff; +} + +#define XSCALE1_OVERFLOWED_MASK 0x700 +#define XSCALE1_CCOUNT_OVERFLOW 0x400 +#define XSCALE1_COUNT0_OVERFLOW 0x100 +#define XSCALE1_COUNT1_OVERFLOW 0x200 +#define XSCALE1_CCOUNT_INT_EN 0x040 +#define XSCALE1_COUNT0_INT_EN 0x010 +#define XSCALE1_COUNT1_INT_EN 0x020 +#define XSCALE1_COUNT0_EVT_SHFT 12 +#define XSCALE1_COUNT0_EVT_MASK (0xff << XSCALE1_COUNT0_EVT_SHFT) +#define XSCALE1_COUNT1_EVT_SHFT 20 +#define XSCALE1_COUNT1_EVT_MASK (0xff << XSCALE1_COUNT1_EVT_SHFT) + +static inline u32 +xscale1pmu_read_pmnc(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c0, c0, 0" : "=r" (val)); + return val; +} + +static inline void +xscale1pmu_write_pmnc(u32 val) +{ + /* upper 4bits and 7, 11 are write-as-0 */ + val &= 0xffff77f; + asm volatile("mcr p14, 0, %0, c0, c0, 0" : : "r" (val)); +} + +static inline int +xscale1_pmnc_counter_has_overflowed(unsigned long pmnc, + enum xscale_counters counter) +{ + int ret = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + ret = pmnc & XSCALE1_CCOUNT_OVERFLOW; + break; + case XSCALE_COUNTER0: + ret = pmnc & XSCALE1_COUNT0_OVERFLOW; + break; + case XSCALE_COUNTER1: + ret = pmnc & XSCALE1_COUNT1_OVERFLOW; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", counter); + } + + return ret; +} + +static irqreturn_t +xscale1pmu_handle_irq(int irq_num, void *dev) +{ + unsigned long pmnc; + struct perf_sample_data data; + struct cpu_hw_events *cpuc; + struct pt_regs *regs; + int idx; + + /* + * NOTE: there's an A stepping erratum that states if an overflow + * bit already exists and another occurs, the previous + * Overflow bit gets cleared. There's no workaround. + * Fixed in B stepping or later. + */ + pmnc = xscale1pmu_read_pmnc(); + + /* + * Write the value back to clear the overflow flags. Overflow + * flags remain in pmnc for use below. We also disable the PMU + * while we process the interrupt. + */ + xscale1pmu_write_pmnc(pmnc & ~XSCALE_PMU_ENABLE); + + if (!(pmnc & XSCALE1_OVERFLOWED_MASK)) + return IRQ_NONE; + + regs = get_irq_regs(); + + perf_sample_data_init(&data, 0); + + cpuc = &__get_cpu_var(cpu_hw_events); + for (idx = 0; idx <= armpmu->num_events; ++idx) { + struct perf_event *event = cpuc->events[idx]; + struct hw_perf_event *hwc; + + if (!test_bit(idx, cpuc->active_mask)) + continue; + + if (!xscale1_pmnc_counter_has_overflowed(pmnc, idx)) + continue; + + hwc = &event->hw; + armpmu_event_update(event, hwc, idx); + data.period = event->hw.last_period; + if (!armpmu_event_set_period(event, hwc, idx)) + continue; + + if (perf_event_overflow(event, 0, &data, regs)) + armpmu->disable(hwc, idx); + } + + perf_event_do_pending(); + + /* + * Re-enable the PMU. + */ + pmnc = xscale1pmu_read_pmnc() | XSCALE_PMU_ENABLE; + xscale1pmu_write_pmnc(pmnc); + + return IRQ_HANDLED; +} + +static void +xscale1pmu_enable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long val, mask, evt, flags; + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + mask = 0; + evt = XSCALE1_CCOUNT_INT_EN; + break; + case XSCALE_COUNTER0: + mask = XSCALE1_COUNT0_EVT_MASK; + evt = (hwc->config_base << XSCALE1_COUNT0_EVT_SHFT) | + XSCALE1_COUNT0_INT_EN; + break; + case XSCALE_COUNTER1: + mask = XSCALE1_COUNT1_EVT_MASK; + evt = (hwc->config_base << XSCALE1_COUNT1_EVT_SHFT) | + XSCALE1_COUNT1_INT_EN; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val &= ~mask; + val |= evt; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale1pmu_disable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long val, mask, evt, flags; + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + mask = XSCALE1_CCOUNT_INT_EN; + evt = 0; + break; + case XSCALE_COUNTER0: + mask = XSCALE1_COUNT0_INT_EN | XSCALE1_COUNT0_EVT_MASK; + evt = XSCALE_PERFCTR_UNUSED << XSCALE1_COUNT0_EVT_SHFT; + break; + case XSCALE_COUNTER1: + mask = XSCALE1_COUNT1_INT_EN | XSCALE1_COUNT1_EVT_MASK; + evt = XSCALE_PERFCTR_UNUSED << XSCALE1_COUNT1_EVT_SHFT; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val &= ~mask; + val |= evt; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static int +xscale1pmu_get_event_idx(struct cpu_hw_events *cpuc, + struct hw_perf_event *event) +{ + if (XSCALE_PERFCTR_CCNT == event->config_base) { + if (test_and_set_bit(XSCALE_CYCLE_COUNTER, cpuc->used_mask)) + return -EAGAIN; + + return XSCALE_CYCLE_COUNTER; + } else { + if (!test_and_set_bit(XSCALE_COUNTER1, cpuc->used_mask)) { + return XSCALE_COUNTER1; + } + + if (!test_and_set_bit(XSCALE_COUNTER0, cpuc->used_mask)) { + return XSCALE_COUNTER0; + } + + return -EAGAIN; + } +} + +static void +xscale1pmu_start(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val |= XSCALE_PMU_ENABLE; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale1pmu_stop(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale1pmu_read_pmnc(); + val &= ~XSCALE_PMU_ENABLE; + xscale1pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static inline u32 +xscale1pmu_read_counter(int counter) +{ + u32 val = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mrc p14, 0, %0, c1, c0, 0" : "=r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mrc p14, 0, %0, c2, c0, 0" : "=r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mrc p14, 0, %0, c3, c0, 0" : "=r" (val)); + break; + } + + return val; +} + +static inline void +xscale1pmu_write_counter(int counter, u32 val) +{ + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mcr p14, 0, %0, c1, c0, 0" : : "r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mcr p14, 0, %0, c2, c0, 0" : : "r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mcr p14, 0, %0, c3, c0, 0" : : "r" (val)); + break; + } +} + +static const struct arm_pmu xscale1pmu = { + .id = ARM_PERF_PMU_ID_XSCALE1, + .handle_irq = xscale1pmu_handle_irq, + .enable = xscale1pmu_enable_event, + .disable = xscale1pmu_disable_event, + .event_map = xscalepmu_event_map, + .raw_event = xscalepmu_raw_event, + .read_counter = xscale1pmu_read_counter, + .write_counter = xscale1pmu_write_counter, + .get_event_idx = xscale1pmu_get_event_idx, + .start = xscale1pmu_start, + .stop = xscale1pmu_stop, + .num_events = 3, + .max_period = (1LLU << 32) - 1, +}; + +#define XSCALE2_OVERFLOWED_MASK 0x01f +#define XSCALE2_CCOUNT_OVERFLOW 0x001 +#define XSCALE2_COUNT0_OVERFLOW 0x002 +#define XSCALE2_COUNT1_OVERFLOW 0x004 +#define XSCALE2_COUNT2_OVERFLOW 0x008 +#define XSCALE2_COUNT3_OVERFLOW 0x010 +#define XSCALE2_CCOUNT_INT_EN 0x001 +#define XSCALE2_COUNT0_INT_EN 0x002 +#define XSCALE2_COUNT1_INT_EN 0x004 +#define XSCALE2_COUNT2_INT_EN 0x008 +#define XSCALE2_COUNT3_INT_EN 0x010 +#define XSCALE2_COUNT0_EVT_SHFT 0 +#define XSCALE2_COUNT0_EVT_MASK (0xff << XSCALE2_COUNT0_EVT_SHFT) +#define XSCALE2_COUNT1_EVT_SHFT 8 +#define XSCALE2_COUNT1_EVT_MASK (0xff << XSCALE2_COUNT1_EVT_SHFT) +#define XSCALE2_COUNT2_EVT_SHFT 16 +#define XSCALE2_COUNT2_EVT_MASK (0xff << XSCALE2_COUNT2_EVT_SHFT) +#define XSCALE2_COUNT3_EVT_SHFT 24 +#define XSCALE2_COUNT3_EVT_MASK (0xff << XSCALE2_COUNT3_EVT_SHFT) + +static inline u32 +xscale2pmu_read_pmnc(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c0, c1, 0" : "=r" (val)); + /* bits 1-2 and 4-23 are read-unpredictable */ + return val & 0xff000009; +} + +static inline void +xscale2pmu_write_pmnc(u32 val) +{ + /* bits 4-23 are write-as-0, 24-31 are write ignored */ + val &= 0xf; + asm volatile("mcr p14, 0, %0, c0, c1, 0" : : "r" (val)); +} + +static inline u32 +xscale2pmu_read_overflow_flags(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c5, c1, 0" : "=r" (val)); + return val; +} + +static inline void +xscale2pmu_write_overflow_flags(u32 val) +{ + asm volatile("mcr p14, 0, %0, c5, c1, 0" : : "r" (val)); +} + +static inline u32 +xscale2pmu_read_event_select(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c8, c1, 0" : "=r" (val)); + return val; +} + +static inline void +xscale2pmu_write_event_select(u32 val) +{ + asm volatile("mcr p14, 0, %0, c8, c1, 0" : : "r"(val)); +} + +static inline u32 +xscale2pmu_read_int_enable(void) +{ + u32 val; + asm volatile("mrc p14, 0, %0, c4, c1, 0" : "=r" (val)); + return val; +} + +static void +xscale2pmu_write_int_enable(u32 val) +{ + asm volatile("mcr p14, 0, %0, c4, c1, 0" : : "r" (val)); +} + +static inline int +xscale2_pmnc_counter_has_overflowed(unsigned long of_flags, + enum xscale_counters counter) +{ + int ret = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + ret = of_flags & XSCALE2_CCOUNT_OVERFLOW; + break; + case XSCALE_COUNTER0: + ret = of_flags & XSCALE2_COUNT0_OVERFLOW; + break; + case XSCALE_COUNTER1: + ret = of_flags & XSCALE2_COUNT1_OVERFLOW; + break; + case XSCALE_COUNTER2: + ret = of_flags & XSCALE2_COUNT2_OVERFLOW; + break; + case XSCALE_COUNTER3: + ret = of_flags & XSCALE2_COUNT3_OVERFLOW; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", counter); + } + + return ret; +} + +static irqreturn_t +xscale2pmu_handle_irq(int irq_num, void *dev) +{ + unsigned long pmnc, of_flags; + struct perf_sample_data data; + struct cpu_hw_events *cpuc; + struct pt_regs *regs; + int idx; + + /* Disable the PMU. */ + pmnc = xscale2pmu_read_pmnc(); + xscale2pmu_write_pmnc(pmnc & ~XSCALE_PMU_ENABLE); + + /* Check the overflow flag register. */ + of_flags = xscale2pmu_read_overflow_flags(); + if (!(of_flags & XSCALE2_OVERFLOWED_MASK)) + return IRQ_NONE; + + /* Clear the overflow bits. */ + xscale2pmu_write_overflow_flags(of_flags); + + regs = get_irq_regs(); + + perf_sample_data_init(&data, 0); + + cpuc = &__get_cpu_var(cpu_hw_events); + for (idx = 0; idx <= armpmu->num_events; ++idx) { + struct perf_event *event = cpuc->events[idx]; + struct hw_perf_event *hwc; + + if (!test_bit(idx, cpuc->active_mask)) + continue; + + if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx)) + continue; + + hwc = &event->hw; + armpmu_event_update(event, hwc, idx); + data.period = event->hw.last_period; + if (!armpmu_event_set_period(event, hwc, idx)) + continue; + + if (perf_event_overflow(event, 0, &data, regs)) + armpmu->disable(hwc, idx); + } + + perf_event_do_pending(); + + /* + * Re-enable the PMU. + */ + pmnc = xscale2pmu_read_pmnc() | XSCALE_PMU_ENABLE; + xscale2pmu_write_pmnc(pmnc); + + return IRQ_HANDLED; +} + +static void +xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long flags, ien, evtsel; + + ien = xscale2pmu_read_int_enable(); + evtsel = xscale2pmu_read_event_select(); + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + ien |= XSCALE2_CCOUNT_INT_EN; + break; + case XSCALE_COUNTER0: + ien |= XSCALE2_COUNT0_INT_EN; + evtsel &= ~XSCALE2_COUNT0_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT0_EVT_SHFT; + break; + case XSCALE_COUNTER1: + ien |= XSCALE2_COUNT1_INT_EN; + evtsel &= ~XSCALE2_COUNT1_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT1_EVT_SHFT; + break; + case XSCALE_COUNTER2: + ien |= XSCALE2_COUNT2_INT_EN; + evtsel &= ~XSCALE2_COUNT2_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT2_EVT_SHFT; + break; + case XSCALE_COUNTER3: + ien |= XSCALE2_COUNT3_INT_EN; + evtsel &= ~XSCALE2_COUNT3_EVT_MASK; + evtsel |= hwc->config_base << XSCALE2_COUNT3_EVT_SHFT; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + xscale2pmu_write_event_select(evtsel); + xscale2pmu_write_int_enable(ien); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) +{ + unsigned long flags, ien, evtsel; + + ien = xscale2pmu_read_int_enable(); + evtsel = xscale2pmu_read_event_select(); + + switch (idx) { + case XSCALE_CYCLE_COUNTER: + ien &= ~XSCALE2_CCOUNT_INT_EN; + break; + case XSCALE_COUNTER0: + ien &= ~XSCALE2_COUNT0_INT_EN; + evtsel &= ~XSCALE2_COUNT0_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT0_EVT_SHFT; + break; + case XSCALE_COUNTER1: + ien &= ~XSCALE2_COUNT1_INT_EN; + evtsel &= ~XSCALE2_COUNT1_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT1_EVT_SHFT; + break; + case XSCALE_COUNTER2: + ien &= ~XSCALE2_COUNT2_INT_EN; + evtsel &= ~XSCALE2_COUNT2_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT2_EVT_SHFT; + break; + case XSCALE_COUNTER3: + ien &= ~XSCALE2_COUNT3_INT_EN; + evtsel &= ~XSCALE2_COUNT3_EVT_MASK; + evtsel |= XSCALE_PERFCTR_UNUSED << XSCALE2_COUNT3_EVT_SHFT; + break; + default: + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return; + } + + spin_lock_irqsave(&pmu_lock, flags); + xscale2pmu_write_event_select(evtsel); + xscale2pmu_write_int_enable(ien); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static int +xscale2pmu_get_event_idx(struct cpu_hw_events *cpuc, + struct hw_perf_event *event) +{ + int idx = xscale1pmu_get_event_idx(cpuc, event); + if (idx >= 0) + goto out; + + if (!test_and_set_bit(XSCALE_COUNTER3, cpuc->used_mask)) + idx = XSCALE_COUNTER3; + else if (!test_and_set_bit(XSCALE_COUNTER2, cpuc->used_mask)) + idx = XSCALE_COUNTER2; +out: + return idx; +} + +static void +xscale2pmu_start(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64; + val |= XSCALE_PMU_ENABLE; + xscale2pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static void +xscale2pmu_stop(void) +{ + unsigned long flags, val; + + spin_lock_irqsave(&pmu_lock, flags); + val = xscale2pmu_read_pmnc(); + val &= ~XSCALE_PMU_ENABLE; + xscale2pmu_write_pmnc(val); + spin_unlock_irqrestore(&pmu_lock, flags); +} + +static inline u32 +xscale2pmu_read_counter(int counter) +{ + u32 val = 0; + + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mrc p14, 0, %0, c1, c1, 0" : "=r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mrc p14, 0, %0, c0, c2, 0" : "=r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mrc p14, 0, %0, c1, c2, 0" : "=r" (val)); + break; + case XSCALE_COUNTER2: + asm volatile("mrc p14, 0, %0, c2, c2, 0" : "=r" (val)); + break; + case XSCALE_COUNTER3: + asm volatile("mrc p14, 0, %0, c3, c2, 0" : "=r" (val)); + break; + } + + return val; +} + +static inline void +xscale2pmu_write_counter(int counter, u32 val) +{ + switch (counter) { + case XSCALE_CYCLE_COUNTER: + asm volatile("mcr p14, 0, %0, c1, c1, 0" : : "r" (val)); + break; + case XSCALE_COUNTER0: + asm volatile("mcr p14, 0, %0, c0, c2, 0" : : "r" (val)); + break; + case XSCALE_COUNTER1: + asm volatile("mcr p14, 0, %0, c1, c2, 0" : : "r" (val)); + break; + case XSCALE_COUNTER2: + asm volatile("mcr p14, 0, %0, c2, c2, 0" : : "r" (val)); + break; + case XSCALE_COUNTER3: + asm volatile("mcr p14, 0, %0, c3, c2, 0" : : "r" (val)); + break; + } +} + +static const struct arm_pmu xscale2pmu = { + .id = ARM_PERF_PMU_ID_XSCALE2, + .handle_irq = xscale2pmu_handle_irq, + .enable = xscale2pmu_enable_event, + .disable = xscale2pmu_disable_event, + .event_map = xscalepmu_event_map, + .raw_event = xscalepmu_raw_event, + .read_counter = xscale2pmu_read_counter, + .write_counter = xscale2pmu_write_counter, + .get_event_idx = xscale2pmu_get_event_idx, + .start = xscale2pmu_start, + .stop = xscale2pmu_stop, + .num_events = 5, + .max_period = (1LLU << 32) - 1, +}; + static int __init init_hw_perf_events(void) { @@ -2115,7 +2912,7 @@ init_hw_perf_events(void) unsigned long implementor = (cpuid & 0xFF000000) >> 24; unsigned long part_number = (cpuid & 0xFFF0); - /* We only support ARM CPUs implemented by ARM at the moment. */ + /* ARM Ltd CPUs. */ if (0x41 == implementor) { switch (part_number) { case 0xB360: /* ARM1136 */ @@ -2157,15 +2954,33 @@ init_hw_perf_events(void) armv7pmu.num_events = armv7_reset_read_pmnc(); perf_max_events = armv7pmu.num_events; break; - default: - pr_info("no hardware support available\n"); - perf_max_events = -1; + } + /* Intel CPUs [xscale]. */ + } else if (0x69 == implementor) { + part_number = (cpuid >> 13) & 0x7; + switch (part_number) { + case 1: + armpmu = &xscale1pmu; + memcpy(armpmu_perf_cache_map, xscale_perf_cache_map, + sizeof(xscale_perf_cache_map)); + perf_max_events = xscale1pmu.num_events; + break; + case 2: + armpmu = &xscale2pmu; + memcpy(armpmu_perf_cache_map, xscale_perf_cache_map, + sizeof(xscale_perf_cache_map)); + perf_max_events = xscale2pmu.num_events; + break; } } - if (armpmu) + if (armpmu) { pr_info("enabled with %s PMU driver, %d counters available\n", - arm_pmu_names[armpmu->id], armpmu->num_events); + arm_pmu_names[armpmu->id], armpmu->num_events); + } else { + pr_info("no hardware support available\n"); + perf_max_events = -1; + } return 0; } -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 3/6] perf-events: allow modules to query the maximum number of perf events 2010-04-12 15:04 ` [PATCH 2/6] ARM: perf-events: add support for xscale PMUs Will Deacon @ 2010-04-12 15:04 ` Will Deacon 2010-04-12 15:04 ` [PATCH 4/6] ARM: oprofile: use perf-events framework as backend Will Deacon 0 siblings, 1 reply; 4+ messages in thread From: Will Deacon @ 2010-04-12 15:04 UTC (permalink / raw) To: linux-arm-kernel An upper bound on the number of perf events is stored in the static perf_max_events variable. Modules, such as OProfile, can use this information to allocate and traverse data structures representing the performance counters for a system. This patch adds a function to return the value of perf_max_events and exports this function to modules. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> --- include/linux/perf_event.h | 1 + kernel/perf_event.c | 6 ++++++ 2 files changed, 7 insertions(+), 0 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 9547703..a197bc5 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -751,6 +751,7 @@ struct perf_output_handle { * Set by architecture code: */ extern int perf_max_events; +extern int perf_get_max_events(void); extern const struct pmu *hw_perf_event_init(struct perf_event *event); diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 574ee58..464bda6 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -39,6 +39,12 @@ static DEFINE_PER_CPU(struct perf_cpu_context, perf_cpu_context); int perf_max_events __read_mostly = 1; +int perf_get_max_events(void) +{ + return perf_max_events; +} +EXPORT_SYMBOL_GPL(perf_get_max_events); + static int perf_reserved_percpu __read_mostly; static int perf_overcommit __read_mostly = 1; -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 4/6] ARM: oprofile: use perf-events framework as backend 2010-04-12 15:04 ` [PATCH 3/6] perf-events: allow modules to query the maximum number of perf events Will Deacon @ 2010-04-12 15:04 ` Will Deacon 0 siblings, 0 replies; 4+ messages in thread From: Will Deacon @ 2010-04-12 15:04 UTC (permalink / raw) To: linux-arm-kernel There are currently two hardware performance monitoring subsystems in the kernel for ARM: OProfile and perf-events. This creates the following problems: 1.) Duplicate PMU accessor code. Inevitable code drift may lead to bugs in one framework that are fixed in the other. 2.) Locking issues. OProfile doesn't reprogram hardware counters between profiling runs if the events to be monitored have not been changed. This means that other profiling frameworks cannot use the counters if OProfile is in use. 3.) Due to differences in the two frameworks, it may not be possible to compare the results obtained by OProfile with those obtained by perf. This patch removes the OProfile PMU driver code and replaces it with calls to perf, therefore solving the issues mentioned above. The only userspace-visible change is the lack of SCU counter support for 11MPCore. This is currently unsupported by OProfile userspace tools anyway and therefore shouldn't cause any problems. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Jamie Iles <jamie.iles@picochip.com> Cc: Jean Pihet <jpihet@mvista.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/oprofile/common.c | 333 +++++++++++++++++++++++++++++++++++++------- 1 files changed, 281 insertions(+), 52 deletions(-) diff --git a/arch/arm/oprofile/common.c b/arch/arm/oprofile/common.c index 3fcd752..aa36307 100644 --- a/arch/arm/oprofile/common.c +++ b/arch/arm/oprofile/common.c @@ -2,32 +2,183 @@ * @file common.c * * @remark Copyright 2004 Oprofile Authors + * @remark Copyright 2010 ARM Ltd. * @remark Read the file COPYING * * @author Zwane Mwaikambo + * @author Will Deacon [move to perf] */ +#include <linux/cpumask.h> +#include <linux/errno.h> #include <linux/init.h> +#include <linux/mutex.h> #include <linux/oprofile.h> -#include <linux/errno.h> +#include <linux/perf_event.h> #include <linux/slab.h> #include <linux/sysdev.h> -#include <linux/mutex.h> +#include <asm/stacktrace.h> +#include <linux/uaccess.h> + +#include <asm/perf_event.h> +#include <asm/ptrace.h> -#include "op_counter.h" -#include "op_arm_model.h" +#ifdef CONFIG_HW_PERF_EVENTS +/* + * Per performance monitor configuration as set via oprofilefs. + */ +struct op_counter_config { + unsigned long count; + unsigned long enabled; + unsigned long event; + unsigned long unit_mask; + unsigned long kernel; + unsigned long user; + struct perf_event_attr attr; +}; -static struct op_arm_model_spec *op_arm_model; static int op_arm_enabled; static DEFINE_MUTEX(op_arm_mutex); -struct op_counter_config *counter_config; +static struct op_counter_config *counter_config; +static struct perf_event **perf_events[nr_cpumask_bits]; +static int perf_num_counters; + +/* + * Overflow callback for oprofile. + */ +static void op_overflow_handler(struct perf_event *event, int unused, + struct perf_sample_data *data, struct pt_regs *regs) +{ + int id; + u32 cpu = smp_processor_id(); + + for (id = 0; id < perf_num_counters; ++id) + if (perf_events[cpu][id] == event) + break; + + if (id != perf_num_counters) + oprofile_add_sample(regs, id); + else + pr_warning("oprofile: ignoring spurious overflow " + "on cpu %u\n", cpu); +} + +/* + * Called by op_arm_setup to create perf attributes to mirror the oprofile + * settings in counter_config. Attributes are created as `pinned' events and + * so are permanently scheduled on the PMU. + */ +static void op_perf_setup(void) +{ + int i; + u32 size = sizeof(struct perf_event_attr); + struct perf_event_attr *attr; + + for (i = 0; i < perf_num_counters; ++i) { + attr = &counter_config[i].attr; + memset(attr, 0, size); + attr->type = PERF_TYPE_RAW; + attr->size = size; + attr->config = counter_config[i].event; + attr->sample_period = counter_config[i].count; + attr->pinned = 1; + } +} + +static int op_create_counter(int cpu, int event) +{ + int ret = 0; + struct perf_event *pevent; + + if (!counter_config[event].enabled || (perf_events[cpu][event] != NULL)) + return ret; + + pevent = perf_event_create_kernel_counter(&counter_config[event].attr, + cpu, -1, + op_overflow_handler); + + if (IS_ERR(pevent)) { + ret = PTR_ERR(pevent); + } else if (pevent->state != PERF_EVENT_STATE_ACTIVE) { + pr_warning("oprofile: failed to enable event %d " + "on CPU %d\n", event, cpu); + ret = -EBUSY; + } else { + perf_events[cpu][event] = pevent; + } + + return ret; +} + +static void op_destroy_counter(int cpu, int event) +{ + struct perf_event *pevent = perf_events[cpu][event]; + + if (pevent) { + perf_event_release_kernel(pevent); + perf_events[cpu][event] = NULL; + } +} + +/* + * Called by op_arm_start to create active perf events based on the + * perviously configured attributes. + */ +static int op_perf_start(void) +{ + int cpu, event, ret = 0; + + for_each_online_cpu(cpu) { + for (event = 0; event < perf_num_counters; ++event) { + ret = op_create_counter(cpu, event); + if (ret) + goto out; + } + } + +out: + return ret; +} + +/* + * Called by op_arm_stop@the end of a profiling run. + */ +static void op_perf_stop(void) +{ + int cpu, event; + + for_each_online_cpu(cpu) + for (event = 0; event < perf_num_counters; ++event) + op_destroy_counter(cpu, event); +} + + +static char *op_name_from_perf_id(enum arm_perf_pmu_ids id) +{ + switch (id) { + case ARM_PERF_PMU_ID_XSCALE1: + return "arm/xscale1"; + case ARM_PERF_PMU_ID_XSCALE2: + return "arm/xscale2"; + case ARM_PERF_PMU_ID_V6: + return "arm/armv6"; + case ARM_PERF_PMU_ID_V6MP: + return "arm/mpcore"; + case ARM_PERF_PMU_ID_CA8: + return "arm/armv7"; + case ARM_PERF_PMU_ID_CA9: + return "arm/armv7-ca9"; + default: + return NULL; + } +} static int op_arm_create_files(struct super_block *sb, struct dentry *root) { unsigned int i; - for (i = 0; i < op_arm_model->num_counters; i++) { + for (i = 0; i < perf_num_counters; i++) { struct dentry *dir; char buf[4]; @@ -46,12 +197,10 @@ static int op_arm_create_files(struct super_block *sb, struct dentry *root) static int op_arm_setup(void) { - int ret; - spin_lock(&oprofilefs_lock); - ret = op_arm_model->setup_ctrs(); + op_perf_setup(); spin_unlock(&oprofilefs_lock); - return ret; + return 0; } static int op_arm_start(void) @@ -60,8 +209,9 @@ static int op_arm_start(void) mutex_lock(&op_arm_mutex); if (!op_arm_enabled) { - ret = op_arm_model->start(); - op_arm_enabled = !ret; + ret = 0; + op_perf_start(); + op_arm_enabled = 1; } mutex_unlock(&op_arm_mutex); return ret; @@ -71,7 +221,7 @@ static void op_arm_stop(void) { mutex_lock(&op_arm_mutex); if (op_arm_enabled) - op_arm_model->stop(); + op_perf_stop(); op_arm_enabled = 0; mutex_unlock(&op_arm_mutex); } @@ -81,7 +231,7 @@ static int op_arm_suspend(struct sys_device *dev, pm_message_t state) { mutex_lock(&op_arm_mutex); if (op_arm_enabled) - op_arm_model->stop(); + op_perf_stop(); mutex_unlock(&op_arm_mutex); return 0; } @@ -89,7 +239,7 @@ static int op_arm_suspend(struct sys_device *dev, pm_message_t state) static int op_arm_resume(struct sys_device *dev) { mutex_lock(&op_arm_mutex); - if (op_arm_enabled && op_arm_model->start()) + if (op_arm_enabled && op_perf_start()) op_arm_enabled = 0; mutex_unlock(&op_arm_mutex); return 0; @@ -126,58 +276,137 @@ static void exit_driverfs(void) #define exit_driverfs() do { } while (0) #endif /* CONFIG_PM */ -int __init oprofile_arch_init(struct oprofile_operations *ops) +static int report_trace(struct stackframe *frame, void *d) { - struct op_arm_model_spec *spec = NULL; - int ret = -ENODEV; + unsigned int *depth = d; - ops->backtrace = arm_backtrace; + if (*depth) { + oprofile_add_trace(frame->pc); + (*depth)--; + } -#ifdef CONFIG_CPU_XSCALE - spec = &op_xscale_spec; -#endif + return *depth == 0; +} -#ifdef CONFIG_OPROFILE_ARMV6 - spec = &op_armv6_spec; -#endif +/* + * The registers we're interested in are at the end of the variable + * length saved register structure. The fp points at the end of this + * structure so the address of this struct is: + * (struct frame_tail *)(xxx->fp)-1 + */ +struct frame_tail { + struct frame_tail *fp; + unsigned long sp; + unsigned long lr; +} __attribute__((packed)); -#ifdef CONFIG_OPROFILE_MPCORE - spec = &op_mpcore_spec; -#endif +static struct frame_tail* user_backtrace(struct frame_tail *tail) +{ + struct frame_tail buftail[2]; -#ifdef CONFIG_OPROFILE_ARMV7 - spec = &op_armv7_spec; -#endif + /* Also check accessibility of one struct frame_tail beyond */ + if (!access_ok(VERIFY_READ, tail, sizeof(buftail))) + return NULL; + if (__copy_from_user_inatomic(buftail, tail, sizeof(buftail))) + return NULL; - if (spec) { - ret = spec->init(); - if (ret < 0) - return ret; + oprofile_add_trace(buftail[0].lr); - counter_config = kcalloc(spec->num_counters, sizeof(struct op_counter_config), - GFP_KERNEL); - if (!counter_config) - return -ENOMEM; + /* frame pointers should strictly progress back up the stack + * (towards higher addresses) */ + if (tail >= buftail[0].fp) + return NULL; + + return buftail[0].fp-1; +} + +static void arm_backtrace(struct pt_regs * const regs, unsigned int depth) +{ + struct frame_tail *tail = ((struct frame_tail *) regs->ARM_fp) - 1; + + if (!user_mode(regs)) { + struct stackframe frame; + frame.fp = regs->ARM_fp; + frame.sp = regs->ARM_sp; + frame.lr = regs->ARM_lr; + frame.pc = regs->ARM_pc; + walk_stackframe(&frame, report_trace, &depth); + return; + } + + while (depth-- && tail && !((unsigned long) tail & 3)) + tail = user_backtrace(tail); +} + +int __init oprofile_arch_init(struct oprofile_operations *ops) +{ + int cpu, ret = 0; + + perf_num_counters = perf_get_max_events(); + + counter_config = kcalloc(perf_num_counters, + sizeof(struct op_counter_config), GFP_KERNEL); - op_arm_model = spec; - init_driverfs(); - ops->create_files = op_arm_create_files; - ops->setup = op_arm_setup; - ops->shutdown = op_arm_stop; - ops->start = op_arm_start; - ops->stop = op_arm_stop; - ops->cpu_type = op_arm_model->name; - printk(KERN_INFO "oprofile: using %s\n", spec->name); + if (!counter_config) { + pr_info("oprofile: failed to allocate %d " + "counters\n", perf_num_counters); + return -ENOMEM; } + for_each_possible_cpu(cpu) { + perf_events[cpu] = kcalloc(perf_num_counters, + sizeof(struct perf_event *), GFP_KERNEL); + if (!perf_events[cpu]) { + pr_info("oprofile: failed to allocate %d perf events " + "for cpu %d\n", perf_num_counters, cpu); + while (--cpu >= 0) + kfree(perf_events[cpu]); + return -ENOMEM; + } + } + + init_driverfs(); + ops->backtrace = arm_backtrace; + ops->create_files = op_arm_create_files; + ops->setup = op_arm_setup; + ops->start = op_arm_start; + ops->stop = op_arm_stop; + ops->shutdown = op_arm_stop; + ops->cpu_type = op_name_from_perf_id(armpmu_get_pmu_id()); + + if (!ops->cpu_type) + ret = -ENODEV; + else + pr_info("oprofile: using %s\n", ops->cpu_type); + return ret; } void oprofile_arch_exit(void) { - if (op_arm_model) { + int cpu, id; + struct perf_event *event; + + if (*perf_events) { exit_driverfs(); - op_arm_model = NULL; + for_each_possible_cpu(cpu) { + for (id = 0; id < perf_num_counters; ++id) { + event = perf_events[cpu][id]; + if (event != NULL) + perf_event_release_kernel(event); + } + kfree(perf_events[cpu]); + } } - kfree(counter_config); + + if (counter_config) + kfree(counter_config); +} +#else +int __init oprofile_arch_init(struct oprofile_operations *ops) +{ + pr_info("oprofile: hardware counters not available\n"); + return -ENODEV; } +void oprofile_arch_exit(void) {} +#endif /* CONFIG_HW_PERF_EVENTS */ -- 1.6.3.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-05-12 9:32 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-12 4:35 [PATCH 4/6] ARM: oprofile: use perf-events framework as backend Deng-Cheng Zhu 2010-05-12 9:32 ` Will Deacon -- strict thread matches above, loose matches on Subject: below -- 2010-04-19 16:42 [PATCH 0/6] ARM: oprofile: use perf-events framework as backend [v4] Will Deacon 2010-04-19 16:42 ` [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU Will Deacon 2010-04-19 16:42 ` [PATCH 2/6] ARM: perf-events: add support for xscale PMUs Will Deacon 2010-04-19 16:42 ` [PATCH 3/6] ARM: perf-events: allow modules to query the number of hardware counters Will Deacon 2010-04-19 16:42 ` [PATCH 4/6] ARM: oprofile: use perf-events framework as backend Will Deacon 2010-04-12 15:04 [PATCH 0/6] ARM: oprofile: use perf-events framework as backend [v3] Will Deacon 2010-04-12 15:04 ` [PATCH 1/6] ARM: perf-events: use numeric ID to identify PMU Will Deacon 2010-04-12 15:04 ` [PATCH 2/6] ARM: perf-events: add support for xscale PMUs Will Deacon 2010-04-12 15:04 ` [PATCH 3/6] perf-events: allow modules to query the maximum number of perf events Will Deacon 2010-04-12 15:04 ` [PATCH 4/6] ARM: oprofile: use perf-events framework as backend Will Deacon
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.