From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: Perf Event support for ARMv7 (was: Re: [PATCH 5/5] arm/perfevents: implement perf event support for ARMv6)
Date: Mon, 21 Dec 2009 11:04:55 -0000 [thread overview]
Message-ID: <000701ca822d$6de761a0$49b624e0$@deacon@arm.com> (raw)
In-Reply-To: <1261155929.6094.11.camel@def-laptop>
Hi Jean,
I've provided some comments inline. Hopefully they're useful.
* Jean Pihet wrote:
> Hello,
>
> Here is a patch that adds the support for ARMv7 processors, using the
> PMNC HW unit.
>
> The code is for review, it has been compiled and boot tested only, the
> complete testing is in progress. Please let me know if the patch is
> wrapped or garbled I will send it attached (20KB in size).
>
> Feedback is welcome.
>
<snip>
> diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
> index abb5267..79e92ce 100644
> --- a/arch/arm/kernel/perf_event.c
> +++ b/arch/arm/kernel/perf_event.c
> @@ -4,6 +4,7 @@
> * ARM performance counter support.
> *
> * Copyright (C) 2009 picoChip Designs, Ltd., Jamie Iles
> + * ARMv7 support: Jean Pihet <jpihet@mvista.com>
> *
> * This code is based on the sparc64 perf event code, which is in turn
> based
> * on the x86 code. Callchain code is based on the ARM OProfile
> backtrace
> @@ -35,8 +36,11 @@ DEFINE_SPINLOCK(pmu_lock);
> * ARMv6 supports a maximum of 3 events, starting from index 1. If we
> add
> * another platform that supports more, we need to increase this to be
> the
> * largest of all platforms.
> + *
> + * ARMv7 supports up to 5 events:
> + * cycle counter CCNT + 4 events counters CNT0..3
> */
> -#define ARMPMU_MAX_HWEVENTS 4
> +#define ARMPMU_MAX_HWEVENTS 5
The maximum number of event counters on ARMv7 is currently 6 [Cortex-A9],
plus a cycle counter. Additionally, the number of event counters actually
available is implementation defined (the cycle counter is mandatory). You can
find out the number of event counters using the PMCR ((PMCR >> 11) & 0x1f).
>
> /* The events for a given CPU. */
> struct cpu_hw_events {
> @@ -965,6 +969,701 @@ static struct arm_pmu armv6pmu = {
> .max_period = (1LLU << 32) - 1,
> };
>
> +/*
> + * ARMv7 Performance counter handling code.
> + *
> + * Copied from ARMv6 code, with the low level code inspired
> + * by the ARMv7 Oprofile code.
> + *
> + * ARMv7 has 4 configurable performance counters and a single cycle
> counter.
> + * All counters can be enabled/disabled and IRQ masked separately. The
> cycle
> + * counter and all 4 performance counters together can be reset
> separately.
> + */
> +
> +enum armv7_perf_types {
> + ARMV7_PERFCTR_PMNC_SW_INCR = 0x00,
> + ARMV7_PERFCTR_IFETCH_MISS = 0x01,
> + ARMV7_PERFCTR_ITLB_MISS = 0x02,
> + ARMV7_PERFCTR_DCACHE_REFILL = 0x03,
> + ARMV7_PERFCTR_DCACHE_ACCESS = 0x04,
> + ARMV7_PERFCTR_DTLB_REFILL = 0x05,
> + ARMV7_PERFCTR_DREAD = 0x06,
> + ARMV7_PERFCTR_DWRITE = 0x07,
> + ARMV7_PERFCTR_INSTR_EXECUTED = 0x08,
> + ARMV7_PERFCTR_EXC_TAKEN = 0x09,
> + ARMV7_PERFCTR_EXC_EXECUTED = 0x0A,
> + ARMV7_PERFCTR_CID_WRITE = 0x0B,
> + ARMV7_PERFCTR_PC_WRITE = 0x0C,
> + ARMV7_PERFCTR_PC_IMM_BRANCH = 0x0D,
> + ARMV7_PERFCTR_PC_PROC_RETURN = 0x0E,
> + ARMV7_PERFCTR_UNALIGNED_ACCESS = 0x0F,
> + ARMV7_PERFCTR_PC_BRANCH_MIS_PRED = 0x10,
> +
> + ARMV7_PERFCTR_PC_BRANCH_MIS_USED = 0x12,
Ok - the events so far are defined by the v7 architecture.
Note that this doesn't necessarily mean they are all supported by
the core.
> + ARMV7_PERFCTR_WRITE_BUFFER_FULL = 0x40,
> + ARMV7_PERFCTR_L2_STORE_MERGED = 0x41,
> + ARMV7_PERFCTR_L2_STORE_BUFF = 0x42,
> + ARMV7_PERFCTR_L2_ACCESS = 0x43,
> + ARMV7_PERFCTR_L2_CACH_MISS = 0x44,
> + ARMV7_PERFCTR_AXI_READ_CYCLES = 0x45,
> + ARMV7_PERFCTR_AXI_WRITE_CYCLES = 0x46,
> + ARMV7_PERFCTR_MEMORY_REPLAY = 0x47,
> + ARMV7_PERFCTR_UNALIGNED_ACCESS_REPLAY = 0x48,
> + ARMV7_PERFCTR_L1_DATA_MISS = 0x49,
> + ARMV7_PERFCTR_L1_INST_MISS = 0x4A,
> + ARMV7_PERFCTR_L1_DATA_COLORING = 0x4B,
> + ARMV7_PERFCTR_L1_NEON_DATA = 0x4C,
> + ARMV7_PERFCTR_L1_NEON_CACH_DATA = 0x4D,
> + ARMV7_PERFCTR_L2_NEON = 0x4E,
> + ARMV7_PERFCTR_L2_NEON_HIT = 0x4F,
> + ARMV7_PERFCTR_L1_INST = 0x50,
> + ARMV7_PERFCTR_PC_RETURN_MIS_PRED = 0x51,
> + ARMV7_PERFCTR_PC_BRANCH_FAILED = 0x52,
> + ARMV7_PERFCTR_PC_BRANCH_TAKEN = 0x53,
> + ARMV7_PERFCTR_PC_BRANCH_EXECUTED = 0x54,
> + ARMV7_PERFCTR_OP_EXECUTED = 0x55,
> + ARMV7_PERFCTR_CYCLES_INST_STALL = 0x56,
> + ARMV7_PERFCTR_CYCLES_INST = 0x57,
> + ARMV7_PERFCTR_CYCLES_NEON_DATA_STALL = 0x58,
> + ARMV7_PERFCTR_CYCLES_NEON_INST_STALL = 0x59,
> + ARMV7_PERFCTR_NEON_CYCLES = 0x5A,
> +
> + ARMV7_PERFCTR_PMU0_EVENTS = 0x70,
> + ARMV7_PERFCTR_PMU1_EVENTS = 0x71,
> + ARMV7_PERFCTR_PMU_EVENTS = 0x72,
> +
> + ARMV7_PERFCTR_CPU_CYCLES = 0xFF
> +};
These events are specific to the Cortex-A8.
Unfortunately, these numbers clash with events specific
to the Cortex-A9 [and potentially future v7 cores].
For example, 0x40 on the A8 is WRITE_BUFFER_FULL but on the
A9 it is JAVA_BYTECODE_EXEC. This means that you'll need to
take a similar approach as was taken for ARM11MP vs ARM11*.
<snip>
> +/*
> + * Available counters
> + */
> +#define ARMV7_CCNT 0
> +#define ARMV7_CNT0 1
> +#define ARMV7_CNT1 2
> +#define ARMV7_CNT2 3
> +#define ARMV7_CNT3 4
> +#define ARMV7_CNTMAX 5
> +#define ARMV7_COUNTER_TO_CCNT (ARMV7_CYCLE_COUNTER - ARMV7_CCNT)
> +
> +#define ARMV7_CPU_COUNTER(cpu, counter) ((cpu) * CNTMAX + (counter))
You don't use this macro. I imagine there are others which are no longer used too.
<snip>
> +static inline int armv7_pmnc_select_counter(unsigned int cnt)
> +{
> + u32 val;
> +
> + cnt -= ARMV7_COUNTER_TO_CCNT;
> +
> + if ((cnt == ARMV7_CCNT) || (cnt >= ARMV7_CNTMAX)) {
> + printk(KERN_ERR "oprofile: CPU%u selecting wrong PMNC counter"
> + " %d\n", smp_processor_id(), cnt);
> + return -1;
> + }
Nice error message :)
<snip>
> static int __init
> init_hw_perf_events(void)
> {
> @@ -977,6 +1676,13 @@ init_hw_perf_events(void)
> memcpy(armpmu_perf_cache_map, armv6_perf_cache_map,
> sizeof(armv6_perf_cache_map));
> perf_max_events = armv6pmu.num_events;
> + } else if (cpu_architecture() == CPU_ARCH_ARMv7) {
> + armpmu = &armv7pmu;
> + memcpy(armpmu_perf_cache_map, armv7_perf_cache_map,
> + sizeof(armv7_perf_cache_map));
> + perf_max_events = armv7pmu.num_events;
> + /* Initialize & Reset PMNC: C bit and P bit */
> + armv7_pmnc_write(ARMV7_PMNC_P | ARMV7_PMNC_C);
> } else {
> pr_info("no hardware support available\n");
> perf_max_events = -1;
You'll need to switch on the cpuid to select the correct event mappings.
I've implemented this for oprofile, I'll post it as an RFC after Christmas
as I won't be able to respond in the meantime.
Cheers,
Will
next prev parent reply other threads:[~2009-12-21 11:04 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-15 11:15 ARMv6 performance counters v3 Jamie Iles
2009-12-15 11:15 ` [PATCH 1/5] arm: provide a mechanism to reserve performance counters Jamie Iles
2009-12-15 11:15 ` [PATCH 2/5] arm/oprofile: reserve the PMU when starting Jamie Iles
2009-12-15 11:15 ` [PATCH 3/5] arm: use the spinlocked, generic atomic64 support Jamie Iles
2009-12-15 11:15 ` [PATCH 4/5] arm: enable support for software perf events Jamie Iles
2009-12-15 11:15 ` [PATCH 5/5] arm/perfevents: implement perf event support for ARMv6 Jamie Iles
2009-12-15 14:29 ` Will Deacon
2009-12-15 15:02 ` Jamie Iles
2009-12-15 15:05 ` Will Deacon
2009-12-15 15:19 ` Jamie Iles
2009-12-15 15:30 ` Peter Zijlstra
2009-12-15 15:36 ` Jamie Iles
2009-12-16 10:54 ` Jamie Iles
2009-12-16 11:04 ` Will Deacon
2009-12-16 11:19 ` Jamie Iles
2009-12-18 17:05 ` Perf Event support for ARMv7 (was: Re: [PATCH 5/5] arm/perfevents: implement perf event support for ARMv6) Jean Pihet
2009-12-19 10:29 ` Jamie Iles
2009-12-19 10:53 ` Ingo Molnar
2009-12-21 11:32 ` Jean Pihet
2009-12-21 11:29 ` Jean Pihet
2009-12-21 11:04 ` Will Deacon [this message]
2009-12-21 11:43 ` Jean Pihet
2009-12-21 12:10 ` Will Deacon
2009-12-21 12:43 ` Jamie Iles
2009-12-21 13:35 ` Jean Pihet
2009-12-22 16:51 ` Jean Pihet
2009-12-28 7:57 ` Ingo Molnar
2009-12-29 13:52 ` Jean Pihet
2009-12-29 16:32 ` Jamie Iles
2010-01-06 15:16 ` Michał Nazarewicz
2010-01-06 15:30 ` Jamie Iles
2010-01-07 17:02 ` Michał Nazarewicz
2009-12-29 13:58 ` Jean Pihet
2010-01-04 16:52 ` Will Deacon
2010-01-15 15:30 ` Jean Pihet
2010-01-15 15:39 ` Jamie Iles
2010-01-15 15:43 ` Jean Pihet
2010-01-15 15:49 ` Jamie Iles
2010-01-20 13:40 ` Will Deacon
2010-01-08 22:17 ` Woodruff, Richard
2010-01-15 15:34 ` Jean Pihet
2009-12-15 14:13 ` [PATCH 1/5] arm: provide a mechanism to reserve performance counters Will Deacon
2009-12-15 14:36 ` Jamie Iles
2009-12-15 17:06 ` Will Deacon
2009-12-17 16:14 ` Will Deacon
2009-12-17 16:27 ` Jamie Iles
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='000701ca822d$6de761a0$49b624e0$@deacon@arm.com' \
--to=will.deacon@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).