All of lore.kernel.org
 help / color / mirror / Atom feed
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: Perf Event support for ARMv7 (was: Re: [PATCH 5/5] arm/perfevents: implement perf event support for ARMv6)
Date: Wed, 20 Jan 2010 13:40:08 -0000	[thread overview]
Message-ID: <000d01ca99d6$14edcba0$3ec962e0$@deacon@arm.com> (raw)
In-Reply-To: <201001151630.07874.jpihet@mvista.com>

Hi Jean,

Sorry for the delay in getting back to you, I've had a few technical
problems with my machine. Anyway, here we go:

* Jean Pihet wrote:
<snip>
> > 0x0c is HW_BRANCH_INSTRUCTIONS and 0x10 is HW_BRANCH_MISSES.
> > 0x12 is the number of predictable branch instructions executed, so the
> > mispredict rate is 0x10/0x12. These events are defined for v7, so A8 should
> > take these definitions too.
> From the spec I read 0x0c is 'SW write of the PC', is that equivalent to
> HW_BRANCH_INSTRUCTIONS?

This event counts:
	- All branch instructions
	- Instructions that explicitly write the PC
	- Exception generating instructions

I think this is suitable for HW_BRANCH_INSTRUCTIONS, but if anybody feels
differently then maybe we should reconsider.

> For A8 I am using:
> - ARMV7_PERFCTR_PC_BRANCH_TAKEN (0x53),
> - ARMV7_PERFCTR_PC_BRANCH_FAILED (0x52)
> 
> For A9 it is unsupported for now.
> 
> Do you think I should use 0x0c and 0x10 for both A8 and A9? How to get the
> accesses and misses count directly?

I think we should define the `standard' set (i.e. those that perf supports by
name) using the v7 events, so in this case then use 0x0c and 0x10 for both A8
and A9. The core-specific definitions can then always be accessed as raw events.
As I mentioned, I think this is important if people decide to compare the counts
between two cores.

> > We could use 0x01 for icache miss, 0x03 for dcache miss and 0x04 for dcache
> > access.
> Ok changed to the following. Is that correct?
> Note that A8 uses specific events for I cache in order to make them comparable
> to each other. I cache miss could use 0x01 also. Cf. remark below for more.
> 
> Cortex-A8:
> - D cache access: ARMV7_PERFCTR_DCACHE_ACCESS (0x04),
> - D cache miss: ARMV7_PERFCTR_DCACHE_REFILL (0x03) instead of
> ARMV7_PERFCTR_L1_DATA_MISS (0x49),
> - I cache access: ARMV7_PERFCTR_L1_DATA_MISS (0x50),
> - I cache miss: ARMV7_PERFCTR_L1_INST_MISS (0x4a).
> 
> Cortex-A9:
> - D cache access: ARMV7_PERFCTR_DCACHE_ACCESS (0x04),
> - D cache miss: ARMV7_PERFCTR_DCACHE_REFILL (0x03),
> - I cache access: Not supported,
> - I cache miss: ARMV7_PERFCTR_IFETCH_MISS (0x01).

Hmm, this is an interesting one. I suppose comparison between events on a given
core (i.e. A8) is preferable, so I agree with you here. Due to the lack of I-cache
access events on A9, there's nothing we can do to get a fair cross-core comparison.
[minor note: You've called the I-cache access event ARMV7_PERFCTR_L1_DATA_MISS!]
 
> > > +	[C(L1I)] = {
> > > +		[C(OP_READ)] = {
> > > +			[C(RESULT_ACCESS)]	= ARMV7_PERFCTR_L1_INST,
> > > +			[C(RESULT_MISS)]	= ARMV7_PERFCTR_L1_INST_MISS,
> > > +		},
> > > +		[C(OP_WRITE)] = {
> > > +			[C(RESULT_ACCESS)]	= ARMV7_PERFCTR_L1_INST,
> > > +			[C(RESULT_MISS)]	= ARMV7_PERFCTR_L1_INST_MISS,
> > > +		},
> > > +		[C(OP_PREFETCH)] = {
> > > +			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
> > > +			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
> > > +		},
> > > +	},
> >
> > Same thing here. I'd suggest using 0x01 instead of 0x4a.
> Ok is it preferred to keep the ARMV7_PERFCTR_L1_ events for both accesses and
> misses in order to make the events counts comparable to each other? On the
> other end using 0x01 allows the comparison between A8 and A9.
> I am OK to change it, just let me know.

After thinking about this above, I agree with you; let's use the
ARMV7_PERFCTR_L1_ events to allow for event comparisons on the A8. Comparing with
an A9 is a non-starter because the I-cache accesses can't be counted there.

> > > +/*
> > > + * Available counters
> > > + */
> > > +#define ARMV7_CNT0 		0	/* First event counter */
> > > +#define ARMV7_CCNT 		31	/* Cycle counter */
> > > +
> > > +#define ARMV7_A8_CNTMAX		5	/* Cortex-A8: up to 4 counters + CCNT */
> > > +#define ARMV7_A9_CNTMAX		32	/* Cortex-A9: up to 31 counters + CCNT*/
> >
> > Actually, A9 has a maximum number of 6 event counters + CCNT.
> Cf. remark above. The code is generic enough and supports up to the 1+31
> events as defined in the A8 and A9 TRMs. The number of counters is
> dynamically read from the PMNC registers. Should that be compared against the
> given maximum (1+4 for A8, 1+6 for A9)? That looks like overkill.

Sure, I was just referring to ARMV7_A9_CNTMAX being artificially high.
You'll never see more than 6 event counters on an A9.

> > It might also be
> > worth adding a cpu_architecture() check to the v6 test just in case a
> > v7 core conflicts with the mask.
> Jamie, what do you think?

I forgot that looked at the MMU. Oh well, the ordering will have to matter.

Cheers,

Will

  parent reply	other threads:[~2010-01-20 13:40 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-15 11:15 ARMv6 performance counters v3 Jamie Iles
2009-12-15 11:15 ` [PATCH 1/5] arm: provide a mechanism to reserve performance counters Jamie Iles
2009-12-15 11:15   ` [PATCH 2/5] arm/oprofile: reserve the PMU when starting Jamie Iles
2009-12-15 11:15     ` [PATCH 3/5] arm: use the spinlocked, generic atomic64 support Jamie Iles
2009-12-15 11:15       ` [PATCH 4/5] arm: enable support for software perf events Jamie Iles
2009-12-15 11:15         ` [PATCH 5/5] arm/perfevents: implement perf event support for ARMv6 Jamie Iles
2009-12-15 14:29           ` Will Deacon
2009-12-15 15:02             ` Jamie Iles
2009-12-15 15:05               ` Will Deacon
2009-12-15 15:19                 ` Jamie Iles
2009-12-15 15:30                   ` Peter Zijlstra
2009-12-15 15:36                     ` Jamie Iles
2009-12-16 10:54                       ` Jamie Iles
2009-12-16 11:04                         ` Will Deacon
2009-12-16 11:19                           ` Jamie Iles
2009-12-18 17:05           ` Perf Event support for ARMv7 (was: Re: [PATCH 5/5] arm/perfevents: implement perf event support for ARMv6) Jean Pihet
2009-12-19 10:29             ` Jamie Iles
2009-12-19 10:53               ` Ingo Molnar
2009-12-21 11:32                 ` Jean Pihet
2009-12-21 11:29               ` Jean Pihet
2009-12-21 11:04             ` Will Deacon
2009-12-21 11:43               ` Jean Pihet
2009-12-21 12:10                 ` Will Deacon
2009-12-21 12:43                   ` Jamie Iles
2009-12-21 13:35                     ` Jean Pihet
2009-12-22 16:51                       ` Jean Pihet
2009-12-28  7:57                         ` Ingo Molnar
2009-12-29 13:52                           ` Jean Pihet
2009-12-29 16:32                             ` Jamie Iles
2010-01-06 15:16                               ` Michał Nazarewicz
2010-01-06 15:30                                 ` Jamie Iles
2010-01-07 17:02                                   ` Michał Nazarewicz
2009-12-29 13:58                         ` Jean Pihet
2010-01-04 16:52                           ` Will Deacon
2010-01-15 15:30                             ` Jean Pihet
2010-01-15 15:39                               ` Jamie Iles
2010-01-15 15:43                                 ` Jean Pihet
2010-01-15 15:49                                   ` Jamie Iles
2010-01-20 13:40                               ` Will Deacon [this message]
2010-01-08 22:17                         ` Woodruff, Richard
2010-01-15 15:34                           ` Jean Pihet
2009-12-15 14:13   ` [PATCH 1/5] arm: provide a mechanism to reserve performance counters Will Deacon
2009-12-15 14:36     ` Jamie Iles
2009-12-15 17:06       ` Will Deacon
2009-12-17 16:14   ` Will Deacon
2009-12-17 16:27     ` Jamie Iles

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='000d01ca99d6$14edcba0$3ec962e0$@deacon@arm.com' \
    --to=will.deacon@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.