All of lore.kernel.org
 help / color / mirror / Atom feed
From: Corey Ashford <cjashfor@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: eranian@gmail.com, linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Eric Dumazet <dada1@cosmosbay.com>,
	Robert Richter <robert.richter@amd.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Peter Anvin <hpa@zytor.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Paul Mackerras <paulus@samba.org>,
	"David S. Miller" <davem@davemloft.net>,
	Mike Galbraith <efault@gmx.de>,
	"perfmon2-devel@lists.sourceforge.net" 
	<perfmon2-devel@lists.sourceforge.net>,
	Papi <ptools-perfapi@cs.utk.edu>
Subject: Re: [announce] Performance Counters for Linux, v6
Date: Mon, 26 Jan 2009 15:41:08 -0800	[thread overview]
Message-ID: <497E4A14.1090605@linux.vnet.ibm.com> (raw)
In-Reply-To: <20090126221553.GB7440@elte.hu>

Ingo Molnar wrote:
> * Corey Ashford <cjashfor@linux.vnet.ibm.com> wrote:
> 
>> Ingo Molnar wrote:
>>> * stephane eranian <eranian@googlemail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> \x10Corey brings up an interesting problem which I wanted to comment on.
>>>>
>>>> The current proposal hinges on the idea that by interpreting a single 
>>>> value the kernel can understand what the user wants to measure. For  
>>>> instance, if I pass type=0, then the kernel understands I want to  
>>>> measure CPU_CYCLES. Given that the number of events and their unit 
>>>> mask combinations can be large, the proposal also provides a "raw" 
>>>> mode, where the content of the type field is interpreted as the raw 
>>>> value to put into a register.
>>>>
>>>> This is where there is an issue because with several PMU models,  
>>>> including on X86, using the raw bit + 64 value is not enough to 
>>>> figure out what the user wants to measure. This happens when the PMU 
>>>> has more than counters. Thus, interpreting each raw value has the 
>>>> event code may be wrong. To remain on familiar territory, the Nehalem 
>>>> uncore PMU has an opcode matcher register, that uses a 64-bit value. 
>>>> On AMD64 Family 10h, you have IBS. But I could give examples on 
>>>> Itanium with opcode matchers, range restrictions. Corey provided 
>>>> other examples for Power. The API has to provide a way to express 
>>>> what the raw value is meant for: counter, matcher, filter...
>>> this can be done in a number of ways (in order of increasing levels of  
>>> abstraction):
>>>
>>> - the raw type is kept wide enough. Paul already requested the raw type
>>>   to be widened to 128 bits to express certain PowerPC features.
>>>
>>> - or the PMU capability is expressed as a special counter type (if it's
>>>   useful enough) - and then either the write() method or ioctl is extended
>>>   to express attributes we want to set/change while a counter is running.
>>>
>>> - or the highest level counter / hw event data type is extended with new
>>>   attribute field(s).
>>>
>>> My feeling is that we generally want such hw features to start small -  
>>> i.e. at the raw type level initially. Then we can allow them to climb 
>>> the ladder, if they prove their utility in practice. We've got space 
>>> reserved in the ABI to allow for growth like this.
>>>
>>> 	Ingo
>>
>> Hi Ingo and Stephane,
>>
>> Thanks for the replies.
>>
>> I think any one of those solutions would work for Power's Instruction 
>> Matching Register.  If more than one register needs to be programmed, or 
>> the values don't fit into the 128-bit raw event types, we could use the 
>> "special counter" approach, I think.
>>
>> I will have another look at the Power PMU description and see if there 
>> are other constraints that might cause us to want to go one way or the 
>> other, or perhaps a different way.
> 
> thanks, that's really appreciated!
> 
> One useful approach would be to come up with a bitcount that you think 
> would fit considering even (currently) fringe/odd features - and we'd make 
> sure there's enough space for that in the ABI - should there be a 
> need/desire to expose that in the future.
> 
> 	Ingo

Looking at the Instruction Matching CAM on Power6, it's comprised of two 
64-bit values, but there are quite a few reserved bits, and bits that 
must be programmed in a fixed way.  If we were to squeeze out the 
reserved and fixed bits from the ABI, that leaves 74 real bits of data 
that a user would like to be able to set.

In addition to that, there is an instruction marking mechanism that 
requires 2 bits to set the sampling mode.

Lastly, there is a thresholding mechanism that has 6 bits of count, two 
3-bit start/end event fields, and a 2-bit granularity field.

In total, that's 90 bits in addition to the event code (9 bits?).  There 
may be a few stragglers that I have missed, and some room should be left 
for future processors.  128 could be a bit tight for future processor 
generations.

While reading the Power6 PMU manual, I also had a look at Power5+ PMU 
manual, and it has five more accessible instruction matching registers 
(32-bits each).  These five are somewhat more special-purpose (they 
match fewer bits in the instruction), and they probably could be left 
out, but it would be nice if the ABI had the room for them.

Regards,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
cjashfor@us.ibm.com


  reply	other threads:[~2009-01-26 23:41 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-21 18:50 [announce] Performance Counters for Linux, v6 Ingo Molnar
2009-01-21 19:34 ` Randy Dunlap
2009-01-21 19:56   ` Ingo Molnar
2009-01-21 21:14     ` Randy Dunlap
2009-01-22 11:22 ` Karel Zak
2009-01-22 12:04   ` Karel Zak
2009-01-22 12:06   ` Ingo Molnar
2009-01-26  1:06 ` Corey Ashford
2009-01-26  9:13   ` stephane eranian
2009-01-26 15:17     ` Ingo Molnar
2009-01-26 16:55       ` stephane eranian
2009-01-26 19:13       ` Corey Ashford
2009-01-26 19:39         ` [perfmon2] " Luck, Tony
2009-01-26 22:10           ` Ingo Molnar
2009-01-26 22:15         ` Ingo Molnar
2009-01-26 23:41           ` Corey Ashford [this message]
2009-01-29  2:10 ` Corey Ashford
2009-01-29 12:32   ` stephane eranian
2009-01-29 20:01     ` Corey Ashford
2009-01-29 21:44       ` stephane eranian
2009-02-19 21:53 ` Corey Ashford
2009-02-20  8:10   ` Ingo Molnar
2009-02-20 22:38     ` Corey Ashford
2009-02-20 22:47       ` Peter Zijlstra
2009-02-20 23:04         ` Corey Ashford
2009-02-20 23:24           ` stephane eranian
2009-02-20 23:58         ` Corey Ashford
2009-02-21  0:47 ` Arnd Bergmann
2009-02-26  9:49   ` Paul Mackerras
2009-02-26 13:37     ` Arnd Bergmann
2009-03-09  1:39 ` Robert Richter
2009-03-09 23:01   ` Paul Mackerras
2009-03-10  9:44     ` Robert Richter
2009-03-10 10:29       ` Peter Zijlstra
2009-03-10 11:49       ` Paul Mackerras
2009-03-10 11:53         ` Ingo Molnar
2009-03-10 16:26         ` Robert Richter
2009-03-10 17:27           ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=497E4A14.1090605@linux.vnet.ibm.com \
    --to=cjashfor@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=dada1@cosmosbay.com \
    --cc=davem@davemloft.net \
    --cc=efault@gmx.de \
    --cc=eranian@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulus@samba.org \
    --cc=perfmon2-devel@lists.sourceforge.net \
    --cc=ptools-perfapi@cs.utk.edu \
    --cc=robert.richter@amd.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.