public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Corey Ashford <cjashfor@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andi Kleen <andi@firstfloor.org>,
	Paul Mackerras <paulus@samba.org>,
	Stephane Eranian <eranian@googlemail.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>,
	Dan Terpstra <terpstra@eecs.utk.edu>,
	Philip Mucci <mucci@eecs.utk.edu>,
	Maynard Johnson <mpjohn@us.ibm.com>, Carl Love <cel@us.ibm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Masami Hiramatsu <mhiramat@redhat.com>
Subject: Re: [RFC] perf_events: support for uncore a.k.a. nest units
Date: Thu, 21 Jan 2010 11:28:55 -0800	[thread overview]
Message-ID: <4B58AAF7.60507@linux.vnet.ibm.com> (raw)
In-Reply-To: <4B58A750.2060607@linux.vnet.ibm.com>

On 1/21/2010 11:13 AM, Corey Ashford wrote:
>
>
> On 1/20/2010 11:21 PM, Ingo Molnar wrote:
>>
>> * Corey Ashford<cjashfor@linux.vnet.ibm.com> wrote:
>>
>>> I really think we need some sort of data structure which is passed
>>> from the
>>> kernel to user space to represent the topology of the system, and give
>>> useful information to be able to identify each PMU node. Whether this is
>>> done with a sysfs-style tree, a table in a file, XML, etc... it doesn't
>>> really matter much, but it needs to be something that can be parsed
>>> relatively easily and *contains just enough information* for the user
>>> to be
>>> able to correctly choose PMUs, and for the kernel to be able to
>>> relate that
>>> back to actual PMU hardware.
>>
>> The right way would be to extend the current event description under
>> /debug/tracing/events with hardware descriptors and (maybe) to
>> formalise this
>> into a separate /proc/events/ or into a separate filesystem.
>>
>> The advantage of this is that in the grand scheme of things we
>> _really_ dont
>> want to limit performance events to 'hardware' hierarchies, or to
>> devices/sysfs, some existing /proc scheme, or any other arbitrary (and
>> fundamentally limiting) object enumeration.
>>
>> We want a unified, logical enumeration of all events and objects that
>> we care
>> about from a performance monitoring and analysis point of view, shaped
>> for the
>> purpose of and parsed by perf user-space. And since the current event
>> descriptors are already rather rich as they enumerate all sorts of
>> things:
>>
>> - tracepoints
>> - hw-breakpoints
>> - dynamic probes
>>
>> etc., and are well used by tooling we should expand those with real
>> hardware
>> structure.
>
> This is an intriguing idea; I like the idea of generalizing all of this
> info into one structure.
>
> So you think that this structure should contain event info as well? If
> these structures are created by the kernel, I think that would
> necessitate placing large event tables into the kernel, which is
> something I think we'd prefer to avoid because of the amount of memory
> it would take. Keep in mind that we need not only event names, but event
> descriptions, encodings, attributes (e.g. unit masks), attribute
> descriptions, etc. I suppose the kernel could read a file from the file
> system, and then add this info to the tree, but that just seems bad. Are
> there existing places in the kernel where it reads a user space file to
> create a user space pseudo filesystem?
>
> I think keeping event naming in user space, and PMU naming in kernel
> space might be a better idea: the kernel exposes the available PMUs to
> user space via some structure, and a user space library tries to
> recognize the exposed PMUs and provide event lists and other needed
> info. The perf tool would use this library to be able to list available
> events to users.
>

Perhaps another way of handing this would be to have the kernel dynamically load 
a specific "PMU kernel module" once it has detected that it has a particular PMU 
in the hardware.  The module would consist only of a data structure, and a 
simple API to access the event data.  This way, only only the PMUs that actually 
exist in the hardware would need to be loaded into memory, and perhaps then only 
temporarily (just long enough to create the pseudo fs nodes).

Still, though, since it's a pseudo fs, all of that event data would be taking up 
kernel memory.

Another model, perhaps, would be to actually write this data out to a real file 
system upon every boot up, so that it wouldn't need to be held in memory.  That 
seems rather ugly and time consuming, though.

-- 
Regards,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
cjashfor@us.ibm.com


  reply	other threads:[~2010-01-21 19:29 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-19 19:41 [RFC] perf_events: support for uncore a.k.a. nest units Corey Ashford
2010-01-20  0:44 ` Andi Kleen
2010-01-20  1:49   ` Corey Ashford
2010-01-20  9:35     ` Andi Kleen
2010-01-20 19:28       ` Corey Ashford
2010-01-20 13:34 ` Peter Zijlstra
2010-01-20 21:33   ` Peter Zijlstra
2010-01-20 23:23     ` Corey Ashford
2010-01-21  7:21       ` Ingo Molnar
2010-01-21 19:13         ` Corey Ashford
2010-01-21 19:28           ` Corey Ashford [this message]
2010-01-27 10:28             ` Ingo Molnar
2010-01-27 19:50               ` Corey Ashford
2010-01-28 10:57                 ` Peter Zijlstra
2010-01-28 18:00                   ` Corey Ashford
2010-01-28 19:06                     ` Peter Zijlstra
2010-01-28 19:44                       ` Corey Ashford
2010-01-28 22:08                       ` Corey Ashford
2010-01-29  9:52                         ` Peter Zijlstra
2010-01-29 23:05                           ` Corey Ashford
2010-01-30  8:42                             ` Peter Zijlstra
2010-02-01 19:39                               ` Corey Ashford
2010-02-01 19:54                                 ` Peter Zijlstra
2010-01-21  8:36       ` Peter Zijlstra
2010-01-21  8:47     ` stephane eranian
2010-01-21  8:59       ` Peter Zijlstra
2010-01-21  9:16         ` stephane eranian
2010-01-21  9:43         ` stephane eranian
     [not found] ` <d3f22a1003290213x7d7904an59d50eb6a8616133@mail.gmail.com>
2010-03-30  7:42   ` Lin Ming
2010-03-30 16:49     ` Corey Ashford
2010-03-30 17:15       ` Peter Zijlstra
2010-03-30 22:12         ` Corey Ashford
2010-03-31 14:01           ` Peter Zijlstra
2010-03-31 14:13             ` stephane eranian
2010-03-31 15:49             ` Maynard Johnson
2010-03-31 17:50             ` Corey Ashford
2010-04-15 21:16         ` Gary.Mohr
2010-04-16 13:24           ` Peter Zijlstra
2010-04-19  9:08             ` Lin Ming
2010-04-19  9:27               ` Peter Zijlstra
2010-04-20 11:55             ` Lin Ming
2010-04-20 12:03               ` Peter Zijlstra
2010-04-21  8:08                 ` Lin Ming
2010-04-21  8:32                   ` stephane eranian
2010-04-21  8:39                     ` Lin Ming
2010-04-21  8:44                       ` stephane eranian
2010-04-21  9:42                         ` Lin Ming
2010-04-21  9:57                           ` Peter Zijlstra
2010-04-21 22:12                             ` Lin Ming
2010-04-21 14:22                               ` Peter Zijlstra
2010-04-21 22:38                                 ` Lin Ming
2010-04-21 14:53                                   ` Peter Zijlstra
2010-03-30 21:28       ` stephane eranian
2010-03-30 23:11         ` Corey Ashford
2010-03-31 13:43           ` stephane eranian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B58AAF7.60507@linux.vnet.ibm.com \
    --to=cjashfor@linux.vnet.ibm.com \
    --cc=acme@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=cel@us.ibm.com \
    --cc=eranian@googlemail.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=mpjohn@us.ibm.com \
    --cc=mucci@eecs.utk.edu \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=terpstra@eecs.utk.edu \
    --cc=xiaoguangrong@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox