From: Robert Richter <robert.richter@amd.com>
To: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, mingo@elte.hu, paulus@samba.org,
davem@davemloft.net, fweisbec@gmail.com,
perfmon2-devel@lists.sf.net, eranian@gmail.com
Subject: Re: [RFC] perf_events: how to add Intel LBR support
Date: Wed, 10 Feb 2010 16:46:38 +0100 [thread overview]
Message-ID: <20100210154638.GJ24679@erda.amd.com> (raw)
In-Reply-To: <bd4cb8901002100331id369b65lc944886f35067fb5@mail.gmail.com>
Stephane,
On 10.02.10 12:31:16, Stephane Eranian wrote:
> I started looking into how to add LBR support to perf_events. We have LBR
> support in perfmon and it has proven very useful for some measurements.
>
> The usage model is that you always couple LBR with sampling on an event.
> You want the LBR state dumped into the sample on overflow. When you resume,
> after an overflow, you clear LBR and you restart it.
>
> One obvious implementation would be to add a new sample type such as
> PERF_SAMPLE_TAKEN_BRANCHES. That would generate a sample with
> a body containing an array of 4x2 up to 16x2 u64 addresses. Internally, the
> hw_perf_event_structure would have to store the LBR state so it could be
> saved and restored on context switch in per-thread mode.
>
> There is one problem with this approach. On Nehalem, the LBR can be configured
> to capture only certain types of branches + priv levels. That is about
> 8 config bits
> + priv levels. Where do we pass those config options?
I have a solution for IBS in mind and try to implement it. I just have
the problem that the current development on perf is so fast and
changes are very intrusive that I am not able to publish a working
version due to merge conflicts. So I need a bit time to rework my
exisisting implementation and review your changes.
The basic idea for IBS is to define special pmu events that have a
different behaviour than standard events (on x86 these are performance
counters). The 64 bit configuration value of such an event is simply
marked as a special event. The pmu detects the type of the model
specific event and passes its value to the hardware. Doing so you can
pass any kind of configuration data to a certain pmu.
The sample data you get in this case could be either packed into the
standard perf_event sampling format, or if this does not fit, the pmu
may return raw samples in a special format the userland knows about.
The interface extension is adopting the perfmon2 model specific pmu
setup where you can pass config values to the pmu and return
performance data from it. The implementation is architecture
independent and compatible with the current interface. The only change
to the api is an additional bit to the perf_event_attr to mark the raw
config value as model specific.
> One solution would have to provide as many PERF_SAMPLE bits as the hardware
> OR provide some config field for it in perf_event_attr. All of this
> would have to
> remain very generic.
>
> An alternative approach is to define a new type of (pseudo)-event, e.g.,
> PERF_TYPE_HW_BRANCH and provide variations very much like this is
> done for the generic cache events. That event would be associated with a
> new fixed-purpose counter (similar to BTS). It would go through scheduling
> via a specific constraint (similar to BTS). The hw_perf_event structure
> would provide the storage area for dumping LBR state.
>
> To sample on LBR with the event approach, the LBR event would have to
> be in the same event group. The sampling event would then simply add
> sample_type = PERF_SAMPLE_GROUP.
>
> The second approach looks more extensible, flexible than the first one. But
> it runs into a major problem with the current perf_event API/ABI and
> implementation. The current assumption is that all events never return more
> than 64-bit worth of data. In the case of LBR, we would need to return way
> more than this.
My implementation just need one 64 bit config value, but it could be
extended to use more than one config value too.
I will try to send working sample code soon, but I need a 'somehow
stable' perf tree for this. It would also help if you would publish
patch sets with many small patches instead of one big change. This
reduces merge or rebase effort.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com
next prev parent reply other threads:[~2010-02-10 15:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-10 11:31 [RFC] perf_events: how to add Intel LBR support Stephane Eranian
2010-02-10 15:46 ` Robert Richter [this message]
2010-02-10 16:01 ` Stephane Eranian
2010-02-11 22:24 ` Robert Richter
2010-02-12 10:32 ` Stephane Eranian
2010-02-14 10:12 ` Peter Zijlstra
2010-02-18 22:25 ` Peter Zijlstra
2010-02-22 14:07 ` Stephane Eranian
2010-02-22 14:29 ` Peter Zijlstra
2010-02-22 14:49 ` Stephane Eranian
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100210154638.GJ24679@erda.amd.com \
--to=robert.richter@amd.com \
--cc=davem@davemloft.net \
--cc=eranian@gmail.com \
--cc=eranian@google.com \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=perfmon2-devel@lists.sf.net \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox