From: Borislav Petkov <bp@alien8.de>
To: Robert Richter <rric@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@kernel.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Borislav Petkov <bp@suse.de>
Subject: Re: [RFC PATCH 1/3] perf: Add persistent events
Date: Sun, 7 Apr 2013 12:31:56 +0200 [thread overview]
Message-ID: <20130407103156.GC31299@pd.tnic> (raw)
In-Reply-To: <20130406155317.GE11652@rric.localhost>
On Sat, Apr 06, 2013 at 05:53:17PM +0200, Robert Richter wrote:
> Boris,
>
> On 15.03.13 14:06:27, Borislav Petkov wrote:
> > Add the needed pieces for persistent events which makes them
> > process-agnostic. Also, make their buffers read-only when mmaping them
> > from userspace.
>
> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > index 9fa9c622a7f4..4a4ae56195e1 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -270,8 +270,9 @@ struct perf_event_attr {
> >
> > exclude_callchain_kernel : 1, /* exclude kernel callchains */
> > exclude_callchain_user : 1, /* exclude user callchains */
> > + persistent : 1, /* always-on event */
>
> based on the discussion we have had a couple of days ago I got the
> idea that we do not necessarily need the persistent flag. The purpose
> of the persistent flag is to indicate that an existing event buffer
> that is already enabled should be used. This means the buffer is
> shared by multiple consumers. Enabled events with the following
> conditions allow this too:
>
> * buffer mappings are readonly,
> * event is systemwide,
> * struct perf_event_attr is identical.
>
> This means that an event with readonly buffers (PROT_WRITE unset,
> over-write unread data) and the same struct perf_event_attr can be
> simply reused. So, we don't need the flag: When setting up a
> systemwide and readonly event we just check if there is already one
> event running. If so we increment the use count of that event and
> share its buffer, otherwise we create a new event.
>
> This also has the advantage that the syscall i/f does not change. We
> simply setup the event in the same way as we do already. Maybe the
> only thing we need is a sysfs extension to report existing systemwide
> events to userland.
I'm not entirely sure why we want to do that? Just for informational
purposes? I mean, if we enable the events on the kernel command line, we
already know which are the persistent ones.
> Now, to have persistent events in the sense that the event is running
> since system startup, we simply enable an in-kernel-counter. If
> userland later sets up the same event, the kernel is aware of the
> already running counter and redirects it to its buffer and all
> (persistent) samples of the current buffer are now available to
> userland.
>
> The above also has the advantage of sharing systemwide events which
> saves system resources, e.g. hardware counters. Suppose two processes
> are collecting cpu-clocks, currently there are two hw counters and
> buffers used for this, but only one counter would be sufficient.
I'm all up for efficiency. :)
> I currently see the following problem with the approach. If we map to
> an already running counter, there might be samples in the buffer that
> had been collected before the syscall was setup. But the profiling
> tool might be interested only in newer samples. So userland must be
> aware of this and might be able to drop those samples. This should be
> also backward compatible with the existing syscall i/f.
Hmm, so the perf tool should actually know about persistent events
after all. If we open the persistent event from userspace, we get *all*
samples. If we simply do a normal perf session, we get only the new
samples.
> Another problem is different buffer size. I don't know if it is
> possible to simply resize mmap'ed regions to the biggest buffer
> requested. Though this problem already exist with your code.
Hmm, that needs more code staring I guess.
> Maybe it's worth to look at this approach. I need to review more perf
> code for this.
Yeah, me too.
Thanks.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
next prev parent reply other threads:[~2013-04-07 10:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-15 13:06 [RFC PATCH 0/3] Perf persistent events Borislav Petkov
2013-03-15 13:06 ` [RFC PATCH 1/3] perf: Add " Borislav Petkov
2013-04-06 15:53 ` Robert Richter
2013-04-07 10:31 ` Borislav Petkov [this message]
2013-03-15 13:06 ` [RFC PATCH 2/3] perf: Add persistent event facilities Borislav Petkov
2013-03-18 9:13 ` Namhyung Kim
2013-03-18 12:11 ` Borislav Petkov
2013-03-28 18:15 ` Robert Richter
2013-04-03 17:27 ` Borislav Petkov
2013-04-06 16:29 ` Robert Richter
2013-03-15 13:06 ` [RFC PATCH 3/3] MCE: Enable persistent event Borislav Petkov
2013-03-18 8:38 ` [RFC PATCH 0/3] Perf persistent events Namhyung Kim
2013-03-18 8:46 ` Ingo Molnar
2013-03-28 15:52 ` Robert Richter
2013-03-29 14:25 ` Borislav Petkov
2013-03-18 8:40 ` Ingo Molnar
2013-03-18 15:16 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130407103156.GC31299@pd.tnic \
--to=bp@alien8.de \
--cc=a.p.zijlstra@chello.nl \
--cc=bp@suse.de \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=rric@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox