From: "K.Prasad" <prasad@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
Alan Stern <stern@rowland.harvard.edu>,
Peter Zijlstra <peterz@infradead.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Jan Kiszka <jan.kiszka@web.de>, Jiri Slaby <jirislaby@gmail.com>,
Li Zefan <lizf@cn.fujitsu.com>, Avi Kivity <avi@redhat.com>,
Paul Mackerras <paulus@samba.org>, Mike Galbraith <efault@gmx.de>,
Masami Hiramatsu <mhiramat@redhat.com>,
Paul Mundt <lethal@linux-sh.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [GIT PULL v2] hw-breakpoints: Rewrite on top of perf events
Date: Mon, 2 Nov 2009 11:55:50 +0530 [thread overview]
Message-ID: <20091102062550.GA3146@in.ibm.com> (raw)
In-Reply-To: <c62985530910291207n22b83357s51d9ba88f5f09944@mail.gmail.com>
On Thu, Oct 29, 2009 at 08:07:15PM +0100, Frederic Weisbecker wrote:
> 2009/10/26 K.Prasad <prasad@linux.vnet.ibm.com>:
> > Outside the specific comments about the implementation here, I think
> > the patchset begets a larger question about hw-breakpoint layer's
> > integration with perf-events.
> >
> > Upon being a witness to the proposed changes and after some exploration
> > of perf_events' functionality, I'm afraid that hw-breakpoint integration
> > with perf doesn't benefit the former as much as originally wished to be
> > (http://lkml.org/lkml/2009/8/26/149).
> >
> > Some of the prevalent concerns (which have been raised in different
> > threads earlier) are:
> >
> > - While kernel-space breakpoints need to reside on every processor
> > (irrespective of the process in user-space), perf-events' notion of a
> > counter is always linked to a process context (although there could be
> > workarounds by making it 'pinned', etc).
>
>
> No. A counter (let's talk about an event profiling instance now) is not
> always attached to a single process.
> It is attached to a context. Such contexts are defined by perf as gathering
> a group of tasks or it can be a whole cpu.
>
Okay.
> The breakpoint API only supports two kind of contexts: one task, or every
> cpus (or per cpu after your last patchset).
>
Yes, and please see the replies to your concerns below.
> That said, perf events can be enhanced to support the context of a wide counter.
>
>
> >
> > - HW Breakpoints register allocation mechanism is 'greedy', which in my
> > opinion is more suitable for allocating a finite and contended
> > resource such as debug register while that of perf-events can give
> > rise to roll-backs (with side-effects such as stray exceptions and
> > race conditions).
>
>
> I don't get your point. The only possible rollback is when we allocate
> a wide breakpoint (then one per cpu).
> If you worry about such races, we can register these breakpoints as
> being disabled
> and enable them once we know the allocation succeeded for every cpu.
>
>
Not just stray exceptions, as explained before here:
http://lkml.org/lkml/2009/10/1/76
- Races between the requests (also leading to temporary failure of
all CPU requests) presenting an unclear picture about free debug
registers (making it difficult to predict the need for a retry).
> >
> > - Given that the notion of a per-process context for counters is
> > well-ingrained into the design of perf-events (even system-wide
> > counters are sometimes implemented through individual syscalls over
> > nr_cpus as in builtin-stat.c), it requires huge re-design and
> > user-space changes.
>
>
> It doesn't require a huge redesign to support wide perf events.
>
>
I contest that :-)...and the sheer amount of code movement, re-design
(including core data structures) in the patchset here:
http://lkml.org/lkml/2009/10/24/53.
And all this with a loss of a well-layered, modular code and a
loss of true system-wide support for bkpt counters!
> > Trying to scoop out the hw-breakpoint layer off its book-keeping/register
> > allocation features only to replace with that of perf-events leads to a
> > poor retrofit. On the other hand, an implementation to enable perf to use
> > hw-breakpoint layer (and its APIs) to profile memory accesses over
> > kernel-space variables (in the context of a process) is very elegant,
> > modular and fits cleanly within the frame-work of the perf-events as a
> > new perf-type (refer http://lkml.org/lkml/2009/10/26/467). A working
> > patchset (under development and containing bugs) is posted for RFC here:
> > http://lkml.org/lkml/2009/10/26/461
>
>
> The non-perf based api is fine for ptrace, kgdb and ftrace uses.
> But it is too limited for perf use.
>
> - It has an ad-hoc context binding (register scheduling) abstraction.
> Perf is able to manage
> that already: binding to defined group of processes, cpu, etc...
>
I don't see what's ad-hoc in the scheduling behaviour of the hw-bkpt
layer. Hw-breakpoint layer does the following with respect to register
scheduling:
- User-space breakpoints are always tied to a thread
(thread_info/task_struct) and are hence
active when the corresponding thread is scheduled.
- Kernel-space addresses (requests from in-kernel sources) should be
always active and aren't affected by process context-switches/schedule
operations. Some of the sophisticated mechanisms for scheduling
kernel vs user-space breakpoints (such as trapping syscalls to restore
register context) were pre-empted by the community (as seen here:
http://lkml.org/lkml/2009/3/11/145).
Any further abstraction required by the end-user (as in the case of
perf) can be well-implemented through the powerful breakpoint
interfaces. For instance - perf-events with its unique requirement
wherein a kernel-space breakpoint need to be active only when a given
process is active. Hardware breakpoint layer handles them quite well
as seen here: http://lkml.org/lkml/2009/10/29/300.
> - It doesn't allow non-pinned events, when a breakpoint is disabled
> (due to context schedule out), it is
> only virtually disabled, it's slot is not freed.
>
The <enable><disable>_hw_breakpoint() are designed such. If a user want
the slot to be freed (which is ill-advised for a requirement here) it
can invoke (un)register_kernel_hw_breakpoint() instead (would have very
little overhead for the 1-CPU case without IPIs).
> Basically, the breakpoints are performance monitoring and debug
> events. Something
> that perf can already handle.
>
> The current breakpoint API does all that in an ad-hoc way
> (debug register scheduling when cpu get up/down, when we context
> switch, etc...).
> It is also not powerful enough to support non-pinned events.
>
> The only downside I can see in perf events: it does not support wide
> system contexts.
> I don't think it requires a huge redesign. But instead of continuing
> this ad-hoc context-handling
> to cover this hole in perf, why not enhance perf so that it can cover that?
The advantages of having perf-events to use hw-breakpoint layer is
explained here and in many of my previous emails. It entails no loss of
functionality for either perf-events of hw-breakpoints, while allowing
users to harness the power of both.
Thanks,
K.Prasad
next prev parent reply other threads:[~2009-11-02 6:26 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-24 14:16 [GIT PULL v2] hw-breakpoints: Rewrite on top of perf events Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 1/6] perf/core: Provide a kernel-internal interface to get to performance counters Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 2/6] x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread() Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 3/6] perf/core: Add a callback to perf events Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 4/6] hw-breakpoints: Rewrite the hw-breakpoints layer on top of " Frederic Weisbecker
2009-10-24 16:19 ` Jan Kiszka
2009-10-25 23:31 ` Frederic Weisbecker
2009-10-26 8:17 ` Jan Kiszka
2009-11-01 21:09 ` [GIT PULL v3] hw-breakpoints: Rewrite " Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 1/6] perf/core: Provide a kernel-internal interface to get to performance counters Frederic Weisbecker
2009-11-02 3:46 ` Paul Mackerras
2009-11-02 5:38 ` Arjan van de Ven
2009-11-02 10:47 ` Paul Mackerras
2009-11-02 13:00 ` Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 2/6] x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread() Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 3/6] perf/core: Add a callback to perf events Frederic Weisbecker
2009-11-02 3:49 ` Paul Mackerras
2009-11-02 13:01 ` Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 4/6] hw-breakpoints: Rewrite the hw-breakpoints layer on top of " Frederic Weisbecker
2009-11-01 22:09 ` Jan Kiszka
2009-11-01 22:49 ` Frederic Weisbecker
2009-11-01 23:37 ` Frederic Weisbecker
2009-11-02 7:45 ` Jan Kiszka
2009-11-02 13:04 ` Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 5/6] hw-breakpoints: Arbitrate access to pmu following registers constraints Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 6/6] ksym_tracer: Remove KSYM_SELFTEST_ENTRY Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 5/6] hw-breakpoints: Arbitrate access to pmu following registers constraints Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 6/6] ksym_tracer: Remove KSYM_SELFTEST_ENTRY Frederic Weisbecker
2009-10-24 14:19 ` [GIT PULL v2] hw-breakpoints: Rewrite on top of perf events Frederic Weisbecker
2009-10-26 21:31 ` K.Prasad
2009-10-29 19:07 ` Frederic Weisbecker
2009-11-02 6:25 ` K.Prasad [this message]
2009-11-02 14:07 ` Frederic Weisbecker
2009-11-04 14:14 ` K.Prasad
2009-11-05 11:02 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091102062550.GA3146@in.ibm.com \
--to=prasad@linux.vnet.ibm.com \
--cc=acme@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=efault@gmx.de \
--cc=fweisbec@gmail.com \
--cc=jan.kiszka@web.de \
--cc=jirislaby@gmail.com \
--cc=lethal@linux-sh.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.