From: Frederic Weisbecker <fweisbec@gmail.com>
To: "K.Prasad" <prasad@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
Alan Stern <stern@rowland.harvard.edu>,
Peter Zijlstra <peterz@infradead.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Jan Kiszka <jan.kiszka@web.de>, Jiri Slaby <jirislaby@gmail.com>,
Li Zefan <lizf@cn.fujitsu.com>, Avi Kivity <avi@redhat.com>,
Paul Mackerras <paulus@samba.org>, Mike Galbraith <efault@gmx.de>,
Masami Hiramatsu <mhiramat@redhat.com>,
Paul Mundt <lethal@linux-sh.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [GIT PULL v2] hw-breakpoints: Rewrite on top of perf events
Date: Mon, 2 Nov 2009 15:07:14 +0100 [thread overview]
Message-ID: <20091102140710.GD4878@nowhere> (raw)
In-Reply-To: <20091102062550.GA3146@in.ibm.com>
On Mon, Nov 02, 2009 at 11:55:50AM +0530, K.Prasad wrote:
> > I don't get your point. The only possible rollback is when we allocate
> > a wide breakpoint (then one per cpu).
> > If you worry about such races, we can register these breakpoints as
> > being disabled
> > and enable them once we know the allocation succeeded for every cpu.
> >
> >
>
> Not just stray exceptions, as explained before here:
> http://lkml.org/lkml/2009/10/1/76
> - Races between the requests (also leading to temporary failure of
> all CPU requests) presenting an unclear picture about free debug
> registers (making it difficult to predict the need for a retry).
Ok. But say we have to set a wide breakpoint.
We can create a disabled set of per cpu breakpoint and then enable
them once we are sure every cpus can host it (we have already reserved a slot
for each of these while registering).
Then this race is not there anymore.
> > >
> > > - Given that the notion of a per-process context for counters is
> > > well-ingrained into the design of perf-events (even system-wide
> > > counters are sometimes implemented through individual syscalls over
> > > nr_cpus as in builtin-stat.c), it requires huge re-design and
> > > user-space changes.
> >
> >
> > It doesn't require a huge redesign to support wide perf events.
> >
> >
>
> I contest that :-)...and the sheer amount of code movement, re-design
> (including core data structures) in the patchset here:
> http://lkml.org/lkml/2009/10/24/53.
This is about rebasing the hw-breakpoints on top of another profiling
infrastructure.
So the fact we had to do a lot of changes looks fair.
> And all this with a loss of a well-layered, modular code and a
> loss of true system-wide support for bkpt counters!
I don't get your point about the loss of a well-layered and modular
code.
We are reusing an existing profiling infrastructure that looks pretty well
layered and modular to me:
The scheduling of per task registers is centralized in the core and not in the
arch like it was done with the hardware breakpoint api. The remaining arch bits
are only a concern of writing these registers.
It is not sane to hook on arch switch_to(), cpu hotplug helpers, kexec, etc...
to do an ad-hoc scheduling of perf task register whereas we already have
a centralized profiling area that can take these decisions.
So, yes indeed it doesn't support the wide contexts yet.
Let's compare that to the tracing area. What if we hadn't the tracepoints
and every tracers put their own tracing callbacks in the area they want to trace.
We would have a proliferation of ad-hoc tracing functions calls.
But we have the tracepoints: a centralized feature that only requires just one
callback somewhere in the kernel where we want to hook up and in which
every tracers can subscribe.
That's more modular and well-layered.
The problem with the lack of a wide context support with perf events is pretty
the same.
The hw-breakpoint api can implement its ad-hoc one. But it means every
other profiling/debug features will lack it and need to implement their
own.
So why not improving a centralized profiling subsystem instead of implementing
an ad-hoc one for every profiling classes that need it?
> > The non-perf based api is fine for ptrace, kgdb and ftrace uses.
> > But it is too limited for perf use.
> >
> > - It has an ad-hoc context binding (register scheduling) abstraction.
> > Perf is able to manage
> > that already: binding to defined group of processes, cpu, etc...
> >
>
> I don't see what's ad-hoc in the scheduling behaviour of the hw-bkpt
> layer. Hw-breakpoint layer does the following with respect to register
> scheduling:
>
> - User-space breakpoints are always tied to a thread
> (thread_info/task_struct) and are hence
> active when the corresponding thread is scheduled.
This is what is ad-hoc. You need to hook on switch_to, cpu hotplug
and kexec to update the breakpoints registers. And this is something
that would need to be done in every archs. That looks insane considering
the fact we have a core layer that can handle these decisions already.
> - Kernel-space addresses (requests from in-kernel sources) should be
> always active and aren't affected by process context-switches/schedule
> operations. Some of the sophisticated mechanisms for scheduling
> kernel vs user-space breakpoints (such as trapping syscalls to restore
> register context) were pre-empted by the community (as seen here:
> http://lkml.org/lkml/2009/3/11/145).
Sure. And things have evolved since then. We have a centralized
profiling/event susbsystem now.
> Any further abstraction required by the end-user (as in the case of
> perf) can be well-implemented through the powerful breakpoint
> interfaces. For instance - perf-events with its unique requirement
> wherein a kernel-space breakpoint need to be active only when a given
> process is active. Hardware breakpoint layer handles them quite well
> as seen here: http://lkml.org/lkml/2009/10/29/300.
It logically disables/enables the breakpoints but not physically.
Which means a disabled breakpoint still keeps its slot, making
it unavailable for another event, it i required for non-pinned
events.
> > - It doesn't allow non-pinned events, when a breakpoint is disabled
> > (due to context schedule out), it is
> > only virtually disabled, it's slot is not freed.
> >
>
> The <enable><disable>_hw_breakpoint() are designed such. If a user want
> the slot to be freed (which is ill-advised for a requirement here) it
> can invoke (un)register_kernel_hw_breakpoint() instead (would have very
> little overhead for the 1-CPU case without IPIs).
This adds unnecessary overhead. All we want is to update arch registers
when we schedule in/out an event.
We need to be able to free a slot for non-pinned counter because
an undefined number of events must be able to be time shared
in a single slot (in the worst case).
Calling unregister_kernel_breakpoint() each time we schedule out
a non-pinned counter adds unnecessary overhead. And calling
register_kernel_breakpoint() while enabling one is yet
another unnecessary overhead.
You'd need to check the free slots constraints every time for them.
next prev parent reply other threads:[~2009-11-02 14:07 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-24 14:16 [GIT PULL v2] hw-breakpoints: Rewrite on top of perf events Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 1/6] perf/core: Provide a kernel-internal interface to get to performance counters Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 2/6] x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread() Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 3/6] perf/core: Add a callback to perf events Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 4/6] hw-breakpoints: Rewrite the hw-breakpoints layer on top of " Frederic Weisbecker
2009-10-24 16:19 ` Jan Kiszka
2009-10-25 23:31 ` Frederic Weisbecker
2009-10-26 8:17 ` Jan Kiszka
2009-11-01 21:09 ` [GIT PULL v3] hw-breakpoints: Rewrite " Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 1/6] perf/core: Provide a kernel-internal interface to get to performance counters Frederic Weisbecker
2009-11-02 3:46 ` Paul Mackerras
2009-11-02 5:38 ` Arjan van de Ven
2009-11-02 10:47 ` Paul Mackerras
2009-11-02 13:00 ` Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 2/6] x86/hw-breakpoints: Actually flush thread breakpoints in flush_thread() Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 3/6] perf/core: Add a callback to perf events Frederic Weisbecker
2009-11-02 3:49 ` Paul Mackerras
2009-11-02 13:01 ` Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 4/6] hw-breakpoints: Rewrite the hw-breakpoints layer on top of " Frederic Weisbecker
2009-11-01 22:09 ` Jan Kiszka
2009-11-01 22:49 ` Frederic Weisbecker
2009-11-01 23:37 ` Frederic Weisbecker
2009-11-02 7:45 ` Jan Kiszka
2009-11-02 13:04 ` Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 5/6] hw-breakpoints: Arbitrate access to pmu following registers constraints Frederic Weisbecker
2009-11-01 21:09 ` [PATCH 6/6] ksym_tracer: Remove KSYM_SELFTEST_ENTRY Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 5/6] hw-breakpoints: Arbitrate access to pmu following registers constraints Frederic Weisbecker
2009-10-24 14:16 ` [PATCH 6/6] ksym_tracer: Remove KSYM_SELFTEST_ENTRY Frederic Weisbecker
2009-10-24 14:19 ` [GIT PULL v2] hw-breakpoints: Rewrite on top of perf events Frederic Weisbecker
2009-10-26 21:31 ` K.Prasad
2009-10-29 19:07 ` Frederic Weisbecker
2009-11-02 6:25 ` K.Prasad
2009-11-02 14:07 ` Frederic Weisbecker [this message]
2009-11-04 14:14 ` K.Prasad
2009-11-05 11:02 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091102140710.GD4878@nowhere \
--to=fweisbec@gmail.com \
--cc=acme@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=efault@gmx.de \
--cc=jan.kiszka@web.de \
--cc=jirislaby@gmail.com \
--cc=lethal@linux-sh.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=prasad@linux.vnet.ibm.com \
--cc=rostedt@goodmis.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox