From: Jim Keniston <jkenisto@us.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Tom Tromey <tromey@redhat.com>,
Kyle Moffett <kyle@moffetthome.net>,
"Frank Ch. Eigler" <fche@redhat.com>,
Oleg Nesterov <oleg@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Stephen Rothwell <sfr@canb.auug.org.au>,
Fr??d??ric Weisbecker <fweisbec@gmail.com>,
LKML <linux-kernel@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
linux-next@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
utrace-devel@redhat.com, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: linux-next: add utrace tree
Date: Thu, 28 Jan 2010 16:59:28 -0800 [thread overview]
Message-ID: <1264726768.4933.50.camel@localhost.localdomain> (raw)
In-Reply-To: <20100128085502.GA7713@elte.hu>
On Thu, 2010-01-28 at 09:55 +0100, Ingo Molnar wrote:
> * Jim Keniston <jkenisto@us.ibm.com> wrote:
>
> > On Wed, 2010-01-27 at 09:54 +0100, Ingo Molnar wrote:
> > ...
> >
> > Yes, emulating "push %ebp" would buy us a lot of coverage for a lot of apps
> > on x86 (but see below**). [...]
>
...
>
> > [...] Even there, though, we'd have to address the page fault we'd
> > occasionally get when extending the stack vma.
>
> Nope, in the simplest model not even page fault emulation is needed,
> get_user()/put_user() would resolve it automatically. If you either get the
> value with the pagefault resolved, or you get a -EFAULT.
get_user()/put_user() have to be done in a context where you can sleep,
right? Uprobes currently operates in such contexts, but there's some
talk of moving it all to a DIE_INT3 notifier context, where it can't
sleep.
...
>
> > > We could get quite good coverage (and very fast
> > > emulation) for the common case in not too much code - and much of that code
> > > we already have available. No re-trapping,
> >
> > As previously discussed, boosting would also get rid of the single-step trap
> > for most instructions.
>
> Boosting is not in the uprobes patch-set you submitted. Even with it present
> it wont get rid of the initial INT3. So basically _best-case_ (with boosting)
> XOL-uprobes could roughly break even with a pure emulator approach ...
>
> That's a big and fundamental difference.
To be fair, wrt uprobes, emulation and boosting are both in the same
state: pretty well understood, but not yet implemented.
...
> > >
> > > - It's as transparent as it gets - no user-space trampoline or other visible
> > > state that modifies behavior or can be stomped upon by user-space bugs.
> >
> > The XOL vma isn't writable from user space, so I can't think of how it could
> > be clobbered merely by a stray memory reference. [...]
>
> Well there must be some purpose to the instrumentation, there must be some way
> to save data, right? If yes and it's in user-space, that data is clobberable.
One or two others have advocated an approach (which eliminates the
breakpoint trap) where trace data is stored in the uprobe vma, but I
haven't. (In such a case, "XOL vma" would be a misnomer.) I agree that
in such a scenario, the uprobe vma would of necessity be writable by the
app.
>
> If it's in kernel-space then we have to enter the kernel anyway (with similar
> cost patterns to an INT3 entry) - so we just delayed the kernel entry.
This seems to presume that you have to extract trace data from the
kernel every time a probe is hit. In actual practice, you're often just
checking for unusual arg values, incrementing a counter, or some such.
>
...
> > Even if we add emulation, it seems sensible to keep the XOL approach as a
> > backup to handle instructions that aren't yet emulated (and architectures
> > that don't yet have emulators). That way, if you don't probe any unemulated
> > instructions, the XOL vma is never created.
>
> To turn the argument around: an in-kernel emulator is an all-around facility
> to make sure we probe safely and securely, _and_ it is also more portable
> because it's simpler (because more gradual) to implement on a new architecture
> as you dont actually have to copy around instructions (and make sure they work
> in that new place), but have to emulate a limited subset of the instruction
> space, on purely local state.
I understand the desire to start small and simple and grow gradually
from there. We thought we were doing that. Single-stepping out of line
has been in use for close to a decade, maybe more; and boosting (in
kprobes) has been around for a few years as well. To the *probes folks,
it feels pretty solid.
>
...
>
> With an emulator (assuming the emulator is correct) we can execute the precise
> semantics of that instruction in that place - without any side-effects from
> trampolining/replacement.
And of course, our view has been that the best way to achieve the effect
of the instruction, including all desired side-effects, is to execute
the instruction on the CPU.
...
> >
> > **In practice, we've had to probe all sorts of instructions, including FP
> > instructions -- especially where you want to exploit the debug info to get
> > the names, types, and locations of variables and args. For some compilers
> > and architectures, the debug info isn't reliable until the end of the
> > function prologue, at which point you could find any old instruction. Ditto
> > if you want to probe statements within a function.
>
> For those cases, frankly, the right approach is to fix the debug info (or
> introduce a new one) and forget the old crap.
>
> You treat debuginfo as some god-given property, while it's one of the suckiest
> aspects of all of Linux. But we've had that discussion months (and years) ago.
> It has improved in gcc 4.5 so there's some hope.
Yes, there seems to be considerable movement toward better debug info --
which could make statement probing (and not just function-boundary
probing) more and more feasible.
>
...
> Ingo
Thanks.
Jim
next prev parent reply other threads:[~2010-01-29 0:59 UTC|newest]
Thread overview: 125+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100119211646.GF16096@redhat.com>
2010-01-20 0:12 ` linux-next: add utrace tree Stephen Rothwell
2010-01-20 5:49 ` Ingo Molnar
2010-01-20 6:15 ` Ananth N Mavinakayanahalli
2010-01-20 6:28 ` Ingo Molnar
2010-01-20 6:40 ` Ananth N Mavinakayanahalli
2010-01-20 10:43 ` Frederic Weisbecker
2010-01-20 6:59 ` Stephen Rothwell
2010-01-20 13:24 ` Frank Ch. Eigler
2010-01-20 7:29 ` Ingo Molnar
2010-01-20 14:38 ` Stephen Rothwell
2010-01-21 1:22 ` Roland McGrath
2010-01-22 0:17 ` Stephen Rothwell
2010-01-22 0:30 ` Andrew Morton
2010-01-22 0:31 ` Andrew Morton
2010-01-22 0:51 ` Frank Ch. Eigler
2010-01-22 1:05 ` Andrew Morton
2010-01-22 1:25 ` Frank Ch. Eigler
2010-01-22 1:32 ` Linus Torvalds
2010-01-22 2:22 ` Frank Ch. Eigler
2010-01-22 2:35 ` Linus Torvalds
2010-01-22 20:51 ` Oleg Nesterov
2010-01-23 6:04 ` Ingo Molnar
2010-01-23 12:03 ` Frank Ch. Eigler
2010-01-24 16:36 ` Thomas Gleixner
2010-01-22 1:28 ` Linus Torvalds
2010-01-22 5:21 ` Ananth N Mavinakayanahalli
2010-01-22 13:43 ` Valdis.Kletnieks
2010-01-22 19:39 ` Oleg Nesterov
2010-01-26 13:58 ` Pavel Machek
2010-01-22 18:28 ` Oleg Nesterov
2010-01-22 20:01 ` Frank Ch. Eigler
2010-01-22 20:16 ` Peter Zijlstra
2010-01-22 21:44 ` Frank Ch. Eigler
2010-01-22 21:59 ` Linus Torvalds
2010-01-22 22:13 ` Frank Ch. Eigler
2010-01-23 0:11 ` Linus Torvalds
2010-01-23 0:22 ` Linus Torvalds
2010-01-23 6:20 ` Kyle Moffett
2010-01-23 11:01 ` Alan Cox
2010-01-23 11:51 ` Frank Ch. Eigler
2010-01-23 15:57 ` Arnaldo Carvalho de Melo
2010-01-23 11:23 ` Ingo Molnar
2010-01-23 11:47 ` Frank Ch. Eigler
2010-01-23 19:48 ` tytso
2010-01-24 18:01 ` Frank Ch. Eigler
2010-01-25 1:42 ` Kyle Moffett
2010-01-25 4:55 ` tytso
2010-01-25 16:52 ` Linus Torvalds
2010-01-25 17:02 ` Frank Ch. Eigler
2010-01-25 17:36 ` Linus Torvalds
2010-01-25 17:45 ` Linus Torvalds
2010-01-25 17:54 ` Steven Rostedt
2010-01-25 18:03 ` Alan Cox
2010-01-25 18:12 ` Linus Torvalds
2010-01-25 18:30 ` Steven Rostedt
2010-01-25 18:45 ` Thomas Gleixner
2010-01-25 20:34 ` Ingo Molnar
2010-01-25 20:30 ` Mark Wielaard
2010-01-25 20:42 ` Linus Torvalds
2010-01-26 0:02 ` Renzo Davoli
2010-01-26 0:07 ` Linus Torvalds
2010-01-26 16:08 ` Johannes Stezenbach
2010-01-26 16:28 ` Linus Torvalds
2010-01-26 16:34 ` Christoph Hellwig
2010-01-28 23:53 ` Benjamin Herrenschmidt
2010-01-29 0:21 ` Linus Torvalds
2010-01-25 4:59 ` Ananth N Mavinakayanahalli
2010-01-25 10:13 ` Peter Zijlstra
2010-01-24 5:04 ` Linus Torvalds
2010-01-24 10:25 ` tytso
2010-01-24 13:20 ` Frank Ch. Eigler
2010-01-25 21:05 ` Tom Tromey
2010-01-25 21:41 ` Linus Torvalds
2010-01-26 14:21 ` Ananth N Mavinakayanahalli
2010-01-26 23:20 ` Tom Tromey
2010-01-26 23:37 ` Linus Torvalds
2010-01-27 6:52 ` Peter Zijlstra
2010-01-27 8:54 ` Ingo Molnar
2010-01-28 1:52 ` Jim Keniston
2010-01-28 8:55 ` Ingo Molnar
2010-01-29 0:59 ` Jim Keniston [this message]
2010-01-29 7:39 ` Ingo Molnar
2010-01-29 7:52 ` Ananth N Mavinakayanahalli
2010-01-29 7:55 ` Ananth N Mavinakayanahalli
2010-01-29 9:16 ` Ingo Molnar
2010-01-29 9:11 ` Ingo Molnar
2010-01-29 9:31 ` Ananth N Mavinakayanahalli
2010-01-29 9:51 ` Ingo Molnar
2010-01-29 18:13 ` Frank Ch. Eigler
2010-01-29 4:55 ` Ananth N Mavinakayanahalli
2010-01-29 7:42 ` Ingo Molnar
2010-01-30 17:49 ` Steven Rostedt
2010-01-30 17:59 ` Linus Torvalds
2010-02-02 6:47 ` Masami Hiramatsu
2010-01-27 10:43 ` Linus Torvalds
2010-01-27 10:55 ` Peter Zijlstra
2010-01-27 10:58 ` Peter Zijlstra
2010-01-27 11:04 ` Linus Torvalds
2010-01-27 16:01 ` Frederic Weisbecker
2010-01-27 11:05 ` Ananth N Mavinakayanahalli
2010-01-27 11:08 ` Peter Zijlstra
2010-01-27 11:20 ` Ananth N Mavinakayanahalli
2010-02-08 10:09 ` Avi Kivity
2010-01-27 11:07 ` Srikar Dronamraju
2010-01-27 13:59 ` Steven Rostedt
2010-01-27 17:42 ` H. Peter Anvin
2010-01-27 18:53 ` Steven Rostedt
2010-02-08 6:54 ` Pavel Machek
2010-02-08 9:30 ` H. Peter Anvin
2010-02-08 9:53 ` Arjan van de Ven
2010-01-27 19:18 ` H. Peter Anvin
2010-01-27 0:38 ` Frank Ch. Eigler
2010-01-26 15:00 ` Frank Ch. Eigler
2010-01-26 17:33 ` Andi Kleen
2010-01-26 18:46 ` Linus Torvalds
2010-01-26 21:02 ` Andi Kleen
2010-01-26 21:53 ` Oleg Nesterov
2010-01-26 22:03 ` Andi Kleen
2010-01-26 23:32 ` Oleg Nesterov
2010-01-26 21:30 ` Oleg Nesterov
2010-01-26 23:27 ` Tom Tromey
2010-01-23 8:05 ` Alexey Dobriyan
2010-01-22 17:45 ` Oleg Nesterov
2010-01-20 8:52 ` Peter Zijlstra
2010-01-20 13:01 ` Frank Ch. Eigler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1264726768.4933.50.camel@localhost.localdomain \
--to=jkenisto@us.ibm.com \
--cc=acme@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=fche@redhat.com \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=kyle@moffetthome.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sfr@canb.auug.org.au \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=tromey@redhat.com \
--cc=utrace-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).