linux-next.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Tom Tromey <tromey@redhat.com>,
	Kyle Moffett <kyle@moffetthome.net>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Fr??d??ric Weisbecker <fweisbec@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	linux-next@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	utrace-devel@redhat.com, Thomas Gleixner <tglx@linutronix.de>,
	JimKeniston <jkenisto@us.ibm.com>
Subject: Re: linux-next: add utrace tree
Date: Wed, 27 Jan 2010 09:54:42 +0100	[thread overview]
Message-ID: <20100127085442.GA28422@elte.hu> (raw)
In-Reply-To: <1264575134.4283.1983.camel@laptop>


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, 2010-01-26 at 15:37 -0800, Linus Torvalds wrote:
> > 
> > On Tue, 26 Jan 2010, Tom Tromey wrote:
> > > 
> > > In non-stop mode (where you can stop one thread but leave the others
> > > running), gdb wants to have the breakpoints always inserted.  So,
> > > something must emulate the displaced instruction.
> > 
> > I'm almost totally uninterested in breakpoints that actually re-write 
> > instructions. It's impossible to do that efficiently and well, especially 
> > in threaded environments.
> > 
> > So if you do instruction rewriting, I can only say "that's your problem".
> 
> Right, so you're going to love uprobes, which does exactly that. The current 
> proposal is overwriting the target instruction with an INT3 and injecting an 
> extra vma into the target process's address space containing the original 
> instruction(s) and possible jumps back to the old code stream.
> 
> I'm all in favor of not doing that extra vma and instead use stack or TLS 
> space, but then people complain about having to make that executable (which 
> is something I don't really mind, x86 had executable everything for very 
> long, and also, its only so when debugging the thing anyway).

I think the best solution for user probes (by far) is to use a simplified 
in-kernel instruction emulator for the few common probes instruction. (Kprobes 
already partially decodes x86 instructions to make it safe to apply 
accelerated probes and there's other decoding logic in the kernel too.)

The design and practical advantages are numerous:

 - People want to probe their function prologues most of the time ...
   a single INT3 there will in most cases just hit the initial stack 
   allocation and that's it. We could get quite good coverage (and very fast 
   emulation) for the common case in not too much code - and much of that code 
   we already have available. No re-trapping, no extra instruction patching 
   and complex maintenance of trampolines.

 - It's as transparent as it gets - no user-space trampoline or other visible
   state that modifies behavior or can be stomped upon by user-space bugs.

 - Lightweight and simple probe insertion: no weird setup sequence needing the 
   stopping of all tasks to install the trampoline. We just add the INT3 and 
   off you go.

 - Emulation is evidently thread-safe, SMP-safe, etc. as it only acts on 
   task local state.

 - The points we can probe are never truly limited as it's all freely
   upscalable: if you cannot probe an instruction you want to probe today,
   extend the emulator. Deny the rest. _All_ versions of uprobes code i've
   seen so far already restricts the probe-compatible instruction set:
   RIP-relative instructions are excluded on 64-bit for example.

 - Emulation has the _least_ semantical side effects as we really execute
   'that' instruction - not some other instruction put elsewhere into a
   special vma or into the process/thread stack, or some special in-kernel
   trampoline, etc.

 - Emulation can be very fast for the common case as well. Nobody will probe
   weird, complex instructions. They will use 'perf probe' to insert probes
   into their functions 90% of the time ...

 - FPU and complex ops and pagefault emulation is not really what i'd expect
   to be necessary for simple probing - but it _can_ be added by people who
   care about it, if they so wish.

Such a scheme would be _far_ more preferable form a maintenance POV as well, 
as the initial code will be small, and we can extend it gradually. All the 
other proposals are complex 'all or nothing' schemes with no flexibility for 
complexity at all.

Thanks,

	Ingo

  reply	other threads:[~2010-01-27  8:55 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20100119211646.GF16096@redhat.com>
2010-01-20  0:12 ` linux-next: add utrace tree Stephen Rothwell
2010-01-20  5:49   ` Ingo Molnar
2010-01-20  6:15     ` Ananth N Mavinakayanahalli
2010-01-20  6:28       ` Ingo Molnar
2010-01-20  6:40         ` Ananth N Mavinakayanahalli
2010-01-20 10:43           ` Frederic Weisbecker
2010-01-20  6:59         ` Stephen Rothwell
2010-01-20 13:24           ` Frank Ch. Eigler
2010-01-20  7:29         ` Ingo Molnar
2010-01-20 14:38           ` Stephen Rothwell
2010-01-21  1:22             ` Roland McGrath
2010-01-22  0:17             ` Stephen Rothwell
2010-01-22  0:30               ` Andrew Morton
2010-01-22  0:31                 ` Andrew Morton
2010-01-22  0:51                   ` Frank Ch. Eigler
2010-01-22  1:05                     ` Andrew Morton
2010-01-22  1:25                       ` Frank Ch. Eigler
2010-01-22  1:32                         ` Linus Torvalds
2010-01-22  2:22                           ` Frank Ch. Eigler
2010-01-22  2:35                             ` Linus Torvalds
2010-01-22 20:51                               ` Oleg Nesterov
2010-01-23  6:04                               ` Ingo Molnar
2010-01-23 12:03                                 ` Frank Ch. Eigler
2010-01-24 16:36                                   ` Thomas Gleixner
2010-01-22  1:28                       ` Linus Torvalds
2010-01-22  5:21                         ` Ananth N Mavinakayanahalli
2010-01-22 13:43                           ` Valdis.Kletnieks
2010-01-22 19:39                             ` Oleg Nesterov
2010-01-26 13:58                             ` Pavel Machek
2010-01-22 18:28                         ` Oleg Nesterov
2010-01-22 20:01                           ` Frank Ch. Eigler
2010-01-22 20:16                             ` Peter Zijlstra
2010-01-22 21:44                               ` Frank Ch. Eigler
2010-01-22 21:59                             ` Linus Torvalds
2010-01-22 22:13                               ` Frank Ch. Eigler
2010-01-23  0:11                                 ` Linus Torvalds
2010-01-23  0:22                                   ` Linus Torvalds
2010-01-23  6:20                                     ` Kyle Moffett
2010-01-23 11:01                                       ` Alan Cox
2010-01-23 11:51                                         ` Frank Ch. Eigler
2010-01-23 15:57                                         ` Arnaldo Carvalho de Melo
2010-01-23 11:23                                       ` Ingo Molnar
2010-01-23 11:47                                         ` Frank Ch. Eigler
2010-01-23 19:48                                           ` tytso
2010-01-24 18:01                                             ` Frank Ch. Eigler
2010-01-25  1:42                                             ` Kyle Moffett
2010-01-25  4:55                                               ` tytso
2010-01-25 16:52                                               ` Linus Torvalds
2010-01-25 17:02                                                 ` Frank Ch. Eigler
2010-01-25 17:36                                                   ` Linus Torvalds
2010-01-25 17:45                                                     ` Linus Torvalds
2010-01-25 17:54                                                     ` Steven Rostedt
2010-01-25 18:03                                                       ` Alan Cox
2010-01-25 18:12                                                       ` Linus Torvalds
2010-01-25 18:30                                                         ` Steven Rostedt
2010-01-25 18:45                                                           ` Thomas Gleixner
2010-01-25 20:34                                                             ` Ingo Molnar
2010-01-25 20:30                                                     ` Mark Wielaard
2010-01-25 20:42                                                       ` Linus Torvalds
2010-01-26  0:02                                                         ` Renzo Davoli
2010-01-26  0:07                                                           ` Linus Torvalds
2010-01-26 16:08                                                             ` Johannes Stezenbach
2010-01-26 16:28                                                               ` Linus Torvalds
2010-01-26 16:34                                                                 ` Christoph Hellwig
2010-01-28 23:53                                                 ` Benjamin Herrenschmidt
2010-01-29  0:21                                                   ` Linus Torvalds
2010-01-25  4:59                                         ` Ananth N Mavinakayanahalli
2010-01-25 10:13                                           ` Peter Zijlstra
2010-01-24  5:04                                       ` Linus Torvalds
2010-01-24 10:25                                         ` tytso
2010-01-24 13:20                                           ` Frank Ch. Eigler
2010-01-25 21:05                                         ` Tom Tromey
2010-01-25 21:41                                           ` Linus Torvalds
2010-01-26 14:21                                             ` Ananth N Mavinakayanahalli
2010-01-26 23:20                                             ` Tom Tromey
2010-01-26 23:37                                               ` Linus Torvalds
2010-01-27  6:52                                                 ` Peter Zijlstra
2010-01-27  8:54                                                   ` Ingo Molnar [this message]
2010-01-28  1:52                                                     ` Jim Keniston
2010-01-28  8:55                                                       ` Ingo Molnar
2010-01-29  0:59                                                         ` Jim Keniston
2010-01-29  7:39                                                           ` Ingo Molnar
2010-01-29  7:52                                                             ` Ananth N Mavinakayanahalli
2010-01-29  7:55                                                               ` Ananth N Mavinakayanahalli
2010-01-29  9:16                                                                 ` Ingo Molnar
2010-01-29  9:11                                                               ` Ingo Molnar
2010-01-29  9:31                                                                 ` Ananth N Mavinakayanahalli
2010-01-29  9:51                                                                   ` Ingo Molnar
2010-01-29 18:13                                                             ` Frank Ch. Eigler
2010-01-29  4:55                                                         ` Ananth N Mavinakayanahalli
2010-01-29  7:42                                                           ` Ingo Molnar
2010-01-30 17:49                                                             ` Steven Rostedt
2010-01-30 17:59                                                               ` Linus Torvalds
2010-02-02  6:47                                                             ` Masami Hiramatsu
2010-01-27 10:43                                                   ` Linus Torvalds
2010-01-27 10:55                                                     ` Peter Zijlstra
2010-01-27 10:58                                                       ` Peter Zijlstra
2010-01-27 11:04                                                       ` Linus Torvalds
2010-01-27 16:01                                                         ` Frederic Weisbecker
2010-01-27 11:05                                                       ` Ananth N Mavinakayanahalli
2010-01-27 11:08                                                         ` Peter Zijlstra
2010-01-27 11:20                                                           ` Ananth N Mavinakayanahalli
2010-02-08 10:09                                                         ` Avi Kivity
2010-01-27 11:07                                                     ` Srikar Dronamraju
2010-01-27 13:59                                                     ` Steven Rostedt
2010-01-27 17:42                                                       ` H. Peter Anvin
2010-01-27 18:53                                                         ` Steven Rostedt
2010-02-08  6:54                                                         ` Pavel Machek
2010-02-08  9:30                                                           ` H. Peter Anvin
2010-02-08  9:53                                                           ` Arjan van de Ven
2010-01-27 19:18                                                     ` H. Peter Anvin
2010-01-27  0:38                                               ` Frank Ch. Eigler
2010-01-26 15:00                                           ` Frank Ch. Eigler
2010-01-26 17:33                                           ` Andi Kleen
2010-01-26 18:46                                             ` Linus Torvalds
2010-01-26 21:02                                               ` Andi Kleen
2010-01-26 21:53                                                 ` Oleg Nesterov
2010-01-26 22:03                                                   ` Andi Kleen
2010-01-26 23:32                                                     ` Oleg Nesterov
2010-01-26 21:30                                               ` Oleg Nesterov
2010-01-26 23:27                                             ` Tom Tromey
2010-01-23  8:05                                     ` Alexey Dobriyan
2010-01-22 17:45                       ` Oleg Nesterov
2010-01-20  8:52         ` Peter Zijlstra
2010-01-20 13:01         ` Frank Ch. Eigler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100127085442.GA28422@elte.hu \
    --to=mingo@elte.hu \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=fche@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jkenisto@us.ibm.com \
    --cc=kyle@moffetthome.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=tromey@redhat.com \
    --cc=utrace-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).