From: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Stephen Rothwell <sfr@canb.auug.org.au>,
Kyle Moffett <kyle@moffetthome.net>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Fr??d??ric Weisbecker <fweisbec@gmail.com>,
Oleg Nesterov <oleg@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
LKML <linux-kernel@vger.kernel.org>,
Tom Tromey <tromey@redhat.com>,
"Frank Ch. Eigler" <fche@redhat.com>,
linux-next@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
utrace-devel@redhat.com, Thomas Gleixner <tglx@linutronix.de>,
avi@redhat.com
Subject: Re: linux-next: add utrace tree
Date: Wed, 27 Jan 2010 16:35:55 +0530 [thread overview]
Message-ID: <20100127110555.GB1842@in.ibm.com> (raw)
In-Reply-To: <1264589716.4283.2006.camel@laptop>
On Wed, Jan 27, 2010 at 11:55:16AM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-27 at 02:43 -0800, Linus Torvalds wrote:
> >
> > On Wed, 27 Jan 2010, Peter Zijlstra wrote:
> > >
> > > Right, so you're going to love uprobes, which does exactly that. The
> > > current proposal is overwriting the target instruction with an INT3 and
> > > injecting an extra vma into the target process's address space
> > > containing the original instruction(s) and possible jumps back to the
> > > old code stream.
> >
> > Just out of interest, how does it handle the threading issue?
> >
> > Last I saw, at least some CPU people were _very_ nervous about overwriting
> > instructions if another CPU might be just about to execute them.
> >
> > Even the "overwrite only the first byte with 'int3'" made them go "umm, I
> > need to talk to some core CPU people to see if that's ok". They mumble
> > about possible CPU errata, I$ coherency, instruction retry etc.
> >
> > I realize kprobes does this very thing, but kprobes is esoteric stuff and
> > doesn't have much choice. In user space, you _could_ do the modification
> > on a different physical page and then just switch the page table entry
> > instead, and not get into the whole D$/I$ coherency thing at all.
>
> Right, so there's two aspects:
>
> 1) concurrency when inserting the probe
> 2) concurrency when hitting the probe
>
> 1) used to be dealt with by using utrace to stop all threads in the
> process and then writing the instruction. I suggested to CoW the page,
> modify the instruction, set the pagetable and flush tlbs at full speed
> -- the very thing you suggest here.
>
> 2) so traditionally (and the intel arch manual describes this) is to
> replace the instruction, single step it, and write the probe back. This
> is racy for multi-threading. The current uprobes stuff solves this by
> doing single-step-out-of-line (XOL).
>
> XOL injects a new vma into the target process and puts the old
> instruction there, then it single steps on the new location, leaving the
> original site with INT3.
>
> This doesn't work for things like RIP relative instructions, so uprobes
> considers them un-probable.
Probing RIP-relative instructions work just fine; there are fixups that
take care of it.
> Also, I myself really object to inserting a vma in a running process,
> its like a land-lord, sure he has the key but he won't come in an poke
> through your things.
>
> The alternative is to place the instruction in TLS or stack space, since
> each thread can only have a single trap at a time, you only need space
> for 1 instruction (plus a possible jump out to the original site). There
> is the 'problem' of marking the TLS/stack executable when being probed.
>
> Then there is the whole emulation angle, the uprobes people basically
> say its too much effort to write a x86 emulator.
We don't need to write one. I don't know how easy it is to make the kvm
emulator less kvm-centric (vcpus, kvm_context, etc). Avi?
Ananth
next prev parent reply other threads:[~2010-01-27 11:06 UTC|newest]
Thread overview: 125+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100119211646.GF16096@redhat.com>
2010-01-20 0:12 ` linux-next: add utrace tree Stephen Rothwell
2010-01-20 5:49 ` Ingo Molnar
2010-01-20 6:15 ` Ananth N Mavinakayanahalli
2010-01-20 6:28 ` Ingo Molnar
2010-01-20 6:40 ` Ananth N Mavinakayanahalli
2010-01-20 10:43 ` Frederic Weisbecker
2010-01-20 6:59 ` Stephen Rothwell
2010-01-20 13:24 ` Frank Ch. Eigler
2010-01-20 7:29 ` Ingo Molnar
2010-01-20 14:38 ` Stephen Rothwell
2010-01-21 1:22 ` Roland McGrath
2010-01-22 0:17 ` Stephen Rothwell
2010-01-22 0:30 ` Andrew Morton
2010-01-22 0:31 ` Andrew Morton
2010-01-22 0:51 ` Frank Ch. Eigler
2010-01-22 1:05 ` Andrew Morton
2010-01-22 1:25 ` Frank Ch. Eigler
2010-01-22 1:32 ` Linus Torvalds
2010-01-22 2:22 ` Frank Ch. Eigler
2010-01-22 2:35 ` Linus Torvalds
2010-01-22 20:51 ` Oleg Nesterov
2010-01-23 6:04 ` Ingo Molnar
2010-01-23 12:03 ` Frank Ch. Eigler
2010-01-24 16:36 ` Thomas Gleixner
2010-01-22 1:28 ` Linus Torvalds
2010-01-22 5:21 ` Ananth N Mavinakayanahalli
2010-01-22 13:43 ` Valdis.Kletnieks
2010-01-22 19:39 ` Oleg Nesterov
2010-01-26 13:58 ` Pavel Machek
2010-01-22 18:28 ` Oleg Nesterov
2010-01-22 20:01 ` Frank Ch. Eigler
2010-01-22 20:16 ` Peter Zijlstra
2010-01-22 21:44 ` Frank Ch. Eigler
2010-01-22 21:59 ` Linus Torvalds
2010-01-22 22:13 ` Frank Ch. Eigler
2010-01-23 0:11 ` Linus Torvalds
2010-01-23 0:22 ` Linus Torvalds
2010-01-23 6:20 ` Kyle Moffett
2010-01-23 11:01 ` Alan Cox
2010-01-23 11:51 ` Frank Ch. Eigler
2010-01-23 15:57 ` Arnaldo Carvalho de Melo
2010-01-23 11:23 ` Ingo Molnar
2010-01-23 11:47 ` Frank Ch. Eigler
2010-01-23 19:48 ` tytso
2010-01-24 18:01 ` Frank Ch. Eigler
2010-01-25 1:42 ` Kyle Moffett
2010-01-25 4:55 ` tytso
2010-01-25 16:52 ` Linus Torvalds
2010-01-25 17:02 ` Frank Ch. Eigler
2010-01-25 17:36 ` Linus Torvalds
2010-01-25 17:45 ` Linus Torvalds
2010-01-25 17:54 ` Steven Rostedt
2010-01-25 18:03 ` Alan Cox
2010-01-25 18:12 ` Linus Torvalds
2010-01-25 18:30 ` Steven Rostedt
2010-01-25 18:45 ` Thomas Gleixner
2010-01-25 20:34 ` Ingo Molnar
2010-01-25 20:30 ` Mark Wielaard
2010-01-25 20:42 ` Linus Torvalds
2010-01-26 0:02 ` Renzo Davoli
2010-01-26 0:07 ` Linus Torvalds
2010-01-26 16:08 ` Johannes Stezenbach
2010-01-26 16:28 ` Linus Torvalds
2010-01-26 16:34 ` Christoph Hellwig
2010-01-28 23:53 ` Benjamin Herrenschmidt
2010-01-29 0:21 ` Linus Torvalds
2010-01-25 4:59 ` Ananth N Mavinakayanahalli
2010-01-25 10:13 ` Peter Zijlstra
2010-01-24 5:04 ` Linus Torvalds
2010-01-24 10:25 ` tytso
2010-01-24 13:20 ` Frank Ch. Eigler
2010-01-25 21:05 ` Tom Tromey
2010-01-25 21:41 ` Linus Torvalds
2010-01-26 14:21 ` Ananth N Mavinakayanahalli
2010-01-26 23:20 ` Tom Tromey
2010-01-26 23:37 ` Linus Torvalds
2010-01-27 6:52 ` Peter Zijlstra
2010-01-27 8:54 ` Ingo Molnar
2010-01-28 1:52 ` Jim Keniston
2010-01-28 8:55 ` Ingo Molnar
2010-01-29 0:59 ` Jim Keniston
2010-01-29 7:39 ` Ingo Molnar
2010-01-29 7:52 ` Ananth N Mavinakayanahalli
2010-01-29 7:55 ` Ananth N Mavinakayanahalli
2010-01-29 9:16 ` Ingo Molnar
2010-01-29 9:11 ` Ingo Molnar
2010-01-29 9:31 ` Ananth N Mavinakayanahalli
2010-01-29 9:51 ` Ingo Molnar
2010-01-29 18:13 ` Frank Ch. Eigler
2010-01-29 4:55 ` Ananth N Mavinakayanahalli
2010-01-29 7:42 ` Ingo Molnar
2010-01-30 17:49 ` Steven Rostedt
2010-01-30 17:59 ` Linus Torvalds
2010-02-02 6:47 ` Masami Hiramatsu
2010-01-27 10:43 ` Linus Torvalds
2010-01-27 10:55 ` Peter Zijlstra
2010-01-27 10:58 ` Peter Zijlstra
2010-01-27 11:04 ` Linus Torvalds
2010-01-27 16:01 ` Frederic Weisbecker
2010-01-27 11:05 ` Ananth N Mavinakayanahalli [this message]
2010-01-27 11:08 ` Peter Zijlstra
2010-01-27 11:20 ` Ananth N Mavinakayanahalli
2010-02-08 10:09 ` Avi Kivity
2010-01-27 11:07 ` Srikar Dronamraju
2010-01-27 13:59 ` Steven Rostedt
2010-01-27 17:42 ` H. Peter Anvin
2010-01-27 18:53 ` Steven Rostedt
2010-02-08 6:54 ` Pavel Machek
2010-02-08 9:30 ` H. Peter Anvin
2010-02-08 9:53 ` Arjan van de Ven
2010-01-27 19:18 ` H. Peter Anvin
2010-01-27 0:38 ` Frank Ch. Eigler
2010-01-26 15:00 ` Frank Ch. Eigler
2010-01-26 17:33 ` Andi Kleen
2010-01-26 18:46 ` Linus Torvalds
2010-01-26 21:02 ` Andi Kleen
2010-01-26 21:53 ` Oleg Nesterov
2010-01-26 22:03 ` Andi Kleen
2010-01-26 23:32 ` Oleg Nesterov
2010-01-26 21:30 ` Oleg Nesterov
2010-01-26 23:27 ` Tom Tromey
2010-01-23 8:05 ` Alexey Dobriyan
2010-01-22 17:45 ` Oleg Nesterov
2010-01-20 8:52 ` Peter Zijlstra
2010-01-20 13:01 ` Frank Ch. Eigler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100127110555.GB1842@in.ibm.com \
--to=ananth@in.ibm.com \
--cc=acme@redhat.com \
--cc=avi@redhat.com \
--cc=fche@redhat.com \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=kyle@moffetthome.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sfr@canb.auug.org.au \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=tromey@redhat.com \
--cc=utrace-devel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).