Re: [PATCH -tip v5 00/10] kprobes: Kprobes jump optimization support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Frederic Weisbecker <fweisbec@gmail.com>
To: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	lkml <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Jim Keniston <jkenisto@us.ibm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Christoph Hellwig <hch@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Anders Kaseorg <andersk@ksplice.com>,
	Tim Abbott <tabbott@ksplice.com>,
	Andi Kleen <andi@firstfloor.org>, Jason Baron <jbaron@redhat.com>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	systemtap <systemtap@sources.redhat.com>,
	DLE <dle-develop@lists.sourceforge.net>
Subject: Re: [PATCH -tip v5 00/10] kprobes: Kprobes jump optimization support
Date: Tue, 24 Nov 2009 03:03:19 +0100	[thread overview]
Message-ID: <20091124020315.GA6221@nowhere> (raw)
In-Reply-To: <20091123232115.22071.71558.stgit@dhcp-100-2-132.bos.redhat.com>

On Mon, Nov 23, 2009 at 06:21:16PM -0500, Masami Hiramatsu wrote:
> Hi,
> 
> Here are the patchset of the kprobes jump optimization v5
> (a.k.a. Djprobe). Since it is not ensured that the int3 bypassing
> cross modifying code is safe on any processors yet, I introduced
> stop_machine() version of XMC. Using stop_machine() will disable
> us to probe NMI codes, but anyway, kprobes itself can't probe
> those codes. So, it's not a problem. This version also includes
> get/put_online_cpus() around optimization for avoiding deadlock
> of text_mutex.
> 
> These patches can be applied on the latest -tip.
> 
> Changes in v5:
> - Use stop_machine() to replace a breakpoint with a jump.
> - get/put_online_cpus() around optimization.
> - Make generic jump patching interface RFC.
> 
> And kprobe stress test didn't found any regressions - from kprobes,
> under kvm/x86.
> 
> Jump Optimized Kprobes
> ======================
> o Concept
>  Kprobes uses the int3 breakpoint instruction on x86 for instrumenting
> probes into running kernel. Jump optimization allows kprobes to replace
> breakpoint with a jump instruction for reducing probing overhead drastically.
> 
> o Performance
>  An optimized kprobe 5 times faster than a kprobe.
> 
>  Optimizing probes gains its performance. Usually, a kprobe hit takes
> 0.5 to 1.0 microseconds to process. On the other hand, a jump optimized
> probe hit takes less than 0.1 microseconds (actual number depends on the
> processor). Here is a sample overheads.
> 
> Intel(R) Xeon(R) CPU E5410  @ 2.33GHz (without debugging options)
> 
>                      x86-32  x86-64
> kprobe:              0.68us  0.91us
> kprobe+booster:	     0.27us  0.40us
> kprobe+optimized:    0.06us  0.06us
> 
> kretprobe :          0.95us  1.21us
> kretprobe+booster:   0.53us  0.71us
> kretprobe+optimized: 0.30us  0.35us
> 
> (booster skips single-stepping)
> 
>  Note that jump optimization also consumes more memory, but not so much.
> It just uses ~200 bytes, so, even if you use ~10,000 probes, it just 
> consumes a few MB.


Nice results.

But I have troubles to figure out the difference between booster version and
optimized version.


> o Optimization
>   Before preparing optimization, Kprobes inserts original(user-defined)
>  kprobe on the specified address. So, even if the kprobe is not
>  possible to be optimized, it just uses a normal kprobe.
> 
>  - Safety check
>   First, Kprobes gets the address of probed function and checks whether the
>  optimized region, which will be replaced by a jump instruction, does NOT
>  straddle the function boundary, because if the optimized region reaches the
>  next function, its caller causes unexpected results.
>   Next, Kprobes decodes whole body of probed function and checks there is
>  NO indirect jump, NO instruction which will cause exception by checking
>  exception_tables (this will jump to fixup code and fixup code jumps into
>  same function body) and NO near jump which jumps into the optimized region
>  (except the 1st byte of jump), because if some jump instruction jumps
>  into the middle of another instruction, it causes unexpected results too.
>   Kprobes also measures the length of instructions which will be replaced
>  by a jump instruction, because a jump instruction is longer than 1 byte,
>  it may replaces multiple instructions, and it checks whether those
>  instructions can be executed out-of-line.
> 
>  - Preparing detour code
>   Then, Kprobes prepares "detour" buffer, which contains exception emulating
>  code (push/pop registers, call handler), copied instructions(Kprobes copies
>  instructions which will be replaced by a jump, to the detour buffer), and
>  a jump which jumps back to the original execution path.
> 
>  - Pre-optimization
>   After preparing detour code, Kprobes enqueues the kprobe to optimizing list
>  and kicks kprobe-optimizer workqueue to optimize it. To wait other optimized
>  probes, kprobe-optimizer will delay to work.


Hmm, so it waits for, actually, non-optimized probes to finish, right?
The site for which you have built up a detour buffer has an int3 in place
that could have kprobes in processing and your are waiting for them
to complete before patching with the jump?


>   When the optimized-kprobe is hit before optimization, its handler
>  changes IP(instruction pointer) to copied code and exits. So, the
>  instructions which were copied to detour buffer are executed on the detour
>  buffer.



Hm, why is it playing such hybrid game there?
If I understand well, we have executed int 3, executed the
handler and we jump back to the detour buffer?



>  - Optimization
>   Kprobe-optimizer doesn't start instruction-replacing soon, it waits
>  synchronize_sched for safety, because some processors are possible to be
>  interrupted on the instructions which will be replaced by a jump instruction.
>  As you know, synchronize_sched() can ensure that all interruptions which were
>  executed when synchronize_sched() was called are done, only if
>  CONFIG_PREEMPT=n. So, this version supports only the kernel with
>  CONFIG_PREEMPT=n.(*)
>   After that, kprobe-optimizer replaces the 4 bytes right after int3 breakpoint
>  with relative-jump destination, and synchronize caches on all processors. Next,
>  it replaces int3 with relative-jump opcode, and synchronize caches again.


You said you now use stop_machine() to patch the jumps, which looks the only
safe way to do that. May be the above explanation is out of date?


>  - Unoptimization
>   When unregistering, disabling kprobe or being blocked by other kprobe,
>  an optimized-kprobe will be unoptimized. Before kprobe-optimizer runs,
>  the kprobe just be dequeued from the optimized list. When the optimization
>  has been done, it replaces a jump with int3 breakpoint and original code.
>   First it puts int3 at the first byte of the jump, synchronize caches
>  on all processors, and replaces the 4 bytes right after int3 with the
>  original code.
> 
> (*)This optimization-safety checking may be replaced with stop-machine method
>  which ksplice is done for supporting CONFIG_PREEMPT=y kernel.


And now that you use get_cpu()/put_cpu(), I guess this config
option is not required anymore.

I don't understand why the int 3 is still required in the sequence.

- Registration: You first patch the site with int 3, then try the jump
  and use the int 3 as a gate to protect your patching.

- Unregistration: Same in reverse


You are doing a live patching while the code might be running concurrently
which requires a very tricky surgery, based on a int 3 gate and rcu as you
describe above.
But do we need to play such dangerous (and complicated) game.
I mean, it's like training to be a tightrope walker while we have a
bridge just beside :)
Why not running stop_machine(), first trying the jump directly, patching
it if it's considered safe, otherwise patching with int 3?

But you said you are using stop_machine() in the v5 changelog,
I should probably first look at the patches :)

Thanks.

next prev parent reply	other threads:[~2009-11-24  2:09 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-23 23:21 [PATCH -tip v5 00/10] kprobes: Kprobes jump optimization support Masami Hiramatsu
2009-11-23 23:21 ` [PATCH -tip v5 01/10] kprobes/x86: Cleanup RELATIVEJUMP_INSTRUCTION to RELATIVEJUMP_OPCODE Masami Hiramatsu
2009-11-23 23:21 ` [PATCH -tip v5 02/10] kprobes: Introduce generic insn_slot framework Masami Hiramatsu
2009-11-23 23:21 ` [PATCH -tip v5 03/10] kprobes: Introduce kprobes jump optimization Masami Hiramatsu
2009-11-24  2:44   ` Frederic Weisbecker
2009-11-24  3:31     ` Frederic Weisbecker
2009-11-24 15:34       ` Masami Hiramatsu
2009-11-24 20:14         ` Frederic Weisbecker
2009-11-24 20:59           ` Masami Hiramatsu
2009-11-25 21:08             ` Steven Rostedt
2009-11-25 21:30               ` Masami Hiramatsu
2009-11-24 21:08           ` H. Peter Anvin
2009-11-24 15:34     ` Masami Hiramatsu
2009-11-24 19:45       ` Frederic Weisbecker
2009-11-24 21:15         ` Masami Hiramatsu
2009-11-23 23:21 ` [PATCH -tip v5 04/10] kprobes: Jump optimization sysctl interface Masami Hiramatsu
2009-11-23 23:21 ` [PATCH -tip v5 05/10] kprobes/x86: Boost probes when reentering Masami Hiramatsu
2009-11-23 23:22 ` [PATCH -tip v5 06/10] kprobes/x86: Cleanup save/restore registers Masami Hiramatsu
2009-11-24  2:51   ` Frederic Weisbecker
2009-11-24 15:39     ` Masami Hiramatsu
2009-11-24 20:19       ` Frederic Weisbecker
2009-11-24 15:40     ` Frank Ch. Eigler
2009-11-24 20:20       ` Frederic Weisbecker
2009-11-23 23:22 ` [PATCH -tip v5 07/10] kprobes/x86: Support kprobes jump optimization on x86 Masami Hiramatsu
2009-11-24  3:14   ` Frederic Weisbecker
2009-11-24 16:27   ` Jason Baron
2009-11-24 17:46     ` Masami Hiramatsu
2009-11-25 16:12       ` Masami Hiramatsu
2009-11-24 16:35   ` H. Peter Anvin
2009-11-24 17:00     ` Masami Hiramatsu
2009-11-23 23:22 ` [PATCH -tip v5 08/10] kprobes: Add documents of jump optimization Masami Hiramatsu
2009-11-23 23:22 ` [PATCH -tip v5 09/10] [RFC] x86: Introduce generic jump patching without stop_machine Masami Hiramatsu
2009-11-23 23:22 ` [PATCH -tip v5 10/10] [RFC] kprobes/x86: Use text_poke_fixup() for jump optimization Masami Hiramatsu
2009-11-24  2:03 ` Frederic Weisbecker [this message]
2009-11-24  3:20   ` [PATCH -tip v5 00/10] kprobes: Kprobes jump optimization support Frederic Weisbecker
2009-11-24  7:52     ` Ingo Molnar
2009-11-24 16:06       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091124020315.GA6221@nowhere \
    --to=fweisbec@gmail.com \
    --cc=ananth@in.ibm.com \
    --cc=andersk@ksplice.com \
    --cc=andi@firstfloor.org \
    --cc=dle-develop@lists.sourceforge.net \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=jkenisto@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=systemtap@sources.redhat.com \
    --cc=tabbott@ksplice.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.