All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
To: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>,
	Ananth N Mavinakayanahalli <ananth@in.ibm.com>,
	Sandeepa Prabhu <sandeepa.prabhu@linaro.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	x86@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	fche@redhat.com, mingo@redhat.com, systemtap@sourceware.org,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH -tip v8 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts
Date: Fri, 14 Mar 2014 22:11:19 +0900	[thread overview]
Message-ID: <5322FFF7.2080606@hitachi.com> (raw)
In-Reply-To: <20140305115843.22766.8355.stgit@ltc230.yrl.intra.hitachi.co.jp>

Ping? :)


(2014/03/05 20:58), Masami Hiramatsu wrote:
> Hi,
> Here is the version 8 of NOKPROBE_SYMBOL series.
> 
> This just updates the kprobe_table hash entry size enlargement
> to 512 from 4096. That also includes decoupling the hash size
> of kprobes one from kretprobes one, since both has different
> hash-bases.
> 
> 
> Changes
> =======
>>From this series, I update 1 patch;
>  - Enlarge kprobes hash table size to 512 instead of
>    4096.
> And evaluate the improvement again with new hash.
> 
> Blacklist improvements
> ======================
> Currently, kprobes uses __kprobes annotation and internal symbol-
> name based blacklist to prohibit probing on some functions, because
> to probe those functions may cause an infinite recursive loop by
> int3/debug exceptions.
> However, current mechanisms have some problems especially from the
> view point of maintaining code;
>  - __kprobes is easy to confuse the function is
>    used by kprobes, despite it just means "no kprobe
>    on it".
>  - __kprobes moves functions to different section
>    this will be not good for cache optimization.
>  - symbol-name based solution is not good at all,
>    since the symbol name easily be changed, and
>    we cannot notice it.
>  - it doesn't support functions in modules at all.
> 
> Thus, I decided to introduce new NOKPROBE_SYMBOL macro for building
> an integrated kprobe blacklist.
> 
> The new macro stores the address of the given symbols into
> _kprobe_blacklist section, and initialize the blacklist based on the
> address list at boottime.
> This is also applied for modules. When loading a module, kprobes
> finds the blacklist symbols in _kprobe_blacklist section in the
> module automatically.
> This series replaces all __kprobes on x86 and generic code with the
> NOKPROBE_SYMBOL() too.
> 
> Although, the new blacklist still support old-style __kprobes by
> decoding .kprobes.text if exist, because it still be used on arch-
> dependent code except for x86.
> 
> Scalability effort
> ==================
> This series fixes not only the kernel crashable "qualitative" bugs
> but also "quantitative" issue with massive multiple kprobes. Thus
> we can now do a stress test, putting kprobes on all (non-blacklisted)
> kernel functions and enabling all of them.
> To set kprobes on all kernel functions, run the below script.
>   ----
>   #!/bin/sh
>   TRACE_DIR=/sys/kernel/debug/tracing/
>   echo > $TRACE_DIR/kprobe_events
>   grep -iw t /proc/kallsyms | tr -d . | \
>     awk 'BEGIN{i=0};{print("p:"$3"_"i, "0x"$1); i++}' | \
>     while read l; do echo $l >> $TRACE_DIR/kprobe_events ; done
>   ----
> Since it doesn't check the blacklist at all, you'll see many write
> errors, but no problem :).
> 
> Note that a kind of performance issue is still in the kprobe-tracer
> if you trace all functions. Since a few ftrace functions are called
> inside the kprobe tracer even if we shut off the tracing (tracing_on
> = 0), enabling kprobe-events on the functions will cause a bad
> performance impact (it is safe, but you'll see the system slowdown
> and no event recorded because it is just ignored).
> To find those functions, you can use the third column of
> (debugfs)/tracing/kprobe_profile as below, which tells you the number
> of miss-hit(ignored) for each events. If you find that some events
> which have small number in 2nd column and large number in 3rd column,
> those may course the slowdown.
>   ----
>   # sort -rnk 3 (debugfs)/tracing/kprobe_profile | head
>   ftrace_cmp_recs_4907                               264950231     33648874543
>   ring_buffer_lock_reserve_5087                              0      4802719935
>   trace_buffer_lock_reserve_5199                             0      4385319303
>   trace_event_buffer_lock_reserve_5200                       0      4379968153
>   ftrace_location_range_4918                          18944015      2407616669
>   bsearch_17098                                       18979815      2407579741
>   ftrace_location_4972                                18927061      2406723128
>   ftrace_int3_handler_1211                            18926980      2406303531
>   poke_int3_handler_199                               18448012      1403516611
>   inat_get_opcode_attribute_16941                            0        12715314
>   ----
> 
> I'd recommend you to enable events on such functions after all other
> events enabled. Then its performance impact becomes minimum.
> 
> To enable kprobes on all kernel functions, run the below script.
>   ----
>   #!/bin/sh
>   TRACE_DIR=/sys/kernel/debug/tracing
>   echo "Disable tracing to remove tracing overhead"
>   echo 0 > $TRACE_DIR/tracing_on
> 
>   BADS="ftrace_cmp_recs ring_buffer_lock_reserve trace_buffer_lock_reserve trace_event_buffer_lock_reserve ftrace_location_range bsearch ftrace_location ftrace_int3_handler poke_int3_handler inat_get_opcode_attribute"
> HIDES=
>   for i in $BADS; do HIDES=$HIDES" --hide=$i*"; done
> 
>   SDATE=`date +%s`
>   echo "Enabling trace events: start at $SDATE"
> 
>   cd $TRACE_DIR/events/kprobes/
>   for i in `ls $HIDES` ; do echo 1 > $i/enable; done
>   for j in $BADS; do for i in `ls -d $j*`;do echo 1 > $i/enable; done; done
> 
>   EDATE=`date +%s`
>   TIME=`expr $EDATE - $SDATE`
>   echo "Elapsed time: $TIME"
>   ---- 
> Note: Perhaps, using systemtap doesn't need to consider above bad
> symbols since it has own logic not to probe itself.
> 
> Result
> ======
> These were also enabled after all other events are enabled.
> And it took 2254 sec(without any intervals) for enabling 37222 probes.
> And at that point, the perf top showed below result:
>   ----
>   Samples: 10K of event 'cycles', Event count (approx.): 270565996
>   +  16.39%  [kernel]                       [k] native_load_idt
>   +  11.17%  [kernel]                       [k] int3
>   -   7.91%  [kernel]                       [k] 0x00007fffa018e8e0
>    - 0xffffffffa018d8e0
>         59.09% trace_event_buffer_lock_reserve
>            kprobe_trace_func
>            kprobe_dispatcher
>       + 40.45% trace_event_buffer_lock_reserve
>   ----
> 0x00007fffa018e8e0 may be the trampoline buffer of an optimized
> probe on trace_event_buffer_lock_reserve. native_load_idt and int3
> are also called from normal kprobes.
> This means, at least my environment, kprobes now passed the
> stress test, and even if we put probes on all available functions
> it just slows down about 50%.
> 
> Changes from v7:
>  - [24/26] Enlarge hash table to 512 instead of 4096.
>  - Re-evaluate the performance improvements.
> 
> Changes from v6:
>  - Updated patches on the latest -tip.
>  - [1/26] Add patch: Fix page-fault handling logic on x86 kprobes
>  - [2/26] Add patch: Allow to handle reentered kprobe on singlestepping
>  - [9/26] Add new patch: Call exception_enter after kprobes handled
>  - [12/26] Allow probing fetch functions in trace_uprobe.c.
>  - [24/26] Add new patch: Enlarge kprobes hash table size
>  - [25/26] Add new patch: Kprobe cache for frequently accessd kprobes
>  - [26/26] Add new patch: Skip Ftrace hlist check with ftrace-based kprobe
> 
> Changes from v5:
>  - [2/22] Introduce nokprobe_inline macro
>  - [6/22] Prohibit probing on memset/memcpy
>  - [11/22] Allow probing on text_poke/hw_breakpoint
>  - [12/22] Use nokprobe_inline macro instead of __always_inline
>  - [14/22] Ditto.
>  - [21/22] Remove preempt disable/enable from kprobes/x86
>  - [22/22] Add emergency int3 recovery code
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (26):
>       [BUGFIX]kprobes/x86: Fix page-fault handling logic
>       kprobes/x86: Allow to handle reentered kprobe on singlestepping
>       kprobes: Prohibit probing on .entry.text code
>       kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist
>       [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_*
>       [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt
>       [BUGFIX] x86: Prohibit probing on thunk functions and restore
>       kprobes/x86: Call exception handlers directly from do_int3/do_debug
>       x86: Call exception_enter after kprobes handled
>       kprobes/x86: Allow probe on some kprobe preparation functions
>       kprobes: Allow probe on some kprobe functions
>       ftrace/*probes: Allow probing on some functions
>       x86: Allow kprobes on text_poke/hw_breakpoint
>       x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation
>       kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes
>       ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace
>       notifier: Use NOKPROBE_SYMBOL macro in notifier
>       sched: Use NOKPROBE_SYMBOL macro in sched
>       kprobes: Show blacklist entries via debugfs
>       kprobes: Support blacklist functions in module
>       kprobes: Use NOKPROBE_SYMBOL() in sample modules
>       kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text
>       kprobes/x86: Remove unneeded preempt_disable/enable in interrupt handlers
>       kprobes: Enlarge hash table to 512 entries
>       kprobes: Introduce kprobe cache to reduce cache misshits
>       ftrace: Introduce FTRACE_OPS_FL_SELF_FILTER for ftrace-kprobe
> 
> 
>  Documentation/kprobes.txt                |   24 +
>  arch/Kconfig                             |   10 
>  arch/x86/include/asm/asm.h               |    7 
>  arch/x86/include/asm/kprobes.h           |    2 
>  arch/x86/include/asm/traps.h             |    2 
>  arch/x86/kernel/alternative.c            |    3 
>  arch/x86/kernel/apic/hw_nmi.c            |    3 
>  arch/x86/kernel/cpu/common.c             |    4 
>  arch/x86/kernel/cpu/perf_event.c         |    3 
>  arch/x86/kernel/cpu/perf_event_amd_ibs.c |    3 
>  arch/x86/kernel/dumpstack.c              |    9 
>  arch/x86/kernel/entry_32.S               |   33 --
>  arch/x86/kernel/entry_64.S               |   20 -
>  arch/x86/kernel/hw_breakpoint.c          |    5 
>  arch/x86/kernel/kprobes/core.c           |  162 ++++----
>  arch/x86/kernel/kprobes/ftrace.c         |   19 +
>  arch/x86/kernel/kprobes/opt.c            |   32 +-
>  arch/x86/kernel/kvm.c                    |    4 
>  arch/x86/kernel/nmi.c                    |   18 +
>  arch/x86/kernel/paravirt.c               |    6 
>  arch/x86/kernel/traps.c                  |   35 +-
>  arch/x86/lib/thunk_32.S                  |    3 
>  arch/x86/lib/thunk_64.S                  |    3 
>  arch/x86/mm/fault.c                      |   28 +
>  include/asm-generic/vmlinux.lds.h        |    9 
>  include/linux/compiler.h                 |    2 
>  include/linux/ftrace.h                   |    3 
>  include/linux/kprobes.h                  |   23 +
>  include/linux/module.h                   |    5 
>  kernel/kprobes.c                         |  607 +++++++++++++++++++++---------
>  kernel/module.c                          |    6 
>  kernel/notifier.c                        |   22 +
>  kernel/sched/core.c                      |    7 
>  kernel/trace/ftrace.c                    |    3 
>  kernel/trace/trace_event_perf.c          |    5 
>  kernel/trace/trace_kprobe.c              |   66 ++-
>  kernel/trace/trace_probe.c               |   65 ++-
>  kernel/trace/trace_probe.h               |   15 -
>  kernel/trace/trace_uprobe.c              |   20 -
>  samples/kprobes/jprobe_example.c         |    1 
>  samples/kprobes/kprobe_example.c         |    3 
>  samples/kprobes/kretprobe_example.c      |    2 
>  42 files changed, 824 insertions(+), 478 deletions(-)
> 
> --
> Masami HIRAMATSU
> IT Management Research Dept. Linux Technology Center
> Hitachi, Ltd., Yokohama Research Laboratory
> E-mail: masami.hiramatsu.pt@hitachi.com
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



      parent reply	other threads:[~2014-03-14 13:11 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-05 11:58 [PATCH -tip v8 00/26] kprobes: introduce NOKPROBE_SYMBOL, bugfixes and scalbility efforts Masami Hiramatsu
2014-03-05 11:58 ` [PATCH -tip v8 01/26] [BUGFIX]kprobes/x86: Fix page-fault handling logic Masami Hiramatsu
2014-03-21 21:39   ` Steven Rostedt
2014-03-24  1:10     ` Masami Hiramatsu
2014-03-05 11:58 ` [PATCH -tip v8 02/26] kprobes/x86: Allow to handle reentered kprobe on singlestepping Masami Hiramatsu
2014-03-21 21:44   ` Steven Rostedt
2014-03-24 10:07     ` Masami Hiramatsu
2014-03-05 11:59 ` [PATCH -tip v8 03/26] kprobes: Prohibit probing on .entry.text code Masami Hiramatsu
2014-03-21 22:04   ` Steven Rostedt
2014-03-24  1:48     ` Masami Hiramatsu
2014-03-24 18:53       ` Steven Rostedt
2014-03-24 18:55   ` Steven Rostedt
2014-03-05 11:59 ` [PATCH -tip v8 04/26] kprobes: Introduce NOKPROBE_SYMBOL() macro for blacklist Masami Hiramatsu
2014-03-22  0:49   ` Steven Rostedt
2014-03-26  7:04     ` Masami Hiramatsu
2014-03-05 11:59 ` [PATCH -tip v8 05/26] [BUGFIX] kprobes/x86: Prohibit probing on debug_stack_* Masami Hiramatsu
2014-03-22  0:50   ` Steven Rostedt
2014-03-05 11:59 ` [PATCH -tip v8 06/26] [BUGFIX] x86: Prohibit probing on native_set_debugreg/load_idt Masami Hiramatsu
2014-03-22  0:53   ` Steven Rostedt
2014-03-05 11:59 ` [PATCH -tip v8 07/26] [BUGFIX] x86: Prohibit probing on thunk functions and restore Masami Hiramatsu
2014-03-22  1:00   ` Steven Rostedt
2014-03-24  3:03     ` Masami Hiramatsu
2014-03-24 18:56   ` Steven Rostedt
2014-03-05 11:59 ` [PATCH -tip v8 08/26] kprobes/x86: Call exception handlers directly from do_int3/do_debug Masami Hiramatsu
2014-03-22  1:05   ` Steven Rostedt
2014-03-24  8:47     ` Masami Hiramatsu
2014-03-24 18:58       ` Steven Rostedt
2014-03-05 11:59 ` [PATCH -tip v8 09/26] x86: Call exception_enter after kprobes handled Masami Hiramatsu
2014-03-24 19:31   ` Steven Rostedt
2014-03-05 11:59 ` [PATCH -tip v8 10/26] kprobes/x86: Allow probe on some kprobe preparation functions Masami Hiramatsu
2014-03-24 19:35   ` Steven Rostedt
2014-03-25  9:20     ` Masami Hiramatsu
2014-03-27  5:50     ` Masami Hiramatsu
2014-03-05 12:00 ` [PATCH -tip v8 11/26] kprobes: Allow probe on some kprobe functions Masami Hiramatsu
2014-03-24 19:37   ` Steven Rostedt
2014-03-25  9:22     ` Masami Hiramatsu
2014-03-05 12:00 ` [PATCH -tip v8 12/26] ftrace/*probes: Allow probing on some functions Masami Hiramatsu
2014-03-24 19:38   ` Steven Rostedt
2014-03-25  9:30     ` Masami Hiramatsu
2014-03-05 12:00 ` [PATCH -tip v8 13/26] x86: Allow kprobes on text_poke/hw_breakpoint Masami Hiramatsu
2014-03-24 19:40   ` Steven Rostedt
2014-03-05 12:00 ` [PATCH -tip v8 14/26] x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation Masami Hiramatsu
2014-03-24 19:45   ` Steven Rostedt
2014-03-25 10:33     ` Masami Hiramatsu
2014-03-05 12:00 ` [PATCH -tip v8 15/26] kprobes: Use NOKPROBE_SYMBOL macro instead of __kprobes Masami Hiramatsu
2014-03-24 19:46   ` Steven Rostedt
2014-03-05 12:00 ` [PATCH -tip v8 16/26] ftrace/kprobes: Use NOKPROBE_SYMBOL macro in ftrace Masami Hiramatsu
2014-03-24 20:10   ` Steven Rostedt
2014-03-25 10:31     ` Masami Hiramatsu
2014-03-05 12:00 ` [PATCH -tip v8 17/26] notifier: Use NOKPROBE_SYMBOL macro in notifier Masami Hiramatsu
2014-03-24 20:12   ` Steven Rostedt
2014-03-25 10:23     ` Masami Hiramatsu
2014-03-25 10:58       ` Steven Rostedt
2014-03-05 12:00 ` [PATCH -tip v8 18/26] sched: Use NOKPROBE_SYMBOL macro in sched Masami Hiramatsu
2014-03-24 20:14   ` Steven Rostedt
2014-03-05 12:00 ` [PATCH -tip v8 19/26] kprobes: Show blacklist entries via debugfs Masami Hiramatsu
2014-03-24 20:19   ` Steven Rostedt
2014-03-25 10:30     ` Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 20/26] kprobes: Support blacklist functions in module Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 21/26] kprobes: Use NOKPROBE_SYMBOL() in sample modules Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 22/26] kprobes/x86: Use kprobe_blacklist for .kprobes.text and .entry.text Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 23/26] kprobes/x86: Remove unneeded preempt_disable/enable in interrupt handlers Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 24/26] kprobes: Enlarge hash table to 512 entries Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 25/26] kprobes: Introduce kprobe cache to reduce cache misshits Masami Hiramatsu
2014-03-05 12:01 ` [PATCH -tip v8 26/26] ftrace: Introduce FTRACE_OPS_FL_SELF_FILTER for ftrace-kprobe Masami Hiramatsu
2014-03-14 13:11 ` Masami Hiramatsu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5322FFF7.2080606@hitachi.com \
    --to=masami.hiramatsu.pt@hitachi.com \
    --cc=ananth@in.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=fche@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=sandeepa.prabhu@linaro.org \
    --cc=systemtap@sourceware.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.