All of lore.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
To: Jiri Kosina <jkosina@suse.cz>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>,
	linux-kernel@vger.kernel.org,
	Fengguang Wu <fengguang.wu@intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	x86@kernel.org
Subject: Re: [PATCH -tip/x86/jumplabel] x86: call out into int3 handler directly instead of using notifier
Date: Mon, 22 Jul 2013 20:24:38 +0900	[thread overview]
Message-ID: <51ED1676.4040604@hitachi.com> (raw)
In-Reply-To: <alpine.LNX.2.00.1307221307280.23128@pobox.suse.cz>

(2013/07/22 20:14), Jiri Kosina wrote:
> In fd4363fff3d96 ("x86: Introduce int3 (breakpoint)-based instruction 
> patching"), the mechanism that was introduced for notifying alternatives 
> code from int3 exception handler that and exception occured was 
> die_notifier.
> 
> This is however problematic, as early code might be using jump labels even 
> before the notifier registration has been performed, which will then lead 
> to an oops due to unhandled exception. One of such occurences has been 
> encountered by Fengguang:

Ah, right!

> 
>  int3: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
>  Modules linked in:
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc1-01429-g04bf576 #8
>  task: ffff88000da1b040 ti: ffff88000da1c000 task.ti: ffff88000da1c000
>  RIP: 0010:[<ffffffff811098cc>]  [<ffffffff811098cc>] ttwu_do_wakeup+0x28/0x225
>  RSP: 0000:ffff88000dd03f10  EFLAGS: 00000006
>  RAX: 0000000000000000 RBX: ffff88000dd12940 RCX: ffffffff81769c40
>  RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000001
>  RBP: ffff88000dd03f28 R08: ffffffff8176a8c0 R09: 0000000000000002
>  R10: ffffffff810ff484 R11: ffff88000dd129e8 R12: ffff88000dbc90c0
>  R13: ffff88000dbc90c0 R14: ffff88000da1dfd8 R15: ffff88000da1dfd8
>  FS:  0000000000000000(0000) GS:ffff88000dd00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: 00000000ffffffff CR3: 0000000001c88000 CR4: 00000000000006e0
>  Stack:
>   ffff88000dd12940 ffff88000dbc90c0 ffff88000da1dfd8 ffff88000dd03f48
>   ffffffff81109e2b ffff88000dd12940 0000000000000000 ffff88000dd03f68
>   ffffffff81109e9e 0000000000000000 0000000000012940 ffff88000dd03f98
>  Call Trace:
>   <IRQ>
>   [<ffffffff81109e2b>] ttwu_do_activate.constprop.56+0x6d/0x79
>   [<ffffffff81109e9e>] sched_ttwu_pending+0x67/0x84
>   [<ffffffff8110c845>] scheduler_ipi+0x15a/0x2b0
>   [<ffffffff8104dfb4>] smp_reschedule_interrupt+0x38/0x41
>   [<ffffffff8173bf5d>] reschedule_interrupt+0x6d/0x80
>   <EOI>
>   [<ffffffff810ff484>] ? __atomic_notifier_call_chain+0x5/0xc1
>   [<ffffffff8105cc30>] ? native_safe_halt+0xd/0x16
>   [<ffffffff81015f10>] default_idle+0x147/0x282
>   [<ffffffff81017026>] arch_cpu_idle+0x3d/0x5d
>   [<ffffffff81127d6a>] cpu_idle_loop+0x46d/0x5db
>   [<ffffffff81127f5c>] cpu_startup_entry+0x84/0x84
>   [<ffffffff8104f4f8>] start_secondary+0x3c8/0x3d5
>  Code: 5c 5d c3 e8 d7 0f 63 00 55 48 ff 05 1f 5d 1e 01 48 89 e5 41 55 49 89 f5 41 54 53 48 89 fb e8 8d fe ff ff 48
>  01 cc <1f> 44 00 00 31 c0 eb 0c 48 ff 05 05 5d 1e 01 b8 01 00 00 00 48
>  RIP  [<ffffffff811098cc>] ttwu_do_wakeup+0x28/0x225
>   RSP <ffff88000dd03f10>
>  ---[ end trace 0d3288a047152a17 ]---
> 
> Fix this by directly calling poke_int3_handler() from the int3 exception 
> handler (analogically to what ftrace has been doing already), instead of 
> relying on notifier, registration of which might not have yet been 
> finalized by the time of the first trap.

This seems OK for me.

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thanks!

> 
> Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> ---
>  arch/x86/include/asm/alternative.h |    2 ++
>  arch/x86/kernel/alternative.c      |   31 ++++++++-----------------------
>  arch/x86/kernel/traps.c            |    4 ++++
>  kernel/kprobes.c                   |    2 +-
>  4 files changed, 15 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index 3abf8dd..4df44c2 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -5,6 +5,7 @@
>  #include <linux/stddef.h>
>  #include <linux/stringify.h>
>  #include <asm/asm.h>
> +#include <asm/ptrace.h>
>  
>  /*
>   * Alternative inline assembly for SMP.
> @@ -232,6 +233,7 @@ struct text_poke_param {
>  	size_t len;
>  };
>  
> +extern int poke_int3_handler(struct pt_regs *regs);
>  extern void *text_poke(void *addr, const void *opcode, size_t len);
>  extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler);
>  extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 0ab4936..7f6351f 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -605,26 +605,24 @@ static void do_sync_core(void *info)
>  static bool bp_patching_in_progress;
>  static void *bp_int3_handler, *bp_int3_addr;
>  
> -static int int3_notify(struct notifier_block *self, unsigned long val, void *data)
> +int poke_int3_handler(struct pt_regs *regs)
>  {
> -	struct die_args *args = data;
> -
>  	/* bp_patching_in_progress */
>  	smp_rmb();
>  
>  	if (likely(!bp_patching_in_progress))
> -		return NOTIFY_DONE;
> +		return 0;
>  
> -	/* we are not interested in non-int3 faults and ring > 0 faults */
> -	if (val != DIE_INT3 || !args->regs || user_mode_vm(args->regs)
> -			    || args->regs->ip != (unsigned long)bp_int3_addr)
> -		return NOTIFY_DONE;
> +	if (user_mode_vm(regs) || regs->ip != (unsigned long)bp_int3_addr)
> +		return 0;
>  
>  	/* set up the specified breakpoint handler */
> -	args->regs->ip = (unsigned long) bp_int3_handler;
> +	regs->ip = (unsigned long) bp_int3_handler;
> +
> +	return 1;
>  
> -	return NOTIFY_STOP;
>  }
> +
>  /**
>   * text_poke_bp() -- update instructions on live kernel on SMP
>   * @addr:	address to patch
> @@ -689,19 +687,6 @@ void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler)
>  	return addr;
>  }
>  
> -/* this one needs to run before anything else handles it as a
> - * regular exception */
> -static struct notifier_block int3_nb = {
> -	.priority = 0x7fffffff,
> -	.notifier_call = int3_notify
> -};
> -
> -static int __init int3_init(void)
> -{
> -	return register_die_notifier(&int3_nb);
> -}
> -
> -arch_initcall(int3_init);
>  /*
>   * Cross-modifying kernel text with stop_machine().
>   * This code originally comes from immediate value.
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 772e2a8..2694486 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -58,6 +58,7 @@
>  #include <asm/mce.h>
>  #include <asm/fixmap.h>
>  #include <asm/mach_traps.h>
> +#include <asm/alternative.h>
>  
>  #ifdef CONFIG_X86_64
>  #include <asm/x86_init.h>
> @@ -324,6 +325,9 @@ dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_co
>  	    ftrace_int3_handler(regs))
>  		return;
>  #endif
> +	if (poke_int3_handler(regs))
> +		return;
> +
>  	prev_state = exception_enter();
>  #ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
>  	if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index d6db7bd..bddf3b2 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1709,7 +1709,7 @@ EXPORT_SYMBOL_GPL(unregister_kprobes);
>  
>  static struct notifier_block kprobe_exceptions_nb = {
>  	.notifier_call = kprobe_exceptions_notify,
> -	.priority = 0x7ffffff0 /* High priority, but not first.  */
> +	.priority = 0x7fffffff /* we need to be notified first */
>  };
>  
>  unsigned long __weak arch_deref_entry_point(void *entry)
> 


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



  reply	other threads:[~2013-07-22 11:24 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-20 13:12 [x86] Kernel panic - not syncing: Fatal exception in interrupt Fengguang Wu
2013-07-21  0:23 ` H. Peter Anvin
2013-07-21  8:30   ` Jiri Kosina
2013-07-21 16:21     ` Fengguang Wu
2013-07-21 17:21     ` H. Peter Anvin
2013-07-22 11:18       ` Jiri Kosina
2013-07-22  1:07     ` Fengguang Wu
2013-07-22 11:14 ` [PATCH -tip/x86/jumplabel] x86: call out into int3 handler directly instead of using notifier Jiri Kosina
2013-07-22 11:24   ` Masami Hiramatsu [this message]
2013-07-22 20:53   ` H. Peter Anvin
2013-07-22 21:00     ` Jiri Kosina
2013-07-23  1:24       ` Masami Hiramatsu
2013-07-23  7:45         ` Ingo Molnar
2013-07-23  8:09           ` Jiri Kosina
2013-07-24  3:55             ` [tip:perf/core] kprobes/x86: Call out into INT3 " tip-bot for Jiri Kosina
2013-07-24 14:33               ` H. Peter Anvin
2013-07-29  9:06                 ` Jiri Kosina
2013-07-29  9:14                   ` H. Peter Anvin
2013-07-29  9:18                     ` Jiri Kosina
2013-07-29  9:23                       ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51ED1676.4040604@hitachi.com \
    --to=masami.hiramatsu.pt@hitachi.com \
    --cc=fengguang.wu@intel.com \
    --cc=hpa@linux.intel.com \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.