From: Masami Hiramatsu <mhiramat@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
rostedt@goodmis.org, mhiramat@kernel.org, bristot@redhat.com,
jbaron@akamai.com, torvalds@linux-foundation.org,
tglx@linutronix.de, mingo@kernel.org, namit@vmware.com,
hpa@zytor.com, luto@kernel.org, ard.biesheuvel@linaro.org,
jpoimboe@redhat.com, jeyu@kernel.org,
alexei.starovoitov@gmail.com
Subject: Re: [PATCH -v5 00/17] Rewrite x86/ftrace to use text_poke (and more)
Date: Mon, 25 Nov 2019 12:55:34 +0900 [thread overview]
Message-ID: <20191125125534.2aaaccf00c38a9a25dee623a@kernel.org> (raw)
In-Reply-To: <20191111131252.921588318@infradead.org>
Hi Peter,
I got a BUG in poke_int3_handler() on 8 threads qemu x86-64
while testing multi-probe testcase.
I couldn't reproduce it because maybe it is a timing bug.
(Sorry, since the terminal line-wrap is disabled, the last chars are overwritten)
BUG: kernel NULL pointer dereference, add8
[ 1475.147390] #PF: supervisor read access in kernel mode
[ 1475.148380] #PF: error_code(0x0000) - not-present page
[ 1475.149362] PGD 0 P4D 0
[ 1475.149970] Oops: 0000 [#1] PREEMPT SMP PTI
[ 1475.150824] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.4.0-rc6+ #19
[ 1475.151996] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a4
[ 1475.154078] RIP: 0010:poke_int3_handler+0x33/0x70
[ 1475.155008] Code: d2 75 04 31 c0 5b c3 f6 87 88 00 00 00 03 75 f3 48 8b 87 80 00 00 00 48 89 fb b
[ 1475.158308] RSP: 0018:ffffc900001a8c10 EFLAGS: 00010046
[ 1475.159345] RAX: 0000000000000000 RBX: ffffc900001a8c38 RCX: ffffffff81800b57
[ 1475.160653] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff81026ec5
[ 1475.162032] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1475.163448] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1475.164828] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1475.166180] FS: 0000000000000000(0000) GS:ffff88807d980000(0000) knlGS:0000000000000000
[ 1475.167876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1475.169066] CR2: 0000000000000008 CR3: 000000000201e000 CR4: 00000000000006a0
[ 1475.170431] Call Trace:
[ 1475.171064] <IRQ>
[ 1475.171609] do_int3+0xd/0xf0
[ 1475.172302] int3+0x42/0x50
[ 1475.173014] RIP: 0010:sched_clock+0x6/0x10
[ 1475.174061] Code: d3 ea f6 c1 40 48 0f 45 c2 4c 01 c0 65 ff 0d 99 ee fe 7e 74 02 5d c3 e8 f8 ad f
[ 1475.178461] RSP: 0018:ffffc900001a8d10 EFLAGS: 00000016
[ 1475.179475] RAX: 000001577e58ce4d RBX: ffff88807d261c00 RCX: 0000000000000000
[ 1475.180685] RDX: 0000000000000014 RSI: 0000000000000014 RDI: ffffffff82556180
[ 1475.181907] RBP: ffff88807d018600 R08: 000002ac502de60d R09: 0000000000000002
[ 1475.183213] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88807d261ce0
[ 1475.184365] R13: ffff88807d261d20 R14: 00000000000003e8 R15: 0000000000000002
[ 1475.185515] ? sched_clock+0x6/0x10
[ 1475.186173] trace_clock_local+0xc/0x20
[ 1475.186939] ring_buffer_lock_reserve+0x10d/0x410
[ 1475.187797] trace_event_buffer_lock_reserve+0x4a/0xf0
[ 1475.188740] kprobe_trace_func+0x10e/0x370
[ 1475.189489] ? sched_clock+0x6/0x10
[ 1475.190141] kprobe_dispatcher+0x39/0x60
[ 1475.190911] aggr_pre_handler+0x4c/0x90
[ 1475.191602] ? sched_clock+0x5/0x10
[ 1475.192317] kprobe_int3_handler+0x101/0x150
[ 1475.193133] do_int3+0x36/0xf0
[ 1475.193697] int3+0x42/0x50
[ 1475.194241] RIP: 0010:sched_clock+0x6/0x10
[ 1475.194947] Code: d3 ea f6 c1 40 48 0f 45 c2 4c 01 c0 65 ff 0d 99 ee fe 7e 74 02 5d c3 e8 f8 ad f
[ 1475.197612] RSP: 0018:ffffc900001a8fc8 EFLAGS: 00000012
[ 1475.198466] RAX: 000001577e58cc7f RBX: 0000000000000000 RCX: 0000000000000000
[ 1475.199511] RDX: 0000000000000014 RSI: 0000000000000014 RDI: ffffffff82556180
[ 1475.200552] RBP: 0000000000000000 R08: 000002ac502de274 R09: 0000000000000001
[ 1475.201641] R10: 0000000000000000 R11: ffff88807d1a9e58 R12: 0000000000000000
[ 1475.202804] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1475.203932] ? sched_clock+0x6/0x10
[ 1475.204557] ? sched_clock+0x5/0x10
[ 1475.205176] sched_clock_cpu+0xe/0xd0
[ 1475.205830] irq_exit+0xb3/0xc0
[ 1475.206512] call_function_interrupt+0xf/0x20
[ 1475.207252] </IRQ>
[ 1475.207714] RIP: 0010:default_idle+0x23/0x180
[ 1475.208449] Code: ff 90 90 90 90 90 90 41 55 41 54 55 53 e8 35 29 c7 ff 0f 1f 44 00 00 e8 8b af 5
[ 1475.211721] RSP: 0018:ffffc9000008feb8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff03
[ 1475.213312] RAX: ffff88807d1a95c0 RBX: 0000000000000006 RCX: 0000000000000000
[ 1475.214687] RDX: 0000000000000046 RSI: 0000000000000006 RDI: ffff88807d1a95c0
[ 1475.216027] RBP: ffffffff8212da80 R08: 0000000000000001 R09: 0000000000000000
[ 1475.217399] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1475.218867] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1475.220144] ? default_idle+0x15/0x180
[ 1475.220929] default_idle_call+0x26/0x30
[ 1475.221809] do_idle+0x1fe/0x220
[ 1475.222695] ? trace_hardirqs_on+0x2c/0xf0
[ 1475.223550] cpu_startup_entry+0x14/0x20
[ 1475.224390] start_secondary+0x152/0x180
[ 1475.225157] secondary_startup_64+0xa4/0xb0
[ 1475.226011] Modules linked in: [last unloaded: trace_printk]
[ 1475.227028] CR2: 0000000000000008
[ 1475.227636] ---[ end trace 5d3d01238a02c9e8 ]---
[ 1475.228410] RIP: 0010:poke_int3_handler+0x33/0x70
[ 1475.229187] Code: d2 75 04 31 c0 5b c3 f6 87 88 00 00 00 03 75 f3 48 8b 87 80 00 00 00 48 89 fb b
[ 1475.232231] RSP: 0018:ffffc900001a8c10 EFLAGS: 00010046
[ 1475.233180] RAX: 0000000000000000 RBX: ffffc900001a8c38 RCX: ffffffff81800b57
[ 1475.234374] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff81026ec5
[ 1475.236003] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1475.237818] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 1475.240007] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1475.242493] FS: 0000000000000000(0000) GS:ffff88807d980000(0000) knlGS:0000000000000000
[ 1475.245272] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1475.247136] CR2: 0000000000000008 CR3: 000000000201e000 CR4: 00000000000006a0
[ 1475.249053] Kernel panic - not syncing: Fatal exception
[ 1475.250593] Kernel Offset: disabled
[ 1475.251459] ---[ end Kernel panic - not syncing: Fatal exception ]---
$ eu-addr2line -e ./vmlinux poke_int3_handler+0x33
arch/x86/kernel/alternative.c:996:6
-----
int poke_int3_handler(struct pt_regs *regs)
{
struct text_poke_loc *tp;
unsigned char int3 = 0xcc;
void *ip;
/*
* Having observed our INT3 instruction, we now must observe
* bp_patching.nr_entries.
*
* nr_entries != 0 INT3
* WMB RMB
* write INT3 if (nr_entries)
*
* Idem for other elements in bp_patching.
*/
smp_rmb();
if (likely(!bp_patching.nr_entries))
return 0;
if (user_mode(regs))
return 0;
/*
* Discount the sizeof(int3). See text_poke_bp_batch().
*/
ip = (void *) regs->ip - sizeof(int3);
/*
* Skip the binary search if there is a single member in the vector.
*/
if (unlikely(bp_patching.nr_entries > 1)) {
tp = bsearch(ip, bp_patching.vec, bp_patching.nr_entries,
sizeof(struct text_poke_loc),
patch_cmp);
if (!tp)
return 0;
} else {
tp = bp_patching.vec;
if (tp->addr != ip) /* !!!! <-- HERE !!!!! */
return 0;
}
/* set up the specified breakpoint detour */
regs->ip = (unsigned long) tp->detour;
return 1;
}
-----
This seems like caused by bp_patching.vec is NULL but
bp_patching.nr_entries != 0. I think there is a small chance to do
this, because we have no smp_wmb after cleaning bp_patching.nr_entries.
-----
void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries)
...
on_each_cpu(do_sync_core, NULL, 1);
/*
* sync_core() implies an smp_mb() and orders this store against
* the writing of the new instruction.
*/
bp_patching.vec = NULL;
bp_patching.nr_entries = 0;
}
-----
I think the "on_each_cpu(do_sync_core, NULL, 1);" can sync the pipeline
but doesn't ensure all ongoing int3 handling is done. Thus, we may need a
bigger wait in between bp_patching.nr_entries = 0 and bp_patching.vec = NULL;
e.g.
bp_patching.nr_entries = 0;
synchronize_rcu();
bp_patching.vec = NULL;
What would you think about ?
Thank you,
On Mon, 11 Nov 2019 14:12:52 +0100
Peter Zijlstra <peterz@infradead.org> wrote:
> Ftrace is one of the last W^X violators (after this only KLP is left). These
> patches move it over to the generic text_poke() interface and thereby get rid
> of this oddity.
>
> The first 14 patches are the same as in the -v4 posting. The last 3 patches are
> new.
>
> Will, patch 13, arm/ftrace, is unchanged. This is because this way it preserves
> behaviour, but if you can provide me a tested-by for the simpler variant I can
> drop that in.
>
> Patch 15 reworks ftrace's event_create_dir(), which ran module code before the
> module was finished loading (before we even applied jump_labels and all that).
>
> Patch 16 and 17 address minor review feedback.
>
> Ingo, Alexei wants patch #1 for some BPF stuff, can he get that in a topic branch?
>
--
Masami Hiramatsu <mhiramat@kernel.org>
next prev parent reply other threads:[~2019-11-25 3:55 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-11 13:12 [PATCH -v5 00/17] Rewrite x86/ftrace to use text_poke (and more) Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 01/17] x86/alternatives: Teach text_poke_bp() to emulate instructions Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 02/17] x86/alternatives: Update int3_emulate_push() comment Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 03/17] x86/alternatives,jump_label: Provide better text_poke() batching interface Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] x86/alternatives, jump_label: " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 04/17] x86/alternatives: Add and use text_gen_insn() helper Peter Zijlstra
2019-11-12 17:10 ` Steven Rostedt
2019-11-12 22:25 ` Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 05/17] x86/ftrace: Use text_poke() Peter Zijlstra
2019-11-12 18:25 ` Steven Rostedt
2019-11-12 22:24 ` Peter Zijlstra
2019-11-12 22:48 ` Steven Rostedt
2019-11-13 9:01 ` Peter Zijlstra
2019-11-13 14:27 ` Steven Rostedt
2019-11-14 13:18 ` Peter Zijlstra
2019-11-14 13:56 ` Steven Rostedt
2019-11-14 14:05 ` Peter Zijlstra
2019-11-13 8:53 ` Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-11-16 20:46 ` Borislav Petkov
2019-11-18 17:35 ` [PATCH] x86/ftrace: Mark ftrace_modify_code_direct() __ref Borislav Petkov
2019-11-18 17:52 ` Steven Rostedt
2019-11-19 9:55 ` [tip: core/kprobes] " tip-bot2 for Borislav Petkov
2019-12-04 8:33 ` tip-bot2 for Borislav Petkov
2019-12-04 8:33 ` [tip: core/kprobes] x86/ftrace: Use text_poke() tip-bot2 for Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 06/17] x86/mm: Remove set_kernel_text_r[ow]() Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:12 ` [PATCH -v5 07/17] x86/alternative: Add text_opcode_size() Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 08/17] x86/ftrace: Use text_gen_insn() Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 09/17] x86/alternative: Remove text_poke_loc::len Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 10/17] x86/alternative: Shrink text_poke_loc Peter Zijlstra
2019-11-15 9:43 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 11/17] x86/kprobes: Convert to text-patching.h Peter Zijlstra
2019-11-19 16:56 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 12/17] x86/kprobes: Fix ordering Peter Zijlstra
2019-11-13 14:31 ` Paul E. McKenney
2019-11-13 15:42 ` Mathieu Desnoyers
2019-11-14 13:53 ` Peter Zijlstra
2019-11-14 15:06 ` Mathieu Desnoyers
2019-11-14 15:13 ` Paul E. McKenney
2019-11-14 15:22 ` Mathieu Desnoyers
2019-11-14 15:28 ` Peter Zijlstra
2019-11-14 15:30 ` Mathieu Desnoyers
2019-11-14 15:42 ` Peter Zijlstra
2019-11-14 15:58 ` Peter Zijlstra
2019-11-19 16:56 ` [tip: core/kprobes] x86/kprobes: Fix ordering while text-patching tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 13/17] arm/ftrace: Use __patch_text_real() Peter Zijlstra
2019-11-11 16:47 ` Will Deacon
2019-11-11 17:19 ` Peter Zijlstra
2019-11-11 17:25 ` Peter Zijlstra
2019-11-12 11:29 ` Will Deacon
2019-11-13 9:26 ` [PATCH -v5mkII 13/17] arm/ftrace: Use __patch_text() Peter Zijlstra
2019-11-19 16:56 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2020-01-22 21:26 ` Dmitry Osipenko
2020-02-07 10:17 ` Peter Zijlstra
2020-02-07 10:26 ` Peter Zijlstra
2020-02-07 11:27 ` Peter Zijlstra
2020-02-07 16:47 ` Dmitry Osipenko
2020-01-08 12:22 ` [PATCH -v5mkII 13/17] " Arnd Bergmann
2020-01-08 14:16 ` Steven Rostedt
2020-01-08 14:22 ` Arnd Bergmann
2019-11-11 13:13 ` [PATCH -v5 14/17] module: Remove set_all_modules_text_*() Peter Zijlstra
2019-11-19 16:56 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 15/17] ftrace: Rework event_create_dir() Peter Zijlstra
2019-11-14 14:07 ` Steven Rostedt
2019-11-19 16:56 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 16/17] x86/kprobe: Add comments to arch_{,un}optimize_kprobes() Peter Zijlstra
2019-11-19 16:56 ` [tip: core/kprobes] " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 13:13 ` [PATCH -v5 17/17] x86/alternative: Use INT3_INSN_SIZE Peter Zijlstra
2019-11-19 16:56 ` [tip: core/kprobes] x86/alternatives: " tip-bot2 for Peter Zijlstra
2019-12-04 8:33 ` tip-bot2 for Peter Zijlstra
2019-11-11 19:47 ` [PATCH -v5 00/17] Rewrite x86/ftrace to use text_poke (and more) Alexei Starovoitov
2019-11-11 20:39 ` Peter Zijlstra
2019-11-11 20:42 ` Peter Zijlstra
2019-11-11 20:56 ` Alexei Starovoitov
2019-11-12 18:26 ` Steven Rostedt
2019-11-25 3:55 ` Masami Hiramatsu [this message]
2019-11-25 6:47 ` Masami Hiramatsu
2019-11-25 17:32 ` Steven Rostedt
2019-11-26 0:11 ` Masami Hiramatsu
2019-11-26 8:58 ` Masami Hiramatsu
2019-11-26 9:58 ` Masami Hiramatsu
2019-11-26 23:48 ` Masami Hiramatsu
[not found] ` <CAADnVQK4twuXzFhD-qLHmCVK0n1h-GDENQLu+4PVV3Hp++R6kQ@mail.gmail.com>
2019-11-27 4:32 ` Alexei Starovoitov
2019-11-27 5:01 ` Alexei Starovoitov
2019-11-27 6:41 ` Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191125125534.2aaaccf00c38a9a25dee623a@kernel.org \
--to=mhiramat@kernel.org \
--cc=alexei.starovoitov@gmail.com \
--cc=ard.biesheuvel@linaro.org \
--cc=bristot@redhat.com \
--cc=hpa@zytor.com \
--cc=jbaron@akamai.com \
--cc=jeyu@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@kernel.org \
--cc=namit@vmware.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox