public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Sasha Levin <sasha.levin@oracle.com>
Cc: josh@joshtriplett.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: rcu: NULL ptr deref on boot
Date: Fri, 27 Jun 2014 10:13:59 -0700	[thread overview]
Message-ID: <20140627171359.GG4603@linux.vnet.ibm.com> (raw)
In-Reply-To: <53ADA04B.3010707@oracle.com>

On Fri, Jun 27, 2014 at 12:48:11PM -0400, Sasha Levin wrote:
> Hi Paul,
> 
> I've noticed the following on boot with the latest -next kernel:
> 
> [    0.000000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives+0x1e/0x20()
> [    0.000000] You're using static_cpu_has before alternatives have run!
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc2-next-20140627-sasha-00023-g5d10814-dirty #734
> [    0.000000]  0000000000000009 ffffffff9d003c48 ffffffff9b525423 0000000000000002
> [    0.000000]  ffffffff9d003c98 ffffffff9d003c88 ffffffff98168aec ffffffff9d003d58
> [    0.000000]  0000000000000000 ffffffff9d003e78 0000000000000000 0000000000000002
> [    0.000000] Call Trace:
> [    0.000000] dump_stack (lib/dump_stack.c:52)
> [    0.000000] warn_slowpath_common (kernel/panic.c:431)
> [    0.000000] warn_slowpath_fmt (kernel/panic.c:446)
> [    0.000000] ? irq_return (arch/x86/kernel/entry_64.S:842)
> [    0.000000] warn_pre_alternatives (arch/x86/kernel/cpu/common.c:1440)
> [    0.000000] __do_page_fault (./arch/x86/include/asm/cpufeature.h:423 arch/x86/mm/fault.c:1022 arch/x86/mm/fault.c:1112)
> [    0.000000] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
> [    0.000000] ? trace_hardirqs_off (kernel/locking/lockdep.c:2645)
> [    0.000000] ? __slab_alloc (mm/slub.c:2364 (discriminator 1))
> [    0.000000] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> [    0.000000] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
> [    0.000000] ? error_sti (arch/x86/kernel/entry_64.S:1419)
> [    0.000000] trace_do_page_fault (arch/x86/mm/fault.c:1313 include/linux/jump_label.h:115 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1314)
> [    0.000000] do_async_page_fault (arch/x86/kernel/kvm.c:264)
> [    0.000000] async_page_fault (arch/x86/kernel/entry_64.S:1322)
> [    0.000000] ? tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] ? tick_nohz_init (include/linux/bitmap.h:164 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] start_kernel (init/main.c:581)
> [    0.000000] ? set_init_arg (init/main.c:281)
> [    0.000000] ? early_idt_handlers (arch/x86/kernel/head_64.S:340)
> [    0.000000] x86_64_start_reservations (arch/x86/kernel/head64.c:194)
> [    0.000000] x86_64_start_kernel (arch/x86/kernel/head64.c:183)
> [    0.000000] ---[ end trace 4d5ff9f2f68c4233 ]---
> [    0.000000] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [    0.000000] IP: tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] PGD 0
> [    0.000000] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W      3.16.0-rc2-next-20140627-sasha-00023-g5d10814-dirty #734
> [    0.000000] task: ffffffff9d0354c0 ti: ffffffff9d000000 task.ti: ffffffff9d000000
> [    0.000000] RIP: tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] RSP: 0000:ffffffff9d003f28  EFLAGS: 00010002
> [    0.000000] RAX: 0000000000000000 RBX: ffff88003684d480 RCX: 0000000000000008
> [    0.000000] RDX: 0000000000000014 RSI: ffff88003684d480 RDI: 0000000000000000
> [    0.000000] RBP: ffffffff9d003f38 R08: ffff88003684d480 R09: ffff88003684d480
> [    0.000000] R10: ffff88003684d480 R11: 0000000000000001 R12: ffffffff9e5fd020
> [    0.000000] R13: ffff88070282ca00 R14: ffffffff9e607ae0 R15: 00000000000146f0
> [    0.000000] FS:  0000000000000000(0000) GS:ffff880036e00000(0000) knlGS:0000000000000000
> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    0.000000] CR2: 0000000000000000 CR3: 000000001d02e000 CR4: 00000000000006b0
> [    0.000000] Stack:
> [    0.000000]  ffffffffffffffff ffffffff9e5fd020 ffffffff9d003f88 ffffffff9e4c9f09
> [    0.000000]  ffffffff9e4c98fd 00000000000146f0 ffffffff9d003f78 ffffffff9e607ae0
> [    0.000000]  0000000000000020 ffffffff9e4c9117 00000000ffffffff 0000ffffffff9e4c
> [    0.000000] Call Trace:
> [    0.000000] start_kernel (init/main.c:581)
> [    0.000000] ? set_init_arg (init/main.c:281)
> [    0.000000] ? early_idt_handlers (arch/x86/kernel/head_64.S:340)
> [    0.000000] x86_64_start_reservations (arch/x86/kernel/head64.c:194)
> [    0.000000] x86_64_start_kernel (arch/x86/kernel/head64.c:183)
> [ 0.000000] Code: e8 d0 84 66 fa 89 c7 48 89 de e8 b0 46 02 fd 48 63 0d 4f 91 dd ff 31 c0 48 8b 3d 5e 1f 1c 01 48 83 c1 3f 48 c1 f9 03 48 83 e1 f8 <f3> aa 48 8b 1d 49 1f 1c 01 e8 9c 84 66 fa 89 c7 e8 25 3c d1 f9
> All code
> ========
>    0:	e8 d0 84 66 fa       	callq  0xfffffffffa6684d5
>    5:	89 c7                	mov    %eax,%edi
>    7:	48 89 de             	mov    %rbx,%rsi
>    a:	e8 b0 46 02 fd       	callq  0xfffffffffd0246bf
>    f:	48 63 0d 4f 91 dd ff 	movslq -0x226eb1(%rip),%rcx        # 0xffffffffffdd9165
>   16:	31 c0                	xor    %eax,%eax
>   18:	48 8b 3d 5e 1f 1c 01 	mov    0x11c1f5e(%rip),%rdi        # 0x11c1f7d
>   1f:	48 83 c1 3f          	add    $0x3f,%rcx
>   23:	48 c1 f9 03          	sar    $0x3,%rcx
>   27:	48 83 e1 f8          	and    $0xfffffffffffffff8,%rcx
>   2b:	f3 aa                	rep stos %al,%es:*(%rdi)		<-- trapping instruction
>   2d:	48 8b 1d 49 1f 1c 01 	mov    0x11c1f49(%rip),%rbx        # 0x11c1f7d
>   34:	e8 9c 84 66 fa       	callq  0xfffffffffa6684d5
>   39:	89 c7                	mov    %eax,%edi
>   3b:	e8 25 3c d1 f9       	callq  0xfffffffff9d13c65
> 	...
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	f3 aa                	rep stos %al,%es:(%rdi)
>    2:	48 8b 1d 49 1f 1c 01 	mov    0x11c1f49(%rip),%rbx        # 0x11c1f52
>    9:	e8 9c 84 66 fa       	callq  0xfffffffffa6684aa
>    e:	89 c7                	mov    %eax,%edi
>   10:	e8 25 3c d1 f9       	callq  0xfffffffff9d13c3a
> 	...
> [    0.000000] RIP tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000]  RSP <ffffffff9d003f28>
> [    0.000000] CR2: 0000000000000000
> 
> Bisection pointed me to "rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs".

Yikes!  tick_nohz_full_mask is allocated not in one place, but two!

Does the following patch help?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 07ae1cc39063..e023134d63a1 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -336,6 +336,10 @@ static int tick_nohz_init_all(void)
 		pr_err("NO_HZ: Can't allocate full dynticks cpumask\n");
 		return err;
 	}
+	if (!alloc_cpumask_var(&tick_nohz_not_full_mask, GFP_KERNEL)) {
+		pr_err("NO_HZ: Can't allocate not-full dynticks cpumask\n");
+		return err;
+	}
 	err = 0;
 	cpumask_setall(tick_nohz_full_mask);
 	cpumask_clear_cpu(smp_processor_id(), tick_nohz_full_mask);


  reply	other threads:[~2014-06-27 17:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27 16:48 rcu: NULL ptr deref on boot Sasha Levin
2014-06-27 17:13 ` Paul E. McKenney [this message]
2014-06-30 12:28   ` Sasha Levin
2014-06-30 17:11     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140627171359.GG4603@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sasha.levin@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox