All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Sasha Levin <sasha.levin@oracle.com>
Cc: josh@joshtriplett.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: rcu: NULL ptr deref on boot
Date: Fri, 27 Jun 2014 10:13:59 -0700	[thread overview]
Message-ID: <20140627171359.GG4603@linux.vnet.ibm.com> (raw)
In-Reply-To: <53ADA04B.3010707@oracle.com>

On Fri, Jun 27, 2014 at 12:48:11PM -0400, Sasha Levin wrote:
> Hi Paul,
> 
> I've noticed the following on boot with the latest -next kernel:
> 
> [    0.000000] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives+0x1e/0x20()
> [    0.000000] You're using static_cpu_has before alternatives have run!
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc2-next-20140627-sasha-00023-g5d10814-dirty #734
> [    0.000000]  0000000000000009 ffffffff9d003c48 ffffffff9b525423 0000000000000002
> [    0.000000]  ffffffff9d003c98 ffffffff9d003c88 ffffffff98168aec ffffffff9d003d58
> [    0.000000]  0000000000000000 ffffffff9d003e78 0000000000000000 0000000000000002
> [    0.000000] Call Trace:
> [    0.000000] dump_stack (lib/dump_stack.c:52)
> [    0.000000] warn_slowpath_common (kernel/panic.c:431)
> [    0.000000] warn_slowpath_fmt (kernel/panic.c:446)
> [    0.000000] ? irq_return (arch/x86/kernel/entry_64.S:842)
> [    0.000000] warn_pre_alternatives (arch/x86/kernel/cpu/common.c:1440)
> [    0.000000] __do_page_fault (./arch/x86/include/asm/cpufeature.h:423 arch/x86/mm/fault.c:1022 arch/x86/mm/fault.c:1112)
> [    0.000000] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
> [    0.000000] ? trace_hardirqs_off (kernel/locking/lockdep.c:2645)
> [    0.000000] ? __slab_alloc (mm/slub.c:2364 (discriminator 1))
> [    0.000000] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> [    0.000000] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
> [    0.000000] ? error_sti (arch/x86/kernel/entry_64.S:1419)
> [    0.000000] trace_do_page_fault (arch/x86/mm/fault.c:1313 include/linux/jump_label.h:115 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1314)
> [    0.000000] do_async_page_fault (arch/x86/kernel/kvm.c:264)
> [    0.000000] async_page_fault (arch/x86/kernel/entry_64.S:1322)
> [    0.000000] ? tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] ? tick_nohz_init (include/linux/bitmap.h:164 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] start_kernel (init/main.c:581)
> [    0.000000] ? set_init_arg (init/main.c:281)
> [    0.000000] ? early_idt_handlers (arch/x86/kernel/head_64.S:340)
> [    0.000000] x86_64_start_reservations (arch/x86/kernel/head64.c:194)
> [    0.000000] x86_64_start_kernel (arch/x86/kernel/head64.c:183)
> [    0.000000] ---[ end trace 4d5ff9f2f68c4233 ]---
> [    0.000000] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [    0.000000] IP: tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] PGD 0
> [    0.000000] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W      3.16.0-rc2-next-20140627-sasha-00023-g5d10814-dirty #734
> [    0.000000] task: ffffffff9d0354c0 ti: ffffffff9d000000 task.ti: ffffffff9d000000
> [    0.000000] RIP: tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000] RSP: 0000:ffffffff9d003f28  EFLAGS: 00010002
> [    0.000000] RAX: 0000000000000000 RBX: ffff88003684d480 RCX: 0000000000000008
> [    0.000000] RDX: 0000000000000014 RSI: ffff88003684d480 RDI: 0000000000000000
> [    0.000000] RBP: ffffffff9d003f38 R08: ffff88003684d480 R09: ffff88003684d480
> [    0.000000] R10: ffff88003684d480 R11: 0000000000000001 R12: ffffffff9e5fd020
> [    0.000000] R13: ffff88070282ca00 R14: ffffffff9e607ae0 R15: 00000000000146f0
> [    0.000000] FS:  0000000000000000(0000) GS:ffff880036e00000(0000) knlGS:0000000000000000
> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    0.000000] CR2: 0000000000000000 CR3: 000000001d02e000 CR4: 00000000000006b0
> [    0.000000] Stack:
> [    0.000000]  ffffffffffffffff ffffffff9e5fd020 ffffffff9d003f88 ffffffff9e4c9f09
> [    0.000000]  ffffffff9e4c98fd 00000000000146f0 ffffffff9d003f78 ffffffff9e607ae0
> [    0.000000]  0000000000000020 ffffffff9e4c9117 00000000ffffffff 0000ffffffff9e4c
> [    0.000000] Call Trace:
> [    0.000000] start_kernel (init/main.c:581)
> [    0.000000] ? set_init_arg (init/main.c:281)
> [    0.000000] ? early_idt_handlers (arch/x86/kernel/head_64.S:340)
> [    0.000000] x86_64_start_reservations (arch/x86/kernel/head64.c:194)
> [    0.000000] x86_64_start_kernel (arch/x86/kernel/head64.c:183)
> [ 0.000000] Code: e8 d0 84 66 fa 89 c7 48 89 de e8 b0 46 02 fd 48 63 0d 4f 91 dd ff 31 c0 48 8b 3d 5e 1f 1c 01 48 83 c1 3f 48 c1 f9 03 48 83 e1 f8 <f3> aa 48 8b 1d 49 1f 1c 01 e8 9c 84 66 fa 89 c7 e8 25 3c d1 f9
> All code
> ========
>    0:	e8 d0 84 66 fa       	callq  0xfffffffffa6684d5
>    5:	89 c7                	mov    %eax,%edi
>    7:	48 89 de             	mov    %rbx,%rsi
>    a:	e8 b0 46 02 fd       	callq  0xfffffffffd0246bf
>    f:	48 63 0d 4f 91 dd ff 	movslq -0x226eb1(%rip),%rcx        # 0xffffffffffdd9165
>   16:	31 c0                	xor    %eax,%eax
>   18:	48 8b 3d 5e 1f 1c 01 	mov    0x11c1f5e(%rip),%rdi        # 0x11c1f7d
>   1f:	48 83 c1 3f          	add    $0x3f,%rcx
>   23:	48 c1 f9 03          	sar    $0x3,%rcx
>   27:	48 83 e1 f8          	and    $0xfffffffffffffff8,%rcx
>   2b:	f3 aa                	rep stos %al,%es:*(%rdi)		<-- trapping instruction
>   2d:	48 8b 1d 49 1f 1c 01 	mov    0x11c1f49(%rip),%rbx        # 0x11c1f7d
>   34:	e8 9c 84 66 fa       	callq  0xfffffffffa6684d5
>   39:	89 c7                	mov    %eax,%edi
>   3b:	e8 25 3c d1 f9       	callq  0xfffffffff9d13c65
> 	...
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	f3 aa                	rep stos %al,%es:(%rdi)
>    2:	48 8b 1d 49 1f 1c 01 	mov    0x11c1f49(%rip),%rbx        # 0x11c1f52
>    9:	e8 9c 84 66 fa       	callq  0xfffffffffa6684aa
>    e:	89 c7                	mov    %eax,%edi
>   10:	e8 25 3c d1 f9       	callq  0xfffffffff9d13c3a
> 	...
> [    0.000000] RIP tick_nohz_init (include/linux/bitmap.h:165 include/linux/cpumask.h:333 kernel/time/tick-sched.c:344 kernel/time/tick-sched.c:356)
> [    0.000000]  RSP <ffffffff9d003f28>
> [    0.000000] CR2: 0000000000000000
> 
> Bisection pointed me to "rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs".

Yikes!  tick_nohz_full_mask is allocated not in one place, but two!

Does the following patch help?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 07ae1cc39063..e023134d63a1 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -336,6 +336,10 @@ static int tick_nohz_init_all(void)
 		pr_err("NO_HZ: Can't allocate full dynticks cpumask\n");
 		return err;
 	}
+	if (!alloc_cpumask_var(&tick_nohz_not_full_mask, GFP_KERNEL)) {
+		pr_err("NO_HZ: Can't allocate not-full dynticks cpumask\n");
+		return err;
+	}
 	err = 0;
 	cpumask_setall(tick_nohz_full_mask);
 	cpumask_clear_cpu(smp_processor_id(), tick_nohz_full_mask);


  reply	other threads:[~2014-06-27 17:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-27 16:48 rcu: NULL ptr deref on boot Sasha Levin
2014-06-27 17:13 ` Paul E. McKenney [this message]
2014-06-30 12:28   ` Sasha Levin
2014-06-30 17:11     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140627171359.GG4603@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sasha.levin@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.