Re: panic at boot time with kernel >= 4.9.98 - uninitialized system_wq in early interrupt

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Greg KH <greg@kroah.com>
To: Max Asbock <amax@lenovo.com>
Cc: "stable@vger.kernel.org" <stable@vger.kernel.org>,
	"tytso@mit.edu" <tytso@mit.edu>, Chris McDermott <lcm@lenovo.com>
Subject: Re: panic at boot time with kernel >= 4.9.98 - uninitialized system_wq in early interrupt
Date: Thu, 7 Jun 2018 10:37:14 +0200	[thread overview]
Message-ID: <20180607083714.GA17489@kroah.com> (raw)
In-Reply-To: <B7F68D8965785B478837F01E8A57F6CE7707EF83@APMAILMBX06.lenovo.com>

On Wed, Jun 06, 2018 at 11:26:46PM +0000, Max Asbock wrote:
> 
> We have been seeing kernel panics on certain systems with the 4.9.x stable kernel, where x >= 98.
> We captured the following panic message:
> [    6.252000] BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
> [    6.261222] IP: [<ffffffff810a8ea2>] __queue_work+0x32/0x430
> [    6.267826] PGD 0 [    6.269951]
> [    6.271887] Oops: 0000 [#1] SMP
> [    6.275576] Modules linked in:
> [    6.279264] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.100 #1
> [    6.296760] task: ffffffff81c114c0 task.stack: ffffffff81c00000
> [    6.303564] RIP: 0010:[<ffffffff810a8ea2>]  [<ffffffff810a8ea2>] __queue_work+0x32/0x430
> [    6.312997] RSP: 0000:ffff897a40403d98  EFLAGS: 00010046
> [    6.319120] RAX: 0000000000000082 RBX: 0000000000000046 RCX: 0000000000000000
> [    6.327282] RDX: ffffffff81d02300 RSI: 0000000000000000 RDI: 0000000000002000
> [    6.335443] RBP: ffff897a40403dd0 R08: 00000000d431dd4b R09: 0000000000000000
> [    6.343606] R10: ffff897a40403e08 R11: ffffffff8227615c R12: ffffffff81d02300
> [    6.351766] R13: 0000000000002000 R14: 0000000000000000 R15: ffffffff81a88ffd
> [    6.359930] FS:  0000000000000000(0000) GS:ffff897a40400000(0000) knlGS:0000000000000000
> [    6.369274] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    6.375883] CR2: 0000000000000102 CR3: 0000000001c08000 CR4: 0000000000040630
> [    6.384045] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    6.392206] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [    6.400366] Stack:
> [    6.402801]  0000001000000000 0000200040403e08 0000000000000046 0000000000000002
> [    6.411723]  ffffffff82275f40 ffff897a40403e08 ffffffff81a88ffd ffff897a40403de8
> [    6.420638]  ffffffff810a9677 ffffffff82275f88 ffff897a40403e78 ffffffff8148dff4
> [    6.429554] Call Trace:
> [    6.432473]  <IRQ> [    6.434698]  [<ffffffff810a9677>] queue_work_on+0x27/0x40
> [    6.441121]  [<ffffffff8148dff4>] crng_reseed+0x154/0x260
> [    6.447343]  [<ffffffff8148e334>] credit_entropy_bits+0x234/0x2a0
> [    6.454343]  [<ffffffff8148e77b>] ? add_interrupt_randomness+0x1bb/0x220
> [    6.462020]  [<ffffffff8148e77b>] add_interrupt_randomness+0x1bb/0x220
> [    6.469506]  [<ffffffff810ef8f0>] handle_irq_event_percpu+0x40/0x80
> [    6.476698]  [<ffffffff810ef96b>] handle_irq_event+0x3b/0x60
> [    6.483211]  [<ffffffff810f2e7f>] handle_level_irq+0x8f/0x110
> [    6.489823]  [<ffffffff81030fd5>] handle_irq+0xb5/0x140
> [    6.495854]  [<ffffffff81096651>] ? _local_bh_enable+0x21/0x50
> [    6.502562]  [<ffffffff81038ab5>] ? __exit_idle+0x5/0x30
> [    6.508689]  [<ffffffff817250cd>] do_IRQ+0x4d/0xe0
> [    6.514232]  [<ffffffff81722ba0>] common_interrupt+0xa0/0xa0
> [    6.520744]  <EOI> [    6.522968]  [<ffffffff81ddeb08>] ? vfs_caches_init+0xe1/0xe3
> [    6.529779]  [<ffffffff81da212e>] start_kernel+0x451/0x4c8
> [    6.536096]  [<ffffffff81da1acb>] ? set_init_arg+0x55/0x55
> [    6.542415]  [<ffffffff81da1120>] ? early_idt_handler_array+0x120/0x120
> [    6.549995]  [<ffffffff81da15d6>] x86_64_start_reservations+0x2a/0x2c
> [    6.557381]  [<ffffffff81da1714>] x86_64_start_kernel+0x13c/0x15f
> [    6.564378] Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 ff 14 25 80 86 c3 81 f6 c4 02 0f 85 1f 03 00 00 <41> f6 86 02 01 00 00 01 0f 85 b2 02 00 00 49 c7 c7 78 69 01 00
> [    6.591500] RIP  [<ffffffff810a8ea2>] __queue_work+0x32/0x430
> [    6.598197]  RSP <ffff897a40403d98>
> [    6.602281] CR2: 0000000000000102
> 
> The NULL pointer dereference happens because the second argument to __queue_work is 0. The second argument is a struct workqueue_struct, specifically system_wq. This shows that an interrupt happened before the system_wq was initialized. We believe the problem was introduced with this commit:
> 
> $ git show c3ff2da5cef05676d490fa9057b2dceb5e48cdb9
> commit c3ff2da5cef05676d490fa9057b2dceb5e48cdb9
> Author: Theodore Ts'o <tytso@mit.edu>
> Date:   Mon Apr 23 18:51:28 2018 -0400
> 
>     random: fix possible sleeping allocation from irq context
>     
>     commit 6c1e851c4edc13a43adb3ea4044e3fc8f43ccf7d upstream.
>     
>     We can do a sleeping allocation from an irq context when CONFIG_NUMA
>     is enabled.  Fix this by initializing the NUMA crng instances in a
>     workqueue.
>     
>     Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>     Reported-by: syzbot+9de458f6a5e713ee8c1a@syzkaller.appspotmail.com
>     Fixes: 8ef35c866f8862df ("random: set up the NUMA crng instances...")
>     Cc: stable@vger.kernel.org
>     Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index b979173..dbfb3e69 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -820,7 +820,7 @@ static int crng_fast_load(const char *cp, size_t len)
>  }
>  
>  #ifdef CONFIG_NUMA
> -static void numa_crng_init(void)
> +static void do_numa_crng_init(struct work_struct *work)
>  {
>         int i;
>         struct crng_state *crng;
> @@ -841,6 +841,13 @@ static void numa_crng_init(void)
>                 kfree(pool);
>         }
>  }
> +
> +static DECLARE_WORK(numa_crng_init_work, do_numa_crng_init);
> +
> +static void numa_crng_init(void)
> +{
> +       schedule_work(&numa_crng_init_work);
> +}
>  #else
>  static void numa_crng_init(void) {}
>  #endif
> 
> Apparently we can't count on system_wq being initialized when schedule_work is called in numa_crng_init from an early interrupt.
> 
> I don't understand the underlying code enough to propose a fix right away. 

Ick :(

I'm guessing you also see these problems on 4.17?  Can you test there to
be sure of that?

thanks,

greg k-h

next prev parent reply	other threads:[~2018-06-07  8:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-06 23:26 panic at boot time with kernel >= 4.9.98 - uninitialized system_wq in early interrupt Max Asbock
2018-06-07  8:37 ` Greg KH [this message]
2018-06-07 17:54   ` [External] " Max Asbock
2018-06-07 18:27     ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180607083714.GA17489@kroah.com \
    --to=greg@kroah.com \
    --cc=amax@lenovo.com \
    --cc=lcm@lenovo.com \
    --cc=stable@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.