Re: [PATCH] x86/mce: Initialize workqueues only once (alternate proposal)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Borislav Petkov <bp@suse.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Wang, Rui Y" <rui.y.wang@intel.com>,
	"Chen, Gong" <gong.chen@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86/mce: Initialize workqueues only once (alternate proposal)
Date: Fri, 19 Jun 2015 21:02:00 +0200	[thread overview]
Message-ID: <20150619190200.GB20546@pd.tnic> (raw)
In-Reply-To: <20150619173620.GA9622@agluck-desk.sc.intel.com>

On Fri, Jun 19, 2015 at 10:36:20AM -0700, Luck, Tony wrote:
> 96d98bfd0366 ("x86/mce: Don't use percpu workqueues") dropped the
> per-CPU workqueues in the MCE code but left the initialization per-CPU.
> This lead to early boot time splats (below) in the workqueues code
> because we were overwriting the workqueue during INIT_WORK() on each new
> CPU which would appear.
> 
> Move initialization to mcheck_init() so it happens only once.
> 
>   mce: [Hardware Error]: Machine check events logged
>   BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>   IP: [<ffffffff810980a1>] process_one_work+0x31/0x420
>    PGD 0
>   Oops: 0000 [#1] SMP
>   Modules linked in:
>   CPU: 36 PID: 263 Comm: kworker/36:0 Not tainted 4.1.0-rc8 #1
>   Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRHSXSD1.86B.0065.R01.1505011640
> +05/01/2015
>   task: ffff88181c284470 ti: ffff88181bd94000 task.ti: ffff88181bd94000
>   RIP: 0010:[<ffffffff810980a1>] process_one_work+0x31/0x420
>   RSP: 0000:ffff88181bd97e08  EFLAGS: 00010046
>   RAX: 0000000fffffffe0 RBX: ffffffff81d0fa20 RCX: 0000000000000000
>   RDX: 0000000fffffff00 RSI: ffffffff81d0fa20 RDI: ffff88181c2660c0
>   RBP: ffff88181bd97e48 R08: ffff88181f416ec0 R09: ffff88181c284470
>   R10: 0000000000000002 R11: ffffffff8109e5ac R12: ffff88181c2660c0
>   R13: ffff88181f416ec0 R14: 0000000000000000 R15: ffff88181c2660f0
>                              ^^^^^^^^^^^^^^^^^
> 
>   27:   4c 0f 45 f2             cmovne %rdx,%r14
>   2b:*  49 8b 46 08             mov    0x8(%r14),%rax           <-- trapping instruction
>   2f:   44 8b b8 00 01 00 00    mov    0x100(%rax),%r15d
> 
>   ...
> 
>   Call Trace:
>    worker_thread
>    ? rescuer_thread
>    kthread
>    ? kthread_create_on_node
>    ret_from_fork
>    ? kthread_create_on_node
>   Code: 48 89 e5 41 57 41 56 45 31 f6 41 55 41 54 49 89 fc 53 48 89 f3 48 83 ec 18 48 8b 06 4c
> +8b 6f 48 48 89 c2 30 d2 a8 04 4c 0f 45 f2 <49> 8b 46 08 44 8b b8 00 01 00 00 41 c1 ef 05 44
> +89 f8 83 e0 01
>   RIP  [<ffffffff810980a1>] process_one_work
>    RSP <ffff88181bd97e08>
>   CR2: 0000000000000008
>   ---[ end trace 8229a011b97532a0 ]---
>   Kernel panic - not syncing: Fatal exception
>   ---[ end Kernel panic - not syncing: Fatal exception
> 
> Reported-by: Rui Wang <rui.y.wang@intel.com>
> Debugged-by: Borislav Petkov <bp@suse.de>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kernel/cpu/mcheck/mce.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 478f81a6d824..158d9e7db974 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -1665,9 +1665,6 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
>  		return;
>  	}
>  
> -	INIT_WORK(&mce_work, mce_process_work);
> -	init_irq_work(&mce_irq_work, mce_irq_work_cb);
> -
>  	machine_check_vector = do_machine_check;
>  
>  	__mcheck_cpu_init_generic();
> @@ -1994,6 +1991,9 @@ int __init mcheck_init(void)
>  	mce_register_decode_chain(&mce_srao_nb);
>  	mcheck_vendor_init_severity();
>  
> +	INIT_WORK(&mce_work, mce_process_work);
> +	init_irq_work(&mce_irq_work, mce_irq_work_cb);
> +
>  	return 0;

Hmm, and I was under the impression that mcheck_init() runs much
later... Not really.

Anyway, your version is better, I've replaced mine with it.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

     prev parent reply	other threads:[~2015-06-19 19:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <FC9702EC51E4CA40A875703BEBD6CEF801AE738D@SHSMSX101.ccr.corp.intel.com>
2015-06-17  9:41 ` MCE Bug? Borislav Petkov
2015-06-17 17:45   ` Luck, Tony
2015-06-17 23:53   ` Luck, Tony
2015-06-18 10:25     ` Borislav Petkov
2015-06-18 13:10     ` [PATCH] x86/mce: Kill drain_mcelog_buffer() Borislav Petkov
2015-06-19  9:27     ` [PATCH] x86/mce: Initialize workqueues only once Borislav Petkov
2015-06-19 12:24       ` Borislav Petkov
2015-06-19 17:36       ` [PATCH] x86/mce: Initialize workqueues only once (alternate proposal) Luck, Tony
2015-06-19 19:02         ` Borislav Petkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150619190200.GB20546@pd.tnic \
    --to=bp@suse.de \
    --cc=gong.chen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rui.y.wang@intel.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.