Linux-NVDIMM Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: "Luck, Tony" <tony.luck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"bp-l3A5Bk7waGM@public.gmane.org"
	<bp-l3A5Bk7waGM@public.gmane.org>
Cc: "linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org"
	<linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org>,
	"x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org"
	<x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org"
	<tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Subject: Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic'
Date: Fri, 21 Apr 2017 21:39:45 +0000	[thread overview]
Message-ID: <1492810703.2738.27.camel@intel.com> (raw)
In-Reply-To: <20170413113159.rc32ebiswn64nzrr-fF5Pk5pvG8Y@public.gmane.org>

On Thu, 2017-04-13 at 13:31 +0200, Borislav Petkov wrote:
> On Thu, Apr 13, 2017 at 12:29:25AM +0200, Borislav Petkov wrote:
> > On Wed, Apr 12, 2017 at 03:26:19PM -0700, Luck, Tony wrote:
> > > We can futz with that and have them specify which chain (or both)
> > > that they want to be added to.
> > 
> > Well, I didn't want the atomic chain to be a notifier because we can
> > keep it simple and non-blocking. Only the process context one will
> > be.
> > 
> > So the question is, do we even have a use case for outside consumers
> > hanging on the atomic chain? Because if not, we're good to go.
> 
> Ok, new day, new patch.
> 
> Below is what we could do: we don't call the notifier at all on the
> atomic path but only print the MCEs. We do log them and if the machine
> survives, we process them accordingly. This is only a fix for upstream
> so that the current issue at hand is addressed.
> 
> For later, we'd need to split the paths in:
> 
> critical_print_mce()
> 
> or somesuch which immediately dumps the MCE to dmesg, and
> 
> mce_log()
> 
> which does the slow path of logging MCEs and calling the blocking
> notifier.
> 
> Now, I'd want to have decoding of the MCE on the critical path too so
> I have to think about how to do that nicely. Maybe move the decoding
> bits which are the same between Intel and AMD in mce.c and have some
> vendor-specific, fast calls. We'll see. Btw, this is something Ingo
> has
> been mentioning for a while.
> 
> Anyway, here's just the urgent fix for now.
> 
> Thanks.
> 
> ---
> From: Vishal Verma <vishal.l.verma@intel.com>
> Date: Tue, 11 Apr 2017 16:44:57 -0600
> Subject: [PATCH] x86/mce: Make the MCE notifier a blocking one
> 
> The NFIT MCE handler callback (for handling media errors on NVDIMMs)
> takes a mutex to add the location of a memory error to a list. But
> since
> the notifier call chain for machine checks (x86_mce_decoder_chain) is
> atomic, we get a lockdep splat like:
> 
>   BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:620
>   in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
>   [..]
>   Call Trace:
>    dump_stack
>    ___might_sleep
>    __might_sleep
>    mutex_lock_nested
>    ? __lock_acquire
>    nfit_handle_mce
>    notifier_call_chain
>    atomic_notifier_call_chain
>    ? atomic_notifier_call_chain
>    mce_gen_pool_process
> 
> Convert the notifier to a blocking one which gets to run only in
> process
> context.
> 
> Boris: remove the notifier call in atomic context in print_mce(). For
> now, let's print the MCE on the atomic path so that we can make sure
> it
> goes out. We still log it for process context later.
> 
> Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: linux-edac <linux-edac@vger.kernel.org>
> Cc: x86-ml <x86@kernel.org>
> Cc: <stable@vger.kernel.org>
> Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@i
> ntel.com
> Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media
> error")
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  arch/x86/kernel/cpu/mcheck/mce-genpool.c  |  2 +-
>  arch/x86/kernel/cpu/mcheck/mce-internal.h |  2 +-
>  arch/x86/kernel/cpu/mcheck/mce.c          | 18 ++++--------------
>  3 files changed, 6 insertions(+), 16 deletions(-)
> 

I noticed this patch was picked up in tip, in ras/urgent, but didn't see
a pull request for 4.11 - was this the intention? Or will it just be
added for 4.12?

	-Vishal
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  parent reply	other threads:[~2017-04-21 21:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-11 22:44 [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic' Vishal Verma
2017-04-12  9:14 ` Borislav Petkov
2017-04-12 19:59   ` Vishal Verma
2017-04-12 20:22     ` Borislav Petkov
2017-04-12 20:27       ` Verma, Vishal L
2017-04-12 20:52         ` Luck, Tony
2017-04-12 20:55           ` Dan Williams
2017-04-12 21:12             ` Thomas Gleixner
2017-04-12 21:19               ` Luck, Tony
2017-04-12 21:47                 ` Borislav Petkov
2017-04-12 22:16                   ` Borislav Petkov
2017-04-12 22:26                     ` Luck, Tony
2017-04-12 22:29                       ` Borislav Petkov
2017-04-13 11:31                         ` Borislav Petkov
2017-04-13 12:12                           ` Borislav Petkov
2017-04-18 16:28                             ` Luck, Tony
     [not found]                           ` <20170413113159.rc32ebiswn64nzrr-fF5Pk5pvG8Y@public.gmane.org>
2017-04-21 21:39                             ` Verma, Vishal L [this message]
2017-04-12 21:13         ` Borislav Petkov
2017-04-12 21:50           ` Thomas Gleixner
2017-04-12 22:42             ` Paul E. McKenney
2017-04-12 23:45               ` Paul E. McKenney
2017-04-13 14:34                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1492810703.2738.27.camel@intel.com \
    --to=vishal.l.verma-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=bp-l3A5Bk7waGM@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nvdimm-y27Ovi1pjclAfugRpC6u6w@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=tony.luck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox