All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Borislav Petkov <bp@amd64.org>
Cc: "Luck, Tony" <tony.luck@intel.com>, Ingo Molnar <mingo@elte.hu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>,
	Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Subject: Re: [PATCH 07/10] MCE: replace mce.c use of TIF_MCE_NOTIFY with user_return_notifier
Date: Sun, 12 Jun 2011 13:30:07 +0300	[thread overview]
Message-ID: <4DF4952F.5030808@redhat.com> (raw)
In-Reply-To: <20110612102443.GA19060@aftab>

On 06/12/2011 01:24 PM, Borislav Petkov wrote:
> On Sun, Jun 12, 2011 at 04:29:41AM -0400, Avi Kivity wrote:
> >  On 06/10/2011 12:35 AM, Luck, Tony wrote:
> >  >  From: Tony Luck<tony.luck@intel.com>
> >  >
> >  >  Ingo wrote:
> >  >  >   We already have a generic facility to do such things at
> >  >  >   return-to-userspace: _TIF_USER_RETURN_NOTIFY.
> >  >
> >  >  This just a proof of concept patch ... before this can become
> >  >  real the user-return-notifier code would have to be made NMI
> >  >  safe (currently it uses hlist_add_head/hlist_del, which would
> >  >  need to be changed to Ying's NMI-safe single threaded lists).
> >
> >  You could use irq_work_queue() to push this into an irq context, which
> >  is user-return-notifier safe.
>
> Maybe I'm missing something but it looks like irq_work_queue() queues
> work which is run in irq_work_run() with IRQs disabled. However, user
> return notifiers are run after IRQs get enabled in entry_64.S. And we
> want to run memory_failure() with IRQs enabled.
>
> More importantly, we want to be able to do the following:
>
> * run #MC handler which queues work
>
> * when returning to userspace, preempt and schedule that previously
> queued work _before_ the process that caused the MCE gets to execute.

Yes.

> Imagine this scenario:
>
> Your userspace process causes a data cache read error due to either
> alpha particles or maybe because the DRAM device containing the process
> page is faulty and generates ECC errors which the ECC code cannot
> correct, i.e. an uncorrectable error we definitely want to handle; IOW
> Action Required MCE.
>
> Now, if you get lucky and this page is mapped only by the process that
> caused the MCE, you could unmap it, mark it PageReserved and cause the
> process to refault. But in order to do that, you want to execute the
> memory_failure() handler _before_ you schedule the process again.
>
> In the instruction cache read error case, you don't have processor
> context to return to (or you're being too conservative and don't want to
> risk it) so you kill the process, which is pretty easy to do.
>
> Does that make a bit more sense? Tony?
>

You're missing the flow.  The MCE handler calls irq_work_queue(), which 
schedules a user return notifier, which does any needed processing in 
task context.


-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2011-06-12 10:30 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-09 21:25 [RFC] reworked machine check recovery patches Luck, Tony
2011-06-09 21:29 ` [PATCH 01/10] MCE: fixes for mce severity table Luck, Tony
2011-06-09 21:30 ` [PATCH 02/10] MCE: save most severe error information Luck, Tony
2011-06-10  8:06   ` Hidetoshi Seto
2011-06-10 18:08     ` Tony Luck
2011-06-09 21:31 ` [PATCH 03/10] MCE: introduce mce_gather_info() Luck, Tony
2011-06-09 21:32 ` [PATCH 04/10] MCE: Move ADDR/MISC reading code into common function Luck, Tony
2011-06-10  9:33   ` Borislav Petkov
2011-06-10 18:17     ` Tony Luck
2011-06-09 21:33 ` [PATCH 05/10] MCE: Mask out address mask bits below address granuality Luck, Tony
2011-06-10  8:07   ` Hidetoshi Seto
2011-06-10  9:46     ` Borislav Petkov
2011-06-10 19:06     ` Tony Luck
2011-06-11  0:12       ` Andi Kleen
2011-06-10  9:42   ` Borislav Petkov
2011-06-10 19:09     ` Tony Luck
2011-06-09 21:34 ` [PATCH 06/10] HWPOISON: Handle hwpoison in current process Luck, Tony
2011-06-10  8:07   ` Hidetoshi Seto
2011-06-10 20:36     ` Tony Luck
2011-06-09 21:35 ` [PATCH 07/10] MCE: replace mce.c use of TIF_MCE_NOTIFY with user_return_notifier Luck, Tony
2011-06-10  8:08   ` Hidetoshi Seto
2011-06-10 20:42     ` Tony Luck
2011-06-11 10:24       ` Borislav Petkov
2011-06-12  8:31       ` Avi Kivity
2011-06-12  8:29   ` Avi Kivity
2011-06-12 10:24     ` Borislav Petkov
2011-06-12 10:30       ` Avi Kivity [this message]
2011-06-12 13:53         ` Borislav Petkov
2011-06-09 21:36 ` [PATCH 08/10] NOTIFIER: Take over TIF_MCE_NOTIFY and implement task return notifier Luck, Tony
2011-06-12 22:38   ` Borislav Petkov
2011-06-13  5:31     ` Tony Luck
2011-06-13  7:59       ` Avi Kivity
2011-06-13  9:55         ` Borislav Petkov
2011-06-13 11:40           ` Avi Kivity
2011-06-13 12:40             ` Borislav Petkov
2011-06-13 12:47               ` Avi Kivity
2011-06-13 15:12                 ` Borislav Petkov
2011-06-13 16:31                   ` Avi Kivity
2011-06-13 17:13                     ` Tony Luck
2011-06-14  2:50                       ` Hidetoshi Seto
2011-06-14  2:51                         ` [PATCH 1/2] x86, mce: introduce mce_memory_failure_process Hidetoshi Seto
2011-06-14  2:53                         ` [PATCH 2/2] x86, mce: rework use of TIF_MCE_NOTIFY Hidetoshi Seto
2011-06-14 18:02                           ` Tony Luck
2011-06-14 18:28                             ` Tony Luck
2011-06-15  1:29                               ` Hidetoshi Seto
2011-06-15  2:10                                 ` Tony Luck
2011-06-15  3:17                                   ` Hidetoshi Seto
2011-06-14  3:09                         ` [PATCH 08/10] NOTIFIER: Take over TIF_MCE_NOTIFY and implement task return notifier Tony Luck
2011-06-14 11:40                       ` Avi Kivity
2011-06-14 13:33                         ` Borislav Petkov
2011-06-14 13:43                           ` Avi Kivity
2011-06-14 17:13                             ` Luck, Tony
2011-06-15  8:51                               ` Avi Kivity
2011-06-14 16:59                         ` Luck, Tony
2011-06-15  8:52                           ` Avi Kivity
2011-06-13 16:43               ` Tony Luck
2011-06-09 21:37 ` [PATCH 09/10] MCE: run through processors with more severe problems first Luck, Tony
2011-06-10  8:09   ` Hidetoshi Seto
2011-06-10 20:49     ` Tony Luck
2011-06-13 22:03       ` Tony Luck
2011-06-14  1:27         ` Hidetoshi Seto
2011-06-14  3:04           ` Tony Luck
2011-06-09 21:38 ` [PATCH 10/10] MCE: Add Action-Required support Luck, Tony
2011-06-10  8:06   ` Hidetoshi Seto
2011-06-10 21:00     ` Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DF4952F.5030808@redhat.com \
    --to=avi@redhat.com \
    --cc=bp@amd64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=tony.luck@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.