All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@amd64.org>, Ingo Molnar <mingo@elte.hu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>,
	Andi Kleen <andi@firstfloor.org>, Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mauro Carvalho Chehab <mchehab@redhat.com>
Subject: Re: [RFC 0/9] mce recovery for Sandy Bridge server
Date: Tue, 24 May 2011 23:24:34 +0200	[thread overview]
Message-ID: <1306272274.2497.73.camel@laptop> (raw)
In-Reply-To: <BANLkTinSOFAioAe2v5c6PRB9EKjJJNMg9w@mail.gmail.com>

On Tue, 2011-05-24 at 10:56 -0700, Tony Luck wrote:
> Dragging PeterZ to this thread, since we are now talking about scheduler.
> 
> On Tue, May 24, 2011 at 10:33 AM, Borislav Petkov <bp@amd64.org> wrote:
> > On Tue, May 24, 2011 at 09:57:46AM -0700, Luck, Tony wrote:
> >> So can we talk about this part for a while before returning to the
> >> "how to report this" discussion?
> >>
> >> So here's the situation - we are in the NMI handler when we find from
> >> looking at the machine check bank registers that we have a recoverable
> >> error. We know the physical address, and we know the task (which might
> >> have been in user or kernel context). I can package that information
> >> into a perf/event ... but then how can I mark the current task as
> >> not-fit-for-execution?
> >
> > Maybe something like
> >
> > set_current_state(TASK_UNINTERRUPTIBLE);
> >
> > finish work in NMI context
> >
> > do remaining work in process context like sending appropriate signals
> > etc; finally:
> >
> > set_task_state(tsk, TASK_RUNNING)
> 
> That looks pretty easy - are their any weird side effects that I should
> be worried about?  My perf/event can't really include the "task" pointer
> (that sounds way too internal) - but I can provide the process id, so
> the "RAS daemon" that sees this event can look up the task to do that
> final set_task_state(tsk, TASK_RUNNING).
> 
> Does this work in the threaded case? In the case where the task was in
> kernel context (but in a CONFIG_PREEMT=y kernel at some point
> where preemption is allowed)?


Right, so you can't do things like that from NMI context, but what perf
can do is raise a self-IPI and continue from IRQ context (question for
the HW folks, can there be cycles between the NMI iret and IRQ assert
from whatever context was before the NMI hit?)

>From IRQ context we can wake threads, set TIF_flags etc. you can
basically do what SIGSTOP does and put the task in TASK_STOPPED state,
wake your handler thread and set TIF_NEED_RESCHED. Then the handler
thread will be scheduled depending on your handler's sched policy.




  parent reply	other threads:[~2011-05-24 21:21 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-23 21:54 [RFC 0/9] mce recovery for Sandy Bridge server Luck, Tony
2011-05-23 22:02 ` [RFC 1/9] mce: fixes for mce severity table Luck, Tony
2011-05-23 22:12 ` [RFC 2/9] mce: save most severe error information Luck, Tony
2011-05-23 22:13 ` [RFC 3/9] MCE: Always retrieve mce rip before calling no_way_out Luck, Tony
2011-05-23 22:13 ` [RFC 4/9] MCE: Move ADDR/MISC reading code into common function Luck, Tony
2011-05-23 22:13 ` [RFC 5/9] MCE: Mask out address mask bits below address granuality Luck, Tony
2011-05-23 22:14 ` [RFC 6/9] HWPOISON: Handle hwpoison in current process Luck, Tony
2011-05-23 22:14 ` [RFC 7/9] MCE: Pass registers to work handlers Luck, Tony
2011-05-23 22:14 ` [RFC 8/9] mce: run through processors with more severe problems first Luck, Tony
2011-05-23 22:15 ` [RFC 9/9] MCE: Add Action-Required support Luck, Tony
2011-05-24  3:40 ` [RFC 0/9] mce recovery for Sandy Bridge server Ingo Molnar
2011-05-24  8:14   ` Borislav Petkov
2011-05-24 16:57   ` Luck, Tony
2011-05-24 17:33     ` Borislav Petkov
2011-05-24 17:56       ` Tony Luck
2011-05-24 21:04         ` Borislav Petkov
2011-05-24 21:24         ` Peter Zijlstra [this message]
2011-05-24 21:30           ` Linus Torvalds
2011-05-24 21:37             ` Peter Zijlstra
2011-05-24 21:41               ` Ingo Molnar
2011-05-24 21:48             ` Tony Luck
2011-05-25 10:02               ` Joerg Roedel
2011-05-25 13:44     ` Ingo Molnar
2011-05-25 21:43       ` Tony Luck
2011-05-25 21:47         ` Ingo Molnar
2011-05-25 23:53       ` Tony Luck
2011-05-26 20:16         ` Tony Luck
2011-05-25  6:03 ` Hidetoshi Seto
2011-05-25 16:44   ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1306272274.2497.73.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=bp@amd64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@elte.hu \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.