All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chen Yucong <slaoub@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: tony.luck@intel.com, ak@linux.intel.com, ying.huang@intel.com,
	seto.hidetoshi@jp.fujitsu.com, linux-kernel@vger.kernel.org,
	linux-edac@vger.kernel.org
Subject: Re: [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU
Date: Wed, 21 May 2014 08:48:24 +0800	[thread overview]
Message-ID: <1400633304.14703.2.camel@debian> (raw)
In-Reply-To: <20140520173308.GE16428@pd.tnic>

On Tue, 2014-05-20 at 19:33 +0200, Borislav Petkov wrote:
> On Tue, May 20, 2014 at 10:11:25AM +0800, Chen Yucong wrote:
> > mces_seen is a Per-CPU variable which should only be accessed by
> > Per-CPU as possible. So the clear operation of mces_seen should also
> > be lcoal to Per-CPU rather than monarch CPU.
> >
> > Meanwhile, there is also a potential risk that mces_seen will not
> > be be cleared if a timeout occors in mce_end for monarch CPU. As a
> > reuslt, the stale value of mces_seen will reappear on the next mce.
> 
> I don't know how many times I have to tell you this already: if we reach
> the timeout, we have a much bigger friggin' problem!

Even if we do not take into account timeout, we should distribute the
clear operation of mces_seen to Per-CPU rather then monarch CPU.
mce_regin, which is only called by monarch CPU, can be used for system
panics as quickly as possible if there is a truly data corrupting error.
But Monarch CPU don't have to help all other CPU to clean mces_clean.
One advantage of Per-CPU is the isolation of errors propagation, being
so, why do not we clean mces_seen by Per-CPU?

You say, "you need to do the cleaning in mce_reign because the monarch
cpu has to run last after all other cpus have scanned their mce banks."
But this is not an adequate explanation.

thx!
cyc

> 
> What you could do instead is make the machine panic in the tolerant==1,
> i.e., the default case, in mce_timed_out().
> 
> Basically, in the case any core is stuck and we reach a timeout, we want
> to panic the whole box immediately. There's a very little chance we can
> recover so panic is the only sane thing left to do.
> 
> Ok?
> 



  reply	other threads:[~2014-05-21  0:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-20  2:11 [PATCH v2] x86/mce: Distirbute the clear operation of mces_seen to Per-CPU rather than only monarch CPU Chen Yucong
2014-05-20 17:33 ` Borislav Petkov
2014-05-21  0:48   ` Chen Yucong [this message]
2014-05-21  1:33   ` Chen Yucong
2014-05-21  1:40 ` Hidetoshi Seto
2014-05-21  2:03   ` Chen Yucong
2014-05-21  2:43     ` Hidetoshi Seto
2014-05-21  3:19       ` Chen Yucong
2014-05-21  3:36         ` Hidetoshi Seto
2014-05-21 21:09           ` Luck, Tony
2014-05-23  1:32             ` Chen Yucong
2014-05-23  9:10               ` Borislav Petkov
2014-05-23 11:57                 ` Chen Yucong
2014-05-23 22:40                   ` Tony Luck
2014-05-23 21:50               ` Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1400633304.14703.2.camel@debian \
    --to=slaoub@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=tony.luck@intel.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.