linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chen Yucong <slaoub@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-edac <linux-edac@vger.kernel.org>, X86 ML <x86@kernel.org>,
	Tony Luck <tony.luck@intel.com>
Subject: Re: [RFC PATCH 0/3] RAS: Correctable Errors Collector thing
Date: Wed, 28 May 2014 10:49:21 +0800	[thread overview]
Message-ID: <1401245361.5049.6.camel@cyc> (raw)
In-Reply-To: <1401197235-13440-1-git-send-email-bp@alien8.de>


> From: Borislav Petkov <bp@suse.de>
> 
> Hi all,
> 
> this is something Tony and I have been working on behind the curtains
> recently. Here it is in a RFC form, it passes quick testing in kvm. Let
> me send it out before I start hammering on it on a real machine.
> 
> More indepth info about what it is and what it does is in patch 1/3.
> 
> As always, comments and suggestions are most welcome.
> 
> Thanks.

What's the point of this patch set?
My understanding is that if there are some(COUNT_MASK) corrected DRAM
ECC errors for a specific page frame, we can believe that the page frame
is so ill that it should be isolated as soon as possible.

The question is: memory_failure can not be used for isolating the page
frame which is being used by kernel, because it just poison the page and
IGNORED. memory_failure is mostly used for handling AR/AO type errors
related to the page frame which the userspace tasks are using now.

Although the relative page frame is very ill, it is not dead and can
still work. However, memory_failure may kill the userspace tasks,
especially for those page frames that are holding dynamic data rather
than file-backed(file/swap) data.

So I do not think that it is a good idea to directly use memory_failure
in this patch set. 

thx!
cyc



  parent reply	other threads:[~2014-05-28  2:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-27 13:27 [RFC PATCH 0/3] RAS: Correctable Errors Collector thing Borislav Petkov
2014-05-27 13:27 ` [RFC PATCH 1/3] MCE, CE: Corrected errors collecting thing Borislav Petkov
2014-05-27 13:27 ` [RFC PATCH 2/3] MCE, CE: Wire in the CE collector Borislav Petkov
2014-05-27 17:48   ` Tony Luck
2014-05-27 18:48     ` Borislav Petkov
2014-05-27 13:27 ` [RFC PATCH 3/3] MCE, CE: Add debugging glue Borislav Petkov
2014-05-28  2:49 ` Chen Yucong [this message]
2014-05-28 16:53   ` [RFC PATCH 0/3] RAS: Correctable Errors Collector thing Max Asbock
2014-05-28 17:21     ` Luck, Tony
2014-05-28 17:23     ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1401245361.5049.6.camel@cyc \
    --to=slaoub@gmail.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).