From: "Chen, Gong" <gong.chen@linux.intel.com>
To: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
tony.luck@intel.com, bp@alien8.de, linux-acpi@vger.kernel.org
Subject: Re: [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for memory error handling
Date: Sat, 14 Dec 2013 08:42:56 -0500 [thread overview]
Message-ID: <20131214134256.GC2823@gchen.bj.intel.com> (raw)
In-Reply-To: <20131126093136.GA27271@gchen.bj.intel.com>
[-- Attachment #1: Type: text/plain, Size: 3645 bytes --]
On Tue, Nov 26, 2013 at 04:31:36AM -0500, Chen, Gong wrote:
> Date: Tue, 26 Nov 2013 04:31:36 -0500
> From: "Chen, Gong" <gong.chen@linux.intel.com>
> To: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> Cc: tony.luck@intel.com, bp@alien8.de, linux-acpi@vger.kernel.org
> Subject: Re: [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for
> memory error handling
> User-Agent: Mutt/1.5.21 (2010-09-15)
>
> On Tue, Nov 26, 2013 at 02:32:53PM +0530, Naveen N. Rao wrote:
> > Date: Tue, 26 Nov 2013 14:32:53 +0530
> > From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> > To: "Chen, Gong" <gong.chen@linux.intel.com>, tony.luck@intel.com,
> > bp@alien8.de
> > CC: linux-acpi@vger.kernel.org
> > Subject: Re: [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for
> > memory error handling
> > User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101
> > Thunderbird/24.1.0
> >
> > On 11/25/2013 12:45 PM, Chen, Gong wrote:
> > >Usually SCI is employed to handle corrected error, especially
> > >for memory corrected error but in fact SCI still can be used
> > >to handle any error like memory uncorrected error even fatal
> > >error if BIOS enable it. For this kind of situation, it
> > >should be logged, too.
> > >
> > >v2 -> v1: make the event record more precisely
> > >
> > >Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
> > >---
> > > arch/x86/kernel/cpu/mcheck/mce-apei.c | 10 +++++++---
> > > drivers/acpi/apei/ghes.c | 3 +--
> > > 2 files changed, 8 insertions(+), 5 deletions(-)
> > >
> > >diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > >index de8b60a..d137ab8 100644
> > >--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > >+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > >@@ -33,6 +33,7 @@
> > > #include <linux/acpi.h>
> > > #include <linux/cper.h>
> > > #include <acpi/apei.h>
> > >+#include <acpi/ghes.h>
> > > #include <asm/mce.h>
> > >
> > > #include "mce-internal.h"
> > >@@ -41,14 +42,17 @@ void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> > > {
> > > struct mce m;
> > >
> > >- /* Only corrected MC is reported */
> > >- if (!corrected || !(mem_err->validation_bits & CPER_MEM_VALID_PA))
> > >+ if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
> > > return;
> > >
> > > mce_setup(&m);
> > > m.bank = 1;
> > >- /* Fake a memory read corrected error with unknown channel */
> > >+ /* Fake a memory read error with unknown channel */
> > > m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
> > >+ if (corrected >= GHES_SEV_RECOVERABLE)
> > >+ m.status |= MCI_STATUS_UC;
> > >+ if (corrected >= GHES_SEV_PANIC)
> > >+ m.status |= MCI_STATUS_PCC;
> >
> > Hmm... so you only fill up the most basic information from the cper
> > record. In the absence of 'S', 'AR' bits, I am not sure how useful
> > this is - except for logging the error through /dev/mcelog for
> > legacy users. If that is the intent, you have my
> >
> > Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> >
> >
> > - Naveen
> >
>
> Thanks for your ACK. We want to record more information but you know
> UEFI/CPER is not related to MCE in essentially. So we can't figure
> out all necessary information to construct MCE record. IOW, we can
> just apply the most valuable information like physical address and
> fake other fields. From this point of view, this kind of H/W error
> event report method is still not perfect.
Hi, Boris
Will you pick up this patch in your RAS request pull?
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2013-12-14 14:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-25 7:15 [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for memory error handling Chen, Gong
2013-11-25 7:15 ` [PATCH v2 2/2] ACPI, APEI, GHES: Cleanup ghes codes " Chen, Gong
2013-11-26 6:54 ` Chen, Gong
2013-11-26 7:23 ` Borislav Petkov
2013-11-27 2:15 ` Chen, Gong
2013-12-14 13:42 ` Chen, Gong
2013-11-26 9:04 ` Naveen N. Rao
2013-12-21 12:41 ` Borislav Petkov
2013-11-25 17:13 ` [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check " Borislav Petkov
2013-11-26 9:02 ` Naveen N. Rao
2013-11-26 9:31 ` Chen, Gong
2013-12-14 13:42 ` Chen, Gong [this message]
2013-12-16 14:51 ` Borislav Petkov
2013-12-16 14:40 ` Chen, Gong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131214134256.GC2823@gchen.bj.intel.com \
--to=gong.chen@linux.intel.com \
--cc=bp@alien8.de \
--cc=linux-acpi@vger.kernel.org \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).