linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Chen, Gong" <gong.chen@linux.intel.com>
To: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: tony.luck@intel.com, bp@alien8.de, linux-acpi@vger.kernel.org
Subject: Re: [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for memory error handling
Date: Tue, 26 Nov 2013 04:31:36 -0500	[thread overview]
Message-ID: <20131126093136.GA27271@gchen.bj.intel.com> (raw)
In-Reply-To: <529463BD.3070305@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3006 bytes --]

On Tue, Nov 26, 2013 at 02:32:53PM +0530, Naveen N. Rao wrote:
> Date: Tue, 26 Nov 2013 14:32:53 +0530
> From: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> To: "Chen, Gong" <gong.chen@linux.intel.com>, tony.luck@intel.com,
>  bp@alien8.de
> CC: linux-acpi@vger.kernel.org
> Subject: Re: [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for
>  memory error handling
> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101
>  Thunderbird/24.1.0
> 
> On 11/25/2013 12:45 PM, Chen, Gong wrote:
> >Usually SCI is employed to handle corrected error, especially
> >for memory corrected error but in fact SCI still can be used
> >to handle any error like memory uncorrected error even fatal
> >error if BIOS enable it. For this kind of situation, it
> >should be logged, too.
> >
> >v2 -> v1: make the event record more precisely
> >
> >Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
> >---
> >  arch/x86/kernel/cpu/mcheck/mce-apei.c | 10 +++++++---
> >  drivers/acpi/apei/ghes.c              |  3 +--
> >  2 files changed, 8 insertions(+), 5 deletions(-)
> >
> >diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> >index de8b60a..d137ab8 100644
> >--- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
> >+++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> >@@ -33,6 +33,7 @@
> >  #include <linux/acpi.h>
> >  #include <linux/cper.h>
> >  #include <acpi/apei.h>
> >+#include <acpi/ghes.h>
> >  #include <asm/mce.h>
> >
> >  #include "mce-internal.h"
> >@@ -41,14 +42,17 @@ void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> >  {
> >  	struct mce m;
> >
> >-	/* Only corrected MC is reported */
> >-	if (!corrected || !(mem_err->validation_bits & CPER_MEM_VALID_PA))
> >+	if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
> >  		return;
> >
> >  	mce_setup(&m);
> >  	m.bank = 1;
> >-	/* Fake a memory read corrected error with unknown channel */
> >+	/* Fake a memory read error with unknown channel */
> >  	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
> >+	if (corrected >= GHES_SEV_RECOVERABLE)
> >+		m.status |= MCI_STATUS_UC;
> >+	if (corrected >= GHES_SEV_PANIC)
> >+		m.status |= MCI_STATUS_PCC;
> 
> Hmm... so you only fill up the most basic information from the cper
> record. In the absence of 'S', 'AR' bits, I am not sure how useful
> this is - except for logging the error through /dev/mcelog for
> legacy users. If that is the intent, you have my
> 
> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> 
> 
> - Naveen
> 

Thanks for your ACK. We want to record more information but you know
UEFI/CPER is not related to MCE in essentially. So we can't figure
out all necessary information to construct MCE record. IOW, we can
just apply the most valuable information like physical address and
fake other fields. From this point of view, this kind of H/W error
event report method is still not perfect.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2013-11-26  9:48 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-25  7:15 [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check for memory error handling Chen, Gong
2013-11-25  7:15 ` [PATCH v2 2/2] ACPI, APEI, GHES: Cleanup ghes codes " Chen, Gong
2013-11-26  6:54   ` Chen, Gong
2013-11-26  7:23     ` Borislav Petkov
2013-11-27  2:15       ` Chen, Gong
2013-12-14 13:42         ` Chen, Gong
2013-11-26  9:04   ` Naveen N. Rao
2013-12-21 12:41     ` Borislav Petkov
2013-11-25 17:13 ` [PATCH v2 1/2] ACPI, APEI, GHES: Remove strict check " Borislav Petkov
2013-11-26  9:02 ` Naveen N. Rao
2013-11-26  9:31   ` Chen, Gong [this message]
2013-12-14 13:42     ` Chen, Gong
2013-12-16 14:51       ` Borislav Petkov
2013-12-16 14:40         ` Chen, Gong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131126093136.GA27271@gchen.bj.intel.com \
    --to=gong.chen@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=linux-acpi@vger.kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).