From: Chen Gong <gong.chen@linux.intel.com>
To: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: tony.luck@intel.com, bp@alien8.de, joe@perches.com,
naveen.n.rao@linux.vnet.ibm.com, arozansk@redhat.com,
linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 6/9] ACPI, APEI, CPER: Add UEFI 2.4 support for memory error
Date: Thu, 17 Oct 2013 08:16:49 -0400 [thread overview]
Message-ID: <20131017121649.GA8701@gchen.bj.intel.com> (raw)
In-Reply-To: <20131017072306.5839d500@samsung.com>
[-- Attachment #1: Type: text/plain, Size: 7609 bytes --]
On Thu, Oct 17, 2013 at 07:23:06AM -0300, Mauro Carvalho Chehab wrote:
> Date: Thu, 17 Oct 2013 07:23:06 -0300
> From: Mauro Carvalho Chehab <m.chehab@samsung.com>
> To: "Chen, Gong" <gong.chen@linux.intel.com>
> Cc: tony.luck@intel.com, bp@alien8.de, joe@perches.com,
> naveen.n.rao@linux.vnet.ibm.com, arozansk@redhat.com,
> linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2 6/9] ACPI, APEI, CPER: Add UEFI 2.4 support for
> memory error
> X-Mailer: Claws Mail 3.9.2 (GTK+ 2.24.19; x86_64-redhat-linux-gnu)
>
> Em Wed, 16 Oct 2013 10:56:03 -0400
> "Chen, Gong" <gong.chen@linux.intel.com> escreveu:
>
> > In latest UEFI spec(by now it is 2.4) memory error definition
> > for CPER (UEFI 2.4 Appendix N Common Platform Error Record)
> > adds some new fields. These fields help people to locate
> > memory error on actual DIMM location.
> >
> > Original-author: Tony Luck <tony.luck@intel.com>
> > Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
> > Reviewed-by: Borislav Petkov <bp@suse.de>
> > ---
> > arch/x86/kernel/cpu/mcheck/mce-apei.c | 3 +--
> > drivers/acpi/apei/cper.c | 7 ++++---
> > drivers/acpi/apei/ghes.c | 4 ++--
> > drivers/edac/ghes_edac.c | 5 ++---
> > include/linux/cper.h | 11 +++++++++--
> > 5 files changed, 18 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/kernel/cpu/mcheck/mce-apei.c b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > index cd8b166..de8b60a 100644
> > --- a/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > +++ b/arch/x86/kernel/cpu/mcheck/mce-apei.c
> > @@ -42,8 +42,7 @@ void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
> > struct mce m;
> >
> > /* Only corrected MC is reported */
> > - if (!corrected || !(mem_err->validation_bits &
> > - CPER_MEM_VALID_PHYSICAL_ADDRESS))
> > + if (!corrected || !(mem_err->validation_bits & CPER_MEM_VALID_PA))
> > return;
> >
> > mce_setup(&m);
> > diff --git a/drivers/acpi/apei/cper.c b/drivers/acpi/apei/cper.c
> > index eb5f6d6..946ef52 100644
> > --- a/drivers/acpi/apei/cper.c
> > +++ b/drivers/acpi/apei/cper.c
> > @@ -8,7 +8,7 @@
> > * various tables, such as ERST, BERT and HEST etc.
> > *
> > * For more information about CPER, please refer to Appendix N of UEFI
> > - * Specification version 2.3.
> > + * Specification version 2.4.
> > *
> > * This program is free software; you can redistribute it and/or
> > * modify it under the terms of the GNU General Public License version
> > @@ -191,16 +191,17 @@ static const char *cper_mem_err_type_strs[] = {
> > "memory sparing",
> > "scrub corrected error",
> > "scrub uncorrected error",
> > + "physical memory map-out event",
> > };
> >
> > static void cper_print_mem(const char *pfx, const struct cper_sec_mem_err *mem)
> > {
> > if (mem->validation_bits & CPER_MEM_VALID_ERROR_STATUS)
> > printk("%s""error_status: 0x%016llx\n", pfx, mem->error_status);
> > - if (mem->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS)
> > + if (mem->validation_bits & CPER_MEM_VALID_PA)
> > printk("%s""physical_address: 0x%016llx\n",
> > pfx, mem->physical_addr);
> > - if (mem->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS_MASK)
> > + if (mem->validation_bits & CPER_MEM_VALID_PA_MASK)
> > printk("%s""physical_address_mask: 0x%016llx\n",
> > pfx, mem->physical_addr_mask);
> > if (mem->validation_bits & CPER_MEM_VALID_NODE)
> > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > index 0db6e4f..a30bc31 100644
> > --- a/drivers/acpi/apei/ghes.c
> > +++ b/drivers/acpi/apei/ghes.c
> > @@ -419,7 +419,7 @@ static void ghes_handle_memory_failure(struct acpi_generic_data *gdata, int sev)
> >
> > if (sec_sev == GHES_SEV_CORRECTED &&
> > (gdata->flags & CPER_SEC_ERROR_THRESHOLD_EXCEEDED) &&
> > - (mem_err->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS)) {
> > + (mem_err->validation_bits & CPER_MEM_VALID_PA)) {
> > pfn = mem_err->physical_addr >> PAGE_SHIFT;
> > if (pfn_valid(pfn))
> > memory_failure_queue(pfn, 0, MF_SOFT_OFFLINE);
> > @@ -430,7 +430,7 @@ static void ghes_handle_memory_failure(struct acpi_generic_data *gdata, int sev)
> > }
> > if (sev == GHES_SEV_RECOVERABLE &&
> > sec_sev == GHES_SEV_RECOVERABLE &&
> > - mem_err->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS) {
> > + mem_err->validation_bits & CPER_MEM_VALID_PA) {
> > pfn = mem_err->physical_addr >> PAGE_SHIFT;
> > memory_failure_queue(pfn, 0, 0);
> > }
> > diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
> > index bb53467..0ad797b 100644
> > --- a/drivers/edac/ghes_edac.c
> > +++ b/drivers/edac/ghes_edac.c
> > @@ -297,15 +297,14 @@ void ghes_edac_report_mem_error(struct ghes *ghes, int sev,
> > }
> >
> > /* Error address */
> > - if (mem_err->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS) {
> > + if (mem_err->validation_bits & CPER_MEM_VALID_PA) {
> > e->page_frame_number = mem_err->physical_addr >> PAGE_SHIFT;
> > e->offset_in_page = mem_err->physical_addr & ~PAGE_MASK;
> > }
> >
> > /* Error grain */
> > - if (mem_err->validation_bits & CPER_MEM_VALID_PHYSICAL_ADDRESS_MASK) {
> > + if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK)
> > e->grain = ~(mem_err->physical_addr_mask & ~PAGE_MASK);
> > - }
> >
> > /* Memory error location, mapped on e->location */
> > p = e->location;
> > diff --git a/include/linux/cper.h b/include/linux/cper.h
> > index 09ebe21..2fc0ec3 100644
> > --- a/include/linux/cper.h
> > +++ b/include/linux/cper.h
> > @@ -218,8 +218,8 @@ enum {
> > #define CPER_PROC_VALID_IP 0x1000
> >
> > #define CPER_MEM_VALID_ERROR_STATUS 0x0001
> > -#define CPER_MEM_VALID_PHYSICAL_ADDRESS 0x0002
> > -#define CPER_MEM_VALID_PHYSICAL_ADDRESS_MASK 0x0004
> > +#define CPER_MEM_VALID_PA 0x0002
> > +#define CPER_MEM_VALID_PA_MASK 0x0004
> > #define CPER_MEM_VALID_NODE 0x0008
> > #define CPER_MEM_VALID_CARD 0x0010
> > #define CPER_MEM_VALID_MODULE 0x0020
> > @@ -232,6 +232,9 @@ enum {
> > #define CPER_MEM_VALID_RESPONDER_ID 0x1000
> > #define CPER_MEM_VALID_TARGET_ID 0x2000
> > #define CPER_MEM_VALID_ERROR_TYPE 0x4000
> > +#define CPER_MEM_VALID_RANK_NUMBER 0x8000
> > +#define CPER_MEM_VALID_CARD_HANDLE 0x10000
> > +#define CPER_MEM_VALID_MODULE_HANDLE 0x20000
> >
> > #define CPER_PCIE_VALID_PORT_TYPE 0x0001
> > #define CPER_PCIE_VALID_VERSION 0x0002
> > @@ -347,6 +350,10 @@ struct cper_sec_mem_err {
> > __u64 responder_id;
> > __u64 target_id;
> > __u8 error_type;
> > + __u8 reserved;
> > + __u16 rank;
> > + __u16 mem_array_handle; /* card handle in UEFI 2.4 */
> > + __u16 mem_dev_handle; /* module handle in UEFI 2.4 */
>
> Hmm... you're adding 3 new types here and the corresponding space inside the
> structure (rank, card_handle and module_handle), but the code that parses and
> prints it is missing, at apei_mce_report_mem_error(), cper_print_mem(),
> ghes_handle_memory_failure() and ghes_edac_report_mem_error().
>
>
1. This patch is just for definition update.
2. apei_mce_report_mem_error/cper_print_mem/ghes_handle_memory_failure
finally point to apei/cper. So patch [8/9] can cover it. As for
EDAC part (ghes_edac_report_mem_error), I can add a new separate
patch to fix missed part.
> > };
> >
> > struct cper_sec_pcie {
>
>
> --
>
> Cheers,
> Mauro
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2013-10-17 12:31 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-16 14:55 [PATCH v2 0/9] Extended H/W error log driver Chen, Gong
2013-10-16 14:55 ` [PATCH v2 1/9] ACPI, APEI, CPER: Fix status check during error printing Chen, Gong
2013-10-16 16:53 ` Mauro Carvalho Chehab
2013-10-16 14:55 ` [PATCH v2 2/9] ACPI, CPER: Update cper info Chen, Gong
2013-10-16 16:28 ` Borislav Petkov
2013-10-16 16:52 ` Mauro Carvalho Chehab
2013-10-16 14:56 ` [PATCH v2 3/9] bitops: Introduce a more generic BITMASK macro Chen, Gong
2013-10-16 16:41 ` Borislav Petkov
2013-10-16 17:02 ` Mauro Carvalho Chehab
2013-10-17 2:31 ` Chen Gong
2013-10-17 2:59 ` Joe Perches
2013-10-17 6:30 ` Chen Gong
2013-10-17 6:58 ` Joe Perches
2013-10-17 7:38 ` Chen Gong
2013-10-17 8:32 ` Joe Perches
2013-10-17 8:40 ` Borislav Petkov
2013-10-17 8:55 ` Joe Perches
2013-10-17 16:10 ` Tony Luck
2013-10-17 18:13 ` Joe Perches
2013-10-16 14:56 ` [PATCH v2 4/9] ACPI, x86: Extended error log driver for x86 platform Chen, Gong
2013-10-16 17:02 ` Borislav Petkov
2013-10-16 14:56 ` [PATCH v2 5/9] DMI: Parse memory device (type 17) in SMBIOS Chen, Gong
2013-10-16 17:05 ` Borislav Petkov
2013-10-17 10:14 ` Mauro Carvalho Chehab
2013-10-16 14:56 ` [PATCH v2 6/9] ACPI, APEI, CPER: Add UEFI 2.4 support for memory error Chen, Gong
2013-10-16 16:43 ` Mauro Carvalho Chehab
2013-10-17 10:23 ` Mauro Carvalho Chehab
2013-10-17 12:16 ` Chen Gong [this message]
2013-10-17 12:23 ` Naveen N. Rao
2013-10-16 14:56 ` [PATCH v2 7/9] ACPI, APEI, CPER: Enhance memory reporting capability Chen, Gong
2013-10-16 17:11 ` Borislav Petkov
2013-10-17 10:24 ` Mauro Carvalho Chehab
2013-10-16 14:56 ` [PATCH v2 8/9] ACPI, APEI, CPER: Cleanup CPER memory error output format Chen, Gong
2013-10-16 17:24 ` Borislav Petkov
2013-10-17 10:27 ` Mauro Carvalho Chehab
2013-10-16 14:56 ` [PATCH v2 9/9] ACPI / trace: Add trace interface for eMCA driver Chen, Gong
2013-10-16 15:50 ` Mauro Carvalho Chehab
2013-10-16 17:29 ` Borislav Petkov
2013-10-16 15:06 ` [PATCH v2 0/9] Extended H/W error log driver Chen Gong
2013-10-16 16:05 ` Borislav Petkov
2013-10-16 16:49 ` Joe Perches
2013-10-16 16:56 ` Steven Rostedt
2013-10-16 18:00 ` Borislav Petkov
2013-10-16 18:11 ` Borislav Petkov
2013-10-17 14:33 ` Chen Gong
2013-10-17 15:25 ` Steven Rostedt
2013-10-17 15:35 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131017121649.GA8701@gchen.bj.intel.com \
--to=gong.chen@linux.intel.com \
--cc=arozansk@redhat.com \
--cc=bp@alien8.de \
--cc=joe@perches.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=m.chehab@samsung.com \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox