linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/2] process unrecognized CPER error section
@ 2015-09-08 21:29 Jonathan (Zhixiong) Zhang
       [not found] ` <1441747761-12012-1-git-send-email-zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2015-09-08 21:29 ` [PATCH V2 2/2] ras: acpi / apei: generate trace event for " Jonathan (Zhixiong) Zhang
  0 siblings, 2 replies; 6+ messages in thread
From: Jonathan (Zhixiong) Zhang @ 2015-09-08 21:29 UTC (permalink / raw)
  To: Matt Fleming, tony.luck-ral2JQCrhuEAvxtiuMwx3w,
	fu.wei-QSEj5FYQhm4dnm+yROfE0A, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	rjw-LthD3rsA81gm4RdzfppkhA, mchehab-JPH+aEBZ4P+UEJcrhfAQsw,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, bp-Gina5bIWoIWzQB+pC5nmwQ,
	gong.chen-VuQAYsv1563Yd54FQh9/CA
  Cc: Jonathan (Zhixiong) Zhang, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-acpi-cunTk1MwBs8s++Sfvej+rw,
	vgandhi-sgV2jX0FEOL9JmXXK+q4OQ, linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	timur-sgV2jX0FEOL9JmXXK+q4OQ

From: "Jonathan (Zhixiong) Zhang" <zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

Currently the kernel ignores CPER records that are unrecognized.
On the other hand, UEFI spec allows for non-standard (eg. vendor
proprietary) error section type in CPER (Common Platform Error Record),
as defined in section N2.3 of UEFI version 2.5. Therefore, user
is not able to see hardware error data of non-standard section.

If section Type field of Generic Error Data Entry is unrecognized,
prints out the raw data in dmesg buffer, and also adds a tracepoint
for reporting such hardware error.

V2:
1. Handle all unrecognized CPER records instead of matching with
section type that is known to be vendor proprietary. (Borislav)

Jonathan (Zhixiong) Zhang (2):
  efi: print unrecognized CPER section
  ras: acpi/apei: generate trace event for unrecognized CPER section

 drivers/acpi/apei/ghes.c    | 23 +++++++++++++++++++++--
 drivers/firmware/efi/cper.c | 39 +++++++++++++++++++++++++++++++--------
 drivers/ras/ras.c           |  1 +
 include/ras/ras_event.h     | 45 +++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 98 insertions(+), 10 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH V2 1/2] efi: print unrecognized CPER section
       [not found] ` <1441747761-12012-1-git-send-email-zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2015-09-08 21:29   ` Jonathan (Zhixiong) Zhang
  2015-09-08 21:56     ` Timur Tabi
  2015-09-10 18:27     ` Borislav Petkov
  0 siblings, 2 replies; 6+ messages in thread
From: Jonathan (Zhixiong) Zhang @ 2015-09-08 21:29 UTC (permalink / raw)
  To: Matt Fleming, tony.luck-ral2JQCrhuEAvxtiuMwx3w,
	fu.wei-QSEj5FYQhm4dnm+yROfE0A, al.stone-QSEj5FYQhm4dnm+yROfE0A,
	rjw-LthD3rsA81gm4RdzfppkhA, mchehab-JPH+aEBZ4P+UEJcrhfAQsw,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, bp-Gina5bIWoIWzQB+pC5nmwQ,
	gong.chen-VuQAYsv1563Yd54FQh9/CA
  Cc: Jonathan (Zhixiong) Zhang, linux-efi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linaro-acpi-cunTk1MwBs8s++Sfvej+rw,
	vgandhi-sgV2jX0FEOL9JmXXK+q4OQ, linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	timur-sgV2jX0FEOL9JmXXK+q4OQ

From: "Jonathan (Zhixiong) Zhang" <zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

UEFI spec allows for non-standard section in Common Platform Error
Record. This is defined in section N.2.3 of UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match with
one of the section types that the kernel knows how to parse, the
section is skipped. Therefore, user is not able to see
such CPER data, for instace, error record of non-standard section.

For above mentioned case, this change prints out the raw data in
hex in dmesg buffer. Data length is taken from Error Data length
field of Generic Error Data Entry.

Following is a sample output from dmesg:
[  115.771702] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2
[  115.779042] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
[  115.787456] {1}[Hardware Error]: event severity: corrected
[  115.792927] {1}[Hardware Error]:  Error 0, type: corrected
[  115.798415] {1}[Hardware Error]:  fru_id: 00000000-0000-0000-0000-000000000000
[  115.805596] {1}[Hardware Error]:  fru_text:
[  115.816105] {1}[Hardware Error]:  section type: d2e2621c-f936-468d-0d84-15a4ed015c8b
[  115.823880] {1}[Hardware Error]:  section length: 88
[  115.828779] {1}[Hardware Error]:   00000000: 01000001 00000002 5f434345 525f4543
[  115.836153] {1}[Hardware Error]:   00000010: 0000574d 00000000 00000000 00000000
[  115.843531] {1}[Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
[  115.850908] {1}[Hardware Error]:   00000030: 00000000 00000000 00000000 00000000
[  115.858288] {1}[Hardware Error]:   00000040: fe800000 00000000 00000004 5f434345
[  115.865665] {1}[Hardware Error]:   00000050: 525f4543 0000574d

Change-Id: I663a6e3ae6dcf68e4e389f76d555e9106ffee165
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/firmware/efi/cper.c | 39 +++++++++++++++++++++++++++++++--------
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index d42537425438..8a58b2927408 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -32,12 +32,31 @@
 #include <linux/acpi.h>
 #include <linux/pci.h>
 #include <linux/aer.h>
+#include <linux/printk.h>
 
 #define INDENT_SP	" "
 
+#define ROW_SIZE 16
+#define GROUP_SIZE 4
+
 static char rcd_decode_str[CPER_REC_LEN];
 
 /*
+ * cper_print_hex - print hex from a CPER data buffer
+ * @pfx: prefix for each line, including log level and prefix string
+ * @buf: buffer pointer
+ * @len: size of buffer
+ *
+ * print_hex_dump() expects log level and prefix string to be passed
+ * in two different paramters. Internally it concatenates them. In
+ * our case, those two are already concatenated in pfx.
+ */
+#define cper_print_hex(pfx, buf, len)				\
+	print_hex_dump(pfx, "",					\
+		DUMP_PREFIX_OFFSET, ROW_SIZE, GROUP_SIZE,	\
+		buf, len, 0)
+
+/*
  * CPER record ID need to be unique even after reboot, because record
  * ID is used as index for ERST storage, while CPER records from
  * multiple boot may co-exist in ERST.
@@ -392,7 +411,9 @@ static void cper_estatus_print_section(
 	uuid_le *sec_type = (uuid_le *)gdata->section_type;
 	__u16 severity;
 	char newpfx[64];
+	u32 len;
 
+	len = gdata->error_data_length;
 	severity = gdata->error_severity;
 	printk("%s""Error %d, type: %s\n", pfx, sec_no,
 	       cper_severity_str(severity));
@@ -405,28 +426,30 @@ static void cper_estatus_print_section(
 	if (!uuid_le_cmp(*sec_type, CPER_SEC_PROC_GENERIC)) {
 		struct cper_sec_proc_generic *proc_err = (void *)(gdata + 1);
 		printk("%s""section_type: general processor error\n", newpfx);
-		if (gdata->error_data_length >= sizeof(*proc_err))
+		if (len >= sizeof(*proc_err))
 			cper_print_proc_generic(newpfx, proc_err);
 		else
 			goto err_section_too_small;
 	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PLATFORM_MEM)) {
 		struct cper_sec_mem_err *mem_err = (void *)(gdata + 1);
 		printk("%s""section_type: memory error\n", newpfx);
-		if (gdata->error_data_length >=
-		    sizeof(struct cper_sec_mem_err_old))
-			cper_print_mem(newpfx, mem_err,
-				       gdata->error_data_length);
+		if (len >= sizeof(struct cper_sec_mem_err_old))
+			cper_print_mem(newpfx, mem_err, len);
 		else
 			goto err_section_too_small;
 	} else if (!uuid_le_cmp(*sec_type, CPER_SEC_PCIE)) {
 		struct cper_sec_pcie *pcie = (void *)(gdata + 1);
 		printk("%s""section_type: PCIe error\n", newpfx);
-		if (gdata->error_data_length >= sizeof(*pcie))
+		if (len >= sizeof(*pcie))
 			cper_print_pcie(newpfx, pcie, gdata);
 		else
 			goto err_section_too_small;
-	} else
-		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
+	} else {
+		const void *raw_err = gdata + 1;
+		printk("%ssection type: %pUl\n", pfx, sec_type);
+		printk("%ssection length: %d\n", pfx, len);
+		cper_print_hex(newpfx, raw_err, len);
+	}
 
 	return;
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH V2 2/2] ras: acpi / apei: generate trace event for unrecognized CPER section
  2015-09-08 21:29 [PATCH V2 0/2] process unrecognized CPER error section Jonathan (Zhixiong) Zhang
       [not found] ` <1441747761-12012-1-git-send-email-zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2015-09-08 21:29 ` Jonathan (Zhixiong) Zhang
  2015-09-10 18:41   ` Borislav Petkov
  1 sibling, 1 reply; 6+ messages in thread
From: Jonathan (Zhixiong) Zhang @ 2015-09-08 21:29 UTC (permalink / raw)
  To: Matt Fleming, tony.luck, fu.wei, al.stone, rjw, mchehab, mingo,
	bp, gong.chen
  Cc: Jonathan (Zhixiong) Zhang, linux-efi, linux-kernel, linaro-acpi,
	vgandhi, linux-acpi, timur

From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org>

UEFI spec allows for non-standard section in Common Platform Error
Record. This is defined in section N.2.3 of UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match with
any section type that the kernel knows how to parse, trace event
is not generated for such section. And thus user is not able to know
happening of such hardware error, including error record of
non-standard section.

This commit generates a trace event which contains raw error data
for unrecognized CPER section.

Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
---
 drivers/acpi/apei/ghes.c | 23 +++++++++++++++++++++--
 drivers/ras/ras.c        |  1 +
 include/ras/ras_event.h  | 45 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 23981ac1c6c2..a3aa3b046a37 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -48,6 +48,7 @@
 #include <acpi/ghes.h>
 #include <acpi/apei.h>
 #include <asm/tlbflush.h>
+#include <ras/ras_event.h>
 
 #include "apei-internal.h"
 
@@ -421,11 +422,23 @@ static void ghes_do_proc(struct ghes *ghes,
 {
 	int sev, sec_sev;
 	struct acpi_hest_generic_data *gdata;
+	uuid_le *sec_type;
+	uuid_le *fru_id;
+	char *fru_text = "";
+	void *raw_err;
 
 	sev = ghes_severity(estatus->error_severity);
 	apei_estatus_for_each_section(estatus, gdata) {
 		sec_sev = ghes_severity(gdata->error_severity);
-		if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
+		sec_type = (uuid_le *)gdata->section_type;
+		if (gdata->validation_bits & CPER_SEC_VALID_FRU_ID)
+			fru_id = (uuid_le *)gdata->fru_id;
+		else
+			fru_id = &NULL_UUID_LE;
+		if (gdata->validation_bits & CPER_SEC_VALID_FRU_TEXT)
+			fru_text = gdata->fru_text;
+
+		if (!uuid_le_cmp(*sec_type,
 				 CPER_SEC_PLATFORM_MEM)) {
 			struct cper_sec_mem_err *mem_err;
 			mem_err = (struct cper_sec_mem_err *)(gdata+1);
@@ -435,7 +448,7 @@ static void ghes_do_proc(struct ghes *ghes,
 			ghes_handle_memory_failure(gdata, sev);
 		}
 #ifdef CONFIG_ACPI_APEI_PCIEAER
-		else if (!uuid_le_cmp(*(uuid_le *)gdata->section_type,
+		else if (!uuid_le_cmp(*sec_type,
 				      CPER_SEC_PCIE)) {
 			struct cper_sec_pcie *pcie_err;
 			pcie_err = (struct cper_sec_pcie *)(gdata+1);
@@ -467,6 +480,12 @@ static void ghes_do_proc(struct ghes *ghes,
 
 		}
 #endif
+		else {
+			raw_err = gdata + 1;
+			trace_raw_event(sec_type,
+					fru_id, fru_text, sec_sev,
+					raw_err, gdata->error_data_length);
+		}
 	}
 }
 
diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
index b67dd362b7b6..6623ae366df9 100644
--- a/drivers/ras/ras.c
+++ b/drivers/ras/ras.c
@@ -27,3 +27,4 @@ subsys_initcall(ras_init);
 EXPORT_TRACEPOINT_SYMBOL_GPL(extlog_mem_event);
 #endif
 EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
+EXPORT_TRACEPOINT_SYMBOL_GPL(raw_event);
diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index 1443d79e4fe6..fd357e9815f5 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -162,6 +162,51 @@ TRACE_EVENT(mc_event,
 );
 
 /*
+ * Raw Events Report
+ *
+ * This event is generated when hardware detected a hardware
+ * error event, which may be of non-standard section as defined
+ * in UEFI spec appendix "Common Platform Error Record", or may
+ * be of sections for which TRACE_EVENT is not defined.
+ *
+ */
+TRACE_EVENT(raw_event,
+
+	TP_PROTO(const uuid_le *sec_type,
+		 const uuid_le *fru_id,
+		 const char *fru_text,
+		 u8 sev,
+		 const u8 *err,
+		 const u32 len),
+
+	TP_ARGS(sec_type, fru_id, fru_text, sev, err, len),
+
+	TP_STRUCT__entry(
+		__array(char, sec_type, 16)
+		__array(char, fru_id, 16)
+		__string(fru_text, fru_text)
+		__field(u8, sev)
+		__field(u32, len)
+		__dynamic_array(u8, buf, len)
+	),
+
+	TP_fast_assign(
+		memcpy(__entry->sec_type, sec_type, sizeof(uuid_le));
+		memcpy(__entry->fru_id, fru_id, sizeof(uuid_le));
+		__assign_str(fru_text, fru_text);
+		__entry->sev = sev;
+		__entry->len = len;
+		memcpy(__get_dynamic_array(buf), err, len);
+	),
+
+	TP_printk("severity: %d; sec type:%pU; FRU: %pU %s; data len:%d; raw data:%s",
+		  __entry->sev, __entry->sec_type,
+		  __entry->fru_id, __get_str(fru_text),
+		  __entry->len,
+		  __print_hex(__get_dynamic_array(buf), __entry->len))
+);
+
+/*
  * PCIe AER Trace event
  *
  * These events are generated when hardware detects a corrected or
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V2 1/2] efi: print unrecognized CPER section
  2015-09-08 21:29   ` [PATCH V2 1/2] efi: print unrecognized CPER section Jonathan (Zhixiong) Zhang
@ 2015-09-08 21:56     ` Timur Tabi
  2015-09-10 18:27     ` Borislav Petkov
  1 sibling, 0 replies; 6+ messages in thread
From: Timur Tabi @ 2015-09-08 21:56 UTC (permalink / raw)
  To: Jonathan (Zhixiong) Zhang, Matt Fleming, tony.luck, fu.wei,
	al.stone, rjw, mchehab, mingo, bp, gong.chen
  Cc: linux-efi, linux-kernel, linaro-acpi, vgandhi, linux-acpi

On 09/08/2015 04:29 PM, Jonathan (Zhixiong) Zhang wrote:

> Change-Id: I663a6e3ae6dcf68e4e389f76d555e9106ffee165

You need to strip out the Change-Id's before posting the patch.

> +#define cper_print_hex(pfx, buf, len)				\
> +	print_hex_dump(pfx, "",					\
> +		DUMP_PREFIX_OFFSET, ROW_SIZE, GROUP_SIZE,	\
> +		buf, len, 0)

		(buf), (len), 0)

is safer

> +	} else {
> +		const void *raw_err = gdata + 1;
> +		printk("%ssection type: %pUl\n", pfx, sec_type);
> +		printk("%ssection length: %d\n", pfx, len);
> +		cper_print_hex(newpfx, raw_err, len);

You don't need raw_err.  This should work fine:

		cper_print_hex(newpfx, gdata + 1, len);

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2 1/2] efi: print unrecognized CPER section
  2015-09-08 21:29   ` [PATCH V2 1/2] efi: print unrecognized CPER section Jonathan (Zhixiong) Zhang
  2015-09-08 21:56     ` Timur Tabi
@ 2015-09-10 18:27     ` Borislav Petkov
  1 sibling, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2015-09-10 18:27 UTC (permalink / raw)
  To: Jonathan (Zhixiong) Zhang
  Cc: Matt Fleming, tony.luck, fu.wei, al.stone, rjw, mchehab, mingo,
	gong.chen, linux-efi, linux-kernel, linaro-acpi, vgandhi,
	linux-acpi, timur

On Tue, Sep 08, 2015 at 02:29:20PM -0700, Jonathan (Zhixiong) Zhang wrote:
> From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org>
> 
> UEFI spec allows for non-standard section in Common Platform Error
> Record. This is defined in section N.2.3 of UEFI version 2.5.
> 
> Currently if the CPER section's type (UUID) does not match with
> one of the section types that the kernel knows how to parse, the
> section is skipped. Therefore, user is not able to see
> such CPER data, for instace, error record of non-standard section.

instace?

Introduce a spellchecker into your workflow, pls.

> For above mentioned case, this change prints out the raw data in
> hex in dmesg buffer. Data length is taken from Error Data length
> field of Generic Error Data Entry.
> 
> Following is a sample output from dmesg:
> [  115.771702] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2
> [  115.779042] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
> [  115.787456] {1}[Hardware Error]: event severity: corrected
> [  115.792927] {1}[Hardware Error]:  Error 0, type: corrected
> [  115.798415] {1}[Hardware Error]:  fru_id: 00000000-0000-0000-0000-000000000000
> [  115.805596] {1}[Hardware Error]:  fru_text:
> [  115.816105] {1}[Hardware Error]:  section type: d2e2621c-f936-468d-0d84-15a4ed015c8b
> [  115.823880] {1}[Hardware Error]:  section length: 88
> [  115.828779] {1}[Hardware Error]:   00000000: 01000001 00000002 5f434345 525f4543
> [  115.836153] {1}[Hardware Error]:   00000010: 0000574d 00000000 00000000 00000000
> [  115.843531] {1}[Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
> [  115.850908] {1}[Hardware Error]:   00000030: 00000000 00000000 00000000 00000000
> [  115.858288] {1}[Hardware Error]:   00000040: fe800000 00000000 00000004 5f434345
> [  115.865665] {1}[Hardware Error]:   00000050: 525f4543 0000574d
> 
> Change-Id: I663a6e3ae6dcf68e4e389f76d555e9106ffee165

As already noted, no internal cset IDs or whatever other markup.

> Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
> ---
>  drivers/firmware/efi/cper.c | 39 +++++++++++++++++++++++++++++++--------
>  1 file changed, 31 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index d42537425438..8a58b2927408 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -32,12 +32,31 @@
>  #include <linux/acpi.h>
>  #include <linux/pci.h>
>  #include <linux/aer.h>
> +#include <linux/printk.h>
>  
>  #define INDENT_SP	" "
>  
> +#define ROW_SIZE 16
> +#define GROUP_SIZE 4
> +
>  static char rcd_decode_str[CPER_REC_LEN];
>  
>  /*
> + * cper_print_hex - print hex from a CPER data buffer
> + * @pfx: prefix for each line, including log level and prefix string

Why?

First argument of print_hex_dump() is @level and second is @prefix_str.
But you're calling print_hex_dump() with "" as a second arg...

> + * @buf: buffer pointer
> + * @len: size of buffer
> + *
> + * print_hex_dump() expects log level and prefix string to be passed
> + * in two different paramters. Internally it concatenates them. In
> + * our case, those two are already concatenated in pfx.

This doesn't make any sense, why?

And WTH are you defining a macro for, to use exactly *once*?! Why can't
you simply use print_hex_dump() like normal kids would do? Same with
those ROW_SIZE and GROUP_SIZE defines... Kill them. Kill it all.

> + */
> +#define cper_print_hex(pfx, buf, len)				\
> +	print_hex_dump(pfx, "",					\
> +		DUMP_PREFIX_OFFSET, ROW_SIZE, GROUP_SIZE,	\
> +		buf, len, 0)
> +
> +/*
>   * CPER record ID need to be unique even after reboot, because record
>   * ID is used as index for ERST storage, while CPER records from
>   * multiple boot may co-exist in ERST.
> @@ -392,7 +411,9 @@ static void cper_estatus_print_section(
>  	uuid_le *sec_type = (uuid_le *)gdata->section_type;
>  	__u16 severity;
>  	char newpfx[64];
> +	u32 len;
>  
> +	len = gdata->error_data_length;

This and the changes it brings with it are unrelated to this patch -
needs to be a separate patch.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2 2/2] ras: acpi / apei: generate trace event for unrecognized CPER section
  2015-09-08 21:29 ` [PATCH V2 2/2] ras: acpi / apei: generate trace event for " Jonathan (Zhixiong) Zhang
@ 2015-09-10 18:41   ` Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2015-09-10 18:41 UTC (permalink / raw)
  To: Jonathan (Zhixiong) Zhang
  Cc: Matt Fleming, tony.luck, fu.wei, al.stone, rjw, mchehab, mingo,
	gong.chen, linux-efi, linux-kernel, linaro-acpi, vgandhi,
	linux-acpi, timur

On Tue, Sep 08, 2015 at 02:29:21PM -0700, Jonathan (Zhixiong) Zhang wrote:
>  /*
> + * Raw Events Report
> + *
> + * This event is generated when hardware detected a hardware
> + * error event, which may be of non-standard section as defined
> + * in UEFI spec appendix "Common Platform Error Record", or may
> + * be of sections for which TRACE_EVENT is not defined.
> + *
> + */
> +TRACE_EVENT(raw_event,
> +
> +	TP_PROTO(const uuid_le *sec_type,
> +		 const uuid_le *fru_id,
> +		 const char *fru_text,
> +		 u8 sev,
> +		 const u8 *err,
> +		 const u32 len),

This is not a raw event - this is an event which has a section type, FRU
ID, text, etc, etc.

A raw event is one which takes exactly two arguments: bytes and count.
What it does is, it dumps the bytes of length count in a block or other
amicably formatted output, most likely hex, similar to hexdump or other
tools; *without* any attempt to interpret it whatsoever.

Its *consumers* do the interpretation. So that that raw_event tracepoint
can be used as a fallback in all cases where the error information is of
unknown structure to the kernel.

Btw, @count should be sanity-checked before calling the tracepoint with
insane values.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-09-10 18:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-08 21:29 [PATCH V2 0/2] process unrecognized CPER error section Jonathan (Zhixiong) Zhang
     [not found] ` <1441747761-12012-1-git-send-email-zjzhang-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2015-09-08 21:29   ` [PATCH V2 1/2] efi: print unrecognized CPER section Jonathan (Zhixiong) Zhang
2015-09-08 21:56     ` Timur Tabi
2015-09-10 18:27     ` Borislav Petkov
2015-09-08 21:29 ` [PATCH V2 2/2] ras: acpi / apei: generate trace event for " Jonathan (Zhixiong) Zhang
2015-09-10 18:41   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).