linux-efi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] fix CPER issues related to UEFI 2.9A Errata
@ 2024-06-24  9:19 Mauro Carvalho Chehab
  2024-06-24  9:19 ` [PATCH v5 1/4] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-24  9:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Borislav Petkov, Tony Luck, James Morse,
	Jonathan Cameron, Shiju Jose, linux-efi, linux-kernel, linux-edac,
	Ard Biesheuvel, Jonathan Corbet, Len Brown, linux-acpi, linux-doc

The UEFI 2.9A errata makes clear how ARM processor type encoding should
be done: it is meant to be equal to Generic processor, using a bitmask.

The current code assumes, for both generic and ARM processor types
that this is an integer, which is an incorrect assumption.

Fix it. While here, also fix a compilation issue when using W=1.

After the change, Kernel will properly decode receiving two errors at the same
message, as defined at UEFI spec:

[   75.282430] Memory failure: 0x5cdfd: recovery action for free buddy page: Recovered
[   94.973081] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
[   94.973770] {2}[Hardware Error]: event severity: recoverable
[   94.974334] {2}[Hardware Error]:  Error 0, type: recoverable
[   94.974962] {2}[Hardware Error]:   section_type: ARM processor error
[   94.975586] {2}[Hardware Error]:   MIDR: 0x000000000000cd24
[   94.976202] {2}[Hardware Error]:   Multiprocessor Affinity Register (MPIDR): 0x000000000000ab12
[   94.977011] {2}[Hardware Error]:   error affinity level: 2
[   94.977593] {2}[Hardware Error]:   running state: 0x1
[   94.978135] {2}[Hardware Error]:   Power State Coordination Interface state: 4660
[   94.978884] {2}[Hardware Error]:   Error info structure 0:
[   94.979463] {2}[Hardware Error]:   num errors: 3
[   94.979971] {2}[Hardware Error]:    first error captured
[   94.980523] {2}[Hardware Error]:    propagated error captured
[   94.981110] {2}[Hardware Error]:    overflow occurred, error info is incomplete
[   94.981893] {2}[Hardware Error]:    error_type: 0x0006: cache error|TLB error
[   94.982606] {2}[Hardware Error]:    error_info: 0x000000000091000f
[   94.983249] {2}[Hardware Error]:     transaction type: Data Access
[   94.983891] {2}[Hardware Error]:     cache error, operation type: Data write
[   94.984559] {2}[Hardware Error]:     TLB error, operation type: Data write
[   94.985215] {2}[Hardware Error]:     cache level: 2
[   94.985749] {2}[Hardware Error]:     TLB level: 2
[   94.986277] {2}[Hardware Error]:     processor context not corrupted

And the error code is properly decoded according with table N.17 from UEFI 2.10
spec:

	[   94.981893] {2}[Hardware Error]:    error_type: 0x0006: cache error|TLB error

The error injection logic was checked via QEMU using this patch:
https://lore.kernel.org/all/20240621165115.336-1-shiju.jose@huawei.com/

v5:
- Do some cleanups and minor fixes as suggested by Jonathan and Tony:
  - check errors at strscpy();
  - simplify cper_bits_to_str() function;
  - use FIELD_GET() and for_each_set_bit();
  - use ARRAY_SIZE() on infofx to let it clear that it should be size of newpfx + 1;
  - fix kernel-doc warning with W=1;
  - use kernel-doc for two exported functions at cper.c.

v4:
- The print function had some bugs on it, which was discovered with
  the help of an error injection tool I'm now using.

v3:
- It adds a helper function to produce a buffer describing the
  error bits at cper's printk and ghes pr_warn_bitrated. It also
  fixes a W=1 error while building cper.

v2:
- It fixes the way printks are handled on both cper_arm and ghes
  drivers.

v1: 
- (tagged as RFC) was mostly to give a heads up that the current 
  implementation is not following the spec. It also touches
  only cper code.




Mauro Carvalho Chehab (4):
  efi/cper: Adjust infopfx size to accept an extra space
  efi/cper: Add a new helper function to print bitmasks
  efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
  docs: efi: add CPER functions to driver-api

 .../driver-api/firmware/efi/index.rst         | 11 ++--
 drivers/acpi/apei/ghes.c                      | 15 +++---
 drivers/firmware/efi/cper-arm.c               | 52 +++++++++----------
 drivers/firmware/efi/cper.c                   | 41 ++++++++++++++-
 include/linux/cper.h                          | 12 +++--
 5 files changed, 89 insertions(+), 42 deletions(-)

-- 
2.45.2



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v5 1/4] efi/cper: Adjust infopfx size to accept an extra space
  2024-06-24  9:19 [PATCH v5 0/4] fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
@ 2024-06-24  9:19 ` Mauro Carvalho Chehab
  2024-06-24  9:19 ` [PATCH v5 2/4] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-24  9:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
	Jonathan Cameron, Shiju Jose, Tony Luck, Ard Biesheuvel,
	linux-edac, linux-efi, linux-kernel

Compiling with W=1 with werror enabled produces an error:

drivers/firmware/efi/cper-arm.c: In function ‘cper_print_proc_arm’:
drivers/firmware/efi/cper-arm.c:298:64: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
  298 |                         snprintf(infopfx, sizeof(infopfx), "%s ", newpfx);
      |                                                                ^
drivers/firmware/efi/cper-arm.c:298:25: note: ‘snprintf’ output between 2 and 65 bytes into a destination of size 64
  298 |                         snprintf(infopfx, sizeof(infopfx), "%s ", newpfx);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As the logic there adds an space at the end of infopx buffer.
Add an extra space to avoid such warning.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/firmware/efi/cper-arm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
index fa9c1c3bf168..eb7ee6af55f2 100644
--- a/drivers/firmware/efi/cper-arm.c
+++ b/drivers/firmware/efi/cper-arm.c
@@ -240,7 +240,7 @@ void cper_print_proc_arm(const char *pfx,
 	int i, len, max_ctx_type;
 	struct cper_arm_err_info *err_info;
 	struct cper_arm_ctx_info *ctx_info;
-	char newpfx[64], infopfx[64];
+	char newpfx[64], infopfx[ARRAY_SIZE(newpfx) + 1];
 
 	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 2/4] efi/cper: Add a new helper function to print bitmasks
  2024-06-24  9:19 [PATCH v5 0/4] fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
  2024-06-24  9:19 ` [PATCH v5 1/4] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
@ 2024-06-24  9:19 ` Mauro Carvalho Chehab
  2024-07-01 15:07   ` Jonathan Cameron
  2024-06-24  9:19 ` [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
  2024-06-24  9:19 ` [PATCH v5 4/4] docs: efi: add CPER functions to driver-api Mauro Carvalho Chehab
  3 siblings, 1 reply; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-24  9:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
	Jonathan Cameron, Shiju Jose, Tony Luck, Alison Schofield,
	Ard Biesheuvel, Dave Jiang, Ira Weiny, linux-edac, linux-efi,
	linux-kernel

Sometimes it is desired to produce a single log line for errors.
Add a new helper function for such purpose.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 drivers/firmware/efi/cper.c | 41 +++++++++++++++++++++++++++++++++++++
 include/linux/cper.h        |  2 ++
 2 files changed, 43 insertions(+)

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 7d2cdd9e2227..4cf56657afde 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -106,6 +106,47 @@ void cper_print_bits(const char *pfx, unsigned int bits,
 		printk("%s\n", buf);
 }
 
+/*
+ * cper_bits_to_str - return a string for set bits
+ * @buf: buffer to store the output string
+ * @buf_size: size of the output string buffer
+ * @bits: bit mask
+ * @strs: string array, indexed by bit position
+ * @strs_size: size of the string array: @strs
+ * @mask: a continuous bitmask used to detect the first valid bit of the
+ *        bitmap.
+ *
+ * Add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits
+ * mask, add the corresponding string describing the bit in @strs to @buf.
+ */
+char *cper_bits_to_str(char *buf, int buf_size, unsigned long bits,
+		       const char * const strs[], unsigned int strs_size)
+{
+	int len = buf_size;
+	char *str = buf;
+	int i, size;
+
+	for_each_set_bit(i, &bits, strs_size) {
+		if (!(bits & (1U << (i))))
+			continue;
+
+		if (*buf && len > 0) {
+			*str = '|';
+			len--;
+			str++;
+		}
+
+		size = strscpy(str, strs[i], len);
+		if (size < 0)
+			break;
+
+		len -= size;
+		str += size;
+	}
+	return buf;
+}
+EXPORT_SYMBOL_GPL(cper_bits_to_str);
+
 static const char * const proc_type_strs[] = {
 	"IA32/X64",
 	"IA64",
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 265b0f8fc0b3..c2f14b916bfb 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -584,6 +584,8 @@ const char *cper_mem_err_type_str(unsigned int);
 const char *cper_mem_err_status_str(u64 status);
 void cper_print_bits(const char *prefix, unsigned int bits,
 		     const char * const strs[], unsigned int strs_size);
+char *cper_bits_to_str(char *buf, int buf_size, unsigned long bits,
+		       const char * const strs[], unsigned int strs_size);
 void cper_mem_err_pack(const struct cper_sec_mem_err *,
 		       struct cper_mem_err_compact *);
 const char *cper_mem_err_unpack(struct trace_seq *,
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
  2024-06-24  9:19 [PATCH v5 0/4] fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
  2024-06-24  9:19 ` [PATCH v5 1/4] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
  2024-06-24  9:19 ` [PATCH v5 2/4] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
@ 2024-06-24  9:19 ` Mauro Carvalho Chehab
  2024-06-27  8:59   ` kernel test robot
  2024-06-28 20:28   ` kernel test robot
  2024-06-24  9:19 ` [PATCH v5 4/4] docs: efi: add CPER functions to driver-api Mauro Carvalho Chehab
  3 siblings, 2 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-24  9:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
	Jonathan Cameron, Rafael J. Wysocki, Shiju Jose, Tony Luck,
	Alison Schofield, Ard Biesheuvel, Dan Williams, Dave Jiang,
	Ira Weiny, Len Brown, Shuai Xue, linux-acpi, linux-edac,
	linux-efi, linux-kernel

Up to UEFI spec, the type byte of CPER struct for ARM processor was
defined simply as:

Type at byte offset 4:

	- Cache error
	- TLB Error
	- Bus Error
	- Micro-architectural Error
	All other values are reserved

Yet, there was no information about how this would be encoded.

Spec 2.9A errata corrected it by defining:

	- Bit 1 - Cache Error
	- Bit 2 - TLB Error
	- Bit 3 - Bus Error
	- Bit 4 - Micro-architectural Error
	All other values are reserved

That actually aligns with the values already defined on older
versions at N.2.4.1. Generic Processor Error Section.

Spec 2.10 also preserve the same encoding as 2.9A

Adjust CPER and GHES handling code for both generic and ARM
processors to properly handle UEFI 2.9A and 2.10 encoding.

Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 drivers/acpi/apei/ghes.c        | 15 ++++++----
 drivers/firmware/efi/cper-arm.c | 50 ++++++++++++++++-----------------
 include/linux/cper.h            | 10 +++----
 3 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 623cc0cb4a65..de79cc0a0f1d 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -533,6 +533,7 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
 {
 	struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
 	int flags = sync ? MF_ACTION_REQUIRED : 0;
+	char error_type[120];
 	bool queued = false;
 	int sec_sev, i;
 	char *p;
@@ -546,9 +547,8 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
 	p = (char *)(err + 1);
 	for (i = 0; i < err->err_info_num; i++) {
 		struct cper_arm_err_info *err_info = (struct cper_arm_err_info *)p;
-		bool is_cache = (err_info->type == CPER_ARM_CACHE_ERROR);
+		bool is_cache = err_info->type & CPER_ARM_CACHE_ERROR;
 		bool has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR);
-		const char *error_type = "unknown error";
 
 		/*
 		 * The field (err_info->error_info & BIT(26)) is fixed to set to
@@ -562,12 +562,15 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
 			continue;
 		}
 
-		if (err_info->type < ARRAY_SIZE(cper_proc_error_type_strs))
-			error_type = cper_proc_error_type_strs[err_info->type];
+		cper_bits_to_str(error_type, sizeof(error_type),
+				 FIELD_GET(CPER_ARM_ERR_TYPE_MASK, err_info->type),
+				 cper_proc_error_type_strs,
+				 ARRAY_SIZE(cper_proc_error_type_strs));
 
 		pr_warn_ratelimited(FW_WARN GHES_PFX
-				    "Unhandled processor error type: %s\n",
-				    error_type);
+				    "Unhandled processor error type 0x%02x: %s%s\n",
+				    err_info->type, error_type,
+				    (err_info->type & ~CPER_ARM_ERR_TYPE_MASK) ? " with reserved bit(s)" : "");
 		p += err_info->length;
 	}
 
diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
index eb7ee6af55f2..52d18490b59e 100644
--- a/drivers/firmware/efi/cper-arm.c
+++ b/drivers/firmware/efi/cper-arm.c
@@ -93,15 +93,11 @@ static void cper_print_arm_err_info(const char *pfx, u32 type,
 	bool proc_context_corrupt, corrected, precise_pc, restartable_pc;
 	bool time_out, access_mode;
 
-	/* If the type is unknown, bail. */
-	if (type > CPER_ARM_MAX_TYPE)
-		return;
-
 	/*
 	 * Vendor type errors have error information values that are vendor
 	 * specific.
 	 */
-	if (type == CPER_ARM_VENDOR_ERROR)
+	if (type & CPER_ARM_VENDOR_ERROR)
 		return;
 
 	if (error_info & CPER_ARM_ERR_VALID_TRANSACTION_TYPE) {
@@ -116,43 +112,38 @@ static void cper_print_arm_err_info(const char *pfx, u32 type,
 	if (error_info & CPER_ARM_ERR_VALID_OPERATION_TYPE) {
 		op_type = ((error_info >> CPER_ARM_ERR_OPERATION_SHIFT)
 			   & CPER_ARM_ERR_OPERATION_MASK);
-		switch (type) {
-		case CPER_ARM_CACHE_ERROR:
+		if (type & CPER_ARM_CACHE_ERROR) {
 			if (op_type < ARRAY_SIZE(arm_cache_err_op_strs)) {
-				printk("%soperation type: %s\n", pfx,
+				printk("%scache error, operation type: %s\n", pfx,
 				       arm_cache_err_op_strs[op_type]);
 			}
-			break;
-		case CPER_ARM_TLB_ERROR:
+		}
+		if (type & CPER_ARM_TLB_ERROR) {
 			if (op_type < ARRAY_SIZE(arm_tlb_err_op_strs)) {
-				printk("%soperation type: %s\n", pfx,
+				printk("%sTLB error, operation type: %s\n", pfx,
 				       arm_tlb_err_op_strs[op_type]);
 			}
-			break;
-		case CPER_ARM_BUS_ERROR:
+		}
+		if (type & CPER_ARM_BUS_ERROR) {
 			if (op_type < ARRAY_SIZE(arm_bus_err_op_strs)) {
-				printk("%soperation type: %s\n", pfx,
+				printk("%sbus error, operation type: %s\n", pfx,
 				       arm_bus_err_op_strs[op_type]);
 			}
-			break;
 		}
 	}
 
 	if (error_info & CPER_ARM_ERR_VALID_LEVEL) {
 		level = ((error_info >> CPER_ARM_ERR_LEVEL_SHIFT)
 			 & CPER_ARM_ERR_LEVEL_MASK);
-		switch (type) {
-		case CPER_ARM_CACHE_ERROR:
+		if (type & CPER_ARM_CACHE_ERROR)
 			printk("%scache level: %d\n", pfx, level);
-			break;
-		case CPER_ARM_TLB_ERROR:
+
+		if (type & CPER_ARM_TLB_ERROR)
 			printk("%sTLB level: %d\n", pfx, level);
-			break;
-		case CPER_ARM_BUS_ERROR:
+
+		if (type & CPER_ARM_BUS_ERROR)
 			printk("%saffinity level at which the bus error occurred: %d\n",
 			       pfx, level);
-			break;
-		}
 	}
 
 	if (error_info & CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT) {
@@ -241,6 +232,7 @@ void cper_print_proc_arm(const char *pfx,
 	struct cper_arm_err_info *err_info;
 	struct cper_arm_ctx_info *ctx_info;
 	char newpfx[64], infopfx[ARRAY_SIZE(newpfx) + 1];
+	char error_type[120];
 
 	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
 
@@ -289,9 +281,15 @@ void cper_print_proc_arm(const char *pfx,
 				       newpfx);
 		}
 
-		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
-			err_info->type < ARRAY_SIZE(cper_proc_error_type_strs) ?
-			cper_proc_error_type_strs[err_info->type] : "unknown");
+		cper_bits_to_str(error_type, sizeof(error_type),
+				 FIELD_GET(CPER_ARM_ERR_TYPE_MASK, err_info->type),
+				 cper_proc_error_type_strs,
+				 ARRAY_SIZE(cper_proc_error_type_strs));
+
+		printk("%serror_type: 0x%02x: %s%s\n", newpfx, err_info->type,
+		       error_type,
+		       (err_info->type & ~CPER_ARM_ERR_TYPE_MASK) ? " with reserved bit(s)" : "");
+
 		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO) {
 			printk("%serror_info: 0x%016llx\n", newpfx,
 			       err_info->error_info);
diff --git a/include/linux/cper.h b/include/linux/cper.h
index c2f14b916bfb..fc62a80575e8 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -293,11 +293,11 @@ enum {
 #define CPER_ARM_INFO_FLAGS_PROPAGATED		BIT(2)
 #define CPER_ARM_INFO_FLAGS_OVERFLOW		BIT(3)
 
-#define CPER_ARM_CACHE_ERROR			0
-#define CPER_ARM_TLB_ERROR			1
-#define CPER_ARM_BUS_ERROR			2
-#define CPER_ARM_VENDOR_ERROR			3
-#define CPER_ARM_MAX_TYPE			CPER_ARM_VENDOR_ERROR
+#define CPER_ARM_ERR_TYPE_MASK			GENMASK(4,1)
+#define CPER_ARM_CACHE_ERROR			BIT(1)
+#define CPER_ARM_TLB_ERROR			BIT(2)
+#define CPER_ARM_BUS_ERROR			BIT(3)
+#define CPER_ARM_VENDOR_ERROR			BIT(4)
 
 #define CPER_ARM_ERR_VALID_TRANSACTION_TYPE	BIT(0)
 #define CPER_ARM_ERR_VALID_OPERATION_TYPE	BIT(1)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v5 4/4] docs: efi: add CPER functions to driver-api
  2024-06-24  9:19 [PATCH v5 0/4] fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2024-06-24  9:19 ` [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
@ 2024-06-24  9:19 ` Mauro Carvalho Chehab
  3 siblings, 0 replies; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-24  9:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
	Jonathan Cameron, Shiju Jose, Tony Luck, Ard Biesheuvel,
	Jonathan Corbet, linux-doc, linux-edac, linux-efi, linux-kernel

There are two kernel-doc like descriptions at cper, which is used
by other parts of cper and on ghes driver. They both have kernel-doc
like descriptions.

Change the tags for them to be actual kernel-doc tags and add them
to the driver-api documentaion at the UEFI section.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 Documentation/driver-api/firmware/efi/index.rst | 11 ++++++++---
 drivers/firmware/efi/cper.c                     | 10 ++++------
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/Documentation/driver-api/firmware/efi/index.rst b/Documentation/driver-api/firmware/efi/index.rst
index 4fe8abba9fc6..5a6b6229592c 100644
--- a/Documentation/driver-api/firmware/efi/index.rst
+++ b/Documentation/driver-api/firmware/efi/index.rst
@@ -1,11 +1,16 @@
 .. SPDX-License-Identifier: GPL-2.0
 
-============
-UEFI Support
-============
+====================================================
+Unified Extensible Firmware Interface (UEFI) Support
+====================================================
 
 UEFI stub library functions
 ===========================
 
 .. kernel-doc:: drivers/firmware/efi/libstub/mem.c
    :internal:
+
+UEFI Common Platform Error Record (CPER) functions
+==================================================
+
+.. kernel-doc:: drivers/firmware/efi/cper.c
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 4cf56657afde..c8a432c5cb5d 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -69,7 +69,7 @@ const char *cper_severity_str(unsigned int severity)
 }
 EXPORT_SYMBOL_GPL(cper_severity_str);
 
-/*
+/**
  * cper_print_bits - print strings for set bits
  * @pfx: prefix for each line, including log level and prefix string
  * @bits: bit mask
@@ -106,18 +106,16 @@ void cper_print_bits(const char *pfx, unsigned int bits,
 		printk("%s\n", buf);
 }
 
-/*
+/**
  * cper_bits_to_str - return a string for set bits
  * @buf: buffer to store the output string
  * @buf_size: size of the output string buffer
  * @bits: bit mask
  * @strs: string array, indexed by bit position
  * @strs_size: size of the string array: @strs
- * @mask: a continuous bitmask used to detect the first valid bit of the
- *        bitmap.
  *
- * Add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits
- * mask, add the corresponding string describing the bit in @strs to @buf.
+ * Add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits,
+ * add the corresponding string describing the bit in @strs to @buf.
  */
 char *cper_bits_to_str(char *buf, int buf_size, unsigned long bits,
 		       const char * const strs[], unsigned int strs_size)
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
  2024-06-24  9:19 ` [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
@ 2024-06-27  8:59   ` kernel test robot
  2024-06-28 20:28   ` kernel test robot
  1 sibling, 0 replies; 8+ messages in thread
From: kernel test robot @ 2024-06-27  8:59 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: llvm, oe-kbuild-all, linux-media, Mauro Carvalho Chehab,
	Borislav Petkov, James Morse, Jonathan Cameron, Rafael J. Wysocki,
	Shiju Jose, Tony Luck, Alison Schofield, Ard Biesheuvel,
	Dan Williams, Dave Jiang, Ira Weiny, Len Brown, Shuai Xue,
	linux-acpi, linux-edac, linux-efi, linux-kernel

Hi Mauro,

kernel test robot noticed the following build errors:

[auto build test ERROR on efi/next]
[also build test ERROR on rafael-pm/linux-next rafael-pm/bleeding-edge linus/master v6.10-rc5 next-20240626]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mauro-Carvalho-Chehab/efi-cper-Adjust-infopfx-size-to-accept-an-extra-space/20240625-203952
base:   https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git next
patch link:    https://lore.kernel.org/r/b9354882f45a0c600e65df4bacee2f1080c4ba89.1719219886.git.mchehab%2Bhuawei%40kernel.org
patch subject: [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
config: i386-randconfig-004-20240627 (https://download.01.org/0day-ci/archive/20240627/202406271626.72sHSPJJ-lkp@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240627/202406271626.72sHSPJJ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406271626.72sHSPJJ-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/acpi/apei/ghes.c:566:6: error: call to undeclared function 'FIELD_GET'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     566 |                                  FIELD_GET(CPER_ARM_ERR_TYPE_MASK, err_info->type),
         |                                  ^
   1 error generated.


vim +/FIELD_GET +566 drivers/acpi/apei/ghes.c

   530	
   531	static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
   532					       int sev, bool sync)
   533	{
   534		struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
   535		int flags = sync ? MF_ACTION_REQUIRED : 0;
   536		char error_type[120];
   537		bool queued = false;
   538		int sec_sev, i;
   539		char *p;
   540	
   541		log_arm_hw_error(err);
   542	
   543		sec_sev = ghes_severity(gdata->error_severity);
   544		if (sev != GHES_SEV_RECOVERABLE || sec_sev != GHES_SEV_RECOVERABLE)
   545			return false;
   546	
   547		p = (char *)(err + 1);
   548		for (i = 0; i < err->err_info_num; i++) {
   549			struct cper_arm_err_info *err_info = (struct cper_arm_err_info *)p;
   550			bool is_cache = err_info->type & CPER_ARM_CACHE_ERROR;
   551			bool has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR);
   552	
   553			/*
   554			 * The field (err_info->error_info & BIT(26)) is fixed to set to
   555			 * 1 in some old firmware of HiSilicon Kunpeng920. We assume that
   556			 * firmware won't mix corrected errors in an uncorrected section,
   557			 * and don't filter out 'corrected' error here.
   558			 */
   559			if (is_cache && has_pa) {
   560				queued = ghes_do_memory_failure(err_info->physical_fault_addr, flags);
   561				p += err_info->length;
   562				continue;
   563			}
   564	
   565			cper_bits_to_str(error_type, sizeof(error_type),
 > 566					 FIELD_GET(CPER_ARM_ERR_TYPE_MASK, err_info->type),
   567					 cper_proc_error_type_strs,
   568					 ARRAY_SIZE(cper_proc_error_type_strs));
   569	
   570			pr_warn_ratelimited(FW_WARN GHES_PFX
   571					    "Unhandled processor error type 0x%02x: %s%s\n",
   572					    err_info->type, error_type,
   573					    (err_info->type & ~CPER_ARM_ERR_TYPE_MASK) ? " with reserved bit(s)" : "");
   574			p += err_info->length;
   575		}
   576	
   577		return queued;
   578	}
   579	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
  2024-06-24  9:19 ` [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
  2024-06-27  8:59   ` kernel test robot
@ 2024-06-28 20:28   ` kernel test robot
  1 sibling, 0 replies; 8+ messages in thread
From: kernel test robot @ 2024-06-28 20:28 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: oe-kbuild-all, linux-media, Mauro Carvalho Chehab,
	Borislav Petkov, James Morse, Jonathan Cameron, Rafael J. Wysocki,
	Shiju Jose, Tony Luck, Alison Schofield, Ard Biesheuvel,
	Dan Williams, Dave Jiang, Ira Weiny, Len Brown, Shuai Xue,
	linux-acpi, linux-edac, linux-efi, linux-kernel

Hi Mauro,

kernel test robot noticed the following build errors:

[auto build test ERROR on efi/next]
[also build test ERROR on rafael-pm/linux-next rafael-pm/bleeding-edge linus/master v6.10-rc5 next-20240627]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mauro-Carvalho-Chehab/efi-cper-Adjust-infopfx-size-to-accept-an-extra-space/20240625-203952
base:   https://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git next
patch link:    https://lore.kernel.org/r/b9354882f45a0c600e65df4bacee2f1080c4ba89.1719219886.git.mchehab%2Bhuawei%40kernel.org
patch subject: [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
config: x86_64-buildonly-randconfig-003-20240628 (https://download.01.org/0day-ci/archive/20240629/202406290457.3HKFUkuV-lkp@intel.com/config)
compiler: gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240629/202406290457.3HKFUkuV-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406290457.3HKFUkuV-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/acpi/apei/ghes.c: In function 'ghes_handle_arm_hw_error':
>> drivers/acpi/apei/ghes.c:566:34: error: implicit declaration of function 'FIELD_GET' [-Werror=implicit-function-declaration]
     566 |                                  FIELD_GET(CPER_ARM_ERR_TYPE_MASK, err_info->type),
         |                                  ^~~~~~~~~
   cc1: some warnings being treated as errors


vim +/FIELD_GET +566 drivers/acpi/apei/ghes.c

   530	
   531	static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
   532					       int sev, bool sync)
   533	{
   534		struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
   535		int flags = sync ? MF_ACTION_REQUIRED : 0;
   536		char error_type[120];
   537		bool queued = false;
   538		int sec_sev, i;
   539		char *p;
   540	
   541		log_arm_hw_error(err);
   542	
   543		sec_sev = ghes_severity(gdata->error_severity);
   544		if (sev != GHES_SEV_RECOVERABLE || sec_sev != GHES_SEV_RECOVERABLE)
   545			return false;
   546	
   547		p = (char *)(err + 1);
   548		for (i = 0; i < err->err_info_num; i++) {
   549			struct cper_arm_err_info *err_info = (struct cper_arm_err_info *)p;
   550			bool is_cache = err_info->type & CPER_ARM_CACHE_ERROR;
   551			bool has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR);
   552	
   553			/*
   554			 * The field (err_info->error_info & BIT(26)) is fixed to set to
   555			 * 1 in some old firmware of HiSilicon Kunpeng920. We assume that
   556			 * firmware won't mix corrected errors in an uncorrected section,
   557			 * and don't filter out 'corrected' error here.
   558			 */
   559			if (is_cache && has_pa) {
   560				queued = ghes_do_memory_failure(err_info->physical_fault_addr, flags);
   561				p += err_info->length;
   562				continue;
   563			}
   564	
   565			cper_bits_to_str(error_type, sizeof(error_type),
 > 566					 FIELD_GET(CPER_ARM_ERR_TYPE_MASK, err_info->type),
   567					 cper_proc_error_type_strs,
   568					 ARRAY_SIZE(cper_proc_error_type_strs));
   569	
   570			pr_warn_ratelimited(FW_WARN GHES_PFX
   571					    "Unhandled processor error type 0x%02x: %s%s\n",
   572					    err_info->type, error_type,
   573					    (err_info->type & ~CPER_ARM_ERR_TYPE_MASK) ? " with reserved bit(s)" : "");
   574			p += err_info->length;
   575		}
   576	
   577		return queued;
   578	}
   579	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v5 2/4] efi/cper: Add a new helper function to print bitmasks
  2024-06-24  9:19 ` [PATCH v5 2/4] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
@ 2024-07-01 15:07   ` Jonathan Cameron
  0 siblings, 0 replies; 8+ messages in thread
From: Jonathan Cameron @ 2024-07-01 15:07 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, James Morse, Shiju Jose, Tony Luck,
	Alison Schofield, Ard Biesheuvel, Dave Jiang, Ira Weiny,
	linux-edac, linux-efi, linux-kernel

On Mon, 24 Jun 2024 11:19:19 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:

> Sometimes it is desired to produce a single log line for errors.

This description is a bit cryptic.  Maybe an example would make
it easier for anyone coming across this in the future?

> Add a new helper function for such purpose.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Minor comment inline.

> ---
>  drivers/firmware/efi/cper.c | 41 +++++++++++++++++++++++++++++++++++++
>  include/linux/cper.h        |  2 ++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 7d2cdd9e2227..4cf56657afde 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -106,6 +106,47 @@ void cper_print_bits(const char *pfx, unsigned int bits,
>  		printk("%s\n", buf);
>  }
>  
> +/*
> + * cper_bits_to_str - return a string for set bits
> + * @buf: buffer to store the output string
> + * @buf_size: size of the output string buffer
> + * @bits: bit mask
> + * @strs: string array, indexed by bit position
> + * @strs_size: size of the string array: @strs
> + * @mask: a continuous bitmask used to detect the first valid bit of the
> + *        bitmap.
> + *
> + * Add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits
> + * mask, add the corresponding string describing the bit in @strs to @buf.
> + */
> +char *cper_bits_to_str(char *buf, int buf_size, unsigned long bits,
> +		       const char * const strs[], unsigned int strs_size)
> +{
> +	int len = buf_size;
> +	char *str = buf;
> +	int i, size;
> +
> +	for_each_set_bit(i, &bits, strs_size) {
> +		if (!(bits & (1U << (i))))
> +			continue;
> +
> +		if (*buf && len > 0) {

That first check on *buf is a bit esoteric.
It's more than possible we get a non clear
buffer passed in.  I'd just use a
	bool first = true;
...
		if (!first && len > 0) {
			*str = '|';
			len--;
			str++;
		}
		first = false;


> +			*str = '|';
> +			len--;
> +			str++;
> +		}
> +
> +		size = strscpy(str, strs[i], len);
> +		if (size < 0)
> +			break;
> +
> +		len -= size;
> +		str += size;
> +	}
> +	return buf;

Why return buf?  You never use it and it's not entirely obvious
that is the right thing to return (as opposed to length written for example).

> +}
> +EXPORT_SYMBOL_GPL(cper_bits_to_str);
> +
>  static const char * const proc_type_strs[] = {
>  	"IA32/X64",
>  	"IA64",
> diff --git a/include/linux/cper.h b/include/linux/cper.h
> index 265b0f8fc0b3..c2f14b916bfb 100644
> --- a/include/linux/cper.h
> +++ b/include/linux/cper.h
> @@ -584,6 +584,8 @@ const char *cper_mem_err_type_str(unsigned int);
>  const char *cper_mem_err_status_str(u64 status);
>  void cper_print_bits(const char *prefix, unsigned int bits,
>  		     const char * const strs[], unsigned int strs_size);
> +char *cper_bits_to_str(char *buf, int buf_size, unsigned long bits,
> +		       const char * const strs[], unsigned int strs_size);
>  void cper_mem_err_pack(const struct cper_sec_mem_err *,
>  		       struct cper_mem_err_compact *);
>  const char *cper_mem_err_unpack(struct trace_seq *,


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-07-01 15:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-24  9:19 [PATCH v5 0/4] fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
2024-06-24  9:19 ` [PATCH v5 1/4] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
2024-06-24  9:19 ` [PATCH v5 2/4] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
2024-07-01 15:07   ` Jonathan Cameron
2024-06-24  9:19 ` [PATCH v5 3/4] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
2024-06-27  8:59   ` kernel test robot
2024-06-28 20:28   ` kernel test robot
2024-06-24  9:19 ` [PATCH v5 4/4] docs: efi: add CPER functions to driver-api Mauro Carvalho Chehab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).