* [PATCH v4 1/3] efi/cper: Adjust infopfx size to accept an extra space
2024-06-20 18:01 [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
@ 2024-06-20 18:01 ` Mauro Carvalho Chehab
2024-06-21 9:04 ` Jonathan Cameron
2024-06-20 18:01 ` [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-20 18:01 UTC (permalink / raw)
Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
Jonathan Cameron, Shiju Jose, Tony Luck, Ard Biesheuvel,
linux-edac, linux-efi, linux-kernel
Compiling with W=1 with werror enabled produces an error:
drivers/firmware/efi/cper-arm.c: In function ‘cper_print_proc_arm’:
drivers/firmware/efi/cper-arm.c:298:64: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
298 | snprintf(infopfx, sizeof(infopfx), "%s ", newpfx);
| ^
drivers/firmware/efi/cper-arm.c:298:25: note: ‘snprintf’ output between 2 and 65 bytes into a destination of size 64
298 | snprintf(infopfx, sizeof(infopfx), "%s ", newpfx);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As the logic there adds an space at the end of infopx buffer.
Add an extra space to avoid such warning.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
drivers/firmware/efi/cper-arm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
index fa9c1c3bf168..d9bbcea0adf4 100644
--- a/drivers/firmware/efi/cper-arm.c
+++ b/drivers/firmware/efi/cper-arm.c
@@ -240,7 +240,7 @@ void cper_print_proc_arm(const char *pfx,
int i, len, max_ctx_type;
struct cper_arm_err_info *err_info;
struct cper_arm_ctx_info *ctx_info;
- char newpfx[64], infopfx[64];
+ char newpfx[64], infopfx[65];
printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH v4 1/3] efi/cper: Adjust infopfx size to accept an extra space
2024-06-20 18:01 ` [PATCH v4 1/3] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
@ 2024-06-21 9:04 ` Jonathan Cameron
0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2024-06-21 9:04 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Borislav Petkov, James Morse, Shiju Jose, Tony Luck,
Ard Biesheuvel, linux-edac, linux-efi, linux-kernel
On Thu, 20 Jun 2024 20:01:44 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Compiling with W=1 with werror enabled produces an error:
>
> drivers/firmware/efi/cper-arm.c: In function ‘cper_print_proc_arm’:
> drivers/firmware/efi/cper-arm.c:298:64: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
> 298 | snprintf(infopfx, sizeof(infopfx), "%s ", newpfx);
> | ^
> drivers/firmware/efi/cper-arm.c:298:25: note: ‘snprintf’ output between 2 and 65 bytes into a destination of size 64
> 298 | snprintf(infopfx, sizeof(infopfx), "%s ", newpfx);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> As the logic there adds an space at the end of infopx buffer.
> Add an extra space to avoid such warning.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Trivial suggestion inline. Either way LGTM
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/firmware/efi/cper-arm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
> index fa9c1c3bf168..d9bbcea0adf4 100644
> --- a/drivers/firmware/efi/cper-arm.c
> +++ b/drivers/firmware/efi/cper-arm.c
> @@ -240,7 +240,7 @@ void cper_print_proc_arm(const char *pfx,
> int i, len, max_ctx_type;
> struct cper_arm_err_info *err_info;
> struct cper_arm_ctx_info *ctx_info;
> - char newpfx[64], infopfx[64];
> + char newpfx[64], infopfx[65];
Maybe make it explicit so we don't wonder if it was
a typo in future. Something like?
char newpfx[64];
char infofx[ARRAY_SIZE(newpfx) + 1];
>
> printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks
2024-06-20 18:01 [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
2024-06-20 18:01 ` [PATCH v4 1/3] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
@ 2024-06-20 18:01 ` Mauro Carvalho Chehab
2024-06-20 18:08 ` Luck, Tony
2024-06-21 9:20 ` Jonathan Cameron
2024-06-20 18:01 ` [PATCH v4 3/3] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
2024-06-21 7:45 ` [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Ard Biesheuvel
3 siblings, 2 replies; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-20 18:01 UTC (permalink / raw)
Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
Jonathan Cameron, Shiju Jose, Tony Luck, Ard Biesheuvel,
Dave Jiang, Ira Weiny, linux-edac, linux-efi, linux-kernel
Sometimes it is desired to produce a single log line for errors.
Add a new helper function for such purpose.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
drivers/firmware/efi/cper.c | 59 +++++++++++++++++++++++++++++++++++++
include/linux/cper.h | 3 ++
2 files changed, 62 insertions(+)
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 7d2cdd9e2227..9bf27af3e870 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -106,6 +106,65 @@ void cper_print_bits(const char *pfx, unsigned int bits,
printk("%s\n", buf);
}
+/*
+ * cper_bits_to_str - return a string for set bits
+ * @buf: buffer to store the output string
+ * @buf_size: size of the output string buffer
+ * @bits: bit mask
+ * @strs: string array, indexed by bit position
+ * @strs_size: size of the string array: @strs
+ *
+ * add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits,
+ * add the corresponding string in @strs to @buf.
+ */
+char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
+ const char * const strs[], unsigned int strs_size,
+ unsigned int mask)
+{
+ int i, size, first_bit;
+ int len = buf_size;
+ const char *start;
+ char *str = buf;
+
+ if (strs_size < 16)
+ size = snprintf(str, len, "0x%02x: ", bits);
+ if (strs_size < 32)
+ size = snprintf(str, len, "0x%04x: ", bits);
+
+ len -= size;
+ str += size;
+
+ start = str;
+
+ if (mask) {
+ first_bit = ffs(mask) - 1;
+ if (bits & ~mask) {
+ size = strscpy(str, "reserved bit(s)", len);
+ len -= size;
+ str += size;
+ }
+ } else {
+ first_bit = 0;
+ }
+
+ for (i = 0; i < strs_size; i++) {
+ if (!(bits & (1U << (i + first_bit))))
+ continue;
+
+ if (*start && len > 0) {
+ *str = '|';
+ len--;
+ str++;
+ }
+
+ size = strscpy(str, strs[i], len);
+ len -= size;
+ str += size;
+ }
+ return buf;
+}
+EXPORT_SYMBOL_GPL(cper_bits_to_str);
+
static const char * const proc_type_strs[] = {
"IA32/X64",
"IA64",
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 265b0f8fc0b3..856e8f00a7fb 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -584,6 +584,9 @@ const char *cper_mem_err_type_str(unsigned int);
const char *cper_mem_err_status_str(u64 status);
void cper_print_bits(const char *prefix, unsigned int bits,
const char * const strs[], unsigned int strs_size);
+char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
+ const char * const strs[], unsigned int strs_size,
+ unsigned int mask);
void cper_mem_err_pack(const struct cper_sec_mem_err *,
struct cper_mem_err_compact *);
const char *cper_mem_err_unpack(struct trace_seq *,
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* RE: [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks
2024-06-20 18:01 ` [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
@ 2024-06-20 18:08 ` Luck, Tony
2024-06-21 9:20 ` Jonathan Cameron
1 sibling, 0 replies; 13+ messages in thread
From: Luck, Tony @ 2024-06-20 18:08 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Borislav Petkov, James Morse, Jonathan Cameron, Shiju Jose,
Ard Biesheuvel, Jiang, Dave, Weiny, Ira,
linux-edac@vger.kernel.org, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org
+
+ size = strscpy(str, strs[i], len);
Check for error return from strscpy()
+ len -= size;
+ str += size;
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks
2024-06-20 18:01 ` [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
2024-06-20 18:08 ` Luck, Tony
@ 2024-06-21 9:20 ` Jonathan Cameron
2024-06-21 9:26 ` Jonathan Cameron
2024-06-21 9:39 ` Mauro Carvalho Chehab
1 sibling, 2 replies; 13+ messages in thread
From: Jonathan Cameron @ 2024-06-21 9:20 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Borislav Petkov, James Morse, Shiju Jose, Tony Luck,
Ard Biesheuvel, Dave Jiang, Ira Weiny, linux-edac, linux-efi,
linux-kernel
On Thu, 20 Jun 2024 20:01:45 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Sometimes it is desired to produce a single log line for errors.
> Add a new helper function for such purpose.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> ---
> drivers/firmware/efi/cper.c | 59 +++++++++++++++++++++++++++++++++++++
> include/linux/cper.h | 3 ++
> 2 files changed, 62 insertions(+)
>
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 7d2cdd9e2227..9bf27af3e870 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -106,6 +106,65 @@ void cper_print_bits(const char *pfx, unsigned int bits,
> printk("%s\n", buf);
> }
>
> +/*
It's exported and in a header used by other code, so why not make
this kernel-doc? /**
> + * cper_bits_to_str - return a string for set bits
> + * @buf: buffer to store the output string
> + * @buf_size: size of the output string buffer
> + * @bits: bit mask
> + * @strs: string array, indexed by bit position
> + * @strs_size: size of the string array: @strs
If it had been kernel doc, W=1 would have told you mask is
missing.
Passing a 0 for mask seems probably not worth while.
If all bits of the unsigned int are set then people can pass ~0
Or make this cper_bits_to_str_masked() and have
cper_bits_to_str() that doesn't take a mask.
If you do that, some simplifications can be easily made.
> + *
> + * add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits,
> + * add the corresponding string in @strs to @buf.
> + */
> +char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
Perhaps make bits an unsigned long as then you can use the
for_each_set_bit() etc.
> + const char * const strs[], unsigned int strs_size,
> + unsigned int mask)
> +{
> + int i, size, first_bit;
> + int len = buf_size;
> + const char *start;
> + char *str = buf;
> +
> + if (strs_size < 16)
> + size = snprintf(str, len, "0x%02x: ", bits);
> + if (strs_size < 32)
> + size = snprintf(str, len, "0x%04x: ", bits);
> +
> + len -= size;
> + str += size;
> +
> + start = str;
> +
> + if (mask) {
> + first_bit = ffs(mask) - 1;
> + if (bits & ~mask) {
> + size = strscpy(str, "reserved bit(s)", len);
> + len -= size;
> + str += size;
> + }
> + } else {
> + first_bit = 0;
> + }
Might be worth
bits = bits & mask;
Obviously setting bits that aren't in the mask is
odd though so maybe a warning print if that happens?
> +
for_each_bit_set(i, &bits, strs_size) {
...
}
> + for (i = 0; i < strs_size; i++) {
> + if (!(bits & (1U << (i + first_bit))))
> + continue;
> +
> + if (*start && len > 0) {
> + *str = '|';
> + len--;
> + str++;
> + }
> +
> + size = strscpy(str, strs[i], len);
> + len -= size;
> + str += size;
> + }
> + return buf;
> +}
> +EXPORT_SYMBOL_GPL(cper_bits_to_str);
> +
> static const char * const proc_type_strs[] = {
> "IA32/X64",
> "IA64",
> diff --git a/include/linux/cper.h b/include/linux/cper.h
> index 265b0f8fc0b3..856e8f00a7fb 100644
> --- a/include/linux/cper.h
> +++ b/include/linux/cper.h
> @@ -584,6 +584,9 @@ const char *cper_mem_err_type_str(unsigned int);
> const char *cper_mem_err_status_str(u64 status);
> void cper_print_bits(const char *prefix, unsigned int bits,
> const char * const strs[], unsigned int strs_size);
> +char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
> + const char * const strs[], unsigned int strs_size,
> + unsigned int mask);
> void cper_mem_err_pack(const struct cper_sec_mem_err *,
> struct cper_mem_err_compact *);
> const char *cper_mem_err_unpack(struct trace_seq *,
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks
2024-06-21 9:20 ` Jonathan Cameron
@ 2024-06-21 9:26 ` Jonathan Cameron
2024-06-21 9:39 ` Mauro Carvalho Chehab
1 sibling, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2024-06-21 9:26 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Borislav Petkov, James Morse, Shiju Jose, Tony Luck,
Ard Biesheuvel, Dave Jiang, Ira Weiny, linux-edac, linux-efi,
linux-kernel
On Fri, 21 Jun 2024 10:20:36 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> On Thu, 20 Jun 2024 20:01:45 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Sometimes it is desired to produce a single log line for errors.
> > Add a new helper function for such purpose.
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > ---
> > drivers/firmware/efi/cper.c | 59 +++++++++++++++++++++++++++++++++++++
> > include/linux/cper.h | 3 ++
> > 2 files changed, 62 insertions(+)
> >
> > diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> > index 7d2cdd9e2227..9bf27af3e870 100644
> > --- a/drivers/firmware/efi/cper.c
> > +++ b/drivers/firmware/efi/cper.c
> > @@ -106,6 +106,65 @@ void cper_print_bits(const char *pfx, unsigned int bits,
> > printk("%s\n", buf);
> > }
> >
> > +/*
>
> It's exported and in a header used by other code, so why not make
> this kernel-doc? /**
>
> > + * cper_bits_to_str - return a string for set bits
> > + * @buf: buffer to store the output string
> > + * @buf_size: size of the output string buffer
> > + * @bits: bit mask
> > + * @strs: string array, indexed by bit position
> > + * @strs_size: size of the string array: @strs
>
> If it had been kernel doc, W=1 would have told you mask is
> missing.
>
> Passing a 0 for mask seems probably not worth while.
> If all bits of the unsigned int are set then people can pass ~0
>
> Or make this cper_bits_to_str_masked() and have
> cper_bits_to_str() that doesn't take a mask.
>
Mask definitely needs docs as I misunderstood it :(
Also needs to be contiguous I think which is also a bit unusual.
> If you do that, some simplifications can be easily made.
>
>
>
> > + *
> > + * add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits,
> > + * add the corresponding string in @strs to @buf.
> > + */
> > +char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
>
> Perhaps make bits an unsigned long as then you can use the
> for_each_set_bit() etc.
>
> > + const char * const strs[], unsigned int strs_size,
> > + unsigned int mask)
> > +{
> > + int i, size, first_bit;
> > + int len = buf_size;
> > + const char *start;
> > + char *str = buf;
> > +
> > + if (strs_size < 16)
> > + size = snprintf(str, len, "0x%02x: ", bits);
> > + if (strs_size < 32)
> > + size = snprintf(str, len, "0x%04x: ", bits);
> > +
> > + len -= size;
> > + str += size;
> > +
> > + start = str;
> > +
> > + if (mask) {
> > + first_bit = ffs(mask) - 1;
> > + if (bits & ~mask) {
> > + size = strscpy(str, "reserved bit(s)", len);
> > + len -= size;
> > + str += size;
> > + }
> > + } else {
> > + first_bit = 0;
> > + }
> Might be worth
>
> bits = bits & mask;
>
> Obviously setting bits that aren't in the mask is
> odd though so maybe a warning print if that happens?
> > +
>
>
> for_each_bit_set(i, &bits, strs_size) {
Ah. I'd missed the offset and gotten function name wrong.
i = first_bit
for_each_set_bit_from(i, &bits, strs_size + first_bit)
and look up based on i - first_bit
> ...
>
> }
>
> > + for (i = 0; i < strs_size; i++) {
> > + if (!(bits & (1U << (i + first_bit))))
> > + continue;
> > +
> > + if (*start && len > 0) {
> > + *str = '|';
> > + len--;
> > + str++;
> > + }
> > +
> > + size = strscpy(str, strs[i], len);
> > + len -= size;
> > + str += size;
> > + }
> > + return buf;
> > +}
> > +EXPORT_SYMBOL_GPL(cper_bits_to_str);
> > +
> > static const char * const proc_type_strs[] = {
> > "IA32/X64",
> > "IA64",
> > diff --git a/include/linux/cper.h b/include/linux/cper.h
> > index 265b0f8fc0b3..856e8f00a7fb 100644
> > --- a/include/linux/cper.h
> > +++ b/include/linux/cper.h
> > @@ -584,6 +584,9 @@ const char *cper_mem_err_type_str(unsigned int);
> > const char *cper_mem_err_status_str(u64 status);
> > void cper_print_bits(const char *prefix, unsigned int bits,
> > const char * const strs[], unsigned int strs_size);
> > +char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
> > + const char * const strs[], unsigned int strs_size,
> > + unsigned int mask);
> > void cper_mem_err_pack(const struct cper_sec_mem_err *,
> > struct cper_mem_err_compact *);
> > const char *cper_mem_err_unpack(struct trace_seq *,
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks
2024-06-21 9:20 ` Jonathan Cameron
2024-06-21 9:26 ` Jonathan Cameron
@ 2024-06-21 9:39 ` Mauro Carvalho Chehab
1 sibling, 0 replies; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-21 9:39 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Borislav Petkov, James Morse, Shiju Jose, Tony Luck,
Ard Biesheuvel, Dave Jiang, Ira Weiny, linux-edac, linux-efi,
linux-kernel
Em Fri, 21 Jun 2024 10:20:36 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> escreveu:
> On Thu, 20 Jun 2024 20:01:45 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Sometimes it is desired to produce a single log line for errors.
> > Add a new helper function for such purpose.
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > ---
> > drivers/firmware/efi/cper.c | 59 +++++++++++++++++++++++++++++++++++++
> > include/linux/cper.h | 3 ++
> > 2 files changed, 62 insertions(+)
> >
> > diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> > index 7d2cdd9e2227..9bf27af3e870 100644
> > --- a/drivers/firmware/efi/cper.c
> > +++ b/drivers/firmware/efi/cper.c
> > @@ -106,6 +106,65 @@ void cper_print_bits(const char *pfx, unsigned int bits,
> > printk("%s\n", buf);
> > }
> >
> > +/*
>
> It's exported and in a header used by other code, so why not make
> this kernel-doc? /**
I tried to preserve the original non-kernel-doc way, as I'm not sure
why other comments on this file are not marked as kernel-doc stuff.
The code there at cper.c also has other coding style issues - for
instance it uses printk() without an error level.
Anyway, I intend to submit later on a separate patch series converting
the existing function documentation stuff to kernel-doc (and adding to
Documentation if not there already), and maybe addressing some other
coding style issues.
Yet, I would prefer to have such changes out of this fix patch series.
> > + * cper_bits_to_str - return a string for set bits
> > + * @buf: buffer to store the output string
> > + * @buf_size: size of the output string buffer
> > + * @bits: bit mask
> > + * @strs: string array, indexed by bit position
> > + * @strs_size: size of the string array: @strs
>
> If it had been kernel doc, W=1 would have told you mask is
> missing.
Yeah, I saw that just after hitting send :-) I'll add mask at
v5.
> Passing a 0 for mask seems probably not worth while.
> If all bits of the unsigned int are set then people can pass ~0
Makes sense.
>
> Or make this cper_bits_to_str_masked() and have
> cper_bits_to_str() that doesn't take a mask.
>
> If you do that, some simplifications can be easily made.
>
>
>
> > + *
> > + * add to @buf the bitmask in hexadecimal. Then, for each set bit in @bits,
> > + * add the corresponding string in @strs to @buf.
> > + */
> > +char *cper_bits_to_str(char *buf, int buf_size, unsigned int bits,
>
> Perhaps make bits an unsigned long as then you can use the
> for_each_set_bit() etc.
Ok.
>
> > + const char * const strs[], unsigned int strs_size,
> > + unsigned int mask)
> > +{
> > + int i, size, first_bit;
> > + int len = buf_size;
> > + const char *start;
> > + char *str = buf;
> > +
> > + if (strs_size < 16)
> > + size = snprintf(str, len, "0x%02x: ", bits);
> > + if (strs_size < 32)
> > + size = snprintf(str, len, "0x%04x: ", bits);
> > +
> > + len -= size;
> > + str += size;
> > +
> > + start = str;
> > +
> > + if (mask) {
> > + first_bit = ffs(mask) - 1;
> > + if (bits & ~mask) {
> > + size = strscpy(str, "reserved bit(s)", len);
> > + len -= size;
> > + str += size;
> > + }
> > + } else {
> > + first_bit = 0;
> > + }
> Might be worth
>
> bits = bits & mask;
No need to to that if we use for_each_set_bit().
>
> Obviously setting bits that aren't in the mask is
> odd though so maybe a warning print if that happens?
The code already warns about that printing:
"reserved bit(s)"
at the output buffer.
Regards,
Mauro
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4 3/3] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
2024-06-20 18:01 [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
2024-06-20 18:01 ` [PATCH v4 1/3] efi/cper: Adjust infopfx size to accept an extra space Mauro Carvalho Chehab
2024-06-20 18:01 ` [PATCH v4 2/3] efi/cper: Add a new helper function to print bitmasks Mauro Carvalho Chehab
@ 2024-06-20 18:01 ` Mauro Carvalho Chehab
2024-06-21 9:30 ` Jonathan Cameron
2024-06-21 7:45 ` [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Ard Biesheuvel
3 siblings, 1 reply; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-20 18:01 UTC (permalink / raw)
Cc: Mauro Carvalho Chehab, Borislav Petkov, James Morse,
Jonathan Cameron, Rafael J. Wysocki, Shiju Jose, Tony Luck,
Uwe Kleine-König, Alison Schofield, Ard Biesheuvel,
Dan Williams, Dave Jiang, Ira Weiny, Len Brown, Shuai Xue,
linux-acpi, linux-edac, linux-efi, linux-kernel
Up to UEFI spec, the type byte of CPER struct for ARM processor was
defined simply as:
Type at byte offset 4:
- Cache error
- TLB Error
- Bus Error
- Micro-architectural Error
All other values are reserved
Yet, there was no information about how this would be encoded.
Spec 2.9A errata corrected it by defining:
- Bit 1 - Cache Error
- Bit 2 - TLB Error
- Bit 3 - Bus Error
- Bit 4 - Micro-architectural Error
All other values are reserved
That actually aligns with the values already defined on older
versions at N.2.4.1. Generic Processor Error Section.
Spec 2.10 also preserve the same encoding as 2.9A
See: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information
Adjust CPER and GHES handling code for both generic and ARM
processors to properly handle UEFI 2.9A and 2.10 encoding.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
drivers/acpi/apei/ghes.c | 10 ++++---
drivers/firmware/efi/cper-arm.c | 46 ++++++++++++++-------------------
include/linux/cper.h | 10 +++----
3 files changed, 31 insertions(+), 35 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 623cc0cb4a65..093a2d0e49e7 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -533,6 +533,7 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
{
struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
int flags = sync ? MF_ACTION_REQUIRED : 0;
+ char error_type[120];
bool queued = false;
int sec_sev, i;
char *p;
@@ -546,9 +547,8 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
p = (char *)(err + 1);
for (i = 0; i < err->err_info_num; i++) {
struct cper_arm_err_info *err_info = (struct cper_arm_err_info *)p;
- bool is_cache = (err_info->type == CPER_ARM_CACHE_ERROR);
+ bool is_cache = err_info->type & CPER_ARM_CACHE_ERROR;
bool has_pa = (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR);
- const char *error_type = "unknown error";
/*
* The field (err_info->error_info & BIT(26)) is fixed to set to
@@ -562,8 +562,10 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata,
continue;
}
- if (err_info->type < ARRAY_SIZE(cper_proc_error_type_strs))
- error_type = cper_proc_error_type_strs[err_info->type];
+ cper_bits_to_str(error_type, sizeof(error_type), err_info->type,
+ cper_proc_error_type_strs,
+ ARRAY_SIZE(cper_proc_error_type_strs),
+ CPER_ARM_ERR_TYPE_MASK);
pr_warn_ratelimited(FW_WARN GHES_PFX
"Unhandled processor error type: %s\n",
diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
index d9bbcea0adf4..4c101a09fd80 100644
--- a/drivers/firmware/efi/cper-arm.c
+++ b/drivers/firmware/efi/cper-arm.c
@@ -93,15 +93,11 @@ static void cper_print_arm_err_info(const char *pfx, u32 type,
bool proc_context_corrupt, corrected, precise_pc, restartable_pc;
bool time_out, access_mode;
- /* If the type is unknown, bail. */
- if (type > CPER_ARM_MAX_TYPE)
- return;
-
/*
* Vendor type errors have error information values that are vendor
* specific.
*/
- if (type == CPER_ARM_VENDOR_ERROR)
+ if (type & CPER_ARM_VENDOR_ERROR)
return;
if (error_info & CPER_ARM_ERR_VALID_TRANSACTION_TYPE) {
@@ -116,43 +112,38 @@ static void cper_print_arm_err_info(const char *pfx, u32 type,
if (error_info & CPER_ARM_ERR_VALID_OPERATION_TYPE) {
op_type = ((error_info >> CPER_ARM_ERR_OPERATION_SHIFT)
& CPER_ARM_ERR_OPERATION_MASK);
- switch (type) {
- case CPER_ARM_CACHE_ERROR:
+ if (type & CPER_ARM_CACHE_ERROR) {
if (op_type < ARRAY_SIZE(arm_cache_err_op_strs)) {
- printk("%soperation type: %s\n", pfx,
+ printk("%scache error, operation type: %s\n", pfx,
arm_cache_err_op_strs[op_type]);
}
- break;
- case CPER_ARM_TLB_ERROR:
+ }
+ if (type & CPER_ARM_TLB_ERROR) {
if (op_type < ARRAY_SIZE(arm_tlb_err_op_strs)) {
- printk("%soperation type: %s\n", pfx,
+ printk("%sTLB error, operation type: %s\n", pfx,
arm_tlb_err_op_strs[op_type]);
}
- break;
- case CPER_ARM_BUS_ERROR:
+ }
+ if (type & CPER_ARM_BUS_ERROR) {
if (op_type < ARRAY_SIZE(arm_bus_err_op_strs)) {
- printk("%soperation type: %s\n", pfx,
+ printk("%sbus error, operation type: %s\n", pfx,
arm_bus_err_op_strs[op_type]);
}
- break;
}
}
if (error_info & CPER_ARM_ERR_VALID_LEVEL) {
level = ((error_info >> CPER_ARM_ERR_LEVEL_SHIFT)
& CPER_ARM_ERR_LEVEL_MASK);
- switch (type) {
- case CPER_ARM_CACHE_ERROR:
+ if (type & CPER_ARM_CACHE_ERROR)
printk("%scache level: %d\n", pfx, level);
- break;
- case CPER_ARM_TLB_ERROR:
+
+ if (type & CPER_ARM_TLB_ERROR)
printk("%sTLB level: %d\n", pfx, level);
- break;
- case CPER_ARM_BUS_ERROR:
+
+ if (type & CPER_ARM_BUS_ERROR)
printk("%saffinity level at which the bus error occurred: %d\n",
pfx, level);
- break;
- }
}
if (error_info & CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT) {
@@ -241,6 +232,7 @@ void cper_print_proc_arm(const char *pfx,
struct cper_arm_err_info *err_info;
struct cper_arm_ctx_info *ctx_info;
char newpfx[64], infopfx[65];
+ char error_type[120];
printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
@@ -289,9 +281,11 @@ void cper_print_proc_arm(const char *pfx,
newpfx);
}
- printk("%serror_type: %d, %s\n", newpfx, err_info->type,
- err_info->type < ARRAY_SIZE(cper_proc_error_type_strs) ?
- cper_proc_error_type_strs[err_info->type] : "unknown");
+ cper_bits_to_str(error_type, sizeof(error_type), err_info->type,
+ cper_proc_error_type_strs,
+ ARRAY_SIZE(cper_proc_error_type_strs),
+ CPER_ARM_ERR_TYPE_MASK);
+ printk("%serror_type: %s\n", newpfx, error_type);
if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO) {
printk("%serror_info: 0x%016llx\n", newpfx,
err_info->error_info);
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 856e8f00a7fb..4fef462944d6 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -293,11 +293,11 @@ enum {
#define CPER_ARM_INFO_FLAGS_PROPAGATED BIT(2)
#define CPER_ARM_INFO_FLAGS_OVERFLOW BIT(3)
-#define CPER_ARM_CACHE_ERROR 0
-#define CPER_ARM_TLB_ERROR 1
-#define CPER_ARM_BUS_ERROR 2
-#define CPER_ARM_VENDOR_ERROR 3
-#define CPER_ARM_MAX_TYPE CPER_ARM_VENDOR_ERROR
+#define CPER_ARM_ERR_TYPE_MASK GENMASK(4,1)
+#define CPER_ARM_CACHE_ERROR BIT(1)
+#define CPER_ARM_TLB_ERROR BIT(2)
+#define CPER_ARM_BUS_ERROR BIT(3)
+#define CPER_ARM_VENDOR_ERROR BIT(4)
#define CPER_ARM_ERR_VALID_TRANSACTION_TYPE BIT(0)
#define CPER_ARM_ERR_VALID_OPERATION_TYPE BIT(1)
--
2.45.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH v4 3/3] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
2024-06-20 18:01 ` [PATCH v4 3/3] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
@ 2024-06-21 9:30 ` Jonathan Cameron
2024-06-21 9:47 ` Mauro Carvalho Chehab
0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Cameron @ 2024-06-21 9:30 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Borislav Petkov, James Morse, Rafael J. Wysocki, Shiju Jose,
Tony Luck, Uwe Kleine-König, Alison Schofield,
Ard Biesheuvel, Dan Williams, Dave Jiang, Ira Weiny, Len Brown,
Shuai Xue, linux-acpi, linux-edac, linux-efi, linux-kernel
On Thu, 20 Jun 2024 20:01:46 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> Up to UEFI spec, the type byte of CPER struct for ARM processor was
> defined simply as:
>
> Type at byte offset 4:
>
> - Cache error
> - TLB Error
> - Bus Error
> - Micro-architectural Error
> All other values are reserved
>
> Yet, there was no information about how this would be encoded.
>
> Spec 2.9A errata corrected it by defining:
>
> - Bit 1 - Cache Error
> - Bit 2 - TLB Error
> - Bit 3 - Bus Error
> - Bit 4 - Micro-architectural Error
> All other values are reserved
>
> That actually aligns with the values already defined on older
> versions at N.2.4.1. Generic Processor Error Section.
>
> Spec 2.10 also preserve the same encoding as 2.9A
>
> See: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information
>
> Adjust CPER and GHES handling code for both generic and ARM
> processors to properly handle UEFI 2.9A and 2.10 encoding.
>
> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
I think you can avoid complexity of your masking solution.
Cost is we don't have that function print that there were reserved bits
set, but that could be easily handled at the caller including notifying
on bits above the defined range which might be helpful.
> diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
> index d9bbcea0adf4..4c101a09fd80 100644
> --- a/drivers/firmware/efi/cper-arm.c
> +++ b/drivers/firmware/efi/cper-arm.c
...
> if (error_info & CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT) {
> @@ -241,6 +232,7 @@ void cper_print_proc_arm(const char *pfx,
> struct cper_arm_err_info *err_info;
> struct cper_arm_ctx_info *ctx_info;
> char newpfx[64], infopfx[65];
> + char error_type[120];
>
> printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
>
> @@ -289,9 +281,11 @@ void cper_print_proc_arm(const char *pfx,
> newpfx);
> }
>
> - printk("%serror_type: %d, %s\n", newpfx, err_info->type,
> - err_info->type < ARRAY_SIZE(cper_proc_error_type_strs) ?
> - cper_proc_error_type_strs[err_info->type] : "unknown");
> + cper_bits_to_str(error_type, sizeof(error_type), err_info->type,
> + cper_proc_error_type_strs,
> + ARRAY_SIZE(cper_proc_error_type_strs),
> + CPER_ARM_ERR_TYPE_MASK);
Maybe drop this mask complexity and just use
FIELD_GET() to extract the relevant field with no shift from 0.
> + printk("%serror_type: %s\n", newpfx, error_type);
> if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO) {
> printk("%serror_info: 0x%016llx\n", newpfx,
> err_info->error_info);
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v4 3/3] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
2024-06-21 9:30 ` Jonathan Cameron
@ 2024-06-21 9:47 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-21 9:47 UTC (permalink / raw)
To: Jonathan Cameron
Cc: Borislav Petkov, James Morse, Rafael J. Wysocki, Shiju Jose,
Tony Luck, Uwe Kleine-König, Alison Schofield,
Ard Biesheuvel, Dan Williams, Dave Jiang, Ira Weiny, Len Brown,
Shuai Xue, linux-acpi, linux-edac, linux-efi, linux-kernel
Em Fri, 21 Jun 2024 10:30:50 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> escreveu:
> On Thu, 20 Jun 2024 20:01:46 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
>
> > Up to UEFI spec, the type byte of CPER struct for ARM processor was
> > defined simply as:
> >
> > Type at byte offset 4:
> >
> > - Cache error
> > - TLB Error
> > - Bus Error
> > - Micro-architectural Error
> > All other values are reserved
> >
> > Yet, there was no information about how this would be encoded.
> >
> > Spec 2.9A errata corrected it by defining:
> >
> > - Bit 1 - Cache Error
> > - Bit 2 - TLB Error
> > - Bit 3 - Bus Error
> > - Bit 4 - Micro-architectural Error
> > All other values are reserved
> >
> > That actually aligns with the values already defined on older
> > versions at N.2.4.1. Generic Processor Error Section.
> >
> > Spec 2.10 also preserve the same encoding as 2.9A
> >
> > See: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-information
> >
> > Adjust CPER and GHES handling code for both generic and ARM
> > processors to properly handle UEFI 2.9A and 2.10 encoding.
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>
> I think you can avoid complexity of your masking solution.
> Cost is we don't have that function print that there were reserved bits
> set, but that could be easily handled at the caller including notifying
> on bits above the defined range which might be helpful.
>
> > diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
> > index d9bbcea0adf4..4c101a09fd80 100644
> > --- a/drivers/firmware/efi/cper-arm.c
> > +++ b/drivers/firmware/efi/cper-arm.c
> ...
>
> > if (error_info & CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT) {
> > @@ -241,6 +232,7 @@ void cper_print_proc_arm(const char *pfx,
> > struct cper_arm_err_info *err_info;
> > struct cper_arm_ctx_info *ctx_info;
> > char newpfx[64], infopfx[65];
> > + char error_type[120];
> >
> > printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
> >
> > @@ -289,9 +281,11 @@ void cper_print_proc_arm(const char *pfx,
> > newpfx);
> > }
> >
> > - printk("%serror_type: %d, %s\n", newpfx, err_info->type,
> > - err_info->type < ARRAY_SIZE(cper_proc_error_type_strs) ?
> > - cper_proc_error_type_strs[err_info->type] : "unknown");
> > + cper_bits_to_str(error_type, sizeof(error_type), err_info->type,
> > + cper_proc_error_type_strs,
> > + ARRAY_SIZE(cper_proc_error_type_strs),
> > + CPER_ARM_ERR_TYPE_MASK);
>
> Maybe drop this mask complexity and just use
> FIELD_GET() to extract the relevant field with no shift from 0.
IMO not using the function will make the code here more complex, as the
same code needs to be duplicated on two places: here and at ghes, where
the error bits are printed using pr_warn_ratelimited():
cper_bits_to_str(error_type, sizeof(error_type), err_info->type,
cper_proc_error_type_strs,
ARRAY_SIZE(cper_proc_error_type_strs),
CPER_ARM_ERR_TYPE_MASK);
pr_warn_ratelimited(FW_WARN GHES_PFX
"Unhandled processor error type: %s\n",
Also, other parts of CPER uses cper_bits_print() for the same reason:
to have the common print code handled inside a function instead of
repeating the same print pattern everywhere.
> > + printk("%serror_type: %s\n", newpfx, error_type);
> > if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO) {
> > printk("%serror_info: 0x%016llx\n", newpfx,
> > err_info->error_info);
>
>
Regards,
Mauro
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata
2024-06-20 18:01 [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Mauro Carvalho Chehab
` (2 preceding siblings ...)
2024-06-20 18:01 ` [PATCH v4 3/3] efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs Mauro Carvalho Chehab
@ 2024-06-21 7:45 ` Ard Biesheuvel
2024-06-21 15:26 ` Mauro Carvalho Chehab
3 siblings, 1 reply; 13+ messages in thread
From: Ard Biesheuvel @ 2024-06-21 7:45 UTC (permalink / raw)
To: Mauro Carvalho Chehab
Cc: Borislav Petkov, Tony Luck, James Morse, Jonathan Cameron,
Shiju Jose, linux-efi, linux-kernel, linux-edac, Len Brown,
linux-acpi
On Thu, 20 Jun 2024 at 20:01, Mauro Carvalho Chehab
<mchehab+huawei@kernel.org> wrote:
>
> The UEFI 2.9A errata makes clear how ARM processor type encoding should
> be done: it is meant to be equal to Generic processor, using a bitmask.
>
> The current code assumes, for both generic and ARM processor types
> that this is an integer, which is an incorrect assumption.
>
> Fix it. While here, also fix a compilation issue when using W=1.
>
> After the change, Kernel will properly decode receiving two errors at the same
> message, as defined at UEFI spec:
>
> [ 75.282430] Memory failure: 0x5cdfd: recovery action for free buddy page: Recovered
> [ 94.973081] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
> [ 94.973770] {2}[Hardware Error]: event severity: recoverable
> [ 94.974334] {2}[Hardware Error]: Error 0, type: recoverable
> [ 94.974962] {2}[Hardware Error]: section_type: ARM processor error
> [ 94.975586] {2}[Hardware Error]: MIDR: 0x000000000000cd24
> [ 94.976202] {2}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x000000000000ab12
> [ 94.977011] {2}[Hardware Error]: error affinity level: 2
> [ 94.977593] {2}[Hardware Error]: running state: 0x1
> [ 94.978135] {2}[Hardware Error]: Power State Coordination Interface state: 4660
> [ 94.978884] {2}[Hardware Error]: Error info structure 0:
> [ 94.979463] {2}[Hardware Error]: num errors: 3
> [ 94.979971] {2}[Hardware Error]: first error captured
> [ 94.980523] {2}[Hardware Error]: propagated error captured
> [ 94.981110] {2}[Hardware Error]: overflow occurred, error info is incomplete
> [ 94.981893] {2}[Hardware Error]: error_type: 0x0006: cache error|TLB error
> [ 94.982606] {2}[Hardware Error]: error_info: 0x000000000091000f
> [ 94.983249] {2}[Hardware Error]: transaction type: Data Access
> [ 94.983891] {2}[Hardware Error]: cache error, operation type: Data write
> [ 94.984559] {2}[Hardware Error]: TLB error, operation type: Data write
> [ 94.985215] {2}[Hardware Error]: cache level: 2
> [ 94.985749] {2}[Hardware Error]: TLB level: 2
> [ 94.986277] {2}[Hardware Error]: processor context not corrupted
>
> And the error code is properly decoded according with table N.17 from UEFI 2.10
> spec:
>
> [ 94.981893] {2}[Hardware Error]: error_type: 0x0006: cache error|TLB error
>
> Mauro Carvalho Chehab (3):
> efi/cper: Adjust infopfx size to accept an extra space
> efi/cper: Add a new helper function to print bitmasks
> efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
>
Hello Mauro,
How this is v4 different from the preceding 3 revisions that you sent
over the past 2 days?
I would expect an experienced maintainer like yourself to be familiar
with the common practice here: please leave some time between sending
revisions so people can take a look. And if there is a pressing need
to deviate from this rule, at least put an explanation in the commit
log of how the series differs from the preceding one.
Thanks,
Ard.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata
2024-06-21 7:45 ` [PATCH v4 0/3] Fix CPER issues related to UEFI 2.9A Errata Ard Biesheuvel
@ 2024-06-21 15:26 ` Mauro Carvalho Chehab
0 siblings, 0 replies; 13+ messages in thread
From: Mauro Carvalho Chehab @ 2024-06-21 15:26 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Borislav Petkov, Tony Luck, James Morse, Jonathan Cameron,
Shiju Jose, linux-efi, linux-kernel, linux-edac, Len Brown,
linux-acpi
Hi Ard,
Em Fri, 21 Jun 2024 09:45:16 +0200
Ard Biesheuvel <ardb@kernel.org> escreveu:
> On Thu, 20 Jun 2024 at 20:01, Mauro Carvalho Chehab
> <mchehab+huawei@kernel.org> wrote:
> >
> > The UEFI 2.9A errata makes clear how ARM processor type encoding should
> > be done: it is meant to be equal to Generic processor, using a bitmask.
> >
> > The current code assumes, for both generic and ARM processor types
> > that this is an integer, which is an incorrect assumption.
> >
> > Fix it. While here, also fix a compilation issue when using W=1.
> >
> > After the change, Kernel will properly decode receiving two errors at the same
> > message, as defined at UEFI spec:
> >
> > [ 75.282430] Memory failure: 0x5cdfd: recovery action for free buddy page: Recovered
> > [ 94.973081] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
> > [ 94.973770] {2}[Hardware Error]: event severity: recoverable
> > [ 94.974334] {2}[Hardware Error]: Error 0, type: recoverable
> > [ 94.974962] {2}[Hardware Error]: section_type: ARM processor error
> > [ 94.975586] {2}[Hardware Error]: MIDR: 0x000000000000cd24
> > [ 94.976202] {2}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x000000000000ab12
> > [ 94.977011] {2}[Hardware Error]: error affinity level: 2
> > [ 94.977593] {2}[Hardware Error]: running state: 0x1
> > [ 94.978135] {2}[Hardware Error]: Power State Coordination Interface state: 4660
> > [ 94.978884] {2}[Hardware Error]: Error info structure 0:
> > [ 94.979463] {2}[Hardware Error]: num errors: 3
> > [ 94.979971] {2}[Hardware Error]: first error captured
> > [ 94.980523] {2}[Hardware Error]: propagated error captured
> > [ 94.981110] {2}[Hardware Error]: overflow occurred, error info is incomplete
> > [ 94.981893] {2}[Hardware Error]: error_type: 0x0006: cache error|TLB error
> > [ 94.982606] {2}[Hardware Error]: error_info: 0x000000000091000f
> > [ 94.983249] {2}[Hardware Error]: transaction type: Data Access
> > [ 94.983891] {2}[Hardware Error]: cache error, operation type: Data write
> > [ 94.984559] {2}[Hardware Error]: TLB error, operation type: Data write
> > [ 94.985215] {2}[Hardware Error]: cache level: 2
> > [ 94.985749] {2}[Hardware Error]: TLB level: 2
> > [ 94.986277] {2}[Hardware Error]: processor context not corrupted
> >
> > And the error code is properly decoded according with table N.17 from UEFI 2.10
> > spec:
> >
> > [ 94.981893] {2}[Hardware Error]: error_type: 0x0006: cache error|TLB error
> >
> > Mauro Carvalho Chehab (3):
> > efi/cper: Adjust infopfx size to accept an extra space
> > efi/cper: Add a new helper function to print bitmasks
> > efi/cper: align ARM CPER type with UEFI 2.9A/2.10 specs
> >
>
> Hello Mauro,
>
> How this is v4 different from the preceding 3 revisions that you sent
> over the past 2 days?
>
> I would expect an experienced maintainer like yourself to be familiar
> with the common practice here: please leave some time between sending
> revisions so people can take a look. And if there is a pressing need
> to deviate from this rule, at least put an explanation in the commit
> log of how the series differs from the preceding one.
Sorry, I'll add a version review on that. Basically I was missing a
test environment to do error injection. When I got it enabled, and fixed
to cope with UEFI 2.9A/2.10 expected behavior, I was able to discover
some issues and to do some code improvements.
v1:
- (tagged as RFC) was mostly to give a heads up that the current
implementation is not following the spec. It also touches
only cper code.
v2:
- It fixes the way printks are handled on both cper_arm and ghes
drivers;
v3:
- It adds a helper function to produce a buffer describing the
error bits at cper's printk and ghes pr_warn_bitrated. It also
fixes a W=1 error while building cper;
v4:
- The print function had some bugs on it, which was discovered with
the help of an error injection tool I'm now using.
I have already another version ready to send. It does some code
cleanup and address the issues pointed by Tony and Jonathan. If you
prefer, I can hold it until Monday to give you some time to look
at it.
Thanks,
Mauro
^ permalink raw reply [flat|nested] 13+ messages in thread