* [PATCH 0/3] efi/memattr: Improve the efi_memattr_init function.
@ 2025-01-06 19:02 Breno Leitao
2025-01-06 19:02 ` [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void Breno Leitao
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Breno Leitao @ 2025-01-06 19:02 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, Breno Leitao,
kernel-team
After 8fbe4c49c0ccac ("efi/memattr: Ignore table if the size is clearly
bogus") landed, we are seeing a lot of firmware bugs in the Meta fleet.
There is an indication that this might be related to kexec, but, more on
this later.
While debugging the current issue, I modified the efi_memattr_init()
function in a way that I think it might be useful upstream.
The changes will help to detect EFI problems at scale, by making it more
verbose and easy to categorize early FW bugs.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Breno Leitao (3):
efi/memattr: Convert efi_memattr_init() return type to void
efi/memattr: Add FW_BUG prefix to firmware error messages
efi/memattr: Include EFI memmap size in corruption warnings
drivers/firmware/efi/memattr.c | 20 ++++++++++----------
include/linux/efi.h | 2 +-
2 files changed, 11 insertions(+), 11 deletions(-)
---
base-commit: c8e17339185951589d109ba81e8854513ab3d26f
change-id: 20250106-efi_fw_bug-2b62e9a41899
Best regards,
--
Breno Leitao <leitao@debian.org>
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void 2025-01-06 19:02 [PATCH 0/3] efi/memattr: Improve the efi_memattr_init function Breno Leitao @ 2025-01-06 19:02 ` Breno Leitao 2025-01-06 19:24 ` Ard Biesheuvel 2025-01-06 19:02 ` [PATCH 2/3] efi/memattr: Add FW_BUG prefix to firmware error messages Breno Leitao 2025-01-06 19:02 ` [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings Breno Leitao 2 siblings, 1 reply; 11+ messages in thread From: Breno Leitao @ 2025-01-06 19:02 UTC (permalink / raw) To: Ard Biesheuvel Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, Breno Leitao, kernel-team The efi_memattr_init() function's return values (0 and -ENOMEM) are never checked by callers. Convert the function to return void since the return status is unused. Signed-off-by: Breno Leitao <leitao@debian.org> --- drivers/firmware/efi/memattr.c | 7 +++---- include/linux/efi.h | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c index c38b1a335590d4f6b75012414a936e66f22dcd9a..e727cc5909cb676c47d787ab0d7754b6fdcb493d 100644 --- a/drivers/firmware/efi/memattr.c +++ b/drivers/firmware/efi/memattr.c @@ -19,19 +19,19 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; * Reserve the memory associated with the Memory Attributes configuration * table, if it exists. */ -int __init efi_memattr_init(void) +void __init efi_memattr_init(void) { efi_memory_attributes_table_t *tbl; unsigned long size; if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) - return 0; + return; tbl = early_memremap(efi_mem_attr_table, sizeof(*tbl)); if (!tbl) { pr_err("Failed to map EFI Memory Attributes table @ 0x%lx\n", efi_mem_attr_table); - return -ENOMEM; + return; } if (tbl->version > 2) { @@ -61,7 +61,6 @@ int __init efi_memattr_init(void) unmap: early_memunmap(tbl, sizeof(*tbl)); - return 0; } /* diff --git a/include/linux/efi.h b/include/linux/efi.h index e5815867aba971a64091bf0ea017d95f21115124..5404939df81bdc870f6ee5d0fb7cbb53394a2304 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -771,7 +771,7 @@ extern unsigned long efi_mem_attr_table; */ typedef int (*efi_memattr_perm_setter)(struct mm_struct *, efi_memory_desc_t *, bool); -extern int efi_memattr_init(void); +extern void efi_memattr_init(void); extern int efi_memattr_apply_permissions(struct mm_struct *mm, efi_memattr_perm_setter fn); -- 2.43.5 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void 2025-01-06 19:02 ` [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void Breno Leitao @ 2025-01-06 19:24 ` Ard Biesheuvel 0 siblings, 0 replies; 11+ messages in thread From: Ard Biesheuvel @ 2025-01-06 19:24 UTC (permalink / raw) To: Breno Leitao Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, kernel-team Hello Breno, On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@debian.org> wrote: > > The efi_memattr_init() function's return values (0 and -ENOMEM) are never > checked by callers. Convert the function to return void since the return > status is unused. > > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > drivers/firmware/efi/memattr.c | 7 +++---- > include/linux/efi.h | 2 +- > 2 files changed, 4 insertions(+), 5 deletions(-) > Acked-by: Ard Biesheuvel <ardb@kernel.org> > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c > index c38b1a335590d4f6b75012414a936e66f22dcd9a..e727cc5909cb676c47d787ab0d7754b6fdcb493d 100644 > --- a/drivers/firmware/efi/memattr.c > +++ b/drivers/firmware/efi/memattr.c > @@ -19,19 +19,19 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; > * Reserve the memory associated with the Memory Attributes configuration > * table, if it exists. > */ > -int __init efi_memattr_init(void) > +void __init efi_memattr_init(void) > { > efi_memory_attributes_table_t *tbl; > unsigned long size; > > if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) > - return 0; > + return; > > tbl = early_memremap(efi_mem_attr_table, sizeof(*tbl)); > if (!tbl) { > pr_err("Failed to map EFI Memory Attributes table @ 0x%lx\n", > efi_mem_attr_table); > - return -ENOMEM; > + return; > } > > if (tbl->version > 2) { > @@ -61,7 +61,6 @@ int __init efi_memattr_init(void) > > unmap: > early_memunmap(tbl, sizeof(*tbl)); > - return 0; > } > > /* > diff --git a/include/linux/efi.h b/include/linux/efi.h > index e5815867aba971a64091bf0ea017d95f21115124..5404939df81bdc870f6ee5d0fb7cbb53394a2304 100644 > --- a/include/linux/efi.h > +++ b/include/linux/efi.h > @@ -771,7 +771,7 @@ extern unsigned long efi_mem_attr_table; > */ > typedef int (*efi_memattr_perm_setter)(struct mm_struct *, efi_memory_desc_t *, bool); > > -extern int efi_memattr_init(void); > +extern void efi_memattr_init(void); > extern int efi_memattr_apply_permissions(struct mm_struct *mm, > efi_memattr_perm_setter fn); > > > -- > 2.43.5 > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] efi/memattr: Add FW_BUG prefix to firmware error messages 2025-01-06 19:02 [PATCH 0/3] efi/memattr: Improve the efi_memattr_init function Breno Leitao 2025-01-06 19:02 ` [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void Breno Leitao @ 2025-01-06 19:02 ` Breno Leitao 2025-01-06 19:25 ` Ard Biesheuvel 2025-01-06 19:02 ` [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings Breno Leitao 2 siblings, 1 reply; 11+ messages in thread From: Breno Leitao @ 2025-01-06 19:02 UTC (permalink / raw) To: Ard Biesheuvel Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, Breno Leitao, kernel-team Tag firmware-related error messages with FW_BUG in efi_memattr_init() to make EFI firmware issues more discoverable and consistent with kernel error reporting conventions. Signed-off-by: Breno Leitao <leitao@debian.org> --- drivers/firmware/efi/memattr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c index e727cc5909cb676c47d787ab0d7754b6fdcb493d..5f83cdea88b05cb325e9f90c14a0048131e53cfa 100644 --- a/drivers/firmware/efi/memattr.c +++ b/drivers/firmware/efi/memattr.c @@ -29,13 +29,13 @@ void __init efi_memattr_init(void) tbl = early_memremap(efi_mem_attr_table, sizeof(*tbl)); if (!tbl) { - pr_err("Failed to map EFI Memory Attributes table @ 0x%lx\n", + pr_err(FW_BUG "Failed to map EFI Memory Attributes table @ 0x%lx\n", efi_mem_attr_table); return; } if (tbl->version > 2) { - pr_warn("Unexpected EFI Memory Attributes table version %d\n", + pr_warn(FW_BUG "Unexpected EFI Memory Attributes table version %d\n", tbl->version); goto unmap; } -- 2.43.5 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] efi/memattr: Add FW_BUG prefix to firmware error messages 2025-01-06 19:02 ` [PATCH 2/3] efi/memattr: Add FW_BUG prefix to firmware error messages Breno Leitao @ 2025-01-06 19:25 ` Ard Biesheuvel 0 siblings, 0 replies; 11+ messages in thread From: Ard Biesheuvel @ 2025-01-06 19:25 UTC (permalink / raw) To: Breno Leitao Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, kernel-team On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@debian.org> wrote: > > Tag firmware-related error messages with FW_BUG in efi_memattr_init() to > make EFI firmware issues more discoverable and consistent with kernel > error reporting conventions. > > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > drivers/firmware/efi/memattr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c > index e727cc5909cb676c47d787ab0d7754b6fdcb493d..5f83cdea88b05cb325e9f90c14a0048131e53cfa 100644 > --- a/drivers/firmware/efi/memattr.c > +++ b/drivers/firmware/efi/memattr.c > @@ -29,13 +29,13 @@ void __init efi_memattr_init(void) > > tbl = early_memremap(efi_mem_attr_table, sizeof(*tbl)); > if (!tbl) { > - pr_err("Failed to map EFI Memory Attributes table @ 0x%lx\n", > + pr_err(FW_BUG "Failed to map EFI Memory Attributes table @ 0x%lx\n", > efi_mem_attr_table); > return; > } > > if (tbl->version > 2) { > - pr_warn("Unexpected EFI Memory Attributes table version %d\n", > + pr_warn(FW_BUG "Unexpected EFI Memory Attributes table version %d\n", > tbl->version); > goto unmap; > } > Neither of these are firmware bugs, so we shouldn't report them as such. (A future version of the table could be > 2, and we wouldn't know how to deal with that. That doesn't mean there is anything wrong with the firmware, it only means the kernel is too old) ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings 2025-01-06 19:02 [PATCH 0/3] efi/memattr: Improve the efi_memattr_init function Breno Leitao 2025-01-06 19:02 ` [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void Breno Leitao 2025-01-06 19:02 ` [PATCH 2/3] efi/memattr: Add FW_BUG prefix to firmware error messages Breno Leitao @ 2025-01-06 19:02 ` Breno Leitao 2025-01-07 11:24 ` Ard Biesheuvel 2 siblings, 1 reply; 11+ messages in thread From: Breno Leitao @ 2025-01-06 19:02 UTC (permalink / raw) To: Ard Biesheuvel Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, Breno Leitao, kernel-team Add EFI memory map size to warning messages when a corrupted Memory Attributes Table is detected, making it easier to diagnose firmware issues. Signed-off-by: Breno Leitao <leitao@debian.org> --- drivers/firmware/efi/memattr.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c index 5f83cdea88b05cb325e9f90c14a0048131e53cfa..2c276bcc0df48352bec6cd96b69edf67a16f6069 100644 --- a/drivers/firmware/efi/memattr.c +++ b/drivers/firmware/efi/memattr.c @@ -22,7 +22,7 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; void __init efi_memattr_init(void) { efi_memory_attributes_table_t *tbl; - unsigned long size; + unsigned long size, efi_map_size; if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) return; @@ -49,9 +49,10 @@ void __init efi_memattr_init(void) * just be ignored altogether. */ size = tbl->num_entries * tbl->desc_size; - if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) { - pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n", - tbl->version, tbl->desc_size, tbl->num_entries); + efi_map_size = efi.memmap.nr_map * efi.memmap.desc_size; + if (size > 3 * efi_map_size) { + pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u, efi_map_size == %lu)\n", + tbl->version, tbl->desc_size, tbl->num_entries, efi_map_size); goto unmap; } -- 2.43.5 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings 2025-01-06 19:02 ` [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings Breno Leitao @ 2025-01-07 11:24 ` Ard Biesheuvel 2025-01-07 12:05 ` Breno Leitao 0 siblings, 1 reply; 11+ messages in thread From: Ard Biesheuvel @ 2025-01-07 11:24 UTC (permalink / raw) To: Breno Leitao Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, kernel-team On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@debian.org> wrote: > > Add EFI memory map size to warning messages when a corrupted Memory > Attributes Table is detected, making it easier to diagnose firmware issues. > > Signed-off-by: Breno Leitao <leitao@debian.org> > --- > drivers/firmware/efi/memattr.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c > index 5f83cdea88b05cb325e9f90c14a0048131e53cfa..2c276bcc0df48352bec6cd96b69edf67a16f6069 100644 > --- a/drivers/firmware/efi/memattr.c > +++ b/drivers/firmware/efi/memattr.c > @@ -22,7 +22,7 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; > void __init efi_memattr_init(void) > { > efi_memory_attributes_table_t *tbl; > - unsigned long size; > + unsigned long size, efi_map_size; > > if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) > return; > @@ -49,9 +49,10 @@ void __init efi_memattr_init(void) > * just be ignored altogether. > */ > size = tbl->num_entries * tbl->desc_size; > - if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) { > - pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n", > - tbl->version, tbl->desc_size, tbl->num_entries); > + efi_map_size = efi.memmap.nr_map * efi.memmap.desc_size; > + if (size > 3 * efi_map_size) { > + pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u, efi_map_size == %lu)\n", > + tbl->version, tbl->desc_size, tbl->num_entries, efi_map_size); > goto unmap; > } > > Hello Breno, I don't mind the patch per se, but I don't think it is terribly useful either. Could you explain how this helps? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings 2025-01-07 11:24 ` Ard Biesheuvel @ 2025-01-07 12:05 ` Breno Leitao 2025-01-09 14:48 ` Ard Biesheuvel 0 siblings, 1 reply; 11+ messages in thread From: Breno Leitao @ 2025-01-07 12:05 UTC (permalink / raw) To: Ard Biesheuvel Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, kernel-team Hello Ard, On Tue, Jan 07, 2025 at 12:24:03PM +0100, Ard Biesheuvel wrote: > On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@debian.org> wrote: > > > > Add EFI memory map size to warning messages when a corrupted Memory > > Attributes Table is detected, making it easier to diagnose firmware issues. > > > > Signed-off-by: Breno Leitao <leitao@debian.org> > > --- > > drivers/firmware/efi/memattr.c | 9 +++++---- > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c > > index 5f83cdea88b05cb325e9f90c14a0048131e53cfa..2c276bcc0df48352bec6cd96b69edf67a16f6069 100644 > > --- a/drivers/firmware/efi/memattr.c > > +++ b/drivers/firmware/efi/memattr.c > > @@ -22,7 +22,7 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; > > void __init efi_memattr_init(void) > > { > > efi_memory_attributes_table_t *tbl; > > - unsigned long size; > > + unsigned long size, efi_map_size; > > > > if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) > > return; > > @@ -49,9 +49,10 @@ void __init efi_memattr_init(void) > > * just be ignored altogether. > > */ > > size = tbl->num_entries * tbl->desc_size; > > - if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) { > > - pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n", > > - tbl->version, tbl->desc_size, tbl->num_entries); > > + efi_map_size = efi.memmap.nr_map * efi.memmap.desc_size; > > + if (size > 3 * efi_map_size) { > > + pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u, efi_map_size == %lu)\n", > > + tbl->version, tbl->desc_size, tbl->num_entries, efi_map_size); > > goto unmap; > > } > > > > > > Hello Breno, > > I don't mind the patch per se, but I don't think it is terribly useful either. > > Could you explain how this helps? We are seeing a bunch of `Corrupted EFI Memory Attributes Table detected!` in the Meta fleet, and this is something we are investigating. We highly think this is related to some kexec overwrites, and when we get here, the EFI table is completely garbage. I haven't seen this problem on cold boot. Here are sof the instances I see: efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 18058, num_entries == 33554432) efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 1, desc_size == 2072184435, num_entries == 3248688968) efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 83886080, num_entries == 304) efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 2, desc_size == 48, num_entries == 40) Anyway, back to you question, this patch helped us to narrow down and find where the problem was, by printing all variables taken in consideration to get the conclusion that the firmware is buggy. Regarding the problem, Usama and I are suspecting that it might be related to some 77d48d39e99170 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption"), but at this time with memattr table, where it might not preserved during kexec(?). Thanks, --breno ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings 2025-01-07 12:05 ` Breno Leitao @ 2025-01-09 14:48 ` Ard Biesheuvel 2025-01-09 15:09 ` Usama Arif 2025-01-09 17:45 ` Breno Leitao 0 siblings, 2 replies; 11+ messages in thread From: Ard Biesheuvel @ 2025-01-09 14:48 UTC (permalink / raw) To: Breno Leitao Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, kernel-team On Tue, 7 Jan 2025 at 13:05, Breno Leitao <leitao@debian.org> wrote: > > Hello Ard, > > On Tue, Jan 07, 2025 at 12:24:03PM +0100, Ard Biesheuvel wrote: > > On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@debian.org> wrote: > > > > > > Add EFI memory map size to warning messages when a corrupted Memory > > > Attributes Table is detected, making it easier to diagnose firmware issues. > > > > > > Signed-off-by: Breno Leitao <leitao@debian.org> > > > --- > > > drivers/firmware/efi/memattr.c | 9 +++++---- > > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c > > > index 5f83cdea88b05cb325e9f90c14a0048131e53cfa..2c276bcc0df48352bec6cd96b69edf67a16f6069 100644 > > > --- a/drivers/firmware/efi/memattr.c > > > +++ b/drivers/firmware/efi/memattr.c > > > @@ -22,7 +22,7 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; > > > void __init efi_memattr_init(void) > > > { > > > efi_memory_attributes_table_t *tbl; > > > - unsigned long size; > > > + unsigned long size, efi_map_size; > > > > > > if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) > > > return; > > > @@ -49,9 +49,10 @@ void __init efi_memattr_init(void) > > > * just be ignored altogether. > > > */ > > > size = tbl->num_entries * tbl->desc_size; > > > - if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) { > > > - pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n", > > > - tbl->version, tbl->desc_size, tbl->num_entries); > > > + efi_map_size = efi.memmap.nr_map * efi.memmap.desc_size; > > > + if (size > 3 * efi_map_size) { > > > + pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u, efi_map_size == %lu)\n", > > > + tbl->version, tbl->desc_size, tbl->num_entries, efi_map_size); > > > goto unmap; > > > } > > > > > > > > > > Hello Breno, > > > > I don't mind the patch per se, but I don't think it is terribly useful either. > > > > Could you explain how this helps? > > We are seeing a bunch of `Corrupted EFI Memory Attributes Table > detected!` in the Meta fleet, and this is something we are > investigating. > > We highly think this is related to some kexec overwrites, and when we > get here, the EFI table is completely garbage. I haven't seen this > problem on cold boot. > It likely means the memory is not reserved correctly. Could you check whether this --- a/drivers/firmware/efi/memattr.c +++ b/drivers/firmware/efi/memattr.c @@ -56,7 +56,7 @@ int __init efi_memattr_init(void) } tbl_size = sizeof(*tbl) + size; - memblock_reserve(efi_mem_attr_table, tbl_size); + efi_mem_reserve(efi_mem_attr_table, tbl_size); set_bit(EFI_MEM_ATTR, &efi.flags); unmap: makes any difference? > Here are sof the instances I see: > > efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 18058, num_entries == 33554432) > efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 1, desc_size == 2072184435, num_entries == 3248688968) > efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 83886080, num_entries == 304) > efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 2, desc_size == 48, num_entries == 40) > The last one looks like a false positive: each of those values seems perfectly reasonable. Any chance you could dump the memory map and this table (boot using efi=debug) on this system? > Anyway, back to you question, this patch helped us to narrow down and > find where the problem was, by printing all variables taken in > consideration to get the conclusion that the firmware is buggy. > Fair enough. > Regarding the problem, Usama and I are suspecting that it might be > related to some 77d48d39e99170 ("efistub/tpm: Use ACPI reclaim memory for > event log to avoid corruption"), but at this time with memattr table, where it > might not preserved during kexec(?). > Please see the suggestion above. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings 2025-01-09 14:48 ` Ard Biesheuvel @ 2025-01-09 15:09 ` Usama Arif 2025-01-09 17:45 ` Breno Leitao 1 sibling, 0 replies; 11+ messages in thread From: Usama Arif @ 2025-01-09 15:09 UTC (permalink / raw) To: Ard Biesheuvel, Breno Leitao Cc: Gregory Price, linux-efi, linux-kernel, kernel-team On 09/01/2025 14:48, Ard Biesheuvel wrote: > On Tue, 7 Jan 2025 at 13:05, Breno Leitao <leitao@debian.org> wrote: >> >> Hello Ard, >> >> On Tue, Jan 07, 2025 at 12:24:03PM +0100, Ard Biesheuvel wrote: >>> On Mon, 6 Jan 2025 at 20:03, Breno Leitao <leitao@debian.org> wrote: >>>> >>>> Add EFI memory map size to warning messages when a corrupted Memory >>>> Attributes Table is detected, making it easier to diagnose firmware issues. >>>> >>>> Signed-off-by: Breno Leitao <leitao@debian.org> >>>> --- >>>> drivers/firmware/efi/memattr.c | 9 +++++---- >>>> 1 file changed, 5 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c >>>> index 5f83cdea88b05cb325e9f90c14a0048131e53cfa..2c276bcc0df48352bec6cd96b69edf67a16f6069 100644 >>>> --- a/drivers/firmware/efi/memattr.c >>>> +++ b/drivers/firmware/efi/memattr.c >>>> @@ -22,7 +22,7 @@ unsigned long __ro_after_init efi_mem_attr_table = EFI_INVALID_TABLE_ADDR; >>>> void __init efi_memattr_init(void) >>>> { >>>> efi_memory_attributes_table_t *tbl; >>>> - unsigned long size; >>>> + unsigned long size, efi_map_size; >>>> >>>> if (efi_mem_attr_table == EFI_INVALID_TABLE_ADDR) >>>> return; >>>> @@ -49,9 +49,10 @@ void __init efi_memattr_init(void) >>>> * just be ignored altogether. >>>> */ >>>> size = tbl->num_entries * tbl->desc_size; >>>> - if (size > 3 * efi.memmap.nr_map * efi.memmap.desc_size) { >>>> - pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u)\n", >>>> - tbl->version, tbl->desc_size, tbl->num_entries); >>>> + efi_map_size = efi.memmap.nr_map * efi.memmap.desc_size; >>>> + if (size > 3 * efi_map_size) { >>>> + pr_warn(FW_BUG "Corrupted EFI Memory Attributes Table detected! (version == %u, desc_size == %u, num_entries == %u, efi_map_size == %lu)\n", >>>> + tbl->version, tbl->desc_size, tbl->num_entries, efi_map_size); >>>> goto unmap; >>>> } >>>> >>>> >>> >>> Hello Breno, >>> >>> I don't mind the patch per se, but I don't think it is terribly useful either. >>> >>> Could you explain how this helps? >> >> We are seeing a bunch of `Corrupted EFI Memory Attributes Table >> detected!` in the Meta fleet, and this is something we are >> investigating. >> >> We highly think this is related to some kexec overwrites, and when we >> get here, the EFI table is completely garbage. I haven't seen this >> problem on cold boot. >> > > It likely means the memory is not reserved correctly. > > Could you check whether this > > --- a/drivers/firmware/efi/memattr.c > +++ b/drivers/firmware/efi/memattr.c > @@ -56,7 +56,7 @@ int __init efi_memattr_init(void) > } > > tbl_size = sizeof(*tbl) + size; > - memblock_reserve(efi_mem_attr_table, tbl_size); > + efi_mem_reserve(efi_mem_attr_table, tbl_size); > set_bit(EFI_MEM_ATTR, &efi.flags); > > unmap: > > > makes any difference? > Hi Ard, Thanks for the reply! I have further explained the problems and possible solutions in https://lore.kernel.org/all/20250108215957.3437660-1-usamaarif642@gmail.com/. I am assuming the above diff is to solve problem 2 that I have described in the patches. I haven't tested it, because its a bit difficult to reproduce problem 2, but I think the above diff might not make a difference? efi_mem_reserve changes e820_table, while /sys/firmware/memmap uses e820_table_firmware. An alternate solution might be to change /sys/firmware/memmap to e820_table. I didnt go down that route because, you would be changing what the kernel exposes to userspace, which might not be the right thing. Thanks, Usama > >> Here are sof the instances I see: >> >> efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 18058, num_entries == 33554432) >> efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 1, desc_size == 2072184435, num_entries == 3248688968) >> efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 0, desc_size == 83886080, num_entries == 304) >> efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 2, desc_size == 48, num_entries == 40) >> > > The last one looks like a false positive: each of those values seems > perfectly reasonable. > > Any chance you could dump the memory map and this table (boot using > efi=debug) on this system? > > >> Anyway, back to you question, this patch helped us to narrow down and >> find where the problem was, by printing all variables taken in >> consideration to get the conclusion that the firmware is buggy. >> > > Fair enough. > >> Regarding the problem, Usama and I are suspecting that it might be >> related to some 77d48d39e99170 ("efistub/tpm: Use ACPI reclaim memory for >> event log to avoid corruption"), but at this time with memattr table, where it >> might not preserved during kexec(?). >> > > Please see the suggestion above. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings 2025-01-09 14:48 ` Ard Biesheuvel 2025-01-09 15:09 ` Usama Arif @ 2025-01-09 17:45 ` Breno Leitao 1 sibling, 0 replies; 11+ messages in thread From: Breno Leitao @ 2025-01-09 17:45 UTC (permalink / raw) To: Ard Biesheuvel Cc: Gregory Price, Usama Arif, linux-efi, linux-kernel, kernel-team On Thu, Jan 09, 2025 at 03:48:56PM +0100, Ard Biesheuvel wrote: > On Tue, 7 Jan 2025 at 13:05, Breno Leitao <leitao@debian.org> wrote: > > On Tue, Jan 07, 2025 at 12:24:03PM +0100, Ard Biesheuvel wrote: > > We are seeing a bunch of `Corrupted EFI Memory Attributes Table > > detected!` in the Meta fleet, and this is something we are > > investigating. > > > > We highly think this is related to some kexec overwrites, and when we > > get here, the EFI table is completely garbage. I haven't seen this > > problem on cold boot. > > > > It likely means the memory is not reserved correctly. > > Could you check whether this > > --- a/drivers/firmware/efi/memattr.c > +++ b/drivers/firmware/efi/memattr.c > @@ -56,7 +56,7 @@ int __init efi_memattr_init(void) > } > > tbl_size = sizeof(*tbl) + size; > - memblock_reserve(efi_mem_attr_table, tbl_size); > + efi_mem_reserve(efi_mem_attr_table, tbl_size); > set_bit(EFI_MEM_ATTR, &efi.flags); > > unmap: > > > makes any difference? It doesn't seem so. This is how I tested it. I've cherry-picked it on top of Linus' master eea6e4b4dfb8 ("Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi"), and booted a system with this kernel. From cold boot, we never see any firmware bug (as usual). Then I kexec the same kernel, and then I got the problem twice (I tested it 2 times and same results): [ 0.000000] APIC: Static calls initialized [ 0.000000] e820: update [mem 0x107fff9400-0x107fff940f] usable ==> usable [ 0.000000] e820: update [mem 0x107fff93e0-0x107fff93ff] usable ==> usable [ 0.000000] e820: update [mem 0x107fff9370-0x107fff93df] usable ==> usable [ 0.000000] extended physical RAM map: [ 0.000000] reserve setup_data: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] reserve setup_data: [mem 0x00000000000a0000-0x00000000000fffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000000100000-0x0000000069cd5fff] usable [ 0.000000] reserve setup_data: [mem 0x0000000069cd6000-0x000000006bdd5fff] reserved [ 0.000000] reserve setup_data: [mem 0x000000006bdd6000-0x000000006be90fff] ACPI data [ 0.000000] reserve setup_data: [mem 0x000000006be91000-0x000000006c9eafff] ACPI NVS [ 0.000000] reserve setup_data: [mem 0x000000006c9eb000-0x000000006ebedfff] reserved [ 0.000000] reserve setup_data: [mem 0x000000006ebee000-0x000000006fffffff] usable [ 0.000000] reserve setup_data: [mem 0x0000000070000000-0x000000008fffffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fd000000-0x00000000fe7fffff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000fed20000-0x00000000fed44fff] reserved [ 0.000000] reserve setup_data: [mem 0x00000000ff000000-0x00000000ffffffff] reserved [ 0.000000] reserve setup_data: [mem 0x0000000100000000-0x000000107fff936f] usable [ 0.000000] reserve setup_data: [mem 0x000000107fff9370-0x000000107fff940f] usable [ 0.000000] reserve setup_data: [mem 0x000000107fff9410-0x000000107fffffff] usable [ 0.000000] efi: EFI v2.6 by American Megatrends [ 0.000000] efi: ACPI 2.0=0x6c61e000 ACPI=0x6c61e000 TPMFinalLog=0x6c9b9000 SMBIOS=0x6e69d000 SMBIOS 3.0=0x6e69c000 MEMATTR=0x64ef1018 ESRT=0x67dc9918 TPMEventLog=0x6be8e018 [ 0.000000] efi: memattr: [Firmware Bug]: Corrupted EFI Memory Attributes Table detected! (version == 1, desc_size == 48, num_entries == 50) [ 0.000000] efi: Remove mem00: MMIO range=[0xff000000-0xffffffff] (16MB) from e820 map [ 0.000000] e820: remove [mem 0xff000000-0xffffffff] reserved [ 0.000000] efi: Not removing mem01: MMIO range=[0xfed20000-0xfed44fff] (148KB) from e820 map [ 0.000000] efi: Remove mem02: MMIO range=[0xfd000000-0xfe7fffff] (24MB) from e820 map [ 0.000000] e820: remove [mem 0xfd000000-0xfe7fffff] reserved [ 0.000000] efi: Remove mem03: MMIO range=[0x80000000-0x8fffffff] (256MB) from e820 map [ 0.000000] e820: remove [mem 0x80000000-0x8fffffff] reserved [ 0.000000] SMBIOS 3.1.1 present. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-01-09 17:45 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-01-06 19:02 [PATCH 0/3] efi/memattr: Improve the efi_memattr_init function Breno Leitao 2025-01-06 19:02 ` [PATCH 1/3] efi/memattr: Convert efi_memattr_init() return type to void Breno Leitao 2025-01-06 19:24 ` Ard Biesheuvel 2025-01-06 19:02 ` [PATCH 2/3] efi/memattr: Add FW_BUG prefix to firmware error messages Breno Leitao 2025-01-06 19:25 ` Ard Biesheuvel 2025-01-06 19:02 ` [PATCH 3/3] efi/memattr: Include EFI memmap size in corruption warnings Breno Leitao 2025-01-07 11:24 ` Ard Biesheuvel 2025-01-07 12:05 ` Breno Leitao 2025-01-09 14:48 ` Ard Biesheuvel 2025-01-09 15:09 ` Usama Arif 2025-01-09 17:45 ` Breno Leitao
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox