* [PATCH 0/2] Fix boot hang issue on Ampere Emag server
@ 2023-01-31 4:03 Jia He
2023-01-31 4:03 ` [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services Jia He
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Jia He @ 2023-01-31 4:03 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei, Jia He
I met a hung task warning and then kernel was hung forever with latest
kernel on an Ampere Emag server.
The root cause is kernel was hung when invoking an efi rts call to set
the RandomSeed variable during the booting stage. The arch_efi_call_virt
call (set_variable) was never returned and then caused the hung task error.
On the Emag server, efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE)
is returned with "true"
Fix it by introducing the efi_get_supported_rt_services() and then determine
to set or clear the runtime services bit of efi.flags.
Jia He (2):
efi: libstub: Fix the retriving of supported rutime services
efi: Introduce efi_get_supported_rt_services() to get the runtime
services mask
drivers/firmware/efi/arm-runtime.c | 5 ++++-
drivers/firmware/efi/efi.c | 28 +++++++++++++++++--------
drivers/firmware/efi/libstub/efi-stub.c | 2 ++
include/linux/efi.h | 1 +
4 files changed, 26 insertions(+), 10 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services
2023-01-31 4:03 [PATCH 0/2] Fix boot hang issue on Ampere Emag server Jia He
@ 2023-01-31 4:03 ` Jia He
2023-01-31 7:04 ` Ard Biesheuvel
2023-01-31 4:03 ` [PATCH 2/2] efi: Introduce efi_get_supported_rt_services() to get the runtime services mask Jia He
2023-01-31 7:18 ` [PATCH 0/2] Fix boot hang issue on Ampere Emag server Ard Biesheuvel
2 siblings, 1 reply; 20+ messages in thread
From: Jia He @ 2023-01-31 4:03 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei, Jia He
If retrieving UEFI configuration table is failed, the supported runtime
services mask should be regarded as 0 instead of EFI_RT_SUPPORTED_ALL.
Otherwise efi_novamap might be incorrectly assigned to "false" on the
Ampere Emag server.
Signed-off-by: Jia He <justin.he@arm.com>
---
drivers/firmware/efi/libstub/efi-stub.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/firmware/efi/libstub/efi-stub.c b/drivers/firmware/efi/libstub/efi-stub.c
index 2955c1ac6a36..f24b5436729c 100644
--- a/drivers/firmware/efi/libstub/efi-stub.c
+++ b/drivers/firmware/efi/libstub/efi-stub.c
@@ -111,6 +111,8 @@ static u32 get_supported_rt_services(void)
rt_prop_table = get_efi_config_table(EFI_RT_PROPERTIES_TABLE_GUID);
if (rt_prop_table)
supported &= rt_prop_table->runtime_services_supported;
+ else
+ supported = 0;
return supported;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] efi: Introduce efi_get_supported_rt_services() to get the runtime services mask
2023-01-31 4:03 [PATCH 0/2] Fix boot hang issue on Ampere Emag server Jia He
2023-01-31 4:03 ` [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services Jia He
@ 2023-01-31 4:03 ` Jia He
2023-01-31 7:06 ` Ard Biesheuvel
2023-01-31 7:18 ` [PATCH 0/2] Fix boot hang issue on Ampere Emag server Ard Biesheuvel
2 siblings, 1 reply; 20+ messages in thread
From: Jia He @ 2023-01-31 4:03 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei, Jia He
Previously, efi.runtime_supported_mask will always be the initial value
EFI_RT_SUPPORTED_ALL and can't be retrieved in efi_config_parse_tables()
if rt_prop is EFI_INVALID_TABLE_ADDR. Thus this can cause the wrong
return value of efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE)
on the Ampere Emag server.
Besides, abstract the runtime services retrieving into a new exported
helper efi_get_supported_rt_services() to set or clear the runtime service
bit of efi.flags.
Signed-off-by: Jia He <justin.he@arm.com>
---
drivers/firmware/efi/arm-runtime.c | 5 ++++-
drivers/firmware/efi/efi.c | 28 +++++++++++++++++++---------
include/linux/efi.h | 1 +
3 files changed, 24 insertions(+), 10 deletions(-)
diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c
index 83f5bb57fa4c..ce93133f9abc 100644
--- a/drivers/firmware/efi/arm-runtime.c
+++ b/drivers/firmware/efi/arm-runtime.c
@@ -146,7 +146,10 @@ static int __init arm_enable_runtime_services(void)
/* Set up runtime services function pointers */
efi_native_runtime_setup();
- set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+ if (efi_get_supported_rt_services())
+ set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
+ else
+ clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
return 0;
}
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index a2b0cbc8741c..96475cb8088e 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -595,6 +595,24 @@ static __init int match_config_table(const efi_guid_t *guid,
return 0;
}
+unsigned int __init efi_get_supported_rt_services(void)
+{
+ unsigned int runtime_supported_mask = EFI_RT_SUPPORTED_ALL;
+
+ if (rt_prop != EFI_INVALID_TABLE_ADDR) {
+ efi_rt_properties_table_t *tbl;
+
+ tbl = early_memremap(rt_prop, sizeof(*tbl));
+ if (tbl) {
+ runtime_supported_mask &= tbl->runtime_services_supported;
+ early_memunmap(tbl, sizeof(*tbl));
+ }
+ } else
+ runtime_supported_mask = 0;
+
+ return runtime_supported_mask;
+}
+
int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
int count,
const efi_config_table_type_t *arch_tables)
@@ -695,15 +713,7 @@ int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
}
}
- if (rt_prop != EFI_INVALID_TABLE_ADDR) {
- efi_rt_properties_table_t *tbl;
-
- tbl = early_memremap(rt_prop, sizeof(*tbl));
- if (tbl) {
- efi.runtime_supported_mask &= tbl->runtime_services_supported;
- early_memunmap(tbl, sizeof(*tbl));
- }
- }
+ efi.runtime_supported_mask &= efi_get_supported_rt_services();
if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) &&
initrd != EFI_INVALID_TABLE_ADDR && phys_initrd_size == 0) {
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 4b27519143f5..fcaf0d7fc07e 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -717,6 +717,7 @@ extern void __init efi_esrt_init(void);
#else
static inline void efi_esrt_init(void) { }
#endif
+extern unsigned int efi_get_supported_rt_services(void);
extern int efi_config_parse_tables(const efi_config_table_t *config_tables,
int count,
const efi_config_table_type_t *arch_tables);
--
2.25.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services
2023-01-31 4:03 ` [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services Jia He
@ 2023-01-31 7:04 ` Ard Biesheuvel
0 siblings, 0 replies; 20+ messages in thread
From: Ard Biesheuvel @ 2023-01-31 7:04 UTC (permalink / raw)
To: Jia He; +Cc: Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei
On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
>
> If retrieving UEFI configuration table is failed, the supported runtime
> services mask should be regarded as 0 instead of EFI_RT_SUPPORTED_ALL.
> Otherwise efi_novamap might be incorrectly assigned to "false" on the
> Ampere Emag server.
>
> Signed-off-by: Jia He <justin.he@arm.com>
> ---
> drivers/firmware/efi/libstub/efi-stub.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/firmware/efi/libstub/efi-stub.c b/drivers/firmware/efi/libstub/efi-stub.c
> index 2955c1ac6a36..f24b5436729c 100644
> --- a/drivers/firmware/efi/libstub/efi-stub.c
> +++ b/drivers/firmware/efi/libstub/efi-stub.c
> @@ -111,6 +111,8 @@ static u32 get_supported_rt_services(void)
> rt_prop_table = get_efi_config_table(EFI_RT_PROPERTIES_TABLE_GUID);
> if (rt_prop_table)
> supported &= rt_prop_table->runtime_services_supported;
> + else
> + supported = 0;
>
> return supported;
> }
Hello Justin,
This is not how things are supposed to work. On systems that do not
implement the RT properties table, all runtime services are assumed to
be implemented.
Note that this table is informational only - the runtime services
themselves must be callable and return EFI_UNSUPPORTED if they are
marked as unavailable in the RT properties table.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] efi: Introduce efi_get_supported_rt_services() to get the runtime services mask
2023-01-31 4:03 ` [PATCH 2/2] efi: Introduce efi_get_supported_rt_services() to get the runtime services mask Jia He
@ 2023-01-31 7:06 ` Ard Biesheuvel
0 siblings, 0 replies; 20+ messages in thread
From: Ard Biesheuvel @ 2023-01-31 7:06 UTC (permalink / raw)
To: Jia He; +Cc: Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei
On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
>
> Previously, efi.runtime_supported_mask will always be the initial value
> EFI_RT_SUPPORTED_ALL and can't be retrieved in efi_config_parse_tables()
> if rt_prop is EFI_INVALID_TABLE_ADDR. Thus this can cause the wrong
> return value of efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE)
> on the Ampere Emag server.
>
> Besides, abstract the runtime services retrieving into a new exported
> helper efi_get_supported_rt_services() to set or clear the runtime service
> bit of efi.flags.
>
> Signed-off-by: Jia He <justin.he@arm.com>
> ---
> drivers/firmware/efi/arm-runtime.c | 5 ++++-
> drivers/firmware/efi/efi.c | 28 +++++++++++++++++++---------
> include/linux/efi.h | 1 +
> 3 files changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/firmware/efi/arm-runtime.c b/drivers/firmware/efi/arm-runtime.c
> index 83f5bb57fa4c..ce93133f9abc 100644
> --- a/drivers/firmware/efi/arm-runtime.c
> +++ b/drivers/firmware/efi/arm-runtime.c
> @@ -146,7 +146,10 @@ static int __init arm_enable_runtime_services(void)
>
> /* Set up runtime services function pointers */
> efi_native_runtime_setup();
> - set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
> + if (efi_get_supported_rt_services())
> + set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
> + else
> + clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
>
This is not right. There are now other users of the EFI runtime
service infrastructure (ACPI PRM), so even if all EFI runtime services
are marked as unsupported, we should still set the
EFI_RUNTIME_SERVICES bit if the EFI runtime memory map is provided.
> return 0;
> }
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index a2b0cbc8741c..96475cb8088e 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -595,6 +595,24 @@ static __init int match_config_table(const efi_guid_t *guid,
> return 0;
> }
>
> +unsigned int __init efi_get_supported_rt_services(void)
> +{
> + unsigned int runtime_supported_mask = EFI_RT_SUPPORTED_ALL;
> +
> + if (rt_prop != EFI_INVALID_TABLE_ADDR) {
> + efi_rt_properties_table_t *tbl;
> +
> + tbl = early_memremap(rt_prop, sizeof(*tbl));
> + if (tbl) {
> + runtime_supported_mask &= tbl->runtime_services_supported;
> + early_memunmap(tbl, sizeof(*tbl));
> + }
> + } else
> + runtime_supported_mask = 0;
> +
> + return runtime_supported_mask;
> +}
> +
> int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
> int count,
> const efi_config_table_type_t *arch_tables)
> @@ -695,15 +713,7 @@ int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
> }
> }
>
> - if (rt_prop != EFI_INVALID_TABLE_ADDR) {
> - efi_rt_properties_table_t *tbl;
> -
> - tbl = early_memremap(rt_prop, sizeof(*tbl));
> - if (tbl) {
> - efi.runtime_supported_mask &= tbl->runtime_services_supported;
> - early_memunmap(tbl, sizeof(*tbl));
> - }
> - }
> + efi.runtime_supported_mask &= efi_get_supported_rt_services();
>
> if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) &&
> initrd != EFI_INVALID_TABLE_ADDR && phys_initrd_size == 0) {
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index 4b27519143f5..fcaf0d7fc07e 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -717,6 +717,7 @@ extern void __init efi_esrt_init(void);
> #else
> static inline void efi_esrt_init(void) { }
> #endif
> +extern unsigned int efi_get_supported_rt_services(void);
> extern int efi_config_parse_tables(const efi_config_table_t *config_tables,
> int count,
> const efi_config_table_type_t *arch_tables);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-01-31 4:03 [PATCH 0/2] Fix boot hang issue on Ampere Emag server Jia He
2023-01-31 4:03 ` [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services Jia He
2023-01-31 4:03 ` [PATCH 2/2] efi: Introduce efi_get_supported_rt_services() to get the runtime services mask Jia He
@ 2023-01-31 7:18 ` Ard Biesheuvel
[not found] ` <DBBPR08MB45383B479656BA18FEFFB7D5F7D09@DBBPR08MB4538.eurprd08.prod.outlook.com>
2023-01-31 18:03 ` Jason A. Donenfeld
2 siblings, 2 replies; 20+ messages in thread
From: Ard Biesheuvel @ 2023-01-31 7:18 UTC (permalink / raw)
To: Jia He, Jason A. Donenfeld
Cc: Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei
(cc Jason for awareness)
On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
>
> I met a hung task warning and then kernel was hung forever with latest
> kernel on an Ampere Emag server.
>
> The root cause is kernel was hung when invoking an efi rts call to set
> the RandomSeed variable during the booting stage. The arch_efi_call_virt
> call (set_variable) was never returned and then caused the hung task error.
>
Given that EFI variables work on this platform (as far as I know), the
problem may be that we are calling SetVariable() too early.
Could you double check whether setting variables works as expected?
You can use efibootmgr -t 10 as root (for example) to set the boot
timeout, and check whether the new value is retained after a reboot
(efibootmgr will print the current value for you)
Could you also please share the kernel log up until the point where it hangs?
> On the Emag server, efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE)
> is returned with "true"
>
This is as expected: if the firmware does not expose the RT properties
table, all runtime services are assumed to be available.
> Fix it by introducing the efi_get_supported_rt_services() and then determine
> to set or clear the runtime services bit of efi.flags.
>
> Jia He (2):
> efi: libstub: Fix the retriving of supported rutime services
> efi: Introduce efi_get_supported_rt_services() to get the runtime
> services mask
>
> drivers/firmware/efi/arm-runtime.c | 5 ++++-
> drivers/firmware/efi/efi.c | 28 +++++++++++++++++--------
> drivers/firmware/efi/libstub/efi-stub.c | 2 ++
> include/linux/efi.h | 1 +
> 4 files changed, 26 insertions(+), 10 deletions(-)
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
[not found] ` <DBBPR08MB45383B479656BA18FEFFB7D5F7D09@DBBPR08MB4538.eurprd08.prod.outlook.com>
@ 2023-01-31 15:31 ` Justin He
2023-01-31 18:23 ` Jason A. Donenfeld
1 sibling, 0 replies; 20+ messages in thread
From: Justin He @ 2023-01-31 15:31 UTC (permalink / raw)
To: Ard Biesheuvel, Jason A. Donenfeld
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei
> -----Original Message-----
> From: Justin He
> Sent: Tuesday, January 31, 2023 11:22 PM
> To: Ard Biesheuvel <ardb@kernel.org>; Jason A. Donenfeld <Jason@zx2c4.com>
> Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>
> Subject: RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
>
> Hi Ard,
>
> > -----Original Message-----
> > From: Ard Biesheuvel <ardb@kernel.org>
> > Sent: Tuesday, January 31, 2023 3:19 PM
> > To: Justin He <Justin.He@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>
> > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>
> > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> >
> > (cc Jason for awareness)
> >
> > On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
> > >
> > > I met a hung task warning and then kernel was hung forever with latest
> > > kernel on an Ampere Emag server.
> > >
> > > The root cause is kernel was hung when invoking an efi rts call to
> > > set the RandomSeed variable during the booting stage. The
> > > arch_efi_call_virt call (set_variable) was never returned and then caused
> the
> > hung task error.
> > >
> >
> > Given that EFI variables work on this platform (as far as I know), the problem
> > may be that we are calling SetVariable() too early.
> >
> > Could you double check whether setting variables works as expected?
> > You can use efibootmgr -t 10 as root (for example) to set the boot timeout,
> and
> > check whether the new value is retained after a reboot (efibootmgr will print
> > the current value for you)
> >
> > Could you also please share the kernel log up until the point where it hangs?
> >
> The set_variable seems to be ok in 5.19+:
> root@:~# efibootmgr -t 10
> BootCurrent: 0000
> Timeout: 10 seconds
> BootOrder: 0000,0004,0003,0002,0001
> Boot0000* ubuntu
> Boot0001* grub
> Boot0002* UEFI: Built-in EFI Shell
> Boot0003* UEFI: PXE boot on MAC: 4C:38:D5:08:F6:02
> Boot0004* ubuntu
> root@:~# efibootmgr -t 9
> BootCurrent: 0000
> Timeout: 9 seconds
> BootOrder: 0000,0004,0003,0002,0001
> Boot0000* ubuntu
> Boot0001* grub
> Boot0002* UEFI: Built-in EFI Shell
> Boot0003* UEFI: PXE boot on MAC: 4C:38:D5:08:F6:02
> Boot0004* ubuntu
> root@:~# uname -a
> Linux $machine_name 5.19.0+ #228 SMP Wed Aug 17 13:22:29 UTC 2022
> aarch64 aarch64 aarch64 GNU/Linux
>
> The dmesg log until the hung(some information was removed, eg, hostname):
>
Sorry, I noticed my previous mail might be bounced back if the text is too long.
I pasted it to online Pastebin.com
https://pastebin.com/3zQyryp1
Hope it helps
Cheers,
Justin
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-01-31 7:18 ` [PATCH 0/2] Fix boot hang issue on Ampere Emag server Ard Biesheuvel
[not found] ` <DBBPR08MB45383B479656BA18FEFFB7D5F7D09@DBBPR08MB4538.eurprd08.prod.outlook.com>
@ 2023-01-31 18:03 ` Jason A. Donenfeld
2023-01-31 18:13 ` Jason A. Donenfeld
1 sibling, 1 reply; 20+ messages in thread
From: Jason A. Donenfeld @ 2023-01-31 18:03 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Jia He, Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei
On Tue, Jan 31, 2023 at 08:18:49AM +0100, Ard Biesheuvel wrote:
> (cc Jason for awareness)
>
> On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
> >
> > I met a hung task warning and then kernel was hung forever with latest
> > kernel on an Ampere Emag server.
> >
> > The root cause is kernel was hung when invoking an efi rts call to set
> > the RandomSeed variable during the booting stage. The arch_efi_call_virt
> > call (set_variable) was never returned and then caused the hung task error.
> >
>
> Given that EFI variables work on this platform (as far as I know), the
> problem may be that we are calling SetVariable() too early.
>
On my phone and with very limited connectivity for another 10 days, but
I wonder if there's a later place we could move this block:
if (efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE))
execute_with_initialized_rng(&refresh_nv_rng_seed_nb);
Is there any additional initialization that happens after
efisubsys_init() that we're maybe missing out on there?
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-01-31 18:03 ` Jason A. Donenfeld
@ 2023-01-31 18:13 ` Jason A. Donenfeld
0 siblings, 0 replies; 20+ messages in thread
From: Jason A. Donenfeld @ 2023-01-31 18:13 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Jia He, Huacai Chen, linux-efi, linux-kernel, Alexandru Elisei
Actually...
On Tue, Jan 31, 2023 at 07:03:12PM +0100, Jason A. Donenfeld wrote:
> On Tue, Jan 31, 2023 at 08:18:49AM +0100, Ard Biesheuvel wrote:
> > (cc Jason for awareness)
> >
> > On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
> > >
> > > I met a hung task warning and then kernel was hung forever with latest
> > > kernel on an Ampere Emag server.
> > >
> > > The root cause is kernel was hung when invoking an efi rts call to set
> > > the RandomSeed variable during the booting stage. The arch_efi_call_virt
> > > call (set_variable) was never returned and then caused the hung task error.
> > >
> >
> > Given that EFI variables work on this platform (as far as I know), the
> > problem may be that we are calling SetVariable() too early.
> >
>
> On my phone and with very limited connectivity for another 10 days, but
> I wonder if there's a later place we could move this block:
>
> if (efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE))
> execute_with_initialized_rng(&refresh_nv_rng_seed_nb);
>
> Is there any additional initialization that happens after
> efisubsys_init() that we're maybe missing out on there?
From Jia's dmesg:
[ 14.209389][ C2] random: crng init done
So SetVariable isn't even being called until 14 seconds after boot. That
suggests there's something else wrong here. I wonder how it is that
efibootmgr works, but this does not... Hm?
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
[not found] ` <DBBPR08MB45383B479656BA18FEFFB7D5F7D09@DBBPR08MB4538.eurprd08.prod.outlook.com>
2023-01-31 15:31 ` Justin He
@ 2023-01-31 18:23 ` Jason A. Donenfeld
2023-02-02 10:50 ` Justin He
1 sibling, 1 reply; 20+ messages in thread
From: Jason A. Donenfeld @ 2023-01-31 18:23 UTC (permalink / raw)
To: Justin He
Cc: Ard Biesheuvel, Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei
On Tue, Jan 31, 2023 at 03:21:39PM +0000, Justin He wrote:
> Hi Ard,
>
> > -----Original Message-----
> > From: Ard Biesheuvel <ardb@kernel.org>
> > Sent: Tuesday, January 31, 2023 3:19 PM
> > To: Justin He <Justin.He@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>
> > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>
> > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> >
> > (cc Jason for awareness)
> >
> > On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
> > >
> > > I met a hung task warning and then kernel was hung forever with latest
> > > kernel on an Ampere Emag server.
> > >
> > > The root cause is kernel was hung when invoking an efi rts call to
> > > set the RandomSeed variable during the booting stage. The
> > > arch_efi_call_virt call (set_variable) was never returned and then caused the
> > hung task error.
> > >
> >
> > Given that EFI variables work on this platform (as far as I know), the problem
> > may be that we are calling SetVariable() too early.
> >
> > Could you double check whether setting variables works as expected?
> > You can use efibootmgr -t 10 as root (for example) to set the boot timeout, and
> > check whether the new value is retained after a reboot (efibootmgr will print
> > the current value for you)
> >
> > Could you also please share the kernel log up until the point where it hangs?
> >
> The set_variable seems to be ok in 5.19+:
> root@:~# efibootmgr -t 10
> BootCurrent: 0000
> Timeout: 10 seconds
I think what we want to learn is whether efibootmgr -t 10 works in the
latest RC. If not, it would suggest the issue isn't with the seed
setting, but with some other unrelated change.
Can you run efibootmgr -t 10 (or whatever) again on a kernel where
you've commented out these lines in efi.c inside of efisubsys_init():
if (efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE))
execute_with_initialized_rng(&refresh_nv_rng_seed_nb);
-->
// if (efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE))
// execute_with_initialized_rng(&refresh_nv_rng_seed_nb);
Or something like that.
Jason
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-01-31 18:23 ` Jason A. Donenfeld
@ 2023-02-02 10:50 ` Justin He
2023-02-07 3:21 ` Justin He
0 siblings, 1 reply; 20+ messages in thread
From: Justin He @ 2023-02-02 10:50 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Ard Biesheuvel, Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei
Hi Jason
> -----Original Message-----
> From: Jason A. Donenfeld <Jason@zx2c4.com>
> Sent: Wednesday, February 1, 2023 2:23 AM
> To: Justin He <Justin.He@arm.com>
> Cc: Ard Biesheuvel <ardb@kernel.org>; Huacai Chen <chenhuacai@kernel.org>;
> linux-efi@vger.kernel.org; linux-kernel@vger.kernel.org; Alexandru Elisei
> <Alexandru.Elisei@arm.com>
> Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
>
> On Tue, Jan 31, 2023 at 03:21:39PM +0000, Justin He wrote:
> > Hi Ard,
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > > Sent: Tuesday, January 31, 2023 3:19 PM
> > > To: Justin He <Justin.He@arm.com>; Jason A. Donenfeld
> > > <Jason@zx2c4.com>
> > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > <Alexandru.Elisei@arm.com>
> > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > >
> > > (cc Jason for awareness)
> > >
> > > On Tue, 31 Jan 2023 at 05:04, Jia He <justin.he@arm.com> wrote:
> > > >
> > > > I met a hung task warning and then kernel was hung forever with
> > > > latest kernel on an Ampere Emag server.
> > > >
> > > > The root cause is kernel was hung when invoking an efi rts call
> > > > to set the RandomSeed variable during the booting stage. The
> > > > arch_efi_call_virt call (set_variable) was never returned and then
> > > > caused the
> > > hung task error.
> > > >
> > >
> > > Given that EFI variables work on this platform (as far as I know),
> > > the problem may be that we are calling SetVariable() too early.
> > >
> > > Could you double check whether setting variables works as expected?
> > > You can use efibootmgr -t 10 as root (for example) to set the boot
> > > timeout, and check whether the new value is retained after a reboot
> > > (efibootmgr will print the current value for you)
> > >
> > > Could you also please share the kernel log up until the point where it
> hangs?
> > >
> > The set_variable seems to be ok in 5.19+:
> > root@:~# efibootmgr -t 10
> > BootCurrent: 0000
> > Timeout: 10 seconds
>
> I think what we want to learn is whether efibootmgr -t 10 works in the latest
> RC. If not, it would suggest the issue isn't with the seed setting, but with some
> other unrelated change.
>
> Can you run efibootmgr -t 10 (or whatever) again on a kernel where you've
> commented out these lines in efi.c inside of efisubsys_init():
>
> if (efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE))
> execute_with_initialized_rng(&refresh_nv_rng_seed_nb);
>
> -->
>
> // if (efi_rt_services_supported(EFI_RT_SUPPORTED_SET_VARIABLE))
> // execute_with_initialized_rng(&refresh_nv_rng_seed_nb);
>
As your suggested (comment above execute_with_initialized_rng in latest kernel):
The efibootmgr -t X will be hung. Looks like one certain commit before your patch
broke the set_variable efi call. I will dig into the further debug and tell you the result.
---
Cheers,
Justin.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-02 10:50 ` Justin He
@ 2023-02-07 3:21 ` Justin He
2023-02-07 8:44 ` Ard Biesheuvel
0 siblings, 1 reply; 20+ messages in thread
From: Justin He @ 2023-02-07 3:21 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
Hi Ard
> -----Original Message-----
[...]
> As your suggested (comment above execute_with_initialized_rng in latest
> kernel):
> The efibootmgr -t X will be hung. Looks like one certain commit before your
> patch broke the set_variable efi call. I will dig into the further debug and tell
> you the result.
The root cause of the hung IMO might be similar to
commit 550b33cfd445296868a478e8413ffb2e963eed32
Author: Ard Biesheuvel <ardb@kernel.org>
Date: Thu Nov 10 10:36:20 2022 +0100
arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines
Do you agree with the idea if I add Ampere ”eMAG” machine into the list of
Using SetVirtualAddressMap() forcibly?
Please note that even in previous kernel patch, the efibootmgr -t 10 will make
kernel hung if I passed "efi=novamap" to the boot parameter.
--
Cheers,
Justin (Jia He)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 3:21 ` Justin He
@ 2023-02-07 8:44 ` Ard Biesheuvel
2023-02-07 8:49 ` Justin He
0 siblings, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2023-02-07 8:44 UTC (permalink / raw)
To: Justin He
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
On Tue, 7 Feb 2023 at 04:21, Justin He <Justin.He@arm.com> wrote:
>
> Hi Ard
>
> > -----Original Message-----
> [...]
> > As your suggested (comment above execute_with_initialized_rng in latest
> > kernel):
> > The efibootmgr -t X will be hung. Looks like one certain commit before your
> > patch broke the set_variable efi call. I will dig into the further debug and tell
> > you the result.
>
> The root cause of the hung IMO might be similar to
> commit 550b33cfd445296868a478e8413ffb2e963eed32
> Author: Ard Biesheuvel <ardb@kernel.org>
> Date: Thu Nov 10 10:36:20 2022 +0100
>
> arm64: efi: Force the use of SetVirtualAddressMap() on Altra machines
>
> Do you agree with the idea if I add Ampere ”eMAG” machine into the list of
> Using SetVirtualAddressMap() forcibly?
>
> Please note that even in previous kernel patch, the efibootmgr -t 10 will make
> kernel hung if I passed "efi=novamap" to the boot parameter.
>
Interesting. What does dmidecode return for the family in the type 1 record?
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 8:44 ` Ard Biesheuvel
@ 2023-02-07 8:49 ` Justin He
2023-02-07 8:54 ` Ard Biesheuvel
0 siblings, 1 reply; 20+ messages in thread
From: Justin He @ 2023-02-07 8:49 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
> -----Original Message-----
[..]
> > The root cause of the hung IMO might be similar to commit
> > 550b33cfd445296868a478e8413ffb2e963eed32
> > Author: Ard Biesheuvel <ardb@kernel.org>
> > Date: Thu Nov 10 10:36:20 2022 +0100
> >
> > arm64: efi: Force the use of SetVirtualAddressMap() on Altra
> > machines
> >
> > Do you agree with the idea if I add Ampere ”eMAG” machine into the
> > list of Using SetVirtualAddressMap() forcibly?
> >
> > Please note that even in previous kernel patch, the efibootmgr -t 10
> > will make kernel hung if I passed "efi=novamap" to the boot parameter.
> >
>
> Interesting. What does dmidecode return for the family in the type 1 record?
# dmidecode |grep -i family
Family: eMAG
Family: ARMv8
The full dmidecode log is at https://pastebin.com/M3MAJtUG
--
Cheers,
Justin (Jia He)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 8:49 ` Justin He
@ 2023-02-07 8:54 ` Ard Biesheuvel
2023-02-07 9:02 ` Justin He
0 siblings, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2023-02-07 8:54 UTC (permalink / raw)
To: Justin He
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
>
>
>
> > -----Original Message-----
> [..]
> > > The root cause of the hung IMO might be similar to commit
> > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > Date: Thu Nov 10 10:36:20 2022 +0100
> > >
> > > arm64: efi: Force the use of SetVirtualAddressMap() on Altra
> > > machines
> > >
> > > Do you agree with the idea if I add Ampere ”eMAG” machine into the
> > > list of Using SetVirtualAddressMap() forcibly?
> > >
> > > Please note that even in previous kernel patch, the efibootmgr -t 10
> > > will make kernel hung if I passed "efi=novamap" to the boot parameter.
> > >
> >
> > Interesting. What does dmidecode return for the family in the type 1 record?
>
> # dmidecode |grep -i family
> Family: eMAG
> Family: ARMv8
>
> The full dmidecode log is at https://pastebin.com/M3MAJtUG
>
OK please try this:
diff --git a/drivers/firmware/efi/libstub/arm64.c
b/drivers/firmware/efi/libstub/arm64.c
index ff2d18c42ee74979..fae930dec82be7c6 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
* Ampere Altra machines crash in SetTime() if SetVirtualAddressMap()
* has not been called prior.
*/
- if (!type1_family || strcmp(type1_family, "Altra"))
+ if (!type1_family ||
+ (strcmp(type1_family, "Altra") && strcmp(type1_family, "eMAG")))
return false;
efi_warn("Working around broken SetVirtualAddressMap()\n");
^ permalink raw reply related [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 8:54 ` Ard Biesheuvel
@ 2023-02-07 9:02 ` Justin He
2023-02-07 9:03 ` Ard Biesheuvel
0 siblings, 1 reply; 20+ messages in thread
From: Justin He @ 2023-02-07 9:02 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
Hi Ard
> -----Original Message-----
> From: Ard Biesheuvel <ardb@kernel.org>
> Sent: Tuesday, February 7, 2023 4:54 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
>
> On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > [..]
> > > > The root cause of the hung IMO might be similar to commit
> > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > Date: Thu Nov 10 10:36:20 2022 +0100
> > > >
> > > > arm64: efi: Force the use of SetVirtualAddressMap() on Altra
> > > > machines
> > > >
> > > > Do you agree with the idea if I add Ampere ”eMAG” machine into the
> > > > list of Using SetVirtualAddressMap() forcibly?
> > > >
> > > > Please note that even in previous kernel patch, the efibootmgr -t
> > > > 10 will make kernel hung if I passed "efi=novamap" to the boot
> parameter.
> > > >
> > >
> > > Interesting. What does dmidecode return for the family in the type 1
> record?
> >
> > # dmidecode |grep -i family
> > Family: eMAG
> > Family: ARMv8
> >
> > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> >
>
> OK please try this:
>
> diff --git a/drivers/firmware/efi/libstub/arm64.c
> b/drivers/firmware/efi/libstub/arm64.c
> index ff2d18c42ee74979..fae930dec82be7c6 100644
> --- a/drivers/firmware/efi/libstub/arm64.c
> +++ b/drivers/firmware/efi/libstub/arm64.c
> @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> * Ampere Altra machines crash in SetTime() if
> SetVirtualAddressMap()
> * has not been called prior.
> */
> - if (!type1_family || strcmp(type1_family, "Altra"))
> + if (!type1_family ||
> + (strcmp(type1_family, "Altra") && strcmp(type1_family,
> + "eMAG")))
> return false;
>
> efi_warn("Working around broken SetVirtualAddressMap()\n");
Yes, it works on my eMAG server: the kernel boots.
Other than efibootmgr failure. But I noticed this efibootmgr failure even before
Commit d3549a938b7 ("avoid SetVirtualAddressMap() when possible ")
root@:~/linux# efibootmgr -t 9; efibootmgr -t 5;
Could not set Timeout: Input/output error
Could not set Timeout: Input/output error
--
Cheers,
Justin (Jia He)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 9:02 ` Justin He
@ 2023-02-07 9:03 ` Ard Biesheuvel
2023-02-07 9:08 ` Justin He
0 siblings, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2023-02-07 9:03 UTC (permalink / raw)
To: Justin He
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
>
> Hi Ard
>
> > -----Original Message-----
> > From: Ard Biesheuvel <ardb@kernel.org>
> > Sent: Tuesday, February 7, 2023 4:54 PM
> > To: Justin He <Justin.He@arm.com>
> > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> > Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> >
> > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > >
> > >
> > >
> > > > -----Original Message-----
> > > [..]
> > > > > The root cause of the hung IMO might be similar to commit
> > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > Date: Thu Nov 10 10:36:20 2022 +0100
> > > > >
> > > > > arm64: efi: Force the use of SetVirtualAddressMap() on Altra
> > > > > machines
> > > > >
> > > > > Do you agree with the idea if I add Ampere ”eMAG” machine into the
> > > > > list of Using SetVirtualAddressMap() forcibly?
> > > > >
> > > > > Please note that even in previous kernel patch, the efibootmgr -t
> > > > > 10 will make kernel hung if I passed "efi=novamap" to the boot
> > parameter.
> > > > >
> > > >
> > > > Interesting. What does dmidecode return for the family in the type 1
> > record?
> > >
> > > # dmidecode |grep -i family
> > > Family: eMAG
> > > Family: ARMv8
> > >
> > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > >
> >
> > OK please try this:
> >
> > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > b/drivers/firmware/efi/libstub/arm64.c
> > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > --- a/drivers/firmware/efi/libstub/arm64.c
> > +++ b/drivers/firmware/efi/libstub/arm64.c
> > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > * Ampere Altra machines crash in SetTime() if
> > SetVirtualAddressMap()
> > * has not been called prior.
> > */
> > - if (!type1_family || strcmp(type1_family, "Altra"))
> > + if (!type1_family ||
> > + (strcmp(type1_family, "Altra") && strcmp(type1_family,
> > + "eMAG")))
> > return false;
> >
> > efi_warn("Working around broken SetVirtualAddressMap()\n");
>
> Yes, it works on my eMAG server: the kernel boots.
> Other than efibootmgr failure. But I noticed this efibootmgr failure even before
> Commit d3549a938b7 ("avoid SetVirtualAddressMap() when possible ")
>
> root@:~/linux# efibootmgr -t 9; efibootmgr -t 5;
> Could not set Timeout: Input/output error
> Could not set Timeout: Input/output error
>
Do you get any [Firmware Bug] lines in the kernel log?
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 9:03 ` Ard Biesheuvel
@ 2023-02-07 9:08 ` Justin He
2023-02-07 9:09 ` Ard Biesheuvel
0 siblings, 1 reply; 20+ messages in thread
From: Justin He @ 2023-02-07 9:08 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
> -----Original Message-----
> From: Ard Biesheuvel <ardb@kernel.org>
> Sent: Tuesday, February 7, 2023 5:04 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
>
> On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
> >
> > Hi Ard
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > > Sent: Tuesday, February 7, 2023 4:54 PM
> > > To: Justin He <Justin.He@arm.com>
> > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > <Alexandru.Elisei@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>; nd
> > > <nd@arm.com>
> > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > >
> > > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > [..]
> > > > > > The root cause of the hung IMO might be similar to commit
> > > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > > Date: Thu Nov 10 10:36:20 2022 +0100
> > > > > >
> > > > > > arm64: efi: Force the use of SetVirtualAddressMap() on
> > > > > > Altra machines
> > > > > >
> > > > > > Do you agree with the idea if I add Ampere ”eMAG” machine into
> > > > > > the list of Using SetVirtualAddressMap() forcibly?
> > > > > >
> > > > > > Please note that even in previous kernel patch, the efibootmgr
> > > > > > -t
> > > > > > 10 will make kernel hung if I passed "efi=novamap" to the boot
> > > parameter.
> > > > > >
> > > > >
> > > > > Interesting. What does dmidecode return for the family in the
> > > > > type 1
> > > record?
> > > >
> > > > # dmidecode |grep -i family
> > > > Family: eMAG
> > > > Family: ARMv8
> > > >
> > > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > > >
> > >
> > > OK please try this:
> > >
> > > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > > b/drivers/firmware/efi/libstub/arm64.c
> > > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > > --- a/drivers/firmware/efi/libstub/arm64.c
> > > +++ b/drivers/firmware/efi/libstub/arm64.c
> > > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > > * Ampere Altra machines crash in SetTime() if
> > > SetVirtualAddressMap()
> > > * has not been called prior.
> > > */
> > > - if (!type1_family || strcmp(type1_family, "Altra"))
> > > + if (!type1_family ||
> > > + (strcmp(type1_family, "Altra") && strcmp(type1_family,
> > > + "eMAG")))
> > > return false;
> > >
> > > efi_warn("Working around broken SetVirtualAddressMap()\n");
> >
> > Yes, it works on my eMAG server: the kernel boots.
> > Other than efibootmgr failure. But I noticed this efibootmgr failure
> > even before Commit d3549a938b7 ("avoid SetVirtualAddressMap() when
> > possible ")
> >
> > root@:~/linux# efibootmgr -t 9; efibootmgr -t 5; Could not set
> > Timeout: Input/output error Could not set Timeout: Input/output error
> >
>
> Do you get any [Firmware Bug] lines in the kernel log?
No,
I built the kernel based on:
commit d2d11f342b179f1894a901f143ec7c008caba43e (HEAD -> master, origin/master, origin/HEAD)
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Sun Feb 5 17:17:10 2023 -0800
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Are you worried about your sync exception fixup patch? I think it has been included.
--
Cheers,
Justin (Jia He)
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 9:08 ` Justin He
@ 2023-02-07 9:09 ` Ard Biesheuvel
2023-02-08 3:10 ` Justin He
0 siblings, 1 reply; 20+ messages in thread
From: Ard Biesheuvel @ 2023-02-07 9:09 UTC (permalink / raw)
To: Justin He
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
On Tue, 7 Feb 2023 at 10:08, Justin He <Justin.He@arm.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Ard Biesheuvel <ardb@kernel.org>
> > Sent: Tuesday, February 7, 2023 5:04 PM
> > To: Justin He <Justin.He@arm.com>
> > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> > Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> >
> > On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
> > >
> > > Hi Ard
> > >
> > > > -----Original Message-----
> > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > > Sent: Tuesday, February 7, 2023 4:54 PM
> > > > To: Justin He <Justin.He@arm.com>
> > > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > > <Alexandru.Elisei@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>; nd
> > > > <nd@arm.com>
> > > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > > >
> > > > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > [..]
> > > > > > > The root cause of the hung IMO might be similar to commit
> > > > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > Date: Thu Nov 10 10:36:20 2022 +0100
> > > > > > >
> > > > > > > arm64: efi: Force the use of SetVirtualAddressMap() on
> > > > > > > Altra machines
> > > > > > >
> > > > > > > Do you agree with the idea if I add Ampere ”eMAG” machine into
> > > > > > > the list of Using SetVirtualAddressMap() forcibly?
> > > > > > >
> > > > > > > Please note that even in previous kernel patch, the efibootmgr
> > > > > > > -t
> > > > > > > 10 will make kernel hung if I passed "efi=novamap" to the boot
> > > > parameter.
> > > > > > >
> > > > > >
> > > > > > Interesting. What does dmidecode return for the family in the
> > > > > > type 1
> > > > record?
> > > > >
> > > > > # dmidecode |grep -i family
> > > > > Family: eMAG
> > > > > Family: ARMv8
> > > > >
> > > > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > > > >
> > > >
> > > > OK please try this:
> > > >
> > > > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > > > b/drivers/firmware/efi/libstub/arm64.c
> > > > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > > > --- a/drivers/firmware/efi/libstub/arm64.c
> > > > +++ b/drivers/firmware/efi/libstub/arm64.c
> > > > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > > > * Ampere Altra machines crash in SetTime() if
> > > > SetVirtualAddressMap()
> > > > * has not been called prior.
> > > > */
> > > > - if (!type1_family || strcmp(type1_family, "Altra"))
> > > > + if (!type1_family ||
> > > > + (strcmp(type1_family, "Altra") && strcmp(type1_family,
> > > > + "eMAG")))
> > > > return false;
> > > >
> > > > efi_warn("Working around broken SetVirtualAddressMap()\n");
> > >
> > > Yes, it works on my eMAG server: the kernel boots.
> > > Other than efibootmgr failure. But I noticed this efibootmgr failure
> > > even before Commit d3549a938b7 ("avoid SetVirtualAddressMap() when
> > > possible ")
> > >
> > > root@:~/linux# efibootmgr -t 9; efibootmgr -t 5; Could not set
> > > Timeout: Input/output error Could not set Timeout: Input/output error
> > >
> >
> > Do you get any [Firmware Bug] lines in the kernel log?
>
> No,
> I built the kernel based on:
> commit d2d11f342b179f1894a901f143ec7c008caba43e (HEAD -> master, origin/master, origin/HEAD)
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Sun Feb 5 17:17:10 2023 -0800
>
> Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
>
> Are you worried about your sync exception fixup patch? I think it has been included.
>
I would just like to understand why setvariable is still broken for you.
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
2023-02-07 9:09 ` Ard Biesheuvel
@ 2023-02-08 3:10 ` Justin He
0 siblings, 0 replies; 20+ messages in thread
From: Justin He @ 2023-02-08 3:10 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Huacai Chen, linux-efi@vger.kernel.org,
linux-kernel@vger.kernel.org, Alexandru Elisei,
Jason A. Donenfeld, nd
Hi Ard
> -----Original Message-----
> From: Ard Biesheuvel <ardb@kernel.org>
> Sent: Tuesday, February 7, 2023 5:09 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
>
> On Tue, 7 Feb 2023 at 10:08, Justin He <Justin.He@arm.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > > Sent: Tuesday, February 7, 2023 5:04 PM
> > > To: Justin He <Justin.He@arm.com>
> > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > <Alexandru.Elisei@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>; nd
> > > <nd@arm.com>
> > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > >
> > > On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
> > > >
> > > > Hi Ard
> > > >
> > > > > -----Original Message-----
> > > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > > > Sent: Tuesday, February 7, 2023 4:54 PM
> > > > > To: Justin He <Justin.He@arm.com>
> > > > > Cc: Huacai Chen <chenhuacai@kernel.org>;
> > > > > linux-efi@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > > Alexandru Elisei <Alexandru.Elisei@arm.com>; Jason A. Donenfeld
> > > > > <Jason@zx2c4.com>; nd <nd@arm.com>
> > > > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag
> > > > > server
> > > > >
> > > > > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > [..]
> > > > > > > > The root cause of the hung IMO might be similar to commit
> > > > > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > > Date: Thu Nov 10 10:36:20 2022 +0100
> > > > > > > >
> > > > > > > > arm64: efi: Force the use of SetVirtualAddressMap() on
> > > > > > > > Altra machines
> > > > > > > >
> > > > > > > > Do you agree with the idea if I add Ampere ”eMAG” machine
> > > > > > > > into the list of Using SetVirtualAddressMap() forcibly?
> > > > > > > >
> > > > > > > > Please note that even in previous kernel patch, the
> > > > > > > > efibootmgr -t
> > > > > > > > 10 will make kernel hung if I passed "efi=novamap" to the
> > > > > > > > boot
> > > > > parameter.
> > > > > > > >
> > > > > > >
> > > > > > > Interesting. What does dmidecode return for the family in
> > > > > > > the type 1
> > > > > record?
> > > > > >
> > > > > > # dmidecode |grep -i family
> > > > > > Family: eMAG
> > > > > > Family: ARMv8
> > > > > >
> > > > > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > > > > >
> > > > >
> > > > > OK please try this:
> > > > >
> > > > > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > > > > b/drivers/firmware/efi/libstub/arm64.c
> > > > > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > > > > --- a/drivers/firmware/efi/libstub/arm64.c
> > > > > +++ b/drivers/firmware/efi/libstub/arm64.c
> > > > > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > > > > * Ampere Altra machines crash in SetTime() if
> > > > > SetVirtualAddressMap()
> > > > > * has not been called prior.
> > > > > */
> > > > > - if (!type1_family || strcmp(type1_family, "Altra"))
> > > > > + if (!type1_family ||
> > > > > + (strcmp(type1_family, "Altra") &&
> > > > > + strcmp(type1_family,
> > > > > + "eMAG")))
> > > > > return false;
> > > > >
> > > > > efi_warn("Working around broken
> > > > > SetVirtualAddressMap()\n");
> > > >
> > > > Yes, it works on my eMAG server: the kernel boots.
> > > > Other than efibootmgr failure. But I noticed this efibootmgr
> > > > failure even before Commit d3549a938b7 ("avoid
> > > > SetVirtualAddressMap() when possible ")
> > > >
> > > > root@:~/linux# efibootmgr -t 9; efibootmgr -t 5; Could not set
> > > > Timeout: Input/output error Could not set Timeout: Input/output
> > > > error
> > > >
> > >
> > > Do you get any [Firmware Bug] lines in the kernel log?
> >
> > No,
> > I built the kernel based on:
> > commit d2d11f342b179f1894a901f143ec7c008caba43e (HEAD -> master,
> > origin/master, origin/HEAD)
> > Author: Linus Torvalds <torvalds@linux-foundation.org>
> > Date: Sun Feb 5 17:17:10 2023 -0800
> >
> > Merge branch 'fixes' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
> >
> > Are you worried about your sync exception fixup patch? I think it has been
> included.
> >
>
>
> I would just like to understand why setvariable is still broken for you.
On an Ampere *Altra* server, the family name seems to not follow your purpose of invoking efi_get_smbios_string(1, family).
dmidecode |grep -i family -C 10
Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: GIGABYTE
Product Name: R272-P30-00
Version: 0100
Serial Number: TS2035812A0022
UUID: 00000000-0000-4000-8000-B42E99AFEF62
Wake-up Type: Power Switch
SKU Number: 01234567890123456789AB
Family: Server
The full dmidecode info of Altra is at
https://pastebin.com/HQLE1yYv
--
Cheers,
Justin (Jia He)
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-02-08 3:10 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-31 4:03 [PATCH 0/2] Fix boot hang issue on Ampere Emag server Jia He
2023-01-31 4:03 ` [PATCH 1/2] efi: libstub: Fix the retriving of supported rutime services Jia He
2023-01-31 7:04 ` Ard Biesheuvel
2023-01-31 4:03 ` [PATCH 2/2] efi: Introduce efi_get_supported_rt_services() to get the runtime services mask Jia He
2023-01-31 7:06 ` Ard Biesheuvel
2023-01-31 7:18 ` [PATCH 0/2] Fix boot hang issue on Ampere Emag server Ard Biesheuvel
[not found] ` <DBBPR08MB45383B479656BA18FEFFB7D5F7D09@DBBPR08MB4538.eurprd08.prod.outlook.com>
2023-01-31 15:31 ` Justin He
2023-01-31 18:23 ` Jason A. Donenfeld
2023-02-02 10:50 ` Justin He
2023-02-07 3:21 ` Justin He
2023-02-07 8:44 ` Ard Biesheuvel
2023-02-07 8:49 ` Justin He
2023-02-07 8:54 ` Ard Biesheuvel
2023-02-07 9:02 ` Justin He
2023-02-07 9:03 ` Ard Biesheuvel
2023-02-07 9:08 ` Justin He
2023-02-07 9:09 ` Ard Biesheuvel
2023-02-08 3:10 ` Justin He
2023-01-31 18:03 ` Jason A. Donenfeld
2023-01-31 18:13 ` Jason A. Donenfeld
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox