* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-15 7:43 [PATCH v2] x86/shutdown: change default reboot method preference Roger Pau Monne
@ 2023-09-18 12:26 ` Jan Beulich
2023-09-18 15:09 ` Roger Pau Monné
2023-09-27 8:21 ` Jan Beulich
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-09-18 12:26 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel
On 15.09.2023 09:43, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
>
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
>
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> CPU: 0
> RIP: e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202 CONTEXT: hypervisor
> [...]
> Xen call trace:
> [<0000000000000017>] R 0000000000000017
> [<ffff83207eff7b50>] S ffff83207eff7b50
> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
>
> Which in most cases does lead to a reboot, however that's unreliable.
>
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
>
> This is in line to what Linux does, so it's unlikely to cause issues on current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.
I certainly appreciate this as a goal. However, ...
> Add a special case for one Acer model that does require being rebooted using
> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
... this is precisely what I'd like to avoid: Needing workarounds on spec-
conforming systems.
> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.
I'm also puzzled by this statement: That Acer aspect is a clear indication
of there being an issue. Plus it's quite easy to see that hooks may be put
in place by various firmware components that would then be used to make
certain adjustments to the platform, ahead of an orderly reboot / shutdown.
> --- a/xen/arch/x86/shutdown.c
> +++ b/xen/arch/x86/shutdown.c
> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>
> if ( xen_guest )
> reboot_type = BOOT_XEN;
> + else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> + reboot_type = BOOT_ACPI;
> else if ( efi_enabled(EFI_RS) )
> reboot_type = BOOT_EFI;
> - else if ( acpi_disabled )
> - reboot_type = BOOT_KBD;
> else
> - reboot_type = BOOT_ACPI;
> + reboot_type = BOOT_KBD;
> }
>
> static int __init cf_check override_reboot(const struct dmi_system_id *d)
> {
> enum reboot_type type = (long)d->driver_data;
>
> - if ( type == BOOT_ACPI && acpi_disabled )
> + if ( (type == BOOT_ACPI && acpi_disabled) ||
> + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> type = BOOT_KBD;
I guess I don't follow this adjustment: Why would we fall back to KBD
first thing? Wouldn't it make sense to try ACPI first if EFI cannot
be used? And go further to KBD only if ACPI then also turns out
disabled (a mode that Xen quite likely won't correctly operate in
anymore anyway, due to bitrot)?
As an aside, KBD likely is unusable on hw-reduced systems, for there
simply not being a legacy keyboard controller. Instead we may need to
fall back to CF9 in such a case.
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-18 12:26 ` Jan Beulich
@ 2023-09-18 15:09 ` Roger Pau Monné
2023-09-18 15:44 ` Jan Beulich
0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2023-09-18 15:09 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel
On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> On 15.09.2023 09:43, Roger Pau Monne wrote:
> > The current logic to chose the preferred reboot method is based on the mode Xen
> > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > method will be to use the ResetSystem() run time service call.
> >
> > However, that method seems to be widely untested, and quite often leads to a
> > result similar to:
> >
> > Hardware Dom0 shutdown: rebooting machine
> > ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> > CPU: 0
> > RIP: e008:[<0000000000000017>] 0000000000000017
> > RFLAGS: 0000000000010202 CONTEXT: hypervisor
> > [...]
> > Xen call trace:
> > [<0000000000000017>] R 0000000000000017
> > [<ffff83207eff7b50>] S ffff83207eff7b50
> > [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> > [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> > [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> > [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> > [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> > [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> > [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> > [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> > [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >
> > ****************************************
> > Panic on CPU 0:
> > FATAL TRAP: vector = 6 (invalid opcode)
> > ****************************************
> >
> > Which in most cases does lead to a reboot, however that's unreliable.
> >
> > Change the default reboot preference to prefer ACPI over UEFI if available and
> > not in reduced hardware mode.
> >
> > This is in line to what Linux does, so it's unlikely to cause issues on current
> > and future hardware, since there's a much higher chance of vendors testing
> > hardware with Linux rather than Xen.
>
> I certainly appreciate this as a goal. However, ...
>
> > Add a special case for one Acer model that does require being rebooted using
> > ResetSystem(). See Linux commit 0082517fa4bce for rationale.
>
> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> conforming systems.
I wouldn't call that platform spec-conforming when ACPI reboot doesn't
work reliably on it either. I haven't been able to find a wording on
the UEFI specification that mandates using ResetSystem() in order to
reset the platform. I've only found this wording:
"... then the UEFI OS Loader has taken control of the platform, and
EFI will not regain control of the system until the platform is reset.
One method of resetting the platform is through the EFI Runtime
Service ResetSystem()."
And this reads to me as a mere indication that one option is to use
ResetSystem(), but that there are likely other platform specific reset
methods that are suitable to be used for OSes and still be compliant
with the UEFI spec.
>
> > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > properly implemented ResetSystem() methods.
>
> I'm also puzzled by this statement: That Acer aspect is a clear indication
> of there being an issue.
Hm yes, I had that sentence from v1, before realizing the Acer quirk.
So there's one know issue with using ACPI as the default reboot
method vs many issues when using the UEFI one.
> Plus it's quite easy to see that hooks may be put
> in place by various firmware components that would then be used to make
> certain adjustments to the platform, ahead of an orderly reboot / shutdown.
Well, I very much doubt any vendor would rely on this, seeing as both
Linux and Windows both default to ACPI reboot, and the UEFI spec not
mandating the use of ResetSystem() anyway.
> > --- a/xen/arch/x86/shutdown.c
> > +++ b/xen/arch/x86/shutdown.c
> > @@ -150,19 +150,20 @@ static void default_reboot_type(void)
> >
> > if ( xen_guest )
> > reboot_type = BOOT_XEN;
> > + else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> > + reboot_type = BOOT_ACPI;
> > else if ( efi_enabled(EFI_RS) )
> > reboot_type = BOOT_EFI;
> > - else if ( acpi_disabled )
> > - reboot_type = BOOT_KBD;
> > else
> > - reboot_type = BOOT_ACPI;
> > + reboot_type = BOOT_KBD;
> > }
> >
> > static int __init cf_check override_reboot(const struct dmi_system_id *d)
> > {
> > enum reboot_type type = (long)d->driver_data;
> >
> > - if ( type == BOOT_ACPI && acpi_disabled )
> > + if ( (type == BOOT_ACPI && acpi_disabled) ||
> > + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> > type = BOOT_KBD;
>
> I guess I don't follow this adjustment: Why would we fall back to KBD
> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
> be used?
This is IMO a weird corner case, we have a explicit request to use one
reboot method, but we cannot do so because the component is disabled.
I've assumed that falling back to KBD was the safest option.
For example if we have to explicitly reboot using UEFI it's likely
because ACPI (the proposed default method) is not suitable, and hence
falling back to ACPI here won't help.
> And go further to KBD only if ACPI then also turns out
> disabled (a mode that Xen quite likely won't correctly operate in
> anymore anyway, due to bitrot)?
>
> As an aside, KBD likely is unusable on hw-reduced systems, for there
> simply not being a legacy keyboard controller. Instead we may need to
> fall back to CF9 in such a case.
Hm, I can send a followup patch for that, but not part of this
change.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-18 15:09 ` Roger Pau Monné
@ 2023-09-18 15:44 ` Jan Beulich
2023-09-18 16:00 ` Roger Pau Monné
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-09-18 15:44 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel
On 18.09.2023 17:09, Roger Pau Monné wrote:
> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
>> On 15.09.2023 09:43, Roger Pau Monne wrote:
>>> The current logic to chose the preferred reboot method is based on the mode Xen
>>> has been booted into, so if the box is booted from UEFI, the preferred reboot
>>> method will be to use the ResetSystem() run time service call.
>>>
>>> However, that method seems to be widely untested, and quite often leads to a
>>> result similar to:
>>>
>>> Hardware Dom0 shutdown: rebooting machine
>>> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
>>> CPU: 0
>>> RIP: e008:[<0000000000000017>] 0000000000000017
>>> RFLAGS: 0000000000010202 CONTEXT: hypervisor
>>> [...]
>>> Xen call trace:
>>> [<0000000000000017>] R 0000000000000017
>>> [<ffff83207eff7b50>] S ffff83207eff7b50
>>> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>>> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>>> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>>> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>>> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>>> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>>> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>>> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>>> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>>>
>>> ****************************************
>>> Panic on CPU 0:
>>> FATAL TRAP: vector = 6 (invalid opcode)
>>> ****************************************
>>>
>>> Which in most cases does lead to a reboot, however that's unreliable.
>>>
>>> Change the default reboot preference to prefer ACPI over UEFI if available and
>>> not in reduced hardware mode.
>>>
>>> This is in line to what Linux does, so it's unlikely to cause issues on current
>>> and future hardware, since there's a much higher chance of vendors testing
>>> hardware with Linux rather than Xen.
>>
>> I certainly appreciate this as a goal. However, ...
>>
>>> Add a special case for one Acer model that does require being rebooted using
>>> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
>>
>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
>> conforming systems.
>
> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> work reliably on it either. I haven't been able to find a wording on
> the UEFI specification that mandates using ResetSystem() in order to
> reset the platform. I've only found this wording:
>
> "... then the UEFI OS Loader has taken control of the platform, and
> EFI will not regain control of the system until the platform is reset.
> One method of resetting the platform is through the EFI Runtime
> Service ResetSystem()."
>
> And this reads to me as a mere indication that one option is to use
> ResetSystem(), but that there are likely other platform specific reset
> methods that are suitable to be used for OSes and still be compliant
> with the UEFI spec.
See my reference to ia64. With ACPI_FADT_RESET_REGISTER not set, I don't
think there would have been any other non-custom reboot method there. So
while perhaps not mandated, it's still the designated abstraction layer.
>>> --- a/xen/arch/x86/shutdown.c
>>> +++ b/xen/arch/x86/shutdown.c
>>> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>>>
>>> if ( xen_guest )
>>> reboot_type = BOOT_XEN;
>>> + else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
>>> + reboot_type = BOOT_ACPI;
>>> else if ( efi_enabled(EFI_RS) )
>>> reboot_type = BOOT_EFI;
>>> - else if ( acpi_disabled )
>>> - reboot_type = BOOT_KBD;
>>> else
>>> - reboot_type = BOOT_ACPI;
>>> + reboot_type = BOOT_KBD;
>>> }
>>>
>>> static int __init cf_check override_reboot(const struct dmi_system_id *d)
>>> {
>>> enum reboot_type type = (long)d->driver_data;
>>>
>>> - if ( type == BOOT_ACPI && acpi_disabled )
>>> + if ( (type == BOOT_ACPI && acpi_disabled) ||
>>> + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
>>> type = BOOT_KBD;
>>
>> I guess I don't follow this adjustment: Why would we fall back to KBD
>> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
>> be used?
>
> This is IMO a weird corner case, we have a explicit request to use one
> reboot method, but we cannot do so because the component is disabled.
> I've assumed that falling back to KBD was the safest option.
>
> For example if we have to explicitly reboot using UEFI it's likely
> because ACPI (the proposed default method) is not suitable, and hence
> falling back to ACPI here won't help.
Perhaps, but falling back to KBD isn't necessarily going to work either.
And it might well be that on said Acer no reboot method would actually
yield consistent behavior, except for ResetSystem(). The fallback logic
here as well as that in machine_restart() is all based on guesswork
anyway.
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-18 15:44 ` Jan Beulich
@ 2023-09-18 16:00 ` Roger Pau Monné
2023-09-19 9:31 ` Jan Beulich
0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2023-09-18 16:00 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel
On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
> On 18.09.2023 17:09, Roger Pau Monné wrote:
> > On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> >> On 15.09.2023 09:43, Roger Pau Monne wrote:
> >>> The current logic to chose the preferred reboot method is based on the mode Xen
> >>> has been booted into, so if the box is booted from UEFI, the preferred reboot
> >>> method will be to use the ResetSystem() run time service call.
> >>>
> >>> However, that method seems to be widely untested, and quite often leads to a
> >>> result similar to:
> >>>
> >>> Hardware Dom0 shutdown: rebooting machine
> >>> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> >>> CPU: 0
> >>> RIP: e008:[<0000000000000017>] 0000000000000017
> >>> RFLAGS: 0000000000010202 CONTEXT: hypervisor
> >>> [...]
> >>> Xen call trace:
> >>> [<0000000000000017>] R 0000000000000017
> >>> [<ffff83207eff7b50>] S ffff83207eff7b50
> >>> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> >>> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> >>> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> >>> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> >>> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> >>> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> >>> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >>> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >>> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >>>
> >>> ****************************************
> >>> Panic on CPU 0:
> >>> FATAL TRAP: vector = 6 (invalid opcode)
> >>> ****************************************
> >>>
> >>> Which in most cases does lead to a reboot, however that's unreliable.
> >>>
> >>> Change the default reboot preference to prefer ACPI over UEFI if available and
> >>> not in reduced hardware mode.
> >>>
> >>> This is in line to what Linux does, so it's unlikely to cause issues on current
> >>> and future hardware, since there's a much higher chance of vendors testing
> >>> hardware with Linux rather than Xen.
> >>
> >> I certainly appreciate this as a goal. However, ...
> >>
> >>> Add a special case for one Acer model that does require being rebooted using
> >>> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
> >>
> >> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> >> conforming systems.
> >
> > I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> > work reliably on it either. I haven't been able to find a wording on
> > the UEFI specification that mandates using ResetSystem() in order to
> > reset the platform. I've only found this wording:
> >
> > "... then the UEFI OS Loader has taken control of the platform, and
> > EFI will not regain control of the system until the platform is reset.
> > One method of resetting the platform is through the EFI Runtime
> > Service ResetSystem()."
> >
> > And this reads to me as a mere indication that one option is to use
> > ResetSystem(), but that there are likely other platform specific reset
> > methods that are suitable to be used for OSes and still be compliant
> > with the UEFI spec.
>
> See my reference to ia64.
Right, I understand that on ia64 things might have been different, due
to the platform lacking any other reboot method, but I don't see how
this applies to x86 where there are other reboot methods.
> With ACPI_FADT_RESET_REGISTER not set, I don't
> think there would have been any other non-custom reboot method there. So
> while perhaps not mandated, it's still the designated abstraction layer.
Again the spec doesn't mention that ResetSystem() must be used, so
while it would make sense if it was reliable, it clearly isn't. In
which case resorting to the more reliable method should always be
preferred, specially if the spec is so lax as to call ResetSystem()
"One method of resetting the platform".
We should also take into account that vendors are much more likely to
test new hardware with Linux rather than Xen, and hence it's low
probability that the default Linux reboot method doesn't work on a
platform, because that would hurt the vendor.
> >>> --- a/xen/arch/x86/shutdown.c
> >>> +++ b/xen/arch/x86/shutdown.c
> >>> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
> >>>
> >>> if ( xen_guest )
> >>> reboot_type = BOOT_XEN;
> >>> + else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> >>> + reboot_type = BOOT_ACPI;
> >>> else if ( efi_enabled(EFI_RS) )
> >>> reboot_type = BOOT_EFI;
> >>> - else if ( acpi_disabled )
> >>> - reboot_type = BOOT_KBD;
> >>> else
> >>> - reboot_type = BOOT_ACPI;
> >>> + reboot_type = BOOT_KBD;
> >>> }
> >>>
> >>> static int __init cf_check override_reboot(const struct dmi_system_id *d)
> >>> {
> >>> enum reboot_type type = (long)d->driver_data;
> >>>
> >>> - if ( type == BOOT_ACPI && acpi_disabled )
> >>> + if ( (type == BOOT_ACPI && acpi_disabled) ||
> >>> + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> >>> type = BOOT_KBD;
> >>
> >> I guess I don't follow this adjustment: Why would we fall back to KBD
> >> first thing? Wouldn't it make sense to try ACPI first if EFI cannot
> >> be used?
> >
> > This is IMO a weird corner case, we have a explicit request to use one
> > reboot method, but we cannot do so because the component is disabled.
> > I've assumed that falling back to KBD was the safest option.
> >
> > For example if we have to explicitly reboot using UEFI it's likely
> > because ACPI (the proposed default method) is not suitable, and hence
> > falling back to ACPI here won't help.
>
> Perhaps, but falling back to KBD isn't necessarily going to work either.
> And it might well be that on said Acer no reboot method would actually
> yield consistent behavior, except for ResetSystem(). The fallback logic
> here as well as that in machine_restart() is all based on guesswork
> anyway.
Indeed, hence it seemed a suitable and less risky option to fallback
to KBD in both cases.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-18 16:00 ` Roger Pau Monné
@ 2023-09-19 9:31 ` Jan Beulich
2023-09-19 10:29 ` Roger Pau Monné
0 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-09-19 9:31 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, Wei Liu, xen-devel
On 18.09.2023 18:00, Roger Pau Monné wrote:
> On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
>> On 18.09.2023 17:09, Roger Pau Monné wrote:
>>> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
>>>> On 15.09.2023 09:43, Roger Pau Monne wrote:
>>>>> The current logic to chose the preferred reboot method is based on the mode Xen
>>>>> has been booted into, so if the box is booted from UEFI, the preferred reboot
>>>>> method will be to use the ResetSystem() run time service call.
>>>>>
>>>>> However, that method seems to be widely untested, and quite often leads to a
>>>>> result similar to:
>>>>>
>>>>> Hardware Dom0 shutdown: rebooting machine
>>>>> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
>>>>> CPU: 0
>>>>> RIP: e008:[<0000000000000017>] 0000000000000017
>>>>> RFLAGS: 0000000000010202 CONTEXT: hypervisor
>>>>> [...]
>>>>> Xen call trace:
>>>>> [<0000000000000017>] R 0000000000000017
>>>>> [<ffff83207eff7b50>] S ffff83207eff7b50
>>>>> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
>>>>> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
>>>>> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
>>>>> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
>>>>> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
>>>>> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
>>>>> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
>>>>> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
>>>>> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>>>>>
>>>>> ****************************************
>>>>> Panic on CPU 0:
>>>>> FATAL TRAP: vector = 6 (invalid opcode)
>>>>> ****************************************
>>>>>
>>>>> Which in most cases does lead to a reboot, however that's unreliable.
>>>>>
>>>>> Change the default reboot preference to prefer ACPI over UEFI if available and
>>>>> not in reduced hardware mode.
>>>>>
>>>>> This is in line to what Linux does, so it's unlikely to cause issues on current
>>>>> and future hardware, since there's a much higher chance of vendors testing
>>>>> hardware with Linux rather than Xen.
>>>>
>>>> I certainly appreciate this as a goal. However, ...
>>>>
>>>>> Add a special case for one Acer model that does require being rebooted using
>>>>> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
>>>>
>>>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
>>>> conforming systems.
>>>
>>> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
>>> work reliably on it either. I haven't been able to find a wording on
>>> the UEFI specification that mandates using ResetSystem() in order to
>>> reset the platform. I've only found this wording:
>>>
>>> "... then the UEFI OS Loader has taken control of the platform, and
>>> EFI will not regain control of the system until the platform is reset.
>>> One method of resetting the platform is through the EFI Runtime
>>> Service ResetSystem()."
>>>
>>> And this reads to me as a mere indication that one option is to use
>>> ResetSystem(), but that there are likely other platform specific reset
>>> methods that are suitable to be used for OSes and still be compliant
>>> with the UEFI spec.
>>
>> See my reference to ia64.
>
> Right, I understand that on ia64 things might have been different, due
> to the platform lacking any other reboot method, but I don't see how
> this applies to x86 where there are other reboot methods.
>
>> With ACPI_FADT_RESET_REGISTER not set, I don't
>> think there would have been any other non-custom reboot method there. So
>> while perhaps not mandated, it's still the designated abstraction layer.
>
> Again the spec doesn't mention that ResetSystem() must be used, so
> while it would make sense if it was reliable, it clearly isn't. In
> which case resorting to the more reliable method should always be
> preferred, specially if the spec is so lax as to call ResetSystem()
> "One method of resetting the platform".
That wording wasn't there in 1.02, but I can see it all the way back to
at least 2.1. So yes, you have a point. Yet - adding onto an earlier
remark of mine - EFI_RESET_NOTIFICATION_PROTOCOL is pretty useless if
use of ResetSystem() was optional.
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-19 9:31 ` Jan Beulich
@ 2023-09-19 10:29 ` Roger Pau Monné
0 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monné @ 2023-09-19 10:29 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel
On Tue, Sep 19, 2023 at 11:31:07AM +0200, Jan Beulich wrote:
> On 18.09.2023 18:00, Roger Pau Monné wrote:
> > On Mon, Sep 18, 2023 at 05:44:47PM +0200, Jan Beulich wrote:
> >> On 18.09.2023 17:09, Roger Pau Monné wrote:
> >>> On Mon, Sep 18, 2023 at 02:26:51PM +0200, Jan Beulich wrote:
> >>>> On 15.09.2023 09:43, Roger Pau Monne wrote:
> >>>>> The current logic to chose the preferred reboot method is based on the mode Xen
> >>>>> has been booted into, so if the box is booted from UEFI, the preferred reboot
> >>>>> method will be to use the ResetSystem() run time service call.
> >>>>>
> >>>>> However, that method seems to be widely untested, and quite often leads to a
> >>>>> result similar to:
> >>>>>
> >>>>> Hardware Dom0 shutdown: rebooting machine
> >>>>> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> >>>>> CPU: 0
> >>>>> RIP: e008:[<0000000000000017>] 0000000000000017
> >>>>> RFLAGS: 0000000000010202 CONTEXT: hypervisor
> >>>>> [...]
> >>>>> Xen call trace:
> >>>>> [<0000000000000017>] R 0000000000000017
> >>>>> [<ffff83207eff7b50>] S ffff83207eff7b50
> >>>>> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> >>>>> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> >>>>> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> >>>>> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> >>>>> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> >>>>> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> >>>>> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> >>>>> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> >>>>> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >>>>>
> >>>>> ****************************************
> >>>>> Panic on CPU 0:
> >>>>> FATAL TRAP: vector = 6 (invalid opcode)
> >>>>> ****************************************
> >>>>>
> >>>>> Which in most cases does lead to a reboot, however that's unreliable.
> >>>>>
> >>>>> Change the default reboot preference to prefer ACPI over UEFI if available and
> >>>>> not in reduced hardware mode.
> >>>>>
> >>>>> This is in line to what Linux does, so it's unlikely to cause issues on current
> >>>>> and future hardware, since there's a much higher chance of vendors testing
> >>>>> hardware with Linux rather than Xen.
> >>>>
> >>>> I certainly appreciate this as a goal. However, ...
> >>>>
> >>>>> Add a special case for one Acer model that does require being rebooted using
> >>>>> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
> >>>>
> >>>> ... this is precisely what I'd like to avoid: Needing workarounds on spec-
> >>>> conforming systems.
> >>>
> >>> I wouldn't call that platform spec-conforming when ACPI reboot doesn't
> >>> work reliably on it either. I haven't been able to find a wording on
> >>> the UEFI specification that mandates using ResetSystem() in order to
> >>> reset the platform. I've only found this wording:
> >>>
> >>> "... then the UEFI OS Loader has taken control of the platform, and
> >>> EFI will not regain control of the system until the platform is reset.
> >>> One method of resetting the platform is through the EFI Runtime
> >>> Service ResetSystem()."
> >>>
> >>> And this reads to me as a mere indication that one option is to use
> >>> ResetSystem(), but that there are likely other platform specific reset
> >>> methods that are suitable to be used for OSes and still be compliant
> >>> with the UEFI spec.
> >>
> >> See my reference to ia64.
> >
> > Right, I understand that on ia64 things might have been different, due
> > to the platform lacking any other reboot method, but I don't see how
> > this applies to x86 where there are other reboot methods.
> >
> >> With ACPI_FADT_RESET_REGISTER not set, I don't
> >> think there would have been any other non-custom reboot method there. So
> >> while perhaps not mandated, it's still the designated abstraction layer.
> >
> > Again the spec doesn't mention that ResetSystem() must be used, so
> > while it would make sense if it was reliable, it clearly isn't. In
> > which case resorting to the more reliable method should always be
> > preferred, specially if the spec is so lax as to call ResetSystem()
> > "One method of resetting the platform".
>
> That wording wasn't there in 1.02, but I can see it all the way back to
> at least 2.1. So yes, you have a point. Yet - adding onto an earlier
> remark of mine - EFI_RESET_NOTIFICATION_PROTOCOL is pretty useless if
> use of ResetSystem() was optional.
See the note in
EFI_RESET_NOTIFICATION_PROTOCOL.RegisterResetNotify():
"The list of registered reset notification functions are processed if
ResetSystem() is called before ExitBootServices(). The list of
registered reset notification functions is ignored if ResetSystem() is
called after ExitBootServices()."
Those handlers are only called before ExitBootServices(), so for our
use-case it doesn't make a difference, as we call ResetSystem() after
having exited boot services.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-15 7:43 [PATCH v2] x86/shutdown: change default reboot method preference Roger Pau Monne
2023-09-18 12:26 ` Jan Beulich
@ 2023-09-27 8:21 ` Jan Beulich
2023-10-03 11:35 ` Roger Pau Monné
2024-07-29 22:08 ` Marek Marczykowski-Górecki
2026-02-13 0:39 ` Marek Marczykowski-Górecki
3 siblings, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2023-09-27 8:21 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel
On 15.09.2023 09:43, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
>
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
>
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> CPU: 0
> RIP: e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202 CONTEXT: hypervisor
> [...]
> Xen call trace:
> [<0000000000000017>] R 0000000000000017
> [<ffff83207eff7b50>] S ffff83207eff7b50
> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
>
> Which in most cases does lead to a reboot, however that's unreliable.
>
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
>
> This is in line to what Linux does, so it's unlikely to cause issues on current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.
>
> Add a special case for one Acer model that does require being rebooted using
> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
>
> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.
A data point from a new system I'm still in the process of setting up: The
ACPI reboot method, as used by Linux, unconditionally means a warm reboot.
The EFI method, otoh, properly distinguishes "reboot=warm" from our default
of explicitly requesting cold reboot. (Without taking the EFI path, I
assume our write to the relevant BDA location simply has no effect, for
this being a legacy BIOS thing, and the system apparently defaults to warm
reboot when using the ACPI method.)
Clearly, as a secondary effect, this system adds to my personal experience
of so far EFI reboot consistently working on all x86 hardware I have (had)
direct access to. (That said, this is the first non-Intel system, which
likely biases my overall experience.)
Jan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-27 8:21 ` Jan Beulich
@ 2023-10-03 11:35 ` Roger Pau Monné
2023-10-23 11:02 ` Roger Pau Monné
0 siblings, 1 reply; 13+ messages in thread
From: Roger Pau Monné @ 2023-10-03 11:35 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel
On Wed, Sep 27, 2023 at 10:21:44AM +0200, Jan Beulich wrote:
> On 15.09.2023 09:43, Roger Pau Monne wrote:
> > The current logic to chose the preferred reboot method is based on the mode Xen
> > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > method will be to use the ResetSystem() run time service call.
> >
> > However, that method seems to be widely untested, and quite often leads to a
> > result similar to:
> >
> > Hardware Dom0 shutdown: rebooting machine
> > ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> > CPU: 0
> > RIP: e008:[<0000000000000017>] 0000000000000017
> > RFLAGS: 0000000000010202 CONTEXT: hypervisor
> > [...]
> > Xen call trace:
> > [<0000000000000017>] R 0000000000000017
> > [<ffff83207eff7b50>] S ffff83207eff7b50
> > [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> > [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> > [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> > [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> > [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> > [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> > [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> > [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> > [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >
> > ****************************************
> > Panic on CPU 0:
> > FATAL TRAP: vector = 6 (invalid opcode)
> > ****************************************
> >
> > Which in most cases does lead to a reboot, however that's unreliable.
> >
> > Change the default reboot preference to prefer ACPI over UEFI if available and
> > not in reduced hardware mode.
> >
> > This is in line to what Linux does, so it's unlikely to cause issues on current
> > and future hardware, since there's a much higher chance of vendors testing
> > hardware with Linux rather than Xen.
> >
> > Add a special case for one Acer model that does require being rebooted using
> > ResetSystem(). See Linux commit 0082517fa4bce for rationale.
> >
> > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > properly implemented ResetSystem() methods.
>
> A data point from a new system I'm still in the process of setting up: The
> ACPI reboot method, as used by Linux, unconditionally means a warm reboot.
> The EFI method, otoh, properly distinguishes "reboot=warm" from our default
> of explicitly requesting cold reboot. (Without taking the EFI path, I
> assume our write to the relevant BDA location simply has no effect, for
> this being a legacy BIOS thing, and the system apparently defaults to warm
> reboot when using the ACPI method.)
This is unfortunate, but IMO not as worse as getting a #UD or any
other fault while attempting a reboot. We can always force this
system to use UEFI reboot, if that does work better than ACPI.
> Clearly, as a secondary effect, this system adds to my personal experience
> of so far EFI reboot consistently working on all x86 hardware I have (had)
> direct access to. (That said, this is the first non-Intel system, which
> likely biases my overall experience.)
I can try to gather some data, I can at least tell you that the Intel
NUC11TNHi7 TGL does also hit a fault when attempting UEFI reboot.
The above crash was from a Dell PowerEdge R6625. I do recall seeing
this with other boxes on the Citrix lab, but don't know the exact
models. I'm quite sure other downstreams can provide similar
feedback.
I think it's clear now that using ResetSystem() when booted from UEFI
is not mandated by the UEFI specification, so I still stand by this
patch and think we should select the default reboot method that has
the highest chance of succeeding.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-10-03 11:35 ` Roger Pau Monné
@ 2023-10-23 11:02 ` Roger Pau Monné
0 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monné @ 2023-10-23 11:02 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel
On Tue, Oct 03, 2023 at 01:35:25PM +0200, Roger Pau Monné wrote:
> On Wed, Sep 27, 2023 at 10:21:44AM +0200, Jan Beulich wrote:
> > On 15.09.2023 09:43, Roger Pau Monne wrote:
> > > The current logic to chose the preferred reboot method is based on the mode Xen
> > > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > > method will be to use the ResetSystem() run time service call.
> > >
> > > However, that method seems to be widely untested, and quite often leads to a
> > > result similar to:
> > >
> > > Hardware Dom0 shutdown: rebooting machine
> > > ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> > > CPU: 0
> > > RIP: e008:[<0000000000000017>] 0000000000000017
> > > RFLAGS: 0000000000010202 CONTEXT: hypervisor
> > > [...]
> > > Xen call trace:
> > > [<0000000000000017>] R 0000000000000017
> > > [<ffff83207eff7b50>] S ffff83207eff7b50
> > > [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> > > [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> > > [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> > > [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> > > [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> > > [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> > > [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> > > [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> > > [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> > >
> > > ****************************************
> > > Panic on CPU 0:
> > > FATAL TRAP: vector = 6 (invalid opcode)
> > > ****************************************
> > >
> > > Which in most cases does lead to a reboot, however that's unreliable.
> > >
> > > Change the default reboot preference to prefer ACPI over UEFI if available and
> > > not in reduced hardware mode.
> > >
> > > This is in line to what Linux does, so it's unlikely to cause issues on current
> > > and future hardware, since there's a much higher chance of vendors testing
> > > hardware with Linux rather than Xen.
> > >
> > > Add a special case for one Acer model that does require being rebooted using
> > > ResetSystem(). See Linux commit 0082517fa4bce for rationale.
> > >
> > > I'm not aware of using ACPI reboot causing issues on boxes that do have
> > > properly implemented ResetSystem() methods.
> >
> > A data point from a new system I'm still in the process of setting up: The
> > ACPI reboot method, as used by Linux, unconditionally means a warm reboot.
> > The EFI method, otoh, properly distinguishes "reboot=warm" from our default
> > of explicitly requesting cold reboot. (Without taking the EFI path, I
> > assume our write to the relevant BDA location simply has no effect, for
> > this being a legacy BIOS thing, and the system apparently defaults to warm
> > reboot when using the ACPI method.)
>
> This is unfortunate, but IMO not as worse as getting a #UD or any
> other fault while attempting a reboot. We can always force this
> system to use UEFI reboot, if that does work better than ACPI.
>
> > Clearly, as a secondary effect, this system adds to my personal experience
> > of so far EFI reboot consistently working on all x86 hardware I have (had)
> > direct access to. (That said, this is the first non-Intel system, which
> > likely biases my overall experience.)
>
> I can try to gather some data, I can at least tell you that the Intel
> NUC11TNHi7 TGL does also hit a fault when attempting UEFI reboot.
> The above crash was from a Dell PowerEdge R6625. I do recall seeing
> this with other boxes on the Citrix lab, but don't know the exact
> models. I'm quite sure other downstreams can provide similar
> feedback.
As a further data point, Dasharo [0] a coreboot downstream was also
providing a firmware with a broken ResetSystem() method, and they
didn't notice until someone reported errors on Xen reboot:
https://github.com/Dasharo/edk2/pull/99/commits/dee75be10ac9387168bd3a8cad0f1ec6e372129a
It's quite clear no one is testing ResetSystem(), the UEFI spec
doesn't mandate using it, and we are just hurting ourselves by forcing
its usage.
Regards, Roger.
[0] https://github.com/Dasharo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-15 7:43 [PATCH v2] x86/shutdown: change default reboot method preference Roger Pau Monne
2023-09-18 12:26 ` Jan Beulich
2023-09-27 8:21 ` Jan Beulich
@ 2024-07-29 22:08 ` Marek Marczykowski-Górecki
2026-02-13 0:39 ` Marek Marczykowski-Górecki
3 siblings, 0 replies; 13+ messages in thread
From: Marek Marczykowski-Górecki @ 2024-07-29 22:08 UTC (permalink / raw)
To: Roger Pau Monne
Cc: xen-devel, Jan Beulich, Andrew Cooper, Wei Liu, Daniel P. Smith
[-- Attachment #1: Type: text/plain, Size: 5042 bytes --]
On Fri, Sep 15, 2023 at 09:43:47AM +0200, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
>
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
>
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> CPU: 0
> RIP: e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202 CONTEXT: hypervisor
> [...]
> Xen call trace:
> [<0000000000000017>] R 0000000000000017
> [<ffff83207eff7b50>] S ffff83207eff7b50
> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
>
> Which in most cases does lead to a reboot, however that's unreliable.
>
> Change the default reboot preference to prefer ACPI over UEFI if available and
> not in reduced hardware mode.
>
> This is in line to what Linux does, so it's unlikely to cause issues on current
> and future hardware, since there's a much higher chance of vendors testing
> hardware with Linux rather than Xen.
>
> Add a special case for one Acer model that does require being rebooted using
> ResetSystem(). See Linux commit 0082517fa4bce for rationale.
>
> I'm not aware of using ACPI reboot causing issues on boxes that do have
> properly implemented ResetSystem() methods.
With the Acer quirk, and the info Jan posted in the thread, this
sentence technically is not true. I don't think it warrants any code
change in this patch (it's clearly less common and less problematic
issue than crash during ResetSystem(), and still can be worked around
with a cmdline option). But might warrant adjusting commit message.
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Other points still stand, and I think this generally is an improvement,
so, preferably with adjusted commit message:
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> ---
> Changes since v1:
> - Add special case for Acer model to use UEFI reboot.
> - Adjust commit message.
> ---
> xen/arch/x86/shutdown.c | 19 +++++++++++++++----
> 1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c
> index 7619544d14da..3816ede1afe5 100644
> --- a/xen/arch/x86/shutdown.c
> +++ b/xen/arch/x86/shutdown.c
> @@ -150,19 +150,20 @@ static void default_reboot_type(void)
>
> if ( xen_guest )
> reboot_type = BOOT_XEN;
> + else if ( !acpi_disabled && !acpi_gbl_reduced_hardware )
> + reboot_type = BOOT_ACPI;
> else if ( efi_enabled(EFI_RS) )
> reboot_type = BOOT_EFI;
> - else if ( acpi_disabled )
> - reboot_type = BOOT_KBD;
> else
> - reboot_type = BOOT_ACPI;
> + reboot_type = BOOT_KBD;
> }
>
> static int __init cf_check override_reboot(const struct dmi_system_id *d)
> {
> enum reboot_type type = (long)d->driver_data;
>
> - if ( type == BOOT_ACPI && acpi_disabled )
> + if ( (type == BOOT_ACPI && acpi_disabled) ||
> + (type == BOOT_EFI && !efi_enabled(EFI_RS)) )
> type = BOOT_KBD;
>
> if ( reboot_type != type )
> @@ -172,6 +173,7 @@ static int __init cf_check override_reboot(const struct dmi_system_id *d)
> [BOOT_KBD] = "keyboard controller",
> [BOOT_ACPI] = "ACPI",
> [BOOT_CF9] = "PCI",
> + [BOOT_EFI] = "UEFI",
> };
>
> reboot_type = type;
> @@ -530,6 +532,15 @@ static const struct dmi_system_id __initconstrel reboot_dmi_table[] = {
> DMI_MATCH(DMI_PRODUCT_NAME, "PowerEdge R740"),
> },
> },
> + { /* Handle problems with rebooting on Acer TravelMate X514-51T. */
> + .callback = override_reboot,
> + .driver_data = (void *)(long)BOOT_EFI,
> + .ident = "Acer TravelMate X514-51T",
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "Acer"),
> + DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate X514-51T"),
> + },
> + },
> { }
> };
>
> --
> 2.42.0
>
>
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2023-09-15 7:43 [PATCH v2] x86/shutdown: change default reboot method preference Roger Pau Monne
` (2 preceding siblings ...)
2024-07-29 22:08 ` Marek Marczykowski-Górecki
@ 2026-02-13 0:39 ` Marek Marczykowski-Górecki
2026-02-13 7:54 ` Roger Pau Monné
3 siblings, 1 reply; 13+ messages in thread
From: Marek Marczykowski-Górecki @ 2026-02-13 0:39 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: xen-devel, Jan Beulich, Andrew Cooper, Wei Liu
[-- Attachment #1: Type: text/plain, Size: 6390 bytes --]
On Fri, Sep 15, 2023 at 09:43:47AM +0200, Roger Pau Monne wrote:
> The current logic to chose the preferred reboot method is based on the mode Xen
> has been booted into, so if the box is booted from UEFI, the preferred reboot
> method will be to use the ResetSystem() run time service call.
>
> However, that method seems to be widely untested, and quite often leads to a
> result similar to:
>
> Hardware Dom0 shutdown: rebooting machine
> ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> CPU: 0
> RIP: e008:[<0000000000000017>] 0000000000000017
> RFLAGS: 0000000000010202 CONTEXT: hypervisor
> [...]
> Xen call trace:
> [<0000000000000017>] R 0000000000000017
> [<ffff83207eff7b50>] S ffff83207eff7b50
> [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
>
> ****************************************
> Panic on CPU 0:
> FATAL TRAP: vector = 6 (invalid opcode)
> ****************************************
>
> Which in most cases does lead to a reboot, however that's unreliable.
It's not relevant anymore, but posting just for the posterity: I
just found yet another system where EFI ResetSystem() crashes. What's
interesting about it, it's rather new system - NUC 14 with Lunar Lake.
It crashes as follows:
(XEN) ----[ Xen-4.17.6 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e008:[<0000000063907504>] 0000000063907504
(XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
(XEN) rax: 000000006ff4da98 rbx: 000000006ff4dad0 rcx: 0000000000000001
(XEN) rdx: 000000000311100a rsi: 0000000000000000 rdi: 000000006ffb5080
(XEN) rbp: 0000000000000001 rsp: ffff82d0403ef958 r8: 0000000000000000
(XEN) r9: 000000006ffb5080 r10: 0000000000000836 r11: 0000000000000835
(XEN) r12: 0000000000000000 r13: 000000000311100a r14: 000000006ffb5080
(XEN) r15: 000000000000001f cr0: 0000000080050033 cr4: 0000000000d526e0
(XEN) cr3: 000000046d4e5000 cr2: 0000000063907504
(XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen code around <0000000063907504> (0000000063907504):
(XEN) 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
(XEN) Xen stack trace from rsp=ffff82d0403ef958:
(XEN) 000000006fff5b56 ffff82d04059c400 ffff83046f90eed0 0000000000000000
(XEN) 0000000000000014 0000000000000000 0000000000000002 0000000000000000
(XEN) 0000000000000086 0000000000000000 0000000000000001 0000000000000000
(XEN) ffff82d0403efb00 000000006ffb5080 000000006fff5bde ffff82d000000001
(XEN) 0000000000000000 000000000311100a 0000000000000000 0000000000000000
(XEN) ffff83046d500770 ffff83046d44d1f8 0000000000000000 0000000000000000
(XEN) 000000006ffb4844 0000000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 000000006ffb4498 0000000000000001 0000000000000001 0000000000000046
(XEN) ffff83046f90ef84 0000000000000000 0000000000000000 0000000000000000
(XEN) 000000006ffad650 0000000000000000 0000000000000000 0000000000000000
(XEN) ffff82d0403efb00 0000000000000000 ffff82d0402884c9 ffff830000000000
(XEN) ffff82d0402888ac 0000000000000000 ffff82d0403efb40 0000000000000004
(XEN) 000000046d4e5000 000000005484f000 0000000000000004 ffff82d040444b20
(XEN) 0000000000000046 ffff82d040317b76 ffff82d040317c85 0000000000000000
(XEN) 0000000000000065 ffff83046d44d000 ffff82d0403172e7 00001388403efb58
(XEN) 000082d0403efb60 0000000000000000 0000000000000003 ffff83046d44d000
(XEN) 0000000000000003 ffff83046d44d1f8 0000000000000000 ffff82d0402277d5
(XEN) ffff82d040227851 ffff82d040206c27 ffff83046d44d000 0000000000054894
(XEN) 0000000000000000 0000000000054894 ffff82d0402d0a58 000000000046bb48
(XEN) Xen call trace:
(XEN) [<0000000063907504>] R 0000000063907504
(XEN) [<000000006fff5b56>] S 000000006fff5b56
(XEN) [<ffff82d0402884c9>] S runtime.c#efi_rs_enter.part.0+0xc9/0x120
(XEN) [<ffff82d0402888ac>] S efi_reset_system+0x4c/0x90
(XEN) [<ffff82d040317b76>] S __stop_this_cpu+0x16/0x40
(XEN) [<ffff82d040317c85>] S smp_send_stop+0xc5/0xe0
(XEN) [<ffff82d0403172e7>] S machine_restart+0x247/0x330
(XEN) [<ffff82d0402277d5>] S shutdown.c#maybe_reboot+0x35/0x40
(XEN) [<ffff82d040227851>] S hwdom_shutdown+0x71/0xc0
(XEN) [<ffff82d040206c27>] S domain_shutdown+0x47/0x100
(XEN) [<ffff82d0402d0a58>] S p2m_add_page+0x4f8/0x7d0
(XEN) [<ffff82d0403cb1a4>] S dom0_construct_pvh+0x3b4/0x1300
(XEN) [<ffff82d040250e00>] S xhci-dbc.c#dbc_uart_flush+0x50/0x60
(XEN) [<ffff82d04022974f>] S timer.c#add_entry+0x4f/0xc0
(XEN) [<ffff82d04031af7b>] S time.c#read_counter+0x1b/0x40
(XEN) [<ffff82d04031b10c>] S time.c#platform_time_calibration+0x1c/0x90
(XEN) [<ffff82d0403e5b23>] S construct_dom0+0x63/0xe0
(XEN) [<ffff82d0403dbd87>] S __start_xen+0x21a7/0x264a
(XEN) [<ffff82d040277284>] S __high_start+0x94/0xa0
(XEN)
(XEN) Pagetable walk from 0000000063907504:
(XEN) L4[0x000] = 000000046d4e4063 ffffffffffffffff
(XEN) L3[0x001] = 0000000054848063 ffffffffffffffff
(XEN) L2[0x11c] = 80000000638001e3 ffffffffffffffff (PSE)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0011]
(XEN) Faulting linear address: 0000000063907504
(XEN) ****************************************
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH v2] x86/shutdown: change default reboot method preference
2026-02-13 0:39 ` Marek Marczykowski-Górecki
@ 2026-02-13 7:54 ` Roger Pau Monné
0 siblings, 0 replies; 13+ messages in thread
From: Roger Pau Monné @ 2026-02-13 7:54 UTC (permalink / raw)
To: Marek Marczykowski-Górecki
Cc: xen-devel, Jan Beulich, Andrew Cooper, Wei Liu
On Fri, Feb 13, 2026 at 01:39:56AM +0100, Marek Marczykowski-Górecki wrote:
> On Fri, Sep 15, 2023 at 09:43:47AM +0200, Roger Pau Monne wrote:
> > The current logic to chose the preferred reboot method is based on the mode Xen
> > has been booted into, so if the box is booted from UEFI, the preferred reboot
> > method will be to use the ResetSystem() run time service call.
> >
> > However, that method seems to be widely untested, and quite often leads to a
> > result similar to:
> >
> > Hardware Dom0 shutdown: rebooting machine
> > ----[ Xen-4.18-unstable x86_64 debug=y Tainted: C ]----
> > CPU: 0
> > RIP: e008:[<0000000000000017>] 0000000000000017
> > RFLAGS: 0000000000010202 CONTEXT: hypervisor
> > [...]
> > Xen call trace:
> > [<0000000000000017>] R 0000000000000017
> > [<ffff83207eff7b50>] S ffff83207eff7b50
> > [<ffff82d0403525aa>] F machine_restart+0x1da/0x261
> > [<ffff82d04035263c>] F apic_wait_icr_idle+0/0x37
> > [<ffff82d040233689>] F smp_call_function_interrupt+0xc7/0xcb
> > [<ffff82d040352f05>] F call_function_interrupt+0x20/0x34
> > [<ffff82d04033b0d5>] F do_IRQ+0x150/0x6f3
> > [<ffff82d0402018c2>] F common_interrupt+0x132/0x140
> > [<ffff82d040283d33>] F arch/x86/acpi/cpu_idle.c#acpi_idle_do_entry+0x113/0x129
> > [<ffff82d04028436c>] F arch/x86/acpi/cpu_idle.c#acpi_processor_idle+0x3eb/0x5f7
> > [<ffff82d04032a549>] F arch/x86/domain.c#idle_loop+0xec/0xee
> >
> > ****************************************
> > Panic on CPU 0:
> > FATAL TRAP: vector = 6 (invalid opcode)
> > ****************************************
> >
> > Which in most cases does lead to a reboot, however that's unreliable.
>
> It's not relevant anymore, but posting just for the posterity: I
> just found yet another system where EFI ResetSystem() crashes. What's
> interesting about it, it's rather new system - NUC 14 with Lunar Lake.
> It crashes as follows:
Interesting, all the NUC systems I owned had what seemed like proper
UEFI implementations. However those are the Intel ones. Lunar Lake
is made by ASUS.
Thanks, Roger.
^ permalink raw reply [flat|nested] 13+ messages in thread