* [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()?
@ 2015-09-24 23:26 Bill Paul
2015-09-30 16:54 ` Bill Paul
2015-09-30 17:05 ` Paolo Bonzini
0 siblings, 2 replies; 3+ messages in thread
From: Bill Paul @ 2015-09-24 23:26 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Eduardo Habkost, Richard Henderson
Consider the following circumstances:
- An x86-64 multicore system is running with all cores set for long mode
(EFER.LME and EFER.LMA set)
- The OS decides to re-launch one of the AP CPUs using an INIT IPI
According to the Intel architecture manual, an INIT IPI should reset the CPU
state (with a few small exceptions):
[...]
10.4.7.3 Local APIC State After an INIT Reset ("Wait-for-SIPI" State)
An INIT reset of the processor can be initiated in either of two ways:
· By asserting the processor's INIT# pin.
· By sending the processor an INIT IPI (an IPI with the delivery mode set
to INIT).
Upon receiving an INIT through either of these mechanisms, the processor
responds by beginning the initialization process of the processor core and the
local APIC. The state of the local APIC following an INIT reset is the same as
it is after a power-up or hardware reset, except that the APIC ID and
arbitration ID registers are not affected. This state is also referred to at
the "wait-for-SIPI" state (see also: Section 8.4.2, "MP Initialization
Protocol Requirements and Restrictions").
[...]
Note however that do_cpu_init() does this:
1225 void do_cpu_init(X86CPU *cpu)
1226 {
1227 CPUState *cs = CPU(cpu);
1228 CPUX86State *env = &cpu->env;
1229 CPUX86State *save = g_new(CPUX86State, 1);
1230 int sipi = cs->interrupt_request & CPU_INTERRUPT_SIPI;
1231
1232 *save = *env;
1233
1234 cpu_reset(cs);
1235 cs->interrupt_request = sipi;
1236 memcpy(&env->start_init_save, &save->start_init_save,
1237 offsetof(CPUX86State, end_init_save) -
1238 offsetof(CPUX86State, start_init_save));
1239 g_free(save);
1240
1241 if (kvm_enabled()) {
1242 kvm_arch_do_init_vcpu(cpu);
1243 }
1244 apic_init_reset(cpu->apic_state);
1245 }
The CPU environment, which in this case includes the EFER state, is saved and
restored when calling cpu_reset(). The x86_cpu_reset() function actually does
clear all of the CPU environment, but this function puts it all back.
The result of this is that if the CPU was in long mode and you do an INIT IPI,
the CPU still has the EFER.LMA and EFER.LME bits set, even though it's not
actually running in long mode anymore. It doesn't seem possible for the guest
to get the CPU out of this state, and one nasty side-effect is that trying to
set the CR0 to enable paging never succeeds.
I added the following code at line 1240 above as a workaround:
#ifdef TARGET_X86_64
/*
* The initial state of the CPU is not 64-bit mode. This being
* the case, don't leave the EFER.LME or EFER.LME bits set.
*/
cpu_load_efer(env, 0);
#endif
This seemed to fix the problem I was having, however I'm not certain this is
the correct fix.
As background, I ran across this problem testing VxWorks with QEMU 2.3.0 and
OVMF firmware. The VxWorks BOOTX64.EFI loader is able to load and run 32-bit
VxWorks images on 64-bit hardware by forcing the CPU back to 32-bit mode
before handing control to the OS. However it only does this for the BSP (CPU
0). It turns out that the UEFI firmware puts the AP cores into long mode too.
(This may be new in recent UEFI/OVMF versions, because I'm pretty sure tested
this path before and didn't see a problem.) Everything works ok with
uniprocessor images, but with SMP images, launching the first AP CPU fails due
to the above condition (the CPU starts up, but is unable to enable paging and
dies screaming in short order).
Booting with the 32-bit OVMF build and the VxWorks BOOTIA32.EFI loader works
ok. The same VxWorks loader and kernel code also seems to run ok on real
hardware.
I'm using QEMU 2.3.0 on FreeBSD/amd64 9.2-RELEASE. I'm not using KVM. It looks
like the code is still the same in the git repo. Am I correct that
do_cpu_init() should be clearing the EFER contents?
-Bill
--
=============================================================================
-Bill Paul (510) 749-2329 | Senior Member of Technical Staff,
wpaul@windriver.com | Master of Unix-Fu - Wind River Systems
=============================================================================
"I put a dollar in a change machine. Nothing changed." - George Carlin
=============================================================================
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()?
2015-09-24 23:26 [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()? Bill Paul
@ 2015-09-30 16:54 ` Bill Paul
2015-09-30 17:05 ` Paolo Bonzini
1 sibling, 0 replies; 3+ messages in thread
From: Bill Paul @ 2015-09-30 16:54 UTC (permalink / raw)
To: qemu-devel; +Cc: Paolo Bonzini, Eduardo Habkost, Richard Henderson
Ping?
> Consider the following circumstances:
>
> - An x86-64 multicore system is running with all cores set for long mode
> (EFER.LME and EFER.LMA set)
> - The OS decides to re-launch one of the AP CPUs using an INIT IPI
>
> According to the Intel architecture manual, an INIT IPI should reset the
> CPU state (with a few small exceptions):
>
> [...]
> 10.4.7.3 Local APIC State After an INIT Reset ("Wait-for-SIPI" State)
>
> An INIT reset of the processor can be initiated in either of two ways:
> · By asserting the processor's INIT# pin.
> · By sending the processor an INIT IPI (an IPI with the delivery mode
> set to INIT).
>
> Upon receiving an INIT through either of these mechanisms, the processor
> responds by beginning the initialization process of the processor core and
> the local APIC. The state of the local APIC following an INIT reset is the
> same as it is after a power-up or hardware reset, except that the APIC ID
> and arbitration ID registers are not affected. This state is also referred
> to at the "wait-for-SIPI" state (see also: Section 8.4.2, "MP
> Initialization Protocol Requirements and Restrictions").
> [...]
>
> Note however that do_cpu_init() does this:
>
> 1225 void do_cpu_init(X86CPU *cpu)
> 1226 {
> 1227 CPUState *cs = CPU(cpu);
> 1228 CPUX86State *env = &cpu->env;
> 1229 CPUX86State *save = g_new(CPUX86State, 1);
> 1230 int sipi = cs->interrupt_request & CPU_INTERRUPT_SIPI;
> 1231
> 1232 *save = *env;
> 1233
> 1234 cpu_reset(cs);
> 1235 cs->interrupt_request = sipi;
> 1236 memcpy(&env->start_init_save, &save->start_init_save,
> 1237 offsetof(CPUX86State, end_init_save) -
> 1238 offsetof(CPUX86State, start_init_save));
> 1239 g_free(save);
> 1240
> 1241 if (kvm_enabled()) {
> 1242 kvm_arch_do_init_vcpu(cpu);
> 1243 }
> 1244 apic_init_reset(cpu->apic_state);
> 1245 }
>
> The CPU environment, which in this case includes the EFER state, is saved
> and restored when calling cpu_reset(). The x86_cpu_reset() function
> actually does clear all of the CPU environment, but this function puts it
> all back.
>
> The result of this is that if the CPU was in long mode and you do an INIT
> IPI, the CPU still has the EFER.LMA and EFER.LME bits set, even though
> it's not actually running in long mode anymore. It doesn't seem possible
> for the guest to get the CPU out of this state, and one nasty side-effect
> is that trying to set the CR0 to enable paging never succeeds.
>
> I added the following code at line 1240 above as a workaround:
>
> #ifdef TARGET_X86_64
> /*
> * The initial state of the CPU is not 64-bit mode. This being
> * the case, don't leave the EFER.LME or EFER.LME bits set.
> */
>
> cpu_load_efer(env, 0);
> #endif
>
> This seemed to fix the problem I was having, however I'm not certain this
> is the correct fix.
>
> As background, I ran across this problem testing VxWorks with QEMU 2.3.0
> and OVMF firmware. The VxWorks BOOTX64.EFI loader is able to load and run
> 32-bit VxWorks images on 64-bit hardware by forcing the CPU back to 32-bit
> mode before handing control to the OS. However it only does this for the
> BSP (CPU 0). It turns out that the UEFI firmware puts the AP cores into
> long mode too. (This may be new in recent UEFI/OVMF versions, because I'm
> pretty sure tested this path before and didn't see a problem.) Everything
> works ok with uniprocessor images, but with SMP images, launching the
> first AP CPU fails due to the above condition (the CPU starts up, but is
> unable to enable paging and dies screaming in short order).
>
> Booting with the 32-bit OVMF build and the VxWorks BOOTIA32.EFI loader
> works ok. The same VxWorks loader and kernel code also seems to run ok on
> real hardware.
>
> I'm using QEMU 2.3.0 on FreeBSD/amd64 9.2-RELEASE. I'm not using KVM. It
> looks like the code is still the same in the git repo. Am I correct that
> do_cpu_init() should be clearing the EFER contents?
>
> -Bill
--
=============================================================================
-Bill Paul (510) 749-2329 | Senior Member of Technical Staff,
wpaul@windriver.com | Master of Unix-Fu - Wind River Systems
=============================================================================
"I put a dollar in a change machine. Nothing changed." - George Carlin
=============================================================================
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()?
2015-09-24 23:26 [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()? Bill Paul
2015-09-30 16:54 ` Bill Paul
@ 2015-09-30 17:05 ` Paolo Bonzini
1 sibling, 0 replies; 3+ messages in thread
From: Paolo Bonzini @ 2015-09-30 17:05 UTC (permalink / raw)
To: Bill Paul, qemu-devel; +Cc: Eduardo Habkost, Richard Henderson
On 25/09/2015 01:26, Bill Paul wrote:
> The result of this is that if the CPU was in long mode and you do an INIT IPI,
> the CPU still has the EFER.LMA and EFER.LME bits set, even though it's not
> actually running in long mode anymore. It doesn't seem possible for the guest
> to get the CPU out of this state, and one nasty side-effect is that trying to
> set the CR0 to enable paging never succeeds.
>
> I added the following code at line 1240 above as a workaround:
>
> #ifdef TARGET_X86_64
> /*
> * The initial state of the CPU is not 64-bit mode. This being
> * the case, don't leave the EFER.LME or EFER.LME bits set.
> */
>
> cpu_load_efer(env, 0);
> #endif
>
> This seemed to fix the problem I was having, however I'm not certain this is
> the correct fix.
I think a better fix is to move the "uint64_t efer;" field to some place
before the dummy "struct {} start_init_save;" marker in
target-i386/cpu.h. Can you test it and send a patch if it works?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-09-30 17:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-24 23:26 [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()? Bill Paul
2015-09-30 16:54 ` Bill Paul
2015-09-30 17:05 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).