From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49585) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZfFp7-0002KL-II for qemu-devel@nongnu.org; Thu, 24 Sep 2015 19:21:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZfFp3-0002cr-IO for qemu-devel@nongnu.org; Thu, 24 Sep 2015 19:21:09 -0400 Received: from mail.windriver.com ([147.11.1.11]:47435) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZfFp3-0002aR-AQ for qemu-devel@nongnu.org; Thu, 24 Sep 2015 19:21:05 -0400 From: Bill Paul Date: Thu, 24 Sep 2015 16:26:33 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Message-Id: <201509241626.33410.wpaul@windriver.com> Subject: [Qemu-devel] Possible bug in target-i386/helper.c:do_cpu_init()? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Paolo Bonzini , Eduardo Habkost , Richard Henderson Consider the following circumstances: =2D An x86-64 multicore system is running with all cores set for long mode (EFER.LME and EFER.LMA set) =2D The OS decides to re-launch one of the AP CPUs using an INIT IPI According to the Intel architecture manual, an INIT IPI should reset the CP= U=20 state (with a few small exceptions): [...] 10.4.7.3 Local APIC State After an INIT Reset ("Wait-for-SIPI" State) An INIT reset of the processor can be initiated in either of two ways: =B7 By asserting the processor's INIT# pin. =B7 By sending the processor an INIT IPI (an IPI with the delivery mode = set=20 to INIT). Upon receiving an INIT through either of these mechanisms, the processor=20 responds by beginning the initialization process of the processor core and = the=20 local APIC. The state of the local APIC following an INIT reset is the same= as it is after a power-up or hardware reset, except that the APIC ID and=20 arbitration ID registers are not affected. This state is also referred to a= t=20 the "wait-for-SIPI" state (see also: Section 8.4.2, "MP Initialization=20 Protocol Requirements and Restrictions"). [...] Note however that do_cpu_init() does this: 1225 void do_cpu_init(X86CPU *cpu) 1226 { 1227 CPUState *cs =3D CPU(cpu); 1228 CPUX86State *env =3D &cpu->env; 1229 CPUX86State *save =3D g_new(CPUX86State, 1); 1230 int sipi =3D cs->interrupt_request & CPU_INTERRUPT_SIPI; 1231=20 1232 *save =3D *env; 1233=20 1234 cpu_reset(cs); 1235 cs->interrupt_request =3D sipi; 1236 memcpy(&env->start_init_save, &save->start_init_save, 1237 offsetof(CPUX86State, end_init_save) - 1238 offsetof(CPUX86State, start_init_save)); 1239 g_free(save); 1240=20 1241 if (kvm_enabled()) { 1242 kvm_arch_do_init_vcpu(cpu); 1243 } 1244 apic_init_reset(cpu->apic_state); 1245 } The CPU environment, which in this case includes the EFER state, is saved a= nd=20 restored when calling cpu_reset(). The x86_cpu_reset() function actually do= es=20 clear all of the CPU environment, but this function puts it all back. The result of this is that if the CPU was in long mode and you do an INIT I= PI,=20 the CPU still has the EFER.LMA and EFER.LME bits set, even though it's not= =20 actually running in long mode anymore. It doesn't seem possible for the gue= st=20 to get the CPU out of this state, and one nasty side-effect is that trying = to=20 set the CR0 to enable paging never succeeds. I added the following code at line 1240 above as a workaround: #ifdef TARGET_X86_64 /* * The initial state of the CPU is not 64-bit mode. This being * the case, don't leave the EFER.LME or EFER.LME bits set. */ =20 cpu_load_efer(env, 0); #endif This seemed to fix the problem I was having, however I'm not certain this i= s=20 the correct fix. As background, I ran across this problem testing VxWorks with QEMU 2.3.0 an= d=20 OVMF firmware. The VxWorks BOOTX64.EFI loader is able to load and run 32-bi= t=20 VxWorks images on 64-bit hardware by forcing the CPU back to 32-bit mode=20 before handing control to the OS. However it only does this for the BSP (CP= U=20 0). It turns out that the UEFI firmware puts the AP cores into long mode to= o.=20 (This may be new in recent UEFI/OVMF versions, because I'm pretty sure test= ed=20 this path before and didn't see a problem.) Everything works ok with=20 uniprocessor images, but with SMP images, launching the first AP CPU fails = due=20 to the above condition (the CPU starts up, but is unable to enable paging a= nd=20 dies screaming in short order). Booting with the 32-bit OVMF build and the VxWorks BOOTIA32.EFI loader work= s=20 ok. The same VxWorks loader and kernel code also seems to run ok on real=20 hardware. I'm using QEMU 2.3.0 on FreeBSD/amd64 9.2-RELEASE. I'm not using KVM. It lo= oks=20 like the code is still the same in the git repo. Am I correct that=20 do_cpu_init() should be clearing the EFER contents? =2DBill =2D-=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D =2DBill Paul (510) 749-2329 | Senior Member of Technical Staff, wpaul@windriver.com | Master of Unix-Fu - Wind River Syste= ms =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D "I put a dollar in a change machine. Nothing changed." - George Carlin =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D