* EFI mixed mode + perf = rampant triple faults @ 2014-12-17 16:51 Andy Lutomirski 2014-12-17 16:54 ` Andy Lutomirski 0 siblings, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2014-12-17 16:51 UTC (permalink / raw) To: linux-efi, LKML I figured I should send this email before I forget about this issue: If you run perf record across any EFI mixed mode call or otherwise receive an NMI or MCE, the machine triple-faults. The cause is straightforward: there is no valid IDT when we have long mode disabled for the duration of the EFI call. As far as I know, the only way to have continuously functional interrupt handling across a long mode transition is to install an interrupt vector table and hope that CPUs actually do something intelligent when receiving an interrupt with LME=1, LMA=1, and PG=0. Yuck. Could we get away with issuing 32-bit EFI calls in compat mode, i.e. with a 32-bit CPL0 CS but while still in long mode? I think that delivery of an IST interrupt (which includes both NMI and MCE) will correctly switch to a fully valid 64-bit state and would correctly switch back when we execute IRET at the end. (Am I missing some reason that switching bitness without a privilege level change doesn't work well? I haven't thought of anything, other than the lack of SS controls on intra-ring interrupts, but that shouldn't be an issue here.) As an added benefit, this would considerably simplify the code. --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2014-12-17 16:51 EFI mixed mode + perf = rampant triple faults Andy Lutomirski @ 2014-12-17 16:54 ` Andy Lutomirski 2014-12-31 18:37 ` Matt Fleming 0 siblings, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2014-12-17 16:54 UTC (permalink / raw) To: LKML, linux-efi@vger.kernel.org; +Cc: Borislav Petkov [trying again with .org spelled correctly. also cc: bpetkov] On Wed, Dec 17, 2014 at 8:51 AM, Andy Lutomirski <luto@amacapital.net> wrote: > I figured I should send this email before I forget about this issue: > > If you run perf record across any EFI mixed mode call or otherwise > receive an NMI or MCE, the machine triple-faults. The cause is > straightforward: there is no valid IDT when we have long mode disabled > for the duration of the EFI call. > > As far as I know, the only way to have continuously functional interrupt > handling across a long mode transition is to install an interrupt vector > table and hope that CPUs actually do something intelligent when > receiving an interrupt with LME=1, LMA=1, and PG=0. Yuck. > > Could we get away with issuing 32-bit EFI calls in compat mode, i.e. > with a 32-bit CPL0 CS but while still in long mode? I think that > delivery of an IST interrupt (which includes both NMI and MCE) will > correctly switch to a fully valid 64-bit state and would correctly > switch back when we execute IRET at the end. (Am I missing some reason > that switching bitness without a privilege level change doesn't work > well? I haven't thought of anything, other than the lack of SS/SP controls > on intra-ring interrupts, but that shouldn't be an issue here.) > > As an added benefit, this would considerably simplify the code. > > --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2014-12-17 16:54 ` Andy Lutomirski @ 2014-12-31 18:37 ` Matt Fleming 2015-01-14 16:51 ` Matt Fleming 0 siblings, 1 reply; 12+ messages in thread From: Matt Fleming @ 2014-12-31 18:37 UTC (permalink / raw) To: Andy Lutomirski Cc: LKML, linux-efi@vger.kernel.org, Borislav Petkov, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, 17 Dec, at 08:54:56AM, Andy Lutomirski wrote: > [trying again with .org spelled correctly. also cc: bpetkov] > > On Wed, Dec 17, 2014 at 8:51 AM, Andy Lutomirski <luto@amacapital.net> wrote: > > I figured I should send this email before I forget about this issue: > > > > If you run perf record across any EFI mixed mode call or otherwise > > receive an NMI or MCE, the machine triple-faults. The cause is > > straightforward: there is no valid IDT when we have long mode disabled > > for the duration of the EFI call. Right, the lack of IDT is intentional since we disable interrupts while making the EFI call and so far I have side-stepped (ignored) the NMI/MCE issue. Perf is an interesting use case. I've admittedly never used it with EFI mixed mode, but yes, we should definitely get that working (if NMI/MCE handling wasn't justification enough). > > As far as I know, the only way to have continuously functional interrupt > > handling across a long mode transition is to install an interrupt vector > > table and hope that CPUs actually do something intelligent when > > receiving an interrupt with LME=1, LMA=1, and PG=0. Yuck. > > > > Could we get away with issuing 32-bit EFI calls in compat mode, i.e. > > with a 32-bit CPL0 CS but while still in long mode? I think that > > delivery of an IST interrupt (which includes both NMI and MCE) will > > correctly switch to a fully valid 64-bit state and would correctly > > switch back when we execute IRET at the end. (Am I missing some reason > > that switching bitness without a privilege level change doesn't work > > well? I haven't thought of anything, other than the lack of SS/SP controls > > on intra-ring interrupts, but that shouldn't be an issue here.) > > > > As an added benefit, this would considerably simplify the code. I can't immediately think of a reason that this wouldn't work, but I've Cc'd more x86 folks for additional insight. I will schedule some time to look into this issue in the new year. Thanks Andy. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2014-12-31 18:37 ` Matt Fleming @ 2015-01-14 16:51 ` Matt Fleming 2015-01-14 18:27 ` Andy Lutomirski 0 siblings, 1 reply; 12+ messages in thread From: Matt Fleming @ 2015-01-14 16:51 UTC (permalink / raw) To: Andy Lutomirski Cc: LKML, linux-efi@vger.kernel.org, Borislav Petkov, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, 31 Dec, at 06:37:39PM, Matt Fleming wrote: > On Wed, 17 Dec, at 08:54:56AM, Andy Lutomirski wrote: > > > As far as I know, the only way to have continuously functional interrupt > > > handling across a long mode transition is to install an interrupt vector > > > table and hope that CPUs actually do something intelligent when > > > receiving an interrupt with LME=1, LMA=1, and PG=0. Yuck. > > > > > > Could we get away with issuing 32-bit EFI calls in compat mode, i.e. > > > with a 32-bit CPL0 CS but while still in long mode? I think that > > > delivery of an IST interrupt (which includes both NMI and MCE) will > > > correctly switch to a fully valid 64-bit state and would correctly > > > switch back when we execute IRET at the end. (Am I missing some reason > > > that switching bitness without a privilege level change doesn't work > > > well? I haven't thought of anything, other than the lack of SS/SP controls > > > on intra-ring interrupts, but that shouldn't be an issue here.) > > > > > > As an added benefit, this would considerably simplify the code. > > I can't immediately think of a reason that this wouldn't work, but I've > Cc'd more x86 folks for additional insight. > > I will schedule some time to look into this issue in the new year. > Thanks Andy. I finally got some time to look into this, and running with __KERNEL32_CS seems to work fine at runtime both with Qemu + 32-bit OVMF and on my ASUS T100. Manually triggering an MCE exception immediately before invoking the firmware service recovers gracefully. Where this won't work so well is at boot time before we jump to the kernel proper. There, we still need to restore the firmware's GDT so that interrupts are serviced correctly before ExitBootServices() (in particular, ia32 Tianocore assumes __KERNEL_CS is a 32-bit CS). Which means the code to handle mixed mode calls at boot time and runtime has now diverged. Fixing that is probably just a SMOP to maximise code reuse though. I'll post a patch after some more testing. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-14 16:51 ` Matt Fleming @ 2015-01-14 18:27 ` Andy Lutomirski 2015-01-14 18:35 ` Borislav Petkov 2015-01-15 19:41 ` Matt Fleming 0 siblings, 2 replies; 12+ messages in thread From: Andy Lutomirski @ 2015-01-14 18:27 UTC (permalink / raw) To: Matt Fleming Cc: LKML, linux-efi@vger.kernel.org, Borislav Petkov, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, Jan 14, 2015 at 8:51 AM, Matt Fleming <matt@console-pimps.org> wrote: > On Wed, 31 Dec, at 06:37:39PM, Matt Fleming wrote: >> On Wed, 17 Dec, at 08:54:56AM, Andy Lutomirski wrote: >> > > As far as I know, the only way to have continuously functional interrupt >> > > handling across a long mode transition is to install an interrupt vector >> > > table and hope that CPUs actually do something intelligent when >> > > receiving an interrupt with LME=1, LMA=1, and PG=0. Yuck. >> > > >> > > Could we get away with issuing 32-bit EFI calls in compat mode, i.e. >> > > with a 32-bit CPL0 CS but while still in long mode? I think that >> > > delivery of an IST interrupt (which includes both NMI and MCE) will >> > > correctly switch to a fully valid 64-bit state and would correctly >> > > switch back when we execute IRET at the end. (Am I missing some reason >> > > that switching bitness without a privilege level change doesn't work >> > > well? I haven't thought of anything, other than the lack of SS/SP controls >> > > on intra-ring interrupts, but that shouldn't be an issue here.) >> > > >> > > As an added benefit, this would considerably simplify the code. >> >> I can't immediately think of a reason that this wouldn't work, but I've >> Cc'd more x86 folks for additional insight. >> >> I will schedule some time to look into this issue in the new year. >> Thanks Andy. > > I finally got some time to look into this, and running with > __KERNEL32_CS seems to work fine at runtime both with Qemu + 32-bit OVMF > and on my ASUS T100. Manually triggering an MCE exception immediately > before invoking the firmware service recovers gracefully. How are you manually triggering an MCE? I've been playing with some MCE stuff recently, but the only reasonably reliable way I know of to trigger an MCE is using WHEA, and I don't have a box with WHEA, and I assume your ASUS T100 doesn't either. > > Where this won't work so well is at boot time before we jump to the > kernel proper. There, we still need to restore the firmware's GDT so > that interrupts are serviced correctly before ExitBootServices() (in > particular, ia32 Tianocore assumes __KERNEL_CS is a 32-bit CS). Tianocore makes assumptions about the kernel's GDT layout? Yuck. --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-14 18:27 ` Andy Lutomirski @ 2015-01-14 18:35 ` Borislav Petkov 2015-01-14 18:38 ` Andy Lutomirski 2015-01-15 19:41 ` Matt Fleming 1 sibling, 1 reply; 12+ messages in thread From: Borislav Petkov @ 2015-01-14 18:35 UTC (permalink / raw) To: Andy Lutomirski Cc: Matt Fleming, LKML, linux-efi@vger.kernel.org, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, Jan 14, 2015 at 10:27:47AM -0800, Andy Lutomirski wrote: > How are you manually triggering an MCE? I've been playing with some > MCE stuff recently, but the only reasonably reliable way I know of to > trigger an MCE is using WHEA, and I don't have a box with WHEA, and I > assume your ASUS T100 doesn't either. asm volatile("int $18"); > Tianocore makes assumptions about the kernel's GDT layout? Yuck. Yuck indeed. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-14 18:35 ` Borislav Petkov @ 2015-01-14 18:38 ` Andy Lutomirski 2015-01-14 18:47 ` Borislav Petkov 0 siblings, 1 reply; 12+ messages in thread From: Andy Lutomirski @ 2015-01-14 18:38 UTC (permalink / raw) To: Borislav Petkov Cc: Matt Fleming, LKML, linux-efi@vger.kernel.org, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, Jan 14, 2015 at 10:35 AM, Borislav Petkov <bp@alien8.de> wrote: > On Wed, Jan 14, 2015 at 10:27:47AM -0800, Andy Lutomirski wrote: >> How are you manually triggering an MCE? I've been playing with some >> MCE stuff recently, but the only reasonably reliable way I know of to >> trigger an MCE is using WHEA, and I don't have a box with WHEA, and I >> assume your ASUS T100 doesn't either. > > asm volatile("int $18"); That's not a real MCE, though -- it happens synchronously instead of at MCE priority with all the associated messiness. Or is the idea to just stick that in after switching to the 32-bit mode being tested? --Andy > >> Tianocore makes assumptions about the kernel's GDT layout? Yuck. > > Yuck indeed. > > -- > Regards/Gruss, > Boris. > > Sent from a fat crate under my desk. Formatting is fine. > -- -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-14 18:38 ` Andy Lutomirski @ 2015-01-14 18:47 ` Borislav Petkov 2015-01-14 18:49 ` Andy Lutomirski 0 siblings, 1 reply; 12+ messages in thread From: Borislav Petkov @ 2015-01-14 18:47 UTC (permalink / raw) To: Andy Lutomirski Cc: Matt Fleming, LKML, linux-efi@vger.kernel.org, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, Jan 14, 2015 at 10:38:25AM -0800, Andy Lutomirski wrote: > That's not a real MCE, though -- it happens synchronously instead of MCE can be synchronous in a sense too, as a result of executing an insn, for example, i.e., EIPV bit set. > at MCE priority with all the associated messiness. Or is the idea to > just stick that in after switching to the 32-bit mode being tested? Yes, the idea was to simply trigger a real exception after switching to 32-bit while still with 64-bit IDT handlers. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-14 18:47 ` Borislav Petkov @ 2015-01-14 18:49 ` Andy Lutomirski 0 siblings, 0 replies; 12+ messages in thread From: Andy Lutomirski @ 2015-01-14 18:49 UTC (permalink / raw) To: Borislav Petkov Cc: Matt Fleming, LKML, linux-efi@vger.kernel.org, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, Jan 14, 2015 at 10:47 AM, Borislav Petkov <bp@alien8.de> wrote: > On Wed, Jan 14, 2015 at 10:38:25AM -0800, Andy Lutomirski wrote: >> That's not a real MCE, though -- it happens synchronously instead of > > MCE can be synchronous in a sense too, as a result of executing an insn, > for example, i.e., EIPV bit set. > >> at MCE priority with all the associated messiness. Or is the idea to >> just stick that in after switching to the 32-bit mode being tested? > > Yes, the idea was to simply trigger a real exception after switching to > 32-bit while still with 64-bit IDT handlers. > Seems like a reasonable test to me. --Andy ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-14 18:27 ` Andy Lutomirski 2015-01-14 18:35 ` Borislav Petkov @ 2015-01-15 19:41 ` Matt Fleming 2015-01-15 19:59 ` H. Peter Anvin 1 sibling, 1 reply; 12+ messages in thread From: Matt Fleming @ 2015-01-15 19:41 UTC (permalink / raw) To: Andy Lutomirski Cc: LKML, linux-efi@vger.kernel.org, Borislav Petkov, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Wed, 14 Jan, at 10:27:47AM, Andy Lutomirski wrote: > > How are you manually triggering an MCE? I've been playing with some > MCE stuff recently, but the only reasonably reliable way I know of to > trigger an MCE is using WHEA, and I don't have a box with WHEA, and I > assume your ASUS T100 doesn't either. As Borislav mentions, I used 'int $18', solely to trigger the 64-bit exception handler code paths in the middle of the EFI mixed mode code. > > Where this won't work so well is at boot time before we jump to the > > kernel proper. There, we still need to restore the firmware's GDT so > > that interrupts are serviced correctly before ExitBootServices() (in > > particular, ia32 Tianocore assumes __KERNEL_CS is a 32-bit CS). > > Tianocore makes assumptions about the kernel's GDT layout? Yuck. No, but 32-bit Tianocore does rely on the second GDT entry being a 32-bit CS. It has no knowledge of Linux's GDT layout. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-15 19:41 ` Matt Fleming @ 2015-01-15 19:59 ` H. Peter Anvin 2015-01-15 22:21 ` Matt Fleming 0 siblings, 1 reply; 12+ messages in thread From: H. Peter Anvin @ 2015-01-15 19:59 UTC (permalink / raw) To: Matt Fleming, Andy Lutomirski Cc: LKML, linux-efi@vger.kernel.org, Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On 01/15/2015 11:41 AM, Matt Fleming wrote: >> >> Tianocore makes assumptions about the kernel's GDT layout? Yuck. > > No, but 32-bit Tianocore does rely on the second GDT entry being a > 32-bit CS. > > It has no knowledge of Linux's GDT layout. > If it assumes that descriptor 16 is a 32-bit CS (and what about data? 24?) that *is* making assumptions on the kernel. -hpa ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: EFI mixed mode + perf = rampant triple faults 2015-01-15 19:59 ` H. Peter Anvin @ 2015-01-15 22:21 ` Matt Fleming 0 siblings, 0 replies; 12+ messages in thread From: Matt Fleming @ 2015-01-15 22:21 UTC (permalink / raw) To: H. Peter Anvin Cc: Andy Lutomirski, LKML, linux-efi@vger.kernel.org, Borislav Petkov, Thomas Gleixner, Ingo Molnar, Peter Zijlstra On Thu, 15 Jan, at 11:59:42AM, H. Peter Anvin wrote: > On 01/15/2015 11:41 AM, Matt Fleming wrote: > >> > >>Tianocore makes assumptions about the kernel's GDT layout? Yuck. > > > >No, but 32-bit Tianocore does rely on the second GDT entry being a > >32-bit CS. > > > >It has no knowledge of Linux's GDT layout. > > > > If it assumes that descriptor 16 is a 32-bit CS (and what about > data? 24?) that *is* making assumptions on the kernel. Bear in mind that this is before ExitBootServices() is invoked, so where the firmware still thinks it (not the OS) "owns" the platform. None of this comes into play at Runtime. -- Matt Fleming, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-01-15 22:21 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-17 16:51 EFI mixed mode + perf = rampant triple faults Andy Lutomirski 2014-12-17 16:54 ` Andy Lutomirski 2014-12-31 18:37 ` Matt Fleming 2015-01-14 16:51 ` Matt Fleming 2015-01-14 18:27 ` Andy Lutomirski 2015-01-14 18:35 ` Borislav Petkov 2015-01-14 18:38 ` Andy Lutomirski 2015-01-14 18:47 ` Borislav Petkov 2015-01-14 18:49 ` Andy Lutomirski 2015-01-15 19:41 ` Matt Fleming 2015-01-15 19:59 ` H. Peter Anvin 2015-01-15 22:21 ` Matt Fleming
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox