public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: edk2-devel@lists.sourceforge.net,
	KVM devel mailing list <kvm@vger.kernel.org>
Subject: Re: [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline
Date: Sun, 08 Dec 2013 23:15:36 +0100	[thread overview]
Message-ID: <52A4EF88.9030005@redhat.com> (raw)
In-Reply-To: <52A4AFBE.4080407@redhat.com>

On 12/08/13 18:43, Laszlo Ersek wrote:
> On 12/06/13 13:03, Paolo Bonzini wrote:

>> Thanks to your binary I now reproduced the issue and it looks like the
>> 64-bit->16-bit switch works:
> 
> Thank you for spending (apparently more than a little) time on this!
> 
>>
>>  qemu-system-x86-4081  [001] 62650.335040: kvm_exit:             reason CR_ACCESS rip 0x3cf7ae45 info 0 0
>>  qemu-system-x86-4081  [001] 62650.335041: kvm_cr:               cr_write 0 = 0x32
>>  qemu-system-x86-4081  [001] 62650.335046: kvm_entry:            vcpu 0
>>
>> 	This is the "mov %rax, %cr0". PE and PG are turned off.
> 
> I'm surprised by this result. The instruction you refer to is below
> "_AsmTransferControl_al_0000" (in the original, unpatched code).
> 
> I had earlier added an infinite loop right below that label (a different
> loop than my xxxx debug loop), and it was *never* reached in my test.
> That is, from the lret that I reported as problematic, to the
> instruction you refer to, the CPU would have had to cross (and finish)
> the infinite loop that I had added earlier. And that never happened in
> my test.
> 
> I had added that loop at "_AsmTransferControl_al_0000" immediately
> precisely because I wanted to see if the label is reached and the
> problem is with something below that label, or with the first lret. I
> sent my email to the KVM list after I had isolated the problem to the
> first LRET:
> 
> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/5297/focus=5325
> 
> On 12/04/13 19:05, Laszlo Ersek wrote:
>> I tested if the (intended) target location of the LRET is reached, and
>> it is not. (It's easy to test by adding a small infinite loop, moving
>> it around, and seeing if the VM is spinning with or without producing
>> a bunch of output on the debug port.) It's *really* that
>> internally-targeted LRET that causes a reboot. [...]

First, you're right and I'm wrong. I can see the triple fault in my kvm
trace as well.

Second... This is incredible. Guess what the following patch does.

 ASM_GLOBAL ASM_PFX(AsmTransferControl)
 ASM_PFX(AsmTransferControl):
     # rcx S3WakingVector    :DWORD
     # rdx AcpiLowMemoryBase :DWORD
     lea   _AsmTransferControl_al_0000(%rip), %eax
     movq  $0x2800000000, %r8
     orq   %r8, %rax
     pushq %rax
     shrd  $20, %ecx, %ebx
     andl  $0x0f, %ecx
     movw  %cx, %bx
     movl  %ebx, jmp_addr(%rip)
     lret
 _AsmTransferControl_al_0000:
     .byte    0x0b8, 0x30, 0      # mov ax, 30h as selector
     movl  %eax, %ds
     movl  %eax, %es
     movl  %eax, %fs
     movl  %eax, %gs
     movl  %eax, %ss
+.code16
+    movw $0x402, %dx
+    movb $0x58, %al
+qqq:
+    outb %al, %dx
+    jmp qqq
+    movb $0x59, %al
+    outb %al, %dx
+.code64
     movq  %cr0, %rax
     movq  %cr4, %rbx

The infinite loop that I used to "pin down" the first instruction that
broke around lret -- it was useless.

The above patch should produce an infinite string of "X" characters on
the qemu debug port. "After" the inifnite string, it should produce a
"Y" character.

In practice, I'm seeing *one* "X" character, and then a triple fault.

The "jmp qqq" instruction *itself* causes a triple fault (a guest
reboot), masking the problem that I was looking for.

When I had such an empty infinite loop (at that time, still without the
outb) right before the lret, it worked. As soon as I moved it below
_AsmTransferControl_al_0000 (ie. under the target of the lret), the jmp
*itself* caused a triple fault (guest reboot), causing me to think that
the *lret* caused the reboot. Because, the tight jmp loop would "always"
work, and if I'm seeing a reboot loop instead of tight spinning right
after the lret, that means that the lret caused the reboot, right? Right?

This is the binary:

0000000000000037 <qqq>:
  37:   ee                      out    %al,(%dx)
  38:   eb fd                   jmp    37 <qqq>

Opcode  Instruction Op/ 64-Bit Compat/   Description
                    En  Mode   Leg Mode
EB cb   JMP rel8    A   Valid  Valid     Jump short, RIP = RIP + 8-bit
                                         displacement sign extended
                                         to 64-bits

My belief in the sanity of x86 has been shattered to its core. I don't
feel stupid. I feel cheated.

Thank you for your help.
Laszlo


------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk

  reply	other threads:[~2013-12-08 22:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-05 16:12 [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline Laszlo Ersek
2013-12-05 16:50 ` Laszlo Ersek
2013-12-05 17:42 ` Paolo Bonzini
2013-12-05 18:29   ` Laszlo Ersek
2013-12-06 12:03     ` Paolo Bonzini
2013-12-06 13:31       ` Paolo Bonzini
2013-12-06 13:46         ` Yao, Jiewen
2013-12-06 14:29           ` Paolo Bonzini
2013-12-06 14:47             ` Yao, Jiewen
2013-12-06 14:51               ` Paolo Bonzini
2013-12-06 13:31       ` Yao, Jiewen
2013-12-08 17:43       ` Laszlo Ersek
2013-12-08 22:15         ` Laszlo Ersek [this message]
2013-12-05 22:38   ` Laszlo Ersek
2013-12-05 22:53     ` Andrew Fish
2013-12-07 16:25     ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A4EF88.9030005@redhat.com \
    --to=lersek@redhat.com \
    --cc=edk2-devel@lists.sourceforge.net \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox