All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: edk2-devel@lists.sourceforge.net,
	KVM devel mailing list <kvm@vger.kernel.org>
Subject: Re: [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline
Date: Sun, 08 Dec 2013 23:15:36 +0100	[thread overview]
Message-ID: <52A4EF88.9030005@redhat.com> (raw)
In-Reply-To: <52A4AFBE.4080407@redhat.com>

On 12/08/13 18:43, Laszlo Ersek wrote:
> On 12/06/13 13:03, Paolo Bonzini wrote:

>> Thanks to your binary I now reproduced the issue and it looks like the
>> 64-bit->16-bit switch works:
> 
> Thank you for spending (apparently more than a little) time on this!
> 
>>
>>  qemu-system-x86-4081  [001] 62650.335040: kvm_exit:             reason CR_ACCESS rip 0x3cf7ae45 info 0 0
>>  qemu-system-x86-4081  [001] 62650.335041: kvm_cr:               cr_write 0 = 0x32
>>  qemu-system-x86-4081  [001] 62650.335046: kvm_entry:            vcpu 0
>>
>> 	This is the "mov %rax, %cr0". PE and PG are turned off.
> 
> I'm surprised by this result. The instruction you refer to is below
> "_AsmTransferControl_al_0000" (in the original, unpatched code).
> 
> I had earlier added an infinite loop right below that label (a different
> loop than my xxxx debug loop), and it was *never* reached in my test.
> That is, from the lret that I reported as problematic, to the
> instruction you refer to, the CPU would have had to cross (and finish)
> the infinite loop that I had added earlier. And that never happened in
> my test.
> 
> I had added that loop at "_AsmTransferControl_al_0000" immediately
> precisely because I wanted to see if the label is reached and the
> problem is with something below that label, or with the first lret. I
> sent my email to the KVM list after I had isolated the problem to the
> first LRET:
> 
> http://thread.gmane.org/gmane.comp.bios.tianocore.devel/5297/focus=5325
> 
> On 12/04/13 19:05, Laszlo Ersek wrote:
>> I tested if the (intended) target location of the LRET is reached, and
>> it is not. (It's easy to test by adding a small infinite loop, moving
>> it around, and seeing if the VM is spinning with or without producing
>> a bunch of output on the debug port.) It's *really* that
>> internally-targeted LRET that causes a reboot. [...]

First, you're right and I'm wrong. I can see the triple fault in my kvm
trace as well.

Second... This is incredible. Guess what the following patch does.

 ASM_GLOBAL ASM_PFX(AsmTransferControl)
 ASM_PFX(AsmTransferControl):
     # rcx S3WakingVector    :DWORD
     # rdx AcpiLowMemoryBase :DWORD
     lea   _AsmTransferControl_al_0000(%rip), %eax
     movq  $0x2800000000, %r8
     orq   %r8, %rax
     pushq %rax
     shrd  $20, %ecx, %ebx
     andl  $0x0f, %ecx
     movw  %cx, %bx
     movl  %ebx, jmp_addr(%rip)
     lret
 _AsmTransferControl_al_0000:
     .byte    0x0b8, 0x30, 0      # mov ax, 30h as selector
     movl  %eax, %ds
     movl  %eax, %es
     movl  %eax, %fs
     movl  %eax, %gs
     movl  %eax, %ss
+.code16
+    movw $0x402, %dx
+    movb $0x58, %al
+qqq:
+    outb %al, %dx
+    jmp qqq
+    movb $0x59, %al
+    outb %al, %dx
+.code64
     movq  %cr0, %rax
     movq  %cr4, %rbx

The infinite loop that I used to "pin down" the first instruction that
broke around lret -- it was useless.

The above patch should produce an infinite string of "X" characters on
the qemu debug port. "After" the inifnite string, it should produce a
"Y" character.

In practice, I'm seeing *one* "X" character, and then a triple fault.

The "jmp qqq" instruction *itself* causes a triple fault (a guest
reboot), masking the problem that I was looking for.

When I had such an empty infinite loop (at that time, still without the
outb) right before the lret, it worked. As soon as I moved it below
_AsmTransferControl_al_0000 (ie. under the target of the lret), the jmp
*itself* caused a triple fault (guest reboot), causing me to think that
the *lret* caused the reboot. Because, the tight jmp loop would "always"
work, and if I'm seeing a reboot loop instead of tight spinning right
after the lret, that means that the lret caused the reboot, right? Right?

This is the binary:

0000000000000037 <qqq>:
  37:   ee                      out    %al,(%dx)
  38:   eb fd                   jmp    37 <qqq>

Opcode  Instruction Op/ 64-Bit Compat/   Description
                    En  Mode   Leg Mode
EB cb   JMP rel8    A   Valid  Valid     Jump short, RIP = RIP + 8-bit
                                         displacement sign extended
                                         to 64-bits

My belief in the sanity of x86 has been shattered to its core. I don't
feel stupid. I feel cheated.

Thank you for your help.
Laszlo


------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk

  reply	other threads:[~2013-12-08 22:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-05 16:12 [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline Laszlo Ersek
2013-12-05 16:50 ` Laszlo Ersek
2013-12-05 17:42 ` Paolo Bonzini
2013-12-05 18:29   ` Laszlo Ersek
2013-12-06 12:03     ` Paolo Bonzini
2013-12-06 13:31       ` Paolo Bonzini
2013-12-06 13:46         ` Yao, Jiewen
2013-12-06 14:29           ` Paolo Bonzini
2013-12-06 14:47             ` Yao, Jiewen
2013-12-06 14:51               ` Paolo Bonzini
2013-12-06 13:31       ` Yao, Jiewen
2013-12-08 17:43       ` Laszlo Ersek
2013-12-08 22:15         ` Laszlo Ersek [this message]
2013-12-05 22:38   ` Laszlo Ersek
2013-12-05 22:53     ` Andrew Fish
2013-12-07 16:25     ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A4EF88.9030005@redhat.com \
    --to=lersek@redhat.com \
    --cc=edk2-devel@lists.sourceforge.net \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.