All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laszlo Ersek <lersek@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: edk2-devel@lists.sourceforge.net,
	KVM devel mailing list <kvm@vger.kernel.org>
Subject: Re: [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline
Date: Sun, 08 Dec 2013 18:43:26 +0100	[thread overview]
Message-ID: <52A4AFBE.4080407@redhat.com> (raw)
In-Reply-To: <52A1BCFE.4020100@redhat.com>

On 12/06/13 13:03, Paolo Bonzini wrote:
> Il 05/12/2013 19:29, Laszlo Ersek ha scritto:
>> On 12/05/13 18:42, Paolo Bonzini wrote:
>>> Il 05/12/2013 17:12, Laszlo Ersek ha scritto:
>>>> Hi,
>>>>
>>>> I'm working on S3 suspend/resume in OVMF. The problem is that I'm getting an
>>>> unexpected guest reboot for code (LRET) that works on physical hardware. I
>>>> tried to trace the problem with ftrace, but I didn't get any mentions of
>>>> em_ret_far(). (Maybe I was looking in the wrong place.)
>>>
>>> What does ftrace say anyway?
>>
>> (pls. see in the next msg I sent)
> 
> Actually I meant the ftrace without any patches.
> 
> Thanks to your binary I now reproduced the issue and it looks like the
> 64-bit->16-bit switch works:

Thank you for spending (apparently more than a little) time on this!

> 
>  qemu-system-x86-4081  [001] 62650.335040: kvm_exit:             reason CR_ACCESS rip 0x3cf7ae45 info 0 0
>  qemu-system-x86-4081  [001] 62650.335041: kvm_cr:               cr_write 0 = 0x32
>  qemu-system-x86-4081  [001] 62650.335046: kvm_entry:            vcpu 0
> 
> 	This is the "mov %rax, %cr0". PE and PG are turned off.

I'm surprised by this result. The instruction you refer to is below
"_AsmTransferControl_al_0000" (in the original, unpatched code).

I had earlier added an infinite loop right below that label (a different
loop than my xxxx debug loop), and it was *never* reached in my test.
That is, from the lret that I reported as problematic, to the
instruction you refer to, the CPU would have had to cross (and finish)
the infinite loop that I had added earlier. And that never happened in
my test.

I had added that loop at "_AsmTransferControl_al_0000" immediately
precisely because I wanted to see if the label is reached and the
problem is with something below that label, or with the first lret. I
sent my email to the KVM list after I had isolated the problem to the
first LRET:

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/5297/focus=5325

On 12/04/13 19:05, Laszlo Ersek wrote:
> I tested if the (intended) target location of the LRET is reached, and
> it is not. (It's easy to test by adding a small infinite loop, moving
> it around, and seeing if the VM is spinning with or without producing
> a bunch of output on the debug port.) It's *really* that
> internally-targeted LRET that causes a reboot. [...]

I have absolutely no clue why this code executes for you and doesn't for
me :) What guest RAM size did you test with?

>  qemu-system-x86-4081  [001] 62650.335047: kvm_exit:             reason MSR_READ rip 0x3cf7ae4e info 0 0
>  qemu-system-x86-4081  [001] 62650.335048: kvm_msr:              msr_read c0000080 = 0x100
>  qemu-system-x86-4081  [001] 62650.335048: kvm_entry:            vcpu 0
>  qemu-system-x86-4081  [001] 62650.335048: kvm_exit:             reason MSR_WRITE rip 0x3cf7ae53 info 0 0
>  qemu-system-x86-4081  [001] 62650.335049: kvm_msr:              msr_write c0000080 = 0x0
>  qemu-system-x86-4081  [001] 62650.335050: kvm_entry:            vcpu 0
> 
> 	LME is turned off.
> 
>  qemu-system-x86-4081  [001] 62650.335050: kvm_exit:             reason CR_ACCESS rip 0x3cf7ae55 info 304 0
>  qemu-system-x86-4081  [001] 62650.335050: kvm_cr:               cr_write 4 = 0x640
>  qemu-system-x86-4081  [001] 62650.335053: kvm_entry:            vcpu 0
> 
> 	PAE is turned off.
> 
>  qemu-system-x86-4081  [001] 62650.335054: kvm_exit:             reason CR_ACCESS rip 0x11e6 info 0 0
>  qemu-system-x86-4081  [001] 62650.335054: kvm_cr:               cr_write 0 = 0x33
>  qemu-system-x86-4081  [001] 62650.335054: kvm_entry:            vcpu 0
> 
> 	Here we're already in real mode.  The weird RIP is explained by
> 	the first few bytes after the FACS resume vector:

>From this point on you were debugging the Linux wakeup code, in
"arch/x86/realmode/rm/wakeup_asm.S". I think.

> 
> 		0x9a1d:0000:  cli    
> 		0x9a1d:0001:  cld    
> 		0x9a1d:0002:  ljmp   $9900,$11d7

ENTRY(wakeup_start)
        cli
        cld

        LJMPW_RM(3f)
3:
        /* Apparently some dimwit BIOS programmers don't know how to
           program a PM to RM transition, and we might end up here with
           junk in the data segment descriptor registers.  The only way
           to repair that is to go into PM and fix it ourselves... */
[...]

>From Linux kernel commit 4b4f7280.

> The page tables are, ahem, crap:
> 
> 000c000: 6750 fe01 0000 0000 0000 0000 0000 0000  gP..............
> 000c010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000c0f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 
> This is 0x9c000.  Strikes any bell?

We're wildly corrupting OS memory during OVMF S3 resume. That's a known
problem and the next stage for me to figure out (with Jordan's help
hopefully):

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/5297/focus=5321
http://thread.gmane.org/gmane.comp.bios.tianocore.devel/5297/focus=5325

So, your tracing reached / debugged code that I had never ever reached.
And my report was precisely about not reaching it. Once we reach it,
it's expected to blow up, but first I wanted to get there.

Again, the 64-bit->16-bit switch (in the original, unpatched edk2/OVMF
code) never worked for me.

I think I did find the reason for that though, please see

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/5343/focus=5365

especially the last patch attached to it.

The likely reason for the failure I was seeing is that the 16-bit code
had been relocated to way above 1MB and could not be addressed with the
16-bit CS:IP notation at all.

Thanks!
Laszlo

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk

  parent reply	other threads:[~2013-12-08 17:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-05 16:12 [edk2] apparent KVM problem with LRET in TianoCore S3 resume trampoline Laszlo Ersek
2013-12-05 16:50 ` Laszlo Ersek
2013-12-05 17:42 ` Paolo Bonzini
2013-12-05 18:29   ` Laszlo Ersek
2013-12-06 12:03     ` Paolo Bonzini
2013-12-06 13:31       ` Paolo Bonzini
2013-12-06 13:46         ` Yao, Jiewen
2013-12-06 14:29           ` Paolo Bonzini
2013-12-06 14:47             ` Yao, Jiewen
2013-12-06 14:51               ` Paolo Bonzini
2013-12-06 13:31       ` Yao, Jiewen
2013-12-08 17:43       ` Laszlo Ersek [this message]
2013-12-08 22:15         ` Laszlo Ersek
2013-12-05 22:38   ` Laszlo Ersek
2013-12-05 22:53     ` Andrew Fish
2013-12-07 16:25     ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A4AFBE.4080407@redhat.com \
    --to=lersek@redhat.com \
    --cc=edk2-devel@lists.sourceforge.net \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.