public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Guest memory backed by PCI BAR (x86)
@ 2015-03-25 15:56 Nate Case
  2015-03-26 14:02 ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Nate Case @ 2015-03-25 15:56 UTC (permalink / raw)
  To: kvm

Hello,

I have an unusual goal of presenting SDRAM located on a real PCIe device
(exposed via BAR) to a guest as normal memory.  Eventually I'd like to
split up guest memory between PCI memory and host memory as different
NUMA nodes to optimize performance;  but for now I'm focusing on just
getting the guest to use memory over PCIe entirely, ignoring the
awful performance.

My first attempt was to modify qemu's "-mem-path" parameter to also
support mmap()able files in addition to hugetlbfs paths.  This was
fairly straightforward, and using a tmpfs file (backed by host SDRAM)
as guest RAM appears to work fine.

I was hoping I could then use a PCI sysfs resource file instead of a
tmpfs file (i.e., /sys/bus/pci/devices/dddd:bb:ss.f/resourceN) to
achieve the desired effect.  But I haven't been able to get Linux or
memtest86+ to boot with this arrangement.  It only boots when KVM
acceleration is disabled.

When KVM acceleration is enabled, SeaBIOS seems to function fine
running out of PCI memory space, but booting the OS resets.
Specifically, the following happens (I'll stick with the memtest86+
5.01 test case for simplicity):

setup.S of memtest:

---[snip]---
  /*
   * Note that the short jump isn't strictly needed, althought there are
   * reasons why it might be a good idea. It won't hurt in any case.
   */
          movw    $0x0001, %ax    # protected mode (PE) bit
          lmsw    %ax             # This is it#
          jmp     flush_instr
  flush_instr:
          movw    $KERNEL_DS, %ax
          movw    %ax, %ds
          movw    %ax, %es
          movw    %ax, %ss    <---- broken here
          movw    %ax, %fs
          movw    %ax, %gs
---[snip]---

In this portion that's attempting to enter protected mode, the
"movw %ax, %ss" instruction execution results in the guest resetting.
gdb shows that after stepping through this instruction, CS:IP points
to F000:E05B (entry_post within SeaBIOS).

I'm new to KVM, so I'm hoping someone could provide some guidance on
what might be wrong, what you'd recommend to get this working, or how
I might debug this better.

I'm using qemu 2.2.0 and an EL6 kernel (2.6.32-358.23.2.el6.x86_64).

Thanks,

Nate

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-25 15:56 Guest memory backed by PCI BAR (x86) Nate Case
@ 2015-03-26 14:02 ` Paolo Bonzini
  2015-03-26 16:01   ` Nate Case
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-03-26 14:02 UTC (permalink / raw)
  To: Nate Case, kvm



On 25/03/2015 16:56, Nate Case wrote:
> I was hoping I could then use a PCI sysfs resource file instead of a
> tmpfs file (i.e., /sys/bus/pci/devices/dddd:bb:ss.f/resourceN) to
> achieve the desired effect.  But I haven't been able to get Linux or
> memtest86+ to boot with this arrangement.  It only boots when KVM
> acceleration is disabled.
> 
> When KVM acceleration is enabled, SeaBIOS seems to function fine
> running out of PCI memory space, but booting the OS resets.
> Specifically, the following happens (I'll stick with the memtest86+
> 5.01 test case for simplicity):

Hi,

please include a trace file of the failure, obtained using "trace-cmd
record -e kvm/* -e kvmmmu/*".

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 14:02 ` Paolo Bonzini
@ 2015-03-26 16:01   ` Nate Case
  2015-03-26 16:07     ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Nate Case @ 2015-03-26 16:01 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

> > When KVM acceleration is enabled, SeaBIOS seems to function fine
> > running out of PCI memory space, but booting the OS resets.
> > Specifically, the following happens (I'll stick with the memtest86+
> > 5.01 test case for simplicity):
> 
> 
> please include a trace file of the failure, obtained using "trace-cmd
> record -e kvm/* -e kvmmmu/*".

Paolo,

The trace file is available here:

     http://oss.xes-inc.com/xtmp/trace-pcimem-memtest86-reset.dat.gz

Thanks,

Nate

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 16:01   ` Nate Case
@ 2015-03-26 16:07     ` Paolo Bonzini
  2015-03-26 16:34       ` Nate Case
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-03-26 16:07 UTC (permalink / raw)
  To: Nate Case; +Cc: kvm



On 26/03/2015 17:01, Nate Case wrote:
>>> When KVM acceleration is enabled, SeaBIOS seems to function fine
>>> running out of PCI memory space, but booting the OS resets.
>>> Specifically, the following happens (I'll stick with the memtest86+
>>> 5.01 test case for simplicity):
>>
>>
>> please include a trace file of the failure, obtained using "trace-cmd
>> record -e kvm/* -e kvmmmu/*".
> 
> Paolo,
> 
> The trace file is available here:
> 
>      http://oss.xes-inc.com/xtmp/trace-pcimem-memtest86-reset.dat.gz

Run QEMU with "-no-reboot -no-shutdown -monitor stdio".  When it
crashes, run "info registers" and then "x/70i 0", and email the output.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 16:07     ` Paolo Bonzini
@ 2015-03-26 16:34       ` Nate Case
  2015-03-26 16:40         ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Nate Case @ 2015-03-26 16:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

> > 
> > The trace file is available here:
> > 
> >      http://oss.xes-inc.com/xtmp/trace-pcimem-memtest86-reset.dat.gz
> 
> Run QEMU with "-no-reboot -no-shutdown -monitor stdio".  When it
> crashes, run "info registers" and then "x/70i 0", and email the output.

QEMU output:

---[snip]---
$ qemu-system-x86_64 -enable-kvm -name testVM6 -machine \
q35,accel=kvm,usb=off -cpu Haswell -m 256 -realtime mlock=off -smp \
1,sockets=1,cores=1,threads=1 -boot order=d image.memtest -vga std \
-display vnc=${LAN_IP}:0 -mem-path \
/sys/bus/pci/devices/0000\:01:00.0/resource2_wc --mem-prealloc -cdrom \
memtest86+-5.01.iso -s -S -d cpu_reset,unimp,guest_errors,int,pcall \
-no-reboot -no-shutdown -monitor stdio
QEMU 2.2.0 monitor - type 'help' for more information
(qemu) CPU Reset (CPU 0)

[[ trimmed initial reset with all zeroed registers ]]

CPU Reset (CPU 0)
EAX=00000000 EBX=00000000 ECX=00000000 EDX=000306c1
ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 0000ffff 00009300
CS =f000 ffff0000 0000ffff 00009b00
SS =0000 00000000 0000ffff 00009300
DS =0000 00000000 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 0000ffff 00008200
TR =0000 00000000 0000ffff 00008b00
GDT=     00000000 0000ffff
IDT=     00000000 0000ffff
CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000000 CCD=00000000 CCO=DYNAMIC
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
---[snip]---

gdb output:

---[snip]---
real-mode-gdb$ info registers
eax            0x18     24
ecx            0x2000   8192
edx            0x92     146
ebx            0x0      0
esp            0x800    0x800
ebp            0x1d0    0x1d0
esi            0x5a00   23040
edi            0x3ff4   16372
eip            0x58     0x58
eflags         0x10046  [ PF ZF RF ]
cs             0x9020   36896
ss             0x9000   36864
ds             0x18     24
es             0x18     24
fs             0x9000   36864
gs             0x9000   36864
real-mode-gdb$ x/70i 0
   0x0: push   bx
   0x1: inc    WORD PTR [bx+si]
   0x3: lock push bx
   0x5: inc    WORD PTR [bx+si]
   0x7: lock ret 
   0x9: loop   0xb
   0xb: lock push bx
   0xd: inc    WORD PTR [bx+si]
   0xf: lock push bx
   0x11:        inc    WORD PTR [bx+si]
   0x13:        lock push bx
   0x15:        inc    WORD PTR [bx+si]
   0x17:        lock push bx
   0x19:        inc    WORD PTR [bx+si]
   0x1b:        lock push bx
   0x1d:        inc    WORD PTR [bx+si]
   0x1f:        lock movs WORD PTR es:[di],WORD PTR ds:[si]
   0x21:        inc    BYTE PTR [bx+si]
   0x23:        lock xchg cx,bp
   0x26:        add    al,dh
   0x28:        jmp    0xf9
   0x2b:        lock jmp 0xfd
   0x2f:        lock jmp 0x101
   0x33:        lock jmp 0x105
   0x37:        lock jmp 0x109
   0x3b:        lock jmp 0x10d
   0x3f:        lock mov dl,BYTE PTR [bx+si+0x0]
   0x43:        ror    BYTE PTR [di-0x8],0x0
   0x47:        lock inc cx
   0x49:        clc    
   0x4a:        add    al,dh
   0x4c:        (bad)  
   0x4d:        jcxz   0x4f
   0x4f:        lock cmp di,sp
   0x52:        add    al,dh
   0x54:        pop    cx
   0x55:        clc    
   0x56:        add    al,dh
=> 0x58:        cs
   0x59:        call   0xf05c
   0x5c:        shr    bh,cl
   0x5e:        add    al,dh
   0x60:        add    ax,0xcf
   0x63:        lock repnz out 0x0,al
   0x67:        lock outs dx,BYTE PTR ds:[si]
   0x69:        inc    BYTE PTR [bx+si]
   0x6b:        lock push bx
   0x6d:        inc    WORD PTR [bx+si]
   0x6f:        lock push bx
   0x71:        inc    WORD PTR [bx+si]
   0x73:        lock push bx
   0x75:        inc    WORD PTR [bx+si]
   0x77:        lock hlt 
   0x79:        aas    
   0x7a:        add    BYTE PTR [bx+si-0x7a78],dl
   0x7e:        add    al,al
   0x80:        push   bx
   0x81:        inc    WORD PTR [bx+si]
   0x83:        lock push bx
   0x85:        inc    WORD PTR [bx+si]
   0x87:        lock push bx
   0x89:        inc    WORD PTR [bx+si]
   0x8b:        lock push bx
   0x8d:        inc    WORD PTR [bx+si]
   0x8f:        lock push bx
   0x91:        inc    WORD PTR [bx+si]
   0x93:        lock push bx
   0x95:        inc    WORD PTR [bx+si]
   0x97:        lock push bx
   0x99:        inc    WORD PTR [bx+si]
---[snip]---

Thanks,

Nate

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 16:34       ` Nate Case
@ 2015-03-26 16:40         ` Paolo Bonzini
  2015-03-26 16:52           ` Nate Case
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2015-03-26 16:40 UTC (permalink / raw)
  To: Nate Case; +Cc: kvm



On 26/03/2015 17:34, Nate Case wrote:
>    0x52:        add    al,dh
>    0x54:        pop    cx
>    0x55:        clc    
>    0x56:        add    al,dh
> => 0x58:        cs
>    0x59:        call   0xf05c
>    0x5c:        shr    bh,cl
>    0x5e:        add    al,dh
>    0x60:        add    ax,0xcf
>    0x63:        lock repnz out 0x0,al

This code makes no sense, it looks like the processor has gone into the
weeds. :(

Based on this:

cs             0x9020   36896

I could guess, based on your use of resource2_wc, that the host is
bypassing the processor cache but the guest is not.  This use is not
supported on x86 KVM, sorry.

Paolo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 16:40         ` Paolo Bonzini
@ 2015-03-26 16:52           ` Nate Case
  2015-03-26 17:04             ` Paolo Bonzini
  0 siblings, 1 reply; 10+ messages in thread
From: Nate Case @ 2015-03-26 16:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

----- Original Message -----
> 
> 
> On 26/03/2015 17:34, Nate Case wrote:
> >    0x52:        add    al,dh
> >    0x54:        pop    cx
> >    0x55:        clc
> >    0x56:        add    al,dh
> > => 0x58:        cs
> >    0x59:        call   0xf05c
> >    0x5c:        shr    bh,cl
> >    0x5e:        add    al,dh
> >    0x60:        add    ax,0xcf
> >    0x63:        lock repnz out 0x0,al
> 
> This code makes no sense, it looks like the processor has gone into the
> weeds. :(
> 
> Based on this:
> 
> cs             0x9020   36896
> 
> I could guess, based on your use of resource2_wc, that the host is
> bypassing the processor cache but the guest is not.  This use is not
> supported on x86 KVM, sorry.

I don't think the "x/70i 0" output reflected where the CPU was actually
executing?  Based on the CS:IP of 9020:0058 (0x90258), shouldn't I be
dumping from around 0x90200 instead?  gdb gets easily confused here

real-mode-gdb$ x/70i 0x90200
   0x90200:     cli    
   0x90201:     mov    al,0x80
   0x90203:     out    0x70,al
   0x90205:     mov    ax,0x9000
   0x90208:     mov    ds,ax
   0x9020a:     mov    es,ax
   0x9020c:     mov    fs,ax
   0x9020e:     mov    ss,ax
   0x90210:     mov    sp,dx
   0x90212:     push   cs
   0x90213:     pop    ds
   0x90214:     lidtw  ds:0xa2
   0x90219:     lgdtw  ds:0xa8
   0x9021e:     mov    dx,0x92
   0x90221:     in     al,dx
   0x90222:     cmp    al,0xff
   0x90224:     je     0x90238
   0x90226:     mov    ah,BYTE PTR [esp+0x4]
   0x9022b:     test   ah,ah
   0x9022d:     je     0x90233
   0x9022f:     or     al,0x2
   0x90231:     jmp    0x90235
   0x90233:     and    al,0xfd
   0x90235:     and    al,0xfe
   0x90237:     out    dx,al
   0x90238:     call   0x90266
   0x9023b:     mov    al,0xd1
   0x9023d:     out    0x64,al
   0x9023f:     call   0x90266
   0x90242:     mov    al,0xdf
   0x90244:     out    0x60,al
   0x90246:     call   0x90266
   0x90249:     mov    ax,0x1
   0x9024c:     lmsw   ax
   0x9024f:     jmp    0x90251
   0x90251:     mov    ax,0x18
   0x90254:     mov    ds,ax
   0x90256:     mov    es,ax
   0x90258:     mov    ss,ax      <-- the "real" IP
   0x9025a:     mov    fs,ax
   0x9025c:     mov    gs,ax
   0x9025e:     jmp    0x10:0x10000
   0x90266:     call   0x9027f
   0x90269:     in     al,0x64
   0x9026b:     cmp    al,0xff
   0x9026d:     je     0x9027e
   0x9026f:     test   al,0x1
   0x90271:     je     0x9027a
   0x90273:     call   0x9027f
   0x90276:     in     al,0x60
   0x90278:     jmp    0x90266
   0x9027a:     test   al,0x2
   0x9027c:     jne    0x90266
   0x9027e:     ret    
   0x9027f:     jmp    0x90281
   0x90281:     ret    
   0x90282:     add    BYTE PTR [bx+si],al
   0x90284:     add    BYTE PTR [bx+si],al
   0x90286:     add    BYTE PTR [bx+si],al
   0x90288:     add    BYTE PTR [bx+si],al
   0x9028a:     add    BYTE PTR [bx+si],al
   0x9028c:     add    BYTE PTR [bx+si],al
   0x9028e:     add    BYTE PTR [bx+si],al
   0x90290:     add    BYTE PTR [bx+si],al
   0x90292:     (bad)  
   0x90293:     jg     0x90295
   0x90295:     add    BYTE PTR [bx+si],al
   0x90297:     call   0xffff:0xc0
   0x9029c:     (bad)  
   0x9029d:     (bad)  

Thanks,

Nate

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 16:52           ` Nate Case
@ 2015-03-26 17:04             ` Paolo Bonzini
  2015-03-26 17:14               ` Nate Case
  2015-03-27 15:27               ` Nate Case
  0 siblings, 2 replies; 10+ messages in thread
From: Paolo Bonzini @ 2015-03-26 17:04 UTC (permalink / raw)
  To: Nate Case; +Cc: kvm



On 26/03/2015 17:52, Nate Case wrote:
> I don't think the "x/70i 0" output reflected where the CPU was actually
> executing?  Based on the CS:IP of 9020:0058 (0x90258), shouldn't I be
> dumping from around 0x90200 instead?  gdb gets easily confused here

Ah, this was gdb (QEMU has its own monitor and it sums the CS base if 
you use $pc, but not if you write an absolute address).

>    0x90249:     mov    ax,0x1
>    0x9024c:     lmsw   ax
>    0x9024f:     jmp    0x90251
>    0x90251:     mov    ax,0x18
>    0x90254:     mov    ds,ax
>    0x90256:     mov    es,ax
>    0x90258:     mov    ss,ax      <-- the "real" IP
>    0x9025a:     mov    fs,ax
>    0x9025c:     mov    gs,ax
>    0x9025e:     jmp    0x10:0x10000

This makes more sense.  The processor is looking at this code at least 
until 0x9024c, because of this in the trace:

 qemu-system-x86-3937  [002] 1474032.001887: kvm_exit:             reason CR_ACCESS rip 0x4c
 qemu-system-x86-3937  [002] 1474032.001887: kvm_cr:               cr_write 0 = 0x11

(bit 4 is always 1 so you see 0x11).

However, the trace then shows a crash (triple fault) at 0x63, not 0x58.

Please run "info registers" from QEMU instead, so that it's possible to
see the hidden part of the segment registers.

Paolo


>    0x90266:     call   0x9027f
>    0x90269:     in     al,0x64
>    0x9026b:     cmp    al,0xff
>    0x9026d:     je     0x9027e
>    0x9026f:     test   al,0x1
>    0x90271:     je     0x9027a
>    0x90273:     call   0x9027f
>    0x90276:     in     al,0x60
>    0x90278:     jmp    0x90266
>    0x9027a:     test   al,0x2
>    0x9027c:     jne    0x90266
>    0x9027e:     ret    
>    0x9027f:     jmp    0x90281
>    0x90281:     ret    
>    0x90282:     add    BYTE PTR [bx+si],al
>    0x90284:     add    BYTE PTR [bx+si],al
>    0x90286:     add    BYTE PTR [bx+si],al
>    0x90288:     add    BYTE PTR [bx+si],al
>    0x9028a:     add    BYTE PTR [bx+si],al
>    0x9028c:     add    BYTE PTR [bx+si],al
>    0x9028e:     add    BYTE PTR [bx+si],al
>    0x90290:     add    BYTE PTR [bx+si],al
>    0x90292:     (bad)  
>    0x90293:     jg     0x90295
>    0x90295:     add    BYTE PTR [bx+si],al
>    0x90297:     call   0xffff:0xc0
>    0x9029c:     (bad)  
>    0x9029d:     (bad)  
> 
> Thanks,
> 
> Nate
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 17:04             ` Paolo Bonzini
@ 2015-03-26 17:14               ` Nate Case
  2015-03-27 15:27               ` Nate Case
  1 sibling, 0 replies; 10+ messages in thread
From: Nate Case @ 2015-03-26 17:14 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

> Ah, this was gdb (QEMU has its own monitor and it sums the CS base if
> you use $pc, but not if you write an absolute address).

Thanks, that's useful to know!  I didn't realize QEMU supported this.

> However, the trace then shows a crash (triple fault) at 0x63, not 0x58.
> 
> Please run "info registers" from QEMU instead, so that it's possible to
> see the hidden part of the segment registers.

Here is the register dump from QEMU:

(qemu) info registers
EAX=00000018 EBX=00000000 ECX=00002000 EDX=00000092
ESI=00005a00 EDI=00003ff4 EBP=000001d0 ESP=00000800
EIP=00000058 EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0018 ffffffff ffffffff 00f0ff00 DPL=3 CS64 [CRA]
CS =9020 00090200 ffffffff 00809b00 DPL=0 CS16 [-RA]
SS =9000 00090000 ffffffff 00809300 DPL=0 DS16 [-WA]
DS =0018 ffffffff ffffffff 00f0ff00 DPL=3 CS64 [CRA]
FS =9000 00090000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =9000 00090000 ffffffff 00809300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00090282 00000800
IDT=     00000000 00000000
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000

Thanks,

Nate

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Guest memory backed by PCI BAR (x86)
  2015-03-26 17:04             ` Paolo Bonzini
  2015-03-26 17:14               ` Nate Case
@ 2015-03-27 15:27               ` Nate Case
  1 sibling, 0 replies; 10+ messages in thread
From: Nate Case @ 2015-03-27 15:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm

> >    0x90249:     mov    ax,0x1
> >    0x9024c:     lmsw   ax
> >    0x9024f:     jmp    0x90251
> >    0x90251:     mov    ax,0x18
> >    0x90254:     mov    ds,ax
> >    0x90256:     mov    es,ax
> >    0x90258:     mov    ss,ax      <-- the "real" IP
> >    0x9025a:     mov    fs,ax
> >    0x9025c:     mov    gs,ax
> >    0x9025e:     jmp    0x10:0x10000
> 
> This makes more sense.  The processor is looking at this code at least
> until 0x9024c, because of this in the trace:
> 
>  qemu-system-x86-3937  [002] 1474032.001887: kvm_exit:             reason
>  CR_ACCESS rip 0x4c
>  qemu-system-x86-3937  [002] 1474032.001887: kvm_cr:               cr_write 0
>  = 0x11
> 
> (bit 4 is always 1 so you see 0x11).
> 
> However, the trace then shows a crash (triple fault) at 0x63, not 0x58.

I was curious about the crash at 0x63 instead of 0x58, and I realized that
the first trace I uploaded had some debug code in the memtest86 setup.S
which would have moved the instruction addresses around.  So the trace
addresses wouldn't have matched the assembly dump exactly.

I uploaded a cleaner trace here:

  http://oss.xes-inc.com/xtmp/trace-pcimem-memtest86-stock-reset.dat.gz

This was used with the stock memtest86 code and also with "-no-reboot"
so you don't see the subsequent boot in the trace.

In this trace, at the end the last guest_rip reference I see is 0x58 now:

 kvm_exit:             [FAILED TO PARSE] exit_reason=30 guest_rip=0x69
 kvm_pio:              pio_read at 0x64 size 1 count 1
 kvm_entry:            vcpu 0
 kvm_exit:             [FAILED TO PARSE] exit_reason=28 guest_rip=0x4c
 kvm_cr:               cr_write 0 = 0x11
 kvm_mmu_get_page:     [FAILED TO PARSE] gfn=0 role=983104 root_count=0 unsync=0 created=0
 kvm_entry:            vcpu 0
 kvm_exit:             [FAILED TO PARSE] exit_reason=2 guest_rip=0x58

QEMU register dump after the failure looks the same as my last post:

(qemu) info registers
EAX=00000018 EBX=00000000 ECX=00002000 EDX=00000092
ESI=00005a00 EDI=00003ff4 EBP=000001d0 ESP=00000800
EIP=00000058 EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0018 ffffffff ffffffff 00f0ff00 DPL=3 CS64 [CRA]
CS =9020 00090200 ffffffff 00809b00 DPL=0 CS16 [-RA]
SS =9000 00090000 ffffffff 00809300 DPL=0 DS16 [-WA]
DS =0018 ffffffff ffffffff 00f0ff00 DPL=3 CS64 [CRA]
FS =9000 00090000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =9000 00090000 ffffffff 00809300 DPL=0 DS16 [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00090282 00000800
IDT=     00000000 00000000
CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000

Instruction dump (matches setup.S code from memtest86+):

(qemu) x/60i 0x90200
0x0000000000090200:  cli    
0x0000000000090201:  mov    $0x80,%al
0x0000000000090203:  out    %al,$0x70
0x0000000000090205:  mov    $0x9000,%ax
0x0000000000090208:  mov    %ax,%ds
0x000000000009020a:  mov    %ax,%es
0x000000000009020c:  mov    %ax,%fs
0x000000000009020e:  mov    %ax,%ss
0x0000000000090210:  mov    %dx,%sp
0x0000000000090212:  push   %cs
0x0000000000090213:  pop    %ds
0x0000000000090214:  lidtw  0xa2
0x0000000000090219:  lgdtw  0xa8
0x000000000009021e:  mov    $0x92,%dx
0x0000000000090221:  in     (%dx),%al
0x0000000000090222:  cmp    $0xff,%al
0x0000000000090224:  je     0x90238
0x0000000000090226:  addr32 mov 0x4(%esp),%ah
0x000000000009022b:  test   %ah,%ah
0x000000000009022d:  je     0x90233
0x000000000009022f:  or     $0x2,%al
0x0000000000090231:  jmp    0x90235
0x0000000000090233:  and    $0xfd,%al
0x0000000000090235:  and    $0xfe,%al
0x0000000000090237:  out    %al,(%dx)
0x0000000000090238:  call   0x90266
0x000000000009023b:  mov    $0xd1,%al
0x000000000009023d:  out    %al,$0x64
0x000000000009023f:  call   0x90266
0x0000000000090242:  mov    $0xdf,%al
0x0000000000090244:  out    %al,$0x60
0x0000000000090246:  call   0x90266
0x0000000000090249:  mov    $0x1,%ax
0x000000000009024c:  lmsw   %ax
0x000000000009024f:  jmp    0x90251
0x0000000000090251:  mov    $0x18,%ax
0x0000000000090254:  mov    %ax,%ds
0x0000000000090256:  mov    %ax,%es
0x0000000000090258:  mov    %ax,%ss      <- pc
0x000000000009025a:  mov    %ax,%fs
0x000000000009025c:  mov    %ax,%gs
0x000000000009025e:  ljmpl  $0x10,$0x10000
0x0000000000090266:  call   0x9027f
0x0000000000090269:  in     $0x64,%al
0x000000000009026b:  cmp    $0xff,%al
0x000000000009026d:  je     0x9027e
0x000000000009026f:  test   $0x1,%al
0x0000000000090271:  je     0x9027a
0x0000000000090273:  call   0x9027f
0x0000000000090276:  in     $0x60,%al
0x0000000000090278:  jmp    0x90266
0x000000000009027a:  test   $0x2,%al
0x000000000009027c:  jne    0x90266
0x000000000009027e:  ret    
0x000000000009027f:  jmp    0x90281
0x0000000000090281:  ret    
0x0000000000090282:  add    %al,(%bx,%si)
0x0000000000090284:  add    %al,(%bx,%si)
0x0000000000090286:  add    %al,(%bx,%si)
0x0000000000090288:  add    %al,(%bx,%si)

I also switched to using sysfs file "resource2" for now instead of
"resource2_wc", but the behavior appears to be the same in both cases.

Thanks,

Nate

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-27 15:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-25 15:56 Guest memory backed by PCI BAR (x86) Nate Case
2015-03-26 14:02 ` Paolo Bonzini
2015-03-26 16:01   ` Nate Case
2015-03-26 16:07     ` Paolo Bonzini
2015-03-26 16:34       ` Nate Case
2015-03-26 16:40         ` Paolo Bonzini
2015-03-26 16:52           ` Nate Case
2015-03-26 17:04             ` Paolo Bonzini
2015-03-26 17:14               ` Nate Case
2015-03-27 15:27               ` Nate Case

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox