From: Marcelo Tosatti <mtosatti@redhat.com>
To: Saso Slavicic <saso.linux@astim.si>
Cc: kvm@vger.kernel.org
Subject: Re: XP machine freeze
Date: Wed, 18 Mar 2015 21:51:53 -0300 [thread overview]
Message-ID: <20150319005153.GB16412@amt.cnet> (raw)
In-Reply-To: <009701d05ffb$5e37a740$1aa6f5c0$@astim.si>
On Mon, Mar 16, 2015 at 04:10:40PM +0100, Saso Slavicic wrote:
> Hi,
>
> I'm fairly experienced with KVM (Centos 5/6), running about a dozen servers
> with 20-30 different (Linux & MS platform) systems.
> I have one Windows XP machine that acts very strangely - it freezes. I get
> ping timeout for the VM from my monitoring and the machine spins 2 or 3
> cores using all the cpu. Now the interesting thing that happens is that once
> you open the console, it suddenly starts working again. You can see the
> clock catching up as it was frozen in time and everything works normally
> once the timer catches up. It usually happens probably about once a month,
> although it happened yesterday and today again.
>
> This machine is on Centos 6, qemu-kvm-0.12.1.2-2.448.el6_6, kernel
> 2.6.32-504.3.3.el6.x86_64.
> I was able to do some debugging when the machine was frozen, so I got some
> things to work with:
>
> # virsh qemu-monitor-command --hmp DBserver 'info cpus'
> * CPU #0: pc=0x0000000080501fdd thread_id=32595
> CPU #1: pc=0x00000000806e7a9b thread_id=32596
> CPU #2: pc=0x00000000ba2da162 (halted) thread_id=32597
> CPU #3: pc=0x00000000ba2da162 (halted) thread_id=32598
>
> Now, in both yesterday's and today's event the CPU0 was stopped at
> 0x0000000080501fdd. I've disassembled the function and got this:
>
> 0x0000000080501fb5: int3
> 0x0000000080501fb6: mov %edi,%edi
> 0x0000000080501fb8: push %ebp
> 0x0000000080501fb9: mov %esp,%ebp
> 0x0000000080501fbb: push %esi
> 0x0000000080501fbc: mov %fs:0x20,%eax
> 0x0000000080501fc2: mov 0x8(%ebp),%ecx
> 0x0000000080501fc5: lea -0x1(%ecx),%esi
> 0x0000000080501fc8: test %esi,%ecx
> 0x0000000080501fca: lea 0x7ec(%eax),%edx
> 0x0000000080501fd0: pop %esi
> 0x0000000080501fd1: je 0x80501fdd
> 0x0000000080501fd3: lea 0x7a0(%eax),%edx
> 0x0000000080501fd9: jmp 0x80501fdd
> *0x0000000080501fdb: pause
> 0x0000000080501fdd: cmpl $0x0,(%edx)
> 0x0000000080501fe0: jne 0x80501fdb
> 0x0000000080501fe2: pop %ebp
> 0x0000000080501fe3: ret $0x4
> 0x0000000080501fe6: int3
>
> Mov %edi,%edi is clearly the start of some function. From what I've been
> able to understand, the code fetches _KPRCB structure (%fs:0x20) and then
> does a spinlock between fdb and fe0 checking for PacketBarrier (?) in EDX
> (0xffdff8c0). Now, $pc always shows fdd address, shouldn't it jump between
> fdb and fe0, it seems as if it was stuck at fdd?
>
> # virsh qemu-monitor-command --hmp DBserver 'info registers'
> EAX=ffdff120 EBX=c06ddf58 ECX=0000000e EDX=ffdff8c0
> ESI=be6e3921 EDI=c06ddf60 EBP=ba4ff708 ESP=ba4ff708
> EIP=80501fdd EFL=00000202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
> CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> DS =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
> FS =0030 ffdff000 00001fff 00c09300 DPL=0 DS [-WA]
> GS =0000 00000000 000fffff 00000000
> LDT=0000 00000000 000fffff 00000000
> TR =0028 80042000 000020ab 00008b00 DPL=0 TSS32-busy
> GDT= 8003f000 000003ff
> IDT= 8003f400 000007ff
> CR0=8001003b CR2=dbbec000 CR3=0b3c0020 CR4=000006f8
> DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
> DR6=ffff0ff0 DR7=00000400
> FCW=027f FSW=0020 [ST=0] FTW=00 MXCSR=00001fa0
> FPR0=8053632b003c1658 c048 FPR1=e1e0c048bf80f6ab 76f8
> FPR2=e1e0000000000000 0023 FPR3=0b017c30003c1658 0000
> FPR4=0000003bba1a7604 1e64 FPR5=0007268c00000000 003b
> FPR6=000002020000001b 2684 FPR7=e3e0a9b4e1b50de4 ca0b
> XMM00=0000000000a1fc95000000000020027f
> XMM01=0000ffff00001fa000001c4c00000001
> XMM02=000000000000c0488053632b003c1658
> XMM03=00000000000076f8e1e0c048bf80f6ab
> XMM04=0000000000000023e1e0000000000000
> XMM05=00000000000000000b017c30003c1658
> XMM06=0000000000001e640000003bba1a7604
> XMM07=000000000000003b0007268c00000000
>
> Clearly, the address in EDX is not 0:
>
> [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0'
> 00000000ffdff8c0: 0x0e
>
> [root@linux ~]# virt-manager
>
> [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0'
> 00000000ffdff8c0: 0x00
>
> However as soon as the VM console is opened and machine starts, the address
> in EDX is set to 0 and the loop is broken.
> Does anybody recognize what function that is? What could possibly happen
> that opening the console and moving the mouse a little, unfreezes the
> machine?
> VM has .81 virtio drivers from Fedora repo at the moment.
Generate a Windows dump?
https://support.microsoft.com/en-us/kb/254649
https://support.microsoft.com/en-us/kb/972110
Step 7: Generate a complete crash dump file or a kernel crash dump file
by using an NMI on a Windows-based system
(you can inject NMIs via QEMU monitor).
>
> The configuration of the machine is pretty standard:
>
> <!--
> WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
> OVERWRITTEN AND LOST. Changes to this xml configuration should be made
> using:
> virsh edit DBserver
> or other application using the libvirt API.
> -->
>
> <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
> <name>DBserver</name>
> <uuid>e42b4cf2-7264-515f-4d24-6267eaa24be8</uuid>
> <memory unit='KiB'>3145728</memory>
> <currentMemory unit='KiB'>3145728</currentMemory>
> <vcpu placement='static'>4</vcpu>
> <os>
> <type arch='x86_64' machine='rhel6.6.0'>hvm</type>
> <boot dev='hd'/>
> </os>
> <features>
> <acpi/>
> <apic/>
> <pae/>
> </features>
> <cpu>
> <topology sockets='1' cores='4' threads='4'/>
> </cpu>
> <clock offset='localtime'>
> <timer name='rtc' tickpolicy='catchup'/>
> </clock>
> <on_poweroff>destroy</on_poweroff>
> <on_reboot>restart</on_reboot>
> <on_crash>restart</on_crash>
> <devices>
> <emulator>/usr/libexec/qemu-kvm</emulator>
> <disk type='block' device='disk'>
> <driver name='qemu' type='raw' cache='none' io='native'/>
> <source dev='/dev/drbd1'/>
> <target dev='vda' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> function='0x0'/>
> </disk>
> <disk type='block' device='disk'>
> <driver name='qemu' type='raw' cache='none' io='native'/>
> <source
> dev='/dev/disk/by-id/usb-WD_Ext_HDD_1021_574D415A4138353838383731-0:0'/>
> <target dev='vdb' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
> function='0x0'/>
> </disk>
> <disk type='file' device='cdrom'>
> <driver name='qemu' type='raw'/>
> <target dev='hdc' bus='ide'/>
> <readonly/>
> <address type='drive' controller='0' bus='1' target='0' unit='0'/>
> </disk>
> <controller type='usb' index='0' model='ich9-ehci1'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x7'/>
> </controller>
> <controller type='usb' index='0' model='ich9-uhci1'>
> <master startport='0'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x0' multifunction='on'/>
> </controller>
> <controller type='usb' index='0' model='ich9-uhci2'>
> <master startport='2'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x1'/>
> </controller>
> <controller type='usb' index='0' model='ich9-uhci3'>
> <master startport='4'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x2'/>
> </controller>
> <controller type='ide' index='0'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> function='0x1'/>
> </controller>
> <interface type='bridge'>
> <mac address='52:54:00:a6:92:ca'/>
> <source bridge='br0'/>
> <model type='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> function='0x0'/>
> </interface>
> <serial type='pty'>
> <target port='0'/>
> </serial>
> <console type='pty'>
> <target type='serial' port='0'/>
> </console>
> <input type='mouse' bus='ps2'/>
> <graphics type='vnc' port='-1' autoport='yes'/>
> <video>
> <model type='vga' vram='9216' heads='1'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
> </video>
> <memballoon model='virtio'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
> function='0x0'/>
> </memballoon>
> </devices>
> <qemu:commandline>
> <qemu:arg value='-set'/>
> <qemu:arg value='device.virtio-disk0.x-data-plane=on'/>
> </qemu:commandline>
> </domain>
>
> The above config is already changed as I've first experimented with removing
> usb tablet (and installing vmware mouse drivers), turning 'x-data-plane on'
> and so on, hoping to solve the problem...Is there anything else I can check
> the next time the machine freezes?
>
> Regards,
> Saso Slavicic
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-03-19 0:52 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-16 15:10 XP machine freeze Saso Slavicic
2015-03-19 0:51 ` Marcelo Tosatti [this message]
2015-03-30 16:19 ` Saso Slavicic
2015-03-22 15:31 ` Brad Campbell
2015-03-30 21:11 ` Paolo Bonzini
2015-03-31 0:27 ` Brad Campbell
2015-03-31 6:29 ` Saso Slavicic
2015-03-31 7:18 ` Brad Campbell
2015-03-31 8:56 ` Paolo Bonzini
2015-03-31 11:16 ` Brad Campbell
2015-03-31 11:23 ` Paolo Bonzini
2015-04-04 10:55 ` Brad Campbell
2015-04-13 4:07 ` Brad Campbell
2015-04-13 12:38 ` Paolo Bonzini
2015-04-13 12:45 ` Brad Campbell
2015-04-13 14:02 ` Paolo Bonzini
2015-04-13 14:25 ` Brad Campbell
2015-04-19 15:27 ` Brad Campbell
2015-04-19 15:48 ` Nadav Amit
2015-04-19 16:50 ` Brad Campbell
2015-04-19 17:16 ` Paolo Bonzini
2015-04-13 12:47 ` Saso Slavicic
2015-04-13 13:33 ` Radim Krčmář
2015-04-13 13:34 ` Nadav Amit
2015-04-13 14:01 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150319005153.GB16412@amt.cnet \
--to=mtosatti@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=saso.linux@astim.si \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox