From: Marcelo Tosatti <mtosatti@redhat.com>
To: Saso Slavicic <saso.linux@astim.si>
Cc: kvm@vger.kernel.org
Subject: Re: XP machine freeze
Date: Wed, 18 Mar 2015 21:51:53 -0300 [thread overview]
Message-ID: <20150319005153.GB16412@amt.cnet> (raw)
In-Reply-To: <009701d05ffb$5e37a740$1aa6f5c0$@astim.si>
On Mon, Mar 16, 2015 at 04:10:40PM +0100, Saso Slavicic wrote:
> Hi,
>
> I'm fairly experienced with KVM (Centos 5/6), running about a dozen servers
> with 20-30 different (Linux & MS platform) systems.
> I have one Windows XP machine that acts very strangely - it freezes. I get
> ping timeout for the VM from my monitoring and the machine spins 2 or 3
> cores using all the cpu. Now the interesting thing that happens is that once
> you open the console, it suddenly starts working again. You can see the
> clock catching up as it was frozen in time and everything works normally
> once the timer catches up. It usually happens probably about once a month,
> although it happened yesterday and today again.
>
> This machine is on Centos 6, qemu-kvm-0.12.1.2-2.448.el6_6, kernel
> 2.6.32-504.3.3.el6.x86_64.
> I was able to do some debugging when the machine was frozen, so I got some
> things to work with:
>
> # virsh qemu-monitor-command --hmp DBserver 'info cpus'
> * CPU #0: pc=0x0000000080501fdd thread_id=32595
> CPU #1: pc=0x00000000806e7a9b thread_id=32596
> CPU #2: pc=0x00000000ba2da162 (halted) thread_id=32597
> CPU #3: pc=0x00000000ba2da162 (halted) thread_id=32598
>
> Now, in both yesterday's and today's event the CPU0 was stopped at
> 0x0000000080501fdd. I've disassembled the function and got this:
>
> 0x0000000080501fb5: int3
> 0x0000000080501fb6: mov %edi,%edi
> 0x0000000080501fb8: push %ebp
> 0x0000000080501fb9: mov %esp,%ebp
> 0x0000000080501fbb: push %esi
> 0x0000000080501fbc: mov %fs:0x20,%eax
> 0x0000000080501fc2: mov 0x8(%ebp),%ecx
> 0x0000000080501fc5: lea -0x1(%ecx),%esi
> 0x0000000080501fc8: test %esi,%ecx
> 0x0000000080501fca: lea 0x7ec(%eax),%edx
> 0x0000000080501fd0: pop %esi
> 0x0000000080501fd1: je 0x80501fdd
> 0x0000000080501fd3: lea 0x7a0(%eax),%edx
> 0x0000000080501fd9: jmp 0x80501fdd
> *0x0000000080501fdb: pause
> 0x0000000080501fdd: cmpl $0x0,(%edx)
> 0x0000000080501fe0: jne 0x80501fdb
> 0x0000000080501fe2: pop %ebp
> 0x0000000080501fe3: ret $0x4
> 0x0000000080501fe6: int3
>
> Mov %edi,%edi is clearly the start of some function. From what I've been
> able to understand, the code fetches _KPRCB structure (%fs:0x20) and then
> does a spinlock between fdb and fe0 checking for PacketBarrier (?) in EDX
> (0xffdff8c0). Now, $pc always shows fdd address, shouldn't it jump between
> fdb and fe0, it seems as if it was stuck at fdd?
>
> # virsh qemu-monitor-command --hmp DBserver 'info registers'
> EAX=ffdff120 EBX=c06ddf58 ECX=0000000e EDX=ffdff8c0
> ESI=be6e3921 EDI=c06ddf60 EBP=ba4ff708 ESP=ba4ff708
> EIP=80501fdd EFL=00000202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
> CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
> SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA]
> DS =0023 00000000 ffffffff 00c0f300 DPL=3 DS [-WA]
> FS =0030 ffdff000 00001fff 00c09300 DPL=0 DS [-WA]
> GS =0000 00000000 000fffff 00000000
> LDT=0000 00000000 000fffff 00000000
> TR =0028 80042000 000020ab 00008b00 DPL=0 TSS32-busy
> GDT= 8003f000 000003ff
> IDT= 8003f400 000007ff
> CR0=8001003b CR2=dbbec000 CR3=0b3c0020 CR4=000006f8
> DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000
> DR6=ffff0ff0 DR7=00000400
> FCW=027f FSW=0020 [ST=0] FTW=00 MXCSR=00001fa0
> FPR0=8053632b003c1658 c048 FPR1=e1e0c048bf80f6ab 76f8
> FPR2=e1e0000000000000 0023 FPR3=0b017c30003c1658 0000
> FPR4=0000003bba1a7604 1e64 FPR5=0007268c00000000 003b
> FPR6=000002020000001b 2684 FPR7=e3e0a9b4e1b50de4 ca0b
> XMM00=0000000000a1fc95000000000020027f
> XMM01=0000ffff00001fa000001c4c00000001
> XMM02=000000000000c0488053632b003c1658
> XMM03=00000000000076f8e1e0c048bf80f6ab
> XMM04=0000000000000023e1e0000000000000
> XMM05=00000000000000000b017c30003c1658
> XMM06=0000000000001e640000003bba1a7604
> XMM07=000000000000003b0007268c00000000
>
> Clearly, the address in EDX is not 0:
>
> [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0'
> 00000000ffdff8c0: 0x0e
>
> [root@linux ~]# virt-manager
>
> [root@linux ~]# virsh qemu-monitor-command --hmp DBserver 'x/1xb 0xFFDFF8C0'
> 00000000ffdff8c0: 0x00
>
> However as soon as the VM console is opened and machine starts, the address
> in EDX is set to 0 and the loop is broken.
> Does anybody recognize what function that is? What could possibly happen
> that opening the console and moving the mouse a little, unfreezes the
> machine?
> VM has .81 virtio drivers from Fedora repo at the moment.
Generate a Windows dump?
https://support.microsoft.com/en-us/kb/254649
https://support.microsoft.com/en-us/kb/972110
Step 7: Generate a complete crash dump file or a kernel crash dump file
by using an NMI on a Windows-based system
(you can inject NMIs via QEMU monitor).
>
> The configuration of the machine is pretty standard:
>
> <!--
> WARNING: THIS IS AN AUTO-GENERATED FILE. CHANGES TO IT ARE LIKELY TO BE
> OVERWRITTEN AND LOST. Changes to this xml configuration should be made
> using:
> virsh edit DBserver
> or other application using the libvirt API.
> -->
>
> <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
> <name>DBserver</name>
> <uuid>e42b4cf2-7264-515f-4d24-6267eaa24be8</uuid>
> <memory unit='KiB'>3145728</memory>
> <currentMemory unit='KiB'>3145728</currentMemory>
> <vcpu placement='static'>4</vcpu>
> <os>
> <type arch='x86_64' machine='rhel6.6.0'>hvm</type>
> <boot dev='hd'/>
> </os>
> <features>
> <acpi/>
> <apic/>
> <pae/>
> </features>
> <cpu>
> <topology sockets='1' cores='4' threads='4'/>
> </cpu>
> <clock offset='localtime'>
> <timer name='rtc' tickpolicy='catchup'/>
> </clock>
> <on_poweroff>destroy</on_poweroff>
> <on_reboot>restart</on_reboot>
> <on_crash>restart</on_crash>
> <devices>
> <emulator>/usr/libexec/qemu-kvm</emulator>
> <disk type='block' device='disk'>
> <driver name='qemu' type='raw' cache='none' io='native'/>
> <source dev='/dev/drbd1'/>
> <target dev='vda' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
> function='0x0'/>
> </disk>
> <disk type='block' device='disk'>
> <driver name='qemu' type='raw' cache='none' io='native'/>
> <source
> dev='/dev/disk/by-id/usb-WD_Ext_HDD_1021_574D415A4138353838383731-0:0'/>
> <target dev='vdb' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
> function='0x0'/>
> </disk>
> <disk type='file' device='cdrom'>
> <driver name='qemu' type='raw'/>
> <target dev='hdc' bus='ide'/>
> <readonly/>
> <address type='drive' controller='0' bus='1' target='0' unit='0'/>
> </disk>
> <controller type='usb' index='0' model='ich9-ehci1'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x7'/>
> </controller>
> <controller type='usb' index='0' model='ich9-uhci1'>
> <master startport='0'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x0' multifunction='on'/>
> </controller>
> <controller type='usb' index='0' model='ich9-uhci2'>
> <master startport='2'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x1'/>
> </controller>
> <controller type='usb' index='0' model='ich9-uhci3'>
> <master startport='4'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x2'/>
> </controller>
> <controller type='ide' index='0'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> function='0x1'/>
> </controller>
> <interface type='bridge'>
> <mac address='52:54:00:a6:92:ca'/>
> <source bridge='br0'/>
> <model type='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
> function='0x0'/>
> </interface>
> <serial type='pty'>
> <target port='0'/>
> </serial>
> <console type='pty'>
> <target type='serial' port='0'/>
> </console>
> <input type='mouse' bus='ps2'/>
> <graphics type='vnc' port='-1' autoport='yes'/>
> <video>
> <model type='vga' vram='9216' heads='1'/>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
> </video>
> <memballoon model='virtio'>
> <address type='pci' domain='0x0000' bus='0x00' slot='0x07'
> function='0x0'/>
> </memballoon>
> </devices>
> <qemu:commandline>
> <qemu:arg value='-set'/>
> <qemu:arg value='device.virtio-disk0.x-data-plane=on'/>
> </qemu:commandline>
> </domain>
>
> The above config is already changed as I've first experimented with removing
> usb tablet (and installing vmware mouse drivers), turning 'x-data-plane on'
> and so on, hoping to solve the problem...Is there anything else I can check
> the next time the machine freezes?
>
> Regards,
> Saso Slavicic
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2015-03-19 0:52 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-16 15:10 XP machine freeze Saso Slavicic
2015-03-19 0:51 ` Marcelo Tosatti [this message]
2015-03-30 16:19 ` Saso Slavicic
2015-03-22 15:31 ` Brad Campbell
2015-03-30 21:11 ` Paolo Bonzini
2015-03-31 0:27 ` Brad Campbell
2015-03-31 6:29 ` Saso Slavicic
2015-03-31 7:18 ` Brad Campbell
2015-03-31 8:56 ` Paolo Bonzini
2015-03-31 11:16 ` Brad Campbell
2015-03-31 11:23 ` Paolo Bonzini
2015-04-04 10:55 ` Brad Campbell
2015-04-13 4:07 ` Brad Campbell
2015-04-13 12:38 ` Paolo Bonzini
2015-04-13 12:45 ` Brad Campbell
2015-04-13 14:02 ` Paolo Bonzini
2015-04-13 14:25 ` Brad Campbell
2015-04-19 15:27 ` Brad Campbell
2015-04-19 15:48 ` Nadav Amit
2015-04-19 16:50 ` Brad Campbell
2015-04-19 17:16 ` Paolo Bonzini
2015-04-13 12:47 ` Saso Slavicic
2015-04-13 13:33 ` Radim Krčmář
2015-04-13 13:34 ` Nadav Amit
2015-04-13 14:01 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150319005153.GB16412@amt.cnet \
--to=mtosatti@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=saso.linux@astim.si \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.