qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Re: [Qemu-devel] troubleshooting live migration
@ 2014-01-16 22:33 Marcin Gibuła
  0 siblings, 0 replies; 4+ messages in thread
From: Marcin Gibuła @ 2014-01-16 22:33 UTC (permalink / raw)
  To: qemu-devel

 > I tried -no-hpet, was still able to replicate the 'lapic' issue. I
 > find it interesting that I can only trigger it if the vm has been
 > running awhile.

Hi,

I've seen identical crashes with live migration in our environment. It 
looks identical - VM has to be idle for some time and after migration 
CPU is at 100% and VM is dead. All migration happens between same hardware.

I don't think I've ever had Windows guest crashing like this and I think 
this is somehow related to kvmclock. I've tried to debug qemu guest 
process and from I can tell, its kernel is busy looping in some time 
management related functions. Could you try to reproduce this issue with 
-no-kvmclock? Our testing environment is currently offline so I can't 
test it myself.

We also use 3.10 kernel (though 3.8 wasn't working either) and strugled 
with this issue with qemu 1.4, 1.5 and 1.6. Didn't test 1.7. Also we're 
using AMD CPUs, so it seems to be platform independend.

-- 
mg

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [Qemu-devel] troubleshooting live migration
@ 2014-01-14 15:31 Marcus Sorensen
  2014-01-15  6:08 ` Marcus Sorensen
  0 siblings, 1 reply; 4+ messages in thread
From: Marcus Sorensen @ 2014-01-14 15:31 UTC (permalink / raw)
  To: qemu-devel

Does anyone have tips on troubleshooting live migration? I've got
several E5-2650 servers running in test environment, kernel 3.10.26
and qemu 1.7.0. If I start a VM guest (say ubuntu, debian, or centos),
I can migrate it around  from host to host to host just fine, but if I
wait awhile (say 1 hour), I try to migrate and it succeeds but the
guest is hosed. No longer pings, cpu is thrashing. I've tried to
strace it and don't see anything that other working hosts aren't
doing, and I've tried gdb but I'm not entirely sure what I'm doing. I
tried downgrading to qemu 1.6.1. I've found dozens of reports of such
behavior, but they're all due to other things (migrating between
different host CPUs, someone thinking it's virtio or memballoon only
to later find a fix like changing machine type, etc). I'm at a loss.
This seems to work just fine with stock CentOS builds.

I'd be happy to try to capture a core if someone is willing to look at it.

Here's an example xml:

<domain type='kvm'>
  <name>VM12</name>
  <uuid>dd25acfc-e24d-4de6-814c-72ac465bc208</uuid>
  <description></description>
  <memory unit='KiB'>4194304</memory>
  <currentMemory unit='KiB'>4194304</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <cputune>
    <shares>2000</shares>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-1.7'>hvm</type>
    <boot dev='cdrom'/>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu>
  </cpu>
  <clock offset='utc'>
    <timer name='kvmclock' tickpolicy='catchup'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-kvm</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source dev='/dev/sdc'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' cache='none'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04'
function='0x0'/>
    </controller>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <interface type='bridge'>
      <mac address='02:00:09:66:00:18'/>
      <source bridge='br1000192'/>
      <model type='virtio'/>
      <bandwidth>
        <inbound average='128000' peak='128000'/>
        <outbound average='128000' peak='128000'/>
      </bandwidth>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/VM12.agent'/>
      <target type='virtio' name='VM12.vport'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='none'/>
</domain>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-01-16 22:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-16 22:33 [Qemu-devel] troubleshooting live migration Marcin Gibuła
  -- strict thread matches above, loose matches on Subject: below --
2014-01-14 15:31 Marcus Sorensen
2014-01-15  6:08 ` Marcus Sorensen
2014-01-15 14:27   ` Marcus Sorensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).