qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Live migration results in non-working virtio-net device (sometimes)
@ 2014-01-30 18:23 Neil Skrypuch
  2014-02-28 20:14 ` Neil Skrypuch
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Neil Skrypuch @ 2014-01-30 18:23 UTC (permalink / raw)
  To: qemu-devel

First, let me briefly outline the way we use live migration, as it is probably 
not typical. We use live migration (with block migration) to make backups of 
VMs with zero downtime. The basic process goes like this:

1) migrate src VM -> dest VM
2) migration completes
3) cont src VM
4) gracefully shut down dest VM
5) dest VM's disk image is now a valid backup

In general, this works very well.

Up until now we have been using qemu-kvm 1.1.2 and have not had any issues 
with the above process. I am now attempting to upgrade us to a newer version 
of qemu, but all newer versions I've tried occasionally result in the virtio-
net device ceasing to function on the src VM after step 3.

I am able to reproduce this reliably (given enough iterations), it happens in 
roughly 2% of all migrations.

Here is the complete qemu command line for the src VM:

/usr/bin/qemu-system-x86_64 -machine accel=kvm -drive 
file=/var/lib/kvm/testbackup.polldev.com.img,if=virtio -m 2048 -smp 
4,cores=4,sockets=1,threads=1 -net 
nic,macaddr=52:54:98:00:00:00,model=virtio -net tap,script=/etc/qemu-ifup-
br2,downscript=no -curses -name 
"testbackup.polldev.com",process=testbackup.polldev.com -monitor 
unix:/var/lib/kvm/monitor/testbackup,server,nowait

The dest VM:

/usr/bin/qemu-system-x86_64 -machine accel=kvm -drive 
file=/backup/testbackup.polldev.com.img.bak20140129,if=virtio -m 2048 -smp 
4,cores=4,sockets=1,threads=1 -net 
nic,macaddr=52:54:98:00:00:00,model=virtio -net tap,script=no,downscript=no -
curses -name "testbackup.polldev.com",process=testbackup.polldev.com -monitor 
unix:/var/lib/kvm/monitor/testbackup.bak,server,nowait -incoming tcp:0:4444

The migration is performed like so:

echo "migrate -b tcp:localhost:4444" | socat STDIO UNIX-
CONNECT:/var/lib/kvm/monitor/testbackup
echo "migrate_set_speed 1G" | socat STDIO UNIX-
CONNECT:/var/lib/kvm/monitor/testbackup
#wait
echo cont | socat STDIO UNIX-CONNECT:/var/lib/kvm/monitor/testbackup

The guest in question is a minimal install of CentOS 6.5.

I have observed this issue across the following qemu versions:

qemu 1.4.2
qemu 1.6.0
qemu 1.6.1
qemu 1.7.0

I also attempted to test qemu 1.5.3, but live migration flat out crashed there 
(totally different issue).

I have also tested a number of other scenarios with qemu 1.6.0, all of which 
exhibit the same failure mode:

qemu 1.6.0 + host kernel 3.1.0
qemu 1.6.0 + host kernel 3.10.7
qemu 1.6.0 + host kernel 3.10.17
qemu 1.6.0 + virtio with -netdev/-device syntax
qemu 1.6.0 + accel=tcg

The one case I have found that works properly is the following:

qemu 1.6.0 + e1000

It is worth noting that when the virtio-net device ceases to function in the 
guest that removing and reinserting the virtio-net kernel module results in 
the device working again (except in 1.4.2, this had no effect there).

As mentioned above I can reproduce this with minimal effort, and am willing to 
test out any patches or provide further details as necessary.

- Neil

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-03-08 15:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-30 18:23 [Qemu-devel] Live migration results in non-working virtio-net device (sometimes) Neil Skrypuch
2014-02-28 20:14 ` Neil Skrypuch
2014-03-01  2:34   ` 陈梁
2014-03-03 20:15     ` Neil Skrypuch
2014-03-05 15:59 ` Andreas Färber
2014-03-05 18:32   ` Neil Skrypuch
2014-03-08 15:02 ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).