From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60013) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1avhD9-0003VZ-Fs for qemu-devel@nongnu.org; Thu, 28 Apr 2016 04:22:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1avhD5-0000Ti-OO for qemu-devel@nongnu.org; Thu, 28 Apr 2016 04:22:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37764) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1avhD5-0000TV-J6 for qemu-devel@nongnu.org; Thu, 28 Apr 2016 04:22:07 -0400 Date: Thu, 28 Apr 2016 09:22:03 +0100 From: "Daniel P. Berrange" Message-ID: <20160428082203.GB1797@redhat.com> Reply-To: "Daniel P. Berrange" References: <20160427142023.GC17937@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] Hang with migration multi-thread compression under high load List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Li, Liang Z" Cc: "qemu-devel@nongnu.org" , "Dr. David Alan Gilbert" On Thu, Apr 28, 2016 at 03:27:39AM +0000, Li, Liang Z wrote: > > I've been testing various features of migration and have hit a problem with > > the multi-thread compression. It works fine when I have 2 or more threads, > > but if I tell it to only use a single thread, then it almost always hangs > > > > I'm doing a migration between 2 guests on the same machine over a tcp > > localhost socket, using this command line to launch them: > > > > /home/berrange/src/virt/qemu/x86_64-softmmu/qemu-system-x86_64 > > -chardev socket,id=mon,path=/var/tmp/qemu-src-4644-monitor.sock > > -mon chardev=mon,mode=control > > -display none > > -vga none > > -machine accel=kvm > > -kernel /boot/vmlinuz-4.4.7-300.fc23.x86_64 > > -initrd /home/berrange/src/virt/qemu/tests/migration/initrd-stress.img > > -append "noapic edd=off printk.time=1 noreplace-smp > > cgroup_disable=memory pci=noearly console=ttyS0 debug ramsize=1" > > -chardev stdio,id=cdev0 > > -device isa-serial,chardev=cdev0 > > -m 1536 > > -smp 1 > > > > The target VM also gets > > > > -incoming tcp:localhost:9000 > > > > > > When the VM hangs, the source QEMU shows this stack trace: > > > > What's the mean of "VM hangs", the VM has no response? > or just the live migration process can't not complete. The live migration process stops transferring any data, and the monitor on the target host stops responding to input, because the main thread is stuck in the the decompress_data_with_multi_threads method. > I do the test in my environment, it works for me. NB, to make it more likely to happen you need to have a highly loaded host - if the host is mostly idle it seems to work fine. > Could you try to exec 'info migrate' in qemu monitor on the source side > to check if the live migration process is ongoing, if the 'transferred ram' > keeps unchanged, it shows dad lock happen. The migration status is "active" and the transferred RAM is stuck at approx 3-4 MB, not making any progress. As mentioned in the description, the source QEMU is stuck in a blocking sendmsg() as the TCP recv buffer is full on the target. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|