From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39647) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XkEHF-00083l-Ok for qemu-devel@nongnu.org; Fri, 31 Oct 2014 11:39:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xjsun-0002mk-DO for qemu-devel@nongnu.org; Thu, 30 Oct 2014 12:49:43 -0400 Received: from mail-bn1on0118.outbound.protection.outlook.com ([157.56.110.118]:20576 helo=na01-bn1-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xjsun-0002lK-1Q for qemu-devel@nongnu.org; Thu, 30 Oct 2014 12:49:37 -0400 From: Gary Hook Date: Thu, 30 Oct 2014 16:49:31 +0000 Message-ID: References: <20141030100344.GE2376@work-vm> In-Reply-To: <20141030100344.GE2376@work-vm> Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-ID: <916225EEB07D8E4EAB1E9D19F28E1127@namprd02.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [Qemu-devel] Bug in recent postcopy patch List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "qemu-devel@nongnu.org" Cc: "Dr. David Alan Gilbert" On 10/30/14, 5:03 AM, "Dr. David Alan Gilbert" wrote: >* Gary Hook (gary.hook@nimboxx.com) wrote: >> *Knock* *knock* *knock* Is this thing on? > >Yes - but only by luck did I notice this; it's normally better >to reply to the thread that posted a patch and cc the authors! Well, that depends upon the developers, I think. I was gently admonished on another list for addressing a developer (inadvertently) directly. But I appreciate your openness, and would not want to abuse your attention. >> I applied the 47 pieces of the recent postcopy patch to 2.1.2 and am >> poking around. An attempt to migrate results in a NULL pointer >>dereference >> in savevm.c. Here is info from gdb: > >I've not tried migrating with block migration; so can you >show the command line you used on qemu and the sequence of commands >you used to trigger the migration? Yessir. We invoke the emulator from libvirt. While the problem we are dealing with applies to any VM, the one I am working with is invoked thusly (edited for readability): qemu-system-x86_64 -enable-kvm -name 88dbaf46-4692-4935-bd9d-8d8fac7725a9 \ -S -machine pc-0.14,accel=3Dkvm,usb=3Doff -m 1024 -realtime mlock=3Doff \ -smp 1,sockets=3D1,cores=3D1,threads=3D1 \ -uuid 88dbaf46-4692-4935-bd9d-8d8fac7725a9 -no-user-config -nodefaults \ -chardev=20 socket,id=3Dcharmonitor,path=3D/var/lib/libvirt/qemu/88dbaf46-4692-4935-bd9= d-8d 8fac7725a9.monitor,server,nowait \ -mon chardev=3Dcharmonitor,id=3Dmonitor,mode=3Dcontrol -rtc base=3Dlocalti= me \ -no-shutdown -boot strict=3Don -device piix3-usb-uhci,id=3Dusb,bus=3Dpci.0,addr=3D0x1.0x2 \ -drive=20 file=3D/mnt/store01/virt/88dbaf46-4692-4935-bd9d-8d8fac7725a9.qcow2,if=3Dno= ne,i d=3Ddrive-virtio-disk0,format=3Dqcow2,cache=3Dwriteback \ -device=20 virtio-blk-pci,scsi=3Doff,bus=3Dpci.0,addr=3D0x4,drive=3Ddrive-virtio-disk0= ,id=3Dvirt io-disk0,bootindex=3D1 \ -drive if=3Dnone,id=3Ddrive-ide0-1-0,readonly=3Don,format=3Draw \ -device=20 ide-cd,bus=3Dide.1,unit=3D0,drive=3Ddrive-ide0-1-0,id=3Dide0-1-0,bootindex= =3D2 \ -netdev tap,fd=3D29,id=3Dhostnet0 -device rtl8139,netdev=3Dhostnet0,id=3Dnet0,mac=3D52:54:00:07:19:5e,bus=3Dpci.0,add= r=3D0x3 \ -chardev pty,id=3Dcharserial0 -device isa-serial,chardev=3Dcharserial0,id=3Dserial0 \ -vnc 127.0.0.1:0,password -device VGA,id=3Dvideo0,bus=3Dpci.0,addr=3D0x2 \ -device virtio-balloon-pci,id=3Dballoon0,bus=3Dpci.0,addr=3D0x5 \ -msg timestamp=3Don I posted another thread asking about migration failure due to a copy taking too long, but got no traction. In the case where the problem raises its head we have turned tunneling on. A tiny VM (<2GB in size) migrates fine using the same procedure. Again, no shared storage. >>Q: why is max_size =3D=3D 0? Does this seem correct? > >Yes, I think that's normal for the 1st time through the loop; (see >migration_thread >near the start max_size is initialised to 0). Thank you; will do. >>=20 >>=20 >> The patches appear to have been fully applied, but it would seem that >>the >> savevm_block_handlers structure needs to be updated to populate this >> field? Which implies that a new function will have to be written? >>=20 >> Or, if I have missed the obvious, I would appreciate enlightenment. > >Simple bug on my part; the line: > > if (se->ops->can_postcopy(se->opaque)) { > >needs to become: > if (se->ops->can_postcopy && > se->ops->can_postcopy(se->opaque)) { I wondered if that were not the case. I will make that change and see what happens. >Thanks for the report. Thank you for your time and ownership. Gary