* [Qemu-devel] HEAD is failing virt-test on migration tests @ 2015-02-12 22:12 Juan Quintela 2015-02-12 22:19 ` Lucas Meneghel Rodrigues 0 siblings, 1 reply; 17+ messages in thread From: Juan Quintela @ 2015-02-12 22:12 UTC (permalink / raw) To: Developers qemu-devel, amit.shah, Dave Gilbert Hi while testing my changes I noticed that virt-test was failing. I check-out master, and failures are there. This is one extract of the log after the 1st failure. Notice that it fails randomly, not every time. I have to go to bed right now, so if anybody beats me with a fix, I would be happy when I wakeup. Thanks, Juan. 22:54:07 DEBUG| (monitor hmp1) Response to 'info migrate' 22:54:07 DEBUG| (monitor hmp1) capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 22:54:07 DEBUG| (monitor hmp1) Migration status: active 22:54:07 DEBUG| (monitor hmp1) total time: 2003 milliseconds 22:54:07 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds 22:54:07 DEBUG| (monitor hmp1) setup: 3 milliseconds 22:54:07 DEBUG| (monitor hmp1) transferred ram: 67619 kbytes 22:54:07 DEBUG| (monitor hmp1) throughput: 268.61 mbps 22:54:07 DEBUG| (monitor hmp1) remaining ram: 103056 kbytes 22:54:07 DEBUG| (monitor hmp1) total ram: 1065796 kbytes 22:54:07 DEBUG| (monitor hmp1) duplicate: 224304 pages 22:54:07 DEBUG| (monitor hmp1) skipped: 0 pages 22:54:07 DEBUG| (monitor hmp1) normal: 16380 pages 22:54:07 DEBUG| (monitor hmp1) normal bytes: 65520 kbytes 22:54:07 DEBUG| (monitor hmp1) dirty sync count: 0 22:54:09 DEBUG| Waiting for migration to complete (4.006475 secs) 22:54:09 DEBUG| (monitor hmp1) Sending command 'info migrate' 22:54:09 DEBUG| Send command: info migrate 22:54:09 DEBUG| (monitor hmp1) Response to 'info migrate' 22:54:09 DEBUG| (monitor hmp1) capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 22:54:09 DEBUG| (monitor hmp1) Migration status: active 22:54:09 DEBUG| (monitor hmp1) total time: 4008 milliseconds 22:54:09 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds 22:54:09 DEBUG| (monitor hmp1) setup: 3 milliseconds 22:54:09 DEBUG| (monitor hmp1) transferred ram: 131397 kbytes 22:54:09 DEBUG| (monitor hmp1) throughput: 268.57 mbps 22:54:09 DEBUG| (monitor hmp1) remaining ram: 31392 kbytes 22:54:09 DEBUG| (monitor hmp1) total ram: 1065796 kbytes 22:54:09 DEBUG| (monitor hmp1) duplicate: 226311 pages 22:54:09 DEBUG| (monitor hmp1) skipped: 0 pages 22:54:09 DEBUG| (monitor hmp1) normal: 32289 pages 22:54:09 DEBUG| (monitor hmp1) normal bytes: 129156 kbytes 22:54:09 DEBUG| (monitor hmp1) dirty sync count: 0 22:54:11 DEBUG| Waiting for migration to complete (6.011556 secs) 22:54:11 DEBUG| (monitor hmp1) Sending command 'info migrate' 22:54:11 DEBUG| Send command: info migrate 22:54:32 WARNI| virt-tests-vm1 is not alive. Can not query the register status 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10880) 22:58:11 DEBUG| Ending VM virt-tests-vm1 process (monitor) 22:58:11 INFO | [qemu output] (Process terminated with status 0) 22:58:11 DEBUG| VM virt-tests-vm1 down (monitor) 22:58:11 DEBUG| Host does not support OpenVSwitch: Missing command: ovs-vswitchd 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10763) 22:58:11 DEBUG| Shutting down VM virt-tests-vm1 (shell) 22:58:11 DEBUG| Login command: 'ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o PreferredAuthentications=password -p 5000 root@192.168.10.200' 22:58:11 DEBUG| virt-tests-vm1 alive now. Used to failed to get register info from guest 9 times 22:58:13 INFO | [qemu output] (Process terminated with status 0) 22:58:13 DEBUG| VM virt-tests-vm1 down (shell) 22:58:14 DEBUG| Host does not support OpenVSwitch: Missing command: ovs-vswitchd 22:58:14 DEBUG| Checking image file /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2 22:58:14 DEBUG| Running '/bin/qemu-img info /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' 22:58:14 DEBUG| Running '/bin/qemu-img check /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' 22:58:14 ERROR| [stdout] 22:58:14 ERROR| [stdout] 1 errors were found on the image. 22:58:14 ERROR| [stdout] Data may be corrupted, or further writes to the image may corrupt it. 22:58:14 ERROR| [stdout] 13495/163840 = 8.24% allocated, 0.03% fragmented, 0.00% compressed clusters 22:58:14 ERROR| [stdout] Image end offset: 885129216 22:58:14 ERROR| [stderr] ERROR cluster 13505 refcount=1 reference=2 22:58:14 ERROR| Errors found on image: '/mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' 22:58:14 WARNI| virt-tests-vm1 is not alive. Can not query the register status 22:58:14 DEBUG| Thread quit. Used to failed to get register info from guest 20150212-225320-Mb1E4VV7 for 1 times. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-12 22:12 [Qemu-devel] HEAD is failing virt-test on migration tests Juan Quintela @ 2015-02-12 22:19 ` Lucas Meneghel Rodrigues 2015-02-12 22:56 ` Lucas Meneghel Rodrigues 0 siblings, 1 reply; 17+ messages in thread From: Lucas Meneghel Rodrigues @ 2015-02-12 22:19 UTC (permalink / raw) To: quintela; +Cc: amit.shah, Developers qemu-devel, Dave Gilbert [-- Attachment #1: Type: text/plain, Size: 5068 bytes --] >From what the log says, after a round of migrations 'info migrate' does not respond after 4 minutes, timing out. Virt Test then shuts down the VM. When it tries to check the qcow2 image, it is corrupted. I'm checking out the latest master to see how reproducible this problem is. On Thu, Feb 12, 2015 at 8:12 PM, Juan Quintela <quintela@redhat.com> wrote: > > Hi > > while testing my changes I noticed that virt-test was failing. I > check-out master, and failures are there. > > This is one extract of the log after the 1st failure. Notice that it > fails randomly, not every time. > > I have to go to bed right now, so if anybody beats me with a fix, I > would be happy when I wakeup. > > Thanks, Juan. > > > 22:54:07 DEBUG| (monitor hmp1) Response to 'info migrate' > 22:54:07 DEBUG| (monitor hmp1) capabilities: xbzrle: off rdma-pin-all: > off auto-converge: off zero-blocks: off > 22:54:07 DEBUG| (monitor hmp1) Migration status: active > 22:54:07 DEBUG| (monitor hmp1) total time: 2003 milliseconds > 22:54:07 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds > 22:54:07 DEBUG| (monitor hmp1) setup: 3 milliseconds > 22:54:07 DEBUG| (monitor hmp1) transferred ram: 67619 kbytes > 22:54:07 DEBUG| (monitor hmp1) throughput: 268.61 mbps > 22:54:07 DEBUG| (monitor hmp1) remaining ram: 103056 kbytes > 22:54:07 DEBUG| (monitor hmp1) total ram: 1065796 kbytes > 22:54:07 DEBUG| (monitor hmp1) duplicate: 224304 pages > 22:54:07 DEBUG| (monitor hmp1) skipped: 0 pages > 22:54:07 DEBUG| (monitor hmp1) normal: 16380 pages > 22:54:07 DEBUG| (monitor hmp1) normal bytes: 65520 kbytes > 22:54:07 DEBUG| (monitor hmp1) dirty sync count: 0 > 22:54:09 DEBUG| Waiting for migration to complete (4.006475 secs) > 22:54:09 DEBUG| (monitor hmp1) Sending command 'info migrate' > 22:54:09 DEBUG| Send command: info migrate > 22:54:09 DEBUG| (monitor hmp1) Response to 'info migrate' > 22:54:09 DEBUG| (monitor hmp1) capabilities: xbzrle: off rdma-pin-all: > off auto-converge: off zero-blocks: off > 22:54:09 DEBUG| (monitor hmp1) Migration status: active > 22:54:09 DEBUG| (monitor hmp1) total time: 4008 milliseconds > 22:54:09 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds > 22:54:09 DEBUG| (monitor hmp1) setup: 3 milliseconds > 22:54:09 DEBUG| (monitor hmp1) transferred ram: 131397 kbytes > 22:54:09 DEBUG| (monitor hmp1) throughput: 268.57 mbps > 22:54:09 DEBUG| (monitor hmp1) remaining ram: 31392 kbytes > 22:54:09 DEBUG| (monitor hmp1) total ram: 1065796 kbytes > 22:54:09 DEBUG| (monitor hmp1) duplicate: 226311 pages > 22:54:09 DEBUG| (monitor hmp1) skipped: 0 pages > 22:54:09 DEBUG| (monitor hmp1) normal: 32289 pages > 22:54:09 DEBUG| (monitor hmp1) normal bytes: 129156 kbytes > 22:54:09 DEBUG| (monitor hmp1) dirty sync count: 0 > 22:54:11 DEBUG| Waiting for migration to complete (6.011556 secs) > 22:54:11 DEBUG| (monitor hmp1) Sending command 'info migrate' > 22:54:11 DEBUG| Send command: info migrate > 22:54:32 WARNI| virt-tests-vm1 is not alive. Can not query the register > status > 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10880) > 22:58:11 DEBUG| Ending VM virt-tests-vm1 process (monitor) > 22:58:11 INFO | [qemu output] (Process terminated with status 0) > 22:58:11 DEBUG| VM virt-tests-vm1 down (monitor) > 22:58:11 DEBUG| Host does not support OpenVSwitch: Missing command: > ovs-vswitchd > 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10763) > 22:58:11 DEBUG| Shutting down VM virt-tests-vm1 (shell) > 22:58:11 DEBUG| Login command: 'ssh -o UserKnownHostsFile=/dev/null -o > StrictHostKeyChecking=no -o PreferredAuthentications=password -p 5000 > root@192.168.10.200' > 22:58:11 DEBUG| virt-tests-vm1 alive now. Used to failed to get register > info from guest 9 times > 22:58:13 INFO | [qemu output] (Process terminated with status 0) > 22:58:13 DEBUG| VM virt-tests-vm1 down (shell) > 22:58:14 DEBUG| Host does not support OpenVSwitch: Missing command: > ovs-vswitchd > 22:58:14 DEBUG| Checking image file > /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2 > 22:58:14 DEBUG| Running '/bin/qemu-img info > /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' > 22:58:14 DEBUG| Running '/bin/qemu-img check > /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' > 22:58:14 ERROR| [stdout] > 22:58:14 ERROR| [stdout] 1 errors were found on the image. > 22:58:14 ERROR| [stdout] Data may be corrupted, or further writes to the > image may corrupt it. > 22:58:14 ERROR| [stdout] 13495/163840 = 8.24% allocated, 0.03% fragmented, > 0.00% compressed clusters > 22:58:14 ERROR| [stdout] Image end offset: 885129216 > 22:58:14 ERROR| [stderr] ERROR cluster 13505 refcount=1 reference=2 > 22:58:14 ERROR| Errors found on image: > '/mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' > 22:58:14 WARNI| virt-tests-vm1 is not alive. Can not query the register > status > 22:58:14 DEBUG| Thread quit. Used to failed to get register info from > guest 20150212-225320-Mb1E4VV7 for 1 times. > > -- Lucas [-- Attachment #2: Type: text/html, Size: 5809 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-12 22:19 ` Lucas Meneghel Rodrigues @ 2015-02-12 22:56 ` Lucas Meneghel Rodrigues 2015-02-12 23:03 ` Lucas Meneghel Rodrigues 2015-02-13 0:29 ` Lucas Meneghel Rodrigues 0 siblings, 2 replies; 17+ messages in thread From: Lucas Meneghel Rodrigues @ 2015-02-12 22:56 UTC (permalink / raw) To: quintela; +Cc: amit.shah, Developers qemu-devel, Dave Gilbert [-- Attachment #1: Type: text/plain, Size: 6981 bytes --] OK, indeed I can reproduce the problem. It's specific to the filedescriptor migration. An easy way to reproduce it is by doing: git clone https://github.com/autotest/virt-test.git cd virt-test ./run -t qemu --bootstrap ./run -t qemu --tests type_specific.io-github-autotest-qemu.migrate.default.fd That's it. I will see if I can bisect this quickly to pinpoint the QEMU commit that brought the regression. The qemu master commit I just tested is: commit 449008f86418583a1f0fb946cf91ee7b4797317d Merge: 5c697ae bc5baff Author: Peter Maydell <peter.maydell@linaro.org> Date: Wed Feb 11 05:14:41 2015 +0000 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20150210.0' into staging RCU fixes and cleanup (Paolo Bonzini) Switch to v2 IOMMU interface (Alex Williamson) DEBUG build fix (Alexey Kardashevskiy) # gpg: Signature made Tue 10 Feb 2015 17:37:06 GMT using RSA key ID 3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com >" # gpg: aka "Alex Williamson <alex@shazbot.org>" # gpg: aka "Alex Williamson <alwillia@redhat.com>" # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com >" * remotes/awilliam/tags/vfio-update-20150210.0: vfio: Fix debug message compile error vfio: Use vfio type1 v2 IOMMU interface vfio: unmap and free BAR data in instance_finalize vfio: free dynamically-allocated data in instance_finalize vfio: cleanup vfio_get_device error path, remove vfio_populate_device callback memory: unregister AddressSpace MemoryListener within BQL Signed-off-by: Peter Maydell <peter.maydell@linaro.org> On Thu, Feb 12, 2015 at 8:19 PM, Lucas Meneghel Rodrigues <lookkas@gmail.com > wrote: > From what the log says, after a round of migrations 'info migrate' does > not respond after 4 minutes, timing out. Virt Test then shuts down the VM. > When it tries to check the qcow2 image, it is corrupted. I'm checking out > the latest master to see how reproducible this problem is. > > On Thu, Feb 12, 2015 at 8:12 PM, Juan Quintela <quintela@redhat.com> > wrote: > >> >> Hi >> >> while testing my changes I noticed that virt-test was failing. I >> check-out master, and failures are there. >> >> This is one extract of the log after the 1st failure. Notice that it >> fails randomly, not every time. >> >> I have to go to bed right now, so if anybody beats me with a fix, I >> would be happy when I wakeup. >> >> Thanks, Juan. >> >> >> 22:54:07 DEBUG| (monitor hmp1) Response to 'info migrate' >> 22:54:07 DEBUG| (monitor hmp1) capabilities: xbzrle: off rdma-pin-all: >> off auto-converge: off zero-blocks: off >> 22:54:07 DEBUG| (monitor hmp1) Migration status: active >> 22:54:07 DEBUG| (monitor hmp1) total time: 2003 milliseconds >> 22:54:07 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds >> 22:54:07 DEBUG| (monitor hmp1) setup: 3 milliseconds >> 22:54:07 DEBUG| (monitor hmp1) transferred ram: 67619 kbytes >> 22:54:07 DEBUG| (monitor hmp1) throughput: 268.61 mbps >> 22:54:07 DEBUG| (monitor hmp1) remaining ram: 103056 kbytes >> 22:54:07 DEBUG| (monitor hmp1) total ram: 1065796 kbytes >> 22:54:07 DEBUG| (monitor hmp1) duplicate: 224304 pages >> 22:54:07 DEBUG| (monitor hmp1) skipped: 0 pages >> 22:54:07 DEBUG| (monitor hmp1) normal: 16380 pages >> 22:54:07 DEBUG| (monitor hmp1) normal bytes: 65520 kbytes >> 22:54:07 DEBUG| (monitor hmp1) dirty sync count: 0 >> 22:54:09 DEBUG| Waiting for migration to complete (4.006475 secs) >> 22:54:09 DEBUG| (monitor hmp1) Sending command 'info migrate' >> 22:54:09 DEBUG| Send command: info migrate >> 22:54:09 DEBUG| (monitor hmp1) Response to 'info migrate' >> 22:54:09 DEBUG| (monitor hmp1) capabilities: xbzrle: off rdma-pin-all: >> off auto-converge: off zero-blocks: off >> 22:54:09 DEBUG| (monitor hmp1) Migration status: active >> 22:54:09 DEBUG| (monitor hmp1) total time: 4008 milliseconds >> 22:54:09 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds >> 22:54:09 DEBUG| (monitor hmp1) setup: 3 milliseconds >> 22:54:09 DEBUG| (monitor hmp1) transferred ram: 131397 kbytes >> 22:54:09 DEBUG| (monitor hmp1) throughput: 268.57 mbps >> 22:54:09 DEBUG| (monitor hmp1) remaining ram: 31392 kbytes >> 22:54:09 DEBUG| (monitor hmp1) total ram: 1065796 kbytes >> 22:54:09 DEBUG| (monitor hmp1) duplicate: 226311 pages >> 22:54:09 DEBUG| (monitor hmp1) skipped: 0 pages >> 22:54:09 DEBUG| (monitor hmp1) normal: 32289 pages >> 22:54:09 DEBUG| (monitor hmp1) normal bytes: 129156 kbytes >> 22:54:09 DEBUG| (monitor hmp1) dirty sync count: 0 >> 22:54:11 DEBUG| Waiting for migration to complete (6.011556 secs) >> 22:54:11 DEBUG| (monitor hmp1) Sending command 'info migrate' >> 22:54:11 DEBUG| Send command: info migrate >> 22:54:32 WARNI| virt-tests-vm1 is not alive. Can not query the register >> status >> 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10880) >> 22:58:11 DEBUG| Ending VM virt-tests-vm1 process (monitor) >> 22:58:11 INFO | [qemu output] (Process terminated with status 0) >> 22:58:11 DEBUG| VM virt-tests-vm1 down (monitor) >> 22:58:11 DEBUG| Host does not support OpenVSwitch: Missing command: >> ovs-vswitchd >> 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10763) >> 22:58:11 DEBUG| Shutting down VM virt-tests-vm1 (shell) >> 22:58:11 DEBUG| Login command: 'ssh -o UserKnownHostsFile=/dev/null -o >> StrictHostKeyChecking=no -o PreferredAuthentications=password -p 5000 >> root@192.168.10.200' >> 22:58:11 DEBUG| virt-tests-vm1 alive now. Used to failed to get register >> info from guest 9 times >> 22:58:13 INFO | [qemu output] (Process terminated with status 0) >> 22:58:13 DEBUG| VM virt-tests-vm1 down (shell) >> 22:58:14 DEBUG| Host does not support OpenVSwitch: Missing command: >> ovs-vswitchd >> 22:58:14 DEBUG| Checking image file >> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2 >> 22:58:14 DEBUG| Running '/bin/qemu-img info >> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >> 22:58:14 DEBUG| Running '/bin/qemu-img check >> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >> 22:58:14 ERROR| [stdout] >> 22:58:14 ERROR| [stdout] 1 errors were found on the image. >> 22:58:14 ERROR| [stdout] Data may be corrupted, or further writes to the >> image may corrupt it. >> 22:58:14 ERROR| [stdout] 13495/163840 = 8.24% allocated, 0.03% >> fragmented, 0.00% compressed clusters >> 22:58:14 ERROR| [stdout] Image end offset: 885129216 >> 22:58:14 ERROR| [stderr] ERROR cluster 13505 refcount=1 reference=2 >> 22:58:14 ERROR| Errors found on image: >> '/mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >> 22:58:14 WARNI| virt-tests-vm1 is not alive. Can not query the register >> status >> 22:58:14 DEBUG| Thread quit. Used to failed to get register info from >> guest 20150212-225320-Mb1E4VV7 for 1 times. >> >> > > > -- > Lucas > -- Lucas [-- Attachment #2: Type: text/html, Size: 8976 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-12 22:56 ` Lucas Meneghel Rodrigues @ 2015-02-12 23:03 ` Lucas Meneghel Rodrigues 2015-02-13 0:29 ` Lucas Meneghel Rodrigues 1 sibling, 0 replies; 17+ messages in thread From: Lucas Meneghel Rodrigues @ 2015-02-12 23:03 UTC (permalink / raw) To: quintela; +Cc: amit.shah, Developers qemu-devel, Dave Gilbert [-- Attachment #1: Type: text/plain, Size: 7458 bytes --] On Thu, Feb 12, 2015 at 8:56 PM, Lucas Meneghel Rodrigues <lookkas@gmail.com > wrote: > OK, indeed I can reproduce the problem. It's specific to the > filedescriptor migration. An easy way to reproduce it is by doing: > > git clone https://github.com/autotest/virt-test.git > > cd virt-test > ./run -t qemu --bootstrap > ./run -t qemu > --tests type_specific.io-github-autotest-qemu.migrate.default.fd > A little correction here, it should've been: ./run -t qemu --tests type_specific.io-github-autotest-qemu.migrate.default.fd --qemu-bin /path/to/qemu-built-from-master > > That's it. I will see if I can bisect this quickly to pinpoint the QEMU > commit that brought the regression. > > The qemu master commit I just tested is: > > commit 449008f86418583a1f0fb946cf91ee7b4797317d > Merge: 5c697ae bc5baff > Author: Peter Maydell <peter.maydell@linaro.org> > Date: Wed Feb 11 05:14:41 2015 +0000 > > Merge remote-tracking branch > 'remotes/awilliam/tags/vfio-update-20150210.0' into staging > > RCU fixes and cleanup (Paolo Bonzini) > Switch to v2 IOMMU interface (Alex Williamson) > DEBUG build fix (Alexey Kardashevskiy) > > # gpg: Signature made Tue 10 Feb 2015 17:37:06 GMT using RSA key ID > 3BB08B22 > # gpg: Good signature from "Alex Williamson < > alex.williamson@redhat.com>" > # gpg: aka "Alex Williamson <alex@shazbot.org>" > # gpg: aka "Alex Williamson <alwillia@redhat.com>" > # gpg: aka "Alex Williamson < > alex.l.williamson@gmail.com>" > > * remotes/awilliam/tags/vfio-update-20150210.0: > vfio: Fix debug message compile error > vfio: Use vfio type1 v2 IOMMU interface > vfio: unmap and free BAR data in instance_finalize > vfio: free dynamically-allocated data in instance_finalize > vfio: cleanup vfio_get_device error path, remove > vfio_populate_device callback > memory: unregister AddressSpace MemoryListener within BQL > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > > > On Thu, Feb 12, 2015 at 8:19 PM, Lucas Meneghel Rodrigues < > lookkas@gmail.com> wrote: > >> From what the log says, after a round of migrations 'info migrate' does >> not respond after 4 minutes, timing out. Virt Test then shuts down the VM. >> When it tries to check the qcow2 image, it is corrupted. I'm checking out >> the latest master to see how reproducible this problem is. >> >> On Thu, Feb 12, 2015 at 8:12 PM, Juan Quintela <quintela@redhat.com> >> wrote: >> >>> >>> Hi >>> >>> while testing my changes I noticed that virt-test was failing. I >>> check-out master, and failures are there. >>> >>> This is one extract of the log after the 1st failure. Notice that it >>> fails randomly, not every time. >>> >>> I have to go to bed right now, so if anybody beats me with a fix, I >>> would be happy when I wakeup. >>> >>> Thanks, Juan. >>> >>> >>> 22:54:07 DEBUG| (monitor hmp1) Response to 'info migrate' >>> 22:54:07 DEBUG| (monitor hmp1) capabilities: xbzrle: off >>> rdma-pin-all: off auto-converge: off zero-blocks: off >>> 22:54:07 DEBUG| (monitor hmp1) Migration status: active >>> 22:54:07 DEBUG| (monitor hmp1) total time: 2003 milliseconds >>> 22:54:07 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds >>> 22:54:07 DEBUG| (monitor hmp1) setup: 3 milliseconds >>> 22:54:07 DEBUG| (monitor hmp1) transferred ram: 67619 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) throughput: 268.61 mbps >>> 22:54:07 DEBUG| (monitor hmp1) remaining ram: 103056 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) total ram: 1065796 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) duplicate: 224304 pages >>> 22:54:07 DEBUG| (monitor hmp1) skipped: 0 pages >>> 22:54:07 DEBUG| (monitor hmp1) normal: 16380 pages >>> 22:54:07 DEBUG| (monitor hmp1) normal bytes: 65520 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) dirty sync count: 0 >>> 22:54:09 DEBUG| Waiting for migration to complete (4.006475 secs) >>> 22:54:09 DEBUG| (monitor hmp1) Sending command 'info migrate' >>> 22:54:09 DEBUG| Send command: info migrate >>> 22:54:09 DEBUG| (monitor hmp1) Response to 'info migrate' >>> 22:54:09 DEBUG| (monitor hmp1) capabilities: xbzrle: off >>> rdma-pin-all: off auto-converge: off zero-blocks: off >>> 22:54:09 DEBUG| (monitor hmp1) Migration status: active >>> 22:54:09 DEBUG| (monitor hmp1) total time: 4008 milliseconds >>> 22:54:09 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds >>> 22:54:09 DEBUG| (monitor hmp1) setup: 3 milliseconds >>> 22:54:09 DEBUG| (monitor hmp1) transferred ram: 131397 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) throughput: 268.57 mbps >>> 22:54:09 DEBUG| (monitor hmp1) remaining ram: 31392 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) total ram: 1065796 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) duplicate: 226311 pages >>> 22:54:09 DEBUG| (monitor hmp1) skipped: 0 pages >>> 22:54:09 DEBUG| (monitor hmp1) normal: 32289 pages >>> 22:54:09 DEBUG| (monitor hmp1) normal bytes: 129156 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) dirty sync count: 0 >>> 22:54:11 DEBUG| Waiting for migration to complete (6.011556 secs) >>> 22:54:11 DEBUG| (monitor hmp1) Sending command 'info migrate' >>> 22:54:11 DEBUG| Send command: info migrate >>> 22:54:32 WARNI| virt-tests-vm1 is not alive. Can not query the register >>> status >>> 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10880) >>> 22:58:11 DEBUG| Ending VM virt-tests-vm1 process (monitor) >>> 22:58:11 INFO | [qemu output] (Process terminated with status 0) >>> 22:58:11 DEBUG| VM virt-tests-vm1 down (monitor) >>> 22:58:11 DEBUG| Host does not support OpenVSwitch: Missing command: >>> ovs-vswitchd >>> 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10763) >>> 22:58:11 DEBUG| Shutting down VM virt-tests-vm1 (shell) >>> 22:58:11 DEBUG| Login command: 'ssh -o UserKnownHostsFile=/dev/null -o >>> StrictHostKeyChecking=no -o PreferredAuthentications=password -p 5000 >>> root@192.168.10.200' >>> 22:58:11 DEBUG| virt-tests-vm1 alive now. Used to failed to get register >>> info from guest 9 times >>> 22:58:13 INFO | [qemu output] (Process terminated with status 0) >>> 22:58:13 DEBUG| VM virt-tests-vm1 down (shell) >>> 22:58:14 DEBUG| Host does not support OpenVSwitch: Missing command: >>> ovs-vswitchd >>> 22:58:14 DEBUG| Checking image file >>> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2 >>> 22:58:14 DEBUG| Running '/bin/qemu-img info >>> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >>> 22:58:14 DEBUG| Running '/bin/qemu-img check >>> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >>> 22:58:14 ERROR| [stdout] >>> 22:58:14 ERROR| [stdout] 1 errors were found on the image. >>> 22:58:14 ERROR| [stdout] Data may be corrupted, or further writes to the >>> image may corrupt it. >>> 22:58:14 ERROR| [stdout] 13495/163840 = 8.24% allocated, 0.03% >>> fragmented, 0.00% compressed clusters >>> 22:58:14 ERROR| [stdout] Image end offset: 885129216 >>> 22:58:14 ERROR| [stderr] ERROR cluster 13505 refcount=1 reference=2 >>> 22:58:14 ERROR| Errors found on image: >>> '/mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >>> 22:58:14 WARNI| virt-tests-vm1 is not alive. Can not query the register >>> status >>> 22:58:14 DEBUG| Thread quit. Used to failed to get register info from >>> guest 20150212-225320-Mb1E4VV7 for 1 times. >>> >>> >> >> >> -- >> Lucas >> > > > > -- > Lucas > -- Lucas [-- Attachment #2: Type: text/html, Size: 10206 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-12 22:56 ` Lucas Meneghel Rodrigues 2015-02-12 23:03 ` Lucas Meneghel Rodrigues @ 2015-02-13 0:29 ` Lucas Meneghel Rodrigues 2015-02-13 0:36 ` Alexander Graf 1 sibling, 1 reply; 17+ messages in thread From: Lucas Meneghel Rodrigues @ 2015-02-13 0:29 UTC (permalink / raw) To: quintela; +Cc: amit.shah, Alexander Graf, Developers qemu-devel, Dave Gilbert [-- Attachment #1: Type: text/plain, Size: 9297 bytes --] Copying Alex. OK, after bisecting, this is what I've got: 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit commit 8118f0950fc77cce7873002a5021172dd6e040b5 Author: Alexander Graf <agraf@suse.de> Date: Thu Jan 22 15:01:39 2015 +0100 migration: Append JSON description of migration stream One of the annoyances of the current migration format is the fact that it's not self-describing. In fact, it's not properly describing at all. Some code randomly scattered throughout QEMU elaborates roughly how to read and write a stream of bytes. We discussed an idea during KVM Forum 2013 to add a JSON description of the migration protocol itself to the migration stream. This patch adds a section after the VM_END migration end marker that contains description data on what the device sections of the stream are composed of. This approach is backwards compatible with any QEMU version reading the stream, because QEMU just stops reading after the VM_END marker and ignores any data following it. With an additional external program this allows us to decipher the contents of any migration stream and hopefully make migration bugs easier to track down. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 61df81f831bc86b29f65883523ea95abb36f1ec5 M hw :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 7092a6b6515a3d0077f68ff2d80dbd74597a244f M include :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 c2b1dcda197d96657458d699c185e39ae45f3c6c M migration :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d 80407662ad3ed860d33a9d35f5c44b1d19c4612b M savevm.c :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa 7aaf3d10ef7f73413b228e854fe6f04317151e46 M tests So there you go. I'm going to sleep, if you need any extra help let me know. Cheers, Lucas On Thu, Feb 12, 2015 at 8:56 PM, Lucas Meneghel Rodrigues <lookkas@gmail.com > wrote: > OK, indeed I can reproduce the problem. It's specific to the > filedescriptor migration. An easy way to reproduce it is by doing: > > git clone https://github.com/autotest/virt-test.git > > cd virt-test > ./run -t qemu --bootstrap > ./run -t qemu > --tests type_specific.io-github-autotest-qemu.migrate.default.fd > > That's it. I will see if I can bisect this quickly to pinpoint the QEMU > commit that brought the regression. > > The qemu master commit I just tested is: > > commit 449008f86418583a1f0fb946cf91ee7b4797317d > Merge: 5c697ae bc5baff > Author: Peter Maydell <peter.maydell@linaro.org> > Date: Wed Feb 11 05:14:41 2015 +0000 > > Merge remote-tracking branch > 'remotes/awilliam/tags/vfio-update-20150210.0' into staging > > RCU fixes and cleanup (Paolo Bonzini) > Switch to v2 IOMMU interface (Alex Williamson) > DEBUG build fix (Alexey Kardashevskiy) > > # gpg: Signature made Tue 10 Feb 2015 17:37:06 GMT using RSA key ID > 3BB08B22 > # gpg: Good signature from "Alex Williamson < > alex.williamson@redhat.com>" > # gpg: aka "Alex Williamson <alex@shazbot.org>" > # gpg: aka "Alex Williamson <alwillia@redhat.com>" > # gpg: aka "Alex Williamson < > alex.l.williamson@gmail.com>" > > * remotes/awilliam/tags/vfio-update-20150210.0: > vfio: Fix debug message compile error > vfio: Use vfio type1 v2 IOMMU interface > vfio: unmap and free BAR data in instance_finalize > vfio: free dynamically-allocated data in instance_finalize > vfio: cleanup vfio_get_device error path, remove > vfio_populate_device callback > memory: unregister AddressSpace MemoryListener within BQL > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > > > On Thu, Feb 12, 2015 at 8:19 PM, Lucas Meneghel Rodrigues < > lookkas@gmail.com> wrote: > >> From what the log says, after a round of migrations 'info migrate' does >> not respond after 4 minutes, timing out. Virt Test then shuts down the VM. >> When it tries to check the qcow2 image, it is corrupted. I'm checking out >> the latest master to see how reproducible this problem is. >> >> On Thu, Feb 12, 2015 at 8:12 PM, Juan Quintela <quintela@redhat.com> >> wrote: >> >>> >>> Hi >>> >>> while testing my changes I noticed that virt-test was failing. I >>> check-out master, and failures are there. >>> >>> This is one extract of the log after the 1st failure. Notice that it >>> fails randomly, not every time. >>> >>> I have to go to bed right now, so if anybody beats me with a fix, I >>> would be happy when I wakeup. >>> >>> Thanks, Juan. >>> >>> >>> 22:54:07 DEBUG| (monitor hmp1) Response to 'info migrate' >>> 22:54:07 DEBUG| (monitor hmp1) capabilities: xbzrle: off >>> rdma-pin-all: off auto-converge: off zero-blocks: off >>> 22:54:07 DEBUG| (monitor hmp1) Migration status: active >>> 22:54:07 DEBUG| (monitor hmp1) total time: 2003 milliseconds >>> 22:54:07 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds >>> 22:54:07 DEBUG| (monitor hmp1) setup: 3 milliseconds >>> 22:54:07 DEBUG| (monitor hmp1) transferred ram: 67619 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) throughput: 268.61 mbps >>> 22:54:07 DEBUG| (monitor hmp1) remaining ram: 103056 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) total ram: 1065796 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) duplicate: 224304 pages >>> 22:54:07 DEBUG| (monitor hmp1) skipped: 0 pages >>> 22:54:07 DEBUG| (monitor hmp1) normal: 16380 pages >>> 22:54:07 DEBUG| (monitor hmp1) normal bytes: 65520 kbytes >>> 22:54:07 DEBUG| (monitor hmp1) dirty sync count: 0 >>> 22:54:09 DEBUG| Waiting for migration to complete (4.006475 secs) >>> 22:54:09 DEBUG| (monitor hmp1) Sending command 'info migrate' >>> 22:54:09 DEBUG| Send command: info migrate >>> 22:54:09 DEBUG| (monitor hmp1) Response to 'info migrate' >>> 22:54:09 DEBUG| (monitor hmp1) capabilities: xbzrle: off >>> rdma-pin-all: off auto-converge: off zero-blocks: off >>> 22:54:09 DEBUG| (monitor hmp1) Migration status: active >>> 22:54:09 DEBUG| (monitor hmp1) total time: 4008 milliseconds >>> 22:54:09 DEBUG| (monitor hmp1) expected downtime: 300 milliseconds >>> 22:54:09 DEBUG| (monitor hmp1) setup: 3 milliseconds >>> 22:54:09 DEBUG| (monitor hmp1) transferred ram: 131397 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) throughput: 268.57 mbps >>> 22:54:09 DEBUG| (monitor hmp1) remaining ram: 31392 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) total ram: 1065796 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) duplicate: 226311 pages >>> 22:54:09 DEBUG| (monitor hmp1) skipped: 0 pages >>> 22:54:09 DEBUG| (monitor hmp1) normal: 32289 pages >>> 22:54:09 DEBUG| (monitor hmp1) normal bytes: 129156 kbytes >>> 22:54:09 DEBUG| (monitor hmp1) dirty sync count: 0 >>> 22:54:11 DEBUG| Waiting for migration to complete (6.011556 secs) >>> 22:54:11 DEBUG| (monitor hmp1) Sending command 'info migrate' >>> 22:54:11 DEBUG| Send command: info migrate >>> 22:54:32 WARNI| virt-tests-vm1 is not alive. Can not query the register >>> status >>> 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10880) >>> 22:58:11 DEBUG| Ending VM virt-tests-vm1 process (monitor) >>> 22:58:11 INFO | [qemu output] (Process terminated with status 0) >>> 22:58:11 DEBUG| VM virt-tests-vm1 down (monitor) >>> 22:58:11 DEBUG| Host does not support OpenVSwitch: Missing command: >>> ovs-vswitchd >>> 22:58:11 DEBUG| Destroying VM virt-tests-vm1 (PID 10763) >>> 22:58:11 DEBUG| Shutting down VM virt-tests-vm1 (shell) >>> 22:58:11 DEBUG| Login command: 'ssh -o UserKnownHostsFile=/dev/null -o >>> StrictHostKeyChecking=no -o PreferredAuthentications=password -p 5000 >>> root@192.168.10.200' >>> 22:58:11 DEBUG| virt-tests-vm1 alive now. Used to failed to get register >>> info from guest 9 times >>> 22:58:13 INFO | [qemu output] (Process terminated with status 0) >>> 22:58:13 DEBUG| VM virt-tests-vm1 down (shell) >>> 22:58:14 DEBUG| Host does not support OpenVSwitch: Missing command: >>> ovs-vswitchd >>> 22:58:14 DEBUG| Checking image file >>> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2 >>> 22:58:14 DEBUG| Running '/bin/qemu-img info >>> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >>> 22:58:14 DEBUG| Running '/bin/qemu-img check >>> /mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >>> 22:58:14 ERROR| [stdout] >>> 22:58:14 ERROR| [stdout] 1 errors were found on the image. >>> 22:58:14 ERROR| [stdout] Data may be corrupted, or further writes to the >>> image may corrupt it. >>> 22:58:14 ERROR| [stdout] 13495/163840 = 8.24% allocated, 0.03% >>> fragmented, 0.00% compressed clusters >>> 22:58:14 ERROR| [stdout] Image end offset: 885129216 >>> 22:58:14 ERROR| [stderr] ERROR cluster 13505 refcount=1 reference=2 >>> 22:58:14 ERROR| Errors found on image: >>> '/mnt/kvm/src/virt-test/shared/data/images/jeos-20-64.qcow2' >>> 22:58:14 WARNI| virt-tests-vm1 is not alive. Can not query the register >>> status >>> 22:58:14 DEBUG| Thread quit. Used to failed to get register info from >>> guest 20150212-225320-Mb1E4VV7 for 1 times. >>> >>> >> >> >> -- >> Lucas >> > > > > -- > Lucas > -- Lucas [-- Attachment #2: Type: text/html, Size: 12545 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 0:29 ` Lucas Meneghel Rodrigues @ 2015-02-13 0:36 ` Alexander Graf 2015-02-13 9:04 ` Dr. David Alan Gilbert 2015-02-13 11:09 ` Lucas Meneghel Rodrigues 0 siblings, 2 replies; 17+ messages in thread From: Alexander Graf @ 2015-02-13 0:36 UTC (permalink / raw) To: Lucas Meneghel Rodrigues, quintela Cc: amit.shah, Developers qemu-devel, Dave Gilbert On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: > Copying Alex. > > OK, after bisecting, this is what I've got: > > 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit > commit 8118f0950fc77cce7873002a5021172dd6e040b5 > Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> > Date: Thu Jan 22 15:01:39 2015 +0100 > > migration: Append JSON description of migration stream > > One of the annoyances of the current migration format is the fact that > it's not self-describing. In fact, it's not properly describing at all. > Some code randomly scattered throughout QEMU elaborates roughly how to > read and write a stream of bytes. > > We discussed an idea during KVM Forum 2013 to add a JSON description of > the migration protocol itself to the migration stream. This patch > adds a section after the VM_END migration end marker that contains > description data on what the device sections of the stream are > composed of. > > This approach is backwards compatible with any QEMU version reading the > stream, because QEMU just stops reading after the VM_END marker and > ignores > any data following it. > > With an additional external program this allows us to decipher the > contents of any migration stream and hopefully make migration bugs > easier > to track down. > > Signed-off-by: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> > Signed-off-by: Amit Shah <amit.shah@redhat.com > <mailto:amit.shah@redhat.com>> > Signed-off-by: Juan Quintela <quintela@redhat.com > <mailto:quintela@redhat.com>> > > :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 > 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw > :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 > 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude > :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 > c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration > :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d > 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c > :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa > 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests > > So there you go. I'm going to sleep, if you need any extra help let me know. So the major difference with this patch applied is that the sender could send more data than the receive wants to read. I can't see the actual migrate command you used down there. I haven't seen this actually being a problem so far, as the receiver just close()s its file descriptor once it hits VM_EOF. This should only break senders if they expect they can send more. That said, I think I only tested offline migration (via exec:), so maybe QEMU is behaving badly and actually wants to send all data and just fails the migration without? Alex ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 0:36 ` Alexander Graf @ 2015-02-13 9:04 ` Dr. David Alan Gilbert 2015-02-13 11:18 ` Alexander Graf 2015-02-13 11:09 ` Lucas Meneghel Rodrigues 1 sibling, 1 reply; 17+ messages in thread From: Dr. David Alan Gilbert @ 2015-02-13 9:04 UTC (permalink / raw) To: Alexander Graf Cc: amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela * Alexander Graf (agraf@suse.de) wrote: > > > On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: > > Copying Alex. > > > > OK, after bisecting, this is what I've got: > > > > 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit > > commit 8118f0950fc77cce7873002a5021172dd6e040b5 > > Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> > > Date: Thu Jan 22 15:01:39 2015 +0100 > > > > migration: Append JSON description of migration stream > > > > One of the annoyances of the current migration format is the fact that > > it's not self-describing. In fact, it's not properly describing at all. > > Some code randomly scattered throughout QEMU elaborates roughly how to > > read and write a stream of bytes. > > > > We discussed an idea during KVM Forum 2013 to add a JSON description of > > the migration protocol itself to the migration stream. This patch > > adds a section after the VM_END migration end marker that contains > > description data on what the device sections of the stream are > > composed of. > > > > This approach is backwards compatible with any QEMU version reading the > > stream, because QEMU just stops reading after the VM_END marker and > > ignores > > any data following it. > > > > With an additional external program this allows us to decipher the > > contents of any migration stream and hopefully make migration bugs > > easier > > to track down. > > > > Signed-off-by: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> > > Signed-off-by: Amit Shah <amit.shah@redhat.com > > <mailto:amit.shah@redhat.com>> > > Signed-off-by: Juan Quintela <quintela@redhat.com > > <mailto:quintela@redhat.com>> > > > > :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 > > 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw > > :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 > > 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude > > :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 > > c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration > > :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d > > 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c > > :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa > > 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests > > > > So there you go. I'm going to sleep, if you need any extra help let me know. > > So the major difference with this patch applied is that the sender could > send more data than the receive wants to read. I can't see the actual > migrate command you used down there. > > I haven't seen this actually being a problem so far, as the receiver > just close()s its file descriptor once it hits VM_EOF. This should only > break senders if they expect they can send more. That said, I think I > only tested offline migration (via exec:), so maybe QEMU is behaving > badly and actually wants to send all data and just fails the migration > without? Hmm, for such an odd change to the migration stream it's a surprise you didn't test it live. The only obvious thing to me of what could go wrong would be that if the destination closed it's migration fd when it received what it thought was a terminator then the source could get upset at it's failure to send the last few kB with the JSON in it. Dave > > > Alex -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 9:04 ` Dr. David Alan Gilbert @ 2015-02-13 11:18 ` Alexander Graf 2015-02-13 11:23 ` Dr. David Alan Gilbert 2015-02-13 11:23 ` Lucas Meneghel Rodrigues 0 siblings, 2 replies; 17+ messages in thread From: Alexander Graf @ 2015-02-13 11:18 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela On 13.02.15 10:04, Dr. David Alan Gilbert wrote: > * Alexander Graf (agraf@suse.de) wrote: >> >> >> On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: >>> Copying Alex. >>> >>> OK, after bisecting, this is what I've got: >>> >>> 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit >>> commit 8118f0950fc77cce7873002a5021172dd6e040b5 >>> Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> >>> Date: Thu Jan 22 15:01:39 2015 +0100 >>> >>> migration: Append JSON description of migration stream >>> >>> One of the annoyances of the current migration format is the fact that >>> it's not self-describing. In fact, it's not properly describing at all. >>> Some code randomly scattered throughout QEMU elaborates roughly how to >>> read and write a stream of bytes. >>> >>> We discussed an idea during KVM Forum 2013 to add a JSON description of >>> the migration protocol itself to the migration stream. This patch >>> adds a section after the VM_END migration end marker that contains >>> description data on what the device sections of the stream are >>> composed of. >>> >>> This approach is backwards compatible with any QEMU version reading the >>> stream, because QEMU just stops reading after the VM_END marker and >>> ignores >>> any data following it. >>> >>> With an additional external program this allows us to decipher the >>> contents of any migration stream and hopefully make migration bugs >>> easier >>> to track down. >>> >>> Signed-off-by: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> >>> Signed-off-by: Amit Shah <amit.shah@redhat.com >>> <mailto:amit.shah@redhat.com>> >>> Signed-off-by: Juan Quintela <quintela@redhat.com >>> <mailto:quintela@redhat.com>> >>> >>> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 >>> 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw >>> :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 >>> 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude >>> :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 >>> c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration >>> :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d >>> 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c >>> :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa >>> 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests >>> >>> So there you go. I'm going to sleep, if you need any extra help let me know. >> >> So the major difference with this patch applied is that the sender could >> send more data than the receive wants to read. I can't see the actual >> migrate command you used down there. >> >> I haven't seen this actually being a problem so far, as the receiver >> just close()s its file descriptor once it hits VM_EOF. This should only >> break senders if they expect they can send more. That said, I think I >> only tested offline migration (via exec:), so maybe QEMU is behaving >> badly and actually wants to send all data and just fails the migration >> without? > > Hmm, for such an odd change to the migration stream it's a surprise you > didn't test it live. Well, let's say I don't remember explicitly testing it live - I probably did at one point. I just verified that migrating with tcp:... works fine in master. Alex ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 11:18 ` Alexander Graf @ 2015-02-13 11:23 ` Dr. David Alan Gilbert 2015-02-13 11:23 ` Lucas Meneghel Rodrigues 1 sibling, 0 replies; 17+ messages in thread From: Dr. David Alan Gilbert @ 2015-02-13 11:23 UTC (permalink / raw) To: Alexander Graf Cc: amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela * Alexander Graf (agraf@suse.de) wrote: > > > On 13.02.15 10:04, Dr. David Alan Gilbert wrote: > > * Alexander Graf (agraf@suse.de) wrote: > >> > >> > >> On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: > >>> Copying Alex. > >>> > >>> OK, after bisecting, this is what I've got: > >>> > >>> 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit > >>> commit 8118f0950fc77cce7873002a5021172dd6e040b5 > >>> Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> > >>> Date: Thu Jan 22 15:01:39 2015 +0100 > >>> > >>> migration: Append JSON description of migration stream > >>> > >>> One of the annoyances of the current migration format is the fact that > >>> it's not self-describing. In fact, it's not properly describing at all. > >>> Some code randomly scattered throughout QEMU elaborates roughly how to > >>> read and write a stream of bytes. > >>> > >>> We discussed an idea during KVM Forum 2013 to add a JSON description of > >>> the migration protocol itself to the migration stream. This patch > >>> adds a section after the VM_END migration end marker that contains > >>> description data on what the device sections of the stream are > >>> composed of. > >>> > >>> This approach is backwards compatible with any QEMU version reading the > >>> stream, because QEMU just stops reading after the VM_END marker and > >>> ignores > >>> any data following it. > >>> > >>> With an additional external program this allows us to decipher the > >>> contents of any migration stream and hopefully make migration bugs > >>> easier > >>> to track down. > >>> > >>> Signed-off-by: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> > >>> Signed-off-by: Amit Shah <amit.shah@redhat.com > >>> <mailto:amit.shah@redhat.com>> > >>> Signed-off-by: Juan Quintela <quintela@redhat.com > >>> <mailto:quintela@redhat.com>> > >>> > >>> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 > >>> 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw > >>> :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 > >>> 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude > >>> :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 > >>> c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration > >>> :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d > >>> 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c > >>> :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa > >>> 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests > >>> > >>> So there you go. I'm going to sleep, if you need any extra help let me know. > >> > >> So the major difference with this patch applied is that the sender could > >> send more data than the receive wants to read. I can't see the actual > >> migrate command you used down there. > >> > >> I haven't seen this actually being a problem so far, as the receiver > >> just close()s its file descriptor once it hits VM_EOF. This should only > >> break senders if they expect they can send more. That said, I think I > >> only tested offline migration (via exec:), so maybe QEMU is behaving > >> badly and actually wants to send all data and just fails the migration > >> without? > > > > Hmm, for such an odd change to the migration stream it's a surprise you > > didn't test it live. > > Well, let's say I don't remember explicitly testing it live - I probably > did at one point. > > I just verified that migrating with tcp:... works fine in master. Yes, that's fair. My suspicion (for which I have no proof) is that it might depend on the amount of buffer in the connection; if there's enough buffer to hold your JSON description it'll work, because you'll have sent the JSON before the destination has spotted the terminator; if you've not got much buffering (e.g. on a local fd) then the source might get stuck trying to write the json or error because the destination has closed the fd. Dave > > > Alex -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 11:18 ` Alexander Graf 2015-02-13 11:23 ` Dr. David Alan Gilbert @ 2015-02-13 11:23 ` Lucas Meneghel Rodrigues 2015-02-13 23:33 ` Alexander Graf 1 sibling, 1 reply; 17+ messages in thread From: Lucas Meneghel Rodrigues @ 2015-02-13 11:23 UTC (permalink / raw) To: Alexander Graf Cc: amit.shah, quintela, Dr. David Alan Gilbert, Lucas Meneghel Rodrigues, Developers qemu-devel On Fri, Feb 13, 2015 at 9:18 AM, Alexander Graf <agraf@suse.de> wrote: > > > On 13.02.15 10:04, Dr. David Alan Gilbert wrote: >> * Alexander Graf (agraf@suse.de) wrote: >>> >>> >>> On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: >>>> Copying Alex. >>>> >>>> OK, after bisecting, this is what I've got: >>>> >>>> 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit >>>> commit 8118f0950fc77cce7873002a5021172dd6e040b5 >>>> Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> >>>> Date: Thu Jan 22 15:01:39 2015 +0100 >>>> >>>> migration: Append JSON description of migration stream >>>> >>>> One of the annoyances of the current migration format is the >>>> fact that >>>> it's not self-describing. In fact, it's not properly >>>> describing at all. >>>> Some code randomly scattered throughout QEMU elaborates >>>> roughly how to >>>> read and write a stream of bytes. >>>> >>>> We discussed an idea during KVM Forum 2013 to add a JSON >>>> description of >>>> the migration protocol itself to the migration stream. This >>>> patch >>>> adds a section after the VM_END migration end marker that >>>> contains >>>> description data on what the device sections of the stream are >>>> composed of. >>>> >>>> This approach is backwards compatible with any QEMU version >>>> reading the >>>> stream, because QEMU just stops reading after the VM_END >>>> marker and >>>> ignores >>>> any data following it. >>>> >>>> With an additional external program this allows us to >>>> decipher the >>>> contents of any migration stream and hopefully make migration >>>> bugs >>>> easier >>>> to track down. >>>> >>>> Signed-off-by: Alexander Graf <agraf@suse.de >>>> <mailto:agraf@suse.de>> >>>> Signed-off-by: Amit Shah <amit.shah@redhat.com >>>> <mailto:amit.shah@redhat.com>> >>>> Signed-off-by: Juan Quintela <quintela@redhat.com >>>> <mailto:quintela@redhat.com>> >>>> >>>> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 >>>> 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw >>>> :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 >>>> 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude >>>> :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 >>>> c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration >>>> :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d >>>> 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c >>>> :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa >>>> 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests >>>> >>>> So there you go. I'm going to sleep, if you need any extra help >>>> let me know. >>> >>> So the major difference with this patch applied is that the sender >>> could >>> send more data than the receive wants to read. I can't see the >>> actual >>> migrate command you used down there. >>> >>> I haven't seen this actually being a problem so far, as the >>> receiver >>> just close()s its file descriptor once it hits VM_EOF. This should >>> only >>> break senders if they expect they can send more. That said, I >>> think I >>> only tested offline migration (via exec:), so maybe QEMU is >>> behaving >>> badly and actually wants to send all data and just fails the >>> migration >>> without? >> >> Hmm, for such an odd change to the migration stream it's a surprise >> you >> didn't test it live. > > Well, let's say I don't remember explicitly testing it live - I > probably > did at one point. > > I just verified that migrating with tcp:... works fine in master. It is working fine with tcp migration in master indeed. The thing is, virt-test tests a bunch of variants, among them fd. fd is the only one failing from the list of things we do test (which also happen to be the virt-test default test set). > > Alex > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 11:23 ` Lucas Meneghel Rodrigues @ 2015-02-13 23:33 ` Alexander Graf 2015-02-16 18:57 ` Dr. David Alan Gilbert 0 siblings, 1 reply; 17+ messages in thread From: Alexander Graf @ 2015-02-13 23:33 UTC (permalink / raw) To: Lucas Meneghel Rodrigues Cc: amit.shah, quintela, Dr. David Alan Gilbert, Lucas Meneghel Rodrigues, Developers qemu-devel On 13.02.15 12:23, Lucas Meneghel Rodrigues wrote: > > > On Fri, Feb 13, 2015 at 9:18 AM, Alexander Graf <agraf@suse.de> wrote: >> >> >> On 13.02.15 10:04, Dr. David Alan Gilbert wrote: >>> * Alexander Graf (agraf@suse.de) wrote: >>>> >>>> >>>> On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: >>>>> Copying Alex. >>>>> >>>>> OK, after bisecting, this is what I've got: >>>>> >>>>> 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit >>>>> commit 8118f0950fc77cce7873002a5021172dd6e040b5 >>>>> Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> >>>>> Date: Thu Jan 22 15:01:39 2015 +0100 >>>>> >>>>> migration: Append JSON description of migration stream >>>>> >>>>> One of the annoyances of the current migration format is the >>>>> fact that >>>>> it's not self-describing. In fact, it's not properly >>>>> describing at all. >>>>> Some code randomly scattered throughout QEMU elaborates >>>>> roughly how to >>>>> read and write a stream of bytes. >>>>> >>>>> We discussed an idea during KVM Forum 2013 to add a JSON >>>>> description of >>>>> the migration protocol itself to the migration stream. This patch >>>>> adds a section after the VM_END migration end marker that >>>>> contains >>>>> description data on what the device sections of the stream are >>>>> composed of. >>>>> >>>>> This approach is backwards compatible with any QEMU version >>>>> reading the >>>>> stream, because QEMU just stops reading after the VM_END >>>>> marker and >>>>> ignores >>>>> any data following it. >>>>> >>>>> With an additional external program this allows us to decipher >>>>> the >>>>> contents of any migration stream and hopefully make migration >>>>> bugs >>>>> easier >>>>> to track down. >>>>> >>>>> Signed-off-by: Alexander Graf <agraf@suse.de >>>>> <mailto:agraf@suse.de>> >>>>> Signed-off-by: Amit Shah <amit.shah@redhat.com >>>>> <mailto:amit.shah@redhat.com>> >>>>> Signed-off-by: Juan Quintela <quintela@redhat.com >>>>> <mailto:quintela@redhat.com>> >>>>> >>>>> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 >>>>> 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw >>>>> :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 >>>>> 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude >>>>> :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 >>>>> c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration >>>>> :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d >>>>> 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c >>>>> :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa >>>>> 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests >>>>> >>>>> So there you go. I'm going to sleep, if you need any extra help >>>>> let me know. >>>> >>>> So the major difference with this patch applied is that the sender >>>> could >>>> send more data than the receive wants to read. I can't see the actual >>>> migrate command you used down there. >>>> >>>> I haven't seen this actually being a problem so far, as the receiver >>>> just close()s its file descriptor once it hits VM_EOF. This should >>>> only >>>> break senders if they expect they can send more. That said, I think I >>>> only tested offline migration (via exec:), so maybe QEMU is behaving >>>> badly and actually wants to send all data and just fails the migration >>>> without? >>> >>> Hmm, for such an odd change to the migration stream it's a surprise you >>> didn't test it live. >> >> Well, let's say I don't remember explicitly testing it live - I probably >> did at one point. >> >> I just verified that migrating with tcp:... works fine in master. > > It is working fine with tcp migration in master indeed. The thing is, > virt-test tests a bunch of variants, among them fd. fd is the only one > failing from the list of things we do test (which also happen to be the > virt-test default test set). Can you please test whether the patch below makes things work for you again? Alex >From ef6fde21007e62529799264f57a65c6bb3d0d414 Mon Sep 17 00:00:00 2001 From: Alexander Graf <agraf@suse.de> Date: Sat, 14 Feb 2015 00:21:01 +0100 Subject: [PATCH] migration: Read JSON VM description on incoming migration One of the really nice things about the VM description format is that it goes over the wire when live migration is happening. Unfortunately QEMU today closes any socket once it sees VM_EOF coming, so we never give the VMDESC the chance to actually land on the wire. This patch makes QEMU read the description as well. This way we ensure that anything wire tapping us in between will get the chance to also interpret the stream. Along the way we also fix virt tests that assume that number_bytes_sent on the sender side is equal to number_bytes_read which was true before the VMDESC patches and is true again with this patch. Signed-off-by: Alexander Graf <agraf@suse.de> diff --git a/savevm.c b/savevm.c index 8040766..ff4bead 100644 --- a/savevm.c +++ b/savevm.c @@ -929,6 +929,7 @@ int qemu_loadvm_state(QEMUFile *f) uint8_t section_type; unsigned int v; int ret; + int file_error_after_eof = -1; if (qemu_savevm_state_blocked(&local_err)) { error_report("%s", error_get_pretty(local_err)); @@ -1034,6 +1035,22 @@ int qemu_loadvm_state(QEMUFile *f) } } + file_error_after_eof = qemu_file_get_error(f); + + /* + * Try to read in the VMDESC section as well, so that dumping tools that + * intercept our migration stream have the chance to see it. + */ + if (qemu_get_byte(f) == QEMU_VM_VMDESCRIPTION) { + uint32_t size = qemu_get_be32(f); + uint8_t *buf = g_malloc(size); + + if (buf) { + qemu_get_buffer(f, buf, size); + g_free(buf); + } + } + cpu_synchronize_all_post_init(); ret = 0; @@ -1045,7 +1062,8 @@ out: } if (ret == 0) { - ret = qemu_file_get_error(f); + /* We may not have a VMDESC section, so ignore relative errors */ + ret = file_error_after_eof; } return ret; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 23:33 ` Alexander Graf @ 2015-02-16 18:57 ` Dr. David Alan Gilbert 2015-02-16 20:24 ` Alexander Graf 0 siblings, 1 reply; 17+ messages in thread From: Dr. David Alan Gilbert @ 2015-02-16 18:57 UTC (permalink / raw) To: Alexander Graf Cc: Lucas Meneghel Rodrigues, amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela * Alexander Graf (agraf@suse.de) wrote: <snip> > Can you please test whether the patch below makes things work for you again? The patch below fixes RDMA migration (same host); however, see comments. > Alex > > From ef6fde21007e62529799264f57a65c6bb3d0d414 Mon Sep 17 00:00:00 2001 > From: Alexander Graf <agraf@suse.de> > Date: Sat, 14 Feb 2015 00:21:01 +0100 > Subject: [PATCH] migration: Read JSON VM description on incoming migration > > One of the really nice things about the VM description format is that it > goes > over the wire when live migration is happening. Unfortunately QEMU today > closes > any socket once it sees VM_EOF coming, so we never give the VMDESC the > chance to > actually land on the wire. > > This patch makes QEMU read the description as well. This way we ensure that > anything wire tapping us in between will get the chance to also > interpret the > stream. > > Along the way we also fix virt tests that assume that number_bytes_sent > on the > sender side is equal to number_bytes_read which was true before the VMDESC > patches and is true again with this patch. > > Signed-off-by: Alexander Graf <agraf@suse.de> > > diff --git a/savevm.c b/savevm.c > index 8040766..ff4bead 100644 > --- a/savevm.c > +++ b/savevm.c > @@ -929,6 +929,7 @@ int qemu_loadvm_state(QEMUFile *f) > uint8_t section_type; > unsigned int v; > int ret; > + int file_error_after_eof = -1; > > if (qemu_savevm_state_blocked(&local_err)) { > error_report("%s", error_get_pretty(local_err)); > @@ -1034,6 +1035,22 @@ int qemu_loadvm_state(QEMUFile *f) > } > } > > + file_error_after_eof = qemu_file_get_error(f); > + > + /* > + * Try to read in the VMDESC section as well, so that dumping tools > that > + * intercept our migration stream have the chance to see it. > + */ > + if (qemu_get_byte(f) == QEMU_VM_VMDESCRIPTION) { You could use qemu_peek_byte for that? > + uint32_t size = qemu_get_be32(f); > + uint8_t *buf = g_malloc(size); > + > + if (buf) { > + qemu_get_buffer(f, buf, size); > + g_free(buf); > + } This is slightly dangerous; a malformed file could send you a huge value and get you to allocate lots of memory for no good reason. You could do some clever; but personally I'd just loop around a nice small buffer until it's gone. As mentioned on IRC; I'm still worried though that this is only a fix for loading on newer versions; migration to an older QEMU with the same machine type would fail. (Yes I know mythically that no one cares about this; but I do). Dave > + } > + > cpu_synchronize_all_post_init(); > > ret = 0; > @@ -1045,7 +1062,8 @@ out: > } > > if (ret == 0) { > - ret = qemu_file_get_error(f); > + /* We may not have a VMDESC section, so ignore relative errors */ > + ret = file_error_after_eof; > } > > return ret; -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-16 18:57 ` Dr. David Alan Gilbert @ 2015-02-16 20:24 ` Alexander Graf 2015-02-16 21:06 ` Paolo Bonzini 0 siblings, 1 reply; 17+ messages in thread From: Alexander Graf @ 2015-02-16 20:24 UTC (permalink / raw) To: Dr. David Alan Gilbert Cc: Lucas Meneghel Rodrigues, amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela On 16.02.15 19:57, Dr. David Alan Gilbert wrote: > * Alexander Graf (agraf@suse.de) wrote: > > <snip> > >> Can you please test whether the patch below makes things work for you again? > > The patch below fixes RDMA migration (same host); however, see comments. > >> Alex >> >> From ef6fde21007e62529799264f57a65c6bb3d0d414 Mon Sep 17 00:00:00 2001 >> From: Alexander Graf <agraf@suse.de> >> Date: Sat, 14 Feb 2015 00:21:01 +0100 >> Subject: [PATCH] migration: Read JSON VM description on incoming migration >> >> One of the really nice things about the VM description format is that it >> goes >> over the wire when live migration is happening. Unfortunately QEMU today >> closes >> any socket once it sees VM_EOF coming, so we never give the VMDESC the >> chance to >> actually land on the wire. >> >> This patch makes QEMU read the description as well. This way we ensure that >> anything wire tapping us in between will get the chance to also >> interpret the >> stream. >> >> Along the way we also fix virt tests that assume that number_bytes_sent >> on the >> sender side is equal to number_bytes_read which was true before the VMDESC >> patches and is true again with this patch. >> >> Signed-off-by: Alexander Graf <agraf@suse.de> >> >> diff --git a/savevm.c b/savevm.c >> index 8040766..ff4bead 100644 >> --- a/savevm.c >> +++ b/savevm.c >> @@ -929,6 +929,7 @@ int qemu_loadvm_state(QEMUFile *f) >> uint8_t section_type; >> unsigned int v; >> int ret; >> + int file_error_after_eof = -1; >> >> if (qemu_savevm_state_blocked(&local_err)) { >> error_report("%s", error_get_pretty(local_err)); >> @@ -1034,6 +1035,22 @@ int qemu_loadvm_state(QEMUFile *f) >> } >> } >> >> + file_error_after_eof = qemu_file_get_error(f); >> + >> + /* >> + * Try to read in the VMDESC section as well, so that dumping tools >> that >> + * intercept our migration stream have the chance to see it. >> + */ >> + if (qemu_get_byte(f) == QEMU_VM_VMDESCRIPTION) { > > You could use qemu_peek_byte for that? It's what I had originally, but qemu_peek_byte() at the end of the day is the exact same as qemu_get_byte, but doesn't increment the internal buffer counter. So any error conditions that incur because the read failed still happen with peek_byte and are a lot less intuitive. > >> + uint32_t size = qemu_get_be32(f); >> + uint8_t *buf = g_malloc(size); >> + >> + if (buf) { >> + qemu_get_buffer(f, buf, size); >> + g_free(buf); >> + } > > This is slightly dangerous; a malformed file could send you a huge > value and get you to allocate lots of memory for no good reason. > > You could do some clever; but personally I'd just loop around a > nice small buffer until it's gone. Good idea. Will change. > As mentioned on IRC; I'm still worried though that this is only > a fix for loading on newer versions; migration to an older QEMU > with the same machine type would fail. > (Yes I know mythically that no one cares about this; but I do). Yeah, I guess I'll follow up with a fix to disable VMDESC submission on older versions, just to be on the safe side. Alex ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-16 20:24 ` Alexander Graf @ 2015-02-16 21:06 ` Paolo Bonzini 2015-02-16 21:08 ` Alexander Graf 0 siblings, 1 reply; 17+ messages in thread From: Paolo Bonzini @ 2015-02-16 21:06 UTC (permalink / raw) To: Alexander Graf, Dr. David Alan Gilbert Cc: Lucas Meneghel Rodrigues, amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela On 16/02/2015 21:24, Alexander Graf wrote: >> As mentioned on IRC; I'm still worried though that this is only >> > a fix for loading on newer versions; migration to an older QEMU >> > with the same machine type would fail. >> > (Yes I know mythically that no one cares about this; but I do). > Yeah, I guess I'll follow up with a fix to disable VMDESC submission on > older versions, just to be on the safe side. Can you make it a capability? Paolo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-16 21:06 ` Paolo Bonzini @ 2015-02-16 21:08 ` Alexander Graf 2015-02-16 21:38 ` Paolo Bonzini 0 siblings, 1 reply; 17+ messages in thread From: Alexander Graf @ 2015-02-16 21:08 UTC (permalink / raw) To: Paolo Bonzini, Dr. David Alan Gilbert Cc: Lucas Meneghel Rodrigues, amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela On 16.02.15 22:06, Paolo Bonzini wrote: > > > On 16/02/2015 21:24, Alexander Graf wrote: >>> As mentioned on IRC; I'm still worried though that this is only >>>> a fix for loading on newer versions; migration to an older QEMU >>>> with the same machine type would fail. >>>> (Yes I know mythically that no one cares about this; but I do). >> Yeah, I guess I'll follow up with a fix to disable VMDESC submission on >> older versions, just to be on the safe side. > > Can you make it a capability? When did live migration start to have capability negotiation? :) Alex ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-16 21:08 ` Alexander Graf @ 2015-02-16 21:38 ` Paolo Bonzini 0 siblings, 0 replies; 17+ messages in thread From: Paolo Bonzini @ 2015-02-16 21:38 UTC (permalink / raw) To: Alexander Graf, Dr. David Alan Gilbert Cc: Lucas Meneghel Rodrigues, amit.shah, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela On 16/02/2015 22:08, Alexander Graf wrote: > > Can you make it a capability? > When did live migration start to have capability negotiation? :) Only capability without negotiation. :) Negotiation is done above QEMU. Paolo ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Qemu-devel] HEAD is failing virt-test on migration tests 2015-02-13 0:36 ` Alexander Graf 2015-02-13 9:04 ` Dr. David Alan Gilbert @ 2015-02-13 11:09 ` Lucas Meneghel Rodrigues 1 sibling, 0 replies; 17+ messages in thread From: Lucas Meneghel Rodrigues @ 2015-02-13 11:09 UTC (permalink / raw) To: Alexander Graf Cc: amit.shah, Dave Gilbert, Developers qemu-devel, Lucas Meneghel Rodrigues, quintela [-- Attachment #1: Type: text/plain, Size: 3707 bytes --] Alex, Dave: Virt-Test fd migration starts by sending a fd to the source vm 22:20:40 DEBUG| Send file descriptor migfd_28_1423786840 to source VM. 22:20:40 DEBUG| (monitor hmp1) Sending command 'getfd migfd_28_1423786840' later on... 22:20:42 INFO | Migrating to fd:migfd_28_1423786840 22:20:42 DEBUG| (monitor hmp1) Sending command 'migrate -d fd:migfd_28_1423786840' 22:20:42 DEBUG| Send command: migrate -d fd:migfd_28_1423786840 Attached to this message you can find a .tar.bz2 file (~36Kb) with virt-test results. It contains extra information, such as a a record of vm registers taken periodically during the testing process. Cheers, Lucas On Thu, Feb 12, 2015 at 10:36 PM, Alexander Graf <agraf@suse.de> wrote: > > > On 13.02.15 01:29, Lucas Meneghel Rodrigues wrote: >> Copying Alex. >> >> OK, after bisecting, this is what I've got: >> >> 8118f0950fc77cce7873002a5021172dd6e040b5 is the first bad commit >> commit 8118f0950fc77cce7873002a5021172dd6e040b5 >> Author: Alexander Graf <agraf@suse.de <mailto:agraf@suse.de>> >> Date: Thu Jan 22 15:01:39 2015 +0100 >> >> migration: Append JSON description of migration stream >> >> One of the annoyances of the current migration format is the >> fact that >> it's not self-describing. In fact, it's not properly describing >> at all. >> Some code randomly scattered throughout QEMU elaborates roughly >> how to >> read and write a stream of bytes. >> >> We discussed an idea during KVM Forum 2013 to add a JSON >> description of >> the migration protocol itself to the migration stream. This >> patch >> adds a section after the VM_END migration end marker that >> contains >> description data on what the device sections of the stream are >> composed of. >> >> This approach is backwards compatible with any QEMU version >> reading the >> stream, because QEMU just stops reading after the VM_END marker >> and >> ignores >> any data following it. >> >> With an additional external program this allows us to decipher >> the >> contents of any migration stream and hopefully make migration >> bugs >> easier >> to track down. >> >> Signed-off-by: Alexander Graf <agraf@suse.de >> <mailto:agraf@suse.de>> >> Signed-off-by: Amit Shah <amit.shah@redhat.com >> <mailto:amit.shah@redhat.com>> >> Signed-off-by: Juan Quintela <quintela@redhat.com >> <mailto:quintela@redhat.com>> >> >> :040000 040000 e9a8888ac242a61fbd05bbb0daa3e8877970e738 >> 61df81f831bc86b29f65883523ea95abb36f1ec5 Mhw >> :040000 040000 fe0659bed17d86c43657c26622d64fd44a1af037 >> 7092a6b6515a3d0077f68ff2d80dbd74597a244f Minclude >> :040000 040000 d90d6f1fe839abf21a45eaba5829d5a6a22abeb1 >> c2b1dcda197d96657458d699c185e39ae45f3c6c Mmigration >> :100644 100644 98895fee81edfbc659fc42d467e930d06b1afa7d >> 80407662ad3ed860d33a9d35f5c44b1d19c4612b Msavevm.c >> :040000 040000 cf218bc2b841cd51ebe3972635be2cfbb1de9dfa >> 7aaf3d10ef7f73413b228e854fe6f04317151e46 Mtests >> >> So there you go. I'm going to sleep, if you need any extra help let >> me know. > > So the major difference with this patch applied is that the sender > could > send more data than the receive wants to read. I can't see the actual > migrate command you used down there. > > I haven't seen this actually being a problem so far, as the receiver > just close()s its file descriptor once it hits VM_EOF. This should > only > break senders if they expect they can send more. That said, I think I > only tested offline migration (via exec:), so maybe QEMU is behaving > badly and actually wants to send all data and just fails the migration > without? > > > Alex > [-- Attachment #2: run-2015-02-12-22.20.21.tar.bz2 --] [-- Type: application/x-bzip-compressed-tar, Size: 36108 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2015-02-16 21:38 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-02-12 22:12 [Qemu-devel] HEAD is failing virt-test on migration tests Juan Quintela 2015-02-12 22:19 ` Lucas Meneghel Rodrigues 2015-02-12 22:56 ` Lucas Meneghel Rodrigues 2015-02-12 23:03 ` Lucas Meneghel Rodrigues 2015-02-13 0:29 ` Lucas Meneghel Rodrigues 2015-02-13 0:36 ` Alexander Graf 2015-02-13 9:04 ` Dr. David Alan Gilbert 2015-02-13 11:18 ` Alexander Graf 2015-02-13 11:23 ` Dr. David Alan Gilbert 2015-02-13 11:23 ` Lucas Meneghel Rodrigues 2015-02-13 23:33 ` Alexander Graf 2015-02-16 18:57 ` Dr. David Alan Gilbert 2015-02-16 20:24 ` Alexander Graf 2015-02-16 21:06 ` Paolo Bonzini 2015-02-16 21:08 ` Alexander Graf 2015-02-16 21:38 ` Paolo Bonzini 2015-02-13 11:09 ` Lucas Meneghel Rodrigues
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).