From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36022) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gVFOB-0007f4-Pv for qemu-devel@nongnu.org; Fri, 07 Dec 2018 07:37:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gVFO7-00029J-Pq for qemu-devel@nongnu.org; Fri, 07 Dec 2018 07:37:51 -0500 Received: from mx1.redhat.com ([209.132.183.28]:59794) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gVFO7-00028Y-Hy for qemu-devel@nongnu.org; Fri, 07 Dec 2018 07:37:47 -0500 From: Jason Wang References: <915953bd-cc9c-9456-b619-297138f68ae6@redhat.com> <7cc73aeb-5eee-8b4f-c005-aee64dfb1bf8@redhat.com> <356e7f1d-e95a-a090-4a79-c1cf5a37d50f@redhat.com> Message-ID: <1c2e9828-467f-1120-f200-6a1d18f93c89@redhat.com> Date: Fri, 7 Dec 2018 20:37:39 +0800 MIME-Version: 1.0 In-Reply-To: <356e7f1d-e95a-a090-4a79-c1cf5a37d50f@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Logging dirty pages from vhost-net in-kernel with vIOMMU List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jintack Lim Cc: QEMU Devel Mailing List , "Michael S . Tsirkin" On 2018/12/6 =E4=B8=8B=E5=8D=888:44, Jason Wang wrote: > > On 2018/12/6 =E4=B8=8B=E5=8D=888:11, Jintack Lim wrote: >> On Thu, Dec 6, 2018 at 2:33 AM Jason Wang wrote: >>> >>> On 2018/12/5 =E4=B8=8B=E5=8D=8810:47, Jintack Lim wrote: >>>> On Tue, Dec 4, 2018 at 8:30 PM Jason Wang wrot= e: >>>>> On 2018/12/5 =E4=B8=8A=E5=8D=882:37, Jintack Lim wrote: >>>>>> Hi, >>>>>> >>>>>> I'm wondering how the current implementation works when logging=20 >>>>>> dirty >>>>>> pages during migration from vhost-net (in kernel) when used vIOMMU= . >>>>>> >>>>>> I understand how vhost-net logs GPAs when not using vIOMMU. But wh= en >>>>>> we use vhost with vIOMMU, then shouldn't vhost-net need to log the >>>>>> translated address (GPA) instead of the address written in the >>>>>> descriptor (IOVA) ? The current implementation looks like vhost-ne= t >>>>>> just logs IOVA without translation in vhost_get_vq_desc() in >>>>>> drivers/vhost/net.c. It seems like QEMU doesn't do any further >>>>>> translation of the dirty log when syncing. >>>>>> >>>>>> I might be missing something. Could somebody shed some light on=20 >>>>>> this? >>>>> Good catch. It looks like a bug to me. Want to post a patch for thi= s? >>>> Thanks for the confirmation. >>>> >>>> What would be a good setup to catch this kind of migration bug? I >>>> tried to observe it in the VM expecting to see network applications >>>> not getting data correctly on the destination, but it was not >>>> successful (i.e. the VM on the destination just worked fine.) I didn= 't >>>> even see anything going wrong when I disabled the vhost logging >>>> completely without using vIOMMU. >>>> >>>> What I did is I ran multiple network benchmarks (e.g. netperf tcp >>>> stream and my own one to check correctness of received data) in a VM >>>> without vhost dirty page logging, and the benchmarks just ran fine i= n >>>> the destination. I checked the used ring at the time the VM is stopp= ed >>>> in the source for migration, and it had multiple descriptors that is >>>> (probably) not processed in the VM yet. Do you have any insight how = it >>>> could just work and what would be a good setup to catch this? >>> >>> According to past experience, it could be reproduced by doing scp fro= m >>> host to guest during migration. >>> >> Thanks. I actually tried that, but didn't see any problem either - I >> copied a large file during migration from host to guest (the copy >> continued on the destination), and checked md5 hashes using md5sum, >> but the copied file had the same checksum as the one in the host. >> >> Do you recall what kind of symptom you observed when the dirty pages >> were not migrated correctly with scp? > > > Yes,=C2=A0 the point is to make the migration converge before the end o= f=20 > scp (e.g set migration speed to a very big value). If scp end before=20 > migration, we won't catch the bug. And it's better to do several=20 > rounds of migration during scp. > > Anyway, let me try to reproduce it tomorrow. > Looks like I can reproduce this, scp give the following error to me: scp /home/file root@192.168.100.4:/home file=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 63% 1301MB 58.1MB/s=C2=A0=C2=A0=20 00:12 ETAReceived disconnect from 192.168.100.4: 2: Packet corrupt lost connection FYI, I use the following cli: numactl --cpunodebind 0 --membind 0 $qemu_path $img_path \ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -netdev tap= ,id=3Dhn0,vhost=3Don \ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -device ioh= 3420,id=3Droot.1,chassis=3D1 \ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -device=20 virtio-net-pci,bus=3Droot.1,netdev=3Dhn0,ats=3Don,disable-legacy=3Don,dis= able-modern=3Doff,iommu_platform=3Don=20 \ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -device int= el-iommu,device-iotlb=3Don \ =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -M q35 -m 4= G -enable-kvm -cpu host -smp 2 $@ Thanks > Thanks