From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35253) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XKMFy-00052d-O9 for qemu-devel@nongnu.org; Thu, 21 Aug 2014 02:54:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XKMFs-0002hq-Hv for qemu-devel@nongnu.org; Thu, 21 Aug 2014 02:53:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:65413) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XKMFs-0002hm-8l for qemu-devel@nongnu.org; Thu, 21 Aug 2014 02:53:52 -0400 Message-ID: <53F59771.2050602@redhat.com> Date: Thu, 21 Aug 2014 14:53:37 +0800 From: Jason Wang MIME-Version: 1.0 References: <1408424189-10510-1-git-send-email-jasowang@redhat.com> <53F46909.2040101@huawei.com> <53F575AF.2010100@redhat.com> <53F59174.7010104@huawei.com> In-Reply-To: <53F59174.7010104@huawei.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH V3] vhost_net: start/stop guest notifiers properly List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Zhangjie (HZ)" , qemu-devel@nongnu.org, mst@redhat.com Cc: William Dauchy On 08/21/2014 02:28 PM, Zhangjie (HZ) wrote: > On 2014/8/21 12:29, Jason Wang wrote: >> On 08/20/2014 05:23 PM, Zhangjie (HZ) wrote: >>> On 2014/8/19 12:56, Jason Wang wrote: >>>> commit a9f98bb5ebe6fb1869321dcc58e72041ae626ad8 vhost: multiqueue >>> call it before setting >>>> Zhang Jie, please test this patch to see if it fixes the issue. >>>> +static void vhost_net_set_vq_index(struct vhost_net *net, int vq_index) >>>> +{ >>>> + net->dev.vq_index = vq_index; >>>> +} >>> int vq_index) >>>> ... >>> Because of vhost_net_set_vq_index, VM can be start successfully. >>> But, after about 80 times of migration under my environment, virtual nic became unreachable again. >>> When I use jprobe to notify tap, the virtual nic becomes reachable again. This shows that interrupts missing causes >>> the problem. >> Thanks for the testing. A questions is can you reproduce this when vhost >> is disabled? > After migration, vhost is not disabled, virtual nic became unreachable because vhost is not awakened. > By the logical of EVENT_IDX, virtio-net will not kick vhost again if the used idx is not updated. > So, if one interrupts is lost during migration, virtio_net will not kick vhost again. > Then, no skb from virtio-net can be sent to tap. Yes and I mean to test vhost=off to see if it was the issue of vhost. > > Jason's patch reduced the probability of occurrence, from about 1/20 to 1/80. It is really effective. I think the patch should be acked. > May be we can try to solve the problem from another perspective. Do you have some methods to sense the migration? > We can make up a signal from virtio-net after the migration. You can make a patch like this to debug. If problem disappears, it means interrupt was really lost anyway. > >> Anyway, I will try to reproduce it by myself. >> > The test environment is really terrible, I build a environment myself, but it problem did not occur. > The environment I use now is from a colleague Responsible for test work. > Two hosts, every host has about 20 vms, they send packages(ipv4 and ipv6) between each other. > The VM to be migrated also sens packages itself, and there is a ping(-i 0.001) from another host to it. > The physical nic is 1GE, connected through a internal nework. Just want to confirm. For the problem did not occur, you mean with my patch on top?