From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44044) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vw8RM-00029S-Pt for qemu-devel@nongnu.org; Thu, 26 Dec 2013 05:45:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vw8RG-0001qY-43 for qemu-devel@nongnu.org; Thu, 26 Dec 2013 05:45:20 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44383) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vw8RF-0001qS-TH for qemu-devel@nongnu.org; Thu, 26 Dec 2013 05:45:14 -0500 Date: Thu, 26 Dec 2013 12:49:10 +0200 From: "Michael S. Tsirkin" Message-ID: <20131226104910.GA22584@redhat.com> References: <52B6FB41.4050806@ozlabs.ru> <52B6FEB9.8010407@ozlabs.ru> <20131223162426.GA1491@redhat.com> <52B8FAD3.4010606@ozlabs.ru> <20131224094047.GB2848@redhat.com> <52B99701.2050709@ozlabs.ru> <20131224154331.GA14229@redhat.com> <52BA368C.8050205@ozlabs.ru> <20131225095243.GA17685@redhat.com> <52BC014B.1090909@ozlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52BC014B.1090909@ozlabs.ru> Subject: Re: [Qemu-devel] vhost-net issue: does not survive reboot on ppc64 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alexey Kardashevskiy Cc: "qemu-devel@nongnu.org" On Thu, Dec 26, 2013 at 09:13:31PM +1100, Alexey Kardashevskiy wrote: > On 12/25/2013 08:52 PM, Michael S. Tsirkin wrote: > > On Wed, Dec 25, 2013 at 12:36:12PM +1100, Alexey Kardashevskiy wrote: > >> On 12/25/2013 02:43 AM, Michael S. Tsirkin wrote: > >>> On Wed, Dec 25, 2013 at 01:15:29AM +1100, Alexey Kardashevskiy wrote: > >>>> On 12/24/2013 08:40 PM, Michael S. Tsirkin wrote: > >>>>> On Tue, Dec 24, 2013 at 02:09:07PM +1100, Alexey Kardashevskiy wrote: > >>>>>> On 12/24/2013 03:24 AM, Michael S. Tsirkin wrote: > >>>>>>> On Mon, Dec 23, 2013 at 02:01:13AM +1100, Alexey Kardashevskiy wrote: > >>>>>>>> On 12/23/2013 01:46 AM, Alexey Kardashevskiy wrote: > >>>>>>>>> On 12/22/2013 09:56 PM, Michael S. Tsirkin wrote: > >>>>>>>>>> On Sun, Dec 22, 2013 at 02:01:23AM +1100, Alexey Kardashevskiy wrote: > >>>>>>>>>>> Hi! > >>>>>>>>>>> > >>>>>>>>>>> I am having a problem with virtio-net + vhost on POWER7 machine - it does > >>>>>>>>>>> not survive reboot of the guest. > >>>>>>>>>>> > >>>>>>>>>>> Steps to reproduce: > >>>>>>>>>>> 1. boot the guest > >>>>>>>>>>> 2. configure eth0 and do ping - everything works > >>>>>>>>>>> 3. reboot the guest (i.e. type "reboot") > >>>>>>>>>>> 4. when it is booted, eth0 can be configured but will not work at all. > >>>>>>>>>>> > >>>>>>>>>>> The test is: > >>>>>>>>>>> ifconfig eth0 172.20.1.2 up > >>>>>>>>>>> ping 172.20.1.23 > >>>>>>>>>>> > >>>>>>>>>>> If to run tcpdump on the host's "tap-id3" interface, it shows no trafic > >>>>>>>>>>> coming from the guest. If to compare how it works before and after reboot, > >>>>>>>>>>> I can see the guest doing an ARP request for 172.20.1.23 and receives the > >>>>>>>>>>> response and it does the same after reboot but the answer does not come. > >>>>>>>>>> > >>>>>>>>>> So you see the arp packet in guest but not in host? > >>>>>>>>> > >>>>>>>>> Yes. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> One thing to try is to boot debug kernel - where pr_debug is > >>>>>>>>>> enabled - then you might see some errors in the kernel log. > >>>>>>>>> > >>>>>>>>> Tried and added lot more debug printk myself, not clear at all what is > >>>>>>>>> happening there. > >>>>>>>>> > >>>>>>>>> One more hint - if I boot the guest and the guest does not bring eth0 up > >>>>>>>>> AND wait more than 200 seconds (and less than 210 seconds), then eth0 will > >>>>>>>>> not work at all. I.e. this script produces not-working-eth0: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ifconfig eth0 172.20.1.2 down > >>>>>>>>> sleep 210 > >>>>>>>>> ifconfig eth0 172.20.1.2 up > >>>>>>>>> ping 172.20.1.23 > >>>>>>>>> > >>>>>>>>> s/210/200/ - and it starts working. No reboot is required to reproduce. > >>>>>>>>> > >>>>>>>>> No "vhost" == always works. The only difference I can see here is vhost's > >>>>>>>>> thread which may get suspended if not used for a while after the start and > >>>>>>>>> does not wake up but this is almost a blind guess. > >>>>>>>> > >>>>>>>> > >>>>>>>> Yet another clue - this host kernel patch seems to help with the guest > >>>>>>>> reboot but does not help with the initial 210 seconds delay: > >>>>>>>> > >>>>>>>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > >>>>>>>> index 69068e0..5e67650 100644 > >>>>>>>> --- a/drivers/vhost/vhost.c > >>>>>>>> +++ b/drivers/vhost/vhost.c > >>>>>>>> @@ -162,10 +162,10 @@ void vhost_work_queue(struct vhost_dev *dev, struct > >>>>>>>> vhost_work *work) > >>>>>>>> list_add_tail(&work->node, &dev->work_list); > >>>>>>>> work->queue_seq++; > >>>>>>>> spin_unlock_irqrestore(&dev->work_lock, flags); > >>>>>>>> - wake_up_process(dev->worker); > >>>>>>>> } else { > >>>>>>>> spin_unlock_irqrestore(&dev->work_lock, flags); > >>>>>>>> } > >>>>>>>> + wake_up_process(dev->worker); > >>>>>>>> } > >>>>>>>> EXPORT_SYMBOL_GPL(vhost_work_queue); > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> Interesting. Some kind of race? A missing memory barrier somewhere? > >>>>>> > >>>>>> I do not see how. I boot the guest and just wait 210 seconds, nothing > >>>>>> happens to cause races. > >>>>>> > >>>>>> > >>>>>>> Since it's all around startup, > >>>>>>> you can try kicking the host eventfd in > >>>>>>> vhost_net_start. > >>>>>> > >>>>>> > >>>>>> How exactly? This did not help. Thanks. > >>>>>> > >>>>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > >>>>>> index 006576d..407ecf2 100644 > >>>>>> --- a/hw/net/vhost_net.c > >>>>>> +++ b/hw/net/vhost_net.c > >>>>>> @@ -229,6 +229,17 @@ int vhost_net_start(VirtIODevice *dev, NetClientState > >>>>>> *ncs, > >>>>>> if (r < 0) { > >>>>>> goto err; > >>>>>> } > >>>>>> + > >>>>>> + VHostNetState *vn = tap_get_vhost_net(ncs[i].peer); > >>>>>> + struct vhost_vring_file file = { > >>>>>> + .index = i > >>>>>> + }; > >>>>>> + file.fd = > >>>>>> event_notifier_get_fd(virtio_queue_get_host_notifier(dev->vq)); > >>>>>> + r = ioctl(vn->dev.control, VHOST_SET_VRING_KICK, &file); > >>>>> > >>>>> No, this sets the notifier, it does not kick. > >>>>> To kick you write 1 there: > >>>>> uint6_t v = 1; > >>>>> write(fd, &v, sizeof v); > >>>> > >>>> > >>>> Please, be precise. How/where do I get that @fd? Is what I do correct? > >>> > >>> Yes. > >>> > >>>> What > >>>> is uint6_t - uint8_t or uint16_t (neither works)? > >>> > >>> Sorry, should have been uint64_t. > >> > >> > >> Oh, that I missed :-) Anyway, this does not make any difference. Is there > >> any cheap&dirty way to make vhost-net kernel thread always awake? Sending > >> it signals from the user space does not work... > > > > You can run a timer in qemu and signal the eventfd from there > > periodically. > > > > Just to restate, tcpdump in guest shows that guest sends arp packet, > > but tcpdump in host on tun device does not show any packets? > > > Ok. Figured it out about disabling interfaces in Fedora19. I was wrong, > something is happening on the host's TAP - the guest sends ARP request, the > response is visible on the TAP interface but not in the guest. Okay. So problem is on host to guest path then. Things to try: 1. trace handle_rx [vhost_net] 2. trace tun_put_user [tun] 3. I suspect some host bug in one of the features. Let's try to disable some flags with device property: you can get the list by doing: ./x86_64-softmmu/qemu-system-x86_64 -device virtio-net-pci,?|grep on/off Things I would try turning off is guest offloads (ones that start with guest_) event_idx,any_layout,mq. Turn them all off, if it helps try to find the one that helped. -- MST