All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] vhost-net issue: does not survive reboot on ppc64
Date: Tue, 24 Dec 2013 14:09:07 +1100	[thread overview]
Message-ID: <52B8FAD3.4010606@ozlabs.ru> (raw)
In-Reply-To: <20131223162426.GA1491@redhat.com>

On 12/24/2013 03:24 AM, Michael S. Tsirkin wrote:
> On Mon, Dec 23, 2013 at 02:01:13AM +1100, Alexey Kardashevskiy wrote:
>> On 12/23/2013 01:46 AM, Alexey Kardashevskiy wrote:
>>> On 12/22/2013 09:56 PM, Michael S. Tsirkin wrote:
>>>> On Sun, Dec 22, 2013 at 02:01:23AM +1100, Alexey Kardashevskiy wrote:
>>>>> Hi!
>>>>>
>>>>> I am having a problem with virtio-net + vhost on POWER7 machine - it does
>>>>> not survive reboot of the guest.
>>>>>
>>>>> Steps to reproduce:
>>>>> 1. boot the guest
>>>>> 2. configure eth0 and do ping - everything works
>>>>> 3. reboot the guest (i.e. type "reboot")
>>>>> 4. when it is booted, eth0 can be configured but will not work at all.
>>>>>
>>>>> The test is:
>>>>> ifconfig eth0 172.20.1.2 up
>>>>> ping 172.20.1.23
>>>>>
>>>>> If to run tcpdump on the host's "tap-id3" interface, it shows no trafic
>>>>> coming from the guest. If to compare how it works before and after reboot,
>>>>> I can see the guest doing an ARP request for 172.20.1.23 and receives the
>>>>> response and it does the same after reboot but the answer does not come.
>>>>
>>>> So you see the arp packet in guest but not in host?
>>>
>>> Yes.
>>>
>>>
>>>> One thing to try is to boot debug kernel - where pr_debug is
>>>> enabled - then you might see some errors in the kernel log.
>>>
>>> Tried and added lot more debug printk myself, not clear at all what is
>>> happening there.
>>>
>>> One more hint - if I boot the guest and the guest does not bring eth0 up
>>> AND wait more than 200 seconds (and less than 210 seconds), then eth0 will
>>> not work at all. I.e. this script produces not-working-eth0:
>>>
>>>
>>> ifconfig eth0 172.20.1.2 down
>>> sleep 210
>>> ifconfig eth0 172.20.1.2 up
>>> ping 172.20.1.23
>>>
>>> s/210/200/ - and it starts working. No reboot is required to reproduce.
>>>
>>> No "vhost" == always works. The only difference I can see here is vhost's
>>> thread which may get suspended if not used for a while after the start and
>>> does not wake up but this is almost a blind guess.
>>
>>
>> Yet another clue - this host kernel patch seems to help with the guest
>> reboot but does not help with the initial 210 seconds delay:
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index 69068e0..5e67650 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -162,10 +162,10 @@ void vhost_work_queue(struct vhost_dev *dev, struct
>> vhost_work *work)
>>                 list_add_tail(&work->node, &dev->work_list);
>>                 work->queue_seq++;
>>                 spin_unlock_irqrestore(&dev->work_lock, flags);
>> -               wake_up_process(dev->worker);
>>         } else {
>>                 spin_unlock_irqrestore(&dev->work_lock, flags);
>>         }
>> +       wake_up_process(dev->worker);
>>  }
>>  EXPORT_SYMBOL_GPL(vhost_work_queue);
>>
>>
> 
> Interesting. Some kind of race? A missing memory barrier somewhere?

I do not see how. I boot the guest and just wait 210 seconds, nothing
happens to cause races.


> Since it's all around startup,
> you can try kicking the host eventfd in
> vhost_net_start.


How exactly? This did not help. Thanks.

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 006576d..407ecf2 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -229,6 +229,17 @@ int vhost_net_start(VirtIODevice *dev, NetClientState
*ncs,
         if (r < 0) {
             goto err;
         }
+
+        VHostNetState *vn = tap_get_vhost_net(ncs[i].peer);
+        struct vhost_vring_file file = {
+            .index = i
+        };
+        file.fd =
event_notifier_get_fd(virtio_queue_get_host_notifier(dev->vq));
+        r = ioctl(vn->dev.control, VHOST_SET_VRING_KICK, &file);
+        if (r) {
+            error_report("Error notifiyng host notifier: %d", -r);
+            goto err;
+        }
     }



> 
>>
>>
>>>>> If to remove vhost=on, it is all good. If to try Fedora19
>>>>> (v3.10-something), it all good again - works before and after reboot.
>>>>>
>>>>>
>>>>> And there 2 questions:
>>>>>
>>>>> 1. does anybody have any clue what might go wrong after reboot?
>>>>>
>>>>> 2. Is there any good material to read about what exactly and how vhost
>>>>> accelerates?
>>>>>
>>>>> My understanding is that packets from the guest to the real network are
>>>>> going as:
>>>>> 1. guest's virtio-pci-net does ioport(VIRTIO_PCI_QUEUE_NOTIFY)
>>>>> 2. QEMU's net/virtio-net.c calls qemu_net_queue_deliver()
>>>>> 3. QEMU's net/tap.c calls tap_write_packet() and this is how the host knows
>>>>> that there is a new packet.
>>>
>>>
>>> What about the documentation? :) or the idea?
>>>
>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> This how I run QEMU:
>>>>> ./qemu-system-ppc64 \
>>>>> -enable-kvm \
>>>>> -m 2048 \
>>>>> -machine pseries \
>>>>> -initrd 1.cpio \
>>>>> -kernel vml312_virtio_net_dbg \
>>>>> -nographic \
>>>>> -vga none \
>>>>> -netdev
>>>>> tap,id=id3,ifname=tap-id3,script=ifup.sh,downscript=ifdown.sh,vhost=on \
>>>>> -device virtio-net-pci,id=id4,netdev=id3,mac=C0:41:49:4b:00:00
>>>>>
>>>>>
>>>>> That is bridge config:
>>>>> [aik@dyn232 ~]$ brctl show
>>>>> bridge name	bridge id		STP enabled	interfaces
>>>>> brtest		8000.00145e992e88	no	pin	eth4
>>>>>
>>>>>
>>>>> The ifup.sh script:
>>>>> ifconfig $1 hw ether ee:01:02:03:04:05
>>>>> /sbin/ifconfig $1 up
>>>>> /usr/sbin/brctl addif brtest $1
>>>
>>>
>>
>>
>> -- 
>> Alexey


-- 
Alexey

  reply	other threads:[~2013-12-24  3:09 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-21 15:01 [Qemu-devel] vhost-net issue: does not survive reboot on ppc64 Alexey Kardashevskiy
2013-12-22 10:56 ` Michael S. Tsirkin
2013-12-22 14:46   ` Alexey Kardashevskiy
2013-12-22 15:01     ` Alexey Kardashevskiy
2013-12-23 16:24       ` Michael S. Tsirkin
2013-12-24  3:09         ` Alexey Kardashevskiy [this message]
2013-12-24  9:40           ` Michael S. Tsirkin
2013-12-24 14:15             ` Alexey Kardashevskiy
2013-12-24 15:43               ` Michael S. Tsirkin
2013-12-25  1:36                 ` Alexey Kardashevskiy
2013-12-25  9:52                   ` Michael S. Tsirkin
2013-12-26 10:13                     ` Alexey Kardashevskiy
2013-12-26 10:49                       ` Michael S. Tsirkin
2013-12-26 12:51                         ` Alexey Kardashevskiy
2013-12-26 13:48                           ` Michael S. Tsirkin
2013-12-26 14:59                             ` Alexey Kardashevskiy
2013-12-26 15:12                               ` Michael S. Tsirkin
2013-12-27  1:44                                 ` Alexey Kardashevskiy
2014-01-06  9:57                                   ` Alexey Kardashevskiy
2014-01-07 13:18                 ` Alexey Kardashevskiy
2014-01-10  5:13                   ` Alexey Kardashevskiy
2014-01-10 12:41                     ` Michael S. Tsirkin
2014-01-10 13:44                       ` Alexey Kardashevskiy
2013-12-22 11:41 ` Zhi Yong Wu
2013-12-22 14:48   ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52B8FAD3.4010606@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.