From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bernd Naumann Subject: vhost_net: VM looses network when using vhost over time Date: Wed, 20 Sep 2017 14:44:54 +0000 (UTC) Message-ID: <872691802.5840849.1505918694826.JavaMail.zimbra@spreadshirt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Linux Kernel Network Developers To: qemu-discuss@nongnu.org Return-path: Received: from mx30.spreadomat.net ([85.239.103.144]:35219 "EHLO mx30.spreadomat.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751570AbdITOx6 (ORCPT ); Wed, 20 Sep 2017 10:53:58 -0400 Sender: netdev-owner@vger.kernel.org List-ID: Hi @all, We have encountered/experience a bug which is more or less reproducible, bu= t we do not know how to do it exactly or how to debug the issue in the firs= t place. # Background In our setup we have a Ganti Cluser (kvm) with atm ~60 nodes running ~500 V= Ms, we are using tap interfaces on L2 bridges, L3 routed tap interfaces, an= d tap interfaces on a bridge with a VTEP attached to it. (For the vxlan set= up we have a home grown daemon to maintain the FDB). # The issue On some VMs we loose network-connectivity under certain/unknown circumstanc= es.=20 "Looseing" means that the VM is not reachable and can therefor not reach an= y other host in the network. However with `tcpdump` on the host (phy NIC + bridge) we can see the traffi= c going in; but with `tcpdump` on the VM we only see arp goes in, but nothi= ng goes out. Manually setting the ARP entry does not help at all, or only f= or a moment, like `ip link set $DEV set arp off; ip link set $DEV arp on`. = The only way we found to "fix" it, is rebooting the VM, or do `modprobe -r = virtio_net; modprobe virtio_net`, but this seams also not the best workarou= nd and can fail in a short time again. Also it is difficult to determinate = when the issue is kicking in. Counting 'FAILED' neighbors is a indicator bu= t nothing to rely on. The frequence of the issue ranges from once in a few days, to multiple time= s per day or even after some minutes after boot. Most impact we see on VMs = with higher network traffic like our gateway-VMs (multiple NICs in differen= t networks, IPsec, iptables, ...); ha-proxy-VMs (similar to our gateways), = but also (with reduced frequency) on /normal/ application VMs. For what we have found so far, it looks like kind of:=20 * https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978 -- Bug #99= 7978 =E2=80=9CKVM images lose connectivity with bridged network=E2=80=9D : = Bugs : qemu-kvm package : Ubuntu * https://bugs.centos.org/view.php?id=3D5526 -- 0005526: KVM Guest with vir= tio network loses network connectivity - CentOS Bug Tracker Via `rtmon` we can observe that it starts with some "FAILED" neighbor entri= es and that they increase over time. As we know that this is only one conse= quence of not sending ARP replys to the requester; or that requested ARP is= unanswered (cause the packet is not leaving the VM), the increasing count = of 'FAILED' neighbors is /normal/. BUT: This can start on any interface, br= idged tap interface for WAN, bridged tap in VXLAN, routed tap; it does not = matter, or is not directly linked to the "kind" of interface. # General overview of the setup * ganiti-cluster with ~60 nodes * each node has 2 x 50G (mlnx5 dual-port) connected to 2 x MLNX SN2700 swit= ches * each node runs `bird` with OSPF and ECMP (and OSPF with ECMP on SN2700 to= o) * each VM has one or more vNICs in a bridged or routed network * networks: bridged tap in WAN; bridged tap with attached VTEP; routed tap * host OS: Ubuntu 16.04.3 with Ubuntu Kernel 4.12.13; first tested with qem= u-kvm 1:2.5+dfsg-5ubuntu10.15, and later upgraded to qemu-kvm 2.10~rc3+dfsg= -0ubuntu1, same issue; guest OS Ubutnu 14.04, Ubuntu 16.04 and Ubuntu 16.04= with latest Ubuntu mainline kernel PPA # So far we can "verify" it is 'vhost' Without "vhost=3Don" for the kvm process we can not observe this issue. Whi= le using "vhost=3Don", a effected VM can be "fixed" by `rmmod` and `insmod = virtio_net`, but reboot seams to provide a "fix" for a "longer" period. (Bu= t as you may know, virtio has not the performance we expect.) So we have some questions: * How can we debug the main issue to provide a meaningful bug report? Debug= flags on the kernel but where to hang gdb on it? Sadly we are no kernel ha= ckers :/, but we can compile our own kernel and qemu-kvm to test also relea= se candidates and/or put patches in place. * Does someone have seen this too? Can provide a better workaround, or patc= h or anything? * Where to file/reopen this issue? qemu, netdev? * Is qemu-kvm even the right place to look for answers? We are happy to provide more information or collect debug information if so= meone wants to investigate. Thanks for your time! Best, Bernd Naumann Spreadshirt=20 Bernd Naumann=20 Systems Engineer, Networking & Operations=20 bernd.naumann@spreadshirt.net=20 http://www.spreadshirt.com=20 sprd.net AG=20 Gie=C3=9Ferstra=C3=9Fe 27=20 D-04229 Leipzig=20 Fon: +49 341 594 00 - 5900=20 Fax: +49 341 594 00 - 5149=20 Vorstand / executive board: Philip Rooke (CEO/Vorsitzender) =C2=B7 Tobias S= chaugg=20 Aufsichtsratsvorsitzender / chairman of the supervisory board: Lukasz Gadow= ski=20 Handelsregister / trade register: Amtsgericht Leipzig, HRB 22478=20 Umsatzsteuer-IdentNummer / VAT-ID: DE 8138 7149 4