From mboxrd@z Thu Jan 1 00:00:00 1970 From: Emmanuel Lacour Subject: Re: virtio_net hang Date: Fri, 14 Nov 2008 10:23:39 +0100 Message-ID: <20081114092339.GC11961@easter-eggs.com> References: <20081113122709.GB14254@easter-eggs.com> <1226589153.19068.7.camel@blaa> <20081113152452.GI14254@easter-eggs.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii To: kvm@vger.kernel.org Return-path: Received: from roxane.home-dn.net ([88.191.11.98]:60681 "EHLO roxane.home-dn.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751176AbYKNJXn (ORCPT ); Fri, 14 Nov 2008 04:23:43 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by roxane.home-dn.net (Postfix) with ESMTP id 66E962C056 for ; Fri, 14 Nov 2008 10:23:41 +0100 (CET) Received: from datura.easter-eggs.fr (unknown [IPv6:2001:7a8:115a:1:214:22ff:feb4:f4ea]) by roxane.home-dn.net (Postfix) with ESMTP id 11C7A2C055 for ; Fri, 14 Nov 2008 10:23:39 +0100 (CET) Content-Disposition: inline In-Reply-To: <20081113152452.GI14254@easter-eggs.com> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Nov 13, 2008 at 04:24:52PM +0100, Emmanuel Lacour wrote: > On Thu, Nov 13, 2008 at 03:12:33PM +0000, Mark McLoughlin wrote: > > The fact that re-loading the virtio_net driver fixes things up makes me > > suspect you've found a bug in the virtio_net driver, rather than e.g. a > > bug in the kvm-userspace side. > > > > To try and narrow down what's happening, when the interface has hung, > > try: > > > > - tcpdump on both eth0 in the guest and the tap device on the host > > (tap5 in your example) > > On eth0 I see echo requests, but _no_ echo replies On tap5 I see echo requests _and_ echo replies > > - look for anything unusual in the stats for both those interfaces, > > e.g. /proc/net/dev, netstat -s etc. > > Comparing with other guest without problems, the only difference is that this tap (and only this one) reports "overruns": tap5 Link encap:Ethernet HWaddr 00:FF:AD:53:76:25 inet6 addr: fe80::2ff:adff:fe53:7625/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:717737621 errors:0 dropped:0 overruns:0 frame:0 TX packets:636626720 errors:0 dropped:0 overruns:317 carrier:0 collisions:0 txqueuelen:500 RX bytes:368973099756 (343.6 GiB) TX bytes:217917073227 (202.9 GiB) overruns seems to happen just when there is "hang", it doesn't seems to increase when network is working properly. > > - strace the /usr/bin/kvm process > > Unfortunatly I was unable to do this because I can't reproduce the problem on a test VM and I can't leave this VM with a non working network for analysis because of production so I have a script which pings and restart module/interface when needed.