From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tom Lendacky Subject: Re: network shutdown under heavy load Date: Wed, 13 Jan 2010 13:13:56 -0600 Message-ID: <201001131313.56223.tahm@linux.vnet.ibm.com> References: <4B265E84.3070008@binaryfreedom.info> <20100110123536.GB8085@gondor.apana.org.au> <4B49CA5E.5090001@redhat.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: Herbert Xu , rek2 , kvm@vger.kernel.org To: Avi Kivity Return-path: Received: from e2.ny.us.ibm.com ([32.97.182.142]:56963 "EHLO e2.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752032Ab0AMTOB (ORCPT ); Wed, 13 Jan 2010 14:14:01 -0500 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by e2.ny.us.ibm.com (8.14.3/8.13.1) with ESMTP id o0DJ4iWU017931 for ; Wed, 13 Jan 2010 14:04:44 -0500 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o0DJDx4r1331308 for ; Wed, 13 Jan 2010 14:13:59 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id o0DJDvrl025575 for ; Wed, 13 Jan 2010 17:13:58 -0200 In-Reply-To: <4B49CA5E.5090001@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sunday 10 January 2010 06:38:54 am Avi Kivity wrote: > On 01/10/2010 02:35 PM, Herbert Xu wrote: > > On Sun, Jan 10, 2010 at 02:30:12PM +0200, Avi Kivity wrote: > >> This isn't in 2.6.27.y. Herbert, can you send it there? > > > > It appears that now that TX is fixed we have a similar problem > > with RX. Once I figure that one out I'll send them together. > I've been experiencing the network shutdown issue also. I've been running netperf tests across 10GbE adapters with Qemu 0.12.1.2, RHEL5.4 guests and 2.6.32 kernel (from kvm.git) guests. I instrumented Qemu to print out some network statistics. It appears that at some point in the netperf test the receiving guest ends up having the 10GbE device "receive_disabled" variable in its VLANClientState structure stuck at 1. From looking at the code it appears that the virtio-net driver in the guest should cause qemu_flush_queued_packets in net.c to eventually run and clear the "receive_disabled" variable but it's not happening. I don't seem to have these issues when I have a lot of debug settings active in the guest kernel which results in very low/poor network performance - maybe some kind of race condition? Tom > Thanks. > > > Who is maintaining that BTW, stable@kernel.org? > > Yes. >