From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tom Lendacky <tahm@linux.vnet.ibm.com>
Subject: Network shutdown under load
Date: Fri, 29 Jan 2010 14:06:41 -0600
Message-ID: <201001291406.41559.tahm@linux.vnet.ibm.com>
Mime-Version: 1.0
Content-Type: Text/Plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: kvm@vger.kernel.org, qemu-devel@nongnu.org, chrisw@redhat.com,
	avi@redhat.com, herbert@gondor.apana.org.au,
	rek2@binaryfreedom.info, markmc@redhat.com, aliguori@us.ibm.com
Return-path: <kvm-owner@vger.kernel.org>
Received: from e39.co.us.ibm.com ([32.97.110.160]:49820 "EHLO
	e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751247Ab0A2UHC (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 29 Jan 2010 15:07:02 -0500
Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106])
	by e39.co.us.ibm.com (8.14.3/8.13.1) with ESMTP id o0TJxi7c015955
	for <kvm@vger.kernel.org>; Fri, 29 Jan 2010 12:59:44 -0700
Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168])
	by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o0TK6rjp059506
	for <kvm@vger.kernel.org>; Fri, 29 Jan 2010 13:06:54 -0700
Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1])
	by d03av02.boulder.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id o0TK6gn8004129
	for <kvm@vger.kernel.org>; Fri, 29 Jan 2010 13:06:44 -0700
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


There's been some discussion of this already in the kvm list, but I want to 
summarize what I've found and also include the qemu-devel list in an effort to 
find a solution to this problem.

Running a netperf test between two kvm guests results in the guest's network 
interface shutting down. I originally found this using kvm guests on two 
different machines that were connected via a 10GbE link.  However, I found 
this problem can be easily reproduced using two guests on the same machine.

I am running the 2.6.32 level of the kvm.git tree and the 0.12.1.2 level of 
the qemu-kvm.git tree.

The setup includes two bridges, br0 and br1.

The commands used to start the guests are as follows:
usr/local/bin/qemu-system-x86_64 -name cape-vm001 -m 1024 -drive 
file=/autobench/var/tmp/cape-vm001-
raw.img,if=virtio,index=0,media=disk,boot=on -net 
nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:51,netdev=cape-vm001-eth0 -
netdev tap,id=cape-vm001-eth0,script=/autobench/var/tmp/ifup-kvm-
br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net 
nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:D1,netdev=cape-vm001-eth1 -
netdev tap,id=cape-vm001-eth1,script=/autobench/var/tmp/ifup-kvm-
br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :1 -monitor 
telnet::5701,server,nowait -snapshot -daemonize

usr/local/bin/qemu-system-x86_64 -name cape-vm002 -m 1024 -drive 
file=/autobench/var/tmp/cape-vm002-
raw.img,if=virtio,index=0,media=disk,boot=on -net 
nic,model=virtio,vlan=0,macaddr=00:16:3E:00:62:61,netdev=cape-vm002-eth0 -
netdev tap,id=cape-vm002-eth0,script=/autobench/var/tmp/ifup-kvm-
br0,downscript=/autobench/var/tmp/ifdown-kvm-br0 -net 
nic,model=virtio,vlan=1,macaddr=00:16:3E:00:62:E1,netdev=cape-vm002-eth1 -
netdev tap,id=cape-vm002-eth1,script=/autobench/var/tmp/ifup-kvm-
br1,downscript=/autobench/var/tmp/ifdown-kvm-br1 -vnc :2 -monitor 
telnet::5702,server,nowait -snapshot -daemonize

The ifup-kvm-br0 script takes the (first) qemu created tap device and brings 
it up and adds it to bridge br0.  The ifup-kvm-br1 script take the (second) 
qemu created tap device and brings it up and adds it to bridge br1.

Each ethernet device within a guest is on it's own subnet.  For example:
  guest 1 eth0 has addr 192.168.100.32 and eth1 has addr 192.168.101.32
  guest 2 eth0 has addr 192.168.100.64 and eth1 has addr 192.168.101.64

On one of the guests run netserver:
  netserver -L 192.168.101.32 -p 12000

On the other guest run netperf:
  netperf -L 192.168.101.64 -H 192.168.101.32 -p 12000 -t TCP_STREAM -l 60 -c 
-C -- -m 16K -M 16K

It may take more than one netperf run (I find that my second run almost always 
causes the shutdown) but the network on the eth1 links will stop working.

I did some debugging and found that in qemu on the guest running netserver:
 - the receive_disabled variable is set and never gets reset
 - the read_poll event handler for the eth1 tap device is disabled and never 
re-enabled
These conditions result in no packets being read from the tap device and sent 
to the guest - effectively shutting down the network.  Network connectivity 
can be restored by shutting down the guest interfaces, unloading the 
virtio_net module, re-loading the virtio_net module and re-starting the guest 
interfaces.

I'm continuing to work on debugging this, but would appreciate if some folks 
with more qemu network experience could try to recreate and debug this.

If my kernel config matters, I can provide that.

Thanks,
Tom