From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: Network throughput limits for local VM <-> VM communication Date: Wed, 17 Jun 2009 15:22:34 +0300 Message-ID: <4A38E00A.5090402@redhat.com> References: <0199E0D51A61344794750DC57738F58E67D2398F9E@GVW1118EXC.americas.hpqcorp.net> <200906101529.47103.arnd@arndb.de> <0199E0D51A61344794750DC57738F58E67D2399710@GVW1118EXC.americas.hpqcorp.net> <200906110949.16197.arnd@arndb.de> <0199E0D51A61344794750DC57738F58E67D23998F5@GVW1118EXC.americas.hpqcorp.net> <4A30BD98.6040302@redhat.com> <0199E0D51A61344794750DC57738F58E67D2399972@GVW1118EXC.americas.hpqcorp.net> <4A30C566.7040109@redhat.com> <0199E0D51A61344794750DC57738F58E67D2DCF44F@GVW1118EXC.americas.hpqcorp.net> <4A38A03B.20904@redhat.com> <0199E0D51A61344794750DC57738F58E67D2DCF4BF@GVW1118EXC.americas.hpqcorp.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Arnd Bergmann , Mark McLoughlin , "kvm@vger.kernel.org" To: "Fischer, Anna" Return-path: Received: from mx2.redhat.com ([66.187.237.31]:40603 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752989AbZFQMWi (ORCPT ); Wed, 17 Jun 2009 08:22:38 -0400 In-Reply-To: <0199E0D51A61344794750DC57738F58E67D2DCF4BF@GVW1118EXC.americas.hpqcorp.net> Sender: kvm-owner@vger.kernel.org List-ID: On 06/17/2009 11:12 AM, Fischer, Anna wrote: > > For the tests I run now (with vlan= enabled) I am actually using both TCP and UDP, and I see the problem with virtio_net for both protocols. What I am wondering about though is that I do not seem to have any problems if I communicate directly between the two guests (if I plug then into the same bridge and put them onto the same networks), so why do I only see the problem of stalling network communication when there is a routing VM in the network path? Is this just because the system is even more overloaded in that case? Or could this be an issue related to a dual NIC configuration or the fact that I run multiple bridges on the same physical machine? > My guess is that somewhere there's a queue that's shorter that the virtio queue, or its usable size fluctuates (because it is shared with something else). So TCP flow control doesn't work, and UDP doesn't have a chance. > When you say "We are working on fixing this." - which code parts are you working on? Is this in the QEMU network I/O processing code or is this virtio_net related? > tap. virtio, qemu, maybe more. It's a difficult problem. > Retry with "the fixed configuration"? You mean setting the vlan= parameter? I have already used the vlan= parameter for the latest tests, and so the CPU utilization issues I am talking about are happening with that configuration. > Yeah. Can you compare total data sent and received as seen by the guests? That would confirm that packets being dropped causes the slowdown. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.