From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: Network performance with small packets
Date: Wed, 2 Feb 2011 20:38:47 +0200
Message-ID: <20110202183847.GA14829@redhat.com>
References: <OFD293DCD2.7F0260F0-ON86257823.0061DC39-86257823.00743BB3@us.ibm.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org
To: Steve Dobbelstein <steved@us.ibm.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:19794 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1754846Ab1BBSjB (ORCPT <rfc822;kvm@vger.kernel.org>);
	Wed, 2 Feb 2011 13:39:01 -0500
Content-Disposition: inline
In-Reply-To: <OFD293DCD2.7F0260F0-ON86257823.0061DC39-86257823.00743BB3@us.ibm.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Tue, Jan 25, 2011 at 03:09:34PM -0600, Steve Dobbelstein wrote:
> 
> I am working on a KVM network performance issue found in our lab running
> the DayTrader benchmark.  The benchmark throughput takes a significant hit
> when running the application server in a KVM guest verses on bare metal.
> We have dug into the problem and found that DayTrader's use of small
> packets exposes KVM's overhead of handling network packets.  I have been
> able to reproduce the performance hit with a simpler setup using the
> netperf benchmark with the TCP_RR test and the request and response sizes
> set to 256 bytes.  I run the benchmark between two physical systems, each
> using a 1GB link.  In order to get the maximum throughput for the system I
> have to run 100 instances of netperf.  When I run the netserver processes
> in a guest, I see a maximum throughput that is 51% of what I get if I run
> the netserver processes directly on the host.  The CPU utilization in the
> guest is only 85% at maximum throughput, whereas it is 100% on bare metal.

You are stressing the scheduler pretty hard with this test :)
Is your real benchmark also using a huge number of threads?
If it's not, you might be seeing a different issue.
IOW, the netperf degradation might not be network-related at all,
but have to do with speed of context switch in guest.
Thoughts?

> The KVM host has 16 CPUs.  The KVM guest is configured with 2 VCPUs.  When
> I run netperf on the host I boot the host with maxcpus=2 on the kernel
> command line.  The host is running the current KVM upstream kernel along
> with the current upstream qemu.  Here is the qemu command used to launch
> the guest:
> /build/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 -name glasgow-RH60 -m 32768 -drive file=/build/guest-data/glasgow-RH60.img,if=virtio,index=0,boot=on
>  -drive file=/dev/virt/WAS,if=virtio,index=1 -net nic,model=virtio,vlan=3,macaddr=00:1A:64:E5:00:63,netdev=nic0 -netdev tap,id=nic0,vhost=on -smp 2
> -vnc :1 -monitor telnet::4499,server,nowait -serial telnet::8899,server,nowait --mem-path /libhugetlbfs -daemonize
> 
> We have tried various proposed fixes, each with varying amounts of success.
> One such fix was to add code to the vhost thread such that when it found
> the work queue empty it wouldn't just exit the thread but rather would
> delay for 50 microseconds and then recheck the queue.  If there was work on
> the queue it would loop back and process it, else it would exit the thread.
> The change got us a 13% improvement in the DayTrader throughput.
> 
> Running the same netperf configuration on the same hardware but using a
> different hypervisor gets us significantly better throughput numbers.   The
> guest on that hypervisor runs at 100% CPU utilization.  The various fixes
> we have tried have not gotten us close to the throughput seen on the other
> hypervisor.  I'm looking for ideas/input from the KVM experts on how to
> make KVM perform better when handling small packets.
> 
> Thanks,
> Steve
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html