From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: Network performance with small packets Date: Wed, 2 Feb 2011 20:38:47 +0200 Message-ID: <20110202183847.GA14829@redhat.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: kvm@vger.kernel.org To: Steve Dobbelstein Return-path: Received: from mx1.redhat.com ([209.132.183.28]:19794 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754846Ab1BBSjB (ORCPT ); Wed, 2 Feb 2011 13:39:01 -0500 Content-Disposition: inline In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On Tue, Jan 25, 2011 at 03:09:34PM -0600, Steve Dobbelstein wrote: > > I am working on a KVM network performance issue found in our lab running > the DayTrader benchmark. The benchmark throughput takes a significant hit > when running the application server in a KVM guest verses on bare metal. > We have dug into the problem and found that DayTrader's use of small > packets exposes KVM's overhead of handling network packets. I have been > able to reproduce the performance hit with a simpler setup using the > netperf benchmark with the TCP_RR test and the request and response sizes > set to 256 bytes. I run the benchmark between two physical systems, each > using a 1GB link. In order to get the maximum throughput for the system I > have to run 100 instances of netperf. When I run the netserver processes > in a guest, I see a maximum throughput that is 51% of what I get if I run > the netserver processes directly on the host. The CPU utilization in the > guest is only 85% at maximum throughput, whereas it is 100% on bare metal. You are stressing the scheduler pretty hard with this test :) Is your real benchmark also using a huge number of threads? If it's not, you might be seeing a different issue. IOW, the netperf degradation might not be network-related at all, but have to do with speed of context switch in guest. Thoughts? > The KVM host has 16 CPUs. The KVM guest is configured with 2 VCPUs. When > I run netperf on the host I boot the host with maxcpus=2 on the kernel > command line. The host is running the current KVM upstream kernel along > with the current upstream qemu. Here is the qemu command used to launch > the guest: > /build/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 -name glasgow-RH60 -m 32768 -drive file=/build/guest-data/glasgow-RH60.img,if=virtio,index=0,boot=on > -drive file=/dev/virt/WAS,if=virtio,index=1 -net nic,model=virtio,vlan=3,macaddr=00:1A:64:E5:00:63,netdev=nic0 -netdev tap,id=nic0,vhost=on -smp 2 > -vnc :1 -monitor telnet::4499,server,nowait -serial telnet::8899,server,nowait --mem-path /libhugetlbfs -daemonize > > We have tried various proposed fixes, each with varying amounts of success. > One such fix was to add code to the vhost thread such that when it found > the work queue empty it wouldn't just exit the thread but rather would > delay for 50 microseconds and then recheck the queue. If there was work on > the queue it would loop back and process it, else it would exit the thread. > The change got us a 13% improvement in the DayTrader throughput. > > Running the same netperf configuration on the same hardware but using a > different hypervisor gets us significantly better throughput numbers. The > guest on that hypervisor runs at 100% CPU utilization. The various fixes > we have tried have not gotten us close to the throughput seen on the other > hypervisor. I'm looking for ideas/input from the KVM experts on how to > make KVM perform better when handling small packets. > > Thanks, > Steve > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html