From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MStcW-0002kJ-Ni for qemu-devel@nongnu.org; Mon, 20 Jul 2009 10:13:36 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MStcS-0002k7-N7 for qemu-devel@nongnu.org; Mon, 20 Jul 2009 10:13:36 -0400 Received: from [199.232.76.173] (port=39006 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MStcS-0002k4-L7 for qemu-devel@nongnu.org; Mon, 20 Jul 2009 10:13:32 -0400 Received: from fwil.voltaire.com ([193.47.165.2]:42058 helo=exil.voltaire.com) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MStcS-0006IF-8K for qemu-devel@nongnu.org; Mon, 20 Jul 2009 10:13:32 -0400 Message-ID: <4A647B72.5090404@Voltaire.com> Date: Mon, 20 Jul 2009 17:13:06 +0300 From: Or Gerlitz MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] net: add raw backend - some performance measurements References: <20090701162115.GA4555@shareable.org> <4A4CA747.1050509@Voltaire.com> <20090703023911.GD938@shareable.org> <4A534EC4.5030209@voltaire.com> <20090707145739.GB14392@shareable.org> <4A54B0F1.3070201@voltaire.com> <20090715203806.GF3056@shareable.org> In-Reply-To: <20090715203806.GF3056@shareable.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jamie Lokier Cc: Mark McLoughlin , "Michael S. Tsirkin" , Herbert Xu , Dor Laor , qemu-devel@nongnu.org, Jan Kiszka Jamie Lokier wrote: > Or Gerlitz wrote: >> the performance (packets per second and cpu utilization) one can get >> with bridge+tap is much lower vs what you get with the raw mode approach. > Have you measured it? yes, here's some data: using 2.6.29.1 in the guest, 2.6.30 in the host, with 1Gbe connectivity (Intel 82575EB) between the two nodes, I see the following results: with -net raw (packet socket) pps cs us sys vm->phys 240k 200 7 8 phys->vm 160k 100 5 7 with -net tap (tap + bridge) pps cs us sys vm->phys 170k 600 5 10 phys->vm 150k 14k 5 20 where "pps" stands for packets-per-second, "cs", "us" and "sys" are taken from vmstat output, such that they represent the context switches per second, user and system time percents. The benchmark I use is netperf 2.4.4 / UDP_STREAM with 22 bytes payload length such that there are 64(=14+20+8+22) bytes on the wire. On this setup (udp, 64 byte frames), doing phys->phys test, netperf sends/receives 450K pps and pktgen sends 900K pps, all tests done without any interrupt moderation tuning. You can see that the raw mode has much better packets per second for the VM TX flow, and on the VM RX side, a bit better pps rate but much lower cpu utilization and context switches number. Or. All this on top of mainstream qemu whose head is commit 8676188b751ca28ab7c42baf20ea64391625b44d "Work around Solaris gas problem"