From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46482) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WKlm4-0006pJ-1d for Qemu-devel@nongnu.org; Tue, 04 Mar 2014 04:36:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WKllv-0000zb-Ld for Qemu-devel@nongnu.org; Tue, 04 Mar 2014 04:36:31 -0500 Received: from mail-ee0-x232.google.com ([2a00:1450:4013:c00::232]:58785) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WKllv-0000zO-Ds for Qemu-devel@nongnu.org; Tue, 04 Mar 2014 04:36:23 -0500 Received: by mail-ee0-f50.google.com with SMTP id c13so2732793eek.9 for ; Tue, 04 Mar 2014 01:36:22 -0800 (PST) Date: Tue, 4 Mar 2014 10:36:14 +0100 From: Stefan Hajnoczi Message-ID: <20140304093614.GG25676@stefanha-thinkpad.redhat.com> References: <5310489A.4060501@cisco.com> <20140303132746.GE21055@stefanha-thinkpad.redhat.com> <53148B1A.3070008@cisco.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53148B1A.3070008@cisco.com> Subject: Re: [Qemu-devel] Contribution - L2TPv3 transport List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Anton Ivanov (antivano)" Cc: Luigi Rizzo , "Qemu-devel@nongnu.org" , Vincenzo Maffione On Mon, Mar 03, 2014 at 02:01:00PM +0000, Anton Ivanov (antivano) wrote: > On 03/03/14 13:27, Stefan Hajnoczi wrote: > > On Fri, Feb 28, 2014 at 08:28:11AM +0000, Anton Ivanov (antivano) wrote: > >> 3. Qemu to communicate with the local host, remote vms, network devices, > >> etc at speeds which for a number of use cases exceed the speed of the > >> legacy tap driver. > > This surprises me. It's odd that tap performs significantly worse. > > > Multipacket RX can go a very long way and it does not work on tap's > emulation of a raw socket. At least in 3.2 :) Luigi and Vincenzo had ideas on making QEMU's net layer support multipacket tx using something like TCP_CORK. This would map to sendmmsg(2). Basically the net client gets multiple .receive() calls but is told to hold off on submitting the packets. Then, when it finally gets uncorked, it can sendmmsg(2). The only issue is we need to hold on to the tx buffers longer than normal. > > Now about the tap userspace ABI, is the performance bottleneck that the > > read(2) system call only receives one packet at a time? The tap file > > descriptor is not a socket so recvmmsg(2) cannot be used on it directly. > > If I read the kernel source correctly the tap fd can emulate a socket > for some calls. However, when I try recvmmsg I get an ENOTSOCKET. The fd is not a real socket. Confusingly, inside the kernel the tun.c driver has a "socket" which is used for zero-copy tx by vhost_net.