From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 4/5] kvm: qemu: Use vringfd to eliminate copies Date: Mon, 16 Jun 2008 07:58:09 -0700 Message-ID: <48567F81.4070801@qumranet.com> References: <1213365481-23460-1-git-send-email-markmc@redhat.com> <1213365481-23460-5-git-send-email-markmc@redhat.com> <48545422.1040109@us.ibm.com> <200806161210.57926.rusty@rustcorp.com.au> <4856728F.4020501@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Rusty Russell , Mark McLoughlin , kvm@vger.kernel.org To: Anthony Liguori Return-path: Received: from il.qumranet.com ([212.179.150.194]:35735 "EHLO il.qumranet.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752778AbYFPO6M (ORCPT ); Mon, 16 Jun 2008 10:58:12 -0400 In-Reply-To: <4856728F.4020501@us.ibm.com> Sender: kvm-owner@vger.kernel.org List-ID: Anthony Liguori wrote: >> >> In theory vringfd will get us zero copy from guest sendfile out to >> external machines. For anything else we're doing a copy anyway, so >> avoiding copying has no great benefit. >> > > There's nothing that prevents zero-copy to be implemented for tun > without vringfd. In fact, I seem to recall that your earlier patches > implemented zero-copy :-) > > I like the vringfd model and I think it's a good way to move forward. > My concern is that it introduces an extra syscall in the TX path. > Right now, we do a single write call whereas with vringfd we need to > insert the TX packet into the queue, do a notify, and then wait for > indication that the TX has succeeded. > > I know we'll win with TSO but we don't need vringfd for TSO. The > jury's still out IMHO as to whether we should do vringfd or just try > to merge TSO tun patches. tun+tso still doesn't give you zerocopy (unless you change it to use aio, which re-introduces the syscall). btw, the two vringfd syscalls are amortized over a potentially large number of packets, whereas the single tun syscall is per-packet. (note: we can get rid of the two syscalls as well by having each side opportunistically pick up ring entries, like Xen does) -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.