From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH 4/5] kvm: qemu: Use vringfd to eliminate copies Date: Tue, 17 Jun 2008 09:54:30 -0500 Message-ID: <4857D026.1080304@us.ibm.com> References: <1213365481-23460-1-git-send-email-markmc@redhat.com> <1213365481-23460-2-git-send-email-markmc@redhat.com> <1213365481-23460-3-git-send-email-markmc@redhat.com> <1213365481-23460-4-git-send-email-markmc@redhat.com> <1213365481-23460-5-git-send-email-markmc@redhat.com> <48545422.1040109@us.ibm.com> <1213711726.31834.24.camel@muff> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Avi Kivity , Rusty Russell , kvm@vger.kernel.org To: Mark McLoughlin Return-path: Received: from e1.ny.us.ibm.com ([32.97.182.141]:45628 "EHLO e1.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756052AbYFQOyq (ORCPT ); Tue, 17 Jun 2008 10:54:46 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e1.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m5HEsk3l017321 for ; Tue, 17 Jun 2008 10:54:46 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v9.0) with ESMTP id m5HEskKQ216900 for ; Tue, 17 Jun 2008 10:54:46 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m5HEsjlU015351 for ; Tue, 17 Jun 2008 10:54:46 -0400 In-Reply-To: <1213711726.31834.24.camel@muff> Sender: kvm-owner@vger.kernel.org List-ID: Mark McLoughlin wrote: > Hi Anthony, > > On Sat, 2008-06-14 at 18:28 -0500, Anthony Liguori wrote: > >> This patch set is useful for testing (I have one too in my patch >> queue). >> > > Ah, didn't know of your queue ... Is it (http://hg.codemonkey.ws/) down > atm? > It's up now. >> We need to make some more pervasive changes to QEMU though to >> take advantage of vringfd upstream. >> >> Specifically, we need to introduce a RX/TX buffer adding/polling API for >> VLANClientState. We can then use this within a vringfd VLAN client to >> push the indexes to vringfd. >> > > I don't think I'm following you fully on this. > > The TX side is fine - guest adds buffer to ring, virtio VLANClient calls > ->add_tx_buffer() on every other VLANClient, waits until all are > finished sending and notifies the guest that we're done. > > But the RX side? The guest allocates the buffers, so does the virtio > VLANClient divide those buffers between every other VLANClient? This is where things get tricky. Internally, it will have to copy the TX buffer into each of the clients RX buffers. We need to special case the circumstance where the only other VLANClientState is a vringfd client so that we can pass the RX buffer directly to it. Haven't come up with a perfect API just yet but that's what we need to do. > Or does > it make all buffers available to all clients and have a way of locking a > buffer just before using it? The former would be a waste, and we don't > have any way of doing the latter right now with vringfd. > > Also, since a client could be supplied with RX buffers from multiple > other clients, tun/tap would need to support multiple RX rings. > > It really makes one wonder whether QEMU's VLAN feature is really worth > all this bother. > It's necessary for upstream acceptance and solving the problem right will give us vringfd support for the e1000 for free. > Oh yes, there's also GSO feature negotiation; you'd need to have a way > of figuring out what clients support what GSO features, which is > fine ... expect for what to do in the case of hotplug. > Right, we need a feature API. Regards, Anthony Liguori >> We can't use the base/limit stuff in QEMU so we have to do >> translation. Not a big deal really. >> > > Yeah, that's not a problem. > > >> Have you benchmarked the driver? I wasn't seeing great performance >> myself although I think that was due to some bugs in the vringfd code. >> > > Nope, I haven't done any real benchmarking with it yet. > > Cheers, > Mark. > >