From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:43630) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uhh7l-0002fK-B6 for qemu-devel@nongnu.org; Wed, 29 May 2013 10:13:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Uhh7e-0006EX-MI for qemu-devel@nongnu.org; Wed, 29 May 2013 10:13:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50881) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Uhh7e-0006EL-D0 for qemu-devel@nongnu.org; Wed, 29 May 2013 10:13:02 -0400 Date: Wed, 29 May 2013 17:12:39 +0300 From: "Michael S. Tsirkin" Message-ID: <20130529141239.GB10347@redhat.com> References: <20130523085034.GA16142@redhat.com> <519F35B7.6010408@redhat.com> <20130524113542.GA7046@redhat.com> <8738tctrox.fsf@codemonkey.ws> <20130524140024.GA12024@redhat.com> <87li6yodgq.fsf@rustcorp.com.au> <87k3miq6sw.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87k3miq6sw.fsf@codemonkey.ws> Subject: Re: [Qemu-devel] updated: kvm networking todo wiki List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Krishna Kumar2 , lmr@redhat.com, "Xin, Xiaohui" , Shirley Ma , akong@redhat.com, kvm@vger.kernel.org, sriram.narasimhan@hp.com, netdev@vger.kernel.org, Jason Wang , Rusty Russell , virtualization@lists.linux-foundation.org, David Stevens , qemu-devel@nongnu.org, vyasevic@redhat.com, herbert@gondor.hengli.com.au, jdike@linux.intel.com, sri@linux.vnet.ibm.com On Wed, May 29, 2013 at 08:01:03AM -0500, Anthony Liguori wrote: > Rusty Russell writes: > > > "Michael S. Tsirkin" writes: > >> On Fri, May 24, 2013 at 08:47:58AM -0500, Anthony Liguori wrote: > >>> "Michael S. Tsirkin" writes: > >>> > >>> > On Fri, May 24, 2013 at 05:41:11PM +0800, Jason Wang wrote: > >>> >> On 05/23/2013 04:50 PM, Michael S. Tsirkin wrote: > >>> >> > Hey guys, > >>> >> > I've updated the kvm networking todo wiki with current projects. > >>> >> > Will try to keep it up to date more often. > >>> >> > Original announcement below. > >>> >> > >>> >> Thanks a lot. I've added the tasks I'm currently working on to the wiki. > >>> >> > >>> >> btw. I notice the virtio-net data plane were missed in the wiki. Is the > >>> >> project still being considered? > >>> > > >>> > It might have been interesting several years ago, but now that linux has > >>> > vhost-net in kernel, the only point seems to be to > >>> > speed up networking on non-linux hosts. > >>> > >>> Data plane just means having a dedicated thread for virtqueue processing > >>> that doesn't hold qemu_mutex. > >>> > >>> Of course we're going to do this in QEMU. It's a no brainer. But not > >>> as a separate device, just as an improvement to the existing userspace > >>> virtio-net. > >>> > >>> > Since non-linux does not have kvm, I doubt virtio is a bottleneck. > >>> > >>> FWIW, I think what's more interesting is using vhost-net as a networking > >>> backend with virtio-net in QEMU being what's guest facing. > >>> > >>> In theory, this gives you the best of both worlds: QEMU acts as a first > >>> line of defense against a malicious guest while still getting the > >>> performance advantages of vhost-net (zero-copy). > >> > >> Great idea, that sounds very intresting. > >> > >> I'll add it to the wiki. > >> > >> In fact a bit of complexity in vhost was put there in the vague hope to > >> support something like this: virtio rings are not translated through > >> regular memory tables, instead, vhost gets a pointer to ring address. > >> > >> This allows qemu acting as a man in the middle, > >> verifying the descriptors but not touching the > >> > >> Anyone interested in working on such a project? > > > > It would be an interesting idea if we didn't already have the vhost > > model where we don't need the userspace bounce. > > The model is very interesting for QEMU because then we can use vhost as > a backend for other types of network adapters (like vmxnet3 or even > e1000). > > It also helps for things like fault tolerance where we need to be able > to control packet flow within QEMU. > > Regards, > > Anthony Liguori It was also floated as an alternative way to do live migration. > > We already have two > > sets of host side ring code in the kernel (vhost and vringh, though > > they're being unified). > > > > All an accelerator can offer on the tx side is zero copy and direct > > update of the used ring. On rx userspace could register the buffers and > > the accelerator could fill them and update the used ring. It still > > needs to deal with merged buffers, for example. > > > > You avoid the address translation in the kernel, but I'm not convinced > > that's a key problem. > > > > Cheers, > > Rusty.