From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35305) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3MS4-0001Ny-N6 for qemu-devel@nongnu.org; Wed, 15 Jan 2014 04:08:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W3MRz-0003Pk-PT for qemu-devel@nongnu.org; Wed, 15 Jan 2014 04:07:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:11909) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W3MRz-0003PX-HF for qemu-devel@nongnu.org; Wed, 15 Jan 2014 04:07:51 -0500 Date: Wed, 15 Jan 2014 11:07:44 +0200 From: "Michael S. Tsirkin" Message-ID: <20140115090744.GB1719@redhat.com> References: <1389623119-15863-1-git-send-email-a.motakis@virtualopensystems.com> <20140114113327.GF27922@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v6 0/8] Vhost and vhost-net support for userspace based backends List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Antonios Motakis Cc: Luke Gorrie , snabb-devel@googlegroups.com, Nikolay Nikolaev , qemu-devel qemu-devel , VirtualOpenSystems Technical Team On Tue, Jan 14, 2014 at 07:13:43PM +0100, Antonios Motakis wrote: >=20 >=20 >=20 > On Tue, Jan 14, 2014 at 12:33 PM, Michael S. Tsirkin w= rote: >=20 > On Mon, Jan 13, 2014 at 03:25:11PM +0100, Antonios Motakis wrote: > > In this patch series we would like to introduce our approach for = putting > a > > virtio-net backend in an external userspace process. Our eventual= target > is to > > run the network backend in the Snabbswitch ethernet switch, while > receiving > > traffic from a guest inside QEMU/KVM which runs an unmodified vir= tio-net > > implementation. > > > > For this, we are working into extending vhost to allow equivalent > functionality > > for userspace. Vhost already passes control of the data plane of > virtio-net to > > the host kernel; we want to realize a similar model, but for user= space. > > > > In this patch series the concept of a vhost-backend is introduced. > > > > We define two vhost backend types - vhost-kernel and vhost-user. = The > former is > > the interface to the current kernel module implementation. Its co= ntrol > plane is > > ioctl based. The data plane is the kernel directly accessing the = QEMU > allocated, > > guest memory. > > > > In the new vhost-user backend, the control plane is based on > communication > > between QEMU and another userspace process using a unix domain so= cket. > This > > allows to implement a virtio backend for a guest running in QEMU,= inside > the > > other userspace process. > > > > We change -mem-path to QemuOpts and add prealloc, share and unlin= k as > properties > > to it. HugeTLBFS requirements of -mem-path are relaxed, so any va= lid path > can > > be used now. The new properties allow more fine grained control o= ver the > guest > > RAM backing store. > > > > The data path is realized by directly accessing the vrings and th= e buffer > data > > off the guest's memory. > > > > The current user of vhost-user is only vhost-net. We add new netd= ev > backend > > that is intended to initialize vhost-net with vhost-user backend. >=20 > Some meta comments. >=20 > Something that makes this patch harder to review is how it's > split up. Generally IMHO it's not a good idea to repeatedly > edit same part of file adding stuff in patch after patch, > it's only making things harder to read if you add stubs, then fill = them up. > (we do this sometimes when we are changing existing code, but > it is generally not needed when adding new code) >=20 > Instead, split it like this: >=20 > 1. general refactoring, split out linux specific and generic parts > =A0 =A0and add the ops indirection > 2. add new files for vhost-user with complete implementation. > =A0 =A0without command line to support it, there will be no way to = use it, > =A0 =A0but should build fine. > 3. tie it all up with option parsing >=20 >=20 > Generic vhost and vhost net files should be kept separate. > Don't let vhost net stuff seep back into generic files, > we have vhost-scsi too. > I would also prefer that userspace vhost has its own files. >=20 >=20 > Ok, we'll keep this into account. > =A0 >=20 >=20 > We need a small test server qemu can talk to, to verify things > actually work. >=20 >=20 > We have implemented such a test app: https://github.com/virtualopensyst= ems/vapp >=20 > We use it for testing, and also as a reference implementation. A client= is also > included. >=20 Sounds good. Can we include this in qemu and tie it into the qtest framework? >>From a brief look, it merely needs to be tweaked for portability, unless=20 >=20 > Already commented on: reuse the chardev syntax and preferably code. > We already support a bunch of options there for > domain sockets that will be useful here, they should > work here as well. >=20 >=20 > We adapted the syntax for this to be consistent with chardev. What we d= idn't > use, it is not obvious at all to us on how they should be used; a lot o= f the > chardev options just don't apply to us. > =A0 Well server option should work at least. nowait can work too? Also, if reconnect is useful it should be for chardevs too, so if we don'= t share code, need to code it in two places to stay consistent. Overall sharing some code might be better ... > In particular you shouldn't require filesystem access by qemu, > passing fd for domain socket should work. >=20 >=20 > We can add an option to pass an fd for the domain socket if needed. How= ever as > far as we understand, chardev doesn't do that either (at least form loo= king at > the man page). Maybe we misunderstand what you mean. Sorry. I got confused with e.g. tap which has this. This might be useful but does not have to block this patch. >=20 >=20 > > Example usage: > > > > qemu -m 1024 -mem-path /hugetlbfs,prealloc=3Don,share=3Don \ > > =A0 =A0 =A0-netdev type=3Dvhost-user,id=3Dnet0,path=3D/path/to/so= ck,poll_time=3D2500 \ > > =A0 =A0 =A0-device virtio-net-pci,netdev=3Dnet0 >=20 > It's not clear which parts of -mem-path are required for vhost-user. > It should be documented somewhere, made clear in -help > and should fail gracefully if misconfigured. >=20 >=20 >=20 > Ok. > =A0 >=20 >=20 > > > > Changes from v5: > > =A0- Split -mem-path unlink option to a separate patch > > =A0- Fds are passed only in the ancillary data > > =A0- Stricter message size checks on receive/send > > =A0- Netdev vhost-user now includes path and poll_time options > > =A0- The connection probing interval is configurable > > > > Changes from v4: > > =A0- Use error_report for errors > > =A0- VhostUserMsg has new field `size` indicating the following p= ayload > length. > > =A0 =A0Field `flags` now has version and reply bits. The structur= e is packed. > > =A0- Send data is of variable length (`size` field in message) > > =A0- Receive in 2 steps, header and payload > > =A0- Add new message type VHOST_USER_ECHO, to check connection st= atus > > > > Changes from v3: > > =A0- Convert -mem-path to QemuOpts with prealloc, share and unlin= k > properties > > =A0- Set 1 sec timeout when read/write to the unix domain socket > > =A0- Fix file descriptor leak > > > > Changes from v2: > > =A0- Reconnect when the backend disappears > > > > Changes from v1: > > =A0- Implementation of vhost-user netdev backend > > =A0- Code improvements > > > > Antonios Motakis (8): > > =A0 Convert -mem-path to QemuOpts and add prealloc and share prop= erties > > =A0 New -mem-path option - unlink. > > =A0 Decouple vhost from kernel interface > > =A0 Add vhost-user skeleton > > =A0 Add domain socket communication for vhost-user backend > > =A0 Add vhost-user calls implementation > > =A0 Add new vhost-user netdev backend > > =A0 Add vhost-user reconnection > > > > =A0exec.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= | =A057 +++- > > =A0hmp-commands.hx =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 4 +- > > =A0hw/net/vhost_net.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| 144 ++++++= +--- > > =A0hw/net/virtio-net.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A042 ++- > > =A0hw/scsi/vhost-scsi.c =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A013 +- > > =A0hw/virtio/Makefile.objs =A0 =A0 =A0 =A0 =A0 | =A0 2 +- > > =A0hw/virtio/vhost-backend.c =A0 =A0 =A0 =A0 | 556 > ++++++++++++++++++++++++++++++++++++++ > > =A0hw/virtio/vhost.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A046 ++-- > > =A0include/exec/cpu-all.h =A0 =A0 =A0 =A0 =A0 =A0| =A0 3 - > > =A0include/hw/virtio/vhost-backend.h | =A040 +++ > > =A0include/hw/virtio/vhost.h =A0 =A0 =A0 =A0 | =A0 4 +- > > =A0include/net/vhost-user.h =A0 =A0 =A0 =A0 =A0| =A017 ++ > > =A0include/net/vhost_net.h =A0 =A0 =A0 =A0 =A0 | =A015 +- > > =A0net/Makefile.objs =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 2 +- > > =A0net/clients.h =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 3 = + > > =A0net/hub.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0= 1 + > > =A0net/net.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0= 2 + > > =A0net/tap.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0= 16 +- > > =A0net/vhost-user.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| 177 ++++= ++++++++ > > =A0qapi-schema.json =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A021 +- > > =A0qemu-options.hx =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A024 +- > > =A0vl.c =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= | =A041 ++- > > =A022 files changed, 1106 insertions(+), 124 deletions(-) > > =A0create mode 100644 hw/virtio/vhost-backend.c > > =A0create mode 100644 include/hw/virtio/vhost-backend.h > > =A0create mode 100644 include/net/vhost-user.h > > =A0create mode 100644 net/vhost-user.c > > > > -- > > 1.8.3.2 > > >=20 >=20