From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NZV4l-0007QQ-VY for qemu-devel@nongnu.org; Mon, 25 Jan 2010 14:58:20 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NZV4h-0007JJ-Ag for qemu-devel@nongnu.org; Mon, 25 Jan 2010 14:58:19 -0500 Received: from [199.232.76.173] (port=44355 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NZV4h-0007J6-52 for qemu-devel@nongnu.org; Mon, 25 Jan 2010 14:58:15 -0500 Received: from mail-fx0-f222.google.com ([209.85.220.222]:48603) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NZV4g-00048Z-Hw for qemu-devel@nongnu.org; Mon, 25 Jan 2010 14:58:14 -0500 Received: by fxm22 with SMTP id 22so8284955fxm.2 for ; Mon, 25 Jan 2010 11:58:13 -0800 (PST) Message-ID: <4B5DF7D0.7050609@codemonkey.ws> Date: Mon, 25 Jan 2010 13:58:08 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <20100111172326.GB12084@redhat.com> In-Reply-To: <20100111172326.GB12084@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH-RFC 13/13] virtio-net: connect to vhost net backend List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org On 01/11/2010 11:23 AM, Michael S. Tsirkin wrote: > start/stop backend on driver start/stop > > Signed-off-by: Michael S. Tsirkin > --- > hw/virtio-net.c | 40 ++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 40 insertions(+), 0 deletions(-) > > diff --git a/hw/virtio-net.c b/hw/virtio-net.c > index c2a389f..99169e1 100644 > --- a/hw/virtio-net.c > +++ b/hw/virtio-net.c > @@ -17,6 +17,7 @@ > #include "net/tap.h" > #include "qemu-timer.h" > #include "virtio-net.h" > +#include "vhost_net.h" > > #define VIRTIO_NET_VM_VERSION 11 > > @@ -47,6 +48,7 @@ typedef struct VirtIONet > uint8_t nomulti; > uint8_t nouni; > uint8_t nobcast; > + uint8_t vhost_started; > struct { > int in_use; > int first_multi; > @@ -114,6 +116,10 @@ static void virtio_net_reset(VirtIODevice *vdev) > n->nomulti = 0; > n->nouni = 0; > n->nobcast = 0; > + if (n->vhost_started) { > + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev); > + n->vhost_started = 0; > + } > > /* Flush any MAC and VLAN filter table state */ > n->mac_table.in_use = 0; > @@ -820,6 +826,36 @@ static NetClientInfo net_virtio_info = { > .link_status_changed = virtio_net_set_link_status, > }; > > +static void virtio_net_set_status(struct VirtIODevice *vdev) > +{ > + VirtIONet *n = to_virtio_net(vdev); > + if (!n->nic->nc.peer) { > + return; > + } > + if (n->nic->nc.peer->info->type != NET_CLIENT_TYPE_TAP) { > + return; > + } > + > + if (!tap_get_vhost_net(n->nic->nc.peer)) { > + return; > + } > + if (!!n->vhost_started == !!(vdev->status& VIRTIO_CONFIG_S_DRIVER_OK)) { > + return; > + } > + if (vdev->status& VIRTIO_CONFIG_S_DRIVER_OK) { > + int r = vhost_net_start(tap_get_vhost_net(n->nic->nc.peer), vdev); > + if (r< 0) { > + fprintf(stderr, "unable to start vhost net: " > + "falling back on userspace virtio\n"); > + } else { > + n->vhost_started = 1; > + } > + } else { > + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev); > + n->vhost_started = 0; > + } > +} > + > This function does a number of bad things. It makes virtio-net have specific knowledge of backends (like tap) and then has virtio-net pass device specific state (vdev) to a network backend. Ultimately, the following things need to happen: 1) when a driver is ready to begin operating: a) virtio-net needs to tell vhost the location of the ring in physical memory b) virtio-net needs to tell vhost about any notification it receives (allowing kvm to shortcut userspace) c) virtio-net needs to tell vhost about which irq is being used (allowing kvm to shortcut userspace) What this suggests is that we need an API for the network backends to do the following: 1) probe whether ring passthrough is supported 2) set the *virtual* address of the ring elements 3) hand the backend some sort of notification object for sending and receiving notifications 4) stop ring passthrough vhost should not need any direct knowledge of the device. This interface should be enough to communicating the required data. I think the only bit that is a little fuzzy and perhaps non-obvious for the current patches is the notification object. I would expect it look something like: typedef struct IOEvent { int type; void (*notify)(IOEvent *); void (*on_notify)(IOEvent *, void (*cb)(IOEvent *, void *)); } IOEvent; And then we would have typedef struct KVMIOEvent { IOEvent event = {.type = KVM_IO_EVENT}; int fd; } KVMIOEvent; if (kvm_enabled()) in virtio-net, we would allocate a KVMIOEvent for the PIO notification and hand that to vhost. vhost would check event.type and if it's KVM_IOEVENT, downcast and use that to get at the file descriptor. The KVMIOEvent should be created, not in the set status callback, but in the PCI map callback. And in the PCI map callback, cpu_single_env should be passed to a kvm specific function to create this KVMIOEvent object that contains the created eventfd() that's handed to kvm via ioctl. It doesn't have to be exactly like this, but the point of all of this is that these KVM specific mechanism (which are really implementation details) should not be pervasive throughout the QEMU interfaces. There should also be strong separation between the vhost-net code and the virtio-net device. Regards, Anthony Liguori > VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf) > { > VirtIONet *n; > @@ -835,6 +871,7 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf) > n->vdev.set_features = virtio_net_set_features; > n->vdev.bad_features = virtio_net_bad_features; > n->vdev.reset = virtio_net_reset; > + n->vdev.set_status = virtio_net_set_status; > n->rx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_rx); > n->tx_vq = virtio_add_queue(&n->vdev, 256, virtio_net_handle_tx); > n->ctrl_vq = virtio_add_queue(&n->vdev, 64, virtio_net_handle_ctrl); > @@ -864,6 +901,9 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf) > void virtio_net_exit(VirtIODevice *vdev) > { > VirtIONet *n = DO_UPCAST(VirtIONet, vdev, vdev); > + if (n->vhost_started) { > + vhost_net_stop(tap_get_vhost_net(n->nic->nc.peer), vdev); > + } > > qemu_purge_queued_packets(&n->nic->nc); > >