qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends
@ 2013-11-29 19:52 Antonios Motakis
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 1/5] Decouple vhost from kernel interface Antonios Motakis
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-11-29 19:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: lukego, Antonios Motakis, tech, n.nikolaev

In this patch series we would like to introduce our approach for putting a
virtio-net backend in an external userspace process. Our eventual target is to
run the network backend in the Snabbswitch ethernet switch, while receiving
traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
implementation.

For this, we are working into extending vhost to allow equivalent functionality
for userspace. Vhost already passes control of the data plane of virtio-net to
the host kernel; we want to realize a similar model, but for userspace.

In this patch series the concept of a vhost-backend is introduced.

We define two vhost backend types - vhost-kernel and vhost-user. The former is
the interface to the current kernel module implementation. Its control plane is
ioctl based. The data plane is the kernel directly accessing the QEMU allocated,
guest memory.

In the new vhost-user backend, the control plane is based on communication
between QEMU and another userspace process using a unix domain socket. This
allows to implement a virtio backend for a guest running in QEMU, inside the
other userspace process.

The guest memory needs to be allocated using '-mem-path' and '-mem-prealloc'
command line options. This also incurs the use of HUGETLBFS, and allows the
backend userspace process to access the guest's memory. The preallocated RAM
file decriptor is shared with the vhost-user backend process.

The data path is realized by directly accessing the vrings and the buffer data
off the guest's memory.

The current user of vhost-user is only vhost-net, for which we add a new 'tap'
network backend option - 'vhostsock'. This parameter specifies the name of an
unix domain socket where the backend vhost-user process waits for a connaction.
This is a temporary scaffolding, as in the future we expect vhost-user to be
independent of the tap backend.

Based on wether the parameter is set or not, the 'tap' initialised vhost-net
will switch between vhost-kernel and vhost-user.

However, since we use another process as the network backend, QEMU should now
agnostic of the network backend used. In future versions of this series, we
intend to introduce a new QEMU network backend that is specific to vhost-user.

Current issues to be fixed:
 - No migration
 - (Probably) no ram hotplug
 - Will not start if the socket is not available
 - No reconnect when the backend disappears
 - Decouple vhost-net from the tap net backend when used with vhost-user

Antonios Motakis (5):
  Decouple vhost from kernel interface
  Add vhost-kernel and the vhost-user skeleton
  Add vhostsock option
  Add domain socket communication for vhost-user backend
  Add vhost-user calls implementation

 hw/net/vhost_net.c                |  29 ++--
 hw/scsi/vhost-scsi.c              |   9 +-
 hw/virtio/Makefile.objs           |   2 +-
 hw/virtio/vhost-backend.c         | 322 ++++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 |  44 +++---
 include/hw/virtio/vhost-backend.h |  28 ++++
 include/hw/virtio/vhost.h         |   4 +-
 include/net/vhost_net.h           |   5 +-
 net/tap.c                         |   4 +-
 qapi-schema.json                  |   3 +
 qemu-options.hx                   |   3 +-
 11 files changed, 413 insertions(+), 40 deletions(-)
 create mode 100644 hw/virtio/vhost-backend.c
 create mode 100644 include/hw/virtio/vhost-backend.h

-- 
1.8.3.2

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/5] Decouple vhost from kernel interface
  2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
@ 2013-11-29 19:52 ` Antonios Motakis
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton Antonios Motakis
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-11-29 19:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Michael S. Tsirkin, n.nikolaev, Paolo Bonzini,
	lukego, Antonios Motakis, tech, KONRAD Frederic

We introduce the concept of vhost-backend, which can be either vhost-kernel
or vhost-user. The existing vhost interface to the kernel is abstracted
behind the vhost-kernel backend.

We replace all direct ioctls to the kernel with a vhost_call to the backend.
vhost dev->control is referenced only in vhost-backend (ioctl, open, close).

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c                |  8 +++----
 hw/scsi/vhost-scsi.c              |  7 +++---
 hw/virtio/Makefile.objs           |  2 +-
 hw/virtio/vhost-backend.c         | 48 +++++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost.c                 | 40 ++++++++++++++++----------------
 include/hw/virtio/vhost-backend.h | 21 +++++++++++++++++
 6 files changed, 97 insertions(+), 29 deletions(-)
 create mode 100644 hw/virtio/vhost-backend.c
 create mode 100644 include/hw/virtio/vhost-backend.h

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 006576d..0d1943f 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -27,7 +27,6 @@
 #include <sys/socket.h>
 #include <linux/kvm.h>
 #include <fcntl.h>
-#include <sys/ioctl.h>
 #include <linux/virtio_ring.h>
 #include <netpacket/packet.h>
 #include <net/ethernet.h>
@@ -37,6 +36,7 @@
 #include <stdio.h>
 
 #include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio-bus.h"
 
 struct vhost_net {
@@ -170,7 +170,7 @@ static int vhost_net_start_one(struct vhost_net *net,
     qemu_set_fd_handler(net->backend, NULL, NULL, NULL);
     file.fd = net->backend;
     for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-        r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
+        r = vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
         if (r < 0) {
             r = -errno;
             goto fail;
@@ -180,7 +180,7 @@ static int vhost_net_start_one(struct vhost_net *net,
 fail:
     file.fd = -1;
     while (file.index-- > 0) {
-        int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
+        int r = vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
     }
     net->nc->info->poll(net->nc, true);
@@ -201,7 +201,7 @@ static void vhost_net_stop_one(struct vhost_net *net,
     }
 
     for (file.index = 0; file.index < net->dev.nvqs; ++file.index) {
-        int r = ioctl(net->dev.control, VHOST_NET_SET_BACKEND, &file);
+        int r = vhost_call(&net->dev, VHOST_NET_SET_BACKEND, &file);
         assert(r >= 0);
     }
     net->nc->info->poll(net->nc, true);
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 9e770fb..14d5030 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -21,6 +21,7 @@
 #include "migration/migration.h"
 #include "hw/virtio/vhost-scsi.h"
 #include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio-scsi.h"
 #include "hw/virtio/virtio-bus.h"
 
@@ -32,7 +33,7 @@ static int vhost_scsi_set_endpoint(VHostSCSI *s)
 
     memset(&backend, 0, sizeof(backend));
     pstrcpy(backend.vhost_wwpn, sizeof(backend.vhost_wwpn), vs->conf.wwpn);
-    ret = ioctl(s->dev.control, VHOST_SCSI_SET_ENDPOINT, &backend);
+    ret = vhost_call(&s->dev, VHOST_SCSI_SET_ENDPOINT, &backend);
     if (ret < 0) {
         return -errno;
     }
@@ -46,7 +47,7 @@ static void vhost_scsi_clear_endpoint(VHostSCSI *s)
 
     memset(&backend, 0, sizeof(backend));
     pstrcpy(backend.vhost_wwpn, sizeof(backend.vhost_wwpn), vs->conf.wwpn);
-    ioctl(s->dev.control, VHOST_SCSI_CLEAR_ENDPOINT, &backend);
+    vhost_call(&s->dev, VHOST_SCSI_CLEAR_ENDPOINT, &backend);
 }
 
 static int vhost_scsi_start(VHostSCSI *s)
@@ -61,7 +62,7 @@ static int vhost_scsi_start(VHostSCSI *s)
         return -ENOSYS;
     }
 
-    ret = ioctl(s->dev.control, VHOST_SCSI_GET_ABI_VERSION, &abi_version);
+    ret = vhost_call(&s->dev, VHOST_SCSI_GET_ABI_VERSION, &abi_version);
     if (ret < 0) {
         return -errno;
     }
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index 1ba53d9..51e5bdb 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -5,4 +5,4 @@ common-obj-y += virtio-mmio.o
 common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/
 
 obj-y += virtio.o virtio-balloon.o 
-obj-$(CONFIG_LINUX) += vhost.o
+obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
new file mode 100644
index 0000000..05de174
--- /dev/null
+++ b/hw/virtio/vhost-backend.c
@@ -0,0 +1,48 @@
+/*
+ * vhost-backend
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ * Written by Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
+
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/ioctl.h>
+
+static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
+        void *arg)
+{
+    int fd = dev->control;
+    return ioctl(fd, request, arg);
+}
+
+int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg)
+{
+    int result;
+
+    result = vhost_kernel_call(dev, request, arg);
+
+    return result;
+}
+
+int vhost_backend_init(struct vhost_dev *dev, const char *devpath)
+{
+    int fd = -1;
+
+    fd = open(devpath, O_RDWR);
+    dev->control = fd;
+
+    return fd;
+}
+
+int vhost_backend_cleanup(struct vhost_dev *dev)
+{
+    return close(dev->control);
+}
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 9e336ad..42f4d5f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -13,8 +13,8 @@
  * GNU GPL, version 2 or (at your option) any later version.
  */
 
-#include <sys/ioctl.h>
 #include "hw/virtio/vhost.h"
+#include "hw/virtio/vhost-backend.h"
 #include "hw/hw.h"
 #include "qemu/atomic.h"
 #include "qemu/range.h"
@@ -291,7 +291,7 @@ static inline void vhost_dev_log_resize(struct vhost_dev* dev, uint64_t size)
 
     log = g_malloc0(size * sizeof *log);
     log_base = (uint64_t)(unsigned long)log;
-    r = ioctl(dev->control, VHOST_SET_LOG_BASE, &log_base);
+    r = vhost_call(dev, VHOST_SET_LOG_BASE, &log_base);
     assert(r >= 0);
     /* Sync only the range covered by the old log */
     if (dev->log_size) {
@@ -460,7 +460,7 @@ static void vhost_commit(MemoryListener *listener)
     }
 
     if (!dev->log_enabled) {
-        r = ioctl(dev->control, VHOST_SET_MEM_TABLE, dev->mem);
+        r = vhost_call(dev, VHOST_SET_MEM_TABLE, dev->mem);
         assert(r >= 0);
         dev->memory_changed = false;
         return;
@@ -473,7 +473,7 @@ static void vhost_commit(MemoryListener *listener)
     if (dev->log_size < log_size) {
         vhost_dev_log_resize(dev, log_size + VHOST_LOG_BUFFER);
     }
-    r = ioctl(dev->control, VHOST_SET_MEM_TABLE, dev->mem);
+    r = vhost_call(dev, VHOST_SET_MEM_TABLE, dev->mem);
     assert(r >= 0);
     /* To log less, can only decrease log size after table update. */
     if (dev->log_size > log_size + VHOST_LOG_BUFFER) {
@@ -541,7 +541,7 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
         .log_guest_addr = vq->used_phys,
         .flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0,
     };
-    int r = ioctl(dev->control, VHOST_SET_VRING_ADDR, &addr);
+    int r = vhost_call(dev, VHOST_SET_VRING_ADDR, &addr);
     if (r < 0) {
         return -errno;
     }
@@ -555,7 +555,7 @@ static int vhost_dev_set_features(struct vhost_dev *dev, bool enable_log)
     if (enable_log) {
         features |= 0x1 << VHOST_F_LOG_ALL;
     }
-    r = ioctl(dev->control, VHOST_SET_FEATURES, &features);
+    r = vhost_call(dev, VHOST_SET_FEATURES, &features);
     return r < 0 ? -errno : 0;
 }
 
@@ -670,13 +670,13 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
 
     vq->num = state.num = virtio_queue_get_num(vdev, idx);
-    r = ioctl(dev->control, VHOST_SET_VRING_NUM, &state);
+    r = vhost_call(dev, VHOST_SET_VRING_NUM, &state);
     if (r) {
         return -errno;
     }
 
     state.num = virtio_queue_get_last_avail_idx(vdev, idx);
-    r = ioctl(dev->control, VHOST_SET_VRING_BASE, &state);
+    r = vhost_call(dev, VHOST_SET_VRING_BASE, &state);
     if (r) {
         return -errno;
     }
@@ -718,7 +718,7 @@ static int vhost_virtqueue_start(struct vhost_dev *dev,
     }
 
     file.fd = event_notifier_get_fd(virtio_queue_get_host_notifier(vvq));
-    r = ioctl(dev->control, VHOST_SET_VRING_KICK, &file);
+    r = vhost_call(dev, VHOST_SET_VRING_KICK, &file);
     if (r) {
         r = -errno;
         goto fail_kick;
@@ -756,7 +756,7 @@ static void vhost_virtqueue_stop(struct vhost_dev *dev,
     };
     int r;
     assert(idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs);
-    r = ioctl(dev->control, VHOST_GET_VRING_BASE, &state);
+    r = vhost_call(dev, VHOST_GET_VRING_BASE, &state);
     if (r < 0) {
         fprintf(stderr, "vhost VQ %d ring restore failed: %d\n", idx, r);
         fflush(stderr);
@@ -798,7 +798,7 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
     }
 
     file.fd = event_notifier_get_fd(&vq->masked_notifier);
-    r = ioctl(dev->control, VHOST_SET_VRING_CALL, &file);
+    r = vhost_call(dev, VHOST_SET_VRING_CALL, &file);
     if (r) {
         r = -errno;
         goto fail_call;
@@ -822,17 +822,16 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
     if (devfd >= 0) {
         hdev->control = devfd;
     } else {
-        hdev->control = open(devpath, O_RDWR);
-        if (hdev->control < 0) {
+        if (vhost_backend_init(hdev, devpath) < 0) {
             return -errno;
         }
     }
-    r = ioctl(hdev->control, VHOST_SET_OWNER, NULL);
+    r = vhost_call(hdev, VHOST_SET_OWNER, NULL);
     if (r < 0) {
         goto fail;
     }
 
-    r = ioctl(hdev->control, VHOST_GET_FEATURES, &features);
+    r = vhost_call(hdev, VHOST_GET_FEATURES, &features);
     if (r < 0) {
         goto fail;
     }
@@ -877,7 +876,7 @@ fail_vq:
     }
 fail:
     r = -errno;
-    close(hdev->control);
+    vhost_backend_cleanup(hdev);
     return r;
 }
 
@@ -890,7 +889,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
     memory_listener_unregister(&hdev->memory_listener);
     g_free(hdev->mem);
     g_free(hdev->mem_sections);
-    close(hdev->control);
+    vhost_backend_cleanup(hdev);
 }
 
 bool vhost_dev_query(struct vhost_dev *hdev, VirtIODevice *vdev)
@@ -992,7 +991,7 @@ void vhost_virtqueue_mask(struct vhost_dev *hdev, VirtIODevice *vdev, int n,
     } else {
         file.fd = event_notifier_get_fd(virtio_queue_get_guest_notifier(vvq));
     }
-    r = ioctl(hdev->control, VHOST_SET_VRING_CALL, &file);
+    r = vhost_call(hdev, VHOST_SET_VRING_CALL, &file);
     assert(r >= 0);
 }
 
@@ -1007,7 +1006,7 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
     if (r < 0) {
         goto fail_features;
     }
-    r = ioctl(hdev->control, VHOST_SET_MEM_TABLE, hdev->mem);
+    r = vhost_call(hdev, VHOST_SET_MEM_TABLE, hdev->mem);
     if (r < 0) {
         r = -errno;
         goto fail_mem;
@@ -1026,8 +1025,7 @@ int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev)
         hdev->log_size = vhost_get_log_size(hdev);
         hdev->log = hdev->log_size ?
             g_malloc0(hdev->log_size * sizeof *hdev->log) : NULL;
-        r = ioctl(hdev->control, VHOST_SET_LOG_BASE,
-                  (uint64_t)(unsigned long)hdev->log);
+        r = vhost_call(hdev, VHOST_SET_LOG_BASE, hdev->log);
         if (r < 0) {
             r = -errno;
             goto fail_log;
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
new file mode 100644
index 0000000..fc51b72
--- /dev/null
+++ b/include/hw/virtio/vhost-backend.h
@@ -0,0 +1,21 @@
+/*
+ * vhost-backend
+ *
+ * Copyright (c) 2013 Virtual Open Systems Sarl.
+ * Written by Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef VHOST_BACKEND_H_
+#define VHOST_BACKEND_H_
+
+struct vhost_dev;
+int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg);
+
+int vhost_backend_init(struct vhost_dev *dev, const char *devpath);
+int vhost_backend_cleanup(struct vhost_dev *dev);
+
+#endif /* VHOST_BACKEND_H_ */
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton
  2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 1/5] Decouple vhost from kernel interface Antonios Motakis
@ 2013-11-29 19:52 ` Antonios Motakis
  2013-12-04 13:47   ` Stefan Hajnoczi
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 3/5] Add vhostsock option Antonios Motakis
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Antonios Motakis @ 2013-11-29 19:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S. Tsirkin, n.nikolaev, Anthony Liguori, Paolo Bonzini,
	lukego, Antonios Motakis, tech

Introduce the backend type - vhost-kernel and vhost-user.
Add basic ioctl, open, close multiplexing depending on selected backend.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c                |  3 +--
 hw/scsi/vhost-scsi.c              |  4 ++--
 hw/virtio/vhost-backend.c         | 25 ++++++++++++++++++++++---
 hw/virtio/vhost.c                 |  6 ++++--
 include/hw/virtio/vhost-backend.h |  7 +++++++
 include/hw/virtio/vhost.h         |  4 +++-
 6 files changed, 39 insertions(+), 10 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 0d1943f..58a1880 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -36,7 +36,6 @@
 #include <stdio.h>
 
 #include "hw/virtio/vhost.h"
-#include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio-bus.h"
 
 struct vhost_net {
@@ -113,7 +112,7 @@ struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
     net->dev.nvqs = 2;
     net->dev.vqs = net->vqs;
 
-    r = vhost_dev_init(&net->dev, devfd, "/dev/vhost-net", force);
+    r = vhost_dev_init(&net->dev, devfd, "/dev/vhost-net", VHOST_BACKEND_TYPE_KERNEL, force);
     if (r < 0) {
         goto fail;
     }
diff --git a/hw/scsi/vhost-scsi.c b/hw/scsi/vhost-scsi.c
index 14d5030..fdc5d44 100644
--- a/hw/scsi/vhost-scsi.c
+++ b/hw/scsi/vhost-scsi.c
@@ -21,7 +21,6 @@
 #include "migration/migration.h"
 #include "hw/virtio/vhost-scsi.h"
 #include "hw/virtio/vhost.h"
-#include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio-scsi.h"
 #include "hw/virtio/virtio-bus.h"
 
@@ -226,7 +225,8 @@ static int vhost_scsi_init(VirtIODevice *vdev)
     s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs);
     s->dev.vq_index = 0;
 
-    ret = vhost_dev_init(&s->dev, vhostfd, "/dev/vhost-scsi", true);
+    ret = vhost_dev_init(&s->dev, vhostfd, "/dev/vhost-scsi",
+                         VHOST_BACKEND_TYPE_KERNEL, true);
     if (ret < 0) {
         error_report("vhost-scsi: vhost initialization failed: %s\n",
                 strerror(-ret));
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 05de174..80defe4 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -25,9 +25,18 @@ static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
 
 int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg)
 {
-    int result;
+    int result = -1;
 
-    result = vhost_kernel_call(dev, request, arg);
+    switch (dev->backend_type) {
+    case VHOST_BACKEND_TYPE_KERNEL:
+        result = vhost_kernel_call(dev, request, arg);
+        break;
+    case VHOST_BACKEND_TYPE_USER:
+        fprintf(stderr, "vhost-user not implemented\n");
+        break;
+    default:
+        fprintf(stderr, "Unknown vhost backend type\n");
+    }
 
     return result;
 }
@@ -36,7 +45,17 @@ int vhost_backend_init(struct vhost_dev *dev, const char *devpath)
 {
     int fd = -1;
 
-    fd = open(devpath, O_RDWR);
+    switch (dev->backend_type) {
+    case VHOST_BACKEND_TYPE_KERNEL:
+        fd = open(devpath, O_RDWR);
+        break;
+    case VHOST_BACKEND_TYPE_USER:
+        fprintf(stderr, "vhost-user not implemented\n");
+        break;
+    default:
+        fprintf(stderr, "Unknown vhost backend type\n");
+    }
+
     dev->control = fd;
 
     return fd;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 42f4d5f..35eeb5f 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -14,7 +14,6 @@
  */
 
 #include "hw/virtio/vhost.h"
-#include "hw/virtio/vhost-backend.h"
 #include "hw/hw.h"
 #include "qemu/atomic.h"
 #include "qemu/range.h"
@@ -815,10 +814,13 @@ static void vhost_virtqueue_cleanup(struct vhost_virtqueue *vq)
 }
 
 int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
-                   bool force)
+                   VhostBackendType backend_type, bool force)
 {
     uint64_t features;
     int i, r;
+
+    hdev->backend_type = backend_type;
+
     if (devfd >= 0) {
         hdev->control = devfd;
     } else {
diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h
index fc51b72..970f033 100644
--- a/include/hw/virtio/vhost-backend.h
+++ b/include/hw/virtio/vhost-backend.h
@@ -12,6 +12,13 @@
 #ifndef VHOST_BACKEND_H_
 #define VHOST_BACKEND_H_
 
+typedef enum VhostBackendType {
+    VHOST_BACKEND_TYPE_NONE = 0,
+    VHOST_BACKEND_TYPE_KERNEL = 1,
+    VHOST_BACKEND_TYPE_USER = 2,
+    VHOST_BACKEND_TYPE_MAX = 3,
+} VhostBackendType;
+
 struct vhost_dev;
 int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg);
 
diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index de24746..c08ae46 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -2,6 +2,7 @@
 #define VHOST_H
 
 #include "hw/hw.h"
+#include "hw/virtio/vhost-backend.h"
 #include "hw/virtio/virtio.h"
 #include "exec/memory.h"
 
@@ -48,10 +49,11 @@ struct vhost_dev {
     bool memory_changed;
     hwaddr mem_changed_start_addr;
     hwaddr mem_changed_end_addr;
+    VhostBackendType backend_type;
 };
 
 int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath,
-                   bool force);
+                   VhostBackendType backend_type, bool force);
 void vhost_dev_cleanup(struct vhost_dev *hdev);
 bool vhost_dev_query(struct vhost_dev *hdev, VirtIODevice *vdev);
 int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev);
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/5] Add vhostsock option
  2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 1/5] Decouple vhost from kernel interface Antonios Motakis
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton Antonios Motakis
@ 2013-11-29 19:52 ` Antonios Motakis
  2013-12-04 13:42   ` Stefan Hajnoczi
  2013-12-04 14:28   ` Eric Blake
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 4/5] Add domain socket communication for vhost-user backend Antonios Motakis
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-11-29 19:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Stefan Hajnoczi, Michael S. Tsirkin, Michael Tokarev,
	Markus Armbruster, n.nikolaev, Luiz Capitulino, Anthony Liguori,
	Paolo Bonzini, lukego, Antonios Motakis, tech

Adding a new tap network backend option - vhostsock. It points
to a named unix domain socket, that will be used to communicate
with vhost-user backend. This is a temporary work around; in future
versions of this series vhost-user will be made independent from
the tap network backend.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/net/vhost_net.c      | 22 +++++++++++++++++-----
 include/net/vhost_net.h |  5 ++++-
 net/tap.c               |  4 +++-
 qapi-schema.json        |  3 +++
 qemu-options.hx         |  3 ++-
 5 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index 58a1880..8c9d425 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -91,15 +91,27 @@ static int vhost_net_get_fd(NetClientState *backend)
     }
 }
 
-struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
-                                 bool force)
+struct vhost_net *vhost_net_init(NetClientState *backend, char *vhostsock,
+                                 int devfd, bool force)
 {
     int r;
     struct vhost_net *net = g_malloc(sizeof *net);
+    const char *backend_sock = 0;
+    VhostBackendType backend_type = VHOST_BACKEND_TYPE_NONE;
+
     if (!backend) {
         fprintf(stderr, "vhost-net requires backend to be setup\n");
         goto fail;
     }
+
+    if (vhostsock && strcmp(vhostsock, VHOST_NET_DEFAULT_SOCK) != 0) {
+        backend_type = VHOST_BACKEND_TYPE_USER;
+        backend_sock = vhostsock;
+    } else {
+        backend_type = VHOST_BACKEND_TYPE_KERNEL;
+        backend_sock = VHOST_NET_DEFAULT_SOCK;
+    }
+
     r = vhost_net_get_fd(backend);
     if (r < 0) {
         goto fail;
@@ -112,7 +124,7 @@ struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
     net->dev.nvqs = 2;
     net->dev.vqs = net->vqs;
 
-    r = vhost_dev_init(&net->dev, devfd, "/dev/vhost-net", VHOST_BACKEND_TYPE_KERNEL, force);
+    r = vhost_dev_init(&net->dev, devfd, backend_sock, backend_type, force);
     if (r < 0) {
         goto fail;
     }
@@ -282,8 +294,8 @@ void vhost_net_virtqueue_mask(VHostNetState *net, VirtIODevice *dev,
     vhost_virtqueue_mask(&net->dev, dev, idx, mask);
 }
 #else
-struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
-                                 bool force)
+struct vhost_net *vhost_net_init(NetClientState *backend, char *vhostsock,
+                                 int devfd, bool force)
 {
     error_report("vhost-net support is not compiled in");
     return NULL;
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 2d936bb..7bb6435 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -3,10 +3,13 @@
 
 #include "net/net.h"
 
+#define VHOST_NET_DEFAULT_SOCK  "/dev/vhost-net"
+
 struct vhost_net;
 typedef struct vhost_net VHostNetState;
 
-VHostNetState *vhost_net_init(NetClientState *backend, int devfd, bool force);
+VHostNetState *vhost_net_init(NetClientState *backend, char *vhostsock,
+                              int devfd, bool force);
 
 bool vhost_net_query(VHostNetState *net, VirtIODevice *dev);
 int vhost_net_start(VirtIODevice *dev, NetClientState *ncs, int total_queues);
diff --git a/net/tap.c b/net/tap.c
index 39c1cda..c4eba01 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -632,7 +632,9 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer,
             vhostfd = -1;
         }
 
-        s->vhost_net = vhost_net_init(&s->nc, vhostfd,
+        s->vhost_net = vhost_net_init(&s->nc,
+                                      tap->has_vhostsock ? tap->vhostsock : 0,
+                                      vhostfd,
                                       tap->has_vhostforce && tap->vhostforce);
         if (!s->vhost_net) {
             error_report("vhost-net requested but could not be initialized");
diff --git a/qapi-schema.json b/qapi-schema.json
index 83fa485..dc35929 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2891,6 +2891,8 @@
 #
 # @vhostforce: #optional vhost on for non-MSIX virtio guests
 #
+# @vhostsock: #optional vhost backend socket
+#
 # @queues: #optional number of queues to be created for multiqueue capable tap
 #
 # Since 1.2
@@ -2909,6 +2911,7 @@
     '*vhostfd':    'str',
     '*vhostfds':   'str',
     '*vhostforce': 'bool',
+    '*vhostsock':  'str',
     '*queues':     'uint32'} }
 
 ##
diff --git a/qemu-options.hx b/qemu-options.hx
index 8b94264..9f80720 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1370,7 +1370,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
     "-net tap[,vlan=n][,name=str],ifname=name\n"
     "                connect the host TAP network interface to VLAN 'n'\n"
 #else
-    "-net tap[,vlan=n][,name=str][,fd=h][,fds=x:y:...:z][,ifname=name][,script=file][,downscript=dfile][,helper=helper][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h][,vhostfds=x:y:...:z][,vhostforce=on|off][,queues=n]\n"
+    "-net tap[,vlan=n][,name=str][,fd=h][,fds=x:y:...:z][,ifname=name][,script=file][,downscript=dfile][,helper=helper][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h][,vhostfds=x:y:...:z][,vhostforce=on|off][,queues=n][,vhostsock=file]\n"
     "                connect the host TAP network interface to VLAN 'n'\n"
     "                use network scripts 'file' (default=" DEFAULT_NETWORK_SCRIPT ")\n"
     "                to configure it and 'dfile' (default=" DEFAULT_NETWORK_DOWN_SCRIPT ")\n"
@@ -1390,6 +1390,7 @@ DEF("net", HAS_ARG, QEMU_OPTION_net,
     "                use 'vhostfd=h' to connect to an already opened vhost net device\n"
     "                use 'vhostfds=x:y:...:z to connect to multiple already opened vhost net devices\n"
     "                use 'queues=n' to specify the number of queues to be created for multiqueue TAP\n"
+    "                use 'vhostsock=file' vhost-user backend socket\n"
     "-net bridge[,vlan=n][,name=str][,br=bridge][,helper=helper]\n"
     "                connects a host TAP network interface to a host bridge device 'br'\n"
     "                (default=" DEFAULT_BRIDGE_INTERFACE ") using the program 'helper'\n"
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 4/5] Add domain socket communication for vhost-user backend
  2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (2 preceding siblings ...)
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 3/5] Add vhostsock option Antonios Motakis
@ 2013-11-29 19:52 ` Antonios Motakis
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation Antonios Motakis
  2013-12-04 13:56 ` [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Stefan Hajnoczi
  5 siblings, 0 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-11-29 19:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: lukego, Antonios Motakis, tech, n.nikolaev, Michael S. Tsirkin

Add structures for passing vhost-user messages over a unix domain socket.
This is the equivalent of the existing vhost-kernel ioctls.

Connect to the named unix domain socket. The system call sendmsg
is used for communication. To be able to pass file descriptors
between processes - we use SCM_RIGHTS type in the message control header.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/virtio/vhost-backend.c | 164 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 162 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 80defe4..264a0a1 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -15,6 +15,115 @@
 #include <fcntl.h>
 #include <unistd.h>
 #include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <linux/vhost.h>
+
+#define VHOST_MEMORY_MAX_NREGIONS    8
+
+typedef enum VhostUserRequest {
+    VHOST_USER_NONE = 0,
+    VHOST_USER_GET_FEATURES = 1,
+    VHOST_USER_SET_FEATURES = 2,
+    VHOST_USER_SET_OWNER = 3,
+    VHOST_USER_RESET_OWNER = 4,
+    VHOST_USER_SET_MEM_TABLE = 5,
+    VHOST_USER_SET_LOG_BASE = 6,
+    VHOST_USER_SET_LOG_FD = 7,
+    VHOST_USER_SET_VRING_NUM = 8,
+    VHOST_USER_SET_VRING_ADDR = 9,
+    VHOST_USER_SET_VRING_BASE = 10,
+    VHOST_USER_GET_VRING_BASE = 11,
+    VHOST_USER_SET_VRING_KICK = 12,
+    VHOST_USER_SET_VRING_CALL = 13,
+    VHOST_USER_SET_VRING_ERR = 14,
+    VHOST_USER_NET_SET_BACKEND = 15,
+    VHOST_USER_MAX
+} VhostUserRequest;
+
+typedef struct VhostUserMemoryRegion {
+    __u64 guest_phys_addr;
+    __u64 memory_size;
+    __u64 userspace_addr;
+} VhostUserMemoryRegion;
+
+typedef struct VhostUserMemory {
+    __u32 nregions;
+    VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS];
+} VhostUserMemory;
+
+typedef struct VhostUserMsg {
+    VhostUserRequest request;
+
+    int flags;
+    union {
+        uint64_t    u64;
+        int         fd;
+        struct vhost_vring_state state;
+        struct vhost_vring_addr addr;
+        struct vhost_vring_file file;
+
+        VhostUserMemory memory;
+    };
+} VhostUserMsg;
+
+static int vhost_user_recv(int fd, VhostUserMsg *msg)
+{
+    ssize_t r = read(fd, msg, sizeof(VhostUserMsg));
+
+    return (r == sizeof(VhostUserMsg)) ? 0 : -1;
+}
+
+static int vhost_user_send_fds(int fd, const VhostUserMsg *msg, int *fds,
+        size_t fd_num)
+{
+    int ret;
+
+    struct msghdr msgh;
+    struct iovec iov[1];
+
+    size_t fd_size = fd_num * sizeof(int);
+    char control[CMSG_SPACE(fd_size)];
+    struct cmsghdr *cmsg;
+
+    memset(&msgh, 0, sizeof(msgh));
+    memset(control, 0, sizeof(control));
+
+    /* set the payload */
+    iov[0].iov_base = (void *) msg;
+    iov[0].iov_len = sizeof(VhostUserMsg);
+
+    msgh.msg_iov = iov;
+    msgh.msg_iovlen = 1;
+
+    if (fd_num) {
+        msgh.msg_control = control;
+        msgh.msg_controllen = sizeof(control);
+
+        cmsg = CMSG_FIRSTHDR(&msgh);
+
+        cmsg->cmsg_len = CMSG_LEN(fd_size);
+        cmsg->cmsg_level = SOL_SOCKET;
+        cmsg->cmsg_type = SCM_RIGHTS;
+        memcpy(CMSG_DATA(cmsg), fds, fd_size);
+    } else {
+        msgh.msg_control = 0;
+        msgh.msg_controllen = 0;
+    }
+
+    do {
+        ret = sendmsg(fd, &msgh, 0);
+    } while (ret < 0 && errno == EINTR);
+
+    if (ret < 0) {
+        fprintf(stderr, "Failed to send msg(%d), reason: %s\n",
+                msg->request, strerror(errno));
+    } else {
+        ret = 0;
+    }
+
+    return ret;
+}
 
 static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
         void *arg)
@@ -23,6 +132,39 @@ static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
     return ioctl(fd, request, arg);
 }
 
+static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
+        void *arg)
+{
+    int fd = dev->control;
+    VhostUserMsg msg;
+    int result = 0;
+    int fds[VHOST_MEMORY_MAX_NREGIONS];
+    size_t fd_num = 0;
+
+    memset(&msg, 0, sizeof(VhostUserMsg));
+
+    switch (request) {
+    default:
+        fprintf(stderr, "vhost-user trying to send unhandled ioctl\n");
+        return -1;
+        break;
+    }
+
+    result = vhost_user_send_fds(fd, &msg, fds, fd_num);
+
+    if (!result) {
+        result = vhost_user_recv(fd, &msg);
+        if (!result) {
+            switch (request) {
+            default:
+                fprintf(stderr, "vhost-user received unhandled message\n");
+            }
+        }
+    }
+
+    return result;
+}
+
 int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg)
 {
     int result = -1;
@@ -32,7 +174,7 @@ int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg)
         result = vhost_kernel_call(dev, request, arg);
         break;
     case VHOST_BACKEND_TYPE_USER:
-        fprintf(stderr, "vhost-user not implemented\n");
+        result = vhost_user_call(dev, request, arg);
         break;
     default:
         fprintf(stderr, "Unknown vhost backend type\n");
@@ -44,13 +186,31 @@ int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg)
 int vhost_backend_init(struct vhost_dev *dev, const char *devpath)
 {
     int fd = -1;
+    struct sockaddr_un un;
+    size_t len;
 
     switch (dev->backend_type) {
     case VHOST_BACKEND_TYPE_KERNEL:
         fd = open(devpath, O_RDWR);
         break;
     case VHOST_BACKEND_TYPE_USER:
-        fprintf(stderr, "vhost-user not implemented\n");
+        /* Create the socket */
+        fd = socket(AF_UNIX, SOCK_STREAM, 0);
+        if (fd == -1) {
+            perror("socket");
+            return -1;
+        }
+
+        un.sun_family = AF_UNIX;
+        strcpy(un.sun_path, devpath);
+
+        len = sizeof(un.sun_family) + strlen(devpath);
+
+        /* Connect */
+        if (connect(fd, (struct sockaddr *) &un, len) == -1) {
+            perror("connect");
+            return -1;
+        }
         break;
     default:
         fprintf(stderr, "Unknown vhost backend type\n");
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation
  2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (3 preceding siblings ...)
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 4/5] Add domain socket communication for vhost-user backend Antonios Motakis
@ 2013-11-29 19:52 ` Antonios Motakis
  2013-12-04 20:00   ` Michael S. Tsirkin
  2013-12-04 13:56 ` [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Stefan Hajnoczi
  5 siblings, 1 reply; 15+ messages in thread
From: Antonios Motakis @ 2013-11-29 19:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: lukego, Antonios Motakis, tech, n.nikolaev, Michael S. Tsirkin

Each ioctl request of vhost-kernel has a vhost-user message equivalent,
which is sent it over the control socket.

The general approach is to copy the data from the supplied argument
pointer to a designated field in the message. If a file descriptor is
to be passed it should be placed also in the fds array for inclusion in
the sendmsd control header.

VHOST_SET_MEM_TABLE ignores the supplied vhost_memory structure and scans
the global ram_list for ram blocks wiht a valid fd field set. This would
be set when -mem-path and -mem-prealloc command line options are used.

Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
---
 hw/virtio/vhost-backend.c | 105 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 100 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 264a0a1..1bc2928 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -67,6 +67,38 @@ typedef struct VhostUserMsg {
     };
 } VhostUserMsg;
 
+static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
+    -1, /* VHOST_USER_NONE */
+    VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
+    VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
+    VHOST_SET_OWNER, /* VHOST_USER_SET_OWNER */
+    VHOST_RESET_OWNER, /* VHOST_USER_RESET_OWNER */
+    VHOST_SET_MEM_TABLE, /* VHOST_USER_SET_MEM_TABLE */
+    VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
+    VHOST_SET_LOG_FD, /* VHOST_USER_SET_LOG_FD */
+    VHOST_SET_VRING_NUM, /* VHOST_USER_SET_VRING_NUM */
+    VHOST_SET_VRING_ADDR, /* VHOST_USER_SET_VRING_ADDR */
+    VHOST_SET_VRING_BASE, /* VHOST_USER_SET_VRING_BASE */
+    VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */
+    VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */
+    VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */
+    VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */
+    VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */
+};
+
+static VhostUserRequest vhost_user_request_translate(unsigned long int request)
+{
+    VhostUserRequest idx;
+
+    for (idx = 0; idx < VHOST_USER_MAX; idx++) {
+        if (ioctl_to_vhost_user_request[idx] == request) {
+            break;
+        }
+    }
+
+    return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx;
+}
+
 static int vhost_user_recv(int fd, VhostUserMsg *msg)
 {
     ssize_t r = read(fd, msg, sizeof(VhostUserMsg));
@@ -137,13 +169,72 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
 {
     int fd = dev->control;
     VhostUserMsg msg;
-    int result = 0;
+    RAMBlock *block = 0;
+    int result = 0, need_reply = 0;
     int fds[VHOST_MEMORY_MAX_NREGIONS];
     size_t fd_num = 0;
 
-    memset(&msg, 0, sizeof(VhostUserMsg));
+    msg.request = vhost_user_request_translate(request);
+    msg.flags = 0;
 
     switch (request) {
+    case VHOST_GET_FEATURES:
+    case VHOST_GET_VRING_BASE:
+        need_reply = 1;
+        break;
+
+    case VHOST_SET_FEATURES:
+    case VHOST_SET_LOG_BASE:
+        msg.u64 = *((uint64_t *) arg);
+        break;
+
+    case VHOST_SET_OWNER:
+    case VHOST_RESET_OWNER:
+        break;
+
+    case VHOST_SET_MEM_TABLE:
+        QTAILQ_FOREACH(block, &ram_list.blocks, next)
+        {
+            if (block->fd > 0) {
+                msg.memory.regions[fd_num].userspace_addr = (__u64) block->host;
+                msg.memory.regions[fd_num].memory_size = block->length;
+                msg.memory.regions[fd_num].guest_phys_addr = block->offset;
+                fds[fd_num++] = block->fd;
+            }
+        }
+
+        msg.memory.nregions = fd_num;
+
+        if (!fd_num) {
+            fprintf(stderr, "Failed initializing vhost-user memory map\n"
+                    "consider -mem-path and -mem-prealloc options\n");
+            return -1;
+        }
+        break;
+
+    case VHOST_SET_LOG_FD:
+        msg.fd = *((int *) arg);
+        break;
+
+    case VHOST_SET_VRING_NUM:
+    case VHOST_SET_VRING_BASE:
+        memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
+        break;
+
+    case VHOST_SET_VRING_ADDR:
+        memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
+        break;
+
+    case VHOST_SET_VRING_KICK:
+    case VHOST_SET_VRING_CALL:
+    case VHOST_SET_VRING_ERR:
+    case VHOST_NET_SET_BACKEND:
+        memcpy(&msg.file, arg, sizeof(struct vhost_vring_file));
+        if (msg.file.fd > 0) {
+            fds[0] = msg.file.fd;
+            fd_num = 1;
+        }
+        break;
     default:
         fprintf(stderr, "vhost-user trying to send unhandled ioctl\n");
         return -1;
@@ -152,12 +243,16 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
 
     result = vhost_user_send_fds(fd, &msg, fds, fd_num);
 
-    if (!result) {
+    if (!result && need_reply) {
         result = vhost_user_recv(fd, &msg);
         if (!result) {
             switch (request) {
-            default:
-                fprintf(stderr, "vhost-user received unhandled message\n");
+            case VHOST_GET_FEATURES:
+                *((uint64_t *) arg) = msg.u64;
+                break;
+            case VHOST_GET_VRING_BASE:
+                memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
+                break;
             }
         }
     }
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] Add vhostsock option
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 3/5] Add vhostsock option Antonios Motakis
@ 2013-12-04 13:42   ` Stefan Hajnoczi
  2013-12-04 15:21     ` Antonios Motakis
  2013-12-04 14:28   ` Eric Blake
  1 sibling, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2013-12-04 13:42 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: tech, Anthony Liguori, Michael S. Tsirkin, Michael Tokarev,
	Markus Armbruster, n.nikolaev, qemu-devel, Stefan Hajnoczi,
	lukego, Paolo Bonzini, Luiz Capitulino

On Fri, Nov 29, 2013 at 08:52:24PM +0100, Antonios Motakis wrote:
> @@ -91,15 +91,27 @@ static int vhost_net_get_fd(NetClientState *backend)
>      }
>  }
>  
> -struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
> -                                 bool force)
> +struct vhost_net *vhost_net_init(NetClientState *backend, char *vhostsock,
> +                                 int devfd, bool force)
>  {
>      int r;
>      struct vhost_net *net = g_malloc(sizeof *net);
> +    const char *backend_sock = 0;
> +    VhostBackendType backend_type = VHOST_BACKEND_TYPE_NONE;
> +
>      if (!backend) {
>          fprintf(stderr, "vhost-net requires backend to be setup\n");
>          goto fail;
>      }
> +
> +    if (vhostsock && strcmp(vhostsock, VHOST_NET_DEFAULT_SOCK) != 0) {

This is a weird hack.  Why check for VHOST_NET_DEFAULT_SOCK at all?

If the option is not present then kernel vhost is used, if the option is
present then userspace vhost is used.  I don't understand why a magic
hardcoded path is useful.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton Antonios Motakis
@ 2013-12-04 13:47   ` Stefan Hajnoczi
  2013-12-04 15:23     ` Antonios Motakis
  0 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2013-12-04 13:47 UTC (permalink / raw)
  To: Antonios Motakis
  Cc: Michael S. Tsirkin, qemu-devel, n.nikolaev, Anthony Liguori,
	lukego, Paolo Bonzini, tech

On Fri, Nov 29, 2013 at 08:52:23PM +0100, Antonios Motakis wrote:
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 05de174..80defe4 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -25,9 +25,18 @@ static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request,
>  
>  int vhost_call(struct vhost_dev *dev, unsigned long int request, void *arg)
>  {
> -    int result;
> +    int result = -1;
>  
> -    result = vhost_kernel_call(dev, request, arg);
> +    switch (dev->backend_type) {
> +    case VHOST_BACKEND_TYPE_KERNEL:
> +        result = vhost_kernel_call(dev, request, arg);
> +        break;
> +    case VHOST_BACKEND_TYPE_USER:
> +        fprintf(stderr, "vhost-user not implemented\n");
> +        break;
> +    default:
> +        fprintf(stderr, "Unknown vhost backend type\n");
> +    }

The switch statement approach gets messy fast when local variables are
needed inside some case labels.  It also makes it hard to conditionally
compile features without using #ifdef.

Perhaps instead a VhostOps struct could be used:

/* Vhost backends implement this interface */
typedef struct {
    int vhost_call(struct vhost_dev *dev,
                   unsigned long int request,
                   void *arg);
    ...
} VhostOps;

const VhostOps vhost_kernel_ops = {
    ...
};

const VhostOps vhost_user_opts = {
    ...
};

ret = dev->vhost_ops->vhost_call(dev, request, arg);

Something along those lines.  It keeps the different backend
implementations separate (they can live in separate files and be
conditional in Makefile.objs).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends
  2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
                   ` (4 preceding siblings ...)
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation Antonios Motakis
@ 2013-12-04 13:56 ` Stefan Hajnoczi
  2013-12-04 15:23   ` Antonios Motakis
  5 siblings, 1 reply; 15+ messages in thread
From: Stefan Hajnoczi @ 2013-12-04 13:56 UTC (permalink / raw)
  To: Antonios Motakis; +Cc: lukego, tech, qemu-devel, n.nikolaev, Michael S. Tsirkin

On Fri, Nov 29, 2013 at 08:52:21PM +0100, Antonios Motakis wrote:
> In this patch series we would like to introduce our approach for putting a
> virtio-net backend in an external userspace process. Our eventual target is to
> run the network backend in the Snabbswitch ethernet switch, while receiving
> traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
> implementation.
> 
> For this, we are working into extending vhost to allow equivalent functionality
> for userspace. Vhost already passes control of the data plane of virtio-net to
> the host kernel; we want to realize a similar model, but for userspace.
> 
> In this patch series the concept of a vhost-backend is introduced.
> 
> We define two vhost backend types - vhost-kernel and vhost-user. The former is
> the interface to the current kernel module implementation. Its control plane is
> ioctl based. The data plane is the kernel directly accessing the QEMU allocated,
> guest memory.
> 
> In the new vhost-user backend, the control plane is based on communication
> between QEMU and another userspace process using a unix domain socket. This
> allows to implement a virtio backend for a guest running in QEMU, inside the
> other userspace process.

One thing that came to mind when reading the patches is that you are
implementing the vhost interface pretty much exactly as-is.  Did you
look at FUSE's character devices in userspace (CUSE)?  IIRC even ioctl
is supported so you might be able to skip the userspace backend entirely
if you mimic vhost_net.ko's ioctl interface.

Then all that's needed is some configuration/startup code to use shared
memory and pass the eventfds.

Stefan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] Add vhostsock option
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 3/5] Add vhostsock option Antonios Motakis
  2013-12-04 13:42   ` Stefan Hajnoczi
@ 2013-12-04 14:28   ` Eric Blake
  1 sibling, 0 replies; 15+ messages in thread
From: Eric Blake @ 2013-12-04 14:28 UTC (permalink / raw)
  To: Antonios Motakis, qemu-devel
  Cc: Anthony Liguori, Michael S. Tsirkin, Michael Tokarev,
	Markus Armbruster, n.nikolaev, Luiz Capitulino, Stefan Hajnoczi,
	lukego, Paolo Bonzini, tech

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

On 11/29/2013 12:52 PM, Antonios Motakis wrote:
> Adding a new tap network backend option - vhostsock. It points
> to a named unix domain socket, that will be used to communicate
> with vhost-user backend. This is a temporary work around; in future
> versions of this series vhost-user will be made independent from
> the tap network backend.
> 

> +++ b/qapi-schema.json
> @@ -2891,6 +2891,8 @@
>  #
>  # @vhostforce: #optional vhost on for non-MSIX virtio guests
>  #
> +# @vhostsock: #optional vhost backend socket
> +#

Needs a "(since 2.0)" designation.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] Add vhostsock option
  2013-12-04 13:42   ` Stefan Hajnoczi
@ 2013-12-04 15:21     ` Antonios Motakis
  0 siblings, 0 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-12-04 15:21 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: VirtualOpenSystems Technical Team, Anthony Liguori,
	Michael S. Tsirkin, Michael Tokarev, Markus Armbruster,
	Nikolay Nikolaev, qemu-devel qemu-devel, Stefan Hajnoczi,
	Luke Gorrie, Paolo Bonzini, Luiz Capitulino

[-- Attachment #1: Type: text/plain, Size: 1279 bytes --]

On Wed, Dec 4, 2013 at 2:42 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Fri, Nov 29, 2013 at 08:52:24PM +0100, Antonios Motakis wrote:
> > @@ -91,15 +91,27 @@ static int vhost_net_get_fd(NetClientState *backend)
> >      }
> >  }
> >
> > -struct vhost_net *vhost_net_init(NetClientState *backend, int devfd,
> > -                                 bool force)
> > +struct vhost_net *vhost_net_init(NetClientState *backend, char
> *vhostsock,
> > +                                 int devfd, bool force)
> >  {
> >      int r;
> >      struct vhost_net *net = g_malloc(sizeof *net);
> > +    const char *backend_sock = 0;
> > +    VhostBackendType backend_type = VHOST_BACKEND_TYPE_NONE;
> > +
> >      if (!backend) {
> >          fprintf(stderr, "vhost-net requires backend to be setup\n");
> >          goto fail;
> >      }
> > +
> > +    if (vhostsock && strcmp(vhostsock, VHOST_NET_DEFAULT_SOCK) != 0) {
>
> This is a weird hack.  Why check for VHOST_NET_DEFAULT_SOCK at all?
>
> If the option is not present then kernel vhost is used, if the option is
> present then userspace vhost is used.  I don't understand why a magic
> hardcoded path is useful.
>

This code will be reworked for the next version of the series, so this
shouldn't be a problem then.

Antonios

[-- Attachment #2: Type: text/html, Size: 1936 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends
  2013-12-04 13:56 ` [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Stefan Hajnoczi
@ 2013-12-04 15:23   ` Antonios Motakis
  0 siblings, 0 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-12-04 15:23 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Luke Gorrie, VirtualOpenSystems Technical Team,
	qemu-devel qemu-devel, Nikolay Nikolaev, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 2270 bytes --]

On Wed, Dec 4, 2013 at 2:56 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Fri, Nov 29, 2013 at 08:52:21PM +0100, Antonios Motakis wrote:
> > In this patch series we would like to introduce our approach for putting
> a
> > virtio-net backend in an external userspace process. Our eventual target
> is to
> > run the network backend in the Snabbswitch ethernet switch, while
> receiving
> > traffic from a guest inside QEMU/KVM which runs an unmodified virtio-net
> > implementation.
> >
> > For this, we are working into extending vhost to allow equivalent
> functionality
> > for userspace. Vhost already passes control of the data plane of
> virtio-net to
> > the host kernel; we want to realize a similar model, but for userspace.
> >
> > In this patch series the concept of a vhost-backend is introduced.
> >
> > We define two vhost backend types - vhost-kernel and vhost-user. The
> former is
> > the interface to the current kernel module implementation. Its control
> plane is
> > ioctl based. The data plane is the kernel directly accessing the QEMU
> allocated,
> > guest memory.
> >
> > In the new vhost-user backend, the control plane is based on
> communication
> > between QEMU and another userspace process using a unix domain socket.
> This
> > allows to implement a virtio backend for a guest running in QEMU, inside
> the
> > other userspace process.
>
> One thing that came to mind when reading the patches is that you are
> implementing the vhost interface pretty much exactly as-is.  Did you
> look at FUSE's character devices in userspace (CUSE)?  IIRC even ioctl
> is supported so you might be able to skip the userspace backend entirely
> if you mimic vhost_net.ko's ioctl interface.
>
> Then all that's needed is some configuration/startup code to use shared
> memory and pass the eventfds.
>
> Stefan
>

Using UNIX domain sockets we can easily exchange the file descriptors we
need, I don't know if this is the case with CUSE. Even though we
reimplement ioctls via the UNIX domain socket, those are simple enough so I
am also not sure CUSE would actually make things simpler.

Even so, the vhost server is not really a device to be represented as a
character device. I don't know if it is the right approach to do it like
that.

Antonios

[-- Attachment #2: Type: text/html, Size: 3112 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton
  2013-12-04 13:47   ` Stefan Hajnoczi
@ 2013-12-04 15:23     ` Antonios Motakis
  0 siblings, 0 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-12-04 15:23 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Michael S. Tsirkin, qemu-devel qemu-devel, Nikolay Nikolaev,
	Anthony Liguori, Luke Gorrie, Paolo Bonzini,
	VirtualOpenSystems Technical Team

[-- Attachment #1: Type: text/plain, Size: 1851 bytes --]

On Wed, Dec 4, 2013 at 2:47 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:

> On Fri, Nov 29, 2013 at 08:52:23PM +0100, Antonios Motakis wrote:
> > diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> > index 05de174..80defe4 100644
> > --- a/hw/virtio/vhost-backend.c
> > +++ b/hw/virtio/vhost-backend.c
> > @@ -25,9 +25,18 @@ static int vhost_kernel_call(struct vhost_dev *dev,
> unsigned long int request,
> >
> >  int vhost_call(struct vhost_dev *dev, unsigned long int request, void
> *arg)
> >  {
> > -    int result;
> > +    int result = -1;
> >
> > -    result = vhost_kernel_call(dev, request, arg);
> > +    switch (dev->backend_type) {
> > +    case VHOST_BACKEND_TYPE_KERNEL:
> > +        result = vhost_kernel_call(dev, request, arg);
> > +        break;
> > +    case VHOST_BACKEND_TYPE_USER:
> > +        fprintf(stderr, "vhost-user not implemented\n");
> > +        break;
> > +    default:
> > +        fprintf(stderr, "Unknown vhost backend type\n");
> > +    }
>
> The switch statement approach gets messy fast when local variables are
> needed inside some case labels.  It also makes it hard to conditionally
> compile features without using #ifdef.
>
> Perhaps instead a VhostOps struct could be used:
>
> /* Vhost backends implement this interface */
> typedef struct {
>     int vhost_call(struct vhost_dev *dev,
>                    unsigned long int request,
>                    void *arg);
>     ...
> } VhostOps;
>
> const VhostOps vhost_kernel_ops = {
>     ...
> };
>
> const VhostOps vhost_user_opts = {
>     ...
> };
>
> ret = dev->vhost_ops->vhost_call(dev, request, arg);
>
> Something along those lines.  It keeps the different backend
> implementations separate (they can live in separate files and be
> conditional in Makefile.objs).
>

We will take this into account as well, thanks.

Antonios

[-- Attachment #2: Type: text/html, Size: 2602 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation
  2013-11-29 19:52 ` [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation Antonios Motakis
@ 2013-12-04 20:00   ` Michael S. Tsirkin
  2013-12-10 12:05     ` Antonios Motakis
  0 siblings, 1 reply; 15+ messages in thread
From: Michael S. Tsirkin @ 2013-12-04 20:00 UTC (permalink / raw)
  To: Antonios Motakis; +Cc: lukego, tech, qemu-devel, n.nikolaev

On Fri, Nov 29, 2013 at 08:52:26PM +0100, Antonios Motakis wrote:
> Each ioctl request of vhost-kernel has a vhost-user message equivalent,
> which is sent it over the control socket.
> 
> The general approach is to copy the data from the supplied argument
> pointer to a designated field in the message. If a file descriptor is
> to be passed it should be placed also in the fds array for inclusion in
> the sendmsd control header.

What does this code talk to? What's on the other side of the domain socket?

> VHOST_SET_MEM_TABLE ignores the supplied vhost_memory structure and scans
> the global ram_list for ram blocks wiht a valid fd field set. This would
> be set when -mem-path and -mem-prealloc command line options are used.

I don't get this.
AFAIK -mem-path is used for huge tlb fs.
So vhost-user requires huge tlb fs then?

> Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>


> ---
>  hw/virtio/vhost-backend.c | 105 +++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 100 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 264a0a1..1bc2928 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -67,6 +67,38 @@ typedef struct VhostUserMsg {
>      };
>  } VhostUserMsg;
>  
> +static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> +    -1, /* VHOST_USER_NONE */
> +    VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
> +    VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
> +    VHOST_SET_OWNER, /* VHOST_USER_SET_OWNER */
> +    VHOST_RESET_OWNER, /* VHOST_USER_RESET_OWNER */
> +    VHOST_SET_MEM_TABLE, /* VHOST_USER_SET_MEM_TABLE */
> +    VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
> +    VHOST_SET_LOG_FD, /* VHOST_USER_SET_LOG_FD */
> +    VHOST_SET_VRING_NUM, /* VHOST_USER_SET_VRING_NUM */
> +    VHOST_SET_VRING_ADDR, /* VHOST_USER_SET_VRING_ADDR */
> +    VHOST_SET_VRING_BASE, /* VHOST_USER_SET_VRING_BASE */
> +    VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */
> +    VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */
> +    VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */
> +    VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */
> +    VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */
> +};
> +
> +static VhostUserRequest vhost_user_request_translate(unsigned long int request)
> +{
> +    VhostUserRequest idx;
> +
> +    for (idx = 0; idx < VHOST_USER_MAX; idx++) {
> +        if (ioctl_to_vhost_user_request[idx] == request) {
> +            break;
> +        }
> +    }
> +
> +    return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx;
> +}
> +
>  static int vhost_user_recv(int fd, VhostUserMsg *msg)
>  {
>      ssize_t r = read(fd, msg, sizeof(VhostUserMsg));
> @@ -137,13 +169,72 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>  {
>      int fd = dev->control;
>      VhostUserMsg msg;
> -    int result = 0;
> +    RAMBlock *block = 0;
> +    int result = 0, need_reply = 0;
>      int fds[VHOST_MEMORY_MAX_NREGIONS];
>      size_t fd_num = 0;
>  
> -    memset(&msg, 0, sizeof(VhostUserMsg));
> +    msg.request = vhost_user_request_translate(request);
> +    msg.flags = 0;
>  
>      switch (request) {
> +    case VHOST_GET_FEATURES:
> +    case VHOST_GET_VRING_BASE:
> +        need_reply = 1;
> +        break;
> +
> +    case VHOST_SET_FEATURES:
> +    case VHOST_SET_LOG_BASE:
> +        msg.u64 = *((uint64_t *) arg);
> +        break;
> +
> +    case VHOST_SET_OWNER:
> +    case VHOST_RESET_OWNER:
> +        break;
> +
> +    case VHOST_SET_MEM_TABLE:
> +        QTAILQ_FOREACH(block, &ram_list.blocks, next)
> +        {
> +            if (block->fd > 0) {
> +                msg.memory.regions[fd_num].userspace_addr = (__u64) block->host;
> +                msg.memory.regions[fd_num].memory_size = block->length;
> +                msg.memory.regions[fd_num].guest_phys_addr = block->offset;
> +                fds[fd_num++] = block->fd;
> +            }
> +        }
> +
> +        msg.memory.nregions = fd_num;
> +
> +        if (!fd_num) {
> +            fprintf(stderr, "Failed initializing vhost-user memory map\n"
> +                    "consider -mem-path and -mem-prealloc options\n");
> +            return -1;
> +        }
> +        break;
> +
> +    case VHOST_SET_LOG_FD:
> +        msg.fd = *((int *) arg);
> +        break;
> +
> +    case VHOST_SET_VRING_NUM:
> +    case VHOST_SET_VRING_BASE:
> +        memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> +        break;
> +
> +    case VHOST_SET_VRING_ADDR:
> +        memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
> +        break;
> +
> +    case VHOST_SET_VRING_KICK:
> +    case VHOST_SET_VRING_CALL:
> +    case VHOST_SET_VRING_ERR:
> +    case VHOST_NET_SET_BACKEND:
> +        memcpy(&msg.file, arg, sizeof(struct vhost_vring_file));
> +        if (msg.file.fd > 0) {
> +            fds[0] = msg.file.fd;
> +            fd_num = 1;
> +        }
> +        break;
>      default:
>          fprintf(stderr, "vhost-user trying to send unhandled ioctl\n");
>          return -1;
> @@ -152,12 +243,16 @@ static int vhost_user_call(struct vhost_dev *dev, unsigned long int request,
>  
>      result = vhost_user_send_fds(fd, &msg, fds, fd_num);
>  
> -    if (!result) {
> +    if (!result && need_reply) {
>          result = vhost_user_recv(fd, &msg);
>          if (!result) {
>              switch (request) {
> -            default:
> -                fprintf(stderr, "vhost-user received unhandled message\n");
> +            case VHOST_GET_FEATURES:
> +                *((uint64_t *) arg) = msg.u64;
> +                break;
> +            case VHOST_GET_VRING_BASE:
> +                memcpy(arg, &msg.state, sizeof(struct vhost_vring_state));
> +                break;
>              }
>          }
>      }
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation
  2013-12-04 20:00   ` Michael S. Tsirkin
@ 2013-12-10 12:05     ` Antonios Motakis
  0 siblings, 0 replies; 15+ messages in thread
From: Antonios Motakis @ 2013-12-10 12:05 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Luke Gorrie, VirtualOpenSystems Technical Team,
	qemu-devel qemu-devel, Nikolay Nikolaev

[-- Attachment #1: Type: text/plain, Size: 7000 bytes --]

Hello,

On Wed, Dec 4, 2013 at 9:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:

> On Fri, Nov 29, 2013 at 08:52:26PM +0100, Antonios Motakis wrote:
> > Each ioctl request of vhost-kernel has a vhost-user message equivalent,
> > which is sent it over the control socket.
> >
> > The general approach is to copy the data from the supplied argument
> > pointer to a designated field in the message. If a file descriptor is
> > to be passed it should be placed also in the fds array for inclusion in
> > the sendmsd control header.
>
> What does this code talk to? What's on the other side of the domain socket?
>

Our intention is to have on the other side of the socket the Snabbswitch
userspace ethernet switch. Of course the interface we are aiming at is not
limited to Snabbswitch, so other use cases could be considered as well.


>
> > VHOST_SET_MEM_TABLE ignores the supplied vhost_memory structure and scans
> > the global ram_list for ram blocks wiht a valid fd field set. This would
> > be set when -mem-path and -mem-prealloc command line options are used.
>
> I don't get this.
> AFAIK -mem-path is used for huge tlb fs.
> So vhost-user requires huge tlb fs then?
>

We need a mechanism to share the guest memory with the remote process. Huge
tlb fs is one way to do it, however we posted a new version of the patch
series that introduces another mechanism (-mem-share), so it is no longer
mandatory to use huge tlb fs. I think should address any concern. For more
details you can refer to the new vhost-user series we posted today.

Best regards,
Antonios


> > Signed-off-by: Antonios Motakis <a.motakis@virtualopensystems.com>
> > Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
>
>
> > ---
> >  hw/virtio/vhost-backend.c | 105
> +++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 100 insertions(+), 5 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> > index 264a0a1..1bc2928 100644
> > --- a/hw/virtio/vhost-backend.c
> > +++ b/hw/virtio/vhost-backend.c
> > @@ -67,6 +67,38 @@ typedef struct VhostUserMsg {
> >      };
> >  } VhostUserMsg;
> >
> > +static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = {
> > +    -1, /* VHOST_USER_NONE */
> > +    VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */
> > +    VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */
> > +    VHOST_SET_OWNER, /* VHOST_USER_SET_OWNER */
> > +    VHOST_RESET_OWNER, /* VHOST_USER_RESET_OWNER */
> > +    VHOST_SET_MEM_TABLE, /* VHOST_USER_SET_MEM_TABLE */
> > +    VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */
> > +    VHOST_SET_LOG_FD, /* VHOST_USER_SET_LOG_FD */
> > +    VHOST_SET_VRING_NUM, /* VHOST_USER_SET_VRING_NUM */
> > +    VHOST_SET_VRING_ADDR, /* VHOST_USER_SET_VRING_ADDR */
> > +    VHOST_SET_VRING_BASE, /* VHOST_USER_SET_VRING_BASE */
> > +    VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */
> > +    VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */
> > +    VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */
> > +    VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */
> > +    VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */
> > +};
> > +
> > +static VhostUserRequest vhost_user_request_translate(unsigned long int
> request)
> > +{
> > +    VhostUserRequest idx;
> > +
> > +    for (idx = 0; idx < VHOST_USER_MAX; idx++) {
> > +        if (ioctl_to_vhost_user_request[idx] == request) {
> > +            break;
> > +        }
> > +    }
> > +
> > +    return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx;
> > +}
> > +
> >  static int vhost_user_recv(int fd, VhostUserMsg *msg)
> >  {
> >      ssize_t r = read(fd, msg, sizeof(VhostUserMsg));
> > @@ -137,13 +169,72 @@ static int vhost_user_call(struct vhost_dev *dev,
> unsigned long int request,
> >  {
> >      int fd = dev->control;
> >      VhostUserMsg msg;
> > -    int result = 0;
> > +    RAMBlock *block = 0;
> > +    int result = 0, need_reply = 0;
> >      int fds[VHOST_MEMORY_MAX_NREGIONS];
> >      size_t fd_num = 0;
> >
> > -    memset(&msg, 0, sizeof(VhostUserMsg));
> > +    msg.request = vhost_user_request_translate(request);
> > +    msg.flags = 0;
> >
> >      switch (request) {
> > +    case VHOST_GET_FEATURES:
> > +    case VHOST_GET_VRING_BASE:
> > +        need_reply = 1;
> > +        break;
> > +
> > +    case VHOST_SET_FEATURES:
> > +    case VHOST_SET_LOG_BASE:
> > +        msg.u64 = *((uint64_t *) arg);
> > +        break;
> > +
> > +    case VHOST_SET_OWNER:
> > +    case VHOST_RESET_OWNER:
> > +        break;
> > +
> > +    case VHOST_SET_MEM_TABLE:
> > +        QTAILQ_FOREACH(block, &ram_list.blocks, next)
> > +        {
> > +            if (block->fd > 0) {
> > +                msg.memory.regions[fd_num].userspace_addr = (__u64)
> block->host;
> > +                msg.memory.regions[fd_num].memory_size = block->length;
> > +                msg.memory.regions[fd_num].guest_phys_addr =
> block->offset;
> > +                fds[fd_num++] = block->fd;
> > +            }
> > +        }
> > +
> > +        msg.memory.nregions = fd_num;
> > +
> > +        if (!fd_num) {
> > +            fprintf(stderr, "Failed initializing vhost-user memory
> map\n"
> > +                    "consider -mem-path and -mem-prealloc options\n");
> > +            return -1;
> > +        }
> > +        break;
> > +
> > +    case VHOST_SET_LOG_FD:
> > +        msg.fd = *((int *) arg);
> > +        break;
> > +
> > +    case VHOST_SET_VRING_NUM:
> > +    case VHOST_SET_VRING_BASE:
> > +        memcpy(&msg.state, arg, sizeof(struct vhost_vring_state));
> > +        break;
> > +
> > +    case VHOST_SET_VRING_ADDR:
> > +        memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr));
> > +        break;
> > +
> > +    case VHOST_SET_VRING_KICK:
> > +    case VHOST_SET_VRING_CALL:
> > +    case VHOST_SET_VRING_ERR:
> > +    case VHOST_NET_SET_BACKEND:
> > +        memcpy(&msg.file, arg, sizeof(struct vhost_vring_file));
> > +        if (msg.file.fd > 0) {
> > +            fds[0] = msg.file.fd;
> > +            fd_num = 1;
> > +        }
> > +        break;
> >      default:
> >          fprintf(stderr, "vhost-user trying to send unhandled ioctl\n");
> >          return -1;
> > @@ -152,12 +243,16 @@ static int vhost_user_call(struct vhost_dev *dev,
> unsigned long int request,
> >
> >      result = vhost_user_send_fds(fd, &msg, fds, fd_num);
> >
> > -    if (!result) {
> > +    if (!result && need_reply) {
> >          result = vhost_user_recv(fd, &msg);
> >          if (!result) {
> >              switch (request) {
> > -            default:
> > -                fprintf(stderr, "vhost-user received unhandled
> message\n");
> > +            case VHOST_GET_FEATURES:
> > +                *((uint64_t *) arg) = msg.u64;
> > +                break;
> > +            case VHOST_GET_VRING_BASE:
> > +                memcpy(arg, &msg.state, sizeof(struct
> vhost_vring_state));
> > +                break;
> >              }
> >          }
> >      }
> > --
> > 1.8.3.2
> >
>

[-- Attachment #2: Type: text/html, Size: 8984 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-12-10 12:06 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-29 19:52 [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Antonios Motakis
2013-11-29 19:52 ` [Qemu-devel] [PATCH 1/5] Decouple vhost from kernel interface Antonios Motakis
2013-11-29 19:52 ` [Qemu-devel] [PATCH 2/5] Add vhost-kernel and the vhost-user skeleton Antonios Motakis
2013-12-04 13:47   ` Stefan Hajnoczi
2013-12-04 15:23     ` Antonios Motakis
2013-11-29 19:52 ` [Qemu-devel] [PATCH 3/5] Add vhostsock option Antonios Motakis
2013-12-04 13:42   ` Stefan Hajnoczi
2013-12-04 15:21     ` Antonios Motakis
2013-12-04 14:28   ` Eric Blake
2013-11-29 19:52 ` [Qemu-devel] [PATCH 4/5] Add domain socket communication for vhost-user backend Antonios Motakis
2013-11-29 19:52 ` [Qemu-devel] [PATCH 5/5] Add vhost-user calls implementation Antonios Motakis
2013-12-04 20:00   ` Michael S. Tsirkin
2013-12-10 12:05     ` Antonios Motakis
2013-12-04 13:56 ` [Qemu-devel] [RFC 0/5] Vhost and vhost-net support for userspace based backends Stefan Hajnoczi
2013-12-04 15:23   ` Antonios Motakis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).