* [Qemu-devel] [PATCH] Add vhost-user test application (Vubr)
@ 2015-10-25 17:42 Victor Kaplansky
2015-10-25 19:02 ` Michael S. Tsirkin
2015-10-25 19:52 ` Michael S. Tsirkin
0 siblings, 2 replies; 3+ messages in thread
From: Victor Kaplansky @ 2015-10-25 17:42 UTC (permalink / raw)
To: qemu-devel; +Cc: Eduardo Habkost, Michael S. Tsirkin
QEMU is missing a good test for vhost-user feature,
so I've created a sample vhost-user application, which
called Vubr (mst coined the name, but better
suggestions will be appreciated). Vubr may later serve
the QEMU community as vhost-user QEMU internal test.
Essentially Vubr is a very basic vhost-user backend for QEMU,
It runs as a separate user-level process. For packet
processing Vubr uses an additional QEMU instance with a backend
configured by "-net socket" as a shared VLAN. This way another
QEMU virtual machine can effectively make a bus by means of
UDP communication.
For a more simple setup, the another QEMU instance running
the SLiRP backed can be the same QEMU instance running vhost-user
client.
The Vubr implementation is very preliminary. It is missing many
features. I has been studying vhost-user protocol internals,
so I've wrote Vubr bit by bit as I progressed through the
protocol. Most probably internal architecture will change
significantly.
To run Vubr application:
Build vubr with:
$ cd qemu/tests/vubr; make
Ensure the machine has hugepages enabled in kernel with command line
like: default_hugepagesz=2M hugepagesz=2M hugepages=2048
Run it with:
$ ./vubr
The above will run vhost-user server listening for connections
on UNIX domain socket /tmp/vubr.sock, and will try to connect
by UDP to VLAN bridge to localhost:5555, while listening on
localhost:4444
Run qemu with a virtio-net backed by vhost-user:
$ qemu \
-enable-kvm -m 512 -smp 2 \
-object memory-backend-file,id=mem,size=512M,mem-path=/dev/hugepages,share=on \
-numa node,memdev=mem -mem-prealloc \
-chardev socket,id=char0,path=/tmp/vubr.sock \
-netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
-device virtio-net-pci,netdev=mynet1 \
-net none \
-net socket,vlan=0,udp=localhost:4444,localaddr=localhost:5555 \
-net user,vlan=0 \
disk.img
Vubr tested very lightly: it's able to bringup a linux on client VM
with virtio-net driver, and execute transmits and receives to the
internet. I tested with "wget redhat.com", "dig redhat.com".
PS. I've consulted DPDK's code for vhost-user during Vubr
implementation.
Signed-off-by: Victor Kaplansky <victork@redhat.com>
---
tests/vubr/dispatcher.h | 26 ++
tests/vubr/vhost.h | 77 +++++
tests/vubr/vhost_user.h | 70 +++++
tests/vubr/virtio_net.h | 38 +++
tests/vubr/virtio_ring.h | 103 +++++++
tests/vubr/virtqueue.h | 17 ++
tests/vubr/vubr_config.h | 7 +
tests/vubr/vubr_device.h | 41 +++
tests/vubr/dispatcher.c | 77 +++++
tests/vubr/main.c | 18 ++
tests/vubr/vhost_user.c | 83 +++++
tests/vubr/vubr_device.c | 773 +++++++++++++++++++++++++++++++++++++++++++++++
tests/vubr/Makefile | 15 +
13 files changed, 1345 insertions(+)
create mode 100644 tests/vubr/dispatcher.h
create mode 100644 tests/vubr/vhost.h
create mode 100644 tests/vubr/vhost_user.h
create mode 100644 tests/vubr/virtio_net.h
create mode 100644 tests/vubr/virtio_ring.h
create mode 100644 tests/vubr/virtqueue.h
create mode 100644 tests/vubr/vubr_config.h
create mode 100644 tests/vubr/vubr_device.h
create mode 100644 tests/vubr/dispatcher.c
create mode 100644 tests/vubr/main.c
create mode 100644 tests/vubr/vhost_user.c
create mode 100644 tests/vubr/vubr_device.c
create mode 100644 tests/vubr/Makefile
diff --git a/tests/vubr/dispatcher.h b/tests/vubr/dispatcher.h
new file mode 100644
index 0000000..cd02f07
--- /dev/null
+++ b/tests/vubr/dispatcher.h
@@ -0,0 +1,26 @@
+#ifndef __DISPATCHER__
+#define __DISPATCHER__
+
+#include <stddef.h>
+#include <stdint.h>
+#include <sys/select.h>
+
+typedef void (*callback_func)(int sock, void *ctx);
+
+struct event {
+ void *ctx;
+ callback_func callback;
+};
+
+struct dispatcher {
+ int max_sock;
+ fd_set fdset;
+ struct event events[FD_SETSIZE];
+};
+
+int dispatcher_init(struct dispatcher *d);
+int dispatcher_add(struct dispatcher *d, int sock, void *ctx, callback_func cb);
+int dispatcher_remove(struct dispatcher *d, int sock);
+int dispatcher_wait(struct dispatcher *d, uint32_t timeout);
+
+#endif /* __DISPATCHER__ */
diff --git a/tests/vubr/vhost.h b/tests/vubr/vhost.h
new file mode 100644
index 0000000..3960cc2
--- /dev/null
+++ b/tests/vubr/vhost.h
@@ -0,0 +1,77 @@
+#ifndef __VHOST_H__
+#define __VHOST_H__
+
+#include <inttypes.h>
+
+/* Most imported form qemu/linux-headers/linux/vhost.h
+ *
+ * Userspace interface for virtio structures. */
+
+struct vhost_vring_state {
+ unsigned int index;
+ unsigned int num;
+};
+
+struct vhost_vring_file {
+ unsigned int index;
+ int fd; /* Pass -1 to unbind from file. */
+};
+
+struct vhost_vring_addr {
+ unsigned int index;
+ /* Option flags. */
+ unsigned int flags;
+ /* Flag values: */
+ /* Whether log address is valid. If set enables logging. */
+#define VHOST_VRING_F_LOG 0
+
+ /* Start of array of descriptors (virtually contiguous) */
+ uint64_t desc_user_addr;
+ /* Used structure address. Must be 32 bit aligned */
+ uint64_t used_user_addr;
+ /* Available structure address. Must be 16 bit aligned */
+ uint64_t avail_user_addr;
+ /* Logging support. */
+ /* Log writes to used structure, at offset calculated from specified
+ * address. Address must be 32 bit aligned. */
+ uint64_t log_guest_addr;
+};
+
+#define VHOST_MEMORY_MAX_NREGIONS (8)
+
+struct vhost_memory_region {
+ uint64_t guest_phys_addr;
+ uint64_t memory_size; /* bytes */
+ uint64_t userspace_addr;
+ uint64_t mmap_offset;
+};
+
+struct vhost_memory {
+ uint32_t nregions;
+ uint32_t padding;
+ struct vhost_memory_region regions[VHOST_MEMORY_MAX_NREGIONS];
+};
+
+/* Feature bits */
+/* Log all write descriptors. Can be changed while device is active. */
+#define VHOST_F_LOG_ALL 26
+/* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
+#define VHOST_NET_F_VIRTIO_NET_HDR 27
+
+struct virtio_net_hdr {
+#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
+ uint8_t flags;
+#define VIRTIO_NET_HDR_GSO_NONE 0
+#define VIRTIO_NET_HDR_GSO_TCPV4 1
+#define VIRTIO_NET_HDR_GSO_UDP 3
+#define VIRTIO_NET_HDR_GSO_TCPV6 4
+#define VIRTIO_NET_HDR_GSO_ECN 0x80
+ uint8_t gso_type;
+ uint16_t hdr_len;
+ uint16_t gso_size;
+ uint16_t csum_start;
+ uint16_t csum_offset;
+ uint16_t num_buffers;
+};
+
+#endif /* __VHOST_H__ */
diff --git a/tests/vubr/vhost_user.h b/tests/vubr/vhost_user.h
new file mode 100644
index 0000000..44d82d0
--- /dev/null
+++ b/tests/vubr/vhost_user.h
@@ -0,0 +1,70 @@
+#ifndef __VHOST_USER_H__
+#define __VHOST_USER_H__
+
+/* Based on qemu/hw/virtio/vhost-user.c */
+
+#include <stdint.h>
+#include <stddef.h>
+#include "vhost.h"
+
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x1ULL
+
+#define VHOST_USER_PROTOCOL_F_MQ 0
+
+#define VHOST_USER_REQUEST_LIST \
+ INFO(NONE, 0) \
+ INFO(GET_FEATURES, 1) \
+ INFO(SET_FEATURES, 2) \
+ INFO(SET_OWNER, 3) \
+ INFO(RESET_DEVICE, 4) \
+ INFO(SET_MEM_TABLE, 5) \
+ INFO(SET_LOG_BASE, 6) \
+ INFO(SET_LOG_FD, 7) \
+ INFO(SET_VRING_NUM, 8) \
+ INFO(SET_VRING_ADDR, 9) \
+ INFO(SET_VRING_BASE, 10) \
+ INFO(GET_VRING_BASE, 11) \
+ INFO(SET_VRING_KICK, 12) \
+ INFO(SET_VRING_CALL, 13) \
+ INFO(SET_VRING_ERR, 14) \
+ INFO(GET_PROTOCOL_FEATURES, 15) \
+ INFO(SET_PROTOCOL_FEATURES, 16) \
+ INFO(GET_QUEUE_NUM, 17) \
+ INFO(SET_VRING_ENABLE, 18)
+
+enum vhost_user_request {
+#define INFO(a, b) VHOST_USER_ ## a = b,
+ VHOST_USER_REQUEST_LIST
+#undef INFO
+ VHOST_USER_MAX
+};
+
+struct vhost_user_message {
+ enum vhost_user_request request;
+
+#define VHOST_USER_VERSION_MASK (0x3)
+#define VHOST_USER_REPLY_MASK (0x1<<2)
+ uint32_t flags;
+ uint32_t size; /* the following payload size */
+ union {
+#define VHOST_USER_VRING_IDX_MASK (0xff)
+#define VHOST_USER_VRING_NOFD_MASK (0x1<<8)
+ uint64_t u64;
+ struct vhost_vring_state state;
+ struct vhost_vring_addr addr;
+ struct vhost_memory memory;
+ } payload;
+ int fds[VHOST_MEMORY_MAX_NREGIONS];
+ int fd_num;
+} __attribute__((packed));
+
+#define VHOST_USER_HDR_SIZE offsetof(struct vhost_user_message, payload.u64)
+
+/* The version of the protocol we support */
+#define VHOST_USER_VERSION (0x1)
+
+void vhost_user_message_read(int conn_fd, struct vhost_user_message *vmsg);
+void vhost_user_message_write(int conn_fd, struct vhost_user_message *vmsg);
+
+#endif /* __VHOST_USER_H__ */
diff --git a/tests/vubr/virtio_net.h b/tests/vubr/virtio_net.h
new file mode 100644
index 0000000..f6f87b1
--- /dev/null
+++ b/tests/vubr/virtio_net.h
@@ -0,0 +1,38 @@
+#ifndef __VIRTIO_NET_H__
+#define __VIRTIO_NET_H__
+
+/* Form qemu/include/standard-headers/linux/virtio_net.h */
+
+/* The feature bitmap for virtio net */
+#define VIRTIO_NET_F_CSUM 0 /* Host handles pkts w/ partial csum */
+#define VIRTIO_NET_F_GUEST_CSUM 1 /* Guest handles pkts w/ partial csum */
+#define VIRTIO_NET_F_CTRL_GUEST_OFFLOADS 2 /* Dynamic offload configuration. */
+#define VIRTIO_NET_F_MAC 5 /* Host has given MAC address. */
+#define VIRTIO_NET_F_GUEST_TSO4 7 /* Guest can handle TSOv4 in. */
+#define VIRTIO_NET_F_GUEST_TSO6 8 /* Guest can handle TSOv6 in. */
+#define VIRTIO_NET_F_GUEST_ECN 9 /* Guest can handle TSO[6] w/ ECN in. */
+#define VIRTIO_NET_F_GUEST_UFO 10 /* Guest can handle UFO in. */
+#define VIRTIO_NET_F_HOST_TSO4 11 /* Host can handle TSOv4 in. */
+#define VIRTIO_NET_F_HOST_TSO6 12 /* Host can handle TSOv6 in. */
+#define VIRTIO_NET_F_HOST_ECN 13 /* Host can handle TSO[6] w/ ECN in. */
+#define VIRTIO_NET_F_HOST_UFO 14 /* Host can handle UFO in. */
+#define VIRTIO_NET_F_MRG_RXBUF 15 /* Host can merge receive buffers. */
+#define VIRTIO_NET_F_STATUS 16 /* virtio_net_config.status available */
+#define VIRTIO_NET_F_CTRL_VQ 17 /* Control channel available */
+#define VIRTIO_NET_F_CTRL_RX 18 /* Control channel RX mode support */
+#define VIRTIO_NET_F_CTRL_VLAN 19 /* Control channel VLAN filtering */
+#define VIRTIO_NET_F_CTRL_RX_EXTRA 20 /* Extra RX mode control support */
+#define VIRTIO_NET_F_GUEST_ANNOUNCE 21 /* Guest can announce device on the
+ * network */
+#define VIRTIO_NET_F_MQ 22 /* Device supports Receive Flow
+ * Steering */
+#define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */
+
+#ifndef VIRTIO_NET_NO_LEGACY
+#define VIRTIO_NET_F_GSO 6 /* Host handles pkts w/ any GSO type */
+#endif /* VIRTIO_NET_NO_LEGACY */
+
+#define VIRTIO_NET_S_LINK_UP 1 /* Link is up */
+#define VIRTIO_NET_S_ANNOUNCE 2 /* Announcement is needed */
+
+#endif /* __VIRTIO_NET_H__ */
diff --git a/tests/vubr/virtio_ring.h b/tests/vubr/virtio_ring.h
new file mode 100644
index 0000000..e2d0adb
--- /dev/null
+++ b/tests/vubr/virtio_ring.h
@@ -0,0 +1,103 @@
+#ifndef VIRTQUEUE_H
+#define VIRTQUEUE_H
+/*
+ *
+ * Virtual I/O Device (VIRTIO) Version 1.0
+ * Committee Specification 03
+ * 02 August 2015
+ * Copyright (c) OASIS Open 2015. All Rights Reserved.
+ * Source: http://docs.oasis-open.org/virtio/virtio/v1.0/cs03/listings/
+ *
+ */
+#include <stdint.h>
+
+typedef uint64_t le64;
+typedef uint32_t le32;
+typedef uint16_t le16;
+
+/* This marks a buffer as continuing via the next field. */
+#define VIRTQ_DESC_F_NEXT 1
+/* This marks a buffer as write-only (otherwise read-only). */
+#define VIRTQ_DESC_F_WRITE 2
+/* This means the buffer contains a list of buffer descriptors. */
+#define VIRTQ_DESC_F_INDIRECT 4
+
+/* The device uses this in used->flags to advise the driver: don't kick me
+ * when you add a buffer. It's unreliable, so it's simply an
+ * optimization. */
+#define VIRTQ_USED_F_NO_NOTIFY 1
+/* The driver uses this in avail->flags to advise the device: don't
+ * interrupt me when you consume a buffer. It's unreliable, so it's
+ * simply an optimization. */
+#define VIRTQ_AVAIL_F_NO_INTERRUPT 1
+
+/* Support for indirect descriptors */
+#define VIRTIO_F_INDIRECT_DESC 28
+
+/* Support for avail_event and used_event fields */
+#define VIRTIO_F_EVENT_IDX 29
+
+/* Arbitrary descriptor layouts. */
+#define VIRTIO_F_ANY_LAYOUT 27
+
+/* Virtqueue descriptors: 16 bytes.
+ * These can chain together via "next". */
+struct virtq_desc {
+ /* Address (guest-physical). */
+ le64 addr;
+ /* Length. */
+ le32 len;
+ /* The flags as indicated above. */
+ le16 flags;
+ /* We chain unused descriptors via this, too */
+ le16 next;
+};
+
+struct virtq_avail {
+ le16 flags;
+ le16 idx;
+ le16 ring[];
+ /* Only if VIRTIO_F_EVENT_IDX: le16 used_event; */
+};
+
+/* le32 is used here for ids for padding reasons. */
+struct virtq_used_elem {
+ /* Index of start of used descriptor chain. */
+ le32 id;
+ /* Total length of the descriptor chain which was written to. */
+ le32 len;
+};
+
+struct virtq_used {
+ le16 flags;
+ le16 idx;
+ struct virtq_used_elem ring[];
+ /* Only if VIRTIO_F_EVENT_IDX: le16 avail_event; */
+};
+
+struct virtq {
+ unsigned int num;
+
+ struct virtq_desc *desc;
+ struct virtq_avail *avail;
+ struct virtq_used *used;
+};
+
+static inline int virtq_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old_idx)
+{
+ return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old_idx);
+}
+
+/* Get location of event indices (only with VIRTIO_F_EVENT_IDX) */
+static inline le16 *virtq_used_event(struct virtq *vq)
+{
+ /* For backwards compat, used event index is at *end* of avail ring. */
+ return &vq->avail->ring[vq->num];
+}
+
+static inline le16 *virtq_avail_event(struct virtq *vq)
+{
+ /* For backwards compat, avail event index is at *end* of used ring. */
+ return (le16 *)&vq->used->ring[vq->num];
+}
+#endif /* VIRTQUEUE_H */
diff --git a/tests/vubr/virtqueue.h b/tests/vubr/virtqueue.h
new file mode 100644
index 0000000..b018cca
--- /dev/null
+++ b/tests/vubr/virtqueue.h
@@ -0,0 +1,17 @@
+#ifndef __VIRTQUEUE__
+#define __VIRTQUEUE__
+
+#include "virtio_ring.h"
+
+struct virtqueue {
+ int call_fd;
+ int kick_fd;
+ uint32_t size;
+ uint16_t last_avail_index;
+ uint16_t last_used_index;
+ struct virtq_desc* desc;
+ struct virtq_avail* avail;
+ struct virtq_used* used;
+};
+
+#endif /* __VIRTQUEUE__ */
diff --git a/tests/vubr/vubr_config.h b/tests/vubr/vubr_config.h
new file mode 100644
index 0000000..19681d0
--- /dev/null
+++ b/tests/vubr/vubr_config.h
@@ -0,0 +1,7 @@
+#ifndef __VHU_CONFIG__
+#define __VHU_CONFIG__
+
+#define VHOST_USER_SHOW_MGMT_TRAFFIC
+#define VHOST_USER_SHOW_NET_TRAFFIC
+
+#endif /* __VHU_CONFIG__ */
diff --git a/tests/vubr/vubr_device.h b/tests/vubr/vubr_device.h
new file mode 100644
index 0000000..04a0ecb
--- /dev/null
+++ b/tests/vubr/vubr_device.h
@@ -0,0 +1,41 @@
+#ifndef __VHU_DEVICE__
+#define __VHU_DEVICE__
+
+#include <arpa/inet.h>
+#include <sys/socket.h>
+
+#include "vhost.h"
+#include "virtqueue.h"
+#include "dispatcher.h"
+
+#define MAX_NR_VIRTQUEUE (8)
+
+struct vubr_device_region {
+ /* Guest Phhysical address. */
+ uint64_t gpa;
+ /* Memory region size. */
+ uint64_t size;
+ /* QEMU virtual address (userspace). */
+ uint64_t qva;
+ /* Starting offset in our mmaped space. */
+ uint64_t mmap_offset;
+ /* Start addrtess of mmaped space. */
+ uint64_t mmap_addr;
+};
+
+struct vubr_device {
+ int sock;
+ struct dispatcher dispatcher;
+ uint32_t nregions;
+ struct vubr_device_region regions[VHOST_MEMORY_MAX_NREGIONS];
+ struct virtqueue virtqueue[MAX_NR_VIRTQUEUE];
+ int backend_udp_sock;
+ struct sockaddr_in backend_udp_dest;
+};
+
+struct vubr_device *vubr_device_new(char *path);
+void vubr_device_run(struct vubr_device * dev);
+void vubr_device_backend_udp_setup(struct vubr_device *dev, char *local_host,
+ uint16_t local_port, char *dest_host, uint16_t dest_port);
+
+#endif /* __VHU_DEVICE__ */
diff --git a/tests/vubr/dispatcher.c b/tests/vubr/dispatcher.c
new file mode 100644
index 0000000..62d386a
--- /dev/null
+++ b/tests/vubr/dispatcher.c
@@ -0,0 +1,77 @@
+#include <stdio.h>
+#include <sys/select.h>
+
+#include "vubr_config.h"
+#include "dispatcher.h"
+
+int
+dispatcher_init(struct dispatcher *d)
+{
+ FD_ZERO(&d->fdset);
+ d->max_sock = -1;
+ return 0;
+}
+
+int
+dispatcher_add(struct dispatcher *d, int sock, void *ctx, callback_func cb)
+{
+ if (sock >= FD_SETSIZE) {
+ fprintf(stderr, "Error: Failed to add new event. sock %d should be less than %d\n",
+ sock, FD_SETSIZE);
+ return -1;
+ }
+
+ d->events[sock].ctx = ctx;
+ d->events[sock].callback = cb;
+
+ FD_SET(sock, &d->fdset);
+ if (sock > d->max_sock)
+ d->max_sock = sock;
+ printf("DEBUG: Added sock %d for watching. max_sock: %d\n", sock, d->max_sock);
+ return 0;
+}
+
+int
+dispatcher_remove(struct dispatcher *d, int sock)
+{
+ if (sock >= FD_SETSIZE) {
+ fprintf(stderr, "Error: Failed to remove event. sock %d should be less than %d\n",
+ sock, FD_SETSIZE);
+ return -1;
+ }
+
+ FD_CLR(sock, &d->fdset);
+ return 0;
+}
+
+/* timeout in us */
+int
+dispatcher_wait(struct dispatcher *d, uint32_t timeout)
+{
+ struct timeval tv;
+ tv.tv_sec = timeout / 1000000;
+ tv.tv_usec = timeout % 1000000;
+
+ fd_set fdset = d->fdset;
+
+ /* wait until some of sockets become readable. */
+ int rc = select(d->max_sock + 1, &fdset, 0, 0, &tv);
+
+ if (rc == -1)
+ perror("select");
+
+ /* Timeout */
+ if (rc == 0)
+ return 0;
+
+ /* Now call callback for every ready socket. */
+
+ int sock;
+ for (sock = 0; sock < d->max_sock + 1; sock++)
+ if (FD_ISSET(sock, &fdset)) {
+ struct event *e = &d->events[sock];
+ e->callback(sock, e->ctx);
+ }
+
+ return 0;
+}
diff --git a/tests/vubr/main.c b/tests/vubr/main.c
new file mode 100644
index 0000000..a7a3e9d
--- /dev/null
+++ b/tests/vubr/main.c
@@ -0,0 +1,18 @@
+#include "vubr_config.h"
+#include "vubr_device.h"
+
+int main(int argc, char* argv[])
+{
+ struct vubr_device *dev;
+
+ if((dev = vubr_device_new("/tmp/vubr.sock"))) {
+ vubr_device_backend_udp_setup(dev,
+ "127.0.0.1", 4444,
+ "127.0.0.1", 5555);
+
+ vubr_device_run(dev);
+ return 0;
+ }
+ else
+ return 1;
+}
diff --git a/tests/vubr/vhost_user.c b/tests/vubr/vhost_user.c
new file mode 100644
index 0000000..91ab09e
--- /dev/null
+++ b/tests/vubr/vhost_user.c
@@ -0,0 +1,83 @@
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <string.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <assert.h>
+#include <unistd.h>
+#include <errno.h>
+
+#include "vubr_config.h"
+#include "vhost_user.h"
+
+void
+vhost_user_message_read(int conn_fd, struct vhost_user_message *vmsg)
+{
+ int rc;
+ struct msghdr msg = {};
+ struct iovec iov;
+ size_t fd_size = VHOST_MEMORY_MAX_NREGIONS * sizeof(int);
+ char control[CMSG_SPACE(fd_size)];
+ memset(control, 0, sizeof(control));
+
+ iov.iov_base = (char *)vmsg;
+ iov.iov_len = VHOST_USER_HDR_SIZE;
+
+ msg.msg_iov = &iov;
+ msg.msg_iovlen = 1;
+ msg.msg_control = control;
+ msg.msg_controllen = sizeof(control);
+
+ rc = recvmsg(conn_fd, &msg, 0);
+
+ if (rc <= 0) {
+ perror("recvmsg");
+ exit(1);
+ }
+
+ vmsg->fd_num = 0;
+ struct cmsghdr *cmsg;
+ for (cmsg = CMSG_FIRSTHDR(&msg);
+ cmsg != NULL;
+ cmsg = CMSG_NXTHDR(&msg, cmsg))
+ {
+ if ((cmsg->cmsg_level == SOL_SOCKET) &&
+ (cmsg->cmsg_type == SCM_RIGHTS))
+ {
+ fd_size = cmsg->cmsg_len - CMSG_LEN(0);
+ vmsg->fd_num = fd_size / sizeof(int);
+ memcpy(vmsg->fds, CMSG_DATA(cmsg), fd_size);
+ break;
+ }
+ }
+
+ if (vmsg->size > sizeof(vmsg->payload)) {
+ fprintf(stderr, "Error: too big message request: %d, size: vmsg->size: %u, while sizeof(vmsg->payload) = %lu\n",
+ vmsg->request, vmsg->size, sizeof(vmsg->payload));
+ exit(1);
+ }
+
+ if (vmsg->size) {
+ rc = read(conn_fd, &vmsg->payload, vmsg->size);
+ if (rc <= 0) {
+ perror("recvmsg");
+ exit(1);
+ }
+
+ assert(rc == vmsg->size);
+ }
+}
+
+void
+vhost_user_message_write(int conn_fd, struct vhost_user_message *vmsg)
+{
+ int rc;
+ do {
+ rc = write(conn_fd, vmsg, VHOST_USER_HDR_SIZE + vmsg->size);
+ } while (rc < 0 && errno == EINTR);
+
+ if (rc < 0) {
+ perror("write");
+ exit(1);
+ }
+}
diff --git a/tests/vubr/vubr_device.c b/tests/vubr/vubr_device.c
new file mode 100644
index 0000000..a296aef
--- /dev/null
+++ b/tests/vubr/vubr_device.c
@@ -0,0 +1,773 @@
+#include <assert.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <inttypes.h>
+#include <string.h>
+#include <unistd.h>
+#include <errno.h>
+#include <sys/types.h>
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <sys/unistd.h>
+#include <sys/mman.h>
+#include <sys/eventfd.h>
+
+#include "vubr_config.h"
+#include "vhost_user.h"
+#include "virtio_net.h"
+#include "vubr_device.h"
+#include "virtqueue.h"
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+static char *vhost_user_request_str[] = {
+#define INFO(name,num) \
+ [num] = #name,
+VHOST_USER_REQUEST_LIST
+#undef INFO
+};
+#endif /* VHOST_USER_SHOW_MGMT_TRAFFIC */
+
+void
+die(char *s)
+{
+ perror(s);
+ exit(1);
+}
+
+static void
+print_buffer(uint8_t* buf, size_t len)
+{
+ int i;
+ printf("raw buffer:\n");
+ for(i = 0; i < len; i++) {
+ if ((i % 16) == 0)
+ printf("\n");
+ if ((i % 4) == 0)
+ printf(" ");
+ printf("%02x ", buf[i]);
+ }
+ printf("\n............................................................\n");
+}
+
+/* Translate guest physical address to our virtual address. */
+static uint64_t __attribute__((unused))
+gpa_to_va(struct vubr_device *dev, uint64_t guest_addr)
+{
+ int i;
+ /* Find matching memory region. */
+
+ for (i = 0; i < dev->nregions; i++) {
+ struct vubr_device_region *r = &dev->regions[i];
+
+ if ((guest_addr >= r->gpa) && (guest_addr < (r->gpa + r->size)))
+ return (guest_addr - r->gpa + r->mmap_addr + r->mmap_offset);
+ }
+
+ assert(!"address not found in regions");
+ return 0;
+}
+
+/* Translate qemu virtual address to our virtual address. */
+static uint64_t
+qva_to_va(struct vubr_device *dev, uint64_t qemu_addr)
+{
+ int i;
+ /* Find matching memory region. */
+
+ for (i = 0; i < dev->nregions; i++) {
+ struct vubr_device_region *r = &dev->regions[i];
+
+ if ((qemu_addr >= r->qva) && (qemu_addr < (r->qva + r->size)))
+ return (qemu_addr - r->qva + r->mmap_addr + r->mmap_offset);
+ }
+
+ assert(!"address not found in regions");
+ return 0;
+}
+
+static void vubr_device_backend_udp_sendbuf(struct vubr_device *dev, uint8_t *buf, size_t len);
+
+static void
+_consume_raw_packet(struct vubr_device *dev, uint8_t *buf, uint32_t len)
+{
+ int hdrlen = sizeof(struct virtio_net_hdr);
+
+#ifdef VHOST_USER_SHOW_NET_TRAFFIC
+ print_buffer(buf, len);
+#endif
+ vubr_device_backend_udp_sendbuf(dev, buf + hdrlen, len - hdrlen);
+}
+
+/* Kick the guest if necessary. */
+static void
+virtqueue_kick(struct virtqueue *vq)
+{
+ if (!(vq->avail->flags & VIRTQ_AVAIL_F_NO_INTERRUPT)) {
+ printf("Kicking the guest...\n");
+ eventfd_write(vq->call_fd, 1);
+ }
+}
+
+static void
+_post_buffer(struct vubr_device *dev, struct virtqueue *vq, uint8_t *buf, int32_t len)
+{
+ struct virtq_desc* desc = vq->desc;
+ struct virtq_avail* avail = vq->avail;
+ struct virtq_used* used = vq->used;
+
+ unsigned int size = vq->size;
+
+ uint16_t a_index = vq->last_avail_index % size;
+ uint16_t u_index = vq->last_used_index % size;
+ uint16_t d_index = avail->ring[a_index];
+
+ int i = d_index;
+
+#ifdef VHOST_USER_SHOW_NET_TRAFFIC
+ printf("Posting the packet to guest on vq:\n");
+ printf(" size = %d\n", vq->size);
+ printf(" last_avail_index = %d\n", vq->last_avail_index);
+ printf(" last_used_index = %d\n", vq->last_used_index);
+ printf(" a_index = %d\n", a_index);
+ printf(" u_index = %d\n", u_index);
+ printf(" d_index = %d\n", d_index);
+ printf(" desc[%d].addr = 0x%016"PRIx64"\n", i, desc[i].addr);
+ printf(" desc[%d].len = %d\n", i, desc[i].len);
+ printf(" desc[%d].flags = %d\n", i, desc[i].flags);
+ printf(" avail->idx = %d\n", avail->idx);
+ printf(" used->idx = %d\n", used->idx);
+#endif
+
+ if (!(desc[i].flags & VIRTQ_DESC_F_WRITE)) {
+ // FIXME: we should find writable descriptor
+ fprintf(stderr, "descriptor is not writable. exiting.\n");
+ exit(1);
+ }
+
+ void *chunk_start = (void *)gpa_to_va(dev, desc[i].addr);
+ uint32_t chunk_len = desc[i].len;
+
+ if (len <= chunk_len) {
+ memcpy(chunk_start, buf, len);
+ } else {
+ fprintf(stderr, "received too long packet from the backend. dropping...\n");
+ return;
+ }
+
+ /* Add descriptor to the used ring. */
+ used->ring[u_index].id = d_index;
+ used->ring[u_index].len = len;
+
+ vq->last_avail_index++;
+ vq->last_used_index++;
+
+ used->idx = vq->last_used_index;
+
+ /* Kick the guest if necessary. */
+ virtqueue_kick(vq);
+}
+
+static int
+_process_desc(struct vubr_device *dev, struct virtqueue *vq)
+{
+ struct virtq_desc* desc = vq->desc;
+ struct virtq_avail* avail = vq->avail;
+ struct virtq_used* used = vq->used;
+
+ unsigned int size = vq->size;
+
+ uint16_t a_index = vq->last_avail_index % size;
+ uint16_t u_index = vq->last_used_index % size;
+ uint16_t d_index = avail->ring[a_index];
+
+ uint32_t i, len = 0;
+ size_t buf_size = 4096;
+ uint8_t buf[4096];
+
+#ifdef VHOST_USER_SHOW_NET_TRAFFIC
+ printf("chunks: ");
+#endif
+
+ i = d_index;
+ do {
+ void *chunk_start = (void *)gpa_to_va(dev, desc[i].addr);
+ uint32_t chunk_len = desc[i].len;
+
+ if (len + chunk_len < buf_size) {
+ memcpy(buf + len, chunk_start, chunk_len);
+#ifdef VHOST_USER_SHOW_NET_TRAFFIC
+ printf("%d ", chunk_len);
+#endif
+ } else {
+ fprintf(stderr, "too long packet. dropping...\n");
+ break;
+ }
+
+ len += chunk_len;
+
+ if (!(desc[i].flags & VIRTQ_DESC_F_NEXT))
+ break;
+
+ i = desc[i].next;
+ } while(1);
+
+ if (!len)
+ return -1;
+
+ /* Add descriptor to the used ring. */
+ used->ring[u_index].id = d_index;
+ used->ring[u_index].len = len;
+
+#ifdef VHOST_USER_SHOW_NET_TRAFFIC
+ printf("\n");
+#endif
+
+ _consume_raw_packet(dev, buf, len);
+
+ return 0;
+}
+
+static void
+_process_avail(struct vubr_device *dev, struct virtqueue *vq)
+{
+ struct virtq_avail *avail = vq->avail;
+ struct virtq_used* used = vq->used;
+
+ while(vq->last_avail_index != avail->idx) {
+ _process_desc(dev, vq);
+ vq->last_avail_index++;
+ vq->last_used_index++;
+ }
+
+ used->idx = vq->last_used_index;
+}
+
+static int vubr_device_backend_udp_recvbuf(struct vubr_device *dev, uint8_t *buf, size_t buflen);
+
+static void
+_backend_recv_cb(int sock, void *ctx)
+{
+ printf("\n\n *** IN UDP RECEIVE CALLBACK ***\n\n");
+ struct vubr_device *dev = (struct vubr_device *) ctx;
+ struct virtqueue *rx_vq = &dev->virtqueue[0];
+#define BUFLEN 4096
+ uint8_t buf[BUFLEN];
+ int len;
+ struct virtio_net_hdr *hdr = (struct virtio_net_hdr *)buf;
+ int hdrlen = sizeof(struct virtio_net_hdr);
+
+ *hdr = (struct virtio_net_hdr) {};
+ hdr->num_buffers = 1;
+
+ len = vubr_device_backend_udp_recvbuf(dev, buf + hdrlen, BUFLEN - hdrlen);
+ _post_buffer(dev, rx_vq, buf, len + hdrlen);
+#undef BUFLEN
+}
+
+static void
+_kick_cb(int sock, void *ctx)
+{
+ struct vubr_device *dev = (struct vubr_device *) ctx;
+ eventfd_t kick_data;
+ ssize_t rc;
+
+ rc = eventfd_read(sock, &kick_data);
+
+ if (rc == -1) {
+ perror("read kick");
+ exit(1);
+ } else {
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("Got kick_data: %016"PRIx64"\n", kick_data);
+#endif
+ _process_avail(dev, &dev->virtqueue[1]);
+ }
+}
+
+static int
+_execute_NONE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_GET_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ vmsg->payload.u64 =
+ ((1ULL << VIRTIO_NET_F_MRG_RXBUF) |
+ (1ULL << VIRTIO_NET_F_CTRL_VQ) |
+ (1ULL << VIRTIO_NET_F_CTRL_RX) |
+ (1ULL << VHOST_F_LOG_ALL));
+ vmsg->size = sizeof(vmsg->payload.u64);
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("returing u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+
+ /* reply */
+ return 1;
+}
+
+static int
+_execute_SET_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+ return 0;
+}
+
+static int
+_execute_SET_OWNER(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_RESET_DEVICE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_SET_MEM_TABLE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("Nregions: %d\n", vmsg->payload.memory.nregions);
+
+ struct vhost_memory *memory = &vmsg->payload.memory;
+ dev->nregions = memory->nregions;
+ int i;
+ for (i = 0; i < dev->nregions; i++) {
+ struct vhost_memory_region *msg_region = &memory->regions[i];
+ struct vubr_device_region *dev_region = &dev->regions[i];
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("Region %d\n", i);
+ printf(" guest_phys_addr: 0x%016"PRIx64"\n", msg_region->guest_phys_addr);
+ printf(" memory_size: 0x%016"PRIx64"\n", msg_region->memory_size);
+ printf(" userspace_addr 0x%016"PRIx64"\n", msg_region->userspace_addr);
+ printf(" mmap_offset 0x%016"PRIx64"\n", msg_region->mmap_offset);
+#endif
+
+ dev_region->gpa = msg_region->guest_phys_addr;
+ dev_region->size = msg_region->memory_size;
+ dev_region->qva = msg_region->userspace_addr;
+ dev_region->mmap_offset = msg_region->mmap_offset;
+
+ void *mmap_addr;
+
+ /* We don't use offset argument of mmap() since the
+ * mapped address has to be page aligned, and we use huge
+ * pages. */
+ mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset,
+ PROT_READ | PROT_WRITE, MAP_SHARED,
+ vmsg->fds[i], 0);
+
+ if (mmap_addr == MAP_FAILED) {
+ perror("mmap");
+ exit (1);
+ }
+
+ dev_region->mmap_addr = (uint64_t) mmap_addr;
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf(" mmap_addr: 0x%016"PRIx64"\n", dev_region->mmap_addr);
+#endif
+ }
+
+ return 0;
+}
+
+static int
+_execute_SET_LOG_BASE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_SET_LOG_FD(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_SET_VRING_NUM(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ unsigned int index = vmsg->payload.state.index;
+ unsigned int num = vmsg->payload.state.num;
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("state.index: %d\n", index);
+ printf("state.num: %d\n", num);
+#endif
+ dev->virtqueue[index].size = num;
+ return 0;
+}
+
+static int
+_execute_SET_VRING_ADDR(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ struct vhost_vring_addr *vra = &vmsg->payload.addr;
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("vhost_vring_addr:\n");
+ printf(" index: %d\n", vra->index);
+ printf(" flags: %d\n", vra->flags);
+ printf(" desc_user_addr: 0x%016"PRIx64"\n", vra->desc_user_addr);
+ printf(" used_user_addr: 0x%016"PRIx64"\n", vra->used_user_addr);
+ printf(" avail_user_addr: 0x%016"PRIx64"\n", vra->avail_user_addr);
+ printf(" log_guest_addr: 0x%016"PRIx64"\n", vra->log_guest_addr);
+#endif
+
+ unsigned int index = vra->index;
+ struct virtqueue *vq = &dev->virtqueue[index];
+
+ vq->desc = (struct virtq_desc *)qva_to_va(dev, vra->desc_user_addr);
+ vq->used = (struct virtq_used *)qva_to_va(dev, vra->used_user_addr);
+ vq->avail = (struct virtq_avail *)qva_to_va(dev, vra->avail_user_addr);
+
+ printf("Setting virtq addresses:\n");
+ printf(" virtq_desc at %p\n", vq->desc);
+ printf(" virtq_used at %p\n", vq->used);
+ printf(" virtq_avail at %p\n", vq->avail);
+
+ vq->last_used_index = vq->used->idx;
+ return 0;
+}
+
+static int
+_execute_SET_VRING_BASE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ unsigned int index = vmsg->payload.state.index;
+ unsigned int num = vmsg->payload.state.num;
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("state.index: %d\n", index);
+ printf("state.num: %d\n", num);
+#endif
+ dev->virtqueue[index].last_avail_index = num;
+
+ return 0;
+}
+
+static int
+_execute_GET_VRING_BASE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_SET_VRING_KICK(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+
+ uint64_t u64_arg = vmsg->payload.u64;
+ int index = u64_arg & VHOST_USER_VRING_IDX_MASK;
+
+ assert((u64_arg & VHOST_USER_VRING_NOFD_MASK) == 0);
+ assert(vmsg->fd_num == 1);
+
+ dev->virtqueue[index].kick_fd = vmsg->fds[0];
+ printf("Got kick_fd: %d for vq: %d\n", vmsg->fds[0], index);
+
+ if ((index % 2 == 1)) {
+ /* TX queue. */
+ dispatcher_add(&dev->dispatcher, dev->virtqueue[index].kick_fd, dev, _kick_cb);
+
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("Waiting for kicks on fd: %d for vq: %d\n",
+ dev->virtqueue[index].kick_fd, index);
+#endif
+ }
+ return 0;
+}
+
+static int
+_execute_SET_VRING_CALL(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+
+ uint64_t u64_arg = vmsg->payload.u64;
+ int index = u64_arg & VHOST_USER_VRING_IDX_MASK;
+
+ assert((u64_arg & VHOST_USER_VRING_NOFD_MASK) == 0);
+ assert(vmsg->fd_num == 1);
+
+ dev->virtqueue[index].call_fd = vmsg->fds[0];
+ printf("Got call_fd: %d for vq: %d\n", vmsg->fds[0], index);
+
+ return 0;
+}
+
+static int
+_execute_SET_VRING_ERR(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+ return 0;
+}
+
+static int
+_execute_GET_PROTOCOL_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ /* FIXME: unimplented */
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+ return 0;
+}
+
+static int
+_execute_SET_PROTOCOL_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ /* FIXME: unimplented */
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
+#endif
+ return 0;
+}
+
+static int
+_execute_GET_QUEUE_NUM(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+_execute_SET_VRING_ENABLE(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+ printf("function %s() not implemented yet.\n", __FUNCTION__);
+ return 0;
+}
+
+static int
+vubr_device_execute_request(struct vubr_device *dev, struct vhost_user_message *vmsg)
+{
+#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
+ /* Print out generic part of the request. */
+ printf("======================= Vhost user message from QEMU =======================\n");
+ printf("Request: %s (%d)\n", vhost_user_request_str[vmsg->request], vmsg->request);
+ printf("Flags: 0x%x\n", vmsg->flags);
+ printf("Size: %d\n", vmsg->size);
+
+ if (vmsg->fd_num) {
+ int i;
+ printf("Fds:");
+ for (i = 0; i < vmsg->fd_num; i++)
+ printf(" %d", vmsg->fds[i]);
+ printf("\n");
+ }
+#endif /* VHOST_USER_SHOW_MGMT_TRAFFIC */
+
+ switch (vmsg->request) {
+ case VHOST_USER_NONE:
+ return _execute_NONE(dev, vmsg);
+ case VHOST_USER_GET_FEATURES:
+ return _execute_GET_FEATURES(dev, vmsg);
+ case VHOST_USER_SET_FEATURES:
+ return _execute_SET_FEATURES(dev, vmsg);
+ case VHOST_USER_SET_OWNER:
+ return _execute_SET_OWNER(dev, vmsg);
+ case VHOST_USER_RESET_DEVICE:
+ return _execute_RESET_DEVICE(dev, vmsg);
+ case VHOST_USER_SET_MEM_TABLE:
+ return _execute_SET_MEM_TABLE(dev, vmsg);
+ case VHOST_USER_SET_LOG_BASE:
+ return _execute_SET_LOG_BASE(dev, vmsg);
+ case VHOST_USER_SET_LOG_FD:
+ return _execute_SET_LOG_FD(dev, vmsg);
+ case VHOST_USER_SET_VRING_NUM:
+ return _execute_SET_VRING_NUM(dev, vmsg);
+ case VHOST_USER_SET_VRING_ADDR:
+ return _execute_SET_VRING_ADDR(dev, vmsg);
+ case VHOST_USER_SET_VRING_BASE:
+ return _execute_SET_VRING_BASE(dev, vmsg);
+ case VHOST_USER_GET_VRING_BASE:
+ return _execute_GET_VRING_BASE(dev, vmsg);
+ case VHOST_USER_SET_VRING_KICK:
+ return _execute_SET_VRING_KICK(dev, vmsg);
+ case VHOST_USER_SET_VRING_CALL:
+ return _execute_SET_VRING_CALL(dev, vmsg);
+ case VHOST_USER_SET_VRING_ERR:
+ return _execute_SET_VRING_ERR(dev, vmsg);
+ case VHOST_USER_GET_PROTOCOL_FEATURES:
+ return _execute_GET_PROTOCOL_FEATURES(dev, vmsg);
+ case VHOST_USER_SET_PROTOCOL_FEATURES:
+ return _execute_SET_PROTOCOL_FEATURES(dev, vmsg);
+ case VHOST_USER_GET_QUEUE_NUM:
+ return _execute_GET_QUEUE_NUM(dev, vmsg);
+ case VHOST_USER_SET_VRING_ENABLE:
+ return _execute_SET_VRING_ENABLE(dev, vmsg);
+ case VHOST_USER_MAX:
+ assert(vmsg->request != VHOST_USER_MAX);
+ }
+ return 0;
+}
+
+static void
+vubr_device_receive_cb(int sock, void *ctx)
+{
+ struct vubr_device *dev = (struct vubr_device *) ctx;
+ struct vhost_user_message vmsg;
+
+ vhost_user_message_read(sock, &vmsg);
+
+ int reply_requested = vubr_device_execute_request(dev, &vmsg);
+
+ if (reply_requested) {
+ /* Set the version in the flags when sending the reply */
+ vmsg.flags &= ~VHOST_USER_VERSION_MASK;
+ vmsg.flags |= VHOST_USER_VERSION;
+ vmsg.flags |= VHOST_USER_REPLY_MASK;
+ vhost_user_message_write(sock, &vmsg);
+ }
+}
+
+static void
+vubr_device_accept_cb(int sock, void *ctx)
+{
+ struct vubr_device *dev = (struct vubr_device *)ctx;
+ int conn_fd;
+ struct sockaddr_un un;
+ socklen_t len = sizeof(un);
+
+ if ((conn_fd = accept(sock, (struct sockaddr *) &un, &len)) == -1) {
+ perror("accept");
+ exit(1);
+ }
+
+ printf("DEBUG: Got connection from remote peer on sock %d\n", conn_fd);
+ dispatcher_add(&dev->dispatcher, conn_fd, ctx, vubr_device_receive_cb);
+}
+
+struct vubr_device *
+vubr_device_new(char *path)
+{
+ struct vubr_device *dev =
+ (struct vubr_device *) calloc(1, sizeof(struct vubr_device));
+
+ dev->nregions = 0;
+
+ int i;
+ for (i = 0; i < MAX_NR_VIRTQUEUE; i++)
+ dev->virtqueue[i] = (struct virtqueue) {
+ .call_fd = -1, .kick_fd = -1,
+ .size = 0,
+ .last_avail_index = 0, .last_used_index = 0,
+ .desc = 0, .avail = 0, .used = 0,
+ };
+
+ /* Get a UNIX socket. */
+ if ((dev->sock = socket(AF_UNIX, SOCK_STREAM, 0)) == -1) {
+ perror("socket");
+ exit(1);
+ }
+
+ struct sockaddr_un un;
+ un.sun_family = AF_UNIX;
+ strcpy(un.sun_path, path);
+
+ size_t len = sizeof(un.sun_family) + strlen(path);
+
+ unlink(path);
+
+ if (bind(dev->sock, (struct sockaddr *) &un, len) == -1) {
+ perror("bind");
+ exit(1);
+ }
+
+ if (listen(dev->sock, 1) == -1) {
+ perror("listen");
+ exit(1);
+ }
+
+ dispatcher_init(&dev->dispatcher);
+ dispatcher_add(&dev->dispatcher, dev->sock, (void*) dev, vubr_device_accept_cb);
+
+ printf("Waiting for connections on UNIX socket %s ...\n", path);
+ return dev;
+}
+
+void
+vubr_device_backend_udp_setup(struct vubr_device *dev,
+ char *local_host,
+ uint16_t local_port,
+ char *dest_host,
+ uint16_t dest_port)
+{
+
+ struct sockaddr_in si_local;
+ int sock;
+
+ if ((sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1)
+ die("socket");
+
+ memset((char *) &si_local, 0, sizeof(struct sockaddr_in));
+ si_local.sin_family = AF_INET;
+ si_local.sin_port = htons(local_port);
+ if(inet_aton(local_host, &si_local.sin_addr) == 0) {
+ fprintf(stderr, "inet_aton() failed.\n");
+ exit(1);
+ }
+
+ if( bind(sock, (struct sockaddr*)&si_local, sizeof(si_local) ) == -1)
+ die("bind");
+
+ /* setup destination for sends */
+ struct sockaddr_in *si_remote = &dev->backend_udp_dest;
+ memset((char *) si_remote, 0, sizeof(struct sockaddr_in));
+ si_remote->sin_family = AF_INET;
+ si_remote->sin_port = htons(dest_port);
+ if(inet_aton(dest_host, &si_remote->sin_addr) == 0) {
+ fprintf(stderr, "inet_aton() failed.\n");
+ exit(1);
+ }
+
+ dev->backend_udp_sock = sock;
+ dispatcher_add(&dev->dispatcher, sock, dev, _backend_recv_cb);
+ printf("Waiting for data from udp backend on %s:%d...\n", local_host, local_port);
+}
+
+static void
+vubr_device_backend_udp_sendbuf(struct vubr_device *dev, uint8_t *buf, size_t len)
+{
+ int slen = sizeof(struct sockaddr_in);
+
+ if (sendto(dev->backend_udp_sock, buf, len, 0, (struct sockaddr *) &dev->backend_udp_dest, slen) == -1)
+ die("sendto()");
+}
+
+static int
+vubr_device_backend_udp_recvbuf(struct vubr_device *dev, uint8_t *buf, size_t buflen)
+{
+ int slen = sizeof(struct sockaddr_in);
+ int rc;
+
+ if ((rc = recvfrom(dev->backend_udp_sock, buf, buflen, 0,
+ (struct sockaddr *) &dev->backend_udp_dest,
+ (socklen_t *)&slen)) == -1)
+ die("recvfrom()");
+
+ return rc;
+}
+
+void
+vubr_device_run(struct vubr_device * dev)
+{
+ while (1) {
+ /* timeout 200ms */
+ dispatcher_wait(&dev->dispatcher, 200000);
+ /* Here one can try polling strategy. */
+ }
+}
diff --git a/tests/vubr/Makefile b/tests/vubr/Makefile
new file mode 100644
index 0000000..c3400fb
--- /dev/null
+++ b/tests/vubr/Makefile
@@ -0,0 +1,15 @@
+SRCS=dispatcher.c vhost_user.c vubr_device.c main.c
+INCLUDES+=vhost.h virtio_ring.h virtio_net.h
+INCLUDES+=vubr_config.h vhost_user.h virtqueue.h
+INCLUDES+=dispatcher.h vubr_device.h
+
+EXE=vubr
+CFLAGS += -m64 -Wall -Werror -g
+
+all: $(EXE)
+
+$(EXE): $(SRCS) $(INCLUDES)
+ $(CC) $(CFLAGS) $(SRCS) -o $@
+
+clean:
+ rm -f $(EXE)
--
--Victor
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] [PATCH] Add vhost-user test application (Vubr)
2015-10-25 17:42 [Qemu-devel] [PATCH] Add vhost-user test application (Vubr) Victor Kaplansky
@ 2015-10-25 19:02 ` Michael S. Tsirkin
2015-10-25 19:52 ` Michael S. Tsirkin
1 sibling, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2015-10-25 19:02 UTC (permalink / raw)
To: Victor Kaplansky; +Cc: qemu-devel, Eduardo Habkost
On Sun, Oct 25, 2015 at 07:42:00PM +0200, Victor Kaplansky wrote:
> QEMU is missing a good test for vhost-user feature,
> so I've created a sample vhost-user application, which
> called Vubr (mst coined the name, but better
> suggestions will be appreciated).
Short for Vhost-User Bridge. Not very pretty, hopefully
someone can come up with a better name.
Maybe just vhost-user-bridge, avoiding acronims.
--
MST
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Qemu-devel] [PATCH] Add vhost-user test application (Vubr)
2015-10-25 17:42 [Qemu-devel] [PATCH] Add vhost-user test application (Vubr) Victor Kaplansky
2015-10-25 19:02 ` Michael S. Tsirkin
@ 2015-10-25 19:52 ` Michael S. Tsirkin
1 sibling, 0 replies; 3+ messages in thread
From: Michael S. Tsirkin @ 2015-10-25 19:52 UTC (permalink / raw)
To: Victor Kaplansky; +Cc: qemu-devel, Eduardo Habkost
On Sun, Oct 25, 2015 at 07:42:00PM +0200, Victor Kaplansky wrote:
> QEMU is missing a good test for vhost-user feature,
The existing test is good actually.
It does not, however, allow actual traffic,
so at best it tests the management protocol.
> so I've created a sample vhost-user application, which
> called Vubr (mst coined the name, but better
> suggestions will be appreciated). Vubr may later serve
> the QEMU community as vhost-user QEMU internal test.
> Essentially Vubr is a very basic vhost-user backend for QEMU,
> It runs as a separate user-level process. For packet
> processing Vubr uses an additional QEMU instance with a backend
> configured by "-net socket" as a shared VLAN. This way another
> QEMU virtual machine can effectively make a bus by means of
> UDP communication.
>
> For a more simple setup, the another QEMU instance running
> the SLiRP backed can be the same QEMU instance running vhost-user
> client.
>
> The Vubr implementation is very preliminary. It is missing many
> features. I has been studying vhost-user protocol internals,
> so I've wrote Vubr bit by bit as I progressed through the
> protocol. Most probably internal architecture will change
> significantly.
>
> To run Vubr application:
>
> Build vubr with:
>
> $ cd qemu/tests/vubr; make
>
> Ensure the machine has hugepages enabled in kernel with command line
> like: default_hugepagesz=2M hugepagesz=2M hugepages=2048
>
> Run it with:
>
> $ ./vubr
>
> The above will run vhost-user server listening for connections
> on UNIX domain socket /tmp/vubr.sock, and will try to connect
> by UDP to VLAN bridge to localhost:5555, while listening on
> localhost:4444
>
> Run qemu with a virtio-net backed by vhost-user:
>
> $ qemu \
> -enable-kvm -m 512 -smp 2 \
> -object memory-backend-file,id=mem,size=512M,mem-path=/dev/hugepages,share=on \
> -numa node,memdev=mem -mem-prealloc \
> -chardev socket,id=char0,path=/tmp/vubr.sock \
> -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce \
> -device virtio-net-pci,netdev=mynet1 \
> -net none \
> -net socket,vlan=0,udp=localhost:4444,localaddr=localhost:5555 \
> -net user,vlan=0 \
> disk.img
>
> Vubr tested very lightly: it's able to bringup a linux on client VM
> with virtio-net driver, and execute transmits and receives to the
> internet. I tested with "wget redhat.com", "dig redhat.com".
>
> PS. I've consulted DPDK's code for vhost-user during Vubr
> implementation.
> Signed-off-by: Victor Kaplansky <victork@redhat.com>
Thanks for working on this.
This needs a bit more work, especially from coding
style perspective. Please read CODING_STYLE and
follow the rules. I pointed out some violations
but by no means all of them.
checkpatch.pl might catch some violations, too.
> ---
> tests/vubr/dispatcher.h | 26 ++
> tests/vubr/vhost.h | 77 +++++
> tests/vubr/vhost_user.h | 70 +++++
> tests/vubr/virtio_net.h | 38 +++
> tests/vubr/virtio_ring.h | 103 +++++++
> tests/vubr/virtqueue.h | 17 ++
> tests/vubr/vubr_config.h | 7 +
> tests/vubr/vubr_device.h | 41 +++
> tests/vubr/dispatcher.c | 77 +++++
> tests/vubr/main.c | 18 ++
> tests/vubr/vhost_user.c | 83 +++++
> tests/vubr/vubr_device.c | 773 +++++++++++++++++++++++++++++++++++++++++++++++
> tests/vubr/Makefile | 15 +
> 13 files changed, 1345 insertions(+)
> create mode 100644 tests/vubr/dispatcher.h
> create mode 100644 tests/vubr/vhost.h
> create mode 100644 tests/vubr/vhost_user.h
> create mode 100644 tests/vubr/virtio_net.h
> create mode 100644 tests/vubr/virtio_ring.h
> create mode 100644 tests/vubr/virtqueue.h
> create mode 100644 tests/vubr/vubr_config.h
> create mode 100644 tests/vubr/vubr_device.h
> create mode 100644 tests/vubr/dispatcher.c
> create mode 100644 tests/vubr/main.c
> create mode 100644 tests/vubr/vhost_user.c
> create mode 100644 tests/vubr/vubr_device.c
> create mode 100644 tests/vubr/Makefile
>
> diff --git a/tests/vubr/dispatcher.h b/tests/vubr/dispatcher.h
> new file mode 100644
> index 0000000..cd02f07
> --- /dev/null
> +++ b/tests/vubr/dispatcher.h
> @@ -0,0 +1,26 @@
> +#ifndef __DISPATCHER__
> +#define __DISPATCHER__
> +
> +#include <stddef.h>
> +#include <stdint.h>
> +#include <sys/select.h>
> +
> +typedef void (*callback_func)(int sock, void *ctx);
> +
> +struct event {
> + void *ctx;
> + callback_func callback;
> +};
> +
> +struct dispatcher {
> + int max_sock;
> + fd_set fdset;
> + struct event events[FD_SETSIZE];
> +};
> +
> +int dispatcher_init(struct dispatcher *d);
> +int dispatcher_add(struct dispatcher *d, int sock, void *ctx, callback_func cb);
> +int dispatcher_remove(struct dispatcher *d, int sock);
> +int dispatcher_wait(struct dispatcher *d, uint32_t timeout);
> +
> +#endif /* __DISPATCHER__ */
> diff --git a/tests/vubr/vhost.h b/tests/vubr/vhost.h
> new file mode 100644
> index 0000000..3960cc2
> --- /dev/null
> +++ b/tests/vubr/vhost.h
> @@ -0,0 +1,77 @@
> +#ifndef __VHOST_H__
> +#define __VHOST_H__
> +
> +#include <inttypes.h>
> +
> +/* Most imported form qemu/linux-headers/linux/vhost.h
> + *
> + * Userspace interface for virtio structures. */
> +
> +struct vhost_vring_state {
> + unsigned int index;
> + unsigned int num;
> +};
> +
> +struct vhost_vring_file {
> + unsigned int index;
> + int fd; /* Pass -1 to unbind from file. */
> +};
> +
> +struct vhost_vring_addr {
> + unsigned int index;
> + /* Option flags. */
> + unsigned int flags;
> + /* Flag values: */
> + /* Whether log address is valid. If set enables logging. */
> +#define VHOST_VRING_F_LOG 0
> +
> + /* Start of array of descriptors (virtually contiguous) */
> + uint64_t desc_user_addr;
> + /* Used structure address. Must be 32 bit aligned */
> + uint64_t used_user_addr;
> + /* Available structure address. Must be 16 bit aligned */
> + uint64_t avail_user_addr;
> + /* Logging support. */
> + /* Log writes to used structure, at offset calculated from specified
> + * address. Address must be 32 bit aligned. */
> + uint64_t log_guest_addr;
> +};
> +
> +#define VHOST_MEMORY_MAX_NREGIONS (8)
> +
> +struct vhost_memory_region {
> + uint64_t guest_phys_addr;
> + uint64_t memory_size; /* bytes */
> + uint64_t userspace_addr;
> + uint64_t mmap_offset;
> +};
> +
> +struct vhost_memory {
> + uint32_t nregions;
> + uint32_t padding;
> + struct vhost_memory_region regions[VHOST_MEMORY_MAX_NREGIONS];
> +};
> +
> +/* Feature bits */
> +/* Log all write descriptors. Can be changed while device is active. */
> +#define VHOST_F_LOG_ALL 26
> +/* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
> +#define VHOST_NET_F_VIRTIO_NET_HDR 27
> +
> +struct virtio_net_hdr {
> +#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1
> + uint8_t flags;
> +#define VIRTIO_NET_HDR_GSO_NONE 0
> +#define VIRTIO_NET_HDR_GSO_TCPV4 1
> +#define VIRTIO_NET_HDR_GSO_UDP 3
> +#define VIRTIO_NET_HDR_GSO_TCPV6 4
> +#define VIRTIO_NET_HDR_GSO_ECN 0x80
> + uint8_t gso_type;
> + uint16_t hdr_len;
> + uint16_t gso_size;
> + uint16_t csum_start;
> + uint16_t csum_offset;
> + uint16_t num_buffers;
> +};
Pls use this one from standard-headers.
> +
> +#endif /* __VHOST_H__ */
> diff --git a/tests/vubr/vhost_user.h b/tests/vubr/vhost_user.h
> new file mode 100644
> index 0000000..44d82d0
> --- /dev/null
> +++ b/tests/vubr/vhost_user.h
> @@ -0,0 +1,70 @@
> +#ifndef __VHOST_USER_H__
> +#define __VHOST_USER_H__
> +
> +/* Based on qemu/hw/virtio/vhost-user.c */
> +
> +#include <stdint.h>
> +#include <stddef.h>
> +#include "vhost.h"
> +
> +#define VHOST_USER_F_PROTOCOL_FEATURES 30
> +#define VHOST_USER_PROTOCOL_FEATURE_MASK 0x1ULL
> +
> +#define VHOST_USER_PROTOCOL_F_MQ 0
> +
> +#define VHOST_USER_REQUEST_LIST \
> + INFO(NONE, 0) \
> + INFO(GET_FEATURES, 1) \
> + INFO(SET_FEATURES, 2) \
> + INFO(SET_OWNER, 3) \
> + INFO(RESET_DEVICE, 4) \
> + INFO(SET_MEM_TABLE, 5) \
> + INFO(SET_LOG_BASE, 6) \
> + INFO(SET_LOG_FD, 7) \
> + INFO(SET_VRING_NUM, 8) \
> + INFO(SET_VRING_ADDR, 9) \
> + INFO(SET_VRING_BASE, 10) \
> + INFO(GET_VRING_BASE, 11) \
> + INFO(SET_VRING_KICK, 12) \
> + INFO(SET_VRING_CALL, 13) \
> + INFO(SET_VRING_ERR, 14) \
> + INFO(GET_PROTOCOL_FEATURES, 15) \
> + INFO(SET_PROTOCOL_FEATURES, 16) \
> + INFO(GET_QUEUE_NUM, 17) \
> + INFO(SET_VRING_ENABLE, 18)
> +
> +enum vhost_user_request {
> +#define INFO(a, b) VHOST_USER_ ## a = b,
I don't think these macro tricks are really justified,
and they break tag searches for most editors.
Let's keep it simple.
> + VHOST_USER_REQUEST_LIST
> +#undef INFO
> + VHOST_USER_MAX
> +};
> +
> +struct vhost_user_message {
Pls fix names to adhere to QEMU coding style,
here and elsewhere.
> + enum vhost_user_request request;
> +
> +#define VHOST_USER_VERSION_MASK (0x3)
> +#define VHOST_USER_REPLY_MASK (0x1<<2)
> + uint32_t flags;
> + uint32_t size; /* the following payload size */
> + union {
> +#define VHOST_USER_VRING_IDX_MASK (0xff)
> +#define VHOST_USER_VRING_NOFD_MASK (0x1<<8)
> + uint64_t u64;
> + struct vhost_vring_state state;
> + struct vhost_vring_addr addr;
> + struct vhost_memory memory;
> + } payload;
> + int fds[VHOST_MEMORY_MAX_NREGIONS];
> + int fd_num;
> +} __attribute__((packed));
> +
> +#define VHOST_USER_HDR_SIZE offsetof(struct vhost_user_message, payload.u64)
> +
> +/* The version of the protocol we support */
> +#define VHOST_USER_VERSION (0x1)
> +
> +void vhost_user_message_read(int conn_fd, struct vhost_user_message *vmsg);
> +void vhost_user_message_write(int conn_fd, struct vhost_user_message *vmsg);
> +
> +#endif /* __VHOST_USER_H__ */
> diff --git a/tests/vubr/virtio_net.h b/tests/vubr/virtio_net.h
> new file mode 100644
> index 0000000..f6f87b1
> --- /dev/null
> +++ b/tests/vubr/virtio_net.h
> @@ -0,0 +1,38 @@
> +#ifndef __VIRTIO_NET_H__
> +#define __VIRTIO_NET_H__
> +
> +/* Form qemu/include/standard-headers/linux/virtio_net.h */
From?
Pls just include that header.
> +
> +/* The feature bitmap for virtio net */
> +#define VIRTIO_NET_F_CSUM 0 /* Host handles pkts w/ partial csum */
> +#define VIRTIO_NET_F_GUEST_CSUM 1 /* Guest handles pkts w/ partial csum */
> +#define VIRTIO_NET_F_CTRL_GUEST_OFFLOADS 2 /* Dynamic offload configuration. */
> +#define VIRTIO_NET_F_MAC 5 /* Host has given MAC address. */
> +#define VIRTIO_NET_F_GUEST_TSO4 7 /* Guest can handle TSOv4 in. */
> +#define VIRTIO_NET_F_GUEST_TSO6 8 /* Guest can handle TSOv6 in. */
> +#define VIRTIO_NET_F_GUEST_ECN 9 /* Guest can handle TSO[6] w/ ECN in. */
> +#define VIRTIO_NET_F_GUEST_UFO 10 /* Guest can handle UFO in. */
> +#define VIRTIO_NET_F_HOST_TSO4 11 /* Host can handle TSOv4 in. */
> +#define VIRTIO_NET_F_HOST_TSO6 12 /* Host can handle TSOv6 in. */
> +#define VIRTIO_NET_F_HOST_ECN 13 /* Host can handle TSO[6] w/ ECN in. */
> +#define VIRTIO_NET_F_HOST_UFO 14 /* Host can handle UFO in. */
> +#define VIRTIO_NET_F_MRG_RXBUF 15 /* Host can merge receive buffers. */
> +#define VIRTIO_NET_F_STATUS 16 /* virtio_net_config.status available */
> +#define VIRTIO_NET_F_CTRL_VQ 17 /* Control channel available */
> +#define VIRTIO_NET_F_CTRL_RX 18 /* Control channel RX mode support */
> +#define VIRTIO_NET_F_CTRL_VLAN 19 /* Control channel VLAN filtering */
> +#define VIRTIO_NET_F_CTRL_RX_EXTRA 20 /* Extra RX mode control support */
> +#define VIRTIO_NET_F_GUEST_ANNOUNCE 21 /* Guest can announce device on the
> + * network */
> +#define VIRTIO_NET_F_MQ 22 /* Device supports Receive Flow
> + * Steering */
> +#define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */
> +
> +#ifndef VIRTIO_NET_NO_LEGACY
> +#define VIRTIO_NET_F_GSO 6 /* Host handles pkts w/ any GSO type */
> +#endif /* VIRTIO_NET_NO_LEGACY */
> +
> +#define VIRTIO_NET_S_LINK_UP 1 /* Link is up */
> +#define VIRTIO_NET_S_ANNOUNCE 2 /* Announcement is needed */
> +
> +#endif /* __VIRTIO_NET_H__ */
> diff --git a/tests/vubr/virtio_ring.h b/tests/vubr/virtio_ring.h
> new file mode 100644
> index 0000000..e2d0adb
> --- /dev/null
> +++ b/tests/vubr/virtio_ring.h
Pls pick the header we have under standard-headers
> @@ -0,0 +1,103 @@
> +#ifndef VIRTQUEUE_H
> +#define VIRTQUEUE_H
> +/*
> + *
> + * Virtual I/O Device (VIRTIO) Version 1.0
> + * Committee Specification 03
> + * 02 August 2015
> + * Copyright (c) OASIS Open 2015. All Rights Reserved.
> + * Source: http://docs.oasis-open.org/virtio/virtio/v1.0/cs03/listings/
> + *
> + */
> +#include <stdint.h>
> +
> +typedef uint64_t le64;
> +typedef uint32_t le32;
> +typedef uint16_t le16;
> +
> +/* This marks a buffer as continuing via the next field. */
> +#define VIRTQ_DESC_F_NEXT 1
> +/* This marks a buffer as write-only (otherwise read-only). */
> +#define VIRTQ_DESC_F_WRITE 2
> +/* This means the buffer contains a list of buffer descriptors. */
> +#define VIRTQ_DESC_F_INDIRECT 4
> +
> +/* The device uses this in used->flags to advise the driver: don't kick me
> + * when you add a buffer. It's unreliable, so it's simply an
> + * optimization. */
> +#define VIRTQ_USED_F_NO_NOTIFY 1
> +/* The driver uses this in avail->flags to advise the device: don't
> + * interrupt me when you consume a buffer. It's unreliable, so it's
> + * simply an optimization. */
> +#define VIRTQ_AVAIL_F_NO_INTERRUPT 1
> +
> +/* Support for indirect descriptors */
> +#define VIRTIO_F_INDIRECT_DESC 28
> +
> +/* Support for avail_event and used_event fields */
> +#define VIRTIO_F_EVENT_IDX 29
> +
> +/* Arbitrary descriptor layouts. */
> +#define VIRTIO_F_ANY_LAYOUT 27
> +
> +/* Virtqueue descriptors: 16 bytes.
> + * These can chain together via "next". */
> +struct virtq_desc {
> + /* Address (guest-physical). */
> + le64 addr;
> + /* Length. */
> + le32 len;
> + /* The flags as indicated above. */
> + le16 flags;
> + /* We chain unused descriptors via this, too */
> + le16 next;
> +};
> +
> +struct virtq_avail {
> + le16 flags;
> + le16 idx;
> + le16 ring[];
> + /* Only if VIRTIO_F_EVENT_IDX: le16 used_event; */
> +};
> +
> +/* le32 is used here for ids for padding reasons. */
> +struct virtq_used_elem {
> + /* Index of start of used descriptor chain. */
> + le32 id;
> + /* Total length of the descriptor chain which was written to. */
> + le32 len;
> +};
> +
> +struct virtq_used {
> + le16 flags;
> + le16 idx;
> + struct virtq_used_elem ring[];
> + /* Only if VIRTIO_F_EVENT_IDX: le16 avail_event; */
> +};
> +
> +struct virtq {
> + unsigned int num;
> +
> + struct virtq_desc *desc;
> + struct virtq_avail *avail;
> + struct virtq_used *used;
> +};
> +
> +static inline int virtq_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old_idx)
> +{
> + return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old_idx);
> +}
> +
> +/* Get location of event indices (only with VIRTIO_F_EVENT_IDX) */
> +static inline le16 *virtq_used_event(struct virtq *vq)
> +{
> + /* For backwards compat, used event index is at *end* of avail ring. */
> + return &vq->avail->ring[vq->num];
> +}
> +
> +static inline le16 *virtq_avail_event(struct virtq *vq)
> +{
> + /* For backwards compat, avail event index is at *end* of used ring. */
> + return (le16 *)&vq->used->ring[vq->num];
> +}
> +#endif /* VIRTQUEUE_H */
> diff --git a/tests/vubr/virtqueue.h b/tests/vubr/virtqueue.h
> new file mode 100644
> index 0000000..b018cca
> --- /dev/null
> +++ b/tests/vubr/virtqueue.h
> @@ -0,0 +1,17 @@
> +#ifndef __VIRTQUEUE__
> +#define __VIRTQUEUE__
> +
> +#include "virtio_ring.h"
> +
> +struct virtqueue {
> + int call_fd;
> + int kick_fd;
> + uint32_t size;
> + uint16_t last_avail_index;
> + uint16_t last_used_index;
> + struct virtq_desc* desc;
> + struct virtq_avail* avail;
> + struct virtq_used* used;
> +};
> +
> +#endif /* __VIRTQUEUE__ */
> diff --git a/tests/vubr/vubr_config.h b/tests/vubr/vubr_config.h
> new file mode 100644
> index 0000000..19681d0
> --- /dev/null
> +++ b/tests/vubr/vubr_config.h
> @@ -0,0 +1,7 @@
> +#ifndef __VHU_CONFIG__
> +#define __VHU_CONFIG__
> +
> +#define VHOST_USER_SHOW_MGMT_TRAFFIC
> +#define VHOST_USER_SHOW_NET_TRAFFIC
I don't see why this needs a header for itself.
Also, these ifdefs all over the code look inelegant.
I'd just do
#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
#define DEBUG_VHOST_USER_BRIDGE_MGMT(...) printf(#arg...)
#else
#define DEBUG_VHOST_USER_BRIDGE_MGMT(...) do {} while (0)
#endif
and then code has no ifdefs.
> +
> +#endif /* __VHU_CONFIG__ */
> diff --git a/tests/vubr/vubr_device.h b/tests/vubr/vubr_device.h
> new file mode 100644
> index 0000000..04a0ecb
> --- /dev/null
> +++ b/tests/vubr/vubr_device.h
> @@ -0,0 +1,41 @@
> +#ifndef __VHU_DEVICE__
> +#define __VHU_DEVICE__
> +
> +#include <arpa/inet.h>
> +#include <sys/socket.h>
> +
> +#include "vhost.h"
> +#include "virtqueue.h"
> +#include "dispatcher.h"
> +
> +#define MAX_NR_VIRTQUEUE (8)
> +
> +struct vubr_device_region {
> + /* Guest Phhysical address. */
> + uint64_t gpa;
You need to add a header to use these.
> + /* Memory region size. */
> + uint64_t size;
> + /* QEMU virtual address (userspace). */
> + uint64_t qva;
> + /* Starting offset in our mmaped space. */
> + uint64_t mmap_offset;
> + /* Start addrtess of mmaped space. */
> + uint64_t mmap_addr;
> +};
> +
> +struct vubr_device {
> + int sock;
> + struct dispatcher dispatcher;
> + uint32_t nregions;
> + struct vubr_device_region regions[VHOST_MEMORY_MAX_NREGIONS];
> + struct virtqueue virtqueue[MAX_NR_VIRTQUEUE];
> + int backend_udp_sock;
> + struct sockaddr_in backend_udp_dest;
> +};
> +
> +struct vubr_device *vubr_device_new(char *path);
> +void vubr_device_run(struct vubr_device * dev);
> +void vubr_device_backend_udp_setup(struct vubr_device *dev, char *local_host,
> + uint16_t local_port, char *dest_host, uint16_t dest_port);
> +
> +#endif /* __VHU_DEVICE__ */
> diff --git a/tests/vubr/dispatcher.c b/tests/vubr/dispatcher.c
> new file mode 100644
> index 0000000..62d386a
> --- /dev/null
> +++ b/tests/vubr/dispatcher.c
Pls add copyright info. GPLv2+ is preferred.
> @@ -0,0 +1,77 @@
> +#include <stdio.h>
> +#include <sys/select.h>
> +
> +#include "vubr_config.h"
> +#include "dispatcher.h"
> +
> +int
> +dispatcher_init(struct dispatcher *d)
> +{
> + FD_ZERO(&d->fdset);
> + d->max_sock = -1;
> + return 0;
> +}
> +
> +int
> +dispatcher_add(struct dispatcher *d, int sock, void *ctx, callback_func cb)
> +{
> + if (sock >= FD_SETSIZE) {
> + fprintf(stderr, "Error: Failed to add new event. sock %d should be less than %d\n",
> + sock, FD_SETSIZE);
> + return -1;
> + }
> +
> + d->events[sock].ctx = ctx;
> + d->events[sock].callback = cb;
> +
> + FD_SET(sock, &d->fdset);
> + if (sock > d->max_sock)
> + d->max_sock = sock;
> + printf("DEBUG: Added sock %d for watching. max_sock: %d\n", sock, d->max_sock);
> + return 0;
> +}
> +
> +int
> +dispatcher_remove(struct dispatcher *d, int sock)
> +{
> + if (sock >= FD_SETSIZE) {
> + fprintf(stderr, "Error: Failed to remove event. sock %d should be less than %d\n",
> + sock, FD_SETSIZE);
> + return -1;
> + }
> +
> + FD_CLR(sock, &d->fdset);
> + return 0;
> +}
> +
> +/* timeout in us */
> +int
> +dispatcher_wait(struct dispatcher *d, uint32_t timeout)
> +{
> + struct timeval tv;
> + tv.tv_sec = timeout / 1000000;
> + tv.tv_usec = timeout % 1000000;
> +
> + fd_set fdset = d->fdset;
> +
> + /* wait until some of sockets become readable. */
> + int rc = select(d->max_sock + 1, &fdset, 0, 0, &tv);
> +
> + if (rc == -1)
> + perror("select");
> +
> + /* Timeout */
> + if (rc == 0)
> + return 0;
> +
> + /* Now call callback for every ready socket. */
> +
> + int sock;
> + for (sock = 0; sock < d->max_sock + 1; sock++)
> + if (FD_ISSET(sock, &fdset)) {
> + struct event *e = &d->events[sock];
> + e->callback(sock, e->ctx);
> + }
> +
> + return 0;
> +}
> diff --git a/tests/vubr/main.c b/tests/vubr/main.c
> new file mode 100644
> index 0000000..a7a3e9d
> --- /dev/null
> +++ b/tests/vubr/main.c
> @@ -0,0 +1,18 @@
> +#include "vubr_config.h"
> +#include "vubr_device.h"
> +
> +int main(int argc, char* argv[])
> +{
> + struct vubr_device *dev;
> +
> + if((dev = vubr_device_new("/tmp/vubr.sock"))) {
> + vubr_device_backend_udp_setup(dev,
> + "127.0.0.1", 4444,
> + "127.0.0.1", 5555);
> +
> + vubr_device_run(dev);
> + return 0;
> + }
> + else
> + return 1;
> +}
> diff --git a/tests/vubr/vhost_user.c b/tests/vubr/vhost_user.c
> new file mode 100644
> index 0000000..91ab09e
> --- /dev/null
> +++ b/tests/vubr/vhost_user.c
> @@ -0,0 +1,83 @@
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <string.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <assert.h>
> +#include <unistd.h>
> +#include <errno.h>
> +
> +#include "vubr_config.h"
> +#include "vhost_user.h"
> +
> +void
> +vhost_user_message_read(int conn_fd, struct vhost_user_message *vmsg)
> +{
> + int rc;
> + struct msghdr msg = {};
> + struct iovec iov;
> + size_t fd_size = VHOST_MEMORY_MAX_NREGIONS * sizeof(int);
> + char control[CMSG_SPACE(fd_size)];
empty line won't hurt here, after all declarations.
> + memset(control, 0, sizeof(control));
> +
> + iov.iov_base = (char *)vmsg;
> + iov.iov_len = VHOST_USER_HDR_SIZE;
> +
> + msg.msg_iov = &iov;
> + msg.msg_iovlen = 1;
> + msg.msg_control = control;
> + msg.msg_controllen = sizeof(control);
> +
> + rc = recvmsg(conn_fd, &msg, 0);
> +
> + if (rc <= 0) {
> + perror("recvmsg");
> + exit(1);
> + }
> +
> + vmsg->fd_num = 0;
> + struct cmsghdr *cmsg;
> + for (cmsg = CMSG_FIRSTHDR(&msg);
> + cmsg != NULL;
> + cmsg = CMSG_NXTHDR(&msg, cmsg))
> + {
> + if ((cmsg->cmsg_level == SOL_SOCKET) &&
Don't use tabs for indents, and you don't need so many ()
around simple math.
> + (cmsg->cmsg_type == SCM_RIGHTS))
> + {
> + fd_size = cmsg->cmsg_len - CMSG_LEN(0);
> + vmsg->fd_num = fd_size / sizeof(int);
> + memcpy(vmsg->fds, CMSG_DATA(cmsg), fd_size);
> + break;
> + }
> + }
> +
> + if (vmsg->size > sizeof(vmsg->payload)) {
> + fprintf(stderr, "Error: too big message request: %d, size: vmsg->size: %u, while sizeof(vmsg->payload) = %lu\n",
> + vmsg->request, vmsg->size, sizeof(vmsg->payload));
> + exit(1);
> + }
> +
> + if (vmsg->size) {
> + rc = read(conn_fd, &vmsg->payload, vmsg->size);
> + if (rc <= 0) {
> + perror("recvmsg");
> + exit(1);
> + }
> +
> + assert(rc == vmsg->size);
> + }
> +}
> +
> +void
> +vhost_user_message_write(int conn_fd, struct vhost_user_message *vmsg)
> +{
> + int rc;
> + do {
> + rc = write(conn_fd, vmsg, VHOST_USER_HDR_SIZE + vmsg->size);
> + } while (rc < 0 && errno == EINTR);
> +
> + if (rc < 0) {
> + perror("write");
> + exit(1);
> + }
> +}
> diff --git a/tests/vubr/vubr_device.c b/tests/vubr/vubr_device.c
> new file mode 100644
> index 0000000..a296aef
> --- /dev/null
> +++ b/tests/vubr/vubr_device.c
> @@ -0,0 +1,773 @@
> +#include <assert.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <inttypes.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include <errno.h>
> +#include <sys/types.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> +#include <sys/unistd.h>
> +#include <sys/mman.h>
> +#include <sys/eventfd.h>
> +
> +#include "vubr_config.h"
> +#include "vhost_user.h"
> +#include "virtio_net.h"
> +#include "vubr_device.h"
> +#include "virtqueue.h"
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> +static char *vhost_user_request_str[] = {
> +#define INFO(name,num) \
> + [num] = #name,
> +VHOST_USER_REQUEST_LIST
> +#undef INFO
> +};
> +#endif /* VHOST_USER_SHOW_MGMT_TRAFFIC */
> +
> +void
> +die(char *s)
Pls do prefix functions with "vubr" and pls don't
prefix them with _.
> +{
> + perror(s);
> + exit(1);
> +}
> +
> +static void
> +print_buffer(uint8_t* buf, size_t len)
> +{
> + int i;
> + printf("raw buffer:\n");
> + for(i = 0; i < len; i++) {
> + if ((i % 16) == 0)
> + printf("\n");
> + if ((i % 4) == 0)
> + printf(" ");
> + printf("%02x ", buf[i]);
> + }
> + printf("\n............................................................\n");
> +}
> +
> +/* Translate guest physical address to our virtual address. */
> +static uint64_t __attribute__((unused))
> +gpa_to_va(struct vubr_device *dev, uint64_t guest_addr)
> +{
> + int i;
emoty line won't hurt after declaration.
> + /* Find matching memory region. */
> +
> + for (i = 0; i < dev->nregions; i++) {
> + struct vubr_device_region *r = &dev->regions[i];
> +
> + if ((guest_addr >= r->gpa) && (guest_addr < (r->gpa + r->size)))
> + return (guest_addr - r->gpa + r->mmap_addr + r->mmap_offset);
> + }
> +
> + assert(!"address not found in regions");
> + return 0;
> +}
> +
> +/* Translate qemu virtual address to our virtual address. */
> +static uint64_t
> +qva_to_va(struct vubr_device *dev, uint64_t qemu_addr)
> +{
> + int i;
> + /* Find matching memory region. */
> +
> + for (i = 0; i < dev->nregions; i++) {
> + struct vubr_device_region *r = &dev->regions[i];
> +
> + if ((qemu_addr >= r->qva) && (qemu_addr < (r->qva + r->size)))
> + return (qemu_addr - r->qva + r->mmap_addr + r->mmap_offset);
> + }
> +
> + assert(!"address not found in regions");
> + return 0;
> +}
> +
> +static void vubr_device_backend_udp_sendbuf(struct vubr_device *dev, uint8_t *buf, size_t len);
Line way too long.
And pls don't forward-declare static functions. Just sort them by use.
> +
> +static void
> +_consume_raw_packet(struct vubr_device *dev, uint8_t *buf, uint32_t len)
> +{
> + int hdrlen = sizeof(struct virtio_net_hdr);
> +
> +#ifdef VHOST_USER_SHOW_NET_TRAFFIC
> + print_buffer(buf, len);
> +#endif
> + vubr_device_backend_udp_sendbuf(dev, buf + hdrlen, len - hdrlen);
> +}
> +
> +/* Kick the guest if necessary. */
> +static void
> +virtqueue_kick(struct virtqueue *vq)
> +{
> + if (!(vq->avail->flags & VIRTQ_AVAIL_F_NO_INTERRUPT)) {
> + printf("Kicking the guest...\n");
> + eventfd_write(vq->call_fd, 1);
> + }
> +}
> +
> +static void
> +_post_buffer(struct vubr_device *dev, struct virtqueue *vq, uint8_t *buf, int32_t len)
> +{
> + struct virtq_desc* desc = vq->desc;
> + struct virtq_avail* avail = vq->avail;
> + struct virtq_used* used = vq->used;
> +
> + unsigned int size = vq->size;
> +
> + uint16_t a_index = vq->last_avail_index % size;
> + uint16_t u_index = vq->last_used_index % size;
> + uint16_t d_index = avail->ring[a_index];
> +
> + int i = d_index;
> +
> +#ifdef VHOST_USER_SHOW_NET_TRAFFIC
> + printf("Posting the packet to guest on vq:\n");
> + printf(" size = %d\n", vq->size);
> + printf(" last_avail_index = %d\n", vq->last_avail_index);
> + printf(" last_used_index = %d\n", vq->last_used_index);
> + printf(" a_index = %d\n", a_index);
> + printf(" u_index = %d\n", u_index);
> + printf(" d_index = %d\n", d_index);
> + printf(" desc[%d].addr = 0x%016"PRIx64"\n", i, desc[i].addr);
> + printf(" desc[%d].len = %d\n", i, desc[i].len);
> + printf(" desc[%d].flags = %d\n", i, desc[i].flags);
> + printf(" avail->idx = %d\n", avail->idx);
> + printf(" used->idx = %d\n", used->idx);
> +#endif
> +
> + if (!(desc[i].flags & VIRTQ_DESC_F_WRITE)) {
> + // FIXME: we should find writable descriptor
> + fprintf(stderr, "descriptor is not writable. exiting.\n");
> + exit(1);
> + }
> +
> + void *chunk_start = (void *)gpa_to_va(dev, desc[i].addr);
> + uint32_t chunk_len = desc[i].len;
> +
> + if (len <= chunk_len) {
> + memcpy(chunk_start, buf, len);
> + } else {
> + fprintf(stderr, "received too long packet from the backend. dropping...\n");
> + return;
> + }
> +
> + /* Add descriptor to the used ring. */
> + used->ring[u_index].id = d_index;
> + used->ring[u_index].len = len;
> +
> + vq->last_avail_index++;
> + vq->last_used_index++;
> +
> + used->idx = vq->last_used_index;
> +
> + /* Kick the guest if necessary. */
> + virtqueue_kick(vq);
> +}
> +
> +static int
> +_process_desc(struct vubr_device *dev, struct virtqueue *vq)
> +{
> + struct virtq_desc* desc = vq->desc;
> + struct virtq_avail* avail = vq->avail;
> + struct virtq_used* used = vq->used;
> +
> + unsigned int size = vq->size;
> +
> + uint16_t a_index = vq->last_avail_index % size;
> + uint16_t u_index = vq->last_used_index % size;
> + uint16_t d_index = avail->ring[a_index];
> +
> + uint32_t i, len = 0;
> + size_t buf_size = 4096;
> + uint8_t buf[4096];
> +
> +#ifdef VHOST_USER_SHOW_NET_TRAFFIC
> + printf("chunks: ");
> +#endif
> +
> + i = d_index;
> + do {
> + void *chunk_start = (void *)gpa_to_va(dev, desc[i].addr);
> + uint32_t chunk_len = desc[i].len;
> +
> + if (len + chunk_len < buf_size) {
> + memcpy(buf + len, chunk_start, chunk_len);
> +#ifdef VHOST_USER_SHOW_NET_TRAFFIC
> + printf("%d ", chunk_len);
> +#endif
> + } else {
> + fprintf(stderr, "too long packet. dropping...\n");
> + break;
> + }
> +
> + len += chunk_len;
> +
> + if (!(desc[i].flags & VIRTQ_DESC_F_NEXT))
> + break;
> +
> + i = desc[i].next;
> + } while(1);
> +
> + if (!len)
> + return -1;
> +
> + /* Add descriptor to the used ring. */
> + used->ring[u_index].id = d_index;
> + used->ring[u_index].len = len;
> +
> +#ifdef VHOST_USER_SHOW_NET_TRAFFIC
> + printf("\n");
> +#endif
> +
> + _consume_raw_packet(dev, buf, len);
> +
> + return 0;
> +}
> +
> +static void
> +_process_avail(struct vubr_device *dev, struct virtqueue *vq)
> +{
> + struct virtq_avail *avail = vq->avail;
> + struct virtq_used* used = vq->used;
> +
> + while(vq->last_avail_index != avail->idx) {
> + _process_desc(dev, vq);
There are no memory barriers anywere, this is almost sure
to be racy.
> + vq->last_avail_index++;
> + vq->last_used_index++;
> + }
> +
> + used->idx = vq->last_used_index;
> +}
> +
> +static int vubr_device_backend_udp_recvbuf(struct vubr_device *dev, uint8_t *buf, size_t buflen);
> +
> +static void
> +_backend_recv_cb(int sock, void *ctx)
> +{
> + printf("\n\n *** IN UDP RECEIVE CALLBACK ***\n\n");
> + struct vubr_device *dev = (struct vubr_device *) ctx;
> + struct virtqueue *rx_vq = &dev->virtqueue[0];
> +#define BUFLEN 4096
> + uint8_t buf[BUFLEN];
> + int len;
> + struct virtio_net_hdr *hdr = (struct virtio_net_hdr *)buf;
> + int hdrlen = sizeof(struct virtio_net_hdr);
> +
> + *hdr = (struct virtio_net_hdr) {};
> + hdr->num_buffers = 1;
> +
> + len = vubr_device_backend_udp_recvbuf(dev, buf + hdrlen, BUFLEN - hdrlen);
> + _post_buffer(dev, rx_vq, buf, len + hdrlen);
> +#undef BUFLEN
Pls don't play such preprocessor tricks.
> +}
> +
> +static void
> +_kick_cb(int sock, void *ctx)
> +{
> + struct vubr_device *dev = (struct vubr_device *) ctx;
> + eventfd_t kick_data;
> + ssize_t rc;
> +
> + rc = eventfd_read(sock, &kick_data);
> +
> + if (rc == -1) {
> + perror("read kick");
> + exit(1);
> + } else {
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("Got kick_data: %016"PRIx64"\n", kick_data);
> +#endif
> + _process_avail(dev, &dev->virtqueue[1]);
> + }
> +}
> +
> +static int
> +_execute_NONE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
> + return 0;
> +}
> +
> +static int
> +_execute_GET_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + vmsg->payload.u64 =
> + ((1ULL << VIRTIO_NET_F_MRG_RXBUF) |
> + (1ULL << VIRTIO_NET_F_CTRL_VQ) |
> + (1ULL << VIRTIO_NET_F_CTRL_RX) |
> + (1ULL << VHOST_F_LOG_ALL));
> + vmsg->size = sizeof(vmsg->payload.u64);
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("returing u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> +
> + /* reply */
> + return 1;
> +}
> +
> +static int
> +_execute_SET_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> + return 0;
> +}
> +
> +static int
> +_execute_SET_OWNER(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
You don't have to do anything if you don't want to.
> + return 0;
> +}
> +
> +static int
> +_execute_RESET_DEVICE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
Basically just stop processing rings until KICK.
> + return 0;
> +}
> +
> +static int
> +_execute_SET_MEM_TABLE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("Nregions: %d\n", vmsg->payload.memory.nregions);
> +
> + struct vhost_memory *memory = &vmsg->payload.memory;
> + dev->nregions = memory->nregions;
> + int i;
> + for (i = 0; i < dev->nregions; i++) {
> + struct vhost_memory_region *msg_region = &memory->regions[i];
> + struct vubr_device_region *dev_region = &dev->regions[i];
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("Region %d\n", i);
> + printf(" guest_phys_addr: 0x%016"PRIx64"\n", msg_region->guest_phys_addr);
> + printf(" memory_size: 0x%016"PRIx64"\n", msg_region->memory_size);
> + printf(" userspace_addr 0x%016"PRIx64"\n", msg_region->userspace_addr);
> + printf(" mmap_offset 0x%016"PRIx64"\n", msg_region->mmap_offset);
> +#endif
> +
> + dev_region->gpa = msg_region->guest_phys_addr;
> + dev_region->size = msg_region->memory_size;
> + dev_region->qva = msg_region->userspace_addr;
> + dev_region->mmap_offset = msg_region->mmap_offset;
> +
> + void *mmap_addr;
> +
> + /* We don't use offset argument of mmap() since the
> + * mapped address has to be page aligned, and we use huge
> + * pages. */
> + mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset,
> + PROT_READ | PROT_WRITE, MAP_SHARED,
> + vmsg->fds[i], 0);
> +
> + if (mmap_addr == MAP_FAILED) {
> + perror("mmap");
> + exit (1);
> + }
> +
> + dev_region->mmap_addr = (uint64_t) mmap_addr;
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf(" mmap_addr: 0x%016"PRIx64"\n", dev_region->mmap_addr);
> +#endif
> + }
> +
> + return 0;
> +}
> +
> +static int
> +_execute_SET_LOG_BASE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
> + return 0;
> +}
> +
> +static int
> +_execute_SET_LOG_FD(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_NUM(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + unsigned int index = vmsg->payload.state.index;
> + unsigned int num = vmsg->payload.state.num;
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("state.index: %d\n", index);
> + printf("state.num: %d\n", num);
> +#endif
> + dev->virtqueue[index].size = num;
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_ADDR(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + struct vhost_vring_addr *vra = &vmsg->payload.addr;
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("vhost_vring_addr:\n");
> + printf(" index: %d\n", vra->index);
> + printf(" flags: %d\n", vra->flags);
> + printf(" desc_user_addr: 0x%016"PRIx64"\n", vra->desc_user_addr);
> + printf(" used_user_addr: 0x%016"PRIx64"\n", vra->used_user_addr);
> + printf(" avail_user_addr: 0x%016"PRIx64"\n", vra->avail_user_addr);
> + printf(" log_guest_addr: 0x%016"PRIx64"\n", vra->log_guest_addr);
> +#endif
> +
> + unsigned int index = vra->index;
> + struct virtqueue *vq = &dev->virtqueue[index];
> +
> + vq->desc = (struct virtq_desc *)qva_to_va(dev, vra->desc_user_addr);
> + vq->used = (struct virtq_used *)qva_to_va(dev, vra->used_user_addr);
> + vq->avail = (struct virtq_avail *)qva_to_va(dev, vra->avail_user_addr);
> +
> + printf("Setting virtq addresses:\n");
> + printf(" virtq_desc at %p\n", vq->desc);
> + printf(" virtq_used at %p\n", vq->used);
> + printf(" virtq_avail at %p\n", vq->avail);
> +
> + vq->last_used_index = vq->used->idx;
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_BASE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + unsigned int index = vmsg->payload.state.index;
> + unsigned int num = vmsg->payload.state.num;
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("state.index: %d\n", index);
> + printf("state.num: %d\n", num);
> +#endif
> + dev->virtqueue[index].last_avail_index = num;
> +
> + return 0;
> +}
> +
> +static int
> +_execute_GET_VRING_BASE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_KICK(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> +
> + uint64_t u64_arg = vmsg->payload.u64;
> + int index = u64_arg & VHOST_USER_VRING_IDX_MASK;
> +
> + assert((u64_arg & VHOST_USER_VRING_NOFD_MASK) == 0);
> + assert(vmsg->fd_num == 1);
> +
> + dev->virtqueue[index].kick_fd = vmsg->fds[0];
> + printf("Got kick_fd: %d for vq: %d\n", vmsg->fds[0], index);
> +
> + if ((index % 2 == 1)) {
> + /* TX queue. */
> + dispatcher_add(&dev->dispatcher, dev->virtqueue[index].kick_fd, dev, _kick_cb);
> +
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("Waiting for kicks on fd: %d for vq: %d\n",
> + dev->virtqueue[index].kick_fd, index);
> +#endif
> + }
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_CALL(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> +
> + uint64_t u64_arg = vmsg->payload.u64;
> + int index = u64_arg & VHOST_USER_VRING_IDX_MASK;
> +
> + assert((u64_arg & VHOST_USER_VRING_NOFD_MASK) == 0);
> + assert(vmsg->fd_num == 1);
> +
> + dev->virtqueue[index].call_fd = vmsg->fds[0];
> + printf("Got call_fd: %d for vq: %d\n", vmsg->fds[0], index);
> +
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_ERR(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> + return 0;
> +}
> +
> +static int
> +_execute_GET_PROTOCOL_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + /* FIXME: unimplented */
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> + return 0;
> +}
> +
> +static int
> +_execute_SET_PROTOCOL_FEATURES(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + /* FIXME: unimplented */
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + printf("u64: 0x%016"PRIx64"\n", vmsg->payload.u64);
> +#endif
> + return 0;
> +}
> +
> +static int
> +_execute_GET_QUEUE_NUM(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
> + return 0;
> +}
> +
> +static int
> +_execute_SET_VRING_ENABLE(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> + printf("function %s() not implemented yet.\n", __FUNCTION__);
> + return 0;
> +}
> +
> +static int
> +vubr_device_execute_request(struct vubr_device *dev, struct vhost_user_message *vmsg)
> +{
> +#ifdef VHOST_USER_SHOW_MGMT_TRAFFIC
> + /* Print out generic part of the request. */
> + printf("======================= Vhost user message from QEMU =======================\n");
> + printf("Request: %s (%d)\n", vhost_user_request_str[vmsg->request], vmsg->request);
> + printf("Flags: 0x%x\n", vmsg->flags);
> + printf("Size: %d\n", vmsg->size);
> +
> + if (vmsg->fd_num) {
> + int i;
> + printf("Fds:");
> + for (i = 0; i < vmsg->fd_num; i++)
> + printf(" %d", vmsg->fds[i]);
> + printf("\n");
> + }
> +#endif /* VHOST_USER_SHOW_MGMT_TRAFFIC */
> +
> + switch (vmsg->request) {
> + case VHOST_USER_NONE:
> + return _execute_NONE(dev, vmsg);
> + case VHOST_USER_GET_FEATURES:
> + return _execute_GET_FEATURES(dev, vmsg);
> + case VHOST_USER_SET_FEATURES:
> + return _execute_SET_FEATURES(dev, vmsg);
> + case VHOST_USER_SET_OWNER:
> + return _execute_SET_OWNER(dev, vmsg);
> + case VHOST_USER_RESET_DEVICE:
> + return _execute_RESET_DEVICE(dev, vmsg);
> + case VHOST_USER_SET_MEM_TABLE:
> + return _execute_SET_MEM_TABLE(dev, vmsg);
> + case VHOST_USER_SET_LOG_BASE:
> + return _execute_SET_LOG_BASE(dev, vmsg);
> + case VHOST_USER_SET_LOG_FD:
> + return _execute_SET_LOG_FD(dev, vmsg);
> + case VHOST_USER_SET_VRING_NUM:
> + return _execute_SET_VRING_NUM(dev, vmsg);
> + case VHOST_USER_SET_VRING_ADDR:
> + return _execute_SET_VRING_ADDR(dev, vmsg);
> + case VHOST_USER_SET_VRING_BASE:
> + return _execute_SET_VRING_BASE(dev, vmsg);
> + case VHOST_USER_GET_VRING_BASE:
> + return _execute_GET_VRING_BASE(dev, vmsg);
> + case VHOST_USER_SET_VRING_KICK:
> + return _execute_SET_VRING_KICK(dev, vmsg);
> + case VHOST_USER_SET_VRING_CALL:
> + return _execute_SET_VRING_CALL(dev, vmsg);
> + case VHOST_USER_SET_VRING_ERR:
> + return _execute_SET_VRING_ERR(dev, vmsg);
> + case VHOST_USER_GET_PROTOCOL_FEATURES:
> + return _execute_GET_PROTOCOL_FEATURES(dev, vmsg);
> + case VHOST_USER_SET_PROTOCOL_FEATURES:
> + return _execute_SET_PROTOCOL_FEATURES(dev, vmsg);
> + case VHOST_USER_GET_QUEUE_NUM:
> + return _execute_GET_QUEUE_NUM(dev, vmsg);
> + case VHOST_USER_SET_VRING_ENABLE:
> + return _execute_SET_VRING_ENABLE(dev, vmsg);
> + case VHOST_USER_MAX:
> + assert(vmsg->request != VHOST_USER_MAX);
> + }
> + return 0;
> +}
> +
> +static void
> +vubr_device_receive_cb(int sock, void *ctx)
> +{
> + struct vubr_device *dev = (struct vubr_device *) ctx;
> + struct vhost_user_message vmsg;
> +
> + vhost_user_message_read(sock, &vmsg);
> +
> + int reply_requested = vubr_device_execute_request(dev, &vmsg);
> +
> + if (reply_requested) {
> + /* Set the version in the flags when sending the reply */
> + vmsg.flags &= ~VHOST_USER_VERSION_MASK;
> + vmsg.flags |= VHOST_USER_VERSION;
> + vmsg.flags |= VHOST_USER_REPLY_MASK;
> + vhost_user_message_write(sock, &vmsg);
> + }
> +}
> +
> +static void
> +vubr_device_accept_cb(int sock, void *ctx)
> +{
> + struct vubr_device *dev = (struct vubr_device *)ctx;
> + int conn_fd;
> + struct sockaddr_un un;
> + socklen_t len = sizeof(un);
> +
> + if ((conn_fd = accept(sock, (struct sockaddr *) &un, &len)) == -1) {
> + perror("accept");
> + exit(1);
> + }
> +
> + printf("DEBUG: Got connection from remote peer on sock %d\n", conn_fd);
above within ifdef as well?
> + dispatcher_add(&dev->dispatcher, conn_fd, ctx, vubr_device_receive_cb);
> +}
> +
> +struct vubr_device *
> +vubr_device_new(char *path)
> +{
> + struct vubr_device *dev =
> + (struct vubr_device *) calloc(1, sizeof(struct vubr_device));
> +
> + dev->nregions = 0;
> +
> + int i;
> + for (i = 0; i < MAX_NR_VIRTQUEUE; i++)
> + dev->virtqueue[i] = (struct virtqueue) {
> + .call_fd = -1, .kick_fd = -1,
> + .size = 0,
> + .last_avail_index = 0, .last_used_index = 0,
> + .desc = 0, .avail = 0, .used = 0,
> + };
> +
> + /* Get a UNIX socket. */
> + if ((dev->sock = socket(AF_UNIX, SOCK_STREAM, 0)) == -1) {
> + perror("socket");
> + exit(1);
> + }
> +
> + struct sockaddr_un un;
> + un.sun_family = AF_UNIX;
> + strcpy(un.sun_path, path);
> +
> + size_t len = sizeof(un.sun_family) + strlen(path);
> +
> + unlink(path);
> +
> + if (bind(dev->sock, (struct sockaddr *) &un, len) == -1) {
> + perror("bind");
> + exit(1);
> + }
> +
> + if (listen(dev->sock, 1) == -1) {
> + perror("listen");
> + exit(1);
> + }
> +
> + dispatcher_init(&dev->dispatcher);
> + dispatcher_add(&dev->dispatcher, dev->sock, (void*) dev, vubr_device_accept_cb);
> +
> + printf("Waiting for connections on UNIX socket %s ...\n", path);
> + return dev;
> +}
> +
> +void
> +vubr_device_backend_udp_setup(struct vubr_device *dev,
> + char *local_host,
> + uint16_t local_port,
> + char *dest_host,
> + uint16_t dest_port)
> +{
> +
> + struct sockaddr_in si_local;
> + int sock;
> +
> + if ((sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) == -1)
> + die("socket");
> +
> + memset((char *) &si_local, 0, sizeof(struct sockaddr_in));
> + si_local.sin_family = AF_INET;
> + si_local.sin_port = htons(local_port);
> + if(inet_aton(local_host, &si_local.sin_addr) == 0) {
> + fprintf(stderr, "inet_aton() failed.\n");
> + exit(1);
> + }
> +
> + if( bind(sock, (struct sockaddr*)&si_local, sizeof(si_local) ) == -1)
> + die("bind");
> +
> + /* setup destination for sends */
> + struct sockaddr_in *si_remote = &dev->backend_udp_dest;
> + memset((char *) si_remote, 0, sizeof(struct sockaddr_in));
> + si_remote->sin_family = AF_INET;
> + si_remote->sin_port = htons(dest_port);
> + if(inet_aton(dest_host, &si_remote->sin_addr) == 0) {
> + fprintf(stderr, "inet_aton() failed.\n");
> + exit(1);
> + }
> +
> + dev->backend_udp_sock = sock;
> + dispatcher_add(&dev->dispatcher, sock, dev, _backend_recv_cb);
> + printf("Waiting for data from udp backend on %s:%d...\n", local_host, local_port);
> +}
> +
> +static void
> +vubr_device_backend_udp_sendbuf(struct vubr_device *dev, uint8_t *buf, size_t len)
> +{
> + int slen = sizeof(struct sockaddr_in);
> +
> + if (sendto(dev->backend_udp_sock, buf, len, 0, (struct sockaddr *) &dev->backend_udp_dest, slen) == -1)
> + die("sendto()");
> +}
> +
> +static int
> +vubr_device_backend_udp_recvbuf(struct vubr_device *dev, uint8_t *buf, size_t buflen)
> +{
> + int slen = sizeof(struct sockaddr_in);
> + int rc;
> +
> + if ((rc = recvfrom(dev->backend_udp_sock, buf, buflen, 0,
> + (struct sockaddr *) &dev->backend_udp_dest,
> + (socklen_t *)&slen)) == -1)
> + die("recvfrom()");
> +
> + return rc;
> +}
> +
> +void
> +vubr_device_run(struct vubr_device * dev)
> +{
> + while (1) {
> + /* timeout 200ms */
> + dispatcher_wait(&dev->dispatcher, 200000);
> + /* Here one can try polling strategy. */
> + }
> +}
> diff --git a/tests/vubr/Makefile b/tests/vubr/Makefile
> new file mode 100644
> index 0000000..c3400fb
> --- /dev/null
> +++ b/tests/vubr/Makefile
> @@ -0,0 +1,15 @@
> +SRCS=dispatcher.c vhost_user.c vubr_device.c main.c
> +INCLUDES+=vhost.h virtio_ring.h virtio_net.h
> +INCLUDES+=vubr_config.h vhost_user.h virtqueue.h
> +INCLUDES+=dispatcher.h vubr_device.h
> +
> +EXE=vubr
> +CFLAGS += -m64 -Wall -Werror -g
> +
> +all: $(EXE)
> +
> +$(EXE): $(SRCS) $(INCLUDES)
> + $(CC) $(CFLAGS) $(SRCS) -o $@
> +
> +clean:
> + rm -f $(EXE)
> --
Probably just add this as part of tests/Makefile
> --Victor
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-10-25 19:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-25 17:42 [Qemu-devel] [PATCH] Add vhost-user test application (Vubr) Victor Kaplansky
2015-10-25 19:02 ` Michael S. Tsirkin
2015-10-25 19:52 ` Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).