public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] vhost_net: support for cross endian guests
@ 2014-10-29  8:38 Cédric Le Goater
  2014-10-29  8:38 ` [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag Cédric Le Goater
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Cédric Le Goater @ 2014-10-29  8:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik, Cédric Le Goater

This patchset adds a VHOST_VRING_F_BYTESWAP flag to inform the host 
to byteswap data of the vring when the guest and the host have a 
different endian order. The flag is stored at initialization in an 
attribute of the virtio queues. It is then used to byteswap, or not, 
the vring indexes and descriptors shared with the guest OS.

The last patch adds the byteswapping of the virtio_net header as it 
is done in qemu.

The patches apply on linux-3.18-rc2 and the tests were done on PowerPC 
using the following hosts :
 
   	 fedora21/ppc64, utopic/ppc64le  

with various guests : 

   	 trusty/ppc64le, utopic/ppc64le, debian/ppc64le, 
   	 rhel6.5/ppc64, fedora21/ppc64, debian/ppc64

Regressions tests for x86_64 were done a debian host using rhel6.6, 
fedora20 and debian guests.
	

Cédric Le Goater (4):
  vhost: add VHOST_VRING_F_BYTESWAP flag
  vhost: add byteswap routines
  vhost: byteswap virtqueue attributes
  vhost_net: byteswap virtio_net header

 drivers/vhost/net.c        |   39 +++++++++---
 drivers/vhost/vhost.c      |  150 ++++++++++++++++++++++++++++++++++++++++----
 drivers/vhost/vhost.h      |    1 +
 include/uapi/linux/vhost.h |    3 +
 4 files changed, 171 insertions(+), 22 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag
  2014-10-29  8:38 [RFC PATCH 0/4] vhost_net: support for cross endian guests Cédric Le Goater
@ 2014-10-29  8:38 ` Cédric Le Goater
  2014-11-03 16:28   ` Michael S. Tsirkin
  2014-10-29  8:38 ` [RFC PATCH 2/4] vhost: add byteswap routines Cédric Le Goater
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Cédric Le Goater @ 2014-10-29  8:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik, Cédric Le Goater

The VHOST_VRING_F_BYTESWAP flag will be used by the host to byteswap
the vring data when the guest and the host have a different endian
order.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
---
 drivers/vhost/vhost.c      |    5 ++++-
 drivers/vhost/vhost.h      |    1 +
 include/uapi/linux/vhost.h |    3 +++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c90f4374442a..72c21b790ba3 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -199,6 +199,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 	vq->call = NULL;
 	vq->log_ctx = NULL;
 	vq->memory = NULL;
+	vq->byteswap = 0;
 }
 
 static int vhost_worker(void *data)
@@ -701,7 +702,8 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
 			r = -EFAULT;
 			break;
 		}
-		if (a.flags & ~(0x1 << VHOST_VRING_F_LOG)) {
+		if (a.flags & ~(0x1 << VHOST_VRING_F_LOG |
+				0x1 << VHOST_VRING_F_BYTESWAP)) {
 			r = -EOPNOTSUPP;
 			break;
 		}
@@ -747,6 +749,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
 		vq->avail = (void __user *)(unsigned long)a.avail_user_addr;
 		vq->log_addr = a.log_guest_addr;
 		vq->used = (void __user *)(unsigned long)a.used_user_addr;
+		vq->byteswap = !!(a.flags & (0x1 << VHOST_VRING_F_BYTESWAP));
 		break;
 	case VHOST_SET_VRING_KICK:
 		if (copy_from_user(&f, argp, sizeof f)) {
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 3eda654b8f5a..ab25b7d0720d 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -110,6 +110,7 @@ struct vhost_virtqueue {
 	/* Log write descriptors */
 	void __user *log_base;
 	struct vhost_log *log;
+	bool byteswap;
 };
 
 struct vhost_dev {
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index bb6a5b4cb3c5..6a8c2b325c44 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -34,6 +34,9 @@ struct vhost_vring_addr {
 	/* Flag values: */
 	/* Whether log address is valid. If set enables logging. */
 #define VHOST_VRING_F_LOG 0
+	/* Whether vring memory accesses should be byte-swapped.
+	 * required when the guest has a different endianness */
+#define VHOST_VRING_F_BYTESWAP 1
 
 	/* Start of array of descriptors (virtually contiguous) */
 	__u64 desc_user_addr;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/4] vhost: add byteswap routines
  2014-10-29  8:38 [RFC PATCH 0/4] vhost_net: support for cross endian guests Cédric Le Goater
  2014-10-29  8:38 ` [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag Cédric Le Goater
@ 2014-10-29  8:38 ` Cédric Le Goater
  2014-11-03 15:38   ` Cornelia Huck
  2014-10-29  8:38 ` [RFC PATCH 3/4] vhost: byteswap virtqueue attributes Cédric Le Goater
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Cédric Le Goater @ 2014-10-29  8:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik, Cédric Le Goater

This patch adds a few helper routines around get_user and put_user
to ease byteswapping.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
---

I am not sure these routines belong to this file. There is room for 
improvement to remove the ugly  switch (sizeof(*(ptr))). Please 
comment !

 drivers/vhost/vhost.c |   93 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 72c21b790ba3..afcb3368370c 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -25,6 +25,7 @@
 #include <linux/kthread.h>
 #include <linux/cgroup.h>
 #include <linux/module.h>
+#include <linux/swab.h>
 
 #include "vhost.h"
 
@@ -1001,6 +1002,98 @@ int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 }
 EXPORT_SYMBOL_GPL(vhost_log_write);
 
+#define vq_get_user(vq, x, ptr)					\
+({								\
+	int ret = get_user(x, ptr);				\
+	if (vq->byteswap) {					\
+		switch (sizeof(*(ptr))) {			\
+		case 2:						\
+			x = swab16(x);				\
+			break;					\
+		case 4:						\
+			x = swab32(x);				\
+			break;					\
+		case 8:						\
+			x = swab64(x);				\
+			break;					\
+		default:					\
+			break;					\
+		}						\
+	}							\
+	ret;							\
+})
+
+#define vq_put_user(vq, x, ptr)					\
+({								\
+	__typeof__(*(ptr)) y = (x);				\
+	if (vq->byteswap) {					\
+		switch (sizeof(*(ptr))) {			\
+		case 2:						\
+			y = swab16(x);				\
+			break;					\
+		case 4:						\
+			y = swab32(x);				\
+			break;					\
+		case 8:						\
+			y = swab64(x);				\
+			break;					\
+		default:					\
+			break;					\
+		}						\
+	}							\
+	put_user(y, ptr);					\
+})
+
+#define __vq_get_user(vq, x, ptr)					\
+({								\
+	int ret = __get_user(x, ptr);				\
+	if (vq->byteswap) {					\
+		switch (sizeof(*(ptr))) {			\
+		case 2:						\
+			x = swab16(x);				\
+			break;					\
+		case 4:						\
+			x = swab32(x);				\
+			break;					\
+		case 8:						\
+			x = swab64(x);				\
+			break;					\
+		default:					\
+			break;					\
+		}						\
+	}							\
+	ret;							\
+})
+
+#define __vq_put_user(vq, x, ptr)					\
+({								\
+	__typeof__(*(ptr)) y = (x);				\
+	if (vq->byteswap) {					\
+		switch (sizeof(*(ptr))) {			\
+		case 2:						\
+			y = swab16(x);				\
+			break;					\
+		case 4:						\
+			y = swab32(x);				\
+			break;					\
+		case 8:						\
+			y = swab64(x);				\
+			break;					\
+		default:					\
+			break;					\
+		}						\
+	}							\
+	__put_user(y, ptr);					\
+})
+
+static void vring_desc_swap(struct vring_desc *desc)
+{
+	desc->addr  = swab64(desc->addr);
+	desc->len   = swab32(desc->len);
+	desc->flags = swab16(desc->flags);
+	desc->next  = swab16(desc->next);
+}
+
 static int vhost_update_used_flags(struct vhost_virtqueue *vq)
 {
 	void __user *used;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/4] vhost: byteswap virtqueue attributes
  2014-10-29  8:38 [RFC PATCH 0/4] vhost_net: support for cross endian guests Cédric Le Goater
  2014-10-29  8:38 ` [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag Cédric Le Goater
  2014-10-29  8:38 ` [RFC PATCH 2/4] vhost: add byteswap routines Cédric Le Goater
@ 2014-10-29  8:38 ` Cédric Le Goater
  2014-11-03 16:02   ` Cornelia Huck
  2014-10-29  8:38 ` [RFC PATCH 4/4] vhost_net: byteswap virtio_net header Cédric Le Goater
  2014-11-03 16:23 ` [RFC PATCH 0/4] vhost_net: support for cross endian guests Michael S. Tsirkin
  4 siblings, 1 reply; 11+ messages in thread
From: Cédric Le Goater @ 2014-10-29  8:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik, Cédric Le Goater

The virtqueue structure shares a few attributes with the guest OS
which need to be byteswapped when the endian order of the host is
different.

This patch uses the vq->byteswap attribute to decide whether to
byteswap or not data being accessed in the guest memory.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
---
 drivers/vhost/vhost.c |   52 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 40 insertions(+), 12 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index afcb3368370c..9a538f4d2ff1 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1097,7 +1097,7 @@ static void vring_desc_swap(struct vring_desc *desc)
 static int vhost_update_used_flags(struct vhost_virtqueue *vq)
 {
 	void __user *used;
-	if (__put_user(vq->used_flags, &vq->used->flags) < 0)
+	if (__vq_put_user(vq, vq->used_flags, &vq->used->flags) < 0)
 		return -EFAULT;
 	if (unlikely(vq->log_used)) {
 		/* Make sure the flag is seen before log. */
@@ -1115,7 +1115,7 @@ static int vhost_update_used_flags(struct vhost_virtqueue *vq)
 
 static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event)
 {
-	if (__put_user(vq->avail_idx, vhost_avail_event(vq)))
+	if (__vq_put_user(vq, vq->avail_idx, vhost_avail_event(vq)))
 		return -EFAULT;
 	if (unlikely(vq->log_used)) {
 		void __user *used;
@@ -1142,7 +1142,7 @@ int vhost_init_used(struct vhost_virtqueue *vq)
 	if (r)
 		return r;
 	vq->signalled_used_valid = false;
-	return get_user(vq->last_used_idx, &vq->used->idx);
+	return vq_get_user(vq, vq->last_used_idx, &vq->used->idx);
 }
 EXPORT_SYMBOL_GPL(vhost_init_used);
 
@@ -1254,6 +1254,10 @@ static int get_indirect(struct vhost_virtqueue *vq,
 			       i, (size_t)indirect->addr + i * sizeof desc);
 			return -EINVAL;
 		}
+
+		if (vq->byteswap)
+			vring_desc_swap(&desc);
+
 		if (unlikely(desc.flags & VRING_DESC_F_INDIRECT)) {
 			vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n",
 			       i, (size_t)indirect->addr + i * sizeof desc);
@@ -1309,7 +1313,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 
 	/* Check it isn't doing very strange things with descriptor numbers. */
 	last_avail_idx = vq->last_avail_idx;
-	if (unlikely(__get_user(vq->avail_idx, &vq->avail->idx))) {
+	if (unlikely(__vq_get_user(vq, vq->avail_idx, &vq->avail->idx))) {
 		vq_err(vq, "Failed to access avail idx at %p\n",
 		       &vq->avail->idx);
 		return -EFAULT;
@@ -1330,7 +1334,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 
 	/* Grab the next descriptor number they're advertising, and increment
 	 * the index we've seen. */
-	if (unlikely(__get_user(head,
+	if (unlikely(__vq_get_user(vq, head,
 				&vq->avail->ring[last_avail_idx % vq->num]))) {
 		vq_err(vq, "Failed to read head: idx %d address %p\n",
 		       last_avail_idx,
@@ -1370,6 +1374,10 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 			       i, vq->desc + i);
 			return -EFAULT;
 		}
+
+		if (vq->byteswap)
+			vring_desc_swap(&desc);
+
 		if (desc.flags & VRING_DESC_F_INDIRECT) {
 			ret = get_indirect(vq, iov, iov_size,
 					   out_num, in_num,
@@ -1437,6 +1445,26 @@ int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
 }
 EXPORT_SYMBOL_GPL(vhost_add_used);
 
+static int __copyhead_to_user(struct vhost_virtqueue *vq,
+			struct vring_used_elem *heads,
+			struct vring_used_elem __user *used,
+			unsigned count)
+{
+	int i;
+
+	for (i = 0; i < count; i++) {
+		if (__vq_put_user(vq, heads[i].id, &used[i].id)) {
+			vq_err(vq, "Failed to write used id");
+			return -EFAULT;
+		}
+		if (__vq_put_user(vq, heads[i].len, &used[i].len)) {
+			vq_err(vq, "Failed to write used len");
+			return -EFAULT;
+		}
+	}
+	return 0;
+}
+
 static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 			    struct vring_used_elem *heads,
 			    unsigned count)
@@ -1448,15 +1476,15 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 	start = vq->last_used_idx % vq->num;
 	used = vq->used->ring + start;
 	if (count == 1) {
-		if (__put_user(heads[0].id, &used->id)) {
+		if (__vq_put_user(vq, heads[0].id, &used->id)) {
 			vq_err(vq, "Failed to write used id");
 			return -EFAULT;
 		}
-		if (__put_user(heads[0].len, &used->len)) {
+		if (__vq_put_user(vq, heads[0].len, &used->len)) {
 			vq_err(vq, "Failed to write used len");
 			return -EFAULT;
 		}
-	} else if (__copy_to_user(used, heads, count * sizeof *used)) {
+	} else if (__copyhead_to_user(vq, heads, used, count)) {
 		vq_err(vq, "Failed to write used");
 		return -EFAULT;
 	}
@@ -1500,7 +1528,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
 
 	/* Make sure buffer is written before we update index. */
 	smp_wmb();
-	if (put_user(vq->last_used_idx, &vq->used->idx)) {
+	if (vq_put_user(vq, vq->last_used_idx, &vq->used->idx)) {
 		vq_err(vq, "Failed to increment used idx");
 		return -EFAULT;
 	}
@@ -1531,7 +1559,7 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 
 	if (!vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) {
 		__u16 flags;
-		if (__get_user(flags, &vq->avail->flags)) {
+		if (__vq_get_user(vq, flags, &vq->avail->flags)) {
 			vq_err(vq, "Failed to get flags");
 			return true;
 		}
@@ -1545,7 +1573,7 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 	if (unlikely(!v))
 		return true;
 
-	if (get_user(event, vhost_used_event(vq))) {
+	if (vq_get_user(vq, event, vhost_used_event(vq))) {
 		vq_err(vq, "Failed to get used event idx");
 		return true;
 	}
@@ -1608,7 +1636,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
 	/* They could have slipped one in as we were doing that: make
 	 * sure it's written, then check again. */
 	smp_mb();
-	r = __get_user(avail_idx, &vq->avail->idx);
+	r = __vq_get_user(vq, avail_idx, &vq->avail->idx);
 	if (r) {
 		vq_err(vq, "Failed to check avail idx at %p: %d\n",
 		       &vq->avail->idx, r);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 4/4] vhost_net: byteswap virtio_net header
  2014-10-29  8:38 [RFC PATCH 0/4] vhost_net: support for cross endian guests Cédric Le Goater
                   ` (2 preceding siblings ...)
  2014-10-29  8:38 ` [RFC PATCH 3/4] vhost: byteswap virtqueue attributes Cédric Le Goater
@ 2014-10-29  8:38 ` Cédric Le Goater
  2014-11-03 16:25   ` Michael S. Tsirkin
  2014-11-03 16:23 ` [RFC PATCH 0/4] vhost_net: support for cross endian guests Michael S. Tsirkin
  4 siblings, 1 reply; 11+ messages in thread
From: Cédric Le Goater @ 2014-10-29  8:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik, Cédric Le Goater

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
---
 drivers/vhost/net.c |   39 ++++++++++++++++++++++++++++++---------
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8dae2f724a35..f2d5f585dae9 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -18,6 +18,7 @@
 #include <linux/file.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/swab.h>
 
 #include <linux/net.h>
 #include <linux/if_packet.h>
@@ -329,6 +330,14 @@ static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
 	rcu_read_unlock_bh();
 }
 
+static void virtio_net_hdr_swap(struct virtio_net_hdr *hdr)
+{
+	hdr->hdr_len     = swab16(hdr->hdr_len);
+	hdr->gso_size    = swab16(hdr->gso_size);
+	hdr->csum_start  = swab16(hdr->csum_start);
+	hdr->csum_offset = swab16(hdr->csum_offset);
+}
+
 /* Expects to be always run from workqueue - which acts as
  * read-size critical section for our kind of RCU. */
 static void handle_tx(struct vhost_net *net)
@@ -346,7 +355,7 @@ static void handle_tx(struct vhost_net *net)
 		.msg_flags = MSG_DONTWAIT,
 	};
 	size_t len, total_len = 0;
-	int err;
+	int err, has_vnet_hdr;
 	size_t hdr_size;
 	struct socket *sock;
 	struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
@@ -359,6 +368,7 @@ static void handle_tx(struct vhost_net *net)
 
 	vhost_disable_notify(&net->dev, vq);
 
+	has_vnet_hdr = vhost_has_feature(vq, VHOST_NET_F_VIRTIO_NET_HDR);
 	hdr_size = nvq->vhost_hlen;
 	zcopy = nvq->ubufs;
 
@@ -406,6 +416,8 @@ static void handle_tx(struct vhost_net *net)
 			break;
 		}
 
+		if (!has_vnet_hdr && vq->byteswap)
+			virtio_net_hdr_swap((void *) vq->iov[0].iov_base);
 		zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
 				   && (nvq->upend_idx + 1) % UIO_MAXIOV !=
 				      nvq->done_idx
@@ -570,7 +582,7 @@ static void handle_rx(struct vhost_net *net)
 		.hdr.gso_type = VIRTIO_NET_HDR_GSO_NONE
 	};
 	size_t total_len = 0;
-	int err, mergeable;
+	int err, mergeable, has_vnet_hdr;
 	s16 headcount;
 	size_t vhost_hlen, sock_hlen;
 	size_t vhost_len, sock_len;
@@ -588,6 +600,7 @@ static void handle_rx(struct vhost_net *net)
 	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
 		vq->log : NULL;
 	mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
+	has_vnet_hdr = vhost_has_feature(vq, VHOST_NET_F_VIRTIO_NET_HDR);
 
 	while ((sock_len = peek_head_len(sock->sk))) {
 		sock_len += sock_hlen;
@@ -638,6 +651,9 @@ static void handle_rx(struct vhost_net *net)
 			vhost_discard_vq_desc(vq, headcount);
 			continue;
 		}
+
+		if (!has_vnet_hdr && vq->byteswap)
+			virtio_net_hdr_swap((void *) vq->iov[0].iov_base);
 		if (unlikely(vhost_hlen) &&
 		    memcpy_toiovecend(nvq->hdr, (unsigned char *)&hdr, 0,
 				      vhost_hlen)) {
@@ -646,13 +662,18 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		}
 		/* TODO: Should check and handle checksum. */
-		if (likely(mergeable) &&
-		    memcpy_toiovecend(nvq->hdr, (unsigned char *)&headcount,
-				      offsetof(typeof(hdr), num_buffers),
-				      sizeof hdr.num_buffers)) {
-			vq_err(vq, "Failed num_buffers write");
-			vhost_discard_vq_desc(vq, headcount);
-			break;
+		if (likely(mergeable)) {
+			__u16 tmp = headcount;
+
+			if (vq->byteswap)
+				tmp = swab16(headcount);
+			if (memcpy_toiovecend(nvq->hdr, (unsigned char *)&tmp,
+					offsetof(typeof(hdr), num_buffers),
+					      sizeof(hdr.num_buffers))) {
+				vq_err(vq, "Failed num_buffers write");
+				vhost_discard_vq_desc(vq, headcount);
+				break;
+			}
 		}
 		vhost_add_used_and_signal_n(&net->dev, vq, vq->heads,
 					    headcount);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/4] vhost: add byteswap routines
  2014-10-29  8:38 ` [RFC PATCH 2/4] vhost: add byteswap routines Cédric Le Goater
@ 2014-11-03 15:38   ` Cornelia Huck
  0 siblings, 0 replies; 11+ messages in thread
From: Cornelia Huck @ 2014-11-03 15:38 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Michael S. Tsirkin, kvm, kvm-ppc, agraf, paulus, gkurz, aik

On Wed, 29 Oct 2014 09:38:43 +0100
Cédric Le Goater <clg@fr.ibm.com> wrote:

> This patch adds a few helper routines around get_user and put_user
> to ease byteswapping.
> 
> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
> ---
> 
> I am not sure these routines belong to this file. There is room for 
> improvement to remove the ugly  switch (sizeof(*(ptr))). Please 
> comment !
> 
>  drivers/vhost/vhost.c |   93 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 93 insertions(+)
> 

I'm wondering whether it would be a good idea to introduce generic
{get,put}_user_reversed() routines.

That way, you would push the switch (sizeof(*(ptr)) into uaccess code,
where it needs to be done anyway. It also makes it possible to add
optimizations there.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 3/4] vhost: byteswap virtqueue attributes
  2014-10-29  8:38 ` [RFC PATCH 3/4] vhost: byteswap virtqueue attributes Cédric Le Goater
@ 2014-11-03 16:02   ` Cornelia Huck
  0 siblings, 0 replies; 11+ messages in thread
From: Cornelia Huck @ 2014-11-03 16:02 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Michael S. Tsirkin, kvm, kvm-ppc, agraf, paulus, gkurz, aik

On Wed, 29 Oct 2014 09:38:44 +0100
Cédric Le Goater <clg@fr.ibm.com> wrote:

> The virtqueue structure shares a few attributes with the guest OS
> which need to be byteswapped when the endian order of the host is
> different.
> 
> This patch uses the vq->byteswap attribute to decide whether to
> byteswap or not data being accessed in the guest memory.
> 
> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
> ---
>  drivers/vhost/vhost.c |   52 +++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 40 insertions(+), 12 deletions(-)
> 

> +static int __copyhead_to_user(struct vhost_virtqueue *vq,
> +			struct vring_used_elem *heads,
> +			struct vring_used_elem __user *used,
> +			unsigned count)

__copy_used_elems_to_user() ?

> +{
> +	int i;
> +
> +	for (i = 0; i < count; i++) {
> +		if (__vq_put_user(vq, heads[i].id, &used[i].id)) {
> +			vq_err(vq, "Failed to write used id");
> +			return -EFAULT;
> +		}
> +		if (__vq_put_user(vq, heads[i].len, &used[i].len)) {
> +			vq_err(vq, "Failed to write used len");
> +			return -EFAULT;
> +		}
> +	}

Is there a number of elements where it would be more efficient to
byteswap the used elements first and then do __copy_to_user() in one
go? Depends on the backend, I guess.

> +	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/4] vhost_net: support for cross endian guests
  2014-10-29  8:38 [RFC PATCH 0/4] vhost_net: support for cross endian guests Cédric Le Goater
                   ` (3 preceding siblings ...)
  2014-10-29  8:38 ` [RFC PATCH 4/4] vhost_net: byteswap virtio_net header Cédric Le Goater
@ 2014-11-03 16:23 ` Michael S. Tsirkin
  2014-11-04  8:07   ` Cedric Le Goater
  4 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2014-11-03 16:23 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik

On Wed, Oct 29, 2014 at 09:38:41AM +0100, Cédric Le Goater wrote:
> This patchset adds a VHOST_VRING_F_BYTESWAP flag to inform the host 
> to byteswap data of the vring when the guest and the host have a 
> different endian order. The flag is stored at initialization in an 
> attribute of the virtio queues. It is then used to byteswap, or not, 
> the vring indexes and descriptors shared with the guest OS.
> 
> The last patch adds the byteswapping of the virtio_net header as it 
> is done in qemu.

Hi Cedric,

Thanks for submitting this.
One general problem with this approach, is that
it adds overhead e.g. for x86 on x86 unconditionally.

I will in a couple of days post a patch adding virtio 1.0
support for vhost.

This will serve as a better basis for cross-endian support in
vhost, on top.

I'll try to remember to Cc you.

> The patches apply on linux-3.18-rc2 and the tests were done on PowerPC 
> using the following hosts :
>  
>    	 fedora21/ppc64, utopic/ppc64le  
> 
> with various guests : 
> 
>    	 trusty/ppc64le, utopic/ppc64le, debian/ppc64le, 
>    	 rhel6.5/ppc64, fedora21/ppc64, debian/ppc64
> 
> Regressions tests for x86_64 were done a debian host using rhel6.6, 
> fedora20 and debian guests.
> 	
> 
> Cédric Le Goater (4):
>   vhost: add VHOST_VRING_F_BYTESWAP flag
>   vhost: add byteswap routines
>   vhost: byteswap virtqueue attributes
>   vhost_net: byteswap virtio_net header
> 
>  drivers/vhost/net.c        |   39 +++++++++---
>  drivers/vhost/vhost.c      |  150 ++++++++++++++++++++++++++++++++++++++++----
>  drivers/vhost/vhost.h      |    1 +
>  include/uapi/linux/vhost.h |    3 +
>  4 files changed, 171 insertions(+), 22 deletions(-)
> 
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 4/4] vhost_net: byteswap virtio_net header
  2014-10-29  8:38 ` [RFC PATCH 4/4] vhost_net: byteswap virtio_net header Cédric Le Goater
@ 2014-11-03 16:25   ` Michael S. Tsirkin
  0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2014-11-03 16:25 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik

On Wed, Oct 29, 2014 at 09:38:45AM +0100, Cédric Le Goater wrote:
> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>

This patch casts userspace pointers to void *,
this is generally unsafe. In particular sparse will warn.

> ---
>  drivers/vhost/net.c |   39 ++++++++++++++++++++++++++++++---------
>  1 file changed, 30 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 8dae2f724a35..f2d5f585dae9 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -18,6 +18,7 @@
>  #include <linux/file.h>
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> +#include <linux/swab.h>
>  
>  #include <linux/net.h>
>  #include <linux/if_packet.h>
> @@ -329,6 +330,14 @@ static void vhost_zerocopy_callback(struct ubuf_info *ubuf, bool success)
>  	rcu_read_unlock_bh();
>  }
>  
> +static void virtio_net_hdr_swap(struct virtio_net_hdr *hdr)
> +{
> +	hdr->hdr_len     = swab16(hdr->hdr_len);
> +	hdr->gso_size    = swab16(hdr->gso_size);
> +	hdr->csum_start  = swab16(hdr->csum_start);
> +	hdr->csum_offset = swab16(hdr->csum_offset);
> +}
> +
>  /* Expects to be always run from workqueue - which acts as
>   * read-size critical section for our kind of RCU. */
>  static void handle_tx(struct vhost_net *net)
> @@ -346,7 +355,7 @@ static void handle_tx(struct vhost_net *net)
>  		.msg_flags = MSG_DONTWAIT,
>  	};
>  	size_t len, total_len = 0;
> -	int err;
> +	int err, has_vnet_hdr;
>  	size_t hdr_size;
>  	struct socket *sock;
>  	struct vhost_net_ubuf_ref *uninitialized_var(ubufs);
> @@ -359,6 +368,7 @@ static void handle_tx(struct vhost_net *net)
>  
>  	vhost_disable_notify(&net->dev, vq);
>  
> +	has_vnet_hdr = vhost_has_feature(vq, VHOST_NET_F_VIRTIO_NET_HDR);
>  	hdr_size = nvq->vhost_hlen;
>  	zcopy = nvq->ubufs;
>  
> @@ -406,6 +416,8 @@ static void handle_tx(struct vhost_net *net)
>  			break;
>  		}
>  
> +		if (!has_vnet_hdr && vq->byteswap)
> +			virtio_net_hdr_swap((void *) vq->iov[0].iov_base);
>  		zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
>  				   && (nvq->upend_idx + 1) % UIO_MAXIOV !=
>  				      nvq->done_idx
> @@ -570,7 +582,7 @@ static void handle_rx(struct vhost_net *net)
>  		.hdr.gso_type = VIRTIO_NET_HDR_GSO_NONE
>  	};
>  	size_t total_len = 0;
> -	int err, mergeable;
> +	int err, mergeable, has_vnet_hdr;
>  	s16 headcount;
>  	size_t vhost_hlen, sock_hlen;
>  	size_t vhost_len, sock_len;
> @@ -588,6 +600,7 @@ static void handle_rx(struct vhost_net *net)
>  	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
>  		vq->log : NULL;
>  	mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> +	has_vnet_hdr = vhost_has_feature(vq, VHOST_NET_F_VIRTIO_NET_HDR);
>  
>  	while ((sock_len = peek_head_len(sock->sk))) {
>  		sock_len += sock_hlen;
> @@ -638,6 +651,9 @@ static void handle_rx(struct vhost_net *net)
>  			vhost_discard_vq_desc(vq, headcount);
>  			continue;
>  		}
> +
> +		if (!has_vnet_hdr && vq->byteswap)
> +			virtio_net_hdr_swap((void *) vq->iov[0].iov_base);
>  		if (unlikely(vhost_hlen) &&
>  		    memcpy_toiovecend(nvq->hdr, (unsigned char *)&hdr, 0,
>  				      vhost_hlen)) {
> @@ -646,13 +662,18 @@ static void handle_rx(struct vhost_net *net)
>  			break;
>  		}
>  		/* TODO: Should check and handle checksum. */
> -		if (likely(mergeable) &&
> -		    memcpy_toiovecend(nvq->hdr, (unsigned char *)&headcount,
> -				      offsetof(typeof(hdr), num_buffers),
> -				      sizeof hdr.num_buffers)) {
> -			vq_err(vq, "Failed num_buffers write");
> -			vhost_discard_vq_desc(vq, headcount);
> -			break;
> +		if (likely(mergeable)) {
> +			__u16 tmp = headcount;
> +
> +			if (vq->byteswap)
> +				tmp = swab16(headcount);
> +			if (memcpy_toiovecend(nvq->hdr, (unsigned char *)&tmp,
> +					offsetof(typeof(hdr), num_buffers),
> +					      sizeof(hdr.num_buffers))) {
> +				vq_err(vq, "Failed num_buffers write");
> +				vhost_discard_vq_desc(vq, headcount);
> +				break;
> +			}
>  		}
>  		vhost_add_used_and_signal_n(&net->dev, vq, vq->heads,
>  					    headcount);
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag
  2014-10-29  8:38 ` [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag Cédric Le Goater
@ 2014-11-03 16:28   ` Michael S. Tsirkin
  0 siblings, 0 replies; 11+ messages in thread
From: Michael S. Tsirkin @ 2014-11-03 16:28 UTC (permalink / raw)
  To: Cédric Le Goater; +Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik

On Wed, Oct 29, 2014 at 09:38:42AM +0100, Cédric Le Goater wrote:
> The VHOST_VRING_F_BYTESWAP flag will be used by the host to byteswap
> the vring data when the guest and the host have a different endian
> order.
> 
> Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>

I don't think it's a good API.
You should ask for specific header format, not for a swap.

> ---
>  drivers/vhost/vhost.c      |    5 ++++-
>  drivers/vhost/vhost.h      |    1 +
>  include/uapi/linux/vhost.h |    3 +++
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index c90f4374442a..72c21b790ba3 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -199,6 +199,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
>  	vq->call = NULL;
>  	vq->log_ctx = NULL;
>  	vq->memory = NULL;
> +	vq->byteswap = 0;
>  }
>  
>  static int vhost_worker(void *data)
> @@ -701,7 +702,8 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
>  			r = -EFAULT;
>  			break;
>  		}
> -		if (a.flags & ~(0x1 << VHOST_VRING_F_LOG)) {
> +		if (a.flags & ~(0x1 << VHOST_VRING_F_LOG |
> +				0x1 << VHOST_VRING_F_BYTESWAP)) {
>  			r = -EOPNOTSUPP;
>  			break;
>  		}
> @@ -747,6 +749,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp)
>  		vq->avail = (void __user *)(unsigned long)a.avail_user_addr;
>  		vq->log_addr = a.log_guest_addr;
>  		vq->used = (void __user *)(unsigned long)a.used_user_addr;
> +		vq->byteswap = !!(a.flags & (0x1 << VHOST_VRING_F_BYTESWAP));
>  		break;
>  	case VHOST_SET_VRING_KICK:
>  		if (copy_from_user(&f, argp, sizeof f)) {
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index 3eda654b8f5a..ab25b7d0720d 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -110,6 +110,7 @@ struct vhost_virtqueue {
>  	/* Log write descriptors */
>  	void __user *log_base;
>  	struct vhost_log *log;
> +	bool byteswap;
>  };
>  
>  struct vhost_dev {
> diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> index bb6a5b4cb3c5..6a8c2b325c44 100644
> --- a/include/uapi/linux/vhost.h
> +++ b/include/uapi/linux/vhost.h
> @@ -34,6 +34,9 @@ struct vhost_vring_addr {
>  	/* Flag values: */
>  	/* Whether log address is valid. If set enables logging. */
>  #define VHOST_VRING_F_LOG 0
> +	/* Whether vring memory accesses should be byte-swapped.
> +	 * required when the guest has a different endianness */
> +#define VHOST_VRING_F_BYTESWAP 1
>  
>  	/* Start of array of descriptors (virtually contiguous) */
>  	__u64 desc_user_addr;
> -- 
> 1.7.10.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/4] vhost_net: support for cross endian guests
  2014-11-03 16:23 ` [RFC PATCH 0/4] vhost_net: support for cross endian guests Michael S. Tsirkin
@ 2014-11-04  8:07   ` Cedric Le Goater
  0 siblings, 0 replies; 11+ messages in thread
From: Cedric Le Goater @ 2014-11-04  8:07 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, kvm-ppc, agraf, paulus, gkurz, aik

On 11/03/2014 05:23 PM, Michael S. Tsirkin wrote:
> On Wed, Oct 29, 2014 at 09:38:41AM +0100, Cédric Le Goater wrote:
>> This patchset adds a VHOST_VRING_F_BYTESWAP flag to inform the host 
>> to byteswap data of the vring when the guest and the host have a 
>> different endian order. The flag is stored at initialization in an 
>> attribute of the virtio queues. It is then used to byteswap, or not, 
>> the vring indexes and descriptors shared with the guest OS.
>>
>> The last patch adds the byteswapping of the virtio_net header as it 
>> is done in qemu.
> 
> Hi Cedric,
> 
> Thanks for submitting this.
> One general problem with this approach, is that
> it adds overhead e.g. for x86 on x86 unconditionally.

Yes but it should be possible to #ifdef most of the routines to
make them transparent for x86. 

> I will in a couple of days post a patch adding virtio 1.0
> support for vhost.
> 
> This will serve as a better basis for cross-endian support in
> vhost, on top.

ok. I will rework the patchset on top of it then.

> I'll try to remember to Cc you.

Thanks.

C.


 
>> The patches apply on linux-3.18-rc2 and the tests were done on PowerPC 
>> using the following hosts :
>>  
>>    	 fedora21/ppc64, utopic/ppc64le  
>>
>> with various guests : 
>>
>>    	 trusty/ppc64le, utopic/ppc64le, debian/ppc64le, 
>>    	 rhel6.5/ppc64, fedora21/ppc64, debian/ppc64
>>
>> Regressions tests for x86_64 were done a debian host using rhel6.6, 
>> fedora20 and debian guests.
>> 	
>>
>> Cédric Le Goater (4):
>>   vhost: add VHOST_VRING_F_BYTESWAP flag
>>   vhost: add byteswap routines
>>   vhost: byteswap virtqueue attributes
>>   vhost_net: byteswap virtio_net header
>>
>>  drivers/vhost/net.c        |   39 +++++++++---
>>  drivers/vhost/vhost.c      |  150 ++++++++++++++++++++++++++++++++++++++++----
>>  drivers/vhost/vhost.h      |    1 +
>>  include/uapi/linux/vhost.h |    3 +
>>  4 files changed, 171 insertions(+), 22 deletions(-)
>>
>> -- 
>> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-11-04  8:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-29  8:38 [RFC PATCH 0/4] vhost_net: support for cross endian guests Cédric Le Goater
2014-10-29  8:38 ` [RFC PATCH 1/4] vhost: add VHOST_VRING_F_BYTESWAP flag Cédric Le Goater
2014-11-03 16:28   ` Michael S. Tsirkin
2014-10-29  8:38 ` [RFC PATCH 2/4] vhost: add byteswap routines Cédric Le Goater
2014-11-03 15:38   ` Cornelia Huck
2014-10-29  8:38 ` [RFC PATCH 3/4] vhost: byteswap virtqueue attributes Cédric Le Goater
2014-11-03 16:02   ` Cornelia Huck
2014-10-29  8:38 ` [RFC PATCH 4/4] vhost_net: byteswap virtio_net header Cédric Le Goater
2014-11-03 16:25   ` Michael S. Tsirkin
2014-11-03 16:23 ` [RFC PATCH 0/4] vhost_net: support for cross endian guests Michael S. Tsirkin
2014-11-04  8:07   ` Cedric Le Goater

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox