kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: Sasha Levin <sasha.levin@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>,
	syzkaller@googlegroups.com, Sasha Levin <levinsasha928@gmail.com>,
	Pekka Enberg <penberg@kernel.org>,
	Asias He <asias.hejun@gmail.com>,
	penberg@cs.helsinki.fi, Cyrill Gorcunov <gorcunov@gmail.com>,
	matt@ozlabs.org, Michael Ellerman <michael@ellerman.id.au>,
	Prasad Joshi <prasadjoshi124@gmail.com>,
	marc.zyngier@arm.com,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	mingo@elte.hu, gorcunov@openvz.org, kvm@vger.kernel.org,
	Kostya Serebryany <kcc@google.com>,
	Evgenii Stepanov <eugenis@google.com>,
	Alexey Samsonov <samsonov@google.com>,
	Alexander Potapenko <glider@google.com>
Subject: Re: Network hangs when communicating with host
Date: Tue, 27 Oct 2015 09:31:07 +0000	[thread overview]
Message-ID: <20151027093106.GA1689@arm.com> (raw)
In-Reply-To: <5626489B.1020309@oracle.com>

[apologies for the delay -- I've been off for a week and am catching up
 on email]

On Tue, Oct 20, 2015 at 09:58:51AM -0400, Sasha Levin wrote:
> On 10/20/2015 09:42 AM, Dmitry Vyukov wrote:
> > I now have another issue. My binary fails to mmap a file within lkvm
> > sandbox. The same binary works fine on host and in qemu. I've added
> > strace into sandbox script, and here is the output:
> > 
> > [pid   837] openat(AT_FDCWD, "syzkaller-shm048878722", O_RDWR|O_CLOEXEC) = 5
> > [pid   837] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 5,
> > 0) = -1 EINVAL (Invalid argument)
> > 
> > I don't see anything that can potentially cause EINVAL here. Is it
> > possible that lkvm somehow affects kernel behavior here?
> > 
> > I run lkvm as:
> > 
> > $ taskset 1 /kvmtool/lkvm sandbox --disk syz-0 --mem=2048 --cpus=2
> > --kernel /arch/x86/boot/bzImage --network mode=user --sandbox
> > /workdir/kvm/syz-0.sh
> 
> It's possible that something in the virtio-9p layer is broken. I'll give
> it a look in the evening.

I ended up with the patch below, but it's really ugly and I didn't get
round to posting it.

Will

--->8

>From 7cbcdfef1b9f094db4bf75676f22339f3164e103 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Fri, 17 Apr 2015 17:31:36 +0100
Subject: [PATCH] kvmtool: virtio-net: fix VIRTIO_NET_F_MRG_RXBUF usage in rx
 thread

When merging virtio-net buffers using the VIRTIO_NET_F_MRG_RXBUF feature,
the first buffer added to the used ring should indicate the total number
of buffers used to hold the packet. Unfortunately, kvmtool has a number
of issues when constructing these merged buffers:

  - Commit 5131332e3f1a ("kvmtool: convert net backend to support
    bi-endianness") introduced a strange loop counter, which resulted in
    hdr->num_buffers being set redundantly the first time round

  - When adding the buffers to the ring, we actually add them one-by-one,
    allowing the guest to see the header before we've inserted the rest
    of the data buffers...

  - ... which is made worse because we non-atomically increment the
    num_buffers count in the header each time we insert a new data buffer

Consequently, the guest quickly becomes confused in its net rx code and
the whole thing grinds to a halt. This is easily exemplified by trying
to boot a root filesystem over NFS, which seldom succeeds.

This patch resolves the issues by allowing us to insert items into the
used ring without updating the index. Once the full payload has been
added and num_buffers corresponds to the total size, we *then* publish
the buffers to the guest.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 include/kvm/virtio.h |  2 ++
 virtio/core.c        | 32 +++++++++++++++++++++++++-------
 virtio/net.c         | 31 +++++++++++++++----------------
 3 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 768ee9668d44..8324ba7d38be 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -112,6 +112,8 @@ static inline bool virt_queue__available(struct virt_queue *vq)
 	return virtio_guest_to_host_u16(vq, vq->vring.avail->idx) != vq->last_avail_idx;
 }
 
+void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump);
+struct vring_used_elem * virt_queue__set_used_elem_no_update(struct virt_queue *queue, u32 head, u32 len, u16 offset);
 struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32 head, u32 len);
 
 bool virtio_queue__should_signal(struct virt_queue *vq);
diff --git a/virtio/core.c b/virtio/core.c
index 3b6e4d7cd045..d6ac289d450e 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -21,22 +21,17 @@ const char* virtio_trans_name(enum virtio_trans trans)
 	return "unknown";
 }
 
-struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32 head, u32 len)
+void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump)
 {
-	struct vring_used_elem *used_elem;
 	u16 idx = virtio_guest_to_host_u16(queue, queue->vring.used->idx);
 
-	used_elem	= &queue->vring.used->ring[idx % queue->vring.num];
-	used_elem->id	= virtio_host_to_guest_u32(queue, head);
-	used_elem->len	= virtio_host_to_guest_u32(queue, len);
-
 	/*
 	 * Use wmb to assure that used elem was updated with head and len.
 	 * We need a wmb here since we can't advance idx unless we're ready
 	 * to pass the used element to the guest.
 	 */
 	wmb();
-	idx++;
+	idx += jump;
 	queue->vring.used->idx = virtio_host_to_guest_u16(queue, idx);
 
 	/*
@@ -45,6 +40,29 @@ struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32
 	 * an updated idx.
 	 */
 	wmb();
+}
+
+struct vring_used_elem *
+virt_queue__set_used_elem_no_update(struct virt_queue *queue, u32 head,
+				    u32 len, u16 offset)
+{
+	struct vring_used_elem *used_elem;
+	u16 idx = virtio_guest_to_host_u16(queue, queue->vring.used->idx);
+
+	idx += offset;
+	used_elem	= &queue->vring.used->ring[idx % queue->vring.num];
+	used_elem->id	= virtio_host_to_guest_u32(queue, head);
+	used_elem->len	= virtio_host_to_guest_u32(queue, len);
+
+	return used_elem;
+}
+
+struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, u32 head, u32 len)
+{
+	struct vring_used_elem *used_elem;
+
+	used_elem = virt_queue__set_used_elem_no_update(queue, head, len, 0);
+	virt_queue__used_idx_advance(queue, 1);
 
 	return used_elem;
 }
diff --git a/virtio/net.c b/virtio/net.c
index 9784520336b1..afee75333edb 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -80,14 +80,12 @@ static void virtio_net_fix_tx_hdr(struct virtio_net_hdr *hdr, struct net_dev *nd
 	hdr->csum_offset	= virtio_guest_to_host_u16(&ndev->vdev, hdr->csum_offset);
 }
 
-static void virtio_net_fix_rx_hdr(struct virtio_net_hdr_mrg_rxbuf *hdr, struct net_dev *ndev)
+static void virtio_net_fix_rx_hdr(struct virtio_net_hdr *hdr, struct net_dev *ndev)
 {
-	hdr->hdr.hdr_len	= virtio_host_to_guest_u16(&ndev->vdev, hdr->hdr.hdr_len);
-	hdr->hdr.gso_size	= virtio_host_to_guest_u16(&ndev->vdev, hdr->hdr.gso_size);
-	hdr->hdr.csum_start	= virtio_host_to_guest_u16(&ndev->vdev, hdr->hdr.csum_start);
-	hdr->hdr.csum_offset	= virtio_host_to_guest_u16(&ndev->vdev, hdr->hdr.csum_offset);
-	if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF))
-		hdr->num_buffers	= virtio_host_to_guest_u16(&ndev->vdev, hdr->num_buffers);
+	hdr->hdr_len		= virtio_host_to_guest_u16(&ndev->vdev, hdr->hdr_len);
+	hdr->gso_size		= virtio_host_to_guest_u16(&ndev->vdev, hdr->gso_size);
+	hdr->csum_start		= virtio_host_to_guest_u16(&ndev->vdev, hdr->csum_start);
+	hdr->csum_offset	= virtio_host_to_guest_u16(&ndev->vdev, hdr->csum_offset);
 }
 
 static void *virtio_net_rx_thread(void *p)
@@ -123,7 +121,7 @@ static void *virtio_net_rx_thread(void *p)
 				.iov_len  = sizeof(buffer),
 			};
 			struct virtio_net_hdr_mrg_rxbuf *hdr;
-			int i;
+			u16 num_buffers;
 
 			len = ndev->ops->rx(&dummy_iov, 1, ndev);
 			if (len < 0) {
@@ -132,7 +130,7 @@ static void *virtio_net_rx_thread(void *p)
 				goto out_err;
 			}
 
-			copied = i = 0;
+			copied = num_buffers = 0;
 			head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
 			hdr = iov[0].iov_base;
 			while (copied < len) {
@@ -140,19 +138,20 @@ static void *virtio_net_rx_thread(void *p)
 
 				memcpy_toiovec(iov, buffer + copied, iovsize);
 				copied += iovsize;
-				if (i++ == 0)
-					virtio_net_fix_rx_hdr(hdr, ndev);
-				if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF)) {
-					u16 num_buffers = virtio_guest_to_host_u16(vq, hdr->num_buffers);
-					hdr->num_buffers = virtio_host_to_guest_u16(vq, num_buffers + 1);
-				}
-				virt_queue__set_used_elem(vq, head, iovsize);
+				virt_queue__set_used_elem_no_update(vq, head, iovsize, num_buffers++);
 				if (copied == len)
 					break;
 				while (!virt_queue__available(vq))
 					sleep(0);
 				head = virt_queue__get_iov(vq, iov, &out, &in, kvm);
 			}
+
+			virtio_net_fix_rx_hdr(&hdr->hdr, ndev);
+			if (has_virtio_feature(ndev, VIRTIO_NET_F_MRG_RXBUF))
+				hdr->num_buffers = virtio_host_to_guest_u16(vq, num_buffers);
+
+			virt_queue__used_idx_advance(vq, num_buffers);
+
 			/* We should interrupt guest right now, otherwise latency is huge. */
 			if (virtio_queue__should_signal(vq))
 				ndev->vdev.ops->signal_vq(kvm, &ndev->vdev, id);
-- 
2.1.4


      reply	other threads:[~2015-10-27  9:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15 20:20 Network hangs when communicating with host Dmitry Vyukov
2015-10-16 17:25 ` Sasha Levin
     [not found]   ` <CACT4Y+bG3gZv7eBUg5hv=5CEasdGUHwYEe6Bae6OVMK3bZe1Rw@mail.gmail.com>
2015-10-19  9:22     ` Andre Przywara
2015-10-19  9:28       ` Dmitry Vyukov
2015-10-19 14:20         ` Sasha Levin
2015-10-20 13:42           ` Dmitry Vyukov
2015-10-20 13:58             ` Sasha Levin
2015-10-27  9:31               ` Will Deacon [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151027093106.GA1689@arm.com \
    --to=will.deacon@arm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=asias.hejun@gmail.com \
    --cc=dvyukov@google.com \
    --cc=eugenis@google.com \
    --cc=glider@google.com \
    --cc=gorcunov@gmail.com \
    --cc=gorcunov@openvz.org \
    --cc=kcc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=levinsasha928@gmail.com \
    --cc=marc.zyngier@arm.com \
    --cc=matt@ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=mingo@elte.hu \
    --cc=penberg@cs.helsinki.fi \
    --cc=penberg@kernel.org \
    --cc=prasadjoshi124@gmail.com \
    --cc=samsonov@google.com \
    --cc=sasha.levin@oracle.com \
    --cc=syzkaller@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).