public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
@ 2011-12-23  5:16 Rusty Russell
  2011-12-23  9:31 ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Rusty Russell @ 2011-12-23  5:16 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel, virtualization

This one as a signed tag on github, in case the inline patch was
the reason you dropped this.

virtio-mmio in new 3.2, and they found a corruption bug.  Please apply.

 * [new tag]         rusty@rustcorp.com.au -> rusty@rustcorp.com.au
The following changes since commit b3b1b70e62a603f473619dbebc3b3d23f535e6f8:

  Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb (2011-12-22 12:59:47 -0800)

are available in the git repository at:

  git://github.com/rustyrussell/linux.git master

Rusty Russell (1):
      virtio: harsher barriers for virtio-mmio.

 drivers/lguest/lguest_device.c |    8 +++++---
 drivers/s390/kvm/kvm_virtio.c  |    2 +-
 drivers/virtio/virtio_mmio.c   |    7 ++++---
 drivers/virtio/virtio_pci.c    |    4 ++--
 drivers/virtio/virtio_ring.c   |   34 +++++++++++++++++++++-------------
 include/linux/virtio_ring.h    |    1 +
 tools/virtio/linux/virtio.h    |    1 +
 tools/virtio/virtio_test.c     |    3 ++-
 8 files changed, 37 insertions(+), 23 deletions(-)

commit ef3a731beb9a030e552945a734dc898b5525e2f7
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Fri Dec 23 15:07:56 2011 +1030

    virtio: harsher barriers for virtio-mmio.
    
    We were cheating with our barriers; using the smp ones rather than the
    real device ones.  That was fine, until virtio-mmio came along, which
    could be talking to a real device (a non-SMP CPU).
    
    Unfortunately, just putting back the real barriers (reverting
    d57ed95d) causes a performance regression on virtio-pci.  In
    particular, Amos reports netbench's TCP_RR over virtio_net CPU
    utilization increased up to 35% while throughput went down by up to
    14%.
    
    By comparison, this branch is in the noise.
    
    Reference: https://lkml.org/lkml/2011/12/11/22
    
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

 drivers/lguest/lguest_device.c |    8 +++++---
 drivers/s390/kvm/kvm_virtio.c  |    2 +-
 drivers/virtio/virtio_mmio.c   |    7 ++++---
 drivers/virtio/virtio_pci.c    |    4 ++--
 drivers/virtio/virtio_ring.c   |   34 +++++++++++++++++++++-------------
 include/linux/virtio_ring.h    |    1 +
 tools/virtio/linux/virtio.h    |    1 +
 tools/virtio/virtio_test.c     |    3 ++-
 8 files changed, 37 insertions(+), 23 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
  2011-12-23  5:16 [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio Rusty Russell
@ 2011-12-23  9:31 ` Linus Torvalds
  2011-12-23  9:35   ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2011-12-23  9:31 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, virtualization

On Thu, Dec 22, 2011 at 9:16 PM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> This one as a signed tag on github, in case the inline patch was
> the reason you dropped this.

No, and please why do you make your tag-names be your email address?
That's just odd. It seems to be related to some broken SCM system that
thinks that tag-names are global and somehow different from
branch-names.

No, the reason I didn't pull was that the games with the barriers just
threw me off. They seem hacky and disgusting beyond words. And afaik
there are no actual *devices* that implement virtio-mmu, and the
commit that adds the mmu thing talks about qemu interfaces to. So it's
still just virtual, which makes the whole thing just masturbatory,
afaik. There's nothing out-of-order in virtual mmio.

So the whole thing looks confused. There's never any reason to
actually use the expensive sfence/rfences at all. Afaik you still just
want smp_*mb() for all cases.

                  Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
  2011-12-23  9:31 ` Linus Torvalds
@ 2011-12-23  9:35   ` Linus Torvalds
  2011-12-23 11:35     ` Ohad Ben-Cohen
  0 siblings, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2011-12-23  9:35 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, virtualization

On Fri, Dec 23, 2011 at 1:31 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So the whole thing looks confused. There's never any reason to
> actually use the expensive sfence/rfences at all. Afaik you still just
> want smp_*mb() for all cases.

But note that I haven't thought deeply about it, I just looked at the
patch and went "Hmm, that can't be right", and then it kind of got
dropped because I forgot about it.

On x86, there really is never any reason to use the heavy memory
barriers unless you are talking to a real device. And last I saw,
"virtio" was still about virtual IO.

Now, I could in theory imagine that you migth want to unconditionally
have the "smp_*mb()" barriers in the case where you are running UP
guest in an SMP host, but that's not what the patch actually did. It
did that odd crazy dynamic "IO barrier or SMP barrier" thing, which
just is confused and doesn't make sense to me in any situation I can
think of.

                    Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
  2011-12-23  9:35   ` Linus Torvalds
@ 2011-12-23 11:35     ` Ohad Ben-Cohen
  2011-12-24  3:01       ` Rusty Russell
  2011-12-29  4:31       ` Rusty Russell
  0 siblings, 2 replies; 7+ messages in thread
From: Ohad Ben-Cohen @ 2011-12-23 11:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rusty Russell, linux-kernel, virtualization, linux-arm

On Fri, Dec 23, 2011 at 11:35 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On x86, there really is never any reason to use the heavy memory
> barriers unless you are talking to a real device. And last I saw,
> "virtio" was still about virtual IO.

I reported this originally, so maybe I should describe our use case a
bit here (it's not virtio-mmio). It's a bit long, so apologizes in
advance.

Almost every SoC today have several additional cores (DSP or whatnot)
which usually employ some hardware multimedia accelerators and are
used to offload cpu-intensive tasks from the main application
processor. These other cores are used in an asymmetric multiprocessing
configurations, i.e. they run their own instance of operating system
(which can be some flavor of RTOS, or Linux, or whatnot. anything
goes).

Virtually every SoC vendor have this (lots of ARM vendors, but it's
definitely not limited to ARM), and they all have their own way of
controlling, and communicating with, those remote cores. And it's
usually rather big (tens of thousands loc), out-of-tree and very
vendor-specific code.

So we're trying to fix this by introducing some generic code that'd
control those remote cores and let drivers talk to them, which all
vendors could then use. I'll be sending you a 3.3 pull request for
this, but you can already take a look in linux-next at drivers/rpmsg
(inter-processor communication bus) and drivers/remoteproc (framework
for booting a remote core).

And rpmsg is using virtio to avoid implementing another shared memory
"wire" protocol. And of course to be able to reuse all the existing
virtio drivers (e.g. net, block, console) with a remote core backend.

Which leads me to the specific issue we have. On OMAP4, the virtio
kick is implemented using a memory-mapped mailbox device. After
updating a vring (which is mapped using ARM's Normal memory) and
before kicking the remote core (using the mailbox device which is
mapped using ARM's Device memory) we must use a "heavy" memory barrier
(i.e. ARM's DSB). Otherwise, if only an smp memory barrier is used
(i.e. DMB on ARM), the kick might jump ahead before the remote core
has observed the updates to the vrings. And then bad things happen.

We didn't want to inflict the performance degradation on the
virtualization use cases (which can run concurrently with the remote
core scenarios), hence the dynamic "IO or SMP barrier" thingy.

(Btw we don't have enough information about other
setups/configurations as other vendors just begin using virtio for
these kind of scenarios, but we guess this is probably isn't limited
to OMAP. Fixing it at the transport layer sounds reasonable, although
there are other ways to do this too).

Hope this makes sense,

Thanks,
Ohad.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
  2011-12-23 11:35     ` Ohad Ben-Cohen
@ 2011-12-24  3:01       ` Rusty Russell
  2011-12-29  4:31       ` Rusty Russell
  1 sibling, 0 replies; 7+ messages in thread
From: Rusty Russell @ 2011-12-24  3:01 UTC (permalink / raw)
  To: Ohad Ben-Cohen, Linus Torvalds; +Cc: linux-kernel, virtualization, linux-arm

On Fri, 23 Dec 2011 13:35:26 +0200, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> On Fri, Dec 23, 2011 at 11:35 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On x86, there really is never any reason to use the heavy memory
> > barriers unless you are talking to a real device. And last I saw,
> > "virtio" was still about virtual IO.

Not any more, as the commit message describes.

But I missed that they're using rpmsg, which isn't merged yet.

Sorry for the noise,
Rusty.
PS. Switching barriers by bus type is too crude anyway, let's come up
    with something better...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
  2011-12-23 11:35     ` Ohad Ben-Cohen
  2011-12-24  3:01       ` Rusty Russell
@ 2011-12-29  4:31       ` Rusty Russell
  2011-12-29  6:53         ` Ohad Ben-Cohen
  1 sibling, 1 reply; 7+ messages in thread
From: Rusty Russell @ 2011-12-29  4:31 UTC (permalink / raw)
  To: Ohad Ben-Cohen; +Cc: linux-kernel, virtualization, linux-arm

On Fri, 23 Dec 2011 13:35:26 +0200, Ohad Ben-Cohen <ohad@wizery.com> wrote:
> On Fri, Dec 23, 2011 at 11:35 AM, Linus Torvalds
> And rpmsg is using virtio to avoid implementing another shared memory
> "wire" protocol. And of course to be able to reuse all the existing
> virtio drivers (e.g. net, block, console) with a remote core backend.

OK, I've applied this patch; you might want to carry it in your tree
too.  You'll need to fix up your call to vring_new_virtqueue()....

From: Rusty Russell <rusty@rustcorp.com.au>
Subject: virtio: harsher barriers for rpmsg.

We were cheating with our barriers; using the smp ones rather than the
real device ones.  That was fine, until rpmsg came along, which is
used to talk to a real device (a non-SMP CPU).

Unfortunately, just putting back the real barriers (reverting
d57ed95d) causes a performance regression on virtio-pci.  In
particular, Amos reports netbench's TCP_RR over virtio_net CPU
utilization increased up to 35% while throughput went down by up to
14%.

By comparison, this branch is in the noise.

Reference: https://lkml.org/lkml/2011/12/11/22

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 drivers/lguest/lguest_device.c |    8 +++++---
 drivers/s390/kvm/kvm_virtio.c  |    2 +-
 drivers/virtio/virtio_mmio.c   |    4 ++--
 drivers/virtio/virtio_pci.c    |    4 ++--
 drivers/virtio/virtio_ring.c   |   34 +++++++++++++++++++++-------------
 include/linux/virtio_ring.h    |    1 +
 tools/virtio/linux/virtio.h    |    1 +
 tools/virtio/virtio_test.c     |    3 ++-
 8 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -292,10 +292,12 @@ static struct virtqueue *lg_find_vq(stru
 
 	/*
 	 * OK, tell virtio_ring.c to set up a virtqueue now we know its size
-	 * and we've got a pointer to its pages.
+	 * and we've got a pointer to its pages.  Note that we set weak_barriers
+	 * to 'true': the host just a(nother) SMP CPU, so we only need inter-cpu
+	 * barriers.
 	 */
-	vq = vring_new_virtqueue(lvq->config.num, LGUEST_VRING_ALIGN,
-				 vdev, lvq->pages, lg_notify, callback, name);
+	vq = vring_new_virtqueue(lvq->config.num, LGUEST_VRING_ALIGN, vdev,
+				 true, lvq->pages, lg_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
 		goto unmap;
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -198,7 +198,7 @@ static struct virtqueue *kvm_find_vq(str
 		goto out;
 
 	vq = vring_new_virtqueue(config->num, KVM_S390_VIRTIO_RING_ALIGN,
-				 vdev, (void *) config->address,
+				 vdev, true, (void *) config->address,
 				 kvm_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -310,8 +310,8 @@ static struct virtqueue *vm_setup_vq(str
 			vm_dev->base + VIRTIO_MMIO_QUEUE_PFN);
 
 	/* Create the vring */
-	vq = vring_new_virtqueue(info->num, VIRTIO_MMIO_VRING_ALIGN,
-				 vdev, info->queue, vm_notify, callback, name);
+	vq = vring_new_virtqueue(info->num, VIRTIO_MMIO_VRING_ALIGN, vdev,
+				 true, info->queue, vm_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
 		goto error_new_virtqueue;
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -414,8 +414,8 @@ static struct virtqueue *setup_vq(struct
 		  vp_dev->ioaddr + VIRTIO_PCI_QUEUE_PFN);
 
 	/* create the vring */
-	vq = vring_new_virtqueue(info->num, VIRTIO_PCI_VRING_ALIGN,
-				 vdev, info->queue, vp_notify, callback, name);
+	vq = vring_new_virtqueue(info->num, VIRTIO_PCI_VRING_ALIGN, vdev,
+				 true, info->queue, vp_notify, callback, name);
 	if (!vq) {
 		err = -ENOMEM;
 		goto out_activate_queue;
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -28,17 +28,20 @@
 #ifdef CONFIG_SMP
 /* Where possible, use SMP barriers which are more lightweight than mandatory
  * barriers, because mandatory barriers control MMIO effects on accesses
- * through relaxed memory I/O windows (which virtio does not use). */
-#define virtio_mb() smp_mb()
-#define virtio_rmb() smp_rmb()
-#define virtio_wmb() smp_wmb()
+ * through relaxed memory I/O windows (which virtio-pci does not use). */
+#define virtio_mb(vq) \
+	do { if ((vq)->weak_barriers) smp_mb(); else mb(); } while(0)
+#define virtio_rmb(vq) \
+	do { if ((vq)->weak_barriers) smp_rmb(); else rmb(); } while(0)
+#define virtio_wmb(vq) \
+	do { if ((vq)->weak_barriers) smp_rmb(); else rmb(); } while(0)
 #else
 /* We must force memory ordering even if guest is UP since host could be
  * running on another CPU, but SMP barriers are defined to barrier() in that
  * configuration. So fall back to mandatory barriers instead. */
-#define virtio_mb() mb()
-#define virtio_rmb() rmb()
-#define virtio_wmb() wmb()
+#define virtio_mb(vq) mb()
+#define virtio_rmb(vq) rmb()
+#define virtio_wmb(vq) wmb()
 #endif
 
 #ifdef DEBUG
@@ -77,6 +80,9 @@ struct vring_virtqueue
 	/* Actual memory layout for this queue */
 	struct vring vring;
 
+	/* Can we use weak barriers? */
+	bool weak_barriers;
+
 	/* Other side has made a mess, don't try any more. */
 	bool broken;
 
@@ -245,14 +251,14 @@ void virtqueue_kick(struct virtqueue *_v
 	START_USE(vq);
 	/* Descriptors and available array need to be set before we expose the
 	 * new available array entries. */
-	virtio_wmb();
+	virtio_wmb(vq);
 
 	old = vq->vring.avail->idx;
 	new = vq->vring.avail->idx = old + vq->num_added;
 	vq->num_added = 0;
 
 	/* Need to update avail index before checking if we should notify */
-	virtio_mb();
+	virtio_mb(vq);
 
 	if (vq->event ?
 	    vring_need_event(vring_avail_event(&vq->vring), new, old) :
@@ -314,7 +320,7 @@ void *virtqueue_get_buf(struct virtqueue
 	}
 
 	/* Only get used array entries after they have been exposed by host. */
-	virtio_rmb();
+	virtio_rmb(vq);
 
 	i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id;
 	*len = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].len;
@@ -337,7 +343,7 @@ void *virtqueue_get_buf(struct virtqueue
 	 * the read in the next get_buf call. */
 	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
 		vring_used_event(&vq->vring) = vq->last_used_idx;
-		virtio_mb();
+		virtio_mb(vq);
 	}
 
 	END_USE(vq);
@@ -366,7 +372,7 @@ bool virtqueue_enable_cb(struct virtqueu
 	 * entry. Always do both to keep code simple. */
 	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
 	vring_used_event(&vq->vring) = vq->last_used_idx;
-	virtio_mb();
+	virtio_mb(vq);
 	if (unlikely(more_used(vq))) {
 		END_USE(vq);
 		return false;
@@ -393,7 +399,7 @@ bool virtqueue_enable_cb_delayed(struct 
 	/* TODO: tune this threshold */
 	bufs = (u16)(vq->vring.avail->idx - vq->last_used_idx) * 3 / 4;
 	vring_used_event(&vq->vring) = vq->last_used_idx + bufs;
-	virtio_mb();
+	virtio_mb(vq);
 	if (unlikely((u16)(vq->vring.used->idx - vq->last_used_idx) > bufs)) {
 		END_USE(vq);
 		return false;
@@ -453,6 +459,7 @@ EXPORT_SYMBOL_GPL(vring_interrupt);
 struct virtqueue *vring_new_virtqueue(unsigned int num,
 				      unsigned int vring_align,
 				      struct virtio_device *vdev,
+				      bool weak_barriers,
 				      void *pages,
 				      void (*notify)(struct virtqueue *),
 				      void (*callback)(struct virtqueue *),
@@ -476,6 +483,7 @@ struct virtqueue *vring_new_virtqueue(un
 	vq->vq.vdev = vdev;
 	vq->vq.name = name;
 	vq->notify = notify;
+	vq->weak_barriers = weak_barriers;
 	vq->broken = false;
 	vq->last_used_idx = 0;
 	vq->num_added = 0;
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -168,6 +168,7 @@ struct virtqueue;
 struct virtqueue *vring_new_virtqueue(unsigned int num,
 				      unsigned int vring_align,
 				      struct virtio_device *vdev,
+				      bool weak_barriers,
 				      void *pages,
 				      void (*notify)(struct virtqueue *vq),
 				      void (*callback)(struct virtqueue *vq),
diff --git a/tools/virtio/linux/virtio.h b/tools/virtio/linux/virtio.h
--- a/tools/virtio/linux/virtio.h
+++ b/tools/virtio/linux/virtio.h
@@ -214,6 +214,7 @@ void *virtqueue_detach_unused_buf(struct
 struct virtqueue *vring_new_virtqueue(unsigned int num,
 				      unsigned int vring_align,
 				      struct virtio_device *vdev,
+				      bool weak_barriers,
 				      void *pages,
 				      void (*notify)(struct virtqueue *vq),
 				      void (*callback)(struct virtqueue *vq),
diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -92,7 +92,8 @@ static void vq_info_add(struct vdev_info
 	assert(r >= 0);
 	memset(info->ring, 0, vring_size(num, 4096));
 	vring_init(&info->vring, num, info->ring, 4096);
-	info->vq = vring_new_virtqueue(info->vring.num, 4096, &dev->vdev, info->ring,
+	info->vq = vring_new_virtqueue(info->vring.num, 4096, &dev->vdev,
+				       true, info->ring,
 				       vq_notify, vq_callback, "test");
 	assert(info->vq);
 	info->vq->priv = info;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio
  2011-12-29  4:31       ` Rusty Russell
@ 2011-12-29  6:53         ` Ohad Ben-Cohen
  0 siblings, 0 replies; 7+ messages in thread
From: Ohad Ben-Cohen @ 2011-12-29  6:53 UTC (permalink / raw)
  To: Rusty Russell; +Cc: linux-kernel, virtualization, linux-arm

On Thu, Dec 29, 2011 at 6:31 AM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> OK, I've applied this patch; you might want to carry it in your tree
> too.  You'll need to fix up your call to vring_new_virtqueue()....

Will do. Thanks !

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-12-29  6:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-23  5:16 [RESENDx2] [PULL] virtio: fix barriers for virtio-mmio Rusty Russell
2011-12-23  9:31 ` Linus Torvalds
2011-12-23  9:35   ` Linus Torvalds
2011-12-23 11:35     ` Ohad Ben-Cohen
2011-12-24  3:01       ` Rusty Russell
2011-12-29  4:31       ` Rusty Russell
2011-12-29  6:53         ` Ohad Ben-Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox