linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: mst@redhat.com (Michael S. Tsirkin)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC] virtio: use mandatory barriers for remote processor vdevs
Date: Tue, 29 Nov 2011 15:11:10 +0200	[thread overview]
Message-ID: <20111129131110.GC19157@redhat.com> (raw)
In-Reply-To: <1322569886-13055-1-git-send-email-ohad@wizery.com>

On Tue, Nov 29, 2011 at 02:31:26PM +0200, Ohad Ben-Cohen wrote:
> Virtio is using memory barriers to control the ordering of
> references to the vrings on SMP systems. When the guest is compiled
> with SMP support, virtio is only using SMP barriers in order to
> avoid incurring the overhead involved with mandatory barriers.
> 
> Lately, though, virtio is being increasingly used with inter-processor
> communication scenarios too, which involve running two (separate)
> instances of operating systems on two (separate) processors, each of
> which might either be UP or SMP.

Is that using virtio-mmio? If yes, would the extra serialization belongs
at that layer?

> To control the ordering of memory references when the vrings are shared
> between two external processors, we must always use mandatory barriers.

Sorry, could you pls explain what are 'two external processors'?
I think I know that if two x86 CPUs in an SMP system run kernels built
in an SMP configuration, smp_*mb barriers are enough.

Documentation/memory-barriers.txt says:
	Mandatory barriers should not be used to control SMP effects ...
	They may, however, be used to control MMIO effects on accesses through
	relaxed memory I/O windows.

We don't control MMIO/relaxed memory I/O windows here, so what exactly
is the issue?

Could you please give an example of a setup that is currently broken?

> 
> A trivial, albeit sub-optimal, solution would be to simply revert
> commit d57ed95 "virtio: use smp_XX barriers on SMP". Obviously, though,
> that's going to have a negative impact on performance of SMP-based
> virtualization use cases.
> 
> A different approach, as demonstrated by this patch, would pick the type
> of memory barriers, in run time, according to the requirements of the
> virtio device. This way, both SMP virtualization scenarios and inter-
> processor communication use cases would run correctly, without making
> any performance compromises (except for those incurred by an additional
> branch or level of indirection).

Is an extra branch faster or slower than reverting d57ed95?

> 
> This patch introduces VIRTIO_RING_F_REMOTEPROC, a new virtio transport
> feature, which should be used by virtio devices that run on remote
> processors. The CONFIG_SMP variant of virtio_{mb, rmb, wmb} is then changed
> to use SMP barriers only if VIRTIO_RING_F_REMOTEPROC was absent.

One wonders how the remote side knows enough to set this flag?

> 
> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
> ---
> Alternatively, we can also introduce some kind of virtio_mb_ops and set it
> according to the nature of the vdev with handlers that just do the right
> thing, instead of introducting that branch.
> 
> Though I also wonder how big really is the perforamnce gain of d57ed95 ?

Want to check and tell us?

>  drivers/virtio/virtio_ring.c |   78 +++++++++++++++++++++++++++++-------------
>  include/linux/virtio_ring.h  |    6 +++
>  2 files changed, 60 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index c7a2c20..cf66a2d 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -23,24 +23,6 @@
>  #include <linux/slab.h>
>  #include <linux/module.h>
>  
> -/* virtio guest is communicating with a virtual "device" that actually runs on
> - * a host processor.  Memory barriers are used to control SMP effects. */
> -#ifdef CONFIG_SMP
> -/* Where possible, use SMP barriers which are more lightweight than mandatory
> - * barriers, because mandatory barriers control MMIO effects on accesses
> - * through relaxed memory I/O windows (which virtio does not use). */
> -#define virtio_mb() smp_mb()
> -#define virtio_rmb() smp_rmb()
> -#define virtio_wmb() smp_wmb()
> -#else
> -/* We must force memory ordering even if guest is UP since host could be
> - * running on another CPU, but SMP barriers are defined to barrier() in that
> - * configuration. So fall back to mandatory barriers instead. */
> -#define virtio_mb() mb()
> -#define virtio_rmb() rmb()
> -#define virtio_wmb() wmb()
> -#endif
> -
>  #ifdef DEBUG
>  /* For development, we want to crash whenever the ring is screwed. */
>  #define BAD_RING(_vq, fmt, args...)				\
> @@ -86,6 +68,9 @@ struct vring_virtqueue
>  	/* Host publishes avail event idx */
>  	bool event;
>  
> +	/* Host runs on a remote processor */
> +	bool rproc;
> +
>  	/* Number of free buffers */
>  	unsigned int num_free;
>  	/* Head of free buffer list. */
> @@ -108,6 +93,48 @@ struct vring_virtqueue
>  	void *data[];
>  };
>  
> +/*
> + * virtio guest is communicating with a virtual "device" that may either run
> + * on the host processor, or on an external processor. The former requires
> + * memory barriers in order to control SMP effects, but the latter must
> + * use mandatory barriers.
> + */
> +#ifdef CONFIG_SMP
> +/* Where possible, use SMP barriers which are more lightweight than mandatory
> + * barriers, because mandatory barriers control MMIO effects on accesses
> + * through relaxed memory I/O windows. */
> +static inline void virtio_mb(struct vring_virtqueue *vq)
> +{
> +	if (vq->rproc)
> +		mb();
> +	else
> +		smp_mb();
> +}
> +
> +static inline void virtio_rmb(struct vring_virtqueue *vq)
> +{
> +	if (vq->rproc)
> +		rmb();
> +	else
> +		smp_rmb();
> +}
> +
> +static inline void virtio_wmb(struct vring_virtqueue *vq)
> +{
> +	if (vq->rproc)
> +		wmb();
> +	else
> +		smp_wmb();
> +}
> +#else
> +/* We must force memory ordering even if guest is UP since host could be
> + * running on another CPU, but SMP barriers are defined to barrier() in that
> + * configuration. So fall back to mandatory barriers instead. */
> +static inline void virtio_mb(struct vring_virtqueue *vq) { mb(); }
> +static inline void virtio_rmb(struct vring_virtqueue *vq) { rmb(); }
> +static inline void virtio_wmb(struct vring_virtqueue *vq) { wmb(); }
> +#endif
> +
>  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
>  
>  /* Set up an indirect table of descriptors and add it to the queue. */
> @@ -245,14 +272,14 @@ void virtqueue_kick(struct virtqueue *_vq)
>  	START_USE(vq);
>  	/* Descriptors and available array need to be set before we expose the
>  	 * new available array entries. */
> -	virtio_wmb();
> +	virtio_wmb(vq);
>  
>  	old = vq->vring.avail->idx;
>  	new = vq->vring.avail->idx = old + vq->num_added;
>  	vq->num_added = 0;
>  
>  	/* Need to update avail index before checking if we should notify */
> -	virtio_mb();
> +	virtio_mb(vq);
>  
>  	if (vq->event ?
>  	    vring_need_event(vring_avail_event(&vq->vring), new, old) :
> @@ -314,7 +341,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
>  	}
>  
>  	/* Only get used array entries after they have been exposed by host. */
> -	virtio_rmb();
> +	virtio_rmb(vq);
>  
>  	i = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].id;
>  	*len = vq->vring.used->ring[vq->last_used_idx%vq->vring.num].len;
> @@ -337,7 +364,7 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
>  	 * the read in the next get_buf call. */
>  	if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
>  		vring_used_event(&vq->vring) = vq->last_used_idx;
> -		virtio_mb();
> +		virtio_mb(vq);
>  	}
>  
>  	END_USE(vq);
> @@ -366,7 +393,7 @@ bool virtqueue_enable_cb(struct virtqueue *_vq)
>  	 * entry. Always do both to keep code simple. */
>  	vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
>  	vring_used_event(&vq->vring) = vq->last_used_idx;
> -	virtio_mb();
> +	virtio_mb(vq);
>  	if (unlikely(more_used(vq))) {
>  		END_USE(vq);
>  		return false;
> @@ -393,7 +420,7 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
>  	/* TODO: tune this threshold */
>  	bufs = (u16)(vq->vring.avail->idx - vq->last_used_idx) * 3 / 4;
>  	vring_used_event(&vq->vring) = vq->last_used_idx + bufs;
> -	virtio_mb();
> +	virtio_mb(vq);
>  	if (unlikely((u16)(vq->vring.used->idx - vq->last_used_idx) > bufs)) {
>  		END_USE(vq);
>  		return false;
> @@ -486,6 +513,7 @@ struct virtqueue *vring_new_virtqueue(unsigned int num,
>  
>  	vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC);
>  	vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
> +	vq->rproc = virtio_has_feature(vdev, VIRTIO_RING_F_REMOTEPROC);
>  
>  	/* No callback?  Tell other side not to bother us. */
>  	if (!callback)
> @@ -522,6 +550,8 @@ void vring_transport_features(struct virtio_device *vdev)
>  			break;
>  		case VIRTIO_RING_F_EVENT_IDX:
>  			break;
> +		case VIRTIO_RING_F_REMOTEPROC:
> +			break;
>  		default:
>  			/* We don't understand this bit. */
>  			clear_bit(i, vdev->features);
> diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
> index 36be0f6..9839593 100644
> --- a/include/linux/virtio_ring.h
> +++ b/include/linux/virtio_ring.h
> @@ -58,6 +58,12 @@
>   * at the end of the used ring. Guest should ignore the used->flags field. */
>  #define VIRTIO_RING_F_EVENT_IDX		29
>  
> +/*
> + * The device we're talking to resides on a remote processor, so we must always
> + * use mandatory memory barriers.
> + */
> +#define VIRTIO_RING_F_REMOTEPROC	30
> +
>  /* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
>  struct vring_desc {
>  	/* Address (guest-physical). */
> -- 
> 1.7.5.4

  reply	other threads:[~2011-11-29 13:11 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-29 12:31 [RFC] virtio: use mandatory barriers for remote processor vdevs Ohad Ben-Cohen
2011-11-29 13:11 ` Michael S. Tsirkin [this message]
2011-11-29 13:57   ` Ohad Ben-Cohen
2011-11-29 15:16     ` Michael S. Tsirkin
2011-11-30 11:45       ` Ohad Ben-Cohen
2011-11-30 14:59         ` Michael S. Tsirkin
2011-11-30 16:04           ` Ohad Ben-Cohen
2011-11-30 16:15             ` Michael S. Tsirkin
2011-11-30 16:24               ` Ohad Ben-Cohen
2011-11-30 23:27                 ` Ohad Ben-Cohen
2011-11-30 23:43                   ` Michael S. Tsirkin
2011-12-01  6:20                     ` Ohad Ben-Cohen
2011-11-29 15:19     ` Michael S. Tsirkin
2011-11-30 11:55       ` Ohad Ben-Cohen
2011-11-30 14:50         ` Michael S. Tsirkin
2011-11-30 22:43           ` Ohad Ben-Cohen
2011-11-30 23:13             ` Michael S. Tsirkin
2011-12-01  2:28               ` Rusty Russell
2011-12-01  7:15                 ` Ohad Ben-Cohen
2011-12-01  8:12                 ` Michael S. Tsirkin
2011-12-02  0:26                   ` Rusty Russell
2011-12-01  6:14               ` Ohad Ben-Cohen
2011-12-01  9:09                 ` Michael S. Tsirkin
2011-12-02 23:09 ` Benjamin Herrenschmidt
2011-12-03  5:14   ` Rusty Russell
2011-12-11 12:25     ` Michael S. Tsirkin
2011-12-11 22:27       ` Benjamin Herrenschmidt
2011-12-12  3:06         ` Amos Kong
2011-12-12  5:12           ` Rusty Russell
2011-12-12 23:56             ` Amos Kong
2011-12-19  2:35               ` Rusty Russell
2011-12-19  2:19             ` Amos Kong
2011-12-19  2:41               ` Benjamin Herrenschmidt
2011-12-19  7:21                 ` Amos Kong
2011-12-19  2:50               ` Amos Kong
2011-12-19  8:37                 ` Rusty Russell
2011-12-03  6:01   ` Ohad Ben-Cohen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111129131110.GC19157@redhat.com \
    --to=mst@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).