linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Wanlong Gao <gaowanlong@cn.fujitsu.com>,
	asias@redhat.com, Rusty Russell <rusty@rustcorp.com.au>,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH 1/9] virtio: add functions for piecewise addition of buffers
Date: Tue, 12 Feb 2013 16:56:20 +0200	[thread overview]
Message-ID: <20130212145620.GA3392@redhat.com> (raw)
In-Reply-To: <1360671815-2135-2-git-send-email-pbonzini@redhat.com>

On Tue, Feb 12, 2013 at 01:23:27PM +0100, Paolo Bonzini wrote:
> virtio device drivers translate requests from higher layer in two steps:
> a device-specific step in the device driver, and generic preparation
> of virtio direct or indirect buffers in virtqueue_add_buf.  Because
> virtqueue_add_buf also accepts the outcome of the first step as a single
> struct scatterlist, drivers may need to put additional items at the
> front or back of the data scatterlists before handing it to virtqueue_add_buf.
> Because of this, virtio-scsi has to copy each request into a scatterlist
> internal to the driver.  It cannot just use the one that was prepared
> by the upper SCSI layers. 
> 
> On top of this, virtqueue_add_buf also has the limitation of not
> supporting chained scatterlists: the buffers must be provided as an
> array of struct scatterlist.  Chained scatterlist, though not supported
> on all architectures, would help for virtio-scsi where all additional
> items are placed at the front.
> 
> This patch adds a different set of APIs for adding a buffer to a virtqueue.
> The new API lets you pass the buffers piecewise, wrapping multiple calls
> to virtqueue_add_sg between virtqueue_start_buf and virtqueue_end_buf.
> virtio-scsi can then call virtqueue_add_sg 3/4 times: for the request
> header, for the write buffer (if present), for the response header, and
> finally for the read buffer (again if present).  It saves the copying
> and the related locking.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  drivers/virtio/virtio_ring.c |  211 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/virtio.h       |   14 +++
>  2 files changed, 225 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index ffd7e7d..64184e5 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -101,6 +101,10 @@ struct vring_virtqueue
>  	/* Last used index we've seen. */
>  	u16 last_used_idx;
>  
> +	/* State between virtqueue_start_buf and virtqueue_end_buf.  */
> +	int head;
> +	struct vring_desc *indirect_base, *tail;
> +
>  	/* How to notify other side. FIXME: commonalize hcalls! */
>  	void (*notify)(struct virtqueue *vq);
>  
> @@ -394,6 +398,213 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
>  	vq->vq.num_free++;
>  }
>  
> +/**
> + * virtqueue_start_buf - start building buffer for the other end
> + * @vq: the struct virtqueue we're talking about.
> + * @data: the token identifying the buffer.
> + * @nents: the number of buffers that will be added

This function starts building one buffer, number of buffers
is a bit weird here.

> + * @nsg: the number of sg lists that will be added

This means number of calls to add_sg ? Not sure why this matters.
How about we pass in in_num/out_num - that is total # of sg,
same as add_buf?


> + * @gfp: how to do memory allocations (if necessary).
> + *
> + * Caller must ensure we don't call this with other virtqueue operations
> + * at the same time (except where noted), and that a successful call is
> + * followed by one or more calls to virtqueue_add_sg, and finally a call
> + * to virtqueue_end_buf.
> + *
> + * Returns zero or a negative error (ie. ENOSPC).
> + */
> +int virtqueue_start_buf(struct virtqueue *_vq,
> +			void *data,
> +			unsigned int nents,
> +			unsigned int nsg,
> +			gfp_t gfp)
> +{
> +	struct vring_virtqueue *vq = to_vvq(_vq);
> +	struct vring_desc *desc = NULL;
> +	int head;
> +	int ret = -ENOMEM;
> +
> +	START_USE(vq);
> +
> +	BUG_ON(data == NULL);
> +
> +#ifdef DEBUG
> +	{
> +		ktime_t now = ktime_get();
> +
> +		/* No kick or get, with .1 second between?  Warn. */
> +		if (vq->last_add_time_valid)
> +			WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
> +					    > 100);
> +		vq->last_add_time = now;
> +		vq->last_add_time_valid = true;
> +	}
> +#endif
> +
> +	BUG_ON(nents < nsg);
> +	BUG_ON(nsg == 0);
> +
> +	/*
> +	 * If the host supports indirect descriptor tables, and there is
> +	 * no space for direct buffers or there are multi-item scatterlists,
> +	 * go indirect.
> +	 */
> +	head = vq->free_head;
> +	if (vq->indirect && (nents > nsg || vq->vq.num_free < nents)) {
> +		if (vq->vq.num_free == 0)
> +			goto no_space;
> +
> +		desc = kmalloc(nents * sizeof(struct vring_desc), gfp);
> +		if (!desc)
> +			goto error;
> +
> +		/* We're about to use a buffer */
> +		vq->vq.num_free--;
> +
> +		/* Use a single buffer which doesn't continue */
> +		vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
> +		vq->vring.desc[head].addr = virt_to_phys(desc);
> +		vq->vring.desc[head].len = nents * sizeof(struct vring_desc);
> +
> +		/* Update free pointer */
> +		vq->free_head = vq->vring.desc[head].next;
> +	}
> +
> +	/* Set token. */
> +	vq->data[head] = data;
> +
> +	pr_debug("Started buffer head %i for %p\n", head, vq);
> +
> +	vq->indirect_base = desc;
> +	vq->tail = NULL;
> +	vq->head = head;
> +	return 0;
> +
> +no_space:
> +	ret = -ENOSPC;
> +error:
> +	pr_debug("Can't add buf (%d) - nents = %i, avail = %i\n",
> +		 ret, nents, vq->vq.num_free);
> +	END_USE(vq);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_start_buf);
> +
> +/**
> + * virtqueue_add_sg - add sglist to buffer being built
> + * @_vq: the virtqueue for which the buffer is being built
> + * @sgl: the description of the buffer(s).
> + * @nents: the number of items to process in sgl
> + * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
> + *
> + * Note that, unlike virtqueue_add_buf, this function follows chained
> + * scatterlists, and stops before the @nents-th item if a scatterlist item
> + * has a marker.
> + *
> + * Caller must ensure we don't call this with other virtqueue operations
> + * at the same time (except where noted).

Hmm so if you want to add in and out, need separate calls?
in_num/out_num would be nicer?


> + */
> +void virtqueue_add_sg(struct virtqueue *_vq,
> +		      struct scatterlist sgl[],
> +		      unsigned int nents,
> +		      enum dma_data_direction dir)
> +{
> +	struct vring_virtqueue *vq = to_vvq(_vq);
> +	unsigned int i, n;
> +	struct scatterlist *sg;
> +	struct vring_desc *tail;
> +	u32 flags;
> +
> +#ifdef DEBUG
> +	BUG_ON(!vq->in_use);
> +#endif
> +
> +	BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
> +	BUG_ON(nents == 0);
> +
> +	flags = dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0;
> +	flags |= VRING_DESC_F_NEXT;
> +
> +	/*
> +	 * If using indirect descriptor tables, fill in the buffers
> +	 * at vq->indirect_base.
> +	 */
> +	if (vq->indirect_base) {
> +		i = 0;
> +		if (likely(vq->tail))
> +			i = vq->tail - vq->indirect_base + 1;
> +
> +		for_each_sg(sgl, sg, nents, n) {
> +			tail = &vq->indirect_base[i];
> +			tail->flags = flags;
> +			tail->addr = sg_phys(sg);
> +			tail->len = sg->length;
> +			tail->next = ++i;
> +		}
> +	} else {
> +		BUG_ON(vq->vq.num_free < nents);
> +
> +		i = vq->free_head;
> +		for_each_sg(sgl, sg, nents, n) {
> +			tail = &vq->vring.desc[i];
> +			tail->flags = flags;
> +			tail->addr = sg_phys(sg);
> +			tail->len = sg->length;
> +			i = tail->next;
> +			vq->vq.num_free--;
> +		}
> +
> +		vq->free_head = i;
> +	}
> +	vq->tail = tail;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_add_sg);
> +
> +/**
> + * virtqueue_end_buf - expose buffer to other end
> + * @_vq: the virtqueue for which the buffer was built
> + *
> + * Caller must ensure we don't call this with other virtqueue operations
> + * at the same time (except where noted).
> + */
> +void virtqueue_end_buf(struct virtqueue *_vq)
> +{
> +	struct vring_virtqueue *vq = to_vvq(_vq);
> +	unsigned int avail;
> +	int head = vq->head;
> +	struct vring_desc *tail = vq->tail;
> +
> +#ifdef DEBUG
> +	BUG_ON(!vq->in_use);
> +#endif
> +	BUG_ON(tail == NULL);
> +
> +	/* The last one does not have the next flag set.  */
> +	tail->flags &= ~VRING_DESC_F_NEXT;
> +
> +	/*
> +	 * Put entry in available array.  Descriptors and available array
> +	 * need to be set before we expose the new available array entries.
> +	 */
> +	avail = vq->vring.avail->idx & (vq->vring.num-1);
> +	vq->vring.avail->ring[avail] = head;
> +	virtio_wmb(vq);
> +
> +	vq->vring.avail->idx++;
> +	vq->num_added++;
> +
> +	/*
> +	 * This is very unlikely, but theoretically possible.  Kick
> +	 * just in case.
> +	 */
> +	if (unlikely(vq->num_added == (1 << 16) - 1))
> +		virtqueue_kick(&vq->vq);
> +
> +	pr_debug("Added buffer head %i to %p\n", head, vq);
> +	END_USE(vq);
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_end_buf);
> +
>  static inline bool more_used(const struct vring_virtqueue *vq)
>  {
>  	return vq->last_used_idx != vq->vring.used->idx;
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index cf8adb1..43d6bc3 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -7,6 +7,7 @@
>  #include <linux/spinlock.h>
>  #include <linux/device.h>
>  #include <linux/mod_devicetable.h>
> +#include <linux/dma-direction.h>
>  #include <linux/gfp.h>
>  
>  /**
> @@ -40,6 +41,19 @@ int virtqueue_add_buf(struct virtqueue *vq,
>  		      void *data,
>  		      gfp_t gfp);
>  
> +int virtqueue_start_buf(struct virtqueue *_vq,
> +			void *data,
> +			unsigned int nents,
> +			unsigned int nsg,
> +			gfp_t gfp);
> +
> +void virtqueue_add_sg(struct virtqueue *_vq,
> +		      struct scatterlist sgl[],
> +		      unsigned int nents,
> +		      enum dma_data_direction dir);
> +
> +void virtqueue_end_buf(struct virtqueue *_vq);
> +
>  void virtqueue_kick(struct virtqueue *vq);
>  
>  bool virtqueue_kick_prepare(struct virtqueue *vq);
> -- 
> 1.7.1
> 

  reply	other threads:[~2013-02-12 14:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-12 12:23 [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes Paolo Bonzini
2013-02-12 12:23 ` [PATCH 1/9] virtio: add functions for piecewise addition of buffers Paolo Bonzini
2013-02-12 14:56   ` Michael S. Tsirkin [this message]
2013-02-12 15:32     ` Paolo Bonzini
2013-02-12 15:43       ` Michael S. Tsirkin
2013-02-12 15:48         ` Paolo Bonzini
2013-02-12 16:13           ` Michael S. Tsirkin
2013-02-12 16:17             ` Paolo Bonzini
2013-02-12 16:35               ` Michael S. Tsirkin
2013-02-12 16:57                 ` Paolo Bonzini
2013-02-12 17:34                   ` Michael S. Tsirkin
2013-02-12 18:04                     ` Paolo Bonzini
2013-02-12 18:23                       ` Michael S. Tsirkin
2013-02-12 20:08                         ` Paolo Bonzini
2013-02-12 20:49                           ` Michael S. Tsirkin
2013-02-13  8:06                             ` Paolo Bonzini
2013-02-13 10:33                               ` Michael S. Tsirkin
2013-02-12 18:03   ` [PATCH v2 " Paolo Bonzini
2013-02-12 12:23 ` [PATCH 2/9] virtio-blk: reorganize virtblk_add_req Paolo Bonzini
2013-02-17  6:38   ` Asias He
2013-02-12 12:23 ` [PATCH 3/9] virtio-blk: use virtqueue_start_buf on bio path Paolo Bonzini
2013-02-17  6:39   ` Asias He
2013-02-12 12:23 ` [PATCH 4/9] virtio-blk: use virtqueue_start_buf on req path Paolo Bonzini
2013-02-17  6:37   ` Asias He
2013-02-18  9:05     ` Paolo Bonzini
2013-02-12 12:23 ` [PATCH 5/9] scatterlist: introduce sg_unmark_end Paolo Bonzini
2013-02-12 12:23 ` [PATCH 6/9] virtio-net: unmark scatterlist ending after virtqueue_add_buf Paolo Bonzini
2013-02-12 12:23 ` [PATCH 7/9] virtio-scsi: use virtqueue_start_buf Paolo Bonzini
2013-02-12 12:23 ` [PATCH 8/9] virtio: introduce and use virtqueue_add_buf_single Paolo Bonzini
2013-02-12 12:23 ` [PATCH 9/9] virtio: reimplement virtqueue_add_buf using new functions Paolo Bonzini
2013-02-14  6:00 ` [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes Rusty Russell
2013-02-14  9:23   ` Paolo Bonzini
2013-02-15 18:04     ` Paolo Bonzini
2013-02-19  7:49     ` Rusty Russell
2013-02-19  9:11       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130212145620.GA3392@redhat.com \
    --to=mst@redhat.com \
    --cc=asias@redhat.com \
    --cc=gaowanlong@cn.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).