From: "Michael S. Tsirkin" <mst@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org,
Wanlong Gao <gaowanlong@cn.fujitsu.com>,
asias@redhat.com, Rusty Russell <rusty@rustcorp.com.au>,
kvm@vger.kernel.org, virtualization@lists.linux-foundation.org
Subject: Re: [PATCH 1/9] virtio: add functions for piecewise addition of buffers
Date: Tue, 12 Feb 2013 16:56:20 +0200 [thread overview]
Message-ID: <20130212145620.GA3392@redhat.com> (raw)
In-Reply-To: <1360671815-2135-2-git-send-email-pbonzini@redhat.com>
On Tue, Feb 12, 2013 at 01:23:27PM +0100, Paolo Bonzini wrote:
> virtio device drivers translate requests from higher layer in two steps:
> a device-specific step in the device driver, and generic preparation
> of virtio direct or indirect buffers in virtqueue_add_buf. Because
> virtqueue_add_buf also accepts the outcome of the first step as a single
> struct scatterlist, drivers may need to put additional items at the
> front or back of the data scatterlists before handing it to virtqueue_add_buf.
> Because of this, virtio-scsi has to copy each request into a scatterlist
> internal to the driver. It cannot just use the one that was prepared
> by the upper SCSI layers.
>
> On top of this, virtqueue_add_buf also has the limitation of not
> supporting chained scatterlists: the buffers must be provided as an
> array of struct scatterlist. Chained scatterlist, though not supported
> on all architectures, would help for virtio-scsi where all additional
> items are placed at the front.
>
> This patch adds a different set of APIs for adding a buffer to a virtqueue.
> The new API lets you pass the buffers piecewise, wrapping multiple calls
> to virtqueue_add_sg between virtqueue_start_buf and virtqueue_end_buf.
> virtio-scsi can then call virtqueue_add_sg 3/4 times: for the request
> header, for the write buffer (if present), for the response header, and
> finally for the read buffer (again if present). It saves the copying
> and the related locking.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> drivers/virtio/virtio_ring.c | 211 ++++++++++++++++++++++++++++++++++++++++++
> include/linux/virtio.h | 14 +++
> 2 files changed, 225 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index ffd7e7d..64184e5 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -101,6 +101,10 @@ struct vring_virtqueue
> /* Last used index we've seen. */
> u16 last_used_idx;
>
> + /* State between virtqueue_start_buf and virtqueue_end_buf. */
> + int head;
> + struct vring_desc *indirect_base, *tail;
> +
> /* How to notify other side. FIXME: commonalize hcalls! */
> void (*notify)(struct virtqueue *vq);
>
> @@ -394,6 +398,213 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
> vq->vq.num_free++;
> }
>
> +/**
> + * virtqueue_start_buf - start building buffer for the other end
> + * @vq: the struct virtqueue we're talking about.
> + * @data: the token identifying the buffer.
> + * @nents: the number of buffers that will be added
This function starts building one buffer, number of buffers
is a bit weird here.
> + * @nsg: the number of sg lists that will be added
This means number of calls to add_sg ? Not sure why this matters.
How about we pass in in_num/out_num - that is total # of sg,
same as add_buf?
> + * @gfp: how to do memory allocations (if necessary).
> + *
> + * Caller must ensure we don't call this with other virtqueue operations
> + * at the same time (except where noted), and that a successful call is
> + * followed by one or more calls to virtqueue_add_sg, and finally a call
> + * to virtqueue_end_buf.
> + *
> + * Returns zero or a negative error (ie. ENOSPC).
> + */
> +int virtqueue_start_buf(struct virtqueue *_vq,
> + void *data,
> + unsigned int nents,
> + unsigned int nsg,
> + gfp_t gfp)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + struct vring_desc *desc = NULL;
> + int head;
> + int ret = -ENOMEM;
> +
> + START_USE(vq);
> +
> + BUG_ON(data == NULL);
> +
> +#ifdef DEBUG
> + {
> + ktime_t now = ktime_get();
> +
> + /* No kick or get, with .1 second between? Warn. */
> + if (vq->last_add_time_valid)
> + WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time))
> + > 100);
> + vq->last_add_time = now;
> + vq->last_add_time_valid = true;
> + }
> +#endif
> +
> + BUG_ON(nents < nsg);
> + BUG_ON(nsg == 0);
> +
> + /*
> + * If the host supports indirect descriptor tables, and there is
> + * no space for direct buffers or there are multi-item scatterlists,
> + * go indirect.
> + */
> + head = vq->free_head;
> + if (vq->indirect && (nents > nsg || vq->vq.num_free < nents)) {
> + if (vq->vq.num_free == 0)
> + goto no_space;
> +
> + desc = kmalloc(nents * sizeof(struct vring_desc), gfp);
> + if (!desc)
> + goto error;
> +
> + /* We're about to use a buffer */
> + vq->vq.num_free--;
> +
> + /* Use a single buffer which doesn't continue */
> + vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
> + vq->vring.desc[head].addr = virt_to_phys(desc);
> + vq->vring.desc[head].len = nents * sizeof(struct vring_desc);
> +
> + /* Update free pointer */
> + vq->free_head = vq->vring.desc[head].next;
> + }
> +
> + /* Set token. */
> + vq->data[head] = data;
> +
> + pr_debug("Started buffer head %i for %p\n", head, vq);
> +
> + vq->indirect_base = desc;
> + vq->tail = NULL;
> + vq->head = head;
> + return 0;
> +
> +no_space:
> + ret = -ENOSPC;
> +error:
> + pr_debug("Can't add buf (%d) - nents = %i, avail = %i\n",
> + ret, nents, vq->vq.num_free);
> + END_USE(vq);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_start_buf);
> +
> +/**
> + * virtqueue_add_sg - add sglist to buffer being built
> + * @_vq: the virtqueue for which the buffer is being built
> + * @sgl: the description of the buffer(s).
> + * @nents: the number of items to process in sgl
> + * @dir: whether the sgl is read or written (DMA_TO_DEVICE/DMA_FROM_DEVICE only)
> + *
> + * Note that, unlike virtqueue_add_buf, this function follows chained
> + * scatterlists, and stops before the @nents-th item if a scatterlist item
> + * has a marker.
> + *
> + * Caller must ensure we don't call this with other virtqueue operations
> + * at the same time (except where noted).
Hmm so if you want to add in and out, need separate calls?
in_num/out_num would be nicer?
> + */
> +void virtqueue_add_sg(struct virtqueue *_vq,
> + struct scatterlist sgl[],
> + unsigned int nents,
> + enum dma_data_direction dir)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + unsigned int i, n;
> + struct scatterlist *sg;
> + struct vring_desc *tail;
> + u32 flags;
> +
> +#ifdef DEBUG
> + BUG_ON(!vq->in_use);
> +#endif
> +
> + BUG_ON(dir != DMA_FROM_DEVICE && dir != DMA_TO_DEVICE);
> + BUG_ON(nents == 0);
> +
> + flags = dir == DMA_FROM_DEVICE ? VRING_DESC_F_WRITE : 0;
> + flags |= VRING_DESC_F_NEXT;
> +
> + /*
> + * If using indirect descriptor tables, fill in the buffers
> + * at vq->indirect_base.
> + */
> + if (vq->indirect_base) {
> + i = 0;
> + if (likely(vq->tail))
> + i = vq->tail - vq->indirect_base + 1;
> +
> + for_each_sg(sgl, sg, nents, n) {
> + tail = &vq->indirect_base[i];
> + tail->flags = flags;
> + tail->addr = sg_phys(sg);
> + tail->len = sg->length;
> + tail->next = ++i;
> + }
> + } else {
> + BUG_ON(vq->vq.num_free < nents);
> +
> + i = vq->free_head;
> + for_each_sg(sgl, sg, nents, n) {
> + tail = &vq->vring.desc[i];
> + tail->flags = flags;
> + tail->addr = sg_phys(sg);
> + tail->len = sg->length;
> + i = tail->next;
> + vq->vq.num_free--;
> + }
> +
> + vq->free_head = i;
> + }
> + vq->tail = tail;
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_add_sg);
> +
> +/**
> + * virtqueue_end_buf - expose buffer to other end
> + * @_vq: the virtqueue for which the buffer was built
> + *
> + * Caller must ensure we don't call this with other virtqueue operations
> + * at the same time (except where noted).
> + */
> +void virtqueue_end_buf(struct virtqueue *_vq)
> +{
> + struct vring_virtqueue *vq = to_vvq(_vq);
> + unsigned int avail;
> + int head = vq->head;
> + struct vring_desc *tail = vq->tail;
> +
> +#ifdef DEBUG
> + BUG_ON(!vq->in_use);
> +#endif
> + BUG_ON(tail == NULL);
> +
> + /* The last one does not have the next flag set. */
> + tail->flags &= ~VRING_DESC_F_NEXT;
> +
> + /*
> + * Put entry in available array. Descriptors and available array
> + * need to be set before we expose the new available array entries.
> + */
> + avail = vq->vring.avail->idx & (vq->vring.num-1);
> + vq->vring.avail->ring[avail] = head;
> + virtio_wmb(vq);
> +
> + vq->vring.avail->idx++;
> + vq->num_added++;
> +
> + /*
> + * This is very unlikely, but theoretically possible. Kick
> + * just in case.
> + */
> + if (unlikely(vq->num_added == (1 << 16) - 1))
> + virtqueue_kick(&vq->vq);
> +
> + pr_debug("Added buffer head %i to %p\n", head, vq);
> + END_USE(vq);
> +}
> +EXPORT_SYMBOL_GPL(virtqueue_end_buf);
> +
> static inline bool more_used(const struct vring_virtqueue *vq)
> {
> return vq->last_used_idx != vq->vring.used->idx;
> diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> index cf8adb1..43d6bc3 100644
> --- a/include/linux/virtio.h
> +++ b/include/linux/virtio.h
> @@ -7,6 +7,7 @@
> #include <linux/spinlock.h>
> #include <linux/device.h>
> #include <linux/mod_devicetable.h>
> +#include <linux/dma-direction.h>
> #include <linux/gfp.h>
>
> /**
> @@ -40,6 +41,19 @@ int virtqueue_add_buf(struct virtqueue *vq,
> void *data,
> gfp_t gfp);
>
> +int virtqueue_start_buf(struct virtqueue *_vq,
> + void *data,
> + unsigned int nents,
> + unsigned int nsg,
> + gfp_t gfp);
> +
> +void virtqueue_add_sg(struct virtqueue *_vq,
> + struct scatterlist sgl[],
> + unsigned int nents,
> + enum dma_data_direction dir);
> +
> +void virtqueue_end_buf(struct virtqueue *_vq);
> +
> void virtqueue_kick(struct virtqueue *vq);
>
> bool virtqueue_kick_prepare(struct virtqueue *vq);
> --
> 1.7.1
>
next prev parent reply other threads:[~2013-02-12 14:52 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-12 12:23 [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes Paolo Bonzini
2013-02-12 12:23 ` [PATCH 1/9] virtio: add functions for piecewise addition of buffers Paolo Bonzini
2013-02-12 14:56 ` Michael S. Tsirkin [this message]
2013-02-12 15:32 ` Paolo Bonzini
2013-02-12 15:43 ` Michael S. Tsirkin
2013-02-12 15:48 ` Paolo Bonzini
2013-02-12 16:13 ` Michael S. Tsirkin
2013-02-12 16:17 ` Paolo Bonzini
2013-02-12 16:35 ` Michael S. Tsirkin
2013-02-12 16:57 ` Paolo Bonzini
2013-02-12 17:34 ` Michael S. Tsirkin
2013-02-12 18:04 ` Paolo Bonzini
2013-02-12 18:23 ` Michael S. Tsirkin
2013-02-12 20:08 ` Paolo Bonzini
2013-02-12 20:49 ` Michael S. Tsirkin
2013-02-13 8:06 ` Paolo Bonzini
2013-02-13 10:33 ` Michael S. Tsirkin
2013-02-12 18:03 ` [PATCH v2 " Paolo Bonzini
2013-02-12 12:23 ` [PATCH 2/9] virtio-blk: reorganize virtblk_add_req Paolo Bonzini
2013-02-17 6:38 ` Asias He
2013-02-12 12:23 ` [PATCH 3/9] virtio-blk: use virtqueue_start_buf on bio path Paolo Bonzini
2013-02-17 6:39 ` Asias He
2013-02-12 12:23 ` [PATCH 4/9] virtio-blk: use virtqueue_start_buf on req path Paolo Bonzini
2013-02-17 6:37 ` Asias He
2013-02-18 9:05 ` Paolo Bonzini
2013-02-12 12:23 ` [PATCH 5/9] scatterlist: introduce sg_unmark_end Paolo Bonzini
2013-02-12 12:23 ` [PATCH 6/9] virtio-net: unmark scatterlist ending after virtqueue_add_buf Paolo Bonzini
2013-02-12 12:23 ` [PATCH 7/9] virtio-scsi: use virtqueue_start_buf Paolo Bonzini
2013-02-12 12:23 ` [PATCH 8/9] virtio: introduce and use virtqueue_add_buf_single Paolo Bonzini
2013-02-12 12:23 ` [PATCH 9/9] virtio: reimplement virtqueue_add_buf using new functions Paolo Bonzini
2013-02-14 6:00 ` [PATCH 0/9] virtio: new API for addition of buffers, scatterlist changes Rusty Russell
2013-02-14 9:23 ` Paolo Bonzini
2013-02-15 18:04 ` Paolo Bonzini
2013-02-19 7:49 ` Rusty Russell
2013-02-19 9:11 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130212145620.GA3392@redhat.com \
--to=mst@redhat.com \
--cc=asias@redhat.com \
--cc=gaowanlong@cn.fujitsu.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rusty@rustcorp.com.au \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).