* [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem
@ 2013-10-10 15:07 Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 1/4] vring: create a common function to parse descriptors Paolo Bonzini
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Paolo Bonzini @ 2013-10-10 15:07 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha
Now that the memory API is thread-safe, we can use it in
virtio-blk-dataplane and replace hostmem.[ch]. This series does this,
and also changes the vring API to use VirtQueueElement (with an eye
towards migration). With this change, virtio-blk-dataplane is also safe
against memory hot-unplug.
The next step would be to replace memory_region_find with
address_space_{map,unmap}, which handle dirtying of memory correctly.
However these APIs are not thread-safe yet, and neither is the handling
of dirty memory (Juan's patches may be a start here).
Also, the usage of iov_discard_{front,back} may cause some complication
when we use address_space_{map,unmap}. We may have to change a bit the
logic in virtio-blk-dataplane to switch to address_space_{map,unmap}.
If we do not want to do this intermediate step, the first three patches
can be applied separately from the fourth.
Paolo Bonzini (4):
vring: create a common function to parse descriptors
vring: factor common code for error exits
dataplane: change vring API to use VirtQueueElement
dataplane: replace hostmem with memory_region_find
hw/block/dataplane/virtio-blk.c | 86 +++++-------
hw/virtio/dataplane/Makefile.objs | 2 +-
hw/virtio/dataplane/hostmem.c | 183 -------------------------
hw/virtio/dataplane/vring.c | 244 +++++++++++++++++++++-------------
include/hw/virtio/dataplane/hostmem.h | 58 --------
include/hw/virtio/dataplane/vring.h | 9 +-
6 files changed, 193 insertions(+), 389 deletions(-)
delete mode 100644 hw/virtio/dataplane/hostmem.c
delete mode 100644 include/hw/virtio/dataplane/hostmem.h
--
1.8.3.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 1/4] vring: create a common function to parse descriptors
2013-10-10 15:07 [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Paolo Bonzini
@ 2013-10-10 15:07 ` Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits Paolo Bonzini
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2013-10-10 15:07 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/virtio/dataplane/vring.c | 113 ++++++++++++++++++++------------------------
1 file changed, 51 insertions(+), 62 deletions(-)
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 351a343..8294f36 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -110,6 +110,47 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
return vring_need_event(vring_used_event(&vring->vr), new, old);
}
+
+static int get_desc(Vring *vring,
+ struct iovec iov[], struct iovec *iov_end,
+ unsigned int *out_num, unsigned int *in_num,
+ struct vring_desc *desc)
+{
+ unsigned *num;
+
+ if (desc->flags & VRING_DESC_F_WRITE) {
+ num = in_num;
+ } else {
+ num = out_num;
+
+ /* If it's an output descriptor, they're all supposed
+ * to come before any input descriptors. */
+ if (unlikely(*in_num)) {
+ error_report("Descriptor has out after in");
+ return -EFAULT;
+ }
+ }
+
+ /* Stop for now if there are not enough iovecs available. */
+ iov += *in_num + *out_num;
+ if (iov >= iov_end) {
+ return -ENOBUFS;
+ }
+
+ /* TODO handle non-contiguous memory across region boundaries */
+ iov->iov_base = hostmem_lookup(&vring->hostmem, desc->addr, desc->len,
+ desc->flags & VRING_DESC_F_WRITE);
+ if (!iov->iov_base) {
+ error_report("Failed to map descriptor addr %#" PRIx64 " len %u",
+ (uint64_t)desc->addr, desc->len);
+ return -EFAULT;
+ }
+
+ iov->iov_len = desc->len;
+ *num += 1;
+ return 0;
+}
+
/* This is stolen from linux/drivers/vhost/vhost.c. */
static int get_indirect(Vring *vring,
struct iovec iov[], struct iovec *iov_end,
@@ -118,6 +159,7 @@ static int get_indirect(Vring *vring,
{
struct vring_desc desc;
unsigned int i = 0, count, found = 0;
+ int ret;
/* Sanity check */
if (unlikely(indirect->len % sizeof(desc))) {
@@ -170,36 +212,10 @@ static int get_indirect(Vring *vring,
return -EFAULT;
}
- /* Stop for now if there are not enough iovecs available. */
- if (iov >= iov_end) {
- return -ENOBUFS;
- }
-
- iov->iov_base = hostmem_lookup(&vring->hostmem, desc.addr, desc.len,
- desc.flags & VRING_DESC_F_WRITE);
- if (!iov->iov_base) {
- error_report("Failed to map indirect descriptor"
- "addr %#" PRIx64 " len %u",
- (uint64_t)desc.addr, desc.len);
- vring->broken = true;
- return -EFAULT;
- }
- iov->iov_len = desc.len;
- iov++;
-
- /* If this is an input descriptor, increment that count. */
- if (desc.flags & VRING_DESC_F_WRITE) {
- *in_num += 1;
- } else {
- /* If it's an output descriptor, they're all supposed
- * to come before any input descriptors. */
- if (unlikely(*in_num)) {
- error_report("Indirect descriptor "
- "has out after in: idx %u", i);
- vring->broken = true;
- return -EFAULT;
- }
- *out_num += 1;
+ ret = get_desc(vring, iov, iov_end, out_num, in_num, &desc);
+ if (ret < 0) {
+ vring->broken |= (ret == -EFAULT);
+ return ret;
}
i = desc.next;
} while (desc.flags & VRING_DESC_F_NEXT);
@@ -224,6 +240,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
struct vring_desc desc;
unsigned int i, head, found = 0, num = vring->vr.num;
uint16_t avail_idx, last_avail_idx;
+ int ret;
/* If there was a fatal error then refuse operation */
if (vring->broken) {
@@ -294,40 +311,12 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
continue;
}
- /* If there are not enough iovecs left, stop for now. The caller
- * should check if there are more descs available once they have dealt
- * with the current set.
- */
- if (iov >= iov_end) {
- return -ENOBUFS;
+ ret = get_desc(vring, iov, iov_end, out_num, in_num, &desc);
+ if (ret < 0) {
+ vring->broken |= (ret == -EFAULT);
+ return ret;
}
- /* TODO handle non-contiguous memory across region boundaries */
- iov->iov_base = hostmem_lookup(&vring->hostmem, desc.addr, desc.len,
- desc.flags & VRING_DESC_F_WRITE);
- if (!iov->iov_base) {
- error_report("Failed to map vring desc addr %#" PRIx64 " len %u",
- (uint64_t)desc.addr, desc.len);
- vring->broken = true;
- return -EFAULT;
- }
- iov->iov_len = desc.len;
- iov++;
-
- if (desc.flags & VRING_DESC_F_WRITE) {
- /* If this is an input descriptor,
- * increment that count. */
- *in_num += 1;
- } else {
- /* If it's an output descriptor, they're all supposed
- * to come before any input descriptors. */
- if (unlikely(*in_num)) {
- error_report("Descriptor has out after in: idx %d", i);
- vring->broken = true;
- return -EFAULT;
- }
- *out_num += 1;
- }
i = desc.next;
} while (desc.flags & VRING_DESC_F_NEXT);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits
2013-10-10 15:07 [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 1/4] vring: create a common function to parse descriptors Paolo Bonzini
@ 2013-10-10 15:07 ` Paolo Bonzini
2013-10-10 20:53 ` Richard Henderson
2013-10-10 15:07 ` [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement Paolo Bonzini
` (2 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2013-10-10 15:07 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/block/dataplane/virtio-blk.c | 1 +
hw/virtio/dataplane/vring.c | 34 +++++++++++++++++++++-------------
2 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index f2d7350..319a234 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -308,6 +308,7 @@ static void handle_notify(EventNotifier *e)
if (process_request(&s->ioqueue, iov, out_num, in_num, head) < 0) {
vring_set_broken(&s->vring);
+ ret = -EFAULT;
break;
}
iov += out_num + in_num;
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 8294f36..d81b653 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -244,7 +244,8 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
/* If there was a fatal error then refuse operation */
if (vring->broken) {
- return -EFAULT;
+ ret = -EFAULT;
+ goto out;
}
/* Check it isn't doing very strange things with descriptor numbers. */
@@ -255,13 +256,14 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
if (unlikely((uint16_t)(avail_idx - last_avail_idx) > num)) {
error_report("Guest moved used index from %u to %u",
last_avail_idx, avail_idx);
- vring->broken = true;
- return -EFAULT;
+ ret = -EFAULT;
+ goto out;
}
/* If there's nothing new since last we looked. */
if (avail_idx == last_avail_idx) {
- return -EAGAIN;
+ ret = -EAGAIN;
+ goto out;
}
/* Only get avail ring entries after they have been exposed by guest. */
@@ -274,8 +276,8 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
/* If their number is silly, that's an error. */
if (unlikely(head >= num)) {
error_report("Guest says index %u > %u is available", head, num);
- vring->broken = true;
- return -EFAULT;
+ ret = -EFAULT;
+ goto out;
}
if (vdev->guest_features & (1 << VIRTIO_RING_F_EVENT_IDX)) {
@@ -289,14 +291,14 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
do {
if (unlikely(i >= num)) {
error_report("Desc index is %u > %u, head = %u", i, num, head);
- vring->broken = true;
- return -EFAULT;
+ ret = -EFAULT;
+ goto out;
}
if (unlikely(++found > num)) {
error_report("Loop detected: last one at %u vq size %u head %u",
i, num, head);
- vring->broken = true;
- return -EFAULT;
+ ret = -EFAULT;
+ goto out;
}
desc = vring->vr.desc[i];
@@ -306,15 +308,14 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
if (desc.flags & VRING_DESC_F_INDIRECT) {
int ret = get_indirect(vring, iov, iov_end, out_num, in_num, &desc);
if (ret < 0) {
- return ret;
+ goto out;
}
continue;
}
ret = get_desc(vring, iov, iov_end, out_num, in_num, &desc);
if (ret < 0) {
- vring->broken |= (ret == -EFAULT);
- return ret;
+ goto out;
}
i = desc.next;
@@ -323,6 +324,13 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
/* On success, increment avail index. */
vring->last_avail_idx++;
return head;
+
+out:
+ assert(ret < 0);
+ if (ret == -EFAULT) {
+ vring->broken = true;
+ }
+ return ret;
}
/* After we've used one of their buffers, we tell them about it.
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement
2013-10-10 15:07 [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 1/4] vring: create a common function to parse descriptors Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits Paolo Bonzini
@ 2013-10-10 15:07 ` Paolo Bonzini
2013-12-04 14:06 ` Stefan Hajnoczi
2013-10-10 15:07 ` [Qemu-devel] [PATCH 4/4] dataplane: replace hostmem with memory_region_find Paolo Bonzini
2013-12-04 14:12 ` [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Stefan Hajnoczi
4 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2013-10-10 15:07 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/block/dataplane/virtio-blk.c | 85 ++++++++++++++-----------------------
hw/virtio/dataplane/vring.c | 51 +++++++++++++---------
include/hw/virtio/dataplane/vring.h | 6 +--
3 files changed, 66 insertions(+), 76 deletions(-)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 319a234..e700c0b 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -35,7 +35,7 @@ enum {
typedef struct {
struct iocb iocb; /* Linux AIO control block */
QEMUIOVector *inhdr; /* iovecs for virtio_blk_inhdr */
- unsigned int head; /* vring descriptor index */
+ VirtQueueElement *elem; /* saved data from the virtqueue */
struct iovec *bounce_iov; /* used if guest buffers are unaligned */
QEMUIOVector *read_qiov; /* for read completion /w bounce buffer */
} VirtIOBlockRequest;
@@ -96,7 +96,7 @@ static void complete_request(struct iocb *iocb, ssize_t ret, void *opaque)
len = 0;
}
- trace_virtio_blk_data_plane_complete_request(s, req->head, ret);
+ trace_virtio_blk_data_plane_complete_request(s, req->elem->index, ret);
if (req->read_qiov) {
assert(req->bounce_iov);
@@ -118,12 +118,12 @@ static void complete_request(struct iocb *iocb, ssize_t ret, void *opaque)
* written to, but for virtio-blk it seems to be the number of bytes
* transferred plus the status bytes.
*/
- vring_push(&s->vring, req->head, len + sizeof(hdr));
-
+ vring_push(&s->vring, req->elem, len + sizeof(hdr));
+ req->elem = NULL;
s->num_reqs--;
}
-static void complete_request_early(VirtIOBlockDataPlane *s, unsigned int head,
+static void complete_request_early(VirtIOBlockDataPlane *s, VirtQueueElement *elem,
QEMUIOVector *inhdr, unsigned char status)
{
struct virtio_blk_inhdr hdr = {
@@ -134,26 +134,26 @@ static void complete_request_early(VirtIOBlockDataPlane *s, unsigned int head,
qemu_iovec_destroy(inhdr);
g_slice_free(QEMUIOVector, inhdr);
- vring_push(&s->vring, head, sizeof(hdr));
+ vring_push(&s->vring, elem, sizeof(hdr));
notify_guest(s);
}
/* Get disk serial number */
static void do_get_id_cmd(VirtIOBlockDataPlane *s,
struct iovec *iov, unsigned int iov_cnt,
- unsigned int head, QEMUIOVector *inhdr)
+ VirtQueueElement *elem, QEMUIOVector *inhdr)
{
char id[VIRTIO_BLK_ID_BYTES];
/* Serial number not NUL-terminated when shorter than buffer */
strncpy(id, s->blk->serial ? s->blk->serial : "", sizeof(id));
iov_from_buf(iov, iov_cnt, 0, id, sizeof(id));
- complete_request_early(s, head, inhdr, VIRTIO_BLK_S_OK);
+ complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_OK);
}
static int do_rdwr_cmd(VirtIOBlockDataPlane *s, bool read,
- struct iovec *iov, unsigned int iov_cnt,
- long long offset, unsigned int head,
+ struct iovec *iov, unsigned iov_cnt,
+ long long offset, VirtQueueElement *elem,
QEMUIOVector *inhdr)
{
struct iocb *iocb;
@@ -186,19 +186,20 @@ static int do_rdwr_cmd(VirtIOBlockDataPlane *s, bool read,
/* Fill in virtio block metadata needed for completion */
VirtIOBlockRequest *req = container_of(iocb, VirtIOBlockRequest, iocb);
- req->head = head;
+ req->elem = elem;
req->inhdr = inhdr;
req->bounce_iov = bounce_iov;
req->read_qiov = read_qiov;
return 0;
}
-static int process_request(IOQueue *ioq, struct iovec iov[],
- unsigned int out_num, unsigned int in_num,
- unsigned int head)
+static int process_request(IOQueue *ioq, VirtQueueElement *elem)
{
VirtIOBlockDataPlane *s = container_of(ioq, VirtIOBlockDataPlane, ioqueue);
- struct iovec *in_iov = &iov[out_num];
+ struct iovec *iov = elem->out_sg;
+ struct iovec *in_iov = elem->in_sg;
+ unsigned out_num = elem->out_num;
+ unsigned in_num = elem->in_num;
struct virtio_blk_outhdr outhdr;
QEMUIOVector *inhdr;
size_t in_size;
@@ -229,29 +230,29 @@ static int process_request(IOQueue *ioq, struct iovec iov[],
switch (outhdr.type) {
case VIRTIO_BLK_T_IN:
- do_rdwr_cmd(s, true, in_iov, in_num, outhdr.sector * 512, head, inhdr);
+ do_rdwr_cmd(s, true, in_iov, in_num, outhdr.sector * 512, elem, inhdr);
return 0;
case VIRTIO_BLK_T_OUT:
- do_rdwr_cmd(s, false, iov, out_num, outhdr.sector * 512, head, inhdr);
+ do_rdwr_cmd(s, false, iov, out_num, outhdr.sector * 512, elem, inhdr);
return 0;
case VIRTIO_BLK_T_SCSI_CMD:
/* TODO support SCSI commands */
- complete_request_early(s, head, inhdr, VIRTIO_BLK_S_UNSUPP);
+ complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_UNSUPP);
return 0;
case VIRTIO_BLK_T_FLUSH:
/* TODO fdsync not supported by Linux AIO, do it synchronously here! */
if (qemu_fdatasync(s->fd) < 0) {
- complete_request_early(s, head, inhdr, VIRTIO_BLK_S_IOERR);
+ complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_IOERR);
} else {
- complete_request_early(s, head, inhdr, VIRTIO_BLK_S_OK);
+ complete_request_early(s, elem, inhdr, VIRTIO_BLK_S_OK);
}
return 0;
case VIRTIO_BLK_T_GET_ID:
- do_get_id_cmd(s, in_iov, in_num, head, inhdr);
+ do_get_id_cmd(s, in_iov, in_num, elem, inhdr);
return 0;
default:
@@ -267,29 +268,8 @@ static void handle_notify(EventNotifier *e)
VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
host_notifier);
- /* There is one array of iovecs into which all new requests are extracted
- * from the vring. Requests are read from the vring and the translated
- * descriptors are written to the iovecs array. The iovecs do not have to
- * persist across handle_notify() calls because the kernel copies the
- * iovecs on io_submit().
- *
- * Handling io_submit() EAGAIN may require storing the requests across
- * handle_notify() calls until the kernel has sufficient resources to
- * accept more I/O. This is not implemented yet.
- */
- struct iovec iovec[VRING_MAX];
- struct iovec *end = &iovec[VRING_MAX];
- struct iovec *iov = iovec;
-
- /* When a request is read from the vring, the index of the first descriptor
- * (aka head) is returned so that the completed request can be pushed onto
- * the vring later.
- *
- * The number of hypervisor read-only iovecs is out_num. The number of
- * hypervisor write-only iovecs is in_num.
- */
- int head;
- unsigned int out_num = 0, in_num = 0;
+ VirtQueueElement *elem;
+ int ret;
unsigned int num_queued;
event_notifier_test_and_clear(&s->host_notifier);
@@ -298,30 +278,31 @@ static void handle_notify(EventNotifier *e)
vring_disable_notification(s->vdev, &s->vring);
for (;;) {
- head = vring_pop(s->vdev, &s->vring, iov, end, &out_num, &in_num);
- if (head < 0) {
+ ret = vring_pop(s->vdev, &s->vring, &elem);
+ if (ret < 0) {
+ assert(elem == NULL);
break; /* no more requests */
}
- trace_virtio_blk_data_plane_process_request(s, out_num, in_num,
- head);
+ trace_virtio_blk_data_plane_process_request(s, elem->out_num,
+ elem->in_num, elem->index);
- if (process_request(&s->ioqueue, iov, out_num, in_num, head) < 0) {
+ if (process_request(&s->ioqueue, elem) < 0) {
vring_set_broken(&s->vring);
+ vring_push(&s->vring, elem, 0);
ret = -EFAULT;
break;
}
- iov += out_num + in_num;
}
- if (likely(head == -EAGAIN)) { /* vring emptied */
+ if (likely(ret == -EAGAIN)) { /* vring emptied */
/* Re-enable guest->host notifies and stop processing the vring.
* But if the guest has snuck in more descriptors, keep processing.
*/
if (vring_enable_notification(s->vdev, &s->vring)) {
break;
}
- } else { /* head == -ENOBUFS or fatal error, iovecs[] is depleted */
+ } else { /* ret == -ENOBUFS or fatal error, iovecs[] is depleted */
/* Since there are no iovecs[] left, stop processing for now. Do
* not re-enable guest->host notifies since the I/O completion
* handler knows to check for more vring descriptors anyway.
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index d81b653..26d6825 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -111,29 +111,32 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
}
-static int get_desc(Vring *vring,
- struct iovec iov[], struct iovec *iov_end,
- unsigned int *out_num, unsigned int *in_num,
+static int get_desc(Vring *vring, VirtQueueElement *elem,
struct vring_desc *desc)
{
unsigned *num;
+ struct iovec *iov;
+ hwaddr *addr;
if (desc->flags & VRING_DESC_F_WRITE) {
- num = in_num;
+ num = &elem->in_num;
+ iov = &elem->in_sg[*num];
+ addr = &elem->in_addr[*num];
} else {
- num = out_num;
+ num = &elem->out_num;
+ iov = &elem->out_sg[*num];
+ addr = &elem->out_addr[*num];
/* If it's an output descriptor, they're all supposed
* to come before any input descriptors. */
- if (unlikely(*in_num)) {
+ if (unlikely(elem->in_num)) {
error_report("Descriptor has out after in");
return -EFAULT;
}
}
/* Stop for now if there are not enough iovecs available. */
- iov += *in_num + *out_num;
- if (iov >= iov_end) {
+ if (*num >= VIRTQUEUE_MAX_SIZE) {
return -ENOBUFS;
}
@@ -147,14 +150,13 @@ static int get_desc(Vring *vring,
}
iov->iov_len = desc->len;
+ *addr = desc->addr;
*num += 1;
return 0;
}
/* This is stolen from linux/drivers/vhost/vhost.c. */
-static int get_indirect(Vring *vring,
- struct iovec iov[], struct iovec *iov_end,
- unsigned int *out_num, unsigned int *in_num,
+static int get_indirect(Vring *vring, VirtQueueElement *elem,
struct vring_desc *indirect)
{
struct vring_desc desc;
@@ -212,7 +214,7 @@ static int get_indirect(Vring *vring,
return -EFAULT;
}
- ret = get_desc(vring, iov, iov_end, out_num, in_num, &desc);
+ ret = get_desc(vring, elem, &desc);
if (ret < 0) {
vring->broken |= (ret == -EFAULT);
return ret;
@@ -234,12 +236,12 @@ static int get_indirect(Vring *vring,
* Stolen from linux/drivers/vhost/vhost.c.
*/
int vring_pop(VirtIODevice *vdev, Vring *vring,
- struct iovec iov[], struct iovec *iov_end,
- unsigned int *out_num, unsigned int *in_num)
+ VirtQueueElement **p_elem)
{
struct vring_desc desc;
unsigned int i, head, found = 0, num = vring->vr.num;
uint16_t avail_idx, last_avail_idx;
+ VirtQueueElement *elem = NULL;
int ret;
/* If there was a fatal error then refuse operation */
@@ -273,6 +275,10 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
* the index we've seen. */
head = vring->vr.avail->ring[last_avail_idx % num];
+ elem = g_slice_new(VirtQueueElement);
+ elem->index = head;
+ elem->in_num = elem->out_num = 0;
+
/* If their number is silly, that's an error. */
if (unlikely(head >= num)) {
error_report("Guest says index %u > %u is available", head, num);
@@ -284,9 +290,6 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
vring_avail_event(&vring->vr) = vring->vr.avail->idx;
}
- /* When we start there are none of either input nor output. */
- *out_num = *in_num = 0;
-
i = head;
do {
if (unlikely(i >= num)) {
@@ -306,14 +309,14 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
barrier();
if (desc.flags & VRING_DESC_F_INDIRECT) {
- int ret = get_indirect(vring, iov, iov_end, out_num, in_num, &desc);
+ int ret = get_indirect(vring, elem, &desc);
if (ret < 0) {
goto out;
}
continue;
}
- ret = get_desc(vring, iov, iov_end, out_num, in_num, &desc);
+ ret = get_desc(vring, elem, &desc);
if (ret < 0) {
goto out;
}
@@ -323,6 +326,7 @@ int vring_pop(VirtIODevice *vdev, Vring *vring,
/* On success, increment avail index. */
vring->last_avail_idx++;
+ *p_elem = elem;
return head;
out:
@@ -330,6 +334,10 @@ out:
if (ret == -EFAULT) {
vring->broken = true;
}
+ if (elem) {
+ g_slice_free(VirtQueueElement, elem);
+ }
+ *p_elem = NULL;
return ret;
}
@@ -337,11 +345,14 @@ out:
*
* Stolen from linux/drivers/vhost/vhost.c.
*/
-void vring_push(Vring *vring, unsigned int head, int len)
+void vring_push(Vring *vring, VirtQueueElement *elem, int len)
{
struct vring_used_elem *used;
+ unsigned int head = elem->index;
uint16_t new;
+ g_slice_free(VirtQueueElement, elem);
+
/* Don't touch vring if a fatal error occurred */
if (vring->broken) {
return;
diff --git a/include/hw/virtio/dataplane/vring.h b/include/hw/virtio/dataplane/vring.h
index c0b69ff..b51cfc9 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -54,9 +54,7 @@ void vring_teardown(Vring *vring, VirtIODevice *vdev, int n);
void vring_disable_notification(VirtIODevice *vdev, Vring *vring);
bool vring_enable_notification(VirtIODevice *vdev, Vring *vring);
bool vring_should_notify(VirtIODevice *vdev, Vring *vring);
-int vring_pop(VirtIODevice *vdev, Vring *vring,
- struct iovec iov[], struct iovec *iov_end,
- unsigned int *out_num, unsigned int *in_num);
-void vring_push(Vring *vring, unsigned int head, int len);
+int vring_pop(VirtIODevice *vdev, Vring *vring, VirtQueueElement **elem);
+void vring_push(Vring *vring, VirtQueueElement *elem, int len);
#endif /* VRING_H */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 4/4] dataplane: replace hostmem with memory_region_find
2013-10-10 15:07 [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Paolo Bonzini
` (2 preceding siblings ...)
2013-10-10 15:07 ` [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement Paolo Bonzini
@ 2013-10-10 15:07 ` Paolo Bonzini
2013-12-04 14:12 ` [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Stefan Hajnoczi
4 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2013-10-10 15:07 UTC (permalink / raw)
To: qemu-devel; +Cc: stefanha
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/virtio/dataplane/Makefile.objs | 2 +-
hw/virtio/dataplane/hostmem.c | 183 ----------------------------------
hw/virtio/dataplane/vring.c | 74 ++++++++++++--
include/hw/virtio/dataplane/hostmem.h | 58 -----------
include/hw/virtio/dataplane/vring.h | 3 +-
5 files changed, 68 insertions(+), 252 deletions(-)
delete mode 100644 hw/virtio/dataplane/hostmem.c
delete mode 100644 include/hw/virtio/dataplane/hostmem.h
diff --git a/hw/virtio/dataplane/Makefile.objs b/hw/virtio/dataplane/Makefile.objs
index a91bf33..9a8cfc0 100644
--- a/hw/virtio/dataplane/Makefile.objs
+++ b/hw/virtio/dataplane/Makefile.objs
@@ -1 +1 @@
-common-obj-y += hostmem.o vring.o
+common-obj-y += vring.o
diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
deleted file mode 100644
index 901d98b..0000000
--- a/hw/virtio/dataplane/hostmem.c
+++ /dev/null
@@ -1,183 +0,0 @@
-/*
- * Thread-safe guest to host memory mapping
- *
- * Copyright 2012 Red Hat, Inc. and/or its affiliates
- *
- * Authors:
- * Stefan Hajnoczi <stefanha@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#include "exec/address-spaces.h"
-#include "hw/virtio/dataplane/hostmem.h"
-
-static int hostmem_lookup_cmp(const void *phys_, const void *region_)
-{
- hwaddr phys = *(const hwaddr *)phys_;
- const HostMemRegion *region = region_;
-
- if (phys < region->guest_addr) {
- return -1;
- } else if (phys >= region->guest_addr + region->size) {
- return 1;
- } else {
- return 0;
- }
-}
-
-/**
- * Map guest physical address to host pointer
- */
-void *hostmem_lookup(HostMem *hostmem, hwaddr phys, hwaddr len, bool is_write)
-{
- HostMemRegion *region;
- void *host_addr = NULL;
- hwaddr offset_within_region;
-
- qemu_mutex_lock(&hostmem->current_regions_lock);
- region = bsearch(&phys, hostmem->current_regions,
- hostmem->num_current_regions,
- sizeof(hostmem->current_regions[0]),
- hostmem_lookup_cmp);
- if (!region) {
- goto out;
- }
- if (is_write && region->readonly) {
- goto out;
- }
- offset_within_region = phys - region->guest_addr;
- if (len <= region->size - offset_within_region) {
- host_addr = region->host_addr + offset_within_region;
- }
-out:
- qemu_mutex_unlock(&hostmem->current_regions_lock);
-
- return host_addr;
-}
-
-/**
- * Install new regions list
- */
-static void hostmem_listener_commit(MemoryListener *listener)
-{
- HostMem *hostmem = container_of(listener, HostMem, listener);
- int i;
-
- qemu_mutex_lock(&hostmem->current_regions_lock);
- for (i = 0; i < hostmem->num_current_regions; i++) {
- memory_region_unref(hostmem->current_regions[i].mr);
- }
- g_free(hostmem->current_regions);
- hostmem->current_regions = hostmem->new_regions;
- hostmem->num_current_regions = hostmem->num_new_regions;
- qemu_mutex_unlock(&hostmem->current_regions_lock);
-
- /* Reset new regions list */
- hostmem->new_regions = NULL;
- hostmem->num_new_regions = 0;
-}
-
-/**
- * Add a MemoryRegionSection to the new regions list
- */
-static void hostmem_append_new_region(HostMem *hostmem,
- MemoryRegionSection *section)
-{
- void *ram_ptr = memory_region_get_ram_ptr(section->mr);
- size_t num = hostmem->num_new_regions;
- size_t new_size = (num + 1) * sizeof(hostmem->new_regions[0]);
-
- hostmem->new_regions = g_realloc(hostmem->new_regions, new_size);
- hostmem->new_regions[num] = (HostMemRegion){
- .host_addr = ram_ptr + section->offset_within_region,
- .guest_addr = section->offset_within_address_space,
- .size = int128_get64(section->size),
- .readonly = section->readonly,
- .mr = section->mr,
- };
- hostmem->num_new_regions++;
-
- memory_region_ref(section->mr);
-}
-
-static void hostmem_listener_append_region(MemoryListener *listener,
- MemoryRegionSection *section)
-{
- HostMem *hostmem = container_of(listener, HostMem, listener);
-
- /* Ignore non-RAM regions, we may not be able to map them */
- if (!memory_region_is_ram(section->mr)) {
- return;
- }
-
- /* Ignore regions with dirty logging, we cannot mark them dirty */
- if (memory_region_is_logging(section->mr)) {
- return;
- }
-
- hostmem_append_new_region(hostmem, section);
-}
-
-/* We don't implement most MemoryListener callbacks, use these nop stubs */
-static void hostmem_listener_dummy(MemoryListener *listener)
-{
-}
-
-static void hostmem_listener_section_dummy(MemoryListener *listener,
- MemoryRegionSection *section)
-{
-}
-
-static void hostmem_listener_eventfd_dummy(MemoryListener *listener,
- MemoryRegionSection *section,
- bool match_data, uint64_t data,
- EventNotifier *e)
-{
-}
-
-static void hostmem_listener_coalesced_mmio_dummy(MemoryListener *listener,
- MemoryRegionSection *section,
- hwaddr addr, hwaddr len)
-{
-}
-
-void hostmem_init(HostMem *hostmem)
-{
- memset(hostmem, 0, sizeof(*hostmem));
-
- qemu_mutex_init(&hostmem->current_regions_lock);
-
- hostmem->listener = (MemoryListener){
- .begin = hostmem_listener_dummy,
- .commit = hostmem_listener_commit,
- .region_add = hostmem_listener_append_region,
- .region_del = hostmem_listener_section_dummy,
- .region_nop = hostmem_listener_append_region,
- .log_start = hostmem_listener_section_dummy,
- .log_stop = hostmem_listener_section_dummy,
- .log_sync = hostmem_listener_section_dummy,
- .log_global_start = hostmem_listener_dummy,
- .log_global_stop = hostmem_listener_dummy,
- .eventfd_add = hostmem_listener_eventfd_dummy,
- .eventfd_del = hostmem_listener_eventfd_dummy,
- .coalesced_mmio_add = hostmem_listener_coalesced_mmio_dummy,
- .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy,
- .priority = 10,
- };
-
- memory_listener_register(&hostmem->listener, &address_space_memory);
- if (hostmem->num_new_regions > 0) {
- hostmem_listener_commit(&hostmem->listener);
- }
-}
-
-void hostmem_finalize(HostMem *hostmem)
-{
- memory_listener_unregister(&hostmem->listener);
- g_free(hostmem->new_regions);
- g_free(hostmem->current_regions);
- qemu_mutex_destroy(&hostmem->current_regions_lock);
-}
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index 26d6825..5ca1ea7 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -15,9 +15,50 @@
*/
#include "trace.h"
+#include "hw/hw.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
#include "hw/virtio/dataplane/vring.h"
#include "qemu/error-report.h"
+static void *vring_map(MemoryRegion **mr, hwaddr phys, hwaddr len,
+ bool is_write)
+{
+ MemoryRegionSection section = memory_region_find(get_system_memory(), phys, len);
+
+ if (!section.mr || int128_get64(section.size) < len) {
+ goto out;
+ }
+ if (is_write && section.readonly) {
+ goto out;
+ }
+ if (!memory_region_is_ram(section.mr)) {
+ goto out;
+ }
+
+ /* Ignore regions with dirty logging, we cannot mark them dirty */
+ if (memory_region_is_logging(section.mr)) {
+ goto out;
+ }
+
+ *mr = section.mr;
+ return memory_region_get_ram_ptr(section.mr) + section.offset_within_region;
+
+out:
+ memory_region_unref(section.mr);
+ *mr = NULL;
+ return NULL;
+}
+
+static void vring_unmap(void *buffer, bool is_write)
+{
+ ram_addr_t addr;
+ MemoryRegion *mr;
+
+ mr = qemu_ram_addr_from_host(buffer, &addr);
+ memory_region_unref(mr);
+}
+
/* Map the guest's vring to host memory */
bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
{
@@ -27,8 +68,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
vring->broken = false;
- hostmem_init(&vring->hostmem);
- vring_ptr = hostmem_lookup(&vring->hostmem, vring_addr, vring_size, true);
+ vring_ptr = vring_map(&vring->mr, vring_addr, vring_size, true);
if (!vring_ptr) {
error_report("Failed to map vring "
"addr %#" HWADDR_PRIx " size %" HWADDR_PRIu,
@@ -54,7 +94,7 @@ void vring_teardown(Vring *vring, VirtIODevice *vdev, int n)
virtio_queue_set_last_avail_idx(vdev, n, vring->last_avail_idx);
virtio_queue_invalidate_signalled_used(vdev, n);
- hostmem_finalize(&vring->hostmem);
+ memory_region_unref(vring->mr);
}
/* Disable guest->host notifies */
@@ -117,6 +157,7 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
unsigned *num;
struct iovec *iov;
hwaddr *addr;
+ MemoryRegion *mr;
if (desc->flags & VRING_DESC_F_WRITE) {
num = &elem->in_num;
@@ -141,14 +182,16 @@ static int get_desc(Vring *vring, VirtQueueElement *elem,
}
/* TODO handle non-contiguous memory across region boundaries */
- iov->iov_base = hostmem_lookup(&vring->hostmem, desc->addr, desc->len,
- desc->flags & VRING_DESC_F_WRITE);
+ iov->iov_base = vring_map(&mr, desc->addr, desc->len,
+ desc->flags & VRING_DESC_F_WRITE);
if (!iov->iov_base) {
error_report("Failed to map descriptor addr %#" PRIx64 " len %u",
(uint64_t)desc->addr, desc->len);
return -EFAULT;
}
+ /* The MemoryRegion is looked up again and unref'ed later, leave the
+ * ref in place. */
iov->iov_len = desc->len;
*addr = desc->addr;
*num += 1;
@@ -183,11 +226,12 @@ static int get_indirect(Vring *vring, VirtQueueElement *elem,
do {
struct vring_desc *desc_ptr;
+ MemoryRegion *mr;
/* Translate indirect descriptor */
- desc_ptr = hostmem_lookup(&vring->hostmem,
- indirect->addr + found * sizeof(desc),
- sizeof(desc), false);
+ desc_ptr = vring_map(&mr,
+ indirect->addr + found * sizeof(desc),
+ sizeof(desc), false);
if (!desc_ptr) {
error_report("Failed to map indirect descriptor "
"addr %#" PRIx64 " len %zu",
@@ -197,6 +241,7 @@ static int get_indirect(Vring *vring, VirtQueueElement *elem,
return -EFAULT;
}
desc = *desc_ptr;
+ memory_region_unref(mr);
/* Ensure descriptor has been loaded before accessing fields */
barrier(); /* read_barrier_depends(); */
@@ -350,6 +395,19 @@ void vring_push(Vring *vring, VirtQueueElement *elem, int len)
struct vring_used_elem *used;
unsigned int head = elem->index;
uint16_t new;
+ int i;
+
+ /* This assumes that the iovecs, if changed, are never moved past
+ * the end of the valid area. This is true if iovec manipulations
+ * are done with iov_discard_front and iov_discard_back.
+ */
+ for (i = 0; i < elem->out_num; i++) {
+ vring_unmap(elem->out_sg[i].iov_base, false);
+ }
+
+ for (i = 0; i < elem->in_num; i++) {
+ vring_unmap(elem->in_sg[i].iov_base, true);
+ }
g_slice_free(VirtQueueElement, elem);
diff --git a/include/hw/virtio/dataplane/hostmem.h b/include/hw/virtio/dataplane/hostmem.h
deleted file mode 100644
index 2810f4b..0000000
--- a/include/hw/virtio/dataplane/hostmem.h
+++ /dev/null
@@ -1,58 +0,0 @@
-/*
- * Thread-safe guest to host memory mapping
- *
- * Copyright 2012 Red Hat, Inc. and/or its affiliates
- *
- * Authors:
- * Stefan Hajnoczi <stefanha@redhat.com>
- *
- * This work is licensed under the terms of the GNU GPL, version 2 or later.
- * See the COPYING file in the top-level directory.
- *
- */
-
-#ifndef HOSTMEM_H
-#define HOSTMEM_H
-
-#include "exec/memory.h"
-#include "qemu/thread.h"
-
-typedef struct {
- MemoryRegion *mr;
- void *host_addr;
- hwaddr guest_addr;
- uint64_t size;
- bool readonly;
-} HostMemRegion;
-
-typedef struct {
- /* The listener is invoked when regions change and a new list of regions is
- * built up completely before they are installed.
- */
- MemoryListener listener;
- HostMemRegion *new_regions;
- size_t num_new_regions;
-
- /* Current regions are accessed from multiple threads either to lookup
- * addresses or to install a new list of regions. The lock protects the
- * pointer and the regions.
- */
- QemuMutex current_regions_lock;
- HostMemRegion *current_regions;
- size_t num_current_regions;
-} HostMem;
-
-void hostmem_init(HostMem *hostmem);
-void hostmem_finalize(HostMem *hostmem);
-
-/**
- * Map a guest physical address to a pointer
- *
- * Note that there is map/unmap mechanism here. The caller must ensure that
- * mapped memory is no longer used across events like hot memory unplug. This
- * can be done with other mechanisms like bdrv_drain_all() that quiesce
- * in-flight I/O.
- */
-void *hostmem_lookup(HostMem *hostmem, hwaddr phys, hwaddr len, bool is_write);
-
-#endif /* HOSTMEM_H */
diff --git a/include/hw/virtio/dataplane/vring.h b/include/hw/virtio/dataplane/vring.h
index b51cfc9..b23edd2 100644
--- a/include/hw/virtio/dataplane/vring.h
+++ b/include/hw/virtio/dataplane/vring.h
@@ -19,11 +19,10 @@
#include <linux/virtio_ring.h>
#include "qemu-common.h"
-#include "hostmem.h"
#include "hw/virtio/virtio.h"
typedef struct {
- HostMem hostmem; /* guest memory mapper */
+ MemoryRegion *mr; /* memory region containing the vring */
struct vring vr; /* virtqueue vring mapped to host memory */
uint16_t last_avail_idx; /* last processed avail ring index */
uint16_t last_used_idx; /* last processed used ring index */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits
2013-10-10 15:07 ` [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits Paolo Bonzini
@ 2013-10-10 20:53 ` Richard Henderson
2013-10-11 9:19 ` Paolo Bonzini
0 siblings, 1 reply; 13+ messages in thread
From: Richard Henderson @ 2013-10-10 20:53 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: stefanha
On 10/10/2013 08:07 AM, Paolo Bonzini wrote:
> return head;
> +
> +out:
> + assert(ret < 0);
> + if (ret == -EFAULT) {
> + vring->broken = true;
> + }
> + return ret;
If this is only the error path, can we call the
label something other than "out"?
r~
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits
2013-10-10 20:53 ` Richard Henderson
@ 2013-10-11 9:19 ` Paolo Bonzini
0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2013-10-11 9:19 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, stefanha
Il 10/10/2013 22:53, Richard Henderson ha scritto:
> On 10/10/2013 08:07 AM, Paolo Bonzini wrote:
>> return head;
>> +
>> +out:
>> + assert(ret < 0);
>> + if (ret == -EFAULT) {
>> + vring->broken = true;
>> + }
>> + return ret;
>
> If this is only the error path, can we call the
> label something other than "out"?
Yes. Though I think it does not matter if the function returns a zero
or positive result after the next patch, so it can become a general exit
label too.
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement
2013-10-10 15:07 ` [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement Paolo Bonzini
@ 2013-12-04 14:06 ` Stefan Hajnoczi
2013-12-04 17:40 ` Paolo Bonzini
0 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2013-12-04 14:06 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha
On Thu, Oct 10, 2013 at 05:07:18PM +0200, Paolo Bonzini wrote:
> @@ -298,30 +278,31 @@ static void handle_notify(EventNotifier *e)
> vring_disable_notification(s->vdev, &s->vring);
>
> for (;;) {
> - head = vring_pop(s->vdev, &s->vring, iov, end, &out_num, &in_num);
> - if (head < 0) {
> + ret = vring_pop(s->vdev, &s->vring, &elem);
> + if (ret < 0) {
> + assert(elem == NULL);
> break; /* no more requests */
> }
>
> - trace_virtio_blk_data_plane_process_request(s, out_num, in_num,
> - head);
> + trace_virtio_blk_data_plane_process_request(s, elem->out_num,
> + elem->in_num, elem->index);
>
> - if (process_request(&s->ioqueue, iov, out_num, in_num, head) < 0) {
> + if (process_request(&s->ioqueue, elem) < 0) {
> vring_set_broken(&s->vring);
> + vring_push(&s->vring, elem, 0);
If we give up on the vring I don't think we should push the element
back. It may cause the guest to panic.
I guess what we really need here is to unmap scatter-gather buffers and
delete elem.
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem
2013-10-10 15:07 [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Paolo Bonzini
` (3 preceding siblings ...)
2013-10-10 15:07 ` [Qemu-devel] [PATCH 4/4] dataplane: replace hostmem with memory_region_find Paolo Bonzini
@ 2013-12-04 14:12 ` Stefan Hajnoczi
4 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2013-12-04 14:12 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, stefanha
On Thu, Oct 10, 2013 at 05:07:15PM +0200, Paolo Bonzini wrote:
> Now that the memory API is thread-safe, we can use it in
> virtio-blk-dataplane and replace hostmem.[ch]. This series does this,
> and also changes the vring API to use VirtQueueElement (with an eye
> towards migration). With this change, virtio-blk-dataplane is also safe
> against memory hot-unplug.
>
> The next step would be to replace memory_region_find with
> address_space_{map,unmap}, which handle dirtying of memory correctly.
> However these APIs are not thread-safe yet, and neither is the handling
> of dirty memory (Juan's patches may be a start here).
>
> Also, the usage of iov_discard_{front,back} may cause some complication
> when we use address_space_{map,unmap}. We may have to change a bit the
> logic in virtio-blk-dataplane to switch to address_space_{map,unmap}.
>
> If we do not want to do this intermediate step, the first three patches
> can be applied separately from the fourth.
>
> Paolo Bonzini (4):
> vring: create a common function to parse descriptors
> vring: factor common code for error exits
> dataplane: change vring API to use VirtQueueElement
> dataplane: replace hostmem with memory_region_find
>
> hw/block/dataplane/virtio-blk.c | 86 +++++-------
> hw/virtio/dataplane/Makefile.objs | 2 +-
> hw/virtio/dataplane/hostmem.c | 183 -------------------------
> hw/virtio/dataplane/vring.c | 244 +++++++++++++++++++++-------------
> include/hw/virtio/dataplane/hostmem.h | 58 --------
> include/hw/virtio/dataplane/vring.h | 9 +-
> 6 files changed, 193 insertions(+), 389 deletions(-)
> delete mode 100644 hw/virtio/dataplane/hostmem.c
> delete mode 100644 include/hw/virtio/dataplane/hostmem.h
Finally looked into this series. I think we should merge all patches so
we start exercising the thread-safe memory API as soon as possible in
the QEMU 2.0 release cycle.
Before merging I wanted to discuss the vring->broken case which I've
left a comment on.
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement
2013-12-04 14:06 ` Stefan Hajnoczi
@ 2013-12-04 17:40 ` Paolo Bonzini
2013-12-05 9:24 ` Stefan Hajnoczi
0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2013-12-04 17:40 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: qemu-devel, stefanha
Il 04/12/2013 15:06, Stefan Hajnoczi ha scritto:
> On Thu, Oct 10, 2013 at 05:07:18PM +0200, Paolo Bonzini wrote:
>> @@ -298,30 +278,31 @@ static void handle_notify(EventNotifier *e)
>> vring_disable_notification(s->vdev, &s->vring);
>>
>> for (;;) {
>> - head = vring_pop(s->vdev, &s->vring, iov, end, &out_num, &in_num);
>> - if (head < 0) {
>> + ret = vring_pop(s->vdev, &s->vring, &elem);
>> + if (ret < 0) {
>> + assert(elem == NULL);
>> break; /* no more requests */
>> }
>>
>> - trace_virtio_blk_data_plane_process_request(s, out_num, in_num,
>> - head);
>> + trace_virtio_blk_data_plane_process_request(s, elem->out_num,
>> + elem->in_num, elem->index);
>>
>> - if (process_request(&s->ioqueue, iov, out_num, in_num, head) < 0) {
>> + if (process_request(&s->ioqueue, elem) < 0) {
>> vring_set_broken(&s->vring);
>> + vring_push(&s->vring, elem, 0);
>
> If we give up on the vring I don't think we should push the element
> back. It may cause the guest to panic.
>
> I guess what we really need here is to unmap scatter-gather buffers and
> delete elem.
That's what already happens actually. vring_push has
+ g_slice_free(VirtQueueElement, elem);
+
/* Don't touch vring if a fatal error occurred */
if (vring->broken) {
return;
in this patch and
+ for (i = 0; i < elem->out_num; i++) {
+ vring_unmap(elem->out_sg[i].iov_base, false);
+ }
+
+ for (i = 0; i < elem->in_num; i++) {
+ vring_unmap(elem->in_sg[i].iov_base, true);
+ }
g_slice_free(VirtQueueElement, elem);
in the next one.
Though I admit vring_push isn't such a great name and API. I can add
instead a vring_free_element function. Do you think vring_push should
call it, or should the caller do that?
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement
2013-12-04 17:40 ` Paolo Bonzini
@ 2013-12-05 9:24 ` Stefan Hajnoczi
2013-12-05 10:34 ` Paolo Bonzini
0 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2013-12-05 9:24 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: Stefan Hajnoczi, qemu-devel
On Wed, Dec 04, 2013 at 06:40:30PM +0100, Paolo Bonzini wrote:
> Il 04/12/2013 15:06, Stefan Hajnoczi ha scritto:
> > On Thu, Oct 10, 2013 at 05:07:18PM +0200, Paolo Bonzini wrote:
> >> @@ -298,30 +278,31 @@ static void handle_notify(EventNotifier *e)
> >> vring_disable_notification(s->vdev, &s->vring);
> >>
> >> for (;;) {
> >> - head = vring_pop(s->vdev, &s->vring, iov, end, &out_num, &in_num);
> >> - if (head < 0) {
> >> + ret = vring_pop(s->vdev, &s->vring, &elem);
> >> + if (ret < 0) {
> >> + assert(elem == NULL);
> >> break; /* no more requests */
> >> }
> >>
> >> - trace_virtio_blk_data_plane_process_request(s, out_num, in_num,
> >> - head);
> >> + trace_virtio_blk_data_plane_process_request(s, elem->out_num,
> >> + elem->in_num, elem->index);
> >>
> >> - if (process_request(&s->ioqueue, iov, out_num, in_num, head) < 0) {
> >> + if (process_request(&s->ioqueue, elem) < 0) {
> >> vring_set_broken(&s->vring);
> >> + vring_push(&s->vring, elem, 0);
> >
> > If we give up on the vring I don't think we should push the element
> > back. It may cause the guest to panic.
> >
> > I guess what we really need here is to unmap scatter-gather buffers and
> > delete elem.
>
> That's what already happens actually. vring_push has
>
>
> + g_slice_free(VirtQueueElement, elem);
> +
> /* Don't touch vring if a fatal error occurred */
> if (vring->broken) {
> return;
>
> in this patch and
>
> + for (i = 0; i < elem->out_num; i++) {
> + vring_unmap(elem->out_sg[i].iov_base, false);
> + }
> +
> + for (i = 0; i < elem->in_num; i++) {
> + vring_unmap(elem->in_sg[i].iov_base, true);
> + }
>
> g_slice_free(VirtQueueElement, elem);
>
> in the next one.
>
> Though I admit vring_push isn't such a great name and API. I can add
> instead a vring_free_element function. Do you think vring_push should
> call it, or should the caller do that?
I think vring_push() should free the VirtQueueElement.
We just need to expose vring_free_element() so that handle_notify() can
call it without pushing bogus buffers back to the guest.
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement
2013-12-05 9:24 ` Stefan Hajnoczi
@ 2013-12-05 10:34 ` Paolo Bonzini
2013-12-06 9:02 ` Stefan Hajnoczi
0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2013-12-05 10:34 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel
Il 05/12/2013 10:24, Stefan Hajnoczi ha scritto:
>> >
>> > That's what already happens actually. vring_push has
>> >
>> >
>> > + g_slice_free(VirtQueueElement, elem);
>> > +
>> > /* Don't touch vring if a fatal error occurred */
>> > if (vring->broken) {
>> > return;
>> >
>> > in this patch and
>> >
>> > + for (i = 0; i < elem->out_num; i++) {
>> > + vring_unmap(elem->out_sg[i].iov_base, false);
>> > + }
>> > +
>> > + for (i = 0; i < elem->in_num; i++) {
>> > + vring_unmap(elem->in_sg[i].iov_base, true);
>> > + }
>> >
>> > g_slice_free(VirtQueueElement, elem);
>> >
>> > in the next one.
>> >
>> > Though I admit vring_push isn't such a great name and API. I can add
>> > instead a vring_free_element function. Do you think vring_push should
>> > call it, or should the caller do that?
> I think vring_push() should free the VirtQueueElement.
>
> We just need to expose vring_free_element() so that handle_notify() can
> call it without pushing bogus buffers back to the guest.
It's not pushing back bogus buffer, see the "if (vring->broken)" above.
But if you prefer handle_notify() to call vring_free_element(), I can
of course do that.
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement
2013-12-05 10:34 ` Paolo Bonzini
@ 2013-12-06 9:02 ` Stefan Hajnoczi
0 siblings, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2013-12-06 9:02 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: qemu-devel, Stefan Hajnoczi
On Thu, Dec 05, 2013 at 11:34:28AM +0100, Paolo Bonzini wrote:
> Il 05/12/2013 10:24, Stefan Hajnoczi ha scritto:
> >> >
> >> > That's what already happens actually. vring_push has
> >> >
> >> >
> >> > + g_slice_free(VirtQueueElement, elem);
> >> > +
> >> > /* Don't touch vring if a fatal error occurred */
> >> > if (vring->broken) {
> >> > return;
> >> >
> >> > in this patch and
> >> >
> >> > + for (i = 0; i < elem->out_num; i++) {
> >> > + vring_unmap(elem->out_sg[i].iov_base, false);
> >> > + }
> >> > +
> >> > + for (i = 0; i < elem->in_num; i++) {
> >> > + vring_unmap(elem->in_sg[i].iov_base, true);
> >> > + }
> >> >
> >> > g_slice_free(VirtQueueElement, elem);
> >> >
> >> > in the next one.
> >> >
> >> > Though I admit vring_push isn't such a great name and API. I can add
> >> > instead a vring_free_element function. Do you think vring_push should
> >> > call it, or should the caller do that?
> > I think vring_push() should free the VirtQueueElement.
> >
> > We just need to expose vring_free_element() so that handle_notify() can
> > call it without pushing bogus buffers back to the guest.
>
> It's not pushing back bogus buffer, see the "if (vring->broken)" above.
> But if you prefer handle_notify() to call vring_free_element(), I can
> of course do that.
Ah, I missed that :). It would be clearer to call vring_free_element()
explicitly.
Stefan
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-12-06 9:03 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-10 15:07 [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 1/4] vring: create a common function to parse descriptors Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 2/4] vring: factor common code for error exits Paolo Bonzini
2013-10-10 20:53 ` Richard Henderson
2013-10-11 9:19 ` Paolo Bonzini
2013-10-10 15:07 ` [Qemu-devel] [PATCH 3/4] dataplane: change vring API to use VirtQueueElement Paolo Bonzini
2013-12-04 14:06 ` Stefan Hajnoczi
2013-12-04 17:40 ` Paolo Bonzini
2013-12-05 9:24 ` Stefan Hajnoczi
2013-12-05 10:34 ` Paolo Bonzini
2013-12-06 9:02 ` Stefan Hajnoczi
2013-10-10 15:07 ` [Qemu-devel] [PATCH 4/4] dataplane: replace hostmem with memory_region_find Paolo Bonzini
2013-12-04 14:12 ` [Qemu-devel] [PATCH 0/4] dataplane: use more of the generic virtio data structures, drop hostmem Stefan Hajnoczi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.