* [Qemu-devel] [PATCH 0/3] virtio-scsi: Fix unsafe bdrv_set_aio_context calls
@ 2015-02-12 5:20 Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL Fam Zheng
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Fam Zheng @ 2015-02-12 5:20 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Paolo Bonzini, stefanha
It is not safe to call bdrv_set_aio_context outside BQL (see patch 1), so move
the call to main thread (patch 3). And as a bonus, the two loops in
virtio-scsi.c and virtio-scsi-dataplane.c for cmd queue handling are converged
(patch 2).
Fam Zheng (3):
block: Forbid bdrv_set_aio_context outside BQL
virtio-scsi: Deduplicate cmd queue handling code of dataplane
virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
hw/scsi/virtio-scsi-dataplane.c | 26 +++----
hw/scsi/virtio-scsi.c | 153 ++++++++++++++++++++++++++++++++++------
include/block/block.h | 3 +-
include/hw/virtio/virtio-scsi.h | 11 ++-
4 files changed, 156 insertions(+), 37 deletions(-)
--
1.9.3
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL
2015-02-12 5:20 [Qemu-devel] [PATCH 0/3] virtio-scsi: Fix unsafe bdrv_set_aio_context calls Fam Zheng
@ 2015-02-12 5:21 ` Fam Zheng
2015-02-13 13:10 ` Paolo Bonzini
2015-03-10 13:34 ` Stefan Hajnoczi
2015-02-12 5:21 ` [Qemu-devel] [PATCH 2/3] virtio-scsi: Deduplicate cmd queue handling code of dataplane Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context Fam Zheng
2 siblings, 2 replies; 13+ messages in thread
From: Fam Zheng @ 2015-02-12 5:21 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Paolo Bonzini, stefanha
Even if the caller has the old #AioContext, there can be a deadlock, due
to the leading bdrv_drain_all:
Suppose there are three io threads (a, b, c) with each owning a BDS
(bds_a, bds_b, bds_c), and a and b want to move their own BDS to c at
the same time:
iothread a iothread b
--------------------------------------------------------------------------
bdrv_set_aio_context(bds_a, c) bdrv_set_aio_context(bds_b, c)
-> bdrv_drain_all() -> bdrv_drain_all()
-> acquire a (OK, already has) -> acquire a (blocked)
-> acquire b (blocked) -> acquire b
-> acquire c -> acquire c
Current caller of bdrv_set_aio_context outside BQL is
virtio-scsi-dataplane, which will be fixed in the next patches.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
include/block/block.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/block/block.h b/include/block/block.h
index 321295e..4fce25d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -546,8 +546,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
* Changes the #AioContext used for fd handlers, timers, and BHs by this
* BlockDriverState and all its children.
*
- * This function must be called from the old #AioContext or with a lock held so
- * the old #AioContext is not executing.
+ * This function must be called with iothread lock held.
*/
void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 2/3] virtio-scsi: Deduplicate cmd queue handling code of dataplane
2015-02-12 5:20 [Qemu-devel] [PATCH 0/3] virtio-scsi: Fix unsafe bdrv_set_aio_context calls Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL Fam Zheng
@ 2015-02-12 5:21 ` Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context Fam Zheng
2 siblings, 0 replies; 13+ messages in thread
From: Fam Zheng @ 2015-02-12 5:21 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Paolo Bonzini, stefanha
The common logic of cmd handling will grow, now there is repeat,
converge them to one place.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
hw/scsi/virtio-scsi-dataplane.c | 15 +++------------
hw/scsi/virtio-scsi.c | 27 ++++++++++++++++++---------
include/hw/virtio/virtio-scsi.h | 8 ++++++--
3 files changed, 27 insertions(+), 23 deletions(-)
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 03a1e8c..4a9b9bd 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -137,20 +137,11 @@ static void virtio_scsi_iothread_handle_cmd(EventNotifier *notifier)
{
VirtIOSCSIVring *vring = container_of(notifier,
VirtIOSCSIVring, host_notifier);
- VirtIOSCSI *s = (VirtIOSCSI *)vring->parent;
- VirtIOSCSIReq *req, *next;
- QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
event_notifier_test_and_clear(notifier);
- while ((req = virtio_scsi_pop_req_vring(s, vring))) {
- if (virtio_scsi_handle_cmd_req_prepare(s, req)) {
- QTAILQ_INSERT_TAIL(&reqs, req, next);
- }
- }
-
- QTAILQ_FOREACH_SAFE(req, &reqs, next, next) {
- virtio_scsi_handle_cmd_req_submit(s, req);
- }
+ virtio_scsi_handle_cmd_do(vring->parent,
+ (VirtIOSCSIPopFunc)virtio_scsi_pop_req_vring,
+ vring);
}
/* assumes s->ctx held */
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 9e2c718..ae8b79a 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -517,7 +517,8 @@ static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
virtio_scsi_complete_cmd_req(req);
}
-bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req)
+static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s,
+ VirtIOSCSIReq *req)
{
VirtIOSCSICommon *vs = &s->parent_obj;
SCSIDevice *d;
@@ -561,7 +562,7 @@ bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req)
return true;
}
-void virtio_scsi_handle_cmd_req_submit(VirtIOSCSI *s, VirtIOSCSIReq *req)
+static void virtio_scsi_handle_cmd_req_submit(VirtIOSCSI *s, VirtIOSCSIReq *req)
{
SCSIRequest *sreq = req->sreq;
if (scsi_req_enqueue(sreq)) {
@@ -571,18 +572,15 @@ void virtio_scsi_handle_cmd_req_submit(VirtIOSCSI *s, VirtIOSCSIReq *req)
scsi_req_unref(sreq);
}
-static void virtio_scsi_handle_cmd(VirtIODevice *vdev, VirtQueue *vq)
+void virtio_scsi_handle_cmd_do(VirtIOSCSI *s,
+ VirtIOSCSIPopFunc pop_req,
+ void *opaque)
{
/* use non-QOM casts in the data path */
- VirtIOSCSI *s = (VirtIOSCSI *)vdev;
VirtIOSCSIReq *req, *next;
QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
- if (s->ctx && !s->dataplane_disabled) {
- virtio_scsi_dataplane_start(s);
- return;
- }
- while ((req = virtio_scsi_pop_req(s, vq))) {
+ while ((req = pop_req(s, opaque))) {
if (virtio_scsi_handle_cmd_req_prepare(s, req)) {
QTAILQ_INSERT_TAIL(&reqs, req, next);
}
@@ -593,6 +591,17 @@ static void virtio_scsi_handle_cmd(VirtIODevice *vdev, VirtQueue *vq)
}
}
+static void virtio_scsi_handle_cmd(VirtIODevice *vdev, VirtQueue *vq)
+{
+ VirtIOSCSI *s = (VirtIOSCSI *)vdev;
+
+ if (s->ctx && !s->dataplane_disabled) {
+ virtio_scsi_dataplane_start(s);
+ return;
+ }
+ virtio_scsi_handle_cmd_do(s, (VirtIOSCSIPopFunc)virtio_scsi_pop_req, vq);
+}
+
static void virtio_scsi_get_config(VirtIODevice *vdev,
uint8_t *config)
{
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index bf17cc9..cba82ea 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -268,8 +268,12 @@ void virtio_scsi_common_realize(DeviceState *dev, Error **errp,
void virtio_scsi_common_unrealize(DeviceState *dev, Error **errp);
void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, VirtIOSCSIReq *req);
-bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req);
-void virtio_scsi_handle_cmd_req_submit(VirtIOSCSI *s, VirtIOSCSIReq *req);
+
+typedef VirtIOSCSIReq *(*VirtIOSCSIPopFunc)(VirtIOSCSI *, void *);
+void virtio_scsi_handle_cmd_do(VirtIOSCSI *s,
+ VirtIOSCSIPopFunc pop_req,
+ void *opaque);
+
VirtIOSCSIReq *virtio_scsi_init_req(VirtIOSCSI *s, VirtQueue *vq);
void virtio_scsi_free_req(VirtIOSCSIReq *req);
void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-12 5:20 [Qemu-devel] [PATCH 0/3] virtio-scsi: Fix unsafe bdrv_set_aio_context calls Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 2/3] virtio-scsi: Deduplicate cmd queue handling code of dataplane Fam Zheng
@ 2015-02-12 5:21 ` Fam Zheng
2015-02-12 14:29 ` Paolo Bonzini
2 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2015-02-12 5:21 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, Paolo Bonzini, stefanha
Before processing a request, virtio-scsi dataplane will check if the
backend runs on the same context with it. If not, it has to be moved,
with bdrv_set_aio_context.
However this function is unsafe to be called from IOThread outside BQL.
The reason is that it calls bdrv_drain_all(), to acquire and drain all
existing BDS. Therefore there is a deadlock problem.
Fix it by offloading the bdrv_set_aio_context to main loop thread,
through a BH (#1). This main loop BH will set the context, then notify
the calling thread with another BH (#2). In BH (#2), we can continue the
virtio-scsi request processing.
A queue is added to VirtIOSCSI for tracking the pending requests, so in
virtio_scsi_dataplane_stop, we have to drain them for migration.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
hw/scsi/virtio-scsi-dataplane.c | 11 ++++
hw/scsi/virtio-scsi.c | 132 +++++++++++++++++++++++++++++++++++-----
include/hw/virtio/virtio-scsi.h | 3 +-
3 files changed, 131 insertions(+), 15 deletions(-)
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 4a9b9bd..3bc7e82 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -283,6 +283,17 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
}
+ while (true) {
+ if (QTAILQ_EMPTY(&s->pending_cmd_reqs)) {
+ break;
+ }
+ aio_context_release(s->ctx);
+ if (!aio_poll(qemu_get_aio_context(), false)) {
+ usleep(500000);
+ }
+ aio_context_acquire(s->ctx);
+ }
+
blk_drain_all(); /* ensure there are no in-flight requests */
aio_context_release(s->ctx);
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index ae8b79a..a842862 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -243,6 +243,54 @@ static void virtio_scsi_cancel_notify(Notifier *notifier, void *data)
g_slice_free(VirtIOSCSICancelNotifier, n);
}
+typedef struct {
+ QEMUBH *bh;
+ BlockBackend *blk;
+ AioContext *new_context;
+ QEMUBHFunc *cb;
+ void *data;
+} PullAioContextInfo;
+
+static void set_aio_context_cb(void *opaque)
+{
+ PullAioContextInfo *info = opaque;
+
+ aio_context_acquire(info->new_context);
+ blk_set_aio_context(info->blk, info->new_context);
+ aio_context_release(info->new_context);
+ qemu_bh_delete(info->bh);
+ info->bh = aio_bh_new(info->new_context, info->cb, info);
+ qemu_bh_schedule(info->bh);
+}
+
+/* Schedule a BH on main thread, and let it handle the context movement. Once
+ * that is done, we are notified by @cb with a second BH on current thread.
+ */
+static void pull_aio_context(BlockBackend *blk, AioContext *ctx,
+ void (*cb)(void *), void *data)
+{
+ PullAioContextInfo *info = g_new(PullAioContextInfo, 1);
+
+ info->blk = blk;
+ info->new_context = ctx;
+ info->cb = cb;
+ info->data = data;
+ info->bh = aio_bh_new(qemu_get_aio_context(), set_aio_context_cb, info);
+ qemu_bh_schedule(info->bh);
+}
+
+static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req);
+static void virtio_scsi_tmf_continue(void *opaque)
+{
+ PullAioContextInfo *info = opaque;
+ VirtIOSCSIReq *req = info->data;
+ VirtIOSCSI *s = req->dev;
+
+ qemu_bh_delete(info->bh);
+ g_free(info);
+ virtio_scsi_do_tmf(s, req);
+}
+
/* Return 0 if the request is ready to be completed and return to guest;
* -EINPROGRESS if the request is submitted and will be completed later, in the
* case of async cancellation. */
@@ -255,9 +303,8 @@ static int virtio_scsi_do_tmf(VirtIOSCSI *s, VirtIOSCSIReq *req)
int ret = 0;
if (s->dataplane_started && blk_get_aio_context(d->conf.blk) != s->ctx) {
- aio_context_acquire(s->ctx);
- blk_set_aio_context(d->conf.blk, s->ctx);
- aio_context_release(s->ctx);
+ pull_aio_context(d->conf.blk, s->ctx, virtio_scsi_tmf_continue, req);
+ return -EINPROGRESS;
}
/* Here VIRTIO_SCSI_S_OK means "FUNCTION COMPLETE". */
req->resp.tmf.response = VIRTIO_SCSI_S_OK;
@@ -517,8 +564,38 @@ static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
virtio_scsi_complete_cmd_req(req);
}
-static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s,
- VirtIOSCSIReq *req)
+static void virtio_scsi_cmd_continue(void *opaque)
+{
+ PullAioContextInfo *info = opaque;
+ VirtIOSCSIReq *req = info->data;
+ VirtIOSCSI *s = req->dev;
+
+ qemu_bh_delete(info->bh);
+ g_free(info);
+ if (QTAILQ_FIRST(&s->pending_cmd_reqs)) {
+ if (req->vring) {
+ virtio_scsi_handle_cmd_do(s,
+ (VirtIOSCSIPopFunc)virtio_scsi_pop_req_vring,
+ req->vring);
+ } else {
+ virtio_scsi_handle_cmd_do(s,
+ (VirtIOSCSIPopFunc)virtio_scsi_pop_req,
+ req->vq);
+ }
+ }
+}
+
+/*
+ * virtio_scsi_handle_cmd_req_prepare:
+ *
+ * Prepare the command request before enqueueing it, or complete it if nothing
+ * to do.
+ *
+ * Returns: 0 if the command is completed.
+ * -EINPROGRESS if the command should be enqueued to scsi bus.
+ * -EAGAIN if the command is not ready to be processed.
+ */
+static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req)
{
VirtIOSCSICommon *vs = &s->parent_obj;
SCSIDevice *d;
@@ -532,19 +609,18 @@ static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s,
} else {
virtio_scsi_bad_req();
}
- return false;
+ return 0;
}
d = virtio_scsi_device_find(s, req->req.cmd.lun);
if (!d) {
req->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
virtio_scsi_complete_cmd_req(req);
- return false;
+ return 0;
}
if (s->dataplane_started && blk_get_aio_context(d->conf.blk) != s->ctx) {
- aio_context_acquire(s->ctx);
- blk_set_aio_context(d->conf.blk, s->ctx);
- aio_context_release(s->ctx);
+ pull_aio_context(d->conf.blk, s->ctx, virtio_scsi_cmd_continue, req);
+ return -EAGAIN;
}
req->sreq = scsi_req_new(d, req->req.cmd.tag,
virtio_scsi_get_lun(req->req.cmd.lun),
@@ -555,11 +631,11 @@ static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s,
req->sreq->cmd.xfer > req->qsgl.size)) {
req->resp.cmd.response = VIRTIO_SCSI_S_OVERRUN;
virtio_scsi_complete_cmd_req(req);
- return false;
+ return 0;
}
scsi_req_ref(req->sreq);
blk_io_plug(d->conf.blk);
- return true;
+ return -EINPROGRESS;
}
static void virtio_scsi_handle_cmd_req_submit(VirtIOSCSI *s, VirtIOSCSIReq *req)
@@ -580,12 +656,38 @@ void virtio_scsi_handle_cmd_do(VirtIOSCSI *s,
VirtIOSCSIReq *req, *next;
QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
- while ((req = pop_req(s, opaque))) {
- if (virtio_scsi_handle_cmd_req_prepare(s, req)) {
+ while (true) {
+ int r;
+
+ req = QTAILQ_FIRST(&s->pending_cmd_reqs);
+ if (req) {
+ QTAILQ_REMOVE(&s->pending_cmd_reqs, req, next);
+ } else {
+ req = pop_req(s, opaque);
+ }
+ if (!req) {
+ break;
+ }
+ r = virtio_scsi_handle_cmd_req_prepare(s, req);
+ switch (r) {
+ case 0:
+ break;
+ case -EINPROGRESS:
QTAILQ_INSERT_TAIL(&reqs, req, next);
+ break;
+ case -EAGAIN:
+ /* Only used for aio context switching, impossible for
+ * non-dataplane code */
+ assert(req->vring);
+ QTAILQ_INSERT_TAIL(&s->pending_cmd_reqs, req, next);
+ goto out;
+ break;
+ default:
+ abort();
}
}
+out:
QTAILQ_FOREACH_SAFE(req, &reqs, next, next) {
virtio_scsi_handle_cmd_req_submit(s, req);
}
@@ -918,7 +1020,9 @@ static void virtio_scsi_device_realize(DeviceState *dev, Error **errp)
static void virtio_scsi_instance_init(Object *obj)
{
VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(obj);
+ VirtIOSCSI *s = VIRTIO_SCSI(obj);
+ QTAILQ_INIT(&s->pending_cmd_reqs);
object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
(Object **)&vs->conf.iothread,
qdev_prop_allow_set_link_before_realize,
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index cba82ea..bc0fead 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -185,6 +185,7 @@ typedef struct VirtIOSCSI {
/* Fields for dataplane below */
AioContext *ctx; /* one iothread per virtio-scsi-pci for now */
+ QTAILQ_HEAD(, VirtIOSCSIReq) pending_cmd_reqs;
/* Vring is used instead of vq in dataplane code, because of the underlying
* memory layer thread safety */
@@ -218,7 +219,7 @@ typedef struct VirtIOSCSIReq {
VirtIOSCSIVring *vring;
union {
- /* Used for two-stage request submission */
+ /* Used for two-stage request submission or suspension */
QTAILQ_ENTRY(VirtIOSCSIReq) next;
/* Used for cancellation of request during TMFs */
--
1.9.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-12 5:21 ` [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context Fam Zheng
@ 2015-02-12 14:29 ` Paolo Bonzini
2015-02-13 1:21 ` Fam Zheng
0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-02-12 14:29 UTC (permalink / raw)
To: Fam Zheng, qemu-devel; +Cc: kwolf, stefanha
On 12/02/2015 06:21, Fam Zheng wrote:
> Before processing a request, virtio-scsi dataplane will check if the
> backend runs on the same context with it. If not, it has to be moved,
> with bdrv_set_aio_context.
>
> However this function is unsafe to be called from IOThread outside BQL.
> The reason is that it calls bdrv_drain_all(), to acquire and drain all
> existing BDS. Therefore there is a deadlock problem.
>
> Fix it by offloading the bdrv_set_aio_context to main loop thread,
> through a BH (#1). This main loop BH will set the context, then notify
> the calling thread with another BH (#2). In BH (#2), we can continue the
> virtio-scsi request processing.
>
> A queue is added to VirtIOSCSI for tracking the pending requests, so in
> virtio_scsi_dataplane_stop, we have to drain them for migration.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
Could you just do set_aio_context for all devices when starting
dataplane? For example with a new scsi_device_set_aio_context function
(and a new method in SCSIDeviceClass).
Maybe I'm missing the obvious. :)
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-12 14:29 ` Paolo Bonzini
@ 2015-02-13 1:21 ` Fam Zheng
2015-02-13 9:38 ` Paolo Bonzini
0 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2015-02-13 1:21 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha
On Thu, 02/12 15:29, Paolo Bonzini wrote:
>
>
> On 12/02/2015 06:21, Fam Zheng wrote:
> > Before processing a request, virtio-scsi dataplane will check if the
> > backend runs on the same context with it. If not, it has to be moved,
> > with bdrv_set_aio_context.
> >
> > However this function is unsafe to be called from IOThread outside BQL.
> > The reason is that it calls bdrv_drain_all(), to acquire and drain all
> > existing BDS. Therefore there is a deadlock problem.
> >
> > Fix it by offloading the bdrv_set_aio_context to main loop thread,
> > through a BH (#1). This main loop BH will set the context, then notify
> > the calling thread with another BH (#2). In BH (#2), we can continue the
> > virtio-scsi request processing.
> >
> > A queue is added to VirtIOSCSI for tracking the pending requests, so in
> > virtio_scsi_dataplane_stop, we have to drain them for migration.
> >
> > Signed-off-by: Fam Zheng <famz@redhat.com>
>
> Could you just do set_aio_context for all devices when starting
> dataplane? For example with a new scsi_device_set_aio_context function
> (and a new method in SCSIDeviceClass).
>
> Maybe I'm missing the obvious. :)
>
Per VQ IOThread stills needs it, in the case that guest switches the VQ of a
disk on the fly.
Fam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-13 1:21 ` Fam Zheng
@ 2015-02-13 9:38 ` Paolo Bonzini
2015-02-13 10:29 ` Fam Zheng
0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-02-13 9:38 UTC (permalink / raw)
To: Fam Zheng; +Cc: kwolf, qemu-devel, stefanha
On 13/02/2015 02:21, Fam Zheng wrote:
>> Could you just do set_aio_context for all devices when starting
>> dataplane? For example with a new scsi_device_set_aio_context function
>> (and a new method in SCSIDeviceClass).
>>
>> Maybe I'm missing the obvious. :)
>
> Per VQ IOThread stills needs it, in the case that guest switches the VQ of a
> disk on the fly.
Per VQ iothread is far away though. Multiqueue aims at parallelizing
accesses *to the same disk from different CPUs*, not at parallelizing
accesses to different disks. As long as bdrv_set_aio_context does
bdrv_drain_all, something has to change for multiqueue dataplane: either
stop using bdrv_set_aio_context, or stop doing bdrv_drain_all.
In either case, the changes are large enough that we shouldn't code for
a case that doesn't exist yet.
In addition, making the code more similar for virtio-blk and virtio-scsi
dataplane is good, because the same changes can apply to both in the future.
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-13 9:38 ` Paolo Bonzini
@ 2015-02-13 10:29 ` Fam Zheng
2015-02-13 10:38 ` Paolo Bonzini
0 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2015-02-13 10:29 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha
On Fri, 02/13 10:38, Paolo Bonzini wrote:
>
>
> On 13/02/2015 02:21, Fam Zheng wrote:
> >> Could you just do set_aio_context for all devices when starting
> >> dataplane? For example with a new scsi_device_set_aio_context function
> >> (and a new method in SCSIDeviceClass).
> >>
> >> Maybe I'm missing the obvious. :)
> >
> > Per VQ IOThread stills needs it, in the case that guest switches the VQ of a
> > disk on the fly.
>
> Per VQ iothread is far away though. Multiqueue aims at parallelizing
> accesses *to the same disk from different CPUs*, not at parallelizing
> accesses to different disks. As long as bdrv_set_aio_context does
> bdrv_drain_all, something has to change for multiqueue dataplane: either
> stop using bdrv_set_aio_context, or stop doing bdrv_drain_all.
>
> In either case, the changes are large enough that we shouldn't code for
> a case that doesn't exist yet.
>
> In addition, making the code more similar for virtio-blk and virtio-scsi
> dataplane is good, because the same changes can apply to both in the future.
I think we should avoid duplicate everything on both virtio-blk and
virtio-scsi, so they will have differences.
Why do you think Per VQ iothread is far away? Limiting to 1 thread for the
whole scsi bus doesn't sound ultimate solution for me. I think it's not harder
than the MMIO safety work we have, and also somehow independent to it.
But yes, stop using bdrv_set_aio_context will be the other way to make it
right, just harder to do.
Fam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-13 10:29 ` Fam Zheng
@ 2015-02-13 10:38 ` Paolo Bonzini
2015-02-13 12:42 ` Fam Zheng
0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2015-02-13 10:38 UTC (permalink / raw)
To: Fam Zheng; +Cc: kwolf, qemu-devel, stefanha
On 13/02/2015 11:29, Fam Zheng wrote:
> I think we should avoid duplicate everything on both virtio-blk and
> virtio-scsi, so they will have differences.
True, but there are also similarities. virtio-blk can also do per-VQ
iothreads
> Why do you think Per VQ iothread is far away?
Because per-VQ iothread needs to use either fine-grained locks or,
especially in the format and protocol driver, no locks at all. No locks
at all applies especially to the raw case, where we can more easily
leverage kernel-side locks and thread-local storage.
Right now the lock is per-AioContext, but even if you made it
BlockBackend-grained lock, the iothreads will just contend on it and
each device won't get better performance than a single iothread. Making
a single backend faster is unfortunately an extremely important case; if
you have multiple backends you can already move them to separate
virtio-scsi controllers or virtio-blk devices.
We haven't even started thinking how the design should look like, so I
think it's far away.
> Limiting to 1 thread for the
> whole scsi bus doesn't sound ultimate solution for me. I think it's not harder
> than the MMIO safety work we have, and also somehow independent to it.
I'm not sure it's independent. While the MMIO safety work does not
imply using fine-grained locks, it probably(*) implies using
fine-grained critical sections. Fine-grained critical sections are
probably(**) a subset of the work needed for fine-grained locks or also
for lockless operation.
(*) probably = couldn't think of a better way
(**) probably = I haven't even thought about it
Paolo
> But yes, stop using bdrv_set_aio_context will be the other way to make it
> right, just harder to do.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-13 10:38 ` Paolo Bonzini
@ 2015-02-13 12:42 ` Fam Zheng
2015-02-13 13:12 ` Paolo Bonzini
0 siblings, 1 reply; 13+ messages in thread
From: Fam Zheng @ 2015-02-13 12:42 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, qemu-devel, stefanha
On Fri, 02/13 11:38, Paolo Bonzini wrote:
>
>
> On 13/02/2015 11:29, Fam Zheng wrote:
> > I think we should avoid duplicate everything on both virtio-blk and
> > virtio-scsi, so they will have differences.
>
> True, but there are also similarities. virtio-blk can also do per-VQ
> iothreads
>
> > Why do you think Per VQ iothread is far away?
>
> Because per-VQ iothread needs to use either fine-grained locks or,
> especially in the format and protocol driver, no locks at all. No locks
> at all applies especially to the raw case, where we can more easily
> leverage kernel-side locks and thread-local storage.
>
> Right now the lock is per-AioContext, but even if you made it
> BlockBackend-grained lock, the iothreads will just contend on it and
> each device won't get better performance than a single iothread. Making
> a single backend faster is unfortunately an extremely important case; if
> you have multiple backends you can already move them to separate
> virtio-scsi controllers or virtio-blk devices.
>
> We haven't even started thinking how the design should look like, so I
> think it's far away.
>
> > Limiting to 1 thread for the
> > whole scsi bus doesn't sound ultimate solution for me. I think it's not harder
> > than the MMIO safety work we have, and also somehow independent to it.
>
> I'm not sure it's independent. While the MMIO safety work does not
> imply using fine-grained locks, it probably(*) implies using
> fine-grained critical sections. Fine-grained critical sections are
> probably(**) a subset of the work needed for fine-grained locks or also
> for lockless operation.
>
> (*) probably = couldn't think of a better way
>
> (**) probably = I haven't even thought about it
>
> Paolo
>
OK, thanks for elaborating. I think for the sake of single IO thread support we
already started, the best option now is to go as you suggested - move
bdrv_set_aio_context to virtio_scsi_dataplane_start and hotplug callbacks.
Could you review patch 1? (And do we want patch 2?)
Thanks,
Fam
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL
2015-02-12 5:21 ` [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL Fam Zheng
@ 2015-02-13 13:10 ` Paolo Bonzini
2015-03-10 13:34 ` Stefan Hajnoczi
1 sibling, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-02-13 13:10 UTC (permalink / raw)
To: Fam Zheng, qemu-devel; +Cc: kwolf, stefanha
On 12/02/2015 06:21, Fam Zheng wrote:
> Even if the caller has the old #AioContext, there can be a deadlock, due
> to the leading bdrv_drain_all:
>
> Suppose there are three io threads (a, b, c) with each owning a BDS
> (bds_a, bds_b, bds_c), and a and b want to move their own BDS to c at
> the same time:
>
> iothread a iothread b
> --------------------------------------------------------------------------
> bdrv_set_aio_context(bds_a, c) bdrv_set_aio_context(bds_b, c)
> -> bdrv_drain_all() -> bdrv_drain_all()
> -> acquire a (OK, already has) -> acquire a (blocked)
> -> acquire b (blocked) -> acquire b
> -> acquire c -> acquire c
>
> Current caller of bdrv_set_aio_context outside BQL is
> virtio-scsi-dataplane, which will be fixed in the next patches.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> include/block/block.h | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/include/block/block.h b/include/block/block.h
> index 321295e..4fce25d 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -546,8 +546,7 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs);
> * Changes the #AioContext used for fd handlers, timers, and BHs by this
> * BlockDriverState and all its children.
> *
> - * This function must be called from the old #AioContext or with a lock held so
> - * the old #AioContext is not executing.
> + * This function must be called with iothread lock held.
> */
> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);
>
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context
2015-02-13 12:42 ` Fam Zheng
@ 2015-02-13 13:12 ` Paolo Bonzini
0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2015-02-13 13:12 UTC (permalink / raw)
To: Fam Zheng; +Cc: kwolf, qemu-devel, stefanha
On 13/02/2015 13:42, Fam Zheng wrote:
> OK, thanks for elaborating. I think for the sake of single IO thread support we
> already started, the best option now is to go as you suggested - move
> bdrv_set_aio_context to virtio_scsi_dataplane_start and hotplug callbacks.
Yes, no doubt the bug has to be fixed.
> Could you review patch 1? (And do we want patch 2?)
I don't know. Theoretically yes, in practice the aim is to get rid of
vring so the benefit is limited in time...
Patch 1 is okay of course. I prefer to document things only once the
code actually does what the documentation says, but Kevin and Stefan are
free to pick it up if they want to.
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL
2015-02-12 5:21 ` [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL Fam Zheng
2015-02-13 13:10 ` Paolo Bonzini
@ 2015-03-10 13:34 ` Stefan Hajnoczi
1 sibling, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2015-03-10 13:34 UTC (permalink / raw)
To: Fam Zheng; +Cc: kwolf, Paolo Bonzini, qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1114 bytes --]
On Thu, Feb 12, 2015 at 01:21:00PM +0800, Fam Zheng wrote:
> Even if the caller has the old #AioContext, there can be a deadlock, due
> to the leading bdrv_drain_all:
>
> Suppose there are three io threads (a, b, c) with each owning a BDS
> (bds_a, bds_b, bds_c), and a and b want to move their own BDS to c at
> the same time:
>
> iothread a iothread b
> --------------------------------------------------------------------------
> bdrv_set_aio_context(bds_a, c) bdrv_set_aio_context(bds_b, c)
> -> bdrv_drain_all() -> bdrv_drain_all()
> -> acquire a (OK, already has) -> acquire a (blocked)
> -> acquire b (blocked) -> acquire b
> -> acquire c -> acquire c
>
> Current caller of bdrv_set_aio_context outside BQL is
> virtio-scsi-dataplane, which will be fixed in the next patches.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> include/block/block.h | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-03-10 13:34 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-12 5:20 [Qemu-devel] [PATCH 0/3] virtio-scsi: Fix unsafe bdrv_set_aio_context calls Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 1/3] block: Forbid bdrv_set_aio_context outside BQL Fam Zheng
2015-02-13 13:10 ` Paolo Bonzini
2015-03-10 13:34 ` Stefan Hajnoczi
2015-02-12 5:21 ` [Qemu-devel] [PATCH 2/3] virtio-scsi: Deduplicate cmd queue handling code of dataplane Fam Zheng
2015-02-12 5:21 ` [Qemu-devel] [PATCH 3/3] virtio-scsi-dataplane: Use main thread BH to set BDS' aio context Fam Zheng
2015-02-12 14:29 ` Paolo Bonzini
2015-02-13 1:21 ` Fam Zheng
2015-02-13 9:38 ` Paolo Bonzini
2015-02-13 10:29 ` Fam Zheng
2015-02-13 10:38 ` Paolo Bonzini
2015-02-13 12:42 ` Fam Zheng
2015-02-13 13:12 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).