* [Qemu-devel] [RFC PATCH 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 11:42 ` Paolo Bonzini
2015-06-25 12:19 ` [Qemu-devel] [PATCH for-2.4 " Paolo Bonzini
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 02/12] block: Introduce bdrv_lock and bdrv_unlock API Fam Zheng
` (11 subsequent siblings)
12 siblings, 2 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
There callers work on a single BlockDriverState subtree, where using
bdrv_drain() is more accurate.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block.c | 6 +++---
block/snapshot.c | 2 +-
migration/block.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/block.c b/block.c
index f42d70e..e9f31b7 100644
--- a/block.c
+++ b/block.c
@@ -1719,9 +1719,9 @@ void bdrv_close(BlockDriverState *bs)
if (bs->job) {
block_job_cancel_sync(bs->job);
}
- bdrv_drain_all(); /* complete I/O */
+ bdrv_drain(bs); /* complete I/O */
bdrv_flush(bs);
- bdrv_drain_all(); /* in case flush left pending I/O */
+ bdrv_drain(bs); /* in case flush left pending I/O */
notifier_list_notify(&bs->close_notifiers, bs);
if (bs->drv) {
@@ -3726,7 +3726,7 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
{
- bdrv_drain_all(); /* ensure there are no in-flight requests */
+ bdrv_drain(bs); /* ensure there are no in-flight requests */
bdrv_detach_aio_context(bs);
diff --git a/block/snapshot.c b/block/snapshot.c
index 50ae610..d56ec14 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -238,7 +238,7 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
}
/* drain all pending i/o before deleting snapshot */
- bdrv_drain_all();
+ bdrv_drain(bs);
if (drv->bdrv_snapshot_delete) {
return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
diff --git a/migration/block.c b/migration/block.c
index ddb59cc..ed865ed 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -457,7 +457,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
blk_mig_lock();
if (bmds_aio_inflight(bmds, sector)) {
blk_mig_unlock();
- bdrv_drain_all();
+ bdrv_drain(bmds->bs);
} else {
blk_mig_unlock();
}
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all Fam Zheng
@ 2015-05-29 11:42 ` Paolo Bonzini
2015-06-25 12:19 ` [Qemu-devel] [PATCH for-2.4 " Paolo Bonzini
1 sibling, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-29 11:42 UTC (permalink / raw)
To: Fam Zheng, qemu-devel
Cc: Kevin Wolf, fam, Jeff Cody, qemu-block, Stefan Hajnoczi
On 29/05/2015 12:53, Fam Zheng wrote:
> There callers work on a single BlockDriverState subtree, where using
> bdrv_drain() is more accurate.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> block.c | 6 +++---
> block/snapshot.c | 2 +-
> migration/block.c | 2 +-
> 3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/block.c b/block.c
> index f42d70e..e9f31b7 100644
> --- a/block.c
> +++ b/block.c
> @@ -1719,9 +1719,9 @@ void bdrv_close(BlockDriverState *bs)
> if (bs->job) {
> block_job_cancel_sync(bs->job);
> }
> - bdrv_drain_all(); /* complete I/O */
> + bdrv_drain(bs); /* complete I/O */
> bdrv_flush(bs);
> - bdrv_drain_all(); /* in case flush left pending I/O */
> + bdrv_drain(bs); /* in case flush left pending I/O */
> notifier_list_notify(&bs->close_notifiers, bs);
>
> if (bs->drv) {
> @@ -3726,7 +3726,7 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
>
> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
> {
> - bdrv_drain_all(); /* ensure there are no in-flight requests */
> + bdrv_drain(bs); /* ensure there are no in-flight requests */
>
> bdrv_detach_aio_context(bs);
>
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 50ae610..d56ec14 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -238,7 +238,7 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
> }
>
> /* drain all pending i/o before deleting snapshot */
> - bdrv_drain_all();
> + bdrv_drain(bs);
>
> if (drv->bdrv_snapshot_delete) {
> return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
> diff --git a/migration/block.c b/migration/block.c
> index ddb59cc..ed865ed 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -457,7 +457,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
> blk_mig_lock();
> if (bmds_aio_inflight(bmds, sector)) {
> blk_mig_unlock();
> - bdrv_drain_all();
> + bdrv_drain(bmds->bs);
> } else {
> blk_mig_unlock();
> }
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH for-2.4 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all Fam Zheng
2015-05-29 11:42 ` Paolo Bonzini
@ 2015-06-25 12:19 ` Paolo Bonzini
1 sibling, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-06-25 12:19 UTC (permalink / raw)
To: Fam Zheng, qemu-devel; +Cc: fam, Kevin Wolf, Stefan Hajnoczi, qemu-block
Kevin, can you queue this patch for 2.4?
Paolo
On 29/05/2015 12:53, Fam Zheng wrote:
> There callers work on a single BlockDriverState subtree, where using
> bdrv_drain() is more accurate.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> block.c | 6 +++---
> block/snapshot.c | 2 +-
> migration/block.c | 2 +-
> 3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/block.c b/block.c
> index f42d70e..e9f31b7 100644
> --- a/block.c
> +++ b/block.c
> @@ -1719,9 +1719,9 @@ void bdrv_close(BlockDriverState *bs)
> if (bs->job) {
> block_job_cancel_sync(bs->job);
> }
> - bdrv_drain_all(); /* complete I/O */
> + bdrv_drain(bs); /* complete I/O */
> bdrv_flush(bs);
> - bdrv_drain_all(); /* in case flush left pending I/O */
> + bdrv_drain(bs); /* in case flush left pending I/O */
> notifier_list_notify(&bs->close_notifiers, bs);
>
> if (bs->drv) {
> @@ -3726,7 +3726,7 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
>
> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
> {
> - bdrv_drain_all(); /* ensure there are no in-flight requests */
> + bdrv_drain(bs); /* ensure there are no in-flight requests */
>
> bdrv_detach_aio_context(bs);
>
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 50ae610..d56ec14 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -238,7 +238,7 @@ int bdrv_snapshot_delete(BlockDriverState *bs,
> }
>
> /* drain all pending i/o before deleting snapshot */
> - bdrv_drain_all();
> + bdrv_drain(bs);
>
> if (drv->bdrv_snapshot_delete) {
> return drv->bdrv_snapshot_delete(bs, snapshot_id, name, errp);
> diff --git a/migration/block.c b/migration/block.c
> index ddb59cc..ed865ed 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -457,7 +457,7 @@ static int mig_save_device_dirty(QEMUFile *f, BlkMigDevState *bmds,
> blk_mig_lock();
> if (bmds_aio_inflight(bmds, sector)) {
> blk_mig_unlock();
> - bdrv_drain_all();
> + bdrv_drain(bmds->bs);
> } else {
> blk_mig_unlock();
> }
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 02/12] block: Introduce bdrv_lock and bdrv_unlock API
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 16:57 ` Eric Blake
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 03/12] blockdev: Lock BDS during internal snapshot transaction Fam Zheng
` (10 subsequent siblings)
12 siblings, 1 reply; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
For various purposes, BDS users call bdrv_drain or bdrv_drain_all to make sure
there are no pending requests duringA a series of operations on the BDS. But in
the middle of operations, the caller may 1) yield from a coroutine (mirror_run);
2) defer the next part of work to a BH (mirror_run); 3) call nested aio_poll
(qmp_transaction); etc..
This lock/unlock API is introduced to help assure above complications won't
spoil the purpose of the bdrv_drain(): bdrv_lock should help quiesce other
readers and writers in the beginning of such operations, and bdrv_unlock should
resume the blocked requests.
A notifier list is added to allow devices to cooperate with the lock and pause
themselves, for example, by not processing more requests on the NBD export.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block.c | 10 +++++++
block/io.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++
include/block/block.h | 39 +++++++++++++++++++++++++++
include/block/block_int.h | 5 ++++
4 files changed, 123 insertions(+)
diff --git a/block.c b/block.c
index e9f31b7..abda2f7 100644
--- a/block.c
+++ b/block.c
@@ -252,8 +252,10 @@ BlockDriverState *bdrv_new(void)
bdrv_iostatus_disable(bs);
notifier_list_init(&bs->close_notifiers);
notifier_with_return_list_init(&bs->before_write_notifiers);
+ notifier_list_init(&bs->lock_notifiers);
qemu_co_queue_init(&bs->throttled_reqs[0]);
qemu_co_queue_init(&bs->throttled_reqs[1]);
+ qemu_co_queue_init(&bs->lock_queue);
bs->refcnt = 1;
bs->aio_context = qemu_get_aio_context();
@@ -1716,6 +1718,7 @@ void bdrv_close(BlockDriverState *bs)
{
BdrvAioNotifier *ban, *ban_next;
+ assert(!bdrv_is_locked(bs));
if (bs->job) {
block_job_cancel_sync(bs->job);
}
@@ -1846,12 +1849,19 @@ static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
/* job */
bs_dest->job = bs_src->job;
+ /* lock */
+ bs_dest->lock_owner = bs_src->lock_owner;
+ bs_dest->lock_level = bs_src->lock_level;
+ bs_dest->lock_queue = bs_src->lock_queue;
+ bs_dest->lock_notifiers = bs_src->lock_notifiers;
+
/* keep the same entry in bdrv_states */
bs_dest->device_list = bs_src->device_list;
bs_dest->blk = bs_src->blk;
memcpy(bs_dest->op_blockers, bs_src->op_blockers,
sizeof(bs_dest->op_blockers));
+
}
/*
diff --git a/block/io.c b/block/io.c
index e394d92..9aa4b71 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2601,3 +2601,72 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
bdrv_flush_io_queue(bs->file);
}
}
+
+static void bdrv_lock_notify(BlockDriverState *bs, bool locking)
+{
+ BdrvLockEvent event = (BdrvLockEvent) {
+ .bs = bs,
+ .locking = locking,
+ };
+ notifier_list_notify(&bs->lock_notifiers, &event);
+}
+
+void bdrv_lock(BlockDriverState *bs)
+{
+ Coroutine *self = qemu_coroutine_self();
+ bool notify = true;
+
+ /*
+ * XXX: eventually we only allow coroutine callers. For now, let's allow
+ * the exceptional non-coroutine callers to serialize by themselves, e.g.
+ * with BQL.
+ */
+ assert(qemu_in_coroutine()
+ || self == bs->lock_owner || bs->lock_level == 0);
+
+ if (bs->lock_level) {
+ if (self == bs->lock_owner) {
+ bs->lock_level++;
+ return;
+ } else {
+ qemu_co_queue_wait(&bs->lock_queue);
+ notify = false;
+ }
+ }
+ assert(bs->lock_level == 0);
+
+ if (notify) {
+ bdrv_lock_notify(bs, true);
+ }
+ bs->lock_level++;
+ bs->lock_owner = self;
+
+ bdrv_drain(bs);
+}
+
+void bdrv_unlock(BlockDriverState *bs)
+{
+ assert(bs->lock_level > 0);
+ if (!--bs->lock_level) {
+ if (!qemu_co_queue_empty(&bs->lock_queue)) {
+ /*
+ * XXX: do we need a BH to run lock_queue?
+ * If so, be careful of bdrv_set_aio_context().
+ **/
+ qemu_co_queue_next(&bs->lock_queue);
+ } else {
+ bdrv_lock_notify(bs, false);
+ }
+ }
+}
+
+bool bdrv_is_locked(BlockDriverState *bs)
+{
+ assert((bs->lock_level == 0) == qemu_co_queue_empty(&bs->lock_queue));
+ return !!bs->lock_level;
+}
+
+void bdrv_add_lock_unlock_notifier(BlockDriverState *bs, Notifier *notifier)
+{
+ notifier_list_add(&bs->lock_notifiers, notifier);
+}
diff --git a/include/block/block.h b/include/block/block.h
index c1c963e..068b01e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -591,6 +591,45 @@ void bdrv_io_plug(BlockDriverState *bs);
void bdrv_io_unplug(BlockDriverState *bs);
void bdrv_flush_io_queue(BlockDriverState *bs);
+/**
+ * bdrv_lock:
+ *
+ * Begin a temporary exclusive accessing by locking the BDS.
+ *
+ * This lock is recursive: if bs is unlocked, the caller context will acquire
+ * the lock; otherwise if the caller is in the same coroutine context that
+ * already holds the lock, it will only add a recurse level; otherwise, this
+ * function will block until the lock is released by the other owner.
+ */
+void bdrv_lock(BlockDriverState *bs);
+
+/**
+ * bdrv_lock:
+ *
+ * Reduce the recurse level, or if it's the outermost unlock, release the lock.
+ */
+void bdrv_unlock(BlockDriverState *bs);
+
+/**
+ * bdrv_is_locked:
+ *
+ * Return if the bs is locked.
+ */
+bool bdrv_is_locked(BlockDriverState *bs);
+
+typedef struct {
+ BlockDriverState *bs;
+ bool locking;
+} BdrvLockEvent;
+
+/**
+ * bdrv_add_lock_unlock_notifier:
+ *
+ * Add a notifier that will get notified when bs is locked or unlocked, with a
+ * BdrvLockEvent data.
+ */
+void bdrv_add_lock_unlock_notifier(BlockDriverState *bs, Notifier *notifier);
+
BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
#endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f004378..a742fea 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -433,6 +433,11 @@ struct BlockDriverState {
/* threshold limit for writes, in bytes. "High water mark". */
uint64_t write_threshold_offset;
NotifierWithReturn write_threshold_notifier;
+
+ Coroutine *lock_owner;
+ int lock_level;
+ CoQueue lock_queue;
+ NotifierList lock_notifiers;
};
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 02/12] block: Introduce bdrv_lock and bdrv_unlock API
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 02/12] block: Introduce bdrv_lock and bdrv_unlock API Fam Zheng
@ 2015-05-29 16:57 ` Eric Blake
0 siblings, 0 replies; 18+ messages in thread
From: Eric Blake @ 2015-05-29 16:57 UTC (permalink / raw)
To: Fam Zheng, qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
[-- Attachment #1: Type: text/plain, Size: 1394 bytes --]
On 05/29/2015 04:53 AM, Fam Zheng wrote:
> For various purposes, BDS users call bdrv_drain or bdrv_drain_all to make sure
> there are no pending requests duringA a series of operations on the BDS. But in
s/duringA/during/
> the middle of operations, the caller may 1) yield from a coroutine (mirror_run);
> 2) defer the next part of work to a BH (mirror_run); 3) call nested aio_poll
> (qmp_transaction); etc..
>
> This lock/unlock API is introduced to help assure above complications won't
> spoil the purpose of the bdrv_drain(): bdrv_lock should help quiesce other
> readers and writers in the beginning of such operations, and bdrv_unlock should
> resume the blocked requests.
>
> A notifier list is added to allow devices to cooperate with the lock and pause
> themselves, for example, by not processing more requests on the NBD export.
>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> block.c | 10 +++++++
> block/io.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++
> include/block/block.h | 39 +++++++++++++++++++++++++++
> include/block/block_int.h | 5 ++++
> 4 files changed, 123 insertions(+)
>
I'll leave the technical review on this series to those more familiar
with coroutines.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 03/12] blockdev: Lock BDS during internal snapshot transaction
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 01/12] block: Use bdrv_drain to replace uncessary bdrv_drain_all Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 02/12] block: Introduce bdrv_lock and bdrv_unlock API Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 11:39 ` Paolo Bonzini
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 04/12] blockdev: Lock BDS during external " Fam Zheng
` (9 subsequent siblings)
12 siblings, 1 reply; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
Signed-off-by: Fam Zheng <famz@redhat.com>
---
blockdev.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 5eaf77e..46f8d60 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1262,6 +1262,7 @@ typedef struct InternalSnapshotState {
BlockDriverState *bs;
AioContext *aio_context;
QEMUSnapshotInfo sn;
+ bool needs_unlock;
} InternalSnapshotState;
static void internal_snapshot_prepare(BlkTransactionState *common,
@@ -1356,6 +1357,8 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
}
/* 4. succeed, mark a snapshot is created */
+ bdrv_lock(bs);
+ state->needs_unlock = true;
state->bs = bs;
}
@@ -1387,6 +1390,9 @@ static void internal_snapshot_clean(BlkTransactionState *common)
InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
common, common);
+ if (state->needs_unlock) {
+ bdrv_unlock(state->bs);
+ }
if (state->aio_context) {
aio_context_release(state->aio_context);
}
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 03/12] blockdev: Lock BDS during internal snapshot transaction
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 03/12] blockdev: Lock BDS during internal snapshot transaction Fam Zheng
@ 2015-05-29 11:39 ` Paolo Bonzini
0 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-29 11:39 UTC (permalink / raw)
To: Fam Zheng, qemu-devel
Cc: Kevin Wolf, fam, Jeff Cody, qemu-block, Stefan Hajnoczi
On 29/05/2015 12:53, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
> blockdev.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/blockdev.c b/blockdev.c
> index 5eaf77e..46f8d60 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1262,6 +1262,7 @@ typedef struct InternalSnapshotState {
> BlockDriverState *bs;
> AioContext *aio_context;
> QEMUSnapshotInfo sn;
> + bool needs_unlock;
No need for needs_unlock...
> } InternalSnapshotState;
>
> static void internal_snapshot_prepare(BlkTransactionState *common,
> @@ -1356,6 +1357,8 @@ static void internal_snapshot_prepare(BlkTransactionState *common,
> }
>
> /* 4. succeed, mark a snapshot is created */
> + bdrv_lock(bs);
> + state->needs_unlock = true;
> state->bs = bs;
> }
>
> @@ -1387,6 +1390,9 @@ static void internal_snapshot_clean(BlkTransactionState *common)
> InternalSnapshotState *state = DO_UPCAST(InternalSnapshotState,
> common, common);
>
> + if (state->needs_unlock) {
> + bdrv_unlock(state->bs);
> + }
... you can use state->bs != NULL here.
Paolo
> if (state->aio_context) {
> aio_context_release(state->aio_context);
> }
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 04/12] blockdev: Lock BDS during external snapshot transaction
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (2 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 03/12] blockdev: Lock BDS during internal snapshot transaction Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 05/12] blockdev: Lock BDS during drive-backup transaction Fam Zheng
` (8 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
Lock immediately follows aio_context_acquire, so unlock right before
the corresponding aio_context_release.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
blockdev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/blockdev.c b/blockdev.c
index 46f8d60..1c3946a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1473,6 +1473,7 @@ static void external_snapshot_prepare(BlkTransactionState *common,
/* Acquire AioContext now so any threads operating on old_bs stop */
state->aio_context = bdrv_get_aio_context(state->old_bs);
aio_context_acquire(state->aio_context);
+ bdrv_lock(state->old_bs);
if (!bdrv_is_inserted(state->old_bs)) {
error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1542,6 +1543,7 @@ static void external_snapshot_commit(BlkTransactionState *common)
bdrv_reopen(state->new_bs, state->new_bs->open_flags & ~BDRV_O_RDWR,
NULL);
+ bdrv_unlock(state->old_bs);
aio_context_release(state->aio_context);
}
@@ -1553,6 +1555,7 @@ static void external_snapshot_abort(BlkTransactionState *common)
bdrv_unref(state->new_bs);
}
if (state->aio_context) {
+ bdrv_unlock(state->old_bs);
aio_context_release(state->aio_context);
}
}
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 05/12] blockdev: Lock BDS during drive-backup transaction
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (3 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 04/12] blockdev: Lock BDS during external " Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 06/12] blockdev: Lock BDS during blockdev-backup transaction Fam Zheng
` (7 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
To save the bs pointer for drive_backup_clean, pull the assignment to
state->bs up. It will not be a problem for drive_backup_abort because
state->job will still be NULL.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
blockdev.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/blockdev.c b/blockdev.c
index 1c3946a..517def4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1588,6 +1588,8 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
/* AioContext is released in .clean() */
state->aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(state->aio_context);
+ bdrv_lock(bs);
+ state->bs = bs;
qmp_drive_backup(backup->device, backup->target,
backup->has_format, backup->format,
@@ -1603,7 +1605,6 @@ static void drive_backup_prepare(BlkTransactionState *common, Error **errp)
return;
}
- state->bs = bs;
state->job = state->bs->job;
}
@@ -1623,6 +1624,7 @@ static void drive_backup_clean(BlkTransactionState *common)
DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
if (state->aio_context) {
+ bdrv_unlock(state->bs);
aio_context_release(state->aio_context);
}
}
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 06/12] blockdev: Lock BDS during blockdev-backup transaction
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (4 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 05/12] blockdev: Lock BDS during drive-backup transaction Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 07/12] block-backend: Add blk_add_lock_unlock_notifier Fam Zheng
` (6 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
The change is similar to drive-backup in last patch.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
blockdev.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/blockdev.c b/blockdev.c
index 517def4..80cf7d5 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1669,6 +1669,8 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
return;
}
aio_context_acquire(state->aio_context);
+ state->bs = bs;
+ bdrv_lock(bs);
qmp_blockdev_backup(backup->device, backup->target,
backup->sync,
@@ -1681,7 +1683,6 @@ static void blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
return;
}
- state->bs = bs;
state->job = state->bs->job;
}
@@ -1701,6 +1702,7 @@ static void blockdev_backup_clean(BlkTransactionState *common)
BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, common);
if (state->aio_context) {
+ bdrv_unlock(state->bs);
aio_context_release(state->aio_context);
}
}
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 07/12] block-backend: Add blk_add_lock_unlock_notifier
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (5 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 06/12] blockdev: Lock BDS during blockdev-backup transaction Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 08/12] virtio-blk: Move complete_request to 'ops' structure Fam Zheng
` (5 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
Forward the call to bdrv_add_lock_unlock_notifier.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/block-backend.c | 6 ++++++
include/sysemu/block-backend.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/block/block-backend.c b/block/block-backend.c
index 93e46f3..79fdf75 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -913,3 +913,9 @@ int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo)
{
return bdrv_probe_geometry(blk->bs, geo);
}
+
+void blk_add_lock_unlock_notifier(BlockBackend *blk, Notifier *notifier)
+{
+ return bdrv_add_lock_unlock_notifier(blk->bs, notifier);
+}
+
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index b4a4d5e..8706a62 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -168,5 +168,6 @@ int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size);
int blk_probe_blocksizes(BlockBackend *blk, BlockSizes *bsz);
int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo);
+void blk_add_lock_unlock_notifier(BlockBackend *blk, Notifier *notifier);
#endif
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 08/12] virtio-blk: Move complete_request to 'ops' structure
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (6 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 07/12] block-backend: Add blk_add_lock_unlock_notifier Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 09/12] virtio-blk: Don't handle output when backend is locked Fam Zheng
` (4 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
Should more ops be added to differentiate code between dataplane and
non-dataplane, the new saved_ops approach will be cleaner than messing
with N pointers.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
hw/block/dataplane/virtio-blk.c | 13 ++++++++-----
hw/block/virtio-blk.c | 8 ++++++--
include/hw/virtio/virtio-blk.h | 9 +++++++--
3 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 3db139b..ec0c8f4 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -51,8 +51,7 @@ struct VirtIOBlockDataPlane {
/* Operation blocker on BDS */
Error *blocker;
- void (*saved_complete_request)(struct VirtIOBlockReq *req,
- unsigned char status);
+ const VirtIOBlockOps *saved_ops;
};
/* Raise an interrupt to signal guest, if necessary */
@@ -88,6 +87,10 @@ static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
qemu_bh_schedule(s->bh);
}
+static const VirtIOBlockOps virtio_blk_data_plane_ops = {
+ .complete_request = complete_request_vring,
+};
+
static void handle_notify(EventNotifier *e)
{
VirtIOBlockDataPlane *s = container_of(e, VirtIOBlockDataPlane,
@@ -269,8 +272,8 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
}
s->host_notifier = *virtio_queue_get_host_notifier(vq);
- s->saved_complete_request = vblk->complete_request;
- vblk->complete_request = complete_request_vring;
+ s->saved_ops = vblk->ops;
+ vblk->ops = &virtio_blk_data_plane_ops;
s->starting = false;
s->started = true;
@@ -313,7 +316,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
return;
}
s->stopping = true;
- vblk->complete_request = s->saved_complete_request;
+ vblk->ops = s->saved_ops;
trace_virtio_blk_data_plane_stop(s);
aio_context_acquire(s->ctx);
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index e6afe97..f4a9d19 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,9 +59,13 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
virtio_notify(vdev, s->vq);
}
+static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
+ .complete_request = virtio_blk_complete_request,
+};
+
static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
{
- req->dev->complete_request(req, status);
+ req->dev->ops->complete_request(req, status);
}
static int virtio_blk_handle_rw_error(VirtIOBlockReq *req, int error,
@@ -905,7 +909,7 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
s->sector_mask = (s->conf.conf.logical_block_size / BDRV_SECTOR_SIZE) - 1;
s->vq = virtio_add_queue(vdev, 128, virtio_blk_handle_output);
- s->complete_request = virtio_blk_complete_request;
+ s->ops = &virtio_blk_ops;
virtio_blk_data_plane_create(vdev, conf, &s->dataplane, &err);
if (err != NULL) {
error_propagate(errp, err);
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 6bf5905..28b3436 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -44,6 +44,12 @@ struct VirtIOBlkConf
struct VirtIOBlockDataPlane;
struct VirtIOBlockReq;
+
+typedef struct {
+ /* Function to push to vq and notify guest */
+ void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+} VirtIOBlockOps;
+
typedef struct VirtIOBlock {
VirtIODevice parent_obj;
BlockBackend *blk;
@@ -54,8 +60,7 @@ typedef struct VirtIOBlock {
unsigned short sector_mask;
bool original_wce;
VMChangeStateEntry *change;
- /* Function to push to vq and notify guest */
- void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+ const VirtIOBlockOps *ops;
Notifier migration_state_notifier;
struct VirtIOBlockDataPlane *dataplane;
} VirtIOBlock;
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 09/12] virtio-blk: Don't handle output when backend is locked
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (7 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 08/12] virtio-blk: Move complete_request to 'ops' structure Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 10/12] virtio-scsi-dataplane: Add backend lock listener Fam Zheng
` (3 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
virtio-blk now listens to locking and unlocking of the associated block
backend.
Up on locking:
non-dataplane:
1) Set VirtIOBlock.paused
2) In virtio_blk_handle_output, do nothing if VirtIOBlock.paused
dataplane:
1) Clear the host event notifier
2) In handle_notify, do nothing if VirtIOBlock.paused
Up on unlocking:
non-dataplane:
1) Clear VirtIOBlock.paused
2) Schedule a BH on the AioContext of the backend, which calls
virtio_blk_handle_output, so that the previous unhandled kicks can
make progress
dataplane:
1) Set the host event notifier
2) Notify the host event notifier so that unhandled events are
processed
Signed-off-by: Fam Zheng <famz@redhat.com>
---
hw/block/dataplane/virtio-blk.c | 25 +++++++++++++++++-
hw/block/virtio-blk.c | 57 +++++++++++++++++++++++++++++++++++++++--
include/hw/virtio/virtio-blk.h | 8 +++++-
3 files changed, 86 insertions(+), 4 deletions(-)
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index ec0c8f4..d6c943c 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -87,8 +87,28 @@ static void complete_request_vring(VirtIOBlockReq *req, unsigned char status)
qemu_bh_schedule(s->bh);
}
+static void virtio_blk_data_plane_pause(VirtIOBlock *vblk)
+{
+ VirtIOBlockDataPlane *s = vblk->dataplane;
+
+ event_notifier_test_and_clear(&s->host_notifier);
+ aio_set_event_notifier(s->ctx, &s->host_notifier, NULL);
+}
+
+static void handle_notify(EventNotifier *e);
+static void virtio_blk_data_plane_resume(VirtIOBlock *vblk)
+{
+ VirtIOBlockDataPlane *s = vblk->dataplane;
+
+ aio_set_event_notifier(s->ctx, &s->host_notifier, handle_notify);
+
+ event_notifier_set(&s->host_notifier);
+}
+
static const VirtIOBlockOps virtio_blk_data_plane_ops = {
- .complete_request = complete_request_vring,
+ .complete_request = complete_request_vring,
+ .pause = virtio_blk_data_plane_pause,
+ .resume = virtio_blk_data_plane_resume,
};
static void handle_notify(EventNotifier *e)
@@ -98,6 +118,9 @@ static void handle_notify(EventNotifier *e)
VirtIOBlock *vblk = VIRTIO_BLK(s->vdev);
event_notifier_test_and_clear(&s->host_notifier);
+ if (vblk->paused) {
+ return;
+ }
blk_io_plug(s->conf->conf.blk);
for (;;) {
MultiReqBuffer mrb = {};
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index f4a9d19..7f3d62e 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -59,8 +59,38 @@ static void virtio_blk_complete_request(VirtIOBlockReq *req,
virtio_notify(vdev, s->vq);
}
+typedef struct {
+ QEMUBH *bh;
+ VirtIOBlock *s;
+} VirtIOBlockResumeData;
+
+static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq);
+static void virtio_blk_resume_bh_cb(void *opaque)
+{
+ VirtIOBlockResumeData *data = opaque;
+ qemu_bh_delete(data->bh);
+ virtio_blk_handle_output(VIRTIO_DEVICE(data->s), data->s->vq);
+}
+
+static void virtio_blk_pause(VirtIOBlock *vblk)
+{
+ /* TODO: stop ioeventfd */
+}
+
+static void virtio_blk_resume(VirtIOBlock *vblk)
+{
+ VirtIOBlockResumeData *data = g_new(VirtIOBlockResumeData, 1);
+ data->bh = aio_bh_new(blk_get_aio_context(vblk->blk),
+ virtio_blk_resume_bh_cb, data);
+ data->s = vblk;
+ data->s->paused = false;
+ qemu_bh_schedule(data->bh);
+}
+
static const VirtIOBlockOps virtio_blk_ops = (VirtIOBlockOps) {
- .complete_request = virtio_blk_complete_request,
+ .complete_request = virtio_blk_complete_request,
+ .pause = virtio_blk_pause,
+ .resume = virtio_blk_resume,
};
static void virtio_blk_req_complete(VirtIOBlockReq *req, unsigned char status)
@@ -597,6 +627,9 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
VirtIOBlockReq *req;
MultiReqBuffer mrb = {};
+ if (s->paused) {
+ return;
+ }
/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
* dataplane here instead of waiting for .set_status().
*/
@@ -787,7 +820,7 @@ static void virtio_blk_save(QEMUFile *f, void *opaque)
virtio_save(vdev, f);
}
-
+
static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
{
VirtIOBlock *s = VIRTIO_BLK(vdev);
@@ -875,6 +908,22 @@ static void virtio_blk_migration_state_changed(Notifier *notifier, void *data)
}
}
+static void virtio_blk_pause_handler(Notifier *notifier, void *opaque)
+{
+ BdrvLockEvent *event = opaque;
+ VirtIOBlock *s = container_of(notifier, VirtIOBlock,
+ pause_notifier);
+
+ if (event->locking == s->paused) {
+ return;
+ }
+ if (event->locking) {
+ s->ops->pause(s);
+ } else {
+ s->ops->resume(s);
+ }
+}
+
static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
{
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -926,6 +975,9 @@ static void virtio_blk_device_realize(DeviceState *dev, Error **errp)
blk_set_guest_block_size(s->blk, s->conf.conf.logical_block_size);
blk_iostatus_enable(s->blk);
+
+ s->pause_notifier.notify = virtio_blk_pause_handler;
+ blk_add_lock_unlock_notifier(s->blk, &s->pause_notifier);
}
static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
@@ -933,6 +985,7 @@ static void virtio_blk_device_unrealize(DeviceState *dev, Error **errp)
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
VirtIOBlock *s = VIRTIO_BLK(dev);
+ notifier_remove(&s->pause_notifier);
remove_migration_state_change_notifier(&s->migration_state_notifier);
virtio_blk_data_plane_destroy(s->dataplane);
s->dataplane = NULL;
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 28b3436..4b6246f 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -42,12 +42,16 @@ struct VirtIOBlkConf
};
struct VirtIOBlockDataPlane;
-
+struct VirtIOBlock;
struct VirtIOBlockReq;
typedef struct {
/* Function to push to vq and notify guest */
void (*complete_request)(struct VirtIOBlockReq *req, unsigned char status);
+
+ /* Functions to pause/resume request handling */
+ void (*pause)(struct VirtIOBlock *vblk);
+ void (*resume)(struct VirtIOBlock *vblk);
} VirtIOBlockOps;
typedef struct VirtIOBlock {
@@ -62,6 +66,8 @@ typedef struct VirtIOBlock {
VMChangeStateEntry *change;
const VirtIOBlockOps *ops;
Notifier migration_state_notifier;
+ Notifier pause_notifier;
+ bool paused;
struct VirtIOBlockDataPlane *dataplane;
} VirtIOBlock;
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 10/12] virtio-scsi-dataplane: Add backend lock listener
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (8 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 09/12] virtio-blk: Don't handle output when backend is locked Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 11/12] nbd-server: Clear "can_read" when backend is locked Fam Zheng
` (2 subsequent siblings)
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
When a disk is attached to scsi-bus, virtio_scsi_hotplug will take care
of protecting the block device with op blockers. Currently we haven't
enabled block jobs (like what's done in virtio_blk_data_plane_create),
but it is better to disable ioeventfd when backend is locked. This is
useful to make sure that guest IO requests are paused during qmp
transactions (such as multi-disk snapshot or backup).
A counter is added to the virtio-scsi device, which keeps track of
currently blocked disks. If it goes from 0 to 1, the ioeventfds are
disabled; when it goes back to 0, they are re-enabled.
Also in device initialization, push the enabling of ioeventfds to before
return, so the virtio_scsi_clear_aio is not needed there. Rename it,
pair with an enabling variant, fix one coding style issue, then use it
in the device pause points.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
hw/scsi/virtio-scsi-dataplane.c | 78 ++++++++++++++++++++++++++++++-----------
hw/scsi/virtio-scsi.c | 3 ++
include/hw/virtio/virtio-scsi.h | 3 ++
3 files changed, 64 insertions(+), 20 deletions(-)
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 5575648..28706a2 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -40,7 +40,6 @@ void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread *iothread)
static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
VirtQueue *vq,
- EventNotifierHandler *handler,
int n)
{
BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(s)));
@@ -60,7 +59,6 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
r = g_slice_new(VirtIOSCSIVring);
r->host_notifier = *virtio_queue_get_host_notifier(vq);
r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
- aio_set_event_notifier(s->ctx, &r->host_notifier, handler);
r->parent = s;
@@ -71,7 +69,6 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
return r;
fail_vring:
- aio_set_event_notifier(s->ctx, &r->host_notifier, NULL);
k->set_host_notifier(qbus->parent, n, false);
g_slice_free(VirtIOSCSIVring, r);
return NULL;
@@ -104,6 +101,9 @@ void virtio_scsi_vring_push_notify(VirtIOSCSIReq *req)
}
}
+static void virtio_scsi_start_ioeventfd(VirtIOSCSI *s);
+static void virtio_scsi_stop_ioeventfd(VirtIOSCSI *s);
+
static void virtio_scsi_iothread_handle_ctrl(EventNotifier *notifier)
{
VirtIOSCSIVring *vring = container_of(notifier,
@@ -111,6 +111,7 @@ static void virtio_scsi_iothread_handle_ctrl(EventNotifier *notifier)
VirtIOSCSI *s = VIRTIO_SCSI(vring->parent);
VirtIOSCSIReq *req;
+ assert(!s->pause_counter);
event_notifier_test_and_clear(notifier);
while ((req = virtio_scsi_pop_req_vring(s, vring))) {
virtio_scsi_handle_ctrl_req(s, req);
@@ -124,6 +125,7 @@ static void virtio_scsi_iothread_handle_event(EventNotifier *notifier)
VirtIOSCSI *s = vring->parent;
VirtIODevice *vdev = VIRTIO_DEVICE(s);
+ assert(!s->pause_counter);
event_notifier_test_and_clear(notifier);
if (!(vdev->status & VIRTIO_CONFIG_S_DRIVER_OK)) {
@@ -143,6 +145,7 @@ static void virtio_scsi_iothread_handle_cmd(EventNotifier *notifier)
VirtIOSCSIReq *req, *next;
QTAILQ_HEAD(, VirtIOSCSIReq) reqs = QTAILQ_HEAD_INITIALIZER(reqs);
+ assert(!s->pause_counter);
event_notifier_test_and_clear(notifier);
while ((req = virtio_scsi_pop_req_vring(s, vring))) {
if (virtio_scsi_handle_cmd_req_prepare(s, req)) {
@@ -155,8 +158,52 @@ static void virtio_scsi_iothread_handle_cmd(EventNotifier *notifier)
}
}
+void virtio_scsi_dataplane_pause_handler(Notifier *notifier, void *data)
+{
+ VirtIOSCSI *s = container_of(notifier, VirtIOSCSI, pause_notifier);
+ BdrvLockEvent *event = data;
+
+ if (event->locking) {
+ s->pause_counter++;
+ if (s->pause_counter == 1) {
+ virtio_scsi_stop_ioeventfd(s);
+ }
+ } else {
+ s->pause_counter--;
+ if (s->pause_counter == 0) {
+ virtio_scsi_start_ioeventfd(s);
+ }
+ }
+ assert(s->pause_counter >= 0);
+}
+
/* assumes s->ctx held */
-static void virtio_scsi_clear_aio(VirtIOSCSI *s)
+static void virtio_scsi_start_ioeventfd(VirtIOSCSI *s)
+{
+ VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
+ int i;
+
+ if (!s->dataplane_started || s->dataplane_stopping) {
+ return;
+ }
+ if (s->ctrl_vring) {
+ aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier,
+ virtio_scsi_iothread_handle_ctrl);
+ }
+ if (s->event_vring) {
+ aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier,
+ virtio_scsi_iothread_handle_event);
+ }
+ if (s->cmd_vrings) {
+ for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
+ aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
+ virtio_scsi_iothread_handle_cmd);
+ }
+ }
+}
+
+/* assumes s->ctx held */
+static void virtio_scsi_stop_ioeventfd(VirtIOSCSI *s)
{
VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(s);
int i;
@@ -169,7 +216,8 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
}
if (s->cmd_vrings) {
for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
- aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
+ aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier,
+ NULL);
}
}
}
@@ -229,24 +277,18 @@ void virtio_scsi_dataplane_start(VirtIOSCSI *s)
}
aio_context_acquire(s->ctx);
- s->ctrl_vring = virtio_scsi_vring_init(s, vs->ctrl_vq,
- virtio_scsi_iothread_handle_ctrl,
- 0);
+ s->ctrl_vring = virtio_scsi_vring_init(s, vs->ctrl_vq, 0);
if (!s->ctrl_vring) {
goto fail_vrings;
}
- s->event_vring = virtio_scsi_vring_init(s, vs->event_vq,
- virtio_scsi_iothread_handle_event,
- 1);
+ s->event_vring = virtio_scsi_vring_init(s, vs->event_vq, 1);
if (!s->event_vring) {
goto fail_vrings;
}
s->cmd_vrings = g_new(VirtIOSCSIVring *, vs->conf.num_queues);
for (i = 0; i < vs->conf.num_queues; i++) {
s->cmd_vrings[i] =
- virtio_scsi_vring_init(s, vs->cmd_vqs[i],
- virtio_scsi_iothread_handle_cmd,
- i + 2);
+ virtio_scsi_vring_init(s, vs->cmd_vqs[i], i + 2);
if (!s->cmd_vrings[i]) {
goto fail_vrings;
}
@@ -254,11 +296,11 @@ void virtio_scsi_dataplane_start(VirtIOSCSI *s)
s->dataplane_starting = false;
s->dataplane_started = true;
+ virtio_scsi_start_ioeventfd(s);
aio_context_release(s->ctx);
return;
fail_vrings:
- virtio_scsi_clear_aio(s);
aio_context_release(s->ctx);
virtio_scsi_vring_teardown(s);
for (i = 0; i < vs->conf.num_queues + 2; i++) {
@@ -290,11 +332,7 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
aio_context_acquire(s->ctx);
- aio_set_event_notifier(s->ctx, &s->ctrl_vring->host_notifier, NULL);
- aio_set_event_notifier(s->ctx, &s->event_vring->host_notifier, NULL);
- for (i = 0; i < vs->conf.num_queues; i++) {
- aio_set_event_notifier(s->ctx, &s->cmd_vrings[i]->host_notifier, NULL);
- }
+ virtio_scsi_stop_ioeventfd(s);
blk_drain_all(); /* ensure there are no in-flight requests */
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index e242fef..be2542e 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -774,6 +774,8 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev,
blk_op_block_all(sd->conf.blk, s->blocker);
aio_context_acquire(s->ctx);
blk_set_aio_context(sd->conf.blk, s->ctx);
+ s->pause_notifier.notify = virtio_scsi_dataplane_pause_handler;
+ blk_add_lock_unlock_notifier(sd->conf.blk, &s->pause_notifier);
aio_context_release(s->ctx);
}
@@ -798,6 +800,7 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev,
}
if (s->ctx) {
+ notifier_remove(&s->pause_notifier);
blk_op_unblock_all(sd->conf.blk, s->blocker);
}
qdev_simple_device_unplug_cb(hotplug_dev, dev, errp);
diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h
index b42e7f1..928b87e 100644
--- a/include/hw/virtio/virtio-scsi.h
+++ b/include/hw/virtio/virtio-scsi.h
@@ -97,6 +97,8 @@ typedef struct VirtIOSCSI {
bool dataplane_disabled;
bool dataplane_fenced;
Error *blocker;
+ Notifier pause_notifier;
+ int pause_counter;
Notifier migration_state_notifier;
uint32_t host_features;
} VirtIOSCSI;
@@ -170,6 +172,7 @@ void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,
uint32_t event, uint32_t reason);
void virtio_scsi_set_iothread(VirtIOSCSI *s, IOThread *iothread);
+void virtio_scsi_dataplane_pause_handler(Notifier *notifier, void *data);
void virtio_scsi_dataplane_start(VirtIOSCSI *s);
void virtio_scsi_dataplane_stop(VirtIOSCSI *s);
void virtio_scsi_vring_push_notify(VirtIOSCSIReq *req);
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 11/12] nbd-server: Clear "can_read" when backend is locked
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (9 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 10/12] virtio-scsi-dataplane: Add backend lock listener Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 12/12] mirror: Protect source between bdrv_drain and bdrv_swap Fam Zheng
2015-05-29 12:01 ` [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Paolo Bonzini
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
So that NBD export will not process more requests.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
nbd.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/nbd.c b/nbd.c
index 06b501b..854d6a5 100644
--- a/nbd.c
+++ b/nbd.c
@@ -160,6 +160,8 @@ struct NBDExport {
uint32_t nbdflags;
QTAILQ_HEAD(, NBDClient) clients;
QTAILQ_ENTRY(NBDExport) next;
+ Notifier lock_notify;
+ bool io_blocked;
AioContext *ctx;
};
@@ -1053,6 +1055,19 @@ static void blk_aio_detach(void *opaque)
exp->ctx = NULL;
}
+static void nbd_pause_handler(Notifier *notifier, void *data)
+{
+ BdrvLockEvent *event = data;
+ NBDClient *client;
+ NBDExport *exp = container_of(notifier, NBDExport, lock_notify);
+
+ exp->io_blocked = event->locking;
+
+ QTAILQ_FOREACH(client, &exp->clients, next) {
+ nbd_update_can_read(client);
+ }
+}
+
NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
uint32_t nbdflags, void (*close)(NBDExport *),
Error **errp)
@@ -1081,6 +1096,8 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
* access since the export could be available before migration handover.
*/
blk_invalidate_cache(blk, NULL);
+ exp->lock_notify.notify = nbd_pause_handler;
+ blk_add_lock_unlock_notifier(blk, &exp->lock_notify);
return exp;
fail:
@@ -1132,6 +1149,7 @@ void nbd_export_close(NBDExport *exp)
nbd_export_set_name(exp, NULL);
nbd_export_put(exp);
if (exp->blk) {
+ notifier_remove(&exp->lock_notify);
blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
blk_aio_detach, exp);
blk_unref(exp->blk);
@@ -1455,6 +1473,9 @@ static void nbd_update_can_read(NBDClient *client)
bool can_read = client->recv_coroutine ||
client->nb_requests < MAX_NBD_REQUESTS;
+ if (client->exp && client->exp->io_blocked) {
+ can_read = false;
+ }
if (can_read != client->can_read) {
client->can_read = can_read;
nbd_set_handlers(client);
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [RFC PATCH 12/12] mirror: Protect source between bdrv_drain and bdrv_swap
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (10 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 11/12] nbd-server: Clear "can_read" when backend is locked Fam Zheng
@ 2015-05-29 10:53 ` Fam Zheng
2015-05-29 12:01 ` [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Paolo Bonzini
12 siblings, 0 replies; 18+ messages in thread
From: Fam Zheng @ 2015-05-29 10:53 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, fam, qemu-block, Jeff Cody, Stefan Hajnoczi,
Paolo Bonzini
Source and target are in sync when we leave the mirror_run loop, they
should remain so until bdrv_swap. Before block_job_defer_to_main_loop
was introduced, it has been easy to prove that. Now that tricky things
can happen after mirror_run returns and before mirror_exit runs, for
example, ioeventfd handlers being called, or a device model timer
callback submitting more I/O.
So, skip the block_job_defer_to_main_loop if we're already in the main
context. This is a necessary special casing until BlockBackend really
honors bdrv_lock.
If we're not in the main context, we rely on the notifying mechanism to
do the right thing.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
block/mirror.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..24cf687 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -57,6 +57,8 @@ typedef struct MirrorBlockJob {
int in_flight;
int sectors_in_flight;
int ret;
+ /* True if the source is locked by us */
+ bool need_unlock;
} MirrorBlockJob;
typedef struct MirrorOp {
@@ -358,6 +360,9 @@ static void mirror_exit(BlockJob *job, void *opaque)
if (replace_aio_context) {
aio_context_release(replace_aio_context);
}
+ if (s->need_unlock) {
+ bdrv_unlock(s->common.bs);
+ }
g_free(s->replaces);
bdrv_unref(s->target);
block_job_completed(&s->common, data->ret);
@@ -521,7 +526,8 @@ static void coroutine_fn mirror_run(void *opaque)
* mirror_populate runs.
*/
trace_mirror_before_drain(s, cnt);
- bdrv_drain(bs);
+ bdrv_lock(bs);
+ s->need_unlock = true;
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
}
@@ -543,6 +549,10 @@ static void coroutine_fn mirror_run(void *opaque)
s->common.cancelled = false;
break;
}
+ if (s->need_unlock) {
+ bdrv_unlock(bs);
+ s->need_unlock = false;
+ }
last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
}
@@ -565,7 +575,11 @@ immediate_exit:
data = g_malloc(sizeof(*data));
data->ret = ret;
- block_job_defer_to_main_loop(&s->common, mirror_exit, data);
+ if (bs->aio_context == qemu_get_aio_context()) {
+ mirror_exit(&s->common, data);
+ } else {
+ block_job_defer_to_main_loop(&s->common, mirror_exit, data);
+ }
}
static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
--
2.4.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API
2015-05-29 10:53 [Qemu-devel] [RFC PATCH 00/12] block: Protect block jobs with lock / unlock API Fam Zheng
` (11 preceding siblings ...)
2015-05-29 10:53 ` [Qemu-devel] [RFC PATCH 12/12] mirror: Protect source between bdrv_drain and bdrv_swap Fam Zheng
@ 2015-05-29 12:01 ` Paolo Bonzini
12 siblings, 0 replies; 18+ messages in thread
From: Paolo Bonzini @ 2015-05-29 12:01 UTC (permalink / raw)
To: Fam Zheng, qemu-devel
Cc: Kevin Wolf, fam, Jeff Cody, qemu-block, Stefan Hajnoczi
On 29/05/2015 12:53, Fam Zheng wrote:
> This is the partial work to introduce bdrv_lock / bdrv_unlock and use them in
> block jobs where exclusive access to a BDS is necessary. It address the same
> category of problems as [1] with a different API, as the idea proposed by Paolo
> and Kevin.
>
> What's implemented in this series is also very close to [1], i.e. pausing
> ioeventfd and NBD server, with a notifier list.
I haven't yet fully digested patch 2 but, in the meanwhile, I can point
out how I would like the patch series to be structured.
Patch 1 is okay.
Patch 2 should be a very simple change that introduces the API simply as
bdrv_lock(bs) { bdrv_drain(bs); }
bdrv_unlock(bs) {}
Then you can introduce bdrv_lock/unlock (patches 3-4-5-6-12) which
remove most calls to bdrv_drain. BTW you can also remove the
bdrv_drain_all in qmp_transaction. These patches introduce the new API
but have no semantic change. If we agree that the API makes sense, they
can also be committed right now.
Only then you introduce full-blown locking and notifiers, and patches
7-8-9-10-11.
Thanks for your work!
Paolo
> The important missing pieces are converting bdrv_lock/unlock callers to
> coroutine, so that they can wait in the bs->lock_queue. In other words, this
> API is not complete without that, because we can't handle the case where two
> non-coroutine callers have contention, which is now asserted in bdrv_lock as
> practically impossible.
>
> [1]: https://lists.gnu.org/archive/html/qemu-block/2015-05/msg00800.html
>
> Thanks,
>
> Fam
>
>
> Fam Zheng (12):
> block: Use bdrv_drain to replace uncessary bdrv_drain_all
> block: Introduce bdrv_lock and bdrv_unlock API
> blockdev: Lock BDS during internal snapshot transaction
> blockdev: Lock BDS during external snapshot transaction
> blockdev: Lock BDS during drive-backup transaction
> blockdev: Lock BDS during blockdev-backup transaction
> block-backend: Add blk_add_lock_unlock_notifier
> virtio-blk: Move complete_request to 'ops' structure
> virtio-blk: Don't handle output when backend is locked
> virtio-scsi-dataplane: Add backend lock listener
> nbd-server: Clear "can_read" when backend is locked
> mirror: Protect source between bdrv_drain and bdrv_swap
>
> block.c | 16 +++++++--
> block/block-backend.c | 6 ++++
> block/io.c | 69 ++++++++++++++++++++++++++++++++++++
> block/mirror.c | 18 ++++++++--
> block/snapshot.c | 2 +-
> blockdev.c | 17 +++++++--
> hw/block/dataplane/virtio-blk.c | 36 ++++++++++++++++---
> hw/block/virtio-blk.c | 63 +++++++++++++++++++++++++++++++--
> hw/scsi/virtio-scsi-dataplane.c | 78 ++++++++++++++++++++++++++++++-----------
> hw/scsi/virtio-scsi.c | 3 ++
> include/block/block.h | 39 +++++++++++++++++++++
> include/block/block_int.h | 5 +++
> include/hw/virtio/virtio-blk.h | 17 +++++++--
> include/hw/virtio/virtio-scsi.h | 3 ++
> include/sysemu/block-backend.h | 1 +
> migration/block.c | 2 +-
> nbd.c | 21 +++++++++++
> 17 files changed, 356 insertions(+), 40 deletions(-)
>
^ permalink raw reply [flat|nested] 18+ messages in thread