* [Qemu-devel] [PATCH 01/10] migration: add missed aio_context_acquire for state writing/reading
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 02/10] block: add missed aio_context_acquire around bdrv_set_aio_context Denis V. Lunev
` (8 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Amit Shah, Denis V. Lunev, qemu-devel, Stefan Hajnoczi,
qemu-stable
aio_context should be locked in the similar way as was done in QMP
snapshot creation in the other case there are a lot of possible
troubles if native AIO mode is enabled for disk.
qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
along with block drivers. This change should influence only HMP snapshot
operations.
AioContext lock is reqursive. Thus nested locking should not be a problem.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
migration/savevm.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..1653f56 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -153,7 +153,11 @@ static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
static int bdrv_fclose(void *opaque)
{
- return bdrv_flush(opaque);
+ BlockDriverState *bs = (BlockDriverState *)opaque;
+ int ret = bdrv_flush(bs);
+
+ aio_context_release(bdrv_get_aio_context(bs));
+ return ret;
}
static const QEMUFileOps bdrv_read_ops = {
@@ -169,10 +173,18 @@ static const QEMUFileOps bdrv_write_ops = {
static QEMUFile *qemu_fopen_bdrv(BlockDriverState *bs, int is_writable)
{
+ QEMUFile *file;
+
if (is_writable) {
- return qemu_fopen_ops(bs, &bdrv_write_ops);
+ file = qemu_fopen_ops(bs, &bdrv_write_ops);
+ } else {
+ file = qemu_fopen_ops(bs, &bdrv_read_ops);
+ }
+
+ if (file != NULL) {
+ aio_context_acquire(bdrv_get_aio_context(bs));
}
- return qemu_fopen_ops(bs, &bdrv_read_ops);
+ return file;
}
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 02/10] block: add missed aio_context_acquire around bdrv_set_aio_context
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 01/10] migration: add missed aio_context_acquire for state writing/reading Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 03/10] migration: added missed aio_context_acquire around bdrv_snapshot_delete Denis V. Lunev
` (7 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, qemu-stable
It is required for bdrv_drain.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
block/block-backend.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/block/block-backend.c b/block/block-backend.c
index 19fdaae..07fcfc7 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1065,7 +1065,10 @@ static AioContext *blk_aiocb_get_aio_context(BlockAIOCB *acb)
void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
{
if (blk->bs) {
+ AioContext *ctx = blk_get_aio_context(blk);
+ aio_context_acquire(ctx);
bdrv_set_aio_context(blk->bs, new_context);
+ aio_context_release(ctx);
}
}
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 03/10] migration: added missed aio_context_acquire around bdrv_snapshot_delete
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 01/10] migration: add missed aio_context_acquire for state writing/reading Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 02/10] block: add missed aio_context_acquire around bdrv_set_aio_context Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:51 ` Juan Quintela
2015-11-03 14:12 ` [Qemu-devel] [PATCH 04/10] blockdev: acquire AioContext in hmp_commit() Denis V. Lunev
` (6 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Juan Quintela, qemu-devel, qemu-stable, Stefan Hajnoczi,
Amit Shah, Denis V. Lunev
Necessary for bdrv_drain to run properly.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
migration/savevm.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/migration/savevm.c b/migration/savevm.c
index 1653f56..f45ff63 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1273,7 +1273,12 @@ static int del_existing_snapshots(Monitor *mon, const char *name)
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
+ AioContext *ctx = bdrv_get_aio_context(bs);
+
+ aio_context_acquire(ctx);
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
+ aio_context_release(ctx);
+
if (err) {
monitor_printf(mon,
"Error while deleting snapshot on device '%s':"
@@ -1518,8 +1523,13 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
bs = NULL;
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs)) {
+ AioContext *ctx = bdrv_get_aio_context(bs);
+
err = NULL;
+ aio_context_acquire(ctx);
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
+ aio_context_release(ctx);
+
if (err) {
monitor_printf(mon,
"Error while deleting snapshot on device '%s':"
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH 03/10] migration: added missed aio_context_acquire around bdrv_snapshot_delete
2015-11-03 14:12 ` [Qemu-devel] [PATCH 03/10] migration: added missed aio_context_acquire around bdrv_snapshot_delete Denis V. Lunev
@ 2015-11-03 14:51 ` Juan Quintela
2015-11-04 7:32 ` [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking Denis V. Lunev
0 siblings, 1 reply; 24+ messages in thread
From: Juan Quintela @ 2015-11-03 14:51 UTC (permalink / raw)
To: Denis V. Lunev; +Cc: Amit Shah, qemu-devel, Stefan Hajnoczi, qemu-stable
"Denis V. Lunev" <den@openvz.org> wrote:
> Necessary for bdrv_drain to run properly.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>
> ---
See comments on previous thread just posted.
Last Stefan suggestion was to move this code to snapshot.c, and then you
don't need to convince migration folks of anything O:-)
> migration/savevm.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 1653f56..f45ff63 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1273,7 +1273,12 @@ static int del_existing_snapshots(Monitor *mon, const char *name)
> while ((bs = bdrv_next(bs))) {
> if (bdrv_can_snapshot(bs) &&
> bdrv_snapshot_find(bs, snapshot, name) >= 0) {
> + AioContext *ctx = bdrv_get_aio_context(bs);
> +
> + aio_context_acquire(ctx);
> bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
> + aio_context_release(ctx);
> +
> if (err) {
> monitor_printf(mon,
> "Error while deleting snapshot on device '%s':"
> @@ -1518,8 +1523,13 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
> bs = NULL;
> while ((bs = bdrv_next(bs))) {
> if (bdrv_can_snapshot(bs)) {
> + AioContext *ctx = bdrv_get_aio_context(bs);
> +
> err = NULL;
> + aio_context_acquire(ctx);
> bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
> + aio_context_release(ctx);
> +
> if (err) {
> monitor_printf(mon,
> "Error while deleting snapshot on device '%s':"
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking
2015-11-03 14:51 ` Juan Quintela
@ 2015-11-04 7:32 ` Denis V. Lunev
2015-11-04 9:49 ` Juan Quintela
0 siblings, 1 reply; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-04 7:32 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, Juan Quintela
What about this? Is it simple enough for you keeping lock around
qemu_fopen_bdrv/qemu_fclose as suggested in patch 1?
This is not tested at all, just sent as an idea for a discussion.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
block.c | 17 +++++++++++++++++
include/block/block.h | 2 ++
migration/savevm.c | 23 +++++++++++++++--------
monitor.c | 2 +-
4 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/block.c b/block.c
index 044897e..d376ec2 100644
--- a/block.c
+++ b/block.c
@@ -2741,6 +2741,23 @@ BlockDriverState *bdrv_next(BlockDriverState *bs)
return QTAILQ_NEXT(bs, device_list);
}
+BlockDriverState *bdrv_next_lock(BlockDriverState *bs)
+{
+ if (bs != NULL) {
+ aio_context_release(bdrv_get_aio_context(bs));
+ }
+ bs = bdrv_next(bs);
+ if (bs != NULL) {
+ aio_context_acquire(bdrv_get_aio_context(bs));
+ }
+ return bs;
+}
+
+void bdrv_unlock(BlockDriverState *bs)
+{
+ aio_context_release(bdrv_get_aio_context(bs));
+}
+
const char *bdrv_get_node_name(const BlockDriverState *bs)
{
return bs->node_name;
diff --git a/include/block/block.h b/include/block/block.h
index 610db92..b29dd5b 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -401,6 +401,8 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
bool bdrv_chain_contains(BlockDriverState *top, BlockDriverState *base);
BlockDriverState *bdrv_next_node(BlockDriverState *bs);
BlockDriverState *bdrv_next(BlockDriverState *bs);
+BlockDriverState *bdrv_next_lock(BlockDriverState *bs);
+void bdrv_unlock(BlockDriverState *bs);
int bdrv_is_encrypted(BlockDriverState *bs);
int bdrv_key_required(BlockDriverState *bs);
int bdrv_set_key(BlockDriverState *bs, const char *key);
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..cf06a10 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1240,8 +1240,9 @@ out:
static BlockDriverState *find_vmstate_bs(void)
{
BlockDriverState *bs = NULL;
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
if (bdrv_can_snapshot(bs)) {
+ bdrv_unlock(bs);
return bs;
}
}
@@ -1258,11 +1259,12 @@ static int del_existing_snapshots(Monitor *mon, const char *name)
Error *err = NULL;
bs = NULL;
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
if (err) {
+ bdrv_unlock(bs);
monitor_printf(mon,
"Error while deleting snapshot on device '%s':"
" %s\n",
@@ -1292,13 +1294,14 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
/* Verify if there is a device that doesn't support snapshots and is writable */
bs = NULL;
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
continue;
}
if (!bdrv_can_snapshot(bs)) {
+ bdrv_unlock(bs);
monitor_printf(mon, "Device '%s' is writable but does not support snapshots.\n",
bdrv_get_device_name(bs));
return;
@@ -1365,7 +1368,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
/* create the snapshots */
bs1 = NULL;
- while ((bs1 = bdrv_next(bs1))) {
+ while ((bs1 = bdrv_next_lock(bs1))) {
if (bdrv_can_snapshot(bs1)) {
/* Write VM state size only to the image that contains the state */
sn->vm_state_size = (bs == bs1 ? vm_state_size : 0);
@@ -1436,13 +1439,14 @@ int load_vmstate(const char *name)
/* Verify if there is any device that doesn't support snapshots and is
writable and check if the requested snapshot is available too. */
bs = NULL;
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
continue;
}
if (!bdrv_can_snapshot(bs)) {
+ bdrv_unlock(bs);
error_report("Device '%s' is writable but does not support snapshots.",
bdrv_get_device_name(bs));
return -ENOTSUP;
@@ -1450,6 +1454,7 @@ int load_vmstate(const char *name)
ret = bdrv_snapshot_find(bs, &sn, name);
if (ret < 0) {
+ bdrv_unlock(bs);
error_report("Device '%s' does not have the requested snapshot '%s'",
bdrv_get_device_name(bs), name);
return ret;
@@ -1460,10 +1465,11 @@ int load_vmstate(const char *name)
bdrv_drain_all();
bs = NULL;
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
if (bdrv_can_snapshot(bs)) {
ret = bdrv_snapshot_goto(bs, name);
if (ret < 0) {
+ bdrv_unlock(bs);
error_report("Error %d while activating snapshot '%s' on '%s'",
ret, name, bdrv_get_device_name(bs));
return ret;
@@ -1504,7 +1510,7 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
}
bs = NULL;
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
if (bdrv_can_snapshot(bs)) {
err = NULL;
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
@@ -1552,10 +1558,11 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
available = 1;
bs1 = NULL;
- while ((bs1 = bdrv_next(bs1))) {
+ while ((bs1 = bdrv_next_lock(bs1))) {
if (bdrv_can_snapshot(bs1) && bs1 != bs) {
ret = bdrv_snapshot_find(bs1, sn_info, sn->id_str);
if (ret < 0) {
+ bdrv_unlock(bs);
available = 0;
break;
}
diff --git a/monitor.c b/monitor.c
index 301a143..ea1a917 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3374,7 +3374,7 @@ static void vm_completion(ReadLineState *rs, const char *str)
len = strlen(str);
readline_set_completion_index(rs, len);
- while ((bs = bdrv_next(bs))) {
+ while ((bs = bdrv_next_lock(bs))) {
SnapshotInfoList *snapshots, *snapshot;
if (!bdrv_can_snapshot(bs)) {
--
2.1.4
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking
2015-11-04 7:32 ` [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking Denis V. Lunev
@ 2015-11-04 9:49 ` Juan Quintela
2015-11-04 11:12 ` Denis V. Lunev
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Denis V. Lunev
0 siblings, 2 replies; 24+ messages in thread
From: Juan Quintela @ 2015-11-04 9:49 UTC (permalink / raw)
To: Denis V. Lunev; +Cc: qemu-devel, Stefan Hajnoczi
"Denis V. Lunev" <den@openvz.org> wrote:
D> What about this? Is it simple enough for you keeping lock around
> qemu_fopen_bdrv/qemu_fclose as suggested in patch 1?
>
> This is not tested at all, just sent as an idea for a discussion.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
> block.c | 17 +++++++++++++++++
> include/block/block.h | 2 ++
> migration/savevm.c | 23 +++++++++++++++--------
> monitor.c | 2 +-
> 4 files changed, 35 insertions(+), 9 deletions(-)
>
> diff --git a/block.c b/block.c
> index 044897e..d376ec2 100644
> --- a/block.c
> +++ b/block.c
> @@ -2741,6 +2741,23 @@ BlockDriverState *bdrv_next(BlockDriverState *bs)
> return QTAILQ_NEXT(bs, device_list);
> }
>
> +BlockDriverState *bdrv_next_lock(BlockDriverState *bs)
> +{
> + if (bs != NULL) {
> + aio_context_release(bdrv_get_aio_context(bs));
> + }
> + bs = bdrv_next(bs);
> + if (bs != NULL) {
> + aio_context_acquire(bdrv_get_aio_context(bs));
> + }
> + return bs;
> +}
> +
> +void bdrv_unlock(BlockDriverState *bs)
> +{
> + aio_context_release(bdrv_get_aio_context(bs));
> +}
I think I preffer bdrv_ref/unref
And once there, having bdrv_next_lock() only remove the need to do a
bdrv_ref (or lock if you preffer).
> diff --git a/include/block/block.h b/include/block/block.h
> index 610db92..b29dd5b 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -401,6 +401,8 @@ BlockDriverState *bdrv_lookup_bs(const char *device,
> bool bdrv_chain_contains(BlockDriverState *top, BlockDriverState *base);
> BlockDriverState *bdrv_next_node(BlockDriverState *bs);
> BlockDriverState *bdrv_next(BlockDriverState *bs);
> +BlockDriverState *bdrv_next_lock(BlockDriverState *bs);
> +void bdrv_unlock(BlockDriverState *bs);
> int bdrv_is_encrypted(BlockDriverState *bs);
> int bdrv_key_required(BlockDriverState *bs);
> int bdrv_set_key(BlockDriverState *bs, const char *key);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index dbcc39a..cf06a10 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1240,8 +1240,9 @@ out:
> static BlockDriverState *find_vmstate_bs(void)
> {
> BlockDriverState *bs = NULL;
> - while ((bs = bdrv_next(bs))) {
> + while ((bs = bdrv_next_lock(bs))) {
> if (bdrv_can_snapshot(bs)) {
> + bdrv_unlock(bs);
Once here, why don't we need it to return it locked?
> return bs;
> }
Looking for one thousand feet view, I think that it is just easier to
export that function from block.c:
BlockDriverState *bdrv_find_snapshot_bs(void)
{
BlockDriverState *bs = NULL;
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs)) {
return bs;
}
}
return NULL;
}
or something like that?
export something like:
char *name bdrv_remove_snapshots(const char *name, Error *err)
{
BlockDriverState *bs;
QEMUSnapshotInfo sn1, *snapshot = &sn1;
bs = NULL;
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
if (err) {
return bdrv_get_device_name(bs);
}
}
}
return NULL;
}
And use like that:
static int del_existing_snapshots(Monitor *mon, const char *name)
{
Error *err = NULL;
char *name;
name = bdrv_remove_snapshots(name, &err);
if (err) {
monitor_printf(mon,
"Error while deleting snapshot on device '%s': %s\n",=
name, error_get_pretty(err));
return -1;
}
return 0;
}
Yes, we go through pains to just not teach block.c about the monitor.
void hmp_delvm(Monitor *mon, const QDict *qdict)
{
const char *name = qdict_get_str(qdict, "name");
if (!bdrv_find_snapshot_bs()) {
monitor_printf(mon, "No block device supports snapshots\n");
return;
}
del_existing_snapshots(mon, name);
}
Yes, we have changed the semantics "slightly". Pervious version of
hmp_delvm() will try to remove all the snapshots from any device with
that name. This one would remove them until it finds one error. I
think that the code reuse and the consistence trumps the change in
semantics (really the change is only on error cases).
> @@ -1292,13 +1294,14 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
>
> /* Verify if there is a device that doesn't support snapshots and is writable */
> bs = NULL;
> - while ((bs = bdrv_next(bs))) {
> + while ((bs = bdrv_next_lock(bs))) {
>
> if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
> continue;
> }
>
> if (!bdrv_can_snapshot(bs)) {
> + bdrv_unlock(bs);
> monitor_printf(mon, "Device '%s' is writable but does not support snapshots.\n",
> bdrv_get_device_name(bs));
> return;
Export this bit of code as:
bdrv_snapshot_supported() or somesuch?
Migration code only needs a true/false value.
> @@ -1365,7 +1368,7 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
> /* create the snapshots */
>
> bs1 = NULL;
> - while ((bs1 = bdrv_next(bs1))) {
> + while ((bs1 = bdrv_next_lock(bs1))) {
> if (bdrv_can_snapshot(bs1)) {
> /* Write VM state size only to the image that contains the state */
> sn->vm_state_size = (bs == bs1 ? vm_state_size : 0);
> @@ -1436,13 +1439,14 @@ int load_vmstate(const char *name)
> /* Verify if there is any device that doesn't support snapshots and is
> writable and check if the requested snapshot is available too. */
> bs = NULL;
> - while ((bs = bdrv_next(bs))) {
> + while ((bs = bdrv_next_lock(bs))) {
>
> if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
> continue;
> }
>
> if (!bdrv_can_snapshot(bs)) {
> + bdrv_unlock(bs);
> error_report("Device '%s' is writable but does not support snapshots.",
> bdrv_get_device_name(bs));
> return -ENOTSUP;
> @@ -1450,6 +1454,7 @@ int load_vmstate(const char *name)
>
> ret = bdrv_snapshot_find(bs, &sn, name);
> if (ret < 0) {
> + bdrv_unlock(bs);
> error_report("Device '%s' does not have the requested snapshot '%s'",
> bdrv_get_device_name(bs), name);
> return ret;
rest of code until here from bdrv_* functions is basically migration
layer asking block layer:
Pretty, pretty please, give me an device where I can do an snapshot in
or one error.
> @@ -1460,10 +1465,11 @@ int load_vmstate(const char *name)
> bdrv_drain_all();
>
> bs = NULL;
> - while ((bs = bdrv_next(bs))) {
> + while ((bs = bdrv_next_lock(bs))) {
> if (bdrv_can_snapshot(bs)) {
> ret = bdrv_snapshot_goto(bs, name);
> if (ret < 0) {
> + bdrv_unlock(bs);
> error_report("Error %d while activating snapshot '%s' on '%s'",
> ret, name, bdrv_get_device_name(bs));
> return ret;
> @@ -1504,7 +1510,7 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
> }
>
> bs = NULL;
> - while ((bs = bdrv_next(bs))) {
> + while ((bs = bdrv_next_lock(bs))) {
> if (bdrv_can_snapshot(bs)) {
> err = NULL;
> bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
> @@ -1552,10 +1558,11 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
> available = 1;
> bs1 = NULL;
>
> - while ((bs1 = bdrv_next(bs1))) {
> + while ((bs1 = bdrv_next_lock(bs1))) {
> if (bdrv_can_snapshot(bs1) && bs1 != bs) {
> ret = bdrv_snapshot_find(bs1, sn_info, sn->id_str);
> if (ret < 0) {
> + bdrv_unlock(bs);
> available = 0;
> break;
> }
I will claim that this command belongs to the block layer, you already
have to work with the monitor, and there are nothing migration related
on it.
Notice that the reason that I ask to move things to the block layer is
that then you can export less internal things. I think that
bdrv_can_snapshot() is a function that shouldn't be exported. What
users want is: give_me_a_device_to_do_one_snapshot(), they don't care
about the details.
> diff --git a/monitor.c b/monitor.c
> index 301a143..ea1a917 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -3374,7 +3374,7 @@ static void vm_completion(ReadLineState *rs, const char *str)
>
> len = strlen(str);
> readline_set_completion_index(rs, len);
> - while ((bs = bdrv_next(bs))) {
> + while ((bs = bdrv_next_lock(bs))) {
> SnapshotInfoList *snapshots, *snapshot;
>
> if (!bdrv_can_snapshot(bs)) {
We don't need to unlock device here?
Yes, I know, that makes things much, much more difficult :-( Instead of
a trivial patch, we got this.
Later, Juan.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking
2015-11-04 9:49 ` Juan Quintela
@ 2015-11-04 11:12 ` Denis V. Lunev
2015-11-04 12:03 ` Juan Quintela
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Denis V. Lunev
1 sibling, 1 reply; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-04 11:12 UTC (permalink / raw)
To: quintela, Denis V. Lunev; +Cc: qemu-devel, Stefan Hajnoczi
On 11/04/2015 12:49 PM, Juan Quintela wrote:
> void hmp_delvm(Monitor *mon, const QDict *qdict)
> {
> const char *name = qdict_get_str(qdict, "name");
>
> if (!bdrv_find_snapshot_bs()) {
> monitor_printf(mon, "No block device supports snapshots\n");
> return;
> }
>
> del_existing_snapshots(mon, name);
> }
>
> Yes, we have changed the semantics "slightly". Pervious version of
> hmp_delvm() will try to remove all the snapshots from any device with
> that name. This one would remove them until it finds one error. I
> think that the code reuse and the consistence trumps the change in
> semantics (really the change is only on error cases).
I think you are wrong here. You can not abort operation if one
disk does not have a snapshot assuming the following situation
- VM has one disk
- snapshot XXX is made
- 2nd disk is added
- remove XXX snapshot
Your position is understood. I'll send yet another proof of concept
in an hour.
Den
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking
2015-11-04 11:12 ` Denis V. Lunev
@ 2015-11-04 12:03 ` Juan Quintela
2015-11-04 12:07 ` Denis V. Lunev
0 siblings, 1 reply; 24+ messages in thread
From: Juan Quintela @ 2015-11-04 12:03 UTC (permalink / raw)
To: Denis V. Lunev; +Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi
"Denis V. Lunev" <den-lists@parallels.com> wrote:
> On 11/04/2015 12:49 PM, Juan Quintela wrote:
>> void hmp_delvm(Monitor *mon, const QDict *qdict)
>> {
>> const char *name = qdict_get_str(qdict, "name");
>>
>> if (!bdrv_find_snapshot_bs()) {
>> monitor_printf(mon, "No block device supports snapshots\n");
>> return;
>> }
>>
>> del_existing_snapshots(mon, name);
>> }
>>
>> Yes, we have changed the semantics "slightly". Pervious version of
>> hmp_delvm() will try to remove all the snapshots from any device with
>> that name. This one would remove them until it finds one error. I
>> think that the code reuse and the consistence trumps the change in
>> semantics (really the change is only on error cases).
>
> I think you are wrong here. You can not abort operation if one
> disk does not have a snapshot assuming the following situation
> - VM has one disk
> - snapshot XXX is made
> - 2nd disk is added
> - remove XXX snapshot
I think that my *completely* untested suggestion handled that well.
char *name bdrv_remove_snapshots(const char *name, Error *err)
{
BlockDriverState *bs;
QEMUSnapshotInfo sn1, *snapshot = &sn1;
bs = NULL;
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
if (err) {
return bdrv_get_device_name(bs);
}
}
}
return NULL;
}
It only stops without removing an snapshot if there is one error
deleting one snapshot. Current code just tells that there is one error
and continues in the rest of the disks.
Notice that we are going to have problems on this operation, we have
found a disk with one snapshot with the name that we want to remove and
we have failed.
>
> Your position is understood. I'll send yet another proof of concept
> in an hour.
Thanks, Juan.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [RFC PATCH 1/1] dataplane: alternative approach to locking
2015-11-04 12:03 ` Juan Quintela
@ 2015-11-04 12:07 ` Denis V. Lunev
0 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-04 12:07 UTC (permalink / raw)
To: quintela; +Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi
On 11/04/2015 03:03 PM, Juan Quintela wrote:
> "Denis V. Lunev" <den-lists@parallels.com> wrote:
>> On 11/04/2015 12:49 PM, Juan Quintela wrote:
>>> void hmp_delvm(Monitor *mon, const QDict *qdict)
>>> {
>>> const char *name = qdict_get_str(qdict, "name");
>>>
>>> if (!bdrv_find_snapshot_bs()) {
>>> monitor_printf(mon, "No block device supports snapshots\n");
>>> return;
>>> }
>>>
>>> del_existing_snapshots(mon, name);
>>> }
>>>
>>> Yes, we have changed the semantics "slightly". Pervious version of
>>> hmp_delvm() will try to remove all the snapshots from any device with
>>> that name. This one would remove them until it finds one error. I
>>> think that the code reuse and the consistence trumps the change in
>>> semantics (really the change is only on error cases).
>> I think you are wrong here. You can not abort operation if one
>> disk does not have a snapshot assuming the following situation
>> - VM has one disk
>> - snapshot XXX is made
>> - 2nd disk is added
>> - remove XXX snapshot
> I think that my *completely* untested suggestion handled that well.
>
> char *name bdrv_remove_snapshots(const char *name, Error *err)
> {
> BlockDriverState *bs;
> QEMUSnapshotInfo sn1, *snapshot = &sn1;
>
> bs = NULL;
> while ((bs = bdrv_next(bs))) {
> if (bdrv_can_snapshot(bs) &&
> bdrv_snapshot_find(bs, snapshot, name) >= 0) {
> bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
> if (err) {
> return bdrv_get_device_name(bs);
> }
> }
> }
> return NULL;
> }
>
> It only stops without removing an snapshot if there is one error
> deleting one snapshot. Current code just tells that there is one error
> and continues in the rest of the disks.
>
> Notice that we are going to have problems on this operation, we have
> found a disk with one snapshot with the name that we want to remove and
> we have failed.
>
>
>> Your position is understood. I'll send yet another proof of concept
>> in an hour.
> Thanks, Juan.
yes. we should follow this way in both branches.
I like this and done similar thing in my RFC :)
Den
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots
2015-11-04 9:49 ` Juan Quintela
2015-11-04 11:12 ` Denis V. Lunev
@ 2015-11-04 11:31 ` Denis V. Lunev
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 2/2] snapshot: create bdrv_snapshot_all_del_snapshot helper Denis V. Lunev
` (2 more replies)
1 sibling, 3 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-04 11:31 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, Juan Quintela
The patch enforces proper locking for this operation.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
Patches are compile-tested only. Sent to check the approach, naming and
functions placement. Functions are returning bad BlockDriver via
parameter to make clear distinction when I'll have to return BS to write.
block/snapshot.c | 35 +++++++++++++++++++++++++++++++++++
include/block/snapshot.h | 9 +++++++++
migration/savevm.c | 17 ++++-------------
3 files changed, 48 insertions(+), 13 deletions(-)
diff --git a/block/snapshot.c b/block/snapshot.c
index 89500f2..6b5ce4e 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -25,6 +25,7 @@
#include "block/snapshot.h"
#include "block/block_int.h"
#include "qapi/qmp/qerror.h"
+#include "monitor/monitor.h"
QemuOptsList internal_snapshot_opts = {
.name = "snapshot",
@@ -356,3 +357,37 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
return ret;
}
+
+
+/* Group operations. All block drivers are involved.
+ * These functions will properly handle dataplace (take aio_context_acquire
+ * when appropriate for appropriate block drivers
+ *
+ * Returned block driver will be always locked.
+ */
+
+bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs)
+{
+ BlockDriverState *bs;
+
+ while ((bs = bdrv_next(bs))) {
+ bool ok;
+ AioContext *ctx = bdrv_get_aio_context(bs);
+
+ if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
+ continue;
+ }
+
+ aio_context_acquire(ctx);
+ ok = bdrv_can_snapshot(bs);
+ aio_context_release(ctx);
+
+ if (!ok) {
+ *first_bad_bs = bs;
+ return false;
+ }
+ }
+
+ *first_bad_bs = NULL;
+ return true;
+}
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index 770d9bb..61b4b5d 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -75,4 +75,13 @@ int bdrv_snapshot_load_tmp(BlockDriverState *bs,
int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
const char *id_or_name,
Error **errp);
+
+
+/* Group operations. All block drivers are involved.
+ * These functions will properly handle dataplace (take aio_context_acquire
+ * when appropriate for appropriate block drivers
+ *
+ */
+bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs);
+
#endif
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..91ba0bf 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1290,19 +1290,10 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
const char *name = qdict_get_try_str(qdict, "name");
Error *local_err = NULL;
- /* Verify if there is a device that doesn't support snapshots and is writable */
- bs = NULL;
- while ((bs = bdrv_next(bs))) {
-
- if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
- continue;
- }
-
- if (!bdrv_can_snapshot(bs)) {
- monitor_printf(mon, "Device '%s' is writable but does not support snapshots.\n",
- bdrv_get_device_name(bs));
- return;
- }
+ if (bdrv_snapshot_all_can_snapshot(&bs)) {
+ monitor_printf(mon, "Device '%s' is writable but does not "
+ "support snapshots.\n", bdrv_get_device_name(bs));
+ return;
}
bs = find_vmstate_bs();
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH RFC 2/2] snapshot: create bdrv_snapshot_all_del_snapshot helper
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Denis V. Lunev
@ 2015-11-04 11:31 ` Denis V. Lunev
2015-11-04 12:10 ` Juan Quintela
2015-11-04 12:07 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Juan Quintela
2015-11-04 13:52 ` Stefan Hajnoczi
2 siblings, 1 reply; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-04 11:31 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, Juan Quintela
to delete snapshots from all loaded block drivers.
The patch also ensures proper locking.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
---
block/snapshot.c | 27 ++++++++++++++++++++++++
include/block/snapshot.h | 2 ++
migration/savevm.c | 54 +++++++++---------------------------------------
3 files changed, 39 insertions(+), 44 deletions(-)
diff --git a/block/snapshot.c b/block/snapshot.c
index 6b5ce4e..9d1aa9b 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -391,3 +391,30 @@ bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs)
*first_bad_bs = NULL;
return true;
}
+
+int bdrv_snapshot_all_del_snapshot(const char *name,
+ BlockDriverState **first_bad_bs, Error **err)
+{
+ BlockDriverState *bs;
+ AioContext *ctx;
+ QEMUSnapshotInfo sn1, *snapshot = &sn1;
+
+ bs = NULL;
+ while ((bs = bdrv_next(bs))) {
+ ctx = bdrv_get_aio_context(bs);
+
+ aio_context_acquire(ctx);
+ if (bdrv_can_snapshot(bs) &&
+ bdrv_snapshot_find(bs, snapshot, name) >= 0) {
+ bdrv_snapshot_delete_by_id_or_name(bs, name, err);
+ }
+ aio_context_release(ctx);
+
+ if (*err) {
+ *first_bad_bs = bs;
+ return -1;
+ }
+ }
+
+ return 0;
+}
diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index 61b4b5d..4b883e5 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -84,4 +84,6 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
*/
bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs);
+int bdrv_snapshot_all_del_snapshot(const char *name,
+ BlockDriverState **first_bsd_bs, Error **err);
#endif
diff --git a/migration/savevm.c b/migration/savevm.c
index 91ba0bf..4608811 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1248,35 +1248,6 @@ static BlockDriverState *find_vmstate_bs(void)
return NULL;
}
-/*
- * Deletes snapshots of a given name in all opened images.
- */
-static int del_existing_snapshots(Monitor *mon, const char *name)
-{
- BlockDriverState *bs;
- QEMUSnapshotInfo sn1, *snapshot = &sn1;
- Error *err = NULL;
-
- bs = NULL;
- while ((bs = bdrv_next(bs))) {
- if (bdrv_can_snapshot(bs) &&
- bdrv_snapshot_find(bs, snapshot, name) >= 0) {
- bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
- if (err) {
- monitor_printf(mon,
- "Error while deleting snapshot on device '%s':"
- " %s\n",
- bdrv_get_device_name(bs),
- error_get_pretty(err));
- error_free(err);
- return -1;
- }
- }
- }
-
- return 0;
-}
-
void hmp_savevm(Monitor *mon, const QDict *qdict)
{
BlockDriverState *bs, *bs1;
@@ -1334,7 +1305,11 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
}
/* Delete old snapshots of the same name */
- if (name && del_existing_snapshots(mon, name) < 0) {
+ if (name && bdrv_snapshot_all_del_snapshot(name, &bs1, &local_err) < 0) {
+ monitor_printf(mon,
+ "Error while deleting snapshot on device '%s': %s\n",
+ bdrv_get_device_name(bs1), error_get_pretty(local_err));
+ error_free(local_err);
goto the_end;
}
@@ -1494,20 +1469,11 @@ void hmp_delvm(Monitor *mon, const QDict *qdict)
return;
}
- bs = NULL;
- while ((bs = bdrv_next(bs))) {
- if (bdrv_can_snapshot(bs)) {
- err = NULL;
- bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
- if (err) {
- monitor_printf(mon,
- "Error while deleting snapshot on device '%s':"
- " %s\n",
- bdrv_get_device_name(bs),
- error_get_pretty(err));
- error_free(err);
- }
- }
+ if (bdrv_snapshot_all_del_snapshot(name, &bs, &err) < 0) {
+ monitor_printf(mon,
+ "Error while deleting snapshot on device '%s': %s\n",
+ bdrv_get_device_name(bs), error_get_pretty(err));
+ error_free(err);
}
}
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/2] snapshot: create bdrv_snapshot_all_del_snapshot helper
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 2/2] snapshot: create bdrv_snapshot_all_del_snapshot helper Denis V. Lunev
@ 2015-11-04 12:10 ` Juan Quintela
0 siblings, 0 replies; 24+ messages in thread
From: Juan Quintela @ 2015-11-04 12:10 UTC (permalink / raw)
To: Denis V. Lunev; +Cc: qemu-devel, Stefan Hajnoczi
"Denis V. Lunev" <den@openvz.org> wrote:
> to delete snapshots from all loaded block drivers.
>
> The patch also ensures proper locking.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
> block/snapshot.c | 27 ++++++++++++++++++++++++
> include/block/snapshot.h | 2 ++
> migration/savevm.c | 54 +++++++++---------------------------------------
> 3 files changed, 39 insertions(+), 44 deletions(-)
>
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 6b5ce4e..9d1aa9b 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -391,3 +391,30 @@ bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs)
> *first_bad_bs = NULL;
> return true;
> }
> +
> +int bdrv_snapshot_all_del_snapshot(const char *name,
bdrv_snapshot_delete_all?
bdrv_snapshot_delete_all_snapshots?
Agreed with the patches.
Reviewed-by: Juan Quintela <quintela@redhat.com>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Denis V. Lunev
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 2/2] snapshot: create bdrv_snapshot_all_del_snapshot helper Denis V. Lunev
@ 2015-11-04 12:07 ` Juan Quintela
2015-11-04 13:50 ` Stefan Hajnoczi
2015-11-04 13:52 ` Stefan Hajnoczi
2 siblings, 1 reply; 24+ messages in thread
From: Juan Quintela @ 2015-11-04 12:07 UTC (permalink / raw)
To: Denis V. Lunev; +Cc: qemu-devel, Stefan Hajnoczi
"Denis V. Lunev" <den@openvz.org> wrote:
> The patch enforces proper locking for this operation.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
> Patches are compile-tested only. Sent to check the approach, naming and
> functions placement. Functions are returning bad BlockDriver via
> parameter to make clear distinction when I'll have to return BS to write.
>
> block/snapshot.c | 35 +++++++++++++++++++++++++++++++++++
> include/block/snapshot.h | 9 +++++++++
> migration/savevm.c | 17 ++++-------------
> 3 files changed, 48 insertions(+), 13 deletions(-)
>
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 89500f2..6b5ce4e 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -25,6 +25,7 @@
> #include "block/snapshot.h"
> #include "block/block_int.h"
> #include "qapi/qmp/qerror.h"
> +#include "monitor/monitor.h"
You don't need the monitor here O:-)
>
> QemuOptsList internal_snapshot_opts = {
> .name = "snapshot",
> @@ -356,3 +357,37 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
>
> return ret;
> }
> +
> +
> +/* Group operations. All block drivers are involved.
> + * These functions will properly handle dataplace (take aio_context_acquire
> + * when appropriate for appropriate block drivers
> + *
> + * Returned block driver will be always locked.
> + */
> +
> +bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs)
bdrv_snapshot_is_possible???
> +{
> + BlockDriverState *bs;
> +
> + while ((bs = bdrv_next(bs))) {
> + bool ok;
> + AioContext *ctx = bdrv_get_aio_context(bs);
> +
> + if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
> + continue;
> + }
> +
> + aio_context_acquire(ctx);
I think that you should get the lock before the bdrv_is_inserted, but
who am I to know for sure O:-)
> + ok = bdrv_can_snapshot(bs);
> + aio_context_release(ctx);
> +
> + if (!ok) {
> + *first_bad_bs = bs;
> + return false;
> + }
> + }
> +
> + *first_bad_bs = NULL;
> + return true;
> +}
> diff --git a/include/block/snapshot.h b/include/block/snapshot.h
> index 770d9bb..61b4b5d 100644
> --- a/include/block/snapshot.h
> +++ b/include/block/snapshot.h
> @@ -75,4 +75,13 @@ int bdrv_snapshot_load_tmp(BlockDriverState *bs,
> int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
> const char *id_or_name,
> Error **errp);
> +
> +
> +/* Group operations. All block drivers are involved.
> + * These functions will properly handle dataplace (take aio_context_acquire
> + * when appropriate for appropriate block drivers
> + *
> + */
> +bool bdrv_snapshot_all_can_snapshot(BlockDriverState **first_bad_bs);
> +
> #endif
> diff --git a/migration/savevm.c b/migration/savevm.c
> index dbcc39a..91ba0bf 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1290,19 +1290,10 @@ void hmp_savevm(Monitor *mon, const QDict *qdict)
> const char *name = qdict_get_try_str(qdict, "name");
> Error *local_err = NULL;
>
> - /* Verify if there is a device that doesn't support snapshots and is writable */
> - bs = NULL;
> - while ((bs = bdrv_next(bs))) {
> -
> - if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) {
> - continue;
> - }
> -
> - if (!bdrv_can_snapshot(bs)) {
> - monitor_printf(mon, "Device '%s' is writable but does not support snapshots.\n",
> - bdrv_get_device_name(bs));
> - return;
> - }
> + if (bdrv_snapshot_all_can_snapshot(&bs)) {
> + monitor_printf(mon, "Device '%s' is writable but does not "
> + "support snapshots.\n", bdrv_get_device_name(bs));
> + return;
> }
>
> bs = find_vmstate_bs();
ok with the savevm.c changes.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Denis V. Lunev
2015-11-04 11:31 ` [Qemu-devel] [PATCH RFC 2/2] snapshot: create bdrv_snapshot_all_del_snapshot helper Denis V. Lunev
2015-11-04 12:07 ` [Qemu-devel] [PATCH RFC 1/2] snapshot: create helper to test that block drivers supports snapshots Juan Quintela
@ 2015-11-04 13:52 ` Stefan Hajnoczi
2 siblings, 0 replies; 24+ messages in thread
From: Stefan Hajnoczi @ 2015-11-04 13:52 UTC (permalink / raw)
To: Denis V. Lunev; +Cc: qemu-devel, Stefan Hajnoczi, Juan Quintela
On Wed, Nov 04, 2015 at 02:31:55PM +0300, Denis V. Lunev wrote:
> The patch enforces proper locking for this operation.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> ---
> Patches are compile-tested only. Sent to check the approach, naming and
> functions placement. Functions are returning bad BlockDriver via
> parameter to make clear distinction when I'll have to return BS to write.
>
> block/snapshot.c | 35 +++++++++++++++++++++++++++++++++++
> include/block/snapshot.h | 9 +++++++++
> migration/savevm.c | 17 ++++-------------
> 3 files changed, 48 insertions(+), 13 deletions(-)
Kevin is the maintainer of block/snapshot.c. Please include him in
future revisions.
Looks fine to me, besides the comments that Juan already raised.
Stefan
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 04/10] blockdev: acquire AioContext in hmp_commit()
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (2 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 03/10] migration: added missed aio_context_acquire around bdrv_snapshot_delete Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:38 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 05/10] block: guard bdrv_drain in bdrv_close with aio_context_acquire Denis V. Lunev
` (5 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, qemu-stable
From: Stefan Hajnoczi <stefanha@redhat.com>
This one slipped through. Although we acquire AioContext when
committing all devices we don't for just a single device.
AioContext must be acquired before calling bdrv_*() functions to
synchronize access with other threads that may be using the AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
---
blockdev.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/blockdev.c b/blockdev.c
index 18712d2..d611779 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1120,6 +1120,9 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
if (!strcmp(device, "all")) {
ret = bdrv_commit_all();
} else {
+ BlockDriverState *bs;
+ AioContext *aio_context;
+
blk = blk_by_name(device);
if (!blk) {
monitor_printf(mon, "Device '%s' not found\n", device);
@@ -1129,7 +1132,14 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
monitor_printf(mon, "Device '%s' has no medium\n", device);
return;
}
- ret = bdrv_commit(blk_bs(blk));
+
+ bs = blk_bs(blk);
+ aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
+
+ ret = bdrv_commit(bs);
+
+ aio_context_release(aio_context);
}
if (ret < 0) {
monitor_printf(mon, "'commit' error for '%s': %s\n", device,
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [Qemu-devel] [PATCH 04/10] blockdev: acquire AioContext in hmp_commit()
2015-11-03 14:12 ` [Qemu-devel] [PATCH 04/10] blockdev: acquire AioContext in hmp_commit() Denis V. Lunev
@ 2015-11-03 14:38 ` Denis V. Lunev
0 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:38 UTC (permalink / raw)
Cc: Jeff Cody, qemu-devel, Stefan Hajnoczi, qemu-stable
On 11/03/2015 05:12 PM, Denis V. Lunev wrote:
> From: Stefan Hajnoczi <stefanha@redhat.com>
>
> This one slipped through. Although we acquire AioContext when
> committing all devices we don't for just a single device.
>
> AioContext must be acquired before calling bdrv_*() functions to
> synchronize access with other threads that may be using the AioContext.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
this was Reviewed-by: Jeff Cody <jcody@redhat.com> in
the original submission.
Lost that accidentally.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 05/10] block: guard bdrv_drain in bdrv_close with aio_context_acquire
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (3 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 04/10] blockdev: acquire AioContext in hmp_commit() Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 06/10] io: guard aio_poll " Denis V. Lunev
` (4 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, qemu-stable
bdrv_close is called in tooooo much places to properly track at the moment.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
block.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block.c b/block.c
index e9f40dc..98b0b66 100644
--- a/block.c
+++ b/block.c
@@ -1895,6 +1895,7 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
void bdrv_close(BlockDriverState *bs)
{
BdrvAioNotifier *ban, *ban_next;
+ AioContext *ctx;
if (bs->job) {
block_job_cancel_sync(bs->job);
@@ -1905,9 +1906,13 @@ void bdrv_close(BlockDriverState *bs)
bdrv_io_limits_disable(bs);
}
+ ctx = bdrv_get_aio_context(bs);
+ aio_context_acquire(ctx);
bdrv_drain(bs); /* complete I/O */
bdrv_flush(bs);
bdrv_drain(bs); /* in case flush left pending I/O */
+ aio_context_release(ctx);
+
notifier_list_notify(&bs->close_notifiers, bs);
if (bs->blk) {
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 06/10] io: guard aio_poll with aio_context_acquire
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (4 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 05/10] block: guard bdrv_drain in bdrv_close with aio_context_acquire Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 07/10] block: call aio_context_acquire in qemu_img/nbd/io Denis V. Lunev
` (3 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Kevin Wolf, Denis V. Lunev, qemu-devel, Stefan Hajnoczi,
qemu-stable
There is no problem if this is called from iothread, when AioContext is
properly acquired. Unfortunately, this code is called from HMP thread
and this leads to a disaster.
HMP thread IO thread (in aio_poll)
| |
qemu_coroutine_enter |
while (rwco.ret == NOT_DONE) |
aio_poll |
aio_context_acquire |
| ret from qemu_poll_ns
| aio_context_acquire (nested = 2)
| process bdrv_rw_co_entry, set rwco.ret
| aio_context_release (nested = )
| reenters aio_poll, clear events
| aio_context_release
aio_context_release
qemu_poll_ns
In this case HMP thread will be never waked up. Alas.
This means that all such patterns MUST be guarded with aio_context_is_owner
checks, but this is terrible as if we'll find all such places we can fix
them with ease.
Another approach would be to take the lock at the very top (at the beginning
of the operation) but this is much more difficult and leads to spreading
of aio_context_acquire to a lot of unrelated pieces.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
---
block.c | 5 ++++-
block/curl.c | 3 +++
block/io.c | 11 +++++++++++
block/iscsi.c | 2 ++
block/nfs.c | 5 +++++
block/qed-table.c | 20 ++++++++++++++++----
block/sheepdog.c | 2 ++
blockjob.c | 6 ++++++
qemu-io-cmds.c | 6 +++++-
9 files changed, 54 insertions(+), 6 deletions(-)
diff --git a/block.c b/block.c
index 98b0b66..cf858a7 100644
--- a/block.c
+++ b/block.c
@@ -359,11 +359,14 @@ int bdrv_create(BlockDriver *drv, const char* filename,
/* Fast-path if already in coroutine context */
bdrv_create_co_entry(&cco);
} else {
+ AioContext *ctx = qemu_get_aio_context();
co = qemu_coroutine_create(bdrv_create_co_entry);
+ aio_context_acquire(ctx);
qemu_coroutine_enter(co, &cco);
while (cco.ret == NOT_DONE) {
- aio_poll(qemu_get_aio_context(), true);
+ aio_poll(ctx, true);
}
+ aio_context_release(ctx);
}
ret = cco.ret;
diff --git a/block/curl.c b/block/curl.c
index 8994182..33c024d 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -378,6 +378,7 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
{
CURLState *state = NULL;
int i, j;
+ AioContext *ctx = bdrv_get_aio_context(bs);
do {
for (i=0; i<CURL_NUM_STATES; i++) {
@@ -392,7 +393,9 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
break;
}
if (!state) {
+ aio_context_acquire(ctx);
aio_poll(bdrv_get_aio_context(bs), true);
+ aio_context_release(ctx);
}
} while(!state);
diff --git a/block/io.c b/block/io.c
index 8dcad3b..05aa32e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -560,11 +560,13 @@ static int bdrv_prwv_co(BlockDriverState *bs, int64_t offset,
} else {
AioContext *aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
co = qemu_coroutine_create(bdrv_rw_co_entry);
qemu_coroutine_enter(co, &rwco);
while (rwco.ret == NOT_DONE) {
aio_poll(aio_context, true);
}
+ aio_context_release(aio_context);
}
return rwco.ret;
}
@@ -1606,12 +1608,15 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
bdrv_get_block_status_above_co_entry(&data);
} else {
AioContext *aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
co = qemu_coroutine_create(bdrv_get_block_status_above_co_entry);
qemu_coroutine_enter(co, &data);
+
while (!data.done) {
aio_poll(aio_context, true);
}
+ aio_context_release(aio_context);
}
return data.ret;
}
@@ -2391,12 +2396,15 @@ int bdrv_flush(BlockDriverState *bs)
bdrv_flush_co_entry(&rwco);
} else {
AioContext *aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
co = qemu_coroutine_create(bdrv_flush_co_entry);
qemu_coroutine_enter(co, &rwco);
+
while (rwco.ret == NOT_DONE) {
aio_poll(aio_context, true);
}
+ aio_context_release(aio_context);
}
return rwco.ret;
@@ -2504,12 +2512,15 @@ int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors)
bdrv_discard_co_entry(&rwco);
} else {
AioContext *aio_context = bdrv_get_aio_context(bs);
+ aio_context_acquire(aio_context);
co = qemu_coroutine_create(bdrv_discard_co_entry);
qemu_coroutine_enter(co, &rwco);
+
while (rwco.ret == NOT_DONE) {
aio_poll(aio_context, true);
}
+ aio_context_release(aio_context);
}
return rwco.ret;
diff --git a/block/iscsi.c b/block/iscsi.c
index 9a628b7..1d6200d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -829,11 +829,13 @@ static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
break;
case SG_IO:
status = -EINPROGRESS;
+ aio_context_acquire(iscsilun->aio_context);
iscsi_aio_ioctl(bs, req, buf, ioctl_cb, &status);
while (status == -EINPROGRESS) {
aio_poll(iscsilun->aio_context, true);
}
+ aio_context_release(iscsilun->aio_context);
return 0;
default:
diff --git a/block/nfs.c b/block/nfs.c
index fd79f89..36ec1e1 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -462,6 +462,7 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
NFSClient *client = bs->opaque;
NFSRPC task = {0};
struct stat st;
+ AioContext *ctx;
if (bdrv_is_read_only(bs) &&
!(bs->open_flags & BDRV_O_NOCACHE)) {
@@ -469,8 +470,11 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
}
task.st = &st;
+ ctx = bdrv_get_aio_context(bs);
+ aio_context_acquire(ctx);
if (nfs_fstat_async(client->context, client->fh, nfs_co_generic_cb,
&task) != 0) {
+ aio_context_release(ctx);
return -ENOMEM;
}
@@ -478,6 +482,7 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
nfs_set_events(client);
aio_poll(client->aio_context, true);
}
+ aio_context_release(ctx);
return (task.ret < 0 ? task.ret : st.st_blocks * 512);
}
diff --git a/block/qed-table.c b/block/qed-table.c
index f4219b8..fa13aba 100644
--- a/block/qed-table.c
+++ b/block/qed-table.c
@@ -169,12 +169,15 @@ static void qed_sync_cb(void *opaque, int ret)
int qed_read_l1_table_sync(BDRVQEDState *s)
{
int ret = -EINPROGRESS;
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
qed_read_table(s, s->header.l1_table_offset,
s->l1_table, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
- aio_poll(bdrv_get_aio_context(s->bs), true);
+ aio_poll(ctx, true);
}
+ aio_context_release(ctx);
return ret;
}
@@ -191,11 +194,14 @@ int qed_write_l1_table_sync(BDRVQEDState *s, unsigned int index,
unsigned int n)
{
int ret = -EINPROGRESS;
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
qed_write_l1_table(s, index, n, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
- aio_poll(bdrv_get_aio_context(s->bs), true);
+ aio_poll(ctx, true);
}
+ aio_context_release(ctx);
return ret;
}
@@ -264,11 +270,14 @@ void qed_read_l2_table(BDRVQEDState *s, QEDRequest *request, uint64_t offset,
int qed_read_l2_table_sync(BDRVQEDState *s, QEDRequest *request, uint64_t offset)
{
int ret = -EINPROGRESS;
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
qed_read_l2_table(s, request, offset, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
- aio_poll(bdrv_get_aio_context(s->bs), true);
+ aio_poll(ctx, true);
}
+ aio_context_release(ctx);
return ret;
}
@@ -286,11 +295,14 @@ int qed_write_l2_table_sync(BDRVQEDState *s, QEDRequest *request,
unsigned int index, unsigned int n, bool flush)
{
int ret = -EINPROGRESS;
+ AioContext *ctx = bdrv_get_aio_context(s->bs);
+ aio_context_acquire(ctx);
qed_write_l2_table(s, request, index, n, flush, qed_sync_cb, &ret);
while (ret == -EINPROGRESS) {
- aio_poll(bdrv_get_aio_context(s->bs), true);
+ aio_poll(ctx, true);
}
+ aio_context_release(ctx);
return ret;
}
diff --git a/block/sheepdog.c b/block/sheepdog.c
index d80e4ed..038a385 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -715,11 +715,13 @@ static int do_req(int sockfd, AioContext *aio_context, SheepdogReq *hdr,
if (qemu_in_coroutine()) {
do_co_req(&srco);
} else {
+ aio_context_acquire(aio_context);
co = qemu_coroutine_create(do_co_req);
qemu_coroutine_enter(co, &srco);
while (!srco.finished) {
aio_poll(aio_context, true);
}
+ aio_context_release(aio_context);
}
return srco.ret;
diff --git a/blockjob.c b/blockjob.c
index c02fe59..9ddb958 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -194,6 +194,7 @@ static int block_job_finish_sync(BlockJob *job,
struct BlockFinishData data;
BlockDriverState *bs = job->bs;
Error *local_err = NULL;
+ AioContext *ctx;
assert(bs->job == job);
@@ -206,14 +207,19 @@ static int block_job_finish_sync(BlockJob *job,
data.ret = -EINPROGRESS;
job->cb = block_job_finish_cb;
job->opaque = &data;
+
+ ctx = bdrv_get_aio_context(bs);
+ aio_context_acquire(ctx);
finish(job, &local_err);
if (local_err) {
+ aio_context_release(ctx);
error_propagate(errp, local_err);
return -EBUSY;
}
while (data.ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(bs), true);
}
+ aio_context_release(ctx);
return (data.cancelled && data.ret == 0) ? -ECANCELED : data.ret;
}
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 6e5d1e4..45299cd 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -474,12 +474,16 @@ static int do_co_write_zeroes(BlockBackend *blk, int64_t offset, int count,
.total = total,
.done = false,
};
+ AioContext *ctx = blk_get_aio_context(blk);
+ aio_context_acquire(ctx);
co = qemu_coroutine_create(co_write_zeroes_entry);
qemu_coroutine_enter(co, &data);
while (!data.done) {
- aio_poll(blk_get_aio_context(blk), true);
+ aio_poll(ctx, true);
}
+ aio_context_release(ctx);
+
if (data.ret < 0) {
return data.ret;
} else {
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 07/10] block: call aio_context_acquire in qemu_img/nbd/io
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (5 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 06/10] io: guard aio_poll " Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 08/10] fifolock: create rfifolock_is_owner helper Denis V. Lunev
` (2 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Denis V. Lunev, qemu-devel, Stefan Hajnoczi, qemu-stable
This will harmless now and would be mandatory in a couple of patches
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
---
qemu-img.c | 2 ++
qemu-io.c | 1 +
qemu-nbd.c | 1 +
3 files changed, 4 insertions(+)
diff --git a/qemu-img.c b/qemu-img.c
index 3025776..a59dd87 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3083,6 +3083,8 @@ int main(int argc, char **argv)
}
cmdname = argv[1];
+ aio_context_acquire(qemu_get_aio_context());
+
/* find the command */
for (cmd = img_cmds; cmd->name != NULL; cmd++) {
if (!strcmp(cmdname, cmd->name)) {
diff --git a/qemu-io.c b/qemu-io.c
index 269f17c..96f381b 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -465,6 +465,7 @@ int main(int argc, char **argv)
error_report_err(local_error);
exit(1);
}
+ aio_context_acquire(qemu_get_aio_context());
/* initialize commands */
qemuio_add_command(&quit_cmd);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index 422a607..dd81d0b 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -670,6 +670,7 @@ int main(int argc, char **argv)
exit(EXIT_FAILURE);
}
bdrv_init();
+ aio_context_acquire(qemu_get_aio_context());
atexit(bdrv_close_all);
if (fmt) {
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 08/10] fifolock: create rfifolock_is_owner helper
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (6 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 07/10] block: call aio_context_acquire in qemu_img/nbd/io Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 09/10] aio_context: create aio_context_is_owner helper Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 10/10] aio: change aio_poll constraints Denis V. Lunev
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi,
qemu-stable
This helper is necessary to ensure locking constraints.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
include/qemu/rfifolock.h | 1 +
util/rfifolock.c | 12 +++++++++---
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/include/qemu/rfifolock.h b/include/qemu/rfifolock.h
index b23ab53..1148cb0 100644
--- a/include/qemu/rfifolock.h
+++ b/include/qemu/rfifolock.h
@@ -50,5 +50,6 @@ void rfifolock_init(RFifoLock *r, void (*cb)(void *), void *opaque);
void rfifolock_destroy(RFifoLock *r);
void rfifolock_lock(RFifoLock *r);
void rfifolock_unlock(RFifoLock *r);
+bool rfifolock_is_owner(RFifoLock *r);
#endif /* QEMU_RFIFOLOCK_H */
diff --git a/util/rfifolock.c b/util/rfifolock.c
index afbf748..4533617 100644
--- a/util/rfifolock.c
+++ b/util/rfifolock.c
@@ -12,6 +12,7 @@
*/
#include <assert.h>
+#include <string.h>
#include "qemu/rfifolock.h"
void rfifolock_init(RFifoLock *r, void (*cb)(void *), void *opaque)
@@ -48,7 +49,7 @@ void rfifolock_lock(RFifoLock *r)
/* Take a ticket */
unsigned int ticket = r->tail++;
- if (r->nesting > 0 && qemu_thread_is_self(&r->owner_thread)) {
+ if (rfifolock_is_owner(r)) {
r->tail--; /* put ticket back, we're nesting */
} else {
while (ticket != r->head) {
@@ -68,11 +69,16 @@ void rfifolock_lock(RFifoLock *r)
void rfifolock_unlock(RFifoLock *r)
{
qemu_mutex_lock(&r->lock);
- assert(r->nesting > 0);
- assert(qemu_thread_is_self(&r->owner_thread));
+ assert(rfifolock_is_owner(r));
if (--r->nesting == 0) {
+ memset(&r->owner_thread, 0, sizeof(r->owner_thread));
r->head++;
qemu_cond_broadcast(&r->cond);
}
qemu_mutex_unlock(&r->lock);
}
+
+bool rfifolock_is_owner(RFifoLock *r)
+{
+ return r->nesting > 0 && qemu_thread_is_self(&r->owner_thread);
+}
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 09/10] aio_context: create aio_context_is_owner helper
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (7 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 08/10] fifolock: create rfifolock_is_owner helper Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
2015-11-03 14:12 ` [Qemu-devel] [PATCH 10/10] aio: change aio_poll constraints Denis V. Lunev
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Denis V. Lunev, Paolo Bonzini, qemu-devel, Stefan Hajnoczi,
qemu-stable
This helper is necessary to ensure locking constraints.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
---
async.c | 5 +++++
include/block/aio.h | 3 +++
2 files changed, 8 insertions(+)
diff --git a/async.c b/async.c
index bdc64a3..1d18d98 100644
--- a/async.c
+++ b/async.c
@@ -361,3 +361,8 @@ void aio_context_release(AioContext *ctx)
{
rfifolock_unlock(&ctx->lock);
}
+
+bool aio_context_is_owner(AioContext *ctx)
+{
+ return rfifolock_is_owner(&ctx->lock);
+}
diff --git a/include/block/aio.h b/include/block/aio.h
index bcc7d43..d8cd41a 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -166,6 +166,9 @@ void aio_context_acquire(AioContext *ctx);
/* Relinquish ownership of the AioContext. */
void aio_context_release(AioContext *ctx);
+/* Check that AioContext is owned by the current thread. */
+bool aio_context_is_owner(AioContext *ctx);
+
/**
* aio_bh_new: Allocate a new bottom half structure.
*
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [Qemu-devel] [PATCH 10/10] aio: change aio_poll constraints
2015-11-03 14:12 [Qemu-devel] [PATCH QEMU 2.5 v4 0/10] dataplane snapshot fixes + aio_poll fixes Denis V. Lunev
` (8 preceding siblings ...)
2015-11-03 14:12 ` [Qemu-devel] [PATCH 09/10] aio_context: create aio_context_is_owner helper Denis V. Lunev
@ 2015-11-03 14:12 ` Denis V. Lunev
9 siblings, 0 replies; 24+ messages in thread
From: Denis V. Lunev @ 2015-11-03 14:12 UTC (permalink / raw)
Cc: Kevin Wolf, Denis V. Lunev, qemu-devel, Stefan Hajnoczi,
qemu-stable
There are 2 versions of the aio_poll: blocking and non-blocking.
Non-blocking version is called at the moment from 3 places:
- iothread_run
- bdrv_drain
- bdrv_drain_all
iothread_run and bdrv_drain_all properly acquires AioContext by their own.
bdrv_drain (according to the description) MUST be called with pre-acquired
context. This is perfect.
Blocking version of aio_poll is called mostly using the following syntax:
AioContext *aio_context = bdrv_get_aio_context(bs);
co = qemu_coroutine_create(bdrv_rw_co_entry);
qemu_coroutine_enter(co, &rwco);
while (rwco.ret == NOT_DONE) {
aio_poll(aio_context, true);
}
There is no problem if this is called from iothread, when AioContext is
properly acquired. Unfortunately, this code is called from HMP thread
and this leads to a disaster.
HMP thread IO thread (in aio_poll)
| |
qemu_coroutine_enter |
while (rwco.ret == NOT_DONE) |
aio_poll |
aio_context_acquire |
| ret from qemu_poll_ns
| aio_context_acquire (nested = 2)
| process bdrv_rw_co_entry, set rwco.ret
| aio_context_release (nested = )
| reenters aio_poll, clear events
| aio_context_release
aio_context_release
qemu_poll_ns
In this case HMP thread will be never waked up. Alas.
This means that all such patterns MUST be guarded with aio_context_is_owner
checks, but this is terrible as if we'll find all such places we can fix
them with ease.
This patch proposes different solution: aio_poll MUST be called with
AioContext acquired. Non-blocking places are perfectly OK already, blocking
places MUST be guarded anyway to avoid above deadlock.
Another approach would be to take the lock at the very top (at the beginning
of the operation) but this is much more difficult and leads to spreading
of aio_context_acquire to a lot of unrelated pieces.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
---
aio-posix.c | 11 +----------
aio-win32.c | 9 +--------
include/block/aio.h | 2 ++
tests/test-aio.c | 11 +++++++++++
tests/test-thread-pool.c | 15 +++++++++++++++
5 files changed, 30 insertions(+), 18 deletions(-)
diff --git a/aio-posix.c b/aio-posix.c
index 0467f23..735d272 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -241,7 +241,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
bool progress;
int64_t timeout;
- aio_context_acquire(ctx);
+ assert(aio_context_is_owner(ctx));
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
@@ -269,17 +269,10 @@ bool aio_poll(AioContext *ctx, bool blocking)
timeout = blocking ? aio_compute_timeout(ctx) : 0;
- /* wait until next event */
- if (timeout) {
- aio_context_release(ctx);
- }
ret = qemu_poll_ns((GPollFD *)pollfds, npfd, timeout);
if (blocking) {
atomic_sub(&ctx->notify_me, 2);
}
- if (timeout) {
- aio_context_acquire(ctx);
- }
aio_notify_accept(ctx);
@@ -298,7 +291,5 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress = true;
}
- aio_context_release(ctx);
-
return progress;
}
diff --git a/aio-win32.c b/aio-win32.c
index 43c4c79..ce45b98 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -288,7 +288,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
int count;
int timeout;
- aio_context_acquire(ctx);
+ assert(aio_context_is_owner(ctx));
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
@@ -331,17 +331,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
timeout = blocking && !have_select_revents
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
- if (timeout) {
- aio_context_release(ctx);
- }
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
if (blocking) {
assert(first);
atomic_sub(&ctx->notify_me, 2);
}
- if (timeout) {
- aio_context_acquire(ctx);
- }
if (first) {
aio_notify_accept(ctx);
@@ -366,6 +360,5 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress |= timerlistgroup_run_timers(&ctx->tlg);
- aio_context_release(ctx);
return progress;
}
diff --git a/include/block/aio.h b/include/block/aio.h
index d8cd41a..c8dc7ea 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -292,6 +292,8 @@ bool aio_dispatch(AioContext *ctx);
* blocking. If @blocking is true, this function will wait until one
* or more AIO events have completed, to ensure something has moved
* before returning.
+ *
+ * The caller must have AioContext owned.
*/
bool aio_poll(AioContext *ctx, bool blocking);
diff --git a/tests/test-aio.c b/tests/test-aio.c
index 1623803..87b2dfd 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -16,6 +16,17 @@
#include "qemu/sockets.h"
#include "qemu/error-report.h"
+static int aio_poll_debug(AioContext *ctx, bool blocking)
+{
+ int ret;
+ aio_context_acquire(ctx);
+ ret = aio_poll(ctx, blocking);
+ aio_context_release(ctx);
+
+ return ret;
+}
+#define aio_poll(ctx, blocking) aio_poll_debug(ctx, blocking)
+
static AioContext *ctx;
typedef struct {
diff --git a/tests/test-thread-pool.c b/tests/test-thread-pool.c
index 6a0b981..3180335 100644
--- a/tests/test-thread-pool.c
+++ b/tests/test-thread-pool.c
@@ -6,6 +6,17 @@
#include "qemu/timer.h"
#include "qemu/error-report.h"
+static int aio_poll_debug(AioContext *ctx, bool blocking)
+{
+ int ret;
+ aio_context_acquire(ctx);
+ ret = aio_poll(ctx, blocking);
+ aio_context_release(ctx);
+
+ return ret;
+}
+#define aio_poll(ctx, blocking) aio_poll_debug(ctx, blocking)
+
static AioContext *ctx;
static ThreadPool *pool;
static int active;
@@ -172,7 +183,9 @@ static void do_test_cancel(bool sync)
if (atomic_cmpxchg(&data[i].n, 0, 3) == 0) {
data[i].ret = -ECANCELED;
if (sync) {
+ aio_context_acquire(ctx);
bdrv_aio_cancel(data[i].aiocb);
+ aio_context_release(ctx);
} else {
bdrv_aio_cancel_async(data[i].aiocb);
}
@@ -186,7 +199,9 @@ static void do_test_cancel(bool sync)
if (data[i].aiocb && data[i].n != 3) {
if (sync) {
/* Canceling the others will be a blocking operation. */
+ aio_context_acquire(ctx);
bdrv_aio_cancel(data[i].aiocb);
+ aio_context_release(ctx);
} else {
bdrv_aio_cancel_async(data[i].aiocb);
}
--
2.5.0
^ permalink raw reply related [flat|nested] 24+ messages in thread