From: "Denis V. Lunev" <den@openvz.org>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
qemu-stable@nongnu.org, qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@redhat.com>,
Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm
Date: Tue, 27 Oct 2015 21:23:09 +0300 [thread overview]
Message-ID: <562FC10D.7040404@openvz.org> (raw)
In-Reply-To: <562FBE8F.7040309@redhat.com>
On 10/27/2015 09:12 PM, Paolo Bonzini wrote:
>
> On 27/10/2015 15:09, Denis V. Lunev wrote:
>> aio_context should be locked in the similar way as was done in QMP
>> snapshot creation in the other case there are a lot of possible
>> troubles if native AIO mode is enabled for disk.
>>
>> - the command can hang (HMP thread) with missed wakeup (the operation is
>> actually complete)
>> io_submit
>> ioq_submit
>> laio_submit
>> raw_aio_submit
>> raw_aio_readv
>> bdrv_co_io_em
>> bdrv_co_readv_em
>> bdrv_aligned_preadv
>> bdrv_co_do_preadv
>> bdrv_co_do_readv
>> bdrv_co_readv
>> qcow2_co_readv
>> bdrv_aligned_preadv
>> bdrv_co_do_pwritev
>> bdrv_rw_co_entry
>>
>> - QEMU can assert in coroutine re-enter
>> __GI_abort
>> qemu_coroutine_enter
>> bdrv_co_io_em_complete
>> qemu_laio_process_completion
>> qemu_laio_completion_bh
>> aio_bh_poll
>> aio_dispatch
>> aio_poll
>> iothread_run
>>
>> AioContext lock is reqursive. Thus nested locking should not be a problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> CC: Amit Shah <amit.shah@redhat.com>
>> ---
>> block/snapshot.c | 5 +++++
>> migration/savevm.c | 7 +++++++
>> 2 files changed, 12 insertions(+)
>>
>> diff --git a/block/snapshot.c b/block/snapshot.c
>> index 89500f2..f6fa17a 100644
>> --- a/block/snapshot.c
>> +++ b/block/snapshot.c
>> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>> {
>> int ret;
>> Error *local_err = NULL;
>> + AioContext *aio_context = bdrv_get_aio_context(bs);
>> +
>> + aio_context_acquire(aio_context);
>>
>> ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>> if (ret == -ENOENT || ret == -EINVAL) {
>> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>> ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>> }
>>
>> + aio_context_release(aio_context);
> Why here and not in hmp_delvm, for consistency?
>
> The call from hmp_savevm is already protected.
>
> Thanks for fixing the bug!
>
> Paolo
the situation is more difficult. There are several disks in VM.
One disk is used for state saving (protected in savevm)
and there are several disks touched via
static int del_existing_snapshots(Monitor *mon, const char *name)
while ((bs = bdrv_next(bs))) {
if (bdrv_can_snapshot(bs) &&
bdrv_snapshot_find(bs, snapshot, name) >= 0) {
bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
}
}
in savevm and similar looking code in delvm with similar cycle
implemented differently.
This patchset looks minimal for me to kludge situation enough.
True fix would be a drop of this code in favour of blockdev
transactions. At least this is my opinion. Though I can not do
this at this stage, this will take a lot of time.
Den
next prev parent reply other threads:[~2015-10-27 18:23 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-27 14:09 [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 1/5] fifolock: create rfifolock_is_locked helper Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 2/5] aio_context: create aio_context_is_locked helper Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 3/5] io: add locking constraints check into bdrv_drain to ensure locking Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm Denis V. Lunev
2015-10-27 18:12 ` Paolo Bonzini
2015-10-27 18:23 ` Denis V. Lunev [this message]
2015-10-28 10:11 ` Juan Quintela
2015-10-28 10:38 ` Denis V. Lunev
2015-10-27 14:09 ` [Qemu-devel] [PATCH 5/5] virtio: sync the dataplane vring state to the virtqueue before virtio_save Denis V. Lunev
2015-10-27 18:41 ` [Qemu-devel] [PATCH v2 0/5] dataplane snapshot fixes Paolo Bonzini
2015-10-27 19:05 ` Denis V. Lunev
2015-10-27 23:22 ` Denis V. Lunev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=562FC10D.7040404@openvz.org \
--to=den@openvz.org \
--cc=amit.shah@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).