From: Juan Quintela <quintela@redhat.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, fam@euphon.net,
stefanha@redhat.com, jsnow@redhat.com,
vsementsov@yandex-team.ru, eblake@redhat.com,
leobras@redhat.com, farosas@suse.de, peterx@redhat.com,
pbonzini@redhat.com, t.lamprecht@proxmox.com
Subject: Re: [PATCH v4] migration: hold the BQL during setup
Date: Fri, 13 Oct 2023 10:57:53 +0200 [thread overview]
Message-ID: <87pm1i6gry.fsf@secure.mitica> (raw)
In-Reply-To: <20231012103307.371092-1-f.ebner@proxmox.com> (Fiona Ebner's message of "Thu, 12 Oct 2023 12:33:07 +0200")
Fiona Ebner <f.ebner@proxmox.com> wrote:
> This is intended to be a semantic revert of commit 9b09503752
> ("migration: run setup callbacks out of big lock"). There have been so
> many changes since that commit (e.g. a new setup callback
> dirty_bitmap_save_setup() that also needs to be adapted now), it's
> easier to do the revert manually.
>
> For snapshots, the bdrv_writev_vmstate() function is used during setup
> (in QIOChannelBlock backing the QEMUFile), but not holding the BQL
> while calling it could lead to an assertion failure. To understand
> how, first note the following:
>
> 1. Generated coroutine wrappers for block layer functions spawn the
> coroutine and use AIO_WAIT_WHILE()/aio_poll() to wait for it.
> 2. If the host OS switches threads at an inconvenient time, it can
> happen that a bottom half scheduled for the main thread's AioContext
> is executed as part of a vCPU thread's aio_poll().
>
> An example leading to the assertion failure is as follows:
>
> main thread:
> 1. A snapshot-save QMP command gets issued.
> 2. snapshot_save_job_bh() is scheduled.
>
> vCPU thread:
> 3. aio_poll() for the main thread's AioContext is called (e.g. when
> the guest writes to a pflash device, as part of blk_pwrite which is a
> generated coroutine wrapper).
> 4. snapshot_save_job_bh() is executed as part of aio_poll().
> 3. qemu_savevm_state() is called.
> 4. qemu_mutex_unlock_iothread() is called. Now
> qemu_get_current_aio_context() returns 0x0.
> 5. bdrv_writev_vmstate() is executed during the usual savevm setup
> via qemu_fflush(). But this function is a generated coroutine wrapper,
> so it uses AIO_WAIT_WHILE. There, the assertion
> assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> will fail.
>
> To fix it, ensure that the BQL is held during setup. While it would
> only be needed for snapshots, adapting migration too avoids additional
> logic for conditional locking/unlocking in the setup callbacks.
> Writing the header could (in theory) also trigger qemu_fflush() and
> thus bdrv_writev_vmstate(), so the locked section also covers the
> qemu_savevm_state_header() call, even for migration for consistentcy.
>
> The section around multifd_send_sync_main() needs to be unlocked to
> avoid a deadlock. In particular, the function calls
> socket_send_channel_create() using multifd_new_send_channel_async() as
> a callback and then waits for the callback to signal via the
> channels_ready semaphore. The connection happens via
> qio_task_run_in_thread(), but the callback is only executed via
> qio_task_thread_result() which is scheduled for the main event loop.
> Without unlocking the section, the main thread would never get to
> process the task result and the callback meaning there would be no
> signal via the channels_ready semaphore.
>
> The comment in ram_init_bitmaps() was introduced by 4987783400
> ("migration: fix incorrect memory_global_dirty_log_start outside BQL")
> and is removed, because it referred to the qemu_mutex_lock_iothread()
> call.
>
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
I don't have a better solution for this problem, if nobody complains in
the next couple of days, I will pull this change.
Thanks for the effort.
prev parent reply other threads:[~2023-10-13 8:59 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-12 10:33 [PATCH v4] migration: hold the BQL during setup Fiona Ebner
2023-10-12 20:40 ` Fabiano Rosas
2023-10-13 10:59 ` Fiona Ebner
2023-10-13 8:57 ` Juan Quintela [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pm1i6gry.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=eblake@redhat.com \
--cc=f.ebner@proxmox.com \
--cc=fam@euphon.net \
--cc=farosas@suse.de \
--cc=jsnow@redhat.com \
--cc=leobras@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=t.lamprecht@proxmox.com \
--cc=vsementsov@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.