From: Stefan Hajnoczi <stefanha@gmail.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Kevin Wolf <kwolf@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration
Date: Thu, 2 Oct 2014 15:52:51 +0100 [thread overview]
Message-ID: <20141002145251.GD6250@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <1412239972-23493-1-git-send-email-aik@ozlabs.ru>
[-- Attachment #1: Type: text/plain, Size: 4227 bytes --]
On Thu, Oct 02, 2014 at 06:52:52PM +1000, Alexey Kardashevskiy wrote:
> When migrated using libvirt with "--copy-storage-all", at the end of
> migration there is race between NBD mirroring task trying to do flush
> and migration completion, both end up invalidating cache. Since qcow2
> driver does not handle this situation very well, random crashes happen.
>
> This disables the BDRV_O_INCOMING flag for the block device being migrated
> and restores it when NBD task is done.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>
>
> The commit log is not full and most likely incorrect as well
> as the patch :) Please, help. Thanks!
>
> The patch seems to fix the initial problem though.
>
>
> btw is there any easy way to migrate one QEMU to another
> using NBD (i.e. not using "migrate -b") and not using libvirt?
> What would the command line be? Debugging with libvirt is real
> pain :(
>
>
> ---
> block.c | 17 ++++-------------
> migration.c | 1 -
> nbd.c | 11 +++++++++++
> 3 files changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/block.c b/block.c
> index c5a251c..ed72e0a 100644
> --- a/block.c
> +++ b/block.c
> @@ -5073,6 +5073,10 @@ void bdrv_invalidate_cache_all(Error **errp)
> QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> AioContext *aio_context = bdrv_get_aio_context(bs);
>
> + if (!(bs->open_flags & BDRV_O_INCOMING)) {
> + continue;
> + }
> +
We shouldn't touch bs before acquiring the AioContext. Acquiring the
AioContext is basically the "lock" for the BDS.
It needs to be moved...
> aio_context_acquire(aio_context);
...in here.
> bdrv_invalidate_cache(bs, &local_err);
> aio_context_release(aio_context);
> @@ -5083,19 +5087,6 @@ void bdrv_invalidate_cache_all(Error **errp)
> }
> }
>
> -void bdrv_clear_incoming_migration_all(void)
> -{
> - BlockDriverState *bs;
> -
> - QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> - AioContext *aio_context = bdrv_get_aio_context(bs);
> -
> - aio_context_acquire(aio_context);
> - bs->open_flags = bs->open_flags & ~(BDRV_O_INCOMING);
> - aio_context_release(aio_context);
> - }
> -}
> -
> int bdrv_flush(BlockDriverState *bs)
> {
> Coroutine *co;
> diff --git a/migration.c b/migration.c
> index 8d675b3..c49a05a 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -103,7 +103,6 @@ static void process_incoming_migration_co(void *opaque)
> }
> qemu_announce_self();
>
> - bdrv_clear_incoming_migration_all();
> /* Make sure all file formats flush their mutable metadata */
> bdrv_invalidate_cache_all(&local_err);
BDRV_O_INCOMING needs to be cleared, otherwise the block drivers will
think that incoming migration is still taking place and treat the file
as effectively read-only during open.
On IRC I suggested doing away with the bdrv_invalidate_cache_all() name
since it no longer calls bdrv_invalidate_cache() on all BDSes.
Combine both clearing BDRV_O_INCOMING and calling
bdrv_invalidate_cache() if BDRV_O_INCOMING was previously set into one
function - you could reuse bdrv_clear_incoming_migration_all() for that.
> if (local_err) {
> diff --git a/nbd.c b/nbd.c
> index e9b539b..7b479c0 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -106,6 +106,7 @@ struct NBDExport {
> off_t dev_offset;
> off_t size;
> uint32_t nbdflags;
> + bool restore_incoming;
Not needed, it does not make sense to restore BDRV_O_INCOMING because
once we've written to a file it cannot be in use by another host at the
same time.
> QTAILQ_HEAD(, NBDClient) clients;
> QTAILQ_ENTRY(NBDExport) next;
>
> @@ -972,6 +973,13 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset,
> exp->ctx = bdrv_get_aio_context(bs);
> bdrv_ref(bs);
> bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp);
> +
> + if (bs->open_flags & BDRV_O_INCOMING) {
I think the flag has to be cleared before calling
bdrv_invalidate_cache() because the .bdrv_open() function looks at the
flag.
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
next prev parent reply other threads:[~2014-10-02 14:53 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-02 8:52 [Qemu-devel] [RFC PATCH] block/migration: Disable cache invalidate for incoming migration Alexey Kardashevskiy
2014-10-02 9:45 ` Paolo Bonzini
2014-10-02 10:19 ` Alexey Kardashevskiy
2014-10-02 14:52 ` Stefan Hajnoczi [this message]
2014-10-03 4:12 ` Alexey Kardashevskiy
2014-10-06 10:03 ` Stefan Hajnoczi
2014-10-06 22:47 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141002145251.GD6250@stefanha-thinkpad.redhat.com \
--to=stefanha@gmail.com \
--cc=aik@ozlabs.ru \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.