From: Juan Quintela <quintela@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, qemu-block@nongnu.org,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [RFC PATCH] migration: Handle block device inactivation failures better
Date: Fri, 14 Apr 2023 14:15:45 +0200 [thread overview]
Message-ID: <87zg7aek26.fsf@secure.mitica> (raw)
In-Reply-To: <20230411183654.1229293-1-eblake@redhat.com> (Eric Blake's message of "Tue, 11 Apr 2023 13:36:54 -0500")
Eric Blake <eblake@redhat.com> wrote:
> Consider what happens when performing a migration between two host
> machines connected to an NFS server serving multiple block devices to
> the guest, when the NFS server becomes unavailable. The migration
> attempts to inactivate all block devices on the source (a necessary
> step before the destination can take over); but if the NFS server is
> non-responsive, the attempt to inactivate can itself fail. When that
> happens, the destination fails to get the migrated guest (good,
> because the source wasn't able to flush everything properly):
>
> (qemu) qemu-kvm: load of migration failed: Input/output error
>
> at which point, our only hope for the guest is for the source to take
> back control. With the current code base, the host outputs a message, but then appears to resume:
>
> (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)
>
> (src qemu)info status
> VM status: running
>
> but a second migration attempt now asserts:
>
> (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
>
> Whether the guest is recoverable on the source after the first failure
> is debatable, but what we do not want is to have qemu itself fail due
> to an assertion. It looks like the problem is as follows:
>
> In migration.c:migration_completion(), the source sets 'inactivate' to
> true (since COLO is not enabled), then tries
> savevm.c:qemu_savevm_state_complete_precopy() with a request to
> inactivate block devices. In turn, this calls
> block.c:bdrv_inactivate_all(), which fails when flushing runs up
> against the non-responsive NFS server. With savevm failing, we are
> now left in a state where some, but not all, of the block devices have
> been inactivated; the 'fail_invalidate' label of
> migration_completion() then wants to reclaim those disks by calling
> bdrv_activate_all() - but this too can fail, yet nothing takes note of
> that failure.
>
> Thus, we have reached a state where the migration engine has forgotten
> all state about whether a block device is inactive, because we did not
> set s->block_inactive; so migration allows the source to reach
> vm_start() and resume execution, violating the block layer invariant
> that the guest CPUs should not be restarted while a device is
> inactive. Note that the code in migration.c:migrate_fd_cancel() will
> also try to reactivate all block devices if s->block_inactive was set,
> but because we failed to set that flag after the first failure, the
> source assumes it has reclaimed all devices, even though it still has
> remaining inactivated devices and does not try again. Normally,
> qmp_cont() will also try to reactivate all disks (or correctly fail if
> the disks are not reclaimable because NFS is not yet back up), but the
> auto-resumption of the source after a migration failure does not go
> through qmp_cont(). And because we have left the block layer in an
> inconsistent state with devices still inactivated, the later migration
> attempt is hitting the assertion failure.
>
> Since it is important to not resume the source with inactive disks,
> this patch tries harder at tracking whether migration attempted to
> inactivate any devices, in order to prevent any vm_start() until it
> has successfully reactivated all devices.
>
> See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
Wow
Such a big comment for such a small patch.
And then people think that there is not "nuance" anymore :-)
> @@ -3518,6 +3519,7 @@ fail_invalidate:
> bdrv_activate_all(&local_err);
> if (local_err) {
> error_report_err(local_err);
> + s->block_inactive = inactivate;
Why not just put here:
s->block_inactive = true;
And call it a day?
if bdrv_activate_all() fails, we can't continue anyways.
Or I am missing something?
Later, Juan.
> } else {
> s->block_inactive = false;
> }
>
> base-commit: f1426881a827a6d3f31b65616c4a8db1e9e7c45e
next prev parent reply other threads:[~2023-04-14 12:20 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 18:36 [RFC PATCH] migration: Handle block device inactivation failures better Eric Blake
2023-04-14 12:15 ` Juan Quintela [this message]
2023-04-14 13:26 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zg7aek26.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).