From: Peter Xu <peterx@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, jdenemar@redhat.com,
wangjie88@huawei.com, quintela@redhat.com, mreitz@redhat.com,
berrange@redhat.com, eblake@redhat.com, fuweiwei2@huawei.com
Subject: Re: [Qemu-devel] [PATCH 3/7] migration: Wait for semaphore before completing migration
Date: Wed, 18 Oct 2017 11:35:19 +0800 [thread overview]
Message-ID: <20171018033519.GM4166@pxdev.xzpeter.org> (raw)
In-Reply-To: <20171011191317.24157-4-dgilbert@redhat.com>
On Wed, Oct 11, 2017 at 08:13:13PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Wait for a semaphore before completing the migration,
> if the previously added capability was enabled.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/migration.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> migration/migration.h | 3 +++
> 2 files changed, 50 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index e1a87c3d23..b411a7bb63 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1967,6 +1967,46 @@ fail:
> }
>
> /**
> + * migration_maybe_pause: Pause if required to by migrate_pause_before_device
> + * called with the iothread locked
> + * Returns: 0 on success
> + */
> +static int migration_maybe_pause(MigrationState *s, int *current_active_state)
> +{
> + int ret;
> + if (!migrate_pause_before_device()) {
> + return 0;
> + }
> + ret = bdrv_inactivate_all();
My understanding is that the crash was caused by mirrored block device
IO triggered after the inactivation, then... should we do this after
waiting for the semaphore (possibly at [1] below) to make sure the
block jobs are completed? Or did I miss anything?
> + if (ret) {
> + error_report("%s: bdrv_inactivate_all() failed (%d)",
> + __func__, ret);
> + return ret;
> + }
> +
> + s->block_inactive = true;
> +
> + /* Since leaving this state is not atomic with posting the semaphore
> + * it's possible that someone could have issued multiple migrate_continue
> + * and the semaphore is incorrectly positive at this point;
> + * the docs say it's undefined to reinit a semaphore that's already
> + * init'd, so use timedwait to eat up any existing posts.
> + */
> + while (qemu_sem_timedwait(&s->pause_sem, 1) == 0);
> +
> + qemu_mutex_unlock_iothread();
> + migrate_set_state(&s->state, *current_active_state,
> + MIGRATION_STATUS_PAUSE_BEFORE_DEVICE);
> + qemu_sem_wait(&s->pause_sem);
[1]
> + migrate_set_state(&s->state, MIGRATION_STATUS_PAUSE_BEFORE_DEVICE,
> + MIGRATION_STATUS_DEVICE);
> + *current_active_state = MIGRATION_STATUS_DEVICE;
> + qemu_mutex_lock_iothread();
> +
> + return s->state == MIGRATION_STATUS_DEVICE ? 0 : -EINVAL;
> +}
> +
> +/**
> * migration_completion: Used by migration_thread when there's not much left.
> * The caller 'breaks' the loop when this returns.
> *
> @@ -1992,6 +2032,11 @@ static void migration_completion(MigrationState *s, int current_active_state,
> bool inactivate = !migrate_colo_enabled();
> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> if (ret >= 0) {
> + ret = migration_maybe_pause(s, ¤t_active_state);
> + /* If this worked it will already have inactivated */
> + inactivate &= !migrate_pause_before_device();
> + }
> + if (ret >= 0) {
> qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
> ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
> inactivate);
> @@ -2372,6 +2417,7 @@ static void migration_instance_finalize(Object *obj)
>
> g_free(params->tls_hostname);
> g_free(params->tls_creds);
> + qemu_sem_destroy(&ms->pause_sem);
> }
>
> static void migration_instance_init(Object *obj)
> @@ -2382,6 +2428,7 @@ static void migration_instance_init(Object *obj)
> ms->state = MIGRATION_STATUS_NONE;
> ms->xbzrle_cache_size = DEFAULT_MIGRATE_CACHE_SIZE;
> ms->mbps = -1;
> + qemu_sem_init(&ms->pause_sem, 0);
>
> params->tls_hostname = g_strdup("");
> params->tls_creds = g_strdup("");
> diff --git a/migration/migration.h b/migration/migration.h
> index 37feea5453..447e8b3f79 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -121,6 +121,9 @@ struct MigrationState
> /* Flag set once the migration thread called bdrv_inactivate_all */
> bool block_inactive;
>
> + /* Migration is paused due to pause-before-device */
> + QemuSemaphore pause_sem;
> +
> /* The semaphore is used to notify COLO thread that failover is finished */
> QemuSemaphore colo_exit_sem;
>
> --
> 2.13.6
>
--
Peter Xu
next prev parent reply other threads:[~2017-10-18 3:35 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-11 19:13 [Qemu-devel] [PATCH 0/7] migration: pause-before-device Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 1/7] migration: Add 'pause-before-device' capability Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 2/7] migration: Add 'pause-before-device' and 'device' statuses Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 3/7] migration: Wait for semaphore before completing migration Dr. David Alan Gilbert (git)
2017-10-18 3:35 ` Peter Xu [this message]
2017-10-18 8:59 ` Dr. David Alan Gilbert
2017-10-11 19:13 ` [Qemu-devel] [PATCH 4/7] migration: migrate-continue Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 5/7] migrate: HMP migate_continue Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 6/7] migration: allow cancel to unpause Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 7/7] migration: pause-before-device for postcopy Dr. David Alan Gilbert (git)
2017-10-11 20:03 ` [Qemu-devel] [PATCH 0/7] migration: pause-before-device no-reply
2017-10-12 8:21 ` Daniel P. Berrange
2017-10-12 9:18 ` Kevin Wolf
2017-10-12 9:27 ` Daniel P. Berrange
2017-10-12 9:52 ` Kevin Wolf
2017-10-12 9:55 ` Daniel P. Berrange
2017-10-12 10:02 ` Daniel P. Berrange
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171018033519.GM4166@pxdev.xzpeter.org \
--to=peterx@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=fuweiwei2@huawei.com \
--cc=jdenemar@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=wangjie88@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).