From: Peter Xu <peterx@redhat.com>
To: "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, jdenemar@redhat.com,
wangjie88@huawei.com, quintela@redhat.com, mreitz@redhat.com,
berrange@redhat.com, eblake@redhat.com, fuweiwei2@huawei.com
Subject: Re: [Qemu-devel] [PATCH 3/7] migration: Wait for semaphore before completing migration
Date: Wed, 18 Oct 2017 11:35:19 +0800 [thread overview]
Message-ID: <20171018033519.GM4166@pxdev.xzpeter.org> (raw)
In-Reply-To: <20171011191317.24157-4-dgilbert@redhat.com>
On Wed, Oct 11, 2017 at 08:13:13PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Wait for a semaphore before completing the migration,
> if the previously added capability was enabled.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/migration.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
> migration/migration.h | 3 +++
> 2 files changed, 50 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index e1a87c3d23..b411a7bb63 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1967,6 +1967,46 @@ fail:
> }
>
> /**
> + * migration_maybe_pause: Pause if required to by migrate_pause_before_device
> + * called with the iothread locked
> + * Returns: 0 on success
> + */
> +static int migration_maybe_pause(MigrationState *s, int *current_active_state)
> +{
> + int ret;
> + if (!migrate_pause_before_device()) {
> + return 0;
> + }
> + ret = bdrv_inactivate_all();
My understanding is that the crash was caused by mirrored block device
IO triggered after the inactivation, then... should we do this after
waiting for the semaphore (possibly at [1] below) to make sure the
block jobs are completed? Or did I miss anything?
> + if (ret) {
> + error_report("%s: bdrv_inactivate_all() failed (%d)",
> + __func__, ret);
> + return ret;
> + }
> +
> + s->block_inactive = true;
> +
> + /* Since leaving this state is not atomic with posting the semaphore
> + * it's possible that someone could have issued multiple migrate_continue
> + * and the semaphore is incorrectly positive at this point;
> + * the docs say it's undefined to reinit a semaphore that's already
> + * init'd, so use timedwait to eat up any existing posts.
> + */
> + while (qemu_sem_timedwait(&s->pause_sem, 1) == 0);
> +
> + qemu_mutex_unlock_iothread();
> + migrate_set_state(&s->state, *current_active_state,
> + MIGRATION_STATUS_PAUSE_BEFORE_DEVICE);
> + qemu_sem_wait(&s->pause_sem);
[1]
> + migrate_set_state(&s->state, MIGRATION_STATUS_PAUSE_BEFORE_DEVICE,
> + MIGRATION_STATUS_DEVICE);
> + *current_active_state = MIGRATION_STATUS_DEVICE;
> + qemu_mutex_lock_iothread();
> +
> + return s->state == MIGRATION_STATUS_DEVICE ? 0 : -EINVAL;
> +}
> +
> +/**
> * migration_completion: Used by migration_thread when there's not much left.
> * The caller 'breaks' the loop when this returns.
> *
> @@ -1992,6 +2032,11 @@ static void migration_completion(MigrationState *s, int current_active_state,
> bool inactivate = !migrate_colo_enabled();
> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> if (ret >= 0) {
> + ret = migration_maybe_pause(s, ¤t_active_state);
> + /* If this worked it will already have inactivated */
> + inactivate &= !migrate_pause_before_device();
> + }
> + if (ret >= 0) {
> qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
> ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
> inactivate);
> @@ -2372,6 +2417,7 @@ static void migration_instance_finalize(Object *obj)
>
> g_free(params->tls_hostname);
> g_free(params->tls_creds);
> + qemu_sem_destroy(&ms->pause_sem);
> }
>
> static void migration_instance_init(Object *obj)
> @@ -2382,6 +2428,7 @@ static void migration_instance_init(Object *obj)
> ms->state = MIGRATION_STATUS_NONE;
> ms->xbzrle_cache_size = DEFAULT_MIGRATE_CACHE_SIZE;
> ms->mbps = -1;
> + qemu_sem_init(&ms->pause_sem, 0);
>
> params->tls_hostname = g_strdup("");
> params->tls_creds = g_strdup("");
> diff --git a/migration/migration.h b/migration/migration.h
> index 37feea5453..447e8b3f79 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -121,6 +121,9 @@ struct MigrationState
> /* Flag set once the migration thread called bdrv_inactivate_all */
> bool block_inactive;
>
> + /* Migration is paused due to pause-before-device */
> + QemuSemaphore pause_sem;
> +
> /* The semaphore is used to notify COLO thread that failover is finished */
> QemuSemaphore colo_exit_sem;
>
> --
> 2.13.6
>
--
Peter Xu
next prev parent reply other threads:[~2017-10-18 3:35 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-11 19:13 [Qemu-devel] [PATCH 0/7] migration: pause-before-device Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 1/7] migration: Add 'pause-before-device' capability Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 2/7] migration: Add 'pause-before-device' and 'device' statuses Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 3/7] migration: Wait for semaphore before completing migration Dr. David Alan Gilbert (git)
2017-10-18 3:35 ` Peter Xu [this message]
2017-10-18 8:59 ` Dr. David Alan Gilbert
2017-10-11 19:13 ` [Qemu-devel] [PATCH 4/7] migration: migrate-continue Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 5/7] migrate: HMP migate_continue Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 6/7] migration: allow cancel to unpause Dr. David Alan Gilbert (git)
2017-10-11 19:13 ` [Qemu-devel] [PATCH 7/7] migration: pause-before-device for postcopy Dr. David Alan Gilbert (git)
2017-10-11 20:03 ` [Qemu-devel] [PATCH 0/7] migration: pause-before-device no-reply
2017-10-12 8:21 ` Daniel P. Berrange
2017-10-12 9:18 ` Kevin Wolf
2017-10-12 9:27 ` Daniel P. Berrange
2017-10-12 9:52 ` Kevin Wolf
2017-10-12 9:55 ` Daniel P. Berrange
2017-10-12 10:02 ` Daniel P. Berrange
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171018033519.GM4166@pxdev.xzpeter.org \
--to=peterx@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=fuweiwei2@huawei.com \
--cc=jdenemar@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=wangjie88@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.