qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com,
	Peter Maydell <peter.maydell@linaro.org>,
	zhanghailiang <zhang.zhanghailiang@huawei.com>,
	Juan Quintela <quintela@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	peterx@redhat.com, mreitz@redhat.com, stefanha@redhat.com,
	jsnow@redhat.com
Subject: Re: [Qemu-devel] [PATCH] migration: Fix race of image locking between src and dst
Date: Mon, 19 Jun 2017 15:49:32 +0100	[thread overview]
Message-ID: <20170619144932.GI2640@redhat.com> (raw)
In-Reply-To: <20170616160658.32290-1-famz@redhat.com>

On Sat, Jun 17, 2017 at 12:06:58AM +0800, Fam Zheng wrote:
> Previously, dst side will immediately try to lock the write byte upon
> receiving QEMU_VM_EOF, but at src side, bdrv_inactivate_all() is only
> done after sending it. If the src host is under load, dst may fail to
> acquire the lock due to racing with the src unlocking it.
> 
> Fix this by hoisting the bdrv_inactivate_all() operation before
> QEMU_VM_EOF.
> 
> N.B. A further improvement could possibly be done to cleanly handover
> locks between src and dst, so that there is no window where a third QEMU
> could steal the locks and prevent src and dst from running.
> 
> Reported-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  migration/colo.c      |  2 +-
>  migration/migration.c | 19 +++++++------------
>  migration/savevm.c    | 19 +++++++++++++++----
>  migration/savevm.h    |  3 ++-
>  4 files changed, 25 insertions(+), 18 deletions(-)

[snip]

> @@ -1695,20 +1695,15 @@ static void migration_completion(MigrationState *s, int current_active_state,
>          ret = global_state_store();
>  
>          if (!ret) {
> +            bool inactivate = !migrate_colo_enabled();
>              ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>              if (ret >= 0) {
>                  qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
> -                qemu_savevm_state_complete_precopy(s->to_dst_file, false);
> +                ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
> +                                                         inactivate);
>              }
> -            /*
> -             * Don't mark the image with BDRV_O_INACTIVE flag if
> -             * we will go into COLO stage later.
> -             */
> -            if (ret >= 0 && !migrate_colo_enabled()) {
> -                ret = bdrv_inactivate_all();
> -                if (ret >= 0) {
> -                    s->block_inactive = true;
> -                }
> +            if (inactivate && ret >= 0) {
> +                s->block_inactive = true;
>              }
>          }
>          qemu_mutex_unlock_iothread();

[snip]

> @@ -1173,6 +1174,15 @@ void qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only)
>          json_end_object(vmdesc);
>      }
>  
> +    if (inactivate_disks) {
> +        /* Inactivate before sending QEMU_VM_EOF so that the
> +         * bdrv_invalidate_cache_all() on the other end won't fail. */
> +        ret = bdrv_inactivate_all();
> +        if (ret) {
> +            qemu_file_set_error(f, ret);
> +            return ret;
> +        }
> +    }

IIUC as well as fixing the race condition, you're also improving
error reporting by using qemu_file_set_error() which was not done
previously. Would be nice to mention that in the commit message
too if you respin for any other reason, but that's just a nit-pick
so

  Reviewed-by: Daniel P. Berrange <berrange@redhat.com>

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

  reply	other threads:[~2017-06-19 14:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-16 16:06 [Qemu-devel] [PATCH] migration: Fix race of image locking between src and dst Fam Zheng
2017-06-19 14:49 ` Daniel P. Berrange [this message]
2017-06-19 15:27   ` Peter Maydell
2017-06-19 15:28     ` Daniel P. Berrange
2017-06-19 15:26 ` Juan Quintela
2017-06-19 16:54   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170619144932.GI2640@redhat.com \
    --to=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).