qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Greg Kurz <groug@kaod.org>
Cc: Juan Quintela <quintela@redhat.com>,
	qemu-devel@nongnu.org, lvivier@redhat.com,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	peterx@redhat.com
Subject: Re: [Qemu-devel] [PULL 15/18] migration: split common postcopy out of ram postcopy
Date: Mon, 5 Feb 2018 13:12:31 +0000	[thread overview]
Message-ID: <20180205131231.GB2316@work-vm> (raw)
In-Reply-To: <20180205104944.26fac450@bahia.lan>

* Greg Kurz (groug@kaod.org) wrote:
> On Fri, 22 Sep 2017 14:25:02 +0200
> Juan Quintela <quintela@redhat.com> wrote:
> 
> > From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > 

<snip>

> >  /* Sent prior to starting the destination running in postcopy, discard pages
> > @@ -1354,6 +1376,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> >          return -1;
> >      }
> >  
> > +    if (!migrate_postcopy_ram()) {
> > +        return 0;
> > +    }
> > +
> 
> If postcopy-ram was set on the source but not on the destination, the source
> sends an advise with ram_pagesize_summary() and qemu_target_page_size() but
> this return path on the destination doesn't dispose of the two values. This
> results in a corrupted stream and confuses qemu_loadvm_state():
> 
> qemu-system-ppc64: Expected vmdescription section, but got 0
> 
> Migration doesn't happen, and worse, the destination may starts execution,
> ie, we have two running instances...

Thanks for debugging this; I'd noticed it but not got around to digging
down.

> It looks wrong that the parsing of the advise depends on a migration
> capability being set by the user. The destination should process the
> postcopy-ram advise sent by the source in any case.
> 
> Now that you're about to introduce a new postcopy variant, I guess it
> is time to improve the advise format to reflect this, as you already
> suggest in a comment above. The format could be something like:
> - uin8_t: number of enabled postcopy variants
> - for each variant:
>   uint8_t: type of the postcopy variant
>   per variant arguments
> 
> The destination could then process the advise according to what the source
> actually sent.
> 
> In the meantime, I'd suggest to partly revert 58110f0acb1a. At least, the part
> that changes the advise format, since it isn't strictly needed right now.

I guess it's difficult to change now; but it needs to be robust.
As I said in my review of the patch (about a year ago!):

   Libvirt does set it on the destination, and it's already useful for checking
   the destination host has the appropriate kernel userfault support;
   so I'm fine with requiring it.
   However it's good where possible to fail nicely if someone doesn't set it.

So we've missed that last bit;  lets make the advise code check the
length match what it's expecting.

Dave

> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -98,23 +98,6 @@ static struct mig_cmd_args {
>      [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
>  };
>  
> -/* Note for MIG_CMD_POSTCOPY_ADVISE:
> - * The format of arguments is depending on postcopy mode:
> - * - postcopy RAM only
> - *   uint64_t host page size
> - *   uint64_t taget page size
> - *
> - * - postcopy RAM and postcopy dirty bitmaps
> - *   format is the same as for postcopy RAM only
> - *
> - * - postcopy dirty bitmaps only
> - *   Nothing. Command length field is 0.
> - *
> - * Be careful: adding a new postcopy entity with some other parameters should
> - * not break format self-description ability. Good way is to introduce some
> - * generic extendable format with an exception for two old entities.
> - */
> -
>  static int announce_self_create(uint8_t *buf,
>                                  uint8_t *mac_addr)
>  {
> @@ -888,8 +871,6 @@ void qemu_savevm_send_postcopy_advise(QEMUFile *f)
>          trace_qemu_savevm_send_postcopy_advise();
>          qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
>                                   16, (uint8_t *)tmp);
> -    } else {
> -        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
>      }
>  }
>  
> @@ -1387,10 +1368,6 @@ static int loadvm_postcopy_handle_advise(MigrationIncomin
>          return -1;
>      }
>  
> -    if (!migrate_postcopy_ram()) {
> -        return 0;
> -    }
> -
>      if (!postcopy_ram_supported_by_host(mis)) {
>          postcopy_state_set(POSTCOPY_INCOMING_NONE);
>          return -1;
> 
> 
> 
> Cheers,
> 
> --
> Greg
> 
> >      if (!postcopy_ram_supported_by_host()) {
> >          postcopy_state_set(POSTCOPY_INCOMING_NONE);
> >          return -1;
> > @@ -1564,7 +1590,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> >           * A rare case, we entered listen without having to do any discards,
> >           * so do the setup that's normally done at the time of the 1st discard.
> >           */
> > -        postcopy_ram_prepare_discard(mis);
> > +        if (migrate_postcopy_ram()) {
> > +            postcopy_ram_prepare_discard(mis);
> > +        }
> >      }
> >  
> >      /*
> > @@ -1572,8 +1600,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> >       * However, at this point the CPU shouldn't be running, and the IO
> >       * shouldn't be doing anything yet so don't actually expect requests
> >       */
> > -    if (postcopy_ram_enable_notify(mis)) {
> > -        return -1;
> > +    if (migrate_postcopy_ram()) {
> > +        if (postcopy_ram_enable_notify(mis)) {
> > +            return -1;
> > +        }
> >      }
> >  
> >      if (mis->have_listen_thread) {
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  parent reply	other threads:[~2018-02-05 13:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-22 12:24 [Qemu-devel] [PULL 00/18] Migration PULL request (3rd try) Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 01/18] migration: Create migration_ioc_process_incoming() Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 02/18] migration: Teach it about G_SOURCE_REMOVE Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 03/18] migration: Add comments to channel functions Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 04/18] migration: Create migration_has_all_channels Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 05/18] migration: Add multifd capability Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 06/18] migration: Create x-multifd-channels parameter Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 07/18] migration: Create x-multifd-page-count parameter Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 08/18] migration: Create multifd migration threads Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 09/18] migration: Split migration_fd_process_incoming Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 10/18] bitmap: remove BITOP_WORD() Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 11/18] bitmap: introduce bitmap_count_one() Juan Quintela
2017-09-22 12:24 ` [Qemu-devel] [PULL 12/18] bitmap: provide to_le/from_le helpers Juan Quintela
2017-09-22 12:25 ` [Qemu-devel] [PULL 13/18] migration: add has_postcopy savevm handler Juan Quintela
2017-09-22 12:25 ` [Qemu-devel] [PULL 14/18] migration: fix ram_save_pending Juan Quintela
2017-09-22 12:25 ` [Qemu-devel] [PULL 15/18] migration: split common postcopy out of ram postcopy Juan Quintela
2018-02-05  9:49   ` Greg Kurz
2018-02-05 12:11     ` Vladimir Sementsov-Ogievskiy
2018-02-05 13:19       ` Greg Kurz
2018-02-05 13:12     ` Dr. David Alan Gilbert [this message]
2017-09-22 12:25 ` [Qemu-devel] [PULL 16/18] migration: pass MigrationIncomingState* into migration check functions Juan Quintela
2017-09-22 12:25 ` [Qemu-devel] [PULL 17/18] migration: fix hardcoded function name in error report Juan Quintela
2017-09-22 12:25 ` [Qemu-devel] [PULL 18/18] migration: split ufd_version_check onto receive/request features part Juan Quintela
2017-09-22 14:20 ` [Qemu-devel] [PULL 00/18] Migration PULL request (3rd try) Peter Maydell
  -- strict thread matches above, loose matches on Subject: below --
2017-09-21 23:07 [Qemu-devel] [PULL 00/18] Migration PULL request Juan Quintela
2017-09-21 23:08 ` [Qemu-devel] [PULL 15/18] migration: split common postcopy out of ram postcopy Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180205131231.GB2316@work-vm \
    --to=dgilbert@redhat.com \
    --cc=groug@kaod.org \
    --cc=lvivier@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).