kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Juan Quintela <quintela@redhat.com>
Cc: kvm@vger.kernel.org, qemu-devel <qemu-devel@nongnu.org>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>
Subject: Re: [QEMU PATCH] kvmclock: advance clock by time window between vm_stop and pre_save
Date: Fri, 4 Nov 2016 12:00:38 -0200	[thread overview]
Message-ID: <20161104140035.GA14339@amt.cnet> (raw)
In-Reply-To: <20161104123539.GA3132@amt.cnet>

On Fri, Nov 04, 2016 at 10:35:39AM -0200, Marcelo Tosatti wrote:
> On Fri, Nov 04, 2016 at 01:28:48PM +0100, Juan Quintela wrote:
> > Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > > This patch, relative to pre-copy migration codepath,
> > > measures the time between vm_stop() and pre_save(), 
> > > which includes copying the remaining RAM to destination,
> > > and advances the clock by that amount.
> > >
> > > In a VM with 5 seconds downtime, this reduces the guest 
> > > clock difference on destination from 5s to 0.2s.
> > >
> > > Please do not apply this yet as some codepaths still need
> > > checking, submitting early for comments.
> > >
> > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > 
> > You can use an optional section, and then you don't need to increase the
> > version number.
> 
> Optional section is more appropriate, thanks.
> 
> > I believe you that the clock manipulation is right, only talking about
> > the migration bits.
> > 
> > > +static uint64_t clock_delta(struct timespec *before, struct timespec *after)
> > > +{
> > > +    if (before->tv_sec > after->tv_sec ||
> > > +        (before->tv_sec == after->tv_sec &&
> > > +         before->tv_nsec > after->tv_nsec)) {
> > > +        fprintf(stderr, "clock_delta failed: before=(%ld sec, %ld nsec),"
> > > +                        "after=(%ld sec, %ld nsec)\n", before->tv_sec,
> > > +                        before->tv_nsec, after->tv_sec, after->tv_nsec);
> > > +        abort();
> > > +    }
> > > +
> > > +    return (after->tv_sec - before->tv_sec) * 1000000000ULL +
> > > +            after->tv_nsec - before->tv_nsec;
> > > +}
> > 
> > I can't believe that we don't have a helper function already to
> > calculate this....
> 
> Couldnt find any...
> 
> > > +
> > > +static void kvmclock_pre_save(void *opaque)
> > > +{
> > > +    KVMClockState *s = opaque;
> > > +    struct timespec now;
> > > +    uint64_t ns;
> > > +
> > > +    if (s->t_aftervmstop.tv_sec == 0) {
> > > +        return;
> > > +    }
> > 
> > You have your test here.
> > 
> > > +
> > > +    clock_gettime(CLOCK_MONOTONIC, &now);
> > > +
> > > +    ns = clock_delta(&s->t_aftervmstop, &now);
> > > +
> > > +    /*
> > > +     * Linux guests can overflow if time jumps
> > > +     * forward in large increments.
> > > +     * Cap maximum adjustment to 10 minutes.
> > > +     */
> > > +    ns = MIN(ns, 600000000000ULL);
> > > +
> > > +    if (s->clock + ns > s->clock) {
> > > +        s->ns = ns;
> > 
> > Would it be a good idea to print an error message here?  If it has been more
> > than 10mins since we did the vmstop, something got wrong here.
> 
> Not sure... is it not possible for the user to stop migration in some 
> way? 
> 
> What if network is very slow and maxdowntime very high?
> 
> > > +    }
> > > +}
> > > +
> > > +static int kvmclock_post_load(void *opaque, int version_id)
> > > +{
> > > +    KVMClockState *s = opaque;
> > > +
> > > +    /* save the value from incoming migration */
> > > +    s->advance_clock = s->ns;
> > > +
> > > +    return 0;
> > > +}
> > > +
> > >  static const VMStateDescription kvmclock_vmsd = {
> > >      .name = "kvmclock",
> > > -    .version_id = 1,
> > > +    .version_id = 2,
> > >      .minimum_version_id = 1,
> > > +    .pre_save = kvmclock_pre_save,
> > > +    .post_load = kvmclock_post_load,
> > >      .fields = (VMStateField[]) {
> > >          VMSTATE_UINT64(clock, KVMClockState),
> > > +        VMSTATE_UINT64_V(ns, KVMClockState, 2),
> > >          VMSTATE_END_OF_LIST()
> > >      }
> > >  };
> > 
> > 
> > If you need help with the subsection stuff, just ask.
> > 
> > Later, Juan.
> 
> Ok, i'll try to cook up an optional section and lets see what happens.
> 
> Thanks Juan.

Ok so by "optional section" i meant a section that when sent 
to destination, could be ignored and migration would succeed. 

The alternative (what this patch has now) is to increase migration
version so that:

    1. older machine types remain compatible. 
    2. newer machine types fail to migrate.

Because the data being sent, ns, is not really optional: if kvmclock or
hyper-v time is enabled (which should be 100% of the cases) we always
want to send that data.

That is, there is no difference between:

* Writing a subsection with needed=1 always (except when 
using an older machine types).
* Using old/new machine types with particular versions.

I think i missed the patch to switch current machine
types to kvmclock v1, BTW.


  reply	other threads:[~2016-11-04 14:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-04  9:43 [QEMU PATCH] kvmclock: advance clock by time window between vm_stop and pre_save Marcelo Tosatti
2016-11-04 12:28 ` Juan Quintela
2016-11-04 12:35   ` Marcelo Tosatti
2016-11-04 14:00     ` Marcelo Tosatti [this message]
2016-11-04 15:25 ` Radim Krčmář
2016-11-04 15:33   ` Paolo Bonzini
2016-11-04 15:48     ` Radim Krčmář
2016-11-04 15:57       ` Paolo Bonzini
2016-11-04 17:16         ` Radim Krčmář
2016-11-04 21:29           ` Paolo Bonzini
2016-11-04 21:47             ` Marcelo Tosatti
2016-11-04 22:35               ` Paolo Bonzini
2016-11-07 14:31           ` Roman Kagan
2016-11-07 19:31             ` Marcelo Tosatti
2016-11-04 16:24       ` Marcelo Tosatti
2016-11-04 17:34         ` Radim Krčmář
2016-11-04 18:29           ` Marcelo Tosatti
2016-11-04 20:07             ` Radim Krčmář
2016-11-04 16:04   ` Marcelo Tosatti
2016-11-04 17:07   ` Marcelo Tosatti
2016-11-04 17:39     ` Radim Krčmář
2016-11-04 18:31       ` Marcelo Tosatti
2016-11-07 13:08       ` Dr. David Alan Gilbert
2016-11-04 16:59 ` [QEMU PATCH v2] " Marcelo Tosatti
2016-11-04 18:57   ` Juan Quintela
2016-11-07 15:46   ` Dr. David Alan Gilbert
2016-11-07 19:41     ` Marcelo Tosatti
2016-11-07 20:03       ` Dr. David Alan Gilbert
2016-11-08  0:06         ` Marcelo Tosatti
2016-11-08 10:22           ` Dr. David Alan Gilbert
2016-11-08 13:32             ` Marcelo Tosatti
2016-11-09 19:32               ` Marcelo Tosatti
2016-11-09 16:23             ` Paolo Bonzini
2016-11-09 16:28               ` Dr. David Alan Gilbert
2016-11-09 16:33                 ` Paolo Bonzini
2016-11-10 11:48               ` Marcelo Tosatti
2016-11-10 17:57                 ` Paolo Bonzini
2016-11-11 14:23                   ` Marcelo Tosatti
2017-02-07 10:02       ` Wanpeng Li
2017-02-07 12:18         ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161104140035.GA14339@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).