From: Marcelo Tosatti <mtosatti@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: kvm list <kvm@vger.kernel.org>, Radim Krcmar <rkrcmar@redhat.com>,
stable <stable@vger.kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: x86: kvm: Revert "remove sched notifier for cross-cpu migrations"
Date: Thu, 26 Mar 2015 08:29:07 -0300 [thread overview]
Message-ID: <20150326112907.GA15098@amt.cnet> (raw)
In-Reply-To: <CALCETrUZspcCyVLTMvTU3VXuUfAAvty7GVPQTECrmP_F0zNJMA@mail.gmail.com>
On Wed, Mar 25, 2015 at 04:22:03PM -0700, Andy Lutomirski wrote:
> On Wed, Mar 25, 2015 at 4:13 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> > On Wed, Mar 25, 2015 at 03:48:02PM -0700, Andy Lutomirski wrote:
> >> On Wed, Mar 25, 2015 at 3:41 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> >> > On Wed, Mar 25, 2015 at 03:33:10PM -0700, Andy Lutomirski wrote:
> >> >> On Mar 25, 2015 2:29 PM, "Marcelo Tosatti" <mtosatti@redhat.com> wrote:
> >> >> >
> >> >> > On Wed, Mar 25, 2015 at 01:52:15PM +0100, Radim Krčmář wrote:
> >> >> > > 2015-03-25 12:08+0100, Radim Krčmář:
> >> >> > > > Reverting the patch protects us from any migration, but I don't think we
> >> >> > > > need to care about changing VCPUs as long as we read a consistent data
> >> >> > > > from kvmclock. (VCPU can change outside of this loop too, so it doesn't
> >> >> > > > matter if we return a value not fit for this VCPU.)
> >> >> > > >
> >> >> > > > I think we could drop the second __getcpu if our kvmclock was being
> >> >> > > > handled better; maybe with a patch like the one below:
> >> >> > >
> >> >> > > The second __getcpu is not neccessary, but I forgot about rdtsc.
> >> >> > > We need to either use rtdscp, know the host has synchronized tsc, or
> >> >> > > monitor VCPU migrations. Only the last one works everywhere.
> >> >> >
> >> >> > The vdso code is only used if host has synchronized tsc.
> >> >> >
> >> >> > But you have to handle the case where host goes from synchronized tsc to
> >> >> > unsynchronized tsc (see the clocksource notifier in the host side).
> >> >> >
> >> >>
> >> >> Can't we change the host to freeze all vcpus and clear the stable bit
> >> >> on all of them if this happens? This would simplify and speed up
> >> >> vclock_gettime.
> >> >>
> >> >> --Andy
> >> >
> >> > Seems interesting to do on 512-vcpus, but sure, could be done.
> >> >
> >>
> >> If you have a 512-vcpu system that switches between stable and
> >> unstable more than once per migration, then I expect that you have
> >> serious problems and this is the least of your worries.
> >>
> >> Personally, I'd *much* rather we just made vcpu 0's pvti authoritative
> >> if we're stable. If nothing else, I'm not even remotely convinced
> >> that the current scheme gives monotonic timing due to skew between
> >> when the updates happen on different vcpus.
> >
> > Can you write down the problem ?
> >
>
> I can try.
>
> Suppose we start out with all vcpus agreeing on their pvti and perfect
> invariant TSCs. Now the host updates its frequency (due to NTP or
> whatever). KVM updates vcpu 0's pvti. Before KVM updates vcpu 1's
> pvti, guest code on vcpus 0 and 1 see synced TSCs but different pvti.
> They'll disagree on the time, and one of them will be ahead until vcpu
> 1's pvti gets updated.
The masterclock scheme enforces the same system_timestamp/tsc_timestamp pairs
to be visible at one time, for all vcpus.
* That is, when timespec0 != timespec1, M < N. Unfortunately that is
* not
* always the case (the difference between two distinct xtime instances
* might be smaller then the difference between corresponding TSC reads,
* when updating guest vcpus pvclock areas).
*
* To avoid that problem, do not allow visibility of distinct
* system_timestamp/tsc_timestamp values simultaneously: use a master
* copy of host monotonic time values. Update that master copy
* in lockstep.
next prev parent reply other threads:[~2015-03-26 11:30 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-23 23:21 x86: kvm: Revert "remove sched notifier for cross-cpu migrations" Marcelo Tosatti
2015-03-23 23:30 ` Andy Lutomirski
2015-03-24 15:34 ` Radim Krčmář
2015-03-24 22:33 ` Andy Lutomirski
2015-03-25 11:08 ` Radim Krčmář
2015-03-25 12:52 ` Radim Krčmář
2015-03-25 21:28 ` Marcelo Tosatti
2015-03-25 22:33 ` Andy Lutomirski
2015-03-25 22:41 ` Marcelo Tosatti
2015-03-25 22:48 ` Andy Lutomirski
2015-03-25 23:13 ` Marcelo Tosatti
2015-03-25 23:22 ` Andy Lutomirski
2015-03-26 11:29 ` Marcelo Tosatti [this message]
2015-03-26 18:51 ` Andy Lutomirski
2015-03-26 20:31 ` Radim Krcmar
2015-03-26 20:58 ` Andy Lutomirski
2015-03-26 22:22 ` Andy Lutomirski
2015-03-26 22:56 ` Marcelo Tosatti
2015-03-26 23:09 ` Andy Lutomirski
2015-03-26 23:22 ` Marcelo Tosatti
2015-03-26 23:28 ` Andy Lutomirski
2015-03-26 23:38 ` Marcelo Tosatti
2015-03-26 18:47 ` Andy Lutomirski
2015-03-26 20:10 ` Radim Krčmář
2015-03-26 20:52 ` Paolo Bonzini
2015-03-24 22:59 ` Marcelo Tosatti
2015-03-25 11:09 ` Radim Krčmář
2015-03-25 13:06 ` Radim Krčmář
2015-03-26 20:59 ` Radim Krčmář
2015-03-26 22:22 ` Marcelo Tosatti
2015-03-26 22:24 ` Andy Lutomirski
2015-03-26 22:40 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150326112907.GA15098@amt.cnet \
--to=mtosatti@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox