Re: Bug in KVM clock backwards compensation

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Zachary Amsden <zamsden@redhat.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: "Roedel, Joerg" <Joerg.Roedel@amd.com>, kvm <kvm@vger.kernel.org>,
	Jan Kiszka <jan.kiszka@web.de>
Subject: Re: Bug in KVM clock backwards compensation
Date: Fri, 29 Apr 2011 11:17:18 -0700	[thread overview]
Message-ID: <4DBB00AE.6000602@redhat.com> (raw)
In-Reply-To: <20110429084054.GG13402@8bytes.org>

On 04/29/2011 01:40 AM, Joerg Roedel wrote:
> On Thu, Apr 28, 2011 at 08:00:57PM -0700, Zachary Amsden wrote:
>    
>> On 04/28/2011 01:20 PM, Joerg Roedel wrote:
>>      
>    
>>> This code checks how many guest tsc cycles have passed since this vCPU
>>> was de-scheduled last time (and before it is running again). So since
>>> the vCPU hasn't run in the meantime it had no chance to change its TSC.
>>> Further, the other parameters like the TSC offset and the scaling
>>> multiplier havn't changed too, so the only variable in the guest-tsc
>>> calculation is the host-tsc.
>>> So this calculation using the guest-tsc can detect backwards going
>>> host-tsc as good as the old one. The benefit here is that we can feed
>>> consistent values into adjust_tsc_offset().
>>>
>>>        
>> While true, this is more complex than the original code.  The original
>> code here doesn't try to actually compensate for the guest TSC
>> difference - instead what it does is NULL any discovered host TSC delta:
>>      
> Why should KVM care about the host-tsc going backwards when the
> guest-tsc moves forward? I think the bottom-line of this code is to make
> sure the guest-tsc does not jump around (or even jumps backwards).
>
> I also don't agree that this is additional complexity. With these
> changes we were able to get rid of the last_host_tsc variable which is
> actually a simplification.
>
> As I see it, there are two situations here:
>
> 1) On hosts without tsc-scaling the value of tsc_delta calculated from
>     the guest-tsc values is the same as when calculated with host-tsc
>     values, because the guest-tsc only differs by an offset from the
>     host-tsc.
>
> 2) On hosts with tsc-scaling these two tsc_delta values may be
>     different. If the guest has a different tsc-freq as the host, we
>     can't simply adjust the tsc by an offset calculated from the host-tsc
>     values. So tsc_delta has to be calculated using the guest-tsc.
>    
>>          if (tsc_delta<  0)
>>              mark_tsc_unstable("KVM discovered backwards TSC");
>>          if (check_tsc_unstable()) {
>>              kvm_x86_ops->adjust_tsc_offset(vcpu, -tsc_delta);
>>              vcpu->arch.tsc_catchup = 1;
>>          }
>>          kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>>
>> Erasing that delta also erases elapsed time since the VCPU has last been
>> run, which isn't desirable, so it then sets tsc_catchup mode, which will
>> restore the proper TSC.  The request here triggers code which later
>> updates the TSC offset again.
>>      
> I just looked again at the tsc_catchup thing. Since the clock-update is
> forced in the code above, and the clock-update-code adjusts the tsc
> itself if necessary, is it really necessary to do this here at all?
>    

Unfortunately, yes, it is.  The code in the second or catchup phase can 
only correct an under adjusted clock, or we risk setting the guest clock 
backwards from what has been observed.  So to guarantee there is not a 
huge jump due to over-adjustment, we must eliminate the TSC delta when 
switching CPUs, and that needs to happen in the preempt notifier, not 
the clock update handler.

I agree it is simpler to do everything in terms of guest clock rate, but 
there is one variable which is thrown in to complicate things here - the 
host clock rate may have changed during this whole process.

Let me look at the code again and see if it is possible to get rid of 
the first, host based adjustment entirely.  Ideally we would just track 
things in terms of the guest clock and do all the computation inside the 
clock update request handler instead of having one adjustment in the 
preempt notifier and a separate one later when the clock update happens.

I had originally tried something like that but ran into issues where the 
rate computation got exceedingly difficult.  Maybe the code has been 
restructured enough now that it will work (the clock update didn't used 
to happen unless you had KVM clock...)

Zach

     prev parent reply	other threads:[~2011-04-29 18:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-28  6:59 Bug in KVM clock backwards compensation Zachary Amsden
2011-04-28  7:06 ` Jan Kiszka
2011-04-28  7:22   ` Roedel, Joerg
2011-04-28 19:06     ` Zachary Amsden
2011-04-28 22:38       ` Jan Kiszka
2011-04-28 17:48   ` Zachary Amsden
2011-04-28  7:13 ` Roedel, Joerg
2011-04-28 18:34   ` Zachary Amsden
2011-04-28 20:20     ` Joerg Roedel
2011-04-29  3:00       ` Zachary Amsden
2011-04-29  8:40         ` Joerg Roedel
2011-04-29 18:17           ` Zachary Amsden [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DBB00AE.6000602@redhat.com \
    --to=zamsden@redhat.com \
    --cc=Joerg.Roedel@amd.com \
    --cc=jan.kiszka@web.de \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).