kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Zachary Amsden <zamsden@redhat.com>
To: Joerg Roedel <joro@8bytes.org>
Cc: "Roedel, Joerg" <Joerg.Roedel@amd.com>, kvm <kvm@vger.kernel.org>,
	Jan Kiszka <jan.kiszka@web.de>
Subject: Re: Bug in KVM clock backwards compensation
Date: Thu, 28 Apr 2011 20:00:57 -0700	[thread overview]
Message-ID: <4DBA29E9.9000905@redhat.com> (raw)
In-Reply-To: <20110428202002.GF13402@8bytes.org>

On 04/28/2011 01:20 PM, Joerg Roedel wrote:
> On Thu, Apr 28, 2011 at 11:34:44AM -0700, Zachary Amsden wrote:
>    
>> On 04/28/2011 12:13 AM, Roedel, Joerg wrote:
>>      
>    
>>> I see it different. This code wants to check if the _guest_ tsc moves
>>> forwared (or at least not backwards). So it is fully legitimate to just
>>> do this by reading the guest-tsc and compare it to the last one the
>>> guest had.
>>>        
>> That wasn't the intention when I wrote that code.  It's simply there to
>> detect backwards motion of the host TSC.  The guest TSC can legally go
>> backwards whenever the guest decides to change it, so checking the guest
>> TSC doesn't make sense here.
>>      
> This code checks how many guest tsc cycles have passed since this vCPU
> was de-scheduled last time (and before it is running again). So since
> the vCPU hasn't run in the meantime it had no chance to change its TSC.
> Further, the other parameters like the TSC offset and the scaling
> multiplier havn't changed too, so the only variable in the guest-tsc
> calculation is the host-tsc.
> So this calculation using the guest-tsc can detect backwards going
> host-tsc as good as the old one. The benefit here is that we can feed
> consistent values into adjust_tsc_offset().
>    

While true, this is more complex than the original code.  The original 
code here doesn't try to actually compensate for the guest TSC 
difference - instead what it does is NULL any discovered host TSC delta:

         if (tsc_delta < 0)
             mark_tsc_unstable("KVM discovered backwards TSC");
         if (check_tsc_unstable()) {
             kvm_x86_ops->adjust_tsc_offset(vcpu, -tsc_delta);
             vcpu->arch.tsc_catchup = 1;
         }
         kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);

Erasing that delta also erases elapsed time since the VCPU has last been 
run, which isn't desirable, so it then sets tsc_catchup mode, which will 
restore the proper TSC.  The request here triggers code which later 
updates the TSC offset again.

To avoid complexity, I think it's simplest to do the first computation 
in terms of the host TSC.

>> Yes, with tsc-scaling, the machines already have stable TSCs - the above
>> test is for older hardware which could have problems, and can be
>> reverted back to the original code without worrying about switching
>> units.
>>      
> This is the case pratically. But architecturally the tsc-scaling feature
> does not depend on a constant tsc, so we can make no such assumtions.
> Additionally, it may happen that Linux mis-detects an unstable tsc for
> some reason (broken BIOS, bug in the code, ...).  Therefore I think it
> is dangerous to assume that this code will never run on tsc-scaling
> capable hosts. And if it does and we don't manage the tsc-offset units
> right, we may see very weird behavior.
>    

I agree, it is best to handle this case - hardware can and will change - 
but the TSC adjustment in terms of guest rate should be done under the 
atomic protection right before entering hardware virtualized mode - here:

I left compute_guest_tsc in place to recompute time in guest units here, 
even if the underlying hardware rate changes.

         /*
          * We may have to catch up the TSC to match elapsed wall clock
          * time for two reasons, even if kvmclock is used.
          *   1) CPU could have been running below the maximum TSC rate
          *   2) Broken TSC compensation resets the base at each VCPU
          *      entry to avoid unknown leaps of TSC even when running
          *      again on the same CPU.  This may cause apparent elapsed
          *      time to disappear, and the guest to stand still or run
          *      very slowly.
          */
         if (vcpu->tsc_catchup) {
                 u64 tsc = compute_guest_tsc(v, kernel_ns);
                 if (tsc > tsc_timestamp) {
                         kvm_x86_ops->adjust_tsc_offset(v, tsc - 
tsc_timestamp);
                         tsc_timestamp = tsc;
                 }
         }

So yeah, the code is getting pretty complex but we'd like to avoid that 
as much as possible - so I would prefer to have the hardware backwards 
compensation separate from the guest rate computation by doing this:

step 1) remove any backwards hardware TSC delta (in hardware units)
step 2) recompute guest TSC from a stable clock (gotten from kernel_ns) 
and apply adjustment (in guest units)

So it appears you can just share most of the logic of guest TSC catchup 
mode.

What do you think?

Zach

  reply	other threads:[~2011-04-29  3:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-28  6:59 Bug in KVM clock backwards compensation Zachary Amsden
2011-04-28  7:06 ` Jan Kiszka
2011-04-28  7:22   ` Roedel, Joerg
2011-04-28 19:06     ` Zachary Amsden
2011-04-28 22:38       ` Jan Kiszka
2011-04-28 17:48   ` Zachary Amsden
2011-04-28  7:13 ` Roedel, Joerg
2011-04-28 18:34   ` Zachary Amsden
2011-04-28 20:20     ` Joerg Roedel
2011-04-29  3:00       ` Zachary Amsden [this message]
2011-04-29  8:40         ` Joerg Roedel
2011-04-29 18:17           ` Zachary Amsden

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DBA29E9.9000905@redhat.com \
    --to=zamsden@redhat.com \
    --cc=Joerg.Roedel@amd.com \
    --cc=jan.kiszka@web.de \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).