public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Zachary Amsden <zamsden@redhat.com>
Cc: kvm@vger.kernel.org, Avi Kivity <avi@redhat.com>,
	Glauber Costa <glommer@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [KVM Clock Synchronization 2/4] Keep TSC synchronized across host suspend
Date: Wed, 5 Jan 2011 09:44:09 -0200	[thread overview]
Message-ID: <20110105114409.GA32733@amt.cnet> (raw)
In-Reply-To: <4D23F6DE.7090505@redhat.com>

On Tue, Jan 04, 2011 at 06:43:10PM -1000, Zachary Amsden wrote:
> On 01/04/2011 05:36 AM, Marcelo Tosatti wrote:
> >On Tue, Dec 28, 2010 at 07:38:18PM -1000, Zachary Amsden wrote:
> >>During a host suspend, TSC may go backwards, which KVM interprets
> >>as an unstable TSC.  Technically, KVM should not be marking the
> >>TSC unstable, which causes the TSC clocksource to go bad, but
> >>should be adjusting the TSC offsets in such a case.
> >>
> >>Signed-off-by: Zachary Amsden<zamsden@redhat.com>
> >>---
> >>  arch/x86/include/asm/kvm_host.h |    1 +
> >>  arch/x86/kvm/x86.c              |   66 +++++++++++++++++++++++++++++++++++---
> >>  2 files changed, 61 insertions(+), 6 deletions(-)
> >>
> >>diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >>index 9e6fe39..63a82b0 100644
> >>--- a/arch/x86/include/asm/kvm_host.h
> >>+++ b/arch/x86/include/asm/kvm_host.h
> >>@@ -386,6 +386,7 @@ struct kvm_vcpu_arch {
> >>  	u64 last_kernel_ns;
> >>  	u64 last_tsc_nsec;
> >>  	u64 last_tsc_write;
> >>+	u64 tsc_offset_adjustment;
> >>  	bool tsc_catchup;
> >>
> >>  	bool nmi_pending;
> >>diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >>index b9118f4..b509c01 100644
> >>--- a/arch/x86/kvm/x86.c
> >>+++ b/arch/x86/kvm/x86.c
> >>@@ -2042,12 +2042,20 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> >>  	}
> >>
> >>  	kvm_x86_ops->vcpu_load(vcpu, cpu);
> >>+
> >>+	/* Apply any externally detected TSC adjustments (due to suspend) */
> >>+	if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
> >>+		kvm_x86_ops->adjust_tsc_offset(vcpu,
> >>+			vcpu->arch.tsc_offset_adjustment);
> >>+		vcpu->arch.tsc_offset_adjustment = 0;
> >>+		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> >>+	}
> >>  	if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
> >>  		/* Make sure TSC doesn't go backwards */
> >>  		s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
> >>  				native_read_tsc() - vcpu->arch.last_host_tsc;
> >>  		if (tsc_delta<  0)
> >>-			mark_tsc_unstable("KVM discovered backwards TSC");
> >>+			WARN_ON(!check_tsc_unstable());
> >>  		if (check_tsc_unstable()) {
> >>  			kvm_x86_ops->adjust_tsc_offset(vcpu, -tsc_delta);
> >>  			vcpu->arch.tsc_catchup = 1;
> >>@@ -5778,14 +5786,60 @@ int kvm_arch_hardware_enable(void *garbage)
> >>  {
> >>  	struct kvm *kvm;
> >>  	struct kvm_vcpu *vcpu;
> >>-	int i;
> >>+	int i, ret;
> >>+	u64 local_tsc;
> >>+	u64 max_tsc = 0;
> >>+	bool stable, backwards_tsc = false;
> >>
> >>  	kvm_shared_msr_cpu_online();
> >>-	list_for_each_entry(kvm,&vm_list, vm_list)
> >>-		kvm_for_each_vcpu(i, vcpu, kvm)
> >>-			if (vcpu->cpu == smp_processor_id())
> >>+	local_tsc = native_read_tsc();
> >>+	stable = !check_tsc_unstable();
> >>+	ret = kvm_x86_ops->hardware_enable(garbage);
> >>+	if (ret)
> >>+		return ret;
> >>+
> >>+	list_for_each_entry(kvm,&vm_list, vm_list) {
> >>+		kvm_for_each_vcpu(i, vcpu, kvm) {
> >>+			if (!stable&&  vcpu->cpu == smp_processor_id())
> >>  				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> >>-	return kvm_x86_ops->hardware_enable(garbage);
> >>+			if (stable&&  vcpu->arch.last_host_tsc>  local_tsc) {
> >>+				backwards_tsc = true;
> >>+				if (vcpu->arch.last_host_tsc>  max_tsc)
> >>+					max_tsc = vcpu->arch.last_host_tsc;
> >>+			}
> >>+		}
> >>+	}
> >>+
> >>+	/*
> >>+	 * Sometimes, reliable TSCs go backwards.  This happens
> >>+	 * on platforms that reset TSC during suspend or hibernate
> >>+	 * actions, but maintain synchronization.  We must compensate.
> >>+	 * Unfortunately, we can't bring the TSCs fully up to date
> >>+	 * with real time.  The reason is that we aren't yet far
> >>+	 * enough into CPU bringup that we know how much real time
> >>+	 * has actually elapsed; our helper function, get_kernel_ns()
> >>+	 * will be using boot variables that haven't been updated yet.
> >>+	 * We simply find the maximum observed TSC above, then record
> >>+	 * the adjustment to TSC in each VCPU.  When the VCPU later
> >>+	 * gets loaded, the adjustment will be applied.  Note that we
> >>+	 * accumulate adjustments, in case multiple suspend cycles
> >>+	 * happen before the VCPU gets a chance to run again.
> >>+	 *
> >>+	 * Note that unreliable TSCs will be compensated already by
> >>+	 * the logic in vcpu_load, which sets the TSC to catchup mode.
> >>+	 * This will catchup all VCPUs to real time, but cannot
> >>+	 * guarantee synchronization.
> >>+	 */
> >>+	if (backwards_tsc) {
> >>+		u64 delta_cyc = max_tsc - local_tsc;
> >>+		list_for_each_entry(kvm,&vm_list, vm_list)
> >>+			kvm_for_each_vcpu(i, vcpu, kvm) {
> >>+				vcpu->arch.tsc_offset_adjustment += delta_cyc;
> >>+				vcpu->arch.last_host_tsc = 0;
> >>+			}
> >>+	}
> >>+
> >>+	return 0;
> >>  }
> >Wouldnt it be simpler to use cyc2ns_offset (or something equivalent)? In
> >any case, you forgot to compare smp_processor_id.
> 
> I don't think so, as here, we're already dealing in units of cycles,
> and we don't want to just fix the kvmclock, we want to make sure the
> TSC also does not go backwards.

I meant calculate the backwards delta similarly to cyc2ns_offset, and
use that to adjust offset on vcpu_load, instead of calculating manually
via last_host_tsc.

> And we deliberately do not check smp_processor_id.  The reasoning is
> - if the TSC has gone backwards, but we have a stable TSC, it means
> all TSCs have gone backwards together, and so should all be adjusted
> equally.  Note that backwards_tsc is only set if TSC is marked as
> stable.

OK. 

  reply	other threads:[~2011-01-05 11:44 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-29  5:38 KVM clock synchronization Zachary Amsden
2010-12-29  5:38 ` [KVM Clock Synchronization 1/4] Make cyc_to_nsec conversions more reliable Zachary Amsden
2010-12-29  5:38 ` [KVM Clock Synchronization 2/4] Keep TSC synchronized across host suspend Zachary Amsden
2011-01-04 15:36   ` Marcelo Tosatti
2011-01-05  4:43     ` Zachary Amsden
2011-01-05 11:44       ` Marcelo Tosatti [this message]
2010-12-29  5:38 ` [KVM Clock Synchronization 3/4] Refactor KVM clock update code Zachary Amsden
2010-12-29  5:38 ` [KVM Clock Synchronization 4/4] Add master clock for KVM clock Zachary Amsden
2011-01-04 18:20   ` Marcelo Tosatti
2011-01-04 21:50     ` Zachary Amsden
2011-01-05 12:17 ` KVM clock synchronization Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110105114409.GA32733@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=glommer@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=zamsden@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox