public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Zachary Amsden <zamsden@redhat.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org, Avi Kivity <avi@redhat.com>,
	Glauber Costa <glommer@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [KVM Clock Synchronization 2/4] Keep TSC synchronized across host suspend
Date: Tue, 04 Jan 2011 18:43:10 -1000	[thread overview]
Message-ID: <4D23F6DE.7090505@redhat.com> (raw)
In-Reply-To: <20110104153652.GB31340@amt.cnet>

On 01/04/2011 05:36 AM, Marcelo Tosatti wrote:
> On Tue, Dec 28, 2010 at 07:38:18PM -1000, Zachary Amsden wrote:
>    
>> During a host suspend, TSC may go backwards, which KVM interprets
>> as an unstable TSC.  Technically, KVM should not be marking the
>> TSC unstable, which causes the TSC clocksource to go bad, but
>> should be adjusting the TSC offsets in such a case.
>>
>> Signed-off-by: Zachary Amsden<zamsden@redhat.com>
>> ---
>>   arch/x86/include/asm/kvm_host.h |    1 +
>>   arch/x86/kvm/x86.c              |   66 +++++++++++++++++++++++++++++++++++---
>>   2 files changed, 61 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 9e6fe39..63a82b0 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -386,6 +386,7 @@ struct kvm_vcpu_arch {
>>   	u64 last_kernel_ns;
>>   	u64 last_tsc_nsec;
>>   	u64 last_tsc_write;
>> +	u64 tsc_offset_adjustment;
>>   	bool tsc_catchup;
>>
>>   	bool nmi_pending;
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index b9118f4..b509c01 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2042,12 +2042,20 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>>   	}
>>
>>   	kvm_x86_ops->vcpu_load(vcpu, cpu);
>> +
>> +	/* Apply any externally detected TSC adjustments (due to suspend) */
>> +	if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
>> +		kvm_x86_ops->adjust_tsc_offset(vcpu,
>> +			vcpu->arch.tsc_offset_adjustment);
>> +		vcpu->arch.tsc_offset_adjustment = 0;
>> +		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>> +	}
>>   	if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
>>   		/* Make sure TSC doesn't go backwards */
>>   		s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
>>   				native_read_tsc() - vcpu->arch.last_host_tsc;
>>   		if (tsc_delta<  0)
>> -			mark_tsc_unstable("KVM discovered backwards TSC");
>> +			WARN_ON(!check_tsc_unstable());
>>   		if (check_tsc_unstable()) {
>>   			kvm_x86_ops->adjust_tsc_offset(vcpu, -tsc_delta);
>>   			vcpu->arch.tsc_catchup = 1;
>> @@ -5778,14 +5786,60 @@ int kvm_arch_hardware_enable(void *garbage)
>>   {
>>   	struct kvm *kvm;
>>   	struct kvm_vcpu *vcpu;
>> -	int i;
>> +	int i, ret;
>> +	u64 local_tsc;
>> +	u64 max_tsc = 0;
>> +	bool stable, backwards_tsc = false;
>>
>>   	kvm_shared_msr_cpu_online();
>> -	list_for_each_entry(kvm,&vm_list, vm_list)
>> -		kvm_for_each_vcpu(i, vcpu, kvm)
>> -			if (vcpu->cpu == smp_processor_id())
>> +	local_tsc = native_read_tsc();
>> +	stable = !check_tsc_unstable();
>> +	ret = kvm_x86_ops->hardware_enable(garbage);
>> +	if (ret)
>> +		return ret;
>> +
>> +	list_for_each_entry(kvm,&vm_list, vm_list) {
>> +		kvm_for_each_vcpu(i, vcpu, kvm) {
>> +			if (!stable&&  vcpu->cpu == smp_processor_id())
>>   				kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>> -	return kvm_x86_ops->hardware_enable(garbage);
>> +			if (stable&&  vcpu->arch.last_host_tsc>  local_tsc) {
>> +				backwards_tsc = true;
>> +				if (vcpu->arch.last_host_tsc>  max_tsc)
>> +					max_tsc = vcpu->arch.last_host_tsc;
>> +			}
>> +		}
>> +	}
>> +
>> +	/*
>> +	 * Sometimes, reliable TSCs go backwards.  This happens
>> +	 * on platforms that reset TSC during suspend or hibernate
>> +	 * actions, but maintain synchronization.  We must compensate.
>> +	 * Unfortunately, we can't bring the TSCs fully up to date
>> +	 * with real time.  The reason is that we aren't yet far
>> +	 * enough into CPU bringup that we know how much real time
>> +	 * has actually elapsed; our helper function, get_kernel_ns()
>> +	 * will be using boot variables that haven't been updated yet.
>> +	 * We simply find the maximum observed TSC above, then record
>> +	 * the adjustment to TSC in each VCPU.  When the VCPU later
>> +	 * gets loaded, the adjustment will be applied.  Note that we
>> +	 * accumulate adjustments, in case multiple suspend cycles
>> +	 * happen before the VCPU gets a chance to run again.
>> +	 *
>> +	 * Note that unreliable TSCs will be compensated already by
>> +	 * the logic in vcpu_load, which sets the TSC to catchup mode.
>> +	 * This will catchup all VCPUs to real time, but cannot
>> +	 * guarantee synchronization.
>> +	 */
>> +	if (backwards_tsc) {
>> +		u64 delta_cyc = max_tsc - local_tsc;
>> +		list_for_each_entry(kvm,&vm_list, vm_list)
>> +			kvm_for_each_vcpu(i, vcpu, kvm) {
>> +				vcpu->arch.tsc_offset_adjustment += delta_cyc;
>> +				vcpu->arch.last_host_tsc = 0;
>> +			}
>> +	}
>> +
>> +	return 0;
>>   }
>>      
> Wouldnt it be simpler to use cyc2ns_offset (or something equivalent)? In
> any case, you forgot to compare smp_processor_id.
>    

I don't think so, as here, we're already dealing in units of cycles, and 
we don't want to just fix the kvmclock, we want to make sure the TSC 
also does not go backwards.

And we deliberately do not check smp_processor_id.  The reasoning is - 
if the TSC has gone backwards, but we have a stable TSC, it means all 
TSCs have gone backwards together, and so should all be adjusted 
equally.  Note that backwards_tsc is only set if TSC is marked as stable.

Zach

  reply	other threads:[~2011-01-05  4:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-29  5:38 KVM clock synchronization Zachary Amsden
2010-12-29  5:38 ` [KVM Clock Synchronization 1/4] Make cyc_to_nsec conversions more reliable Zachary Amsden
2010-12-29  5:38 ` [KVM Clock Synchronization 2/4] Keep TSC synchronized across host suspend Zachary Amsden
2011-01-04 15:36   ` Marcelo Tosatti
2011-01-05  4:43     ` Zachary Amsden [this message]
2011-01-05 11:44       ` Marcelo Tosatti
2010-12-29  5:38 ` [KVM Clock Synchronization 3/4] Refactor KVM clock update code Zachary Amsden
2010-12-29  5:38 ` [KVM Clock Synchronization 4/4] Add master clock for KVM clock Zachary Amsden
2011-01-04 18:20   ` Marcelo Tosatti
2011-01-04 21:50     ` Zachary Amsden
2011-01-05 12:17 ` KVM clock synchronization Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D23F6DE.7090505@redhat.com \
    --to=zamsden@redhat.com \
    --cc=avi@redhat.com \
    --cc=glommer@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox