KVM: x86: limit difference between kvmclock updates

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* KVM: x86: limit difference between kvmclock updates
@ 2013-05-09 23:21 Marcelo Tosatti
  2013-05-14  9:05 ` Gleb Natapov
  0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2013-05-09 23:21 UTC (permalink / raw)
  To: kvm-devel; +Cc: Glauber Costa


kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
migration, should not allow system_timestamp from the rest of the vcpus
to remain static. Otherwise ntp frequency correction applies to one
vcpu's system_timestamp but not the others.

So in those cases, request a kvmclock update for all vcpus. The worst
case for a remote vcpu to update its kvmclock is then bounded by maximum
nohz sleep latency.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 94f35d2..a37cadc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1569,6 +1569,30 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 	return 0;
 }
 
+/*
+ * kvmclock updates which are isolated to a given vcpu, such as
+ * vcpu->cpu migration, should not allow system_timestamp from
+ * the rest of the vcpus to remain static. Otherwise ntp frequency
+ * correction applies to one vcpu's system_timestamp but not
+ * the others.
+ *
+ * So in those cases, request a kvmclock update for all vcpus.
+ * The worst case for a remote vcpu to update its kvmclock
+ * is then bounded by maximum nohz sleep latency.
+ */
+
+static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
+{
+	int i;
+	struct kvm *kvm = v->kvm;
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		set_bit(KVM_REQ_CLOCK_UPDATE, &vcpu->requests);
+		kvm_vcpu_kick(vcpu);
+	}
+}
+
 static bool msr_mtrr_valid(unsigned msr)
 {
 	switch (msr) {
@@ -1965,7 +1989,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		kvmclock_reset(vcpu);
 
 		vcpu->arch.time = data;
-		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
+		kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
 
 		/* we verify if the enable bit is set... */
 		if (!(data & 1))
@@ -2684,7 +2708,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 		 * kvmclock on vcpu->cpu migration
 		 */
 		if (!vcpu->kvm->arch.use_master_clock || vcpu->cpu == -1)
-			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
+			kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
 		if (vcpu->cpu != cpu)
 			kvm_migrate_timers(vcpu);
 		vcpu->cpu = cpu;
@@ -5704,6 +5728,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			__kvm_migrate_timers(vcpu);
 		if (kvm_check_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu))
 			kvm_gen_update_masterclock(vcpu->kvm);
+		if (kvm_check_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu))
+			kvm_gen_kvmclock_update(vcpu);
 		if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu)) {
 			r = kvm_guest_time_update(vcpu);
 			if (unlikely(r))
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7823b63..044b0b9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -126,6 +126,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_MCLOCK_INPROGRESS 19
 #define KVM_REQ_EPR_EXIT          20
 #define KVM_REQ_SCAN_IOAPIC       21
+#define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: KVM: x86: limit difference between kvmclock updates
  2013-05-09 23:21 KVM: x86: limit difference between kvmclock updates Marcelo Tosatti
@ 2013-05-14  9:05 ` Gleb Natapov
  2013-05-14 13:12   ` Marcelo Tosatti
  0 siblings, 1 reply; 5+ messages in thread
From: Gleb Natapov @ 2013-05-14  9:05 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm-devel, Glauber Costa

On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> 
> kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> migration, should not allow system_timestamp from the rest of the vcpus
> to remain static. Otherwise ntp frequency correction applies to one
> vcpu's system_timestamp but not the others.
> 
> So in those cases, request a kvmclock update for all vcpus. The worst
> case for a remote vcpu to update its kvmclock is then bounded by maximum
> nohz sleep latency.
> 
Does this mean that when one vcpu is migrated all others are kicked out
from a guest mode?

> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 94f35d2..a37cadc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1569,6 +1569,30 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
>  	return 0;
>  }
>  
> +/*
> + * kvmclock updates which are isolated to a given vcpu, such as
> + * vcpu->cpu migration, should not allow system_timestamp from
> + * the rest of the vcpus to remain static. Otherwise ntp frequency
> + * correction applies to one vcpu's system_timestamp but not
> + * the others.
> + *
> + * So in those cases, request a kvmclock update for all vcpus.
> + * The worst case for a remote vcpu to update its kvmclock
> + * is then bounded by maximum nohz sleep latency.
> + */
> +
> +static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
> +{
> +	int i;
> +	struct kvm *kvm = v->kvm;
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		set_bit(KVM_REQ_CLOCK_UPDATE, &vcpu->requests);
> +		kvm_vcpu_kick(vcpu);
> +	}
> +}
> +
>  static bool msr_mtrr_valid(unsigned msr)
>  {
>  	switch (msr) {
> @@ -1965,7 +1989,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  		kvmclock_reset(vcpu);
>  
>  		vcpu->arch.time = data;
> -		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> +		kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
>  
>  		/* we verify if the enable bit is set... */
>  		if (!(data & 1))
> @@ -2684,7 +2708,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  		 * kvmclock on vcpu->cpu migration
>  		 */
>  		if (!vcpu->kvm->arch.use_master_clock || vcpu->cpu == -1)
> -			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> +			kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
>  		if (vcpu->cpu != cpu)
>  			kvm_migrate_timers(vcpu);
>  		vcpu->cpu = cpu;
> @@ -5704,6 +5728,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  			__kvm_migrate_timers(vcpu);
>  		if (kvm_check_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu))
>  			kvm_gen_update_masterclock(vcpu->kvm);
> +		if (kvm_check_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu))
> +			kvm_gen_kvmclock_update(vcpu);
>  		if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu)) {
>  			r = kvm_guest_time_update(vcpu);
>  			if (unlikely(r))
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 7823b63..044b0b9 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -126,6 +126,7 @@ static inline bool is_error_page(struct page *page)
>  #define KVM_REQ_MCLOCK_INPROGRESS 19
>  #define KVM_REQ_EPR_EXIT          20
>  #define KVM_REQ_SCAN_IOAPIC       21
> +#define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
>  
>  #define KVM_USERSPACE_IRQ_SOURCE_ID		0
>  #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KVM: x86: limit difference between kvmclock updates
  2013-05-14  9:05 ` Gleb Natapov
@ 2013-05-14 13:12   ` Marcelo Tosatti
  2013-05-15 17:41     ` Gleb Natapov
  0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2013-05-14 13:12 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm-devel, Glauber Costa

On Tue, May 14, 2013 at 12:05:13PM +0300, Gleb Natapov wrote:
> On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> > 
> > kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> > migration, should not allow system_timestamp from the rest of the vcpus
> > to remain static. Otherwise ntp frequency correction applies to one
> > vcpu's system_timestamp but not the others.
> > 
> > So in those cases, request a kvmclock update for all vcpus. The worst
> > case for a remote vcpu to update its kvmclock is then bounded by maximum
> > nohz sleep latency.
> > 
> Does this mean that when one vcpu is migrated all others are kicked out
> from a guest mode?

Yes, those which are in guest mode. For guests with large number of
vcpus this is a problem, but i can't see a simpler method to fix the bug
for now.

Yes, this aspect must be improved (however, the bug incurs on timers in
the guest taking tens of milliseconds with vcpu->pcpu pinning, which can
be unacceptable).


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KVM: x86: limit difference between kvmclock updates
  2013-05-14 13:12   ` Marcelo Tosatti
@ 2013-05-15 17:41     ` Gleb Natapov
  2013-05-15 19:01       ` Marcelo Tosatti
  0 siblings, 1 reply; 5+ messages in thread
From: Gleb Natapov @ 2013-05-15 17:41 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm-devel, Glauber Costa

On Tue, May 14, 2013 at 10:12:57AM -0300, Marcelo Tosatti wrote:
> On Tue, May 14, 2013 at 12:05:13PM +0300, Gleb Natapov wrote:
> > On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> > > 
> > > kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> > > migration, should not allow system_timestamp from the rest of the vcpus
> > > to remain static. Otherwise ntp frequency correction applies to one
> > > vcpu's system_timestamp but not the others.
> > > 
> > > So in those cases, request a kvmclock update for all vcpus. The worst
> > > case for a remote vcpu to update its kvmclock is then bounded by maximum
> > > nohz sleep latency.
> > > 
> > Does this mean that when one vcpu is migrated all others are kicked out
> > from a guest mode?
> 
> Yes, those which are in guest mode. For guests with large number of
> vcpus this is a problem, but i can't see a simpler method to fix the bug
> for now.
> 
> Yes, this aspect must be improved (however, the bug incurs on timers in
> the guest taking tens of milliseconds with vcpu->pcpu pinning, which can
> be unacceptable).
Not sure I understand. With vcpu->pcpu pinning there will be no
migration. Do you mean "without" here?

If vcpu->kvm->arch.use_master_clock is false we kick vcpus on each
vcpu_load. When is it false?

I applied the patch since it fixes the real problem, but we need to
evaluate how it affects scalability.

--
			Gleb.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KVM: x86: limit difference between kvmclock updates
  2013-05-15 17:41     ` Gleb Natapov
@ 2013-05-15 19:01       ` Marcelo Tosatti
  0 siblings, 0 replies; 5+ messages in thread
From: Marcelo Tosatti @ 2013-05-15 19:01 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm-devel, Glauber Costa

On Wed, May 15, 2013 at 08:41:54PM +0300, Gleb Natapov wrote:
> On Tue, May 14, 2013 at 10:12:57AM -0300, Marcelo Tosatti wrote:
> > On Tue, May 14, 2013 at 12:05:13PM +0300, Gleb Natapov wrote:
> > > On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> > > > 
> > > > kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> > > > migration, should not allow system_timestamp from the rest of the vcpus
> > > > to remain static. Otherwise ntp frequency correction applies to one
> > > > vcpu's system_timestamp but not the others.
> > > > 
> > > > So in those cases, request a kvmclock update for all vcpus. The worst
> > > > case for a remote vcpu to update its kvmclock is then bounded by maximum
> > > > nohz sleep latency.
> > > > 
> > > Does this mean that when one vcpu is migrated all others are kicked out
> > > from a guest mode?
> > 
> > Yes, those which are in guest mode. For guests with large number of
> > vcpus this is a problem, but i can't see a simpler method to fix the bug
> > for now.
> > 
> > Yes, this aspect must be improved (however, the bug incurs on timers in
> > the guest taking tens of milliseconds with vcpu->pcpu pinning, which can
> > be unacceptable).
> Not sure I understand. With vcpu->pcpu pinning there will be no
> migration. Do you mean "without" here?

With vcpu->pcpu pinning there is no guarantee of kvm_arch_vcpu_load therefore 
no KVM_REQ_UPDATE_CLOCK. This is the problem.

> If vcpu->kvm->arch.use_master_clock is false we kick vcpus on each
> vcpu_load. When is it false?

When

- the host does not use TSC clocksource
or 
- the vcpus TSCs are out of sync

> I applied the patch since it fixes the real problem, but we need to
> evaluate how it affects scalability.

I'll look into ways to reduce the IPIs.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-05-15 19:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-09 23:21 KVM: x86: limit difference between kvmclock updates Marcelo Tosatti
2013-05-14  9:05 ` Gleb Natapov
2013-05-14 13:12   ` Marcelo Tosatti
2013-05-15 17:41     ` Gleb Natapov
2013-05-15 19:01       ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox