* KVM: x86: limit difference between kvmclock updates
@ 2013-05-09 23:21 Marcelo Tosatti
2013-05-14 9:05 ` Gleb Natapov
0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2013-05-09 23:21 UTC (permalink / raw)
To: kvm-devel; +Cc: Glauber Costa
kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
migration, should not allow system_timestamp from the rest of the vcpus
to remain static. Otherwise ntp frequency correction applies to one
vcpu's system_timestamp but not the others.
So in those cases, request a kvmclock update for all vcpus. The worst
case for a remote vcpu to update its kvmclock is then bounded by maximum
nohz sleep latency.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 94f35d2..a37cadc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1569,6 +1569,30 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
return 0;
}
+/*
+ * kvmclock updates which are isolated to a given vcpu, such as
+ * vcpu->cpu migration, should not allow system_timestamp from
+ * the rest of the vcpus to remain static. Otherwise ntp frequency
+ * correction applies to one vcpu's system_timestamp but not
+ * the others.
+ *
+ * So in those cases, request a kvmclock update for all vcpus.
+ * The worst case for a remote vcpu to update its kvmclock
+ * is then bounded by maximum nohz sleep latency.
+ */
+
+static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
+{
+ int i;
+ struct kvm *kvm = v->kvm;
+ struct kvm_vcpu *vcpu;
+
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ set_bit(KVM_REQ_CLOCK_UPDATE, &vcpu->requests);
+ kvm_vcpu_kick(vcpu);
+ }
+}
+
static bool msr_mtrr_valid(unsigned msr)
{
switch (msr) {
@@ -1965,7 +1989,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
kvmclock_reset(vcpu);
vcpu->arch.time = data;
- kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
+ kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
/* we verify if the enable bit is set... */
if (!(data & 1))
@@ -2684,7 +2708,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
* kvmclock on vcpu->cpu migration
*/
if (!vcpu->kvm->arch.use_master_clock || vcpu->cpu == -1)
- kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
+ kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
if (vcpu->cpu != cpu)
kvm_migrate_timers(vcpu);
vcpu->cpu = cpu;
@@ -5704,6 +5728,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
__kvm_migrate_timers(vcpu);
if (kvm_check_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu))
kvm_gen_update_masterclock(vcpu->kvm);
+ if (kvm_check_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu))
+ kvm_gen_kvmclock_update(vcpu);
if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu)) {
r = kvm_guest_time_update(vcpu);
if (unlikely(r))
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7823b63..044b0b9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -126,6 +126,7 @@ static inline bool is_error_page(struct page *page)
#define KVM_REQ_MCLOCK_INPROGRESS 19
#define KVM_REQ_EPR_EXIT 20
#define KVM_REQ_SCAN_IOAPIC 21
+#define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
#define KVM_USERSPACE_IRQ_SOURCE_ID 0
#define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: KVM: x86: limit difference between kvmclock updates
2013-05-09 23:21 KVM: x86: limit difference between kvmclock updates Marcelo Tosatti
@ 2013-05-14 9:05 ` Gleb Natapov
2013-05-14 13:12 ` Marcelo Tosatti
0 siblings, 1 reply; 5+ messages in thread
From: Gleb Natapov @ 2013-05-14 9:05 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: kvm-devel, Glauber Costa
On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
>
> kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> migration, should not allow system_timestamp from the rest of the vcpus
> to remain static. Otherwise ntp frequency correction applies to one
> vcpu's system_timestamp but not the others.
>
> So in those cases, request a kvmclock update for all vcpus. The worst
> case for a remote vcpu to update its kvmclock is then bounded by maximum
> nohz sleep latency.
>
Does this mean that when one vcpu is migrated all others are kicked out
from a guest mode?
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 94f35d2..a37cadc 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1569,6 +1569,30 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
> return 0;
> }
>
> +/*
> + * kvmclock updates which are isolated to a given vcpu, such as
> + * vcpu->cpu migration, should not allow system_timestamp from
> + * the rest of the vcpus to remain static. Otherwise ntp frequency
> + * correction applies to one vcpu's system_timestamp but not
> + * the others.
> + *
> + * So in those cases, request a kvmclock update for all vcpus.
> + * The worst case for a remote vcpu to update its kvmclock
> + * is then bounded by maximum nohz sleep latency.
> + */
> +
> +static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
> +{
> + int i;
> + struct kvm *kvm = v->kvm;
> + struct kvm_vcpu *vcpu;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + set_bit(KVM_REQ_CLOCK_UPDATE, &vcpu->requests);
> + kvm_vcpu_kick(vcpu);
> + }
> +}
> +
> static bool msr_mtrr_valid(unsigned msr)
> {
> switch (msr) {
> @@ -1965,7 +1989,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> kvmclock_reset(vcpu);
>
> vcpu->arch.time = data;
> - kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> + kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
>
> /* we verify if the enable bit is set... */
> if (!(data & 1))
> @@ -2684,7 +2708,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> * kvmclock on vcpu->cpu migration
> */
> if (!vcpu->kvm->arch.use_master_clock || vcpu->cpu == -1)
> - kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> + kvm_make_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu);
> if (vcpu->cpu != cpu)
> kvm_migrate_timers(vcpu);
> vcpu->cpu = cpu;
> @@ -5704,6 +5728,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> __kvm_migrate_timers(vcpu);
> if (kvm_check_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu))
> kvm_gen_update_masterclock(vcpu->kvm);
> + if (kvm_check_request(KVM_REQ_GLOBAL_CLOCK_UPDATE, vcpu))
> + kvm_gen_kvmclock_update(vcpu);
> if (kvm_check_request(KVM_REQ_CLOCK_UPDATE, vcpu)) {
> r = kvm_guest_time_update(vcpu);
> if (unlikely(r))
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 7823b63..044b0b9 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -126,6 +126,7 @@ static inline bool is_error_page(struct page *page)
> #define KVM_REQ_MCLOCK_INPROGRESS 19
> #define KVM_REQ_EPR_EXIT 20
> #define KVM_REQ_SCAN_IOAPIC 21
> +#define KVM_REQ_GLOBAL_CLOCK_UPDATE 22
>
> #define KVM_USERSPACE_IRQ_SOURCE_ID 0
> #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Gleb.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: KVM: x86: limit difference between kvmclock updates
2013-05-14 9:05 ` Gleb Natapov
@ 2013-05-14 13:12 ` Marcelo Tosatti
2013-05-15 17:41 ` Gleb Natapov
0 siblings, 1 reply; 5+ messages in thread
From: Marcelo Tosatti @ 2013-05-14 13:12 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm-devel, Glauber Costa
On Tue, May 14, 2013 at 12:05:13PM +0300, Gleb Natapov wrote:
> On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> >
> > kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> > migration, should not allow system_timestamp from the rest of the vcpus
> > to remain static. Otherwise ntp frequency correction applies to one
> > vcpu's system_timestamp but not the others.
> >
> > So in those cases, request a kvmclock update for all vcpus. The worst
> > case for a remote vcpu to update its kvmclock is then bounded by maximum
> > nohz sleep latency.
> >
> Does this mean that when one vcpu is migrated all others are kicked out
> from a guest mode?
Yes, those which are in guest mode. For guests with large number of
vcpus this is a problem, but i can't see a simpler method to fix the bug
for now.
Yes, this aspect must be improved (however, the bug incurs on timers in
the guest taking tens of milliseconds with vcpu->pcpu pinning, which can
be unacceptable).
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: KVM: x86: limit difference between kvmclock updates
2013-05-14 13:12 ` Marcelo Tosatti
@ 2013-05-15 17:41 ` Gleb Natapov
2013-05-15 19:01 ` Marcelo Tosatti
0 siblings, 1 reply; 5+ messages in thread
From: Gleb Natapov @ 2013-05-15 17:41 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: kvm-devel, Glauber Costa
On Tue, May 14, 2013 at 10:12:57AM -0300, Marcelo Tosatti wrote:
> On Tue, May 14, 2013 at 12:05:13PM +0300, Gleb Natapov wrote:
> > On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> > >
> > > kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> > > migration, should not allow system_timestamp from the rest of the vcpus
> > > to remain static. Otherwise ntp frequency correction applies to one
> > > vcpu's system_timestamp but not the others.
> > >
> > > So in those cases, request a kvmclock update for all vcpus. The worst
> > > case for a remote vcpu to update its kvmclock is then bounded by maximum
> > > nohz sleep latency.
> > >
> > Does this mean that when one vcpu is migrated all others are kicked out
> > from a guest mode?
>
> Yes, those which are in guest mode. For guests with large number of
> vcpus this is a problem, but i can't see a simpler method to fix the bug
> for now.
>
> Yes, this aspect must be improved (however, the bug incurs on timers in
> the guest taking tens of milliseconds with vcpu->pcpu pinning, which can
> be unacceptable).
Not sure I understand. With vcpu->pcpu pinning there will be no
migration. Do you mean "without" here?
If vcpu->kvm->arch.use_master_clock is false we kick vcpus on each
vcpu_load. When is it false?
I applied the patch since it fixes the real problem, but we need to
evaluate how it affects scalability.
--
Gleb.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: KVM: x86: limit difference between kvmclock updates
2013-05-15 17:41 ` Gleb Natapov
@ 2013-05-15 19:01 ` Marcelo Tosatti
0 siblings, 0 replies; 5+ messages in thread
From: Marcelo Tosatti @ 2013-05-15 19:01 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm-devel, Glauber Costa
On Wed, May 15, 2013 at 08:41:54PM +0300, Gleb Natapov wrote:
> On Tue, May 14, 2013 at 10:12:57AM -0300, Marcelo Tosatti wrote:
> > On Tue, May 14, 2013 at 12:05:13PM +0300, Gleb Natapov wrote:
> > > On Thu, May 09, 2013 at 08:21:41PM -0300, Marcelo Tosatti wrote:
> > > >
> > > > kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu
> > > > migration, should not allow system_timestamp from the rest of the vcpus
> > > > to remain static. Otherwise ntp frequency correction applies to one
> > > > vcpu's system_timestamp but not the others.
> > > >
> > > > So in those cases, request a kvmclock update for all vcpus. The worst
> > > > case for a remote vcpu to update its kvmclock is then bounded by maximum
> > > > nohz sleep latency.
> > > >
> > > Does this mean that when one vcpu is migrated all others are kicked out
> > > from a guest mode?
> >
> > Yes, those which are in guest mode. For guests with large number of
> > vcpus this is a problem, but i can't see a simpler method to fix the bug
> > for now.
> >
> > Yes, this aspect must be improved (however, the bug incurs on timers in
> > the guest taking tens of milliseconds with vcpu->pcpu pinning, which can
> > be unacceptable).
> Not sure I understand. With vcpu->pcpu pinning there will be no
> migration. Do you mean "without" here?
With vcpu->pcpu pinning there is no guarantee of kvm_arch_vcpu_load therefore
no KVM_REQ_UPDATE_CLOCK. This is the problem.
> If vcpu->kvm->arch.use_master_clock is false we kick vcpus on each
> vcpu_load. When is it false?
When
- the host does not use TSC clocksource
or
- the vcpus TSCs are out of sync
> I applied the patch since it fixes the real problem, but we need to
> evaluate how it affects scalability.
I'll look into ways to reduce the IPIs.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-05-15 19:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-09 23:21 KVM: x86: limit difference between kvmclock updates Marcelo Tosatti
2013-05-14 9:05 ` Gleb Natapov
2013-05-14 13:12 ` Marcelo Tosatti
2013-05-15 17:41 ` Gleb Natapov
2013-05-15 19:01 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox