* [PATCH v1 0/3] kvm:x86: simplify kvmclock update logic
@ 2025-08-19 15:20 Lei Chen
2025-08-19 15:20 ` [PATCH v1 1/3] Revert "x86: kvm: introduce periodic global clock updates" Lei Chen
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Lei Chen @ 2025-08-19 15:20 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin
Cc: kvm, linux-kernel
This patch series simplifies kvmclock updating logic by reverting
related commits.
Now we have three requests about time updating:
1. KVM_REQ_CLOCK_UPDATE:
The function kvm_guest_time_update gathers info from master clock
or host.rdtsc() and update vcpu->arch.hvclock, and then kvmclock or hyperv
reference counter.
2. KVM_REQ_MASTERCLOCK_UPDATE:
The function kvm_update_masterclock updates kvm->arch from
pvclock_gtod_data(a global var updated by timekeeping subsystem), and
then make KVM_REQ_CLOCK_UPDATE request for each vcpu.
3. KVM_REQ_GLOBAL_CLOCK_UPDATE:
The function kvm_gen_kvmclock_update makes KVM_REQ_CLOCK_UPDATE
request for each vcpu.
In the early implementation, functions mentioned above were
synchronous. But things got complicated since the following commits.
1. Commit 7e44e4495a39 ("x86: kvm: rate-limit global clock updates")
intends to use kvmclock_update_work to sync ntp corretion
across all vcpus kvmclock, which is based on commit 0061d53daf26f
("KVM: x86: limit difference between kvmclock updates")
2. Commit 332967a3eac0 ("x86: kvm: introduce periodic global clock
updates") introduced a 300s-interval work to periodically sync
ntp corrections across all vcpus.
I think those commits could be reverted because:
1. Since commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to
monotonic raw clock"), kvmclock switched to mono raw clock,
Those two commits could be reverted.
2. the periodic work introduced from commit 332967a3eac0 ("x86:
kvm: introduce periodic global clock updates") always does
nothing for normal scenarios. If some exceptions happen,
the corresponding logic makes right CLOCK_UPDATE request for right vcpus.
The following shows what exceptions might happen and how they are
handled.
(1). cpu_tsc_khz changed
__kvmclock_cpufreq_notifier makes KVM_REQ_CLOCK_UPDATE request
(2). use/unuse master clock
kvm_track_tsc_matching makes KVM_REQ_MASTERCLOCK_UPDATE, which means
KVM_REQ_CLOCK_UPDATE for each vcpu.
(3). guest writes MSR_IA32_TSC
kvm_synchronize_tsc will handle it and finally call
kvm_track_tsc_matching to make everything well.
(4). enable/disable tsc_catchup
kvm_arch_vcpu_load and bottom half of vcpu_enter_guest makes
KVM_REQ_CLOCK_UPDATE request
Really happy for your comments, thanks.
Related links:
https://lkml.indiana.edu/hypermail/linux/kernel/2310.0/04217.html
https://patchew.org/linux/20240522001817.619072-1-dwmw2@infradead.org/20240522001817.619072-20-dwmw2@infradead.org/
Lei Chen (3):
Revert "x86: kvm: introduce periodic global clock updates"
Revert "x86: kvm: rate-limit global clock updates"
KVM: x86: remove comment about ntp correction sync for
arch/x86/include/asm/kvm_host.h | 2 --
arch/x86/kvm/x86.c | 58 +++------------------------------
2 files changed, 5 insertions(+), 55 deletions(-)
--
2.44.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v1 1/3] Revert "x86: kvm: introduce periodic global clock updates"
2025-08-19 15:20 [PATCH v1 0/3] kvm:x86: simplify kvmclock update logic Lei Chen
@ 2025-08-19 15:20 ` Lei Chen
2025-08-19 15:20 ` [PATCH v1 2/3] Revert "x86: kvm: rate-limit " Lei Chen
2025-08-19 15:20 ` [PATCH v1 3/3] KVM: x86: remove comment about ntp correction sync for Lei Chen
2 siblings, 0 replies; 4+ messages in thread
From: Lei Chen @ 2025-08-19 15:20 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin
Cc: kvm, linux-kernel
This reverts commit 332967a3eac06f6379283cf155c84fe7cd0537c2.
Commit 332967a3eac0 ("x86: kvm: introduce periodic global clock
updates") introduced a 300s interval work to sync ntp corrections
across all vcpus.
Since commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to
monotonic raw clock"), kvmclock switched to mono raw clock,
we can no longer take ntp into consideration.
Signed-off-by: Lei Chen <lei.chen@smartx.com>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/x86.c | 25 -------------------------
2 files changed, 26 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f19a76d3ca0e..e41e4fe91f5e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1436,7 +1436,6 @@ struct kvm_arch {
u64 master_kernel_ns;
u64 master_cycle_now;
struct delayed_work kvmclock_update_work;
- struct delayed_work kvmclock_sync_work;
#ifdef CONFIG_KVM_HYPERV
struct kvm_hv hyperv;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1c49bc681c4..399045a384d4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -157,9 +157,6 @@ EXPORT_SYMBOL_GPL(report_ignored_msrs);
unsigned int min_timer_period_us = 200;
module_param(min_timer_period_us, uint, 0644);
-static bool __read_mostly kvmclock_periodic_sync = true;
-module_param(kvmclock_periodic_sync, bool, 0444);
-
/* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */
static u32 __read_mostly tsc_tolerance_ppm = 250;
module_param(tsc_tolerance_ppm, uint, 0644);
@@ -3439,20 +3436,6 @@ static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
KVMCLOCK_UPDATE_DELAY);
}
-#define KVMCLOCK_SYNC_PERIOD (300 * HZ)
-
-static void kvmclock_sync_fn(struct work_struct *work)
-{
- struct delayed_work *dwork = to_delayed_work(work);
- struct kvm_arch *ka = container_of(dwork, struct kvm_arch,
- kvmclock_sync_work);
- struct kvm *kvm = container_of(ka, struct kvm, arch);
-
- schedule_delayed_work(&kvm->arch.kvmclock_update_work, 0);
- schedule_delayed_work(&kvm->arch.kvmclock_sync_work,
- KVMCLOCK_SYNC_PERIOD);
-}
-
/* These helpers are safe iff @msr is known to be an MCx bank MSR. */
static bool is_mci_control_msr(u32 msr)
{
@@ -12327,8 +12310,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
{
- struct kvm *kvm = vcpu->kvm;
-
if (mutex_lock_killable(&vcpu->mutex))
return;
vcpu_load(vcpu);
@@ -12339,10 +12320,6 @@ void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
vcpu->arch.msr_kvm_poll_control = 1;
mutex_unlock(&vcpu->mutex);
-
- if (kvmclock_periodic_sync && vcpu->vcpu_idx == 0)
- schedule_delayed_work(&kvm->arch.kvmclock_sync_work,
- KVMCLOCK_SYNC_PERIOD);
}
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
@@ -12722,7 +12699,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
#endif
INIT_DELAYED_WORK(&kvm->arch.kvmclock_update_work, kvmclock_update_fn);
- INIT_DELAYED_WORK(&kvm->arch.kvmclock_sync_work, kvmclock_sync_fn);
kvm_apicv_init(kvm);
kvm_hv_init_vm(kvm);
@@ -12830,7 +12806,6 @@ void kvm_arch_pre_destroy_vm(struct kvm *kvm)
* is unsafe, i.e. will lead to use-after-free. The PIT also needs to
* be stopped before IRQ routing is freed.
*/
- cancel_delayed_work_sync(&kvm->arch.kvmclock_sync_work);
cancel_delayed_work_sync(&kvm->arch.kvmclock_update_work);
#ifdef CONFIG_KVM_IOAPIC
--
2.44.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v1 2/3] Revert "x86: kvm: rate-limit global clock updates"
2025-08-19 15:20 [PATCH v1 0/3] kvm:x86: simplify kvmclock update logic Lei Chen
2025-08-19 15:20 ` [PATCH v1 1/3] Revert "x86: kvm: introduce periodic global clock updates" Lei Chen
@ 2025-08-19 15:20 ` Lei Chen
2025-08-19 15:20 ` [PATCH v1 3/3] KVM: x86: remove comment about ntp correction sync for Lei Chen
2 siblings, 0 replies; 4+ messages in thread
From: Lei Chen @ 2025-08-19 15:20 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin
Cc: kvm, linux-kernel
This reverts commit 7e44e4495a398eb553ce561f29f9148f40a3448f.
Commit 7e44e4495a39 ("x86: kvm: rate-limit global clock updates")
intends to use a kvmclock_update_work to sync ntp corretion
across all vcpus kvmclock, which is based on commit 0061d53daf26f
("KVM: x86: limit difference between kvmclock updates")
Since kvmclock has been switched to mono raw, this commit can be
reverted.
Signed-off-by: Lei Chen <lei.chen@smartx.com>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/x86.c | 29 ++++-------------------------
2 files changed, 4 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e41e4fe91f5e..0a1165f40ff1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1435,7 +1435,6 @@ struct kvm_arch {
bool use_master_clock;
u64 master_kernel_ns;
u64 master_cycle_now;
- struct delayed_work kvmclock_update_work;
#ifdef CONFIG_KVM_HYPERV
struct kvm_hv hyperv;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 399045a384d4..d526e9e285f1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3404,22 +3404,14 @@ uint64_t kvm_get_wall_clock_epoch(struct kvm *kvm)
* the others.
*
* So in those cases, request a kvmclock update for all vcpus.
- * We need to rate-limit these requests though, as they can
- * considerably slow guests that have a large number of vcpus.
- * The time for a remote vcpu to update its kvmclock is bound
- * by the delay we use to rate-limit the updates.
+ * The worst case for a remote vcpu to update its kvmclock
+ * is then bounded by maximum nohz sleep latency.
*/
-
-#define KVMCLOCK_UPDATE_DELAY msecs_to_jiffies(100)
-
-static void kvmclock_update_fn(struct work_struct *work)
+static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
{
unsigned long i;
- struct delayed_work *dwork = to_delayed_work(work);
- struct kvm_arch *ka = container_of(dwork, struct kvm_arch,
- kvmclock_update_work);
- struct kvm *kvm = container_of(ka, struct kvm, arch);
struct kvm_vcpu *vcpu;
+ struct kvm *kvm = v->kvm;
kvm_for_each_vcpu(i, vcpu, kvm) {
kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
@@ -3427,15 +3419,6 @@ static void kvmclock_update_fn(struct work_struct *work)
}
}
-static void kvm_gen_kvmclock_update(struct kvm_vcpu *v)
-{
- struct kvm *kvm = v->kvm;
-
- kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
- schedule_delayed_work(&kvm->arch.kvmclock_update_work,
- KVMCLOCK_UPDATE_DELAY);
-}
-
/* These helpers are safe iff @msr is known to be an MCx bank MSR. */
static bool is_mci_control_msr(u32 msr)
{
@@ -12698,8 +12681,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.hv_root_tdp = INVALID_PAGE;
#endif
- INIT_DELAYED_WORK(&kvm->arch.kvmclock_update_work, kvmclock_update_fn);
-
kvm_apicv_init(kvm);
kvm_hv_init_vm(kvm);
kvm_xen_init_vm(kvm);
@@ -12806,8 +12787,6 @@ void kvm_arch_pre_destroy_vm(struct kvm *kvm)
* is unsafe, i.e. will lead to use-after-free. The PIT also needs to
* be stopped before IRQ routing is freed.
*/
- cancel_delayed_work_sync(&kvm->arch.kvmclock_update_work);
-
#ifdef CONFIG_KVM_IOAPIC
kvm_free_pit(kvm);
#endif
--
2.44.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v1 3/3] KVM: x86: remove comment about ntp correction sync for
2025-08-19 15:20 [PATCH v1 0/3] kvm:x86: simplify kvmclock update logic Lei Chen
2025-08-19 15:20 ` [PATCH v1 1/3] Revert "x86: kvm: introduce periodic global clock updates" Lei Chen
2025-08-19 15:20 ` [PATCH v1 2/3] Revert "x86: kvm: rate-limit " Lei Chen
@ 2025-08-19 15:20 ` Lei Chen
2 siblings, 0 replies; 4+ messages in thread
From: Lei Chen @ 2025-08-19 15:20 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin
Cc: kvm, linux-kernel
Since vcpu local clock is no longer affected by ntp,
remove comment about ntp correction sync for function
kvm_gen_kvmclock_update.
Signed-off-by: Lei Chen <lei.chen@smartx.com>
---
arch/x86/kvm/x86.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d526e9e285f1..f85611f59218 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3399,9 +3399,7 @@ uint64_t kvm_get_wall_clock_epoch(struct kvm *kvm)
/*
* kvmclock updates which are isolated to a given vcpu, such as
* vcpu->cpu migration, should not allow system_timestamp from
- * the rest of the vcpus to remain static. Otherwise ntp frequency
- * correction applies to one vcpu's system_timestamp but not
- * the others.
+ * the rest of the vcpus to remain static.
*
* So in those cases, request a kvmclock update for all vcpus.
* The worst case for a remote vcpu to update its kvmclock
--
2.44.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-19 15:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 15:20 [PATCH v1 0/3] kvm:x86: simplify kvmclock update logic Lei Chen
2025-08-19 15:20 ` [PATCH v1 1/3] Revert "x86: kvm: introduce periodic global clock updates" Lei Chen
2025-08-19 15:20 ` [PATCH v1 2/3] Revert "x86: kvm: rate-limit " Lei Chen
2025-08-19 15:20 ` [PATCH v1 3/3] KVM: x86: remove comment about ntp correction sync for Lei Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).