Linux Documentation
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Sean Christopherson <seanjc@google.com>,
	Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Juergen Gross <jgross@suse.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	David Woodhouse <dwmw2@infradead.org>,
	Paul Durrant <paul@xen.org>, Jonathan Cameron <jic23@kernel.org>,
	Sascha Bischoff <Sascha.Bischoff@arm.com>,
	Marc Zyngier <maz@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
	Jack Allister <jalliste@amazon.com>,
	Dongli Zhang <dongli.zhang@oracle.com>,
	joe.jin@oracle.com, kvm@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	xen-devel@lists.xenproject.org, linux-kselftest@vger.kernel.org
Subject: [PATCH v5 15/34] KVM: x86: Restructure kvm_guest_time_update() for TSC upscaling
Date: Mon,  8 Jun 2026 15:47:56 +0100	[thread overview]
Message-ID: <20260608145455.89187-16-dwmw2@infradead.org> (raw)
In-Reply-To: <20260608145455.89187-1-dwmw2@infradead.org>

From: David Woodhouse <dwmw@amazon.co.uk>

Restructure kvm_guest_time_update() so that kernel_ns/host_tsc are
always "now" when doing TSC catchup, then swap in the master clock
reference values afterward for the hv_clock.

This makes the TSC upscaling code considerably simpler: the catchup
adjustment is computed as the delta between what the guest TSC *should*
be at "now" and what it actually is, rather than mixing "now" and
"master clock reference" timestamps.

The seqcount loop now also contains the kvm_get_time_and_clockread()
call (matching get_kvmclock's pattern), with the same WARN for
unexpected failure.

Based on a suggestion by Sean Christopherson.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kvm/x86.c | 74 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 53 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8aae22401046..92e32d720523 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3363,46 +3363,63 @@ static void kvm_setup_guest_pvclock(struct pvclock_vcpu_time_info *ref_hv_clock,
 int kvm_guest_time_update(struct kvm_vcpu *v)
 {
 	struct pvclock_vcpu_time_info hv_clock = {};
-	unsigned long flags;
 	u64 tgt_tsc_hz;
 	unsigned seq;
 	struct kvm_vcpu_arch *vcpu = &v->arch;
 	struct kvm_arch *ka = &v->kvm->arch;
 	s64 kernel_ns;
 	u64 tsc_timestamp, host_tsc;
+	u64 master_host_tsc = 0;
+	s64 master_kernel_ns = 0;
 	bool use_master_clock;
 
-	kernel_ns = 0;
-	host_tsc = 0;
-
 	/*
 	 * If the host uses TSC clock, then passthrough TSC as stable
 	 * to the guest.
 	 */
 	do {
 		seq = read_seqcount_begin(&ka->pvclock_sc);
+
 		use_master_clock = ka->use_master_clock;
-		if (use_master_clock) {
-			host_tsc = ka->master_cycle_now;
-			kernel_ns = ka->master_kernel_ns;
-		}
+
+		/*
+		 * The TSC read and the call to get_cpu_tsc_khz() must happen
+		 * on the same CPU.
+		 */
+		get_cpu();
+
+		tgt_tsc_hz = (u64)get_cpu_tsc_khz() * 1000;
+
+#ifdef CONFIG_X86_64
+		if (use_master_clock &&
+		    !kvm_get_time_and_clockread(&kernel_ns, &host_tsc) &&
+		    !read_seqcount_retry(&ka->pvclock_sc, seq))
+			use_master_clock = false;
+#endif
+
+		put_cpu();
+
+		if (!use_master_clock)
+			break;
+
+		master_host_tsc = ka->master_cycle_now;
+		master_kernel_ns = ka->master_kernel_ns;
 	} while (read_seqcount_retry(&ka->pvclock_sc, seq));
 
-	/* Keep irq disabled to prevent changes to the clock */
-	local_irq_save(flags);
-	tgt_tsc_hz = (u64)get_cpu_tsc_khz() * 1000;
 	if (unlikely(tgt_tsc_hz == 0)) {
-		local_irq_restore(flags);
 		kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
 		return 1;
 	}
+
 	if (!use_master_clock) {
+		unsigned long flags;
+
+		local_irq_save(flags);
 		host_tsc = rdtsc();
 		kernel_ns = get_kvmclock_base_ns();
+		local_irq_restore(flags);
 	}
 
-	tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
-
 	/*
 	 * We may have to catch up the TSC to match elapsed wall clock
 	 * time for two reasons, even if kvmclock is used.
@@ -3411,17 +3428,32 @@ int kvm_guest_time_update(struct kvm_vcpu *v)
 	 *      entry to avoid unknown leaps of TSC even when running
 	 *      again on the same CPU.  This may cause apparent elapsed
 	 *      time to disappear, and the guest to stand still or run
-	 *	very slowly.
+	 *      very slowly.
 	 */
 	if (vcpu->tsc_catchup) {
-		u64 tsc = compute_guest_tsc(v, kernel_ns);
-		if (tsc > tsc_timestamp) {
-			adjust_tsc_offset_guest(v, tsc - tsc_timestamp);
-			tsc_timestamp = tsc;
-		}
+		s64 adjustment;
+
+		/*
+		 * Calculate the delta between what the guest TSC *should* be
+		 * and what it actually is according to kvm_read_l1_tsc().
+		 */
+		adjustment = compute_guest_tsc(v, kernel_ns) -
+			     kvm_read_l1_tsc(v, host_tsc);
+		if (adjustment > 0)
+			adjust_tsc_offset_guest(v, adjustment);
 	}
 
-	local_irq_restore(flags);
+	/*
+	 * Now that TSC upscaling is out of the way, the remaining calculations
+	 * are all relative to the reference time that's placed in hv_clock.
+	 * If the master clock is NOT in use, the reference time is "now".  If
+	 * master clock is in use, the reference time comes from there.
+	 */
+	if (use_master_clock) {
+		host_tsc = master_host_tsc;
+		kernel_ns = master_kernel_ns;
+	}
+	tsc_timestamp = kvm_read_l1_tsc(v, host_tsc);
 
 	/* With all the info we got, fill in the values */
 
-- 
2.54.0


  parent reply	other threads:[~2026-06-08 14:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08 14:47 [PATCH v5 00/34] Cleaning up the KVM clock mess David Woodhouse
2026-06-08 14:47 ` [PATCH v5 01/34] KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() David Woodhouse
2026-06-08 14:47 ` [PATCH v5 02/34] KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force David Woodhouse
2026-06-08 14:47 ` [PATCH v5 03/34] UAPI: x86: Move pvclock-abi to UAPI for x86 platforms David Woodhouse
2026-06-08 14:47 ` [PATCH v5 04/34] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration David Woodhouse
2026-06-16  6:47   ` Dongli Zhang
2026-06-16 11:13     ` David Woodhouse
2026-06-23  8:50       ` David Woodhouse
2026-06-08 14:47 ` [PATCH v5 05/34] KVM: selftests: Add KVM/PV clock selftest to prove timer correction David Woodhouse
2026-06-08 14:47 ` [PATCH v5 06/34] KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC David Woodhouse
2026-07-01 21:38   ` Sean Christopherson
2026-07-02  8:58     ` David Woodhouse
2026-06-08 14:47 ` [PATCH v5 07/34] KVM: x86: Activate master clock immediately on vCPU creation David Woodhouse
2026-06-08 14:47 ` [PATCH v5 08/34] KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration David Woodhouse
2026-06-09 23:26   ` Randy Dunlap
2026-07-01 21:47   ` Sean Christopherson
2026-07-02  8:09     ` David Woodhouse
2026-06-08 14:47 ` [PATCH v5 09/34] KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host David Woodhouse
2026-06-08 14:47 ` [PATCH v5 10/34] KVM: x86: Fold __get_kvmclock() into get_kvmclock() David Woodhouse
2026-06-08 14:47 ` [PATCH v5 11/34] KVM: x86: Restructure get_kvmclock() David Woodhouse
2026-06-08 14:47 ` [PATCH v5 12/34] KVM: x86: Fix KVM clock precision in get_kvmclock() with TSC scaling David Woodhouse
2026-06-08 14:47 ` [PATCH v5 13/34] KVM: x86: Use get_kvmclock() in kvm_get_wall_clock_epoch() David Woodhouse
2026-06-08 14:47 ` [PATCH v5 14/34] KVM: x86: Fix compute_guest_tsc() to handle negative time deltas David Woodhouse
2026-06-08 14:47 ` David Woodhouse [this message]
2026-06-08 14:47 ` [PATCH v5 16/34] KVM: x86: Simplify and comment kvm_get_time_scale() David Woodhouse
2026-06-08 14:47 ` [PATCH v5 17/34] KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset() David Woodhouse
2026-06-08 14:47 ` [PATCH v5 18/34] KVM: x86: Improve synchronization in kvm_synchronize_tsc() David Woodhouse
2026-06-08 14:48 ` [PATCH v5 19/34] KVM: x86: Kill last_tsc_{nsec,write,offset} fields David Woodhouse
2026-06-08 14:48 ` [PATCH v5 20/34] KVM: x86: Replace nr_vcpus_matched_tsc count with all_vcpus_matched_tsc bool David Woodhouse
2026-06-08 14:48 ` [PATCH v5 21/34] KVM: x86: Allow KVM master clock mode when TSCs are offset from each other David Woodhouse
2026-06-08 14:48 ` [PATCH v5 22/34] KVM: selftests: Add master clock offset test David Woodhouse
2026-06-08 14:48 ` [PATCH v5 23/34] KVM: x86: Factor out kvm_use_master_clock() David Woodhouse
2026-06-08 14:48 ` [PATCH v5 24/34] KVM: x86: Avoid gratuitous global clock updates David Woodhouse
2026-06-08 14:48 ` [PATCH v5 25/34] KVM: x86/xen: Prevent runstate times from becoming negative David Woodhouse
2026-06-08 14:48 ` [PATCH v5 26/34] KVM: x86: Avoid redundant masterclock updates from multiple vCPUs David Woodhouse
2026-06-08 14:48 ` [PATCH v5 27/34] KVM: x86: Remove runtime Xen TSC frequency CPUID update David Woodhouse
2026-06-08 14:48 ` [PATCH v5 28/34] KVM: selftests: Add Xen/generic CPUID timing leaf test David Woodhouse
2026-06-08 14:48 ` [PATCH v5 29/34] KVM: x86: Re-synchronize TSC after KVM_SET_TSC_KHZ David Woodhouse
2026-06-08 14:48 ` [PATCH v5 30/34] KVM: selftests: Add Xen runstate migration test David Woodhouse
2026-06-08 14:48 ` [PATCH v5 31/34] KVM: x86: Use ktime_get_snapshot_id() for master clock David Woodhouse
2026-06-08 14:48 ` [PATCH v5 32/34] KVM: x86: Compute kvmclock base without pvclock_gtod_data David Woodhouse
2026-06-08 14:48 ` [PATCH v5 33/34] KVM: x86: Replace pvclock_gtod_data vclock_mode with boolean David Woodhouse
2026-06-08 14:48 ` [PATCH v5 34/34] KVM: x86: Remove pvclock_gtod_data and private timekeeping code David Woodhouse
2026-06-09 18:50 ` [PATCH v5 00/34] Cleaning up the KVM clock mess David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260608145455.89187-16-dwmw2@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=Sascha.Bischoff@arm.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=dongli.zhang@oracle.com \
    --cc=hpa@zytor.com \
    --cc=jalliste@amazon.com \
    --cc=jgross@suse.com \
    --cc=jic23@kernel.org \
    --cc=joe.jin@oracle.com \
    --cc=joey.gouly@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=skhan@linuxfoundation.org \
    --cc=tglx@kernel.org \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox