From: David Woodhouse <dwmw2@infradead.org>
To: Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Sean Christopherson <seanjc@google.com>,
Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Juergen Gross <jgross@suse.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
David Woodhouse <dwmw2@infradead.org>,
Paul Durrant <paul@xen.org>, Jonathan Cameron <jic23@kernel.org>,
Sascha Bischoff <Sascha.Bischoff@arm.com>,
Marc Zyngier <maz@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
Jack Allister <jalliste@amazon.com>,
Dongli Zhang <dongli.zhang@oracle.com>,
joe.jin@oracle.com, kvm@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
xen-devel@lists.xenproject.org, linux-kselftest@vger.kernel.org
Subject: [PATCH v4 07/30] KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration
Date: Sat, 9 May 2026 23:46:33 +0100 [thread overview]
Message-ID: <20260509224824.3264567-8-dwmw2@infradead.org> (raw)
In-Reply-To: <20260509224824.3264567-1-dwmw2@infradead.org>
From: David Woodhouse <dwmw@amazon.co.uk>
The documentation on TSC migration using KVM_VCPU_TSC_OFFSET is woefully
inadequate. It ignores TSC scaling, and ignores the fact that the host
TSC may differ from one host to the next (and in fact because of the way
the kernel calibrates it, it generally differs from one boot to the next
even on the same hardware).
Add KVM_VCPU_TSC_SCALE to extract the actual scale ratio and frac_bits,
and attempt to document the process that userspace needs to follow to
preserve the TSC across migration.
Only enumerate KVM_VCPU_TSC_SCALE when kvm_caps.has_tsc_control is true,
since the scaling ratio is only meaningful when hardware TSC scaling is
supported.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
Documentation/virt/kvm/devices/vcpu.rst | 36 ++++++++++++++++++++++++-
arch/x86/include/uapi/asm/kvm.h | 6 +++++
arch/x86/kvm/x86.c | 22 +++++++++++++++
3 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst
index 5e3805820010..56562b932280 100644
--- a/Documentation/virt/kvm/devices/vcpu.rst
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -243,7 +243,10 @@ Returns:
Specifies the guest's TSC offset relative to the host's TSC. The guest's
TSC is then derived by the following equation:
- guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET
+ guest_tsc = ((host_tsc * tsc_scale_ratio) >> tsc_scale_bits) + KVM_VCPU_TSC_OFFSET
+
+The values of tsc_scale_ratio and tsc_scale_bits can be obtained using
+the KVM_VCPU_TSC_SCALE attribute.
This attribute is useful to adjust the guest's TSC on live migration,
so that the TSC counts the time during which the VM was paused. The
@@ -292,3 +295,34 @@ From the destination VMM process:
7. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the
respective value derived in the previous step.
+
+4.2 ATTRIBUTE: KVM_VCPU_TSC_SCALE
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+:Parameters: struct kvm_vcpu_tsc_scale
+
+Returns:
+
+ ======= ======================================
+ -EFAULT Error reading the provided parameter
+ address.
+ -ENXIO Attribute not supported (no TSC scaling)
+ -EINVAL Invalid request to write the attribute
+ ======= ======================================
+
+This read-only attribute reports the guest's TSC scaling factor, in the form
+of a fixed-point number represented by the following structure::
+
+ struct kvm_vcpu_tsc_scale {
+ __u64 tsc_ratio;
+ __u64 tsc_frac_bits;
+ };
+
+The tsc_frac_bits field indicates the location of the fixed point, such that
+host TSC values are converted to guest TSC using the formula:
+
+ guest_tsc = ((host_tsc * tsc_ratio) >> tsc_frac_bits) + offset
+
+Userspace can use this to precisely calculate the guest TSC from the host
+TSC at any given moment. This is needed for accurate migration of guests,
+as described in the documentation for the KVM_VCPU_TSC_OFFSET attribute.
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 5f2b30d0405c..384be9a53395 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -961,6 +961,12 @@ struct kvm_hyperv_eventfd {
/* for KVM_{GET,SET,HAS}_DEVICE_ATTR */
#define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
#define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
+#define KVM_VCPU_TSC_SCALE 1 /* attribute for TSC scaling factor */
+
+struct kvm_vcpu_tsc_scale {
+ __u64 tsc_ratio;
+ __u64 tsc_frac_bits;
+};
/* x86-specific KVM_EXIT_HYPERCALL flags. */
#define KVM_EXIT_HYPERCALL_LONG_MODE _BITULL(0)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d1327d5fba3f..2179ea2da8e0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5930,6 +5930,9 @@ static int kvm_arch_tsc_has_attr(struct kvm_vcpu *vcpu,
case KVM_VCPU_TSC_OFFSET:
r = 0;
break;
+ case KVM_VCPU_TSC_SCALE:
+ r = kvm_caps.has_tsc_control ? 0 : -ENXIO;
+ break;
default:
r = -ENXIO;
}
@@ -5950,6 +5953,22 @@ static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu,
break;
r = 0;
break;
+ case KVM_VCPU_TSC_SCALE: {
+ struct kvm_vcpu_tsc_scale scale;
+
+ if (!kvm_caps.has_tsc_control) {
+ r = -ENXIO;
+ break;
+ }
+
+ scale.tsc_ratio = vcpu->arch.l1_tsc_scaling_ratio;
+ scale.tsc_frac_bits = kvm_caps.tsc_scaling_ratio_frac_bits;
+ r = -EFAULT;
+ if (copy_to_user(uaddr, &scale, sizeof(scale)))
+ break;
+ r = 0;
+ break;
+ }
default:
r = -ENXIO;
}
@@ -5989,6 +6008,9 @@ static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu,
r = 0;
break;
}
+ case KVM_VCPU_TSC_SCALE:
+ r = -EINVAL; /* Read only */
+ break;
default:
r = -ENXIO;
}
--
2.51.0
next prev parent reply other threads:[~2026-05-09 22:49 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-09 22:46 [PATCH v4] 00/30] Cleaning up the KVM clock mess David Woodhouse
2026-05-09 22:46 ` [PATCH v4 01/30] KVM: x86/xen: Do not corrupt KVM clock in kvm_xen_shared_info_init() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 02/30] KVM: x86: Improve accuracy of KVM clock when TSC scaling is in force David Woodhouse
2026-05-09 22:46 ` [PATCH v4 03/30] UAPI: x86: Move pvclock-abi to UAPI for x86 platforms David Woodhouse
2026-05-09 22:46 ` [PATCH v4 04/30] KVM: x86: Add KVM_[GS]ET_CLOCK_GUEST for accurate KVM clock migration David Woodhouse
2026-05-09 22:46 ` [PATCH v4 05/30] KVM: selftests: Add KVM/PV clock selftest to prove timer correction David Woodhouse
2026-05-09 22:46 ` [PATCH v4 06/30] KVM: x86: Explicitly disable TSC scaling without CONSTANT_TSC David Woodhouse
2026-05-09 22:46 ` David Woodhouse [this message]
2026-05-09 22:46 ` [PATCH v4 08/30] KVM: x86: Avoid NTP frequency skew for KVM clock on 32-bit host David Woodhouse
2026-05-09 22:46 ` [PATCH v4 09/30] KVM: x86: WARN if kvm_get_walltime_and_clockread() fails unexpectedly David Woodhouse
2026-05-09 22:46 ` [PATCH v4 10/30] KVM: x86: Fold __get_kvmclock() into get_kvmclock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 11/30] KVM: x86: Add WARN and restructure get_kvmclock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 12/30] KVM: x86: Use get_kvmclock_base_ns() as fallback in get_kvmclock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 13/30] KVM: x86: Fix KVM clock precision in get_kvmclock() with TSC scaling David Woodhouse
2026-05-09 22:46 ` [PATCH v4 14/30] KVM: x86: Use get_kvmclock() in kvm_get_wall_clock_epoch() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 15/30] KVM: x86: Fix compute_guest_tsc() to handle negative time deltas David Woodhouse
2026-05-09 22:46 ` [PATCH v4 16/30] KVM: x86: Restructure kvm_guest_time_update() for TSC upscaling David Woodhouse
2026-05-09 22:46 ` [PATCH v4 17/30] KVM: x86: Simplify and comment kvm_get_time_scale() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 18/30] KVM: x86: Remove implicit rdtsc() from kvm_compute_l1_tsc_offset() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 19/30] KVM: x86: Improve synchronization in kvm_synchronize_tsc() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 20/30] KVM: x86: Kill last_tsc_{nsec,write,offset} fields David Woodhouse
2026-05-09 22:46 ` [PATCH v4 21/30] KVM: x86: Replace nr_vcpus_matched_tsc count with all_vcpus_matched_tsc bool David Woodhouse
2026-05-09 22:46 ` [PATCH v4 22/30] KVM: x86: Allow KVM master clock mode when TSCs are offset from each other David Woodhouse
2026-05-09 22:46 ` [PATCH v4 23/30] KVM: x86: Factor out kvm_use_master_clock() David Woodhouse
2026-05-09 22:46 ` [PATCH v4 24/30] KVM: x86: Avoid gratuitous global clock updates David Woodhouse
2026-05-09 22:46 ` [PATCH v4 25/30] KVM: x86/xen: Prevent runstate times from becoming negative David Woodhouse
2026-05-09 22:46 ` [PATCH v4 26/30] KVM: x86: Avoid redundant masterclock updates from multiple vCPUs David Woodhouse
2026-05-09 22:46 ` [PATCH v4 27/30] KVM: x86: Add KVM_VCPU_TSC_EFFECTIVE_FREQ attribute David Woodhouse
2026-05-09 22:46 ` [PATCH v4 28/30] KVM: x86: Remove runtime Xen TSC frequency CPUID update David Woodhouse
2026-05-09 22:46 ` [PATCH v4 29/30] x86/kvm: Obtain TSC frequency from CPUID if present David Woodhouse
2026-05-09 22:46 ` [PATCH v4 30/30] x86/xen: " David Woodhouse
2026-05-10 20:56 ` [PATCH v4 33/30] KVM: selftests: Add Xen runstate migration test David Woodhouse
2026-05-10 20:58 ` [PATCH v4 31/30] KVM: selftests: Add Xen/generic CPUID timing leaf test David Woodhouse
2026-05-10 21:05 ` [PATCH v4 32/30] KVM: x86: Re-synchronize TSC after KVM_SET_TSC_KHZ David Woodhouse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260509224824.3264567-8-dwmw2@infradead.org \
--to=dwmw2@infradead.org \
--cc=Sascha.Bischoff@arm.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=dongli.zhang@oracle.com \
--cc=hpa@zytor.com \
--cc=jalliste@amazon.com \
--cc=jgross@suse.com \
--cc=jic23@kernel.org \
--cc=joe.jin@oracle.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maz@kernel.org \
--cc=mingo@redhat.com \
--cc=paul@xen.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=skhan@linuxfoundation.org \
--cc=tglx@kernel.org \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox