From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 034263D9043; Sat, 9 May 2026 22:49:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778366947; cv=none; b=fVII01GHof/3GvTofCDKdfrhWx4JCvXlvMsRRZTFpcRr7drWgkyG41hAUDCqe5FZ4EceqGBHx+P+NAHCG4H+jLwKo8zmpnSsNTvNCCk+Itgyy1iMjpdzd8c75gji7z5sc4NP+xD7x5Ee98t0JqEcb6UobhdBOX6q6tLU/v19M80= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778366947; c=relaxed/simple; bh=5qfZyt+F6ZQV7QdFVXv286fFETr1VWlC2gByXVFEq+s=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qM1tQhK/mdNPkxsrayRCgvpIr44UC5e98IBHkFNHI702izjOVZ2Xpehf7L5WQP+RUIDH718kQsVQWo4/1xCV38LtkSxjYS84yjoYRMA+k11ERZcd7ExICDeW6jun3Nx+6AeXJ6UAF5ze4i2rFfu89nXOknXhoVv5Wtk1aO8wXJA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=desiato.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=SE6rkQzN; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=desiato.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="SE6rkQzN" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:To:From:Reply-To: Cc:Content-Type:Content-ID:Content-Description; bh=dFbbY5OtZVYUw6Qy5tHS1aKoaYMuSOyj4x77iW34VYE=; b=SE6rkQzNLq7L4CqMdEC8BZZ0P6 EjpyYbZKK7v43rP/AeQVjSCbX4CyqL7SJgHZ4T2fJwadBOsjBMNWV4O9xztFhbc8UcA+mg4CVAeM/ pF/f/hHOOeWQWkfR1yJ2MttlyGrwYzwM8Us9usfEsF/GJA5jH2cvu8IwGenT2kPZ0o39/N1dT+CvY nxohuJurpY0cff1GFCz0rJ1gt7T7yFG3MUVg02Quc5oH80soyrmI4lVS78+lhGlGprjcpiGEuprPs JbtHdqIH9/IguEaANVzJe00VtBFhZaDJfm4NidetUasrra5NgYbGP82uPUI+FCBzQ8t4ujGxOn5Cs 38tbJYDw==; Received: from [2001:8b0:10b:1::425] (helo=i7.infradead.org) by desiato.infradead.org with esmtpsa (Exim 4.99.1 #2 (Red Hat Linux)) id 1wLqTD-00000008wzA-0Oe2; Sat, 09 May 2026 22:48:32 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.98.2 #2 (Red Hat Linux)) id 1wLqTC-0000000DhHh-1YLk; Sat, 09 May 2026 23:48:26 +0100 From: David Woodhouse To: Paolo Bonzini , Jonathan Corbet , Shuah Khan , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Vitaly Kuznetsov , Juergen Gross , Boris Ostrovsky , David Woodhouse , Paul Durrant , Jonathan Cameron , Sascha Bischoff , Marc Zyngier , Joey Gouly , Jack Allister , Dongli Zhang , joe.jin@oracle.com, kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, linux-kselftest@vger.kernel.org Subject: [PATCH v4 07/30] KVM: x86: Add KVM_VCPU_TSC_SCALE and fix the documentation on TSC migration Date: Sat, 9 May 2026 23:46:33 +0100 Message-ID: <20260509224824.3264567-8-dwmw2@infradead.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260509224824.3264567-1-dwmw2@infradead.org> References: <20260509224824.3264567-1-dwmw2@infradead.org> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by desiato.infradead.org. See http://www.infradead.org/rpr.html From: David Woodhouse The documentation on TSC migration using KVM_VCPU_TSC_OFFSET is woefully inadequate. It ignores TSC scaling, and ignores the fact that the host TSC may differ from one host to the next (and in fact because of the way the kernel calibrates it, it generally differs from one boot to the next even on the same hardware). Add KVM_VCPU_TSC_SCALE to extract the actual scale ratio and frac_bits, and attempt to document the process that userspace needs to follow to preserve the TSC across migration. Only enumerate KVM_VCPU_TSC_SCALE when kvm_caps.has_tsc_control is true, since the scaling ratio is only meaningful when hardware TSC scaling is supported. Signed-off-by: David Woodhouse Reviewed-by: Paul Durrant --- Documentation/virt/kvm/devices/vcpu.rst | 36 ++++++++++++++++++++++++- arch/x86/include/uapi/asm/kvm.h | 6 +++++ arch/x86/kvm/x86.c | 22 +++++++++++++++ 3 files changed, 63 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst index 5e3805820010..56562b932280 100644 --- a/Documentation/virt/kvm/devices/vcpu.rst +++ b/Documentation/virt/kvm/devices/vcpu.rst @@ -243,7 +243,10 @@ Returns: Specifies the guest's TSC offset relative to the host's TSC. The guest's TSC is then derived by the following equation: - guest_tsc = host_tsc + KVM_VCPU_TSC_OFFSET + guest_tsc = ((host_tsc * tsc_scale_ratio) >> tsc_scale_bits) + KVM_VCPU_TSC_OFFSET + +The values of tsc_scale_ratio and tsc_scale_bits can be obtained using +the KVM_VCPU_TSC_SCALE attribute. This attribute is useful to adjust the guest's TSC on live migration, so that the TSC counts the time during which the VM was paused. The @@ -292,3 +295,34 @@ From the destination VMM process: 7. Write the KVM_VCPU_TSC_OFFSET attribute for every vCPU with the respective value derived in the previous step. + +4.2 ATTRIBUTE: KVM_VCPU_TSC_SCALE +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:Parameters: struct kvm_vcpu_tsc_scale + +Returns: + + ======= ====================================== + -EFAULT Error reading the provided parameter + address. + -ENXIO Attribute not supported (no TSC scaling) + -EINVAL Invalid request to write the attribute + ======= ====================================== + +This read-only attribute reports the guest's TSC scaling factor, in the form +of a fixed-point number represented by the following structure:: + + struct kvm_vcpu_tsc_scale { + __u64 tsc_ratio; + __u64 tsc_frac_bits; + }; + +The tsc_frac_bits field indicates the location of the fixed point, such that +host TSC values are converted to guest TSC using the formula: + + guest_tsc = ((host_tsc * tsc_ratio) >> tsc_frac_bits) + offset + +Userspace can use this to precisely calculate the guest TSC from the host +TSC at any given moment. This is needed for accurate migration of guests, +as described in the documentation for the KVM_VCPU_TSC_OFFSET attribute. diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 5f2b30d0405c..384be9a53395 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -961,6 +961,12 @@ struct kvm_hyperv_eventfd { /* for KVM_{GET,SET,HAS}_DEVICE_ATTR */ #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ +#define KVM_VCPU_TSC_SCALE 1 /* attribute for TSC scaling factor */ + +struct kvm_vcpu_tsc_scale { + __u64 tsc_ratio; + __u64 tsc_frac_bits; +}; /* x86-specific KVM_EXIT_HYPERCALL flags. */ #define KVM_EXIT_HYPERCALL_LONG_MODE _BITULL(0) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d1327d5fba3f..2179ea2da8e0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5930,6 +5930,9 @@ static int kvm_arch_tsc_has_attr(struct kvm_vcpu *vcpu, case KVM_VCPU_TSC_OFFSET: r = 0; break; + case KVM_VCPU_TSC_SCALE: + r = kvm_caps.has_tsc_control ? 0 : -ENXIO; + break; default: r = -ENXIO; } @@ -5950,6 +5953,22 @@ static int kvm_arch_tsc_get_attr(struct kvm_vcpu *vcpu, break; r = 0; break; + case KVM_VCPU_TSC_SCALE: { + struct kvm_vcpu_tsc_scale scale; + + if (!kvm_caps.has_tsc_control) { + r = -ENXIO; + break; + } + + scale.tsc_ratio = vcpu->arch.l1_tsc_scaling_ratio; + scale.tsc_frac_bits = kvm_caps.tsc_scaling_ratio_frac_bits; + r = -EFAULT; + if (copy_to_user(uaddr, &scale, sizeof(scale))) + break; + r = 0; + break; + } default: r = -ENXIO; } @@ -5989,6 +6008,9 @@ static int kvm_arch_tsc_set_attr(struct kvm_vcpu *vcpu, r = 0; break; } + case KVM_VCPU_TSC_SCALE: + r = -EINVAL; /* Read only */ + break; default: r = -ENXIO; } -- 2.51.0