[PATCH 3/5] KVM: x86: account KVM_SET_CLOCK downtime in steal time

Linux Kernel Selftest development
 help / color / mirror / Atom feed

From: Dongli Zhang <dongli.zhang@oracle.com>
To: kvm@vger.kernel.org, x86@kernel.org, linux-kselftest@vger.kernel.org
Cc: seanjc@google.com, pbonzini@redhat.com, vkuznets@redhat.com,
	tglx@kernel.org, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, shuah@kernel.org, hpa@zytor.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
	vschneid@redhat.com, kprateek.nayak@amd.com, jgross@suse.com,
	dwmw2@infradead.org, joe.jin@oracle.com
Subject: [PATCH 3/5] KVM: x86: account KVM_SET_CLOCK downtime in steal time
Date: Mon,  4 May 2026 17:30:16 -0700	[thread overview]
Message-ID: <20260505003044.78693-4-dongli.zhang@oracle.com> (raw)
In-Reply-To: <20260505003044.78693-1-dongli.zhang@oracle.com>

The KVM_CLOCK_REALTIME has been introduced to help track the downtime of
live migration. KVM uses that realtime value to advance guest clock, but
the same blackout is not reflected in KVM steal time.

Account that same delta in steal time directly in kvm_vm_ioctl_set_clock(),
only when KVM_CLOCK_REALTIME is used. This keeps the KVM-only solution
self-contained and avoids adding a new KVM ioctl or requiring additional
userspace changes (i.e. QEMU).

Record the per-VM downtime delta when KVM_SET_CLOCK receives
KVM_CLOCK_REALTIME, and fold it into the existing x86 steal accounting
path. Initialize each vCPU's local cursor
(vcpu->arch.st.last_downtime_steal) when the guest enables
MSR_KVM_STEAL_TIME so previously accumulated blackout is not charged.

Note that this means a vCPU may observe additional steal time after
blackout even if the host side contribution from current->sched_info
did not increase during that interval.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/x86.c              | 25 +++++++++++++++++++++++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1f1f29128c5d..920441b1abf0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -959,6 +959,7 @@ struct kvm_vcpu_arch {
 		u8 preempted;
 		u64 msr_val;
 		u64 last_steal;
+		u64 last_downtime_steal;
 		struct gfn_to_hva_cache cache;
 		bool need_reset;
 	} st;
@@ -1506,6 +1507,8 @@ struct kvm_arch {
 	u64 master_kernel_ns;
 	u64 master_cycle_now;
 
+	atomic64_t downtime_steal;
+
 #ifdef CONFIG_KVM_HYPERV
 	struct kvm_hv hyperv;
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eec578894ad5..452293fc0505 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3751,6 +3751,7 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 	struct kvm_steal_time __user *st;
 	struct kvm_memslots *slots;
 	gpa_t gpa = vcpu->arch.st.msr_val & KVM_STEAL_VALID_BITS;
+	u64 downtime_steal;
 	u64 steal;
 	u32 version;
 
@@ -3838,6 +3839,11 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 	steal += current->sched_info.run_delay -
 		vcpu->arch.st.last_steal;
 	vcpu->arch.st.last_steal = current->sched_info.run_delay;
+
+	downtime_steal = atomic64_read(&vcpu->kvm->arch.downtime_steal);
+	steal += downtime_steal - vcpu->arch.st.last_downtime_steal;
+	vcpu->arch.st.last_downtime_steal = downtime_steal;
+
 	unsafe_put_user(steal, &st->steal, out);
 
 	version += 1;
@@ -4185,6 +4191,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			break;
 
 		vcpu->arch.st.need_reset = true;
+		vcpu->arch.st.last_downtime_steal =
+			atomic64_read(&vcpu->kvm->arch.downtime_steal);
+
 		kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
 
 		break;
@@ -7250,8 +7259,18 @@ static int kvm_vm_ioctl_set_clock(struct kvm *kvm, void __user *argp)
 		/*
 		 * Avoid stepping the kvmclock backwards.
 		 */
-		if (now_real_ns > data.realtime)
-			data.clock += now_real_ns - data.realtime;
+		if (now_real_ns > data.realtime) {
+			u64 downtime_ns = now_real_ns - data.realtime;
+
+			data.clock += downtime_ns;
+
+			if (sched_info_on()) {
+				atomic64_add(downtime_ns,
+					     &kvm->arch.downtime_steal);
+				kvm_make_all_cpus_request(kvm,
+							  KVM_REQ_STEAL_UPDATE);
+			}
+		}
 	}
 
 	if (ka->use_master_clock)
@@ -13389,6 +13408,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	kvm->arch.hv_root_tdp = INVALID_PAGE;
 #endif
 
+	atomic64_set(&kvm->arch.downtime_steal, 0);
+
 	kvm_apicv_init(kvm);
 	kvm_hv_init_vm(kvm);
 	kvm_xen_init_vm(kvm);
-- 
2.39.3

next prev parent reply	other threads:[~2026-05-05  0:32 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05  0:30 [PATCH 0/5] Fix and enhance KVM steal accounting for both guest and host Dongli Zhang
2026-05-05  0:30 ` [PATCH 1/5] x86/kvm: Reset prev_steal_time and prev_steal_time_rq when enabling steal time Dongli Zhang
2026-05-05  0:30 ` [PATCH 2/5] KVM: x86: Reset vcpu->arch.st.last_steal " Dongli Zhang
2026-05-08 22:40   ` Sean Christopherson
2026-05-10 17:09     ` David Woodhouse
2026-05-10 18:40       ` David Woodhouse
2026-05-05  0:30 ` Dongli Zhang [this message]
2026-05-10 18:54   ` [PATCH 3/5] KVM: x86: account KVM_SET_CLOCK downtime in " David Woodhouse
2026-05-10 19:11     ` H. Peter Anvin
2026-05-10 20:13       ` David Woodhouse
2026-05-05  0:30 ` [PATCH 4/5] KVM: selftests: Test steal time when re-adding a vCPU on a new thread Dongli Zhang
2026-05-05  0:30 ` [PATCH 5/5] KVM: selftests: Test KVM_SET_CLOCK downtime in steal time Dongli Zhang

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:1f1f29128c5 dfblob:920441b1abf dfblob:eec578894ad
dfblob:452293fc050 )
 OR (
bs:"[PATCH 3/5] KVM: x86: account KVM_SET_CLOCK downtime in steal time" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260505003044.78693-4-dongli.zhang@oracle.com \
    --to=dongli.zhang@oracle.com \
    --cc=bp@alien8.de \
    --cc=bsegall@google.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=joe.jin@oracle.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vkuznets@redhat.com \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox