public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alexander Graf <graf@amazon.com>
To: <kvm@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <hpa@zytor.com>, <x86@kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	<nh-open-source@amazon.com>, <gurugubs@amazon.com>,
	<jalliste@amazon.co.uk>, Michael Kelley <mhklinux@outlook.com>,
	John Starks <jostarks@microsoft.com>
Subject: [PATCH] kvm: hyper-v: Delay firing of expired stimers
Date: Thu, 15 Jan 2026 14:15:20 +0000	[thread overview]
Message-ID: <20260115141520.24176-1-graf@amazon.com> (raw)

During Windows Server 2025 hibernation, I have seen Windows' calculation
of interrupt target time get skewed over the hypervisor view of the same.
This can cause Windows to emit timer events in the past for events that
do not fire yet according to the real time source. This then leads to
interrupt storms in the guest which slow down execution to a point where
watchdogs trigger. Those manifest as bugchecks 0x9f and 0xa0 during
hibernation, typically in the resume path.

To work around this problem, we can delay timers that get created with a
target time in the past by a tiny bit (10µs) to give the guest CPU time
to process real work and make forward progress, hopefully recovering its
interrupt logic in the process. While this small delay can marginally
reduce accuracy of guest timers, 10µs are within the noise of VM
entry/exit overhead (~1-2 µs) so I do not expect to see real world impact.

To still provide some level of visibility when this happens, add a trace
point that clearly shows the discrepancy between the target time and the
current time.

Signed-off-by: Alexander Graf <graf@amazon.com>
---
 arch/x86/kvm/hyperv.c | 22 ++++++++++++++++++----
 arch/x86/kvm/trace.h  | 26 ++++++++++++++++++++++++++
 2 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 72b19a88a776..c41061acbcbc 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -666,13 +666,27 @@ static int stimer_start(struct kvm_vcpu_hv_stimer *stimer)
 	stimer->exp_time = stimer->count;
 	if (time_now >= stimer->count) {
 		/*
-		 * Expire timer according to Hypervisor Top-Level Functional
-		 * specification v4(15.3.1):
+		 * Hypervisor Top-Level Functional specification v4(15.3.1):
 		 * "If a one shot is enabled and the specified count is in
 		 * the past, it will expire immediately."
+		 *
+		 * However, there are cases during hibernation when Windows's
+		 * interrupt count calculation can go out of sync with KVM's
+		 * view of it, causing Windows to emit timer events in the past
+		 * for events that do not fire yet according to the real time
+		 * source. This then leads to interrupt storms in the guest
+		 * which slow down execution to a point where watchdogs trigger.
+		 *
+		 * Instead of taking TLFS literally on what "immediately" means,
+		 * give the guest at least 10µs to process work. While this can
+		 * marginally reduce accuracy of guest timers, 10µs are within
+		 * the noise of VM entry/exit overhead (~1-2 µs).
 		 */
-		stimer_mark_pending(stimer, false);
-		return 0;
+		trace_kvm_hv_stimer_start_expired(
+					hv_stimer_to_vcpu(stimer)->vcpu_id,
+					stimer->index,
+					time_now, stimer->count);
+		stimer->count = time_now + 100;
 	}
 
 	trace_kvm_hv_stimer_start_one_shot(hv_stimer_to_vcpu(stimer)->vcpu_id,
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 57d79fd31df0..f9e69c4d9e9b 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1401,6 +1401,32 @@ TRACE_EVENT(kvm_hv_stimer_start_one_shot,
 		  __entry->count)
 );
 
+/*
+ * Tracepoint for stimer_start(one-shot timer already expired).
+ */
+TRACE_EVENT(kvm_hv_stimer_start_expired,
+	TP_PROTO(int vcpu_id, int timer_index, u64 time_now, u64 count),
+	TP_ARGS(vcpu_id, timer_index, time_now, count),
+
+	TP_STRUCT__entry(
+		__field(int, vcpu_id)
+		__field(int, timer_index)
+		__field(u64, time_now)
+		__field(u64, count)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_id = vcpu_id;
+		__entry->timer_index = timer_index;
+		__entry->time_now = time_now;
+		__entry->count = count;
+	),
+
+	TP_printk("vcpu_id %d timer %d time_now %llu count %llu (expired)",
+		  __entry->vcpu_id, __entry->timer_index, __entry->time_now,
+		  __entry->count)
+);
+
 /*
  * Tracepoint for stimer_timer_callback.
  */
-- 
2.47.1




Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597

             reply	other threads:[~2026-01-15 14:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-15 14:15 Alexander Graf [this message]
2026-01-23 18:21 ` [PATCH] kvm: hyper-v: Delay firing of expired stimers Sean Christopherson
2026-01-24 21:26   ` Alexander Graf
2026-01-26  9:41     ` Vitaly Kuznetsov
2026-01-24  0:37 ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260115141520.24176-1-graf@amazon.com \
    --to=graf@amazon.com \
    --cc=gurugubs@amazon.com \
    --cc=hpa@zytor.com \
    --cc=jalliste@amazon.co.uk \
    --cc=jostarks@microsoft.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhklinux@outlook.com \
    --cc=nh-open-source@amazon.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox