public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH 0/2] x86/apic: fix false test_apic_change_mode failures on stalled vCPUs
@ 2026-04-28 13:35 Igor Mammedov
  2026-04-28 13:35 ` [kvm-unit-tests PATCH 1/2] x86/apic: separate reporting from actual measurements Igor Mammedov
  2026-04-28 13:35 ` [kvm-unit-tests PATCH 2/2] x86/apic: add retry logic to test_apic_change_mode Igor Mammedov
  0 siblings, 2 replies; 3+ messages in thread
From: Igor Mammedov @ 2026-04-28 13:35 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini

test_apic_change_mode sporadically fails in CI on both Intel and AMD
hosts with errors like:
  "FAIL: TMCCT should have a non-zero value"
  "FAIL: TMCCT should be reset to the initial-count"
  "FAIL: TMCCT should not be reset to TMICT value"

The root cause is that the APIC timer runs at wall clock time under KVM.
With the default tmict=0x999999 (~10ms period at 1ns bus cycle).

A vCPU stall for sufficiently large portion of TMICT leads to false positives
(reasons could be: host preemption, cross-socket migration, heavy CPU
contention). It's basically not possible to reliably sample timer values
while it's running.

This series adds retry logic with increasing timer periods (10ms, 60ms,
700ms) so that transient vCPU stalls don't cause false failures, while
real bugs still get caught. (most of false failures are handled by 60ms
timer, and 700ms is one pathological case observed in a week of testing)

Reproducer (requires 2+ NUMA nodes):

  stress-ng --cpu 128 --timer 32 --hrtimers 32 --quiet &
  sleep 2
  while true; do
      /usr/libexec/qemu-kvm --no-reboot -nodefaults \
          -global kvm-pit.lost_tick_policy=discard \
          -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 \
          -display none -serial stdio -device pci-testdev \
          -machine q35 -kernel x86/apic.flat \
          -smp 1 -cpu qemu64,+x2apic,+tsc-deadline \
          >> apic_race.log 2>&1 &
      QEMU_PID=$!
      while kill -0 $QEMU_PID 2>/dev/null; do
          taskset -p -c 0 $QEMU_PID 2>/dev/null
          sleep 0.001
          taskset -p -c 1 $QEMU_PID 2>/dev/null
          sleep 0.001
      done
      wait $QEMU_PID 2>/dev/null
  done

patches reduce ~4% failure rate (8 FAILs / 216 PASSes in 2 minutes).
to 0 FAILs over thousands of runs.

Igor Mammedov (2):
  x86/apic: separate reporting from actual measurements
  x86/apic: add retry logic to test_apic_change_mode

 x86/apic.c | 119 ++++++++++++++++++++++++++++++++---------------------
 1 file changed, 72 insertions(+), 47 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [kvm-unit-tests PATCH 1/2] x86/apic: separate reporting from actual measurements
  2026-04-28 13:35 [kvm-unit-tests PATCH 0/2] x86/apic: fix false test_apic_change_mode failures on stalled vCPUs Igor Mammedov
@ 2026-04-28 13:35 ` Igor Mammedov
  2026-04-28 13:35 ` [kvm-unit-tests PATCH 2/2] x86/apic: add retry logic to test_apic_change_mode Igor Mammedov
  1 sibling, 0 replies; 3+ messages in thread
From: Igor Mammedov @ 2026-04-28 13:35 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini

Move report() calls to the end of the function so that serial I/O
does not interfere with timer measurements.

While at it, wrap the test body in a for loop as preparation
for adding retry logic to deal with failures due to large
enough vCPU stalls.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 x86/apic.c | 109 +++++++++++++++++++++++++++++++----------------------
 1 file changed, 63 insertions(+), 46 deletions(-)

diff --git a/x86/apic.c b/x86/apic.c
index 0a52e9a4..d4eb8e11 100644
--- a/x86/apic.c
+++ b/x86/apic.c
@@ -573,62 +573,79 @@ static inline void apic_change_mode(unsigned long new_mode)
 
 static void test_apic_change_mode(void)
 {
-	uint32_t tmict = 0x999999;
+	const uint32_t tmict_values[] = {
+		0x999999,    /* ~10ms */
+	};
+	int retry, max_retries = ARRAY_SIZE(tmict_values);
+	uint32_t tmict;
+	bool tmict_reset = false, o_nonzero = false, o_reached_zero = false;
+	bool p_nonzero = false, p_not_reset = false, p_after_wrap = false;
+	bool p2o_not_reset = false, p2o_reached_zero = false, p2o_stay_zero = false;
 
 	printf("starting apic change mode\n");
 
-	apic_write(APIC_TMICT, tmict);
+	for (retry = 0; retry < max_retries; retry++) {
+		tmict = tmict_values[retry];
+		apic_write(APIC_TMICT, tmict);
+		apic_change_mode(APIC_LVT_TIMER_PERIODIC);
+		tmict_reset = apic_read(APIC_TMICT) == tmict;
 
-	apic_change_mode(APIC_LVT_TIMER_PERIODIC);
+		/* Testing one-shot */
+		apic_change_mode(APIC_LVT_TIMER_ONESHOT);
+		apic_write(APIC_TMICT, tmict);
+		o_nonzero = apic_read(APIC_TMCCT);
 
-	report(apic_read(APIC_TMICT) == tmict, "TMICT value reset");
+		wait_until_tmcct_is_zero(tmict, false);
+		o_reached_zero = !apic_read(APIC_TMCCT);
 
-	/* Testing one-shot */
-	apic_change_mode(APIC_LVT_TIMER_ONESHOT);
-	apic_write(APIC_TMICT, tmict);
-	report(apic_read(APIC_TMCCT), "TMCCT should have a non-zero value");
+		/*
+		 * Write TMICT before changing mode from one-shot to periodic
+		 * TMCCT should be reset to TMICT periodicly
+		 */
+		apic_write(APIC_TMICT, tmict);
+		wait_until_tmcct_is_zero(tmict, true);
+		apic_change_mode(APIC_LVT_TIMER_PERIODIC);
+		p_nonzero = apic_read(APIC_TMCCT);
 
-	wait_until_tmcct_is_zero(tmict, false);
-	report(!apic_read(APIC_TMCCT), "TMCCT should have reached 0");
+		/*
+		 * After the change of mode, the counter should not be reset
+		 * and continue counting down from where it was
+		 */
+		p_not_reset = apic_read(APIC_TMCCT) < (tmict / 2);
+		/*
+		 * Specifically wait for timer wrap around and skip 0.
+		 * Under KVM lapic there is a possibility that a small amount
+		 * of consecutive TMCCR reads return 0 while hrtimer is reset
+		 * in an async callback
+		 */
+		wait_until_tmcct_wrap_around(tmict, false);
+		p_after_wrap = apic_read(APIC_TMCCT) > (tmict / 2);
 
-	/*
-	 * Write TMICT before changing mode from one-shot to periodic TMCCT should
-	 * be reset to TMICT periodicly
-	 */
-	apic_write(APIC_TMICT, tmict);
-	wait_until_tmcct_is_zero(tmict, true);
-	apic_change_mode(APIC_LVT_TIMER_PERIODIC);
-	report(apic_read(APIC_TMCCT), "TMCCT should have a non-zero value");
+		wait_until_tmcct_is_zero(tmict, true);
+		/*
+		 * Keep the same TMICT and change timer mode to one-shot
+		 * TMCCT should be > 0 and count-down to 0
+		 */
+		apic_change_mode(APIC_LVT_TIMER_ONESHOT);
+		p2o_not_reset = apic_read(APIC_TMCCT) < (tmict / 2);
 
-	/*
-	 * After the change of mode, the counter should not be reset and continue
-	 * counting down from where it was
-	 */
-	report(apic_read(APIC_TMCCT) < (tmict / 2),
-	       "TMCCT should not be reset to TMICT value");
-	/*
-	 * Specifically wait for timer wrap around and skip 0.
-	 * Under KVM lapic there is a possibility that a small amount of consecutive
-	 * TMCCR reads return 0 while hrtimer is reset in an async callback
-	 */
-	wait_until_tmcct_wrap_around(tmict, false);
-	report(apic_read(APIC_TMCCT) > (tmict / 2),
-	       "TMCCT should be reset to the initial-count");
+		wait_until_tmcct_is_zero(tmict, false);
+		p2o_reached_zero = !apic_read(APIC_TMCCT);
 
-	wait_until_tmcct_is_zero(tmict, true);
-	/*
-	 * Keep the same TMICT and change timer mode to one-shot
-	 * TMCCT should be > 0 and count-down to 0
-	 */
-	apic_change_mode(APIC_LVT_TIMER_ONESHOT);
-	report(apic_read(APIC_TMCCT) < (tmict / 2),
-	       "TMCCT should not be reset to init");
-	wait_until_tmcct_is_zero(tmict, false);
-	report(!apic_read(APIC_TMCCT), "TMCCT should have reach zero");
-
-	/* now tmcct == 0 and tmict != 0 */
-	apic_change_mode(APIC_LVT_TIMER_PERIODIC);
-	report(!apic_read(APIC_TMCCT), "TMCCT should stay at zero");
+		/* now tmcct == 0 and tmict != 0 */
+		apic_change_mode(APIC_LVT_TIMER_PERIODIC);
+		p2o_stay_zero = !apic_read(APIC_TMCCT);
+	}
+
+	report(tmict_reset, "TMICT value reset");
+	report(o_nonzero, "one-shot: TMCCT should have a non-zero value");
+	report(o_reached_zero, "one-shot: TMCCT should have reached 0");
+	report(p_nonzero, "periodic: TMCCT should have a non-zero value");
+	report(p_not_reset, "TMCCT should not be reset to TMICT value");
+	report(p_after_wrap, "TMCCT should be reset to the initial-count");
+	report(p2o_not_reset, "TMCCT should not be reset to init");
+	report(p2o_reached_zero, "TMCCT should have reach zero");
+	report(p2o_stay_zero, "TMCCT should stay at zero");
 }
 
 #define KVM_HC_SEND_IPI 10
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [kvm-unit-tests PATCH 2/2] x86/apic: add retry logic to test_apic_change_mode
  2026-04-28 13:35 [kvm-unit-tests PATCH 0/2] x86/apic: fix false test_apic_change_mode failures on stalled vCPUs Igor Mammedov
  2026-04-28 13:35 ` [kvm-unit-tests PATCH 1/2] x86/apic: separate reporting from actual measurements Igor Mammedov
@ 2026-04-28 13:35 ` Igor Mammedov
  1 sibling, 0 replies; 3+ messages in thread
From: Igor Mammedov @ 2026-04-28 13:35 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini

The APIC timer runs at wall clock time under KVM. If vCPU is stalled for
long enough, timer can expire before the guest reads TMCCT or tick well
past expected values, causing various false test failures [1].

Add retry attempts with increasing timer period (10ms, 60ms, 700ms)
if any test fails, to handle spurious failures due to vCPU stalls.

1) Failures we sometimes observe in CI are:
  "FAIL: TMCCT should have a non-zero value"
  "FAIL: TMCCT should be reset to the initial-count"
  "FAIL: TMCCT should not be reset to TMICT value"
Seen on both Intel and AMD hosts.

PS:
on most test runs, test completes fine on the 1st iteration.
The patch would affect only failure path which will be slowed down
due to retries but still fail if there is a bug with benefit of
getting rid of false positives caused by vCPU stalls.

PS2:
Number of tries and tmict delta comes from analyzing vcpu stalls
on heavily overcommited Haswell host with test being bounced between
2 sockets.
Typically 2nd iteration (60ms) is enough to get rid of
false positives.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 x86/apic.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/x86/apic.c b/x86/apic.c
index d4eb8e11..27597323 100644
--- a/x86/apic.c
+++ b/x86/apic.c
@@ -575,9 +575,12 @@ static void test_apic_change_mode(void)
 {
 	const uint32_t tmict_values[] = {
 		0x999999,    /* ~10ms */
+		0x3938700,   /* ~60ms */
+		0x29b92700,  /* ~700ms */
 	};
 	int retry, max_retries = ARRAY_SIZE(tmict_values);
 	uint32_t tmict;
+	bool fail;
 	bool tmict_reset = false, o_nonzero = false, o_reached_zero = false;
 	bool p_nonzero = false, p_not_reset = false, p_after_wrap = false;
 	bool p2o_not_reset = false, p2o_reached_zero = false, p2o_stay_zero = false;
@@ -635,6 +638,11 @@ static void test_apic_change_mode(void)
 		/* now tmcct == 0 and tmict != 0 */
 		apic_change_mode(APIC_LVT_TIMER_PERIODIC);
 		p2o_stay_zero = !apic_read(APIC_TMCCT);
+		fail = !(tmict_reset && o_nonzero && o_reached_zero &&
+			 p_nonzero && p_not_reset && p_after_wrap &&
+			 p2o_not_reset && p2o_reached_zero && p2o_stay_zero);
+		if (!fail)
+			break;
 	}
 
 	report(tmict_reset, "TMICT value reset");
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-28 13:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 13:35 [kvm-unit-tests PATCH 0/2] x86/apic: fix false test_apic_change_mode failures on stalled vCPUs Igor Mammedov
2026-04-28 13:35 ` [kvm-unit-tests PATCH 1/2] x86/apic: separate reporting from actual measurements Igor Mammedov
2026-04-28 13:35 ` [kvm-unit-tests PATCH 2/2] x86/apic: add retry logic to test_apic_change_mode Igor Mammedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox