Re: [PATCH v3 08/10] KVM: selftests: Trigger save+restore randomly in the #PF stress test

Kernel KVM virtualization development
 help / color / mirror / Atom feed

From: Yosry Ahmed <yosry@kernel.org>
To: sashiko-reviews@lists.linux.dev
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v3 08/10] KVM: selftests: Trigger save+restore randomly in the #PF stress test
Date: Mon, 29 Jun 2026 20:29:05 +0000	[thread overview]
Message-ID: <akLU4Zdfal21geBW@google.com> (raw)
In-Reply-To: <20260629184837.AC6A81F000E9@smtp.kernel.org>

On Mon, Jun 29, 2026 at 06:48:37PM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [Medium] The aggressive 1ms signal interval can starve the guest and cause the test to deterministically fail on slower or nested environments.
> --
> 
> commit 4750524a9d34a97e2957bdf96a22b57c5404cb65
> Author: Yosry Ahmed <yosry@kernel.org>
> 
> KVM: selftests: Trigger save+restore randomly in the #PF stress test
> 
> This patch introduces a background thread in the page fault stress test that
> sends SIGUSR1 to the vCPU thread to trigger exits to userspace. This allows
> exercising race conditions between page fault handling in the guest and
> save/restore paths in the host.
> 
> > diff --git a/tools/testing/selftests/kvm/x86/stress_save_restore_pf_test.c b/tools/testing/selftests/kvm/x86/stress_save_restore_pf_test.c
> > index 1b6f64bbcf937..bbbb5bb2a2ee1 100644
> > --- a/tools/testing/selftests/kvm/x86/stress_save_restore_pf_test.c
> > +++ b/tools/testing/selftests/kvm/x86/stress_save_restore_pf_test.c
> 
> [ ... ]
> 
> > @@ -94,15 +96,41 @@ static void guest_access_memory(void *arg)
> > +static void *sigusr_thread_fn(void *arg)
> > +{
> > +	pthread_t vcpu_thread = (pthread_t)arg;
> > +
> > +	for (;;) {
> > +		pthread_testcancel();
> > +		pthread_kill(vcpu_thread, SIGUSR1);
> > +		usleep(msecs_to_usecs(1));
> > +	}
> > +	return NULL;
> > +}
> 
> [ ... ]
> 
> > @@ -151,18 +179,30 @@ int main(int argc, char *argv[])
> >  	while (count++ < NR_ITERATIONS) {
> > +		/*
> > +		 * Only handle SIGUSR while the vCPU is running, otherwise
> > +		 * ignore it to avoid interrupting other ioctls/syscalls.
> > +		 */
> > +		vcpu_sigusr_listen();
> >  		r = __vcpu_run(vcpu);
> > -		TEST_ASSERT(!r, "vcpu_run failed");
> > -		TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO);
> > -
> > -		get_ucall(vcpu, &uc);
> > -		if (uc.cmd == UCALL_ABORT) {
> > +		if (r == -1)
> > +			TEST_ASSERT_EQ(errno, EINTR);
> > +		vcpu_sigusr_ignore();
> 
> [Severity: Medium]
> Can the aggressive 1ms signal interval from sigusr_thread_fn starve the guest
> and cause deterministic test failures in slower environments?
> 
> The first time __vcpu_run executes for a newly created VM, it performs
> expensive kernel-side initializations like setting up the shadow MMU. If this
> setup takes longer than 1ms (which is probable in nested or heavily loaded CI
> environments), the background thread will send SIGUSR1 before the guest
> actually executes.
> 
> When KVM detects the pending signal, it will abort the entry and return
> -EINTR. Because main catches -EINTR and immediately proceeds to save the
> unchanged state and recreate the VM for the next iteration, the test can get
> stuck in a cycle where the guest never makes forward progress.
> 
> After exhausting all iterations doing nothing but VM recreations, the test
> would then fail the guest_faults > 0 assertion at the end. Could we verify
> that the guest actually ran and made forward progress before destroying the
> VM, or perhaps increase the signal interval?

The signal interval was initially 100us and that did cause problems on
slower kernels (with DEBUG options or ASAN). With 1 msec, we don't
observe any problems internally with DEBUG or ASAN builds on a variety
of platforms. So I think we are good with 1 msec.

Increasing the interval too much will probably make the test less
effective, as the signal is less likely to hit in interesting race
windows.

next prev parent reply	other threads:[~2026-06-29 20:29 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-29 18:37 [PATCH v3 00/10] KVM: selftests: Stress save+restore and #PF (ft. nested) Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 01/10] KVM: selftests: Move STR() and XSTR() definitions to test_util.h Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 02/10] KVM: selftests: Fix RAX and RFLAGS VMCB offsets when running L2 Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 03/10] KVM: selftests: Use an array for guest_regs (and fix offsets) Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 04/10] KVM: selftests: Move GPR load/save definitions outside of nSVM code Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 05/10] KVM: selftests: Reuse GPR switching logic for nVMX Yosry Ahmed
2026-06-29 18:49   ` sashiko-bot
2026-06-29 20:26     ` Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 06/10] KVM: selftests: Drop HORRIFIC_L2_UCALL_CLOBBER_HACK Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 07/10] KVM: selftests: Add basic stress test for save+restore and #PF handling Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 08/10] KVM: selftests: Trigger save+restore randomly in the #PF stress test Yosry Ahmed
2026-06-29 18:48   ` sashiko-bot
2026-06-29 20:29     ` Yosry Ahmed [this message]
2026-06-29 18:37 ` [PATCH v3 09/10] KVM: selftests: Support running stress save+restore and #PF test in L2 Yosry Ahmed
2026-06-29 18:37 ` [PATCH v3 10/10] KVM: selftests: Trigger L2->L1 exits stress save+restore and #PF test Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akLU4Zdfal21geBW@google.com \
    --to=yosry@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox