Re: [PATCH] KVM: selftests: Wait mprotect_ro_done before write to RO in mmu_stress_test

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: pbonzini@redhat.com, rick.p.edgecombe@intel.com,
	 linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: selftests: Wait mprotect_ro_done before write to RO in mmu_stress_test
Date: Tue, 25 Feb 2025 17:48:39 -0800	[thread overview]
Message-ID: <Z75y90KM_fE6H1cJ@google.com> (raw)
In-Reply-To: <20250208105318.16861-1-yan.y.zhao@intel.com>

On Sat, Feb 08, 2025, Yan Zhao wrote:
> In the read-only mprotect() phase of mmu_stress_test, ensure that
> mprotect(PROT_READ) has completed before the guest starts writing to the
> read-only mprotect() memory.
> 
> Without waiting for mprotect_ro_done before the guest starts writing in
> stage 3 (the stage for read-only mprotect()), the host's assertion of stage
> 3 could fail if mprotect_ro_done is set to true in the window between the
> guest finishing writes to all GPAs and executing GUEST_SYNC(3).
> 
> This scenario is easy to occur especially when there are hundred of vCPUs.
> 
> CPU 0                  CPU 1 guest     CPU 1 host
>                                        enter stage 3's 1st loop
>                        //in stage 3
>                        write all GPAs
>                        @rip 0x4025f0
> 
> mprotect(PROT_READ)
> mprotect_ro_done=true
>                        GUEST_SYNC(3)
>                                        r=0, continue stage 3's 1st loop
> 
>                        //in stage 4
>                        write GPA
>                        @rip 0x402635
> 
>                                        -EFAULT, jump out stage 3's 1st loop
>                                        enter stage 3's 2nd loop
>                        write GPA
>                        @rip 0x402635
>                                        -EFAULT, continue stage 3's 2nd loop
>                                        guest rip += 3
> 
> The test then fails and reports "Unhandled exception '0xe' at guest RIP
> '0x402638'", since the next valid guest rip address is 0x402639, i.e. the
> "(mem) = val" in vcpu_arch_put_guest() is compiled into a mov instruction
> of length 4.

This shouldn't happen.  On x86, stage 3 is a hand-coded "mov %rax, (%rax)", not
vcpu_arch_put_guest().  Either something else is going on, or __x86_64__ isn't
defined?

	do {
		for (gpa = start_gpa; gpa < end_gpa; gpa += stride)
#ifdef __x86_64__
			asm volatile(".byte 0x48,0x89,0x00" :: "a"(gpa) : "memory"); /* mov %rax, (%rax) */
#elif defined(__aarch64__)
			asm volatile("str %0, [%0]" :: "r" (gpa) : "memory");
#else
			vcpu_arch_put_guest(*((volatile uint64_t *)gpa), gpa);
#endif
	} while (!READ_ONCE(mprotect_ro_done));

	/*
	 * Only architectures that write the entire range can explicitly sync,
	 * as other architectures will be stuck on the write fault.
	 */
#if defined(__x86_64__) || defined(__aarch64__)
	GUEST_SYNC(3);
#endif

	for (gpa = start_gpa; gpa < end_gpa; gpa += stride)
		vcpu_arch_put_guest(*((volatile uint64_t *)gpa), gpa);
	GUEST_SYNC(4);



> Even if it could be compiled into a mov instruction of length 3, the
> following execution of GUEST_SYNC(4) in guest could still cause the host
> failure of the assertion of stage 3.

Sorry, I don't follow.  The guest doesn't get "released" from GUEST_SYNC(3) until
the host runs the vCPU again, and that happens after asserting on stage 3 and
synchronizing with the main thread.

	assert_sync_stage(vcpu, 3);
#endif /* __x86_64__ || __aarch64__ */
	rendezvous_with_boss();

	/*
	 * Stage 4.  Run to completion, waiting for mprotect(PROT_WRITE) to
	 * make the memory writable again.
	 */
	do {
		r = _vcpu_run(vcpu);
	} while (r && errno == EFAULT);

next prev parent reply	other threads:[~2025-02-26  1:48 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-08 10:53 [PATCH] KVM: selftests: Wait mprotect_ro_done before write to RO in mmu_stress_test Yan Zhao
2025-02-11  1:03 ` Edgecombe, Rick P
2025-02-11  1:42   ` Yan Zhao
2025-02-11 18:29     ` Edgecombe, Rick P
2025-02-12  6:59       ` Yan Zhao
2025-02-26  1:53       ` Sean Christopherson
2025-02-26  2:07         ` Yan Zhao
2025-02-26  2:20           ` Sean Christopherson
2025-02-26  3:09             ` Yan Zhao
2025-02-26  1:48 ` Sean Christopherson [this message]
2025-02-26  3:08   ` Yan Zhao
2025-02-26 19:30     ` Sean Christopherson
2025-02-27  5:45       ` Yan Zhao
2025-02-27 22:18         ` Sean Christopherson
2025-02-28 10:55           ` Yan Zhao
2025-02-28 14:00             ` Sean Christopherson
2025-03-03  3:14               ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z75y90KM_fE6H1cJ@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox