public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: SVM: Fix SNP AP destroy race with VMRUN
@ 2025-03-17 17:20 Tom Lendacky
  2025-03-17 17:23 ` Tom Lendacky
  0 siblings, 1 reply; 12+ messages in thread
From: Tom Lendacky @ 2025-03-17 17:20 UTC (permalink / raw)
  To: kvm, linux-kernel, x86
  Cc: Paolo Bonzini, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Ingo Molnar, Thomas Gleixner, Michael Roth

An AP destroy request for a target vCPU is typically followed by an
RMPADJUST to remove the VMSA attribute from the page currently being
used as the VMSA for the target vCPU. This can result in a vCPU that
is about to VMRUN to exit with #VMEXIT_INVALID.

This usually does not happen as APs are typically sitting in HLT when
being destroyed and therefore the vCPU thread is not running at the time.
However, if HLT is allowed inside the VM, then the vCPU could be about to
VMRUN when the VMSA attribute is removed from the VMSA page, resulting in
a #VMEXIT_INVALID when the vCPU actually issues the VMRUN and causing the
guest to crash. An RMPADJUST against an in-use (already running) VMSA
results in a #NPF for the vCPU issuing the RMPADJUST, so the VMSA
attribute cannot be changed until the VMRUN for target vCPU exits. The
Qemu command line option '-overcommit cpu-pm=on' is an example of allowing
HLT inside the guest.

Use kvm_test_request() to ensure that the target vCPU sees the AP destroy
request before returning to the initiating vCPU.

Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kvm/svm/sev.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0d898d6b697f..a040f29bb07b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4071,6 +4071,16 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
 	if (kick) {
 		kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
 		kvm_vcpu_kick(target_vcpu);
+
+		if (request == SVM_VMGEXIT_AP_DESTROY) {
+			/*
+			 * A destroy is likely to be followed by an RMPADJUST
+			 * that will remove the VMSA flag, so be sure the vCPU
+			 * got the request in case it is on the way to a VMRUN.
+			 */
+			while (kvm_test_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu))
+				cond_resched();
+		}
 	}
 
 	mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);

base-commit: f8d892c137f7448d7b49f5e3ad7aa7b5a48a64ed
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread
* Re: [PATCH] KVM: SVM: Fix SNP AP destroy race with VMRUN
@ 2025-03-21 16:20 Tom Lendacky
  0 siblings, 0 replies; 12+ messages in thread
From: Tom Lendacky @ 2025-03-21 16:20 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-kernel, x86, Paolo Bonzini, Borislav Petkov,
	Dave Hansen, Ingo Molnar, Thomas Gleixner, Michael Roth

On 3/18/25 08:47, Tom Lendacky wrote:
> On 3/18/25 07:43, Tom Lendacky wrote:
>> On 3/17/25 12:36, Tom Lendacky wrote:
>>> On 3/17/25 12:28, Sean Christopherson wrote:
>>>> On Mon, Mar 17, 2025, Tom Lendacky wrote:
>>>>> On 3/17/25 12:20, Tom Lendacky wrote:
>>>>>> An AP destroy request for a target vCPU is typically followed by an
>>>>>> RMPADJUST to remove the VMSA attribute from the page currently being
>>>>>> used as the VMSA for the target vCPU. This can result in a vCPU that
>>>>>> is about to VMRUN to exit with #VMEXIT_INVALID.
>>>>>>
>>>>>> This usually does not happen as APs are typically sitting in HLT when
>>>>>> being destroyed and therefore the vCPU thread is not running at the time.
>>>>>> However, if HLT is allowed inside the VM, then the vCPU could be about to
>>>>>> VMRUN when the VMSA attribute is removed from the VMSA page, resulting in
>>>>>> a #VMEXIT_INVALID when the vCPU actually issues the VMRUN and causing the
>>>>>> guest to crash. An RMPADJUST against an in-use (already running) VMSA
>>>>>> results in a #NPF for the vCPU issuing the RMPADJUST, so the VMSA
>>>>>> attribute cannot be changed until the VMRUN for target vCPU exits. The
>>>>>> Qemu command line option '-overcommit cpu-pm=on' is an example of allowing
>>>>>> HLT inside the guest.
>>>>>>
>>>>>> Use kvm_test_request() to ensure that the target vCPU sees the AP destroy
>>>>>> request before returning to the initiating vCPU.
>>>>>>
>>>>>> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
>>>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

>>>>
>>>> Very off-the-cuff, but I assume KVM_REQ_UPDATE_PROTECTED_GUEST_STATE just needs
>>>> to be annotated with KVM_REQUEST_WAIT.
>>>
>>> Ok, nice. I wasn't sure if KVM_REQUEST_WAIT would be appropriate here.
>>> This is much simpler. Let me test it out and resend if everything goes ok.
>>
>> So that doesn't work. I can still get an occasional #VMEXIT_INVALID. Let
>> me try to track down what is happening with this approach...
> 
> Looks like I need to use kvm_make_vcpus_request_mask() instead of just a
> plain kvm_make_request() followed by a kvm_vcpu_kick().
> 
> Let me try that and see how this works.

This appears to be working ok. The kvm_make_vcpus_request_mask() function
would need to be EXPORT_SYMBOL_GPL, though, any objections to that?

I could also simplify this a bit by creating a new function that takes a
target vCPU and then calls kvm_make_vcpus_request_mask() from there.
Thoughts?

This is what the patch currently looks like:


diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 32ae3aa50c7e..51aa63591b0a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -123,7 +123,8 @@
 	KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_HV_TLB_FLUSH \
 	KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
-#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE	KVM_ARCH_REQ(34)
+#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE \
+	KVM_ARCH_REQ_FLAGS(34, KVM_REQUEST_WAIT)
 
 #define CR0_RESERVED_BITS                                               \
 	(~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6e3f5042d9ce..0c45cc0c0571 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4038,8 +4038,13 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
 
 out:
 	if (kick) {
-		kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
-		kvm_vcpu_kick(target_vcpu);
+		DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
+
+		bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS);
+		bitmap_set(vcpu_bitmap, target_vcpu->vcpu_idx, 1);
+		kvm_make_vcpus_request_mask(vcpu->kvm,
+					    KVM_REQ_UPDATE_PROTECTED_GUEST_STATE,
+					    vcpu_bitmap);
 	}
 
 	mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ba0327e2d0d3..08c135f3d31f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -268,6 +268,7 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 
 	return called;
 }
+EXPORT_SYMBOL_GPL(kvm_make_vcpus_request_mask);
 
 bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 {


Thanks,
Tom

> 
> Thanks,
> Tom
> 
>>
>> Thanks,
>> Tom
>>
>>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>>>> index 04e6c5604bc3..67abfe97c600 100644
>>>> --- a/arch/x86/include/asm/kvm_host.h
>>>> +++ b/arch/x86/include/asm/kvm_host.h
>>>> @@ -124,7 +124,8 @@
>>>>         KVM_ARCH_REQ_FLAGS(31, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>>>  #define KVM_REQ_HV_TLB_FLUSH \
>>>>         KVM_ARCH_REQ_FLAGS(32, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>>>> -#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE   KVM_ARCH_REQ(34)
>>>> +#define KVM_REQ_UPDATE_PROTECTED_GUEST_STATE \
>>>> +       KVM_ARCH_REQ_FLAGS(34, KVM_REQUEST_WAIT)
>>>>  
>>>>  #define CR0_RESERVED_BITS                                               \
>>>>         (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \
>>>>
>>>>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-03-26 17:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-17 17:20 [PATCH] KVM: SVM: Fix SNP AP destroy race with VMRUN Tom Lendacky
2025-03-17 17:23 ` Tom Lendacky
2025-03-17 17:28   ` Sean Christopherson
2025-03-17 17:36     ` Tom Lendacky
2025-03-18 12:43       ` Tom Lendacky
2025-03-18 13:47         ` Tom Lendacky
2025-03-21 16:52           ` Tom Lendacky
2025-03-21 23:17             ` Sean Christopherson
2025-03-25 17:49               ` Tom Lendacky
2025-03-26 15:34                 ` Tom Lendacky
2025-03-26 17:17                   ` Sean Christopherson
  -- strict thread matches above, loose matches on Subject: below --
2025-03-21 16:20 Tom Lendacky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox