* [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
@ 2025-02-19 1:26 ` Sean Christopherson
2025-02-24 19:38 ` Tom Lendacky
2025-02-25 2:22 ` Kim Phillips
2025-02-19 1:26 ` [PATCH 02/10] KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3 Sean Christopherson
` (9 subsequent siblings)
10 siblings, 2 replies; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:26 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
When running SEV-SNP guests on a CPU that supports DebugSwap, always save
the host's DR0..DR3 mask MSR values irrespective of whether or not
DebugSwap is enabled, to ensure the host values aren't clobbered by the
CPU.
SVM_VMGEXIT_AP_CREATE is deeply flawed in that it allows the *guest* to
create a VMSA with guest-controlled SEV_FEATURES. A well behaved guest
can inform the hypervisor, i.e. KVM, of its "requested" features, but on
CPUs without ALLOWED_SEV_FEATURES support, nothing prevents the guest from
lying about which SEV features are being enabled (or not!).
If a misbehaving guest enables DebugSwap in a secondary vCPU's VMSA, the
CPU will load the DR0..DR3 mask MSRs on #VMEXIT, i.e. will clobber the
MSRs with '0' if KVM doesn't save its desired value.
Note, DR0..DR3 themselves are "ok", as DR7 is reset on #VMEXIT, and KVM
restores all DRs in common x86 code as needed via hw_breakpoint_restore().
I.e. there is no risk of host DR0..DR3 being clobbered (when it matters).
However, there is a flaw in the opposite direction; because the guest can
lie about enabling DebugSwap, i.e. can *disable* DebugSwap without KVM's
knowledge, KVM must not rely on the CPU to restore DRs. Defer fixing
that wart, as it's more of a documentation issue than a bug in the code.
Note, KVM added support for DebugSwap on commit d1f85fbe836e ("KVM: SEV:
Enable data breakpoints in SEV-ES"), but that is not an appropriate Fixes,
as the underlying flaw exists in hardware, not in KVM. I.e. all kernels
that support SEV-SNP need to be patched, not just kernels with KVM's full
support for DebugSwap (ignoring that DebugSwap support landed first).
Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Cc: stable@vger.kernel.org
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 74525651770a..e3606d072735 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4568,6 +4568,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
{
+ struct kvm *kvm = svm->vcpu.kvm;
+
/*
* All host state for SEV-ES guests is categorized into three swap types
* based on how it is handled by hardware during a world switch:
@@ -4592,9 +4594,14 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
/*
* If DebugSwap is enabled, debug registers are loaded but NOT saved by
* the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
- * saves and loads debug registers (Type-A).
+ * saves and loads debug registers (Type-A). Sadly, on CPUs without
+ * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
+ * DebugSwap on secondary vCPUs without KVM's knowledge via "AP Create",
+ * and so KVM must save DRs if DebugSwap is supported to prevent DRs
+ * from being clobbered by a misbehaving guest.
*/
- if (sev_vcpu_has_debug_swap(svm)) {
+ if (sev_vcpu_has_debug_swap(svm) ||
+ (sev_snp_guest(kvm) && cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP))) {
hostsa->dr0 = native_get_debugreg(0);
hostsa->dr1 = native_get_debugreg(1);
hostsa->dr2 = native_get_debugreg(2);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap
2025-02-19 1:26 ` [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap Sean Christopherson
@ 2025-02-24 19:38 ` Tom Lendacky
2025-02-25 2:22 ` Kim Phillips
1 sibling, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 19:38 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:26, Sean Christopherson wrote:
> When running SEV-SNP guests on a CPU that supports DebugSwap, always save
> the host's DR0..DR3 mask MSR values irrespective of whether or not
> DebugSwap is enabled, to ensure the host values aren't clobbered by the
> CPU.
>
> SVM_VMGEXIT_AP_CREATE is deeply flawed in that it allows the *guest* to
> create a VMSA with guest-controlled SEV_FEATURES. A well behaved guest
> can inform the hypervisor, i.e. KVM, of its "requested" features, but on
> CPUs without ALLOWED_SEV_FEATURES support, nothing prevents the guest from
> lying about which SEV features are being enabled (or not!).
>
> If a misbehaving guest enables DebugSwap in a secondary vCPU's VMSA, the
> CPU will load the DR0..DR3 mask MSRs on #VMEXIT, i.e. will clobber the
> MSRs with '0' if KVM doesn't save its desired value.
>
> Note, DR0..DR3 themselves are "ok", as DR7 is reset on #VMEXIT, and KVM
> restores all DRs in common x86 code as needed via hw_breakpoint_restore().
> I.e. there is no risk of host DR0..DR3 being clobbered (when it matters).
> However, there is a flaw in the opposite direction; because the guest can
> lie about enabling DebugSwap, i.e. can *disable* DebugSwap without KVM's
> knowledge, KVM must not rely on the CPU to restore DRs. Defer fixing
> that wart, as it's more of a documentation issue than a bug in the code.
>
> Note, KVM added support for DebugSwap on commit d1f85fbe836e ("KVM: SEV:
> Enable data breakpoints in SEV-ES"), but that is not an appropriate Fixes,
> as the underlying flaw exists in hardware, not in KVM. I.e. all kernels
> that support SEV-SNP need to be patched, not just kernels with KVM's full
> support for DebugSwap (ignoring that DebugSwap support landed first).
>
> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
> Cc: stable@vger.kernel.org
> Cc: Naveen N Rao <naveen@kernel.org>
> Cc: Kim Phillips <kim.phillips@amd.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Alexey Kardashevskiy <aik@amd.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 74525651770a..e3606d072735 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4568,6 +4568,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
>
> void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
> {
> + struct kvm *kvm = svm->vcpu.kvm;
> +
> /*
> * All host state for SEV-ES guests is categorized into three swap types
> * based on how it is handled by hardware during a world switch:
> @@ -4592,9 +4594,14 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
> /*
> * If DebugSwap is enabled, debug registers are loaded but NOT saved by
> * the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
> - * saves and loads debug registers (Type-A).
> + * saves and loads debug registers (Type-A). Sadly, on CPUs without
> + * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
> + * DebugSwap on secondary vCPUs without KVM's knowledge via "AP Create",
> + * and so KVM must save DRs if DebugSwap is supported to prevent DRs
> + * from being clobbered by a misbehaving guest.
> */
> - if (sev_vcpu_has_debug_swap(svm)) {
> + if (sev_vcpu_has_debug_swap(svm) ||
> + (sev_snp_guest(kvm) && cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP))) {
> hostsa->dr0 = native_get_debugreg(0);
> hostsa->dr1 = native_get_debugreg(1);
> hostsa->dr2 = native_get_debugreg(2);
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap
2025-02-19 1:26 ` [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap Sean Christopherson
2025-02-24 19:38 ` Tom Lendacky
@ 2025-02-25 2:22 ` Kim Phillips
2025-02-25 14:12 ` Tom Lendacky
1 sibling, 1 reply; 33+ messages in thread
From: Kim Phillips @ 2025-02-25 2:22 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Tom Lendacky,
Alexey Kardashevskiy
On 2/18/25 7:26 PM, Sean Christopherson wrote:
> When running SEV-SNP guests on a CPU that supports DebugSwap, always save
> the host's DR0..DR3 mask MSR values irrespective of whether or not
> DebugSwap is enabled, to ensure the host values aren't clobbered by the
> CPU.
>
> SVM_VMGEXIT_AP_CREATE is deeply flawed in that it allows the *guest* to
> create a VMSA with guest-controlled SEV_FEATURES. A well behaved guest
> can inform the hypervisor, i.e. KVM, of its "requested" features, but on
> CPUs without ALLOWED_SEV_FEATURES support, nothing prevents the guest from
> lying about which SEV features are being enabled (or not!).
>
> If a misbehaving guest enables DebugSwap in a secondary vCPU's VMSA, the
> CPU will load the DR0..DR3 mask MSRs on #VMEXIT, i.e. will clobber the
> MSRs with '0' if KVM doesn't save its desired value.
>
> Note, DR0..DR3 themselves are "ok", as DR7 is reset on #VMEXIT, and KVM
> restores all DRs in common x86 code as needed via hw_breakpoint_restore().
> I.e. there is no risk of host DR0..DR3 being clobbered (when it matters).
> However, there is a flaw in the opposite direction; because the guest can
> lie about enabling DebugSwap, i.e. can *disable* DebugSwap without KVM's
> knowledge, KVM must not rely on the CPU to restore DRs. Defer fixing
> that wart, as it's more of a documentation issue than a bug in the code.
>
> Note, KVM added support for DebugSwap on commit d1f85fbe836e ("KVM: SEV:
> Enable data breakpoints in SEV-ES"), but that is not an appropriate Fixes,
> as the underlying flaw exists in hardware, not in KVM. I.e. all kernels
> that support SEV-SNP need to be patched, not just kernels with KVM's full
> support for DebugSwap (ignoring that DebugSwap support landed first).
>
> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
> Cc: stable@vger.kernel.org
> Cc: Naveen N Rao <naveen@kernel.org>
> Cc: Kim Phillips <kim.phillips@amd.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Alexey Kardashevskiy <aik@amd.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 74525651770a..e3606d072735 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4568,6 +4568,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
>
> void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
> {
> + struct kvm *kvm = svm->vcpu.kvm;
> +
> /*
> * All host state for SEV-ES guests is categorized into three swap types
> * based on how it is handled by hardware during a world switch:
> @@ -4592,9 +4594,14 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
> /*
> * If DebugSwap is enabled, debug registers are loaded but NOT saved by
> * the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
> - * saves and loads debug registers (Type-A).
> + * saves and loads debug registers (Type-A). Sadly, on CPUs without
> + * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
> + * DebugSwap on secondary vCPUs without KVM's knowledge via "AP Create",
> + * and so KVM must save DRs if DebugSwap is supported to prevent DRs
> + * from being clobbered by a misbehaving guest.
> */
> - if (sev_vcpu_has_debug_swap(svm)) {
> + if (sev_vcpu_has_debug_swap(svm) ||
> + (sev_snp_guest(kvm) && cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP))) {
Both ALLOWED_SEV_FEATURES and DEBUG_SWAP are also SEV-ES (not only SNP)
features, so s/sev_snp_guest/sev_es_guest/?
Thanks,
Kim
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap
2025-02-25 2:22 ` Kim Phillips
@ 2025-02-25 14:12 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-25 14:12 UTC (permalink / raw)
To: Kim Phillips, Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Alexey Kardashevskiy
On 2/24/25 20:22, Kim Phillips wrote:
> On 2/18/25 7:26 PM, Sean Christopherson wrote:
>> When running SEV-SNP guests on a CPU that supports DebugSwap, always save
>> the host's DR0..DR3 mask MSR values irrespective of whether or not
>> DebugSwap is enabled, to ensure the host values aren't clobbered by the
>> CPU.
>>
>> SVM_VMGEXIT_AP_CREATE is deeply flawed in that it allows the *guest* to
>> create a VMSA with guest-controlled SEV_FEATURES. A well behaved guest
>> can inform the hypervisor, i.e. KVM, of its "requested" features, but on
>> CPUs without ALLOWED_SEV_FEATURES support, nothing prevents the guest
>> from
>> lying about which SEV features are being enabled (or not!).
>>
>> If a misbehaving guest enables DebugSwap in a secondary vCPU's VMSA, the
>> CPU will load the DR0..DR3 mask MSRs on #VMEXIT, i.e. will clobber the
>> MSRs with '0' if KVM doesn't save its desired value.
>>
>> Note, DR0..DR3 themselves are "ok", as DR7 is reset on #VMEXIT, and KVM
>> restores all DRs in common x86 code as needed via
>> hw_breakpoint_restore().
>> I.e. there is no risk of host DR0..DR3 being clobbered (when it matters).
>> However, there is a flaw in the opposite direction; because the guest can
>> lie about enabling DebugSwap, i.e. can *disable* DebugSwap without KVM's
>> knowledge, KVM must not rely on the CPU to restore DRs. Defer fixing
>> that wart, as it's more of a documentation issue than a bug in the code.
>>
>> Note, KVM added support for DebugSwap on commit d1f85fbe836e ("KVM: SEV:
>> Enable data breakpoints in SEV-ES"), but that is not an appropriate
>> Fixes,
>> as the underlying flaw exists in hardware, not in KVM. I.e. all kernels
>> that support SEV-SNP need to be patched, not just kernels with KVM's full
>> support for DebugSwap (ignoring that DebugSwap support landed first).
>>
>> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
>> Cc: stable@vger.kernel.org
>> Cc: Naveen N Rao <naveen@kernel.org>
>> Cc: Kim Phillips <kim.phillips@amd.com>
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: Alexey Kardashevskiy <aik@amd.com>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> ---
>> arch/x86/kvm/svm/sev.c | 11 +++++++++--
>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index 74525651770a..e3606d072735 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -4568,6 +4568,8 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
>> void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct
>> sev_es_save_area *hostsa)
>> {
>> + struct kvm *kvm = svm->vcpu.kvm;
>> +
>> /*
>> * All host state for SEV-ES guests is categorized into three
>> swap types
>> * based on how it is handled by hardware during a world switch:
>> @@ -4592,9 +4594,14 @@ void sev_es_prepare_switch_to_guest(struct
>> vcpu_svm *svm, struct sev_es_save_are
>> /*
>> * If DebugSwap is enabled, debug registers are loaded but NOT
>> saved by
>> * the CPU (Type-B). If DebugSwap is disabled/unsupported, the
>> CPU both
>> - * saves and loads debug registers (Type-A).
>> + * saves and loads debug registers (Type-A). Sadly, on CPUs without
>> + * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
>> + * DebugSwap on secondary vCPUs without KVM's knowledge via "AP
>> Create",
>> + * and so KVM must save DRs if DebugSwap is supported to prevent DRs
>> + * from being clobbered by a misbehaving guest.
>> */
>> - if (sev_vcpu_has_debug_swap(svm)) {
>> + if (sev_vcpu_has_debug_swap(svm) ||
>> + (sev_snp_guest(kvm) &&
>> cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP))) {
>
> Both ALLOWED_SEV_FEATURES and DEBUG_SWAP are also SEV-ES (not only SNP)
> features, so s/sev_snp_guest/sev_es_guest/?
Only SNP can supply a VMSA that may have a different SEV_FEATURES. For
SEV-ES, SEV_FEATURES will have been set by KVM and only KVM.
Thanks,
Tom
>
> Thanks,
>
> Kim
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 02/10] KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
2025-02-19 1:26 ` [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap Sean Christopherson
@ 2025-02-19 1:26 ` Sean Christopherson
2025-02-24 20:32 ` Tom Lendacky
2025-02-19 1:26 ` [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA Sean Christopherson
` (8 subsequent siblings)
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:26 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
Never rely on the CPU to restore/load host DR0..DR3 values, even if the
CPU supports DebugSwap, as there are no guarantees that SNP guests will
actually enable DebugSwap on APs. E.g. if KVM were to rely on the CPU to
load DR0..DR3 and skipped them during hw_breakpoint_restore(), KVM would
run with clobbered-to-zero DRs if an SNP guest created APs without
DebugSwap enabled.
Update the comment to explain the dangers, and hopefully prevent breaking
KVM in the future.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e3606d072735..6c6d45e13858 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4594,18 +4594,21 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
/*
* If DebugSwap is enabled, debug registers are loaded but NOT saved by
* the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
- * saves and loads debug registers (Type-A). Sadly, on CPUs without
- * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
- * DebugSwap on secondary vCPUs without KVM's knowledge via "AP Create",
- * and so KVM must save DRs if DebugSwap is supported to prevent DRs
- * from being clobbered by a misbehaving guest.
+ * saves and loads debug registers (Type-A). Sadly, KVM can't prevent
+ * SNP guests from lying about DebugSwap on secondary vCPUs, i.e. the
+ * SEV_FEATURES provided at "AP Create" isn't guaranteed to match what
+ * the guest has actually enabled (or not!) in the VMSA.
+ *
+ * If DebugSwap is *possible*, save the masks so that they're restored
+ * if the guest enables DebugSwap. But for the DRs themselves, do NOT
+ * rely on the CPU to restore the host values; KVM will restore them as
+ * needed in common code, via hw_breakpoint_restore(). Note, KVM does
+ * NOT support virtualizing Breakpoint Extensions, i.e. the mask MSRs
+ * don't need to be restored per se, KVM just needs to ensure they are
+ * loaded with the correct values *if* the CPU writes the MSRs.
*/
if (sev_vcpu_has_debug_swap(svm) ||
(sev_snp_guest(kvm) && cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP))) {
- hostsa->dr0 = native_get_debugreg(0);
- hostsa->dr1 = native_get_debugreg(1);
- hostsa->dr2 = native_get_debugreg(2);
- hostsa->dr3 = native_get_debugreg(3);
hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0);
hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1);
hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 02/10] KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
2025-02-19 1:26 ` [PATCH 02/10] KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3 Sean Christopherson
@ 2025-02-24 20:32 ` Tom Lendacky
2025-02-24 22:32 ` Sean Christopherson
0 siblings, 1 reply; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 20:32 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:26, Sean Christopherson wrote:
> Never rely on the CPU to restore/load host DR0..DR3 values, even if the
> CPU supports DebugSwap, as there are no guarantees that SNP guests will
> actually enable DebugSwap on APs. E.g. if KVM were to rely on the CPU to
> load DR0..DR3 and skipped them during hw_breakpoint_restore(), KVM would
> run with clobbered-to-zero DRs if an SNP guest created APs without
> DebugSwap enabled.
>
> Update the comment to explain the dangers, and hopefully prevent breaking
> KVM in the future.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
See comment below about the Type-A vs Type-B thing, but functionally:
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 21 ++++++++++++---------
> 1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index e3606d072735..6c6d45e13858 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4594,18 +4594,21 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
> /*
> * If DebugSwap is enabled, debug registers are loaded but NOT saved by
> * the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
> - * saves and loads debug registers (Type-A). Sadly, on CPUs without
> - * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
> - * DebugSwap on secondary vCPUs without KVM's knowledge via "AP Create",
> - * and so KVM must save DRs if DebugSwap is supported to prevent DRs
> - * from being clobbered by a misbehaving guest.
> + * saves and loads debug registers (Type-A). Sadly, KVM can't prevent
This mention of Type-A was bothering me, so I did some investigation on
this. If DebugSwap (DebugVirtualization in the latest APM) is
disabled/unsupported, DR0-3 and DR0-3 Mask registers are left alone and
the guest sees the host values, they are not fully restored and fully
saved. When DebugVirtualization is enabled, at that point the registers
become Type-B.
I'm not sure whether it is best to update the comment here or in the
first patch.
Thanks,
Tom
> + * SNP guests from lying about DebugSwap on secondary vCPUs, i.e. the
> + * SEV_FEATURES provided at "AP Create" isn't guaranteed to match what
> + * the guest has actually enabled (or not!) in the VMSA.
> + *
> + * If DebugSwap is *possible*, save the masks so that they're restored
> + * if the guest enables DebugSwap. But for the DRs themselves, do NOT
> + * rely on the CPU to restore the host values; KVM will restore them as
> + * needed in common code, via hw_breakpoint_restore(). Note, KVM does
> + * NOT support virtualizing Breakpoint Extensions, i.e. the mask MSRs
> + * don't need to be restored per se, KVM just needs to ensure they are
> + * loaded with the correct values *if* the CPU writes the MSRs.
> */
> if (sev_vcpu_has_debug_swap(svm) ||
> (sev_snp_guest(kvm) && cpu_feature_enabled(X86_FEATURE_DEBUG_SWAP))) {
> - hostsa->dr0 = native_get_debugreg(0);
> - hostsa->dr1 = native_get_debugreg(1);
> - hostsa->dr2 = native_get_debugreg(2);
> - hostsa->dr3 = native_get_debugreg(3);
> hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0);
> hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1);
> hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2);
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 02/10] KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
2025-02-24 20:32 ` Tom Lendacky
@ 2025-02-24 22:32 ` Sean Christopherson
0 siblings, 0 replies; 33+ messages in thread
From: Sean Christopherson @ 2025-02-24 22:32 UTC (permalink / raw)
To: Tom Lendacky
Cc: Paolo Bonzini, kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On Mon, Feb 24, 2025, Tom Lendacky wrote:
> On 2/18/25 19:26, Sean Christopherson wrote:
> > Never rely on the CPU to restore/load host DR0..DR3 values, even if the
> > CPU supports DebugSwap, as there are no guarantees that SNP guests will
> > actually enable DebugSwap on APs. E.g. if KVM were to rely on the CPU to
> > load DR0..DR3 and skipped them during hw_breakpoint_restore(), KVM would
> > run with clobbered-to-zero DRs if an SNP guest created APs without
> > DebugSwap enabled.
> >
> > Update the comment to explain the dangers, and hopefully prevent breaking
> > KVM in the future.
> >
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
>
> See comment below about the Type-A vs Type-B thing, but functionally:
>
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> > ---
> > arch/x86/kvm/svm/sev.c | 21 ++++++++++++---------
> > 1 file changed, 12 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index e3606d072735..6c6d45e13858 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -4594,18 +4594,21 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
> > /*
> > * If DebugSwap is enabled, debug registers are loaded but NOT saved by
> > * the CPU (Type-B). If DebugSwap is disabled/unsupported, the CPU both
> > - * saves and loads debug registers (Type-A). Sadly, on CPUs without
> > - * ALLOWED_SEV_FEATURES, KVM can't prevent SNP guests from enabling
> > - * DebugSwap on secondary vCPUs without KVM's knowledge via "AP Create",
> > - * and so KVM must save DRs if DebugSwap is supported to prevent DRs
> > - * from being clobbered by a misbehaving guest.
> > + * saves and loads debug registers (Type-A). Sadly, KVM can't prevent
>
> This mention of Type-A was bothering me, so I did some investigation on
> this. If DebugSwap (DebugVirtualization in the latest APM) is
> disabled/unsupported, DR0-3 and DR0-3 Mask registers are left alone and
> the guest sees the host values, they are not fully restored and fully
> saved. When DebugVirtualization is enabled, at that point the registers
> become Type-B.
Good catch. I completely glossed over that; I'm pretty sure my subconcious simply
rejected that statement as wrong.
> I'm not sure whether it is best to update the comment here or in the
> first patch.
Probably first patch.
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
2025-02-19 1:26 ` [PATCH 01/10] KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap Sean Christopherson
2025-02-19 1:26 ` [PATCH 02/10] KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3 Sean Christopherson
@ 2025-02-19 1:26 ` Sean Christopherson
2025-02-24 21:03 ` Tom Lendacky
2025-02-19 1:26 ` [PATCH 04/10] KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error Sean Christopherson
` (7 subsequent siblings)
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:26 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
Terminate the VM if userspace attempts to run an SEV-SNP (or -ES) vCPU
that has an invalid VMSA. With SNP's AP Create/Destroy hypercalls, it's
possible for an SNP vCPU to end up with an invalid VMSA, e.g. through a
deliberate Destroy or a failed Create event. KVM marks the vCPU HALTED so
that *KVM* doesn't run the vCPU, but nothing prevents a misbehaving VMM
from manually making the vCPU RUNNABLE via KVM_SET_MP_STATE.
Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 18 +++++++++++++++---
arch/x86/kvm/svm/svm.c | 7 +++++--
arch/x86/kvm/svm/svm.h | 2 +-
3 files changed, 21 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6c6d45e13858..e14a37dbc6ea 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3452,10 +3452,21 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
svm->sev_es.ghcb = NULL;
}
-void pre_sev_run(struct vcpu_svm *svm, int cpu)
+int pre_sev_run(struct vcpu_svm *svm, int cpu)
{
struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
- unsigned int asid = sev_get_asid(svm->vcpu.kvm);
+ struct kvm *kvm = svm->vcpu.kvm;
+ unsigned int asid = sev_get_asid(kvm);
+
+ /*
+ * Terminate the VM if userspace attempts to run the vCPU with an
+ * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
+ * an SNP AP Destroy event.
+ */
+ if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
+ kvm_vm_dead(kvm);
+ return -EIO;
+ }
/* Assign the asid allocated with this SEV guest */
svm->asid = asid;
@@ -3468,11 +3479,12 @@ void pre_sev_run(struct vcpu_svm *svm, int cpu)
*/
if (sd->sev_vmcbs[asid] == svm->vmcb &&
svm->vcpu.arch.last_vmentry_cpu == cpu)
- return;
+ return 0;
sd->sev_vmcbs[asid] = svm->vmcb;
svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ASID;
vmcb_mark_dirty(svm->vmcb, VMCB_ASID);
+ return 0;
}
#define GHCB_SCRATCH_AREA_LIMIT (16ULL * PAGE_SIZE)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b8aa0f36850f..46e0b65a9fec 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3587,7 +3587,7 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
return svm_invoke_exit_handler(vcpu, exit_code);
}
-static void pre_svm_run(struct kvm_vcpu *vcpu)
+static int pre_svm_run(struct kvm_vcpu *vcpu)
{
struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, vcpu->cpu);
struct vcpu_svm *svm = to_svm(vcpu);
@@ -3609,6 +3609,8 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
/* FIXME: handle wraparound of asid_generation */
if (svm->current_vmcb->asid_generation != sd->asid_generation)
new_asid(svm, sd);
+
+ return 0;
}
static void svm_inject_nmi(struct kvm_vcpu *vcpu)
@@ -4231,7 +4233,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
if (force_immediate_exit)
smp_send_reschedule(vcpu->cpu);
- pre_svm_run(vcpu);
+ if (pre_svm_run(vcpu))
+ return EXIT_FASTPATH_EXIT_USERSPACE;
sync_lapic_to_cr8(vcpu);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5b159f017055..e51852977b70 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -713,7 +713,7 @@ void avic_refresh_virtual_apic_mode(struct kvm_vcpu *vcpu);
/* sev.c */
-void pre_sev_run(struct vcpu_svm *svm, int cpu);
+int pre_sev_run(struct vcpu_svm *svm, int cpu);
void sev_init_vmcb(struct vcpu_svm *svm);
void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm);
int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-19 1:26 ` [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA Sean Christopherson
@ 2025-02-24 21:03 ` Tom Lendacky
2025-02-24 22:55 ` Sean Christopherson
0 siblings, 1 reply; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 21:03 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:26, Sean Christopherson wrote:
> Terminate the VM if userspace attempts to run an SEV-SNP (or -ES) vCPU
> that has an invalid VMSA. With SNP's AP Create/Destroy hypercalls, it's
> possible for an SNP vCPU to end up with an invalid VMSA, e.g. through a
> deliberate Destroy or a failed Create event. KVM marks the vCPU HALTED so
> that *KVM* doesn't run the vCPU, but nothing prevents a misbehaving VMM
> from manually making the vCPU RUNNABLE via KVM_SET_MP_STATE.
>
> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 18 +++++++++++++++---
> arch/x86/kvm/svm/svm.c | 7 +++++--
> arch/x86/kvm/svm/svm.h | 2 +-
> 3 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 6c6d45e13858..e14a37dbc6ea 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3452,10 +3452,21 @@ void sev_es_unmap_ghcb(struct vcpu_svm *svm)
> svm->sev_es.ghcb = NULL;
> }
>
> -void pre_sev_run(struct vcpu_svm *svm, int cpu)
> +int pre_sev_run(struct vcpu_svm *svm, int cpu)
> {
> struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
> - unsigned int asid = sev_get_asid(svm->vcpu.kvm);
> + struct kvm *kvm = svm->vcpu.kvm;
> + unsigned int asid = sev_get_asid(kvm);
> +
> + /*
> + * Terminate the VM if userspace attempts to run the vCPU with an
> + * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
> + * an SNP AP Destroy event.
> + */
> + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
> + kvm_vm_dead(kvm);
> + return -EIO;
> + }
If a VMRUN is performed with the vmsa_pa value set to INVALID_PAGE, the
VMRUN will fail and KVM will dump the VMCB and exit back to userspace
with KVM_EXIT_INTERNAL_ERROR.
Is doing this preferrable to that? If so, should a vcpu_unimpl() message
be issued, too, to better identify the reason for marking the VM dead?
>
> /* Assign the asid allocated with this SEV guest */
> svm->asid = asid;
> @@ -3468,11 +3479,12 @@ void pre_sev_run(struct vcpu_svm *svm, int cpu)
> */
> if (sd->sev_vmcbs[asid] == svm->vmcb &&
> svm->vcpu.arch.last_vmentry_cpu == cpu)
> - return;
> + return 0;
>
> sd->sev_vmcbs[asid] = svm->vmcb;
> svm->vmcb->control.tlb_ctl = TLB_CONTROL_FLUSH_ASID;
> vmcb_mark_dirty(svm->vmcb, VMCB_ASID);
> + return 0;
> }
>
> #define GHCB_SCRATCH_AREA_LIMIT (16ULL * PAGE_SIZE)
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index b8aa0f36850f..46e0b65a9fec 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3587,7 +3587,7 @@ static int svm_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
> return svm_invoke_exit_handler(vcpu, exit_code);
> }
>
> -static void pre_svm_run(struct kvm_vcpu *vcpu)
> +static int pre_svm_run(struct kvm_vcpu *vcpu)
> {
> struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, vcpu->cpu);
> struct vcpu_svm *svm = to_svm(vcpu);
> @@ -3609,6 +3609,8 @@ static void pre_svm_run(struct kvm_vcpu *vcpu)
> /* FIXME: handle wraparound of asid_generation */
> if (svm->current_vmcb->asid_generation != sd->asid_generation)
> new_asid(svm, sd);
> +
> + return 0;
> }
>
> static void svm_inject_nmi(struct kvm_vcpu *vcpu)
> @@ -4231,7 +4233,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
> if (force_immediate_exit)
> smp_send_reschedule(vcpu->cpu);
>
> - pre_svm_run(vcpu);
> + if (pre_svm_run(vcpu))
> + return EXIT_FASTPATH_EXIT_USERSPACE;
Since the return code from pre_svm_run() is never used, should it just
be a bool function, then?
Thanks,
Tom
>
> sync_lapic_to_cr8(vcpu);
>
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 5b159f017055..e51852977b70 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -713,7 +713,7 @@ void avic_refresh_virtual_apic_mode(struct kvm_vcpu *vcpu);
>
> /* sev.c */
>
> -void pre_sev_run(struct vcpu_svm *svm, int cpu);
> +int pre_sev_run(struct vcpu_svm *svm, int cpu);
> void sev_init_vmcb(struct vcpu_svm *svm);
> void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm);
> int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-24 21:03 ` Tom Lendacky
@ 2025-02-24 22:55 ` Sean Christopherson
2025-02-24 23:55 ` Tom Lendacky
0 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-24 22:55 UTC (permalink / raw)
To: Tom Lendacky
Cc: Paolo Bonzini, kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On Mon, Feb 24, 2025, Tom Lendacky wrote:
> On 2/18/25 19:26, Sean Christopherson wrote:
> > -void pre_sev_run(struct vcpu_svm *svm, int cpu)
> > +int pre_sev_run(struct vcpu_svm *svm, int cpu)
> > {
> > struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
> > - unsigned int asid = sev_get_asid(svm->vcpu.kvm);
> > + struct kvm *kvm = svm->vcpu.kvm;
> > + unsigned int asid = sev_get_asid(kvm);
> > +
> > + /*
> > + * Terminate the VM if userspace attempts to run the vCPU with an
> > + * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
> > + * an SNP AP Destroy event.
> > + */
> > + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
> > + kvm_vm_dead(kvm);
> > + return -EIO;
> > + }
>
> If a VMRUN is performed with the vmsa_pa value set to INVALID_PAGE, the
> VMRUN will fail and KVM will dump the VMCB and exit back to userspace
I haven't tested, but based on what the APM says, I'm pretty sure this would crash
the host due to a #GP on VMRUN, i.e. due to the resulting kvm_spurious_fault().
IF (rAX contains an unsupported physical address)
EXCEPTION [#GP]
> with KVM_EXIT_INTERNAL_ERROR.
>
> Is doing this preferrable to that?
Even if AMD guaranteed that the absolute worst case scenario is a failed VMRUN
with zero side effects, doing VMRUN with a bad address should be treated as a
KVM bug.
> If so, should a vcpu_unimpl() message be issued, too, to better identify the
> reason for marking the VM dead?
My vote is no. At some point we need to assume userspace possesess a reasonable
level of competency and sanity.
> > static void svm_inject_nmi(struct kvm_vcpu *vcpu)
> > @@ -4231,7 +4233,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
> > if (force_immediate_exit)
> > smp_send_reschedule(vcpu->cpu);
> >
> > - pre_svm_run(vcpu);
> > + if (pre_svm_run(vcpu))
> > + return EXIT_FASTPATH_EXIT_USERSPACE;
>
> Since the return code from pre_svm_run() is never used, should it just
> be a bool function, then?
Hard no. I strongly dislike boolean returns for functions that aren't obviously
predicates, because it's impossible to determine the polarity of the return value
based solely on the prototype. This leads to bugs that are easier to detect with
0/-errno return, e.g. returning -EINVAL in a happy path stands out more than
returning the wrong false/true value.
Case in point (because I just responded to another emain about this function),
what's the polarity of this helper? :-)
static bool sanity_check_entries(struct kvm_cpuid_entry2 __user *entries,
__u32 num_entries, unsigned int ioctl_type)
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-24 22:55 ` Sean Christopherson
@ 2025-02-24 23:55 ` Tom Lendacky
2025-02-25 0:54 ` Sean Christopherson
0 siblings, 1 reply; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 23:55 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/24/25 16:55, Sean Christopherson wrote:
> On Mon, Feb 24, 2025, Tom Lendacky wrote:
>> On 2/18/25 19:26, Sean Christopherson wrote:
>>> -void pre_sev_run(struct vcpu_svm *svm, int cpu)
>>> +int pre_sev_run(struct vcpu_svm *svm, int cpu)
>>> {
>>> struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
>>> - unsigned int asid = sev_get_asid(svm->vcpu.kvm);
>>> + struct kvm *kvm = svm->vcpu.kvm;
>>> + unsigned int asid = sev_get_asid(kvm);
>>> +
>>> + /*
>>> + * Terminate the VM if userspace attempts to run the vCPU with an
>>> + * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
>>> + * an SNP AP Destroy event.
>>> + */
>>> + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
>>> + kvm_vm_dead(kvm);
>>> + return -EIO;
>>> + }
>>
>> If a VMRUN is performed with the vmsa_pa value set to INVALID_PAGE, the
>> VMRUN will fail and KVM will dump the VMCB and exit back to userspace
>
> I haven't tested, but based on what the APM says, I'm pretty sure this would crash
> the host due to a #GP on VMRUN, i.e. due to the resulting kvm_spurious_fault().
>
> IF (rAX contains an unsupported physical address)
> EXCEPTION [#GP]
Well that's for the VMCB, the VMSA is pointed to by the VMCB and results
in a VMEXIT code of -1 if you don't supply a proper page-aligned,
physical address.
>
>> with KVM_EXIT_INTERNAL_ERROR.
>>
>> Is doing this preferrable to that?
>
> Even if AMD guaranteed that the absolute worst case scenario is a failed VMRUN
> with zero side effects, doing VMRUN with a bad address should be treated as a
> KVM bug.
Fair.
>
>> If so, should a vcpu_unimpl() message be issued, too, to better identify the
>> reason for marking the VM dead?
>
> My vote is no. At some point we need to assume userspace possesess a reasonable
> level of competency and sanity.
>
>>> static void svm_inject_nmi(struct kvm_vcpu *vcpu)
>>> @@ -4231,7 +4233,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
>>> if (force_immediate_exit)
>>> smp_send_reschedule(vcpu->cpu);
>>>
>>> - pre_svm_run(vcpu);
>>> + if (pre_svm_run(vcpu))
>>> + return EXIT_FASTPATH_EXIT_USERSPACE;
In testing this out, I think userspace continues on because I eventually
get:
KVM_GET_PIT2 failed: Input/output error
/tmp/cmdline.98112: line 1: 98163 Aborted (core dumped) ...
Haven't looked too close, but maybe an exit_reason needs to be set to
get qemu to quit sooner?
Thanks,
Tom
>>
>> Since the return code from pre_svm_run() is never used, should it just
>> be a bool function, then?
>
> Hard no. I strongly dislike boolean returns for functions that aren't obviously
> predicates, because it's impossible to determine the polarity of the return value
> based solely on the prototype. This leads to bugs that are easier to detect with
> 0/-errno return, e.g. returning -EINVAL in a happy path stands out more than
> returning the wrong false/true value.
>
> Case in point (because I just responded to another emain about this function),
> what's the polarity of this helper? :-)
>
> static bool sanity_check_entries(struct kvm_cpuid_entry2 __user *entries,
> __u32 num_entries, unsigned int ioctl_type)
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-24 23:55 ` Tom Lendacky
@ 2025-02-25 0:54 ` Sean Christopherson
2025-02-25 1:20 ` Sean Christopherson
2025-02-25 14:42 ` Tom Lendacky
0 siblings, 2 replies; 33+ messages in thread
From: Sean Christopherson @ 2025-02-25 0:54 UTC (permalink / raw)
To: Tom Lendacky
Cc: Paolo Bonzini, kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On Mon, Feb 24, 2025, Tom Lendacky wrote:
> On 2/24/25 16:55, Sean Christopherson wrote:
> > On Mon, Feb 24, 2025, Tom Lendacky wrote:
> >> On 2/18/25 19:26, Sean Christopherson wrote:
> >>> -void pre_sev_run(struct vcpu_svm *svm, int cpu)
> >>> +int pre_sev_run(struct vcpu_svm *svm, int cpu)
> >>> {
> >>> struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
> >>> - unsigned int asid = sev_get_asid(svm->vcpu.kvm);
> >>> + struct kvm *kvm = svm->vcpu.kvm;
> >>> + unsigned int asid = sev_get_asid(kvm);
> >>> +
> >>> + /*
> >>> + * Terminate the VM if userspace attempts to run the vCPU with an
> >>> + * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
> >>> + * an SNP AP Destroy event.
> >>> + */
> >>> + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
> >>> + kvm_vm_dead(kvm);
> >>> + return -EIO;
> >>> + }
> >>
> >> If a VMRUN is performed with the vmsa_pa value set to INVALID_PAGE, the
> >> VMRUN will fail and KVM will dump the VMCB and exit back to userspace
> >
> > I haven't tested, but based on what the APM says, I'm pretty sure this would crash
> > the host due to a #GP on VMRUN, i.e. due to the resulting kvm_spurious_fault().
> >
> > IF (rAX contains an unsupported physical address)
> > EXCEPTION [#GP]
>
> Well that's for the VMCB, the VMSA is pointed to by the VMCB and results
> in a VMEXIT code of -1 if you don't supply a proper page-aligned,
> physical address.
Ah, good to know (and somewhat of a relief :-) ).
> >>> static void svm_inject_nmi(struct kvm_vcpu *vcpu)
> >>> @@ -4231,7 +4233,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
> >>> if (force_immediate_exit)
> >>> smp_send_reschedule(vcpu->cpu);
> >>>
> >>> - pre_svm_run(vcpu);
> >>> + if (pre_svm_run(vcpu))
> >>> + return EXIT_FASTPATH_EXIT_USERSPACE;
>
> In testing this out, I think userspace continues on because I eventually
> get:
>
> KVM_GET_PIT2 failed: Input/output error
> /tmp/cmdline.98112: line 1: 98163 Aborted (core dumped) ...
>
> Haven't looked too close, but maybe an exit_reason needs to be set to
> get qemu to quit sooner?
Oh, the irony. In trying to do the right thing (exit to userspace), I managed to
do the wrong thing.
If KVM tried to re-enter the guest, vcpu_enter_guest() would have encountered
the KVM_REQ_DEAD and exited with -EIO.
if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu)) {
r = -EIO;
goto out;
}
By returning EXIT_FASTPATH_EXIT_USERSPACE, KVM exited to userspace more directly
and returned '0' instead of -EIO.
Getting KVM to return -EIO is easy, but doing so feels wrong, especially if we
take the quick-and-dirty route like so:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 454fd6b8f3db..9c8b400e04f2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11102,7 +11102,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
kvm_lapic_sync_from_vapic(vcpu);
if (unlikely(exit_fastpath == EXIT_FASTPATH_EXIT_USERSPACE))
- return 0;
+ return kvm_test_request(KVM_REQ_VM_DEAD, vcpu) ? -EIO : 0;
r = kvm_x86_call(handle_exit)(vcpu, exit_fastpath);
return r;
Given that, IIUC, KVM would eventually return KVM_EXIT_FAIL_ENTRY, I like your
idea of returning meaningful information. And unless I'm missing something, that
would obviate any need to terminate the VM, which would address your earlier point
of whether terminating the VM is truly better than returning than returning a
familiar error code.
So this? (completely untested)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 7345cac6f93a..71b340cbe561 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3463,10 +3463,8 @@ int pre_sev_run(struct vcpu_svm *svm, int cpu)
* invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
* an SNP AP Destroy event.
*/
- if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
- kvm_vm_dead(kvm);
- return -EIO;
- }
+ if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa))
+ return -EINVAL;
/* Assign the asid allocated with this SEV guest */
svm->asid = asid;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 46e0b65a9fec..f72bcf2e590e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4233,8 +4233,12 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
if (force_immediate_exit)
smp_send_reschedule(vcpu->cpu);
- if (pre_svm_run(vcpu))
+ if (pre_svm_run(vcpu)) {
+ vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+ vcpu->run->fail_entry.hardware_entry_failure_reason = SVM_EXIT_ERR;
+ vcpu->run->fail_entry.cpu = vcpu->cpu;
return EXIT_FASTPATH_EXIT_USERSPACE;
+ }
sync_lapic_to_cr8(vcpu);
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-25 0:54 ` Sean Christopherson
@ 2025-02-25 1:20 ` Sean Christopherson
2025-02-25 14:42 ` Tom Lendacky
1 sibling, 0 replies; 33+ messages in thread
From: Sean Christopherson @ 2025-02-25 1:20 UTC (permalink / raw)
To: Tom Lendacky
Cc: Paolo Bonzini, kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On Mon, Feb 24, 2025, Sean Christopherson wrote:
> On Mon, Feb 24, 2025, Tom Lendacky wrote:
> > On 2/24/25 16:55, Sean Christopherson wrote:
> > > On Mon, Feb 24, 2025, Tom Lendacky wrote:
> > >> On 2/18/25 19:26, Sean Christopherson wrote:
> > >>> -void pre_sev_run(struct vcpu_svm *svm, int cpu)
> > >>> +int pre_sev_run(struct vcpu_svm *svm, int cpu)
> > >>> {
> > >>> struct svm_cpu_data *sd = per_cpu_ptr(&svm_data, cpu);
> > >>> - unsigned int asid = sev_get_asid(svm->vcpu.kvm);
> > >>> + struct kvm *kvm = svm->vcpu.kvm;
> > >>> + unsigned int asid = sev_get_asid(kvm);
> > >>> +
> > >>> + /*
> > >>> + * Terminate the VM if userspace attempts to run the vCPU with an
> > >>> + * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
> > >>> + * an SNP AP Destroy event.
> > >>> + */
> > >>> + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
> > >>> + kvm_vm_dead(kvm);
> > >>> + return -EIO;
> > >>> + }
> > >>
> > >> If a VMRUN is performed with the vmsa_pa value set to INVALID_PAGE, the
> > >> VMRUN will fail and KVM will dump the VMCB and exit back to userspace
> > >
> > > I haven't tested, but based on what the APM says, I'm pretty sure this would crash
> > > the host due to a #GP on VMRUN, i.e. due to the resulting kvm_spurious_fault().
> > >
> > > IF (rAX contains an unsupported physical address)
> > > EXCEPTION [#GP]
> >
> > Well that's for the VMCB, the VMSA is pointed to by the VMCB and results
> > in a VMEXIT code of -1 if you don't supply a proper page-aligned,
> > physical address.
>
> Ah, good to know (and somewhat of a relief :-) ).
If anyone else was wondering, the behavior is described in the "VMRUN Page Checks"
table, which "Table 15-4" in the March 2024 version of the APM.
FWIW, knowing that VMRUN is supposed to handle this scenario does make me somewhat
tempted to skip this patch entirely. But I still don't like the idea of doing
VMRUN with a known bad address.
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA
2025-02-25 0:54 ` Sean Christopherson
2025-02-25 1:20 ` Sean Christopherson
@ 2025-02-25 14:42 ` Tom Lendacky
1 sibling, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-25 14:42 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/24/25 18:54, Sean Christopherson wrote:
> On Mon, Feb 24, 2025, Tom Lendacky wrote:
>> On 2/24/25 16:55, Sean Christopherson wrote:
>>> On Mon, Feb 24, 2025, Tom Lendacky wrote:
>>>> On 2/18/25 19:26, Sean Christopherson wrote:
>
> Given that, IIUC, KVM would eventually return KVM_EXIT_FAIL_ENTRY, I like your
> idea of returning meaningful information. And unless I'm missing something, that
> would obviate any need to terminate the VM, which would address your earlier point
> of whether terminating the VM is truly better than returning than returning a
> familiar error code.
>
> So this? (completely untested)
This works nicely and qemu terminates quickly with:
KVM: entry failed, hardware error 0xffffffffffffffff
Thanks,
Tom
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 7345cac6f93a..71b340cbe561 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3463,10 +3463,8 @@ int pre_sev_run(struct vcpu_svm *svm, int cpu)
> * invalid VMSA, e.g. if userspace forces the vCPU to be RUNNABLE after
> * an SNP AP Destroy event.
> */
> - if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa)) {
> - kvm_vm_dead(kvm);
> - return -EIO;
> - }
> + if (sev_es_guest(kvm) && !VALID_PAGE(svm->vmcb->control.vmsa_pa))
> + return -EINVAL;
>
> /* Assign the asid allocated with this SEV guest */
> svm->asid = asid;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 46e0b65a9fec..f72bcf2e590e 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4233,8 +4233,12 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
> if (force_immediate_exit)
> smp_send_reschedule(vcpu->cpu);
>
> - if (pre_svm_run(vcpu))
> + if (pre_svm_run(vcpu)) {
> + vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> + vcpu->run->fail_entry.hardware_entry_failure_reason = SVM_EXIT_ERR;
> + vcpu->run->fail_entry.cpu = vcpu->cpu;
> return EXIT_FASTPATH_EXIT_USERSPACE;
> + }
>
> sync_lapic_to_cr8(vcpu);
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 04/10] KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (2 preceding siblings ...)
2025-02-19 1:26 ` [PATCH 03/10] KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid VMSA Sean Christopherson
@ 2025-02-19 1:26 ` Sean Christopherson
2025-02-24 21:31 ` Tom Lendacky
2025-02-19 1:27 ` [PATCH 05/10] KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view Sean Christopherson
` (6 subsequent siblings)
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:26 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
If KVM rejects an AP Creation event, leave the target vCPU state as-is.
Nothing in the GHCB suggests the hypervisor is *allowed* to muck with vCPU
state on failure, let alone required to do so. Furthermore, kicking only
in the !ON_INIT case leads to divergent behavior, and even the "kick" case
is non-deterministic.
E.g. if an ON_INIT request fails, the guest can successfully retry if the
fixed AP Creation request is made prior to sending INIT. And if a !ON_INIT
fails, the guest can successfully retry if the fixed AP Creation request is
handled before the target vCPU processes KVM's
KVM_REQ_UPDATE_PROTECTED_GUEST_STATE.
Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index e14a37dbc6ea..07125b2cf0a6 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3959,16 +3959,12 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
/*
* The target vCPU is valid, so the vCPU will be kicked unless the
- * request is for CREATE_ON_INIT. For any errors at this stage, the
- * kick will place the vCPU in an non-runnable state.
+ * request is for CREATE_ON_INIT.
*/
kick = true;
mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
- target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
- target_svm->sev_es.snp_ap_waiting_for_reset = true;
-
/* Interrupt injection mode shouldn't change for AP creation */
if (request < SVM_VMGEXIT_AP_DESTROY) {
u64 sev_features;
@@ -4014,20 +4010,23 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
target_svm->sev_es.snp_vmsa_gpa = svm->vmcb->control.exit_info_2;
break;
case SVM_VMGEXIT_AP_DESTROY:
+ target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
break;
default:
vcpu_unimpl(vcpu, "vmgexit: invalid AP creation request [%#x] from guest\n",
request);
ret = -EINVAL;
- break;
+ goto out;
}
-out:
+ target_svm->sev_es.snp_ap_waiting_for_reset = true;
+
if (kick) {
kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
kvm_vcpu_kick(target_vcpu);
}
+out:
mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
return ret;
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 04/10] KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error
2025-02-19 1:26 ` [PATCH 04/10] KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error Sean Christopherson
@ 2025-02-24 21:31 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 21:31 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:26, Sean Christopherson wrote:
> If KVM rejects an AP Creation event, leave the target vCPU state as-is.
> Nothing in the GHCB suggests the hypervisor is *allowed* to muck with vCPU
> state on failure, let alone required to do so. Furthermore, kicking only
> in the !ON_INIT case leads to divergent behavior, and even the "kick" case
> is non-deterministic.
>
> E.g. if an ON_INIT request fails, the guest can successfully retry if the
> fixed AP Creation request is made prior to sending INIT. And if a !ON_INIT
> fails, the guest can successfully retry if the fixed AP Creation request is
> handled before the target vCPU processes KVM's
> KVM_REQ_UPDATE_PROTECTED_GUEST_STATE.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 13 ++++++-------
> 1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index e14a37dbc6ea..07125b2cf0a6 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3959,16 +3959,12 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
>
> /*
> * The target vCPU is valid, so the vCPU will be kicked unless the
> - * request is for CREATE_ON_INIT. For any errors at this stage, the
> - * kick will place the vCPU in an non-runnable state.
> + * request is for CREATE_ON_INIT.
> */
> kick = true;
>
> mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
>
> - target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
> - target_svm->sev_es.snp_ap_waiting_for_reset = true;
> -
> /* Interrupt injection mode shouldn't change for AP creation */
> if (request < SVM_VMGEXIT_AP_DESTROY) {
> u64 sev_features;
> @@ -4014,20 +4010,23 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> target_svm->sev_es.snp_vmsa_gpa = svm->vmcb->control.exit_info_2;
> break;
> case SVM_VMGEXIT_AP_DESTROY:
> + target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
> break;
> default:
> vcpu_unimpl(vcpu, "vmgexit: invalid AP creation request [%#x] from guest\n",
> request);
> ret = -EINVAL;
> - break;
> + goto out;
> }
>
> -out:
> + target_svm->sev_es.snp_ap_waiting_for_reset = true;
> +
> if (kick) {
> kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
> kvm_vcpu_kick(target_vcpu);
> }
>
> +out:
> mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
>
> return ret;
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 05/10] KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (3 preceding siblings ...)
2025-02-19 1:26 ` [PATCH 04/10] KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error Sean Christopherson
@ 2025-02-19 1:27 ` Sean Christopherson
2025-02-24 21:46 ` Tom Lendacky
2025-02-19 1:27 ` [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling Sean Christopherson
` (5 subsequent siblings)
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:27 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
When handling an "AP Create" event, return an error if the "requested" SEV
features for the vCPU don't exactly match KVM's view of the VM-scoped
features. There is no known use case for heterogeneous SEV features across
vCPUs, and while KVM can't actually enforce an exact match since the value
in RAX isn't guaranteed to match what the guest shoved into the VMSA, KVM
can at least avoid knowingly letting the guest run in an unsupported state.
E.g. if a VM is created with DebugSwap disabled, KVM will intercept #DBs
and DRs for all vCPUs, even if an AP is "created" with DebugSwap enabled in
its VMSA.
Note, the GHCB spec only "requires" that "AP use the same interrupt
injection mechanism as the BSP", but given the disaster that is DebugSwap
and SEV_FEATURES in general, it's safe to say that AMD didn't consider all
possible complications with mismatching features between the BSP and APs.
Oppurtunistically fold the check into the relevant request flavors; the
"request < AP_DESTROY" check is just a bizarre way of implementing the
AP_CREATE_ON_INIT => AP_CREATE fallthrough.
Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 23 ++++++++---------------
1 file changed, 8 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 07125b2cf0a6..8425198c5204 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3934,6 +3934,7 @@ void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
static int sev_snp_ap_creation(struct vcpu_svm *svm)
{
+ struct kvm_sev_info *sev = to_kvm_sev_info(svm->vcpu.kvm);
struct kvm_vcpu *vcpu = &svm->vcpu;
struct kvm_vcpu *target_vcpu;
struct vcpu_svm *target_svm;
@@ -3965,26 +3966,18 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
- /* Interrupt injection mode shouldn't change for AP creation */
- if (request < SVM_VMGEXIT_AP_DESTROY) {
- u64 sev_features;
-
- sev_features = vcpu->arch.regs[VCPU_REGS_RAX];
- sev_features ^= to_kvm_sev_info(svm->vcpu.kvm)->vmsa_features;
-
- if (sev_features & SVM_SEV_FEAT_INT_INJ_MODES) {
- vcpu_unimpl(vcpu, "vmgexit: invalid AP injection mode [%#lx] from guest\n",
- vcpu->arch.regs[VCPU_REGS_RAX]);
- ret = -EINVAL;
- goto out;
- }
- }
-
switch (request) {
case SVM_VMGEXIT_AP_CREATE_ON_INIT:
kick = false;
fallthrough;
case SVM_VMGEXIT_AP_CREATE:
+ if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
+ vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
+ vcpu->arch.regs[VCPU_REGS_RAX], sev->vmsa_features);
+ ret = -EINVAL;
+ goto out;
+ }
+
if (!page_address_valid(vcpu, svm->vmcb->control.exit_info_2)) {
vcpu_unimpl(vcpu, "vmgexit: invalid AP VMSA address [%#llx] from guest\n",
svm->vmcb->control.exit_info_2);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 05/10] KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view
2025-02-19 1:27 ` [PATCH 05/10] KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view Sean Christopherson
@ 2025-02-24 21:46 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 21:46 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:27, Sean Christopherson wrote:
> When handling an "AP Create" event, return an error if the "requested" SEV
> features for the vCPU don't exactly match KVM's view of the VM-scoped
> features. There is no known use case for heterogeneous SEV features across
> vCPUs, and while KVM can't actually enforce an exact match since the value
> in RAX isn't guaranteed to match what the guest shoved into the VMSA, KVM
> can at least avoid knowingly letting the guest run in an unsupported state.
>
> E.g. if a VM is created with DebugSwap disabled, KVM will intercept #DBs
> and DRs for all vCPUs, even if an AP is "created" with DebugSwap enabled in
> its VMSA.
>
> Note, the GHCB spec only "requires" that "AP use the same interrupt
> injection mechanism as the BSP", but given the disaster that is DebugSwap
> and SEV_FEATURES in general, it's safe to say that AMD didn't consider all
> possible complications with mismatching features between the BSP and APs.
>
> Oppurtunistically fold the check into the relevant request flavors; the
> "request < AP_DESTROY" check is just a bizarre way of implementing the
> AP_CREATE_ON_INIT => AP_CREATE fallthrough.
s/Oppurtunistically/Opportunistically/
Yes, with the change in patch #4 to not kick vCPUs on AP creation
failure, that check can now be moved to the switch statement.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Fixes: e366f92ea99e ("KVM: SEV: Support SEV-SNP AP Creation NAE event")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 23 ++++++++---------------
> 1 file changed, 8 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 07125b2cf0a6..8425198c5204 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3934,6 +3934,7 @@ void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
>
> static int sev_snp_ap_creation(struct vcpu_svm *svm)
> {
> + struct kvm_sev_info *sev = to_kvm_sev_info(svm->vcpu.kvm);
> struct kvm_vcpu *vcpu = &svm->vcpu;
> struct kvm_vcpu *target_vcpu;
> struct vcpu_svm *target_svm;
> @@ -3965,26 +3966,18 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
>
> mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
>
> - /* Interrupt injection mode shouldn't change for AP creation */
> - if (request < SVM_VMGEXIT_AP_DESTROY) {
> - u64 sev_features;
> -
> - sev_features = vcpu->arch.regs[VCPU_REGS_RAX];
> - sev_features ^= to_kvm_sev_info(svm->vcpu.kvm)->vmsa_features;
> -
> - if (sev_features & SVM_SEV_FEAT_INT_INJ_MODES) {
> - vcpu_unimpl(vcpu, "vmgexit: invalid AP injection mode [%#lx] from guest\n",
> - vcpu->arch.regs[VCPU_REGS_RAX]);
> - ret = -EINVAL;
> - goto out;
> - }
> - }
> -
> switch (request) {
> case SVM_VMGEXIT_AP_CREATE_ON_INIT:
> kick = false;
> fallthrough;
> case SVM_VMGEXIT_AP_CREATE:
> + if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
> + vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
> + vcpu->arch.regs[VCPU_REGS_RAX], sev->vmsa_features);
> + ret = -EINVAL;
> + goto out;
> + }
> +
> if (!page_address_valid(vcpu, svm->vmcb->control.exit_info_2)) {
> vcpu_unimpl(vcpu, "vmgexit: invalid AP VMSA address [%#llx] from guest\n",
> svm->vmcb->control.exit_info_2);
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (4 preceding siblings ...)
2025-02-19 1:27 ` [PATCH 05/10] KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view Sean Christopherson
@ 2025-02-19 1:27 ` Sean Christopherson
2025-02-19 6:19 ` Gupta, Pankaj
2025-02-24 21:48 ` Tom Lendacky
2025-02-19 1:27 ` [PATCH 07/10] KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling Sean Christopherson
` (4 subsequent siblings)
10 siblings, 2 replies; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:27 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
Drop the local "kick" variable and the unnecessary "fallthrough" logic
from sev_snp_ap_creation(), and simply pivot on the request when deciding
whether or not to immediate force a state update on the target vCPU.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 8425198c5204..7f6c8fedb235 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3940,7 +3940,6 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
struct vcpu_svm *target_svm;
unsigned int request;
unsigned int apic_id;
- bool kick;
int ret;
request = lower_32_bits(svm->vmcb->control.exit_info_1);
@@ -3958,18 +3957,10 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
target_svm = to_svm(target_vcpu);
- /*
- * The target vCPU is valid, so the vCPU will be kicked unless the
- * request is for CREATE_ON_INIT.
- */
- kick = true;
-
mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
switch (request) {
case SVM_VMGEXIT_AP_CREATE_ON_INIT:
- kick = false;
- fallthrough;
case SVM_VMGEXIT_AP_CREATE:
if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
@@ -4014,7 +4005,11 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
target_svm->sev_es.snp_ap_waiting_for_reset = true;
- if (kick) {
+ /*
+ * Unless Creation is deferred until INIT, signal the vCPU to update
+ * its state.
+ */
+ if (request != SVM_VMGEXIT_AP_CREATE_ON_INIT) {
kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
kvm_vcpu_kick(target_vcpu);
}
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling
2025-02-19 1:27 ` [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling Sean Christopherson
@ 2025-02-19 6:19 ` Gupta, Pankaj
2025-02-24 21:48 ` Tom Lendacky
1 sibling, 0 replies; 33+ messages in thread
From: Gupta, Pankaj @ 2025-02-19 6:19 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
On 2/19/2025 2:27 AM, Sean Christopherson wrote:
> Drop the local "kick" variable and the unnecessary "fallthrough" logic
> from sev_snp_ap_creation(), and simply pivot on the request when deciding
> whether or not to immediate force a state update on the target vCPU.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Straightforward cleanup.
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 8425198c5204..7f6c8fedb235 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3940,7 +3940,6 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> struct vcpu_svm *target_svm;
> unsigned int request;
> unsigned int apic_id;
> - bool kick;
> int ret;
>
> request = lower_32_bits(svm->vmcb->control.exit_info_1);
> @@ -3958,18 +3957,10 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
>
> target_svm = to_svm(target_vcpu);
>
> - /*
> - * The target vCPU is valid, so the vCPU will be kicked unless the
> - * request is for CREATE_ON_INIT.
> - */
> - kick = true;
> -
> mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
>
> switch (request) {
> case SVM_VMGEXIT_AP_CREATE_ON_INIT:
> - kick = false;
> - fallthrough;
> case SVM_VMGEXIT_AP_CREATE:
> if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
> vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
> @@ -4014,7 +4005,11 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
>
> target_svm->sev_es.snp_ap_waiting_for_reset = true;
>
> - if (kick) {
> + /*
> + * Unless Creation is deferred until INIT, signal the vCPU to update
> + * its state.
> + */
> + if (request != SVM_VMGEXIT_AP_CREATE_ON_INIT) {
> kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
> kvm_vcpu_kick(target_vcpu);
> }
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling
2025-02-19 1:27 ` [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling Sean Christopherson
2025-02-19 6:19 ` Gupta, Pankaj
@ 2025-02-24 21:48 ` Tom Lendacky
1 sibling, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 21:48 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:27, Sean Christopherson wrote:
> Drop the local "kick" variable and the unnecessary "fallthrough" logic
> from sev_snp_ap_creation(), and simply pivot on the request when deciding
> whether or not to immediate force a state update on the target vCPU.
>
> No functional change intended.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 8425198c5204..7f6c8fedb235 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3940,7 +3940,6 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> struct vcpu_svm *target_svm;
> unsigned int request;
> unsigned int apic_id;
> - bool kick;
> int ret;
>
> request = lower_32_bits(svm->vmcb->control.exit_info_1);
> @@ -3958,18 +3957,10 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
>
> target_svm = to_svm(target_vcpu);
>
> - /*
> - * The target vCPU is valid, so the vCPU will be kicked unless the
> - * request is for CREATE_ON_INIT.
> - */
> - kick = true;
> -
> mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
>
> switch (request) {
> case SVM_VMGEXIT_AP_CREATE_ON_INIT:
> - kick = false;
> - fallthrough;
> case SVM_VMGEXIT_AP_CREATE:
> if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
> vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
> @@ -4014,7 +4005,11 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
>
> target_svm->sev_es.snp_ap_waiting_for_reset = true;
>
> - if (kick) {
> + /*
> + * Unless Creation is deferred until INIT, signal the vCPU to update
> + * its state.
> + */
> + if (request != SVM_VMGEXIT_AP_CREATE_ON_INIT) {
> kvm_make_request(KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, target_vcpu);
> kvm_vcpu_kick(target_vcpu);
> }
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 07/10] KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (5 preceding siblings ...)
2025-02-19 1:27 ` [PATCH 06/10] KVM: SVM: Simplify request+kick logic in SNP AP Creation handling Sean Christopherson
@ 2025-02-19 1:27 ` Sean Christopherson
2025-02-24 21:49 ` Tom Lendacky
2025-02-19 1:27 ` [PATCH 08/10] KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa Sean Christopherson
` (3 subsequent siblings)
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:27 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
Use guard(mutex) in sev_snp_ap_creation() and modify the error paths to
return directly instead of jumping to a common exit point.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 7f6c8fedb235..241cf7769508 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3940,7 +3940,6 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
struct vcpu_svm *target_svm;
unsigned int request;
unsigned int apic_id;
- int ret;
request = lower_32_bits(svm->vmcb->control.exit_info_1);
apic_id = upper_32_bits(svm->vmcb->control.exit_info_1);
@@ -3953,11 +3952,9 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
return -EINVAL;
}
- ret = 0;
-
target_svm = to_svm(target_vcpu);
- mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
+ guard(mutex)(&target_svm->sev_es.snp_vmsa_mutex);
switch (request) {
case SVM_VMGEXIT_AP_CREATE_ON_INIT:
@@ -3965,15 +3962,13 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
vcpu->arch.regs[VCPU_REGS_RAX], sev->vmsa_features);
- ret = -EINVAL;
- goto out;
+ return -EINVAL;
}
if (!page_address_valid(vcpu, svm->vmcb->control.exit_info_2)) {
vcpu_unimpl(vcpu, "vmgexit: invalid AP VMSA address [%#llx] from guest\n",
svm->vmcb->control.exit_info_2);
- ret = -EINVAL;
- goto out;
+ return -EINVAL;
}
/*
@@ -3987,8 +3982,7 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
vcpu_unimpl(vcpu,
"vmgexit: AP VMSA address [%llx] from guest is unsafe as it is 2M aligned\n",
svm->vmcb->control.exit_info_2);
- ret = -EINVAL;
- goto out;
+ return -EINVAL;
}
target_svm->sev_es.snp_vmsa_gpa = svm->vmcb->control.exit_info_2;
@@ -3999,8 +3993,7 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
default:
vcpu_unimpl(vcpu, "vmgexit: invalid AP creation request [%#x] from guest\n",
request);
- ret = -EINVAL;
- goto out;
+ return -EINVAL;
}
target_svm->sev_es.snp_ap_waiting_for_reset = true;
@@ -4014,10 +4007,7 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
kvm_vcpu_kick(target_vcpu);
}
-out:
- mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
-
- return ret;
+ return 0;
}
static int snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp_gpa)
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 07/10] KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling
2025-02-19 1:27 ` [PATCH 07/10] KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling Sean Christopherson
@ 2025-02-24 21:49 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 21:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:27, Sean Christopherson wrote:
> Use guard(mutex) in sev_snp_ap_creation() and modify the error paths to
> return directly instead of jumping to a common exit point.
>
> No functional change intended.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 22 ++++++----------------
> 1 file changed, 6 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 7f6c8fedb235..241cf7769508 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3940,7 +3940,6 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> struct vcpu_svm *target_svm;
> unsigned int request;
> unsigned int apic_id;
> - int ret;
>
> request = lower_32_bits(svm->vmcb->control.exit_info_1);
> apic_id = upper_32_bits(svm->vmcb->control.exit_info_1);
> @@ -3953,11 +3952,9 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> return -EINVAL;
> }
>
> - ret = 0;
> -
> target_svm = to_svm(target_vcpu);
>
> - mutex_lock(&target_svm->sev_es.snp_vmsa_mutex);
> + guard(mutex)(&target_svm->sev_es.snp_vmsa_mutex);
>
> switch (request) {
> case SVM_VMGEXIT_AP_CREATE_ON_INIT:
> @@ -3965,15 +3962,13 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> if (vcpu->arch.regs[VCPU_REGS_RAX] != sev->vmsa_features) {
> vcpu_unimpl(vcpu, "vmgexit: mismatched AP sev_features [%#lx] != [%#llx] from guest\n",
> vcpu->arch.regs[VCPU_REGS_RAX], sev->vmsa_features);
> - ret = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> if (!page_address_valid(vcpu, svm->vmcb->control.exit_info_2)) {
> vcpu_unimpl(vcpu, "vmgexit: invalid AP VMSA address [%#llx] from guest\n",
> svm->vmcb->control.exit_info_2);
> - ret = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> /*
> @@ -3987,8 +3982,7 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> vcpu_unimpl(vcpu,
> "vmgexit: AP VMSA address [%llx] from guest is unsafe as it is 2M aligned\n",
> svm->vmcb->control.exit_info_2);
> - ret = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> target_svm->sev_es.snp_vmsa_gpa = svm->vmcb->control.exit_info_2;
> @@ -3999,8 +3993,7 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> default:
> vcpu_unimpl(vcpu, "vmgexit: invalid AP creation request [%#x] from guest\n",
> request);
> - ret = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> target_svm->sev_es.snp_ap_waiting_for_reset = true;
> @@ -4014,10 +4007,7 @@ static int sev_snp_ap_creation(struct vcpu_svm *svm)
> kvm_vcpu_kick(target_vcpu);
> }
>
> -out:
> - mutex_unlock(&target_svm->sev_es.snp_vmsa_mutex);
> -
> - return ret;
> + return 0;
> }
>
> static int snp_handle_guest_req(struct vcpu_svm *svm, gpa_t req_gpa, gpa_t resp_gpa)
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 08/10] KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (6 preceding siblings ...)
2025-02-19 1:27 ` [PATCH 07/10] KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling Sean Christopherson
@ 2025-02-19 1:27 ` Sean Christopherson
2025-02-24 21:58 ` Tom Lendacky
2025-02-19 1:27 ` [PATCH 09/10] KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates Sean Christopherson
` (2 subsequent siblings)
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:27 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
Mark the VMCB dirty, i.e. zero control.clean, prior to handling the new
VMSA. Nothing in the VALID_PAGE() case touches control.clean, and
isolating the VALID_PAGE() code will allow simplifying the overall logic.
Note, the VMCB probably doesn't need to be marked dirty when the VMSA is
invalid, as KVM will disallow running the vCPU in such a state. But it
also doesn't hurt anything.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 241cf7769508..3a531232c3a1 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3852,6 +3852,12 @@ static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
/* Clear use of the VMSA */
svm->vmcb->control.vmsa_pa = INVALID_PAGE;
+ /*
+ * When replacing the VMSA during SEV-SNP AP creation,
+ * mark the VMCB dirty so that full state is always reloaded.
+ */
+ vmcb_mark_all_dirty(svm->vmcb);
+
if (VALID_PAGE(svm->sev_es.snp_vmsa_gpa)) {
gfn_t gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
struct kvm_memory_slot *slot;
@@ -3897,12 +3903,6 @@ static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
kvm_release_page_clean(page);
}
- /*
- * When replacing the VMSA during SEV-SNP AP creation,
- * mark the VMCB dirty so that full state is always reloaded.
- */
- vmcb_mark_all_dirty(svm->vmcb);
-
return 0;
}
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 08/10] KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa
2025-02-19 1:27 ` [PATCH 08/10] KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa Sean Christopherson
@ 2025-02-24 21:58 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 21:58 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:27, Sean Christopherson wrote:
> Mark the VMCB dirty, i.e. zero control.clean, prior to handling the new
> VMSA. Nothing in the VALID_PAGE() case touches control.clean, and
> isolating the VALID_PAGE() code will allow simplifying the overall logic.
>
> Note, the VMCB probably doesn't need to be marked dirty when the VMSA is
> invalid, as KVM will disallow running the vCPU in such a state. But it
> also doesn't hurt anything.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 241cf7769508..3a531232c3a1 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3852,6 +3852,12 @@ static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
> /* Clear use of the VMSA */
> svm->vmcb->control.vmsa_pa = INVALID_PAGE;
>
> + /*
> + * When replacing the VMSA during SEV-SNP AP creation,
> + * mark the VMCB dirty so that full state is always reloaded.
> + */
> + vmcb_mark_all_dirty(svm->vmcb);
> +
> if (VALID_PAGE(svm->sev_es.snp_vmsa_gpa)) {
> gfn_t gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
> struct kvm_memory_slot *slot;
> @@ -3897,12 +3903,6 @@ static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
> kvm_release_page_clean(page);
> }
>
> - /*
> - * When replacing the VMSA during SEV-SNP AP creation,
> - * mark the VMCB dirty so that full state is always reloaded.
> - */
> - vmcb_mark_all_dirty(svm->vmcb);
> -
> return 0;
> }
>
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 09/10] KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (7 preceding siblings ...)
2025-02-19 1:27 ` [PATCH 08/10] KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa Sean Christopherson
@ 2025-02-19 1:27 ` Sean Christopherson
2025-02-24 22:57 ` Tom Lendacky
2025-02-19 1:27 ` [PATCH 10/10] KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure Sean Christopherson
2025-02-20 22:51 ` [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Tom Lendacky
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:27 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
Use guard(mutex) in sev_snp_init_protected_guest_state() and pull in its
lock-protected inner helper. Without an unlock trampoline (and even with
one), there is no real need for an inner helper. Eliminating the helper
also avoids having to fixup the open coded "lockdep" WARN_ON().
Opportunistically drop the error message if KVM can't obtain the pfn for
the new target VMSA. The error message provides zero information that
can't be gleaned from the fact that the vCPU is stuck.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 122 ++++++++++++++++++-----------------------
1 file changed, 53 insertions(+), 69 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 3a531232c3a1..15c324b61b24 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3839,11 +3839,26 @@ static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc)
BUG();
}
-static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
+/*
+ * Invoked as part of svm_vcpu_reset() processing of an init event.
+ */
+void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ struct kvm_memory_slot *slot;
+ struct page *page;
+ kvm_pfn_t pfn;
+ gfn_t gfn;
- WARN_ON(!mutex_is_locked(&svm->sev_es.snp_vmsa_mutex));
+ if (!sev_snp_guest(vcpu->kvm))
+ return;
+
+ guard(mutex)(&svm->sev_es.snp_vmsa_mutex);
+
+ if (!svm->sev_es.snp_ap_waiting_for_reset)
+ return;
+
+ svm->sev_es.snp_ap_waiting_for_reset = false;
/* Mark the vCPU as offline and not runnable */
vcpu->arch.pv.pv_unhalted = false;
@@ -3858,78 +3873,47 @@ static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
*/
vmcb_mark_all_dirty(svm->vmcb);
- if (VALID_PAGE(svm->sev_es.snp_vmsa_gpa)) {
- gfn_t gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
- struct kvm_memory_slot *slot;
- struct page *page;
- kvm_pfn_t pfn;
-
- slot = gfn_to_memslot(vcpu->kvm, gfn);
- if (!slot)
- return -EINVAL;
-
- /*
- * The new VMSA will be private memory guest memory, so
- * retrieve the PFN from the gmem backend.
- */
- if (kvm_gmem_get_pfn(vcpu->kvm, slot, gfn, &pfn, &page, NULL))
- return -EINVAL;
-
- /*
- * From this point forward, the VMSA will always be a
- * guest-mapped page rather than the initial one allocated
- * by KVM in svm->sev_es.vmsa. In theory, svm->sev_es.vmsa
- * could be free'd and cleaned up here, but that involves
- * cleanups like wbinvd_on_all_cpus() which would ideally
- * be handled during teardown rather than guest boot.
- * Deferring that also allows the existing logic for SEV-ES
- * VMSAs to be re-used with minimal SNP-specific changes.
- */
- svm->sev_es.snp_has_guest_vmsa = true;
-
- /* Use the new VMSA */
- svm->vmcb->control.vmsa_pa = pfn_to_hpa(pfn);
-
- /* Mark the vCPU as runnable */
- kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE);
-
- svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
-
- /*
- * gmem pages aren't currently migratable, but if this ever
- * changes then care should be taken to ensure
- * svm->sev_es.vmsa is pinned through some other means.
- */
- kvm_release_page_clean(page);
- }
-
- return 0;
-}
-
-/*
- * Invoked as part of svm_vcpu_reset() processing of an init event.
- */
-void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
- int ret;
-
- if (!sev_snp_guest(vcpu->kvm))
+ if (!VALID_PAGE(svm->sev_es.snp_vmsa_gpa))
return;
- mutex_lock(&svm->sev_es.snp_vmsa_mutex);
+ gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
- if (!svm->sev_es.snp_ap_waiting_for_reset)
- goto unlock;
-
- svm->sev_es.snp_ap_waiting_for_reset = false;
+ slot = gfn_to_memslot(vcpu->kvm, gfn);
+ if (!slot)
+ return;
- ret = __sev_snp_update_protected_guest_state(vcpu);
- if (ret)
- vcpu_unimpl(vcpu, "snp: AP state update on init failed\n");
+ /*
+ * The new VMSA will be private memory guest memory, so retrieve the
+ * PFN from the gmem backend.
+ */
+ if (kvm_gmem_get_pfn(vcpu->kvm, slot, gfn, &pfn, &page, NULL))
+ return;
-unlock:
- mutex_unlock(&svm->sev_es.snp_vmsa_mutex);
+ /*
+ * From this point forward, the VMSA will always be a guest-mapped page
+ * rather than the initial one allocated by KVM in svm->sev_es.vmsa. In
+ * theory, svm->sev_es.vmsa could be free'd and cleaned up here, but
+ * that involves cleanups like wbinvd_on_all_cpus() which would ideally
+ * be handled during teardown rather than guest boot. Deferring that
+ * also allows the existing logic for SEV-ES VMSAs to be re-used with
+ * minimal SNP-specific changes.
+ */
+ svm->sev_es.snp_has_guest_vmsa = true;
+
+ /* Use the new VMSA */
+ svm->vmcb->control.vmsa_pa = pfn_to_hpa(pfn);
+
+ /* Mark the vCPU as runnable */
+ kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE);
+
+ svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
+
+ /*
+ * gmem pages aren't currently migratable, but if this ever changes
+ * then care should be taken to ensure svm->sev_es.vmsa is pinned
+ * through some other means.
+ */
+ kvm_release_page_clean(page);
}
static int sev_snp_ap_creation(struct vcpu_svm *svm)
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 09/10] KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates
2025-02-19 1:27 ` [PATCH 09/10] KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates Sean Christopherson
@ 2025-02-24 22:57 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-24 22:57 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:27, Sean Christopherson wrote:
> Use guard(mutex) in sev_snp_init_protected_guest_state() and pull in its
> lock-protected inner helper. Without an unlock trampoline (and even with
> one), there is no real need for an inner helper. Eliminating the helper
> also avoids having to fixup the open coded "lockdep" WARN_ON().
>
> Opportunistically drop the error message if KVM can't obtain the pfn for
> the new target VMSA. The error message provides zero information that
> can't be gleaned from the fact that the vCPU is stuck.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/svm/sev.c | 122 ++++++++++++++++++-----------------------
> 1 file changed, 53 insertions(+), 69 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 3a531232c3a1..15c324b61b24 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3839,11 +3839,26 @@ static int snp_begin_psc(struct vcpu_svm *svm, struct psc_buffer *psc)
> BUG();
> }
>
> -static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
> +/*
> + * Invoked as part of svm_vcpu_reset() processing of an init event.
> + */
> +void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> + struct kvm_memory_slot *slot;
> + struct page *page;
> + kvm_pfn_t pfn;
> + gfn_t gfn;
>
> - WARN_ON(!mutex_is_locked(&svm->sev_es.snp_vmsa_mutex));
> + if (!sev_snp_guest(vcpu->kvm))
> + return;
> +
> + guard(mutex)(&svm->sev_es.snp_vmsa_mutex);
> +
> + if (!svm->sev_es.snp_ap_waiting_for_reset)
> + return;
> +
> + svm->sev_es.snp_ap_waiting_for_reset = false;
>
> /* Mark the vCPU as offline and not runnable */
> vcpu->arch.pv.pv_unhalted = false;
> @@ -3858,78 +3873,47 @@ static int __sev_snp_update_protected_guest_state(struct kvm_vcpu *vcpu)
> */
> vmcb_mark_all_dirty(svm->vmcb);
>
> - if (VALID_PAGE(svm->sev_es.snp_vmsa_gpa)) {
> - gfn_t gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
> - struct kvm_memory_slot *slot;
> - struct page *page;
> - kvm_pfn_t pfn;
> -
> - slot = gfn_to_memslot(vcpu->kvm, gfn);
> - if (!slot)
> - return -EINVAL;
> -
> - /*
> - * The new VMSA will be private memory guest memory, so
> - * retrieve the PFN from the gmem backend.
> - */
> - if (kvm_gmem_get_pfn(vcpu->kvm, slot, gfn, &pfn, &page, NULL))
> - return -EINVAL;
> -
> - /*
> - * From this point forward, the VMSA will always be a
> - * guest-mapped page rather than the initial one allocated
> - * by KVM in svm->sev_es.vmsa. In theory, svm->sev_es.vmsa
> - * could be free'd and cleaned up here, but that involves
> - * cleanups like wbinvd_on_all_cpus() which would ideally
> - * be handled during teardown rather than guest boot.
> - * Deferring that also allows the existing logic for SEV-ES
> - * VMSAs to be re-used with minimal SNP-specific changes.
> - */
> - svm->sev_es.snp_has_guest_vmsa = true;
> -
> - /* Use the new VMSA */
> - svm->vmcb->control.vmsa_pa = pfn_to_hpa(pfn);
> -
> - /* Mark the vCPU as runnable */
> - kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE);
> -
> - svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
> -
> - /*
> - * gmem pages aren't currently migratable, but if this ever
> - * changes then care should be taken to ensure
> - * svm->sev_es.vmsa is pinned through some other means.
> - */
> - kvm_release_page_clean(page);
> - }
> -
> - return 0;
> -}
> -
> -/*
> - * Invoked as part of svm_vcpu_reset() processing of an init event.
> - */
> -void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
> -{
> - struct vcpu_svm *svm = to_svm(vcpu);
> - int ret;
> -
> - if (!sev_snp_guest(vcpu->kvm))
> + if (!VALID_PAGE(svm->sev_es.snp_vmsa_gpa))
> return;
>
> - mutex_lock(&svm->sev_es.snp_vmsa_mutex);
> + gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
>
> - if (!svm->sev_es.snp_ap_waiting_for_reset)
> - goto unlock;
> -
> - svm->sev_es.snp_ap_waiting_for_reset = false;
> + slot = gfn_to_memslot(vcpu->kvm, gfn);
> + if (!slot)
> + return;
>
> - ret = __sev_snp_update_protected_guest_state(vcpu);
> - if (ret)
> - vcpu_unimpl(vcpu, "snp: AP state update on init failed\n");
> + /*
> + * The new VMSA will be private memory guest memory, so retrieve the
> + * PFN from the gmem backend.
> + */
> + if (kvm_gmem_get_pfn(vcpu->kvm, slot, gfn, &pfn, &page, NULL))
> + return;
>
> -unlock:
> - mutex_unlock(&svm->sev_es.snp_vmsa_mutex);
> + /*
> + * From this point forward, the VMSA will always be a guest-mapped page
> + * rather than the initial one allocated by KVM in svm->sev_es.vmsa. In
> + * theory, svm->sev_es.vmsa could be free'd and cleaned up here, but
> + * that involves cleanups like wbinvd_on_all_cpus() which would ideally
> + * be handled during teardown rather than guest boot. Deferring that
> + * also allows the existing logic for SEV-ES VMSAs to be re-used with
> + * minimal SNP-specific changes.
> + */
> + svm->sev_es.snp_has_guest_vmsa = true;
> +
> + /* Use the new VMSA */
> + svm->vmcb->control.vmsa_pa = pfn_to_hpa(pfn);
> +
> + /* Mark the vCPU as runnable */
> + kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE);
> +
> + svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
> +
> + /*
> + * gmem pages aren't currently migratable, but if this ever changes
> + * then care should be taken to ensure svm->sev_es.vmsa is pinned
> + * through some other means.
> + */
> + kvm_release_page_clean(page);
> }
>
> static int sev_snp_ap_creation(struct vcpu_svm *svm)
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCH 10/10] KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (8 preceding siblings ...)
2025-02-19 1:27 ` [PATCH 09/10] KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates Sean Christopherson
@ 2025-02-19 1:27 ` Sean Christopherson
2025-02-25 0:00 ` Tom Lendacky
2025-02-20 22:51 ` [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Tom Lendacky
10 siblings, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2025-02-19 1:27 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips, Tom Lendacky,
Alexey Kardashevskiy
When processing an SNP AP Creation event, invalidate the "next" VMSA GPA
even if acquiring the page/pfn for the new VMSA fails. In practice, the
next GPA will never be used regardless of whether or not its invalidated,
as the entire flow is guarded by snp_ap_waiting_for_reset, and said guard
and snp_vmsa_gpa are always written as a pair. But that's really hard to
see in the code.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/sev.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 15c324b61b24..7345cac6f93a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3877,6 +3877,7 @@ void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
return;
gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
+ svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
slot = gfn_to_memslot(vcpu->kvm, gfn);
if (!slot)
@@ -3906,8 +3907,6 @@ void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
/* Mark the vCPU as runnable */
kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE);
- svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
-
/*
* gmem pages aren't currently migratable, but if this ever changes
* then care should be taken to ensure svm->sev_es.vmsa is pinned
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 33+ messages in thread* Re: [PATCH 10/10] KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure
2025-02-19 1:27 ` [PATCH 10/10] KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure Sean Christopherson
@ 2025-02-25 0:00 ` Tom Lendacky
0 siblings, 0 replies; 33+ messages in thread
From: Tom Lendacky @ 2025-02-25 0:00 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:27, Sean Christopherson wrote:
> When processing an SNP AP Creation event, invalidate the "next" VMSA GPA
> even if acquiring the page/pfn for the new VMSA fails. In practice, the
> next GPA will never be used regardless of whether or not its invalidated,
> as the entire flow is guarded by snp_ap_waiting_for_reset, and said guard
> and snp_vmsa_gpa are always written as a pair. But that's really hard to
> see in the code.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 15c324b61b24..7345cac6f93a 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3877,6 +3877,7 @@ void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
> return;
>
> gfn = gpa_to_gfn(svm->sev_es.snp_vmsa_gpa);
> + svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
>
> slot = gfn_to_memslot(vcpu->kvm, gfn);
> if (!slot)
> @@ -3906,8 +3907,6 @@ void sev_snp_init_protected_guest_state(struct kvm_vcpu *vcpu)
> /* Mark the vCPU as runnable */
> kvm_set_mp_state(vcpu, KVM_MP_STATE_RUNNABLE);
>
> - svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
> -
> /*
> * gmem pages aren't currently migratable, but if this ever changes
> * then care should be taken to ensure svm->sev_es.vmsa is pinned
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES
2025-02-19 1:26 [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Sean Christopherson
` (9 preceding siblings ...)
2025-02-19 1:27 ` [PATCH 10/10] KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure Sean Christopherson
@ 2025-02-20 22:51 ` Tom Lendacky
2025-02-25 0:02 ` Tom Lendacky
10 siblings, 1 reply; 33+ messages in thread
From: Tom Lendacky @ 2025-02-20 22:51 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/18/25 19:26, Sean Christopherson wrote:
> This is a hastily thrown together series, barely above RFC, to try and
> address the worst of the issues that arise with guest controlled SEV
> features (thanks AP creation)[1].
>
> In addition to the initial flaws with DebugSwap, I came across a variety
> of issues when trying to figure out how best to handle SEV features in
> general. E.g. AFAICT, KVM doesn't guard against userspace manually making
> a vCPU RUNNABLE after it has been DESTROYED (or after a failed CREATE).
>
> This is essentially compile-tested only, as I don't have easy access to a
> system with SNP enabled. I ran the SEV-ES selftests, but that's not much
> in the way of test coverage.
>
> AMD folks, I would greatly appreciate reviews, testing, and most importantly,
> confirmation that all of this actually works the way I think it does.
A quick test of a 64 vCPU SNP guest booted successfully, so that's a
good start. I'll take a closer look at these patches over the next few days.
Thanks,
Tom
>
> [1] https://lore.kernel.org/all/Z7TSef290IQxQhT2@google.com
>
> Sean Christopherson (10):
> KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap
> KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
> KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid
> VMSA
> KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error
> KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view
> KVM: SVM: Simplify request+kick logic in SNP AP Creation handling
> KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling
> KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa
> KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates
> KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure
>
> arch/x86/kvm/svm/sev.c | 218 +++++++++++++++++++----------------------
> arch/x86/kvm/svm/svm.c | 7 +-
> arch/x86/kvm/svm/svm.h | 2 +-
> 3 files changed, 106 insertions(+), 121 deletions(-)
>
>
> base-commit: fed48e2967f402f561d80075a20c5c9e16866e53
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES
2025-02-20 22:51 ` [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES Tom Lendacky
@ 2025-02-25 0:02 ` Tom Lendacky
2025-02-25 2:21 ` Kim Phillips
0 siblings, 1 reply; 33+ messages in thread
From: Tom Lendacky @ 2025-02-25 0:02 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Kim Phillips,
Alexey Kardashevskiy
On 2/20/25 16:51, Tom Lendacky wrote:
> On 2/18/25 19:26, Sean Christopherson wrote:
>> This is a hastily thrown together series, barely above RFC, to try and
>> address the worst of the issues that arise with guest controlled SEV
>> features (thanks AP creation)[1].
>>
>> In addition to the initial flaws with DebugSwap, I came across a variety
>> of issues when trying to figure out how best to handle SEV features in
>> general. E.g. AFAICT, KVM doesn't guard against userspace manually making
>> a vCPU RUNNABLE after it has been DESTROYED (or after a failed CREATE).
>>
>> This is essentially compile-tested only, as I don't have easy access to a
>> system with SNP enabled. I ran the SEV-ES selftests, but that's not much
>> in the way of test coverage.
>>
>> AMD folks, I would greatly appreciate reviews, testing, and most importantly,
>> confirmation that all of this actually works the way I think it does.
>
> A quick test of a 64 vCPU SNP guest booted successfully, so that's a
> good start. I'll take a closer look at these patches over the next few days.
Everything looks good. I'm going to try messing around with the
DebugSwap feature bit just to try some of those odd cases and make sure
everything does what it is supposed to. Should have results in a day or two.
Thanks,
Tom
>
> Thanks,
> Tom
>
>>
>> [1] https://lore.kernel.org/all/Z7TSef290IQxQhT2@google.com
>>
>> Sean Christopherson (10):
>> KVM: SVM: Save host DR masks but NOT DRs on CPUs with DebugSwap
>> KVM: SVM: Don't rely on DebugSwap to restore host DR0..DR3
>> KVM: SVM: Terminate the VM if a SEV-ES+ guest is run with an invalid
>> VMSA
>> KVM: SVM: Don't change target vCPU state on AP Creation VMGEXIT error
>> KVM: SVM: Require AP's "requested" SEV_FEATURES to match KVM's view
>> KVM: SVM: Simplify request+kick logic in SNP AP Creation handling
>> KVM: SVM: Use guard(mutex) to simplify SNP AP Creation error handling
>> KVM: SVM: Mark VMCB dirty before processing incoming snp_vmsa_gpa
>> KVM: SVM: Use guard(mutex) to simplify SNP vCPU state updates
>> KVM: SVM: Invalidate "next" SNP VMSA GPA even on failure
>>
>> arch/x86/kvm/svm/sev.c | 218 +++++++++++++++++++----------------------
>> arch/x86/kvm/svm/svm.c | 7 +-
>> arch/x86/kvm/svm/svm.h | 2 +-
>> 3 files changed, 106 insertions(+), 121 deletions(-)
>>
>>
>> base-commit: fed48e2967f402f561d80075a20c5c9e16866e53
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCH 00/10] KVM: SVM: Attempt to cleanup SEV_FEATURES
2025-02-25 0:02 ` Tom Lendacky
@ 2025-02-25 2:21 ` Kim Phillips
0 siblings, 0 replies; 33+ messages in thread
From: Kim Phillips @ 2025-02-25 2:21 UTC (permalink / raw)
To: Tom Lendacky, Sean Christopherson, Paolo Bonzini
Cc: kvm, linux-kernel, Naveen N Rao, Alexey Kardashevskiy
On 2/24/25 6:02 PM, Tom Lendacky wrote:
> On 2/20/25 16:51, Tom Lendacky wrote:
>> On 2/18/25 19:26, Sean Christopherson wrote:
>>> This is a hastily thrown together series, barely above RFC, to try and
>>> address the worst of the issues that arise with guest controlled SEV
>>> features (thanks AP creation)[1].
>>>
>>> In addition to the initial flaws with DebugSwap, I came across a variety
>>> of issues when trying to figure out how best to handle SEV features in
>>> general. E.g. AFAICT, KVM doesn't guard against userspace manually making
>>> a vCPU RUNNABLE after it has been DESTROYED (or after a failed CREATE).
>>>
>>> This is essentially compile-tested only, as I don't have easy access to a
>>> system with SNP enabled. I ran the SEV-ES selftests, but that's not much
>>> in the way of test coverage.
>>>
>>> AMD folks, I would greatly appreciate reviews, testing, and most importantly,
>>> confirmation that all of this actually works the way I think it does.
>>
>> A quick test of a 64 vCPU SNP guest booted successfully, so that's a
>> good start. I'll take a closer look at these patches over the next few days.
>
> Everything looks good. I'm going to try messing around with the
> DebugSwap feature bit just to try some of those odd cases and make sure
> everything does what it is supposed to. Should have results in a day or two.
My host and guest kernels are based on kvm-x86/next and, following the
instructions under "Tested with:" [1], I don't see gdb stopping on the
watchpoint in the guest gdb session:
...
Reading symbols from a.out...
Hardware watchpoint 1: x
Starting program: /home/ubuntu/a.out
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Inferior 1 (process 957) exited normally]
It happens regardless of the kvm_amd debug_swap= setting, and
regardless of whether this cleanup series is applied or not.
Doing a break on main and running interactively makes gdb stop at main(),
as it should.
Am I doing something wrong? Does anyone know whether
DebugSwap under SEV-ES (not just SNP) was tested?
Thanks,
Kim
[1] https://lore.kernel.org/kvm/20230411125718.2297768-6-aik@amd.com/
^ permalink raw reply [flat|nested] 33+ messages in thread