* [RFC PATCH v2 01/17] KVM: x86/lapic: Differentiate protected APIC interrupt mechanisms
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 02/17] x86/cpufeatures: Add Secure AVIC CPU feature Neeraj Upadhyay
` (16 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
The existing guest_apic_protected boolean flag is insufficient for
handling different protected guest technologies. While both Intel TDX
and AMD SNP (with Secure AVIC) protect the virtual APIC, they use
fundamentally different interrupt delivery mechanisms.
TDX relies on hardware-managed Posted Interrupts, whereas Secure AVIC
requires KVM to perform explicit software-based interrupt injection.
The current flag cannot distinguish between these two models.
To address this, introduce a new flag, prot_apic_intr_inject. This flag
is true for protected guests that require KVM to inject interrupts and
false for those that use a hardware-managed delivery mechanism.
This preparatory change allows subsequent commits to implement the correct
interrupt handling logic for Secure AVIC.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/lapic.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 72de14527698..f48218fd4638 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -70,7 +70,10 @@ struct kvm_lapic {
bool irr_pending;
bool lvt0_in_nmi_mode;
/* Select registers in the vAPIC cannot be read/written. */
- bool guest_apic_protected;
+ struct {
+ bool guest_apic_protected;
+ bool prot_apic_intr_inject;
+ };
/* Number of bits set in ISR. */
s16 isr_count;
/* The highest vector set in ISR; if -1 - invalid, must scan ISR. */
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [RFC PATCH v2 02/17] x86/cpufeatures: Add Secure AVIC CPU feature
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 01/17] KVM: x86/lapic: Differentiate protected APIC interrupt mechanisms Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 03/17] KVM: SVM: Add support for Secure AVIC capability in KVM Neeraj Upadhyay
` (15 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
Add CPU feature detection for Secure AVIC. The Secure AVIC feature
provides hardware acceleration for performance sensitive APIC accesses
and support for managing guest-owned APIC state for the SEV-SNP guests.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 06fc0479a23f..d855825b1b9e 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -449,6 +449,7 @@
#define X86_FEATURE_DEBUG_SWAP (19*32+14) /* "debug_swap" SEV-ES full debug state swap support */
#define X86_FEATURE_RMPREAD (19*32+21) /* RMPREAD instruction */
#define X86_FEATURE_SEGMENTED_RMP (19*32+23) /* Segmented RMP support */
+#define X86_FEATURE_SECURE_AVIC (19*32+26) /* Secure AVIC */
#define X86_FEATURE_ALLOWED_SEV_FEATURES (19*32+27) /* Allowed SEV Features */
#define X86_FEATURE_SVSM (19*32+28) /* "svsm" SVSM present */
#define X86_FEATURE_HV_INUSE_WR_ALLOWED (19*32+30) /* Allow Write to in-use hypervisor-owned pages */
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [RFC PATCH v2 03/17] KVM: SVM: Add support for Secure AVIC capability in KVM
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 01/17] KVM: x86/lapic: Differentiate protected APIC interrupt mechanisms Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 02/17] x86/cpufeatures: Add Secure AVIC CPU feature Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 04/17] KVM: SVM: Set guest APIC protection flags for Secure AVIC Neeraj Upadhyay
` (14 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
Add support to KVM for determining if a system is capable of supporting
Secure AVIC feature.
Secure AVIC feature support is determined based on:
- secure_avic module parameter is set.
- X86_FEATURE_SECURE_AVIC CPU feature bit is set.
- SNP feature is supported.
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/svm.h | 1 +
arch/x86/kvm/svm/sev.c | 9 +++++++++
2 files changed, 10 insertions(+)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index ffc27f676243..ab3d55654c77 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -299,6 +299,7 @@ static_assert((X2AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == X2AVIC_
#define SVM_SEV_FEAT_RESTRICTED_INJECTION BIT(3)
#define SVM_SEV_FEAT_ALTERNATE_INJECTION BIT(4)
#define SVM_SEV_FEAT_DEBUG_SWAP BIT(5)
+#define SVM_SEV_FEAT_SECURE_AVIC BIT(16)
#define VMCB_ALLOWED_SEV_FEATURES_VALID BIT_ULL(63)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 5bac4d20aec0..b2eae102681c 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -59,6 +59,10 @@ static bool sev_es_debug_swap_enabled = true;
module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0444);
static u64 sev_supported_vmsa_features;
+/* enable/disable SEV-SNP Secure AVIC support */
+bool sev_snp_savic_enabled = true;
+module_param_named(secure_avic, sev_snp_savic_enabled, bool, 0444);
+
#define AP_RESET_HOLD_NONE 0
#define AP_RESET_HOLD_NAE_EVENT 1
#define AP_RESET_HOLD_MSR_PROTO 2
@@ -2911,6 +2915,8 @@ void __init sev_set_cpu_caps(void)
kvm_cpu_cap_set(X86_FEATURE_SEV_SNP);
kvm_caps.supported_vm_types |= BIT(KVM_X86_SNP_VM);
}
+ if (sev_snp_savic_enabled)
+ kvm_cpu_cap_set(X86_FEATURE_SECURE_AVIC);
}
static bool is_sev_snp_initialized(void)
@@ -3075,6 +3081,9 @@ void __init sev_hardware_setup(void)
!cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP))
sev_es_debug_swap_enabled = false;
+ if (!sev_snp_supported || !cpu_feature_enabled(X86_FEATURE_SECURE_AVIC))
+ sev_snp_savic_enabled = false;
+
sev_supported_vmsa_features = 0;
if (sev_es_debug_swap_enabled)
sev_supported_vmsa_features |= SVM_SEV_FEAT_DEBUG_SWAP;
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [RFC PATCH v2 04/17] KVM: SVM: Set guest APIC protection flags for Secure AVIC
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (2 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 03/17] KVM: SVM: Add support for Secure AVIC capability in KVM Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests Neeraj Upadhyay
` (13 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
Secure AVIC provides a hardware-backed, protected virtual APIC for
SNP guests. When this feature is active, KVM cannot directly access
the virtual APIC state and must use software-based interrupt injection
to deliver interrupts to the guest.
Introduce a helper, sev_savic_active(), to detect when a VM has Secure AVIC
enabled based on its VMSA features.
At vCPU creation time, use this helper to set the appropriate APIC flags:
- guest_apic_protected is set to true, as the APIC state is not visible
to KVM.
- prot_apic_intr_inject is set to true to signal that the software
injection path must be used for interrupt delivery.
This ensures that the core APIC code can correctly identify and handle
Secure AVIC guests.
This is only an initialization commit and actual support for creating
Secure AVIC enabled guests and injecting interrupts will be added in
later commits.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/svm.c | 5 +++++
arch/x86/kvm/svm/svm.h | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 8a66e2e985a4..064ec98d7e67 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1300,6 +1300,11 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
if (err)
goto error_free_vmsa_page;
+ if (sev_savic_active(vcpu->kvm)) {
+ vcpu->arch.apic->guest_apic_protected = true;
+ vcpu->arch.apic->prot_apic_intr_inject = true;
+ }
+
svm->msrpm = svm_vcpu_alloc_msrpm();
if (!svm->msrpm) {
err = -ENOMEM;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 70df7c6413cf..1090a48adeda 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -869,6 +869,10 @@ void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end);
int sev_gmem_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, bool is_private);
struct vmcb_save_area *sev_decrypt_vmsa(struct kvm_vcpu *vcpu);
void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa);
+static inline bool sev_savic_active(struct kvm *kvm)
+{
+ return to_kvm_sev_info(kvm)->vmsa_features & SVM_SEV_FEAT_SECURE_AVIC;
+}
#else
static inline struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
{
@@ -899,6 +903,7 @@ static inline int sev_gmem_max_mapping_level(struct kvm *kvm, kvm_pfn_t pfn, boo
{
return 0;
}
+static inline bool sev_savic_active(struct kvm *kvm) { return false; }
static inline struct vmcb_save_area *sev_decrypt_vmsa(struct kvm_vcpu *vcpu)
{
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (3 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 04/17] KVM: SVM: Set guest APIC protection flags for Secure AVIC Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 13:55 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC Neeraj Upadhyay
` (12 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
Disable interception for SECURE_AVIC_CONTROL MSR for Secure AVIC
enabled guests. The SECURE_AVIC_CONTROL MSR holds the GPA of the
guest APIC backing page and bitfields to control enablement of Secure
AVIC and whether the guest allows NMIs to be injected by the hypervisor.
This MSR is populated by the guest and can be read by the guest to get
the GPA of the APIC backing page. The MSR can only be accessed in Secure
AVIC mode; accessing it when not in Secure AVIC mode results in #GP. So,
KVM should not intercept it.
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kvm/svm/sev.c | 6 +++++-
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b65c3ba5fa14..9f16030dd849 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -707,6 +707,7 @@
#define MSR_AMD64_SEG_RMP_ENABLED_BIT 0
#define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT)
#define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8)
+#define MSR_AMD64_SAVIC_CONTROL 0xc0010138
#define MSR_SVSM_CAA 0xc001f000
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index b2eae102681c..afe4127a1918 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4487,7 +4487,8 @@ void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm)
static void sev_es_init_vmcb(struct vcpu_svm *svm)
{
- struct kvm_sev_info *sev = to_kvm_sev_info(svm->vcpu.kvm);
+ struct kvm_vcpu *vcpu = &svm->vcpu;
+ struct kvm_sev_info *sev = to_kvm_sev_info(vcpu->kvm);
struct vmcb *vmcb = svm->vmcb01.ptr;
svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE;
@@ -4546,6 +4547,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
/* Can't intercept XSETBV, HV can't modify XCR0 directly */
svm_clr_intercept(svm, INTERCEPT_XSETBV);
+
+ if (sev_savic_active(vcpu->kvm))
+ svm_set_intercept_for_msr(vcpu, MSR_AMD64_SAVIC_CONTROL, MSR_TYPE_RW, false);
}
void sev_init_vmcb(struct vcpu_svm *svm)
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests
2025-09-23 5:03 ` [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests Neeraj Upadhyay
@ 2025-09-23 13:55 ` Tom Lendacky
2025-09-25 5:16 ` Upadhyay, Neeraj
0 siblings, 1 reply; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 13:55 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> Disable interception for SECURE_AVIC_CONTROL MSR for Secure AVIC
> enabled guests. The SECURE_AVIC_CONTROL MSR holds the GPA of the
> guest APIC backing page and bitfields to control enablement of Secure
> AVIC and whether the guest allows NMIs to be injected by the hypervisor.
> This MSR is populated by the guest and can be read by the guest to get
> the GPA of the APIC backing page. The MSR can only be accessed in Secure
> AVIC mode; accessing it when not in Secure AVIC mode results in #GP. So,
> KVM should not intercept it.
The reason KVM should not intercept the MSR access is that the guest
would not be able to actually set the MSR if it is intercepted.
Thanks,
Tom
>
> Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/kvm/svm/sev.c | 6 +++++-
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index b65c3ba5fa14..9f16030dd849 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -707,6 +707,7 @@
> #define MSR_AMD64_SEG_RMP_ENABLED_BIT 0
> #define MSR_AMD64_SEG_RMP_ENABLED BIT_ULL(MSR_AMD64_SEG_RMP_ENABLED_BIT)
> #define MSR_AMD64_RMP_SEGMENT_SHIFT(x) (((x) & GENMASK_ULL(13, 8)) >> 8)
> +#define MSR_AMD64_SAVIC_CONTROL 0xc0010138
>
> #define MSR_SVSM_CAA 0xc001f000
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index b2eae102681c..afe4127a1918 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4487,7 +4487,8 @@ void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm)
>
> static void sev_es_init_vmcb(struct vcpu_svm *svm)
> {
> - struct kvm_sev_info *sev = to_kvm_sev_info(svm->vcpu.kvm);
> + struct kvm_vcpu *vcpu = &svm->vcpu;
> + struct kvm_sev_info *sev = to_kvm_sev_info(vcpu->kvm);
> struct vmcb *vmcb = svm->vmcb01.ptr;
>
> svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE;
> @@ -4546,6 +4547,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
>
> /* Can't intercept XSETBV, HV can't modify XCR0 directly */
> svm_clr_intercept(svm, INTERCEPT_XSETBV);
> +
> + if (sev_savic_active(vcpu->kvm))
> + svm_set_intercept_for_msr(vcpu, MSR_AMD64_SAVIC_CONTROL, MSR_TYPE_RW, false);
> }
>
> void sev_init_vmcb(struct vcpu_svm *svm)
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests
2025-09-23 13:55 ` Tom Lendacky
@ 2025-09-25 5:16 ` Upadhyay, Neeraj
2025-09-25 13:54 ` Tom Lendacky
0 siblings, 1 reply; 32+ messages in thread
From: Upadhyay, Neeraj @ 2025-09-25 5:16 UTC (permalink / raw)
To: Tom Lendacky, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/2025 7:25 PM, Tom Lendacky wrote:
> On 9/23/25 00:03, Neeraj Upadhyay wrote:
>> Disable interception for SECURE_AVIC_CONTROL MSR for Secure AVIC
>> enabled guests. The SECURE_AVIC_CONTROL MSR holds the GPA of the
>> guest APIC backing page and bitfields to control enablement of Secure
>> AVIC and whether the guest allows NMIs to be injected by the hypervisor.
>> This MSR is populated by the guest and can be read by the guest to get
>> the GPA of the APIC backing page. The MSR can only be accessed in Secure
>> AVIC mode; accessing it when not in Secure AVIC mode results in #GP. So,
>> KVM should not intercept it.
>
> The reason KVM should not intercept the MSR access is that the guest
> would not be able to actually set the MSR if it is intercepted.
>
Yes, something like below looks ok?
Disable interception for SECURE_AVIC_CONTROL MSR for Secure AVIC
enabled guests. The SECURE_AVIC_CONTROL MSR holds the GPA of the
guest APIC backing page and bitfields to control enablement of Secure
AVIC and whether the guest allows NMIs to be injected by the hypervisor.
This MSR is populated by the guest and can be read by the guest to get
the GPA of the APIC backing page. This MSR is only accessible by the
guest when the Secure AVIC feature is active; any other access attempt
will result in a #GP fault. So, KVM should not intercept access to this
MSR, as doing so prevents the guest from successfully reading/writing
its configuration and enabling the feature.
- Neeraj
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests
2025-09-25 5:16 ` Upadhyay, Neeraj
@ 2025-09-25 13:54 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-25 13:54 UTC (permalink / raw)
To: Upadhyay, Neeraj, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/25/25 00:16, Upadhyay, Neeraj wrote:
>
>
> On 9/23/2025 7:25 PM, Tom Lendacky wrote:
>> On 9/23/25 00:03, Neeraj Upadhyay wrote:
>>> Disable interception for SECURE_AVIC_CONTROL MSR for Secure AVIC
>>> enabled guests. The SECURE_AVIC_CONTROL MSR holds the GPA of the
>>> guest APIC backing page and bitfields to control enablement of Secure
>>> AVIC and whether the guest allows NMIs to be injected by the hypervisor.
>>> This MSR is populated by the guest and can be read by the guest to get
>>> the GPA of the APIC backing page. The MSR can only be accessed in Secure
>>> AVIC mode; accessing it when not in Secure AVIC mode results in #GP. So,
>>> KVM should not intercept it.
>>
>> The reason KVM should not intercept the MSR access is that the guest
>> would not be able to actually set the MSR if it is intercepted.
>>
>
> Yes, something like below looks ok?
>
> Disable interception for SECURE_AVIC_CONTROL MSR for Secure AVIC
> enabled guests. The SECURE_AVIC_CONTROL MSR holds the GPA of the
> guest APIC backing page and bitfields to control enablement of Secure
> AVIC and whether the guest allows NMIs to be injected by the hypervisor.
> This MSR is populated by the guest and can be read by the guest to get
> the GPA of the APIC backing page. This MSR is only accessible by the
> guest when the Secure AVIC feature is active; any other access attempt
> will result in a #GP fault. So, KVM should not intercept access to this
> MSR, as doing so prevents the guest from successfully reading/writing its
> configuration and enabling the feature.
It's probably more info than is really needed. Just saying something like
the following should be enough (feel free to improve on this):
Disable interception of the SECURE_AVIC_CONTROL MSR for Secure AVIC
enabled guests. The SECURE_AVIC_CONTROL MSR is used by the guest to
configure and enable Secure AVIC. In order for the guest to be able to
successfully do this, the MSR access must not be intercepted.
Thanks,
Tom
>
>
>
> - Neeraj
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (4 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 14:47 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 07/17] KVM: SVM: Add IPI Delivery Support " Neeraj Upadhyay
` (11 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
For AMD SEV-SNP guests with Secure AVIC, the virtual APIC state is
not visible to KVM and managed by the hardware. This renders the
traditional interrupt injection mechanism, which directly modifies
guest state, unusable. Instead, interrupt delivery must be mediated
through a new interface in the VMCB. Implement support for this
mechanism.
First, new VMCB control fields, requested_irr and update_irr, are
defined to allow KVM to communicate pending interrupts to the hardware
before VMRUN.
Hook the core interrupt injection path, svm_inject_irq(). Instead of
injecting directly, transfer pending interrupts from KVM's software
IRR to the new requested_irr VMCB field and delegate final delivery
to the hardware.
Since the hardware is now responsible for the timing and delivery of
interrupts to the guest (including managing the guest's RFLAGS.IF and
vAPIC state), bypass the standard KVM interrupt window checks in
svm_interrupt_allowed() and svm_enable_irq_window(). Similarly, interrupt
re-injection is handled by the hardware and requires no explicit KVM
involvement.
Finally, update the logic for detecting pending interrupts. Add the
vendor op, protected_apic_has_interrupt(), to check only KVM's software
vAPIC IRR state.
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/asm/svm.h | 8 +++++--
arch/x86/kvm/lapic.c | 17 ++++++++++++---
arch/x86/kvm/svm/sev.c | 44 ++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.c | 13 +++++++++++
arch/x86/kvm/svm/svm.h | 4 ++++
arch/x86/kvm/x86.c | 15 ++++++++++++-
6 files changed, 95 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index ab3d55654c77..0faf262f9f9f 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -162,10 +162,14 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
u64 vmsa_pa; /* Used for an SEV-ES guest */
u8 reserved_8[16];
u16 bus_lock_counter; /* Offset 0x120 */
- u8 reserved_9[22];
+ u8 reserved_9[18];
+ u8 update_irr; /* Offset 0x134 */
+ u8 reserved_10[3];
u64 allowed_sev_features; /* Offset 0x138 */
u64 guest_sev_features; /* Offset 0x140 */
- u8 reserved_10[664];
+ u8 reserved_11[8];
+ u32 requested_irr[8]; /* Offset 0x150 */
+ u8 reserved_12[624];
/*
* Offset 0x3e0, 32 bytes reserved
* for use by hypervisor/software.
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 5fc437341e03..3199c7c6db05 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2938,11 +2938,22 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)
if (!kvm_apic_present(vcpu))
return -1;
- if (apic->guest_apic_protected)
+ if (!apic->guest_apic_protected) {
+ __apic_update_ppr(apic, &ppr);
+ return apic_has_interrupt_for_ppr(apic, ppr);
+ }
+
+ if (!apic->prot_apic_intr_inject)
return -1;
- __apic_update_ppr(apic, &ppr);
- return apic_has_interrupt_for_ppr(apic, ppr);
+ /*
+ * For guest-protected virtual APIC, hardware manages the virtual
+ * PPR and interrupt delivery to the guest. So, checking the KVM
+ * managed virtual APIC's APIC_IRR state for any pending vectors
+ * is the only thing required here.
+ */
+ return apic_search_irr(apic);
+
}
EXPORT_SYMBOL_GPL(kvm_apic_has_interrupt);
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index afe4127a1918..78cefc14a2ee 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -28,6 +28,7 @@
#include <asm/debugreg.h>
#include <asm/msr.h>
#include <asm/sev.h>
+#include <asm/apic.h>
#include "mmu.h"
#include "x86.h"
@@ -35,6 +36,7 @@
#include "svm_ops.h"
#include "cpuid.h"
#include "trace.h"
+#include "lapic.h"
#define GHCB_VERSION_MAX 2ULL
#define GHCB_VERSION_DEFAULT 2ULL
@@ -5064,3 +5066,45 @@ void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa)
free_page((unsigned long)vmsa);
}
+
+void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected)
+{
+ unsigned int i, vec, vec_pos, vec_start;
+ struct kvm_lapic *apic;
+ bool has_interrupts;
+ u32 val;
+
+ /* Secure AVIC HW takes care of re-injection */
+ if (reinjected)
+ return;
+
+ apic = svm->vcpu.arch.apic;
+ has_interrupts = false;
+
+ for (i = 0; i < ARRAY_SIZE(svm->vmcb->control.requested_irr); i++) {
+ val = apic_get_reg(apic->regs, APIC_IRR + i * 0x10);
+ if (!val)
+ continue;
+ has_interrupts = true;
+ svm->vmcb->control.requested_irr[i] |= val;
+ vec_start = i * 32;
+ /*
+ * Clear each vector one by one to avoid race with concurrent
+ * APIC_IRR updates from the deliver_interrupt() path.
+ */
+ do {
+ vec_pos = __ffs(val);
+ vec = vec_start + vec_pos;
+ apic_clear_vector(vec, apic->regs + APIC_IRR);
+ val = val & ~BIT(vec_pos);
+ } while (val);
+ }
+
+ if (has_interrupts)
+ svm->vmcb->control.update_irr |= BIT(0);
+}
+
+bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu)
+{
+ return kvm_apic_has_interrupt(vcpu) != -1;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 064ec98d7e67..7811a87bc111 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -52,6 +52,8 @@
#include "svm.h"
#include "svm_ops.h"
+#include "lapic.h"
+
#include "kvm_onhyperv.h"
#include "svm_onhyperv.h"
@@ -3689,6 +3691,9 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu, bool reinjected)
struct vcpu_svm *svm = to_svm(vcpu);
u32 type;
+ if (sev_savic_active(vcpu->kvm))
+ return sev_savic_set_requested_irr(svm, reinjected);
+
if (vcpu->arch.interrupt.soft) {
if (svm_update_soft_interrupt_rip(vcpu))
return;
@@ -3870,6 +3875,9 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ if (sev_savic_active(vcpu->kvm))
+ return 1;
+
if (svm->nested.nested_run_pending)
return -EBUSY;
@@ -3890,6 +3898,9 @@ static void svm_enable_irq_window(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ if (sev_savic_active(vcpu->kvm))
+ return;
+
/*
* In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
* 1, because that's a separate STGI/VMRUN intercept. The next time we
@@ -5132,6 +5143,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.apicv_post_state_restore = avic_apicv_post_state_restore,
.required_apicv_inhibits = AVIC_REQUIRED_APICV_INHIBITS,
+ .protected_apic_has_interrupt = sev_savic_has_pending_interrupt,
+
.get_exit_info = svm_get_exit_info,
.get_entry_info = svm_get_entry_info,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 1090a48adeda..60dc424d62c4 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -873,6 +873,8 @@ static inline bool sev_savic_active(struct kvm *kvm)
{
return to_kvm_sev_info(kvm)->vmsa_features & SVM_SEV_FEAT_SECURE_AVIC;
}
+void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected);
+bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu);
#else
static inline struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
{
@@ -910,6 +912,8 @@ static inline struct vmcb_save_area *sev_decrypt_vmsa(struct kvm_vcpu *vcpu)
return NULL;
}
static inline void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa) {}
+static inline void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected) {}
+static inline bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu) { return false; }
#endif
/* vmenter.S */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 33fba801b205..65ebdc6deb92 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10369,7 +10369,20 @@ static int kvm_check_and_inject_events(struct kvm_vcpu *vcpu,
if (r < 0)
goto out;
if (r) {
- int irq = kvm_cpu_get_interrupt(vcpu);
+ int irq;
+
+ /*
+ * Do not ack the interrupt here for guest-protected VAPIC
+ * which requires interrupt injection to the guest.
+ *
+ * ->inject_irq reads the KVM's VAPIC's APIC_IRR state and
+ * clears it.
+ */
+ if (vcpu->arch.apic->guest_apic_protected &&
+ vcpu->arch.apic->prot_apic_intr_inject)
+ irq = kvm_apic_has_interrupt(vcpu);
+ else
+ irq = kvm_cpu_get_interrupt(vcpu);
if (!WARN_ON_ONCE(irq == -1)) {
kvm_queue_interrupt(vcpu, irq, false);
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC
2025-09-23 5:03 ` [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC Neeraj Upadhyay
@ 2025-09-23 14:47 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 14:47 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> For AMD SEV-SNP guests with Secure AVIC, the virtual APIC state is
> not visible to KVM and managed by the hardware. This renders the
> traditional interrupt injection mechanism, which directly modifies
> guest state, unusable. Instead, interrupt delivery must be mediated
> through a new interface in the VMCB. Implement support for this
> mechanism.
>
> First, new VMCB control fields, requested_irr and update_irr, are
> defined to allow KVM to communicate pending interrupts to the hardware
> before VMRUN.
>
> Hook the core interrupt injection path, svm_inject_irq(). Instead of
> injecting directly, transfer pending interrupts from KVM's software
> IRR to the new requested_irr VMCB field and delegate final delivery
> to the hardware.
>
> Since the hardware is now responsible for the timing and delivery of
> interrupts to the guest (including managing the guest's RFLAGS.IF and
> vAPIC state), bypass the standard KVM interrupt window checks in
> svm_interrupt_allowed() and svm_enable_irq_window(). Similarly, interrupt
> re-injection is handled by the hardware and requires no explicit KVM
> involvement.
>
> Finally, update the logic for detecting pending interrupts. Add the
> vendor op, protected_apic_has_interrupt(), to check only KVM's software
> vAPIC IRR state.
>
> Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/include/asm/svm.h | 8 +++++--
> arch/x86/kvm/lapic.c | 17 ++++++++++++---
> arch/x86/kvm/svm/sev.c | 44 ++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/svm/svm.c | 13 +++++++++++
> arch/x86/kvm/svm/svm.h | 4 ++++
> arch/x86/kvm/x86.c | 15 ++++++++++++-
> 6 files changed, 95 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index ab3d55654c77..0faf262f9f9f 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -162,10 +162,14 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
> u64 vmsa_pa; /* Used for an SEV-ES guest */
> u8 reserved_8[16];
> u16 bus_lock_counter; /* Offset 0x120 */
> - u8 reserved_9[22];
> + u8 reserved_9[18];
> + u8 update_irr; /* Offset 0x134 */
The APM has this as a 4 byte field.
> + u8 reserved_10[3];
> u64 allowed_sev_features; /* Offset 0x138 */
> u64 guest_sev_features; /* Offset 0x140 */
> - u8 reserved_10[664];
> + u8 reserved_11[8];
> + u32 requested_irr[8]; /* Offset 0x150 */
> + u8 reserved_12[624];
> /*
> * Offset 0x3e0, 32 bytes reserved
> * for use by hypervisor/software.
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 5fc437341e03..3199c7c6db05 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2938,11 +2938,22 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)
> if (!kvm_apic_present(vcpu))
> return -1;
>
> - if (apic->guest_apic_protected)
> + if (!apic->guest_apic_protected) {
> + __apic_update_ppr(apic, &ppr);
> + return apic_has_interrupt_for_ppr(apic, ppr);
> + }
> +
> + if (!apic->prot_apic_intr_inject)
> return -1;
>
> - __apic_update_ppr(apic, &ppr);
> - return apic_has_interrupt_for_ppr(apic, ppr);
> + /*
> + * For guest-protected virtual APIC, hardware manages the virtual
> + * PPR and interrupt delivery to the guest. So, checking the KVM
> + * managed virtual APIC's APIC_IRR state for any pending vectors
> + * is the only thing required here.
> + */
> + return apic_search_irr(apic);
Just a though, but I wonder if this would look cleaner by doing:
if (apic->guest_apic_protected) {
if (!apic->prot_apic_intr_inject)
return -1;
/*
* For guest-protected ...
*/
return apic_search_irr(apic);
}
__apic_update_ppr(apic, &ppr);
return apic_has_interrupt_for_ppr(apic, ppr);
> +
> }
> EXPORT_SYMBOL_GPL(kvm_apic_has_interrupt);
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index afe4127a1918..78cefc14a2ee 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -28,6 +28,7 @@
> #include <asm/debugreg.h>
> #include <asm/msr.h>
> #include <asm/sev.h>
> +#include <asm/apic.h>
>
> #include "mmu.h"
> #include "x86.h"
> @@ -35,6 +36,7 @@
> #include "svm_ops.h"
> #include "cpuid.h"
> #include "trace.h"
> +#include "lapic.h"
>
> #define GHCB_VERSION_MAX 2ULL
> #define GHCB_VERSION_DEFAULT 2ULL
> @@ -5064,3 +5066,45 @@ void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa)
>
> free_page((unsigned long)vmsa);
> }
> +
> +void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected)
> +{
> + unsigned int i, vec, vec_pos, vec_start;
> + struct kvm_lapic *apic;
> + bool has_interrupts;
> + u32 val;
> +
> + /* Secure AVIC HW takes care of re-injection */
> + if (reinjected)
> + return;
> +
> + apic = svm->vcpu.arch.apic;
> + has_interrupts = false;
> +
> + for (i = 0; i < ARRAY_SIZE(svm->vmcb->control.requested_irr); i++) {
> + val = apic_get_reg(apic->regs, APIC_IRR + i * 0x10);
> + if (!val)
> + continue;
Add a blank line here.
> + has_interrupts = true;
> + svm->vmcb->control.requested_irr[i] |= val;
Add a blank line here.
> + vec_start = i * 32;
Move this line to just below the comment.
> + /*
> + * Clear each vector one by one to avoid race with concurrent
> + * APIC_IRR updates from the deliver_interrupt() path.
> + */
> + do {
> + vec_pos = __ffs(val);
> + vec = vec_start + vec_pos;
> + apic_clear_vector(vec, apic->regs + APIC_IRR);
> + val = val & ~BIT(vec_pos);
> + } while (val);
Would the following be cleaner?
for_each_set_bit(vec_pos, &val, 32)
apic_clear_vector(vec_start + vec_pos, apic->regs + APIC_IRR);
Might have to make "val" an unsigned long, though, and not sure how that
affects OR'ing it into requested_irr.
> + }
> +
> + if (has_interrupts)
> + svm->vmcb->control.update_irr |= BIT(0);
> +}
> +
> +bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu)
> +{
> + return kvm_apic_has_interrupt(vcpu) != -1;
> +}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 064ec98d7e67..7811a87bc111 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -52,6 +52,8 @@
> #include "svm.h"
> #include "svm_ops.h"
>
> +#include "lapic.h"
Is this include really needed?
> +
> #include "kvm_onhyperv.h"
> #include "svm_onhyperv.h"
>
> @@ -3689,6 +3691,9 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu, bool reinjected)
> struct vcpu_svm *svm = to_svm(vcpu);
> u32 type;
>
> + if (sev_savic_active(vcpu->kvm))
> + return sev_savic_set_requested_irr(svm, reinjected);
> +
> if (vcpu->arch.interrupt.soft) {
> if (svm_update_soft_interrupt_rip(vcpu))
> return;
> @@ -3870,6 +3875,9 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> + if (sev_savic_active(vcpu->kvm))
> + return 1;
Maybe just add a comment above this about why you always return 1 for
Secure AVIC.
> +
> if (svm->nested.nested_run_pending)
> return -EBUSY;
>
> @@ -3890,6 +3898,9 @@ static void svm_enable_irq_window(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
>
> + if (sev_savic_active(vcpu->kvm))
> + return;
Ditto here on the comment.
> +
> /*
> * In case GIF=0 we can't rely on the CPU to tell us when GIF becomes
> * 1, because that's a separate STGI/VMRUN intercept. The next time we
> @@ -5132,6 +5143,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> .apicv_post_state_restore = avic_apicv_post_state_restore,
> .required_apicv_inhibits = AVIC_REQUIRED_APICV_INHIBITS,
>
> + .protected_apic_has_interrupt = sev_savic_has_pending_interrupt,
> +
> .get_exit_info = svm_get_exit_info,
> .get_entry_info = svm_get_entry_info,
>
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 1090a48adeda..60dc424d62c4 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -873,6 +873,8 @@ static inline bool sev_savic_active(struct kvm *kvm)
> {
> return to_kvm_sev_info(kvm)->vmsa_features & SVM_SEV_FEAT_SECURE_AVIC;
> }
> +void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected);
> +bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu);
> #else
> static inline struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
> {
> @@ -910,6 +912,8 @@ static inline struct vmcb_save_area *sev_decrypt_vmsa(struct kvm_vcpu *vcpu)
> return NULL;
> }
> static inline void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa) {}
> +static inline void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected) {}
> +static inline bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu) { return false; }
> #endif
>
> /* vmenter.S */
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 33fba801b205..65ebdc6deb92 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10369,7 +10369,20 @@ static int kvm_check_and_inject_events(struct kvm_vcpu *vcpu,
> if (r < 0)
> goto out;
> if (r) {
> - int irq = kvm_cpu_get_interrupt(vcpu);
> + int irq;
> +
> + /*
> + * Do not ack the interrupt here for guest-protected VAPIC
> + * which requires interrupt injection to the guest.
Maybe a bit more detail about why you don't want to do the ACK?
Thanks,
Tom
> + *
> + * ->inject_irq reads the KVM's VAPIC's APIC_IRR state and
> + * clears it.
> + */
> + if (vcpu->arch.apic->guest_apic_protected &&
> + vcpu->arch.apic->prot_apic_intr_inject)
> + irq = kvm_apic_has_interrupt(vcpu);
> + else
> + irq = kvm_cpu_get_interrupt(vcpu);
>
> if (!WARN_ON_ONCE(irq == -1)) {
> kvm_queue_interrupt(vcpu, irq, false);
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 07/17] KVM: SVM: Add IPI Delivery Support for Secure AVIC
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (5 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception " Neeraj Upadhyay
` (10 subsequent siblings)
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
Secure AVIC hardware only accelerates Self-IPI, i.e. on WRMSR to
APIC_SELF_IPI and APIC_ICR (with destination shorthand equal to "self")
registers, hardware takes care of updating the APIC_IRR in the guest-owned
APIC backing page of the vCPU. For other IPI types (cross-vCPU, broadcast
IPIs), software needs to take care of updating the APIC_IRR state in the
target vCPUs' APIC backing page and to ensure that the target vCPU notices
the new pending interrupt.
To ensure that the remote vCPU notices the new pending interrupt, the guest
sends a APIC_ICR MSR-write GHCB protocol event to the hypervisor.
Handle the APIC_ICR write MSR exits for Secure AVIC guests by either
sending an AVIC doorbell (if the target vCPU is running) or by waking up
the non-running target vCPU thread.
To ensure that the target vCPU observes the new IPI request, introduce a
new per-vcpu flag, sev_savic_has_pending_ipi. This flag acts as a reliable
"sticky bit" that signals a pending IPI, ensuring the event is not lost
even if the primary wakeup mechanism is missed. Update
sev_savic_has_pending_interrupt() to return true if
sev_savic_has_pending_ipi is set. This ensures that when a vCPU is about
to block (in kvm_vcpu_block()), it correctly recognizes that it has work
to do and will not go to sleep.
Clear the sev_savic_has_pending_ipi flag in pre_sev_run() just before the
next VM-entry. This resets the one-shot signal, as the pending interrupt
is now about to be processed by the hardware upon VMRUN.
During APIC_ICR write GHCB request handling, unconditionally set
sev_savic_has_pending_ipi for the target vCPU irrespective of whether the
target vCPU is in guest mode or not. If the target vCPU does not take any
other VMEXIT before taking next hlt exit, the vCPU blocking fails as
sev_savic_has_pending_ipi remains set. The sev_savic_has_pending_ipi is
cleared before next VMRUN and on subsequent hlt exit the vCPU thread
would block.
Following are the race conditions which can occur between target vCPU
doing hlt and the source vCPU's IPI request handling.
a. VMEXIT before HLT when RFLAGS.IF = 0 or Interrupt shadow is active.
#Source-vCPU #Target-VCPU
1. sev_savic_has_pending_ipi = true
2. smp_mb();
3. Disable interrupts
4. Target vCPU is in guest mode
5. Raise AVIC doorbell to target
vCPU's physical APIC_ID
6. VMEXIT
7. sev_savic_has_pending_ipi =
false
8. VMRUN
9. HLT
10. VMEXIT
11. kvm_arch_vcpu_runnable()
returns false
12. vCPU thread blocks
In this scenario IDLE HLT intercept ensures that the target vCPU does
not take hlt intercept as V_INTR is set (AVIC doorbell by source vCPU
triggers evaluation of Secure AVIC backing page of the target vCPU
and sets V_INTR).
b. Target vCPU takes HLT VMEXIT but hasn't cleared IN_GUEST_MODE at the
time when doorbell write is issued by source CPU.
#Source-vCPU #Target-VCPU
1. sev_savic_has_pending_ipi = true
2. smp_mb();
3. Target vCPU is in guest mode
4. HLT
5. VMEXIT
6. Raise AVIC doorbell to the target
physical CPU.
7. vcpu->mode =
OUTSIDE_GUEST_MODE
8. kvm_cpu_has_interrupt()
protected_..._interrupt()
smp_mb()
sev_savic_has_pending_ipi is
true
In this case, the smp_mb() barriers at 2, 8 guarantee that the target
vCPU's thread observes sev_savic_has_pending_ipi is set and returns to
the guest mode without blocking.
c. For other cases, where the source vCPU thread observes the target vCPU
to be outside of the guest mode, memory barriers in rcuwait_wake_up()
(source vCPU thread) and set_current_state() (target vCPU thread)
provides the required ordering and ensures that read of
sev_savic_has_pending_ipi in kvm_vcpu_check_block() observes the write
by the source vCPU.
#Source-vCPU #Target-VCPU
rcuwait_wake_up()
smp_mb()
task = rcu_dereference(w->task);
if (task)
wake_up_process()
prepare_to_rcuwait()
w->task = current
set_current_state(
TASK_INTERRUPTIBLE)
smp_mb()
kvm_vcpu_check_block()
kvm_cpu_has_interrupt()
<Read sev_savic_has_..._ipi>
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 218 ++++++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/svm/svm.h | 2 +
2 files changed, 219 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 78cefc14a2ee..a64fcc7637c7 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3511,6 +3511,89 @@ int pre_sev_run(struct vcpu_svm *svm, int cpu)
if (!cpumask_test_cpu(cpu, to_kvm_sev_info(kvm)->have_run_cpus))
cpumask_set_cpu(cpu, to_kvm_sev_info(kvm)->have_run_cpus);
+ /*
+ * It should be safe to clear sev_savic_has_pending_ipi here.
+ *
+ * Following are the scenarios possible:
+ *
+ * Scenario 1: sev_savic_has_pending_ipi is set before hlt exit of the
+ * target vCPU.
+ *
+ * Source vCPU Target vCPU
+ *
+ * 1. Set APIC_IRR of target
+ * vCPU.
+ *
+ * 2. VMGEXIT
+ *
+ * 3. Set ...has_pending_ipi
+ *
+ * savic_handle_icr_write()
+ * ..._has_pending_ipi = true
+ *
+ * 4. avic_ring_doorbell()
+ * - VS -
+ *
+ * 4. VMEXIT
+ *
+ * 5. ..._has_pending_ipi = false
+ *
+ * 6. VM entry
+ *
+ * 7. hlt exit
+ *
+ * In this case, any VM exit taken by target vCPU before hlt exit
+ * clears sev_savic_has_pending_ipi. On hlt exit, idle halt intercept
+ * would find the V_INTR set and skip hlt exit.
+ *
+ * Scenario 2: sev_savic_has_pending_ipi is set when target vCPU
+ * has taken hlt exit.
+ *
+ * Source vCPU Target vCPU
+ *
+ * 1. hlt exit
+ *
+ * 2. Set ...has_pending_ipi
+ * 3. kvm_vcpu_has_events() returns true
+ * and VM is reentered.
+ *
+ * vcpu_block()
+ * kvm_arch_vcpu_runnable()
+ * kvm_vcpu_has_events()
+ * <return true as ..._has_pending_ipi
+ * is set>
+ *
+ * 4. On VM entry, APIC_IRR state is re-evaluated
+ * and V_INTR is set and interrupt is delivered
+ * to vCPU.
+ *
+ *
+ * Scenario 3: sev_savic_has_pending_ipi is set while halt exit is happening:
+ *
+ *
+ * Source vCPU Target vCPU
+ *
+ * 1. hlt
+ * Hardware check V_INTR to determine
+ * if hlt exit need to be taken. No other
+ * exit such as intr exit can be taken
+ * while this sequence is being executed.
+ *
+ * 2. Set APIC_IRR of target vCPU.
+ *
+ * 3. Set ...has_pending_ipi
+ * 4. hlt exit taken.
+ *
+ * 5. ...has_pending_ipi being set is observed
+ * by target vCPU and the vCPU is resumed.
+ *
+ * In this scenario, hardware ensures that target vCPU does not take any exit
+ * between checking V_INTR state and halt exit. So, sev_savic_has_pending_ipi
+ * remains set when vCPU takes hlt exit.
+ */
+ if (READ_ONCE(svm->sev_savic_has_pending_ipi))
+ WRITE_ONCE(svm->sev_savic_has_pending_ipi, false);
+
/* Assign the asid allocated with this SEV guest */
svm->asid = asid;
@@ -4281,6 +4364,129 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
return 0;
}
+static void savic_handle_icr_write(struct kvm_vcpu *kvm_vcpu, u64 icr)
+{
+ struct kvm *kvm = kvm_vcpu->kvm;
+ struct kvm_vcpu *vcpu;
+ u32 icr_low, icr_high;
+ bool in_guest_mode;
+ unsigned long i;
+
+ icr_low = lower_32_bits(icr);
+ icr_high = upper_32_bits(icr);
+
+ /*
+ * TODO: Instead of scanning all the vCPUS, get fastpath working which should
+ * look similar to avic_kick_target_vcpus_fast().
+ */
+ kvm_for_each_vcpu(i, vcpu, kvm) {
+ if (!kvm_apic_match_dest(vcpu, kvm_vcpu->arch.apic, icr_low & APIC_SHORT_MASK,
+ icr_high, icr_low & APIC_DEST_MASK))
+ continue;
+
+ /*
+ * Setting sev_savic_has_pending_ipi could result in a spurious
+ * return from hlt (as kvm_cpu_has_interrupt() would return true)
+ * if destination CPU is in guest mode and the guest takes a hlt
+ * exit after handling the IPI. sev_savic_has_pending_ipi gets cleared
+ * on VM entry, so there can be at most one spurious return per IPI.
+ * For vcpu->mode == IN_GUEST_MODE, sev_savic_has_pending_ipi need
+ * to be set to handle the case where the destination vCPU has taken
+ * hlt exit and the source CPU has not observed (target)vcpu->mode !=
+ * IN_GUEST_MODE.
+ */
+ WRITE_ONCE(to_svm(vcpu)->sev_savic_has_pending_ipi, true);
+ /* Order sev_savic_has_pending_ipi write and vcpu->mode read. */
+ smp_mb();
+ /* Pairs with smp_store_release in vcpu_enter_guest. */
+ in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE);
+ if (in_guest_mode) {
+ /*
+ * Signal the doorbell to tell hardware to inject the IRQ.
+ *
+ * If the vCPU exits the guest before the doorbell chimes,
+ * below memory ordering guarantees that the destination vCPU
+ * observes sev_savic_has_pending_ipi == true before
+ * blocking.
+ *
+ * Src-CPU Dest-CPU
+ *
+ * savic_handle_icr_write()
+ * sev_savic_has_pending_ipi = true
+ * smp_mb()
+ * smp_load_acquire(&vcpu->mode)
+ *
+ * - VS -
+ * vcpu->mode = OUTSIDE_GUEST_MODE
+ * __kvm_emulate_halt()
+ * kvm_cpu_has_interrupt()
+ * smp_mb()
+ * if (sev_savic_has_pending_ipi)
+ * return true;
+ *
+ * [S1]
+ * sev_savic_has_pending_ipi = true
+ *
+ * SMP_MB
+ *
+ * [L1]
+ * vcpu->mode
+ * [S2]
+ * vcpu->mode = OUTSIDE_GUEST_MODE
+ *
+ *
+ * SMP_MB
+ *
+ * [L2] sev_savic_has_pending_ipi == true
+ *
+ * exists (L1=IN_GUEST_MODE /\ L2=false)
+ *
+ * Above condition does not exit. So, if the source CPU observes
+ * vcpu->mode = IN_GUEST_MODE (L1), sev_savic_has_pending_ipi load by
+ * the destination CPU (L2) should observe the store (S1) from the
+ * source CPU.
+ */
+ avic_ring_doorbell(vcpu);
+ } else {
+ /*
+ * Wakeup the vCPU if it was blocking.
+ *
+ * Memory ordering is provided by smp_mb() in rcuwait_wake_up() on the
+ * source CPU and smp_mb() in set_current_state() inside kvm_vcpu_block()
+ * on the destination CPU.
+ */
+ kvm_vcpu_kick(vcpu);
+ }
+ }
+}
+
+static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
+{
+ u32 msr, reg;
+
+ msr = kvm_rcx_read(vcpu);
+ reg = (msr - APIC_BASE_MSR) << 4;
+
+ switch (reg) {
+ case APIC_ICR:
+ /*
+ * Only APIC_ICR WRMSR requires special handling for Secure AVIC
+ * guests to wake up destination vCPUs.
+ */
+ if (to_svm(vcpu)->vmcb->control.exit_info_1) {
+ u64 data = kvm_read_edx_eax(vcpu);
+
+ savic_handle_icr_write(vcpu, data);
+ return true;
+ }
+ break;
+ default:
+ break;
+ }
+
+ return false;
+}
+
int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -4419,6 +4625,11 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
control->exit_info_1, control->exit_info_2);
ret = -EINVAL;
break;
+ case SVM_EXIT_MSR:
+ if (sev_savic_active(vcpu->kvm) && savic_handle_msr_exit(vcpu))
+ return 1;
+
+ fallthrough;
default:
ret = svm_invoke_exit_handler(vcpu, exit_code);
}
@@ -5106,5 +5317,10 @@ void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected)
bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu)
{
- return kvm_apic_has_interrupt(vcpu) != -1;
+ /*
+ * See memory ordering description in savic_handle_icr_write().
+ */
+ smp_mb();
+ return READ_ONCE(to_svm(vcpu)->sev_savic_has_pending_ipi) ||
+ kvm_apic_has_interrupt(vcpu) != -1;
}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 60dc424d62c4..a3edb6e720cd 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -335,6 +335,8 @@ struct vcpu_svm {
/* Guest GIF value, used when vGIF is not enabled */
bool guest_gif;
+
+ bool sev_savic_has_pending_ipi;
};
struct svm_cpu_data {
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception for Secure AVIC
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (6 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 07/17] KVM: SVM: Add IPI Delivery Support " Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 15:00 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests Neeraj Upadhyay
` (9 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Secure AVIC does not support injecting exception from the hypervisor.
Take an early return from svm_inject_exception() for Secure AVIC enabled
guests.
Hardware takes care of delivering exceptions initiated by the guest as
well as re-injecting exceptions initiated by the guest (in case there's
an intercept before delivering the exception).
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/svm.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7811a87bc111..fdd612c975ae 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -374,6 +374,9 @@ static void svm_inject_exception(struct kvm_vcpu *vcpu)
struct kvm_queued_exception *ex = &vcpu->arch.exception;
struct vcpu_svm *svm = to_svm(vcpu);
+ if (sev_savic_active(vcpu->kvm))
+ return;
+
kvm_deliver_exception_payload(vcpu, ex);
if (kvm_exception_is_soft(ex->vector) &&
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception for Secure AVIC
2025-09-23 5:03 ` [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception " Neeraj Upadhyay
@ 2025-09-23 15:00 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 15:00 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> From: Kishon Vijay Abraham I <kvijayab@amd.com>
>
> Secure AVIC does not support injecting exception from the hypervisor.
> Take an early return from svm_inject_exception() for Secure AVIC enabled
> guests.
>
> Hardware takes care of delivering exceptions initiated by the guest as
> well as re-injecting exceptions initiated by the guest (in case there's
> an intercept before delivering the exception).
>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/svm/svm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 7811a87bc111..fdd612c975ae 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -374,6 +374,9 @@ static void svm_inject_exception(struct kvm_vcpu *vcpu)
> struct kvm_queued_exception *ex = &vcpu->arch.exception;
> struct vcpu_svm *svm = to_svm(vcpu);
>
> + if (sev_savic_active(vcpu->kvm))
> + return;
A comment above this would be good to have.
Thanks,
Tom
> +
> kvm_deliver_exception_payload(vcpu, ex);
>
> if (kvm_exception_is_soft(ex->vector) &&
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (7 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception " Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 15:15 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area " Neeraj Upadhyay
` (8 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
Exceptions cannot be explicitly injected from the hypervisor to
Secure AVIC enabled guests. So, KVM cannot inject exceptions into
a Secure AVIC guest. If KVM were to intercept an exception (e.g., #PF
or #GP), it would be unable to deliver it back to the guest, effectively
dropping the event and leading to guest misbehavior or hangs. So,
clear exception intercepts so that all exceptions are handled directly by
the guest without KVM intervention.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index a64fcc7637c7..837ab55a3330 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4761,8 +4761,17 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
/* Can't intercept XSETBV, HV can't modify XCR0 directly */
svm_clr_intercept(svm, INTERCEPT_XSETBV);
- if (sev_savic_active(vcpu->kvm))
+ if (sev_savic_active(vcpu->kvm)) {
svm_set_intercept_for_msr(vcpu, MSR_AMD64_SAVIC_CONTROL, MSR_TYPE_RW, false);
+
+ /* Clear all exception intercepts. */
+ clr_exception_intercept(svm, PF_VECTOR);
+ clr_exception_intercept(svm, UD_VECTOR);
+ clr_exception_intercept(svm, MC_VECTOR);
+ clr_exception_intercept(svm, AC_VECTOR);
+ clr_exception_intercept(svm, DB_VECTOR);
+ clr_exception_intercept(svm, GP_VECTOR);
+ }
}
void sev_init_vmcb(struct vcpu_svm *svm)
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests
2025-09-23 5:03 ` [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests Neeraj Upadhyay
@ 2025-09-23 15:15 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 15:15 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> Exceptions cannot be explicitly injected from the hypervisor to
> Secure AVIC enabled guests. So, KVM cannot inject exceptions into
> a Secure AVIC guest. If KVM were to intercept an exception (e.g., #PF
> or #GP), it would be unable to deliver it back to the guest, effectively
> dropping the event and leading to guest misbehavior or hangs. So,
> clear exception intercepts so that all exceptions are handled directly by
> the guest without KVM intervention.
>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index a64fcc7637c7..837ab55a3330 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4761,8 +4761,17 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
> /* Can't intercept XSETBV, HV can't modify XCR0 directly */
> svm_clr_intercept(svm, INTERCEPT_XSETBV);
>
> - if (sev_savic_active(vcpu->kvm))
> + if (sev_savic_active(vcpu->kvm)) {
> svm_set_intercept_for_msr(vcpu, MSR_AMD64_SAVIC_CONTROL, MSR_TYPE_RW, false);
> +
> + /* Clear all exception intercepts. */
> + clr_exception_intercept(svm, PF_VECTOR);
> + clr_exception_intercept(svm, UD_VECTOR);
> + clr_exception_intercept(svm, MC_VECTOR);
> + clr_exception_intercept(svm, AC_VECTOR);
> + clr_exception_intercept(svm, DB_VECTOR);
> + clr_exception_intercept(svm, GP_VECTOR);
Some of these are cleared no matter what prior to here. For example,
PF_VECTOR is cleared if npt_enabled is true (which is required for SEV),
UD_VECTOR and GP_VECTOR are cleared in sev_init_vmcb().
For the MC_VECTOR interception, the SVM code just ignores it today by
returning 1 immediately, so clearing the interception looks like a NOP,
but I might be missing something.
Thanks,
Tom
> + }
> }
>
> void sev_init_vmcb(struct vcpu_svm *svm)
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area for Secure AVIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (8 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 15:16 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support " Neeraj Upadhyay
` (7 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
From: Kishon Vijay Abraham I <kvijayab@amd.com>
Unlike standard SVM which uses the V_GIF (Virtual Global Interrupt Flag)
bit in the VMCB, Secure AVIC ignores this field.
Instead, the hardware requires an equivalent V_GIF bit to be set within
the vintr_ctrl field of the VMSA (Virtual Machine Save Area). Failure
to set this bit will cause the hardware to block all interrupt delivery,
rendering the guest non-functional.
To enable interrupts for Secure AVIC guests, modify sev_es_sync_vmsa()
to unconditionally set the V_GIF_MASK in the VMSA's vintr_ctrl field
whenever Secure AVIC is active. This ensures the hardware correctly
identifies the guest as interruptible.
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 837ab55a3330..2dee210efb37 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -884,6 +884,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
save->sev_features = sev->vmsa_features;
+ if (sev_savic_active(vcpu->kvm))
+ save->vintr_ctrl |= V_GIF_MASK;
+
/*
* Skip FPU and AVX setup with KVM_SEV_ES_INIT to avoid
* breaking older measurements.
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area for Secure AVIC guests
2025-09-23 5:03 ` [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area " Neeraj Upadhyay
@ 2025-09-23 15:16 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 15:16 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> From: Kishon Vijay Abraham I <kvijayab@amd.com>
>
> Unlike standard SVM which uses the V_GIF (Virtual Global Interrupt Flag)
> bit in the VMCB, Secure AVIC ignores this field.
>
> Instead, the hardware requires an equivalent V_GIF bit to be set within
> the vintr_ctrl field of the VMSA (Virtual Machine Save Area). Failure
> to set this bit will cause the hardware to block all interrupt delivery,
> rendering the guest non-functional.
>
> To enable interrupts for Secure AVIC guests, modify sev_es_sync_vmsa()
> to unconditionally set the V_GIF_MASK in the VMSA's vintr_ctrl field
> whenever Secure AVIC is active. This ensures the hardware correctly
> identifies the guest as interruptible.
>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 837ab55a3330..2dee210efb37 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -884,6 +884,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>
> save->sev_features = sev->vmsa_features;
>
> + if (sev_savic_active(vcpu->kvm))
> + save->vintr_ctrl |= V_GIF_MASK;
A comment above this would be good.
Thanks,
Tom
> +
> /*
> * Skip FPU and AVX setup with KVM_SEV_ES_INIT to avoid
> * breaking older measurements.
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support for Secure AVIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (9 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area " Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 15:25 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page Neeraj Upadhyay
` (6 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
The Secure AVIC hardware introduces a new model for handling Non-Maskable
Interrupts (NMIs). This model differs significantly from standard SVM, as
guest NMI state is managed by the hardware and is not visible to KVM.
Consequently, KVM can no longer use the generic EVENT_INJ mechanism and
must not track NMI masking state in software. Instead, it must adopt the
vNMI (Virtual NMI) flow, which is the only mechanism supported by
Secure AVIC.
Enable NMI support by making three key changes:
1. Enable NMI in VMSA: Set the V_NMI_ENABLE_MASK bit in the VMSA's
vintr_ctr field. This is a hardware prerequisite to enable the
vNMI feature for the guest.
2. Use vNMI for Injection: Modify svm_inject_nmi() to use the vNMI
flow for Secure AVIC guests. When an NMI is requested, set the
V_NMI_PENDING_MASK in the VMCB instead of using EVENT_INJ.
3. Update NMI Windowing: Modify svm_nmi_allowed() to reflect that
hardware now manages NMI blocking. KVM's only responsibility is to
avoid queuing a new vNMI if one is already pending. The check is
now simplified to whether V_NMI_PENDING_MASK is already set.
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 2 +-
arch/x86/kvm/svm/svm.c | 56 ++++++++++++++++++++++++++----------------
2 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 2dee210efb37..7c66aefe428a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -885,7 +885,7 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
save->sev_features = sev->vmsa_features;
if (sev_savic_active(vcpu->kvm))
- save->vintr_ctrl |= V_GIF_MASK;
+ save->vintr_ctrl |= V_GIF_MASK | V_NMI_ENABLE_MASK;
/*
* Skip FPU and AVX setup with KVM_SEV_ES_INIT to avoid
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index fdd612c975ae..a945bc094c1a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3635,27 +3635,6 @@ static int pre_svm_run(struct kvm_vcpu *vcpu)
return 0;
}
-static void svm_inject_nmi(struct kvm_vcpu *vcpu)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
-
- svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
-
- if (svm->nmi_l1_to_l2)
- return;
-
- /*
- * No need to manually track NMI masking when vNMI is enabled, hardware
- * automatically sets V_NMI_BLOCKING_MASK as appropriate, including the
- * case where software directly injects an NMI.
- */
- if (!is_vnmi_enabled(svm)) {
- svm->nmi_masked = true;
- svm_set_iret_intercept(svm);
- }
- ++vcpu->stat.nmi_injections;
-}
-
static bool svm_is_vnmi_pending(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -3689,6 +3668,33 @@ static bool svm_set_vnmi_pending(struct kvm_vcpu *vcpu)
return true;
}
+static void svm_inject_nmi(struct kvm_vcpu *vcpu)
+{
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (sev_savic_active(vcpu->kvm)) {
+ svm_set_vnmi_pending(vcpu);
+ ++vcpu->stat.nmi_injections;
+ return;
+ }
+
+ svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
+
+ if (svm->nmi_l1_to_l2)
+ return;
+
+ /*
+ * No need to manually track NMI masking when vNMI is enabled, hardware
+ * automatically sets V_NMI_BLOCKING_MASK as appropriate, including the
+ * case where software directly injects an NMI.
+ */
+ if (!is_vnmi_enabled(svm)) {
+ svm->nmi_masked = true;
+ svm_set_iret_intercept(svm);
+ }
+ ++vcpu->stat.nmi_injections;
+}
+
static void svm_inject_irq(struct kvm_vcpu *vcpu, bool reinjected)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -3836,6 +3842,14 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
static int svm_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection)
{
struct vcpu_svm *svm = to_svm(vcpu);
+
+ /* Secure AVIC only support V_NMI based NMI injection. */
+ if (sev_savic_active(vcpu->kvm)) {
+ if (svm->vmcb->control.int_ctl & V_NMI_PENDING_MASK)
+ return 0;
+ return 1;
+ }
+
if (svm->nested.nested_run_pending)
return -EBUSY;
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support for Secure AVIC guests
2025-09-23 5:03 ` [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support " Neeraj Upadhyay
@ 2025-09-23 15:25 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 15:25 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> The Secure AVIC hardware introduces a new model for handling Non-Maskable
> Interrupts (NMIs). This model differs significantly from standard SVM, as
> guest NMI state is managed by the hardware and is not visible to KVM.
>
> Consequently, KVM can no longer use the generic EVENT_INJ mechanism and
> must not track NMI masking state in software. Instead, it must adopt the
> vNMI (Virtual NMI) flow, which is the only mechanism supported by
> Secure AVIC.
>
> Enable NMI support by making three key changes:
>
> 1. Enable NMI in VMSA: Set the V_NMI_ENABLE_MASK bit in the VMSA's
> vintr_ctr field. This is a hardware prerequisite to enable the
> vNMI feature for the guest.
>
> 2. Use vNMI for Injection: Modify svm_inject_nmi() to use the vNMI
> flow for Secure AVIC guests. When an NMI is requested, set the
> V_NMI_PENDING_MASK in the VMCB instead of using EVENT_INJ.
>
> 3. Update NMI Windowing: Modify svm_nmi_allowed() to reflect that
> hardware now manages NMI blocking. KVM's only responsibility is to
> avoid queuing a new vNMI if one is already pending. The check is
> now simplified to whether V_NMI_PENDING_MASK is already set.
>
> Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 2 +-
> arch/x86/kvm/svm/svm.c | 56 ++++++++++++++++++++++++++----------------
> 2 files changed, 36 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 2dee210efb37..7c66aefe428a 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -885,7 +885,7 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
> save->sev_features = sev->vmsa_features;
>
> if (sev_savic_active(vcpu->kvm))
> - save->vintr_ctrl |= V_GIF_MASK;
> + save->vintr_ctrl |= V_GIF_MASK | V_NMI_ENABLE_MASK;
>
> /*
> * Skip FPU and AVX setup with KVM_SEV_ES_INIT to avoid
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index fdd612c975ae..a945bc094c1a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3635,27 +3635,6 @@ static int pre_svm_run(struct kvm_vcpu *vcpu)
> return 0;
> }
>
> -static void svm_inject_nmi(struct kvm_vcpu *vcpu)
> -{
> - struct vcpu_svm *svm = to_svm(vcpu);
> -
> - svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
> -
> - if (svm->nmi_l1_to_l2)
> - return;
> -
> - /*
> - * No need to manually track NMI masking when vNMI is enabled, hardware
> - * automatically sets V_NMI_BLOCKING_MASK as appropriate, including the
> - * case where software directly injects an NMI.
> - */
> - if (!is_vnmi_enabled(svm)) {
> - svm->nmi_masked = true;
> - svm_set_iret_intercept(svm);
> - }
> - ++vcpu->stat.nmi_injections;
> -}
A pre-patch that moves this function would make the changes you make to
it in this patch more obvious.
Thanks,
Tom
> -
> static bool svm_is_vnmi_pending(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> @@ -3689,6 +3668,33 @@ static bool svm_set_vnmi_pending(struct kvm_vcpu *vcpu)
> return true;
> }
>
> +static void svm_inject_nmi(struct kvm_vcpu *vcpu)
> +{
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (sev_savic_active(vcpu->kvm)) {
> + svm_set_vnmi_pending(vcpu);
> + ++vcpu->stat.nmi_injections;
> + return;
> + }
> +
> + svm->vmcb->control.event_inj = SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_NMI;
> +
> + if (svm->nmi_l1_to_l2)
> + return;
> +
> + /*
> + * No need to manually track NMI masking when vNMI is enabled, hardware
> + * automatically sets V_NMI_BLOCKING_MASK as appropriate, including the
> + * case where software directly injects an NMI.
> + */
> + if (!is_vnmi_enabled(svm)) {
> + svm->nmi_masked = true;
> + svm_set_iret_intercept(svm);
> + }
> + ++vcpu->stat.nmi_injections;
> +}
> +
> static void svm_inject_irq(struct kvm_vcpu *vcpu, bool reinjected)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> @@ -3836,6 +3842,14 @@ bool svm_nmi_blocked(struct kvm_vcpu *vcpu)
> static int svm_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> +
> + /* Secure AVIC only support V_NMI based NMI injection. */
> + if (sev_savic_active(vcpu->kvm)) {
> + if (svm->vmcb->control.int_ctl & V_NMI_PENDING_MASK)
> + return 0;
> + return 1;
> + }
> +
> if (svm->nested.nested_run_pending)
> return -EBUSY;
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (10 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support " Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 16:02 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests Neeraj Upadhyay
` (5 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
The Secure AVIC hardware requires uninterrupted access to the guest's
APIC backing page. If this page is not present in the Nested Page Table
(NPT) during a hardware access, a non-recoverable nested page fault
occurs. This sets a BUSY flag in the VMSA and causes subsequent
VMRUNs to fail with an unrecoverable VMEXIT_BUSY, effectively
killing the vCPU.
This situation can arise if the backing page resides within a 2MB large
page in the NPT. If other parts of that large page are modified (e.g.,
memory state changes), KVM would split the 2MB NPT entry into 4KB
entries. This process can temporarily zap the PTE for the backing page,
creating a window for the fatal hardware access.
Introduce a new GHCB VMGEXIT protocol, SVM_VMGEXIT_SECURE_AVIC, to
allow the guest to explicitly inform KVM of the APIC backing page's
location, thereby enabling KVM to guarantee its presence in the NPT.
Implement two actions for this protocol:
- SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE:
On this request, KVM receives the GPA of the backing page. To prevent
the 2MB page-split issue, immediately perform a PSMASH on the GPA by
calling sev_handle_rmp_fault(). This proactively breaks any
containing 2MB NPT entry into 4KB pages, isolating the backing page's
PTE and guaranteeing its presence. Store the GPA for future reference.
- SVM_VMGEXIT_SAVIC_UNREGISTER_BACKING_PAGE:
On this request, clear the stored GPA, releasing KVM from its
obligation to maintain the NPT entry. Return the previously
registered GPA to the guest.
This mechanism ensures the stability of the APIC backing page mapping,
which is critical for the correct operation of Secure AVIC.
Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/include/uapi/asm/svm.h | 3 ++
arch/x86/kvm/svm/sev.c | 59 +++++++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 1 +
3 files changed, 63 insertions(+)
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index 9c640a521a67..f1ef52e0fab1 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -118,6 +118,9 @@
#define SVM_VMGEXIT_AP_CREATE 1
#define SVM_VMGEXIT_AP_DESTROY 2
#define SVM_VMGEXIT_SNP_RUN_VMPL 0x80000018
+#define SVM_VMGEXIT_SECURE_AVIC 0x8000001a
+#define SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE 0
+#define SVM_VMGEXIT_SAVIC_UNREGISTER_BACKING_PAGE 1
#define SVM_VMGEXIT_HV_FEATURES 0x8000fffd
#define SVM_VMGEXIT_TERM_REQUEST 0x8000fffe
#define SVM_VMGEXIT_TERM_REASON(reason_set, reason_code) \
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 7c66aefe428a..3e9cc50f2705 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3399,6 +3399,15 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
!kvm_ghcb_rcx_is_valid(svm))
goto vmgexit_err;
break;
+ case SVM_VMGEXIT_SECURE_AVIC:
+ if (!sev_savic_active(vcpu->kvm))
+ goto vmgexit_err;
+ if (!kvm_ghcb_rax_is_valid(svm))
+ goto vmgexit_err;
+ if (svm->vmcb->control.exit_info_1 == SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE)
+ if (!kvm_ghcb_rbx_is_valid(svm))
+ goto vmgexit_err;
+ break;
case SVM_VMGEXIT_MMIO_READ:
case SVM_VMGEXIT_MMIO_WRITE:
if (!kvm_ghcb_sw_scratch_is_valid(svm))
@@ -4490,6 +4499,53 @@ static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
return false;
}
+static int sev_handle_savic_vmgexit(struct vcpu_svm *svm)
+{
+ struct kvm_vcpu *vcpu = NULL;
+ u64 apic_id;
+
+ apic_id = kvm_rax_read(&svm->vcpu);
+
+ if (apic_id == -1ULL) {
+ vcpu = &svm->vcpu;
+ } else {
+ vcpu = kvm_get_vcpu_by_id(vcpu->kvm, apic_id);
+ if (!vcpu)
+ goto savic_request_invalid;
+ }
+
+ switch (svm->vmcb->control.exit_info_1) {
+ case SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE:
+ gpa_t gpa;
+
+ gpa = kvm_rbx_read(&svm->vcpu);
+ if (!PAGE_ALIGNED(gpa))
+ goto savic_request_invalid;
+
+ /*
+ * sev_handle_rmp_fault() invocation would result in PSMASH if
+ * NPTE size is 2M.
+ */
+ sev_handle_rmp_fault(vcpu, gpa, 0);
+ to_svm(vcpu)->sev_savic_gpa = gpa;
+ break;
+ case SVM_VMGEXIT_SAVIC_UNREGISTER_BACKING_PAGE:
+ kvm_rbx_write(&svm->vcpu, to_svm(vcpu)->sev_savic_gpa);
+ to_svm(vcpu)->sev_savic_gpa = 0;
+ break;
+ default:
+ goto savic_request_invalid;
+ }
+
+ return 1;
+
+savic_request_invalid:
+ ghcb_set_sw_exit_info_1(svm->sev_es.ghcb, 2);
+ ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, GHCB_ERR_INVALID_INPUT);
+
+ return 1;
+}
+
int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -4628,6 +4684,9 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
control->exit_info_1, control->exit_info_2);
ret = -EINVAL;
break;
+ case SVM_VMGEXIT_SECURE_AVIC:
+ ret = sev_handle_savic_vmgexit(svm);
+ break;
case SVM_EXIT_MSR:
if (sev_savic_active(vcpu->kvm) && savic_handle_msr_exit(vcpu))
return 1;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a3edb6e720cd..8043833a1a8c 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -337,6 +337,7 @@ struct vcpu_svm {
bool guest_gif;
bool sev_savic_has_pending_ipi;
+ gpa_t sev_savic_gpa;
};
struct svm_cpu_data {
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page
2025-09-23 5:03 ` [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page Neeraj Upadhyay
@ 2025-09-23 16:02 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 16:02 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> The Secure AVIC hardware requires uninterrupted access to the guest's
> APIC backing page. If this page is not present in the Nested Page Table
> (NPT) during a hardware access, a non-recoverable nested page fault
> occurs. This sets a BUSY flag in the VMSA and causes subsequent
> VMRUNs to fail with an unrecoverable VMEXIT_BUSY, effectively
> killing the vCPU.
>
> This situation can arise if the backing page resides within a 2MB large
> page in the NPT. If other parts of that large page are modified (e.g.,
> memory state changes), KVM would split the 2MB NPT entry into 4KB
> entries. This process can temporarily zap the PTE for the backing page,
> creating a window for the fatal hardware access.
>
> Introduce a new GHCB VMGEXIT protocol, SVM_VMGEXIT_SECURE_AVIC, to
> allow the guest to explicitly inform KVM of the APIC backing page's
> location, thereby enabling KVM to guarantee its presence in the NPT.
>
> Implement two actions for this protocol:
>
> - SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE:
> On this request, KVM receives the GPA of the backing page. To prevent
> the 2MB page-split issue, immediately perform a PSMASH on the GPA by
> calling sev_handle_rmp_fault(). This proactively breaks any
> containing 2MB NPT entry into 4KB pages, isolating the backing page's
> PTE and guaranteeing its presence. Store the GPA for future reference.
>
> - SVM_VMGEXIT_SAVIC_UNREGISTER_BACKING_PAGE:
> On this request, clear the stored GPA, releasing KVM from its
> obligation to maintain the NPT entry. Return the previously
> registered GPA to the guest.
>
> This mechanism ensures the stability of the APIC backing page mapping,
> which is critical for the correct operation of Secure AVIC.
>
> Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/include/uapi/asm/svm.h | 3 ++
> arch/x86/kvm/svm/sev.c | 59 +++++++++++++++++++++++++++++++++
> arch/x86/kvm/svm/svm.h | 1 +
> 3 files changed, 63 insertions(+)
>
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index 9c640a521a67..f1ef52e0fab1 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -118,6 +118,9 @@
> #define SVM_VMGEXIT_AP_CREATE 1
> #define SVM_VMGEXIT_AP_DESTROY 2
> #define SVM_VMGEXIT_SNP_RUN_VMPL 0x80000018
> +#define SVM_VMGEXIT_SECURE_AVIC 0x8000001a
> +#define SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE 0
> +#define SVM_VMGEXIT_SAVIC_UNREGISTER_BACKING_PAGE 1
> #define SVM_VMGEXIT_HV_FEATURES 0x8000fffd
> #define SVM_VMGEXIT_TERM_REQUEST 0x8000fffe
> #define SVM_VMGEXIT_TERM_REASON(reason_set, reason_code) \
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 7c66aefe428a..3e9cc50f2705 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -3399,6 +3399,15 @@ static int sev_es_validate_vmgexit(struct vcpu_svm *svm)
> !kvm_ghcb_rcx_is_valid(svm))
> goto vmgexit_err;
> break;
> + case SVM_VMGEXIT_SECURE_AVIC:
> + if (!sev_savic_active(vcpu->kvm))
> + goto vmgexit_err;
> + if (!kvm_ghcb_rax_is_valid(svm))
> + goto vmgexit_err;
> + if (svm->vmcb->control.exit_info_1 == SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE)
> + if (!kvm_ghcb_rbx_is_valid(svm))
> + goto vmgexit_err;
> + break;
> case SVM_VMGEXIT_MMIO_READ:
> case SVM_VMGEXIT_MMIO_WRITE:
> if (!kvm_ghcb_sw_scratch_is_valid(svm))
> @@ -4490,6 +4499,53 @@ static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
> return false;
> }
>
> +static int sev_handle_savic_vmgexit(struct vcpu_svm *svm)
> +{
> + struct kvm_vcpu *vcpu = NULL;
This gets confusing below, how about calling this target_vcpu. Also, it
shouldn't need initializing, right?
> + u64 apic_id;
> +
> + apic_id = kvm_rax_read(&svm->vcpu);
> +
> + if (apic_id == -1ULL) {
> + vcpu = &svm->vcpu;
> + } else {
> + vcpu = kvm_get_vcpu_by_id(vcpu->kvm, apic_id);
> + if (!vcpu)
> + goto savic_request_invalid;
> + }
> +
> + switch (svm->vmcb->control.exit_info_1) {
> + case SVM_VMGEXIT_SAVIC_REGISTER_BACKING_PAGE:
> + gpa_t gpa;
> +
> + gpa = kvm_rbx_read(&svm->vcpu);
> + if (!PAGE_ALIGNED(gpa))
> + goto savic_request_invalid;
> +
> + /*
> + * sev_handle_rmp_fault() invocation would result in PSMASH if
> + * NPTE size is 2M.
> + */
Why you're invoking sev_handle_rmp_fault() would be more appropriate in
the comment.
Thanks,
Tom
> + sev_handle_rmp_fault(vcpu, gpa, 0);
> + to_svm(vcpu)->sev_savic_gpa = gpa;
> + break;
> + case SVM_VMGEXIT_SAVIC_UNREGISTER_BACKING_PAGE:
> + kvm_rbx_write(&svm->vcpu, to_svm(vcpu)->sev_savic_gpa);
> + to_svm(vcpu)->sev_savic_gpa = 0;
> + break;
> + default:
> + goto savic_request_invalid;
> + }
> +
> + return 1;
> +
> +savic_request_invalid:
> + ghcb_set_sw_exit_info_1(svm->sev_es.ghcb, 2);
> + ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, GHCB_ERR_INVALID_INPUT);
> +
> + return 1;
> +}
> +
> int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> @@ -4628,6 +4684,9 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu)
> control->exit_info_1, control->exit_info_2);
> ret = -EINVAL;
> break;
> + case SVM_VMGEXIT_SECURE_AVIC:
> + ret = sev_handle_savic_vmgexit(svm);
> + break;
> case SVM_EXIT_MSR:
> if (sev_savic_active(vcpu->kvm) && savic_handle_msr_exit(vcpu))
> return 1;
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index a3edb6e720cd..8043833a1a8c 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -337,6 +337,7 @@ struct vcpu_svm {
> bool guest_gif;
>
> bool sev_savic_has_pending_ipi;
> + gpa_t sev_savic_gpa;
> };
>
> struct svm_cpu_data {
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (11 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 16:15 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests Neeraj Upadhyay
` (4 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
While Secure AVIC hardware accelerates End-of-Interrupt (EOI) processing
for edge-triggered interrupts, it requires hypervisor assistance for
level-triggered interrupts originating from the IOAPIC. For these
interrupts, a guest write to the EOI MSR triggers a VM-Exit.
The primary challenge in handling this exit is that the guest's real
In-Service Register (ISR) is not visible to KVM. When KVM receives an EOI,
it has no direct way of knowing which interrupt vector is being
acknowledged.
To solve this, use KVM's software vAPIC state as a shadow tracking
mechanism for active, level-triggered interrupts.
The implementation follows this flow:
1. On interrupt injection (sev_savic_set_requested_irr), check KVM's
software vAPIC Trigger Mode Register (TMR) to identify if the
interrupt is level-triggered.
2. If it is, set the corresponding vector in KVM's software shadow ISR.
This marks the interrupt as "in-service" from KVM's perspective.
3. When the guest later issues an EOI, the APIC_EOI MSR write exit
handler finds the highest vector set in this shadow ISR.
4. The handler then clears the vector from the shadow ISR and calls
kvm_apic_set_eoi_accelerated() to propagate the EOI to the virtual
IOAPIC, allowing it to de-assert the interrupt line.
This enables correct EOI handling for level-triggered interrupts in
Secure AVIC guests, despite the hardware-enforced opacity of the guest's
APIC state.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 3e9cc50f2705..5be2956fb812 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4474,7 +4474,9 @@ static void savic_handle_icr_write(struct kvm_vcpu *kvm_vcpu, u64 icr)
static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
{
+ struct kvm_lapic *apic;
u32 msr, reg;
+ int vec;
msr = kvm_rcx_read(vcpu);
reg = (msr - APIC_BASE_MSR) << 4;
@@ -4492,6 +4494,12 @@ static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
return true;
}
break;
+ case APIC_EOI:
+ apic = vcpu->arch.apic;
+ vec = apic_find_highest_vector(apic->regs + APIC_ISR);
+ apic_clear_vector(vec, apic->regs + APIC_ISR);
+ kvm_apic_set_eoi_accelerated(vcpu, vec);
+ return true;
default:
break;
}
@@ -5379,6 +5387,8 @@ void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected)
vec = vec_start + vec_pos;
apic_clear_vector(vec, apic->regs + APIC_IRR);
val = val & ~BIT(vec_pos);
+ if (apic_test_vector(vec, apic->regs + APIC_TMR))
+ apic_set_vector(vec, apic->regs + APIC_ISR);
} while (val);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests
2025-09-23 5:03 ` [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests Neeraj Upadhyay
@ 2025-09-23 16:15 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 16:15 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> While Secure AVIC hardware accelerates End-of-Interrupt (EOI) processing
> for edge-triggered interrupts, it requires hypervisor assistance for
> level-triggered interrupts originating from the IOAPIC. For these
> interrupts, a guest write to the EOI MSR triggers a VM-Exit.
>
> The primary challenge in handling this exit is that the guest's real
> In-Service Register (ISR) is not visible to KVM. When KVM receives an EOI,
> it has no direct way of knowing which interrupt vector is being
> acknowledged.
>
> To solve this, use KVM's software vAPIC state as a shadow tracking
> mechanism for active, level-triggered interrupts.
>
> The implementation follows this flow:
>
> 1. On interrupt injection (sev_savic_set_requested_irr), check KVM's
> software vAPIC Trigger Mode Register (TMR) to identify if the
> interrupt is level-triggered.
>
> 2. If it is, set the corresponding vector in KVM's software shadow ISR.
> This marks the interrupt as "in-service" from KVM's perspective.
>
> 3. When the guest later issues an EOI, the APIC_EOI MSR write exit
> handler finds the highest vector set in this shadow ISR.
>
> 4. The handler then clears the vector from the shadow ISR and calls
> kvm_apic_set_eoi_accelerated() to propagate the EOI to the virtual
> IOAPIC, allowing it to de-assert the interrupt line.
>
> This enables correct EOI handling for level-triggered interrupts in
> Secure AVIC guests, despite the hardware-enforced opacity of the guest's
> APIC state.
>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 3e9cc50f2705..5be2956fb812 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4474,7 +4474,9 @@ static void savic_handle_icr_write(struct kvm_vcpu *kvm_vcpu, u64 icr)
>
> static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
> {
> + struct kvm_lapic *apic;
> u32 msr, reg;
> + int vec;
>
> msr = kvm_rcx_read(vcpu);
> reg = (msr - APIC_BASE_MSR) << 4;
> @@ -4492,6 +4494,12 @@ static bool savic_handle_msr_exit(struct kvm_vcpu *vcpu)
> return true;
> }
> break;
> + case APIC_EOI:
> + apic = vcpu->arch.apic;
> + vec = apic_find_highest_vector(apic->regs + APIC_ISR);
> + apic_clear_vector(vec, apic->regs + APIC_ISR);
> + kvm_apic_set_eoi_accelerated(vcpu, vec);
> + return true;
Do you need to ensure that this is truly a WRMSR being done vs a RDMSR?
Or are you guaranteed that it is a WRMSR at this point?
Thanks,
Tom
> default:
> break;
> }
> @@ -5379,6 +5387,8 @@ void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected)
> vec = vec_start + vec_pos;
> apic_clear_vector(vec, apic->regs + APIC_IRR);
> val = val & ~BIT(vec_pos);
> + if (apic_test_vector(vec, apic->regs + APIC_TMR))
> + apic_set_vector(vec, apic->regs + APIC_ISR);
> } while (val);
> }
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (12 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 16:23 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests Neeraj Upadhyay
` (3 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
KVM tracks End-of-Interrupts (EOIs) for the legacy RTC interrupt (GSI 8)
to detect and report coalesced interrupts to userspace. This mechanism
fundamentally relies on KVM having visibility into the guest's interrupt
acknowledgment state.
This assumption is invalid for guests with a protected APIC (e.g., Secure
AVIC) for two main reasons:
a. The guest's true In-Service Register (ISR) is not visible to KVM,
making it impossible to know if the previous interrupt is still active.
So, lazy pending EOI checks cannot be done.
b. The RTC interrupt is edge-triggered, and its EOI is accelerated by the
hardware without a VM-Exit. KVM never sees the EOI event.
Since KVM can observe neither the interrupt's service status nor its EOI,
the tracking logic is invalid. So, disable this feature for all protected
APIC guests. This change means that userspace will no longer be able to
detect coalesced RTC interrupts for these specific guest types.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/ioapic.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 2b5d389bca5f..308778ba4f58 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -113,6 +113,9 @@ static void __rtc_irq_eoi_tracking_restore_one(struct kvm_vcpu *vcpu)
struct dest_map *dest_map = &ioapic->rtc_status.dest_map;
union kvm_ioapic_redirect_entry *e;
+ if (vcpu->arch.apic->guest_apic_protected)
+ return;
+
e = &ioapic->redirtbl[RTC_GSI];
if (!kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT,
e->fields.dest_id,
@@ -476,6 +479,7 @@ static int ioapic_service(struct kvm_ioapic *ioapic, int irq, bool line_status)
{
union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
struct kvm_lapic_irq irqe;
+ struct kvm_vcpu *vcpu;
int ret;
if (entry->fields.mask ||
@@ -505,7 +509,9 @@ static int ioapic_service(struct kvm_ioapic *ioapic, int irq, bool line_status)
BUG_ON(ioapic->rtc_status.pending_eoi != 0);
ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
&ioapic->rtc_status.dest_map);
- ioapic->rtc_status.pending_eoi = (ret < 0 ? 0 : ret);
+ vcpu = kvm_get_vcpu(ioapic->kvm, 0);
+ if (!vcpu->arch.apic->guest_apic_protected)
+ ioapic->rtc_status.pending_eoi = (ret < 0 ? 0 : ret);
} else
ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests
2025-09-23 5:03 ` [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests Neeraj Upadhyay
@ 2025-09-23 16:23 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 16:23 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> KVM tracks End-of-Interrupts (EOIs) for the legacy RTC interrupt (GSI 8)
> to detect and report coalesced interrupts to userspace. This mechanism
> fundamentally relies on KVM having visibility into the guest's interrupt
> acknowledgment state.
>
> This assumption is invalid for guests with a protected APIC (e.g., Secure
> AVIC) for two main reasons:
>
> a. The guest's true In-Service Register (ISR) is not visible to KVM,
> making it impossible to know if the previous interrupt is still active.
> So, lazy pending EOI checks cannot be done.
>
> b. The RTC interrupt is edge-triggered, and its EOI is accelerated by the
> hardware without a VM-Exit. KVM never sees the EOI event.
>
> Since KVM can observe neither the interrupt's service status nor its EOI,
> the tracking logic is invalid. So, disable this feature for all protected
> APIC guests. This change means that userspace will no longer be able to
> detect coalesced RTC interrupts for these specific guest types.
>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/ioapic.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 2b5d389bca5f..308778ba4f58 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -113,6 +113,9 @@ static void __rtc_irq_eoi_tracking_restore_one(struct kvm_vcpu *vcpu)
> struct dest_map *dest_map = &ioapic->rtc_status.dest_map;
> union kvm_ioapic_redirect_entry *e;
>
> + if (vcpu->arch.apic->guest_apic_protected)
> + return;
A comment above this code would be good.
> +
> e = &ioapic->redirtbl[RTC_GSI];
> if (!kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT,
> e->fields.dest_id,
> @@ -476,6 +479,7 @@ static int ioapic_service(struct kvm_ioapic *ioapic, int irq, bool line_status)
> {
> union kvm_ioapic_redirect_entry *entry = &ioapic->redirtbl[irq];
> struct kvm_lapic_irq irqe;
> + struct kvm_vcpu *vcpu;
> int ret;
>
> if (entry->fields.mask ||
> @@ -505,7 +509,9 @@ static int ioapic_service(struct kvm_ioapic *ioapic, int irq, bool line_status)
> BUG_ON(ioapic->rtc_status.pending_eoi != 0);
> ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe,
> &ioapic->rtc_status.dest_map);
> - ioapic->rtc_status.pending_eoi = (ret < 0 ? 0 : ret);
> + vcpu = kvm_get_vcpu(ioapic->kvm, 0);
> + if (!vcpu->arch.apic->guest_apic_protected)
> + ioapic->rtc_status.pending_eoi = (ret < 0 ? 0 : ret);
And a comment about this, too.
Thanks,
Tom
> } else
> ret = kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe, NULL);
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (13 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 16:32 ` Tom Lendacky
2025-09-23 5:03 ` [RFC PATCH v2 16/17] KVM: x86/cpuid: Disable paravirt APIC features for protected APIC Neeraj Upadhyay
` (2 subsequent siblings)
17 siblings, 1 reply; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
The kvm_wait_lapic_expire() function is a pre-VMRUN optimization that
allows a vCPU to wait for an imminent LAPIC timer interrupt. However,
this function is not fully compatible with protected APIC models like
Secure AVIC because it relies on inspecting KVM's software vAPIC state.
For Secure AVIC, the true timer state is hardware-managed and opaque
to KVM. For this reason, kvm_wait_lapic_expire() does not check whether
timer interrupt is injected for the guests which have protected APIC
state.
For the protected APIC guests, the check for injected timer need to be
done by the callers of kvm_wait_lapic_expire(). So, for Secure AVIC
guests, check to be injected vectors in the requested_IRR for injected
timer interrupt before doing a kvm_wait_lapic_expire().
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 8 ++++++++
arch/x86/kvm/svm/svm.c | 3 ++-
arch/x86/kvm/svm/svm.h | 2 ++
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 5be2956fb812..3f6cf8d5068a 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -5405,3 +5405,11 @@ bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu)
return READ_ONCE(to_svm(vcpu)->sev_savic_has_pending_ipi) ||
kvm_apic_has_interrupt(vcpu) != -1;
}
+
+bool sev_savic_timer_int_injected(struct kvm_vcpu *vcpu)
+{
+ u32 reg = kvm_lapic_get_reg(vcpu->arch.apic, APIC_LVTT);
+ int vec = reg & APIC_VECTOR_MASK;
+
+ return to_svm(vcpu)->vmcb->control.requested_irr[vec / 32] & BIT(vec % 32);
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a945bc094c1a..d0d972731ea7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4335,7 +4335,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
vcpu->arch.host_debugctl != svm->vmcb->save.dbgctl)
update_debugctlmsr(svm->vmcb->save.dbgctl);
- kvm_wait_lapic_expire(vcpu);
+ if (!sev_savic_active(vcpu->kvm) || sev_savic_timer_int_injected(vcpu))
+ kvm_wait_lapic_expire(vcpu);
/*
* If this vCPU has touched SPEC_CTRL, restore the guest's value if
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 8043833a1a8c..ecc4ea11822d 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -878,6 +878,7 @@ static inline bool sev_savic_active(struct kvm *kvm)
}
void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected);
bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu);
+bool sev_savic_timer_int_injected(struct kvm_vcpu *vcpu);
#else
static inline struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
{
@@ -917,6 +918,7 @@ static inline struct vmcb_save_area *sev_decrypt_vmsa(struct kvm_vcpu *vcpu)
static inline void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa) {}
static inline void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected) {}
static inline bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu) { return false; }
+static inline bool sev_savic_timer_int_injected(struct kvm_vcpu *vcpu) { return true; }
#endif
/* vmenter.S */
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests
2025-09-23 5:03 ` [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests Neeraj Upadhyay
@ 2025-09-23 16:32 ` Tom Lendacky
0 siblings, 0 replies; 32+ messages in thread
From: Tom Lendacky @ 2025-09-23 16:32 UTC (permalink / raw)
To: Neeraj Upadhyay, kvm, seanjc, pbonzini
Cc: linux-kernel, nikunj, Santosh.Shukla, Vasant.Hegde,
Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang, naveen.rao,
tiala
On 9/23/25 00:03, Neeraj Upadhyay wrote:
> The kvm_wait_lapic_expire() function is a pre-VMRUN optimization that
> allows a vCPU to wait for an imminent LAPIC timer interrupt. However,
> this function is not fully compatible with protected APIC models like
> Secure AVIC because it relies on inspecting KVM's software vAPIC state.
> For Secure AVIC, the true timer state is hardware-managed and opaque
> to KVM. For this reason, kvm_wait_lapic_expire() does not check whether
> timer interrupt is injected for the guests which have protected APIC
> state.
>
> For the protected APIC guests, the check for injected timer need to be
> done by the callers of kvm_wait_lapic_expire(). So, for Secure AVIC
> guests, check to be injected vectors in the requested_IRR for injected
> timer interrupt before doing a kvm_wait_lapic_expire().
>
> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
> ---
> arch/x86/kvm/svm/sev.c | 8 ++++++++
> arch/x86/kvm/svm/svm.c | 3 ++-
> arch/x86/kvm/svm/svm.h | 2 ++
> 3 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 5be2956fb812..3f6cf8d5068a 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -5405,3 +5405,11 @@ bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu)
> return READ_ONCE(to_svm(vcpu)->sev_savic_has_pending_ipi) ||
> kvm_apic_has_interrupt(vcpu) != -1;
> }
> +
> +bool sev_savic_timer_int_injected(struct kvm_vcpu *vcpu)
> +{
> + u32 reg = kvm_lapic_get_reg(vcpu->arch.apic, APIC_LVTT);
Extra space before the "="
> + int vec = reg & APIC_VECTOR_MASK;
> +
> + return to_svm(vcpu)->vmcb->control.requested_irr[vec / 32] & BIT(vec % 32);
> +}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index a945bc094c1a..d0d972731ea7 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4335,7 +4335,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
> vcpu->arch.host_debugctl != svm->vmcb->save.dbgctl)
> update_debugctlmsr(svm->vmcb->save.dbgctl);
>
> - kvm_wait_lapic_expire(vcpu);
> + if (!sev_savic_active(vcpu->kvm) || sev_savic_timer_int_injected(vcpu))
> + kvm_wait_lapic_expire(vcpu);
>
> /*
> * If this vCPU has touched SPEC_CTRL, restore the guest's value if
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 8043833a1a8c..ecc4ea11822d 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -878,6 +878,7 @@ static inline bool sev_savic_active(struct kvm *kvm)
> }
> void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected);
> bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu);
> +bool sev_savic_timer_int_injected(struct kvm_vcpu *vcpu);
> #else
> static inline struct page *snp_safe_alloc_page_node(int node, gfp_t gfp)
> {
> @@ -917,6 +918,7 @@ static inline struct vmcb_save_area *sev_decrypt_vmsa(struct kvm_vcpu *vcpu)
> static inline void sev_free_decrypted_vmsa(struct kvm_vcpu *vcpu, struct vmcb_save_area *vmsa) {}
> static inline void sev_savic_set_requested_irr(struct vcpu_svm *svm, bool reinjected) {}
> static inline bool sev_savic_has_pending_interrupt(struct kvm_vcpu *vcpu) { return false; }
> +static inline bool sev_savic_timer_int_injected(struct kvm_vcpu *vcpu) { return true; }
Shouldn't this return false? If CONFIG_KVM_AMD_SEV isn't defined, then
sev_savic_active() will always be false and this won't be called anyway.
Thanks,
Tom
> #endif
>
> /* vmenter.S */
^ permalink raw reply [flat|nested] 32+ messages in thread
* [RFC PATCH v2 16/17] KVM: x86/cpuid: Disable paravirt APIC features for protected APIC
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (14 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 5:03 ` [RFC PATCH v2 17/17] KVM: SVM: Advertise Secure AVIC support for SNP guests Neeraj Upadhyay
2025-09-23 10:02 ` [syzbot ci] Re: AMD: Add Secure AVIC KVM Support syzbot ci
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
The paravirtualized APIC features, PV_EOI and PV_SEND_IPI, are
predicated on KVM having full visibility and control over the guest's
vAPIC state. This assumption is invalid for guests with a protected APIC
(e.g., AMD SEV-SNP with Secure AVIC, Intel TDX), where the APIC state is
opaque to the hypervisor and managed by the hardware.
- PV_EOI: KVM cannot service a PV_EOI MSR write because it has no
access to the guest's true In-Service Register (ISR). For these
guests, EOIs are either accelerated by hardware or virtualized via
a different, technology-specific VM-Exit, not the PV MSR.
- PV_SEND_IPI: Protected guest models have their own specific IPI
virtualization flows (e.g., VMGEXIT on ICR write for Secure AVIC).
Exposing the generic PV_SEND_IPI hypercall would provide a
conflicting, incorrect path that bypasses the required secure flow.
To prevent the guest from using these incompatible interfaces, clear
the KVM_FEATURE_PV_EOI and KVM_FEATURE_PV_SEND_IPI PV feature CPUID
bits when for guests with protected APIC.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/cpuid.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e2836a255b16..01b3c4e88282 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -245,6 +245,10 @@ static u32 kvm_apply_cpuid_pv_features_quirk(struct kvm_vcpu *vcpu)
if (kvm_hlt_in_guest(vcpu->kvm))
best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT);
+ if (vcpu->arch.apic->guest_apic_protected)
+ best->eax &= ~((1 << KVM_FEATURE_PV_EOI) |
+ (1 << KVM_FEATURE_PV_SEND_IPI));
+
return best->eax;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [RFC PATCH v2 17/17] KVM: SVM: Advertise Secure AVIC support for SNP guests
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (15 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 16/17] KVM: x86/cpuid: Disable paravirt APIC features for protected APIC Neeraj Upadhyay
@ 2025-09-23 5:03 ` Neeraj Upadhyay
2025-09-23 10:02 ` [syzbot ci] Re: AMD: Add Secure AVIC KVM Support syzbot ci
17 siblings, 0 replies; 32+ messages in thread
From: Neeraj Upadhyay @ 2025-09-23 5:03 UTC (permalink / raw)
To: kvm, seanjc, pbonzini
Cc: linux-kernel, Thomas.Lendacky, nikunj, Santosh.Shukla,
Vasant.Hegde, Suravee.Suthikulpanit, bp, David.Kaplan, huibo.wang,
naveen.rao, tiala
The preceding patches have implemented all the necessary KVM
infrastructure to support the Secure AVIC feature for SEV-SNP guests,
including interrupt/NMI injection, IPI virtualization, and EOI handling.
Despite the backend support being complete, KVM does not yet advertise
this capability. As a result, userspace tools cannot create VMs that
utilize this feature.
To enable the feature, add the SVM_SEV_FEAT_SECURE_AVIC flag to the
sev_supported_vmsa_features bitmask. This bitmask communicates
KVM's supported VMSA features to userspace.
This is the final enabling patch in the series, allowing the creation
of Secure AVIC-enabled virtual machines.
Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
---
arch/x86/kvm/svm/sev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 3f6cf8d5068a..fe3d65c50afd 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3092,6 +3092,9 @@ void __init sev_hardware_setup(void)
sev_supported_vmsa_features = 0;
if (sev_es_debug_swap_enabled)
sev_supported_vmsa_features |= SVM_SEV_FEAT_DEBUG_SWAP;
+
+ if (sev_snp_savic_enabled)
+ sev_supported_vmsa_features |= SVM_SEV_FEAT_SECURE_AVIC;
}
void sev_hardware_unsetup(void)
--
2.34.1
^ permalink raw reply related [flat|nested] 32+ messages in thread* [syzbot ci] Re: AMD: Add Secure AVIC KVM Support
2025-09-23 5:03 [RFC PATCH v2 00/17] AMD: Add Secure AVIC KVM Support Neeraj Upadhyay
` (16 preceding siblings ...)
2025-09-23 5:03 ` [RFC PATCH v2 17/17] KVM: SVM: Advertise Secure AVIC support for SNP guests Neeraj Upadhyay
@ 2025-09-23 10:02 ` syzbot ci
2025-09-23 10:17 ` Upadhyay, Neeraj
17 siblings, 1 reply; 32+ messages in thread
From: syzbot ci @ 2025-09-23 10:02 UTC (permalink / raw)
To: bp, david.kaplan, huibo.wang, kvm, linux-kernel, naveen.rao,
neeraj.upadhyay, nikunj, pbonzini, santosh.shukla, seanjc,
suravee.suthikulpanit, thomas.lendacky, tiala, vasant.hegde
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v2] AMD: Add Secure AVIC KVM Support
https://lore.kernel.org/all/20250923050317.205482-1-Neeraj.Upadhyay@amd.com
* [RFC PATCH v2 01/17] KVM: x86/lapic: Differentiate protected APIC interrupt mechanisms
* [RFC PATCH v2 02/17] x86/cpufeatures: Add Secure AVIC CPU feature
* [RFC PATCH v2 03/17] KVM: SVM: Add support for Secure AVIC capability in KVM
* [RFC PATCH v2 04/17] KVM: SVM: Set guest APIC protection flags for Secure AVIC
* [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests
* [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC
* [RFC PATCH v2 07/17] KVM: SVM: Add IPI Delivery Support for Secure AVIC
* [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception for Secure AVIC
* [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests
* [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area for Secure AVIC guests
* [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support for Secure AVIC guests
* [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page
* [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests
* [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests
* [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests
* [RFC PATCH v2 16/17] KVM: x86/cpuid: Disable paravirt APIC features for protected APIC
* [RFC PATCH v2 17/17] KVM: SVM: Advertise Secure AVIC support for SNP guests
and found the following issue:
general protection fault in kvm_apply_cpuid_pv_features_quirk
Full report is available here:
https://ci.syzbot.org/series/887b895e-0315-498c-99e5-966704f16fb5
***
general protection fault in kvm_apply_cpuid_pv_features_quirk
tree: kvm-next
URL: https://kernel.googlesource.com/pub/scm/virt/kvm/kvm/
base: a6ad54137af92535cfe32e19e5f3bc1bb7dbd383
arch: amd64
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config: https://ci.syzbot.org/builds/a65d3de7-36d8-4181-8566-80e0f0719955/config
C repro: https://ci.syzbot.org/findings/939a8c5a-41b2-4e9b-9129-80dff6d039c4/c_repro
syz repro: https://ci.syzbot.org/findings/939a8c5a-41b2-4e9b-9129-80dff6d039c4/syz_repro
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000013: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000098-0x000000000000009f]
CPU: 0 UID: 0 PID: 5992 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:kvm_apply_cpuid_pv_features_quirk+0x38c/0x4f0 arch/x86/kvm/cpuid.c:248
Code: c1 e8 03 80 3c 10 00 74 12 4c 89 ff e8 9d d8 d4 00 48 ba 00 00 00 00 00 fc ff df bb 9c 00 00 00 49 03 1f 48 89 d8 48 c1 e8 03 <0f> b6 04 10 84 c0 0f 85 c2 00 00 00 80 3b 00 74 2e e8 4e 6a 71 00
RSP: 0018:ffffc90004f871a0 EFLAGS: 00010203
RAX: 0000000000000013 RBX: 000000000000009c RCX: ffff888107562440
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90004f87250 R08: 0000000000000005 R09: 000000008b838003
R10: ffffc90004f872e0 R11: fffff520009f0e61 R12: ffff888034f30970
R13: 1ffff110069e612e R14: ffff888020170528 R15: ffff888034f302f8
FS: 000055556af3f500(0000) GS:ffff8880b861b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffff40f56c8 CR3: 0000000020cc0000 CR4: 0000000000352ef0
Call Trace:
<TASK>
kvm_vcpu_after_set_cpuid+0xc75/0x18a0 arch/x86/kvm/cpuid.c:432
kvm_set_cpuid+0xea4/0x1110 arch/x86/kvm/cpuid.c:551
kvm_vcpu_ioctl_set_cpuid2+0xbe/0x130 arch/x86/kvm/cpuid.c:626
kvm_arch_vcpu_ioctl+0x13c5/0x2a80 arch/x86/kvm/x86.c:5975
kvm_vcpu_ioctl+0x74d/0xe90 virt/kvm/kvm_main.c:4637
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:598 [inline]
__se_sys_ioctl+0xf9/0x170 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f14f278e82b
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
RSP: 002b:00007ffff40f55f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffff40f5d40 RCX: 00007f14f278e82b
RDX: 00007ffff40f5d40 RSI: 000000004008ae90 RDI: 0000000000000005
RBP: 00002000008fc000 R08: 0000000000000000 R09: 0000000000000006
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000063 R14: 00002000008fb000 R15: 00002000008fc800
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:kvm_apply_cpuid_pv_features_quirk+0x38c/0x4f0 arch/x86/kvm/cpuid.c:248
Code: c1 e8 03 80 3c 10 00 74 12 4c 89 ff e8 9d d8 d4 00 48 ba 00 00 00 00 00 fc ff df bb 9c 00 00 00 49 03 1f 48 89 d8 48 c1 e8 03 <0f> b6 04 10 84 c0 0f 85 c2 00 00 00 80 3b 00 74 2e e8 4e 6a 71 00
RSP: 0018:ffffc90004f871a0 EFLAGS: 00010203
RAX: 0000000000000013 RBX: 000000000000009c RCX: ffff888107562440
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90004f87250 R08: 0000000000000005 R09: 000000008b838003
R10: ffffc90004f872e0 R11: fffff520009f0e61 R12: ffff888034f30970
R13: 1ffff110069e612e R14: ffff888020170528 R15: ffff888034f302f8
FS: 000055556af3f500(0000) GS:ffff8881a3c1b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055df3be04900 CR3: 0000000020cc0000 CR4: 0000000000352ef0
----------------
Code disassembly (best guess):
0: c1 e8 03 shr $0x3,%eax
3: 80 3c 10 00 cmpb $0x0,(%rax,%rdx,1)
7: 74 12 je 0x1b
9: 4c 89 ff mov %r15,%rdi
c: e8 9d d8 d4 00 call 0xd4d8ae
11: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx
18: fc ff df
1b: bb 9c 00 00 00 mov $0x9c,%ebx
20: 49 03 1f add (%r15),%rbx
23: 48 89 d8 mov %rbx,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 0f b6 04 10 movzbl (%rax,%rdx,1),%eax <-- trapping instruction
2e: 84 c0 test %al,%al
30: 0f 85 c2 00 00 00 jne 0xf8
36: 80 3b 00 cmpb $0x0,(%rbx)
39: 74 2e je 0x69
3b: e8 4e 6a 71 00 call 0x716a8e
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [syzbot ci] Re: AMD: Add Secure AVIC KVM Support
2025-09-23 10:02 ` [syzbot ci] Re: AMD: Add Secure AVIC KVM Support syzbot ci
@ 2025-09-23 10:17 ` Upadhyay, Neeraj
0 siblings, 0 replies; 32+ messages in thread
From: Upadhyay, Neeraj @ 2025-09-23 10:17 UTC (permalink / raw)
To: syzbot ci, bp, david.kaplan, huibo.wang, kvm, linux-kernel,
naveen.rao, nikunj, pbonzini, santosh.shukla, seanjc,
suravee.suthikulpanit, thomas.lendacky, tiala, vasant.hegde
Cc: syzbot, syzkaller-bugs
On 9/23/2025 3:32 PM, syzbot ci wrote:
> syzbot ci has tested the following series
>
> [v2] AMD: Add Secure AVIC KVM Support
> https://lore.kernel.org/all/20250923050317.205482-1-Neeraj.Upadhyay@amd.com
> * [RFC PATCH v2 01/17] KVM: x86/lapic: Differentiate protected APIC interrupt mechanisms
> * [RFC PATCH v2 02/17] x86/cpufeatures: Add Secure AVIC CPU feature
> * [RFC PATCH v2 03/17] KVM: SVM: Add support for Secure AVIC capability in KVM
> * [RFC PATCH v2 04/17] KVM: SVM: Set guest APIC protection flags for Secure AVIC
> * [RFC PATCH v2 05/17] KVM: SVM: Do not intercept SECURE_AVIC_CONTROL MSR for SAVIC guests
> * [RFC PATCH v2 06/17] KVM: SVM: Implement interrupt injection for Secure AVIC
> * [RFC PATCH v2 07/17] KVM: SVM: Add IPI Delivery Support for Secure AVIC
> * [RFC PATCH v2 08/17] KVM: SVM: Do not inject exception for Secure AVIC
> * [RFC PATCH v2 09/17] KVM: SVM: Do not intercept exceptions for Secure AVIC guests
> * [RFC PATCH v2 10/17] KVM: SVM: Set VGIF in VMSA area for Secure AVIC guests
> * [RFC PATCH v2 11/17] KVM: SVM: Enable NMI support for Secure AVIC guests
> * [RFC PATCH v2 12/17] KVM: SVM: Add VMGEXIT handler for Secure AVIC backing page
> * [RFC PATCH v2 13/17] KVM: SVM: Add IOAPIC EOI support for Secure AVIC guests
> * [RFC PATCH v2 14/17] KVM: x86/ioapic: Disable RTC EOI tracking for protected APIC guests
> * [RFC PATCH v2 15/17] KVM: SVM: Check injected timers for Secure AVIC guests
> * [RFC PATCH v2 16/17] KVM: x86/cpuid: Disable paravirt APIC features for protected APIC
> * [RFC PATCH v2 17/17] KVM: SVM: Advertise Secure AVIC support for SNP guests
>
> and found the following issue:
> general protection fault in kvm_apply_cpuid_pv_features_quirk
>
> Full report is available here:
> https://ci.syzbot.org/series/887b895e-0315-498c-99e5-966704f16fb5
>
> ***
>
> general protection fault in kvm_apply_cpuid_pv_features_quirk
>
Thanks for the report. I will update the check to below:
if (lapic_in_kernel(vcpu) && vcpu->arch.apic->guest_apic_protected)
best->eax &= ~((1 << KVM_FEATURE_PV_EOI) |
(1 << KVM_FEATURE_PV_SEND_IPI));
- Neeraj
^ permalink raw reply [flat|nested] 32+ messages in thread