* [PATCH v2 0/4] AVIC bugfixes and workarounds
@ 2023-09-28 17:33 Maxim Levitsky
2023-09-28 17:33 ` [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception Maxim Levitsky
` (5 more replies)
0 siblings, 6 replies; 13+ messages in thread
From: Maxim Levitsky @ 2023-09-28 17:33 UTC (permalink / raw)
To: kvm
Cc: iommu, H. Peter Anvin, Sean Christopherson, Maxim Levitsky,
Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Joerg Roedel,
x86, Suravee Suthikulpanit, linux-kernel, Dave Hansen,
Will Deacon, Ingo Molnar, Robin Murphy
This patch series includes several fixes to AVIC I found while working
on a new version of nested AVIC code.
Also while developing it I realized that a very simple workaround for
AVIC's errata #1235 exists and included it in this patch series as well.
changes since v2:
- added 'fixes' tags
- reworked workaround for avic errata #1235
- dropped iommu patch as it is no longer needed.
Best regards,
Maxim Levitsky
Maxim Levitsky (4):
x86: KVM: SVM: always update the x2avic msr interception
x86: KVM: SVM: add support for Invalid IPI Vector interception
x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
x86: KVM: SVM: workaround for AVIC's errata #1235
arch/x86/include/asm/svm.h | 1 +
arch/x86/kvm/svm/avic.c | 68 +++++++++++++++++++++++++++-----------
arch/x86/kvm/svm/nested.c | 3 ++
arch/x86/kvm/svm/svm.c | 3 +-
arch/x86/kvm/svm/svm.h | 1 +
5 files changed, 55 insertions(+), 21 deletions(-)
--
2.26.3
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
@ 2023-09-28 17:33 ` Maxim Levitsky
2023-09-29 0:24 ` Sean Christopherson
2023-09-28 17:33 ` [PATCH v2 2/4] x86: KVM: SVM: add support for Invalid IPI Vector interception Maxim Levitsky
` (4 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2023-09-28 17:33 UTC (permalink / raw)
To: kvm
Cc: iommu, H. Peter Anvin, Sean Christopherson, Maxim Levitsky,
Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Joerg Roedel,
x86, Suravee Suthikulpanit, linux-kernel, Dave Hansen,
Will Deacon, Ingo Molnar, Robin Murphy, stable
The following problem exists since x2avic was enabled in the KVM:
svm_set_x2apic_msr_interception is called to enable the interception of
the x2apic msrs.
In particular it is called at the moment the guest resets its apic.
Assuming that the guest's apic was in x2apic mode, the reset will bring
it back to the xapic mode.
The svm_set_x2apic_msr_interception however has an erroneous check for
'!apic_x2apic_mode()' which prevents it from doing anything in this case.
As a result of this, all x2apic msrs are left unintercepted, and that
exposes the bare metal x2apic (if enabled) to the guest.
Oops.
Remove the erroneous '!apic_x2apic_mode()' check to fix that.
This fixes CVE-2023-5090
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
arch/x86/kvm/svm/svm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9507df93f410a63..acdd0b89e4715a3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -913,8 +913,7 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept)
if (intercept == svm->x2avic_msrs_intercepted)
return;
- if (!x2avic_enabled ||
- !apic_x2apic_mode(svm->vcpu.arch.apic))
+ if (!x2avic_enabled)
return;
for (i = 0; i < MAX_DIRECT_ACCESS_MSRS; i++) {
--
2.26.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/4] x86: KVM: SVM: add support for Invalid IPI Vector interception
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
2023-09-28 17:33 ` [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception Maxim Levitsky
@ 2023-09-28 17:33 ` Maxim Levitsky
2023-09-29 0:42 ` Sean Christopherson
2023-09-28 17:33 ` [PATCH v2 3/4] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested() Maxim Levitsky
` (3 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2023-09-28 17:33 UTC (permalink / raw)
To: kvm
Cc: iommu, H. Peter Anvin, Sean Christopherson, Maxim Levitsky,
Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Joerg Roedel,
x86, Suravee Suthikulpanit, linux-kernel, Dave Hansen,
Will Deacon, Ingo Molnar, Robin Murphy, stable
In later revisions of AMD's APM, there is a new 'incomplete IPI' exit code:
"Invalid IPI Vector - The vector for the specified IPI was set to an
illegal value (VEC < 16)"
Note that tests on Zen2 machine show that this VM exit doesn't happen and
instead AVIC just does nothing.
Add support for this exit code by doing nothing, instead of filling
the kernel log with errors.
Also replace an unthrottled 'pr_err()' if another unknown incomplete
IPI exit happens with vcpu_unimpl()
(e.g in case AMD adds yet another 'Invalid IPI' exit reason)
Cc: <stable@vger.kernel.org>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
arch/x86/include/asm/svm.h | 1 +
arch/x86/kvm/svm/avic.c | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 19bf955b67e0da0..3ac0ffc4f3e202b 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -268,6 +268,7 @@ enum avic_ipi_failure_cause {
AVIC_IPI_FAILURE_TARGET_NOT_RUNNING,
AVIC_IPI_FAILURE_INVALID_TARGET,
AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
+ AVIC_IPI_FAILURE_INVALID_IPI_VECTOR,
};
#define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(8, 0)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 2092db892d7d052..4b74ea91f4e6bb6 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -529,8 +529,11 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
case AVIC_IPI_FAILURE_INVALID_BACKING_PAGE:
WARN_ONCE(1, "Invalid backing page\n");
break;
+ case AVIC_IPI_FAILURE_INVALID_IPI_VECTOR:
+ /* Invalid IPI with vector < 16 */
+ break;
default:
- pr_err("Unknown IPI interception\n");
+ vcpu_unimpl(vcpu, "Unknown avic incomplete IPI interception\n");
}
return 1;
--
2.26.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 3/4] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
2023-09-28 17:33 ` [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception Maxim Levitsky
2023-09-28 17:33 ` [PATCH v2 2/4] x86: KVM: SVM: add support for Invalid IPI Vector interception Maxim Levitsky
@ 2023-09-28 17:33 ` Maxim Levitsky
2023-09-29 0:42 ` Sean Christopherson
2023-09-28 17:33 ` [PATCH v2 4/4] x86: KVM: SVM: workaround for AVIC's errata #1235 Maxim Levitsky
` (2 subsequent siblings)
5 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2023-09-28 17:33 UTC (permalink / raw)
To: kvm
Cc: iommu, H. Peter Anvin, Sean Christopherson, Maxim Levitsky,
Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Joerg Roedel,
x86, Suravee Suthikulpanit, linux-kernel, Dave Hansen,
Will Deacon, Ingo Molnar, Robin Murphy, stable
svm_leave_nested() similar to a nested VM exit, get the vCPU out of nested
mode and thus should end the local inhibition of AVIC on this vCPU.
Failure to do so, can lead to hangs on guest reboot.
Raise the KVM_REQ_APICV_UPDATE request to refresh the AVIC state of the
current vCPU in this case.
Fixes: f44509f849fe ("KVM: x86: SVM: allow AVIC to co-exist with a nested guest running")
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
arch/x86/kvm/svm/nested.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index dd496c9e5f91f28..3fea8c47679e689 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1253,6 +1253,9 @@ void svm_leave_nested(struct kvm_vcpu *vcpu)
nested_svm_uninit_mmu_context(vcpu);
vmcb_mark_all_dirty(svm->vmcb);
+
+ if (kvm_apicv_activated(vcpu->kvm))
+ kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
}
kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
--
2.26.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 4/4] x86: KVM: SVM: workaround for AVIC's errata #1235
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
` (2 preceding siblings ...)
2023-09-28 17:33 ` [PATCH v2 3/4] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested() Maxim Levitsky
@ 2023-09-28 17:33 ` Maxim Levitsky
2023-09-29 2:06 ` Sean Christopherson
2023-09-29 2:09 ` [PATCH v2 0/4] AVIC bugfixes and workarounds Sean Christopherson
2023-10-12 14:46 ` Paolo Bonzini
5 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2023-09-28 17:33 UTC (permalink / raw)
To: kvm
Cc: iommu, H. Peter Anvin, Sean Christopherson, Maxim Levitsky,
Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Joerg Roedel,
x86, Suravee Suthikulpanit, linux-kernel, Dave Hansen,
Will Deacon, Ingo Molnar, Robin Murphy
On Zen2 (and likely on Zen1 as well), AVIC doesn't reliably detect a change
in the 'is_running' bit during ICR write emulation and might skip a
VM exit, if that bit was recently cleared.
The absence of the VM exit, leads to the KVM not waking up / triggering
nested vm exit on the target(s) of the IPI which can, in some cases,
lead to an unbounded delays in the guest execution.
As I recently discovered, a reasonable workaround exists: make the KVM
never set the is_running bit.
This workaround ensures that (*) all ICR writes always cause a VM exit
and therefore correctly emulated, in expense of never enjoying VM exit-less
ICR emulation.
This workaround does carry a performance penalty but according to my
benchmarks is still much better than not using AVIC at all,
because AVIC is still used for the receiving end of the IPIs, and for the
posted interrupts.
If the user is aware of the errata and it doesn't affect his workload,
the user can disable the workaround with 'avic_zen2_errata_workaround=0'
(*) More correctly all ICR writes except when 'Self' shorthand is used:
In this case AVIC skips reading physid table and just sets bits in IRR
of local APIC. Thankfully in this case, the errata is not possible,
therefore an extra workaround for this case is not needed.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
arch/x86/kvm/svm/avic.c | 63 +++++++++++++++++++++++++++++------------
arch/x86/kvm/svm/svm.h | 1 +
2 files changed, 46 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 4b74ea91f4e6bb6..28bb0e6b321660d 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -62,6 +62,9 @@ static_assert(__AVIC_GATAG(AVIC_VM_ID_MASK, AVIC_VCPU_ID_MASK) == -1u);
static bool force_avic;
module_param_unsafe(force_avic, bool, 0444);
+static int avic_zen2_errata_workaround = -1;
+module_param(avic_zen2_errata_workaround, int, 0444);
+
/* Note:
* This hash table is used to map VM_ID to a struct kvm_svm,
* when handling AMD IOMMU GALOG notification to schedule in
@@ -276,7 +279,7 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
static int avic_init_backing_page(struct kvm_vcpu *vcpu)
{
- u64 *entry, new_entry;
+ u64 *entry;
int id = vcpu->vcpu_id;
struct vcpu_svm *svm = to_svm(vcpu);
@@ -308,10 +311,10 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
if (!entry)
return -EINVAL;
- new_entry = __sme_set((page_to_phys(svm->avic_backing_page) &
- AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK) |
- AVIC_PHYSICAL_ID_ENTRY_VALID_MASK);
- WRITE_ONCE(*entry, new_entry);
+ svm->avic_physical_id_entry = __sme_set((page_to_phys(svm->avic_backing_page) &
+ AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK) |
+ AVIC_PHYSICAL_ID_ENTRY_VALID_MASK);
+ WRITE_ONCE(*entry, svm->avic_physical_id_entry);
svm->avic_physical_id_cache = entry;
@@ -835,7 +838,7 @@ static int svm_ir_list_add(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
* will update the pCPU info when the vCPU awkened and/or scheduled in.
* See also avic_vcpu_load().
*/
- entry = READ_ONCE(*(svm->avic_physical_id_cache));
+ entry = READ_ONCE(svm->avic_physical_id_entry);
if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
amd_iommu_update_ga(entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK,
true, pi->ir_data);
@@ -1027,7 +1030,6 @@ avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int cpu, bool r)
void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
- u64 entry;
int h_physical_id = kvm_cpu_get_apicid(cpu);
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long flags;
@@ -1056,14 +1058,23 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
*/
spin_lock_irqsave(&svm->ir_list_lock, flags);
- entry = READ_ONCE(*(svm->avic_physical_id_cache));
- WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
- entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
- entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
- entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
+ WARN_ON_ONCE(svm->avic_physical_id_entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
+
+ svm->avic_physical_id_entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
+ svm->avic_physical_id_entry |=
+ (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
+
+ svm->avic_physical_id_entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
+
+ /*
+ * Do not update the actual physical id table entry if workaround
+ * for #1235 - the physical ID entry is_running is never set when
+ * the workaround is activated
+ */
+ if (!avic_zen2_errata_workaround)
+ WRITE_ONCE(*(svm->avic_physical_id_cache), svm->avic_physical_id_entry);
- WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, true);
spin_unlock_irqrestore(&svm->ir_list_lock, flags);
@@ -1071,7 +1082,6 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
void avic_vcpu_put(struct kvm_vcpu *vcpu)
{
- u64 entry;
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long flags;
@@ -1084,10 +1094,9 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
* can't be scheduled out and thus avic_vcpu_{put,load}() can't run
* recursively.
*/
- entry = READ_ONCE(*(svm->avic_physical_id_cache));
/* Nothing to do if IsRunning == '0' due to vCPU blocking. */
- if (!(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK))
+ if (!(svm->avic_physical_id_entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK))
return;
/*
@@ -1102,8 +1111,14 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
avic_update_iommu_vcpu_affinity(vcpu, -1, 0);
- entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
- WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+ svm->avic_physical_id_entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
+
+ /*
+ * Do not update the actual physical id table entry
+ * See explanation in avic_vcpu_load
+ */
+ if (!avic_zen2_errata_workaround)
+ WRITE_ONCE(*(svm->avic_physical_id_cache), svm->avic_physical_id_entry);
spin_unlock_irqrestore(&svm->ir_list_lock, flags);
@@ -1217,5 +1232,17 @@ bool avic_hardware_setup(void)
amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
+ if (avic_zen2_errata_workaround == -1) {
+
+ /* Assume that Zen1 and Zen2 have errata #1235 */
+ if (boot_cpu_data.x86 == 0x17)
+ avic_zen2_errata_workaround = 1;
+ else
+ avic_zen2_errata_workaround = 0;
+ }
+
+ if (avic_zen2_errata_workaround)
+ pr_info("Workaround for AVIC errata #1235 is enabled\n");
+
return true;
}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index be67ab7fdd104e3..98dc45b9c194d2e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -265,6 +265,7 @@ struct vcpu_svm {
u32 ldr_reg;
u32 dfr_reg;
struct page *avic_backing_page;
+ u64 avic_physical_id_entry;
u64 *avic_physical_id_cache;
/*
--
2.26.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception
2023-09-28 17:33 ` [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception Maxim Levitsky
@ 2023-09-29 0:24 ` Sean Christopherson
2023-10-03 3:17 ` Suthikulpanit, Suravee
0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2023-09-29 0:24 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, iommu, H. Peter Anvin, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy,
stable
On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> The following problem exists since x2avic was enabled in the KVM:
>
> svm_set_x2apic_msr_interception is called to enable the interception of
Nit, svm_set_x2apic_msr_interception().
Definitely not worth another version though.
> the x2apic msrs.
>
> In particular it is called at the moment the guest resets its apic.
>
> Assuming that the guest's apic was in x2apic mode, the reset will bring
> it back to the xapic mode.
>
> The svm_set_x2apic_msr_interception however has an erroneous check for
> '!apic_x2apic_mode()' which prevents it from doing anything in this case.
>
> As a result of this, all x2apic msrs are left unintercepted, and that
> exposes the bare metal x2apic (if enabled) to the guest.
> Oops.
>
> Remove the erroneous '!apic_x2apic_mode()' check to fix that.
>
> This fixes CVE-2023-5090
>
> Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
> Cc: stable@vger.kernel.org
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
Reviewed-by: Sean Christopherson <seanjc@google.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/4] x86: KVM: SVM: add support for Invalid IPI Vector interception
2023-09-28 17:33 ` [PATCH v2 2/4] x86: KVM: SVM: add support for Invalid IPI Vector interception Maxim Levitsky
@ 2023-09-29 0:42 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2023-09-29 0:42 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, iommu, H. Peter Anvin, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy,
stable
On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> In later revisions of AMD's APM, there is a new 'incomplete IPI' exit code:
>
> "Invalid IPI Vector - The vector for the specified IPI was set to an
> illegal value (VEC < 16)"
>
> Note that tests on Zen2 machine show that this VM exit doesn't happen and
> instead AVIC just does nothing.
>
> Add support for this exit code by doing nothing, instead of filling
> the kernel log with errors.
>
> Also replace an unthrottled 'pr_err()' if another unknown incomplete
> IPI exit happens with vcpu_unimpl()
>
> (e.g in case AMD adds yet another 'Invalid IPI' exit reason)
>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
Reviewed-by: Sean Christopherson <seanjc@google.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 3/4] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
2023-09-28 17:33 ` [PATCH v2 3/4] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested() Maxim Levitsky
@ 2023-09-29 0:42 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2023-09-29 0:42 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, iommu, H. Peter Anvin, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy,
stable
On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> svm_leave_nested() similar to a nested VM exit, get the vCPU out of nested
> mode and thus should end the local inhibition of AVIC on this vCPU.
>
> Failure to do so, can lead to hangs on guest reboot.
>
> Raise the KVM_REQ_APICV_UPDATE request to refresh the AVIC state of the
> current vCPU in this case.
>
> Fixes: f44509f849fe ("KVM: x86: SVM: allow AVIC to co-exist with a nested guest running")
> Cc: stable@vger.kernel.org
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
Reviewed-by: Sean Christopherson <seanjc@google.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 4/4] x86: KVM: SVM: workaround for AVIC's errata #1235
2023-09-28 17:33 ` [PATCH v2 4/4] x86: KVM: SVM: workaround for AVIC's errata #1235 Maxim Levitsky
@ 2023-09-29 2:06 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2023-09-29 2:06 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, iommu, H. Peter Anvin, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy
[-- Attachment #1: Type: text/plain, Size: 2482 bytes --]
On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 4b74ea91f4e6bb6..28bb0e6b321660d 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -62,6 +62,9 @@ static_assert(__AVIC_GATAG(AVIC_VM_ID_MASK, AVIC_VCPU_ID_MASK) == -1u);
> static bool force_avic;
> module_param_unsafe(force_avic, bool, 0444);
>
> +static int avic_zen2_errata_workaround = -1;
> +module_param(avic_zen2_errata_workaround, int, 0444);
> +
> /* Note:
> * This hash table is used to map VM_ID to a struct kvm_svm,
> * when handling AMD IOMMU GALOG notification to schedule in
> @@ -276,7 +279,7 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
>
> static int avic_init_backing_page(struct kvm_vcpu *vcpu)
> {
> - u64 *entry, new_entry;
> + u64 *entry;
> int id = vcpu->vcpu_id;
> struct vcpu_svm *svm = to_svm(vcpu);
>
> @@ -308,10 +311,10 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
> if (!entry)
> return -EINVAL;
>
> - new_entry = __sme_set((page_to_phys(svm->avic_backing_page) &
> - AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK) |
> - AVIC_PHYSICAL_ID_ENTRY_VALID_MASK);
> - WRITE_ONCE(*entry, new_entry);
> + svm->avic_physical_id_entry = __sme_set((page_to_phys(svm->avic_backing_page) &
> + AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK) |
> + AVIC_PHYSICAL_ID_ENTRY_VALID_MASK);
> + WRITE_ONCE(*entry, svm->avic_physical_id_entry);
Aha! Rather than deal with the dummy entry at runtime, simply point the pointer
at the dummy entry during setup.
And instead of adding a dedicated erratum param, let's piggyback VMX's enable_ipiv.
It's not a true disable, but IMO it's close enough. That will make the param
much more self-documenting, and won't feel so awkward if someone wants to disable
IPI virtualization for other reasons.
Then we can do this in three steps:
1. Move enable_ipiv to common code
2. Let userspace disable enable_ipiv for SVM+AVIC
3. Disable enable_ipiv for affected CPUs
The biggest downside to using enable_ipiv is that a the "auto" behavior for the
erratum will be a bit ugly, but that's a solvable problem.
If you've no objection to the above approach, I'll post the attached patches along
with a massaged version of this patch.
The attached patches apply on top of an AVIC clean[*], which (shameless plug)
could use a review ;-)
[*] https://lore.kernel.org/all/20230815213533.548732-1-seanjc@google.com
[-- Attachment #2: 0001-KVM-VMX-Move-enable_ipiv-knob-to-common-x86.patch --]
[-- Type: text/x-diff, Size: 2651 bytes --]
From 4990d0e56b1e9bb8bf97502d525779b2a43d26d4 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 28 Sep 2023 17:22:52 -0700
Subject: [PATCH 1/2] KVM: VMX: Move enable_ipiv knob to common x86
Move enable_ipiv to common x86 so that it can be reused by SVM to control
IPI virtualization when AVIC is enabled. SVM doesn't actually provide a
way to truly disable IPI virtualization, but KVM can get close enough by
skipping the necessary table programming.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/vmx/capabilities.h | 1 -
arch/x86/kvm/vmx/vmx.c | 2 --
arch/x86/kvm/x86.c | 3 +++
4 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e9e69009789e..7239155213c7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1806,6 +1806,7 @@ extern u32 __read_mostly kvm_nr_uret_msrs;
extern u64 __read_mostly host_efer;
extern bool __read_mostly allow_smaller_maxphyaddr;
extern bool __read_mostly enable_apicv;
+extern bool __read_mostly enable_ipiv;
extern struct kvm_x86_ops kvm_x86_ops;
#define KVM_X86_OP(func) \
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 41a4533f9989..8cbfef64ea75 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -15,7 +15,6 @@ extern bool __read_mostly enable_ept;
extern bool __read_mostly enable_unrestricted_guest;
extern bool __read_mostly enable_ept_ad_bits;
extern bool __read_mostly enable_pml;
-extern bool __read_mostly enable_ipiv;
extern int __read_mostly pt_mode;
#define PT_MODE_SYSTEM 0
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 72e3943f3693..f51dac6b21ae 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -104,8 +104,6 @@ static bool __read_mostly fasteoi = 1;
module_param(fasteoi, bool, S_IRUGO);
module_param(enable_apicv, bool, S_IRUGO);
-
-bool __read_mostly enable_ipiv = true;
module_param(enable_ipiv, bool, 0444);
/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6573c89c35a9..ccf5aa4fbe73 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -235,6 +235,9 @@ EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr);
bool __read_mostly enable_apicv = true;
EXPORT_SYMBOL_GPL(enable_apicv);
+bool __read_mostly enable_ipiv = true;
+EXPORT_SYMBOL_GPL(enable_ipiv);
+
u64 __read_mostly host_xss;
EXPORT_SYMBOL_GPL(host_xss);
base-commit: ca3beed3b49348748201a2a35888b49858ce5d73
--
2.42.0.582.g8ccd20d70d-goog
[-- Attachment #3: 0002-KVM-SVM-Add-enable_ipiv-param-skip-physical-ID-progr.patch --]
[-- Type: text/x-diff, Size: 3225 bytes --]
From fb86a56d11eac07626ffd9defeff39b88dbf6406 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 28 Sep 2023 17:25:48 -0700
Subject: [PATCH 2/2] KVM: SVM: Add enable_ipiv param, skip physical ID
programming if disabled
Let userspace "disable" IPI virtualization via an enable_ipiv module param
by programming a dummy entry instead of the vCPU's actual backing entry in
the physical ID table. SVM doesn't provide a way to actually disable IPI
virtualization in hardware, but by leaving all entries blank, every IPI in
the guest (except for self-IPIs) will generate a VM-Exit.
Providing a way to effectively disable IPI virtualization will allow KVM
to safely enable AVIC on hardware that is suseptible to erratum #1235,
which causes hardware to sometimes fail to detect that the IsRunning bit
has been cleared by software.
All credit goes to Maxim for the idea!
Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/svm/avic.c | 15 ++++++++++++++-
arch/x86/kvm/svm/svm.c | 3 +++
arch/x86/kvm/svm/svm.h | 1 +
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index fa87b6853f1d..fc804bb84394 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -310,7 +310,20 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
AVIC_PHYSICAL_ID_ENTRY_VALID_MASK;
WRITE_ONCE(table[id], new_entry);
- svm->avic_physical_id_entry = &table[id];
+ /*
+ * IPI virtualization is bundled with AVIC, but effectively can be
+ * disabled simply by never marking vCPUs as running in the physical ID
+ * table. Use a dummy entry to avoid conditionals in the runtime code,
+ * and to keep the IOMMU coordination logic as simple as possible. The
+ * entry in the table also needs to be valid (see above), otherwise KVM
+ * will ignore IPIs due to thinking the target doesn't exist.
+ */
+ if (enable_ipiv) {
+ svm->avic_physical_id_entry = &table[id];
+ } else {
+ svm->ipiv_disabled_backing_entry = table[id];
+ svm->avic_physical_id_entry = &svm->ipiv_disabled_backing_entry;
+ }
return 0;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index acdd0b89e471..bc40ffb5c47c 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -227,6 +227,8 @@ module_param(tsc_scaling, int, 0444);
static bool avic;
module_param(avic, bool, 0444);
+module_param(enable_ipiv, bool, 0444);
+
bool __read_mostly dump_invalid_vmcb;
module_param(dump_invalid_vmcb, bool, 0644);
@@ -5252,6 +5254,7 @@ static __init int svm_hardware_setup(void)
enable_apicv = avic = avic && avic_hardware_setup();
if (!enable_apicv) {
+ enable_ipiv = false;
svm_x86_ops.vcpu_blocking = NULL;
svm_x86_ops.vcpu_unblocking = NULL;
svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 147516617f88..7a1fc9325d74 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -264,6 +264,7 @@ struct vcpu_svm {
u32 ldr_reg;
u32 dfr_reg;
+ u64 ipiv_disabled_backing_entry;
u64 *avic_physical_id_entry;
/*
--
2.42.0.582.g8ccd20d70d-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/4] AVIC bugfixes and workarounds
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
` (3 preceding siblings ...)
2023-09-28 17:33 ` [PATCH v2 4/4] x86: KVM: SVM: workaround for AVIC's errata #1235 Maxim Levitsky
@ 2023-09-29 2:09 ` Sean Christopherson
2023-09-29 17:42 ` Paolo Bonzini
2023-10-12 14:46 ` Paolo Bonzini
5 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2023-09-29 2:09 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, iommu, H. Peter Anvin, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy
On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> Maxim Levitsky (4):
> x86: KVM: SVM: always update the x2avic msr interception
> x86: KVM: SVM: add support for Invalid IPI Vector interception
> x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
Paolo, I assume you'll take the first three directly for 6.6?
> x86: KVM: SVM: workaround for AVIC's errata #1235
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/4] AVIC bugfixes and workarounds
2023-09-29 2:09 ` [PATCH v2 0/4] AVIC bugfixes and workarounds Sean Christopherson
@ 2023-09-29 17:42 ` Paolo Bonzini
0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2023-09-29 17:42 UTC (permalink / raw)
To: Sean Christopherson
Cc: Maxim Levitsky, kvm, iommu, H. Peter Anvin, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy
On Fri, Sep 29, 2023 at 4:09 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> > Maxim Levitsky (4):
> > x86: KVM: SVM: always update the x2avic msr interception
> > x86: KVM: SVM: add support for Invalid IPI Vector interception
> > x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
>
> Paolo, I assume you'll take the first three directly for 6.6?
Yes.
Paolo
> > x86: KVM: SVM: workaround for AVIC's errata #1235
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception
2023-09-29 0:24 ` Sean Christopherson
@ 2023-10-03 3:17 ` Suthikulpanit, Suravee
0 siblings, 0 replies; 13+ messages in thread
From: Suthikulpanit, Suravee @ 2023-10-03 3:17 UTC (permalink / raw)
To: Sean Christopherson, Maxim Levitsky
Cc: kvm, iommu, H. Peter Anvin, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, linux-kernel, Dave Hansen,
Will Deacon, Ingo Molnar, Robin Murphy, stable
Maxim,
Thanks for finding and fixing this.
On 9/29/2023 7:24 AM, Sean Christopherson wrote:
> On Thu, Sep 28, 2023, Maxim Levitsky wrote:
>> The following problem exists since x2avic was enabled in the KVM:
>>
>> svm_set_x2apic_msr_interception is called to enable the interception of
>
> Nit, svm_set_x2apic_msr_interception().
>
> Definitely not worth another version though.
>
>> the x2apic msrs.
>>
>> In particular it is called at the moment the guest resets its apic.
>>
>> Assuming that the guest's apic was in x2apic mode, the reset will bring
>> it back to the xapic mode.
>>
>> The svm_set_x2apic_msr_interception however has an erroneous check for
>> '!apic_x2apic_mode()' which prevents it from doing anything in this case.
>>
>> As a result of this, all x2apic msrs are left unintercepted, and that
>> exposes the bare metal x2apic (if enabled) to the guest.
>> Oops.
>>
>> Remove the erroneous '!apic_x2apic_mode()' check to fix that.
>>
>> This fixes CVE-2023-5090
>>
>> Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>> ---
>
> Reviewed-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/4] AVIC bugfixes and workarounds
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
` (4 preceding siblings ...)
2023-09-29 2:09 ` [PATCH v2 0/4] AVIC bugfixes and workarounds Sean Christopherson
@ 2023-10-12 14:46 ` Paolo Bonzini
5 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2023-10-12 14:46 UTC (permalink / raw)
To: Maxim Levitsky
Cc: kvm, iommu, H . Peter Anvin, Sean Christopherson, Thomas Gleixner,
Borislav Petkov, Joerg Roedel, x86, Suravee Suthikulpanit,
linux-kernel, Dave Hansen, Will Deacon, Ingo Molnar, Robin Murphy
Queued patches 1-3, thanks.
Paolo
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-10-12 14:47 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-28 17:33 [PATCH v2 0/4] AVIC bugfixes and workarounds Maxim Levitsky
2023-09-28 17:33 ` [PATCH v2 1/4] x86: KVM: SVM: always update the x2avic msr interception Maxim Levitsky
2023-09-29 0:24 ` Sean Christopherson
2023-10-03 3:17 ` Suthikulpanit, Suravee
2023-09-28 17:33 ` [PATCH v2 2/4] x86: KVM: SVM: add support for Invalid IPI Vector interception Maxim Levitsky
2023-09-29 0:42 ` Sean Christopherson
2023-09-28 17:33 ` [PATCH v2 3/4] x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested() Maxim Levitsky
2023-09-29 0:42 ` Sean Christopherson
2023-09-28 17:33 ` [PATCH v2 4/4] x86: KVM: SVM: workaround for AVIC's errata #1235 Maxim Levitsky
2023-09-29 2:06 ` Sean Christopherson
2023-09-29 2:09 ` [PATCH v2 0/4] AVIC bugfixes and workarounds Sean Christopherson
2023-09-29 17:42 ` Paolo Bonzini
2023-10-12 14:46 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).