* [PATCH 0/3] Unify IBRS virtualization
@ 2025-02-21 16:33 Yosry Ahmed
2025-02-21 16:33 ` [PATCH 1/3] x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE Yosry Ahmed
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Yosry Ahmed @ 2025-02-21 16:33 UTC (permalink / raw)
To: x86, Sean Christopherson
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, Kaplan, David, kvm,
linux-kernel, Yosry Ahmed
To properly virtualize IBRS on Intel, an IBPB is executed on emulated
VM-exits to provide separate predictor modes for L1 and L2.
Similar handling is theoretically needed for AMD, unless IbrsSameMode is
enumerated by the CPU (which should be the case for most/all CPUs
anyway). For correctness and clarity, this series generalizes the
handling to apply for both Intel and AMD as needed.
I am not sure if this series would land through the kvm-x86 tree or the
tip/x86 tree.
Yosry Ahmed (3):
x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE
KVM: x86: Propagate AMD's IbrsSameMode to the guest
KVM: x86: Generalize IBRS virtualization on emulated VM-exit
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kvm/cpuid.c | 1 +
arch/x86/kvm/svm/nested.c | 2 ++
arch/x86/kvm/vmx/nested.c | 11 +----------
arch/x86/kvm/x86.h | 18 ++++++++++++++++++
tools/arch/x86/include/asm/cpufeatures.h | 1 +
6 files changed, 24 insertions(+), 10 deletions(-)
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE
2025-02-21 16:33 [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
@ 2025-02-21 16:33 ` Yosry Ahmed
2025-02-21 16:33 ` [PATCH 2/3] KVM: x86: Propagate AMD's IbrsSameMode to the guest Yosry Ahmed
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Yosry Ahmed @ 2025-02-21 16:33 UTC (permalink / raw)
To: x86, Sean Christopherson
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, Kaplan, David, kvm,
linux-kernel, Yosry Ahmed
Per the APM [1]:
Some processors, identified by CPUID Fn8000_0008_EBX[IbrsSameMode]
(bit 19) = 1, provide additional speculation limits. For these
processors, when IBRS is set, indirect branch predictions are not
influenced by any prior indirect branches, regardless of mode (CPL
and guest/host) and regardless of whether the prior indirect branches
occurred before or after the setting of IBRS. This is referred to as
Same Mode IBRS.
Define this feature bit, which will be used by KVM to determine if an
IBPB is required on nested VM-exits in SVM.
[1] AMD64 Architecture Programmer's Manual Pub. 40332, Rev 4.08 - April
2024, Volume 2, 3.2.9 Speculation Control MSRs
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
---
arch/x86/include/asm/cpufeatures.h | 1 +
tools/arch/x86/include/asm/cpufeatures.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 645aa360628da..46af88357ac89 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -344,6 +344,7 @@
#define X86_FEATURE_AMD_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */
#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */
#define X86_FEATURE_AMD_STIBP_ALWAYS_ON (13*32+17) /* Single Thread Indirect Branch Predictors always-on preferred */
+#define X86_FEATURE_AMD_IBRS_SAME_MODE (13*32+19) /* Indirect Branch Restricted Speculation same mode protection*/
#define X86_FEATURE_AMD_PPIN (13*32+23) /* "amd_ppin" Protected Processor Inventory Number */
#define X86_FEATURE_AMD_SSBD (13*32+24) /* Speculative Store Bypass Disable */
#define X86_FEATURE_VIRT_SSBD (13*32+25) /* "virt_ssbd" Virtualized Speculative Store Bypass Disable */
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 17b6590748c00..c99d626357c1e 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -344,6 +344,7 @@
#define X86_FEATURE_AMD_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */
#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */
#define X86_FEATURE_AMD_STIBP_ALWAYS_ON (13*32+17) /* Single Thread Indirect Branch Predictors always-on preferred */
+#define X86_FEATURE_AMD_IBRS_SAME_MODE (13*32+19) /* Indirect Branch Restricted Speculation same mode protection*/
#define X86_FEATURE_AMD_PPIN (13*32+23) /* "amd_ppin" Protected Processor Inventory Number */
#define X86_FEATURE_AMD_SSBD (13*32+24) /* Speculative Store Bypass Disable */
#define X86_FEATURE_VIRT_SSBD (13*32+25) /* "virt_ssbd" Virtualized Speculative Store Bypass Disable */
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/3] KVM: x86: Propagate AMD's IbrsSameMode to the guest
2025-02-21 16:33 [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
2025-02-21 16:33 ` [PATCH 1/3] x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE Yosry Ahmed
@ 2025-02-21 16:33 ` Yosry Ahmed
2025-02-21 16:33 ` [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit Yosry Ahmed
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Yosry Ahmed @ 2025-02-21 16:33 UTC (permalink / raw)
To: x86, Sean Christopherson
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, Kaplan, David, kvm,
linux-kernel, Yosry Ahmed
If IBRS provides same mode (kernel/user or host/guest) protection on the
host, then by definition it also provides same mode protection in the
guest. In fact, all different modes from the guest's perspective are the
same mode from the host's perspective anyway.
Propagate IbrsSameMode to the guests.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
---
arch/x86/kvm/cpuid.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index edef30359c198..05d7bbfbb8885 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -1103,6 +1103,7 @@ void kvm_set_cpu_caps(void)
F(AMD_SSB_NO),
F(AMD_STIBP),
F(AMD_STIBP_ALWAYS_ON),
+ F(AMD_IBRS_SAME_MODE),
F(AMD_PSFD),
F(AMD_IBPB_RET),
);
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
2025-02-21 16:33 [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
2025-02-21 16:33 ` [PATCH 1/3] x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE Yosry Ahmed
2025-02-21 16:33 ` [PATCH 2/3] KVM: x86: Propagate AMD's IbrsSameMode to the guest Yosry Ahmed
@ 2025-02-21 16:33 ` Yosry Ahmed
2025-02-21 17:59 ` Jim Mattson
2025-03-13 21:23 ` [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
2025-04-25 22:37 ` Sean Christopherson
4 siblings, 1 reply; 13+ messages in thread
From: Yosry Ahmed @ 2025-02-21 16:33 UTC (permalink / raw)
To: x86, Sean Christopherson
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, Kaplan, David, kvm,
linux-kernel, Yosry Ahmed
Commit 2e7eab81425a ("KVM: VMX: Execute IBPB on emulated VM-exit when
guest has IBRS") added an IBPB in the emulated VM-exit path on Intel to
properly virtualize IBRS by providing separate predictor modes for L1
and L2.
AMD requires similar handling, except when IbrsSameMode is enumerated by
the host CPU (which is the case on most/all AMD CPUs). With
IbrsSameMode, hardware IBRS is sufficient and no extra handling is
needed from KVM.
Generalize the handling in nested_vmx_vmexit() by moving it into a
generic function, add the AMD handling, and use it in
nested_svm_vmexit() too. The main reason for using a generic function is
to have a single place to park the huge comment about virtualizing IBRS.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
---
arch/x86/kvm/svm/nested.c | 2 ++
arch/x86/kvm/vmx/nested.c | 11 +----------
arch/x86/kvm/x86.h | 18 ++++++++++++++++++
3 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index d77b094d9a4d6..61b73ff30807e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1041,6 +1041,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
+ kvm_nested_vmexit_handle_spec_ctrl(vcpu);
+
svm_switch_vmcb(svm, &svm->vmcb01);
/*
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 8a7af02d466e9..453d52a6e836a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5018,16 +5018,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
vmx_switch_vmcs(vcpu, &vmx->vmcs01);
- /*
- * If IBRS is advertised to the vCPU, KVM must flush the indirect
- * branch predictors when transitioning from L2 to L1, as L1 expects
- * hardware (KVM in this case) to provide separate predictor modes.
- * Bare metal isolates VMX root (host) from VMX non-root (guest), but
- * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
- * separate modes for L2 vs L1.
- */
- if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
- indirect_branch_prediction_barrier();
+ kvm_nested_vmexit_handle_spec_ctrl(vcpu);
/* Update any VMCS fields that might have changed while L2 ran */
vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 7a87c5fc57f1b..008c8d381c253 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -116,6 +116,24 @@ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
kvm_x86_ops.nested_ops->leave_nested(vcpu);
}
+/*
+ * If IBRS is advertised to the vCPU, KVM must flush the indirect branch
+ * predictors when transitioning from L2 to L1, as L1 expects hardware (KVM in
+ * this case) to provide separate predictor modes. Bare metal isolates the host
+ * from the guest, but doesn't isolate different guests from one another (in
+ * this case L1 and L2). The exception is if bare metal supports same mode IBRS,
+ * which offers protection within the same mode, and hence protects L1 from L2.
+ */
+static inline void kvm_nested_vmexit_handle_spec_ctrl(struct kvm_vcpu *vcpu)
+{
+ if (cpu_feature_enabled(X86_FEATURE_AMD_IBRS_SAME_MODE))
+ return;
+
+ if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
+ guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS))
+ indirect_branch_prediction_barrier();
+}
+
static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
{
return vcpu->arch.last_vmentry_cpu != -1;
--
2.48.1.601.g30ceb7b040-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
2025-02-21 16:33 ` [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit Yosry Ahmed
@ 2025-02-21 17:59 ` Jim Mattson
2025-02-21 18:39 ` Yosry Ahmed
0 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2025-02-21 17:59 UTC (permalink / raw)
To: Yosry Ahmed
Cc: x86, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Paolo Bonzini,
Kaplan, David, kvm, linux-kernel
On Fri, Feb 21, 2025 at 8:34 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> Commit 2e7eab81425a ("KVM: VMX: Execute IBPB on emulated VM-exit when
> guest has IBRS") added an IBPB in the emulated VM-exit path on Intel to
> properly virtualize IBRS by providing separate predictor modes for L1
> and L2.
>
> AMD requires similar handling, except when IbrsSameMode is enumerated by
> the host CPU (which is the case on most/all AMD CPUs). With
> IbrsSameMode, hardware IBRS is sufficient and no extra handling is
> needed from KVM.
>
> Generalize the handling in nested_vmx_vmexit() by moving it into a
> generic function, add the AMD handling, and use it in
> nested_svm_vmexit() too. The main reason for using a generic function is
> to have a single place to park the huge comment about virtualizing IBRS.
>
> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> ---
> arch/x86/kvm/svm/nested.c | 2 ++
> arch/x86/kvm/vmx/nested.c | 11 +----------
> arch/x86/kvm/x86.h | 18 ++++++++++++++++++
> 3 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index d77b094d9a4d6..61b73ff30807e 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1041,6 +1041,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>
> nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
>
> + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> +
> svm_switch_vmcb(svm, &svm->vmcb01);
>
> /*
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 8a7af02d466e9..453d52a6e836a 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -5018,16 +5018,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
>
> vmx_switch_vmcs(vcpu, &vmx->vmcs01);
>
> - /*
> - * If IBRS is advertised to the vCPU, KVM must flush the indirect
> - * branch predictors when transitioning from L2 to L1, as L1 expects
> - * hardware (KVM in this case) to provide separate predictor modes.
> - * Bare metal isolates VMX root (host) from VMX non-root (guest), but
> - * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
> - * separate modes for L2 vs L1.
> - */
> - if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
> - indirect_branch_prediction_barrier();
> + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
>
> /* Update any VMCS fields that might have changed while L2 ran */
> vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 7a87c5fc57f1b..008c8d381c253 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -116,6 +116,24 @@ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
> kvm_x86_ops.nested_ops->leave_nested(vcpu);
> }
>
> +/*
> + * If IBRS is advertised to the vCPU, KVM must flush the indirect branch
> + * predictors when transitioning from L2 to L1, as L1 expects hardware (KVM in
> + * this case) to provide separate predictor modes. Bare metal isolates the host
> + * from the guest, but doesn't isolate different guests from one another (in
> + * this case L1 and L2). The exception is if bare metal supports same mode IBRS,
> + * which offers protection within the same mode, and hence protects L1 from L2.
> + */
> +static inline void kvm_nested_vmexit_handle_spec_ctrl(struct kvm_vcpu *vcpu)
Maybe just kvm_nested_vmexit_handle_ibrs?
> +{
> + if (cpu_feature_enabled(X86_FEATURE_AMD_IBRS_SAME_MODE))
> + return;
> +
> + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> + guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS))
This is a bit conservative, but I don't think there's any ROI in being
more pedantic.
For the series,
Reviewed-by: Jim Mattson <jmattson@google.com>
> + indirect_branch_prediction_barrier();
> +}
> +
> static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
> {
> return vcpu->arch.last_vmentry_cpu != -1;
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
2025-02-21 17:59 ` Jim Mattson
@ 2025-02-21 18:39 ` Yosry Ahmed
2025-02-21 19:02 ` Jim Mattson
0 siblings, 1 reply; 13+ messages in thread
From: Yosry Ahmed @ 2025-02-21 18:39 UTC (permalink / raw)
To: Jim Mattson
Cc: x86, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Paolo Bonzini,
Kaplan, David, kvm, linux-kernel
On Fri, Feb 21, 2025 at 09:59:04AM -0800, Jim Mattson wrote:
> On Fri, Feb 21, 2025 at 8:34 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> >
> > Commit 2e7eab81425a ("KVM: VMX: Execute IBPB on emulated VM-exit when
> > guest has IBRS") added an IBPB in the emulated VM-exit path on Intel to
> > properly virtualize IBRS by providing separate predictor modes for L1
> > and L2.
> >
> > AMD requires similar handling, except when IbrsSameMode is enumerated by
> > the host CPU (which is the case on most/all AMD CPUs). With
> > IbrsSameMode, hardware IBRS is sufficient and no extra handling is
> > needed from KVM.
> >
> > Generalize the handling in nested_vmx_vmexit() by moving it into a
> > generic function, add the AMD handling, and use it in
> > nested_svm_vmexit() too. The main reason for using a generic function is
> > to have a single place to park the huge comment about virtualizing IBRS.
> >
> > Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> > ---
> > arch/x86/kvm/svm/nested.c | 2 ++
> > arch/x86/kvm/vmx/nested.c | 11 +----------
> > arch/x86/kvm/x86.h | 18 ++++++++++++++++++
> > 3 files changed, 21 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index d77b094d9a4d6..61b73ff30807e 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -1041,6 +1041,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> >
> > nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
> >
> > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > +
> > svm_switch_vmcb(svm, &svm->vmcb01);
> >
> > /*
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 8a7af02d466e9..453d52a6e836a 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -5018,16 +5018,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
> >
> > vmx_switch_vmcs(vcpu, &vmx->vmcs01);
> >
> > - /*
> > - * If IBRS is advertised to the vCPU, KVM must flush the indirect
> > - * branch predictors when transitioning from L2 to L1, as L1 expects
> > - * hardware (KVM in this case) to provide separate predictor modes.
> > - * Bare metal isolates VMX root (host) from VMX non-root (guest), but
> > - * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
> > - * separate modes for L2 vs L1.
> > - */
> > - if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
> > - indirect_branch_prediction_barrier();
> > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> >
> > /* Update any VMCS fields that might have changed while L2 ran */
> > vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > index 7a87c5fc57f1b..008c8d381c253 100644
> > --- a/arch/x86/kvm/x86.h
> > +++ b/arch/x86/kvm/x86.h
> > @@ -116,6 +116,24 @@ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
> > kvm_x86_ops.nested_ops->leave_nested(vcpu);
> > }
> >
> > +/*
> > + * If IBRS is advertised to the vCPU, KVM must flush the indirect branch
> > + * predictors when transitioning from L2 to L1, as L1 expects hardware (KVM in
> > + * this case) to provide separate predictor modes. Bare metal isolates the host
> > + * from the guest, but doesn't isolate different guests from one another (in
> > + * this case L1 and L2). The exception is if bare metal supports same mode IBRS,
> > + * which offers protection within the same mode, and hence protects L1 from L2.
> > + */
> > +static inline void kvm_nested_vmexit_handle_spec_ctrl(struct kvm_vcpu *vcpu)
>
> Maybe just kvm_nested_vmexit_handle_ibrs?
I was trying to use a generic name to accomodate any future handling
needed for non-IBRS speculation control virtualization. But I could just
be overthinking. Happy to take whatever name is agreed upon in during
reviews.
>
> > +{
> > + if (cpu_feature_enabled(X86_FEATURE_AMD_IBRS_SAME_MODE))
> > + return;
> > +
> > + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> > + guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS))
>
> This is a bit conservative, but I don't think there's any ROI in being
> more pedantic.
Could you elaborate on this?
Is this about doing the IBPB even if L1 does not actually execute an
IBRS? I thought about this for a bit, but otherwise we'd have to
intercept the MSR write IIUC, and I am not sure if that's better. Also,
that's what we are already doing so I just kept it as-is.
Or maybe about whether we need this on AMD only with AUTOIBRS? The APM
is a bit unclear to me in this regard, but I believe may be needed even
for 'normal' IBRS.
>
> For the series,
>
> Reviewed-by: Jim Mattson <jmattson@google.com>
Thanks!
>
> > + indirect_branch_prediction_barrier();
> > +}
> > +
> > static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
> > {
> > return vcpu->arch.last_vmentry_cpu != -1;
> > --
> > 2.48.1.601.g30ceb7b040-goog
> >
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
2025-02-21 18:39 ` Yosry Ahmed
@ 2025-02-21 19:02 ` Jim Mattson
2025-02-21 19:25 ` Yosry Ahmed
0 siblings, 1 reply; 13+ messages in thread
From: Jim Mattson @ 2025-02-21 19:02 UTC (permalink / raw)
To: Yosry Ahmed
Cc: x86, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Paolo Bonzini,
Kaplan, David, kvm, linux-kernel
On Fri, Feb 21, 2025 at 10:39 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> On Fri, Feb 21, 2025 at 09:59:04AM -0800, Jim Mattson wrote:
> > On Fri, Feb 21, 2025 at 8:34 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > >
> > > Commit 2e7eab81425a ("KVM: VMX: Execute IBPB on emulated VM-exit when
> > > guest has IBRS") added an IBPB in the emulated VM-exit path on Intel to
> > > properly virtualize IBRS by providing separate predictor modes for L1
> > > and L2.
> > >
> > > AMD requires similar handling, except when IbrsSameMode is enumerated by
> > > the host CPU (which is the case on most/all AMD CPUs). With
> > > IbrsSameMode, hardware IBRS is sufficient and no extra handling is
> > > needed from KVM.
> > >
> > > Generalize the handling in nested_vmx_vmexit() by moving it into a
> > > generic function, add the AMD handling, and use it in
> > > nested_svm_vmexit() too. The main reason for using a generic function is
> > > to have a single place to park the huge comment about virtualizing IBRS.
> > >
> > > Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> > > ---
> > > arch/x86/kvm/svm/nested.c | 2 ++
> > > arch/x86/kvm/vmx/nested.c | 11 +----------
> > > arch/x86/kvm/x86.h | 18 ++++++++++++++++++
> > > 3 files changed, 21 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > index d77b094d9a4d6..61b73ff30807e 100644
> > > --- a/arch/x86/kvm/svm/nested.c
> > > +++ b/arch/x86/kvm/svm/nested.c
> > > @@ -1041,6 +1041,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> > >
> > > nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
> > >
> > > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > > +
> > > svm_switch_vmcb(svm, &svm->vmcb01);
> > >
> > > /*
> > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > index 8a7af02d466e9..453d52a6e836a 100644
> > > --- a/arch/x86/kvm/vmx/nested.c
> > > +++ b/arch/x86/kvm/vmx/nested.c
> > > @@ -5018,16 +5018,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
> > >
> > > vmx_switch_vmcs(vcpu, &vmx->vmcs01);
> > >
> > > - /*
> > > - * If IBRS is advertised to the vCPU, KVM must flush the indirect
> > > - * branch predictors when transitioning from L2 to L1, as L1 expects
> > > - * hardware (KVM in this case) to provide separate predictor modes.
> > > - * Bare metal isolates VMX root (host) from VMX non-root (guest), but
> > > - * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
> > > - * separate modes for L2 vs L1.
> > > - */
> > > - if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
> > > - indirect_branch_prediction_barrier();
> > > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > >
> > > /* Update any VMCS fields that might have changed while L2 ran */
> > > vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> > > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > > index 7a87c5fc57f1b..008c8d381c253 100644
> > > --- a/arch/x86/kvm/x86.h
> > > +++ b/arch/x86/kvm/x86.h
> > > @@ -116,6 +116,24 @@ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
> > > kvm_x86_ops.nested_ops->leave_nested(vcpu);
> > > }
> > >
> > > +/*
> > > + * If IBRS is advertised to the vCPU, KVM must flush the indirect branch
> > > + * predictors when transitioning from L2 to L1, as L1 expects hardware (KVM in
> > > + * this case) to provide separate predictor modes. Bare metal isolates the host
> > > + * from the guest, but doesn't isolate different guests from one another (in
> > > + * this case L1 and L2). The exception is if bare metal supports same mode IBRS,
> > > + * which offers protection within the same mode, and hence protects L1 from L2.
> > > + */
> > > +static inline void kvm_nested_vmexit_handle_spec_ctrl(struct kvm_vcpu *vcpu)
> >
> > Maybe just kvm_nested_vmexit_handle_ibrs?
>
> I was trying to use a generic name to accomodate any future handling
> needed for non-IBRS speculation control virtualization. But I could just
> be overthinking. Happy to take whatever name is agreed upon in during
> reviews.
>
> >
> > > +{
> > > + if (cpu_feature_enabled(X86_FEATURE_AMD_IBRS_SAME_MODE))
> > > + return;
> > > +
> > > + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> > > + guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS))
> >
> > This is a bit conservative, but I don't think there's any ROI in being
> > more pedantic.
>
> Could you elaborate on this?
>
> Is this about doing the IBPB even if L1 does not actually execute an
> IBRS? I thought about this for a bit, but otherwise we'd have to
> intercept the MSR write IIUC, and I am not sure if that's better. Also,
> that's what we are already doing so I just kept it as-is.
>
> Or maybe about whether we need this on AMD only with AUTOIBRS? The APM
> is a bit unclear to me in this regard, but I believe may be needed even
> for 'normal' IBRS.
If IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit, then this IBPB is
unnecessary.
However, since the host (L1) is running in a de-privileged prediction
domain, simply setting IA32_SPEC_CTRL.IBRS in the future won't protect
it from the guest (L2) that just exited. If we don't eagerly perform
an IBPB now, then L0 would have to intercept WRMSR(IA32_SPEC_CTRL)
from L1 so that we can issue an IBPB in the future, if L1 ever sets
IA32_SPEC_CTRL.IBRS.
Eagerly performing an IBPB now seems like the better option.
> >
> > For the series,
> >
> > Reviewed-by: Jim Mattson <jmattson@google.com>
>
> Thanks!
>
> >
> > > + indirect_branch_prediction_barrier();
> > > +}
> > > +
> > > static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
> > > {
> > > return vcpu->arch.last_vmentry_cpu != -1;
> > > --
> > > 2.48.1.601.g30ceb7b040-goog
> > >
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
2025-02-21 19:02 ` Jim Mattson
@ 2025-02-21 19:25 ` Yosry Ahmed
2025-02-21 19:52 ` Jim Mattson
0 siblings, 1 reply; 13+ messages in thread
From: Yosry Ahmed @ 2025-02-21 19:25 UTC (permalink / raw)
To: Jim Mattson
Cc: x86, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Paolo Bonzini,
Kaplan, David, kvm, linux-kernel
On Fri, Feb 21, 2025 at 11:02:16AM -0800, Jim Mattson wrote:
> On Fri, Feb 21, 2025 at 10:39 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> >
> > On Fri, Feb 21, 2025 at 09:59:04AM -0800, Jim Mattson wrote:
> > > On Fri, Feb 21, 2025 at 8:34 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > > >
> > > > Commit 2e7eab81425a ("KVM: VMX: Execute IBPB on emulated VM-exit when
> > > > guest has IBRS") added an IBPB in the emulated VM-exit path on Intel to
> > > > properly virtualize IBRS by providing separate predictor modes for L1
> > > > and L2.
> > > >
> > > > AMD requires similar handling, except when IbrsSameMode is enumerated by
> > > > the host CPU (which is the case on most/all AMD CPUs). With
> > > > IbrsSameMode, hardware IBRS is sufficient and no extra handling is
> > > > needed from KVM.
> > > >
> > > > Generalize the handling in nested_vmx_vmexit() by moving it into a
> > > > generic function, add the AMD handling, and use it in
> > > > nested_svm_vmexit() too. The main reason for using a generic function is
> > > > to have a single place to park the huge comment about virtualizing IBRS.
> > > >
> > > > Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> > > > ---
> > > > arch/x86/kvm/svm/nested.c | 2 ++
> > > > arch/x86/kvm/vmx/nested.c | 11 +----------
> > > > arch/x86/kvm/x86.h | 18 ++++++++++++++++++
> > > > 3 files changed, 21 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > > index d77b094d9a4d6..61b73ff30807e 100644
> > > > --- a/arch/x86/kvm/svm/nested.c
> > > > +++ b/arch/x86/kvm/svm/nested.c
> > > > @@ -1041,6 +1041,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> > > >
> > > > nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
> > > >
> > > > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > > > +
> > > > svm_switch_vmcb(svm, &svm->vmcb01);
> > > >
> > > > /*
> > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > > index 8a7af02d466e9..453d52a6e836a 100644
> > > > --- a/arch/x86/kvm/vmx/nested.c
> > > > +++ b/arch/x86/kvm/vmx/nested.c
> > > > @@ -5018,16 +5018,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
> > > >
> > > > vmx_switch_vmcs(vcpu, &vmx->vmcs01);
> > > >
> > > > - /*
> > > > - * If IBRS is advertised to the vCPU, KVM must flush the indirect
> > > > - * branch predictors when transitioning from L2 to L1, as L1 expects
> > > > - * hardware (KVM in this case) to provide separate predictor modes.
> > > > - * Bare metal isolates VMX root (host) from VMX non-root (guest), but
> > > > - * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
> > > > - * separate modes for L2 vs L1.
> > > > - */
> > > > - if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
> > > > - indirect_branch_prediction_barrier();
> > > > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > > >
> > > > /* Update any VMCS fields that might have changed while L2 ran */
> > > > vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> > > > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > > > index 7a87c5fc57f1b..008c8d381c253 100644
> > > > --- a/arch/x86/kvm/x86.h
> > > > +++ b/arch/x86/kvm/x86.h
> > > > @@ -116,6 +116,24 @@ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
> > > > kvm_x86_ops.nested_ops->leave_nested(vcpu);
> > > > }
> > > >
> > > > +/*
> > > > + * If IBRS is advertised to the vCPU, KVM must flush the indirect branch
> > > > + * predictors when transitioning from L2 to L1, as L1 expects hardware (KVM in
> > > > + * this case) to provide separate predictor modes. Bare metal isolates the host
> > > > + * from the guest, but doesn't isolate different guests from one another (in
> > > > + * this case L1 and L2). The exception is if bare metal supports same mode IBRS,
> > > > + * which offers protection within the same mode, and hence protects L1 from L2.
> > > > + */
> > > > +static inline void kvm_nested_vmexit_handle_spec_ctrl(struct kvm_vcpu *vcpu)
> > >
> > > Maybe just kvm_nested_vmexit_handle_ibrs?
> >
> > I was trying to use a generic name to accomodate any future handling
> > needed for non-IBRS speculation control virtualization. But I could just
> > be overthinking. Happy to take whatever name is agreed upon in during
> > reviews.
> >
> > >
> > > > +{
> > > > + if (cpu_feature_enabled(X86_FEATURE_AMD_IBRS_SAME_MODE))
> > > > + return;
> > > > +
> > > > + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> > > > + guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS))
> > >
> > > This is a bit conservative, but I don't think there's any ROI in being
> > > more pedantic.
> >
> > Could you elaborate on this?
> >
> > Is this about doing the IBPB even if L1 does not actually execute an
> > IBRS? I thought about this for a bit, but otherwise we'd have to
> > intercept the MSR write IIUC, and I am not sure if that's better. Also,
> > that's what we are already doing so I just kept it as-is.
> >
> > Or maybe about whether we need this on AMD only with AUTOIBRS? The APM
> > is a bit unclear to me in this regard, but I believe may be needed even
> > for 'normal' IBRS.
>
> If IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit, then this IBPB is
> unnecessary.
>
> However, since the host (L1) is running in a de-privileged prediction
> domain, simply setting IA32_SPEC_CTRL.IBRS in the future won't protect
> it from the guest (L2) that just exited. If we don't eagerly perform
> an IBPB now, then L0 would have to intercept WRMSR(IA32_SPEC_CTRL)
> from L1 so that we can issue an IBPB in the future, if L1 ever sets
> IA32_SPEC_CTRL.IBRS.
Right, that's what I meant by "we'd have to intercept the MSR write"
above, but I didn't put it as eloquently as you just did :)
We'd also need to have different handling for eIBRS/AUTOIBRS. It would
basically be:
if eIBRS/AUTOIBRS is enabled by L1:
- Do not intercept IBRS MSR writes
- Always IBPB on emulated VM-exits (unless IbrsSameMode).
else if IBRS is advertised to L1:
- Intercept IBRS MSR writes and do an IBPB.
- Do not IBPB on emulated VM-exits.
We'd basically have two modes of IBRS virtualization and we'd need to
switch between them at runtime according to L1's setting of
eIBRS/AUTOIBRS.
We can simplify it if we always intercept IBRS MSR writes assuming L1
won't do them with eIBRS/AUTOIBRS anyway, so this becomes:
- On emulated VM-exits, IBPB if eIBRS/AUTOIBRS is enabled (unless
IbrsSameMode).
- On IBRS MSR writes, do an IBPB.
Simpler, but not sure if it buys us much.
>
> Eagerly performing an IBPB now seems like the better option.
So yeah I definitely agree, unless we get regression reports caused by
the IBPB on emulated VM-exits, and the MSR write interception turns out
to be an improvement.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
2025-02-21 19:25 ` Yosry Ahmed
@ 2025-02-21 19:52 ` Jim Mattson
0 siblings, 0 replies; 13+ messages in thread
From: Jim Mattson @ 2025-02-21 19:52 UTC (permalink / raw)
To: Yosry Ahmed
Cc: x86, Sean Christopherson, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Paolo Bonzini,
Kaplan, David, kvm, linux-kernel
On Fri, Feb 21, 2025 at 11:26 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
>
> On Fri, Feb 21, 2025 at 11:02:16AM -0800, Jim Mattson wrote:
> > On Fri, Feb 21, 2025 at 10:39 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > >
> > > On Fri, Feb 21, 2025 at 09:59:04AM -0800, Jim Mattson wrote:
> > > > On Fri, Feb 21, 2025 at 8:34 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > > > >
> > > > > Commit 2e7eab81425a ("KVM: VMX: Execute IBPB on emulated VM-exit when
> > > > > guest has IBRS") added an IBPB in the emulated VM-exit path on Intel to
> > > > > properly virtualize IBRS by providing separate predictor modes for L1
> > > > > and L2.
> > > > >
> > > > > AMD requires similar handling, except when IbrsSameMode is enumerated by
> > > > > the host CPU (which is the case on most/all AMD CPUs). With
> > > > > IbrsSameMode, hardware IBRS is sufficient and no extra handling is
> > > > > needed from KVM.
> > > > >
> > > > > Generalize the handling in nested_vmx_vmexit() by moving it into a
> > > > > generic function, add the AMD handling, and use it in
> > > > > nested_svm_vmexit() too. The main reason for using a generic function is
> > > > > to have a single place to park the huge comment about virtualizing IBRS.
> > > > >
> > > > > Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
> > > > > ---
> > > > > arch/x86/kvm/svm/nested.c | 2 ++
> > > > > arch/x86/kvm/vmx/nested.c | 11 +----------
> > > > > arch/x86/kvm/x86.h | 18 ++++++++++++++++++
> > > > > 3 files changed, 21 insertions(+), 10 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > > > > index d77b094d9a4d6..61b73ff30807e 100644
> > > > > --- a/arch/x86/kvm/svm/nested.c
> > > > > +++ b/arch/x86/kvm/svm/nested.c
> > > > > @@ -1041,6 +1041,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> > > > >
> > > > > nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr);
> > > > >
> > > > > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > > > > +
> > > > > svm_switch_vmcb(svm, &svm->vmcb01);
> > > > >
> > > > > /*
> > > > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > > > > index 8a7af02d466e9..453d52a6e836a 100644
> > > > > --- a/arch/x86/kvm/vmx/nested.c
> > > > > +++ b/arch/x86/kvm/vmx/nested.c
> > > > > @@ -5018,16 +5018,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
> > > > >
> > > > > vmx_switch_vmcs(vcpu, &vmx->vmcs01);
> > > > >
> > > > > - /*
> > > > > - * If IBRS is advertised to the vCPU, KVM must flush the indirect
> > > > > - * branch predictors when transitioning from L2 to L1, as L1 expects
> > > > > - * hardware (KVM in this case) to provide separate predictor modes.
> > > > > - * Bare metal isolates VMX root (host) from VMX non-root (guest), but
> > > > > - * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
> > > > > - * separate modes for L2 vs L1.
> > > > > - */
> > > > > - if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL))
> > > > > - indirect_branch_prediction_barrier();
> > > > > + kvm_nested_vmexit_handle_spec_ctrl(vcpu);
> > > > >
> > > > > /* Update any VMCS fields that might have changed while L2 ran */
> > > > > vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
> > > > > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> > > > > index 7a87c5fc57f1b..008c8d381c253 100644
> > > > > --- a/arch/x86/kvm/x86.h
> > > > > +++ b/arch/x86/kvm/x86.h
> > > > > @@ -116,6 +116,24 @@ static inline void kvm_leave_nested(struct kvm_vcpu *vcpu)
> > > > > kvm_x86_ops.nested_ops->leave_nested(vcpu);
> > > > > }
> > > > >
> > > > > +/*
> > > > > + * If IBRS is advertised to the vCPU, KVM must flush the indirect branch
> > > > > + * predictors when transitioning from L2 to L1, as L1 expects hardware (KVM in
> > > > > + * this case) to provide separate predictor modes. Bare metal isolates the host
> > > > > + * from the guest, but doesn't isolate different guests from one another (in
> > > > > + * this case L1 and L2). The exception is if bare metal supports same mode IBRS,
> > > > > + * which offers protection within the same mode, and hence protects L1 from L2.
> > > > > + */
> > > > > +static inline void kvm_nested_vmexit_handle_spec_ctrl(struct kvm_vcpu *vcpu)
> > > >
> > > > Maybe just kvm_nested_vmexit_handle_ibrs?
> > >
> > > I was trying to use a generic name to accomodate any future handling
> > > needed for non-IBRS speculation control virtualization. But I could just
> > > be overthinking. Happy to take whatever name is agreed upon in during
> > > reviews.
> > >
> > > >
> > > > > +{
> > > > > + if (cpu_feature_enabled(X86_FEATURE_AMD_IBRS_SAME_MODE))
> > > > > + return;
> > > > > +
> > > > > + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SPEC_CTRL) ||
> > > > > + guest_cpu_cap_has(vcpu, X86_FEATURE_AMD_IBRS))
> > > >
> > > > This is a bit conservative, but I don't think there's any ROI in being
> > > > more pedantic.
> > >
> > > Could you elaborate on this?
> > >
> > > Is this about doing the IBPB even if L1 does not actually execute an
> > > IBRS? I thought about this for a bit, but otherwise we'd have to
> > > intercept the MSR write IIUC, and I am not sure if that's better. Also,
> > > that's what we are already doing so I just kept it as-is.
> > >
> > > Or maybe about whether we need this on AMD only with AUTOIBRS? The APM
> > > is a bit unclear to me in this regard, but I believe may be needed even
> > > for 'normal' IBRS.
> >
> > If IA32_SPEC_CTRL.IBRS is clear at emulated VM-exit, then this IBPB is
> > unnecessary.
> >
> > However, since the host (L1) is running in a de-privileged prediction
> > domain, simply setting IA32_SPEC_CTRL.IBRS in the future won't protect
> > it from the guest (L2) that just exited. If we don't eagerly perform
> > an IBPB now, then L0 would have to intercept WRMSR(IA32_SPEC_CTRL)
> > from L1 so that we can issue an IBPB in the future, if L1 ever sets
> > IA32_SPEC_CTRL.IBRS.
>
> Right, that's what I meant by "we'd have to intercept the MSR write"
> above, but I didn't put it as eloquently as you just did :)
>
> We'd also need to have different handling for eIBRS/AUTOIBRS. It would
> basically be:
>
> if eIBRS/AUTOIBRS is enabled by L1:
> - Do not intercept IBRS MSR writes
> - Always IBPB on emulated VM-exits (unless IbrsSameMode).
> else if IBRS is advertised to L1:
> - Intercept IBRS MSR writes and do an IBPB.
> - Do not IBPB on emulated VM-exits.
>
> We'd basically have two modes of IBRS virtualization and we'd need to
> switch between them at runtime according to L1's setting of
> eIBRS/AUTOIBRS.
>
> We can simplify it if we always intercept IBRS MSR writes assuming L1
> won't do them with eIBRS/AUTOIBRS anyway, so this becomes:
>
> - On emulated VM-exits, IBPB if eIBRS/AUTOIBRS is enabled (unless
> IbrsSameMode).
> - On IBRS MSR writes, do an IBPB.
>
> Simpler, but not sure if it buys us much.
>
> >
> > Eagerly performing an IBPB now seems like the better option.
>
> So yeah I definitely agree, unless we get regression reports caused by
> the IBPB on emulated VM-exits, and the MSR write interception turns out
> to be an improvement.
For "normal" configurations, I would expect L1 to set eIBRS or
autoIBRS on modern CPUs. In that case, the IBPB is required. However,
if L1 is a Linux OS booted with mitigations=off, the IBPB is probably
gratuitous.
On the Intel side, KVM tries to do some optimization of the case where
the guest doesn't use IA32_SPEC_CTRL. Since writes to IA32_SPEC_CTRL
are intercepted until the first non-zero value is written, I suppose
you could skip the IBPB when writes to IA32_SPEC_CTRL are intercepted,
and then just blindly perform an IBPB when disabling the intercept in
vmx_set_msr() if nested is enabled. Again, though, I don't think this
is worth it.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Unify IBRS virtualization
2025-02-21 16:33 [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
` (2 preceding siblings ...)
2025-02-21 16:33 ` [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit Yosry Ahmed
@ 2025-03-13 21:23 ` Yosry Ahmed
2025-03-26 19:48 ` Sean Christopherson
2025-04-25 22:37 ` Sean Christopherson
4 siblings, 1 reply; 13+ messages in thread
From: Yosry Ahmed @ 2025-03-13 21:23 UTC (permalink / raw)
To: x86, Sean Christopherson
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, Kaplan, David, kvm,
linux-kernel
On Fri, Feb 21, 2025 at 04:33:49PM +0000, Yosry Ahmed wrote:
> To properly virtualize IBRS on Intel, an IBPB is executed on emulated
> VM-exits to provide separate predictor modes for L1 and L2.
>
> Similar handling is theoretically needed for AMD, unless IbrsSameMode is
> enumerated by the CPU (which should be the case for most/all CPUs
> anyway). For correctness and clarity, this series generalizes the
> handling to apply for both Intel and AMD as needed.
>
> I am not sure if this series would land through the kvm-x86 tree or the
> tip/x86 tree.
Sean, any thoughts about this (or general feedback about this series)?
>
> Yosry Ahmed (3):
> x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE
> KVM: x86: Propagate AMD's IbrsSameMode to the guest
> KVM: x86: Generalize IBRS virtualization on emulated VM-exit
>
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kvm/cpuid.c | 1 +
> arch/x86/kvm/svm/nested.c | 2 ++
> arch/x86/kvm/vmx/nested.c | 11 +----------
> arch/x86/kvm/x86.h | 18 ++++++++++++++++++
> tools/arch/x86/include/asm/cpufeatures.h | 1 +
> 6 files changed, 24 insertions(+), 10 deletions(-)
>
> --
> 2.48.1.601.g30ceb7b040-goog
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Unify IBRS virtualization
2025-03-13 21:23 ` [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
@ 2025-03-26 19:48 ` Sean Christopherson
2025-03-26 19:52 ` Yosry Ahmed
0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2025-03-26 19:48 UTC (permalink / raw)
To: Yosry Ahmed
Cc: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, David Kaplan, kvm,
linux-kernel
On Thu, Mar 13, 2025, Yosry Ahmed wrote:
> On Fri, Feb 21, 2025 at 04:33:49PM +0000, Yosry Ahmed wrote:
> > To properly virtualize IBRS on Intel, an IBPB is executed on emulated
> > VM-exits to provide separate predictor modes for L1 and L2.
> >
> > Similar handling is theoretically needed for AMD, unless IbrsSameMode is
> > enumerated by the CPU (which should be the case for most/all CPUs
> > anyway). For correctness and clarity, this series generalizes the
> > handling to apply for both Intel and AMD as needed.
> >
> > I am not sure if this series would land through the kvm-x86 tree or the
> > tip/x86 tree.
>
> Sean, any thoughts about this (or general feedback about this series)?
No feedback, I just you and Jim to get mitigation stuff right far more than I
trust myself :-)
I'm planning on grabbing this for 6.16.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Unify IBRS virtualization
2025-03-26 19:48 ` Sean Christopherson
@ 2025-03-26 19:52 ` Yosry Ahmed
0 siblings, 0 replies; 13+ messages in thread
From: Yosry Ahmed @ 2025-03-26 19:52 UTC (permalink / raw)
To: Sean Christopherson
Cc: x86, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, David Kaplan, kvm,
linux-kernel
On Wed, Mar 26, 2025 at 12:48:53PM -0700, Sean Christopherson wrote:
> On Thu, Mar 13, 2025, Yosry Ahmed wrote:
> > On Fri, Feb 21, 2025 at 04:33:49PM +0000, Yosry Ahmed wrote:
> > > To properly virtualize IBRS on Intel, an IBPB is executed on emulated
> > > VM-exits to provide separate predictor modes for L1 and L2.
> > >
> > > Similar handling is theoretically needed for AMD, unless IbrsSameMode is
> > > enumerated by the CPU (which should be the case for most/all CPUs
> > > anyway). For correctness and clarity, this series generalizes the
> > > handling to apply for both Intel and AMD as needed.
> > >
> > > I am not sure if this series would land through the kvm-x86 tree or the
> > > tip/x86 tree.
> >
> > Sean, any thoughts about this (or general feedback about this series)?
>
> No feedback, I just you and Jim to get mitigation stuff right far more than I
> trust myself :-)
>
> I'm planning on grabbing this for 6.16.
Awesome, thanks!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/3] Unify IBRS virtualization
2025-02-21 16:33 [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
` (3 preceding siblings ...)
2025-03-13 21:23 ` [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
@ 2025-04-25 22:37 ` Sean Christopherson
4 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2025-04-25 22:37 UTC (permalink / raw)
To: Sean Christopherson, x86, Yosry Ahmed
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin, Paolo Bonzini, Jim Mattson, David Kaplan, kvm,
linux-kernel
On Fri, 21 Feb 2025 16:33:49 +0000, Yosry Ahmed wrote:
> To properly virtualize IBRS on Intel, an IBPB is executed on emulated
> VM-exits to provide separate predictor modes for L1 and L2.
>
> Similar handling is theoretically needed for AMD, unless IbrsSameMode is
> enumerated by the CPU (which should be the case for most/all CPUs
> anyway). For correctness and clarity, this series generalizes the
> handling to apply for both Intel and AMD as needed.
>
> [...]
Applied to kvm-x86 misc, thanks!
[1/3] x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE
commit: 9a7cb00a8ff7380a09fa75287a3f2642c472d562
[2/3] KVM: x86: Propagate AMD's IbrsSameMode to the guest
commit: 65ca2872015c232d6743b497e3c08ff96596b917
[3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit
commit: 656d9624bd21d35499eaa5ee97fda6def62901c8
--
https://github.com/kvm-x86/linux/tree/next
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-04-25 22:37 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21 16:33 [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
2025-02-21 16:33 ` [PATCH 1/3] x86/cpufeatures: Define X86_FEATURE_AMD_IBRS_SAME_MODE Yosry Ahmed
2025-02-21 16:33 ` [PATCH 2/3] KVM: x86: Propagate AMD's IbrsSameMode to the guest Yosry Ahmed
2025-02-21 16:33 ` [PATCH 3/3] KVM: x86: Generalize IBRS virtualization on emulated VM-exit Yosry Ahmed
2025-02-21 17:59 ` Jim Mattson
2025-02-21 18:39 ` Yosry Ahmed
2025-02-21 19:02 ` Jim Mattson
2025-02-21 19:25 ` Yosry Ahmed
2025-02-21 19:52 ` Jim Mattson
2025-03-13 21:23 ` [PATCH 0/3] Unify IBRS virtualization Yosry Ahmed
2025-03-26 19:48 ` Sean Christopherson
2025-03-26 19:52 ` Yosry Ahmed
2025-04-25 22:37 ` Sean Christopherson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox