From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Sean Christopherson <seanjc@google.com>,
Jim Mattson <jmattson@google.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 09/18] KVM: VMX: Execute IBPB on emulated VM-exit when guest has IBRS
Date: Thu, 23 Feb 2023 14:06:54 +0100 [thread overview]
Message-ID: <20230223130426.040985126@linuxfoundation.org> (raw)
In-Reply-To: <20230223130425.680784802@linuxfoundation.org>
From: Jim Mattson <jmattson@google.com>
[ Upstream commit 2e7eab81425ad6c875f2ed47c0ce01e78afc38a5 ]
According to Intel's document on Indirect Branch Restricted
Speculation, "Enabling IBRS does not prevent software from controlling
the predicted targets of indirect branches of unrelated software
executed later at the same predictor mode (for example, between two
different user applications, or two different virtual machines). Such
isolation can be ensured through use of the Indirect Branch Predictor
Barrier (IBPB) command." This applies to both basic and enhanced IBRS.
Since L1 and L2 VMs share hardware predictor modes (guest-user and
guest-kernel), hardware IBRS is not sufficient to virtualize
IBRS. (The way that basic IBRS is implemented on pre-eIBRS parts,
hardware IBRS is actually sufficient in practice, even though it isn't
sufficient architecturally.)
For virtual CPUs that support IBRS, add an indirect branch prediction
barrier on emulated VM-exit, to ensure that the predicted targets of
indirect branches executed in L1 cannot be controlled by software that
was executed in L2.
Since we typically don't intercept guest writes to IA32_SPEC_CTRL,
perform the IBPB at emulated VM-exit regardless of the current
IA32_SPEC_CTRL.IBRS value, even though the IBPB could technically be
deferred until L1 sets IA32_SPEC_CTRL.IBRS, if IA32_SPEC_CTRL.IBRS is
clear at emulated VM-exit.
This is CVE-2022-2196.
Fixes: 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02")
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20221019213620.1953281-3-jmattson@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/kvm/vmx/nested.c | 11 +++++++++++
arch/x86/kvm/vmx/vmx.c | 6 ++++--
2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 00f3336194a96..d3a8ee0ef988a 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4118,6 +4118,17 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
vmx_switch_vmcs(vcpu, &vmx->vmcs01);
+ /*
+ * If IBRS is advertised to the vCPU, KVM must flush the indirect
+ * branch predictors when transitioning from L2 to L1, as L1 expects
+ * hardware (KVM in this case) to provide separate predictor modes.
+ * Bare metal isolates VMX root (host) from VMX non-root (guest), but
+ * doesn't isolate different VMCSs, i.e. in this case, doesn't provide
+ * separate modes for L2 vs L1.
+ */
+ if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+ indirect_branch_prediction_barrier();
+
/* Update any VMCS fields that might have changed while L2 ran */
vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a8c8073654cf1..e6dd6a7e86893 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1397,8 +1397,10 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
/*
* No indirect branch prediction barrier needed when switching
- * the active VMCS within a guest, e.g. on nested VM-Enter.
- * The L1 VMM can protect itself with retpolines, IBPB or IBRS.
+ * the active VMCS within a vCPU, unless IBRS is advertised to
+ * the vCPU. To minimize the number of IBPBs executed, KVM
+ * performs IBPB on nested VM-Exit (a single nested transition
+ * may switch the active VMCS multiple times).
*/
if (!buddy || WARN_ON_ONCE(buddy->vmcs != prev))
indirect_branch_prediction_barrier();
--
2.39.0
next prev parent reply other threads:[~2023-02-23 13:10 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-23 13:06 [PATCH 5.4 00/18] 5.4.233-rc1 review Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 01/18] dma-mapping: add generic helpers for mapping sgtable objects Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 02/18] scatterlist: add generic wrappers for iterating over " Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 03/18] drm: etnaviv: fix common struct sg_table related issues Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 04/18] drm/etnaviv: dont truncate physical page address Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 05/18] wifi: rtl8xxxu: gen2: Turn on the rate control Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 06/18] powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 07/18] random: always mix cycle counter in add_latent_entropy() Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 08/18] KVM: x86: Fail emulation during EMULTYPE_SKIP on any exception Greg Kroah-Hartman
2023-02-23 13:06 ` Greg Kroah-Hartman [this message]
2023-02-23 13:06 ` [PATCH 5.4 10/18] can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 11/18] powerpc: dts: t208x: Disable 10G on MAC1 and MAC2 Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 12/18] alarmtimer: Prevent starvation by small intervals and SIG_IGN Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 13/18] drm/i915/gvt: fix double free bug in split_2MB_gtt_entry Greg Kroah-Hartman
2023-02-23 13:06 ` [PATCH 5.4 14/18] mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh Greg Kroah-Hartman
2023-02-23 13:07 ` [PATCH 5.4 15/18] uaccess: Add speculation barrier to copy_from_user() Greg Kroah-Hartman
2023-02-23 13:07 ` [PATCH 5.4 16/18] wifi: mwifiex: Add missing compatible string for SD8787 Greg Kroah-Hartman
2023-02-23 13:07 ` [PATCH 5.4 17/18] ext4: Fix function prototype mismatch for ext4_feat_ktype Greg Kroah-Hartman
2023-02-23 13:07 ` [PATCH 5.4 18/18] Revert "net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs" Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230223130426.040985126@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=jmattson@google.com \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=seanjc@google.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).