From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Like Xu <like.xu.linux@gmail.com>,
Chao Gao <chao.gao@intel.com>,
Sean Christopherson <seanjc@google.com>
Subject: [PATCH 6.1 22/73] KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled
Date: Wed, 20 Nov 2024 13:58:08 +0100 [thread overview]
Message-ID: <20241120125810.150300726@linuxfoundation.org> (raw)
In-Reply-To: <20241120125809.623237564@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson <seanjc@google.com>
commit 2657b82a78f18528bef56dc1b017158490970873 upstream.
When getting the current VPID, e.g. to emulate a guest TLB flush, return
vpid01 if L2 is running but with VPID disabled, i.e. if VPID is disabled
in vmcs12. Architecturally, if VPID is disabled, then the guest and host
effectively share VPID=0. KVM emulates this behavior by using vpid01 when
running an L2 with VPID disabled (see prepare_vmcs02_early_rare()), and so
KVM must also treat vpid01 as the current VPID while L2 is active.
Unconditionally treating vpid02 as the current VPID when L2 is active
causes KVM to flush TLB entries for vpid02 instead of vpid01, which
results in TLB entries from L1 being incorrectly preserved across nested
VM-Enter to L2 (L2=>L1 isn't problematic, because the TLB flush after
nested VM-Exit flushes vpid01).
The bug manifests as failures in the vmx_apicv_test KVM-Unit-Test, as KVM
incorrectly retains TLB entries for the APIC-access page across a nested
VM-Enter.
Opportunisticaly add comments at various touchpoints to explain the
architectural requirements, and also why KVM uses vpid01 instead of vpid02.
All credit goes to Chao, who root caused the issue and identified the fix.
Link: https://lore.kernel.org/all/ZwzczkIlYGX+QXJz@intel.com
Fixes: 2b4a5a5d5688 ("KVM: nVMX: Flush current VPID (L1 vs. L2) for KVM_REQ_TLB_FLUSH_GUEST")
Cc: stable@vger.kernel.org
Cc: Like Xu <like.xu.linux@gmail.com>
Debugged-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
Link: https://lore.kernel.org/r/20241031202011.1580522-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/kvm/vmx/nested.c | 30 +++++++++++++++++++++++++-----
arch/x86/kvm/vmx/vmx.c | 2 +-
2 files changed, 26 insertions(+), 6 deletions(-)
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1126,11 +1126,14 @@ static void nested_vmx_transition_tlb_fl
struct vcpu_vmx *vmx = to_vmx(vcpu);
/*
- * If vmcs12 doesn't use VPID, L1 expects linear and combined mappings
- * for *all* contexts to be flushed on VM-Enter/VM-Exit, i.e. it's a
- * full TLB flush from the guest's perspective. This is required even
- * if VPID is disabled in the host as KVM may need to synchronize the
- * MMU in response to the guest TLB flush.
+ * If VPID is disabled, then guest TLB accesses use VPID=0, i.e. the
+ * same VPID as the host, and so architecturally, linear and combined
+ * mappings for VPID=0 must be flushed at VM-Enter and VM-Exit. KVM
+ * emulates L2 sharing L1's VPID=0 by using vpid01 while running L2,
+ * and so KVM must also emulate TLB flush of VPID=0, i.e. vpid01. This
+ * is required if VPID is disabled in KVM, as a TLB flush (there are no
+ * VPIDs) still occurs from L1's perspective, and KVM may need to
+ * synchronize the MMU in response to the guest TLB flush.
*
* Note, using TLB_FLUSH_GUEST is correct even if nested EPT is in use.
* EPT is a special snowflake, as guest-physical mappings aren't
@@ -2196,6 +2199,17 @@ static void prepare_vmcs02_early_rare(st
vmcs_write64(VMCS_LINK_POINTER, INVALID_GPA);
+ /*
+ * If VPID is disabled, then guest TLB accesses use VPID=0, i.e. the
+ * same VPID as the host. Emulate this behavior by using vpid01 for L2
+ * if VPID is disabled in vmcs12. Note, if VPID is disabled, VM-Enter
+ * and VM-Exit are architecturally required to flush VPID=0, but *only*
+ * VPID=0. I.e. using vpid02 would be ok (so long as KVM emulates the
+ * required flushes), but doing so would cause KVM to over-flush. E.g.
+ * if L1 runs L2 X with VPID12=1, then runs L2 Y with VPID12 disabled,
+ * and then runs L2 X again, then KVM can and should retain TLB entries
+ * for VPID12=1.
+ */
if (enable_vpid) {
if (nested_cpu_has_vpid(vmcs12) && vmx->nested.vpid02)
vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->nested.vpid02);
@@ -5758,6 +5772,12 @@ static int handle_invvpid(struct kvm_vcp
return nested_vmx_fail(vcpu,
VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID);
+ /*
+ * Always flush the effective vpid02, i.e. never flush the current VPID
+ * and never explicitly flush vpid01. INVVPID targets a VPID, not a
+ * VMCS, and so whether or not the current vmcs12 has VPID enabled is
+ * irrelevant (and there may not be a loaded vmcs12).
+ */
vpid02 = nested_get_vpid02(vcpu);
switch (type) {
case VMX_VPID_EXTENT_INDIVIDUAL_ADDR:
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3098,7 +3098,7 @@ static void vmx_flush_tlb_all(struct kvm
static inline int vmx_get_current_vpid(struct kvm_vcpu *vcpu)
{
- if (is_guest_mode(vcpu))
+ if (is_guest_mode(vcpu) && nested_cpu_has_vpid(get_vmcs12(vcpu)))
return nested_get_vpid02(vcpu);
return to_vmx(vcpu)->vpid;
}
next prev parent reply other threads:[~2024-11-20 13:01 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 12:57 [PATCH 6.1 00/73] 6.1.119-rc1 review Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 01/73] netlink: terminate outstanding dump on socket close Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 02/73] net: vertexcom: mse102x: Fix tx_bytes calculation Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 03/73] drm/rockchip: vop: Fix a dereferenced before check warning Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 04/73] mptcp: error out earlier on disconnect Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 05/73] net/mlx5: fs, lock FTE when checking if active Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 06/73] net/mlx5e: kTLS, Fix incorrect page refcounting Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 07/73] net/mlx5e: CT: Fix null-ptr-deref in add rule err flow Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 08/73] virtio/vsock: Fix accept_queue memory leak Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 09/73] Bluetooth: hci_event: Remove code to removed CONFIG_BT_HS Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 10/73] Bluetooth: hci_core: Fix calling mgmt_device_connected Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 11/73] net/sched: cls_u32: replace int refcounts with proper refcounts Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 12/73] net: sched: cls_u32: Fix u32s systematic failure to free IDR entries for hnodes Greg Kroah-Hartman
2024-11-20 12:57 ` [PATCH 6.1 13/73] samples: pktgen: correct dev to DEV Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 14/73] bonding: add ns target multicast address to slave device Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 15/73] ARM: 9419/1: mm: Fix kernel memory mapping for xip kernels Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 16/73] x86/mm: Fix a kdump kernel failure on SME system when CONFIG_IMA_KEXEC=y Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 17/73] mm: fix NULL pointer dereference in alloc_pages_bulk_noprof Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 18/73] ocfs2: uncache inode which has failed entering the group Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 19/73] vdpa/mlx5: Fix PA offset with unaligned starting iotlb map Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 20/73] vp_vdpa: fix id_table array not null terminated error Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 21/73] ima: fix buffer overrun in ima_eventdigest_init_common Greg Kroah-Hartman
2024-11-20 12:58 ` Greg Kroah-Hartman [this message]
2024-11-20 12:58 ` [PATCH 6.1 23/73] KVM: x86: Unconditionally set irr_pending when updating APICv state Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 24/73] KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 25/73] nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 26/73] ALSA: hda/realtek - Fixed Clevo platform headset Mic issue Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 27/73] ALSA: hda/realtek: fix mute/micmute LEDs for a HP EliteBook 645 G10 Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 28/73] ocfs2: fix UBSAN warning in ocfs2_verify_volume() Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 29/73] nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 30/73] Revert "mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K" Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 31/73] mmc: sunxi-mmc: Fix A100 compatible description Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 32/73] drm/bridge: tc358768: Fix DSI command tx Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 33/73] drm/amd: Fix initialization mistake for NBIO 7.7.0 Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 34/73] staging: vchiq_arm: Get the rid off struct vchiq_2835_state Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 35/73] staging: vchiq_arm: Use devm_kzalloc() for vchiq_arm_state allocation Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 36/73] fs/ntfs3: Additional check in ntfs_file_release Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 37/73] Bluetooth: ISO: Fix not validating setsockopt user input Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 38/73] lib/buildid: Fix build ID parsing logic Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 39/73] cxl/pci: fix error code in __cxl_hdm_decode_init() Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 40/73] media: dvbdev: fix the logic when DVB_DYNAMIC_MINORS is not set Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 41/73] NFSD: initialize copy->cp_clp early in nfsd4_copy for use by trace point Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 42/73] NFSD: Async COPY result needs to return a write verifier Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 43/73] NFSD: Limit the number of concurrent async COPY operations Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 44/73] NFSD: Initialize struct nfsd4_copy earlier Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 45/73] NFSD: Never decrement pending_async_copies on error Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 46/73] mptcp: cope racing subflow creation in mptcp_rcv_space_adjust Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 47/73] mptcp: define more local variables sk Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 48/73] mptcp: add userspace_pm_lookup_addr_by_id helper Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 49/73] mptcp: update local address flags when setting it Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 50/73] mptcp: hold pm lock when deleting entry Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 51/73] mptcp: drop lookup_by_id in lookup_addr Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 52/73] mptcp: pm: use _rcu variant under rcu_read_lock Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 53/73] ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16() Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 54/73] ksmbd: fix potencial out-of-bounds when buffer offset is invalid Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 55/73] net: add copy_safe_from_sockptr() helper Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 56/73] nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 57/73] fs/9p: fix uninitialized values during inode evict Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 58/73] ipvs: properly dereference pe in ip_vs_add_service Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 59/73] net/sched: taprio: extend minimum interval restriction to entire cycle too Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 60/73] net: fec: remove .ndo_poll_controller to avoid deadlocks Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 61/73] mm: revert "mm: shmem: fix data-race in shmem_getattr()" Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 62/73] mm: avoid unsafe VMA hook invocation when error arises on mmap hook Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 63/73] mm: unconditionally close VMAs on error Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 64/73] mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 65/73] mm: resolve faulty mmap_region() error path behaviour Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 66/73] drm/amd: check num of link levels when update pcie param Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 67/73] char: xillybus: Prevent use-after-free due to race condition Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 68/73] null_blk: Remove usage of the deprecated ida_simple_xx() API Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 69/73] null_blk: fix null-ptr-dereference while configuring power and submit_queues Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 70/73] null_blk: Fix return value of nullb_device_power_store() Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 71/73] parisc: fix a possible DMA corruption Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 72/73] char: xillybus: Fix trivial bug with mutex Greg Kroah-Hartman
2024-11-20 12:58 ` [PATCH 6.1 73/73] net: Make copy_safe_from_sockptr() match documentation Greg Kroah-Hartman
2024-11-20 16:45 ` [PATCH 6.1 00/73] 6.1.119-rc1 review Mark Brown
2024-11-20 17:01 ` SeongJae Park
2024-11-20 18:31 ` Florian Fainelli
2024-11-20 23:22 ` Shuah Khan
2024-11-21 4:26 ` Ron Economos
2024-11-21 8:32 ` Naresh Kamboju
2024-11-21 9:02 ` Pavel Machek
2024-11-21 16:50 ` Hardik Garg
2024-11-21 19:39 ` Jon Hunter
2024-11-22 6:59 ` Muhammad Usama Anjum
2024-11-22 13:55 ` Yann Sionneau
2024-11-23 7:25 ` Pavel Machek
2024-11-23 16:11 ` Chuck Lever III
2024-11-23 17:47 ` Pavel Machek
2024-11-23 15:47 ` Guenter Roeck
2024-12-02 13:02 ` Greg Kroah-Hartman
2024-11-28 17:54 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241120125810.150300726@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=chao.gao@intel.com \
--cc=like.xu.linux@gmail.com \
--cc=patches@lists.linux.dev \
--cc=seanjc@google.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.