[PATCH v6 36/44] KVM: nVMX: Don't update msr_autostore count when saving TSC for vmcs12

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Marc Zyngier <maz@kernel.org>, Oliver Upton <oupton@kernel.org>,
	 Tianrui Zhao <zhaotianrui@loongson.cn>,
	Bibo Mao <maobibo@loongson.cn>,
	 Huacai Chen <chenhuacai@kernel.org>,
	Anup Patel <anup@brainfault.org>,  Paul Walmsley <pjw@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,  Xin Li <xin@zytor.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>,
	 Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	 Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	 Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	 kvm@vger.kernel.org, loongarch@lists.linux.dev,
	kvm-riscv@lists.infradead.org,  linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org,  linux-perf-users@vger.kernel.org,
	Mingwei Zhang <mizhang@google.com>,
	 Xudong Hao <xudong.hao@intel.com>,
	Sandipan Das <sandipan.das@amd.com>,
	 Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Xiong Zhang <xiong.y.zhang@linux.intel.com>,
	 Manali Shukla <manali.shukla@amd.com>,
	Jim Mattson <jmattson@google.com>
Subject: [PATCH v6 36/44] KVM: nVMX: Don't update msr_autostore count when saving TSC for vmcs12
Date: Fri,  5 Dec 2025 16:17:12 -0800	[thread overview]
Message-ID: <20251206001720.468579-37-seanjc@google.com> (raw)
In-Reply-To: <20251206001720.468579-1-seanjc@google.com>

Rework nVMX's use of the MSR auto-store list to snapshot TSC to sneak
MSR_IA32_TSC into the list _without_ updating KVM's software tracking,
and drop the generic functionality so that future usage of the store list
for nested specific logic needs to consider the implications of modifying
the list.  Updating the list only for vmcs02 and only on nested VM-Enter
is a disaster waiting to happen, as it means vmcs01 is stale relative to
the software tracking, and KVM could unintentionally leave an MSR in the
store list in perpetuity while running L1, e.g. if KVM addressed the first
issue and updated vmcs01 on nested VM-Exit without removing TSC from the
list.

Furthermore, mixing KVM's desire to save an MSR with L1's desire to save
an MSR result KVM clobbering/ignoring the needs of vmcs01 or vmcs02.
E.g. if KVM added MSR_IA32_TSC to the store list for its own purposes, and
then _removed_ MSR_IA32_TSC from the list after emulating nested VM-Enter,
then KVM would remove MSR_IA32_TSC from the list even though saving TSC on
VM-Exit from L2 is still desirable (to provide L1 with an accurate TSC).

Similarly, removing an MSR from the list based on vmcs12's settings could
drop an MSR that KVM wants to save for its own purposes.

In practice, the issues are currently benign, because KVM doesn't use the
store list for vmcs01.  But that will change with upcoming mediated PMU
support.

Alternatively, a "full" solution would be to track MSR list entries for
vmcs12 separately from KVM's standard lists, but MSR_IA32_TSC is likely
the only MSR that KVM would ever want to save on _every_ VM-Exit purely
based on vmcs12.  I.e. the added complexity isn't remotely justified at
this time.

Opportunistically escalate from a pr_warn_ratelimited() to a full WARN as
KVM reserves eight entries in each MSR list, and as above KVM uses at most
one entry.

Opportunistically make vmx_find_loadstore_msr_slot() local to vmx.c as
using it directly from nested code is unsafe due to the potential for
mixing vmcs01 and vmcs02 state (see above).

Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 71 ++++++++++++---------------------------
 arch/x86/kvm/vmx/vmx.c    |  2 +-
 arch/x86/kvm/vmx/vmx.h    |  2 +-
 3 files changed, 24 insertions(+), 51 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 486789dac515..614b789ecf16 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1075,16 +1075,12 @@ static bool nested_vmx_get_vmexit_msr_value(struct kvm_vcpu *vcpu,
 	 * does not include the time taken for emulation of the L2->L1
 	 * VM-exit in L0, use the more accurate value.
 	 */
-	if (msr_index == MSR_IA32_TSC) {
-		int i = vmx_find_loadstore_msr_slot(&vmx->msr_autostore,
-						    MSR_IA32_TSC);
+	if (msr_index == MSR_IA32_TSC && vmx->nested.tsc_autostore_slot >= 0) {
+		int slot = vmx->nested.tsc_autostore_slot;
+		u64 host_tsc = vmx->msr_autostore.val[slot].value;
 
-		if (i >= 0) {
-			u64 val = vmx->msr_autostore.val[i].value;
-
-			*data = kvm_read_l1_tsc(vcpu, val);
-			return true;
-		}
+		*data = kvm_read_l1_tsc(vcpu, host_tsc);
+		return true;
 	}
 
 	if (kvm_emulate_msr_read(vcpu, msr_index, data)) {
@@ -1163,42 +1159,6 @@ static bool nested_msr_store_list_has_msr(struct kvm_vcpu *vcpu, u32 msr_index)
 	return false;
 }
 
-static void prepare_vmx_msr_autostore_list(struct kvm_vcpu *vcpu,
-					   u32 msr_index)
-{
-	struct vcpu_vmx *vmx = to_vmx(vcpu);
-	struct vmx_msrs *autostore = &vmx->msr_autostore;
-	bool in_vmcs12_store_list;
-	int msr_autostore_slot;
-	bool in_autostore_list;
-	int last;
-
-	msr_autostore_slot = vmx_find_loadstore_msr_slot(autostore, msr_index);
-	in_autostore_list = msr_autostore_slot >= 0;
-	in_vmcs12_store_list = nested_msr_store_list_has_msr(vcpu, msr_index);
-
-	if (in_vmcs12_store_list && !in_autostore_list) {
-		if (autostore->nr == MAX_NR_LOADSTORE_MSRS) {
-			/*
-			 * Emulated VMEntry does not fail here.  Instead a less
-			 * accurate value will be returned by
-			 * nested_vmx_get_vmexit_msr_value() by reading KVM's
-			 * internal MSR state instead of reading the value from
-			 * the vmcs02 VMExit MSR-store area.
-			 */
-			pr_warn_ratelimited(
-				"Not enough msr entries in msr_autostore.  Can't add msr %x\n",
-				msr_index);
-			return;
-		}
-		last = autostore->nr++;
-		autostore->val[last].index = msr_index;
-	} else if (!in_vmcs12_store_list && in_autostore_list) {
-		last = --autostore->nr;
-		autostore->val[msr_autostore_slot] = autostore->val[last];
-	}
-}
-
 /*
  * Load guest's/host's cr3 at nested entry/exit.  @nested_ept is true if we are
  * emulating VM-Entry into a guest with EPT enabled.  On failure, the expected
@@ -2699,12 +2659,25 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
 	}
 
 	/*
-	 * Make sure the msr_autostore list is up to date before we set the
-	 * count in the vmcs02.
+	 * If vmcs12 is configured to save TSC on exit via the auto-store list,
+	 * append the MSR to vmcs02's auto-store list so that KVM effectively
+	 * reads TSC at the time of VM-Exit from L2.  The saved value will be
+	 * propagated to vmcs12's list on nested VM-Exit.
+	 *
+	 * Don't increment the number of MSRs in the vCPU structure, as saving
+	 * TSC is specific to this particular incarnation of vmcb02, i.e. must
+	 * not bleed into vmcs01.
 	 */
-	prepare_vmx_msr_autostore_list(&vmx->vcpu, MSR_IA32_TSC);
+	if (nested_msr_store_list_has_msr(&vmx->vcpu, MSR_IA32_TSC) &&
+	    !WARN_ON_ONCE(vmx->msr_autostore.nr >= ARRAY_SIZE(vmx->msr_autostore.val))) {
+		vmx->nested.tsc_autostore_slot = vmx->msr_autostore.nr;
+		vmx->msr_autostore.val[vmx->msr_autostore.nr].index = MSR_IA32_TSC;
 
-	vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr);
+		vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr + 1);
+	} else {
+		vmx->nested.tsc_autostore_slot = -1;
+		vmcs_write32(VM_EXIT_MSR_STORE_COUNT, vmx->msr_autostore.nr);
+	}
 	vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
 	vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr);
 
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 23c92c41fd83..52bcb817cc15 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1029,7 +1029,7 @@ static __always_inline void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx
 	vm_exit_controls_clearbit(vmx, exit);
 }
 
-int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr)
+static int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr)
 {
 	unsigned int i;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 4ce653d729ca..3175fedb5a4d 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -191,6 +191,7 @@ struct nested_vmx {
 	u16 vpid02;
 	u16 last_vpid;
 
+	int tsc_autostore_slot;
 	struct nested_vmx_msrs msrs;
 
 	/* SMM related state */
@@ -383,7 +384,6 @@ void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
 unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
 bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
 		    unsigned int flags);
-int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
 void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);
 
 void vmx_set_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type, bool set);
-- 
2.52.0.223.gf5cc29aaa4-goog

next prev parent reply	other threads:[~2025-12-06  0:18 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-06  0:16 [PATCH v6 00/44] KVM: x86: Add support for mediated vPMUs Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 01/44] perf: Skip pmu_ctx based on event_type Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 02/44] perf: Add generic exclude_guest support Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 03/44] perf: Move security_perf_event_free() call to __free_event() Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 04/44] perf: Add APIs to create/release mediated guest vPMUs Sean Christopherson
2025-12-08 11:51   ` Peter Zijlstra
2025-12-08 18:07     ` Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 05/44] perf: Clean up perf ctx time Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 06/44] perf: Add a EVENT_GUEST flag Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 07/44] perf: Add APIs to load/put guest mediated PMU context Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 08/44] perf/x86/core: Register a new vector for handling mediated guest PMIs Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 09/44] perf/x86/core: Add APIs to switch to/from mediated PMI vector (for KVM) Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 10/44] perf/x86/core: Do not set bit width for unavailable counters Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 11/44] perf/x86/core: Plumb mediated PMU capability from x86_pmu to x86_pmu_cap Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 12/44] perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 13/44] perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 14/44] KVM: Add a simplified wrapper for registering perf callbacks Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 15/44] KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 16/44] KVM: x86/pmu: Start stubbing in mediated PMU support Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 17/44] KVM: x86/pmu: Implement Intel mediated PMU requirements and constraints Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 18/44] KVM: x86/pmu: Implement AMD mediated PMU requirements Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 19/44] KVM: x86/pmu: Register PMI handler for mediated vPMU Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 20/44] KVM: x86/pmu: Disable RDPMC interception for compatible " Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 21/44] KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated PMU Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 22/44] KVM: x86/pmu: Disable interception of select PMU MSRs for mediated vPMUs Sean Christopherson
2025-12-06  0:16 ` [PATCH v6 23/44] KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter accesses Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 24/44] KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 25/44] KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter updates Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 26/44] KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on AMD Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 27/44] KVM: x86/pmu: Load/put mediated PMU context when entering/exiting guest Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 28/44] KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are active Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 29/44] KVM: x86/pmu: Handle emulated instruction for mediated vPMU Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 30/44] KVM: nVMX: Add macros to simplify nested MSR interception setting Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 31/44] KVM: nVMX: Disable PMU MSR interception as appropriate while running L2 Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 32/44] KVM: nSVM: " Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 33/44] KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 34/44] KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already match Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 35/44] KVM: VMX: Drop intermediate "guest" field from msr_autostore Sean Christopherson
2025-12-08  9:14   ` Mi, Dapeng
2025-12-06  0:17 ` Sean Christopherson [this message]
2025-12-06  0:17 ` [PATCH v6 37/44] KVM: VMX: Dedup code for removing MSR from VMCS's auto-load list Sean Christopherson
2025-12-08  9:29   ` Mi, Dapeng
2025-12-09 17:37     ` Sean Christopherson
2025-12-10  1:08       ` Mi, Dapeng
2025-12-06  0:17 ` [PATCH v6 38/44] KVM: VMX: Drop unused @entry_only param from add_atomic_switch_msr() Sean Christopherson
2025-12-08  9:32   ` Mi, Dapeng
2025-12-06  0:17 ` [PATCH v6 39/44] KVM: VMX: Bug the VM if either MSR auto-load list is full Sean Christopherson
2025-12-08  9:32   ` Mi, Dapeng
2025-12-08  9:34   ` Mi, Dapeng
2025-12-06  0:17 ` [PATCH v6 40/44] KVM: VMX: Set MSR index auto-load entry if and only if entry is "new" Sean Christopherson
2025-12-08  9:35   ` Mi, Dapeng
2025-12-06  0:17 ` [PATCH v6 41/44] KVM: VMX: Compartmentalize adding MSRs to host vs. guest auto-load list Sean Christopherson
2025-12-08  9:36   ` Mi, Dapeng
2025-12-06  0:17 ` [PATCH v6 42/44] KVM: VMX: Dedup code for adding MSR to VMCS's auto list Sean Christopherson
2025-12-08  9:37   ` Mi, Dapeng
2025-12-06  0:17 ` [PATCH v6 43/44] KVM: VMX: Initialize vmcs01.VM_EXIT_MSR_STORE_ADDR with list address Sean Christopherson
2025-12-06  0:17 ` [PATCH v6 44/44] KVM: VMX: Add mediated PMU support for CPUs without "save perf global ctrl" Sean Christopherson
2025-12-08  9:39   ` Mi, Dapeng
2025-12-09  6:31     ` Mi, Dapeng
2025-12-08 15:37 ` [PATCH v6 00/44] KVM: x86: Add support for mediated vPMUs Peter Zijlstra

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:486789dac51 dfblob:614b789ecf1 dfblob:23c92c41fd8
dfblob:52bcb817cc1 dfblob:4ce653d729c dfblob:3175fedb5a4 )
 OR (
bs:"[PATCH v6 36/44] KVM: nVMX: Don't update msr_autostore count when saving TSC for vmcs12" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251206001720.468579-37-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=acme@kernel.org \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=chenhuacai@kernel.org \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=loongarch@lists.linux.dev \
    --cc=luto@kernel.org \
    --cc=manali.shukla@amd.com \
    --cc=maobibo@loongson.cn \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mizhang@google.com \
    --cc=namhyung@kernel.org \
    --cc=oupton@kernel.org \
    --cc=palmer@dabbelt.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjw@kernel.org \
    --cc=sandipan.das@amd.com \
    --cc=xin@zytor.com \
    --cc=xiong.y.zhang@linux.intel.com \
    --cc=xudong.hao@intel.com \
    --cc=zhaotianrui@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).