* [PATCH v2 0/4] KVM: VMX: Handle the immediate form of MSR instructions
@ 2025-08-02 0:15 Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 1/4] x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions Xin Li (Intel)
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Xin Li (Intel) @ 2025-08-02 0:15 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: pbonzini, seanjc, tglx, mingo, bp, dave.hansen, x86, hpa, xin,
chao.gao
This patch set handles two newly introduced VM exit reasons associated
with the immediate form of MSR instructions to ensure proper
virtualization of these instructions.
The immediate form of MSR access instructions are primarily motivated
by performance, not code size: by having the MSR number in an immediate,
it is available *much* earlier in the pipeline, which allows the
hardware much more leeway about how a particular MSR is handled.
For proper virtualization of the immediate form of MSR instructions,
Intel VMX architecture adds the following changes:
1) The immediate form of RDMSR uses VM exit reason 84.
2) The immediate form of WRMSRNS uses VM exit reason 85.
3) For both VM exit reasons 84 and 85, the exit qualification is set
to the MSR address causing the VM exit.
4) Bits 3 ~ 6 of the VM exit instruction information field represent
the operand register used in the immediate form of MSR instruction.
5) The VM-exit instruction length field records the size of the
immediate form of the MSR instruction.
Note: The VMX specification for the immediate form of MSR instructions
was inadvertently omitted from the last published ISE, but it will be
included in the upcoming edition.
Linux bare metal support of the immediate form of MSR instructions is
still under development; however, the KVM support effort is proceeding
independently of the bare metal implementation.
Link to v1:
https://lore.kernel.org/lkml/20250730174605.1614792-1-xin@zytor.com/
Changes in v2:
*) Added nested MSR bitmap check for the two new MSR-related VM exit
reasons (Chao).
*) Shortened function names that still convey enough information
(Chao & Sean).
*) Removed VCPU_EXREG_EDX_EAX as it unnecessarily exposes details of a
specific flow across KVM (Sean).
*) Implemented a separate userspace completion callback for the
immediate form RDMSR (Sean).
*) Passed MSR data directly to __kvm_emulate_wrmsr() instead of the
encoded general-purpose register containing it (Sean).
*) Merged modifications to x86.c and vmx.c within the same patch to
facilitate easier code review (Sean).
*) Moved fastpath support in a separate patch, i.e., patch 3 (Sean).
*) Cleared the immediate form MSR capability in SVM in patch 4 (Sean).
Xin Li (Intel) (4):
x86/cpufeatures: Add a CPU feature bit for MSR immediate form
instructions
KVM: VMX: Handle the immediate form of MSR instructions
KVM: VMX: Support the immediate form WRMSRNS in fastpath
KVM: x86: Advertise support for the immediate form of MSR instructions
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/kvm_host.h | 4 ++
arch/x86/include/uapi/asm/vmx.h | 6 +-
arch/x86/kernel/cpu/scattered.c | 1 +
arch/x86/kvm/cpuid.c | 6 +-
arch/x86/kvm/reverse_cpuid.h | 5 ++
arch/x86/kvm/svm/svm.c | 8 ++-
arch/x86/kvm/vmx/nested.c | 13 ++++-
arch/x86/kvm/vmx/vmx.c | 26 ++++++++-
arch/x86/kvm/vmx/vmx.h | 5 ++
arch/x86/kvm/x86.c | 92 ++++++++++++++++++++++--------
arch/x86/kvm/x86.h | 3 +-
12 files changed, 139 insertions(+), 31 deletions(-)
base-commit: 33f843444e28920d6e624c6c24637b4bb5d3c8de
--
2.50.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/4] x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions
2025-08-02 0:15 [PATCH v2 0/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
@ 2025-08-02 0:15 ` Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Xin Li (Intel) @ 2025-08-02 0:15 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: pbonzini, seanjc, tglx, mingo, bp, dave.hansen, x86, hpa, xin,
chao.gao
The immediate form of MSR access instructions are primarily motivated
by performance, not code size: by having the MSR number in an immediate,
it is available *much* earlier in the pipeline, which allows the
hardware much more leeway about how a particular MSR is handled.
Use a scattered CPU feature bit for MSR immediate form instructions.
Suggested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 286d509f9363..75b43bbe2a6d 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -491,6 +491,7 @@
#define X86_FEATURE_TSA_SQ_NO (21*32+11) /* AMD CPU not vulnerable to TSA-SQ */
#define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */
#define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */
+#define X86_FEATURE_MSR_IMM (21*32+14) /* MSR immediate form instructions */
/*
* BUG word(s)
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index b4a1f6732a3a..5fe19bbe538e 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -27,6 +27,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
{ X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
{ X86_FEATURE_INTEL_PPIN, CPUID_EBX, 0, 0x00000007, 1 },
+ { X86_FEATURE_MSR_IMM, CPUID_ECX, 5, 0x00000007, 1 },
{ X86_FEATURE_APX, CPUID_EDX, 21, 0x00000007, 1 },
{ X86_FEATURE_RRSBA_CTRL, CPUID_EDX, 2, 0x00000007, 2 },
{ X86_FEATURE_BHI_CTRL, CPUID_EDX, 4, 0x00000007, 2 },
--
2.50.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions
2025-08-02 0:15 [PATCH v2 0/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 1/4] x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions Xin Li (Intel)
@ 2025-08-02 0:15 ` Xin Li (Intel)
2025-08-05 20:03 ` Sean Christopherson
2025-08-02 0:15 ` [PATCH v2 3/4] KVM: VMX: Support the immediate form WRMSRNS in fastpath Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 4/4] KVM: x86: Advertise support for the immediate form of MSR instructions Xin Li (Intel)
3 siblings, 1 reply; 7+ messages in thread
From: Xin Li (Intel) @ 2025-08-02 0:15 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: pbonzini, seanjc, tglx, mingo, bp, dave.hansen, x86, hpa, xin,
chao.gao
Handle two newly introduced VM exit reasons associated with the
immediate form of MSR instructions.
For proper virtualization of the immediate form of MSR instructions,
Intel VMX architecture adds the following changes:
1) The immediate form of RDMSR uses VM exit reason 84.
2) The immediate form of WRMSRNS uses VM exit reason 85.
3) For both VM exit reasons 84 and 85, the exit qualification is set
to the MSR address causing the VM exit.
4) Bits 3 ~ 6 of the VM exit instruction information field represent
the operand register used in the immediate form of MSR instruction.
5) The VM-exit instruction length field records the size of the
immediate form of the MSR instruction.
Add code to properly virtualize the immediate form of MSR instructions.
While at it, add helper functions to centralize guest MSR read/write
emulation, which consolidates the MSR emulation logic and makes it
easier to extend support for new MSR-related VM exit reasons introduced
with the immediate form of MSR instructions.
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
---
Changes in v2:
*) Added nested MSR bitmap check for the two new MSR-related VM exit
reasons (Chao).
*) Shortened function names that still convey enough information
(Chao & Sean).
*) Removed VCPU_EXREG_EDX_EAX as it unnecessarily exposes details of a
specific flow across KVM (Sean).
*) Implemented a separate userspace completion callback for the
immediate form RDMSR (Sean).
*) Passed MSR data directly to __kvm_emulate_wrmsr() instead of the
encoded general-purpose register containing it (Sean).
*) Merged modifications to x86.c and vmx.c within the same patch to
facilitate easier code review (Sean).
*) Moved fastpath support in a separate following patch (Sean).
---
arch/x86/include/asm/kvm_host.h | 3 ++
arch/x86/include/uapi/asm/vmx.h | 6 ++-
arch/x86/kvm/vmx/nested.c | 13 +++++-
arch/x86/kvm/vmx/vmx.c | 21 ++++++++++
arch/x86/kvm/vmx/vmx.h | 5 +++
arch/x86/kvm/x86.c | 73 +++++++++++++++++++++++++--------
6 files changed, 101 insertions(+), 20 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f19a76d3ca0e..c5d0082cf0a5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -978,6 +978,7 @@ struct kvm_vcpu_arch {
unsigned long guest_debug_dr7;
u64 msr_platform_info;
u64 msr_misc_features_enables;
+ u32 cui_rdmsr_imm_reg;
u64 mcg_cap;
u64 mcg_status;
@@ -2155,7 +2156,9 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiat
int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data);
int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data);
int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu);
+int kvm_emulate_rdmsr_imm(struct kvm_vcpu *vcpu, u32 msr, int reg);
int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu);
+int kvm_emulate_wrmsr_imm(struct kvm_vcpu *vcpu, u32 msr, int reg);
int kvm_emulate_as_nop(struct kvm_vcpu *vcpu);
int kvm_emulate_invd(struct kvm_vcpu *vcpu);
int kvm_emulate_mwait(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
index f0f4a4cf84a7..9792e329343e 100644
--- a/arch/x86/include/uapi/asm/vmx.h
+++ b/arch/x86/include/uapi/asm/vmx.h
@@ -94,6 +94,8 @@
#define EXIT_REASON_BUS_LOCK 74
#define EXIT_REASON_NOTIFY 75
#define EXIT_REASON_TDCALL 77
+#define EXIT_REASON_MSR_READ_IMM 84
+#define EXIT_REASON_MSR_WRITE_IMM 85
#define VMX_EXIT_REASONS \
{ EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \
@@ -158,7 +160,9 @@
{ EXIT_REASON_TPAUSE, "TPAUSE" }, \
{ EXIT_REASON_BUS_LOCK, "BUS_LOCK" }, \
{ EXIT_REASON_NOTIFY, "NOTIFY" }, \
- { EXIT_REASON_TDCALL, "TDCALL" }
+ { EXIT_REASON_TDCALL, "TDCALL" }, \
+ { EXIT_REASON_MSR_READ_IMM, "MSR_READ_IMM" }, \
+ { EXIT_REASON_MSR_WRITE_IMM, "MSR_WRITE_IMM" }
#define VMX_EXIT_REASON_FLAGS \
{ VMX_EXIT_REASONS_FAILED_VMENTRY, "FAILED_VMENTRY" }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b8ea1969113d..4e6352ef9520 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6216,19 +6216,26 @@ static bool nested_vmx_exit_handled_msr(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12,
union vmx_exit_reason exit_reason)
{
- u32 msr_index = kvm_rcx_read(vcpu);
+ u32 msr_index;
gpa_t bitmap;
if (!nested_cpu_has(vmcs12, CPU_BASED_USE_MSR_BITMAPS))
return true;
+ if (exit_reason.basic == EXIT_REASON_MSR_READ_IMM ||
+ exit_reason.basic == EXIT_REASON_MSR_WRITE_IMM)
+ msr_index = vmx_get_exit_qual(vcpu);
+ else
+ msr_index = kvm_rcx_read(vcpu);
+
/*
* The MSR_BITMAP page is divided into four 1024-byte bitmaps,
* for the four combinations of read/write and low/high MSR numbers.
* First we need to figure out which of the four to use:
*/
bitmap = vmcs12->msr_bitmap;
- if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
+ if (exit_reason.basic == EXIT_REASON_MSR_WRITE ||
+ exit_reason.basic == EXIT_REASON_MSR_WRITE_IMM)
bitmap += 2048;
if (msr_index >= 0xc0000000) {
msr_index -= 0xc0000000;
@@ -6527,6 +6534,8 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_DESC);
case EXIT_REASON_MSR_READ:
case EXIT_REASON_MSR_WRITE:
+ case EXIT_REASON_MSR_READ_IMM:
+ case EXIT_REASON_MSR_WRITE_IMM:
return nested_vmx_exit_handled_msr(vcpu, vmcs12, exit_reason);
case EXIT_REASON_INVALID_STATE:
return true;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index aa157fe5b7b3..c112595dfff9 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6003,6 +6003,23 @@ static int handle_notify(struct kvm_vcpu *vcpu)
return 1;
}
+static int vmx_get_msr_imm_reg(void)
+{
+ return vmx_get_instr_info_reg(vmcs_read32(VMX_INSTRUCTION_INFO));
+}
+
+static int handle_rdmsr_imm(struct kvm_vcpu *vcpu)
+{
+ return kvm_emulate_rdmsr_imm(vcpu, vmx_get_exit_qual(vcpu),
+ vmx_get_msr_imm_reg());
+}
+
+static int handle_wrmsr_imm(struct kvm_vcpu *vcpu)
+{
+ return kvm_emulate_wrmsr_imm(vcpu, vmx_get_exit_qual(vcpu),
+ vmx_get_msr_imm_reg());
+}
+
/*
* The exit handlers return 1 if the exit was handled fully and guest execution
* may resume. Otherwise they set the kvm_run parameter to indicate what needs
@@ -6061,6 +6078,8 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
[EXIT_REASON_ENCLS] = handle_encls,
[EXIT_REASON_BUS_LOCK] = handle_bus_lock_vmexit,
[EXIT_REASON_NOTIFY] = handle_notify,
+ [EXIT_REASON_MSR_READ_IMM] = handle_rdmsr_imm,
+ [EXIT_REASON_MSR_WRITE_IMM] = handle_wrmsr_imm,
};
static const int kvm_vmx_max_exit_handlers =
@@ -6495,6 +6514,8 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
#ifdef CONFIG_MITIGATION_RETPOLINE
if (exit_reason.basic == EXIT_REASON_MSR_WRITE)
return kvm_emulate_wrmsr(vcpu);
+ else if (exit_reason.basic == EXIT_REASON_MSR_WRITE_IMM)
+ return handle_wrmsr_imm(vcpu);
else if (exit_reason.basic == EXIT_REASON_PREEMPTION_TIMER)
return handle_preemption_timer(vcpu);
else if (exit_reason.basic == EXIT_REASON_INTERRUPT_WINDOW)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index d3389baf3ab3..24d65dac5e89 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -706,6 +706,11 @@ static inline bool vmx_guest_state_valid(struct kvm_vcpu *vcpu)
void dump_vmcs(struct kvm_vcpu *vcpu);
+static inline int vmx_get_instr_info_reg(u32 vmx_instr_info)
+{
+ return (vmx_instr_info >> 3) & 0xf;
+}
+
static inline int vmx_get_instr_info_reg2(u32 vmx_instr_info)
{
return (vmx_instr_info >> 28) & 0xf;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1c49bc681c4..fe12aae7089c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1968,6 +1968,13 @@ static void complete_userspace_rdmsr(struct kvm_vcpu *vcpu)
}
}
+static void complete_userspace_rdmsr_imm(struct kvm_vcpu *vcpu)
+{
+ if (!vcpu->run->msr.error)
+ kvm_register_write(vcpu, vcpu->arch.cui_rdmsr_imm_reg,
+ vcpu->run->msr.data);
+}
+
static int complete_emulated_msr_access(struct kvm_vcpu *vcpu)
{
return complete_emulated_insn_gp(vcpu, vcpu->run->msr.error);
@@ -1990,6 +1997,12 @@ static int complete_fast_rdmsr(struct kvm_vcpu *vcpu)
return complete_fast_msr_access(vcpu);
}
+static int complete_fast_rdmsr_imm(struct kvm_vcpu *vcpu)
+{
+ complete_userspace_rdmsr_imm(vcpu);
+ return complete_fast_msr_access(vcpu);
+}
+
static u64 kvm_msr_reason(int r)
{
switch (r) {
@@ -2024,56 +2037,82 @@ static int kvm_msr_user_space(struct kvm_vcpu *vcpu, u32 index,
return 1;
}
-int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
+static int __kvm_emulate_rdmsr(struct kvm_vcpu *vcpu, u32 msr, int reg,
+ int (*complete_rdmsr)(struct kvm_vcpu *))
{
- u32 ecx = kvm_rcx_read(vcpu);
u64 data;
int r;
- r = kvm_get_msr_with_filter(vcpu, ecx, &data);
-
+ r = kvm_get_msr_with_filter(vcpu, msr, &data);
if (!r) {
- trace_kvm_msr_read(ecx, data);
+ trace_kvm_msr_read(msr, data);
- kvm_rax_write(vcpu, data & -1u);
- kvm_rdx_write(vcpu, (data >> 32) & -1u);
+ if (reg < 0) {
+ kvm_rax_write(vcpu, data & -1u);
+ kvm_rdx_write(vcpu, (data >> 32) & -1u);
+ } else {
+ kvm_register_write(vcpu, reg, data);
+ }
} else {
/* MSR read failed? See if we should ask user space */
- if (kvm_msr_user_space(vcpu, ecx, KVM_EXIT_X86_RDMSR, 0,
- complete_fast_rdmsr, r))
+ if (kvm_msr_user_space(vcpu, msr, KVM_EXIT_X86_RDMSR, 0,
+ complete_rdmsr, r))
return 0;
- trace_kvm_msr_read_ex(ecx);
+ trace_kvm_msr_read_ex(msr);
}
return kvm_x86_call(complete_emulated_msr)(vcpu, r);
}
+
+int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu)
+{
+ return __kvm_emulate_rdmsr(vcpu, kvm_rcx_read(vcpu), -1,
+ complete_fast_rdmsr);
+}
EXPORT_SYMBOL_GPL(kvm_emulate_rdmsr);
-int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
+int kvm_emulate_rdmsr_imm(struct kvm_vcpu *vcpu, u32 msr, int reg)
+{
+ vcpu->arch.cui_rdmsr_imm_reg = reg;
+
+ return __kvm_emulate_rdmsr(vcpu, msr, reg, complete_fast_rdmsr_imm);
+}
+EXPORT_SYMBOL_GPL(kvm_emulate_rdmsr_imm);
+
+static int __kvm_emulate_wrmsr(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
- u32 ecx = kvm_rcx_read(vcpu);
- u64 data = kvm_read_edx_eax(vcpu);
int r;
- r = kvm_set_msr_with_filter(vcpu, ecx, data);
+ r = kvm_set_msr_with_filter(vcpu, msr, data);
if (!r) {
- trace_kvm_msr_write(ecx, data);
+ trace_kvm_msr_write(msr, data);
} else {
/* MSR write failed? See if we should ask user space */
- if (kvm_msr_user_space(vcpu, ecx, KVM_EXIT_X86_WRMSR, data,
+ if (kvm_msr_user_space(vcpu, msr, KVM_EXIT_X86_WRMSR, data,
complete_fast_msr_access, r))
return 0;
/* Signal all other negative errors to userspace */
if (r < 0)
return r;
- trace_kvm_msr_write_ex(ecx, data);
+ trace_kvm_msr_write_ex(msr, data);
}
return kvm_x86_call(complete_emulated_msr)(vcpu, r);
}
+
+int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu)
+{
+ return __kvm_emulate_wrmsr(vcpu, kvm_rcx_read(vcpu), kvm_read_edx_eax(vcpu));
+}
EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr);
+int kvm_emulate_wrmsr_imm(struct kvm_vcpu *vcpu, u32 msr, int reg)
+{
+ return __kvm_emulate_wrmsr(vcpu, msr, kvm_register_read(vcpu, reg));
+}
+EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr_imm);
+
int kvm_emulate_as_nop(struct kvm_vcpu *vcpu)
{
return kvm_skip_emulated_instruction(vcpu);
--
2.50.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/4] KVM: VMX: Support the immediate form WRMSRNS in fastpath
2025-08-02 0:15 [PATCH v2 0/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 1/4] x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
@ 2025-08-02 0:15 ` Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 4/4] KVM: x86: Advertise support for the immediate form of MSR instructions Xin Li (Intel)
3 siblings, 0 replies; 7+ messages in thread
From: Xin Li (Intel) @ 2025-08-02 0:15 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: pbonzini, seanjc, tglx, mingo, bp, dave.hansen, x86, hpa, xin,
chao.gao
Refactored handle_fastpath_set_msr_irqoff() to accept MSR index and data
directly via input arguments, enabling it to handle both implicit and
immediate form WRMSRNS through appropriate wrappers. Also rename it to
__handle_fastpath_wrmsr().
BTW, per Sean's suggestion, rename handle_fastpath_set_msr_irqoff() to
handle_fastpath_wrmsr().
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
---
Change in v2:
*) Moved fastpath support in a separate patch (Sean).
---
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 5 ++++-
arch/x86/kvm/x86.c | 19 +++++++++++++------
arch/x86/kvm/x86.h | 3 ++-
4 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d9931c6c4bc6..4abc34b7c2c7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4189,7 +4189,7 @@ static fastpath_t svm_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
case SVM_EXIT_MSR:
if (!svm->vmcb->control.exit_info_1)
break;
- return handle_fastpath_set_msr_irqoff(vcpu);
+ return handle_fastpath_wrmsr(vcpu);
case SVM_EXIT_HLT:
return handle_fastpath_hlt(vcpu);
default:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c112595dfff9..2cd865e117a8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7191,7 +7191,10 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu,
switch (vmx_get_exit_reason(vcpu).basic) {
case EXIT_REASON_MSR_WRITE:
- return handle_fastpath_set_msr_irqoff(vcpu);
+ return handle_fastpath_wrmsr(vcpu);
+ case EXIT_REASON_MSR_WRITE_IMM:
+ return handle_fastpath_wrmsr_imm(vcpu, vmx_get_exit_qual(vcpu),
+ vmx_get_msr_imm_reg());
case EXIT_REASON_PREEMPTION_TIMER:
return handle_fastpath_preemption_timer(vcpu, force_immediate_exit);
case EXIT_REASON_HLT:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe12aae7089c..9aede349b6ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2202,10 +2202,8 @@ static int handle_fastpath_set_tscdeadline(struct kvm_vcpu *vcpu, u64 data)
return 0;
}
-fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu)
+static fastpath_t __handle_fastpath_wrmsr(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
- u32 msr = kvm_rcx_read(vcpu);
- u64 data;
fastpath_t ret;
bool handled;
@@ -2213,11 +2211,9 @@ fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu)
switch (msr) {
case APIC_BASE_MSR + (APIC_ICR >> 4):
- data = kvm_read_edx_eax(vcpu);
handled = !handle_fastpath_set_x2apic_icr_irqoff(vcpu, data);
break;
case MSR_IA32_TSC_DEADLINE:
- data = kvm_read_edx_eax(vcpu);
handled = !handle_fastpath_set_tscdeadline(vcpu, data);
break;
default:
@@ -2239,7 +2235,18 @@ fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu)
return ret;
}
-EXPORT_SYMBOL_GPL(handle_fastpath_set_msr_irqoff);
+
+fastpath_t handle_fastpath_wrmsr(struct kvm_vcpu *vcpu)
+{
+ return __handle_fastpath_wrmsr(vcpu, kvm_rcx_read(vcpu), kvm_read_edx_eax(vcpu));
+}
+EXPORT_SYMBOL_GPL(handle_fastpath_wrmsr);
+
+fastpath_t handle_fastpath_wrmsr_imm(struct kvm_vcpu *vcpu, u32 msr, int reg)
+{
+ return __handle_fastpath_wrmsr(vcpu, msr, kvm_register_read(vcpu, reg));
+}
+EXPORT_SYMBOL_GPL(handle_fastpath_wrmsr_imm);
/*
* Adapt set_msr() to msr_io()'s calling convention
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index bcfd9b719ada..de22c19b47fe 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -437,7 +437,8 @@ int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,
void *insn, int insn_len);
int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
int emulation_type, void *insn, int insn_len);
-fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu);
+fastpath_t handle_fastpath_wrmsr(struct kvm_vcpu *vcpu);
+fastpath_t handle_fastpath_wrmsr_imm(struct kvm_vcpu *vcpu, u32 msr, int reg);
fastpath_t handle_fastpath_hlt(struct kvm_vcpu *vcpu);
extern struct kvm_caps kvm_caps;
--
2.50.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/4] KVM: x86: Advertise support for the immediate form of MSR instructions
2025-08-02 0:15 [PATCH v2 0/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
` (2 preceding siblings ...)
2025-08-02 0:15 ` [PATCH v2 3/4] KVM: VMX: Support the immediate form WRMSRNS in fastpath Xin Li (Intel)
@ 2025-08-02 0:15 ` Xin Li (Intel)
3 siblings, 0 replies; 7+ messages in thread
From: Xin Li (Intel) @ 2025-08-02 0:15 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: pbonzini, seanjc, tglx, mingo, bp, dave.hansen, x86, hpa, xin,
chao.gao
Advertise support for the immediate form of MSR instructions to userspace
if the instructions are supported by the underlying CPU.
The immediate form of MSR access instructions are primarily motivated
by performance, not code size: by having the MSR number in an immediate,
it is available *much* earlier in the pipeline, which allows the
hardware much more leeway about how a particular MSR is handled.
Explicitly clear the capability in SVM, as its handling is only added for
VMX.
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
---
Change in v2:
*) Cleared the capability in SVM (Sean).
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 6 +++++-
arch/x86/kvm/reverse_cpuid.h | 5 +++++
arch/x86/kvm/svm/svm.c | 6 +++++-
4 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c5d0082cf0a5..2a7d0dcc1d70 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -771,6 +771,7 @@ enum kvm_only_cpuid_leafs {
CPUID_7_2_EDX,
CPUID_24_0_EBX,
CPUID_8000_0021_ECX,
+ CPUID_7_1_ECX,
NR_KVM_CPU_CAPS,
NKVMCAPINTS = NR_KVM_CPU_CAPS - NCAPINTS,
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e2836a255b16..eaaa9203d4d9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -985,6 +985,10 @@ void kvm_set_cpu_caps(void)
F(LAM),
);
+ kvm_cpu_cap_init(CPUID_7_1_ECX,
+ SCATTERED_F(MSR_IMM),
+ );
+
kvm_cpu_cap_init(CPUID_7_1_EDX,
F(AVX_VNNI_INT8),
F(AVX_NE_CONVERT),
@@ -1411,9 +1415,9 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
goto out;
cpuid_entry_override(entry, CPUID_7_1_EAX);
+ cpuid_entry_override(entry, CPUID_7_1_ECX);
cpuid_entry_override(entry, CPUID_7_1_EDX);
entry->ebx = 0;
- entry->ecx = 0;
}
if (max_idx >= 2) {
entry = do_host_cpuid(array, function, 2);
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index c53b92379e6e..743ab25ba787 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -25,6 +25,9 @@
#define KVM_X86_FEATURE_SGX2 KVM_X86_FEATURE(CPUID_12_EAX, 1)
#define KVM_X86_FEATURE_SGX_EDECCSSA KVM_X86_FEATURE(CPUID_12_EAX, 11)
+/* Intel-defined sub-features, CPUID level 0x00000007:1 (ECX) */
+#define KVM_X86_FEATURE_MSR_IMM KVM_X86_FEATURE(CPUID_7_1_ECX, 5)
+
/* Intel-defined sub-features, CPUID level 0x00000007:1 (EDX) */
#define X86_FEATURE_AVX_VNNI_INT8 KVM_X86_FEATURE(CPUID_7_1_EDX, 4)
#define X86_FEATURE_AVX_NE_CONVERT KVM_X86_FEATURE(CPUID_7_1_EDX, 5)
@@ -87,6 +90,7 @@ static const struct cpuid_reg reverse_cpuid[] = {
[CPUID_7_2_EDX] = { 7, 2, CPUID_EDX},
[CPUID_24_0_EBX] = { 0x24, 0, CPUID_EBX},
[CPUID_8000_0021_ECX] = {0x80000021, 0, CPUID_ECX},
+ [CPUID_7_1_ECX] = { 7, 1, CPUID_ECX},
};
/*
@@ -128,6 +132,7 @@ static __always_inline u32 __feature_translate(int x86_feature)
KVM_X86_TRANSLATE_FEATURE(BHI_CTRL);
KVM_X86_TRANSLATE_FEATURE(TSA_SQ_NO);
KVM_X86_TRANSLATE_FEATURE(TSA_L1_NO);
+ KVM_X86_TRANSLATE_FEATURE(MSR_IMM);
default:
return x86_feature;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4abc34b7c2c7..57bcd92125a3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5301,8 +5301,12 @@ static __init void svm_set_cpu_caps(void)
/* CPUID 0x8000001F (SME/SEV features) */
sev_set_cpu_caps();
- /* Don't advertise Bus Lock Detect to guest if SVM support is absent */
+ /*
+ * Clear capabilities that are automatically configured by common code,
+ * but that require explicit SVM support (that isn't yet implemented).
+ */
kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT);
+ kvm_cpu_cap_clear(X86_FEATURE_MSR_IMM);
}
static __init int svm_hardware_setup(void)
--
2.50.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions
2025-08-02 0:15 ` [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
@ 2025-08-05 20:03 ` Sean Christopherson
2025-08-06 15:59 ` Xin Li
0 siblings, 1 reply; 7+ messages in thread
From: Sean Christopherson @ 2025-08-05 20:03 UTC (permalink / raw)
To: Xin Li (Intel)
Cc: linux-kernel, kvm, pbonzini, tglx, mingo, bp, dave.hansen, x86,
hpa, chao.gao
On Fri, Aug 01, 2025, Xin Li (Intel) wrote:
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f19a76d3ca0e..c5d0082cf0a5 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -978,6 +978,7 @@ struct kvm_vcpu_arch {
> unsigned long guest_debug_dr7;
> u64 msr_platform_info;
> u64 msr_misc_features_enables;
> + u32 cui_rdmsr_imm_reg;
This should be an "int", mostly because that's how KVM tracks it throughout the
various accessors, but also because it'd let us use "-1" for an "invalid" value,
e.g. if we ever want to add sanity checks to the completion callback (I don't
think that's worth doing).
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index aa157fe5b7b3..c112595dfff9 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6003,6 +6003,23 @@ static int handle_notify(struct kvm_vcpu *vcpu)
> return 1;
> }
>
> +static int vmx_get_msr_imm_reg(void)
It's a bit silly, but I think it's worth passing in the @vcpu here. E.g. if we
ever want to support caching the vmcs.VMX_INSTRUCTION_INFO. And because it costs
literally nothing (barring a truly stupid compiler).
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a1c49bc681c4..fe12aae7089c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1968,6 +1968,13 @@ static void complete_userspace_rdmsr(struct kvm_vcpu *vcpu)
> }
> }
>
> +static void complete_userspace_rdmsr_imm(struct kvm_vcpu *vcpu)
No need for this helper, the few lines can be open coded in complete_fast_rdmsr_imm().
> +{
> + if (!vcpu->run->msr.error)
> + kvm_register_write(vcpu, vcpu->arch.cui_rdmsr_imm_reg,
> + vcpu->run->msr.data);
> +}
> +
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions
2025-08-05 20:03 ` Sean Christopherson
@ 2025-08-06 15:59 ` Xin Li
0 siblings, 0 replies; 7+ messages in thread
From: Xin Li @ 2025-08-06 15:59 UTC (permalink / raw)
To: Sean Christopherson
Cc: linux-kernel, kvm, pbonzini, tglx, mingo, bp, dave.hansen, x86,
hpa, chao.gao
On 8/5/2025 1:03 PM, Sean Christopherson wrote:
> On Fri, Aug 01, 2025, Xin Li (Intel) wrote:
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index f19a76d3ca0e..c5d0082cf0a5 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -978,6 +978,7 @@ struct kvm_vcpu_arch {
>> unsigned long guest_debug_dr7;
>> u64 msr_platform_info;
>> u64 msr_misc_features_enables;
>> + u32 cui_rdmsr_imm_reg;
>
> This should be an "int", mostly because that's how KVM tracks it throughout the
> various accessors, but also because it'd let us use "-1" for an "invalid" value,
> e.g. if we ever want to add sanity checks to the completion callback (I don't
> think that's worth doing).
Sigh, using u32 was a dumb move on my part.
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index a1c49bc681c4..fe12aae7089c 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1968,6 +1968,13 @@ static void complete_userspace_rdmsr(struct kvm_vcpu *vcpu)
>> }
>> }
>>
>> +static void complete_userspace_rdmsr_imm(struct kvm_vcpu *vcpu)
>
> No need for this helper, the few lines can be open coded in complete_fast_rdmsr_imm().
Yes, the change in v3 is simpler.
>
>> +{
>> + if (!vcpu->run->msr.error)
>> + kvm_register_write(vcpu, vcpu->arch.cui_rdmsr_imm_reg,
>> + vcpu->run->msr.data);
>> +}
>> +
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-06 16:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-02 0:15 [PATCH v2 0/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 1/4] x86/cpufeatures: Add a CPU feature bit for MSR immediate form instructions Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 2/4] KVM: VMX: Handle the immediate form of MSR instructions Xin Li (Intel)
2025-08-05 20:03 ` Sean Christopherson
2025-08-06 15:59 ` Xin Li
2025-08-02 0:15 ` [PATCH v2 3/4] KVM: VMX: Support the immediate form WRMSRNS in fastpath Xin Li (Intel)
2025-08-02 0:15 ` [PATCH v2 4/4] KVM: x86: Advertise support for the immediate form of MSR instructions Xin Li (Intel)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).