public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions
@ 2025-09-30  7:03 Juergen Gross
  2025-09-30  7:03 ` [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function Juergen Gross
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Juergen Gross @ 2025-09-30  7:03 UTC (permalink / raw)
  To: linux-kernel, x86, linux-coco, kvm, linux-hyperv, virtualization,
	llvm
  Cc: xin, Juergen Gross, Kirill A. Shutemov, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Sean Christopherson, Paolo Bonzini, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Vitaly Kuznetsov,
	Boris Ostrovsky, xen-devel, Ajay Kaher, Alexey Makhalov,
	Broadcom internal kernel review list, Andy Lutomirski,
	Peter Zijlstra, Nathan Chancellor, Nick Desaulniers,
	Bill Wendling, Justin Stitt

When building a kernel with CONFIG_PARAVIRT_XXL the paravirt
infrastructure will always use functions for reading or writing MSRs,
even when running on bare metal.

Switch to inline RDMSR/WRMSR instructions in this case, reducing the
paravirt overhead.

In order to make this less intrusive, some further reorganization of
the MSR access helpers is done in the first 5 patches.

The next 5 patches are converting the non-paravirt case to use direct
inlining of the MSR access instructions, including the WRMSRNS
instruction and the immediate variants of RDMSR and WRMSR if possible.

Patch 11 removes the PV hooks for MSR accesses and implements the
Xen PV cases via calls depending on X86_FEATURE_XENPV, which results
in runtime patching those calls away for the non-XenPV case.

Patch 12 is a final little cleanup patch.

This series has been tested to work with Xen PV and on bare metal.

This series is inspired by Xin Li, who used a similar approach, but
(in my opinion) with some flaws. Originally I thought it should be
possible to use the paravirt infrastructure, but this turned out to be
rather complicated, especially for the Xen PV case in the *_safe()
variants of the MSR access functions.

Changes since V1:
- Use Xin Li's approach for inlining
- Several new patches

Juergen Gross (9):
  coco/tdx: Rename MSR access helpers
  x86/sev: replace call of native_wrmsr() with native_wrmsrq()
  x86/kvm: Remove the KVM private read_msr() function
  x86/msr: minimize usage of native_*() msr access functions
  x86/msr: Move MSR trace calls one function level up
  x86/msr: Use the alternatives mechanism for WRMSR
  x86/msr: Use the alternatives mechanism for RDMSR
  x86/paravirt: Don't use pv_ops vector for MSR access functions
  x86/msr: Reduce number of low level MSR access helpers

Xin Li (Intel) (3):
  x86/cpufeatures: Add a CPU feature bit for MSR immediate form
    instructions
  x86/opcode: Add immediate form MSR instructions
  x86/extable: Add support for immediate form MSR instructions

 arch/x86/coco/tdx/tdx.c               |   8 +-
 arch/x86/hyperv/ivm.c                 |   2 +-
 arch/x86/include/asm/cpufeatures.h    |   1 +
 arch/x86/include/asm/fred.h           |   2 +-
 arch/x86/include/asm/kvm_host.h       |  10 -
 arch/x86/include/asm/msr.h            | 409 +++++++++++++++++++-------
 arch/x86/include/asm/paravirt.h       |  67 -----
 arch/x86/include/asm/paravirt_types.h |  13 -
 arch/x86/include/asm/sev-internal.h   |   7 +-
 arch/x86/kernel/cpu/scattered.c       |   1 +
 arch/x86/kernel/kvmclock.c            |   2 +-
 arch/x86/kernel/paravirt.c            |   5 -
 arch/x86/kvm/svm/svm.c                |  16 +-
 arch/x86/kvm/vmx/vmx.c                |   4 +-
 arch/x86/lib/x86-opcode-map.txt       |   5 +-
 arch/x86/mm/extable.c                 |  39 ++-
 arch/x86/xen/enlighten_pv.c           |  24 +-
 arch/x86/xen/pmu.c                    |   5 +-
 tools/arch/x86/lib/x86-opcode-map.txt |   5 +-
 19 files changed, 383 insertions(+), 242 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function
  2025-09-30  7:03 [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions Juergen Gross
@ 2025-09-30  7:03 ` Juergen Gross
  2025-09-30 16:04   ` Sean Christopherson
  2025-09-30  7:03 ` [PATCH v2 04/12] x86/msr: Minimize usage of native_*() msr access functions Juergen Gross
  2025-09-30 19:19 ` [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions H. Peter Anvin
  2 siblings, 1 reply; 6+ messages in thread
From: Juergen Gross @ 2025-09-30  7:03 UTC (permalink / raw)
  To: linux-kernel, x86, kvm
  Cc: xin, Juergen Gross, Sean Christopherson, Paolo Bonzini,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin

Instead of having a KVM private read_msr() function, just use rdmsrq().

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- remove the helper and use rdmsrq() directly (Sean Christopherson)
---
 arch/x86/include/asm/kvm_host.h | 10 ----------
 arch/x86/kvm/vmx/vmx.c          |  4 ++--
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f19a76d3ca0e..aed754dda1a3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2296,16 +2296,6 @@ static inline void kvm_load_ldt(u16 sel)
 	asm("lldt %0" : : "rm"(sel));
 }
 
-#ifdef CONFIG_X86_64
-static inline unsigned long read_msr(unsigned long msr)
-{
-	u64 value;
-
-	rdmsrq(msr, value);
-	return value;
-}
-#endif
-
 static inline void kvm_inject_gp(struct kvm_vcpu *vcpu, u32 error_code)
 {
 	kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index aa157fe5b7b3..12bb1769e3ae 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1288,8 +1288,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 	} else {
 		savesegment(fs, fs_sel);
 		savesegment(gs, gs_sel);
-		fs_base = read_msr(MSR_FS_BASE);
-		vt->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE);
+		rdmsrq(MSR_FS_BASE, fs_base);
+		rdmsrq(MSR_KERNEL_GS_BASE, vt->msr_host_kernel_gs_base);
 	}
 
 	wrmsrq(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 04/12] x86/msr: Minimize usage of native_*() msr access functions
  2025-09-30  7:03 [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions Juergen Gross
  2025-09-30  7:03 ` [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function Juergen Gross
@ 2025-09-30  7:03 ` Juergen Gross
  2025-09-30 19:19 ` [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions H. Peter Anvin
  2 siblings, 0 replies; 6+ messages in thread
From: Juergen Gross @ 2025-09-30  7:03 UTC (permalink / raw)
  To: linux-kernel, x86, linux-hyperv, kvm
  Cc: xin, Juergen Gross, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Paolo Bonzini, Vitaly Kuznetsov,
	Sean Christopherson, Boris Ostrovsky, xen-devel

In order to prepare for some MSR access function reorg work, switch
most users of native_{read|write}_msr[_safe]() to the more generic
rdmsr*()/wrmsr*() variants.

For now this will have some intermediate performance impact with
paravirtualization configured when running on bare metal, but this
is a prereq change for the planned direct inlining of the rdmsr/wrmsr
instructions with this configuration.

The main reason for this switch is the planned move of the MSR trace
function invocation from the native_*() functions to the generic
rdmsr*()/wrmsr*() variants. Without this switch the users of the
native_*() functions would lose the related tracing entries.

Note that the Xen related MSR access functions will not be switched,
as these will be handled after the move of the trace hooks.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/hyperv/ivm.c      |  2 +-
 arch/x86/kernel/kvmclock.c |  2 +-
 arch/x86/kvm/svm/svm.c     | 16 ++++++++--------
 arch/x86/xen/pmu.c         |  4 ++--
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index ade6c665c97e..202ed01dc151 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -327,7 +327,7 @@ int hv_snp_boot_ap(u32 apic_id, unsigned long start_ip, unsigned int cpu)
 	asm volatile("movl %%ds, %%eax;" : "=a" (vmsa->ds.selector));
 	hv_populate_vmcb_seg(vmsa->ds, vmsa->gdtr.base);
 
-	vmsa->efer = native_read_msr(MSR_EFER);
+	rdmsrq(MSR_EFER, vmsa->efer);
 
 	vmsa->cr4 = native_read_cr4();
 	vmsa->cr3 = __native_read_cr3();
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ca0a49eeac4a..b6cd45cce5fe 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -196,7 +196,7 @@ static void kvm_setup_secondary_clock(void)
 void kvmclock_disable(void)
 {
 	if (msr_kvm_system_time)
-		native_write_msr(msr_kvm_system_time, 0);
+		wrmsrq(msr_kvm_system_time, 0);
 }
 
 static void __init kvmclock_init_mem(void)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1bfebe40854f..105d5c2aae46 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -393,12 +393,12 @@ static void svm_init_erratum_383(void)
 		return;
 
 	/* Use _safe variants to not break nested virtualization */
-	if (native_read_msr_safe(MSR_AMD64_DC_CFG, &val))
+	if (rdmsrq_safe(MSR_AMD64_DC_CFG, &val))
 		return;
 
 	val |= (1ULL << 47);
 
-	native_write_msr_safe(MSR_AMD64_DC_CFG, val);
+	wrmsrq_safe(MSR_AMD64_DC_CFG, val);
 
 	erratum_383_found = true;
 }
@@ -558,9 +558,9 @@ static int svm_enable_virtualization_cpu(void)
 		u64 len, status = 0;
 		int err;
 
-		err = native_read_msr_safe(MSR_AMD64_OSVW_ID_LENGTH, &len);
+		err = rdmsrq_safe(MSR_AMD64_OSVW_ID_LENGTH, &len);
 		if (!err)
-			err = native_read_msr_safe(MSR_AMD64_OSVW_STATUS, &status);
+			err = rdmsrq_safe(MSR_AMD64_OSVW_STATUS, &status);
 
 		if (err)
 			osvw_status = osvw_len = 0;
@@ -2032,7 +2032,7 @@ static bool is_erratum_383(void)
 	if (!erratum_383_found)
 		return false;
 
-	if (native_read_msr_safe(MSR_IA32_MC0_STATUS, &value))
+	if (rdmsrq_safe(MSR_IA32_MC0_STATUS, &value))
 		return false;
 
 	/* Bit 62 may or may not be set for this mce */
@@ -2043,11 +2043,11 @@ static bool is_erratum_383(void)
 
 	/* Clear MCi_STATUS registers */
 	for (i = 0; i < 6; ++i)
-		native_write_msr_safe(MSR_IA32_MCx_STATUS(i), 0);
+		wrmsrq_safe(MSR_IA32_MCx_STATUS(i), 0);
 
-	if (!native_read_msr_safe(MSR_IA32_MCG_STATUS, &value)) {
+	if (!rdmsrq_safe(MSR_IA32_MCG_STATUS, &value)) {
 		value &= ~(1ULL << 2);
-		native_write_msr_safe(MSR_IA32_MCG_STATUS, value);
+		wrmsrq_safe(MSR_IA32_MCG_STATUS, value);
 	}
 
 	/* Flush tlb to evict multi-match entries */
diff --git a/arch/x86/xen/pmu.c b/arch/x86/xen/pmu.c
index 8f89ce0b67e3..d49a3bdc448b 100644
--- a/arch/x86/xen/pmu.c
+++ b/arch/x86/xen/pmu.c
@@ -323,7 +323,7 @@ static u64 xen_amd_read_pmc(int counter)
 		u64 val;
 
 		msr = amd_counters_base + (counter * amd_msr_step);
-		native_read_msr_safe(msr, &val);
+		rdmsrq_safe(msr, &val);
 		return val;
 	}
 
@@ -349,7 +349,7 @@ static u64 xen_intel_read_pmc(int counter)
 		else
 			msr = MSR_IA32_PERFCTR0 + counter;
 
-		native_read_msr_safe(msr, &val);
+		rdmsrq_safe(msr, &val);
 		return val;
 	}
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function
  2025-09-30  7:03 ` [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function Juergen Gross
@ 2025-09-30 16:04   ` Sean Christopherson
  2025-10-01  9:14     ` Jürgen Groß
  0 siblings, 1 reply; 6+ messages in thread
From: Sean Christopherson @ 2025-09-30 16:04 UTC (permalink / raw)
  To: Juergen Gross
  Cc: linux-kernel, x86, kvm, xin, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin

For the scope:

  KVM: x86:

because x86/kvm is specifically used for guest-side code.

On Tue, Sep 30, 2025, Juergen Gross wrote:
> Instead of having a KVM private read_msr() function, just use rdmsrq().
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> V2:
> - remove the helper and use rdmsrq() directly (Sean Christopherson)
> ---
>  arch/x86/include/asm/kvm_host.h | 10 ----------
>  arch/x86/kvm/vmx/vmx.c          |  4 ++--
>  2 files changed, 2 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index f19a76d3ca0e..aed754dda1a3 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -2296,16 +2296,6 @@ static inline void kvm_load_ldt(u16 sel)
>  	asm("lldt %0" : : "rm"(sel));
>  }
>  
> -#ifdef CONFIG_X86_64
> -static inline unsigned long read_msr(unsigned long msr)
> -{
> -	u64 value;
> -
> -	rdmsrq(msr, value);
> -	return value;
> -}
> -#endif

Gah, the same commit[*] that added a wrmsrns() use also added a read_msr().  Sorry :-(

[*] 65391feb042b ("KVM: VMX: Add host MSR read/write helpers to consolidate preemption handling")

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions
  2025-09-30  7:03 [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions Juergen Gross
  2025-09-30  7:03 ` [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function Juergen Gross
  2025-09-30  7:03 ` [PATCH v2 04/12] x86/msr: Minimize usage of native_*() msr access functions Juergen Gross
@ 2025-09-30 19:19 ` H. Peter Anvin
  2 siblings, 0 replies; 6+ messages in thread
From: H. Peter Anvin @ 2025-09-30 19:19 UTC (permalink / raw)
  To: Juergen Gross, linux-kernel, x86, linux-coco, kvm, linux-hyperv,
	virtualization, llvm
  Cc: xin, Kirill A. Shutemov, Dave Hansen, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Sean Christopherson, Paolo Bonzini,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	Vitaly Kuznetsov, Boris Ostrovsky, xen-devel, Ajay Kaher,
	Alexey Makhalov, Broadcom internal kernel review list,
	Andy Lutomirski, Peter Zijlstra, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt

On 2025-09-30 00:03, Juergen Gross wrote:
> When building a kernel with CONFIG_PARAVIRT_XXL the paravirt
> infrastructure will always use functions for reading or writing MSRs,
> even when running on bare metal.
> 
> Switch to inline RDMSR/WRMSR instructions in this case, reducing the
> paravirt overhead.
> 
> In order to make this less intrusive, some further reorganization of
> the MSR access helpers is done in the first 5 patches.
> 
> The next 5 patches are converting the non-paravirt case to use direct
> inlining of the MSR access instructions, including the WRMSRNS
> instruction and the immediate variants of RDMSR and WRMSR if possible.
> 
> Patch 11 removes the PV hooks for MSR accesses and implements the
> Xen PV cases via calls depending on X86_FEATURE_XENPV, which results
> in runtime patching those calls away for the non-XenPV case.
> 
> Patch 12 is a final little cleanup patch.
> 
> This series has been tested to work with Xen PV and on bare metal.
> 
> This series is inspired by Xin Li, who used a similar approach, but
> (in my opinion) with some flaws. Originally I thought it should be
> possible to use the paravirt infrastructure, but this turned out to be
> rather complicated, especially for the Xen PV case in the *_safe()
> variants of the MSR access functions.
> 

Looks good to me.

(I'm not at all surprised that paravirt_ops didn't do the job. Both I and Xin
had come to the same conclusion.)


Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function
  2025-09-30 16:04   ` Sean Christopherson
@ 2025-10-01  9:14     ` Jürgen Groß
  0 siblings, 0 replies; 6+ messages in thread
From: Jürgen Groß @ 2025-10-01  9:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: linux-kernel, x86, kvm, xin, Paolo Bonzini, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin


[-- Attachment #1.1.1: Type: text/plain, Size: 1378 bytes --]

On 30.09.25 18:04, Sean Christopherson wrote:
> For the scope:
> 
>    KVM: x86:
> 
> because x86/kvm is specifically used for guest-side code.

Okay, will change that.

> 
> On Tue, Sep 30, 2025, Juergen Gross wrote:
>> Instead of having a KVM private read_msr() function, just use rdmsrq().
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> V2:
>> - remove the helper and use rdmsrq() directly (Sean Christopherson)
>> ---
>>   arch/x86/include/asm/kvm_host.h | 10 ----------
>>   arch/x86/kvm/vmx/vmx.c          |  4 ++--
>>   2 files changed, 2 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index f19a76d3ca0e..aed754dda1a3 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -2296,16 +2296,6 @@ static inline void kvm_load_ldt(u16 sel)
>>   	asm("lldt %0" : : "rm"(sel));
>>   }
>>   
>> -#ifdef CONFIG_X86_64
>> -static inline unsigned long read_msr(unsigned long msr)
>> -{
>> -	u64 value;
>> -
>> -	rdmsrq(msr, value);
>> -	return value;
>> -}
>> -#endif
> 
> Gah, the same commit[*] that added a wrmsrns() use also added a read_msr().  Sorry :-(
> 
> [*] 65391feb042b ("KVM: VMX: Add host MSR read/write helpers to consolidate preemption handling")

Again, thanks for the heads up.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-10-01  9:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-30  7:03 [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions Juergen Gross
2025-09-30  7:03 ` [PATCH v2 03/12] x86/kvm: Remove the KVM private read_msr() function Juergen Gross
2025-09-30 16:04   ` Sean Christopherson
2025-10-01  9:14     ` Jürgen Groß
2025-09-30  7:03 ` [PATCH v2 04/12] x86/msr: Minimize usage of native_*() msr access functions Juergen Gross
2025-09-30 19:19 ` [PATCH v2 00/12] x86/msr: Inline rdmsr/wrmsr instructions H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox