All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM
@ 2020-08-21 10:52 Paolo Bonzini
  2020-08-21 14:15 ` Brian Gerst
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Paolo Bonzini @ 2020-08-21 10:52 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: x86, Sean Christopherson, Dave Hansen, Chang Seok Bae,
	Peter Zijlstra, Sasha Levin, Tom Lendacky, Andy Lutomirski

From: Sean Christopherson <sean.j.christopherson@intel.com>

Don't use RDPID in the paranoid entry flow, as it can consume a KVM
guest's MSR_TSC_AUX value if an NMI arrives during KVM's run loop.

In general, the kernel does not need TSC_AUX because it can just use
__this_cpu_read(cpu_number) to read the current processor id.  It can
also just block preemption and thread migration at its will, therefore
it has no need for the atomic rdtsc+vgetcpu provided by RDTSCP.  For this
reason, as a performance optimization, KVM loads the guest's TSC_AUX when
a CPU first enters its run loop.  On AMD's SVM, it doesn't restore the
host's value until the CPU exits the run loop; VMX is even more aggressive
and defers restoring the host's value until the CPU returns to userspace.

This optimization obviously relies on the kernel not consuming TSC_AUX,
which falls apart if an NMI arrives during the run loop and uses RDPID.
Removing it would be painful, as both SVM and VMX would need to context
switch the MSR on every VM-Enter (for a cost of 2x WRMSR), whereas using
LSL instead RDPID is a minor blip.

Both SAVE_AND_SET_GSBASE and GET_PERCPU_BASE are only used in paranoid entry,
therefore the patch can just remove the RDPID alternative.

Fixes: eaad981291ee3 ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Chang Seok Bae <chang.seok.bae@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Debugged-by: Tom Lendacky <thomas.lendacky@amd.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/entry/calling.h | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 98e4d8886f11..ae9b0d4615b3 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -374,12 +374,14 @@ For 32-bit we have the following conventions - kernel is built with
  * Fetch the per-CPU GSBASE value for this processor and put it in @reg.
  * We normally use %gs for accessing per-CPU data, but we are setting up
  * %gs here and obviously can not use %gs itself to access per-CPU data.
+ *
+ * Do not use RDPID, because KVM loads guest's TSC_AUX on vm-entry and
+ * may not restore the host's value until the CPU returns to userspace.
+ * Thus the kernel would consume a guest's TSC_AUX if an NMI arrives
+ * while running KVM's run loop.
  */
 .macro GET_PERCPU_BASE reg:req
-	ALTERNATIVE \
-		"LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
-		"RDPID	\reg", \
-		X86_FEATURE_RDPID
+	LOAD_CPU_AND_NODE_SEG_LIMIT \reg
 	andq	$VDSO_CPUNODE_MASK, \reg
 	movq	__per_cpu_offset(, \reg, 8), \reg
 .endm
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-25 12:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-21 10:52 [PATCH v2] x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM Paolo Bonzini
2020-08-21 14:15 ` Brian Gerst
2020-08-21 14:21 ` [tip: x86/urgent] " tip-bot2 for Sean Christopherson
2020-08-21 14:21 ` [PATCH v2] " Sean Christopherson
2020-08-21 15:35   ` Brian Gerst
2020-08-25 10:44     ` Thomas Gleixner
2020-08-25 12:11       ` Brian Gerst

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.