Kernel KVM virtualization development
 help / color / mirror / Atom feed
* [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
@ 2020-08-21  2:50 Sean Christopherson
  2020-08-21  7:24 ` peterz
  2020-08-21  7:47 ` Borislav Petkov
  0 siblings, 2 replies; 17+ messages in thread
From: Sean Christopherson @ 2020-08-21  2:50 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86
  Cc: H. Peter Anvin, linux-kernel, Dave Hansen, Chang Seok Bae,
	Peter Zijlstra, Sasha Levin, Paolo Bonzini, kvm, Tom Lendacky,
	Sean Christopherson

Don't use RDPID in the paranoid entry flow if KVM is enabled as doing so
can consume a KVM guest's MSR_TSC_AUX value if an NMI arrives in KVM's
run loop.

As a performance optimization, KVM loads the guest's TSC_AUX when a CPU
first enters its run loop, and on AMD's SVM doesn't restore the host's
value until the CPU exits the run loop.  VMX is even more aggressive and
defers restoring the host's value until the CPU returns to userspace.
This optimization obviously relies on the kernel not consuming TSC_AUX,
which falls apart if an NMI arrives in the run loop.

Removing KVM's optimizaton would be painful, as both SVM and VMX would
need to context switch the MSR on every VM-Enter (2x WRMSR + 1x RDMSR),
whereas using LSL instead RDPID is a minor blip.

Fixes: eaad981291ee3 ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Chang Seok Bae <chang.seok.bae@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Reported-by: Tom Lendacky <thomas.lendacky@amd.com>
Debugged-by: Tom Lendacky <thomas.lendacky@amd.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---

Andy, I know you said "unconditionally", but it felt weird adding a
comment way down in GET_PERCPU_BASE without plumbing a param in to help
provide context.  But, paranoid_entry is the only user so adding a param
that is unconditional also felt weird.  That being said, I definitely
don't have a strong opinion one way or the other.

 arch/x86/entry/calling.h  | 10 +++++++---
 arch/x86/entry/entry_64.S |  7 ++++++-
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 98e4d8886f11c..a925c0cf89c1a 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -342,9 +342,9 @@ For 32-bit we have the following conventions - kernel is built with
 #endif
 .endm
 
-.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req
+.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req no_rdpid=0
 	rdgsbase \save_reg
-	GET_PERCPU_BASE \scratch_reg
+	GET_PERCPU_BASE \scratch_reg \no_rdpid
 	wrgsbase \scratch_reg
 .endm
 
@@ -375,11 +375,15 @@ For 32-bit we have the following conventions - kernel is built with
  * We normally use %gs for accessing per-CPU data, but we are setting up
  * %gs here and obviously can not use %gs itself to access per-CPU data.
  */
-.macro GET_PERCPU_BASE reg:req
+.macro GET_PERCPU_BASE reg:req no_rdpid=0
+	.if \no_rdpid
+	LOAD_CPU_AND_NODE_SEG_LIMIT \reg
+	.else
 	ALTERNATIVE \
 		"LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
 		"RDPID	\reg", \
 		X86_FEATURE_RDPID
+	.endif
 	andq	$VDSO_CPUNODE_MASK, \reg
 	movq	__per_cpu_offset(, \reg, 8), \reg
 .endm
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 70dea93378162..fd915c46297c5 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -842,8 +842,13 @@ SYM_CODE_START_LOCAL(paranoid_entry)
 	 *
 	 * The MSR write ensures that no subsequent load is based on a
 	 * mispredicted GSBASE. No extra FENCE required.
+	 *
+	 * Disallow RDPID if KVM is enabled as it may consume a guest's TSC_AUX
+	 * if an NMI arrives in KVM's run loop.  KVM loads guest's TSC_AUX on
+	 * VM-Enter and may not restore the host's value until the CPU returns
+	 * to userspace, i.e. KVM depends on the kernel not using TSC_AUX.
 	 */
-	SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx
+	SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx no_rdpid=IS_ENABLED(CONFIG_KVM)
 	ret
 
 .Lparanoid_entry_checkgs:
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  2:50 [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled Sean Christopherson
@ 2020-08-21  7:24 ` peterz
  2020-08-21  7:44   ` Borislav Petkov
  2020-08-21  7:47 ` Borislav Petkov
  1 sibling, 1 reply; 17+ messages in thread
From: peterz @ 2020-08-21  7:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, linux-kernel, Dave Hansen, Chang Seok Bae,
	Sasha Levin, Paolo Bonzini, kvm, Tom Lendacky

On Thu, Aug 20, 2020 at 07:50:50PM -0700, Sean Christopherson wrote:
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 70dea93378162..fd915c46297c5 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -842,8 +842,13 @@ SYM_CODE_START_LOCAL(paranoid_entry)
>  	 *
>  	 * The MSR write ensures that no subsequent load is based on a
>  	 * mispredicted GSBASE. No extra FENCE required.
> +	 *
> +	 * Disallow RDPID if KVM is enabled as it may consume a guest's TSC_AUX
> +	 * if an NMI arrives in KVM's run loop.  KVM loads guest's TSC_AUX on
> +	 * VM-Enter and may not restore the host's value until the CPU returns
> +	 * to userspace, i.e. KVM depends on the kernel not using TSC_AUX.
>  	 */
> -	SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx
> +	SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx no_rdpid=IS_ENABLED(CONFIG_KVM)
>  	ret

With distro configs that's going to be a guaranteed no_rdpid. Also with
a grand total of 0 performance numbers that RDPID is even worth it, I'd
suggest to just unconditionally remove that thing. Simpler code
all-around.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  7:24 ` peterz
@ 2020-08-21  7:44   ` Borislav Petkov
  0 siblings, 0 replies; 17+ messages in thread
From: Borislav Petkov @ 2020-08-21  7:44 UTC (permalink / raw)
  To: peterz
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Sasha Levin, Paolo Bonzini, kvm, Tom Lendacky

On Fri, Aug 21, 2020 at 09:24:14AM +0200, peterz@infradead.org wrote:
> With distro configs that's going to be a guaranteed no_rdpid. Also with
> a grand total of 0 performance numbers that RDPID is even worth it, I'd
> suggest to just unconditionally remove that thing. Simpler code
> all-around.

I was just about to say the same thing - all this dancing around just to
keep RDPID better be backed by numbers, otherwise just remove the damn
thing in that path and use LSL and be done with it.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  2:50 [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled Sean Christopherson
  2020-08-21  7:24 ` peterz
@ 2020-08-21  7:47 ` Borislav Petkov
  2020-08-21  8:09   ` Paolo Bonzini
  1 sibling, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2020-08-21  7:47 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, linux-kernel, Dave Hansen, Chang Seok Bae,
	Peter Zijlstra, Sasha Levin, Paolo Bonzini, kvm, Tom Lendacky

On Thu, Aug 20, 2020 at 07:50:50PM -0700, Sean Christopherson wrote:
> +	 * Disallow RDPID if KVM is enabled as it may consume a guest's TSC_AUX
> +	 * if an NMI arrives in KVM's run loop.  KVM loads guest's TSC_AUX on
> +	 * VM-Enter and may not restore the host's value until the CPU returns
> +	 * to userspace, i.e. KVM depends on the kernel not using TSC_AUX.
>  	 */

And frankly, this is really unfair. The kernel should be able to use any
MSR. IOW, KVM needs to be fixed here. I'm sure it context-switches other
MSRs so one more MSR is not a big deal.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  7:47 ` Borislav Petkov
@ 2020-08-21  8:09   ` Paolo Bonzini
  2020-08-21  8:16     ` Borislav Petkov
  2020-08-21  9:28     ` Thomas Gleixner
  0 siblings, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2020-08-21  8:09 UTC (permalink / raw)
  To: Borislav Petkov, Sean Christopherson
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, linux-kernel, Dave Hansen, Chang Seok Bae,
	Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On 21/08/20 09:47, Borislav Petkov wrote:
> On Thu, Aug 20, 2020 at 07:50:50PM -0700, Sean Christopherson wrote:
>> +	 * Disallow RDPID if KVM is enabled as it may consume a guest's TSC_AUX
>> +	 * if an NMI arrives in KVM's run loop.  KVM loads guest's TSC_AUX on
>> +	 * VM-Enter and may not restore the host's value until the CPU returns
>> +	 * to userspace, i.e. KVM depends on the kernel not using TSC_AUX.
>>  	 */
> And frankly, this is really unfair. The kernel should be able to use any
> MSR. IOW, KVM needs to be fixed here. I'm sure it context-switches other
> MSRs so one more MSR is not a big deal.

The only MSR that KVM needs to context-switch manually are XSS and
SPEC_CTRL.  They tend to be the same on host and guest in which case
they can be optimized away.

All the other MSRs (EFER and PAT are those that come to mind) are
handled by the microcode and thus they don't have the slowness of
RDMSR/WRMSR

One more MSR *is* a big deal: KVM's vmentry+vmexit cost is around 1000
cycles, adding 100 clock cycles for 2 WRMSRs is a 10% increase.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  8:09   ` Paolo Bonzini
@ 2020-08-21  8:16     ` Borislav Petkov
  2020-08-21  9:05       ` Paolo Bonzini
  2020-08-21  9:28     ` Thomas Gleixner
  1 sibling, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2020-08-21  8:16 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On Fri, Aug 21, 2020 at 10:09:01AM +0200, Paolo Bonzini wrote:
> One more MSR *is* a big deal: KVM's vmentry+vmexit cost is around 1000
> cycles, adding 100 clock cycles for 2 WRMSRs is a 10% increase.

The kernel uses TSC_AUX so we can't reserve it to KVM either.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  8:16     ` Borislav Petkov
@ 2020-08-21  9:05       ` Paolo Bonzini
  2020-08-21  9:22         ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2020-08-21  9:05 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On 21/08/20 10:16, Borislav Petkov wrote:
> On Fri, Aug 21, 2020 at 10:09:01AM +0200, Paolo Bonzini wrote:
>> One more MSR *is* a big deal: KVM's vmentry+vmexit cost is around 1000
>> cycles, adding 100 clock cycles for 2 WRMSRs is a 10% increase.
> 
> The kernel uses TSC_AUX so we can't reserve it to KVM either.

KVM only uses TSC_AUX while in kernel space, because the kernel hadn't
used it until now.  That's for a good reason:

* if possible, __this_cpu_read(cpu_number) is always faster.

* The kernel can just block preemption at its will and has no need for
the atomic rdtsc+vgetcpu provided by RDTSCP.

So far, the kernel had always used LSL instead of RDPID when
__this_cpu_read was not available.  In one place, RDTSCP is used as an
ordered rdtsc but it discards the TSC_AUX value.  RDPID is also used in
the vDSO but it isn't kernel space.

Hence the assumption that KVM makes (and has made ever since TSC_AUX was
introduced.  What is the difference in speed between LSL and RDPID?  I
don't have a machine that has RDPID to test it, unfortunately.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  9:05       ` Paolo Bonzini
@ 2020-08-21  9:22         ` Borislav Petkov
  2020-08-21  9:44           ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2020-08-21  9:22 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On Fri, Aug 21, 2020 at 11:05:24AM +0200, Paolo Bonzini wrote:
> ... RDTSCP is used as an ordered rdtsc but it discards the TSC_AUX
> value.

... and now because KVM is using it, the kernel can forget using
TSC_AUX. Are you kidding me?!

> Hence the assumption that KVM makes (and has made ever since TSC_AUX was
> introduced.

And *this* is the problem. KVM can't just be grabbing MSRs and claiming
them because it started using them first. You guys need to stop this. If
it is a shared resource, there better be an agreement about sharing it.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  8:09   ` Paolo Bonzini
  2020-08-21  8:16     ` Borislav Petkov
@ 2020-08-21  9:28     ` Thomas Gleixner
  2020-08-21  9:37       ` Paolo Bonzini
  2020-08-21 19:55       ` hpa
  1 sibling, 2 replies; 17+ messages in thread
From: Thomas Gleixner @ 2020-08-21  9:28 UTC (permalink / raw)
  To: Paolo Bonzini, Borislav Petkov, Sean Christopherson
  Cc: Andy Lutomirski, Ingo Molnar, x86, H. Peter Anvin, linux-kernel,
	Dave Hansen, Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm,
	Tom Lendacky

On Fri, Aug 21 2020 at 10:09, Paolo Bonzini wrote:
> On 21/08/20 09:47, Borislav Petkov wrote:
>> On Thu, Aug 20, 2020 at 07:50:50PM -0700, Sean Christopherson wrote:
>>> +	 * Disallow RDPID if KVM is enabled as it may consume a guest's TSC_AUX
>>> +	 * if an NMI arrives in KVM's run loop.  KVM loads guest's TSC_AUX on
>>> +	 * VM-Enter and may not restore the host's value until the CPU returns
>>> +	 * to userspace, i.e. KVM depends on the kernel not using TSC_AUX.
>>>  	 */
>> And frankly, this is really unfair. The kernel should be able to use any
>> MSR. IOW, KVM needs to be fixed here. I'm sure it context-switches other
>> MSRs so one more MSR is not a big deal.
>
> The only MSR that KVM needs to context-switch manually are XSS and
> SPEC_CTRL.  They tend to be the same on host and guest in which case
> they can be optimized away.
>
> All the other MSRs (EFER and PAT are those that come to mind) are
> handled by the microcode and thus they don't have the slowness of
> RDMSR/WRMSR
>
> One more MSR *is* a big deal: KVM's vmentry+vmexit cost is around 1000
> cycles, adding 100 clock cycles for 2 WRMSRs is a 10% increase.

We all know that MSRs are slow, but as a general rule I have to make it
entirely clear that the kernel has precedence over KVM.

If the kernel wants to use an MSR for it's own purposes then KVM has to
deal with that and not the other way round. Preventing the kernel from
using a facility freely is not an option ever.

The insanities of KVM performance optimizations have bitten us more than
once.

For this particular case at hand I don't care much and we should just
rip the whole RDPID thing out unconditionally. We still have zero
numbers about the performance difference vs. LSL.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  9:28     ` Thomas Gleixner
@ 2020-08-21  9:37       ` Paolo Bonzini
  2020-08-21 19:55       ` hpa
  1 sibling, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2020-08-21  9:37 UTC (permalink / raw)
  To: Thomas Gleixner, Borislav Petkov, Sean Christopherson
  Cc: Andy Lutomirski, Ingo Molnar, x86, H. Peter Anvin, linux-kernel,
	Dave Hansen, Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm,
	Tom Lendacky

On 21/08/20 11:28, Thomas Gleixner wrote:
> We all know that MSRs are slow, but as a general rule I have to make it
> entirely clear that the kernel has precedence over KVM.

I totally agree.  I just don't think that it matters _in this case_,
because the kernel hardly has any reason to use TSC_AUX while in ring0.

Paolo

> If the kernel wants to use an MSR for it's own purposes then KVM has to
> deal with that and not the other way round. Preventing the kernel from
> using a facility freely is not an option ever.
> 
> The insanities of KVM performance optimizations have bitten us more than
> once.
> 
> For this particular case at hand I don't care much and we should just
> rip the whole RDPID thing out unconditionally. We still have zero
> numbers about the performance difference vs. LSL.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  9:22         ` Borislav Petkov
@ 2020-08-21  9:44           ` Paolo Bonzini
  2020-08-21  9:48             ` Borislav Petkov
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2020-08-21  9:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On 21/08/20 11:22, Borislav Petkov wrote:
>> Hence the assumption that KVM makes (and has made ever since TSC_AUX was
>> introduced.
> And *this* is the problem. KVM can't just be grabbing MSRs and claiming
> them because it started using them first. You guys need to stop this. If
> it is a shared resource, there better be an agreement about sharing it.

It's not like we grab MSRs every day.  The user-return notifier restores
6 MSRs (7 on very old processors).  The last two that were added were
MSR_TSC_AUX itself in 2009 (!) and MSR_IA32_TSX_CTRL last year.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  9:44           ` Paolo Bonzini
@ 2020-08-21  9:48             ` Borislav Petkov
  2020-08-21 10:07               ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Borislav Petkov @ 2020-08-21  9:48 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On Fri, Aug 21, 2020 at 11:44:33AM +0200, Paolo Bonzini wrote:
> It's not like we grab MSRs every day.  The user-return notifier restores
> 6 MSRs (7 on very old processors).  The last two that were added were
> MSR_TSC_AUX itself in 2009 (!) and MSR_IA32_TSX_CTRL last year.

What about "If it is a shared resource, there better be an agreement
about sharing it." is not clear?

It doesn't matter how many or which resources - there needs to be a
contract for shared use so that shared use is possible. It is that
simple.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  9:48             ` Borislav Petkov
@ 2020-08-21 10:07               ` Paolo Bonzini
  2020-08-22 16:42                 ` Andy Lutomirski
  0 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2020-08-21 10:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sean Christopherson, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On 21/08/20 11:48, Borislav Petkov wrote:
>> It's not like we grab MSRs every day.  The user-return notifier restores
>> 6 MSRs (7 on very old processors).  The last two that were added were
>> MSR_TSC_AUX itself in 2009 (!) and MSR_IA32_TSX_CTRL last year.
> What about "If it is a shared resource, there better be an agreement
> about sharing it." is not clear?
> 
> It doesn't matter how many or which resources - there needs to be a
> contract for shared use so that shared use is possible. It is that
> simple.

Sure, and I'll make sure to have that discussion the next time we add a
shared MSR in 2029.

In the meanwhile:

* for the syscall MSRs, patches to share them were reviewed by hpa and
peterz so I guess there's no problem.

* for MSR_TSC_AUX, which is the one that is causing problems, everybody
seems to agree with just using LSL (in the lack specific numbers on
performance improvements from RDPID).

* for MSR_IA32_TSX_CTRL.RTM_DISABLE, which is the only one that was
added in the last 10 years, I'm pretty sure there are no plans for using
the Trusty Sidechannel eXtension in the kernel.  So that should be fine
too.  (The CPUID_CLEAR bit of the MSR is not shared).

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21  9:28     ` Thomas Gleixner
  2020-08-21  9:37       ` Paolo Bonzini
@ 2020-08-21 19:55       ` hpa
  2020-08-21 20:02         ` Peter Zijlstra
  1 sibling, 1 reply; 17+ messages in thread
From: hpa @ 2020-08-21 19:55 UTC (permalink / raw)
  To: Thomas Gleixner, Paolo Bonzini, Borislav Petkov,
	Sean Christopherson
  Cc: Andy Lutomirski, Ingo Molnar, x86, linux-kernel, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm, Tom Lendacky

On August 21, 2020 2:28:32 AM PDT, Thomas Gleixner <tglx@linutronix.de> wrote:
>On Fri, Aug 21 2020 at 10:09, Paolo Bonzini wrote:
>> On 21/08/20 09:47, Borislav Petkov wrote:
>>> On Thu, Aug 20, 2020 at 07:50:50PM -0700, Sean Christopherson wrote:
>>>> +	 * Disallow RDPID if KVM is enabled as it may consume a guest's
>TSC_AUX
>>>> +	 * if an NMI arrives in KVM's run loop.  KVM loads guest's
>TSC_AUX on
>>>> +	 * VM-Enter and may not restore the host's value until the CPU
>returns
>>>> +	 * to userspace, i.e. KVM depends on the kernel not using
>TSC_AUX.
>>>>  	 */
>>> And frankly, this is really unfair. The kernel should be able to use
>any
>>> MSR. IOW, KVM needs to be fixed here. I'm sure it context-switches
>other
>>> MSRs so one more MSR is not a big deal.
>>
>> The only MSR that KVM needs to context-switch manually are XSS and
>> SPEC_CTRL.  They tend to be the same on host and guest in which case
>> they can be optimized away.
>>
>> All the other MSRs (EFER and PAT are those that come to mind) are
>> handled by the microcode and thus they don't have the slowness of
>> RDMSR/WRMSR
>>
>> One more MSR *is* a big deal: KVM's vmentry+vmexit cost is around
>1000
>> cycles, adding 100 clock cycles for 2 WRMSRs is a 10% increase.
>
>We all know that MSRs are slow, but as a general rule I have to make it
>entirely clear that the kernel has precedence over KVM.
>
>If the kernel wants to use an MSR for it's own purposes then KVM has to
>deal with that and not the other way round. Preventing the kernel from
>using a facility freely is not an option ever.
>
>The insanities of KVM performance optimizations have bitten us more
>than
>once.
>
>For this particular case at hand I don't care much and we should just
>rip the whole RDPID thing out unconditionally. We still have zero
>numbers about the performance difference vs. LSL.
>
>Thanks,
>
>        tglx

It is hardly going to be a performance difference for paranoid entry, which is hopefully rare enough that it falls into the noise.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21 19:55       ` hpa
@ 2020-08-21 20:02         ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2020-08-21 20:02 UTC (permalink / raw)
  To: hpa
  Cc: Thomas Gleixner, Paolo Bonzini, Borislav Petkov,
	Sean Christopherson, Andy Lutomirski, Ingo Molnar, x86,
	linux-kernel, Dave Hansen, Chang Seok Bae, Sasha Levin, kvm,
	Tom Lendacky

On Fri, Aug 21, 2020 at 12:55:53PM -0700, hpa@zytor.com wrote:

> It is hardly going to be a performance difference for paranoid entry,
> which is hopefully rare enough that it falls into the noise.

Try perf some day ;-)

But yeah, given the utter trainwreck that NMIs are anyway, this is all
not going to matter much.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-21 10:07               ` Paolo Bonzini
@ 2020-08-22 16:42                 ` Andy Lutomirski
  2020-09-16 16:54                   ` Paolo Bonzini
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Lutomirski @ 2020-08-22 16:42 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Borislav Petkov, Sean Christopherson, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, X86 ML, H. Peter Anvin, LKML,
	Dave Hansen, Chang Seok Bae, Peter Zijlstra, Sasha Levin,
	kvm list, Tom Lendacky

On Fri, Aug 21, 2020 at 3:07 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 21/08/20 11:48, Borislav Petkov wrote:
> >> It's not like we grab MSRs every day.  The user-return notifier restores
> >> 6 MSRs (7 on very old processors).  The last two that were added were
> >> MSR_TSC_AUX itself in 2009 (!) and MSR_IA32_TSX_CTRL last year.
> > What about "If it is a shared resource, there better be an agreement
> > about sharing it." is not clear?
> >
> > It doesn't matter how many or which resources - there needs to be a
> > contract for shared use so that shared use is possible. It is that
> > simple.
>
> Sure, and I'll make sure to have that discussion the next time we add a
> shared MSR in 2029.
>
> In the meanwhile:
>
> * for the syscall MSRs, patches to share them were reviewed by hpa and
> peterz so I guess there's no problem.
>
> * for MSR_TSC_AUX, which is the one that is causing problems, everybody
> seems to agree with just using LSL (in the lack specific numbers on
> performance improvements from RDPID).
>
> * for MSR_IA32_TSX_CTRL.RTM_DISABLE, which is the only one that was
> added in the last 10 years, I'm pretty sure there are no plans for using
> the Trusty Sidechannel eXtension in the kernel.  So that should be fine
> too.  (The CPUID_CLEAR bit of the MSR is not shared).
>

Regardless of how anyone feels about who owns what in arch/x86 vs
arch/x86/kvm, the x86 architecture is a mess that comes from Intel and
AMD, and we have to deal with it.  On VMX, when a VM exits, the VM's
value of MSR_TSC_AUX is live, and we can take an NMI, MCE, or
abominable new #SX, #VE, #VC, etc on the next instruction boundary.
And unless we use the atomic MSR switch mechanism, the result is that
we're going through the entry path with guest-controlled MSRs.

So I think we can chalk this one up to obnoxious hardware.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
  2020-08-22 16:42                 ` Andy Lutomirski
@ 2020-09-16 16:54                   ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2020-09-16 16:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, X86 ML, H. Peter Anvin, LKML, Dave Hansen,
	Chang Seok Bae, Peter Zijlstra, Sasha Levin, kvm list,
	Tom Lendacky

On 22/08/20 18:42, Andy Lutomirski wrote:
> On VMX, when a VM exits, the VM's
> value of MSR_TSC_AUX is live, and we can take an NMI, MCE, or
> abominable new #SX, #VE, #VC, etc on the next instruction boundary.
> And unless we use the atomic MSR switch mechanism, the result is that
> we're going through the entry path with guest-controlled MSRs.

If anything of that is a problem, we can and will use the atomic MSR
switching; it's not worth doing complicated stuff if you're going to pay
the price of rdmsr/wrmsr anyway.

The remaining cases are MSRs that are really meant for usermode (such as
the syscall MSRs) and especially the edge cases of these two MSRs that
the kernel doesn't mind too much about.  But they are really really
rare, I don't expect any new one coming soon and if they are ever needed
(by SGX perhaps?!?) I'll certainly loop you guys in.

Paolo


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-09-16 20:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-21  2:50 [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled Sean Christopherson
2020-08-21  7:24 ` peterz
2020-08-21  7:44   ` Borislav Petkov
2020-08-21  7:47 ` Borislav Petkov
2020-08-21  8:09   ` Paolo Bonzini
2020-08-21  8:16     ` Borislav Petkov
2020-08-21  9:05       ` Paolo Bonzini
2020-08-21  9:22         ` Borislav Petkov
2020-08-21  9:44           ` Paolo Bonzini
2020-08-21  9:48             ` Borislav Petkov
2020-08-21 10:07               ` Paolo Bonzini
2020-08-22 16:42                 ` Andy Lutomirski
2020-09-16 16:54                   ` Paolo Bonzini
2020-08-21  9:28     ` Thomas Gleixner
2020-08-21  9:37       ` Paolo Bonzini
2020-08-21 19:55       ` hpa
2020-08-21 20:02         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox