public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/retpoline: Also fill return buffer after idle
@ 2018-01-08 23:51 Andi Kleen
  2018-01-09  0:00 ` David Woodhouse
  2018-01-09  9:37 ` Peter Zijlstra
  0 siblings, 2 replies; 7+ messages in thread
From: Andi Kleen @ 2018-01-08 23:51 UTC (permalink / raw)
  To: dwmw2
  Cc: pjt, linux-kernel, torvalds, gregkh, tim.c.chen, dave.hansen,
	tglx, peterz, luto, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

This is an extension of the earlier patch to fill the return buffer
on context switch. It uses the assembler macros added earlier.

When we go into deeper idle states the return buffer could be cleared
in MWAIT, but then another thread which wakes up earlier might
be poisoning the indirect branch predictor. Then when the return
buffer underflows there might an uncontrolled indirect branch.

To guard against this always fill the return buffer when exiting idle.

Needed on Skylake and some Broadwells.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/entry/entry_32.S    |  8 ++++++++
 arch/x86/entry/entry_64.S    |  8 ++++++++
 arch/x86/include/asm/mwait.h | 11 ++++++++++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 7dee84a3cf83..2687cce8a02e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -1092,3 +1092,11 @@ ENTRY(rewind_stack_do_exit)
 	call	do_exit
 1:	jmp 1b
 END(rewind_stack_do_exit)
+
+ENTRY(fill_return_buffer)
+#ifdef CONFIG_RETPOLINE
+	ALTERNATIVE "ret", "", X86_FEATURE_RETPOLINE
+        FILL_RETURN_BUFFER
+#endif
+        ret
+END(fill_return_buffer)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a33033e2bfe0..92fbec1b0eb5 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1831,3 +1831,11 @@ ENTRY(rewind_stack_do_exit)
 
 	call	do_exit
 END(rewind_stack_do_exit)
+
+ENTRY(fill_return_buffer)
+#ifdef CONFIG_RETPOLINE
+	ALTERNATIVE "ret", "", X86_FEATURE_RETPOLINE
+	FILL_RETURN_BUFFER
+#endif
+	ret
+END(fill_return_buffer)
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 39a2fb29378a..1d9f9269b5e7 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -87,6 +87,8 @@ static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
 		     :: "a" (eax), "c" (ecx));
 }
 
+extern __visible void fill_return_buffer(void);
+
 /*
  * This uses new MONITOR/MWAIT instructions on P4 processors with PNI,
  * which can obviate IPI to trigger checking of need_resched.
@@ -107,8 +109,15 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
-		if (!need_resched())
+		if (!need_resched()) {
 			__mwait(eax, ecx);
+			/*
+			 * idle could have cleared the return buffer,
+			 * so fill it to prevent uncontrolled
+			 * speculation.
+			 */
+			fill_return_buffer();
+		}
 	}
 	current_clr_polling();
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/retpoline: Also fill return buffer after idle
  2018-01-08 23:51 [PATCH] x86/retpoline: Also fill return buffer after idle Andi Kleen
@ 2018-01-09  0:00 ` David Woodhouse
  2018-01-09  0:24   ` Andi Kleen
  2018-01-09  9:37 ` Peter Zijlstra
  1 sibling, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2018-01-09  0:00 UTC (permalink / raw)
  To: Andi Kleen
  Cc: pjt, linux-kernel, torvalds, gregkh, tim.c.chen, dave.hansen,
	tglx, peterz, luto, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 3188 bytes --]

On Mon, 2018-01-08 at 15:51 -0800, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> This is an extension of the earlier patch to fill the return buffer
> on context switch. It uses the assembler macros added earlier.
> 
> When we go into deeper idle states the return buffer could be cleared
> in MWAIT, but then another thread which wakes up earlier might
> be poisoning the indirect branch predictor. Then when the return
> buffer underflows there might an uncontrolled indirect branch.
> 
> To guard against this always fill the return buffer when exiting idle.
> 
> Needed on Skylake and some Broadwells.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  arch/x86/entry/entry_32.S    |  8 ++++++++
>  arch/x86/entry/entry_64.S    |  8 ++++++++
>  arch/x86/include/asm/mwait.h | 11 ++++++++++-
>  3 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index 7dee84a3cf83..2687cce8a02e 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -1092,3 +1092,11 @@ ENTRY(rewind_stack_do_exit)
>  	call	do_exit
>  1:	jmp 1b
>  END(rewind_stack_do_exit)
> +
> +ENTRY(fill_return_buffer)
> +#ifdef CONFIG_RETPOLINE
> +	ALTERNATIVE "ret", "", X86_FEATURE_RETPOLINE
> +        FILL_RETURN_BUFFER
> +#endif
> +        ret
> +END(fill_return_buffer)
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index a33033e2bfe0..92fbec1b0eb5 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1831,3 +1831,11 @@ ENTRY(rewind_stack_do_exit)
>  
>  	call	do_exit
>  END(rewind_stack_do_exit)
> +
> +ENTRY(fill_return_buffer)
> +#ifdef CONFIG_RETPOLINE
> +	ALTERNATIVE "ret", "", X86_FEATURE_RETPOLINE
> +	FILL_RETURN_BUFFER
> +#endif
> +	ret
> +END(fill_return_buffer)
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index 39a2fb29378a..1d9f9269b5e7 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -87,6 +87,8 @@ static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
>  		     :: "a" (eax), "c" (ecx));
>  }
>  
> +extern __visible void fill_return_buffer(void);
> +
>  /*
>   * This uses new MONITOR/MWAIT instructions on P4 processors with PNI,
>   * which can obviate IPI to trigger checking of need_resched.
> @@ -107,8 +109,15 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
>  		}
>  
>  		__monitor((void *)¤t_thread_info()->flags, 0, 0);
> -		if (!need_resched())
> +		if (!need_resched()) {
>  			__mwait(eax, ecx);
> +			/*
> +			 * idle could have cleared the return buffer,
> +			 * so fill it to prevent uncontrolled
> +			 * speculation.
> +			 */
> +			fill_return_buffer();
> +		}
>  	}
>  	current_clr_polling();
>  }

Probably doesn't matter right there but it's going to end up being used
elsewhere with IBRS/IBPB, and the compiler is going to think it needs
to save all the call-clobbered registers for that. Do we want to make
it use inline asm instead?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/retpoline: Also fill return buffer after idle
  2018-01-09  0:00 ` David Woodhouse
@ 2018-01-09  0:24   ` Andi Kleen
  2018-01-09  0:28     ` David Woodhouse
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2018-01-09  0:24 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Andi Kleen, pjt, linux-kernel, torvalds, gregkh, tim.c.chen,
	dave.hansen, tglx, peterz, luto

> Probably doesn't matter right there but it's going to end up being used
> elsewhere with IBRS/IBPB, and the compiler is going to think it needs
> to save all the call-clobbered registers for that. Do we want to make
> it use inline asm instead?

You mean KVM?

All the other places have lots of other calls too, so the registers are likely 
already clobbered.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/retpoline: Also fill return buffer after idle
  2018-01-09  0:24   ` Andi Kleen
@ 2018-01-09  0:28     ` David Woodhouse
  0 siblings, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2018-01-09  0:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Andi Kleen, pjt, linux-kernel, torvalds, gregkh, tim.c.chen,
	dave.hansen, tglx, peterz, luto

[-- Attachment #1: Type: text/plain, Size: 458 bytes --]

On Mon, 2018-01-08 at 16:24 -0800, Andi Kleen wrote:
> > Probably doesn't matter right there but it's going to end up being used
> > elsewhere with IBRS/IBPB, and the compiler is going to think it needs
> > to save all the call-clobbered registers for that. Do we want to make
> > it use inline asm instead?
> 
> You mean KVM?
> 
> All the other places have lots of other calls too, so the registers are likely 
> already clobbered.

Fair enough.

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/retpoline: Also fill return buffer after idle
  2018-01-08 23:51 [PATCH] x86/retpoline: Also fill return buffer after idle Andi Kleen
  2018-01-09  0:00 ` David Woodhouse
@ 2018-01-09  9:37 ` Peter Zijlstra
  2018-01-09 13:58   ` David Woodhouse
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2018-01-09  9:37 UTC (permalink / raw)
  To: Andi Kleen
  Cc: dwmw2, pjt, linux-kernel, torvalds, gregkh, tim.c.chen,
	dave.hansen, tglx, luto, Andi Kleen

On Mon, Jan 08, 2018 at 03:51:26PM -0800, Andi Kleen wrote:

> @@ -107,8 +109,15 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
>  		}
>  
>  		__monitor((void *)&current_thread_info()->flags, 0, 0);
> -		if (!need_resched())
> +		if (!need_resched()) {
>  			__mwait(eax, ecx);
> +			/*
> +			 * idle could have cleared the return buffer,
> +			 * so fill it to prevent uncontrolled
> +			 * speculation.
> +			 */
> +			fill_return_buffer();

wouldn't something like:

			if (static_cpu_has(X86_FEATURE_RETPOLINE))
				fill_return_buffer();

be much saner? Then we avoid the entire call when not needed and you
don't have to muck with the asm either.

Also, you forgot mwait_idle() in process.c

> +		}
>  	}
>  	current_clr_polling();
>  }
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/retpoline: Also fill return buffer after idle
  2018-01-09  9:37 ` Peter Zijlstra
@ 2018-01-09 13:58   ` David Woodhouse
  2018-01-09 14:12     ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2018-01-09 13:58 UTC (permalink / raw)
  To: Peter Zijlstra, Andi Kleen
  Cc: pjt, linux-kernel, torvalds, gregkh, tim.c.chen, dave.hansen,
	tglx, luto, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 2136 bytes --]

On Tue, 2018-01-09 at 10:37 +0100, Peter Zijlstra wrote:
> On Mon, Jan 08, 2018 at 03:51:26PM -0800, Andi Kleen wrote:
> 
> > 
> > @@ -107,8 +109,15 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
> >  		}
> >  
> >  		__monitor((void *)¤t_thread_info()->flags, 0, 0);
> > -		if (!need_resched())
> > +		if (!need_resched()) {
> >  			__mwait(eax, ecx);
> > +			/*
> > +			 * idle could have cleared the return buffer,
> > +			 * so fill it to prevent uncontrolled
> > +			 * speculation.
> > +			 */
> > +			fill_return_buffer();
> wouldn't something like:
> 
> 			if (static_cpu_has(X86_FEATURE_RETPOLINE))
> 				fill_return_buffer();
> 
> be much saner? Then we avoid the entire call when not needed and you
> don't have to muck with the asm either.

Hm...

The background, of course, that that we need to be careful when doing
things like this. If you end up with a conditional branch there, then
processor can speculate right past it. There's a reason a lot of the
IBRS-setting code has, effectively, an 'else lfence' in the cases where
it isn't being done with ALTERNATIVEs.

We had a *beautiful* case of that in the early IBRS patch set, on the
syscall path, where the conditional branch opened up a path for
speculative execution all the way to the jmp *sys_call_table(…).

Now, as discussed on IRC, we can see that the current implementation of
static_cpu_has using asm goto *is* generally doing the right thing and
turning it into a straight unconditional jump over the
fill_return_buffer() code. Clever GCC, have biscuit.

However, you are suggesting that we turn the static_cpu_has() trick
from a "nice to have" optimisation which is all very well when it pans
out, to something we *rely* on for secure operation of the system.

It never ends well when we rely on all versions of GCC optimising
things precisely how we want.

If you can build in a sanity check to ensure that the build will *fail*
when GCC doesn't do what we want, I suppose we could live with that.
But we don't have such a sanity check at the moment, do we?

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] x86/retpoline: Also fill return buffer after idle
  2018-01-09 13:58   ` David Woodhouse
@ 2018-01-09 14:12     ` Peter Zijlstra
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2018-01-09 14:12 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Andi Kleen, pjt, linux-kernel, torvalds, gregkh, tim.c.chen,
	dave.hansen, tglx, luto, Andi Kleen

On Tue, Jan 09, 2018 at 01:58:38PM +0000, David Woodhouse wrote:
> Clever GCC, have biscuit.

Well, we requested this feature exactly because of this. It had better
work.

> However, you are suggesting that we turn the static_cpu_has() trick
> from a "nice to have" optimisation which is all very well when it pans
> out, to something we *rely* on for secure operation of the system.

It must work, we 'rely' on it already. GCC doing something stupid there
is a GCC bug. Any GCC bug is a royal pain, they happen, life goes on.

> It never ends well when we rely on all versions of GCC optimising
> things precisely how we want.
>
> If you can build in a sanity check to ensure that the build will *fail*
> when GCC doesn't do what we want, I suppose we could live with that.
> But we don't have such a sanity check at the moment, do we?

We have STATIC_KEYS_SELFTEST, which might or might not qualify.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-01-09 14:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-08 23:51 [PATCH] x86/retpoline: Also fill return buffer after idle Andi Kleen
2018-01-09  0:00 ` David Woodhouse
2018-01-09  0:24   ` Andi Kleen
2018-01-09  0:28     ` David Woodhouse
2018-01-09  9:37 ` Peter Zijlstra
2018-01-09 13:58   ` David Woodhouse
2018-01-09 14:12     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox