* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts
@ 2018-01-05 12:27 Dr. David Alan Gilbert
2018-01-05 18:00 ` Dave Hansen
0 siblings, 1 reply; 13+ messages in thread
From: Dr. David Alan Gilbert @ 2018-01-05 12:27 UTC (permalink / raw)
To: Dave Hansen
Cc: Peter Zijlstra, Tim Chen, Thomas Gleixner, Linus Torvalds,
Greg KH, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, LKML,
Andy Lutomirski
> Dave Hansen <dave.hansen@intel.com> wrote:
>> On 01/04/2018 08:51 PM, Andy Lutomirski wrote:
>> > Do we need an arch_prctl() to enable IBRS for user mode?
>>
>> Eventually, once the dust settles. I think there's a spectrum of
>> paranoia here, that is roughly (with increasing paranoia):
>>
>> 1. do nothing
>> 2. do retpoline
>> 3. do IBRS in kernel
>> 4. do IBRS always
>>
>> I think you're asking for ~3.5.
>>
>> Patches for 1-3 are out there and 4 is pretty straightforward. Doing a
>> arch_prctl() is still straightforward, but will be a much more niche
>> thing than any of the other choices. Plus, with a user interface, we
>> have to argue over the ABI for at least a month or two. ;)
I was chatting to Andrea about this, and we came to the conclusion one
use might be for qemu; I was worried about (theoretically) whether
userspace in a guest could read privileged data from the guest kernel by
attacking the qemu process rather than by attacking the kernels.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-05 12:27 [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Dr. David Alan Gilbert @ 2018-01-05 18:00 ` Dave Hansen 0 siblings, 0 replies; 13+ messages in thread From: Dave Hansen @ 2018-01-05 18:00 UTC (permalink / raw) To: 84a6f2f2-d5fe-6b42-0590-33723c1b4960 Cc: Peter Zijlstra, Tim Chen, Thomas Gleixner, Linus Torvalds, Greg KH, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, LKML, Andy Lutomirski On 01/05/2018 04:27 AM, Dr. David Alan Gilbert wrote: >>> Patches for 1-3 are out there and 4 is pretty straightforward. Doing a >>> arch_prctl() is still straightforward, but will be a much more niche >>> thing than any of the other choices. Plus, with a user interface, we >>> have to argue over the ABI for at least a month or two. ;) > I was chatting to Andrea about this, and we came to the conclusion one > use might be for qemu; I was worried about (theoretically) whether > userspace in a guest could read privileged data from the guest kernel by > attacking the qemu process rather than by attacking the kernels. Theoretically, I believe it's possible. The SMEP-based mitigations are effective when crossing rings, but do not help with guest-ring0->host-ring0 or presumably guest-ring3->host-ring3. For the same-ring things, we have the indirect branch predictor flush operation MSR (IBPB). Expect those to be posted once we have the IBRS and retpoline approaches settled. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 0/7] IBRS patch series
@ 2018-01-04 17:56 Tim Chen
2018-01-04 17:56 ` [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Tim Chen
0 siblings, 1 reply; 13+ messages in thread
From: Tim Chen @ 2018-01-04 17:56 UTC (permalink / raw)
To: Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH
Cc: Tim Chen, Dave Hansen, Andrea Arcangeli, Andi Kleen,
Arjan Van De Ven, linux-kernel
This patch series enables the basic detection and usage of x86 indirect
branch speculation feature. It enables the indirect branch restricted
speculation (IBRS) on kernel entry and disables it on exit.
It enumerates the indirect branch prediction barrier (IBPB).
The x86 IBRS feature requires corresponding microcode support.
It mitigates the variant 2 vulnerability described in
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
If IBRS is set, near returns and near indirect jumps/calls will not
allow their predicted target address to be controlled by code that
executed in a less privileged prediction mode before the IBRS mode was
last written with a value of 1 or on another logical processor so long
as all RSB entries from the previous less privileged prediction mode
are overwritten.
Setting of IBPB ensures that earlier code's behavior does not control later
indirect branch predictions. It is used when context switching to new
untrusted address space. Unlike IBRS, IBPB is a command MSR
and does not retain its state.
Speculation on Skylake and later requires these patches ("dynamic IBRS")
be used instead of retpoline[1]. If you are very paranoid or you run on
a CPU where IBRS=1 is cheaper, you may also want to run in "IBRS always"
mode.
See: https://docs.google.com/document/d/e/2PACX-1vSMrwkaoSUBAFc6Fjd19F18c1O9pudkfAY-7lGYGOTN8mc9ul-J6pWadcAaBJZcVA7W_3jlLKRtKRbd/pub
More detailed description of IBRS is described in the first patch.
It is applied on top of the page table isolation changes.
A run time and boot time control of the IBRS feature is provided
There are 2 ways to control IBRS
1. At boot time
noibrs kernel boot parameter will disable IBRS usage
Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.
2. At run time
echo 0 > /sys/kernel/debug/ibrs_enabled will turn off IBRS
echo 1 > /sys/kernel/debug/ibrs_enabled will turn on IBRS in kernel
echo 2 > /sys/kernel/debug/ibrs_enabled will turn on IBRS in both userspace and kernel (IBRS always)
[1] https://lkml.org/lkml/2018/1/4/174
Tim Chen (7):
x86/feature: Detect the x86 feature to control Speculation
x86/enter: MACROS to set/clear IBRS
x86/enter: Use IBRS on syscall and interrupts
x86/idle: Disable IBRS entering idle and enable it on wakeup
x86: Use IBRS for firmware update path
x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature
x86/microcode: Recheck IBRS features on microcode reload
Documentation/admin-guide/kernel-parameters.txt | 4 +
arch/x86/entry/entry_64.S | 24 +++
arch/x86/entry/entry_64_compat.S | 9 +
arch/x86/include/asm/apm.h | 6 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/efi.h | 16 +-
arch/x86/include/asm/msr-index.h | 7 +
arch/x86/include/asm/mwait.h | 19 ++
arch/x86/include/asm/spec_ctrl.h | 253 ++++++++++++++++++++++++
arch/x86/kernel/cpu/Makefile | 1 +
arch/x86/kernel/cpu/microcode/core.c | 6 +
arch/x86/kernel/cpu/scattered.c | 11 ++
arch/x86/kernel/cpu/spec_ctrl.c | 124 ++++++++++++
arch/x86/kernel/process.c | 9 +-
14 files changed, 486 insertions(+), 4 deletions(-)
create mode 100644 arch/x86/include/asm/spec_ctrl.h
create mode 100644 arch/x86/kernel/cpu/spec_ctrl.c
--
2.9.4
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 17:56 [PATCH 0/7] IBRS patch series Tim Chen @ 2018-01-04 17:56 ` Tim Chen 2018-01-04 20:00 ` Greg KH ` (3 more replies) 0 siblings, 4 replies; 13+ messages in thread From: Tim Chen @ 2018-01-04 17:56 UTC (permalink / raw) To: Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH Cc: Tim Chen, Dave Hansen, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel Set IBRS upon kernel entrance via syscall and interrupts. Clear it upon exit. If NMI runs when exiting kernel between IBRS_DISABLE and SWAPGS, the NMI would have turned on IBRS bit 0 and then it would have left enabled when exiting the NMI. IBRS bit 0 would then be left enabled in userland until the next enter kernel. That is a minor inefficiency only, but we can eliminate it by saving the MSR when entering the NMI in save_paranoid and restoring it when exiting the NMI. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- arch/x86/entry/entry_64.S | 24 ++++++++++++++++++++++++ arch/x86/entry/entry_64_compat.S | 9 +++++++++ 2 files changed, 33 insertions(+) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 3f72f5c..0c4d542 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -37,6 +37,7 @@ #include <asm/pgtable_types.h> #include <asm/export.h> #include <asm/frame.h> +#include <asm/spec_ctrl.h> #include <linux/err.h> #include "calling.h" @@ -170,6 +171,8 @@ ENTRY(entry_SYSCALL_64_trampoline) /* Load the top of the task stack into RSP */ movq CPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS /* Start building the simulated IRET frame. */ pushq $__USER_DS /* pt_regs->ss */ @@ -213,6 +216,8 @@ ENTRY(entry_SYSCALL_64) * is not required to switch CR3. */ movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS TRACE_IRQS_OFF @@ -407,6 +412,7 @@ syscall_return_via_sysret: * We are on the trampoline stack. All regs except RDI are live. * We can do future final exit work right here. */ + DISABLE_IBRS SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi popq %rdi @@ -745,6 +751,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode) * We can do future final exit work right here. */ + DISABLE_IBRS SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi /* Restore RDI. */ @@ -832,6 +839,14 @@ native_irq_return_ldt: SWAPGS /* to kernel GS */ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi /* to kernel CR3 */ + /* + * Normally we enable IBRS when we switch to kernel's CR3. + * But we are going to switch back to user CR3 immediately + * in this routine after fixing ESPFIX stack. There is + * no vulnerable code branching for IBRS to protect. + * We don't toggle IBRS to avoid the cost of two MSR writes. + */ + movq PER_CPU_VAR(espfix_waddr), %rdi movq %rax, (0*8)(%rdi) /* user RAX */ movq (1*8)(%rsp), %rax /* user RIP */ @@ -965,6 +980,8 @@ ENTRY(switch_to_thread_stack) SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi movq %rsp, %rdi movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp + /* Stack is usable, use the non-clobbering IBRS enable: */ + ENABLE_IBRS UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI pushq 7*8(%rdi) /* regs->ss */ @@ -1265,6 +1282,7 @@ ENTRY(paranoid_entry) 1: SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14 + ENABLE_IBRS_SAVE_AND_CLOBBER save_reg=%r13d ret END(paranoid_entry) @@ -1288,6 +1306,7 @@ ENTRY(paranoid_exit) testl %ebx, %ebx /* swapgs needed? */ jnz .Lparanoid_exit_no_swapgs TRACE_IRQS_IRETQ + RESTORE_IBRS_CLOBBER save_reg=%r13d RESTORE_CR3 scratch_reg=%rbx save_reg=%r14 SWAPGS_UNSAFE_STACK jmp .Lparanoid_exit_restore @@ -1318,6 +1337,7 @@ ENTRY(error_entry) SWAPGS /* We have user CR3. Change to kernel CR3. */ SWITCH_TO_KERNEL_CR3 scratch_reg=%rax + ENABLE_IBRS_CLOBBER .Lerror_entry_from_usermode_after_swapgs: /* Put us onto the real thread stack. */ @@ -1365,6 +1385,7 @@ ENTRY(error_entry) */ SWAPGS SWITCH_TO_KERNEL_CR3 scratch_reg=%rax + ENABLE_IBRS_CLOBBER jmp .Lerror_entry_done .Lbstep_iret: @@ -1379,6 +1400,7 @@ ENTRY(error_entry) */ SWAPGS SWITCH_TO_KERNEL_CR3 scratch_reg=%rax + ENABLE_IBRS /* * Pretend that the exception came from user mode: set up pt_regs @@ -1480,6 +1502,7 @@ ENTRY(nmi) SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx movq %rsp, %rdx movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp + ENABLE_IBRS UNWIND_HINT_IRET_REGS base=%rdx offset=8 pushq 5*8(%rdx) /* pt_regs->ss */ pushq 4*8(%rdx) /* pt_regs->rsp */ @@ -1730,6 +1753,7 @@ end_repeat_nmi: movq $-1, %rsi call do_nmi + RESTORE_IBRS_CLOBBER save_reg=%r13d RESTORE_CR3 scratch_reg=%r15 save_reg=%r14 testl %ebx, %ebx /* swapgs needed? */ diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S index 40f1700..88ee1c0 100644 --- a/arch/x86/entry/entry_64_compat.S +++ b/arch/x86/entry/entry_64_compat.S @@ -14,6 +14,7 @@ #include <asm/irqflags.h> #include <asm/asm.h> #include <asm/smap.h> +#include <asm/spec_ctrl.h> #include <linux/linkage.h> #include <linux/err.h> @@ -54,6 +55,7 @@ ENTRY(entry_SYSENTER_compat) SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp + ENABLE_IBRS /* * User tracing code (ptrace or signal handlers) might assume that @@ -224,6 +226,7 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe) * preserved during the C calls inside TRACE_IRQS_OFF anyway. */ SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi + ENABLE_IBRS_CLOBBER /* clobbers %rax, %rcx, %rdx */ /* * User mode is traced as though IRQs are on, and SYSENTER @@ -240,6 +243,12 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe) /* Opportunistic SYSRET */ sysret32_from_system_call: TRACE_IRQS_ON /* User mode traces as IRQs on. */ + /* + * Clobber of %rax, %rcx, %rdx is OK before register restoring. + * This is safe to do here because we have no indirect branches + * between here and the return to userspace (sysretl). + */ + DISABLE_IBRS_CLOBBER movq RBX(%rsp), %rbx /* pt_regs->rbx */ movq RBP(%rsp), %rbp /* pt_regs->rbp */ movq EFLAGS(%rsp), %r11 /* pt_regs->flags (in r11) */ -- 2.9.4 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 17:56 ` [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Tim Chen @ 2018-01-04 20:00 ` Greg KH 2018-01-04 20:26 ` Tim Chen 2018-01-04 20:45 ` Dave Hansen ` (2 subsequent siblings) 3 siblings, 1 reply; 13+ messages in thread From: Greg KH @ 2018-01-04 20:00 UTC (permalink / raw) To: Tim Chen Cc: Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Dave Hansen, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel On Thu, Jan 04, 2018 at 09:56:44AM -0800, Tim Chen wrote: > > That is a minor inefficiency only, but we can eliminate it by saving > the MSR when entering the NMI in save_paranoid and restoring it when > exiting the NMI. Any hints as to what exactly "minor" means in cycles here? :) thanks, greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 20:00 ` Greg KH @ 2018-01-04 20:26 ` Tim Chen 0 siblings, 0 replies; 13+ messages in thread From: Tim Chen @ 2018-01-04 20:26 UTC (permalink / raw) To: Greg KH Cc: Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Dave Hansen, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel On 01/04/2018 12:00 PM, Greg KH wrote: > On Thu, Jan 04, 2018 at 09:56:44AM -0800, Tim Chen wrote: >> >> That is a minor inefficiency only, but we can eliminate it by saving >> the MSR when entering the NMI in save_paranoid and restoring it when >> exiting the NMI. > > Any hints as to what exactly "minor" means in cycles here? :) > The current implementation does not have this inefficiency. The comment is to explain why we need to save the IBRS state in save_paranoid. The issue is if we don't save the IBRS state for NMI, For nested interrupts, it is hard to figure out when we are returning from NMI, whether we are returning to user space or kernel space. And if we do the safe thing by leaving IBRS on, there is a possibility that we may return to user space with IBRS enabled, which will affect performance. The possibility of hitting this is minor, but still we want to eliminate it. Thanks. Tim ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 17:56 ` [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Tim Chen 2018-01-04 20:00 ` Greg KH @ 2018-01-04 20:45 ` Dave Hansen 2018-01-04 22:33 ` Peter Zijlstra 2018-01-05 13:35 ` Thomas Gleixner 3 siblings, 0 replies; 13+ messages in thread From: Dave Hansen @ 2018-01-04 20:45 UTC (permalink / raw) To: Tim Chen, Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH Cc: Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel On 01/04/2018 09:56 AM, Tim Chen wrote: > If NMI runs when exiting kernel between IBRS_DISABLE and > SWAPGS, the NMI would have turned on IBRS bit 0 and then it would have > left enabled when exiting the NMI. IBRS bit 0 would then be left > enabled in userland until the next enter kernel. > > That is a minor inefficiency only, but we can eliminate it by saving > the MSR when entering the NMI in save_paranoid and restoring it when > exiting the NMI. Can I suggest and alternate description for the NMI case? This is long-winded, but it should keep me from having to think through it yet again. :) " The normal interrupt code uses the 'error_entry' path which uses the Code Segment (CS) of the instruction that was interrupted to tell whether it interrupted the kernel or userspace and thus has to switch IBRS, or leave it alone. The NMI code is different. It uses 'paranoid_entry' because it can interrupt the kernel while it is running with a userspace IBRS (and %GS and CR3) value, but has a kernel CS. If we used the same approach as the normal interrupt code, we might do the following; SYSENTER_entry <-------------- NMI HERE IBRS=1 do_something() IBRS=0 SYSRET The NMI code might notice that we are running in the kernel and decide that it is OK to skip the IBRS=1. This would leave it running unprotected with IBRS=0, which is bad. However, if we unconditionally set IBRS=1, in the NMI, we might get the following case: SYSENTER_entry IBRS=1 do_something() IBRS=0 <-------------- NMI HERE (set IBRS=1) SYSRET and we would return to userspace with IBRS=1. Userspace would run slowly until we entered and exited the kernel again. (This is the case Tim is alluding to in the patch description). Instead of those two approaches, we chose a third one where we simply save the IBRS value in a scratch register (%r13) and then restore that value, verbatim. This is what PTI does with CR3 and it works beautifully. " ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 17:56 ` [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Tim Chen 2018-01-04 20:00 ` Greg KH 2018-01-04 20:45 ` Dave Hansen @ 2018-01-04 22:33 ` Peter Zijlstra 2018-01-04 23:12 ` Andrea Arcangeli 2018-01-05 0:08 ` Dave Hansen 2018-01-05 13:35 ` Thomas Gleixner 3 siblings, 2 replies; 13+ messages in thread From: Peter Zijlstra @ 2018-01-04 22:33 UTC (permalink / raw) To: Tim Chen Cc: Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH, Dave Hansen, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel On Thu, Jan 04, 2018 at 09:56:44AM -0800, Tim Chen wrote: > Set IBRS upon kernel entrance via syscall and interrupts. Clear it > upon exit. So not only did we add a CR3 write, we're now adding an MSR write to the entry/exit paths. Please tell me that these are 'fast' MSRs? Given people are already reporting stupid numbers with just the existing PTI/CR3, what kind of pain are we going to get from adding this? > If NMI runs when exiting kernel between IBRS_DISABLE and > SWAPGS, the NMI would have turned on IBRS bit 0 and then it would have > left enabled when exiting the NMI. IBRS bit 0 would then be left > enabled in userland until the next enter kernel. > > That is a minor inefficiency only, but we can eliminate it by saving > the MSR when entering the NMI in save_paranoid and restoring it when > exiting the NMI. > > Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Invalid SoB chain, either you lost a From: Andrea or you need something else. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 22:33 ` Peter Zijlstra @ 2018-01-04 23:12 ` Andrea Arcangeli 2018-01-05 0:08 ` Dave Hansen 1 sibling, 0 replies; 13+ messages in thread From: Andrea Arcangeli @ 2018-01-04 23:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Tim Chen, Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH, Dave Hansen, Andi Kleen, Arjan Van De Ven, linux-kernel On Thu, Jan 04, 2018 at 11:33:21PM +0100, Peter Zijlstra wrote: > So not only did we add a CR3 write, we're now adding an MSR write to the > entry/exit paths. Please tell me that these are 'fast' MSRs? Given > people are already reporting stupid numbers with just the existing > PTI/CR3, what kind of pain are we going to get from adding this? On SkyLake it costs roughly the same as cr3 write with bit 63 set, but SkyLake then runs faster with IBRS enabled too. On earlier CPUs enabling IBRS slows down CPU quite a bit, so the primary concern is for older CPUs and the MSR write is the last worry there. ibrs 2 will set IBRS all the time (only guest mode will alter it and it'll always be restored to IBRS set during vmexit) so there will be no cost on kernel enter/exit (also no cost in vmenter vmexit if guest leaves it always set). Future silicon will like to run in ibrs 2 mode always, but current one runs faster at ibrs 1 despite the MSR write for most workloads (kernel builds etc..). Thanks, Andrea ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 22:33 ` Peter Zijlstra 2018-01-04 23:12 ` Andrea Arcangeli @ 2018-01-05 0:08 ` Dave Hansen 2018-01-05 4:51 ` Andy Lutomirski 1 sibling, 1 reply; 13+ messages in thread From: Dave Hansen @ 2018-01-05 0:08 UTC (permalink / raw) To: Peter Zijlstra, Tim Chen Cc: Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel On 01/04/2018 02:33 PM, Peter Zijlstra wrote: > On Thu, Jan 04, 2018 at 09:56:44AM -0800, Tim Chen wrote: >> Set IBRS upon kernel entrance via syscall and interrupts. Clear it >> upon exit. > > So not only did we add a CR3 write, we're now adding an MSR write to the > entry/exit paths. Please tell me that these are 'fast' MSRs? Given > people are already reporting stupid numbers with just the existing > PTI/CR3, what kind of pain are we going to get from adding this? This "dynamic IBRS" that does runtime switching will not be on by default and will be patched around by alternatives unless someone explicitly opts in. If you decide you want the additional protection that it provides, you can take the performance hit. How much is that? We've been saying that these new MSRs are roughly as expensive as the CR3 writes. How expensive are those? Don't take my word for it, a few folks were talking about it today: Google says[1]: "We see negligible impact on performance." Amazon says[2]: "We don’t expect meaningful performance impact." I chopped a few qualifiers out of there, but I think that roughly captures the sentiment. 1. https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html 2. http://www.businessinsider.com/google-amazon-performance-hit-meltdown-spectre-fixes-overblown-2018-1 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-05 0:08 ` Dave Hansen @ 2018-01-05 4:51 ` Andy Lutomirski 2018-01-05 5:11 ` Dave Hansen 0 siblings, 1 reply; 13+ messages in thread From: Andy Lutomirski @ 2018-01-05 4:51 UTC (permalink / raw) To: Dave Hansen Cc: Peter Zijlstra, Tim Chen, Thomas Gleixner, Andy Lutomirski, Linus Torvalds, Greg KH, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, LKML On Thu, Jan 4, 2018 at 4:08 PM, Dave Hansen <dave.hansen@intel.com> wrote: > On 01/04/2018 02:33 PM, Peter Zijlstra wrote: >> On Thu, Jan 04, 2018 at 09:56:44AM -0800, Tim Chen wrote: >>> Set IBRS upon kernel entrance via syscall and interrupts. Clear it >>> upon exit. >> >> So not only did we add a CR3 write, we're now adding an MSR write to the >> entry/exit paths. Please tell me that these are 'fast' MSRs? Given >> people are already reporting stupid numbers with just the existing >> PTI/CR3, what kind of pain are we going to get from adding this? > > This "dynamic IBRS" that does runtime switching will not be on by > default and will be patched around by alternatives unless someone > explicitly opts in. > > If you decide you want the additional protection that it provides, you > can take the performance hit. How much is that? We've been saying that > these new MSRs are roughly as expensive as the CR3 writes. How > expensive are those? Don't take my word for it, a few folks were > talking about it today: > > Google says[1]: "We see negligible impact on performance." > Amazon says[2]: "We don’t expect meaningful performance impact." > > I chopped a few qualifiers out of there, but I think that roughly > captures the sentiment. > > 1. > https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html > 2. > http://www.businessinsider.com/google-amazon-performance-hit-meltdown-spectre-fixes-overblown-2018-1 Do we need an arch_prctl() to enable IBRS for user mode? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-05 4:51 ` Andy Lutomirski @ 2018-01-05 5:11 ` Dave Hansen 2018-01-05 12:01 ` Alan Cox 0 siblings, 1 reply; 13+ messages in thread From: Dave Hansen @ 2018-01-05 5:11 UTC (permalink / raw) To: Andy Lutomirski Cc: Peter Zijlstra, Tim Chen, Thomas Gleixner, Linus Torvalds, Greg KH, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, LKML On 01/04/2018 08:51 PM, Andy Lutomirski wrote: > Do we need an arch_prctl() to enable IBRS for user mode? Eventually, once the dust settles. I think there's a spectrum of paranoia here, that is roughly (with increasing paranoia): 1. do nothing 2. do retpoline 3. do IBRS in kernel 4. do IBRS always I think you're asking for ~3.5. Patches for 1-3 are out there and 4 is pretty straightforward. Doing a arch_prctl() is still straightforward, but will be a much more niche thing than any of the other choices. Plus, with a user interface, we have to argue over the ABI for at least a month or two. ;) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-05 5:11 ` Dave Hansen @ 2018-01-05 12:01 ` Alan Cox 0 siblings, 0 replies; 13+ messages in thread From: Alan Cox @ 2018-01-05 12:01 UTC (permalink / raw) To: Dave Hansen Cc: Andy Lutomirski, Peter Zijlstra, Tim Chen, Thomas Gleixner, Linus Torvalds, Greg KH, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, LKML On Thu, 4 Jan 2018 21:11:23 -0800 Dave Hansen <dave.hansen@intel.com> wrote: > On 01/04/2018 08:51 PM, Andy Lutomirski wrote: > > Do we need an arch_prctl() to enable IBRS for user mode? > > Eventually, once the dust settles. I think there's a spectrum of > paranoia here, that is roughly (with increasing paranoia): > > 1. do nothing > 2. do retpoline > 3. do IBRS in kernel > 4. do IBRS always > > I think you're asking for ~3.5. And we'll actually end up with cgroups needing to handle this and a prctl because the answer is simply not a systemwide single constant. To start with if my code has CAP_SYS_RAWIO who gives a **** about IBRS protecting it. Likewise on many real world systems I trust my base OS (or I might as well turn off the power) I sort of trust my apps, and I deeply distrust my web browser which itself probably wants to turn some of the protections on for crap like javascript and webassembly. If I'm running containers well my desktop is probably #2 and my container #3 or #4 There's no point getting hung up about a single magic default number, because that's not how it's going to end up. Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts 2018-01-04 17:56 ` [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Tim Chen ` (2 preceding siblings ...) 2018-01-04 22:33 ` Peter Zijlstra @ 2018-01-05 13:35 ` Thomas Gleixner 3 siblings, 0 replies; 13+ messages in thread From: Thomas Gleixner @ 2018-01-05 13:35 UTC (permalink / raw) To: Tim Chen Cc: Andy Lutomirski, Linus Torvalds, Greg KH, Dave Hansen, Andrea Arcangeli, Andi Kleen, Arjan Van De Ven, linux-kernel On Thu, 4 Jan 2018, Tim Chen wrote: > Set IBRS upon kernel entrance via syscall and interrupts. Clear it > upon exit. I have no idea on which kernel this is supposed to apply. It fails on Linus tree and on tip x86/pti. Can you please finally ditch the broken and outdated version of PTI on which this is based on? I really wonder how this stuff is developed and tested when its missing essential fixes. Please post patches against tip x86/pti where the lastest fixes and patches concerning this mess are staged for both Linus and 4.14.stable Thanks, tglx ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2018-01-05 18:00 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-01-05 12:27 [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Dr. David Alan Gilbert 2018-01-05 18:00 ` Dave Hansen -- strict thread matches above, loose matches on Subject: below -- 2018-01-04 17:56 [PATCH 0/7] IBRS patch series Tim Chen 2018-01-04 17:56 ` [PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts Tim Chen 2018-01-04 20:00 ` Greg KH 2018-01-04 20:26 ` Tim Chen 2018-01-04 20:45 ` Dave Hansen 2018-01-04 22:33 ` Peter Zijlstra 2018-01-04 23:12 ` Andrea Arcangeli 2018-01-05 0:08 ` Dave Hansen 2018-01-05 4:51 ` Andy Lutomirski 2018-01-05 5:11 ` Dave Hansen 2018-01-05 12:01 ` Alan Cox 2018-01-05 13:35 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox