* [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) [not found] ` <20080417165944.GB25198@Krystal> @ 2008-04-17 20:14 ` Mathieu Desnoyers 2008-04-17 20:29 ` Andrew Morton ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-17 20:14 UTC (permalink / raw) To: mingo Cc: akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel (hopefully finally CCing LKML) :) Implements an alternative iret with popf and return so trap and exception handlers can return to the NMI handler without issuing iret. iret would cause NMIs to be reenabled prematurely. x86_32 uses popf and far return. x86_64 has to copy the return instruction pointer to the top of the previous stack, issue a popf, loads the previous esp and issue a near return (ret). It allows placing immediate values (and therefore optimized trace_marks) in NMI code since returning from a breakpoint would be valid. Accessing vmalloc'd memory, which allows executing module code or accessing vmapped or vmalloc'd areas from NMI context, would also be valid. This is very useful to tracers like LTTng. This patch makes all faults, traps and exception safe to be called from NMI context *except* single-stepping, which requires iret to restore the TF (trap flag) and jump to the return address in a single instruction. Sorry, no kprobes support in NMI handlers because of this limitation. We cannot single-step an NMI handler, because iret must set the TF flag and return back to the instruction to single-step in a single instruction. This cannot be emulated with popf/lret, because lret would be single-stepped. It does not apply to immediate values because they do not use single-stepping. This code detects if the TF flag is set and uses the iret path for single-stepping, even if it reactivates NMIs prematurely. alpha and avr32 use the active count bit 31. This patch moves them to 28. TODO : test alpha and avr32 active count modification TODO : add paravirt support for the iret alternative. Currently, paravirt kernels running on bare metal still use iret in traps nested over nmi handlers. tested on x86_32 (tests implemented in a separate patch) : - instrumented the return path to export the EIP, CS and EFLAGS values when taken so we know the return path code has been executed. - trace_mark, using immediate values, with 10ms delay with the breakpoint activated. Runs well through the return path. - tested vmalloc faults in NMI handler by placing a non-optimized marker in the NMI handler (so no breakpoint is executed) and connecting a probe which touches every pages of a 20MB vmalloc'd buffer. It executes trough the return path without problem. - Tested with and without preemption tested on x86_64 - instrumented the return path to export the EIP, CS and EFLAGS values when taken so we know the return path code has been executed. - trace_mark, using immediate values, with 10ms delay with the breakpoint activated. Runs well through the return path. To test on x86_64 : - Test without preemption - Test vmalloc faults - Test on Intel 64 bits CPUs. Changelog since v1 : - x86_64 fixes. Changelog since v2 : - fix paravirt build Changelog since v3 : - Include modifications suggested by Jeremy Changelog since v3 : - including hardirq.h in entry_32/64.S is a bad idea (non ifndef'd C code), define HARDNMI_MASK in the .S files directly. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> CC: akpm@osdl.org CC: mingo@elte.hu CC: "H. Peter Anvin" <hpa@zytor.com> CC: Jeremy Fitzhardinge <jeremy@goop.org> CC: Steven Rostedt <rostedt@goodmis.org> CC: "Frank Ch. Eigler" <fche@redhat.com> --- arch/x86/kernel/entry_32.S | 27 ++++++++++++++++++- arch/x86/kernel/entry_64.S | 26 ++++++++++++++++++ include/asm-alpha/thread_info.h | 2 - include/asm-avr32/thread_info.h | 2 - include/asm-x86/irqflags.h | 55 ++++++++++++++++++++++++++++++++++++++++ include/asm-x86/paravirt.h | 2 + include/linux/hardirq.h | 24 ++++++++++++++++- 7 files changed, 133 insertions(+), 5 deletions(-) Index: linux-2.6-lttng/include/linux/hardirq.h =================================================================== --- linux-2.6-lttng.orig/include/linux/hardirq.h 2008-04-16 11:25:18.000000000 -0400 +++ linux-2.6-lttng/include/linux/hardirq.h 2008-04-16 11:29:30.000000000 -0400 @@ -22,10 +22,13 @@ * PREEMPT_MASK: 0x000000ff * SOFTIRQ_MASK: 0x0000ff00 * HARDIRQ_MASK: 0x0fff0000 + * HARDNMI_MASK: 0x40000000 */ #define PREEMPT_BITS 8 #define SOFTIRQ_BITS 8 +#define HARDNMI_BITS 1 + #ifndef HARDIRQ_BITS #define HARDIRQ_BITS 12 @@ -45,16 +48,19 @@ #define PREEMPT_SHIFT 0 #define SOFTIRQ_SHIFT (PREEMPT_SHIFT + PREEMPT_BITS) #define HARDIRQ_SHIFT (SOFTIRQ_SHIFT + SOFTIRQ_BITS) +#define HARDNMI_SHIFT (30) #define __IRQ_MASK(x) ((1UL << (x))-1) #define PREEMPT_MASK (__IRQ_MASK(PREEMPT_BITS) << PREEMPT_SHIFT) #define SOFTIRQ_MASK (__IRQ_MASK(SOFTIRQ_BITS) << SOFTIRQ_SHIFT) #define HARDIRQ_MASK (__IRQ_MASK(HARDIRQ_BITS) << HARDIRQ_SHIFT) +#define HARDNMI_MASK (__IRQ_MASK(HARDNMI_BITS) << HARDNMI_SHIFT) #define PREEMPT_OFFSET (1UL << PREEMPT_SHIFT) #define SOFTIRQ_OFFSET (1UL << SOFTIRQ_SHIFT) #define HARDIRQ_OFFSET (1UL << HARDIRQ_SHIFT) +#define HARDNMI_OFFSET (1UL << HARDNMI_SHIFT) #if PREEMPT_ACTIVE < (1 << (HARDIRQ_SHIFT + HARDIRQ_BITS)) #error PREEMPT_ACTIVE is too low! @@ -63,6 +69,7 @@ #define hardirq_count() (preempt_count() & HARDIRQ_MASK) #define softirq_count() (preempt_count() & SOFTIRQ_MASK) #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK)) +#define hardnmi_count() (preempt_count() & HARDNMI_MASK) /* * Are we doing bottom half or hardware interrupt processing? @@ -71,6 +78,7 @@ #define in_irq() (hardirq_count()) #define in_softirq() (softirq_count()) #define in_interrupt() (irq_count()) +#define in_nmi() (hardnmi_count()) /* * Are we running in atomic context? WARNING: this macro cannot @@ -159,7 +167,19 @@ extern void irq_enter(void); */ extern void irq_exit(void); -#define nmi_enter() do { lockdep_off(); __irq_enter(); } while (0) -#define nmi_exit() do { __irq_exit(); lockdep_on(); } while (0) +#define nmi_enter() \ + do { \ + lockdep_off(); \ + BUG_ON(hardnmi_count()); \ + add_preempt_count(HARDNMI_OFFSET); \ + __irq_enter(); \ + } while (0) + +#define nmi_exit() \ + do { \ + __irq_exit(); \ + sub_preempt_count(HARDNMI_OFFSET); \ + lockdep_on(); \ + } while (0) #endif /* LINUX_HARDIRQ_H */ Index: linux-2.6-lttng/arch/x86/kernel/entry_32.S =================================================================== --- linux-2.6-lttng.orig/arch/x86/kernel/entry_32.S 2008-04-16 11:25:18.000000000 -0400 +++ linux-2.6-lttng/arch/x86/kernel/entry_32.S 2008-04-17 12:55:10.000000000 -0400 @@ -75,11 +75,12 @@ DF_MASK = 0x00000400 NT_MASK = 0x00004000 VM_MASK = 0x00020000 +#define HARDNMI_MASK 0x40000000 + #ifdef CONFIG_PREEMPT #define preempt_stop(clobbers) DISABLE_INTERRUPTS(clobbers); TRACE_IRQS_OFF #else #define preempt_stop(clobbers) -#define resume_kernel restore_nocheck #endif .macro TRACE_IRQS_IRET @@ -265,6 +266,8 @@ END(ret_from_exception) #ifdef CONFIG_PREEMPT ENTRY(resume_kernel) DISABLE_INTERRUPTS(CLBR_ANY) + testl $HARDNMI_MASK,TI_preempt_count(%ebp) # nested over NMI ? + jnz return_to_nmi cmpl $0,TI_preempt_count(%ebp) # non-zero preempt_count ? jnz restore_nocheck need_resched: @@ -276,6 +279,12 @@ need_resched: call preempt_schedule_irq jmp need_resched END(resume_kernel) +#else +ENTRY(resume_kernel) + testl $HARDNMI_MASK,TI_preempt_count(%ebp) # nested over NMI ? + jnz return_to_nmi + jmp restore_nocheck +END(resume_kernel) #endif CFI_ENDPROC @@ -411,6 +420,22 @@ restore_nocheck_notrace: CFI_ADJUST_CFA_OFFSET -4 irq_return: INTERRUPT_RETURN +return_to_nmi: + testl $X86_EFLAGS_TF, PT_EFLAGS(%esp) + jnz restore_nocheck /* + * If single-stepping an NMI handler, + * use the normal iret path instead of + * the popf/lret because lret would be + * single-stepped. It should not + * happen : it will reactivate NMIs + * prematurely. + */ + TRACE_IRQS_IRET + RESTORE_REGS + addl $4, %esp # skip orig_eax/error_code + CFI_ADJUST_CFA_OFFSET -4 + INTERRUPT_RETURN_NMI_SAFE + .section .fixup,"ax" iret_exc: pushl $0 # no error code Index: linux-2.6-lttng/arch/x86/kernel/entry_64.S =================================================================== --- linux-2.6-lttng.orig/arch/x86/kernel/entry_64.S 2008-04-16 11:25:18.000000000 -0400 +++ linux-2.6-lttng/arch/x86/kernel/entry_64.S 2008-04-17 12:53:54.000000000 -0400 @@ -54,6 +54,8 @@ .code64 +#define HARDNMI_MASK 0x40000000 + #ifndef CONFIG_PREEMPT #define retint_kernel retint_restore_args #endif @@ -581,12 +583,27 @@ retint_restore_args: /* return to kernel * The iretq could re-enable interrupts: */ TRACE_IRQS_IRETQ + testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx) + jnz return_to_nmi /* Nested over NMI ? */ restore_args: RESTORE_ARGS 0,8,0 irq_return: INTERRUPT_RETURN +return_to_nmi: /* + * If single-stepping an NMI handler, + * use the normal iret path instead of + * the popf/lret because lret would be + * single-stepped. It should not + * happen : it will reactivate NMIs + * prematurely. + */ + testw $X86_EFLAGS_TF,EFLAGS-ARGOFFSET(%rsp) /* trap flag? */ + jnz restore_args + RESTORE_ARGS 0,8,0 + INTERRUPT_RETURN_NMI_SAFE + .section __ex_table, "a" .quad irq_return, bad_iret .previous @@ -802,6 +819,10 @@ END(spurious_interrupt) .macro paranoidexit trace=1 /* ebx: no swapgs flag */ paranoid_exit\trace: + GET_THREAD_INFO(%rcx) + testl $HARDNMI_MASK,threadinfo_preempt_count(%rcx) + jnz paranoid_return_to_nmi\trace /* Nested over NMI ? */ +paranoid_exit_no_nmi\trace: testl %ebx,%ebx /* swapgs needed? */ jnz paranoid_restore\trace testl $3,CS(%rsp) @@ -814,6 +835,11 @@ paranoid_swapgs\trace: paranoid_restore\trace: RESTORE_ALL 8 jmp irq_return +paranoid_return_to_nmi\trace: + testw $X86_EFLAGS_TF,EFLAGS-0(%rsp) /* trap flag? */ + jnz paranoid_exit_no_nmi\trace + RESTORE_ALL 8 + INTERRUPT_RETURN_NMI_SAFE paranoid_userspace\trace: GET_THREAD_INFO(%rcx) movl threadinfo_flags(%rcx),%ebx Index: linux-2.6-lttng/include/asm-x86/irqflags.h =================================================================== --- linux-2.6-lttng.orig/include/asm-x86/irqflags.h 2008-04-16 11:25:18.000000000 -0400 +++ linux-2.6-lttng/include/asm-x86/irqflags.h 2008-04-17 12:28:23.000000000 -0400 @@ -138,12 +138,67 @@ static inline unsigned long __raw_local_ #ifdef CONFIG_X86_64 #define INTERRUPT_RETURN iretq + +/* + * Only returns from a trap or exception to a NMI context (intra-privilege + * level near return) to the same SS and CS segments. Should be used + * upon trap or exception return when nested over a NMI context so no iret is + * issued. It takes care of modifying the eflags, rsp and returning to the + * previous function. + * + * The stack, at that point, looks like : + * + * 0(rsp) RIP + * 8(rsp) CS + * 16(rsp) EFLAGS + * 24(rsp) RSP + * 32(rsp) SS + * + * Upon execution : + * Copy EIP to the top of the return stack + * Update top of return stack address + * Pop eflags into the eflags register + * Make the return stack current + * Near return (popping the return address from the return stack) + */ +#define INTERRUPT_RETURN_NMI_SAFE pushq %rax; \ + mov %rsp, %rax; \ + mov 24+8(%rax), %rsp; \ + pushq 0+8(%rax); \ + pushq 16+8(%rax); \ + movq (%rax), %rax; \ + popfq; \ + ret; + #define ENABLE_INTERRUPTS_SYSCALL_RET \ movq %gs:pda_oldrsp, %rsp; \ swapgs; \ sysretq; #else #define INTERRUPT_RETURN iret + +/* + * Protected mode only, no V8086. Implies that protected mode must + * be entered before NMIs or MCEs are enabled. Only returns from a trap or + * exception to a NMI context (intra-privilege level far return). Should be used + * upon trap or exception return when nested over a NMI context so no iret is + * issued. + * + * The stack, at that point, looks like : + * + * 0(esp) EIP + * 4(esp) CS + * 8(esp) EFLAGS + * + * Upon execution : + * Copy the stack eflags to top of stack + * Pop eflags into the eflags register + * Far return: pop EIP and CS into their register, and additionally pop EFLAGS. + */ +#define INTERRUPT_RETURN_NMI_SAFE pushl 8(%esp); \ + popfl; \ + lret $4; + #define ENABLE_INTERRUPTS_SYSCALL_RET sti; sysexit #define GET_CR0_INTO_EAX movl %cr0, %eax #endif Index: linux-2.6-lttng/include/asm-alpha/thread_info.h =================================================================== --- linux-2.6-lttng.orig/include/asm-alpha/thread_info.h 2008-04-16 11:25:18.000000000 -0400 +++ linux-2.6-lttng/include/asm-alpha/thread_info.h 2008-04-17 12:53:55.000000000 -0400 @@ -57,7 +57,7 @@ register struct thread_info *__current_t #endif /* __ASSEMBLY__ */ -#define PREEMPT_ACTIVE 0x40000000 +#define PREEMPT_ACTIVE 0x10000000 /* * Thread information flags: Index: linux-2.6-lttng/include/asm-avr32/thread_info.h =================================================================== --- linux-2.6-lttng.orig/include/asm-avr32/thread_info.h 2008-04-16 11:25:18.000000000 -0400 +++ linux-2.6-lttng/include/asm-avr32/thread_info.h 2008-04-17 12:53:55.000000000 -0400 @@ -70,7 +70,7 @@ static inline struct thread_info *curren #endif /* !__ASSEMBLY__ */ -#define PREEMPT_ACTIVE 0x40000000 +#define PREEMPT_ACTIVE 0x10000000 /* * Thread information flags Index: linux-2.6-lttng/include/asm-x86/paravirt.h =================================================================== --- linux-2.6-lttng.orig/include/asm-x86/paravirt.h 2008-04-16 12:23:44.000000000 -0400 +++ linux-2.6-lttng/include/asm-x86/paravirt.h 2008-04-16 12:24:36.000000000 -0400 @@ -1358,6 +1358,8 @@ static inline unsigned long __raw_local_ PARA_SITE(PARA_PATCH(pv_cpu_ops, PV_CPU_iret), CLBR_NONE, \ jmp *%cs:pv_cpu_ops+PV_CPU_iret) +#define INTERRUPT_RETURN_NMI_SAFE INTERRUPT_RETURN + #define DISABLE_INTERRUPTS(clobbers) \ PARA_SITE(PARA_PATCH(pv_irq_ops, PV_IRQ_irq_disable), clobbers, \ PV_SAVE_REGS; \ -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-17 20:14 ` [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) Mathieu Desnoyers @ 2008-04-17 20:29 ` Andrew Morton 2008-04-17 21:16 ` Mathieu Desnoyers 2008-04-17 22:01 ` Andi Kleen 2008-04-21 14:00 ` Pavel Machek 2 siblings, 1 reply; 23+ messages in thread From: Andrew Morton @ 2008-04-17 20:29 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: mingo, hpa, jeremy, rostedt, fche, linux-kernel On Thu, 17 Apr 2008 16:14:10 -0400 Mathieu Desnoyers <compudj@krystal.dyndns.org> wrote: > +#define nmi_enter() \ > + do { \ > + lockdep_off(); \ > + BUG_ON(hardnmi_count()); \ > + add_preempt_count(HARDNMI_OFFSET); \ > + __irq_enter(); \ > + } while (0) <did it _have_ to be a macro?> Doing BUG() inside an NMI should be OK most of the time. But the BUG-handling code does want to know if we're in interrupt context - at least for the "fatal exception in interrupt" stuff, and probably other things. But afacit the failure to include HARDNMI_MASK in #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK)) will prevent that. So. Should we or should we not make in_interrupt() return true in NMI? "should", I expect. If not, we'd need to do something else to communicate the current processing state down to the BUG-handling code. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-17 20:29 ` Andrew Morton @ 2008-04-17 21:16 ` Mathieu Desnoyers 2008-04-17 21:26 ` Andrew Morton 0 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-17 21:16 UTC (permalink / raw) To: Andrew Morton; +Cc: mingo, hpa, jeremy, rostedt, fche, linux-kernel * Andrew Morton (akpm@linux-foundation.org) wrote: > On Thu, 17 Apr 2008 16:14:10 -0400 > Mathieu Desnoyers <compudj@krystal.dyndns.org> wrote: > > > +#define nmi_enter() \ > > + do { \ > > + lockdep_off(); \ > > + BUG_ON(hardnmi_count()); \ > > + add_preempt_count(HARDNMI_OFFSET); \ > > + __irq_enter(); \ > > + } while (0) > > <did it _have_ to be a macro?> > isn't this real macro art work ? ;) I kept the same coding style that was already there, which mimics the irq_enter/irq_exit macros. Changing all of them at once could be done in a separate patch. > Doing BUG() inside an NMI should be OK most of the time. But the > BUG-handling code does want to know if we're in interrupt context - at > least for the "fatal exception in interrupt" stuff, and probably other > things. > > But afacit the failure to include HARDNMI_MASK in > > #define irq_count() (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK)) > > will prevent that. > > So. > > Should we or should we not make in_interrupt() return true in NMI? > "should", I expect. > > If not, we'd need to do something else to communicate the current > processing state down to the BUG-handling code. > You bring an interesting question. In practice, since this BUG_ON could only happen if we have an NMI nested over another NMI or an nmi which fails to decrement its HARDNMI_MASK. Given that the HARDIRQ_MASK is incremented right after the HARDNMI_MASK increment (the reverse is also true), really bad things (TM) must have happened for the BUG_ON to be triggered outside of the __irq_enter()/__irq_exit() scope of the NMI below the buggy one. But since this code is there to extract as much information as possible when things go wrong, I would say it's safer to, at least, add HARDNMI_MASK to irq_count(). Instead, though, I think we could add : if (in_nmi()) panic("Fatal exception in non-maskable interrupt"); to die(). That would be clearer. I just added it to x86_32, but can't find where x86_64 reports the "fatal exception in interrupt" and friends message. Any idea ? By dealing with this case specifically, I think we don't really have to add HARDNMI_MASK to irq_count(), considering it's normally an HARDIRQ too. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-17 21:16 ` Mathieu Desnoyers @ 2008-04-17 21:26 ` Andrew Morton 0 siblings, 0 replies; 23+ messages in thread From: Andrew Morton @ 2008-04-17 21:26 UTC (permalink / raw) To: Mathieu Desnoyers; +Cc: mingo, hpa, jeremy, rostedt, fche, linux-kernel On Thu, 17 Apr 2008 17:16:25 -0400 Mathieu Desnoyers <compudj@krystal.dyndns.org> wrote: > > Should we or should we not make in_interrupt() return true in NMI? > > "should", I expect. > > > > If not, we'd need to do something else to communicate the current > > processing state down to the BUG-handling code. > > > > You bring an interesting question. In practice, since this BUG_ON could > only happen if we have an NMI nested over another NMI or an nmi which > fails to decrement its HARDNMI_MASK. Given that the HARDIRQ_MASK is > incremented right after the HARDNMI_MASK increment (the reverse is also > true), really bad things (TM) must have happened for the BUG_ON to be > triggered outside of the __irq_enter()/__irq_exit() scope of the NMI > below the buggy one. > > But since this code is there to extract as much information as possible > when things go wrong, I would say it's safer to, at least, add > HARDNMI_MASK to irq_count(). > > Instead, though, I think we could add : > > if (in_nmi()) > panic("Fatal exception in non-maskable interrupt"); > > to die(). But that's just one site. There might be (now, or in the future) other code under BUG() which tests in_interrupt(). And most of the places where we test for in_interrupt() and in_irq() probably want that to return true is we're in NMI too. After all, it's an interrupt. > That would be clearer. I just added it to x86_32, but can't > find where x86_64 reports the "fatal exception in interrupt" and friends > message. Any idea ? Dunno - maybe it just doesn't have it. Maybe it was never the right thing to do. > By dealing with this case specifically, I think we don't really have to > add HARDNMI_MASK to irq_count(), considering it's normally an HARDIRQ > too. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-17 20:14 ` [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) Mathieu Desnoyers 2008-04-17 20:29 ` Andrew Morton @ 2008-04-17 22:01 ` Andi Kleen 2008-04-18 0:06 ` Mathieu Desnoyers 2008-04-21 14:00 ` Pavel Machek 2 siblings, 1 reply; 23+ messages in thread From: Andi Kleen @ 2008-04-17 22:01 UTC (permalink / raw) To: Mathieu Desnoyers Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Mathieu Desnoyers <compudj@krystal.dyndns.org> writes: > > It allows placing immediate values (and therefore optimized trace_marks) in NMI > code Only if all your trace_mark infrastructure is lock less. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-17 22:01 ` Andi Kleen @ 2008-04-18 0:06 ` Mathieu Desnoyers 2008-04-18 8:07 ` Andi Kleen 2008-04-18 11:30 ` Andi Kleen 0 siblings, 2 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-18 0:06 UTC (permalink / raw) To: Andi Kleen Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * Andi Kleen (andi@firstfloor.org) wrote: > Mathieu Desnoyers <compudj@krystal.dyndns.org> writes: > > > > It allows placing immediate values (and therefore optimized trace_marks) in NMI > > code > > Only if all your trace_mark infrastructure is lock less. > > -Andi > It uses RCU-style updates and has been designed to be lockless from the ground up. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-18 0:06 ` Mathieu Desnoyers @ 2008-04-18 8:07 ` Andi Kleen 2008-04-19 21:00 ` Mathieu Desnoyers 2008-04-18 11:30 ` Andi Kleen 1 sibling, 1 reply; 23+ messages in thread From: Andi Kleen @ 2008-04-18 8:07 UTC (permalink / raw) To: Mathieu Desnoyers Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Mathieu Desnoyers wrote: > * Andi Kleen (andi@firstfloor.org) wrote: >> Mathieu Desnoyers <compudj@krystal.dyndns.org> writes: >>> It allows placing immediate values (and therefore optimized trace_marks) in NMI >>> code >> Only if all your trace_mark infrastructure is lock less. >> >> -Andi >> > > It uses RCU-style updates and has been designed to be lockless from the > ground up. Wrong. If it causes vmalloc faults it is not lockless. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-18 8:07 ` Andi Kleen @ 2008-04-19 21:00 ` Mathieu Desnoyers 0 siblings, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-19 21:00 UTC (permalink / raw) To: Andi Kleen Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * Andi Kleen (andi@firstfloor.org) wrote: > Mathieu Desnoyers wrote: > > * Andi Kleen (andi@firstfloor.org) wrote: > >> Mathieu Desnoyers <compudj@krystal.dyndns.org> writes: > >>> It allows placing immediate values (and therefore optimized trace_marks) in NMI > >>> code > >> Only if all your trace_mark infrastructure is lock less. > >> > >> -Andi > >> > > > > It uses RCU-style updates and has been designed to be lockless from the > > ground up. > > Wrong. If it causes vmalloc faults it is not lockless. > > -Andi > Could you point me where vmalloc_fault accesses a data structure for which updates are protected by disabling interrupts ? I am curious. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-18 0:06 ` Mathieu Desnoyers 2008-04-18 8:07 ` Andi Kleen @ 2008-04-18 11:30 ` Andi Kleen 2008-04-19 21:23 ` Mathieu Desnoyers 1 sibling, 1 reply; 23+ messages in thread From: Andi Kleen @ 2008-04-18 11:30 UTC (permalink / raw) To: Mathieu Desnoyers Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Mathieu Desnoyers <compudj@krystal.dyndns.org> writes: > > It uses RCU-style updates and has been designed to be lockless from the > ground up. RCU is not necessarily NMI safe. In most cases RCU needs writer locks which you cannot do with NMIs. -Andi ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-18 11:30 ` Andi Kleen @ 2008-04-19 21:23 ` Mathieu Desnoyers 0 siblings, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-19 21:23 UTC (permalink / raw) To: Andi Kleen Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * Andi Kleen (andi@firstfloor.org) wrote: > Mathieu Desnoyers <compudj@krystal.dyndns.org> writes: > > > > It uses RCU-style updates and has been designed to be lockless from the > > ground up. > > RCU is not necessarily NMI safe. In most cases RCU needs writer locks > which you cannot do with NMIs. > > -Andi > RCU-style updates are done outside of NMIs, in sleepable context. That's just required when the probes connected on markers must be registered/unregistered. The NMI context is the RCU read side. It only have to get the probe function pointers to call along with the private data pointers. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-17 20:14 ` [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) Mathieu Desnoyers 2008-04-17 20:29 ` Andrew Morton 2008-04-17 22:01 ` Andi Kleen @ 2008-04-21 14:00 ` Pavel Machek 2008-04-21 14:22 ` H. Peter Anvin 2 siblings, 1 reply; 23+ messages in thread From: Pavel Machek @ 2008-04-21 14:00 UTC (permalink / raw) To: Mathieu Desnoyers Cc: mingo, akpm, H. Peter Anvin, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel On Thu 2008-04-17 16:14:10, Mathieu Desnoyers wrote: > (hopefully finally CCing LKML) :) > > Implements an alternative iret with popf and return so trap and exception > handlers can return to the NMI handler without issuing iret. iret would cause > NMIs to be reenabled prematurely. x86_32 uses popf and far return. x86_64 has to > copy the return instruction pointer to the top of the previous stack, issue a > popf, loads the previous esp and issue a near return (ret). sounds expensive. Does it slow down normal loads? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 14:00 ` Pavel Machek @ 2008-04-21 14:22 ` H. Peter Anvin 2008-04-21 14:51 ` Mathieu Desnoyers 2008-04-21 15:08 ` Mathieu Desnoyers 0 siblings, 2 replies; 23+ messages in thread From: H. Peter Anvin @ 2008-04-21 14:22 UTC (permalink / raw) To: Pavel Machek Cc: Mathieu Desnoyers, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Pavel Machek wrote: > On Thu 2008-04-17 16:14:10, Mathieu Desnoyers wrote: >> (hopefully finally CCing LKML) :) >> >> Implements an alternative iret with popf and return so trap and exception >> handlers can return to the NMI handler without issuing iret. iret would cause >> NMIs to be reenabled prematurely. x86_32 uses popf and far return. x86_64 has to >> copy the return instruction pointer to the top of the previous stack, issue a >> popf, loads the previous esp and issue a near return (ret). > > sounds expensive. Does it slow down normal loads? > It should *only* be used to return from NMI, #MC or INT3 (breakpoint), which should never happen in normal operation, and even then only when interrupting another NMI or #MC handler. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 14:22 ` H. Peter Anvin @ 2008-04-21 14:51 ` Mathieu Desnoyers 2008-04-21 15:08 ` Mathieu Desnoyers 1 sibling, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-21 14:51 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * H. Peter Anvin (hpa@zytor.com) wrote: > Pavel Machek wrote: >> On Thu 2008-04-17 16:14:10, Mathieu Desnoyers wrote: >>> (hopefully finally CCing LKML) :) >>> >>> Implements an alternative iret with popf and return so trap and exception >>> handlers can return to the NMI handler without issuing iret. iret would >>> cause >>> NMIs to be reenabled prematurely. x86_32 uses popf and far return. x86_64 >>> has to >>> copy the return instruction pointer to the top of the previous stack, >>> issue a >>> popf, loads the previous esp and issue a near return (ret). >> sounds expensive. Does it slow down normal loads? > > It should *only* be used to return from NMI, #MC or INT3 (breakpoint), > which should never happen in normal operation, and even then only when > interrupting another NMI or #MC handler. > > -hpa > Sorry Pavel, for some reason you message did not reach my inbox. hpa is right : this code path is only taken to return to the NMI handler from a trap or exception or, possibly, machine check exception. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 14:22 ` H. Peter Anvin 2008-04-21 14:51 ` Mathieu Desnoyers @ 2008-04-21 15:08 ` Mathieu Desnoyers 2008-04-21 15:08 ` H. Peter Anvin 2008-04-21 15:11 ` Mathieu Desnoyers 1 sibling, 2 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-21 15:08 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * H. Peter Anvin (hpa@zytor.com) wrote: > Pavel Machek wrote: >> On Thu 2008-04-17 16:14:10, Mathieu Desnoyers wrote: >>> (hopefully finally CCing LKML) :) >>> >>> Implements an alternative iret with popf and return so trap and exception >>> handlers can return to the NMI handler without issuing iret. iret would >>> cause >>> NMIs to be reenabled prematurely. x86_32 uses popf and far return. x86_64 >>> has to >>> copy the return instruction pointer to the top of the previous stack, >>> issue a >>> popf, loads the previous esp and issue a near return (ret). >> sounds expensive. Does it slow down normal loads? > > It should *only* be used to return from NMI, #MC or INT3 (breakpoint), > which should never happen in normal operation, and even then only when > interrupting another NMI or #MC handler. > > -hpa > Just to be clear : the added cost on normal interrupt return is to add a supplementary test of the thread flags already loaded in registers and a conditional branch. This is used to detect if we are nested over an NMI handler. I doubt anyone ever notice an impact caused by this added test/branch. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 15:08 ` Mathieu Desnoyers @ 2008-04-21 15:08 ` H. Peter Anvin 2008-04-21 15:21 ` Mathieu Desnoyers 2008-04-21 15:47 ` Mathieu Desnoyers 2008-04-21 15:11 ` Mathieu Desnoyers 1 sibling, 2 replies; 23+ messages in thread From: H. Peter Anvin @ 2008-04-21 15:08 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Mathieu Desnoyers wrote: > > Just to be clear : the added cost on normal interrupt return is to add a > supplementary test of the thread flags already loaded in registers and > a conditional branch. This is used to detect if we are nested over an > NMI handler. I doubt anyone ever notice an impact caused by this added > test/branch. > Why the **** would you do this except in the handful of places where you actually *could* be nested over an NMI handler (basically #MC, #DB and INT3)? -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 15:08 ` H. Peter Anvin @ 2008-04-21 15:21 ` Mathieu Desnoyers 2008-04-21 15:47 ` Mathieu Desnoyers 1 sibling, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-21 15:21 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * H. Peter Anvin (hpa@zytor.com) wrote: > Mathieu Desnoyers wrote: >> Just to be clear : the added cost on normal interrupt return is to add a >> supplementary test of the thread flags already loaded in registers and >> a conditional branch. This is used to detect if we are nested over an >> NMI handler. I doubt anyone ever notice an impact caused by this added >> test/branch. > > Why the **** would you do this except in the handful of places where you > actually *could* be nested over an NMI handler (basically #MC, #DB and > INT3)? > > -hpa > Because I would have to do a more invasive code modification, since they currently share their return path with normal interrupts. I agree that the next step is to tune the patchset to only target traps and exceptions which may happen on top of an NMI. I'll change it in my next patchset version. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 15:08 ` H. Peter Anvin 2008-04-21 15:21 ` Mathieu Desnoyers @ 2008-04-21 15:47 ` Mathieu Desnoyers 2008-04-21 17:23 ` Pavel Machek 1 sibling, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-21 15:47 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * H. Peter Anvin (hpa@zytor.com) wrote: > Mathieu Desnoyers wrote: >> Just to be clear : the added cost on normal interrupt return is to add a >> supplementary test of the thread flags already loaded in registers and >> a conditional branch. This is used to detect if we are nested over an >> NMI handler. I doubt anyone ever notice an impact caused by this added >> test/branch. > > Why the **** would you do this except in the handful of places where you > actually *could* be nested over an NMI handler (basically #MC, #DB and > INT3)? > > -hpa > There is also the page fault case. I think putting this test in ret_from_exception would be both safe (it is executed for any exception return) and fast (exceptions are rare). Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 15:47 ` Mathieu Desnoyers @ 2008-04-21 17:23 ` Pavel Machek 2008-04-21 17:28 ` H. Peter Anvin 0 siblings, 1 reply; 23+ messages in thread From: Pavel Machek @ 2008-04-21 17:23 UTC (permalink / raw) To: Mathieu Desnoyers Cc: H. Peter Anvin, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel On Mon 2008-04-21 11:47:56, Mathieu Desnoyers wrote: > * H. Peter Anvin (hpa@zytor.com) wrote: > > Mathieu Desnoyers wrote: > >> Just to be clear : the added cost on normal interrupt return is to add a > >> supplementary test of the thread flags already loaded in registers and > >> a conditional branch. This is used to detect if we are nested over an > >> NMI handler. I doubt anyone ever notice an impact caused by this added > >> test/branch. > > > > Why the **** would you do this except in the handful of places where you > > actually *could* be nested over an NMI handler (basically #MC, #DB and > > INT3)? > > There is also the page fault case. I think putting this test in > ret_from_exception would be both safe (it is executed for any > exception return) and fast (exceptions are rare). Eh? I thought that page fault is one of the hottest paths in kernel (along with syscall and packet receive/send)... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 17:23 ` Pavel Machek @ 2008-04-21 17:28 ` H. Peter Anvin 2008-04-21 17:42 ` Mathieu Desnoyers 0 siblings, 1 reply; 23+ messages in thread From: H. Peter Anvin @ 2008-04-21 17:28 UTC (permalink / raw) To: Pavel Machek Cc: Mathieu Desnoyers, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Pavel Machek wrote: > >> There is also the page fault case. I think putting this test in >> ret_from_exception would be both safe (it is executed for any >> exception return) and fast (exceptions are rare). > > Eh? I thought that page fault is one of the hottest paths in kernel > (along with syscall and packet receive/send)... > Pavel Yeah, and the concept of handling page faults inside an NMI handler is pure fantasy. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 17:28 ` H. Peter Anvin @ 2008-04-21 17:42 ` Mathieu Desnoyers 2008-04-21 17:59 ` H. Peter Anvin 0 siblings, 1 reply; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-21 17:42 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * H. Peter Anvin (hpa@zytor.com) wrote: > Pavel Machek wrote: >>> There is also the page fault case. I think putting this test in >>> ret_from_exception would be both safe (it is executed for any >>> exception return) and fast (exceptions are rare). >> Eh? I thought that page fault is one of the hottest paths in kernel >> (along with syscall and packet receive/send)... >> Pavel > On x86_64, we can pinpoint only the page faults returning to the kernel, which are rare and only caused by vmalloc accesses. Ideally we could do the same on x86_32. > Yeah, and the concept of handling page faults inside an NMI handler is pure > fantasy. > > -hpa > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 17:42 ` Mathieu Desnoyers @ 2008-04-21 17:59 ` H. Peter Anvin 2008-04-22 13:12 ` Mathieu Desnoyers 0 siblings, 1 reply; 23+ messages in thread From: H. Peter Anvin @ 2008-04-21 17:59 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel Mathieu Desnoyers wrote: > * H. Peter Anvin (hpa@zytor.com) wrote: >> Pavel Machek wrote: >>>> There is also the page fault case. I think putting this test in >>>> ret_from_exception would be both safe (it is executed for any >>>> exception return) and fast (exceptions are rare). >>> Eh? I thought that page fault is one of the hottest paths in kernel >>> (along with syscall and packet receive/send)... >>> Pavel > > On x86_64, we can pinpoint only the page faults returning to the kernel, > which are rare and only caused by vmalloc accesses. Ideally we could do > the same on x86_32. > Pinpoint, how? Ultimately you need a runtime test, and you better be showing that people are going to die unless before you add a cycle to the page fault path. I'm only slightly exaggerating that. -hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 17:59 ` H. Peter Anvin @ 2008-04-22 13:12 ` Mathieu Desnoyers 0 siblings, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-22 13:12 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * H. Peter Anvin (hpa@zytor.com) wrote: > Mathieu Desnoyers wrote: >> * H. Peter Anvin (hpa@zytor.com) wrote: >>> Pavel Machek wrote: >>>>> There is also the page fault case. I think putting this test in >>>>> ret_from_exception would be both safe (it is executed for any >>>>> exception return) and fast (exceptions are rare). >>>> Eh? I thought that page fault is one of the hottest paths in kernel >>>> (along with syscall and packet receive/send)... >>>> Pavel >> On x86_64, we can pinpoint only the page faults returning to the kernel, >> which are rare and only caused by vmalloc accesses. Ideally we could do >> the same on x86_32. > > Pinpoint, how? Ultimately you need a runtime test, and you better be > showing that people are going to die unless before you add a cycle to the > page fault path. I'm only slightly exaggerating that. > On x86_32, ret_from_exception identifies the return path taken to return from an exception. By dulicating the check_userspace code both in the ret_from_intr and in the ret_from_exception (that's only 4 instructions), we can know if we are in the specific condition of returning to the kernel from an exception without any supplementary test. Therefore, we can do the nmi nesting test only in the specific return-to-kernel-from-exception case without slowing down any critical code. Something similar is done on x86_64. That will appear in my next version. Thanks, Mathieu > -hpa > -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) 2008-04-21 15:08 ` Mathieu Desnoyers 2008-04-21 15:08 ` H. Peter Anvin @ 2008-04-21 15:11 ` Mathieu Desnoyers 1 sibling, 0 replies; 23+ messages in thread From: Mathieu Desnoyers @ 2008-04-21 15:11 UTC (permalink / raw) To: H. Peter Anvin Cc: Pavel Machek, mingo, akpm, Jeremy Fitzhardinge, Steven Rostedt, Frank Ch. Eigler, linux-kernel * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: > * H. Peter Anvin (hpa@zytor.com) wrote: > > Pavel Machek wrote: > >> On Thu 2008-04-17 16:14:10, Mathieu Desnoyers wrote: > >>> (hopefully finally CCing LKML) :) > >>> > >>> Implements an alternative iret with popf and return so trap and exception > >>> handlers can return to the NMI handler without issuing iret. iret would > >>> cause > >>> NMIs to be reenabled prematurely. x86_32 uses popf and far return. x86_64 > >>> has to > >>> copy the return instruction pointer to the top of the previous stack, > >>> issue a > >>> popf, loads the previous esp and issue a near return (ret). > >> sounds expensive. Does it slow down normal loads? > > > > It should *only* be used to return from NMI, #MC or INT3 (breakpoint), > > which should never happen in normal operation, and even then only when > > interrupting another NMI or #MC handler. > > > > -hpa > > > > Just to be clear : the added cost on normal interrupt return is to add a > supplementary test of the thread flags already loaded in registers and err, by thread flag, I meant thread preempt count. And it's not in registers, so it has to be read from the data cache (it's clearly already there). > a conditional branch. This is used to detect if we are nested over an > NMI handler. I doubt anyone ever notice an impact caused by this added > test/branch. > > Mathieu > > -- > Mathieu Desnoyers > Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal > OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2008-04-22 13:12 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080417165839.GA25198@Krystal>
[not found] ` <20080417165944.GB25198@Krystal>
2008-04-17 20:14 ` [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v5) Mathieu Desnoyers
2008-04-17 20:29 ` Andrew Morton
2008-04-17 21:16 ` Mathieu Desnoyers
2008-04-17 21:26 ` Andrew Morton
2008-04-17 22:01 ` Andi Kleen
2008-04-18 0:06 ` Mathieu Desnoyers
2008-04-18 8:07 ` Andi Kleen
2008-04-19 21:00 ` Mathieu Desnoyers
2008-04-18 11:30 ` Andi Kleen
2008-04-19 21:23 ` Mathieu Desnoyers
2008-04-21 14:00 ` Pavel Machek
2008-04-21 14:22 ` H. Peter Anvin
2008-04-21 14:51 ` Mathieu Desnoyers
2008-04-21 15:08 ` Mathieu Desnoyers
2008-04-21 15:08 ` H. Peter Anvin
2008-04-21 15:21 ` Mathieu Desnoyers
2008-04-21 15:47 ` Mathieu Desnoyers
2008-04-21 17:23 ` Pavel Machek
2008-04-21 17:28 ` H. Peter Anvin
2008-04-21 17:42 ` Mathieu Desnoyers
2008-04-21 17:59 ` H. Peter Anvin
2008-04-22 13:12 ` Mathieu Desnoyers
2008-04-21 15:11 ` Mathieu Desnoyers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox