* [RFC GIT PULL] softirq: Consolidation and stack overrun fix
@ 2013-09-19 19:51 Frederic Weisbecker
2013-09-19 19:51 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker
` (3 more replies)
0 siblings, 4 replies; 52+ messages in thread
From: Frederic Weisbecker @ 2013-09-19 19:51 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
Ingo Molnar, Peter Zijlstra, H. Peter Anvin, Linus Torvalds,
James Hogan, James E.J. Bottomley, Helge Deller,
Martin Schwidefsky, Heiko Carstens, David S. Miller,
Andrew Morton
Thomas,
Please consider this patchset for pulling from:
git://github.com/fweisbec/linux-dynticks.git
irq/core-v2
HEAD: 539b9cde35b473483c722de110133cd757015947
It fixes stacks overruns reported by Benjamin Herrenschmidt:
http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop
And Paul Mackerras gave a feedback here:
http://lkml.kernel.org/r/20130918065101.GA22060@drongo
Of course the fix probably comes at the expense of a performance
hit due to cache switch, miss, etc... when softirq are processed
at the end of interrupts, although I haven't tried to measure that.
Thanks.
---
Frederic Weisbecker (3):
irq: Consolidate do_softirq() arch overriden implementations
irq: Execute softirq on its own stack on irq exit
irq: Comment on the use of inline stack for ksoftirqd
arch/metag/kernel/irq.c | 56 ++++++++++++++++++-------------------------
arch/parisc/kernel/irq.c | 17 ++-----------
arch/powerpc/kernel/irq.c | 17 +------------
arch/s390/kernel/irq.c | 52 ++++++++++++++++------------------------
arch/sh/kernel/irq.c | 60 +++++++++++++++++++---------------------------
arch/sparc/kernel/irq_64.c | 31 ++++++++----------------
arch/x86/kernel/irq_32.c | 34 ++++++++++----------------
arch/x86/kernel/irq_64.c | 18 +++-----------
include/linux/interrupt.h | 11 +++++++++
kernel/softirq.c | 13 +++++-----
10 files changed, 115 insertions(+), 194 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 52+ messages in thread* [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations 2013-09-19 19:51 [RFC GIT PULL] softirq: Consolidation and stack overrun fix Frederic Weisbecker @ 2013-09-19 19:51 ` Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 2/3] irq: Execute softirq on its own stack on irq exit Frederic Weisbecker ` (2 subsequent siblings) 3 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-19 19:51 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, Linus Torvalds, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton All arch overriden implementations of do_softirq() share the following common code: disable irqs, check if there are softirqs pending, then execute __do_softirq() on a specific stack. Consolidate the common parts such that archs only worry about the stack switch. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> --- arch/metag/kernel/irq.c | 56 ++++++++++++++++++------------------------- arch/parisc/kernel/irq.c | 17 ++----------- arch/powerpc/kernel/irq.c | 17 +------------ arch/s390/kernel/irq.c | 52 ++++++++++++++++------------------------ arch/sh/kernel/irq.c | 60 +++++++++++++++++++--------------------------- arch/sparc/kernel/irq_64.c | 31 ++++++++---------------- arch/x86/kernel/irq_32.c | 34 ++++++++++---------------- arch/x86/kernel/irq_64.c | 18 +++----------- include/linux/interrupt.h | 11 +++++++++ kernel/softirq.c | 7 ++---- 10 files changed, 110 insertions(+), 193 deletions(-) diff --git a/arch/metag/kernel/irq.c b/arch/metag/kernel/irq.c index 2a2c9d5..12a12b4 100644 --- a/arch/metag/kernel/irq.c +++ b/arch/metag/kernel/irq.c @@ -159,44 +159,34 @@ void irq_ctx_exit(int cpu) extern asmlinkage void __do_softirq(void); -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; struct thread_info *curctx; union irq_ctx *irqctx; u32 *isp; - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) { - curctx = current_thread_info(); - irqctx = softirq_ctx[smp_processor_id()]; - irqctx->tinfo.task = curctx->task; - - /* build the stack frame on the softirq stack */ - isp = (u32 *) ((char *)irqctx + sizeof(struct thread_info)); - - asm volatile ( - "MOV D0.5,%0\n" - "SWAP A0StP,D0.5\n" - "CALLR D1RtP,___do_softirq\n" - "MOV A0StP,D0.5\n" - : - : "r" (isp) - : "memory", "cc", "D1Ar1", "D0Ar2", "D1Ar3", "D0Ar4", - "D1Ar5", "D0Ar6", "D0Re0", "D1Re0", "D0.4", "D1RtP", - "D0.5" - ); - /* - * Shouldn't happen, we returned above if in_interrupt(): - */ - WARN_ON_ONCE(softirq_count()); - } - - local_irq_restore(flags); + curctx = current_thread_info(); + irqctx = softirq_ctx[smp_processor_id()]; + irqctx->tinfo.task = curctx->task; + + /* build the stack frame on the softirq stack */ + isp = (u32 *) ((char *)irqctx + sizeof(struct thread_info)); + + asm volatile ( + "MOV D0.5,%0\n" + "SWAP A0StP,D0.5\n" + "CALLR D1RtP,___do_softirq\n" + "MOV A0StP,D0.5\n" + : + : "r" (isp) + : "memory", "cc", "D1Ar1", "D0Ar2", "D1Ar3", "D0Ar4", + "D1Ar5", "D0Ar6", "D0Re0", "D1Re0", "D0.4", "D1RtP", + "D0.5" + ); + /* + * Shouldn't happen, we returned above if in_interrupt(): + */ + WARN_ON_ONCE(softirq_count()); } #endif diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c index 2e6443b..ef59276 100644 --- a/arch/parisc/kernel/irq.c +++ b/arch/parisc/kernel/irq.c @@ -499,22 +499,9 @@ static void execute_on_irq_stack(void *func, unsigned long param1) *irq_stack_in_use = 1; } -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - __u32 pending; - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); - - pending = local_softirq_pending(); - - if (pending) - execute_on_irq_stack(__do_softirq, 0); - - local_irq_restore(flags); + execute_on_irq_stack(__do_softirq, 0); } #endif /* CONFIG_IRQSTACKS */ diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index c69440c..7d0da88 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -601,7 +601,7 @@ void irq_ctx_init(void) } } -static inline void do_softirq_onstack(void) +void do_softirq_own_stack(void) { struct thread_info *curtp, *irqtp; unsigned long saved_sp_limit = current->thread.ksp_limit; @@ -623,21 +623,6 @@ static inline void do_softirq_onstack(void) set_bits(irqtp->flags, &curtp->flags); } -void do_softirq(void) -{ - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) - do_softirq_onstack(); - - local_irq_restore(flags); -} - irq_hw_number_t virq_to_hw(unsigned int virq) { struct irq_data *irq_data = irq_get_irq_data(virq); diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c index 8ac2097..bb27a26 100644 --- a/arch/s390/kernel/irq.c +++ b/arch/s390/kernel/irq.c @@ -157,39 +157,29 @@ int arch_show_interrupts(struct seq_file *p, int prec) /* * Switch to the asynchronous interrupt stack for softirq execution. */ -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags, old, new; - - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) { - /* Get current stack pointer. */ - asm volatile("la %0,0(15)" : "=a" (old)); - /* Check against async. stack address range. */ - new = S390_lowcore.async_stack; - if (((new - old) >> (PAGE_SHIFT + THREAD_ORDER)) != 0) { - /* Need to switch to the async. stack. */ - new -= STACK_FRAME_OVERHEAD; - ((struct stack_frame *) new)->back_chain = old; - - asm volatile(" la 15,0(%0)\n" - " basr 14,%2\n" - " la 15,0(%1)\n" - : : "a" (new), "a" (old), - "a" (__do_softirq) - : "0", "1", "2", "3", "4", "5", "14", - "cc", "memory" ); - } else { - /* We are already on the async stack. */ - __do_softirq(); - } + unsigned long old, new; + + /* Get current stack pointer. */ + asm volatile("la %0,0(15)" : "=a" (old)); + /* Check against async. stack address range. */ + new = S390_lowcore.async_stack; + if (((new - old) >> (PAGE_SHIFT + THREAD_ORDER)) != 0) { + /* Need to switch to the async. stack. */ + new -= STACK_FRAME_OVERHEAD; + ((struct stack_frame *) new)->back_chain = old; + asm volatile(" la 15,0(%0)\n" + " basr 14,%2\n" + " la 15,0(%1)\n" + : : "a" (new), "a" (old), + "a" (__do_softirq) + : "0", "1", "2", "3", "4", "5", "14", + "cc", "memory" ); + } else { + /* We are already on the async stack. */ + __do_softirq(); } - - local_irq_restore(flags); } /* diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c index 063af10..50ecd48 100644 --- a/arch/sh/kernel/irq.c +++ b/arch/sh/kernel/irq.c @@ -149,47 +149,37 @@ void irq_ctx_exit(int cpu) hardirq_ctx[cpu] = NULL; } -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; struct thread_info *curctx; union irq_ctx *irqctx; u32 *isp; - if (in_interrupt()) - return; - - local_irq_save(flags); + curctx = current_thread_info(); + irqctx = softirq_ctx[smp_processor_id()]; + irqctx->tinfo.task = curctx->task; + irqctx->tinfo.previous_sp = current_stack_pointer; + + /* build the stack frame on the softirq stack */ + isp = (u32 *)((char *)irqctx + sizeof(*irqctx)); + + __asm__ __volatile__ ( + "mov r15, r9 \n" + "jsr @%0 \n" + /* switch to the softirq stack */ + " mov %1, r15 \n" + /* restore the thread stack */ + "mov r9, r15 \n" + : /* no outputs */ + : "r" (__do_softirq), "r" (isp) + : "memory", "r0", "r1", "r2", "r3", "r4", + "r5", "r6", "r7", "r8", "r9", "r15", "t", "pr" + ); - if (local_softirq_pending()) { - curctx = current_thread_info(); - irqctx = softirq_ctx[smp_processor_id()]; - irqctx->tinfo.task = curctx->task; - irqctx->tinfo.previous_sp = current_stack_pointer; - - /* build the stack frame on the softirq stack */ - isp = (u32 *)((char *)irqctx + sizeof(*irqctx)); - - __asm__ __volatile__ ( - "mov r15, r9 \n" - "jsr @%0 \n" - /* switch to the softirq stack */ - " mov %1, r15 \n" - /* restore the thread stack */ - "mov r9, r15 \n" - : /* no outputs */ - : "r" (__do_softirq), "r" (isp) - : "memory", "r0", "r1", "r2", "r3", "r4", - "r5", "r6", "r7", "r8", "r9", "r15", "t", "pr" - ); - - /* - * Shouldn't happen, we returned above if in_interrupt(): - */ - WARN_ON_ONCE(softirq_count()); - } - - local_irq_restore(flags); + /* + * Shouldn't happen, we returned above if in_interrupt(): + */ + WARN_ON_ONCE(softirq_count()); } #else static inline void handle_one_irq(unsigned int irq) diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c index d4840ce..666193f 100644 --- a/arch/sparc/kernel/irq_64.c +++ b/arch/sparc/kernel/irq_64.c @@ -698,30 +698,19 @@ void __irq_entry handler_irq(int pil, struct pt_regs *regs) set_irq_regs(old_regs); } -void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); + void *orig_sp, *sp = softirq_stack[smp_processor_id()]; - if (local_softirq_pending()) { - void *orig_sp, *sp = softirq_stack[smp_processor_id()]; - - sp += THREAD_SIZE - 192 - STACK_BIAS; - - __asm__ __volatile__("mov %%sp, %0\n\t" - "mov %1, %%sp" - : "=&r" (orig_sp) - : "r" (sp)); - __do_softirq(); - __asm__ __volatile__("mov %0, %%sp" - : : "r" (orig_sp)); - } + sp += THREAD_SIZE - 192 - STACK_BIAS; - local_irq_restore(flags); + __asm__ __volatile__("mov %%sp, %0\n\t" + "mov %1, %%sp" + : "=&r" (orig_sp) + : "r" (sp)); + __do_softirq(); + __asm__ __volatile__("mov %0, %%sp" + : : "r" (orig_sp)); } #ifdef CONFIG_HOTPLUG_CPU diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c index 4186755..1036f03 100644 --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -149,35 +149,25 @@ void irq_ctx_init(int cpu) cpu, per_cpu(hardirq_ctx, cpu), per_cpu(softirq_ctx, cpu)); } -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; struct thread_info *curctx; union irq_ctx *irqctx; u32 *isp; - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) { - curctx = current_thread_info(); - irqctx = __this_cpu_read(softirq_ctx); - irqctx->tinfo.task = curctx->task; - irqctx->tinfo.previous_esp = current_stack_pointer; - - /* build the stack frame on the softirq stack */ - isp = (u32 *) ((char *)irqctx + sizeof(*irqctx)); + curctx = current_thread_info(); + irqctx = __this_cpu_read(softirq_ctx); + irqctx->tinfo.task = curctx->task; + irqctx->tinfo.previous_esp = current_stack_pointer; - call_on_stack(__do_softirq, isp); - /* - * Shouldn't happen, we returned above if in_interrupt(): - */ - WARN_ON_ONCE(softirq_count()); - } + /* build the stack frame on the softirq stack */ + isp = (u32 *) ((char *)irqctx + sizeof(*irqctx)); - local_irq_restore(flags); + call_on_stack(__do_softirq, isp); + /* + * Shouldn't happen, we returned above if in_interrupt(): + */ + WARN_ON_ONCE(softirq_count()); } bool handle_irq(unsigned irq, struct pt_regs *regs) diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c index d04d3ec..8dd8c96 100644 --- a/arch/x86/kernel/irq_64.c +++ b/arch/x86/kernel/irq_64.c @@ -91,20 +91,8 @@ bool handle_irq(unsigned irq, struct pt_regs *regs) extern void call_softirq(void); -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - __u32 pending; - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); - pending = local_softirq_pending(); - /* Switch to interrupt stack */ - if (pending) { - call_softirq(); - WARN_ON_ONCE(softirq_count()); - } - local_irq_restore(flags); + call_softirq(); + WARN_ON_ONCE(softirq_count()); } diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 5e865b5..c9e831d 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -19,6 +19,7 @@ #include <linux/atomic.h> #include <asm/ptrace.h> +#include <asm/irq.h> /* * These correspond to the IORESOURCE_IRQ_* defines in @@ -374,6 +375,16 @@ struct softirq_action asmlinkage void do_softirq(void); asmlinkage void __do_softirq(void); + +#ifdef __ARCH_HAS_DO_SOFTIRQ +void do_softirq_own_stack(void); +#else +static inline void do_softirq_own_stack(void) +{ + __do_softirq(); +} +#endif + extern void open_softirq(int nr, void (*action)(struct softirq_action *)); extern void softirq_init(void); extern void __raise_softirq_irqoff(unsigned int nr); diff --git a/kernel/softirq.c b/kernel/softirq.c index 53cc09c..36112cf 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -29,7 +29,6 @@ #define CREATE_TRACE_POINTS #include <trace/events/irq.h> -#include <asm/irq.h> /* - No shared variables, all the data are CPU local. - If a softirq needs serialization, let it serialize itself @@ -283,7 +282,7 @@ restart: tsk_restore_flags(current, old_flags, PF_MEMALLOC); } -#ifndef __ARCH_HAS_DO_SOFTIRQ + asmlinkage void do_softirq(void) { @@ -298,13 +297,11 @@ asmlinkage void do_softirq(void) pending = local_softirq_pending(); if (pending) - __do_softirq(); + do_softirq_own_stack(); local_irq_restore(flags); } -#endif - /* * Enter an interrupt context. */ -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH 2/3] irq: Execute softirq on its own stack on irq exit 2013-09-19 19:51 [RFC GIT PULL] softirq: Consolidation and stack overrun fix Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker @ 2013-09-19 19:51 ` Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 3/3] irq: Comment on the use of inline stack for ksoftirqd Frederic Weisbecker 2013-09-20 0:02 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Linus Torvalds 3 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-19 19:51 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, Linus Torvalds, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton When a softirq executes in irq_exit(), it can contribute to random complicated and large stack scenario involving task calls, hw interrupt calls, softirq handler calls and then other irqs, interrupting the softirq, that can dig further with an irq handler. Softirqs executing on the inline hw interrupt stack may favour stack overflows in such circumstances, as it has been reported in powerpc where task -> irq -> softirq -> irq can end up forming a huge calltrace in the single kernel stack. So if there are softirqs pending on hardirq exit, lets execute them on the softirq stack to minimize this. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> --- kernel/softirq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/softirq.c b/kernel/softirq.c index 36112cf..8c8f08b 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -326,7 +326,7 @@ void irq_enter(void) static inline void invoke_softirq(void) { if (!force_irqthreads) - __do_softirq(); + do_softirq_own_stack(); else wakeup_softirqd(); } -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* [PATCH 3/3] irq: Comment on the use of inline stack for ksoftirqd 2013-09-19 19:51 [RFC GIT PULL] softirq: Consolidation and stack overrun fix Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 2/3] irq: Execute softirq on its own stack on irq exit Frederic Weisbecker @ 2013-09-19 19:51 ` Frederic Weisbecker 2013-09-20 0:02 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Linus Torvalds 3 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-19 19:51 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, Linus Torvalds, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton Ksoftirqd shouldn't need softirq stack since it's executing in a kernel thread with a callstack that is only beginning at this stage. Lets comment about that for clarity. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> --- kernel/softirq.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/softirq.c b/kernel/softirq.c index 8c8f08b..1b0f48e 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -759,6 +759,10 @@ static void run_ksoftirqd(unsigned int cpu) { local_irq_disable(); if (local_softirq_pending()) { + /* + * Execute softirq on inline stack, as we are not deep in + * the task stack here. + */ __do_softirq(); rcu_note_context_switch(cpu); local_irq_enable(); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-19 19:51 [RFC GIT PULL] softirq: Consolidation and stack overrun fix Frederic Weisbecker ` (2 preceding siblings ...) 2013-09-19 19:51 ` [PATCH 3/3] irq: Comment on the use of inline stack for ksoftirqd Frederic Weisbecker @ 2013-09-20 0:02 ` Linus Torvalds 2013-09-20 1:53 ` Benjamin Herrenschmidt 2013-09-20 11:03 ` Thomas Gleixner 3 siblings, 2 replies; 52+ messages in thread From: Linus Torvalds @ 2013-09-20 0:02 UTC (permalink / raw) To: Frederic Weisbecker Cc: Thomas Gleixner, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Thu, Sep 19, 2013 at 2:51 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > It fixes stacks overruns reported by Benjamin Herrenschmidt: > http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop So I don't really dislike this patch-series, but isn't "irq_exit()" (which calls the new softirq_on_stack()) already running in the context of the irq stack? And it's run at the very end of the irq processing, so the irq stack should be empty too at that point. So switching to *another* empty stack sounds really sad. No? Taking more cache misses etc, instead of using the already empty - but cache-hot - stack that we already have. I'm assuming that the problem is that since we're already on the irq stack, if *another* irq comes in, now that *other* irq doesn't get yet another irq stack page. And I'm wondering whether we shouldn't just fix that (hopefully unlikely) case instead? So instead of having a softirq stack, we'd have just an extra irq stack for the case where the original irq stack is already in use. Hmm? Linus ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 0:02 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Linus Torvalds @ 2013-09-20 1:53 ` Benjamin Herrenschmidt 2013-09-20 11:03 ` Thomas Gleixner 1 sibling, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-20 1:53 UTC (permalink / raw) To: Linus Torvalds Cc: Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Thu, 2013-09-19 at 19:02 -0500, Linus Torvalds wrote: > On Thu, Sep 19, 2013 at 2:51 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > It fixes stacks overruns reported by Benjamin Herrenschmidt: > > http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop > > So I don't really dislike this patch-series, but isn't "irq_exit()" > (which calls the new softirq_on_stack()) already running in the > context of the irq stack? Not on powerpc and afaik not on i386 from my quick look at handle_irq() in irq_32.c ... maybe x86_64 calls do_IRQ already on the irq stack ? Also irq and softirq are (somewhat on purpose) different stacks > And it's run at the very end of the irq > processing, so the irq stack should be empty too at that point. > So switching to *another* empty stack sounds really sad. No? Taking > more cache misses etc, instead of using the already empty - but > cache-hot - stack that we already have. > > I'm assuming that the problem is that since we're already on the irq > stack, if *another* irq comes in, now that *other* irq doesn't get yet > another irq stack page. And I'm wondering whether we shouldn't just > fix that (hopefully unlikely) case instead? So instead of having a > softirq stack, we'd have just an extra irq stack for the case where > the original irq stack is already in use. Well actually in the crash we observed we aren't already in the irq stack. We could try to change powerpc to switch stack before calling do_IRQ but that would be fairly invasive for various reasons (a significant change of our assembly entry code) unless we do it as a kind of wrapper around do_IRQ (and thus keep the actual interrupt frame on the main kernel stack). I'll look into hacking something up along those lines, it might be the best approach for a RHEL7 fix anyway. Ben. > Hmm? > > Linus > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 0:02 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Linus Torvalds 2013-09-20 1:53 ` Benjamin Herrenschmidt @ 2013-09-20 11:03 ` Thomas Gleixner 2013-09-20 11:11 ` Peter Zijlstra 2013-09-20 16:26 ` Frederic Weisbecker 1 sibling, 2 replies; 52+ messages in thread From: Thomas Gleixner @ 2013-09-20 11:03 UTC (permalink / raw) To: Linus Torvalds Cc: Frederic Weisbecker, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Thu, 19 Sep 2013, Linus Torvalds wrote: > On Thu, Sep 19, 2013 at 2:51 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > It fixes stacks overruns reported by Benjamin Herrenschmidt: > > http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop > > So I don't really dislike this patch-series, but isn't "irq_exit()" > (which calls the new softirq_on_stack()) already running in the > context of the irq stack? And it's run at the very end of the irq > processing, so the irq stack should be empty too at that point. Right, but most of the implementations are braindamaged. irq_enter(); handle_irq_on_hardirq_stack(); irq_exit(); instead of doing: switch_stack() irq_enter() handle_irq() irq_exit() restore_stack() So in the case of softirq processing (the likely case) we end up doing: switch_to_hardirq_stack() ... restore_original_stack() switch_to_softirq_stack() ... restore_original_stack() Two avoidable stack switch operations for no gain. > I'm assuming that the problem is that since we're already on the irq > stack, if *another* irq comes in, now that *other* irq doesn't get yet > another irq stack page. And I'm wondering whether we shouldn't just > fix that (hopefully unlikely) case instead? So instead of having a > softirq stack, we'd have just an extra irq stack for the case where > the original irq stack is already in use. Why not have a single irq_stack large enough to accomodate interrupt handling during softirq processing? We have no interrupt nesting so the maximum stack depth necessary is max(softirq_stack_usage) + max(irq_stack_usage) Today we allocate THREAD_SIZE_ORDER for the hard and the soft context, so allocating 2 * THREAD_SIZE_ORDER should be sufficient. Thanks, tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 11:03 ` Thomas Gleixner @ 2013-09-20 11:11 ` Peter Zijlstra 2013-09-21 0:55 ` Benjamin Herrenschmidt 2013-09-20 16:26 ` Frederic Weisbecker 1 sibling, 1 reply; 52+ messages in thread From: Peter Zijlstra @ 2013-09-20 11:11 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Frederic Weisbecker, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, Sep 20, 2013 at 01:03:17PM +0200, Thomas Gleixner wrote: > On Thu, 19 Sep 2013, Linus Torvalds wrote: > > > On Thu, Sep 19, 2013 at 2:51 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > > It fixes stacks overruns reported by Benjamin Herrenschmidt: > > > http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop > > > > So I don't really dislike this patch-series, but isn't "irq_exit()" > > (which calls the new softirq_on_stack()) already running in the > > context of the irq stack? And it's run at the very end of the irq > > processing, so the irq stack should be empty too at that point. > > Right, but most of the implementations are braindamaged. > > irq_enter(); > handle_irq_on_hardirq_stack(); > irq_exit(); I was only just staring at i386 and found it did exactly that. It had to jump through preempt_count hoops to make that work and obviously I hadn't test-build the preempt patches on i386. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 11:11 ` Peter Zijlstra @ 2013-09-21 0:55 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-21 0:55 UTC (permalink / raw) To: Peter Zijlstra Cc: Thomas Gleixner, Linus Torvalds, Frederic Weisbecker, LKML, Paul Mackerras, Ingo Molnar, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, 2013-09-20 at 13:11 +0200, Peter Zijlstra wrote: > On Fri, Sep 20, 2013 at 01:03:17PM +0200, Thomas Gleixner wrote: > > On Thu, 19 Sep 2013, Linus Torvalds wrote: > > > > > On Thu, Sep 19, 2013 at 2:51 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > > > > It fixes stacks overruns reported by Benjamin Herrenschmidt: > > > > http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop > > > > > > So I don't really dislike this patch-series, but isn't "irq_exit()" > > > (which calls the new softirq_on_stack()) already running in the > > > context of the irq stack? And it's run at the very end of the irq > > > processing, so the irq stack should be empty too at that point. > > > > Right, but most of the implementations are braindamaged. > > > > irq_enter(); > > handle_irq_on_hardirq_stack(); > > irq_exit(); > > I was only just staring at i386 and found it did exactly that. It had to > jump through preempt_count hoops to make that work and obviously I > hadn't test-build the preempt patches on i386. Right and powerpc does the switch even later when calling the individual handlers. Ben. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 11:03 ` Thomas Gleixner 2013-09-20 11:11 ` Peter Zijlstra @ 2013-09-20 16:26 ` Frederic Weisbecker 2013-09-20 17:30 ` Thomas Gleixner ` (2 more replies) 1 sibling, 3 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-20 16:26 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, Sep 20, 2013 at 01:03:17PM +0200, Thomas Gleixner wrote: > On Thu, 19 Sep 2013, Linus Torvalds wrote: > > > On Thu, Sep 19, 2013 at 2:51 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > > It fixes stacks overruns reported by Benjamin Herrenschmidt: > > > http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop > > > > So I don't really dislike this patch-series, but isn't "irq_exit()" > > (which calls the new softirq_on_stack()) already running in the > > context of the irq stack? And it's run at the very end of the irq > > processing, so the irq stack should be empty too at that point. > > Right, but most of the implementations are braindamaged. > > irq_enter(); > handle_irq_on_hardirq_stack(); > irq_exit(); > > instead of doing: > > switch_stack() > irq_enter() > handle_irq() > irq_exit() > restore_stack() > > So in the case of softirq processing (the likely case) we end up doing: > > switch_to_hardirq_stack() > ... > restore_original_stack() > switch_to_softirq_stack() > ... > restore_original_stack() > > Two avoidable stack switch operations for no gain. > > > I'm assuming that the problem is that since we're already on the irq > > stack, if *another* irq comes in, now that *other* irq doesn't get yet > > another irq stack page. And I'm wondering whether we shouldn't just > > fix that (hopefully unlikely) case instead? So instead of having a > > softirq stack, we'd have just an extra irq stack for the case where > > the original irq stack is already in use. > > Why not have a single irq_stack large enough to accomodate interrupt > handling during softirq processing? We have no interrupt nesting so > the maximum stack depth necessary is > > max(softirq_stack_usage) + max(irq_stack_usage) > > Today we allocate THREAD_SIZE_ORDER for the hard and the soft context, > so allocating 2 * THREAD_SIZE_ORDER should be sufficient. Looks good to me. Now just for clarity, what do we then do with inline sofirq executions: on local_bh_enable() for example, or explicit calls to do_softirq() other than irq exit? Should we keep the current switch to a different softirq stack? If we have a generic irq stack (used for both hard and soft) that is big enough, perhaps we can also switch to this generic irq stack for inline softirqs executions? After all there is no much point in keeping a separate stack for that: this result in cache misses if the inline softirq is interrupted by a hardirq, also inlined softirqs can't happen in hardirq, so there should be no much risk of overruns. Or am I missing other things? > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 16:26 ` Frederic Weisbecker @ 2013-09-20 17:30 ` Thomas Gleixner 2013-09-20 18:37 ` Frederic Weisbecker 2013-09-20 22:14 ` Linus Torvalds 2013-09-21 0:52 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 52+ messages in thread From: Thomas Gleixner @ 2013-09-20 17:30 UTC (permalink / raw) To: Frederic Weisbecker Cc: Linus Torvalds, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, 20 Sep 2013, Frederic Weisbecker wrote: > Now just for clarity, what do we then do with inline sofirq > executions: on local_bh_enable() for example, or explicit calls to > do_softirq() other than irq exit? Should we keep the current switch > to a different softirq stack? If we have a generic irq stack (used > for both hard and soft) that is big enough, perhaps we can also > switch to this generic irq stack for inline softirqs executions? > After all there is no much point in keeping a separate stack for > that: this result in cache misses if the inline softirq is > interrupted by a hardirq, also inlined softirqs can't happen in > hardirq, so there should be no much risk of overruns. We can use the same irqstack for this because from the irqstack point of view, thats the same as if softirqs get executed from irq_exit(). Thanks, tglx ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 17:30 ` Thomas Gleixner @ 2013-09-20 18:37 ` Frederic Weisbecker 0 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-20 18:37 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, Sep 20, 2013 at 07:30:14PM +0200, Thomas Gleixner wrote: > On Fri, 20 Sep 2013, Frederic Weisbecker wrote: > > > Now just for clarity, what do we then do with inline sofirq > > executions: on local_bh_enable() for example, or explicit calls to > > do_softirq() other than irq exit? Should we keep the current switch > > to a different softirq stack? If we have a generic irq stack (used > > for both hard and soft) that is big enough, perhaps we can also > > switch to this generic irq stack for inline softirqs executions? > > After all there is no much point in keeping a separate stack for > > that: this result in cache misses if the inline softirq is > > interrupted by a hardirq, also inlined softirqs can't happen in > > hardirq, so there should be no much risk of overruns. > > We can use the same irqstack for this because from the irqstack point > of view, thats the same as if softirqs get executed from > irq_exit(). Ok, so I see that's what x86-64 is doing. But x86-32 seems to be using different stacks for hard and soft irqs for no much reasons (expept maybe to avoid overrun if the hardirq). And x86-32 only switches to hardirq stack for the handler. So probably we can use the same stack for the whole and extend it to before irq_enter() and after irq_exit(), like you suggested. BTW, we still want the 1st patch of my series I think, as it simply consolidate existing code. Using the same stack for hard and soft irqs is independant from that. If you're ok with it, would you agree to apply it? Thanks. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 16:26 ` Frederic Weisbecker 2013-09-20 17:30 ` Thomas Gleixner @ 2013-09-20 22:14 ` Linus Torvalds 2013-09-21 7:47 ` Ingo Molnar 2013-09-21 18:58 ` Frederic Weisbecker 2013-09-21 0:52 ` Benjamin Herrenschmidt 2 siblings, 2 replies; 52+ messages in thread From: Linus Torvalds @ 2013-09-20 22:14 UTC (permalink / raw) To: Frederic Weisbecker Cc: Thomas Gleixner, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, Sep 20, 2013 at 9:26 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > Now just for clarity, what do we then do with inline sofirq executions: on local_bh_enable() > for example, or explicit calls to do_softirq() other than irq exit? If we do a softirq because it was pending and we did a "local_bh_enable()" in normal code, we need a new stack. The "local_bh_enable()" may be pretty deep in the callchain on a normal process stack, so I think it would be safest to switch to a separate stack for softirq handling. So you have a few different cases: - irq_exit(). The irq stack is by definition empty (assuming itq_exit() is done on the irq stack), so doing softirq in that context should be fine. However, that assumes that if we get *another* interrupt, then we'll switch stacks again, so this does mean that we need two irq stacks. No, irq's don't nest, but if we run softirq on the first irq stack, the other irq *can* nest that softirq. - process context doing local_bh_enable, and a bh became pending while it was disabled. See above: this needs a stack switch. Which stack to use is open, again assuming that a hardirq coming in will switch to yet another stack. Hmm? Linus ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 22:14 ` Linus Torvalds @ 2013-09-21 7:47 ` Ingo Molnar 2013-09-21 18:58 ` Frederic Weisbecker 1 sibling, 0 replies; 52+ messages in thread From: Ingo Molnar @ 2013-09-21 7:47 UTC (permalink / raw) To: Linus Torvalds Cc: Frederic Weisbecker, Thomas Gleixner, LKML, Benjamin Herrenschmidt, Paul Mackerras, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Sep 20, 2013 at 9:26 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > Now just for clarity, what do we then do with inline sofirq > > executions: on local_bh_enable() for example, or explicit calls to > > do_softirq() other than irq exit? > > If we do a softirq because it was pending and we did a > "local_bh_enable()" in normal code, we need a new stack. The > "local_bh_enable()" may be pretty deep in the callchain on a normal > process stack, so I think it would be safest to switch to a separate > stack for softirq handling. > > So you have a few different cases: > > - irq_exit(). The irq stack is by definition empty (assuming itq_exit() > is done on the irq stack), so doing softirq in that context should be > fine. However, that assumes that if we get *another* interrupt, then > we'll switch stacks again, so this does mean that we need two irq > stacks. No, irq's don't nest, but if we run softirq on the first irq > stack, the other irq *can* nest that softirq. > > - process context doing local_bh_enable, and a bh became pending while > it was disabled. See above: this needs a stack switch. Which stack to > use is open, again assuming that a hardirq coming in will switch to yet > another stack. > > Hmm? I'd definitely argue in favor of never letting unknown-size stacks nest (i.e. to always switch if we start a new context on top of a non-trivial stack). Known (small) size stack nesting is not real stack nesting, it's just a somewhat unusual (and faster) way of stack switching. Thanks, Ingo ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 22:14 ` Linus Torvalds 2013-09-21 7:47 ` Ingo Molnar @ 2013-09-21 18:58 ` Frederic Weisbecker 2013-09-21 21:45 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-21 18:58 UTC (permalink / raw) To: Linus Torvalds Cc: Thomas Gleixner, LKML, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton 2013/9/20 Linus Torvalds <torvalds@linux-foundation.org>: > On Fri, Sep 20, 2013 at 9:26 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote: >> >> Now just for clarity, what do we then do with inline sofirq executions: on local_bh_enable() >> for example, or explicit calls to do_softirq() other than irq exit? > > If we do a softirq because it was pending and we did a > "local_bh_enable()" in normal code, we need a new stack. The > "local_bh_enable()" may be pretty deep in the callchain on a normal > process stack, so I think it would be safest to switch to a separate > stack for softirq handling. Right. > > So you have a few different cases: > > - irq_exit(). The irq stack is by definition empty (assuming > itq_exit() is done on the irq stack), so doing softirq in that context > should be fine. However, that assumes that if we get *another* > interrupt, then we'll switch stacks again, so this does mean that we > need two irq stacks. No, irq's don't nest, but if we run softirq on > the first irq stack, the other irq *can* nest that softirq. Well, most archs don't define __ARCH_IRQ_EXIT_IRQS_DISABLED. It doesn't even mean that the majority of them actually run irq_exit() with irqs enabled in practice. But there may be thoretically some where hardirqs can nest without even the help of softirqs. So it's quite possible to run softirqs on a hardirq stack that is not empty. Now certainly what needs to be fixed then is archs that don't have __ARCH_IRQ_EXIT_IRQS_DISABLED or archs that have any other significant opportunity to nest interrupt. > > - process context doing local_bh_enable, and a bh became pending > while it was disabled. See above: this needs a stack switch. Which > stack to use is open, again assuming that a hardirq coming in will > switch to yet another stack. Right. Now if we do like Thomas suggested, we can have a common irq stack that is big enough for hard and softirqs. After all there should never be more than two or three nesting irq contexts: hardirq->softirq->hardirq, softirq->hardirq, ... At least if we put aside the unsane archs that can nest irqs somehow. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-21 18:58 ` Frederic Weisbecker @ 2013-09-21 21:45 ` Benjamin Herrenschmidt 2013-09-21 23:27 ` Frederic Weisbecker ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-21 21:45 UTC (permalink / raw) To: Frederic Weisbecker Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sat, 2013-09-21 at 13:58 -0500, Frederic Weisbecker wrote: > Now certainly what needs to be fixed then is archs that don't have > __ARCH_IRQ_EXIT_IRQS_DISABLED > or archs that have any other significant opportunity to nest interrupt. Interesting. I notice we don't define it on powerpc but we don't enable IRQs in do_IRQ either... our path is very similar to x86 in this regard, the only thing that can cause them to become enabled would be if a driver interrupt handler did local_irq_enable(). It used to be fairly common for drivers to do spin_unlock_irq() which would unconditionally re-enable. Did we add WARNs or lockdep logic to catch these nowadays ? > > - process context doing local_bh_enable, and a bh became pending > > while it was disabled. See above: this needs a stack switch. Which > > stack to use is open, again assuming that a hardirq coming in will > > switch to yet another stack. > > Right. Now if we do like Thomas suggested, we can have a common irq > stack that is big enough for hard and softirqs. After all there should > never be more than two or three nesting irq contexts: > hardirq->softirq->hardirq, softirq->hardirq, ... > > At least if we put aside the unsane archs that can nest irqs somehow. I really don't like the "larger" irq stack ... probably because I can't make it work easily :-) See my previous comment about how we get to thread_info on ppc. What I *can* do that would help I suppose would be to switch to the irq stack before irq_enter/exit which would at least mean that softirq would run from the top of the irq stack which is better than the current situation. I'm fact I'll whip up a quick fix see if that might be enough of a band aid for RHEL7. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-21 21:45 ` Benjamin Herrenschmidt @ 2013-09-21 23:27 ` Frederic Weisbecker 2013-09-22 2:01 ` H. Peter Anvin 2013-09-23 4:40 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-21 23:27 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, Sep 22, 2013 at 07:45:01AM +1000, Benjamin Herrenschmidt wrote: > On Sat, 2013-09-21 at 13:58 -0500, Frederic Weisbecker wrote: > > > Now certainly what needs to be fixed then is archs that don't have > > __ARCH_IRQ_EXIT_IRQS_DISABLED > > or archs that have any other significant opportunity to nest interrupt. > > Interesting. I notice we don't define it on powerpc Yeah, x86 doesn't define it either. In fact few archs do. > but we don't enable > IRQs in do_IRQ either... our path is very similar to x86 in this regard, > the only thing that can cause them to become enabled would be if a > driver interrupt handler did local_irq_enable(). > > It used to be fairly common for drivers to do spin_unlock_irq() which > would unconditionally re-enable. Did we add WARNs or lockdep logic to > catch these nowadays ? Right there is a check in handle_irq_event_percpu() that warns if the handler exits with irqs enabled. And irq_exit() also warns when (__ARCH_IRQ_EXIT_IRQS_DISABLED && !irq_disabled()) > > > > - process context doing local_bh_enable, and a bh became pending > > > while it was disabled. See above: this needs a stack switch. Which > > > stack to use is open, again assuming that a hardirq coming in will > > > switch to yet another stack. > > > > Right. Now if we do like Thomas suggested, we can have a common irq > > stack that is big enough for hard and softirqs. After all there should > > never be more than two or three nesting irq contexts: > > hardirq->softirq->hardirq, softirq->hardirq, ... > > > > At least if we put aside the unsane archs that can nest irqs somehow. > > I really don't like the "larger" irq stack ... probably because I can't > make it work easily :-) See my previous comment about how we get to > thread_info on ppc. > > What I *can* do that would help I suppose would be to switch to the irq > stack before irq_enter/exit which would at least mean that softirq would > run from the top of the irq stack which is better than the current > situation. Yeah I think that doing this should solve the biggest part of the problem on ppc. You'll at least ensure that you have splitup stacks for tasks and softirq/irq stacks. > > I'm fact I'll whip up a quick fix see if that might be enough of a band > aid for RHEL7. > > Cheers, > Ben. > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-21 21:45 ` Benjamin Herrenschmidt 2013-09-21 23:27 ` Frederic Weisbecker @ 2013-09-22 2:01 ` H. Peter Anvin 2013-09-22 4:39 ` Benjamin Herrenschmidt 2013-09-23 4:40 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 52+ messages in thread From: H. Peter Anvin @ 2013-09-22 2:01 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On 09/21/2013 02:45 PM, Benjamin Herrenschmidt wrote: > > I really don't like the "larger" irq stack ... probably because I can't > make it work easily :-) See my previous comment about how we get to > thread_info on ppc. > For the record, I intend to remove thread_info from the stack on x86 and instead merge it with task_struct as a single structure pointed to with a percpu variable. -hpa ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 2:01 ` H. Peter Anvin @ 2013-09-22 4:39 ` Benjamin Herrenschmidt 2013-09-22 4:41 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-22 4:39 UTC (permalink / raw) To: H. Peter Anvin Cc: Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sat, 2013-09-21 at 19:01 -0700, H. Peter Anvin wrote: > On 09/21/2013 02:45 PM, Benjamin Herrenschmidt wrote: > > > > I really don't like the "larger" irq stack ... probably because I can't > > make it work easily :-) See my previous comment about how we get to > > thread_info on ppc. > > > > For the record, I intend to remove thread_info from the stack on x86 and > instead merge it with task_struct as a single structure pointed to with > a percpu variable. Last I looked, our per-cpu codegen was pretty poor... but then we have this "PACA" (somewhat arch specific per-cpu blob that is separate from the rest of per-cpu because of a mix of historical reasons and the fact that it has to be allocated in a specific part of memory at boot time) which we point to directly via a GPR, so we could point to it via PACA. How do you do your per-cpu on x86 ? On powerpc we struggle because we try to dedicate a register (r13) to this PACA (the per-cpu offset hangs off it), but we constantly run into issues where gcc copies r13 to another register and then indexes off that, even accross preempt_enable/disable sections, or worst such as saving/restoring from the stack. We can't seem to get the compiler to treat it appropriately as volatile. Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 4:39 ` Benjamin Herrenschmidt @ 2013-09-22 4:41 ` Benjamin Herrenschmidt 2013-09-22 16:24 ` Peter Zijlstra 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-22 4:41 UTC (permalink / raw) To: H. Peter Anvin Cc: Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, 2013-09-22 at 14:39 +1000, Benjamin Herrenschmidt wrote: > How do you do your per-cpu on x86 ? On powerpc we struggle because we > try to dedicate a register (r13) to this PACA (the per-cpu offset hangs > off it), but we constantly run into issues where gcc copies r13 to > another register and then indexes off that, even accross > preempt_enable/disable sections, or worst such as saving/restoring from > the stack. We can't seem to get the compiler to treat it appropriately > as volatile. Also, do you have a half-decent way of getting to per-cpu from asm ? Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 4:41 ` Benjamin Herrenschmidt @ 2013-09-22 16:24 ` Peter Zijlstra 2013-09-22 17:47 ` H. Peter Anvin 2013-09-22 21:56 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 52+ messages in thread From: Peter Zijlstra @ 2013-09-22 16:24 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: H. Peter Anvin, Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, Sep 22, 2013 at 02:41:01PM +1000, Benjamin Herrenschmidt wrote: > On Sun, 2013-09-22 at 14:39 +1000, Benjamin Herrenschmidt wrote: > > How do you do your per-cpu on x86 ? We use a segment offset. Something like: inc %gs:var; would be a per-cpu increment. The actual memory location used for the memop is the variable address + GS offset. And our GS offset is per cpu and points to the base of the per cpu segment for that cpu. > Also, do you have a half-decent way of getting to per-cpu from asm ? Yes, see above :-) Assuming we repurpose r13 as per-cpu base, you could do the whole this_cpu_* stuff which is locally atomic -- ie. safe against IRQs and preemption as: loop: lwarx rt, var, r13 inc rt stwcx rt, var, r13 bne- loop Except, I think your ll/sc pair is actually slower than doing: local_irq_save(flags) var++; local_irq_restore(flags) Esp. with the lazy irq disable you have. And I'm fairly sure using them as generic per cpu accessors isn't sane, but I'm not sure PPC64 has other memops with implicit addition like that. As to the problem of GCC moving r13 about, some archs have some exceptions in the register allocator and leave some registers alone. IIRC MIPS has this and uses one of those (istr there's 2) for the per cpu base address. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 16:24 ` Peter Zijlstra @ 2013-09-22 17:47 ` H. Peter Anvin 2013-09-22 22:00 ` Benjamin Herrenschmidt 2013-09-22 21:56 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 52+ messages in thread From: H. Peter Anvin @ 2013-09-22 17:47 UTC (permalink / raw) To: Peter Zijlstra Cc: Benjamin Herrenschmidt, Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On 09/22/2013 09:24 AM, Peter Zijlstra wrote: > > As to the problem of GCC moving r13 about, some archs have some > exceptions in the register allocator and leave some registers alone. > IIRC MIPS has this and uses one of those (istr there's 2) for the > per cpu base address. > You can force gcc to leave a register alone with the command line option -ffixed-r13. -hpa ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 17:47 ` H. Peter Anvin @ 2013-09-22 22:00 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-22 22:00 UTC (permalink / raw) To: H. Peter Anvin Cc: Peter Zijlstra, Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, 2013-09-22 at 10:47 -0700, H. Peter Anvin wrote: > On 09/22/2013 09:24 AM, Peter Zijlstra wrote: > > > > As to the problem of GCC moving r13 about, some archs have some > > exceptions in the register allocator and leave some registers alone. > > IIRC MIPS has this and uses one of those (istr there's 2) for the > > per cpu base address. > > > > You can force gcc to leave a register alone with the command line option > -ffixed-r13. Haven't tried that in a while but iirc, from the discussions I had with the gcc folks, it didn't work the way we wanted. Basically, it still assumes that it's not going to change at random points, I can't have something like register volatile unsigned long pre_cpu_offset asm("r13") It will barf on the "volatile" and if I don't have it, it will make assumptions that r13 doesn't change, and thus might copy its value in another cpu accross preempt_enable/disable. I think I need another sessions with gcc folks on that one. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 16:24 ` Peter Zijlstra 2013-09-22 17:47 ` H. Peter Anvin @ 2013-09-22 21:56 ` Benjamin Herrenschmidt 2013-09-22 22:22 ` Linus Torvalds 1 sibling, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-22 21:56 UTC (permalink / raw) To: Peter Zijlstra Cc: H. Peter Anvin, Frederic Weisbecker, Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, 2013-09-22 at 18:24 +0200, Peter Zijlstra wrote: > On Sun, Sep 22, 2013 at 02:41:01PM +1000, Benjamin Herrenschmidt wrote: > > On Sun, 2013-09-22 at 14:39 +1000, Benjamin Herrenschmidt wrote: > > > How do you do your per-cpu on x86 ? > > We use a segment offset. Something like: > > inc %gs:var; > > would be a per-cpu increment. The actual memory location used for the > memop is the variable address + GS offset. > > And our GS offset is per cpu and points to the base of the per cpu > segment for that cpu. > > > Also, do you have a half-decent way of getting to per-cpu from asm ? > > Yes, see above :-) And gcc makes no stupid assumptions that this gs doesn't change ? That's the main problem we have with using r13 for PACA. > Assuming we repurpose r13 as per-cpu base, you could do the whole > this_cpu_* stuff which is locally atomic -- ie. safe against IRQs and > preemption as: > > loop: > lwarx rt, var, r13 > inc rt > stwcx rt, var, r13 > bne- loop > > Except, I think your ll/sc pair is actually slower than doing: > > local_irq_save(flags) > var++; > local_irq_restore(flags) > > Esp. with the lazy irq disable you have. Right, for local atomics we really don't want to use reservations, they have to go to the L2 (and flush the corresponding L1 line along the way). > And I'm fairly sure using them as generic per cpu accessors isn't sane, > but I'm not sure PPC64 has other memops with implicit addition like > that. We have no memop with implicit addition today, so for generic counters we do need reservation. For other per-cpus such as thread info etc... the problem is more how quick it is to get to the per-cpu. > As to the problem of GCC moving r13 about, some archs have some > exceptions in the register allocator and leave some registers alone. > IIRC MIPS has this and uses one of those (istr there's 2) for the > per cpu base address. Right, not sure if there's anything we can do short of getting gcc modified and relying on that new change (which would be problematic). I've been trying to get straight answers from gcc folks about what we can or cannot rely upon and never got any. r13 is actually the TLS, but we can't use it as such in the kernel as gcc *will* assume it doesn't change (though we could use it for current and have our per-cpu offset hanging off that, one level of indirection is better than nothing as suppose). The problem boils down to gcc not having a concept that a global register variable can be volatile. Cheers, Ben. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 21:56 ` Benjamin Herrenschmidt @ 2013-09-22 22:22 ` Linus Torvalds 2013-09-22 22:38 ` Benjamin Herrenschmidt ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Linus Torvalds @ 2013-09-22 22:22 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, Sep 22, 2013 at 2:56 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > On Sun, 2013-09-22 at 18:24 +0200, Peter Zijlstra wrote: >> >> We use a segment offset. Something like: >> >> inc %gs:var; >> > > And gcc makes no stupid assumptions that this gs doesn't change ? That's > the main problem we have with using r13 for PACA. Since gcc doesn't really know about segment registers at all (modulo %fs as TLS on x86), we do everything like that using inline asm. It's not *too* painful if you have a number of macro helpers to build up all the different versions. And r13 isn't volatile if you are preempt-safe, so I'm wondering if you could just make the preempt disable code mark %r13 as modified ("+r"). Then gcc won't ever cache r13 across one of those. And if you don't have preemption disabled, then you cannot do multiple ops using %r13 anyway, since on a load-store architecture it might change even between the load and store, so a per-cpu "add" operation *has* to cache the %r13 value in *another* register anyway, because using memory ops with just an offset off %r13 would be buggy. So I don't think this is a gcc issue. gcc can't fix those kinds of problems. Personally, I'd suggest something like: - the paca stuff is just insane. Try to get rid of it. - use %r13 for the per-thread thread-info pointer instead. A per-thread pointer is *not* volatile like the per-cpu base is. - Now you can make the per-cpu offset be loaded off the per-thread pointer (update it at context switch). gcc knows to not cache it across function calls, since it's a memory access. Use ACCESS_ONCE() or something to make sure it's only loaded once for the cpu offset ops. Alternatively, make %r13 point to the percpu side, but make sure that you always use an asm accessor to fetch the value. In particular, I think you need to make __my_cpu_offset be an inline asm that fetches %r13 into some other register. Otherwise you can never get it right. Linus ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 22:22 ` Linus Torvalds @ 2013-09-22 22:38 ` Benjamin Herrenschmidt 2013-09-23 4:35 ` [PATCH] powerpc/irq: Run softirqs off the top of the irq stack Benjamin Herrenschmidt 2013-09-24 5:42 ` [PATCH v2] " Benjamin Herrenschmidt 2013-09-23 17:59 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Chris Metcalf 2013-09-24 0:10 ` Benjamin Herrenschmidt 2 siblings, 2 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-22 22:38 UTC (permalink / raw) To: Linus Torvalds Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, 2013-09-22 at 15:22 -0700, Linus Torvalds wrote: > On Sun, Sep 22, 2013 at 2:56 PM, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: > > On Sun, 2013-09-22 at 18:24 +0200, Peter Zijlstra wrote: > >> > >> We use a segment offset. Something like: > >> > >> inc %gs:var; > >> > > > > And gcc makes no stupid assumptions that this gs doesn't change ? That's > > the main problem we have with using r13 for PACA. > > Since gcc doesn't really know about segment registers at all (modulo > %fs as TLS on x86), we do everything like that using inline asm. > > It's not *too* painful if you have a number of macro helpers to build > up all the different versions. > > And r13 isn't volatile if you are preempt-safe, so I'm wondering if > you could just make the preempt disable code mark %r13 as modified > ("+r"). Then gcc won't ever cache r13 across one of those. And if you > don't have preemption disabled, then you cannot do multiple ops using > %r13 anyway, since on a load-store architecture it might change even > between the load and store, so a per-cpu "add" operation *has* to > cache the %r13 value in *another* register anyway, because using > memory ops with just an offset off %r13 would be buggy. The only other one is our soft-disable of irqs which is a byte off r13, so technically a simple: li r0,0 stb r0,PACASOFTIRQEN(r13) Is a perfectly valid implementation of local_irq_disable() ;-) But that sort of special case I think we can handle with special cases. Right, using an r13 clobber in a dummy asm in things like preempt_disable and local_irq_disable is so far the most interesting option left on our table. > So I don't think this is a gcc issue. gcc can't fix those kinds of problems. Well, it would help if we had a way to tell gcc that a global register variable is volatile. But it doesn't understand that concept. > Personally, I'd suggest something like: > > - the paca stuff is just insane. Try to get rid of it. I wish I could... it dig deep. The problem is fundamentally because with MMU off (aka "real" mode), under a hypervisor, we have only access to a portion of the partition memory (it's a fake real mode if you want, and on older systems it's implemented as a physically contiguous chunk of memory reserved by the hypervisor for the partition, and as such cannot be arbitrarily large, newer CPUs support something smarter but with still with some limitations). That combined with the fact that all interrupts are delivered in real mode, and you get the picture. KVM "userspace" (the variant that emulates the guest as a user process) makes it worse as it needs to maintain much more state in a place that is "real mode accessible". We *might* be bale to repurpose r13 as our per-cpu offset and move a chunk of stuff from the paca to per-cpu variables, but we will still need "something" akin to the PACA for the guts of the exception entry/exit and KVM. > - use %r13 for the per-thread thread-info pointer instead. A > per-thread pointer is *not* volatile like the per-cpu base is. Yes, use it as a TLS, that would be an option I've been considering. It could also be the task struct but thread_info might be slightly hotter. That's what I had in mind when I mentioned using it as a TLS for "current". > - Now you can make the per-cpu offset be loaded off the per-thread > pointer (update it at context switch). gcc knows to not cache it > across function calls, since it's a memory access. Use ACCESS_ONCE() > or something to make sure it's only loaded once for the cpu offset > ops. I have seen gcc completely disregard ACCESS_ONCE in the past though, at least in the context of PACA relative accesses... > Alternatively, make %r13 point to the percpu side, but make sure that > you always use an asm accessor to fetch the value. In particular, I > think you need to make __my_cpu_offset be an inline asm that fetches > %r13 into some other register. Otherwise you can never get it right. Yes, that's my plan B, at least for PACA but it would work for per-cpu as well. Use a get_paca/put_paca style construct which fetches is in another register (and does the preempt enable/disable while at it) and for the handful of cases where we do want direct and faster access such as manipulating the soft-irq flag, an inline asm load or store which is what we do today. Also with LE, we still have a bit of potential for tweaking the ABI and the compiler, so that's something I want to put on the table. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH] powerpc/irq: Run softirqs off the top of the irq stack 2013-09-22 22:38 ` Benjamin Herrenschmidt @ 2013-09-23 4:35 ` Benjamin Herrenschmidt 2013-09-23 7:56 ` Stephen Rothwell 2013-09-23 16:47 ` Linus Torvalds 2013-09-24 5:42 ` [PATCH v2] " Benjamin Herrenschmidt 1 sibling, 2 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-23 4:35 UTC (permalink / raw) To: linuxppc-dev Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Linus Torvalds Nowadays, irq_exit() calls __do_softirq() pretty much directly instead of calling do_softirq() which switches to the decicated softirq stack. This has lead to observed stack overflows on powerpc since we call irq_enter() and irq_exit() outside of the scope that switches to the irq stack. This fixes it by moving the stack switching up a level, making irq_enter() and irq_exit() run off the irq stack. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- This is the "band aid" discussed so far for the stack overflow problem for powerpc. I assume sparc and i386 at least need something similar (I had a quick look at ARM and it doesn't seem to have irq stacks at all). Unless objection in the next day or so, I intend to send that to Linus and to -stable ASAP. arch/powerpc/include/asm/irq.h | 4 +- arch/powerpc/kernel/irq.c | 99 ++++++++++++++++++++++-------------------- arch/powerpc/kernel/misc_32.S | 9 ++-- arch/powerpc/kernel/misc_64.S | 10 ++--- 4 files changed, 62 insertions(+), 60 deletions(-) diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index 0e40843..41f13ce 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -69,9 +69,9 @@ extern struct thread_info *softirq_ctx[NR_CPUS]; extern void irq_ctx_init(void); extern void call_do_softirq(struct thread_info *tp); -extern int call_handle_irq(int irq, void *p1, - struct thread_info *tp, void *func); +extern void call_do_irq(struct pt_regs *regs, struct thread_info *tp); extern void do_IRQ(struct pt_regs *regs); +extern void __do_irq(struct pt_regs *regs); int irq_choose_cpu(const struct cpumask *mask); diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index c69440c..0c9646f 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -443,46 +443,7 @@ void migrate_irqs(void) static inline void handle_one_irq(unsigned int irq) { - struct thread_info *curtp, *irqtp; - unsigned long saved_sp_limit; - struct irq_desc *desc; - - desc = irq_to_desc(irq); - if (!desc) - return; - - /* Switch to the irq stack to handle this */ - curtp = current_thread_info(); - irqtp = hardirq_ctx[smp_processor_id()]; - - if (curtp == irqtp) { - /* We're already on the irq stack, just handle it */ - desc->handle_irq(irq, desc); - return; - } - - saved_sp_limit = current->thread.ksp_limit; - - irqtp->task = curtp->task; - irqtp->flags = 0; - - /* Copy the softirq bits in preempt_count so that the - * softirq checks work in the hardirq context. */ - irqtp->preempt_count = (irqtp->preempt_count & ~SOFTIRQ_MASK) | - (curtp->preempt_count & SOFTIRQ_MASK); - current->thread.ksp_limit = (unsigned long)irqtp + - _ALIGN_UP(sizeof(struct thread_info), 16); - - call_handle_irq(irq, desc, irqtp, desc->handle_irq); - current->thread.ksp_limit = saved_sp_limit; - irqtp->task = NULL; - - /* Set any flag that may have been set on the - * alternate stack - */ - if (irqtp->flags) - set_bits(irqtp->flags, &curtp->flags); } static inline void check_stack_overflow(void) @@ -501,9 +462,9 @@ static inline void check_stack_overflow(void) #endif } -void do_IRQ(struct pt_regs *regs) +void __do_irq(struct pt_regs *regs) { - struct pt_regs *old_regs = set_irq_regs(regs); + struct irq_desc *desc; unsigned int irq; irq_enter(); @@ -519,18 +480,64 @@ void do_IRQ(struct pt_regs *regs) */ irq = ppc_md.get_irq(); - /* We can hard enable interrupts now */ + /* We can hard enable interrupts now to allow perf interrupts */ may_hard_irq_enable(); /* And finally process it */ - if (irq != NO_IRQ) - handle_one_irq(irq); - else + if (unlikely(irq == NO_IRQ)) __get_cpu_var(irq_stat).spurious_irqs++; + else { + desc = irq_to_desc(irq); + if (likely(desc)) + desc->handle_irq(irq, desc); + } trace_irq_exit(regs); irq_exit(); +} + +void do_IRQ(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + struct thread_info *curtp, *irqtp; + unsigned long saved_sp_limit; + + /* Switch to the irq stack to handle this */ + curtp = current_thread_info(); + irqtp = hardirq_ctx[raw_smp_processor_id()]; + + /* Already there ? */ + if (unlikely(curtp == irqtp)) { + __do_irq(regs); + set_irq_regs(old_regs); + return; + } + + /* Adjust the stack limit */ + saved_sp_limit = current->thread.ksp_limit; + current->thread.ksp_limit = (unsigned long)irqtp + + _ALIGN_UP(sizeof(struct thread_info), 16); + + + /* Prepare the thread_info in the irq stack */ + irqtp->task = curtp->task; + irqtp->flags = 0; + + /* Copy the preempt_count so that the [soft]irq checks work. */ + irqtp->preempt_count = curtp->preempt_count; + + /* Switch stack and call */ + call_do_irq(regs, irqtp); + + /* Restore stack limit */ + current->thread.ksp_limit = saved_sp_limit; + irqtp->task = NULL; + + /* Copy back updates to the thread_info */ + if (irqtp->flags) + set_bits(irqtp->flags, &curtp->flags); + set_irq_regs(old_regs); } @@ -592,12 +599,10 @@ void irq_ctx_init(void) memset((void *)softirq_ctx[i], 0, THREAD_SIZE); tp = softirq_ctx[i]; tp->cpu = i; - tp->preempt_count = 0; memset((void *)hardirq_ctx[i], 0, THREAD_SIZE); tp = hardirq_ctx[i]; tp->cpu = i; - tp->preempt_count = HARDIRQ_OFFSET; } } diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 777d999..7da3882 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -47,13 +47,12 @@ _GLOBAL(call_do_softirq) mtlr r0 blr -_GLOBAL(call_handle_irq) +_GLOBAL(call_do_irq) mflr r0 stw r0,4(r1) - mtctr r6 - stwu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5) - mr r1,r5 - bctrl + stwu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4) + mr r1,r4 + bl __do_irq lwz r1,0(r1) lwz r0,4(r1) mtlr r0 diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S index 971d7e7..e59caf8 100644 --- a/arch/powerpc/kernel/misc_64.S +++ b/arch/powerpc/kernel/misc_64.S @@ -40,14 +40,12 @@ _GLOBAL(call_do_softirq) mtlr r0 blr -_GLOBAL(call_handle_irq) - ld r8,0(r6) +_GLOBAL(call_do_irq) mflr r0 std r0,16(r1) - mtctr r8 - stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5) - mr r1,r5 - bctrl + stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4) + mr r1,r4 + bl .__do_irq ld r1,0(r1) ld r0,16(r1) mtlr r0 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH] powerpc/irq: Run softirqs off the top of the irq stack 2013-09-23 4:35 ` [PATCH] powerpc/irq: Run softirqs off the top of the irq stack Benjamin Herrenschmidt @ 2013-09-23 7:56 ` Stephen Rothwell 2013-09-23 10:13 ` Benjamin Herrenschmidt 2013-09-23 16:47 ` Linus Torvalds 1 sibling, 1 reply; 52+ messages in thread From: Stephen Rothwell @ 2013-09-23 7:56 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev, James Hogan, Andrew Morton, David S. Miller, Peter Zijlstra, Frederic Weisbecker, Helge Deller, Heiko Carstens, LKML, Ingo Molnar, James E.J. Bottomley, H. Peter Anvin, Martin Schwidefsky, Thomas Gleixner, Linus Torvalds, Paul Mackerras [-- Attachment #1: Type: text/plain, Size: 2158 bytes --] Hi Ben, On Mon, 23 Sep 2013 14:35:58 +1000 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c > index c69440c..0c9646f 100644 > --- a/arch/powerpc/kernel/irq.c > +++ b/arch/powerpc/kernel/irq.c > @@ -443,46 +443,7 @@ void migrate_irqs(void) > > static inline void handle_one_irq(unsigned int irq) > { > - struct thread_info *curtp, *irqtp; > - unsigned long saved_sp_limit; > - struct irq_desc *desc; > - > - desc = irq_to_desc(irq); > - if (!desc) > - return; > - > - /* Switch to the irq stack to handle this */ > - curtp = current_thread_info(); > - irqtp = hardirq_ctx[smp_processor_id()]; > - > - if (curtp == irqtp) { > - /* We're already on the irq stack, just handle it */ > - desc->handle_irq(irq, desc); > - return; > - } > - > - saved_sp_limit = current->thread.ksp_limit; > - > - irqtp->task = curtp->task; > - irqtp->flags = 0; > - > - /* Copy the softirq bits in preempt_count so that the > - * softirq checks work in the hardirq context. */ > - irqtp->preempt_count = (irqtp->preempt_count & ~SOFTIRQ_MASK) | > - (curtp->preempt_count & SOFTIRQ_MASK); > > - current->thread.ksp_limit = (unsigned long)irqtp + > - _ALIGN_UP(sizeof(struct thread_info), 16); > - > - call_handle_irq(irq, desc, irqtp, desc->handle_irq); > - current->thread.ksp_limit = saved_sp_limit; > - irqtp->task = NULL; > - > - /* Set any flag that may have been set on the > - * alternate stack > - */ > - if (irqtp->flags) > - set_bits(irqtp->flags, &curtp->flags); > } This function ends up as a single blank line ... > @@ -519,18 +480,64 @@ void do_IRQ(struct pt_regs *regs) > */ > irq = ppc_md.get_irq(); > > - /* We can hard enable interrupts now */ > + /* We can hard enable interrupts now to allow perf interrupts */ > may_hard_irq_enable(); > > /* And finally process it */ > - if (irq != NO_IRQ) > - handle_one_irq(irq); then you remove the only call, so why not just remove the function completely? -- Cheers, Stephen Rothwell sfr@canb.auug.org.au [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] powerpc/irq: Run softirqs off the top of the irq stack 2013-09-23 7:56 ` Stephen Rothwell @ 2013-09-23 10:13 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-23 10:13 UTC (permalink / raw) To: Stephen Rothwell Cc: linuxppc-dev, James Hogan, Andrew Morton, David S. Miller, Peter Zijlstra, Frederic Weisbecker, Helge Deller, Heiko Carstens, LKML, Ingo Molnar, James E.J. Bottomley, H. Peter Anvin, Martin Schwidefsky, Thomas Gleixner, Linus Torvalds, Paul Mackerras On Mon, 2013-09-23 at 17:56 +1000, Stephen Rothwell wrote: > Hi Ben, > > On Mon, 23 Sep 2013 14:35:58 +1000 Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > > > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c > > index c69440c..0c9646f 100644 > > --- a/arch/powerpc/kernel/irq.c > > +++ b/arch/powerpc/kernel/irq.c > > @@ -443,46 +443,7 @@ void migrate_irqs(void) > > > > static inline void handle_one_irq(unsigned int irq) > > { > > - struct thread_info *curtp, *irqtp; > > - unsigned long saved_sp_limit; > > - struct irq_desc *desc; > > - > > - desc = irq_to_desc(irq); > > - if (!desc) > > - return; > > - > > - /* Switch to the irq stack to handle this */ > > - curtp = current_thread_info(); > > - irqtp = hardirq_ctx[smp_processor_id()]; > > - > > - if (curtp == irqtp) { > > - /* We're already on the irq stack, just handle it */ > > - desc->handle_irq(irq, desc); > > - return; > > - } > > - > > - saved_sp_limit = current->thread.ksp_limit; > > - > > - irqtp->task = curtp->task; > > - irqtp->flags = 0; > > - > > - /* Copy the softirq bits in preempt_count so that the > > - * softirq checks work in the hardirq context. */ > > - irqtp->preempt_count = (irqtp->preempt_count & ~SOFTIRQ_MASK) | > > - (curtp->preempt_count & SOFTIRQ_MASK); > > > > - current->thread.ksp_limit = (unsigned long)irqtp + > > - _ALIGN_UP(sizeof(struct thread_info), 16); > > - > > - call_handle_irq(irq, desc, irqtp, desc->handle_irq); > > - current->thread.ksp_limit = saved_sp_limit; > > - irqtp->task = NULL; > > - > > - /* Set any flag that may have been set on the > > - * alternate stack > > - */ > > - if (irqtp->flags) > > - set_bits(irqtp->flags, &curtp->flags); > > } > > This function ends up as a single blank line ... > > > @@ -519,18 +480,64 @@ void do_IRQ(struct pt_regs *regs) > > */ > > irq = ppc_md.get_irq(); > > > > - /* We can hard enable interrupts now */ > > + /* We can hard enable interrupts now to allow perf interrupts */ > > may_hard_irq_enable(); > > > > /* And finally process it */ > > - if (irq != NO_IRQ) > > - handle_one_irq(irq); > > then you remove the only call, so why not just remove the function > completely? Because I'm an idiot ? :-) I moved bits and pieces to do_IRQ and forgot to remove the remainder (and gcc didn't warn :-) Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] powerpc/irq: Run softirqs off the top of the irq stack 2013-09-23 4:35 ` [PATCH] powerpc/irq: Run softirqs off the top of the irq stack Benjamin Herrenschmidt 2013-09-23 7:56 ` Stephen Rothwell @ 2013-09-23 16:47 ` Linus Torvalds 2013-09-23 20:51 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2013-09-23 16:47 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev, Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, Sep 22, 2013 at 9:35 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > This is the "band aid" discussed so far for the stack overflow > problem for powerpc. I don't think it's a "band-aid" in any way, except perhaps in the sense that there are certainly other things we can also do in this series (ie I liked Frederic's cleanups etc). Linus ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] powerpc/irq: Run softirqs off the top of the irq stack 2013-09-23 16:47 ` Linus Torvalds @ 2013-09-23 20:51 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-23 20:51 UTC (permalink / raw) To: Linus Torvalds Cc: linuxppc-dev, Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Mon, 2013-09-23 at 09:47 -0700, Linus Torvalds wrote: > On Sun, Sep 22, 2013 at 9:35 PM, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: > > > > This is the "band aid" discussed so far for the stack overflow > > problem for powerpc. > > I don't think it's a "band-aid" in any way, except perhaps in the > sense that there are certainly other things we can also do in this > series (ie I liked Frederic's cleanups etc). Ah yes, I thought of it as a band-aid in the sense that a better approach would be to switch to the irq stack earlier like x86_64 does but that would be a lot more invasive. Definitely something I would look into if I was to tackle changing how we do per-cpu and the PACA though. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH v2] powerpc/irq: Run softirqs off the top of the irq stack 2013-09-22 22:38 ` Benjamin Herrenschmidt 2013-09-23 4:35 ` [PATCH] powerpc/irq: Run softirqs off the top of the irq stack Benjamin Herrenschmidt @ 2013-09-24 5:42 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 5:42 UTC (permalink / raw) To: linuxppc-dev Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Linus Torvalds Nowadays, irq_exit() calls __do_softirq() pretty much directly instead of calling do_softirq() which switches to the decicated softirq stack. This has lead to observed stack overflows on powerpc since we call irq_enter() and irq_exit() outside of the scope that switches to the irq stack. This fixes it by moving the stack switching up a level, making irq_enter() and irq_exit() run off the irq stack. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> v2: Garbage collect now empty handle_one_irq() --- arch/powerpc/include/asm/irq.h | 4 +- arch/powerpc/kernel/irq.c | 104 ++++++++++++++++++++--------------------- arch/powerpc/kernel/misc_32.S | 9 ++-- arch/powerpc/kernel/misc_64.S | 10 ++-- 4 files changed, 62 insertions(+), 65 deletions(-) diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h index 0e40843..41f13ce 100644 --- a/arch/powerpc/include/asm/irq.h +++ b/arch/powerpc/include/asm/irq.h @@ -69,9 +69,9 @@ extern struct thread_info *softirq_ctx[NR_CPUS]; extern void irq_ctx_init(void); extern void call_do_softirq(struct thread_info *tp); -extern int call_handle_irq(int irq, void *p1, - struct thread_info *tp, void *func); +extern void call_do_irq(struct pt_regs *regs, struct thread_info *tp); extern void do_IRQ(struct pt_regs *regs); +extern void __do_irq(struct pt_regs *regs); int irq_choose_cpu(const struct cpumask *mask); diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index c69440c..2234a12 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -441,50 +441,6 @@ void migrate_irqs(void) } #endif -static inline void handle_one_irq(unsigned int irq) -{ - struct thread_info *curtp, *irqtp; - unsigned long saved_sp_limit; - struct irq_desc *desc; - - desc = irq_to_desc(irq); - if (!desc) - return; - - /* Switch to the irq stack to handle this */ - curtp = current_thread_info(); - irqtp = hardirq_ctx[smp_processor_id()]; - - if (curtp == irqtp) { - /* We're already on the irq stack, just handle it */ - desc->handle_irq(irq, desc); - return; - } - - saved_sp_limit = current->thread.ksp_limit; - - irqtp->task = curtp->task; - irqtp->flags = 0; - - /* Copy the softirq bits in preempt_count so that the - * softirq checks work in the hardirq context. */ - irqtp->preempt_count = (irqtp->preempt_count & ~SOFTIRQ_MASK) | - (curtp->preempt_count & SOFTIRQ_MASK); - - current->thread.ksp_limit = (unsigned long)irqtp + - _ALIGN_UP(sizeof(struct thread_info), 16); - - call_handle_irq(irq, desc, irqtp, desc->handle_irq); - current->thread.ksp_limit = saved_sp_limit; - irqtp->task = NULL; - - /* Set any flag that may have been set on the - * alternate stack - */ - if (irqtp->flags) - set_bits(irqtp->flags, &curtp->flags); -} - static inline void check_stack_overflow(void) { #ifdef CONFIG_DEBUG_STACKOVERFLOW @@ -501,9 +457,9 @@ static inline void check_stack_overflow(void) #endif } -void do_IRQ(struct pt_regs *regs) +void __do_irq(struct pt_regs *regs) { - struct pt_regs *old_regs = set_irq_regs(regs); + struct irq_desc *desc; unsigned int irq; irq_enter(); @@ -519,18 +475,64 @@ void do_IRQ(struct pt_regs *regs) */ irq = ppc_md.get_irq(); - /* We can hard enable interrupts now */ + /* We can hard enable interrupts now to allow perf interrupts */ may_hard_irq_enable(); /* And finally process it */ - if (irq != NO_IRQ) - handle_one_irq(irq); - else + if (unlikely(irq == NO_IRQ)) __get_cpu_var(irq_stat).spurious_irqs++; + else { + desc = irq_to_desc(irq); + if (likely(desc)) + desc->handle_irq(irq, desc); + } trace_irq_exit(regs); irq_exit(); +} + +void do_IRQ(struct pt_regs *regs) +{ + struct pt_regs *old_regs = set_irq_regs(regs); + struct thread_info *curtp, *irqtp; + unsigned long saved_sp_limit; + + /* Switch to the irq stack to handle this */ + curtp = current_thread_info(); + irqtp = hardirq_ctx[raw_smp_processor_id()]; + + /* Already there ? */ + if (unlikely(curtp == irqtp)) { + __do_irq(regs); + set_irq_regs(old_regs); + return; + } + + /* Adjust the stack limit */ + saved_sp_limit = current->thread.ksp_limit; + current->thread.ksp_limit = (unsigned long)irqtp + + _ALIGN_UP(sizeof(struct thread_info), 16); + + + /* Prepare the thread_info in the irq stack */ + irqtp->task = curtp->task; + irqtp->flags = 0; + + /* Copy the preempt_count so that the [soft]irq checks work. */ + irqtp->preempt_count = curtp->preempt_count; + + /* Switch stack and call */ + call_do_irq(regs, irqtp); + + /* Restore stack limit */ + current->thread.ksp_limit = saved_sp_limit; + irqtp->task = NULL; + + /* Copy back updates to the thread_info */ + if (irqtp->flags) + set_bits(irqtp->flags, &curtp->flags); + set_irq_regs(old_regs); } @@ -592,12 +594,10 @@ void irq_ctx_init(void) memset((void *)softirq_ctx[i], 0, THREAD_SIZE); tp = softirq_ctx[i]; tp->cpu = i; - tp->preempt_count = 0; memset((void *)hardirq_ctx[i], 0, THREAD_SIZE); tp = hardirq_ctx[i]; tp->cpu = i; - tp->preempt_count = HARDIRQ_OFFSET; } } diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 777d999..7da3882 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -47,13 +47,12 @@ _GLOBAL(call_do_softirq) mtlr r0 blr -_GLOBAL(call_handle_irq) +_GLOBAL(call_do_irq) mflr r0 stw r0,4(r1) - mtctr r6 - stwu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5) - mr r1,r5 - bctrl + stwu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4) + mr r1,r4 + bl __do_irq lwz r1,0(r1) lwz r0,4(r1) mtlr r0 diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S index 971d7e7..e59caf8 100644 --- a/arch/powerpc/kernel/misc_64.S +++ b/arch/powerpc/kernel/misc_64.S @@ -40,14 +40,12 @@ _GLOBAL(call_do_softirq) mtlr r0 blr -_GLOBAL(call_handle_irq) - ld r8,0(r6) +_GLOBAL(call_do_irq) mflr r0 std r0,16(r1) - mtctr r8 - stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r5) - mr r1,r5 - bctrl + stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4) + mr r1,r4 + bl .__do_irq ld r1,0(r1) ld r0,16(r1) mtlr r0 ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 22:22 ` Linus Torvalds 2013-09-22 22:38 ` Benjamin Herrenschmidt @ 2013-09-23 17:59 ` Chris Metcalf 2013-09-23 20:57 ` Benjamin Herrenschmidt 2013-09-24 0:10 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 52+ messages in thread From: Chris Metcalf @ 2013-09-23 17:59 UTC (permalink / raw) To: Linus Torvalds Cc: Benjamin Herrenschmidt, Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On 9/22/2013 6:23 PM, Linus Torvalds wrote: > Alternatively, make %r13 point to the percpu side, but make sure that > you always use an asm accessor to fetch the value. In particular, I > think you need to make __my_cpu_offset be an inline asm that fetches > %r13 into some other register. Otherwise you can never get it right. We just came up against this on tilegx with a customer bug report in a PREEMPT environment. On tile, %tp is a GPR that points to the percpu area. The following seems to be the right abstraction -- though I'd also argue that letting barrier() clobber not just memory, but %tp, might be a better solution, but it's not clear what the best way is to do per-architecture overrides of per-compiler definitions like barrier(). See also the ARM v7 code, which has to do something similar, though their percpu pointer is not a GPR, which changes the tradeoffs somewhat. register unsigned long my_cpu_offset_reg asm("tp"); #ifdef CONFIG_PREEMPT /* * For full preemption, we can't just use the register variable * directly, since we need barrier() to hazard against it, causing the * compiler to reload anything computed from a previous "tp" value. * But we also don't want to use volatile asm, since we'd like the * compiler to be able to cache the value across multiple percpu reads. * So we use a fake stack read as a hazard against barrier(). */ static inline unsigned long __my_cpu_offset(void) { unsigned long tp; register unsigned long *sp asm("sp"); asm("move %0, tp" : "=r" (tp) : "m" (*sp)); return tp; } #define __my_cpu_offset __my_cpu_offset() #else /* * We don't need to hazard against barrier() since "tp" doesn't ever * change with PREEMPT_NONE, and with PREEMPT_VOLUNTARY it only * changes at function call points, at which we are already re-reading * the value of "tp" due to "my_cpu_offset_reg" being a global variable. */ #define __my_cpu_offset my_cpu_offset_reg #endif #define set_my_cpu_offset(tp) (my_cpu_offset_reg = (tp)) -- Chris Metcalf, Tilera Corp. http://www.tilera.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-23 17:59 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Chris Metcalf @ 2013-09-23 20:57 ` Benjamin Herrenschmidt 2013-09-24 19:27 ` Chris Metcalf 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-23 20:57 UTC (permalink / raw) To: Chris Metcalf Cc: Linus Torvalds, Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Mon, 2013-09-23 at 13:59 -0400, Chris Metcalf wrote: > We just came up against this on tilegx with a customer bug report in a > PREEMPT environment. On tile, %tp is a GPR that points to the percpu area. > The following seems to be the right abstraction -- though I'd also argue > that letting barrier() clobber not just memory, but %tp, might be a better > solution, but it's not clear what the best way is to do per-architecture > overrides of per-compiler definitions like barrier(). See also the ARM v7 > code, which has to do something similar, though their percpu pointer is > not a GPR, which changes the tradeoffs somewhat. Hrm, if I read correctly what you did is that you read "tp" into another register *and* also mark that action as clobbering the top int on the stack ? I don't quite get what the stack clobber brings you here and how it works around the fact that gcc might still cache that copy of tp into another register accross preempt_enable/disable... It's hard to tell with gcc ... the best I've had so far as an option was something that would mark my per-cpu register (r13) *itself* as clobbered... Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-23 20:57 ` Benjamin Herrenschmidt @ 2013-09-24 19:27 ` Chris Metcalf 2013-09-24 20:58 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 52+ messages in thread From: Chris Metcalf @ 2013-09-24 19:27 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On 9/23/2013 4:57 PM, Benjamin Herrenschmidt wrote: > On Mon, 2013-09-23 at 13:59 -0400, Chris Metcalf wrote: >> We just came up against this on tilegx with a customer bug report in a >> PREEMPT environment. On tile, %tp is a GPR that points to the percpu area. >> The following seems to be the right abstraction -- though I'd also argue >> that letting barrier() clobber not just memory, but %tp, might be a better >> solution, but it's not clear what the best way is to do per-architecture >> overrides of per-compiler definitions like barrier(). See also the ARM v7 >> code, which has to do something similar, though their percpu pointer is >> not a GPR, which changes the tradeoffs somewhat. > Hrm, if I read correctly what you did is that you read "tp" into another > register *and* also mark that action as clobbering the top int on the stack ? > > I don't quite get what the stack clobber brings you here and how it works > around the fact that gcc might still cache that copy of tp into another > register accross preempt_enable/disable... Correct. The idea is that since gcc knows that the outputs of the asm are dependent on memory, when it sees a memory clobber it knows it has to instantiate the asm again to get a new value of the asm outputs. gcc will know that any copies of tp in other registers are also dependent on memory and will drop all of them across the barrier() as well. > It's hard to tell with gcc ... the best I've had so far as an option was > something that would mark my per-cpu register (r13) *itself* as clobbered... Well, as I said above, that would be better, but it requires providing an alternate definition of barrier() that is per-architecture, not just per-compiler. If there's interest in pursuing a solution like that, it would be technically somewhat better; in particular, with PREEMPT_NONE, you could treat the "tp" register int as locally scoped in an inline, and the compiler wouldn't have to reload it after function calls. Presumably we'd need to pick an asm header that could provide an arch_barrier_clobbers string that would be added to barrier() for gcc if it were defined. -- Chris Metcalf, Tilera Corp. http://www.tilera.com ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 19:27 ` Chris Metcalf @ 2013-09-24 20:58 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 20:58 UTC (permalink / raw) To: Chris Metcalf Cc: Linus Torvalds, Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Tue, 2013-09-24 at 15:27 -0400, Chris Metcalf wrote: > > It's hard to tell with gcc ... the best I've had so far as an option was > > something that would mark my per-cpu register (r13) *itself* as clobbered... > > Well, as I said above, that would be better, but it requires providing an > alternate definition of barrier() that is per-architecture, not just > per-compiler. My compiler people tell me "clobbered" is wrong. It will tell the compiler that barrier() damages r13 (or whatever other register you use) and instead make it do exactly the wrong thing which is to back it up before the barrier and use the backup :-) I'm told what we need is an empty asm that marks r13 as an *output* (and possible an input as well to be safe). I will experiment. > If there's interest in pursuing a solution like that, it > would be technically somewhat better; in particular, with PREEMPT_NONE, > you could treat the "tp" register int as locally scoped in an inline, and > the compiler wouldn't have to reload it after function calls. Presumably > we'd need to pick an asm header that could provide an arch_barrier_clobbers > string that would be added to barrier() for gcc if it were defined. My idea was to add a preempt_barrier() and put that in preempt_enable/disable and the various irq enable/disable. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-22 22:22 ` Linus Torvalds 2013-09-22 22:38 ` Benjamin Herrenschmidt 2013-09-23 17:59 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Chris Metcalf @ 2013-09-24 0:10 ` Benjamin Herrenschmidt 2013-09-24 1:19 ` Linus Torvalds 2 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 0:10 UTC (permalink / raw) To: Linus Torvalds Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Sun, 2013-09-22 at 15:22 -0700, Linus Torvalds wrote: > - use %r13 for the per-thread thread-info pointer instead. A > per-thread pointer is *not* volatile like the per-cpu base is. .../... > Alternatively, make %r13 point to the percpu side, but make sure that > you always use an asm accessor to fetch the value. In particular, I > think you need to make __my_cpu_offset be an inline asm that fetches > %r13 into some other register. Otherwise you can never get it right. BTW, that boils down to a choice between using r13 as either a TLS for current or current_thread_info, or as a per-cpu pointer, which one is the most performance critical ? Now in the first case, it seems to me that using it as "current" rather than "current_thread_info()" is a better idea since we access current a LOT more overall in the kernel, from there we can find a way to put thread_info into task struct (via thread struct maybe) to make it a simple offset from current. The big pro of that approach is of course that r13 becomes the TLS as intended, and we can feel a lot more comfortable that we are "safe" vs. whatever crazyness gcc will come up with next. The flip side is that per-cpu will remain a load away, so getting the address of a per-cpu variable would typically be a 3 instruction deal involving a load and a pair of adds to get to the address, then the actual per-cpu access proper. This is equivalent to what we have today (we put the per-cpu offset in the PACA). Using r13 as per-cpu allows to avoid that first load. So what's the most worthwhile thing to do here ? I'm leaning toward 1, ie, stick current in r13 and feel a lot safer about it (I won't have to scrutinize generated code all over the place to convince myself things aren't crossing the barriers), and if the thread_info is in the task struct, that makes accessing it really trivial & fast as well. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 0:10 ` Benjamin Herrenschmidt @ 2013-09-24 1:19 ` Linus Torvalds 2013-09-24 1:52 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2013-09-24 1:19 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Mon, Sep 23, 2013 at 5:10 PM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > > BTW, that boils down to a choice between using r13 as either a TLS for > current or current_thread_info, or as a per-cpu pointer, which one is > the most performance critical ? I think you can tune most of the architecture setup to best suit your needs. For example, on x86, we don't have much choice: the per-cpu accessors are going to be faster than the alternatives, and there are patches afoot to tune the preempt and rcu-readside counters to use the percpu area (and then save/restore things at task switch time). But having the counters natively in the thread_info struct is fine too and is what we do now. Generally, we've put the performance-critical stuff into "current_thread_info" as opposed to "current", so it's likely that if the choice is between those two, then you might want to pick %r13 pointing to the thread-info rather than the "struct task_struct" (ie things like low-level thread flags). But which is better probably depends on load, and again, some of it you can tweak by just making per-architecture structure choices and making the macros point at one or the other. There's a few things that really depend on per-cpu areas, but I don't think it's a huge performance issue if you have to indirect off memory to get that. Most of the performance issues with per-cpu stuff is about avoiding cachelines ping-ponging back and forth, not so much about fast direct access. Of course, if some load really uses a *lot* of percpu accesses, you get both. The advantage of having %r13 point to thread data (which is "stable" as far as the compiler is concerned) as opposed to having it be a per-cpu pointer (which can change randomly due to task switching) is that from a correctness standpoint I really do think that either thread-info or current is *much* easier to handle than using it for the per-cpu base pointer. Linus ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 1:19 ` Linus Torvalds @ 2013-09-24 1:52 ` Benjamin Herrenschmidt 2013-09-24 8:04 ` Peter Zijlstra 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 1:52 UTC (permalink / raw) To: Linus Torvalds Cc: Peter Zijlstra, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Mon, 2013-09-23 at 18:19 -0700, Linus Torvalds wrote: > On Mon, Sep 23, 2013 at 5:10 PM, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: > > > > BTW, that boils down to a choice between using r13 as either a TLS for > > current or current_thread_info, or as a per-cpu pointer, which one is > > the most performance critical ? > > I think you can tune most of the architecture setup to best suit your needs. > > For example, on x86, we don't have much choice: the per-cpu accessors > are going to be faster than the alternatives, and there are patches > afoot to tune the preempt and rcu-readside counters to use the percpu > area (and then save/restore things at task switch time). But having > the counters natively in the thread_info struct is fine too and is > what we do now. Right, as long as the generic code doesn't move toward putting everything in per-cpu without leaving us the option :-) > Generally, we've put the performance-critical stuff into > "current_thread_info" as opposed to "current", so it's likely that if > the choice is between those two, then you might want to pick %r13 > pointing to the thread-info rather than the "struct task_struct" (ie > things like low-level thread flags). But which is better probably > depends on load, and again, some of it you can tweak by just making > per-architecture structure choices and making the macros point at one > or the other. Well, if current_thread_info is basically inside the thread struct, it will be the same, just a different offset from r13... task struct, thread struct, thread info, it all becomes just one big structure pointed to by r13. > There's a few things that really depend on per-cpu areas, but I don't > think it's a huge performance issue if you have to indirect off memory > to get that. Most of the performance issues with per-cpu stuff is > about avoiding cachelines ping-ponging back and forth, not so much > about fast direct access. Of course, if some load really uses a *lot* > of percpu accesses, you get both. > > The advantage of having %r13 point to thread data (which is "stable" > as far as the compiler is concerned) as opposed to having it be a > per-cpu pointer (which can change randomly due to task switching) is > that from a correctness standpoint I really do think that either > thread-info or current is *much* easier to handle than using it for > the per-cpu base pointer. Right. I had a chat with Alan Modra (gcc) and he reckons the "right" way to make the per-cpu (or PACA) stuff work reasonably reliably is to do something along the lines of: register unsigned long per_cpu_offset asm("r13"); And have a barrier in preempt_enable/disable (and irq enable/disable, though arguably we could just make barrier() do it) which marks r13 as an *output* (not a clobber !). >From there, gcc knows that after any such barrier, r13 can have changed and we intend to use the new value (if it's marked as a clobber, it assumes it was *clobbered* and thus need to be restored to it's *previous* value). So if that holds, we have a solid way to do per-cpu. On one side, I tend to think that r13 being task/thread/thread_info is probably a better overall choice, I'm worried that going in a different direction than x86 means generic code will get "tuned" to use per-cpu for performance critical stuff rather than task/thread/thread_info in inflexible ways. Cheers, Ben. > Linus > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 1:52 ` Benjamin Herrenschmidt @ 2013-09-24 8:04 ` Peter Zijlstra 2013-09-24 8:16 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 52+ messages in thread From: Peter Zijlstra @ 2013-09-24 8:04 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Tue, Sep 24, 2013 at 11:52:07AM +1000, Benjamin Herrenschmidt wrote: > So if that holds, we have a solid way to do per-cpu. On one side, I tend > to think that r13 being task/thread/thread_info is probably a better > overall choice, I'm worried that going in a different direction than x86 > means generic code will get "tuned" to use per-cpu for performance > critical stuff rather than task/thread/thread_info in inflexible ways. The plus side of per-cpu over per-task is that one typically has a lot less cpus than tasks. Also, its far easier/cheaper to iterate cpus than it is to iterate tasks. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 8:04 ` Peter Zijlstra @ 2013-09-24 8:16 ` Benjamin Herrenschmidt 2013-09-24 8:21 ` Peter Zijlstra 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 8:16 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Tue, 2013-09-24 at 10:04 +0200, Peter Zijlstra wrote: > On Tue, Sep 24, 2013 at 11:52:07AM +1000, Benjamin Herrenschmidt wrote: > > So if that holds, we have a solid way to do per-cpu. On one side, I tend > > to think that r13 being task/thread/thread_info is probably a better > > overall choice, I'm worried that going in a different direction than x86 > > means generic code will get "tuned" to use per-cpu for performance > > critical stuff rather than task/thread/thread_info in inflexible ways. > > The plus side of per-cpu over per-task is that one typically has a lot > less cpus than tasks. Also, its far easier/cheaper to iterate cpus than > it is to iterate tasks. I don't see how that relates to the above though... ie, how the number of CPUs vs. tasks and the relative ease of iteration relates to how fast we access the "current" cpu and the "current" task ? The thing is, if I use r13 as "current" (and put thread_info in the task struct), virtually *all* my accesses to task, thread and thread_info structs become a single load or store instruction from r13. On the other hand, access to "my_cpu_offset" for per-cpu remains (since that's what we do today already) a load to get the offset an an addition + access. (ie, I would stash the cpu offset into the thread info and context switch it). If I use r13 as "my_cpu_offset", then I might be able to skip a load for per-cpu accesses to the current cpu, making them a bit faster, but add an indirection for "current" and thread_info. It's basically a question of where do I put the extra load and what has the bigger net benefit. I tend to think we access "current" a LOT more than per-cpu but I might be wrong. Additionally, using r13 for "current" removes all the problems with gcc copying the value accross preempt_disable/enable etc... while using it as per-cpu means they remain. We think we have a fix (which will involve a special preempt_barrier() added to preempt_disable/enable and irq enable/disable), but it's still not as nice as "the problem just doesn't exist" :-) Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 8:16 ` Benjamin Herrenschmidt @ 2013-09-24 8:21 ` Peter Zijlstra 2013-09-24 9:31 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 52+ messages in thread From: Peter Zijlstra @ 2013-09-24 8:21 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Tue, Sep 24, 2013 at 06:16:53PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2013-09-24 at 10:04 +0200, Peter Zijlstra wrote: > > On Tue, Sep 24, 2013 at 11:52:07AM +1000, Benjamin Herrenschmidt wrote: > > > So if that holds, we have a solid way to do per-cpu. On one side, I tend > > > to think that r13 being task/thread/thread_info is probably a better > > > overall choice, I'm worried that going in a different direction than x86 > > > means generic code will get "tuned" to use per-cpu for performance > > > critical stuff rather than task/thread/thread_info in inflexible ways. > > > > The plus side of per-cpu over per-task is that one typically has a lot > > less cpus than tasks. Also, its far easier/cheaper to iterate cpus than > > it is to iterate tasks. > > I don't see how that relates to the above though... It was a comment on the increase of per-cpu usage in generic code. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 8:21 ` Peter Zijlstra @ 2013-09-24 9:31 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 9:31 UTC (permalink / raw) To: Peter Zijlstra Cc: Linus Torvalds, H. Peter Anvin, Frederic Weisbecker, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton, Anton Blanchard On Tue, 2013-09-24 at 10:21 +0200, Peter Zijlstra wrote: > > I don't see how that relates to the above though... > > It was a comment on the increase of per-cpu usage in generic code. Oh I see. Ok, but some things are intrinsically per-task so if you use per-cpu in that case you end up context switching, at least that's what Linus alluded to. It would be nice if such construct we design so that we still access them via the task struct on arch where such accesses are cheap. I think if/when we get rid of using r13 for our arch specific "PACA", it becomes easy for me to experiment with using it for either per-cpu or current and do some measurements. It also means I can easily switch "the other way" if I need to. I'll dig into that, it will probably take some time before I sanitize our code enough anyway. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-21 21:45 ` Benjamin Herrenschmidt 2013-09-21 23:27 ` Frederic Weisbecker 2013-09-22 2:01 ` H. Peter Anvin @ 2013-09-23 4:40 ` Benjamin Herrenschmidt 2013-09-23 5:01 ` David Miller 2013-09-24 2:44 ` Frederic Weisbecker 2 siblings, 2 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-23 4:40 UTC (permalink / raw) To: Frederic Weisbecker Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Sun, 2013-09-22 at 07:45 +1000, Benjamin Herrenschmidt wrote: > What I *can* do that would help I suppose would be to switch to the irq > stack before irq_enter/exit which would at least mean that softirq would > run from the top of the irq stack which is better than the current > situation. > > I'm fact I'll whip up a quick fix see if that might be enough of a band > aid for RHEL7. OK I've done that, it seems to work so far. Heads up guys: i386 and sparc at least seem to need the same treatment. I haven't looked at others except ARM which doesn't seem to have irq stacks to begin with. We can also instead apply Fred's series to put back in the switch to the softirq stack since this is actually a regression , but then, arguably, making sure irq_exit() is called off the irq stack is better and means we do one instead of two stack switches. Fred: Maybe revert partially under an arch #define/Kconfig so we can get the best of both worlds ? Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-23 4:40 ` Benjamin Herrenschmidt @ 2013-09-23 5:01 ` David Miller 2013-09-24 2:44 ` Frederic Weisbecker 1 sibling, 0 replies; 52+ messages in thread From: David Miller @ 2013-09-23 5:01 UTC (permalink / raw) To: benh Cc: fweisbec, torvalds, tglx, linux-kernel, paulus, mingo, peterz, hpa, james.hogan, jejb, deller, schwidefsky, heiko.carstens, akpm From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Date: Mon, 23 Sep 2013 14:40:34 +1000 > Heads up guys: i386 and sparc at least seem to need the same > treatment. I'll take a look thanks for the heads up. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-23 4:40 ` Benjamin Herrenschmidt 2013-09-23 5:01 ` David Miller @ 2013-09-24 2:44 ` Frederic Weisbecker 2013-09-24 4:42 ` Benjamin Herrenschmidt 1 sibling, 1 reply; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-24 2:44 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Mon, Sep 23, 2013 at 02:40:34PM +1000, Benjamin Herrenschmidt wrote: > On Sun, 2013-09-22 at 07:45 +1000, Benjamin Herrenschmidt wrote: > > What I *can* do that would help I suppose would be to switch to the irq > > stack before irq_enter/exit which would at least mean that softirq would > > run from the top of the irq stack which is better than the current > > situation. > > > > I'm fact I'll whip up a quick fix see if that might be enough of a band > > aid for RHEL7. > > OK I've done that, it seems to work so far. Heads up guys: i386 and sparc > at least seem to need the same treatment. I haven't looked at others except > ARM which doesn't seem to have irq stacks to begin with. > > We can also instead apply Fred's series to put back in the switch to the > softirq stack since this is actually a regression , but then, arguably, > making sure irq_exit() is called off the irq stack is better and means > we do one instead of two stack switches. > > Fred: Maybe revert partially under an arch #define/Kconfig so we can get > the best of both worlds ? Aye, I did not realize that's indeed a regression, caused by ("irq: Sanitize invoke_softirq") facd8b80c67a3cf64a467c4a2ac5fb31f2e6745b which converted do_softirq() to __do_softirq() on irq_exit(). It indeed looked like the macro-conditional call was only there for irq disability reasons. But then these crashes... So the safest way to fix this is to unconditionally call do_softirq() from irq_exit(). A performance penalty may come along but safety primes. We should probably do that and work on longer term solutions (Kconfig based arch switch, etc...) for the next merge window? I'll respin the series plus the regression fix, unless somebody has a better solution. Thanks. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 2:44 ` Frederic Weisbecker @ 2013-09-24 4:42 ` Benjamin Herrenschmidt 2013-09-24 13:56 ` Frederic Weisbecker 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 4:42 UTC (permalink / raw) To: Frederic Weisbecker Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Tue, 2013-09-24 at 04:44 +0200, Frederic Weisbecker wrote: > So the safest way to fix this is to unconditionally call do_softirq() > from irq_exit(). > A performance penalty may come along but safety primes. > > We should probably do that and work on longer term solutions (Kconfig > based arch switch, etc...) > for the next merge window? As you prefer, though I'm keen on getting the "fast" version in RHEL7 if RH will take it :-) >From the generic code POV, it's a one-liner #ifdef to select between do_softirq and __do_softirq() right ? Then it's up to the arch to #define I_CAN_DO_FAST ! > I'll respin the series plus the regression fix, unless somebody has a > better solution. Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 4:42 ` Benjamin Herrenschmidt @ 2013-09-24 13:56 ` Frederic Weisbecker 2013-09-24 20:55 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-24 13:56 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Tue, Sep 24, 2013 at 02:42:57PM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2013-09-24 at 04:44 +0200, Frederic Weisbecker wrote: > > So the safest way to fix this is to unconditionally call do_softirq() > > from irq_exit(). > > A performance penalty may come along but safety primes. > > > > We should probably do that and work on longer term solutions (Kconfig > > based arch switch, etc...) > > for the next merge window? > > As you prefer, though I'm keen on getting the "fast" version in RHEL7 if > RH will take it :-) So what is the fast version? Converting __do_softirq() to do_softirq() unconditionally. RH will accept any fix that goes upstream. > > From the generic code POV, it's a one-liner #ifdef to select between > do_softirq and __do_softirq() right ? Then it's up to the arch to > #define I_CAN_DO_FAST ! I'd rather say #define I_CAN_DO_SAFE :) But I guess the kind of symbol we want is some ARCH_HAS_IRQ_STACK_LOW_HANDLER > > > I'll respin the series plus the regression fix, unless somebody has a > > better solution. > > Cheers, > Ben. > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 13:56 ` Frederic Weisbecker @ 2013-09-24 20:55 ` Benjamin Herrenschmidt 2013-09-25 8:46 ` Frederic Weisbecker 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-24 20:55 UTC (permalink / raw) To: Frederic Weisbecker Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Tue, 2013-09-24 at 15:56 +0200, Frederic Weisbecker wrote: > On Tue, Sep 24, 2013 at 02:42:57PM +1000, Benjamin Herrenschmidt wrote: > > On Tue, 2013-09-24 at 04:44 +0200, Frederic Weisbecker wrote: > > > So the safest way to fix this is to unconditionally call do_softirq() > > > from irq_exit(). > > > A performance penalty may come along but safety primes. > > > > > > We should probably do that and work on longer term solutions (Kconfig > > > based arch switch, etc...) > > > for the next merge window? > > > > As you prefer, though I'm keen on getting the "fast" version in RHEL7 if > > RH will take it :-) > > So what is the fast version? Converting __do_softirq() to do_softirq() > unconditionally. > > RH will accept any fix that goes upstream. No, me fixing powerpc do_IRQ to do irq_exit run on the irq stack, and your fix for everybody else with an ifdef such that x86_64 and powerpc get to skip the additional stack switch. > > > > From the generic code POV, it's a one-liner #ifdef to select between > > do_softirq and __do_softirq() right ? Then it's up to the arch to > > #define I_CAN_DO_FAST ! > > I'd rather say #define I_CAN_DO_SAFE :) > > But I guess the kind of symbol we want is some ARCH_HAS_IRQ_STACK_LOW_HANDLER ARCH_IRQ_EXIT_ON_IRQ_STACK Cheers, Ben. > > > > > I'll respin the series plus the regression fix, unless somebody has a > > > better solution. > > > > Cheers, > > Ben. > > > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-24 20:55 ` Benjamin Herrenschmidt @ 2013-09-25 8:46 ` Frederic Weisbecker 0 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-25 8:46 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linus Torvalds, Thomas Gleixner, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Wed, Sep 25, 2013 at 06:55:47AM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2013-09-24 at 15:56 +0200, Frederic Weisbecker wrote: > > On Tue, Sep 24, 2013 at 02:42:57PM +1000, Benjamin Herrenschmidt wrote: > > > On Tue, 2013-09-24 at 04:44 +0200, Frederic Weisbecker wrote: > > > > So the safest way to fix this is to unconditionally call do_softirq() > > > > from irq_exit(). > > > > A performance penalty may come along but safety primes. > > > > > > > > We should probably do that and work on longer term solutions (Kconfig > > > > based arch switch, etc...) > > > > for the next merge window? > > > > > > As you prefer, though I'm keen on getting the "fast" version in RHEL7 if > > > RH will take it :-) > > > > So what is the fast version? Converting __do_softirq() to do_softirq() > > unconditionally. > > > > RH will accept any fix that goes upstream. > > No, me fixing powerpc do_IRQ to do irq_exit run on the irq stack, and > your fix for everybody else with an ifdef such that x86_64 and powerpc > get to skip the additional stack switch. > > > > > > > From the generic code POV, it's a one-liner #ifdef to select between > > > do_softirq and __do_softirq() right ? Then it's up to the arch to > > > #define I_CAN_DO_FAST ! > > > > I'd rather say #define I_CAN_DO_SAFE :) > > > > But I guess the kind of symbol we want is some ARCH_HAS_IRQ_STACK_LOW_HANDLER > > ARCH_IRQ_EXIT_ON_IRQ_STACK Ok, I'll pick this one. Thanks! > > Cheers, > Ben. > > > > > > > > I'll respin the series plus the regression fix, unless somebody has a > > > > better solution. > > > > > > Cheers, > > > Ben. > > > > > > > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [RFC GIT PULL] softirq: Consolidation and stack overrun fix 2013-09-20 16:26 ` Frederic Weisbecker 2013-09-20 17:30 ` Thomas Gleixner 2013-09-20 22:14 ` Linus Torvalds @ 2013-09-21 0:52 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-21 0:52 UTC (permalink / raw) To: Frederic Weisbecker Cc: Thomas Gleixner, Linus Torvalds, LKML, Paul Mackerras, Ingo Molnar, Peter Zijlstra, H. Peter Anvin, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton On Fri, 2013-09-20 at 11:26 -0500, Frederic Weisbecker wrote: > Looks good to me. > > Now just for clarity, what do we then do with inline sofirq executions: on local_bh_enable() > for example, or explicit calls to do_softirq() other than irq exit? We cannot make the irq stack larger than the normal stacks on ppc due to how we get to the thread info (by masking low bits of the stack pointer), however our stacks are pretty big too (16k) so it might be ok to run the softirqs on the irq stack, it's just a matter of us doing the switch before do_IRQ rather than later when calling the handlers. I think we basically copied what x86 originally did, but we can definitely change that. > Should we keep the current switch to a different softirq stack? If we have a generic irq stack > (used for both hard and soft) that is big enough, perhaps we can also switch to this > generic irq stack for inline softirqs executions? After all there is no much point in keeping > a separate stack for that: this result in cache misses if the inline softirq is interrupted by > a hardirq, also inlined softirqs can't happen in hardirq, so there should be no much risk of overruns. > > Or am I missing other things? Originally IRQs could nest so we really wanted separate stacks, especially since softirq is where network processing happens and that can go fairly deep. But that's not the case anymore so I suppose it makes some sense... Ben. > > > > Thanks, > > > > tglx > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* do_softirq() vs __do_softirq() in irq_exit() and stack overflow @ 2013-09-04 21:39 Benjamin Herrenschmidt 2013-09-05 15:33 ` [RFC PATCH 0/3] irq: Fix stack overflow due to softirq called on current stack Frederic Weisbecker 0 siblings, 1 reply; 52+ messages in thread From: Benjamin Herrenschmidt @ 2013-09-04 21:39 UTC (permalink / raw) To: linux-kernel Cc: tglx, Linus Torvalds, Frederic Weisbecker, davem, Paul Mackerras Hi Folks ! It appears that the current version of irq_exit() calls __do_softirq() directly rather than do_softirq(). That means we are going to call the softirq's in the current interrupt frame rather than on the separate softirq stack. The current frame is also still the normal kernel stack, because do_IRQ() itself only switches to the interrupt stack for processing the handlers (it's back to the original stack by the time it calls irq_exit). That means that we end up stacking the normal stack, the actually HW interrupt stack frame (which can be pretty big on ppc) + do_IRQ's own, then the softirq (networks stack can create HUGE stack frames) and ... we are in softirq, so HW irqs are enable, we can thus can another irq stack frame piled up on top of that (or a perf stack). We are observing actual overflows, here's an example blowing up our 16k stack on ppc64, you notice that it's all on the normal kernel stack: [ 1002.364577] do_IRQ: stack overflow: 1920 [ 1002.364653] CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1 [ 1002.364734] Call Trace: [ 1002.364770] [c0000000050a8740] [c0000000000157c0] .show_stack+0x130/0x200 (unreliable) [ 1002.364872] [c0000000050a8810] [c00000000082f6d0] .dump_stack+0x28/0x3c [ 1002.364955] [c0000000050a8880] [c000000000010b98] .do_IRQ+0x2b8/0x2c0 [ 1002.365039] [c0000000050a8930] [c0000000000026d4] hardware_interrupt_common+0x154/0x180 [ 1002.365148] --- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp] [ 1002.365148] LR = .cp_start_xmit+0x390/0x820 [8139cp] [ 1002.365359] [c0000000050a8d40] [c0000000006d7f14] .dev_hard_start_xmit+0x394/0x640 [ 1002.365433] [c0000000050a8e00] [c0000000007028c0] .sch_direct_xmit+0x110/0x260 [ 1002.365499] [c0000000050a8ea0] [c0000000006d8420] .dev_queue_xmit+0x260/0x630 [ 1002.365571] [c0000000050a8f40] [d0000000027d30d4] .br_dev_queue_push_xmit+0xc4/0x130 [bridge] [ 1002.365641] [c0000000050a8fc0] [d0000000027d01f8] .br_dev_xmit+0x198/0x270 [bridge] [ 1002.365707] [c0000000050a9070] [c0000000006d7f14] .dev_hard_start_xmit+0x394/0x640 [ 1002.365774] [c0000000050a9130] [c0000000006d85e8] .dev_queue_xmit+0x428/0x630 [ 1002.365834] [c0000000050a91d0] [c000000000729764] .ip_finish_output+0x2a4/0x550 [ 1002.365902] [c0000000050a9290] [c00000000072aaf0] .ip_local_out+0x50/0x70 [ 1002.365960] [c0000000050a9310] [c00000000072aed8] .ip_queue_xmit+0x148/0x420 [ 1002.366018] [c0000000050a93b0] [c000000000749524] .tcp_transmit_skb+0x4e4/0xaf0 [ 1002.366085] [c0000000050a94a0] [c00000000073de9c] .__tcp_ack_snd_check+0x7c/0xf0 [ 1002.366152] [c0000000050a9520] [c0000000007451d8] .tcp_rcv_established+0x1e8/0x930 [ 1002.366217] [c0000000050a95f0] [c00000000075326c] .tcp_v4_do_rcv+0x21c/0x570 [ 1002.366274] [c0000000050a96c0] [c000000000754a44] .tcp_v4_rcv+0x734/0x930 [ 1002.366332] [c0000000050a97a0] [c000000000724144] .ip_local_deliver_finish+0x184/0x360 [ 1002.366398] [c0000000050a9840] [c000000000724468] .ip_rcv_finish+0x148/0x400 [ 1002.366457] [c0000000050a98d0] [c0000000006d3248] .__netif_receive_skb_core+0x4f8/0xb00 [ 1002.366523] [c0000000050a99d0] [c0000000006d5414] .netif_receive_skb+0x44/0x110 [ 1002.366594] [c0000000050a9a70] [d0000000027d4e2c] .br_handle_frame_finish+0x2bc/0x3f0 [bridge] [ 1002.366674] [c0000000050a9b20] [d0000000027de5ac] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge] [ 1002.366754] [c0000000050a9bd0] [d0000000027df5ec] .br_nf_pre_routing+0x4dc/0x7d0 [bridge] [ 1002.366820] [c0000000050a9c70] [c000000000717aa4] .nf_iterate+0x114/0x130 [ 1002.366877] [c0000000050a9d30] [c000000000717b74] .nf_hook_slow+0xb4/0x1e0 [ 1002.366938] [c0000000050a9e00] [d0000000027d51f0] .br_handle_frame+0x290/0x330 [bridge] [ 1002.367005] [c0000000050a9ea0] [c0000000006d309c] .__netif_receive_skb_core+0x34c/0xb00 [ 1002.367072] [c0000000050a9fa0] [c0000000006d5414] .netif_receive_skb+0x44/0x110 [ 1002.367138] [c0000000050aa040] [c0000000006d6218] .napi_gro_receive+0xe8/0x120 [ 1002.367210] [c0000000050aa0c0] [d00000000208536c] .cp_rx_poll+0x31c/0x590 [8139cp] [ 1002.367276] [c0000000050aa1d0] [c0000000006d59cc] .net_rx_action+0x1dc/0x310 [ 1002.367335] [c0000000050aa2b0] [c0000000000a0568] .__do_softirq+0x158/0x330 [ 1002.367394] [c0000000050aa3b0] [c0000000000a0978] .irq_exit+0xc8/0x110 [ 1002.367452] [c0000000050aa430] [c0000000000109bc] .do_IRQ+0xdc/0x2c0 [ 1002.367510] [c0000000050aa4e0] [c0000000000026d4] hardware_interrupt_common+0x154/0x180 [ 1002.367580] --- Exception: 501 at .bad_range+0x1c/0x110 [ 1002.367580] LR = .get_page_from_freelist+0x908/0xbb0 [ 1002.367658] [c0000000050aa7d0] [c00000000041d758] .list_del+0x18/0x50 (unreliable) [ 1002.367725] [c0000000050aa850] [c0000000001bfa98] .get_page_from_freelist+0x908/0xbb0 [ 1002.367792] [c0000000050aa9e0] [c0000000001bff5c] .__alloc_pages_nodemask+0x21c/0xae0 [ 1002.367860] [c0000000050aaba0] [c0000000002126d0] .alloc_pages_vma+0xd0/0x210 [ 1002.367918] [c0000000050aac60] [c0000000001e93f4] .handle_pte_fault+0x814/0xb70 [ 1002.367985] [c0000000050aad50] [c0000000001eade4] .__get_user_pages+0x1a4/0x640 [ 1002.368052] [c0000000050aae60] [c00000000004606c] .get_user_pages_fast+0xec/0x160 [ 1002.368130] [c0000000050aaf10] [d000000001f73930] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm] [ 1002.368205] [c0000000050aafd0] [d000000001f7e214] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm] [ 1002.368280] [c0000000050ab070] [d000000001f8a824] .kvmppc_mmu_map_page+0x94/0x530 [kvm] [ 1002.368354] [c0000000050ab190] [d000000001f85064] .kvmppc_handle_pagefault+0x174/0x610 [kvm] [ 1002.368429] [c0000000050ab270] [d000000001f85b74] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm] [ 1002.368504] [c0000000050ab320] [d000000001f88ec4] kvm_start_lightweight+0x1ec/0x1fc [kvm] [ 1002.368578] [c0000000050ab4f0] [d000000001f86a58] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm] [ 1002.368652] [c0000000050ab9c0] [d000000001f7f218] .kvmppc_vcpu_run+0xc8/0xf0 [kvm] [ 1002.368725] [c0000000050aba50] [d000000001f7bdac] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm] [ 1002.368797] [c0000000050abae0] [d000000001f74618] .kvm_vcpu_ioctl+0x478/0x730 [kvm] [ 1002.368865] [c0000000050abc90] [c00000000025302c] .do_vfs_ioctl+0x4ec/0x7c0 [ 1002.368923] [c0000000050abd80] [c0000000002533d4] .SyS_ioctl+0xd4/0xf0 [ 1002.368981] [c0000000050abe30] [c000000000009ed4] syscall_exit+0x0/0x98 [ 1002.369117] ------------[ cut here ]------------ Cheers, Ben. ^ permalink raw reply [flat|nested] 52+ messages in thread
* [RFC PATCH 0/3] irq: Fix stack overflow due to softirq called on current stack 2013-09-04 21:39 do_softirq() vs __do_softirq() in irq_exit() and stack overflow Benjamin Herrenschmidt @ 2013-09-05 15:33 ` Frederic Weisbecker 2013-09-05 15:33 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker 0 siblings, 1 reply; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-05 15:33 UTC (permalink / raw) To: LKML Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, H. Peter Anvin, Linus Torvalds, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton Hi, This series is a proposition to fix the crash reported here: http://lkml.kernel.org/r/1378330796.4321.50.camel%40pasglop And it has the upside to also consolidate a bit the arch do_softirq overriden implementation. Only tested in x86-64 for now. Thanks. Frederic Weisbecker (3): irq: Consolidate do_softirq() arch overriden implementations irq: Execute softirq on its own stack on irq exit irq: Comment on the use of inline stack for ksoftirq arch/metag/kernel/irq.c | 56 +++++++++++++++++------------------------ arch/parisc/kernel/irq.c | 17 +----------- arch/powerpc/kernel/irq.c | 17 +----------- arch/s390/kernel/irq.c | 52 +++++++++++++++---------------------- arch/sh/kernel/irq.c | 60 ++++++++++++++++++------------------------- arch/sparc/kernel/irq_64.c | 31 +++++++--------------- arch/x86/kernel/irq_32.c | 34 +++++++++---------------- arch/x86/kernel/irq_64.c | 18 ++----------- include/linux/interrupt.h | 11 ++++++++ kernel/softirq.c | 10 +++---- 10 files changed, 112 insertions(+), 194 deletions(-) -- 1.7.5.4 ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations 2013-09-05 15:33 ` [RFC PATCH 0/3] irq: Fix stack overflow due to softirq called on current stack Frederic Weisbecker @ 2013-09-05 15:33 ` Frederic Weisbecker 0 siblings, 0 replies; 52+ messages in thread From: Frederic Weisbecker @ 2013-09-05 15:33 UTC (permalink / raw) To: LKML Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras, Ingo Molnar, Thomas Gleixner, Peter Zijlstra, H. Peter Anvin, Linus Torvalds, James Hogan, James E.J. Bottomley, Helge Deller, Martin Schwidefsky, Heiko Carstens, David S. Miller, Andrew Morton All arch's overriden implementation of do_softirq() do the same: disabled irqs, check if there are softirqs pending, then execute __do_softirq() it a specific stack. Consolidate the common parts such that arch only worry about the stack switch. Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@au1.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Andrew Morton <akpm@linux-foundation.org> --- arch/metag/kernel/irq.c | 56 +++++++++++++++++------------------------ arch/parisc/kernel/irq.c | 17 +----------- arch/powerpc/kernel/irq.c | 17 +----------- arch/s390/kernel/irq.c | 52 +++++++++++++++---------------------- arch/sh/kernel/irq.c | 60 ++++++++++++++++++------------------------- arch/sparc/kernel/irq_64.c | 31 +++++++--------------- arch/x86/kernel/irq_32.c | 34 +++++++++---------------- arch/x86/kernel/irq_64.c | 18 ++----------- include/linux/interrupt.h | 11 ++++++++ kernel/softirq.c | 7 +--- 10 files changed, 110 insertions(+), 193 deletions(-) diff --git a/arch/metag/kernel/irq.c b/arch/metag/kernel/irq.c index 2a2c9d5..12a12b4 100644 --- a/arch/metag/kernel/irq.c +++ b/arch/metag/kernel/irq.c @@ -159,44 +159,34 @@ void irq_ctx_exit(int cpu) extern asmlinkage void __do_softirq(void); -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; struct thread_info *curctx; union irq_ctx *irqctx; u32 *isp; - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) { - curctx = current_thread_info(); - irqctx = softirq_ctx[smp_processor_id()]; - irqctx->tinfo.task = curctx->task; - - /* build the stack frame on the softirq stack */ - isp = (u32 *) ((char *)irqctx + sizeof(struct thread_info)); - - asm volatile ( - "MOV D0.5,%0\n" - "SWAP A0StP,D0.5\n" - "CALLR D1RtP,___do_softirq\n" - "MOV A0StP,D0.5\n" - : - : "r" (isp) - : "memory", "cc", "D1Ar1", "D0Ar2", "D1Ar3", "D0Ar4", - "D1Ar5", "D0Ar6", "D0Re0", "D1Re0", "D0.4", "D1RtP", - "D0.5" - ); - /* - * Shouldn't happen, we returned above if in_interrupt(): - */ - WARN_ON_ONCE(softirq_count()); - } - - local_irq_restore(flags); + curctx = current_thread_info(); + irqctx = softirq_ctx[smp_processor_id()]; + irqctx->tinfo.task = curctx->task; + + /* build the stack frame on the softirq stack */ + isp = (u32 *) ((char *)irqctx + sizeof(struct thread_info)); + + asm volatile ( + "MOV D0.5,%0\n" + "SWAP A0StP,D0.5\n" + "CALLR D1RtP,___do_softirq\n" + "MOV A0StP,D0.5\n" + : + : "r" (isp) + : "memory", "cc", "D1Ar1", "D0Ar2", "D1Ar3", "D0Ar4", + "D1Ar5", "D0Ar6", "D0Re0", "D1Re0", "D0.4", "D1RtP", + "D0.5" + ); + /* + * Shouldn't happen, we returned above if in_interrupt(): + */ + WARN_ON_ONCE(softirq_count()); } #endif diff --git a/arch/parisc/kernel/irq.c b/arch/parisc/kernel/irq.c index 2e6443b..ef59276 100644 --- a/arch/parisc/kernel/irq.c +++ b/arch/parisc/kernel/irq.c @@ -499,22 +499,9 @@ static void execute_on_irq_stack(void *func, unsigned long param1) *irq_stack_in_use = 1; } -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - __u32 pending; - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); - - pending = local_softirq_pending(); - - if (pending) - execute_on_irq_stack(__do_softirq, 0); - - local_irq_restore(flags); + execute_on_irq_stack(__do_softirq, 0); } #endif /* CONFIG_IRQSTACKS */ diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c index c69440c..7d0da88 100644 --- a/arch/powerpc/kernel/irq.c +++ b/arch/powerpc/kernel/irq.c @@ -601,7 +601,7 @@ void irq_ctx_init(void) } } -static inline void do_softirq_onstack(void) +void do_softirq_own_stack(void) { struct thread_info *curtp, *irqtp; unsigned long saved_sp_limit = current->thread.ksp_limit; @@ -623,21 +623,6 @@ static inline void do_softirq_onstack(void) set_bits(irqtp->flags, &curtp->flags); } -void do_softirq(void) -{ - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) - do_softirq_onstack(); - - local_irq_restore(flags); -} - irq_hw_number_t virq_to_hw(unsigned int virq) { struct irq_data *irq_data = irq_get_irq_data(virq); diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c index 54b0995..c289daa 100644 --- a/arch/s390/kernel/irq.c +++ b/arch/s390/kernel/irq.c @@ -124,39 +124,29 @@ skip_arch_irqs: /* * Switch to the asynchronous interrupt stack for softirq execution. */ -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags, old, new; - - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) { - /* Get current stack pointer. */ - asm volatile("la %0,0(15)" : "=a" (old)); - /* Check against async. stack address range. */ - new = S390_lowcore.async_stack; - if (((new - old) >> (PAGE_SHIFT + THREAD_ORDER)) != 0) { - /* Need to switch to the async. stack. */ - new -= STACK_FRAME_OVERHEAD; - ((struct stack_frame *) new)->back_chain = old; - - asm volatile(" la 15,0(%0)\n" - " basr 14,%2\n" - " la 15,0(%1)\n" - : : "a" (new), "a" (old), - "a" (__do_softirq) - : "0", "1", "2", "3", "4", "5", "14", - "cc", "memory" ); - } else { - /* We are already on the async stack. */ - __do_softirq(); - } + unsigned long old, new; + + /* Get current stack pointer. */ + asm volatile("la %0,0(15)" : "=a" (old)); + /* Check against async. stack address range. */ + new = S390_lowcore.async_stack; + if (((new - old) >> (PAGE_SHIFT + THREAD_ORDER)) != 0) { + /* Need to switch to the async. stack. */ + new -= STACK_FRAME_OVERHEAD; + ((struct stack_frame *) new)->back_chain = old; + asm volatile(" la 15,0(%0)\n" + " basr 14,%2\n" + " la 15,0(%1)\n" + : : "a" (new), "a" (old), + "a" (__do_softirq) + : "0", "1", "2", "3", "4", "5", "14", + "cc", "memory" ); + } else { + /* We are already on the async stack. */ + __do_softirq(); } - - local_irq_restore(flags); } #ifdef CONFIG_PROC_FS diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c index 063af10..50ecd48 100644 --- a/arch/sh/kernel/irq.c +++ b/arch/sh/kernel/irq.c @@ -149,47 +149,37 @@ void irq_ctx_exit(int cpu) hardirq_ctx[cpu] = NULL; } -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; struct thread_info *curctx; union irq_ctx *irqctx; u32 *isp; - if (in_interrupt()) - return; - - local_irq_save(flags); + curctx = current_thread_info(); + irqctx = softirq_ctx[smp_processor_id()]; + irqctx->tinfo.task = curctx->task; + irqctx->tinfo.previous_sp = current_stack_pointer; + + /* build the stack frame on the softirq stack */ + isp = (u32 *)((char *)irqctx + sizeof(*irqctx)); + + __asm__ __volatile__ ( + "mov r15, r9 \n" + "jsr @%0 \n" + /* switch to the softirq stack */ + " mov %1, r15 \n" + /* restore the thread stack */ + "mov r9, r15 \n" + : /* no outputs */ + : "r" (__do_softirq), "r" (isp) + : "memory", "r0", "r1", "r2", "r3", "r4", + "r5", "r6", "r7", "r8", "r9", "r15", "t", "pr" + ); - if (local_softirq_pending()) { - curctx = current_thread_info(); - irqctx = softirq_ctx[smp_processor_id()]; - irqctx->tinfo.task = curctx->task; - irqctx->tinfo.previous_sp = current_stack_pointer; - - /* build the stack frame on the softirq stack */ - isp = (u32 *)((char *)irqctx + sizeof(*irqctx)); - - __asm__ __volatile__ ( - "mov r15, r9 \n" - "jsr @%0 \n" - /* switch to the softirq stack */ - " mov %1, r15 \n" - /* restore the thread stack */ - "mov r9, r15 \n" - : /* no outputs */ - : "r" (__do_softirq), "r" (isp) - : "memory", "r0", "r1", "r2", "r3", "r4", - "r5", "r6", "r7", "r8", "r9", "r15", "t", "pr" - ); - - /* - * Shouldn't happen, we returned above if in_interrupt(): - */ - WARN_ON_ONCE(softirq_count()); - } - - local_irq_restore(flags); + /* + * Shouldn't happen, we returned above if in_interrupt(): + */ + WARN_ON_ONCE(softirq_count()); } #else static inline void handle_one_irq(unsigned int irq) diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c index d4840ce..666193f 100644 --- a/arch/sparc/kernel/irq_64.c +++ b/arch/sparc/kernel/irq_64.c @@ -698,30 +698,19 @@ void __irq_entry handler_irq(int pil, struct pt_regs *regs) set_irq_regs(old_regs); } -void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); + void *orig_sp, *sp = softirq_stack[smp_processor_id()]; - if (local_softirq_pending()) { - void *orig_sp, *sp = softirq_stack[smp_processor_id()]; - - sp += THREAD_SIZE - 192 - STACK_BIAS; - - __asm__ __volatile__("mov %%sp, %0\n\t" - "mov %1, %%sp" - : "=&r" (orig_sp) - : "r" (sp)); - __do_softirq(); - __asm__ __volatile__("mov %0, %%sp" - : : "r" (orig_sp)); - } + sp += THREAD_SIZE - 192 - STACK_BIAS; - local_irq_restore(flags); + __asm__ __volatile__("mov %%sp, %0\n\t" + "mov %1, %%sp" + : "=&r" (orig_sp) + : "r" (sp)); + __do_softirq(); + __asm__ __volatile__("mov %0, %%sp" + : : "r" (orig_sp)); } #ifdef CONFIG_HOTPLUG_CPU diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c index 4186755..1036f03 100644 --- a/arch/x86/kernel/irq_32.c +++ b/arch/x86/kernel/irq_32.c @@ -149,35 +149,25 @@ void irq_ctx_init(int cpu) cpu, per_cpu(hardirq_ctx, cpu), per_cpu(softirq_ctx, cpu)); } -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - unsigned long flags; struct thread_info *curctx; union irq_ctx *irqctx; u32 *isp; - if (in_interrupt()) - return; - - local_irq_save(flags); - - if (local_softirq_pending()) { - curctx = current_thread_info(); - irqctx = __this_cpu_read(softirq_ctx); - irqctx->tinfo.task = curctx->task; - irqctx->tinfo.previous_esp = current_stack_pointer; - - /* build the stack frame on the softirq stack */ - isp = (u32 *) ((char *)irqctx + sizeof(*irqctx)); + curctx = current_thread_info(); + irqctx = __this_cpu_read(softirq_ctx); + irqctx->tinfo.task = curctx->task; + irqctx->tinfo.previous_esp = current_stack_pointer; - call_on_stack(__do_softirq, isp); - /* - * Shouldn't happen, we returned above if in_interrupt(): - */ - WARN_ON_ONCE(softirq_count()); - } + /* build the stack frame on the softirq stack */ + isp = (u32 *) ((char *)irqctx + sizeof(*irqctx)); - local_irq_restore(flags); + call_on_stack(__do_softirq, isp); + /* + * Shouldn't happen, we returned above if in_interrupt(): + */ + WARN_ON_ONCE(softirq_count()); } bool handle_irq(unsigned irq, struct pt_regs *regs) diff --git a/arch/x86/kernel/irq_64.c b/arch/x86/kernel/irq_64.c index d04d3ec..8dd8c96 100644 --- a/arch/x86/kernel/irq_64.c +++ b/arch/x86/kernel/irq_64.c @@ -91,20 +91,8 @@ bool handle_irq(unsigned irq, struct pt_regs *regs) extern void call_softirq(void); -asmlinkage void do_softirq(void) +void do_softirq_own_stack(void) { - __u32 pending; - unsigned long flags; - - if (in_interrupt()) - return; - - local_irq_save(flags); - pending = local_softirq_pending(); - /* Switch to interrupt stack */ - if (pending) { - call_softirq(); - WARN_ON_ONCE(softirq_count()); - } - local_irq_restore(flags); + call_softirq(); + WARN_ON_ONCE(softirq_count()); } diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index 5fa5afe..6554954 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -19,6 +19,7 @@ #include <linux/atomic.h> #include <asm/ptrace.h> +#include <asm/irq.h> /* * These correspond to the IORESOURCE_IRQ_* defines in @@ -443,6 +444,16 @@ struct softirq_action asmlinkage void do_softirq(void); asmlinkage void __do_softirq(void); + +#ifdef __ARCH_HAS_DO_SOFTIRQ +void do_softirq_own_stack(void); +#else +static inline void do_softirq_own_stack(void) +{ + __do_softirq(); +} +#endif + extern void open_softirq(int nr, void (*action)(struct softirq_action *)); extern void softirq_init(void); extern void __raise_softirq_irqoff(unsigned int nr); diff --git a/kernel/softirq.c b/kernel/softirq.c index be3d351..39d27ff 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -29,7 +29,6 @@ #define CREATE_TRACE_POINTS #include <trace/events/irq.h> -#include <asm/irq.h> /* - No shared variables, all the data are CPU local. - If a softirq needs serialization, let it serialize itself @@ -283,7 +282,7 @@ restart: tsk_restore_flags(current, old_flags, PF_MEMALLOC); } -#ifndef __ARCH_HAS_DO_SOFTIRQ + asmlinkage void do_softirq(void) { @@ -298,13 +297,11 @@ asmlinkage void do_softirq(void) pending = local_softirq_pending(); if (pending) - __do_softirq(); + do_softirq_own_stack(); local_irq_restore(flags); } -#endif - /* * Enter an interrupt context. */ -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 52+ messages in thread
end of thread, other threads:[~2013-09-25 8:46 UTC | newest] Thread overview: 52+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-09-19 19:51 [RFC GIT PULL] softirq: Consolidation and stack overrun fix Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 2/3] irq: Execute softirq on its own stack on irq exit Frederic Weisbecker 2013-09-19 19:51 ` [PATCH 3/3] irq: Comment on the use of inline stack for ksoftirqd Frederic Weisbecker 2013-09-20 0:02 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Linus Torvalds 2013-09-20 1:53 ` Benjamin Herrenschmidt 2013-09-20 11:03 ` Thomas Gleixner 2013-09-20 11:11 ` Peter Zijlstra 2013-09-21 0:55 ` Benjamin Herrenschmidt 2013-09-20 16:26 ` Frederic Weisbecker 2013-09-20 17:30 ` Thomas Gleixner 2013-09-20 18:37 ` Frederic Weisbecker 2013-09-20 22:14 ` Linus Torvalds 2013-09-21 7:47 ` Ingo Molnar 2013-09-21 18:58 ` Frederic Weisbecker 2013-09-21 21:45 ` Benjamin Herrenschmidt 2013-09-21 23:27 ` Frederic Weisbecker 2013-09-22 2:01 ` H. Peter Anvin 2013-09-22 4:39 ` Benjamin Herrenschmidt 2013-09-22 4:41 ` Benjamin Herrenschmidt 2013-09-22 16:24 ` Peter Zijlstra 2013-09-22 17:47 ` H. Peter Anvin 2013-09-22 22:00 ` Benjamin Herrenschmidt 2013-09-22 21:56 ` Benjamin Herrenschmidt 2013-09-22 22:22 ` Linus Torvalds 2013-09-22 22:38 ` Benjamin Herrenschmidt 2013-09-23 4:35 ` [PATCH] powerpc/irq: Run softirqs off the top of the irq stack Benjamin Herrenschmidt 2013-09-23 7:56 ` Stephen Rothwell 2013-09-23 10:13 ` Benjamin Herrenschmidt 2013-09-23 16:47 ` Linus Torvalds 2013-09-23 20:51 ` Benjamin Herrenschmidt 2013-09-24 5:42 ` [PATCH v2] " Benjamin Herrenschmidt 2013-09-23 17:59 ` [RFC GIT PULL] softirq: Consolidation and stack overrun fix Chris Metcalf 2013-09-23 20:57 ` Benjamin Herrenschmidt 2013-09-24 19:27 ` Chris Metcalf 2013-09-24 20:58 ` Benjamin Herrenschmidt 2013-09-24 0:10 ` Benjamin Herrenschmidt 2013-09-24 1:19 ` Linus Torvalds 2013-09-24 1:52 ` Benjamin Herrenschmidt 2013-09-24 8:04 ` Peter Zijlstra 2013-09-24 8:16 ` Benjamin Herrenschmidt 2013-09-24 8:21 ` Peter Zijlstra 2013-09-24 9:31 ` Benjamin Herrenschmidt 2013-09-23 4:40 ` Benjamin Herrenschmidt 2013-09-23 5:01 ` David Miller 2013-09-24 2:44 ` Frederic Weisbecker 2013-09-24 4:42 ` Benjamin Herrenschmidt 2013-09-24 13:56 ` Frederic Weisbecker 2013-09-24 20:55 ` Benjamin Herrenschmidt 2013-09-25 8:46 ` Frederic Weisbecker 2013-09-21 0:52 ` Benjamin Herrenschmidt -- strict thread matches above, loose matches on Subject: below -- 2013-09-04 21:39 do_softirq() vs __do_softirq() in irq_exit() and stack overflow Benjamin Herrenschmidt 2013-09-05 15:33 ` [RFC PATCH 0/3] irq: Fix stack overflow due to softirq called on current stack Frederic Weisbecker 2013-09-05 15:33 ` [PATCH 1/3] irq: Consolidate do_softirq() arch overriden implementations Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).