* Serial Oopsen caused by global IRQ chanes
@ 2002-07-27 18:11 Russell King
2002-07-27 18:19 ` Russell King
2002-07-27 18:43 ` [patch] " Ingo Molnar
0 siblings, 2 replies; 7+ messages in thread
From: Russell King @ 2002-07-27 18:11 UTC (permalink / raw)
To: linux-kernel; +Cc: mingo
Hi,
Two people have now reported to me a couple of oopsen which appear to be
caused by a change in 2.5.29 to synchronize_irq(), which I believe has
made synchronize_irq() useless.
In effect, we no longer guarantee that any IRQ handlers for a particular
IRQ will have finished running by the time free_irq() returns. So, code
which has:
int bar;
int *foo = &bar;
irq_handler()
{
*foo = 0;
}
void module_exit(void)
{
free_irq(irq, NULL);
foo = NULL;
}
is currently broken in two ways:
1. it's possible for irq_handler to dereference foo on another CPU _after_
free_irq has returned.
2. it's possible for the module to be unloaded while the irq_handler is
still running on another CPU.
Would someone else (Ingo?) like to comment on the above please?
The serial code regularly trips up because of this on SMP boxen.
Thanks.
--
Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Serial Oopsen caused by global IRQ chanes 2002-07-27 18:11 Serial Oopsen caused by global IRQ chanes Russell King @ 2002-07-27 18:19 ` Russell King 2002-07-27 18:21 ` Ingo Molnar 2002-07-27 18:43 ` [patch] " Ingo Molnar 1 sibling, 1 reply; 7+ messages in thread From: Russell King @ 2002-07-27 18:19 UTC (permalink / raw) To: linux-kernel; +Cc: mingo On Sat, Jul 27, 2002 at 07:11:19PM +0100, Russell King wrote: > Two people have now reported to me a couple of oopsen which appear to be > caused by a change in 2.5.29 to synchronize_irq(), which I believe has > made synchronize_irq() useless. Sorry, not useless. Just free_irq extremely buggy. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Serial Oopsen caused by global IRQ chanes 2002-07-27 18:19 ` Russell King @ 2002-07-27 18:21 ` Ingo Molnar 0 siblings, 0 replies; 7+ messages in thread From: Ingo Molnar @ 2002-07-27 18:21 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel, mingo On Sat, 27 Jul 2002, Russell King wrote: > > Two people have now reported to me a couple of oopsen which appear to be > > caused by a change in 2.5.29 to synchronize_irq(), which I believe has > > made synchronize_irq() useless. > > Sorry, not useless. Just free_irq extremely buggy. i found the bug, will send a patch soon. Ingo ^ permalink raw reply [flat|nested] 7+ messages in thread
* [patch] Re: Serial Oopsen caused by global IRQ chanes 2002-07-27 18:11 Serial Oopsen caused by global IRQ chanes Russell King 2002-07-27 18:19 ` Russell King @ 2002-07-27 18:43 ` Ingo Molnar 2002-07-27 22:36 ` William Lee Irwin III 1 sibling, 1 reply; 7+ messages in thread From: Ingo Molnar @ 2002-07-27 18:43 UTC (permalink / raw) To: Russell King; +Cc: linux-kernel, Linus Torvalds the attached patch fixes a synchronize_irq() bug: if the interrupt is freed while an IRQ handler is running (irq state is IRQ_INPROGRESS) then synchronize_irq() will return early, which is incorrect. there was another do_IRQ() bug that in fact necessiated the bad code that caused the synchronize_irq() bug - we kept the IRQ_INPROGRESS bit set for not active interrupt sources - after they happen for the first time. Now the only effect this has is on i8259A irq handling - we used to keep these irqs disabled after the first 'spurious' interrupt happened. Now what the i8259A code really wants to do IMO is to keep the interrupt disabled if there is no handler defined for that interrupt source. The patch adds exactly this. I dont remember why this was needed in the first place (irq probing? avoidance of interrupt storms?), but with the patch the behavior should be equivalent. Ingo --- linux/arch/i386/kernel/irq.c.orig2 Sat Jul 27 20:28:05 2002 +++ linux/arch/i386/kernel/irq.c Sat Jul 27 20:31:58 2002 @@ -187,10 +187,6 @@ #if CONFIG_SMP inline void synchronize_irq(unsigned int irq) { - /* is there anything to synchronize with? */ - if (!irq_desc[irq].action) - return; - while (irq_desc[irq].status & IRQ_INPROGRESS) cpu_relax(); } @@ -350,7 +346,7 @@ * use the action we have. */ action = NULL; - if (!(status & (IRQ_DISABLED | IRQ_INPROGRESS))) { + if (likely(!(status & (IRQ_DISABLED | IRQ_INPROGRESS)))) { action = desc->action; status &= ~IRQ_PENDING; /* we commit to handling */ status |= IRQ_INPROGRESS; /* we are handling it */ @@ -363,7 +359,7 @@ a different instance of this same irq, the other processor will take care of it. */ - if (!action) + if (unlikely(!action)) goto out; /* @@ -381,12 +377,12 @@ handle_IRQ_event(irq, ®s, action); spin_lock(&desc->lock); - if (!(desc->status & IRQ_PENDING)) + if (likely(!(desc->status & IRQ_PENDING))) break; desc->status &= ~IRQ_PENDING; } - desc->status &= ~IRQ_INPROGRESS; out: + desc->status &= ~IRQ_INPROGRESS; /* * The ->end() handler has to deal with interrupts which got * disabled while the handler was running. --- linux/arch/i386/kernel/i8259.c.orig2 Sat Jul 27 20:40:11 2002 +++ linux/arch/i386/kernel/i8259.c Sat Jul 27 20:42:44 2002 @@ -38,7 +38,8 @@ static void end_8259A_irq (unsigned int irq) { - if (!(irq_desc[irq].status & (IRQ_DISABLED|IRQ_INPROGRESS))) + if (!(irq_desc[irq].status & (IRQ_DISABLED|IRQ_INPROGRESS)) && + irq_desc[irq].action) enable_8259A_irq(irq); } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] Re: Serial Oopsen caused by global IRQ chanes 2002-07-27 18:43 ` [patch] " Ingo Molnar @ 2002-07-27 22:36 ` William Lee Irwin III 2002-07-27 23:01 ` William Lee Irwin III 0 siblings, 1 reply; 7+ messages in thread From: William Lee Irwin III @ 2002-07-27 22:36 UTC (permalink / raw) To: Ingo Molnar; +Cc: Russell King, linux-kernel, Linus Torvalds On Sat, Jul 27, 2002 at 08:43:04PM +0200, Ingo Molnar wrote: > the attached patch fixes a synchronize_irq() bug: if the interrupt is > freed while an IRQ handler is running (irq state is IRQ_INPROGRESS) then > synchronize_irq() will return early, which is incorrect. > there was another do_IRQ() bug that in fact necessiated the bad code that > caused the synchronize_irq() bug - we kept the IRQ_INPROGRESS bit set for > not active interrupt sources - after they happen for the first time. Now > the only effect this has is on i8259A irq handling - we used to keep these > irqs disabled after the first 'spurious' interrupt happened. Now what the > i8259A code really wants to do IMO is to keep the interrupt disabled if > there is no handler defined for that interrupt source. The patch adds > exactly this. I dont remember why this was needed in the first place (irq > probing? avoidance of interrupt storms?), but with the patch the behavior > should be equivalent. I'm having trouble with this one, I seem to get lots of these messages: pu: 12, clocks: 99983, slice: 3029 CPU12<T0:99968,T1:60576,D:15,S:3029,C:99983> CPU 12 IS NOW UP! Bringing up 13 cpu: 13, clocks: 99983, slice: 3029 CPU13<T0:99968,T1:57552,D:10,S:3029,C:99983> CPU 13 IS NOW UP! Bringing up 14 cpu: 14, clocks: 99983, slice: 3029 CPU14<T0:99968,T1:54528,D:5,S:3029,C:99983> CPU 14 IS NOW UP! Bringing up 15 cpu: 15, clocks: 99983, slice: 3029 CPU15<T0:99968,T1:51504,D:0,S:3029,C:99983> CPU 15 IS NOW UP! CPUS done 4294967295 Linux NET4.0 for Linux 2.4 ... and then the kernel deadlocks after free_initmem()'s printk(). Cheers, Bill ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] Re: Serial Oopsen caused by global IRQ chanes 2002-07-27 22:36 ` William Lee Irwin III @ 2002-07-27 23:01 ` William Lee Irwin III 2002-07-27 23:27 ` William Lee Irwin III 0 siblings, 1 reply; 7+ messages in thread From: William Lee Irwin III @ 2002-07-27 23:01 UTC (permalink / raw) To: Ingo Molnar, Russell King, linux-kernel, Linus Torvalds On Sat, Jul 27, 2002 at 08:43:04PM +0200, Ingo Molnar wrote: >> the attached patch fixes a synchronize_irq() bug: if the interrupt is >> freed while an IRQ handler is running (irq state is IRQ_INPROGRESS) then On Sat, Jul 27, 2002 at 03:36:17PM -0700, William Lee Irwin III wrote: > I'm having trouble with this one, I seem to get lots of these messages: > pu: 12, clocks: 99983, slice: 3029 > CPU12<T0:99968,T1:60576,D:15,S:3029,C:99983> > CPU 12 IS NOW UP! Sorry, this is the hotplug stuff which surprised me, but is unrelated to any deadlock (unless it's printk'ing at a bad time, which I doubt). Thought it was an arch complaint about trying to wake already-woken cpus. On Sat, Jul 27, 2002 at 03:36:17PM -0700, William Lee Irwin III wrote: > ... and then the kernel deadlocks after free_initmem()'s printk(). Seems to be timing related, printk's in init/main.c made it go away. ISTR hearing something about hotplug merge troubles, I'll investigate that thread and report results with the fixes posted there. Cheers, Bill ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch] Re: Serial Oopsen caused by global IRQ chanes 2002-07-27 23:01 ` William Lee Irwin III @ 2002-07-27 23:27 ` William Lee Irwin III 0 siblings, 0 replies; 7+ messages in thread From: William Lee Irwin III @ 2002-07-27 23:27 UTC (permalink / raw) To: Ingo Molnar, Russell King, linux-kernel, Linus Torvalds On Sat, Jul 27, 2002 at 08:43:04PM +0200, Ingo Molnar wrote: >>> the attached patch fixes a synchronize_irq() bug: if the interrupt is >>> freed while an IRQ handler is running (irq state is IRQ_INPROGRESS) then In combination with Rusty's hotplug fixes, and your additional diff atop that, this successfully allows my 16-way to boot. Let me know if there's anything else you'd like me to test. Thanks, Bill ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2002-07-27 23:24 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-07-27 18:11 Serial Oopsen caused by global IRQ chanes Russell King 2002-07-27 18:19 ` Russell King 2002-07-27 18:21 ` Ingo Molnar 2002-07-27 18:43 ` [patch] " Ingo Molnar 2002-07-27 22:36 ` William Lee Irwin III 2002-07-27 23:01 ` William Lee Irwin III 2002-07-27 23:27 ` William Lee Irwin III
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox