* [parisc-linux] irq.c patch to fix lockups on recent kernels
@ 2002-05-15 13:47 Randolph Chung
2002-05-16 20:40 ` Grant Grundler
0 siblings, 1 reply; 4+ messages in thread
From: Randolph Chung @ 2002-05-15 13:47 UTC (permalink / raw)
To: parisc-linux
We have been seeing fairly frequent lockups while doing heavy I/O on
recent kernels. Paul Bame isolated this to the 2.4.18-pa16->pa17 patch
to irq.c that was put in place to fix a xtime_lock deadlock.
Here is a patch that partially reverts that patch but still fixes the
xtime_lock problem, as well as avoiding the I/O hangs. After talking to
Grant about this, I'm not convinced this is the right fix. It seems like
do_cpu_irq_mask is already called with eiem masked, so I'm not sure why
masking it again might make a difference.... we might just be masking
(no pun intended) another bug....
I've run a kernel with this patch on a SMP a500 overnight while doing
lots of I/O ... seems to be ok. The previous -pa2[1234] kernels will
lock up in <10 minutes...
Can someone more familiar with this part of the kernel please take a
look?
randolph
--
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/
===================================================================
RCS file: /var/cvs/linux/arch/parisc/kernel/irq.c,v
retrieving revision 1.53
diff -u -p -r1.53 irq.c
--- irq.c 2002/04/13 22:12:27 1.53
+++ irq.c 2002/05/15 13:35:06
@@ -382,6 +382,7 @@ void do_irq(struct irqaction *action, in
void do_cpu_irq_mask(unsigned long mask, struct irq_region *region, struct pt_regs *regs)
{
unsigned long bit;
+ unsigned long orig_eiem;
int irq;
#ifdef DEBUG_IRQ
@@ -401,6 +402,9 @@ void do_cpu_irq_mask(unsigned long mask,
* Keeping PSW_I disabled avoids this.
*/
+ orig_eiem = get_eiem();
+ set_eiem(orig_eiem & ~mask);
+
for (bit = (1L<<MAX_CPU_IRQ), irq = 0; mask && bit; bit>>=1, irq++) {
int irq_num;
if (!(bit&mask))
@@ -410,9 +414,10 @@ void do_cpu_irq_mask(unsigned long mask,
irq_num = region->data.irqbase + irq;
do_irq(®ion->action[irq], irq_num, regs);
}
+ set_eiem(orig_eiem);
/* Leave with PSW_I bit set */
- local_irq_enable();
+ local_irq_enable();
}
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [parisc-linux] irq.c patch to fix lockups on recent kernels
2002-05-15 13:47 [parisc-linux] irq.c patch to fix lockups on recent kernels Randolph Chung
@ 2002-05-16 20:40 ` Grant Grundler
2002-05-16 21:30 ` Grant Grundler
0 siblings, 1 reply; 4+ messages in thread
From: Grant Grundler @ 2002-05-16 20:40 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
Randolph Chung wrote:
> After talking to Grant about this, I'm not convinced this is the right
> fix. It seems like do_cpu_irq_mask is already called with eiem masked,
> so I'm not sure why masking it again might make a difference....
No. eiem == External Interrupt Enable Mask.
Only the I-bit is disabled when do_cpu_irq_mask() is entered.
Bits in the EIRR are already cleared by assembler code to indicate
we are handling those interrupts.
If any bits in EIRR are set when re-enabling I-bit (in PSW),
we should get another external interrupt.
In -pa17, I removed the mfctl/mtctl calls since I-bit is supposed
to be disabled to block *all* interrupts. But supposing some
driver calls cli() and re-enables interrupts in general, the
EIRR bits (ie interrupt "vectors" in ia64 speak) we are processing
could run into the same problem that I saw with xtime_lock.
Thus, restoring the code I removed in -pa17 that mucks with eiem
would mask issues with sti/cli in the drivers.
Conversely, any interrupt handler sittin on the interrupt stack
for long periods of time will block stuff too...the symptoms I've
heard so far don't match this scenario though.
> Can someone more familiar with this part of the kernel please take a
> look?
The patch looks fine to me. I suspect it's masking
a problem with sti()/cli() usage someplace though.
thanks,
grant
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [parisc-linux] irq.c patch to fix lockups on recent kernels
2002-05-16 20:40 ` Grant Grundler
@ 2002-05-16 21:30 ` Grant Grundler
2002-05-17 6:35 ` Grant Grundler
0 siblings, 1 reply; 4+ messages in thread
From: Grant Grundler @ 2002-05-16 21:30 UTC (permalink / raw)
To: Randolph Chung; +Cc: parisc-linux
Grant Grundler wrote:
> Only the I-bit is disabled when do_cpu_irq_mask() is entered.
> Bits in the EIRR are already cleared by assembler code to indicate
> we are handling those interrupts.
> If any bits in EIRR are set when re-enabling I-bit (in PSW),
> we should get another external interrupt.
I'm thinking we should:
o move the EIRR bit handling into do_cpu_irq_mask()
(do_cpu_irq_mask() could loop until EIRR is zero).
o move I-bit handling into the entry.S assembly.
That way, EIRR and EIM handling is all in C and I-bit is all in asm.
That seems to make the most sense to me since EIRR/EIM are closely
related and I-bit is orthogonal to that.
grant
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [parisc-linux] irq.c patch to fix lockups on recent kernels
2002-05-16 21:30 ` Grant Grundler
@ 2002-05-17 6:35 ` Grant Grundler
0 siblings, 0 replies; 4+ messages in thread
From: Grant Grundler @ 2002-05-17 6:35 UTC (permalink / raw)
To: parisc-linux
Grant Grundler wrote:
> I'm thinking we should:
> o move the EIRR bit handling into do_cpu_irq_mask()
> (do_cpu_irq_mask() could loop until EIRR is zero).
> o move I-bit handling into the entry.S assembly.
prototype code is on ftp.parisc-linux.org/patches/irq_eirr.diff
I didn't cleanup some of entry.S stuff but it should work.
Not tested.
grant
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2002-05-17 6:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-15 13:47 [parisc-linux] irq.c patch to fix lockups on recent kernels Randolph Chung
2002-05-16 20:40 ` Grant Grundler
2002-05-16 21:30 ` Grant Grundler
2002-05-17 6:35 ` Grant Grundler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox