qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] [SPARC] SlavIO interrupt controller fix
@ 2007-03-15  1:05 Aurelien Jarno
  0 siblings, 0 replies; only message in thread
From: Aurelien Jarno @ 2007-03-15  1:05 UTC (permalink / raw)
  To: qemu-devel

Hi all,

I have finally been able to fix the Linux kernel crash that occurs on
the Sparc target (sun4m) when doing intensive disk I/O (see the dmesg 
log below).

slavio_pic_set_irq() in slavio_intctl.c calls slavio_check_interrupts()
when an interrupt is activated, but also when interrupt is deactivated.
This can cause in very rare conditions a spurious interrupt that 
perturbates the ESP driver that leads to a kernel crash. 

>From what I have been able to trace, it occurs when an interrupt is 
being serviced, and an interrupt with a lower level is being cleared 
before the interrupt routine in the target disables the first interrupt.
To have a bad effect on the ESP driver, it should also occur when a 
DMA transfer is scheduled. That explains why this bug is not so easy 
to reproduce (it usually occurs between half an hour and two hours under
intensive disk I/O, and up to 24 hours with very few disk I/O), though
it is very annoying.

Note that all other functions from this file that activate and 
deactivate interrupts only call slavio_check_interrupts() in  interrupt
activation cases, so they are already correct.

The patch below fixes the problem. With this patch I am currently running
a Sparc target with intensive I/O disk for 24 hours without crash.

Cheers,
Aurelien


esp0: !BSERV after data, probably to msgout
esp0: Aborting command
esp0: dumping state
esp0: dma -- cond_reg<a4000211> addr<f0251000>
esp0: SW [sreg<00> sstep<04> ireg<18>]
esp0: HW reread [sreg<83> sstep<00> ireg<10>]
esp0: current command [tgt<00> lun<00> pphase<DATAOUT> cphase<DATAOUT>]
esp0: disconnected
esp0: Aborting command
esp0: dumping state
esp0: dma -- cond_reg<a4000210> addr<f0251000>
esp0: SW [sreg<00> sstep<04> ireg<18>]
esp0: HW reread [sreg<03> sstep<00> ireg<10>]
esp0: current command [tgt<00> lun<00> pphase<UNISSUED> cphase<UNISSUED>]
esp0: disconnected
esp0: Resetting scsi bus
esp0: SCSI bus reset interrupt
Unable to handle kernel NULL pointer dereference
tsk->{mm,active_mm}->context = 0000000d
tsk->{mm,active_mm}->pgd = fc048800
              \|/ ____ \|/
              "@'/ ,. \`@"
              /_| \__/ |_\
                 \__U_/
apt-get(4250): Oops [#1]
PSR: 04400fc6 PC: fe61f128 NPC: fe61f12c Y: 00000000    Not tainted
PC: <esp_do_data_finale+0x3b4/0x3f8 [esp]>
%G: f2cb4000 ffffffff  00000014 fd0da000  00000000 00000020  f2cb4000 00000001
%O: fe620800 f79d8800  00000010 00000008  f00d8eac f0234000  f2cb5b18 fe61edd0
RPC: <esp_do_data_finale+0x5c/0x3f8 [esp]>
%L: f79f3600 00000000  00000000 f7956500  00000000 ea7afb00  f3004000 00989680
%I: f021529c 00000000  00000000 00000000  00000000 fff00000  f2cb5b80 fe61de10
Caller[fe61de10]: esp_work_bus+0x64/0x6c [esp]
Caller[fe61f7e8]: esp_intr+0x1e0/0x310 [esp]
Caller[f0013160]: handler_irq+0x94/0xd4
Caller[f0010bd8]: patch_handler_irq+0x8/0x24
Caller[f019b744]: here+0x18/0x90
Caller[f019c538]: do_nanosleep+0x44/0x88
Caller[f0046af8]: hrtimer_nanosleep+0x30/0x130
Caller[f0046c74]: sys_nanosleep+0x7c/0x94
Caller[f0011634]: syscall_is_too_hard+0x3c/0x40
Caller[5035a36c]: 0x5035a374
Instruction DUMP: c22420ec  8400a014  c42420e8 <c200a010> c22420e4  c200a00c  c22420e0  c20e203a  82086007
Kernel panic - not syncing: Aiee, killing interrupt handler!
 <0>Press Stop-A (L1-A) to return to the boot prom



--- hw/slavio_intctl.c	2007-02-06 00:01:54.000000000 +0100
+++ hw/slavio_intctl.c	2007-03-14 13:50:18.000000000 +0100
@@ -293,6 +293,7 @@
 	    if (level) {
 		s->intregm_pending |= mask;
 		s->intreg_pending[s->target_cpu] |= 1 << pil;
+		slavio_check_interrupts(s);
 	    }
 	    else {
 		s->intregm_pending &= ~mask;
@@ -300,7 +301,6 @@
 	    }
 	}
     }
-    slavio_check_interrupts(s);
 }
 
 void slavio_pic_set_irq_cpu(void *opaque, int irq, int level, unsigned int cpu)

-- 
  .''`.  Aurelien Jarno	            | GPG: 1024D/F1BCDB73
 : :' :  Debian developer           | Electrical Engineer
 `. `'   aurel32@debian.org         | aurelien@aurel32.net
   `-    people.debian.org/~aurel32 | www.aurel32.net

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2007-03-15  1:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-15  1:05 [Qemu-devel] [PATCH] [SPARC] SlavIO interrupt controller fix Aurelien Jarno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).