All of lore.kernel.org
 help / color / mirror / Atom feed
* Possible hang inside interrupt handler on sata_promise.
@ 2007-09-12 13:19 Robin Holt
  2007-09-12 15:33 ` Mikael Pettersson
  2007-09-13 15:59 ` Chuck Ebbert
  0 siblings, 2 replies; 5+ messages in thread
From: Robin Holt @ 2007-09-12 13:19 UTC (permalink / raw)
  To: Mikael Pettersson, Jeff Garzik; +Cc: linux-kernel


I have been experiencing hangs on a newly setup machine.  Unfortunately,
it appears to be hanging inside the interrupt handler as sysrq and
caps-lock led seem to stop working when the event occurs.  I am guessing
it is related to the sata_promise driver, but that is only a guess as
I don't get much for output.  I am running the debian unstable kernel
(2.6.22-1-686), but the problem also occurs with the debian stable kernel
(2.6.18-4-686).  I do need to boot with the acpi=off option, but am
not sure if that is related.  Unfortunately, I do not know much about
troubleshooting i386 when problems occur inside the interrupt handlers.

What can I do to help troubleshoot this problem.

Thanks,
Robin Holt


PS: A little background information:

# lspci -v
...
00:0b.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02)
        Subsystem: Promise Technology, Inc. PDC40718 (SATA 300 TX4)
        Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 10
        I/O ports at d400 [size=128]
        I/O ports at d800 [size=256]
        Memory at f8042000 (32-bit, non-prefetchable) [size=4K]
        Memory at f8000000 (32-bit, non-prefetchable) [size=128K]
        [virtual] Expansion ROM at 30020000 [disabled] [size=32K]
        Capabilities: [60] Power Management version 2

00:0c.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02)
        Subsystem: Promise Technology, Inc. PDC40718 (SATA 300 TX4)
        Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 11
        I/O ports at dc00 [size=128]
        I/O ports at e000 [size=256]
        Memory at f8041000 (32-bit, non-prefetchable) [size=4K]
        Memory at f8020000 (32-bit, non-prefetchable) [size=128K]
        [virtual] Expansion ROM at 30028000 [disabled] [size=32K]
        Capabilities: [60] Power Management version 2

# cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] 
md2 : active raid6 sda1[0] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1]
      1953535744 blocks level 6, 64k chunk, algorithm 2 [6/6] [UUUUUU]
...


The md2 device is a single pv in a vg (VG_DATA) and currently has a
single lv which is 40GB and contains an xfs filesystem.


# dmesg
...
sata_promise 0000:00:0b.0: version 2.07
sata_promise 0000:00:0b.0: applying SATAII TX4 port numbering workaround
scsi0 : sata_promise
scsi1 : sata_promise
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
scsi2 : sata_promise
scsi3 : sata_promise
ata1: SATA max UDMA/133 cmd 0xe080e380 ctl 0xe080e3b8 bmdma 0x00000000 irq 10
ata2: SATA max UDMA/133 cmd 0xe080e280 ctl 0xe080e2b8 bmdma 0x00000000 irq 10
ata3: SATA max UDMA/133 cmd 0xe080e200 ctl 0xe080e238 bmdma 0x00000000 irq 10
ata4: SATA max UDMA/133 cmd 0xe080e300 ctl 0xe080e338 bmdma 0x00000000 irq 10
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ATA-7: WDC WD5000AAKS-00TMA0, 12.01C01, max UDMA/133
ata3.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata3.00: configured for UDMA/133
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: ATA-7: WDC WD5000KS-00MNB0, 07.02E07, max UDMA/133
ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata4.00: configured for UDMA/133
scsi 2:0:0:0: Direct-Access     ATA      WDC WD5000AAKS-0 12.0 PQ: 0 ANSI: 5
scsi 3:0:0:0: Direct-Access     ATA      WDC WD5000KS-00M 07.0 PQ: 0 ANSI: 5
sata_promise 0000:00:0c.0: applying SATAII TX4 port numbering workaround
scsi4 : sata_promise
scsi5 : sata_promise
scsi6 : sata_promise
scsi7 : sata_promise
ata5: SATA max UDMA/133 cmd 0xe081c380 ctl 0xe081c3b8 bmdma 0x00000000 irq 11
ata6: SATA max UDMA/133 cmd 0xe081c280 ctl 0xe081c2b8 bmdma 0x00000000 irq 11
ata7: SATA max UDMA/133 cmd 0xe081c200 ctl 0xe081c238 bmdma 0x00000000 irq 11
ata8: SATA max UDMA/133 cmd 0xe081c300 ctl 0xe081c338 bmdma 0x00000000 irq 11
ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: ATA-7: WDC WD5000KS-00MNB0, 07.02E07, max UDMA/133
ata5.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata5.00: configured for UDMA/133
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata6.00: ATA-7: WDC WD5000KS-00MNB0, 07.02E07, max UDMA/133
ata6.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata6.00: configured for UDMA/133
ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata7.00: ATA-7: WDC WD5000KS-00MNB0, 07.02E07, max UDMA/133
ata7.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata7.00: configured for UDMA/133
ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata8.00: ATA-7: WDC WD5000AAKS-00TMA0, 12.01C01, max UDMA/133
ata8.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 0/32)
ata8.00: configured for UDMA/133
scsi 4:0:0:0: Direct-Access     ATA      WDC WD5000KS-00M 07.0 PQ: 0 ANSI: 5
scsi 5:0:0:0: Direct-Access     ATA      WDC WD5000KS-00M 07.0 PQ: 0 ANSI: 5
scsi 6:0:0:0: Direct-Access     ATA      WDC WD5000KS-00M 07.0 PQ: 0 ANSI: 5
scsi 7:0:0:0: Direct-Access     ATA      WDC WD5000AAKS-0 12.0 PQ: 0 ANSI: 5


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible hang inside interrupt handler on sata_promise.
  2007-09-12 13:19 Possible hang inside interrupt handler on sata_promise Robin Holt
@ 2007-09-12 15:33 ` Mikael Pettersson
  2007-09-13 15:59 ` Chuck Ebbert
  1 sibling, 0 replies; 5+ messages in thread
From: Mikael Pettersson @ 2007-09-12 15:33 UTC (permalink / raw)
  To: Robin Holt; +Cc: Jeff Garzik, linux-kernel

Robin Holt writes:
 > 
 > I have been experiencing hangs on a newly setup machine.  Unfortunately,
 > it appears to be hanging inside the interrupt handler as sysrq and
 > caps-lock led seem to stop working when the event occurs.  I am guessing
 > it is related to the sata_promise driver, but that is only a guess as
 > I don't get much for output.  I am running the debian unstable kernel
 > (2.6.22-1-686), but the problem also occurs with the debian stable kernel
 > (2.6.18-4-686).  I do need to boot with the acpi=off option, but am
 > not sure if that is related.  Unfortunately, I do not know much about
 > troubleshooting i386 when problems occur inside the interrupt handlers.

This is the first I've heard of a problem like this.
Since you boot with acpi=off and sysrq stops working,
I really have to suspect a mainboard interrupt problem.

 > What can I do to help troubleshoot this problem.

Unless the mainboard in question is known to be a totally
lost cause for ACPI, the first step should be to get ACPI
working.

If it's an older mainboard it might not need ACPI,
but then you should build and boot an ACPI-free kernel.
(I wouldn't trust acpi=off to be equivalent to CONFIG_ACPI=n.)

If the mainboard has an I/O-APIC then you should make sure
that the kernel can find and use it.

/Mikael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible hang inside interrupt handler on sata_promise.
  2007-09-12 13:19 Possible hang inside interrupt handler on sata_promise Robin Holt
  2007-09-12 15:33 ` Mikael Pettersson
@ 2007-09-13 15:59 ` Chuck Ebbert
  2007-09-13 17:13   ` Robin Holt
  1 sibling, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2007-09-13 15:59 UTC (permalink / raw)
  To: Robin Holt; +Cc: Mikael Pettersson, Jeff Garzik, linux-kernel

On 09/12/2007 09:19 AM, Robin Holt wrote:
> I have been experiencing hangs on a newly setup machine.  Unfortunately,
> it appears to be hanging inside the interrupt handler as sysrq and
> caps-lock led seem to stop working when the event occurs.  I am guessing
> it is related to the sata_promise driver, but that is only a guess as
> I don't get much for output.  I am running the debian unstable kernel
> (2.6.22-1-686), but the problem also occurs with the debian stable kernel
> (2.6.18-4-686).  I do need to boot with the acpi=off option, but am
> not sure if that is related.  Unfortunately, I do not know much about
> troubleshooting i386 when problems occur inside the interrupt handlers.

There is a list of kernel options you can try, including:

  noapic
  nolapic
  pci=nomsi,nommconf
  pci=noacpi


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible hang inside interrupt handler on sata_promise.
  2007-09-13 15:59 ` Chuck Ebbert
@ 2007-09-13 17:13   ` Robin Holt
  2007-09-13 17:20     ` Maciej W. Rozycki
  0 siblings, 1 reply; 5+ messages in thread
From: Robin Holt @ 2007-09-13 17:13 UTC (permalink / raw)
  To: Chuck Ebbert; +Cc: Robin Holt, Mikael Pettersson, Jeff Garzik, linux-kernel

Is there any way on typical motherboards to send an NMI?  On Altix boxes,
we can use the system controller to send an NMI.  I have found some
motherboards appear to have an NMI line.  Is there anything like that
on i386?  Maybe I am missing the issue entirely.  Does deadlock inside
an IRQ handler seem plausible?

On a side note, I have reproduced the hang with a different storage device
(ide).  At the time, I have deactivated the volume group, stopped the
raid device and unloaded the sata_promise module.  I also noticed that
uhci_hcd and eth1 (3com 3c59x module) are sharing IRQ 5 so I had unloaded
the uhci_hcd and usbcore before the last test.

On Thu, Sep 13, 2007 at 11:59:25AM -0400, Chuck Ebbert wrote:
> There is a list of kernel options you can try, including:
> 
>   noapic
>   nolapic
>   pci=nomsi,nommconf
>   pci=noacpi

I will try these this evening.

Thanks,
Robin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Possible hang inside interrupt handler on sata_promise.
  2007-09-13 17:13   ` Robin Holt
@ 2007-09-13 17:20     ` Maciej W. Rozycki
  0 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2007-09-13 17:20 UTC (permalink / raw)
  To: Robin Holt; +Cc: Chuck Ebbert, Mikael Pettersson, Jeff Garzik, linux-kernel

On Thu, 13 Sep 2007, Robin Holt wrote:

> Is there any way on typical motherboards to send an NMI?  On Altix boxes,
> we can use the system controller to send an NMI.  I have found some
> motherboards appear to have an NMI line.  Is there anything like that
> on i386?  Maybe I am missing the issue entirely.  Does deadlock inside
> an IRQ handler seem plausible?

 There are various ways possible to send an NMI on a typical i386 box of 
these days; the easiest is probably the NMI watchdog -- see the 
"nmi_watchdog" kernel option.

  Maciej

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-09-13 17:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-12 13:19 Possible hang inside interrupt handler on sata_promise Robin Holt
2007-09-12 15:33 ` Mikael Pettersson
2007-09-13 15:59 ` Chuck Ebbert
2007-09-13 17:13   ` Robin Holt
2007-09-13 17:20     ` Maciej W. Rozycki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.