public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* Re: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
@ 2003-04-09 15:47 Duncan Gibb
  0 siblings, 0 replies; 7+ messages in thread
From: Duncan Gibb @ 2003-04-09 15:47 UTC (permalink / raw)
  To: Rohit Gupta; +Cc: linux-scsi

[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]

A few days ago, you wrote:

RG> I am having system hangs when I place the U320 SCSI card and
RG> the Intel gigabit Server card (copper) on the same bus,
RG> specifically the 100 Mhz bus on the Intel 7505 chipset.

I have a similar setup to you in a machine I built last weekend.  It's
based on a SuperMicro X5DA8 board, which has all the bits you mention on
the motherboard.

The system seems stable, but I am experiencing frequent SCSI bus and
apparent subsystem lockups.

Channel A has only the swap disk attached.

Channel B is:

CD-RW(term enabled)----m/b----|----PCMCIAreader----Scanner----terminator

M/B term is disabled (Adaptec SCSIselect seems to do this in a way the
m/b jumpers don't).  | is a backplate where the bus goes from IDC50-int
to 50MD-ext.  All devices worked fine on an older PCI aic7xxx-based card
but this board does not have enough PCI slots to test in situ.

Scanning at 300dpi in greyscale works fine; colour gets part way and
kills the scsi.  Writing a CD kills the scsi.  SCSI death looks like the
attached portion of /var/log/messages.  Death appears to afflict only
one channel.  "dd if=/dev/sda of=/dev/null" still works at any rate.

I am currently using an unadulterated kernel 2.4.21-pre5-ac3, pending
getting something more recent to build (with ac97).  ide-scsi is also
loaded.

If you've found any way around this apparent problem, or anyone has and
ideas, please let me know.

Cheers


Duncan


[-- Attachment #2: var_log_messages__edited --]
[-- Type: text/plain, Size: 11704 bytes --]

Apr  9 14:22:06 localhost kernel: scsi1:0:2:0: Attempting to abort cmd f782de00
Apr  9 14:22:06 localhost kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Apr  9 14:22:06 localhost kernel: scsi1: Dumping Card State at program address 0x1a4 Mode 0x33
Apr  9 14:22:06 localhost kernel: Card was paused
Apr  9 14:22:06 localhost kernel: HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11]
Apr  9 14:22:06 localhost kernel: DFFSTAT[0x31] SCSISIGI[0x4] SCSIPHASE[0x0] SCSIBUS[0x0]
Apr  9 14:22:06 localhost kernel: LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10]
Apr  9 14:22:06 localhost kernel: SEQINTCTL[0x0] SEQ_FLAGS[0x40] SEQ_FLAGS2[0x0] SSTAT0[0x0]
Apr  9 14:22:06 localhost kernel: SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0]
Apr  9 14:22:06 localhost kernel: SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0]
Apr  9 14:22:06 localhost kernel: LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]
Apr  9 14:22:06 localhost kernel:
Apr  9 14:22:06 localhost kernel: SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x2 CURRSCB 0x2 NEXTSCB 0x0
Apr  9 14:22:06 localhost kernel: qinstart = 10969 qinfifonext = 10970
Apr  9 14:22:06 localhost kernel: QINFIFO: 0x3
Apr  9 14:22:06 localhost kernel: WAITING_TID_QUEUES:
Apr  9 14:22:06 localhost kernel: Pending list:
Apr  9 14:22:06 localhost kernel:   3 SCB_CONTROL[0x48] SCB_SCSIID[0x27] SCB_TAG[0x3]
Apr  9 14:22:06 localhost kernel:   2 SCB_CONTROL[0x0] SCB_SCSIID[0x37] SCB_TAG[0x2]
Apr  9 14:22:06 localhost kernel: Total 2
Apr  9 14:22:06 localhost kernel: Kernel Free SCB list: 1 0
Apr  9 14:22:06 localhost kernel: Sequencer Complete DMA-inprog list:
Apr  9 14:22:06 localhost kernel: Sequencer Complete list:
Apr  9 14:22:06 localhost kernel: Sequencer DMA-Up and Complete list:
Apr  9 14:22:06 localhost kernel:
Apr  9 14:22:06 localhost kernel: scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0, LJSCB 0xff00
Apr  9 14:22:06 localhost kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
Apr  9 14:22:06 localhost kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
Apr  9 14:22:06 localhost kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
Apr  9 14:22:06 localhost kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
Apr  9 14:22:06 localhost kernel: scsi1: FIFO1 Free, LONGJMP == 0x81e8, SCB 0x3, LJSCB 0x2
Apr  9 14:22:06 localhost kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
Apr  9 14:22:06 localhost kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
Apr  9 14:22:06 localhost kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
Apr  9 14:22:06 localhost kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
Apr  9 14:22:06 localhost kernel: LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
Apr  9 14:22:06 localhost kernel: scsi1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42
Apr  9 14:22:06 localhost kernel: scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
Apr  9 14:22:06 localhost kernel: SIMODE0[0xc]
Apr  9 14:22:06 localhost kernel: CCSCBCTL[0x0]
Apr  9 14:22:06 localhost kernel: scsi1: REG0 == 0x3, SINDEX = 0x133, DINDEX = 0x108
Apr  9 14:22:06 localhost kernel: scsi1: SCBPTR == 0x2, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xfffd
Apr  9 14:22:06 localhost kernel: CDB 3 0 0 0 20 0
Apr  9 14:22:06 localhost kernel: STACK: 0xdd 0x0 0x0 0x0 0x0 0x0 0x0 0x2e
Apr  9 14:22:06 localhost kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Apr  9 14:22:06 localhost kernel: DevQ(0:2:0): 0 waiting
Apr  9 14:22:06 localhost kernel: DevQ(0:3:0): 0 waiting
Apr  9 14:22:06 localhost kernel: DevQ(0:6:0): 0 waiting
Apr  9 14:22:06 localhost kernel: scsi1:0:2:0: Cmd aborted from QINFIFO
Apr  9 14:22:16 localhost kernel: scsi1:0:2:0: Attempting to abort cmd f782de00
Apr  9 14:22:16 localhost kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Apr  9 14:22:16 localhost kernel: scsi1: Dumping Card State at program address 0x1a4 Mode 0x33
Apr  9 14:22:16 localhost kernel: Card was paused
Apr  9 14:22:16 localhost kernel: HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11]
Apr  9 14:22:16 localhost kernel: DFFSTAT[0x31] SCSISIGI[0x4] SCSIPHASE[0x0] SCSIBUS[0x0]
Apr  9 14:22:16 localhost kernel: LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10]
Apr  9 14:22:16 localhost kernel: SEQINTCTL[0x0] SEQ_FLAGS[0x40] SEQ_FLAGS2[0x0] SSTAT0[0x0]
Apr  9 14:22:16 localhost kernel: SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0]
Apr  9 14:22:16 localhost kernel: SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0]
Apr  9 14:22:16 localhost kernel: LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]
Apr  9 14:22:16 localhost kernel:
Apr  9 14:22:16 localhost kernel: SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x2 CURRSCB 0x2 NEXTSCB 0x0
Apr  9 14:22:16 localhost kernel: qinstart = 10969 qinfifonext = 10970
Apr  9 14:22:16 localhost kernel: QINFIFO: 0x3
Apr  9 14:22:16 localhost kernel: WAITING_TID_QUEUES:
Apr  9 14:22:16 localhost kernel: Pending list:
Apr  9 14:22:16 localhost kernel:   3 SCB_CONTROL[0x48] SCB_SCSIID[0x27] SCB_TAG[0x3]
Apr  9 14:22:16 localhost kernel:   2 SCB_CONTROL[0x0] SCB_SCSIID[0x37] SCB_TAG[0x2]
Apr  9 14:22:16 localhost kernel: Total 2
Apr  9 14:22:16 localhost kernel: Kernel Free SCB list: 1 0
Apr  9 14:22:16 localhost kernel: Sequencer Complete DMA-inprog list:
Apr  9 14:22:16 localhost kernel: Sequencer Complete list:
Apr  9 14:22:16 localhost kernel: Sequencer DMA-Up and Complete list:
Apr  9 14:22:16 localhost kernel:
Apr  9 14:22:16 localhost kernel: scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0, LJSCB 0xff00
Apr  9 14:22:16 localhost kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
Apr  9 14:22:16 localhost kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
Apr  9 14:22:16 localhost kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
Apr  9 14:22:16 localhost kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
Apr  9 14:22:16 localhost kernel: scsi1: FIFO1 Free, LONGJMP == 0x81e8, SCB 0x3, LJSCB 0x2
Apr  9 14:22:16 localhost kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
Apr  9 14:22:16 localhost kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
Apr  9 14:22:16 localhost kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
Apr  9 14:22:16 localhost kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
Apr  9 14:22:16 localhost kernel: LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
Apr  9 14:22:16 localhost kernel: scsi1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42
Apr  9 14:22:16 localhost kernel: scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
Apr  9 14:22:16 localhost kernel: SIMODE0[0xc]
Apr  9 14:22:16 localhost kernel: CCSCBCTL[0x0]
Apr  9 14:22:16 localhost kernel: scsi1: REG0 == 0x3, SINDEX = 0x133, DINDEX = 0x108
Apr  9 14:22:16 localhost kernel: scsi1: SCBPTR == 0x2, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xfffd
Apr  9 14:22:16 localhost kernel: CDB 3 0 0 0 20 0
Apr  9 14:22:16 localhost kernel: STACK: 0xdd 0x0 0x0 0x0 0x0 0x0 0x0 0x2e
Apr  9 14:22:16 localhost kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Apr  9 14:22:16 localhost kernel: DevQ(0:2:0): 0 waiting
Apr  9 14:22:16 localhost kernel: DevQ(0:3:0): 0 waiting
Apr  9 14:22:16 localhost kernel: DevQ(0:6:0): 0 waiting
Apr  9 14:22:16 localhost kernel: scsi1:0:2:0: Cmd aborted from QINFIFO
Apr  9 14:22:16 localhost kernel: scsi1:0:3:0: Attempting to abort cmd f782da00
Apr  9 14:22:16 localhost kernel: >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
Apr  9 14:22:16 localhost kernel: scsi1: Dumping Card State at program address 0x1a4 Mode 0x33
Apr  9 14:22:16 localhost kernel: Card was paused
Apr  9 14:22:16 localhost kernel: HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11]
Apr  9 14:22:16 localhost kernel: DFFSTAT[0x31] SCSISIGI[0x4] SCSIPHASE[0x0] SCSIBUS[0x0]
Apr  9 14:22:16 localhost kernel: LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10]
Apr  9 14:22:16 localhost kernel: SEQINTCTL[0x0] SEQ_FLAGS[0x40] SEQ_FLAGS2[0x0] SSTAT0[0x0]
Apr  9 14:22:16 localhost kernel: SSTAT1[0x0] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0]
Apr  9 14:22:16 localhost kernel: SIMODE1[0xac] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0]
Apr  9 14:22:16 localhost kernel: LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]
Apr  9 14:22:16 localhost kernel:
Apr  9 14:22:16 localhost kernel: SCB Count = 4 CMDS_PENDING = 1 LASTSCB 0x2 CURRSCB 0x2 NEXTSCB 0x0
Apr  9 14:22:16 localhost kernel: qinstart = 10969 qinfifonext = 10969
Apr  9 14:22:16 localhost kernel: QINFIFO:
Apr  9 14:22:16 localhost kernel: WAITING_TID_QUEUES:
Apr  9 14:22:16 localhost kernel: Pending list:
Apr  9 14:22:16 localhost kernel:   2 SCB_CONTROL[0x0] SCB_SCSIID[0x37] SCB_TAG[0x2]
Apr  9 14:22:16 localhost kernel: Total 1
Apr  9 14:22:16 localhost kernel: Kernel Free SCB list: 3 1 0
Apr  9 14:22:16 localhost kernel: Sequencer Complete DMA-inprog list:
Apr  9 14:22:16 localhost kernel: Sequencer Complete list:
Apr  9 14:22:16 localhost kernel: Sequencer DMA-Up and Complete list:
Apr  9 14:22:16 localhost kernel:
Apr  9 14:22:16 localhost kernel: scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x0, LJSCB 0xff00
Apr  9 14:22:16 localhost kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
Apr  9 14:22:16 localhost kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
Apr  9 14:22:16 localhost kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
Apr  9 14:22:16 localhost kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
Apr  9 14:22:16 localhost kernel: scsi1: FIFO1 Free, LONGJMP == 0x81e8, SCB 0x3, LJSCB 0x2
Apr  9 14:22:16 localhost kernel: SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]
Apr  9 14:22:16 localhost kernel: SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0]
Apr  9 14:22:16 localhost kernel: SOFFCNT[0x0] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0
Apr  9 14:22:16 localhost kernel: HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]
Apr  9 14:22:16 localhost kernel: LQIN: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
Apr  9 14:22:16 localhost kernel: scsi1: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42
Apr  9 14:22:16 localhost kernel: scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
Apr  9 14:22:16 localhost kernel: SIMODE0[0xc]
Apr  9 14:22:16 localhost kernel: CCSCBCTL[0x0]
Apr  9 14:22:16 localhost kernel: scsi1: REG0 == 0x3, SINDEX = 0x133, DINDEX = 0x108
Apr  9 14:22:16 localhost kernel: scsi1: SCBPTR == 0x2, SCB_NEXT == 0xff00, SCB_NEXT2 == 0xfffd
Apr  9 14:22:16 localhost kernel: CDB 3 0 0 0 20 0
Apr  9 14:22:16 localhost kernel: STACK: 0xdd 0x0 0x0 0x0 0x0 0x0 0x0 0x2e
Apr  9 14:22:16 localhost kernel: <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Apr  9 14:22:16 localhost kernel: DevQ(0:2:0): 0 waiting
Apr  9 14:22:16 localhost kernel: DevQ(0:3:0): 0 waiting
Apr  9 14:22:16 localhost kernel: DevQ(0:6:0): 0 waiting
Apr  9 14:22:16 localhost kernel: scsi1:0:3:0: Unable to deliver message
Apr  9 14:22:16 localhost kernel: Recovery code sleeping
Apr  9 14:22:21 localhost kernel: Recovery code awake
Apr  9 14:22:21 localhost kernel: Timer Expired
Apr  9 14:22:21 localhost kernel: scsi1: Device reset returning 0x2003
Apr  9 14:22:21 localhost kernel: Recovery code sleeping
Apr  9 14:22:26 localhost kernel: Recovery code awake
Apr  9 14:22:26 localhost kernel: Timer Expired
Apr  9 14:22:26 localhost kernel: scsi1: Device reset returning 0x2003
Apr  9 14:22:26 localhost kernel: Recovery SCB completes
Apr  9 14:22:26 localhost kernel: Recovery SCB completes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
@ 2003-04-09 19:26 Cress, Andrew R
  2003-04-09 20:23 ` Duncan Gibb
  0 siblings, 1 reply; 7+ messages in thread
From: Cress, Andrew R @ 2003-04-09 19:26 UTC (permalink / raw)
  To: 'Duncan Gibb', Rohit Gupta; +Cc: linux-scsi

Duncan & Rohit,

What version of the aic79xx driver is in that kernel?  I hope it is >=
1.3.1?

Andy

-----Original Message-----
From: Duncan Gibb [mailto:duncangibb@uk.dmgworldmedia.com] 
Sent: Wednesday, April 09, 2003 11:48 AM
To: Rohit Gupta
Cc: linux-scsi@vger.kernel.org
Subject: Re: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505


A few days ago, you wrote:

RG> I am having system hangs when I place the U320 SCSI card and
RG> the Intel gigabit Server card (copper) on the same bus,
RG> specifically the 100 Mhz bus on the Intel 7505 chipset.

I have a similar setup to you in a machine I built last weekend.  It's
based on a SuperMicro X5DA8 board, which has all the bits you mention on
the motherboard.

The system seems stable, but I am experiencing frequent SCSI bus and
apparent subsystem lockups.

Channel A has only the swap disk attached.

Channel B is:

CD-RW(term enabled)----m/b----|----PCMCIAreader----Scanner----terminator

M/B term is disabled (Adaptec SCSIselect seems to do this in a way the
m/b jumpers don't).  | is a backplate where the bus goes from IDC50-int
to 50MD-ext.  All devices worked fine on an older PCI aic7xxx-based card
but this board does not have enough PCI slots to test in situ.

Scanning at 300dpi in greyscale works fine; colour gets part way and
kills the scsi.  Writing a CD kills the scsi.  SCSI death looks like the
attached portion of /var/log/messages.  Death appears to afflict only
one channel.  "dd if=/dev/sda of=/dev/null" still works at any rate.

I am currently using an unadulterated kernel 2.4.21-pre5-ac3, pending
getting something more recent to build (with ac97).  ide-scsi is also
loaded.

If you've found any way around this apparent problem, or anyone has and
ideas, please let me know.

Cheers


Duncan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
  2003-04-09 19:26 Cress, Andrew R
@ 2003-04-09 20:23 ` Duncan Gibb
  2003-04-09 20:27   ` Justin T. Gibbs
  0 siblings, 1 reply; 7+ messages in thread
From: Duncan Gibb @ 2003-04-09 20:23 UTC (permalink / raw)
  To: Cress, Andrew R; +Cc: Rohit Gupta, linux-scsi

On Wed, 2003-04-09 at 20:26, Cress, Andrew R wrote:

AC> What version of the aic79xx driver is in that kernel?
AC> I hope it is >= 1.3.1?

It's 1.3.0.  I'll see if I can find and build a later one.  I take it
there is a known serious bug in <1.3.1, then?


Duncan




^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
  2003-04-09 20:23 ` Duncan Gibb
@ 2003-04-09 20:27   ` Justin T. Gibbs
  2003-04-09 22:01     ` Duncan Gibb
  0 siblings, 1 reply; 7+ messages in thread
From: Justin T. Gibbs @ 2003-04-09 20:27 UTC (permalink / raw)
  To: Duncan Gibb, Cress, Andrew R; +Cc: Rohit Gupta, linux-scsi

> On Wed, 2003-04-09 at 20:26, Cress, Andrew R wrote:
> 
> AC> What version of the aic79xx driver is in that kernel?
> AC> I hope it is >= 1.3.1?
> 
> It's 1.3.0.  I'll see if I can find and build a later one.  I take it
> there is a known serious bug in <1.3.1, then?

The latest driver is 1.3.6:

http://people.FreeBSD.org/~gibbs/linux/SRC/
http://people.FreeBSD.org/~gibbs/linux/DUD/aic79xx/
http://people.FreeBSD.org/~gibbs/linux/RPM/aic79xx/

There have been several bugs fixed since 1.3.0.  See the CHANGELOG file
in the source distributions for details.  Depending on the drives you are
using, you may also need to run with a fairly low tag depth (32 is the
default we use in our binary distributions) to get the drivers stable.

--
Justin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
  2003-04-09 20:27   ` Justin T. Gibbs
@ 2003-04-09 22:01     ` Duncan Gibb
  2003-04-10 16:55       ` Justin T. Gibbs
  0 siblings, 1 reply; 7+ messages in thread
From: Duncan Gibb @ 2003-04-09 22:01 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Cress, Andrew R, Rohit Gupta, linux-scsi

On Wed, 2003-04-09 at 21:27, Justin T. Gibbs wrote:

JG> The latest driver is 1.3.6:

I superimposed your 2.4-20030328 driver over my kernel tree and
rebuilt.  It still locked up :-(

JG> Depending on the drives you are using, you may also need to run
JG> with a fairly low tag depth {..} to get the drivers stable.

I tried lowering global tag depth to 4 (which I presume is a low number,
but I don't really know what I'm doing).  And it still locks up.
Moreover, according to /proc/scsi/aic79xx/[01], the driver has
negotiated "Max Tagged Openings 0" with all the devices on this bus.

I noticed /proc/scsi/aic79xx/1 correctly refers to the controller as
Channel B, but all the device info says Channel A.  Hope this doesn't
mean it's getting scsi0 and scsi1 mixed up at a lower level.

My other theory is that all the devices on this bus are removable in one
form or another, and hence are being polled for media changes.  The
actions which cause the bus/driver to lock up are things which need a
long period (several seconds) of data transfer - scanning in colour,
writing a CD.  Could the disconnect logic be getting screwed up
somewhere?  How could I test that?


Duncan



^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
  2003-04-09 22:01     ` Duncan Gibb
@ 2003-04-10 16:55       ` Justin T. Gibbs
  2003-04-10 21:00         ` Duncan Gibb
  0 siblings, 1 reply; 7+ messages in thread
From: Justin T. Gibbs @ 2003-04-10 16:55 UTC (permalink / raw)
  To: Duncan Gibb; +Cc: Cress, Andrew R, Rohit Gupta, linux-scsi

> On Wed, 2003-04-09 at 21:27, Justin T. Gibbs wrote:
> 
> JG> The latest driver is 1.3.6:
> 
> I superimposed your 2.4-20030328 driver over my kernel tree and
> rebuilt.  It still locked up :-(

Do you have the nmi_watchdog enabled?  What bus speed are you running
for the aic7902 and the gig-E card?  Are the on the same physical PCI/PCI-X
bus?

> I tried lowering global tag depth to 4 (which I presume is a low number,
> but I don't really know what I'm doing).  And it still locks up.
> Moreover, according to /proc/scsi/aic79xx/[01], the driver has
> negotiated "Max Tagged Openings 0" with all the devices on this bus.

That seems really wierd - like you disabled disconnection.

> I noticed /proc/scsi/aic79xx/1 correctly refers to the controller as
> Channel B, but all the device info says Channel A.  Hope this doesn't
> mean it's getting scsi0 and scsi1 mixed up at a lower level.

Yes, that is a bit confusing.  The two channels are actually two independent,
single channel, controllers.  The user doesn't know that, and expects the
names to match those silk-screened on the card.  I'll review the code to
see if I can make it less confusing (perhaps just omit the channel identifier).

> My other theory is that all the devices on this bus are removable in one
> form or another, and hence are being polled for media changes.  The
> actions which cause the bus/driver to lock up are things which need a
> long period (several seconds) of data transfer - scanning in colour,
> writing a CD.  Could the disconnect logic be getting screwed up
> somewhere?  How could I test that?

A good start would be to send me privately (no need to spam the list) the
output of "cat /proc/scsi/aic79xx/*" and "cat /proc/scsi/scsi" as well
as a dmesg from the system.  From the last trace you sent, it did look
like we timed out while a command without the disconnection privledge was
out on the bus, but its not clear why yet.  If you compile the driver
with debugging enabled and a debug mask of 8, I can also see the content
of the serial eeprom to see if the settings are strange.

--
Justin


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505
  2003-04-10 16:55       ` Justin T. Gibbs
@ 2003-04-10 21:00         ` Duncan Gibb
  0 siblings, 0 replies; 7+ messages in thread
From: Duncan Gibb @ 2003-04-10 21:00 UTC (permalink / raw)
  To: Justin T. Gibbs; +Cc: Cress, Andrew R, Rohit Gupta, linux-scsi

On Thu, 2003-04-10 at 17:55, Justin T. Gibbs wrote:

JG> Do you have the nmi_watchdog enabled?

Not normally.  I tried booting with "nmi_watchdog=1" and it made no
difference.  Note that in my case (unlike Rohit Gupta's), it's not a
complete kernel lockup - the system is fine apart from one dead SCSI
channel.  Processes which try to touch devices on ChB hang apparently
indefinitely.

JG> What bus speed are you running for the aic7902 and the gig-E card?

The BIOS says "AUTO"; /proc/pci says 66MHz.

JG> Are the on the same physical PCI/PCI-X bus?

No.  SuperMicro built the board such that PCI-X slots 1 and 2 and the
SCSI controller are on one bus, and PCI-X slot 3 and the Ethernet
interface are on another.  All the slots are empty.


DG> the driver has negotiated "Max Tagged Openings 0" with
DG> all the devices on this bus.

JG> That seems really wierd - like you disabled disconnection.

I don't know how I would do that with this hardware.  SCSIselect doesn't
have that as an option (and has everything defaulted except ChB
termination), and none of my hardware can be jumpered that way AFAIK.


DG> /proc/scsi/aic79xx/1 {..} controller as Channel B,
DG> but all the device info says Channel A.

JG> The two channels are actually two independent, single channel,
JG> controllers.  The user doesn't know that

The user has been misled by the marketing info and the motherboard
manual ;-)

JG> I'll review the code to see if I can make it less confusing
JG> (perhaps just omit the channel identifier).

Presumably channel identifiers are useful in some cases, so IMHO the
most logical option would be to make them match what's reported by
/proc/scsi/scsi.  There the channels are all called "00".


JG> send me privately {..} the output of "cat /proc/scsi/aic79xx/*"
JG> and "cat /proc/scsi/scsi" as well as a dmesg from the system.
JG> {..} If you compile the driver with debugging enabled and a
JG> debug mask of 8, I can also see the content of the serial
JG> eeprom to see if the settings are strange.

OK.  I noticed you released new source today, so I will download that
and play 'make' again.


Duncan



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-04-10 20:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-09 15:47 aic79xx U320 + e1000 Intel hangs on Idual Xeon 7505 Duncan Gibb
  -- strict thread matches above, loose matches on Subject: below --
2003-04-09 19:26 Cress, Andrew R
2003-04-09 20:23 ` Duncan Gibb
2003-04-09 20:27   ` Justin T. Gibbs
2003-04-09 22:01     ` Duncan Gibb
2003-04-10 16:55       ` Justin T. Gibbs
2003-04-10 21:00         ` Duncan Gibb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox