public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Fatal problem, possibly related to AIC79xx
@ 2004-07-10 15:21 Antonin Kral
  2004-07-11 20:07 ` Willy Tarreau
  0 siblings, 1 reply; 6+ messages in thread
From: Antonin Kral @ 2004-07-10 15:21 UTC (permalink / raw)
  To: linux-kernel

Hi all,

I'm trying to install Linux (in particular Debian) to our new servers.
These servers are based od motherbard SuperMicro X5DL8-GG with aic7902
without RAID, 1GB RAM, one XEON 3.06GHz

I have two, really strange problems, first of all I have noticed, that
with enabled SMP support kernel detects TWO processors, but only one is
physically installed.

The second problem is, that I am not able to run almost any program.
E.g. if I try to execute free I'll get "Illegal instruction", for mount
I'll get "Segmentation Fault".

What I've tried:

  Vanilla kernels 2.4.25, 2.4.26, 2.6.6, 2.6.7. And almost all
combinations with/without:
          * SMP
          * APIC
          * Highmem
          * MTRR

All without ACPI and with aic79xx and e1000 build in kernel.

  I've tried Knoppix 3.4. The strange think was that I was unable to
load module for aic79xx, because of "no such device".

Does anyone have any idea how to solve my problems?

  Thank you, best regards,

        Antonin Kral


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fatal problem, possibly related to AIC79xx
       [not found] <A6974D8E5F98D511BB910002A50A6647615FFBBF@hdsmsx403.hd.intel.com>
@ 2004-07-11  2:30 ` Len Brown
  2004-07-11  6:13   ` AIC79xx problem [was; Re: Fatal problem, possibly related to AIC79xx] Antonin Kral
  0 siblings, 1 reply; 6+ messages in thread
From: Len Brown @ 2004-07-11  2:30 UTC (permalink / raw)
  To: Antonin Kral; +Cc: linux-kernel

On Sat, 2004-07-10 at 11:21, Antonin Kral wrote:

> SuperMicro X5DL8-GG with aic7902
> ... one XEON 3.06GHz
> 
> I have two, really strange problems, first of all I have noticed, that
> with enabled SMP support kernel detects TWO processors, but only one
> is physically installed.

If you'd like to have just 1 processor instead of two, then
enter the BIOS SETUP and disable HyperThreading (HT),
or boot the SMP kernel with maxcpus=1.

I have no insight into your potential AIC79XX problem...

cheers,
-Len



^ permalink raw reply	[flat|nested] 6+ messages in thread

* AIC79xx problem [was; Re: Fatal problem, possibly related to AIC79xx]
  2004-07-11  2:30 ` Fatal problem, possibly related to AIC79xx Len Brown
@ 2004-07-11  6:13   ` Antonin Kral
  2004-07-11 13:12     ` AIC79xx problem Antonin Kral
  0 siblings, 1 reply; 6+ messages in thread
From: Antonin Kral @ 2004-07-11  6:13 UTC (permalink / raw)
  To: Len Brown; +Cc: linux-kernel

* Len Brown <len.brown@intel.com> [2004-07-11 04:30] wrote:
> On Sat, 2004-07-10 at 11:21, Antonin Kral wrote:
> If you'd like to have just 1 processor instead of two, then
> enter the BIOS SETUP and disable HyperThreading (HT),
> or boot the SMP kernel with maxcpus=1.

Yes, you are right and I realizes this myself as well. Sorry for making
waves. But at that time, I was able to find only this.

> I have no insight into your potential AIC79XX problem...

I've checked that the problem is closely releated to AIC79xx, because it
raises when I use SCSI. I've got some old IDE disk and with it work
everything perfectly.

    Antonin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: AIC79xx problem
  2004-07-11  6:13   ` AIC79xx problem [was; Re: Fatal problem, possibly related to AIC79xx] Antonin Kral
@ 2004-07-11 13:12     ` Antonin Kral
  0 siblings, 0 replies; 6+ messages in thread
From: Antonin Kral @ 2004-07-11 13:12 UTC (permalink / raw)
  To: linux-kernel

Hi all,

> I've checked that the problem is closely releated to AIC79xx, because it
> raises when I use SCSI. I've got some old IDE disk and with it work
> everything perfectly.

I still have problems with AIC79xx driver, but I've finally got some
kernel outputs. Problems seem to araise only during writing to the disk.
When I install system to the same disk at other computer, I can run
programs without problem, but during writing to the disk I receive a lot
of messages similar to these:

scsi1: PCI error Interrupt
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi1: Dumping Card State at program address 0xe Mode 0x33
Card was paused
HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11] 
DFFSTAT[0x1] SCSISIGI[0x24] SCSIPHASE[0x1] SCSIBUS[0x94] 
LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10] 
SEQINTCTL[0x8] SEQ_FLAGS[0xc0] SEQ_FLAGS2[0x0] SSTAT0[0x0] 
SSTAT1[0x19] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] 
SIMODE1[0xa4] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] 
LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x81] 

SCB Count = 30 CMDS_PENDING = 2 LASTSCB 0x6 CURRSCB 0x1d NEXTSCB 0xff00
qinstart = 1103 qinfifonext = 1103
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
 29 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x37] 
  6 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x37] 
Total 2
Kernel Free SCB list: 21 28 4 17 27 24 20 16 23 22 3 2 1 5 14 13 12 11 0 9 8 7 19 15 18 10 26 25 
Sequencer Complete DMA-inprog list: 
Sequencer Complete list: 
Sequencer DMA-Up and Complete list: 

scsi1: FIFO0 Active, LONGJMP == 0x80ff, SCB 0x6
SEQIMODE[0x3f] SEQINTSRC[0x20] DFCNTRL[0x0] DFSTATUS[0x89] 
SG_CACHE_SHADOW[0x38] SG_STATE[0x0] DFFSXFRCTL[0x0] 
SOFFCNT[0x7e] MDFFSTAT[0xc] SHADDR = 0x0361cb000, SHCNT = 0x1000 
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x0] 
scsi1: FIFO1 Active, LONGJMP == 0x257, SCB 0x6
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x2c] DFSTATUS[0xc9] 
SG_CACHE_SHADOW[0x38] SG_STATE[0x3] DFFSXFRCTL[0x0] 
SOFFCNT[0x7e] MDFFSTAT[0x2] SHADDR = 0x0361cc000, SHCNT = 0x0 
HADDR = 0x0361cc000, HCNT = 0x0 CCSGCTL[0x10] 
LQIN: 0x5 0x0 0x0 0x6 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x80 0x0 0x0 0x0 0x2 0x0 
scsi1: LQISTATE = 0x25, LQOSTATE = 0x0, OPTIONMODE = 0x42
scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x3
SIMODE0[0xc] 
CCSCBCTL[0x0] 
scsi1: REG0 == 0x15, SINDEX = 0x133, DINDEX = 0x108
scsi1: SCBPTR == 0x15, SCB_NEXT == 0x6, SCB_NEXT2 == 0xffe4
CDB 0 10 0 80 60 96
STACK: 0x125 0x0 0x0 0x25e 0x257 0x257 0x29 0x1
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
DevQ(0:3:0): 0 waiting
DevQ(0:6:0): 0 waiting
scsi1: Split completion read data parity error in DFF1
scsi1: Address or Write Phase Parity Error Detected in DFF1.
scsi1: PCI error Interrupt
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi1: Dumping Card State at program address 0x27 Mode 0x22
Card was paused
HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11] 
DFFSTAT[0x11] SCSISIGI[0x24] SCSIPHASE[0x1] SCSIBUS[0x95] 
LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10] 
SEQINTCTL[0x0] SEQ_FLAGS[0xc0] SEQ_FLAGS2[0x0] SSTAT0[0x0] 
SSTAT1[0x19] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] 
SIMODE1[0xa4] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] 
LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x81] 

SCB Count = 30 CMDS_PENDING = 2 LASTSCB 0x6 CURRSCB 0x1d NEXTSCB 0xff00
qinstart = 1103 qinfifonext = 1103
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
 29 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x37] 
  6 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x37] 
Total 2
Kernel Free SCB list: 21 28 4 17 27 24 20 16 23 22 3 2 1 5 14 13 12 11 0 9 8 7 19 15 18 10 26 25 
Sequencer Complete DMA-inprog list: 
Sequencer Complete list: 
Sequencer DMA-Up and Complete list: 

scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x6
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] 
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] 
SOFFCNT[0x7e] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x0] 
scsi1: FIFO1 Active, LONGJMP == 0x257, SCB 0x6
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x2c] DFSTATUS[0xc9] 
SG_CACHE_SHADOW[0x28] SG_STATE[0x3] DFFSXFRCTL[0x0] 
SOFFCNT[0x7e] MDFFSTAT[0x2] SHADDR = 0x0361ad000, SHCNT = 0x0 
HADDR = 0x0361ad000, HCNT = 0x0 CCSGCTL[0x10] 
LQIN: 0x5 0x0 0x0 0x6 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x80 0x0 0x0 0x0 0x2 0x0 
scsi1: LQISTATE = 0x25, LQOSTATE = 0x0, OPTIONMODE = 0x42
scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x3
SIMODE0[0xc] 
CCSCBCTL[0x0] 
scsi1: REG0 == 0x1d, SINDEX = 0x122, DINDEX = 0x108
scsi1: SCBPTR == 0x15, SCB_NEXT == 0x6, SCB_NEXT2 == 0xffe4
CDB 0 10 0 80 60 96
STACK: 0x15 0x125 0x0 0x0 0x25e 0x257 0x93 0x29
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
DevQ(0:3:0): 0 waiting
DevQ(0:6:0): 0 waiting
scsi1: Split completion read data parity error in DFF1
scsi1: Address or Write Phase Parity Error Detected in DFF1.
scsi1: PCI error Interrupt
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
scsi1: Dumping Card State at program address 0x93 Mode 0x11
Card was paused
HS_MAILBOX[0x0] INTCTL[0x80] SEQINTSTAT[0x0] SAVED_MODE[0x11] 
DFFSTAT[0x11] SCSISIGI[0x24] SCSIPHASE[0x1] SCSIBUS[0x95] 
LASTPHASE[0x1] SCSISEQ0[0x0] SCSISEQ1[0x12] SEQCTL0[0x10] 
SEQINTCTL[0x0] SEQ_FLAGS[0xc0] SEQ_FLAGS2[0x0] SSTAT0[0x0] 
SSTAT1[0x19] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0x0] 
SIMODE1[0xa4] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x0] 
LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x81] 

SCB Count = 30 CMDS_PENDING = 2 LASTSCB 0x6 CURRSCB 0x6 NEXTSCB 0xffc0
qinstart = 1120 qinfifonext = 1120
QINFIFO:
WAITING_TID_QUEUES:
Pending list:
  6 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x37] 
 20 FIFO_USE[0x0] SCB_CONTROL[0x60] SCB_SCSIID[0x37] 
Total 2
Kernel Free SCB list: 24 27 17 4 28 21 29 16 23 22 3 2 1 5 14 13 12 11 0 9 8 7 19 15 18 10 26 25 
Sequencer Complete DMA-inprog list: 
Sequencer Complete list: 
Sequencer DMA-Up and Complete list: 

scsi1: FIFO0 Free, LONGJMP == 0x80ff, SCB 0x18
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89] 
SG_CACHE_SHADOW[0x2] SG_STATE[0x0] DFFSXFRCTL[0x0] 
SOFFCNT[0x7e] MDFFSTAT[0x5] SHADDR = 0x00, SHCNT = 0x0 
HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x0] 
scsi1: FIFO1 Active, LONGJMP == 0x257, SCB 0x14
SEQIMODE[0x3f] SEQINTSRC[0x0] DFCNTRL[0x2c] DFSTATUS[0xc9] 
SG_CACHE_SHADOW[0x28] SG_STATE[0x3] DFFSXFRCTL[0x0] 
SOFFCNT[0x7e] MDFFSTAT[0x2] SHADDR = 0x036461000, SHCNT = 0x0 
HADDR = 0x036461000, HCNT = 0x0 CCSGCTL[0x10] 
LQIN: 0x5 0x0 0x0 0x14 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x80 0x0 0x0 0x0 0x2 0x0 
scsi1: LQISTATE = 0x25, LQOSTATE = 0x0, OPTIONMODE = 0x42
scsi1: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x1
SIMODE0[0xc] 
CCSCBCTL[0x0] 
scsi1: REG0 == 0x60, SINDEX = 0x122, DINDEX = 0x108
scsi1: SCBPTR == 0x14, SCB_NEXT == 0xffc0, SCB_NEXT2 == 0xffe4
CDB 2a 0 0 0 28 90
STACK: 0x29 0x15 0x125 0x0 0x0 0x257 0x257 0x257
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
DevQ(0:3:0): 0 waiting
DevQ(0:6:0): 0 waiting
scsi1: Split completion read data parity error in DFF1
scsi1: Address or Write Phase Parity Error Detected in DFF1.

I've found some info, that this could be releated to the firmware
version on the disk (
http://lists.freebsd.org/pipermail/aic7xxx/2004-February/004083.html ).
So output from the /proc/scsi/scsi:

Attached devices: 
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: SUPER    Model: GEM318           Rev: 0   
  Type:   Processor                        ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 03 Lun: 00
  Vendor: SEAGATE  Model: ST373453LC       Rev: 0006
  Type:   Direct-Access                    ANSI SCSI revision: 03
Host: scsi1 Channel: 00 Id: 06 Lun: 00
  Vendor: SUPER    Model: GEM318           Rev: 0   
  Type:   Processor                        ANSI SCSI revision: 02


   Any idea? I will appreciate any help, thaks

        Antonin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fatal problem, possibly related to AIC79xx
  2004-07-10 15:21 Fatal problem, possibly related to AIC79xx Antonin Kral
@ 2004-07-11 20:07 ` Willy Tarreau
  2004-07-11 20:26   ` Antonin Kral
  0 siblings, 1 reply; 6+ messages in thread
From: Willy Tarreau @ 2004-07-11 20:07 UTC (permalink / raw)
  To: Antonin Kral; +Cc: linux-kernel

Looks like a hardware problem to me. Perhaps higher transfer rates obtained
with aic7xxx triggers it faster. You should really run cpuburn (burnBX) and
memtest86 on this box.

Regards,
Willy

On Sat, Jul 10, 2004 at 05:21:25PM +0200, Antonin Kral wrote:
> Hi all,
> 
> I'm trying to install Linux (in particular Debian) to our new servers.
> These servers are based od motherbard SuperMicro X5DL8-GG with aic7902
> without RAID, 1GB RAM, one XEON 3.06GHz
> 
> I have two, really strange problems, first of all I have noticed, that
> with enabled SMP support kernel detects TWO processors, but only one is
> physically installed.
> 
> The second problem is, that I am not able to run almost any program.
> E.g. if I try to execute free I'll get "Illegal instruction", for mount
> I'll get "Segmentation Fault".
> 
> What I've tried:
> 
>   Vanilla kernels 2.4.25, 2.4.26, 2.6.6, 2.6.7. And almost all
> combinations with/without:
>           * SMP
>           * APIC
>           * Highmem
>           * MTRR
> 
> All without ACPI and with aic79xx and e1000 build in kernel.
> 
>   I've tried Knoppix 3.4. The strange think was that I was unable to
> load module for aic79xx, because of "no such device".
> 
> Does anyone have any idea how to solve my problems?
> 
>   Thank you, best regards,
> 
>         Antonin Kral
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fatal problem, possibly related to AIC79xx
  2004-07-11 20:07 ` Willy Tarreau
@ 2004-07-11 20:26   ` Antonin Kral
  0 siblings, 0 replies; 6+ messages in thread
From: Antonin Kral @ 2004-07-11 20:26 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: linux-kernel

I've run memtest for more than 48 hours. And actually I have two boxes
with same behaviour. I was thinking about more precise check of SCSI bus
termination...

Regards,
    Antonin

* Willy Tarreau <willy@w.ods.org> [2004-07-11 22:23] wrote:
> Looks like a hardware problem to me. Perhaps higher transfer rates obtained
> with aic7xxx triggers it faster. You should really run cpuburn (burnBX) and
> memtest86 on this box.
> 
> Regards,
> Willy
> 
> On Sat, Jul 10, 2004 at 05:21:25PM +0200, Antonin Kral wrote:
> > Hi all,
> > 
> > I'm trying to install Linux (in particular Debian) to our new servers.
> > These servers are based od motherbard SuperMicro X5DL8-GG with aic7902
> > without RAID, 1GB RAM, one XEON 3.06GHz
> > 
> > I have two, really strange problems, first of all I have noticed, that
> > with enabled SMP support kernel detects TWO processors, but only one is
> > physically installed.
> > 
> > The second problem is, that I am not able to run almost any program.
> > E.g. if I try to execute free I'll get "Illegal instruction", for mount
> > I'll get "Segmentation Fault".
> > 
> > What I've tried:
> > 
> >   Vanilla kernels 2.4.25, 2.4.26, 2.6.6, 2.6.7. And almost all
> > combinations with/without:
> >           * SMP
> >           * APIC
> >           * Highmem
> >           * MTRR
> > 
> > All without ACPI and with aic79xx and e1000 build in kernel.
> > 
> >   I've tried Knoppix 3.4. The strange think was that I was unable to
> > load module for aic79xx, because of "no such device".
> > 
> > Does anyone have any idea how to solve my problems?
> > 
> >   Thank you, best regards,
> > 
> >         Antonin Kral
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-07-11 20:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <A6974D8E5F98D511BB910002A50A6647615FFBBF@hdsmsx403.hd.intel.com>
2004-07-11  2:30 ` Fatal problem, possibly related to AIC79xx Len Brown
2004-07-11  6:13   ` AIC79xx problem [was; Re: Fatal problem, possibly related to AIC79xx] Antonin Kral
2004-07-11 13:12     ` AIC79xx problem Antonin Kral
2004-07-10 15:21 Fatal problem, possibly related to AIC79xx Antonin Kral
2004-07-11 20:07 ` Willy Tarreau
2004-07-11 20:26   ` Antonin Kral

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox