public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 3ware 9550 using 3w-9xxx driver lockups ?
@ 2006-08-08 14:19 Andy Davidson
  2006-08-08 18:32 ` adam radford
  2006-08-09  5:29 ` Florian Weimer
  0 siblings, 2 replies; 4+ messages in thread
From: Andy Davidson @ 2006-08-08 14:19 UTC (permalink / raw)
  To: linux-kernel



Hi,

One of our servers locked up (no kernel panic, but no response to 
console/network) with the following output on console :

Aug  8 14:10:34 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x4383A7F.
Aug  8 14:10:39 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x438C4D9.
Aug  8 14:10:41 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x438C4DC.
Aug  8 14:10:47 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x438FB34.
Aug  8 14:10:49 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x438FB36.
Aug  8 14:10:52 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x439051D.
Aug  8 14:11:08 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING 
(0x04:0x0023): Sector repair completed:port=1,LBA=0x4378D8A.

I'd put this down to bad hardware until an identically spec'd machine 
(with fresh disks) did the same thing.


Anyone else noticed any issues using the newer 3-ware 9550S cards with 
the 3w-9xxx driver ?


Other details :
Linux how-mail-1 2.6.16-2-686-smp #1 SMP Sat Jul 15 22:33:00 UTC 2006 i686

from lspci:
03:03.0 RAID bus controller: 3ware Inc 9550SX SATA-RAID


3ware 9000 Storage Controller device driver for Linux v2.26.02.007.
ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 24 (level, low) -> IRQ 177
scsi0 : 3ware 9000 Storage Controller
3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xda300000, 
IRQ: 177.
3w-9xxx: scsi0: Firmware FE9X 3.01.01.028, BIOS BE9X 3.01.00.024, Ports: 
4. Vendor: AMCC      Model: 9550SX-4LP DISK
Rev: 3.01 Type:   Direct-Access                      ANSI SCSI revision: 03f


We'd welcome any help ?

Thanks,
Andy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3ware 9550 using 3w-9xxx driver lockups ?
  2006-08-08 14:19 3ware 9550 using 3w-9xxx driver lockups ? Andy Davidson
@ 2006-08-08 18:32 ` adam radford
  2006-08-08 19:10   ` Jeff V. Merkey
  2006-08-09  5:29 ` Florian Weimer
  1 sibling, 1 reply; 4+ messages in thread
From: adam radford @ 2006-08-08 18:32 UTC (permalink / raw)
  To: Andy Davidson; +Cc: linux-kernel

Looks to me like you have a bad drive (or several bad drives, if
other drives on other machines are reallocating sectors as well).

Many reallocated sectors could be a sign that your drive is about to die...
Are you booting off this disk?

Have you run smartmontools on those bad disks?

You fail to mention your raid configuration.

Are you running the latest 3ware firmware for that controller?  There
is a Linux specific userspace firmware update utility.

Also, this topic is better served on the linux-scsi email list.

-Adam

On 8/8/06, Andy Davidson <andy@nosignal.org> wrote:
>
>
> Hi,
>
> One of our servers locked up (no kernel panic, but no response to
> console/network) with the following output on console :
>
> Aug  8 14:10:34 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x4383A7F.
> Aug  8 14:10:39 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438C4D9.
> Aug  8 14:10:41 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438C4DC.
> Aug  8 14:10:47 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438FB34.
> Aug  8 14:10:49 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438FB36.
> Aug  8 14:10:52 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x439051D.
> Aug  8 14:11:08 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
> (0x04:0x0023): Sector repair completed:port=1,LBA=0x4378D8A.
>
> I'd put this down to bad hardware until an identically spec'd machine
> (with fresh disks) did the same thing.
>
>
> Anyone else noticed any issues using the newer 3-ware 9550S cards with
> the 3w-9xxx driver ?
>
>
> Other details :
> Linux how-mail-1 2.6.16-2-686-smp #1 SMP Sat Jul 15 22:33:00 UTC 2006 i686
>
> from lspci:
> 03:03.0 RAID bus controller: 3ware Inc 9550SX SATA-RAID
>
>
> 3ware 9000 Storage Controller device driver for Linux v2.26.02.007.
> ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 24 (level, low) -> IRQ 177
> scsi0 : 3ware 9000 Storage Controller
> 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xda300000,
> IRQ: 177.
> 3w-9xxx: scsi0: Firmware FE9X 3.01.01.028, BIOS BE9X 3.01.00.024, Ports:
> 4. Vendor: AMCC      Model: 9550SX-4LP DISK
> Rev: 3.01 Type:   Direct-Access                      ANSI SCSI revision: 03f
>
>
> We'd welcome any help ?
>
> Thanks,
> Andy
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3ware 9550 using 3w-9xxx driver lockups ?
  2006-08-08 18:32 ` adam radford
@ 2006-08-08 19:10   ` Jeff V. Merkey
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff V. Merkey @ 2006-08-08 19:10 UTC (permalink / raw)
  To: adam radford; +Cc: Andy Davidson, linux-kernel

adam radford wrote:

> Looks to me like you have a bad drive (or several bad drives, if
> other drives on other machines are reallocating sectors as well).
>
> Many reallocated sectors could be a sign that your drive is about to 
> die...
> Are you booting off this disk?
>
> Have you run smartmontools on those bad disks?
>
> You fail to mention your raid configuration.
>
> Are you running the latest 3ware firmware for that controller?  There
> is a Linux specific userspace firmware update utility.
>
> Also, this topic is better served on the linux-scsi email list.
>
> -Adam
>
> On 8/8/06, Andy Davidson <andy@nosignal.org> wrote:
>
>>
>>
>> Hi,
>>
>> One of our servers locked up (no kernel panic, but no response to
>> console/network) with the following output on console :
>>
>> Aug  8 14:10:34 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x4383A7F.
>> Aug  8 14:10:39 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438C4D9.
>> Aug  8 14:10:41 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438C4DC.
>> Aug  8 14:10:47 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438FB34.
>> Aug  8 14:10:49 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x438FB36.
>> Aug  8 14:10:52 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x439051D.
>> Aug  8 14:11:08 how-mail-1 kernel: 3w-9xxx: scsi0: AEN: WARNING
>> (0x04:0x0023): Sector repair completed:port=1,LBA=0x4378D8A.
>>
>> I'd put this down to bad hardware until an identically spec'd machine
>> (with fresh disks) did the same thing.
>>
>>
>> Anyone else noticed any issues using the newer 3-ware 9550S cards with
>> the 3w-9xxx driver ?
>>
>>
>> Other details :
>> Linux how-mail-1 2.6.16-2-686-smp #1 SMP Sat Jul 15 22:33:00 UTC 2006 
>> i686
>>
>> from lspci:
>> 03:03.0 RAID bus controller: 3ware Inc 9550SX SATA-RAID
>>
>>
>> 3ware 9000 Storage Controller device driver for Linux v2.26.02.007.
>> ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 24 (level, low) -> IRQ 177
>> scsi0 : 3ware 9000 Storage Controller
>> 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xda300000,
>> IRQ: 177.
>> 3w-9xxx: scsi0: Firmware FE9X 3.01.01.028, BIOS BE9X 3.01.00.024, Ports:
>> 4. Vendor: AMCC      Model: 9550SX-4LP DISK
>> Rev: 3.01 Type:   Direct-Access                      ANSI SCSI 
>> revision: 03f
>>
>>
>> We'd welcome any help ?
>>
>> Thanks,
>> Andy
>> -
>> To unsubscribe from this list: send the line "unsubscribe 
>> linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>>
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
Adam,

Have him check the cables as well unless he using the multi-lane 
controller.  I see this a lot with loose ATA cables
on units, but since we moved over to the multi-lane architecture -- 
great work on the multi-lane version of the adapters, BTW.
He should also check is he's getting -2 bio errors.  I see -2 bio errors 
in parallel with this if its a bad cable.

Jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3ware 9550 using 3w-9xxx driver lockups ?
  2006-08-08 14:19 3ware 9550 using 3w-9xxx driver lockups ? Andy Davidson
  2006-08-08 18:32 ` adam radford
@ 2006-08-09  5:29 ` Florian Weimer
  1 sibling, 0 replies; 4+ messages in thread
From: Florian Weimer @ 2006-08-09  5:29 UTC (permalink / raw)
  To: Andy Davidson; +Cc: linux-kernel

* Andy Davidson:

> Anyone else noticed any issues using the newer 3-ware 9550S cards with
> the 3w-9xxx driver ?

We've seen them as well, but tracked it down to a faulty drive.  You
should run smartctl against your drives and look for entries in the
error log.  (smartctl has speciall support for 3ware controllers, see
its manpage.)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-08-09  6:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-08 14:19 3ware 9550 using 3w-9xxx driver lockups ? Andy Davidson
2006-08-08 18:32 ` adam radford
2006-08-08 19:10   ` Jeff V. Merkey
2006-08-09  5:29 ` Florian Weimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox