* SATA command timeouts
@ 2006-07-08 13:00 Sven Riedel
2006-07-10 11:25 ` Sven Riedel
0 siblings, 1 reply; 2+ messages in thread
From: Sven Riedel @ 2006-07-08 13:00 UTC (permalink / raw)
To: linux-ide
[-- Attachment #1: Type: text/plain, Size: 1031 bytes --]
Hi,
I hooked up new SATA disks to a machine yesterday and set up a software raid.
After copying roughly 25 GB of data onto the array all attached drives
reported command timeouts and wouldn't recover. Only a reboot would help.
I tried the same again, this time the drives would timeout after 60 GB of data
written. I exchanged the CPU, but the same thing occurs.
I'm currently in the process of testing the drives singly by writing 150GB of
data onto them and seeing if they will timeout singly (so far I had one
success and one complete system hang - even the CPU fan stopped spinning). My
next test will be doing write-heavy operations on all four drives
simulaneously, but without gathering them into a raid array.
Used hardware:
- 4 WD 400GB SATA RE2 drives
- Promise SATA300 TX4 controller
All of this was first running kernel 2.6.16.14, then after upgrading to
2.6.17.4.
dmesg output is attached.
Is there anything else I can test to debug this problem?
Regs,
Sven
--
Sven Riedel sr@gimp.org
[-- Attachment #2: dmesg_timeout --]
[-- Type: text/plain, Size: 1130 bytes --]
Jul 7 22:26:05 planck kernel: ata1: status=0x50 { DriveReady SeekComplete }
Jul 7 22:26:05 planck kernel: sda: Current: sense key=0x0
Jul 7 22:26:05 planck kernel: ASC=0x0 ASCQ=0x0
Jul 7 22:26:05 planck kernel: ata3: status=0x50 { DriveReady SeekComplete }
Jul 7 22:26:05 planck kernel: sdc: Current: sense key=0x0
Jul 7 22:26:05 planck kernel: ASC=0x0 ASCQ=0x0
Jul 7 22:26:05 planck kernel: ata2: status=0x50 { DriveReady SeekComplete }
Jul 7 22:26:05 planck kernel: sdb: Current: sense key=0x0
Jul 7 22:26:05 planck kernel: ASC=0x0 ASCQ=0x0
Jul 7 22:26:35 planck kernel: ata1: status=0x50 { DriveReady SeekComplete }
Jul 7 22:26:35 planck kernel: sda: Current: sense key=0x0
Jul 7 22:26:35 planck kernel: ASC=0x0 ASCQ=0x0
Jul 7 22:26:35 planck kernel: ata4: status=0x50 { DriveReady SeekComplete }
Jul 7 22:26:35 planck kernel: sdd: Current: sense key=0x0
Jul 7 22:26:35 planck kernel: ASC=0x0 ASCQ=0x0
Jul 7 22:26:35 planck kernel: ata3: status=0x50 { DriveReady SeekComplete }
Jul 7 22:26:35 planck kernel: sdc: Current: sense key=0x0
Jul 7 22:26:35 planck kernel: ASC=0x0 ASCQ=0x0
[-- Attachment #3: dmesg_boot --]
[-- Type: text/plain, Size: 4259 bytes --]
Jul 7 20:40:39 planck kernel: SCSI subsystem initialized
Jul 7 20:40:40 planck kernel: ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 18 (le
vel, low) -> IRQ 201
Jul 7 20:40:40 planck kernel: ata1: SATA max UDMA/133 cmd 0xF8A80200 ctl 0xF8A8
0238 bmdma 0x0 irq 201
Jul 7 20:40:40 planck kernel: ata2: SATA max UDMA/133 cmd 0xF8A80280 ctl 0xF8A8
02B8 bmdma 0x0 irq 201
Jul 7 20:40:40 planck kernel: ata3: SATA max UDMA/133 cmd 0xF8A80300 ctl 0xF8A8
0338 bmdma 0x0 irq 201
Jul 7 20:40:40 planck kernel: ata4: SATA max UDMA/133 cmd 0xF8A80380 ctl 0xF8A8
03B8 bmdma 0x0 irq 201
Jul 7 20:40:40 planck kernel: ata1: SATA link up 1.5 Gbps (SStatus 113)
Jul 7 20:40:40 planck kernel: ata1: dev 0 ATA-7, max UDMA/133, 781422768 sector
s: LBA48
Jul 7 20:40:40 planck kernel: ata1: dev 0 configured for UDMA/133
Jul 7 20:40:40 planck kernel: scsi0 : sata_promise
Jul 7 20:40:40 planck kernel: ata2: SATA link up 1.5 Gbps (SStatus 113)
Jul 7 20:40:40 planck kernel: ata2: dev 0 ATA-7, max UDMA/133, 781422768 sector
s: LBA48
Jul 7 20:40:40 planck kernel: ata2: dev 0 configured for UDMA/133
Jul 7 20:40:40 planck kernel: scsi1 : sata_promise
Jul 7 20:40:41 planck kernel: ata3: SATA link up 1.5 Gbps (SStatus 113)
Jul 7 20:40:41 planck kernel: ata3: dev 0 ATA-7, max UDMA/133, 781422768 sector
s: LBA48
Jul 7 20:40:41 planck kernel: ata3: dev 0 configured for UDMA/133
Jul 7 20:40:41 planck kernel: scsi2 : sata_promise
Jul 7 20:40:41 planck kernel: ata4: SATA link up 1.5 Gbps (SStatus 113)
Jul 7 20:40:41 planck kernel: ata4: dev 0 ATA-7, max UDMA/133, 781422768 sector
s: LBA48
Jul 7 20:40:41 planck kernel: ata4: dev 0 configured for UDMA/133
Jul 7 20:40:41 planck kernel: scsi3 : sata_promise
Jul 7 20:40:41 planck kernel: Vendor: ATA Model: WDC WD4000YR-01P Rev:
01.0
Jul 7 20:40:41 planck kernel: Type: Direct-Access ANSI
SCSI revision: 05
Jul 7 20:40:41 planck kernel: Vendor: ATA Model: WDC WD4000YR-01P Rev:
01.0
Jul 7 20:40:41 planck kernel: Type: Direct-Access ANSI
SCSI revision: 05
Jul 7 20:40:41 planck kernel: Vendor: ATA Model: WDC WD4000YR-01P Rev:
01.0
Jul 7 20:40:41 planck kernel: Type: Direct-Access ANSI
SCSI revision: 05
Jul 7 20:40:41 planck kernel: Vendor: ATA Model: WDC WD4000YR-01P Rev:
01.0
Jul 7 20:40:41 planck kernel: Type: Direct-Access ANSI
SCSI revision: 05
Jul 7 20:40:42 planck kernel: SCSI device sda: 781422768 512-byte hdwr sectors
(400088 MB)
Jul 7 20:40:42 planck kernel: sda: Write Protect is off
Jul 7 20:40:42 planck kernel: SCSI device sda: drive cache: write back
Jul 7 20:40:42 planck kernel: SCSI device sda: 781422768 512-byte hdwr sectors
(400088 MB)
Jul 7 20:40:42 planck kernel: sda: Write Protect is off
Jul 7 20:40:42 planck kernel: SCSI device sda: drive cache: write back
Jul 7 20:40:42 planck kernel: sda: unknown partition table
Jul 7 20:40:42 planck kernel: sd 0:0:0:0: Attached scsi disk sda
Jul 7 20:40:42 planck kernel: SCSI device sdb: 781422768 512-byte hdwr sectors
(400088 MB)
Jul 7 20:40:42 planck kernel: sdb: Write Protect is off
Jul 7 20:40:42 planck kernel: SCSI device sdb: drive cache: write back
Jul 7 20:40:42 planck kernel: SCSI device sdb: 781422768 512-byte hdwr sectors
(400088 MB)
Jul 7 20:40:42 planck kernel: sdb: Write Protect is off
Jul 7 20:40:42 planck kernel: SCSI device sdb: drive cache: write back
Jul 7 20:40:42 planck kernel: sdb: unknown partition table
Jul 7 20:40:42 planck kernel: sd 1:0:0:0: Attached scsi disk sdb
Jul 7 20:40:42 planck kernel: SCSI device sdc: 781422768 512-byte hdwr sectors
(400088 MB)
Jul 7 20:40:42 planck kernel: sdc: Write Protect is off
Jul 7 20:40:42 planck kernel: SCSI device sdc: drive cache: write back
Jul 7 20:40:42 planck kernel: SCSI device sdc: 781422768 512-byte hdwr sectors
(400088 MB)
Jul 7 20:40:42 planck kernel: sdc: Write Protect is off
Jul 7 20:40:42 planck kernel: SCSI device sdc: drive cache: write back
Jul 7 20:40:42 planck kernel: sdc: unknown partition table
Jul 7 20:40:42 planck kernel: sd 2:0:0:0: Attached scsi disk sdc
Jul 7 20:40:42 planck kernel: SCSI device sdd: 781422768 512-byte hdwr sectors
(400088 MB)
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: SATA command timeouts
2006-07-08 13:00 SATA command timeouts Sven Riedel
@ 2006-07-10 11:25 ` Sven Riedel
0 siblings, 0 replies; 2+ messages in thread
From: Sven Riedel @ 2006-07-10 11:25 UTC (permalink / raw)
Cc: linux-ide
On Sat, Jul 08, 2006 at 03:00:37PM +0200, Sven Riedel wrote:
> Hi,
> I hooked up new SATA disks to a machine yesterday and set up a software raid.
> After copying roughly 25 GB of data onto the array all attached drives
> reported command timeouts and wouldn't recover. Only a reboot would help.
> I tried the same again, this time the drives would timeout after 60 GB of data
> written. I exchanged the CPU, but the same thing occurs.
Well, seems the second CPU was defunct. After switching back to the first CPU
the hard system crashes stopped occurring.
> My
> next test will be doing write-heavy operations on all four drives
> simulaneously, but without gathering them into a raid array.
Well, it looks like the problem can be reproduced most reliably only when all
four drives are under load. It happens even sooner when they are gathered into
a raid array (I only tested level 5 - could be that the background "rebuild"
after creation is adding additional load which triggers the problem sooner).
Just to rule out an insufficient power supply, I hooked the drives up to a
completely independent PSU and the drives still all suddenly trigger a
command-timeout after a while.
Regs,
Sven
--
Sven Riedel sr@gimp.org
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-07-10 11:25 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-08 13:00 SATA command timeouts Sven Riedel
2006-07-10 11:25 ` Sven Riedel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).