* sata_nv issues with MCP51 SATA controller
@ 2007-09-13 7:46 Jon Ivar Rykkelid
2007-09-13 14:20 ` Jeff Garzik
0 siblings, 1 reply; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-13 7:46 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2290 bytes --]
Hi, I'm resending (didn't see my first attempt appear on the maillist):
I'm having serious disk-issues when using the on-board nvidia controller
for my HDDs (My motherboard is a Gigabyte GA-N650SLI-DS4 with nvidia
chipset, cpu is intel Core2Quad)
excerpt from "lspci":
00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
(rev a1)
00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
(rev a1)
I have a normal IDE/P-ATA-disk attached to the "IDE"-controller and that
works fine (/dev/hda)
However, any number of disks (I have tried 2 and 4) connected to the
SATA-controller(s), will eventually fail. - See attached log (excerpt /
anything relevant from /var/log/messages)
At first, disks were REALLY unstable, but then I disabled S.M.A.R.T.
(both in BIOS and Linux), and I updated from the CentOS5 (equivalent of
RHEL5) kernel (2.6.18) to the latest (at that time) official kernel from
kernel.org:
> uname -a
Linux mirakel 2.6.22.5-custom_jir #2 SMP Thu Aug 30 22:06:21 CEST 2007
i686 i686 i386 GNU/Linux
Now it will normally take a day or two before SATA crashes, so things
are better, but still rather useless.
First error when sata_nv get into problems is always:
"exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen"
(as shown in the attached log-file.) - when this happens to one device,
it'll almost instantly happen to the other disk attached to that
controller as well. A couple of minutes (or so) later, the disk(s)
connected to the other controller will start acting up as well (in the
same manner). - I/O freezes, and nothing helps except a reboot...
As I run a rather large (software / md) RAID-5 disk array on this server
(I'm doing a bit of video editing), every crash means a time-consuming
rebuild of the disk-array...
I have given up on the sata_nv / nvidia-controllers for the time being.
I now resort to some old PCI-connected sata-controllers which work fine
(but slow, as they are outdated and "overloaded").
So, if anyone has a good solution / suggestion / improved driver (over
the one supplied with the official 2.6.22.5-kernel) I am eager to give
it a go and see if the situation can be resolved.
I appreciate any sensible suggestions.
BR
Jon Ivar
[-- Attachment #2: sata_nv-error.log --]
[-- Type: text/plain, Size: 17112 bytes --]
Sep 8 00:05:59 mirakel kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:05:59 mirakel kernel: ata1.00: cmd 35/00:08:47:83:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 8 00:05:59 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:05:59 mirakel kernel: ata1: soft resetting port
Sep 8 00:05:59 mirakel kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:00 mirakel kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:06:00 mirakel kernel: ata2.00: cmd c8/00:08:d7:6e:6f/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 in
Sep 8 00:06:00 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:06:00 mirakel kernel: ata2: soft resetting port
Sep 8 00:06:01 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:30 mirakel kernel: ata1.00: qc timeout (cmd 0x27)
Sep 8 00:06:30 mirakel kernel: ata1.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:06:30 mirakel kernel: ata1.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:06:30 mirakel kernel: ata1: failed to recover some devices, retrying in 5 secs
Sep 8 00:06:31 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:06:31 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:06:31 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:06:31 mirakel kernel: ata2: failed to recover some devices, retrying in 5 secs
Sep 8 00:06:35 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:35 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:35 mirakel kernel: ata1: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:06:36 mirakel kernel: ata2: hard resetting port
Sep 8 00:06:36 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:45 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:45 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:45 mirakel kernel: ata1: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:06:55 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:55 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:55 mirakel kernel: ata1: reset failed (errno=-19), retrying in 35 secs
Sep 8 00:07:06 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:07:06 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:07:06 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:07:06 mirakel kernel: ata2.00: limiting speed to UDMA/133:PIO3
Sep 8 00:07:06 mirakel kernel: ata2: failed to recover some devices, retrying in 5 secs
Sep 8 00:07:11 mirakel kernel: ata2: hard resetting port
Sep 8 00:07:12 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:07:30 mirakel kernel: ata1: hard resetting port
Sep 8 00:07:30 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:07:30 mirakel kernel: ata1: reset failed, giving up
Sep 8 00:07:30 mirakel kernel: ata1.00: disabled
Sep 8 00:07:30 mirakel kernel: ata1: EH complete
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 488407879
Sep 8 00:07:30 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:07:30 mirakel kernel: raid5: Disk failure on dm-0, disabling device. Operation continuing on 7 devices
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 141263543
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 4560055
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] READ CAPACITY failed
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Sense not available.
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Write Protect is off
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Asking for cache data failed
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Assuming drive cache: write through
Sep 8 00:07:42 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:07:42 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:07:42 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:07:42 mirakel kernel: ata2.00: disabled
Sep 8 00:07:42 mirakel kernel: ata2: EH complete
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 141520599
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 141671879
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 488407879
Sep 8 00:07:42 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:07:42 mirakel kernel: raid5: Disk failure on dm-1, disabling device. Operation continuing on 6 devices
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] READ CAPACITY failed
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Sense not available.
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Write Protect is off
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Asking for cache data failed
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Assuming drive cache: write through
Sep 8 00:08:12 mirakel kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:08:12 mirakel kernel: ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
Sep 8 00:08:12 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:08:13 mirakel kernel: ata3: soft resetting port
Sep 8 00:08:13 mirakel kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:08:42 mirakel kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:08:42 mirakel kernel: ata4.00: cmd 35/00:08:bf:44:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 8 00:08:42 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:08:43 mirakel kernel: ata4: soft resetting port
Sep 8 00:08:43 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:08:43 mirakel kernel: ata3.00: qc timeout (cmd 0x27)
Sep 8 00:08:43 mirakel kernel: ata3.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:08:43 mirakel kernel: ata3.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:08:43 mirakel kernel: ata3: failed to recover some devices, retrying in 5 secs
Sep 8 00:08:48 mirakel kernel: ata3: hard resetting port
Sep 8 00:08:48 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:08:48 mirakel kernel: ata3: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:08:58 mirakel kernel: ata3: hard resetting port
Sep 8 00:08:58 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:08:58 mirakel kernel: ata3: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:09:08 mirakel kernel: ata3: hard resetting port
Sep 8 00:09:08 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:09:08 mirakel kernel: ata3: reset failed (errno=-19), retrying in 35 secs
Sep 8 00:09:13 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:09:13 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:09:13 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:09:13 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 8 00:09:18 mirakel kernel: ata4: hard resetting port
Sep 8 00:09:18 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:09:43 mirakel kernel: ata3: hard resetting port
Sep 8 00:09:43 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:09:43 mirakel kernel: ata3: reset failed, giving up
Sep 8 00:09:43 mirakel kernel: ata3.00: disabled
Sep 8 00:09:43 mirakel kernel: ata3: EH complete
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:09:43 mirakel kernel: end_request: I/O error, dev sdc, sector 488391871
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] READ CAPACITY failed
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Sense not available.
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Write Protect is off
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Asking for cache data failed
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Assuming drive cache: write through
Sep 8 00:09:43 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:09:43 mirakel kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 5 devices
Sep 8 00:09:48 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:09:48 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:09:48 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:09:48 mirakel kernel: ata4.00: limiting speed to UDMA/133:PIO3
Sep 8 00:09:48 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 8 00:09:53 mirakel kernel: ata4: hard resetting port
Sep 8 00:09:54 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:10:24 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:10:24 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:10:24 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:10:24 mirakel kernel: ata4.00: disabled
Sep 8 00:10:25 mirakel kernel: ata4: EH complete
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:10:25 mirakel kernel: end_request: I/O error, dev sdd, sector 488391871
Sep 8 00:10:25 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:10:25 mirakel kernel: raid5: Disk failure on sdd1, disabling device. Operation continuing on 4 devices
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] READ CAPACITY failed
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Sense not available.
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Write Protect is off
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Asking for cache data failed
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Assuming drive cache: write through
Sep 8 00:10:25 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:25 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716576
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:25 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716499
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716500
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716501
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 6175
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Aborting journal on device md0.
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_reserve_inode_write: Journal has aborted
Sep 8 00:10:25 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:25 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:25 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:25 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:25 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:25 mirakel kernel: disk 7, o:0, dev:sdd1
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 0
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_dirty_inode: Journal has aborted
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 0
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_free_blocks_sb: Journal has aborted
Sep 8 00:10:26 mirakel kernel: ext3_abort called.
Sep 8 00:10:26 mirakel kernel: EXT3-fs error (device md0): ext3_journal_start_sb: Detected aborted journal
Sep 8 00:10:26 mirakel kernel: Remounting filesystem read-only
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123686376
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123689709
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123689744
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:26 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:26 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:26 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:26 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:26 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:26 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: EXT3-fs error (device md0): ext3_readdir: directory #126337 contains a hole at offset 4096
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 7:46 sata_nv issues with MCP51 SATA controller Jon Ivar Rykkelid
@ 2007-09-13 14:20 ` Jeff Garzik
2007-09-13 15:05 ` Jon Ivar Rykkelid
0 siblings, 1 reply; 25+ messages in thread
From: Jeff Garzik @ 2007-09-13 14:20 UTC (permalink / raw)
To: Jon Ivar Rykkelid; +Cc: linux-kernel
Jon Ivar Rykkelid wrote:
>
> Hi, I'm resending (didn't see my first attempt appear on the maillist):
>
>
>
> I'm having serious disk-issues when using the on-board nvidia controller
> for my HDDs (My motherboard is a Gigabyte GA-N650SLI-DS4 with nvidia
> chipset, cpu is intel Core2Quad)
>
> excerpt from "lspci":
> 00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> (rev a1)
> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> (rev a1)
>
> I have a normal IDE/P-ATA-disk attached to the "IDE"-controller and that
> works fine (/dev/hda)
>
> However, any number of disks (I have tried 2 and 4) connected to the
> SATA-controller(s), will eventually fail. - See attached log (excerpt /
> anything relevant from /var/log/messages)
>
> At first, disks were REALLY unstable, but then I disabled S.M.A.R.T.
> (both in BIOS and Linux), and I updated from the CentOS5 (equivalent of
> RHEL5) kernel (2.6.18) to the latest (at that time) official kernel from
> kernel.org:
>
> > uname -a
> Linux mirakel 2.6.22.5-custom_jir #2 SMP Thu Aug 30 22:06:21 CEST 2007
> i686 i686 i386 GNU/Linux
>
> Now it will normally take a day or two before SATA crashes, so things
> are better, but still rather useless.
>
> First error when sata_nv get into problems is always:
> "exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen"
> (as shown in the attached log-file.) - when this happens to one device,
> it'll almost instantly happen to the other disk attached to that
> controller as well. A couple of minutes (or so) later, the disk(s)
> connected to the other controller will start acting up as well (in the
> same manner). - I/O freezes, and nothing helps except a reboot...
>
> As I run a rather large (software / md) RAID-5 disk array on this server
> (I'm doing a bit of video editing), every crash means a time-consuming
> rebuild of the disk-array...
>
> I have given up on the sata_nv / nvidia-controllers for the time being.
> I now resort to some old PCI-connected sata-controllers which work fine
> (but slow, as they are outdated and "overloaded").
>
> So, if anyone has a good solution / suggestion / improved driver (over
> the one supplied with the official 2.6.22.5-kernel) I am eager to give
> it a go and see if the situation can be resolved.
does adma=0 module option do anything?
Jeff
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 14:20 ` Jeff Garzik
@ 2007-09-13 15:05 ` Jon Ivar Rykkelid
2007-09-13 15:14 ` Tejun Heo
0 siblings, 1 reply; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-13 15:05 UTC (permalink / raw)
To: Jeff Garzik; +Cc: linux-kernel, Tejun Heo
Jeff Garzik wrote:
> Jon Ivar Rykkelid wrote:
>>
>> Hi, I'm resending (didn't see my first attempt appear on the maillist):
>>
>>
>>
>> I'm having serious disk-issues when using the on-board nvidia controller
>> for my HDDs (My motherboard is a Gigabyte GA-N650SLI-DS4 with nvidia
>> chipset, cpu is intel Core2Quad)
>>
>> excerpt from "lspci":
>> 00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
>> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
>> (rev a1)
>> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
>> (rev a1)
>>
>> I have a normal IDE/P-ATA-disk attached to the "IDE"-controller and that
>> works fine (/dev/hda)
>>
>> However, any number of disks (I have tried 2 and 4) connected to the
>> SATA-controller(s), will eventually fail. - See attached log (excerpt /
>> anything relevant from /var/log/messages)
>>
>> At first, disks were REALLY unstable, but then I disabled S.M.A.R.T.
>> (both in BIOS and Linux), and I updated from the CentOS5 (equivalent of
>> RHEL5) kernel (2.6.18) to the latest (at that time) official kernel from
>> kernel.org:
>>
>> > uname -a
>> Linux mirakel 2.6.22.5-custom_jir #2 SMP Thu Aug 30 22:06:21 CEST 2007
>> i686 i686 i386 GNU/Linux
>>
>> Now it will normally take a day or two before SATA crashes, so things
>> are better, but still rather useless.
>>
>> First error when sata_nv get into problems is always:
>> "exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen"
>> (as shown in the attached log-file.) - when this happens to one device,
>> it'll almost instantly happen to the other disk attached to that
>> controller as well. A couple of minutes (or so) later, the disk(s)
>> connected to the other controller will start acting up as well (in the
>> same manner). - I/O freezes, and nothing helps except a reboot...
>>
>> As I run a rather large (software / md) RAID-5 disk array on this server
>> (I'm doing a bit of video editing), every crash means a time-consuming
>> rebuild of the disk-array...
>>
>> I have given up on the sata_nv / nvidia-controllers for the time being.
>> I now resort to some old PCI-connected sata-controllers which work fine
>> (but slow, as they are outdated and "overloaded").
>>
>> So, if anyone has a good solution / suggestion / improved driver (over
>> the one supplied with the official 2.6.22.5-kernel) I am eager to give
>> it a go and see if the situation can be resolved.
>
> does adma=0 module option do anything?
>
> Jeff
Thanks for the suggestion, but sata_nv is not built modular in my
current kernel, so "no can do" at the moment
(However, if some expert REALLY thinks this will fix things, I will
CERTAINLY recompile and give it a go)
As I said before, it all works for some time (a day or two) before it
crashes with the current kernel & no "S.M.A.R.T.". With my current setup
I have always had the time to fully rebuild my disk-array before a new
crash. - In the case of 4 disks attached to the nvidia controllers
(disregarding the disks on other controllers), this means that the
sata_nv-driver / controllers alone have read at least 750GB and written
250GB of data before the crash (with no resets working) - soft reboot
fixes everything. - I'm pretty confident that this is a driver issue.
As Tejun Heo <htejun@gmail.com> writes "the whole controller seems to
have went down at once and it's not even IRQ routing problem - resets
are failing."
The error-messages / crash-symptoms were the same with SMART enabled and
the original CentOS5-kernel, except that with that setup, the crashes
were much more frequent.
Any help?
BR
Jon Ivar
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 15:05 ` Jon Ivar Rykkelid
@ 2007-09-13 15:14 ` Tejun Heo
2007-09-13 18:01 ` Jon Ivar Rykkelid
0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2007-09-13 15:14 UTC (permalink / raw)
To: Jon Ivar Rykkelid; +Cc: Jeff Garzik, linux-kernel
Jon Ivar Rykkelid wrote:
> Thanks for the suggestion, but sata_nv is not built modular in my
> current kernel, so "no can do" at the moment
> (However, if some expert REALLY thinks this will fix things, I will
> CERTAINLY recompile and give it a go)
Passing "sata_nv.adma=0" as kernel boot parameter will do the trick.
--
tejun
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 15:14 ` Tejun Heo
@ 2007-09-13 18:01 ` Jon Ivar Rykkelid
2007-09-13 19:26 ` Jon Ivar Rykkelid
2007-09-14 13:29 ` Prakash Punnoor
0 siblings, 2 replies; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-13 18:01 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jeff Garzik, linux-kernel, Robert Hancock
Resending, as my first attempts contained HTML and was blocked...
Tejun Heo wrote:
> Jon Ivar Rykkelid wrote:
>
>> Thanks for the suggestion, but sata_nv is not built modular in my
>> current kernel, so "no can do" at the moment
>> (However, if some expert REALLY thinks this will fix things, I will
>> CERTAINLY recompile and give it a go)
>>
>
> Passing "sata_nv.adma=0" as kernel boot parameter will do the trick.
>
>
Ahh, silly me... Of course!
Ooops, I just got back, and verified: I actually have sata_nv running as
a module after all on this server... My bad.
I fixed /etc/modprobe.conf to include the following two lines:
"
alias scsi_hostadapter sata_nv
options sata_nv adma=0
...
"
I then ran "mkinitrd" (to ensure that the latest options from
modprobe.conf were included) in the initrd-image that I load at boot.
- Do you guys think this is worth a try? Anyway, I have rebooted now, so
I'll test it for a few days and let you know - We'll just have to wait
and see...
Do you think I should re-enable SMART to provoke a failure, or would
that be to tempt fate too much? (For now I have not re-enabled SMART)
PS: Is there any way of testing / verifying that sata_nv is now running
with this option? - I am pretty sure I have done it correctly, but I
would still like to confirm that the proper option has been passed if
possible.
Thanks
Jon Ivar
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 18:01 ` Jon Ivar Rykkelid
@ 2007-09-13 19:26 ` Jon Ivar Rykkelid
2007-09-13 19:54 ` Jeff Garzik
2007-09-14 13:29 ` Prakash Punnoor
1 sibling, 1 reply; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-13 19:26 UTC (permalink / raw)
To: Tejun Heo; +Cc: Jeff Garzik, linux-kernel, Robert Hancock
[-- Attachment #1: Type: text/plain, Size: 658 bytes --]
Hi,
I now tested with the adma=0 option, but if anything I got a crash
quicker than before. Same error message started coming in, but this time
the system hung before I was able to capture the log as well (but I saw
the error, and it was the same as before, except that this time it was
the ata3-channel that first started acting up..) - To remind you all
what this is about, I have reattached the log that I originally captured...
Any help / clever suggestions is appreciated.
Jon Ivar Rykkelid wrote:
> I fixed /etc/modprobe.conf to include the following two lines:
> "
> alias scsi_hostadapter sata_nv
> options sata_nv adma=0
> ...
> "
Jon Ivar
[-- Attachment #2: sata_nv-error.log --]
[-- Type: text/plain, Size: 17111 bytes --]
Sep 8 00:05:59 mirakel kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:05:59 mirakel kernel: ata1.00: cmd 35/00:08:47:83:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 8 00:05:59 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:05:59 mirakel kernel: ata1: soft resetting port
Sep 8 00:05:59 mirakel kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:00 mirakel kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:06:00 mirakel kernel: ata2.00: cmd c8/00:08:d7:6e:6f/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 in
Sep 8 00:06:00 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:06:00 mirakel kernel: ata2: soft resetting port
Sep 8 00:06:01 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:30 mirakel kernel: ata1.00: qc timeout (cmd 0x27)
Sep 8 00:06:30 mirakel kernel: ata1.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:06:30 mirakel kernel: ata1.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:06:30 mirakel kernel: ata1: failed to recover some devices, retrying in 5 secs
Sep 8 00:06:31 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:06:31 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:06:31 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:06:31 mirakel kernel: ata2: failed to recover some devices, retrying in 5 secs
Sep 8 00:06:35 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:35 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:35 mirakel kernel: ata1: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:06:36 mirakel kernel: ata2: hard resetting port
Sep 8 00:06:36 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:45 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:45 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:45 mirakel kernel: ata1: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:06:55 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:55 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:55 mirakel kernel: ata1: reset failed (errno=-19), retrying in 35 secs
Sep 8 00:07:06 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:07:06 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:07:06 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:07:06 mirakel kernel: ata2.00: limiting speed to UDMA/133:PIO3
Sep 8 00:07:06 mirakel kernel: ata2: failed to recover some devices, retrying in 5 secs
Sep 8 00:07:11 mirakel kernel: ata2: hard resetting port
Sep 8 00:07:12 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:07:30 mirakel kernel: ata1: hard resetting port
Sep 8 00:07:30 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:07:30 mirakel kernel: ata1: reset failed, giving up
Sep 8 00:07:30 mirakel kernel: ata1.00: disabled
Sep 8 00:07:30 mirakel kernel: ata1: EH complete
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 488407879
Sep 8 00:07:30 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:07:30 mirakel kernel: raid5: Disk failure on dm-0, disabling device. Operation continuing on 7 devices
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 141263543
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 4560055
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] READ CAPACITY failed
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Sense not available.
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Write Protect is off
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Asking for cache data failed
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Assuming drive cache: write through
Sep 8 00:07:42 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:07:42 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:07:42 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:07:42 mirakel kernel: ata2.00: disabled
Sep 8 00:07:42 mirakel kernel: ata2: EH complete
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 141520599
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 141671879
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 488407879
Sep 8 00:07:42 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:07:42 mirakel kernel: raid5: Disk failure on dm-1, disabling device. Operation continuing on 6 devices
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] READ CAPACITY failed
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Sense not available.
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Write Protect is off
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Asking for cache data failed
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Assuming drive cache: write through
Sep 8 00:08:12 mirakel kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:08:12 mirakel kernel: ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
Sep 8 00:08:12 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:08:13 mirakel kernel: ata3: soft resetting port
Sep 8 00:08:13 mirakel kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:08:42 mirakel kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:08:42 mirakel kernel: ata4.00: cmd 35/00:08:bf:44:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 8 00:08:42 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:08:43 mirakel kernel: ata4: soft resetting port
Sep 8 00:08:43 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:08:43 mirakel kernel: ata3.00: qc timeout (cmd 0x27)
Sep 8 00:08:43 mirakel kernel: ata3.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:08:43 mirakel kernel: ata3.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:08:43 mirakel kernel: ata3: failed to recover some devices, retrying in 5 secs
Sep 8 00:08:48 mirakel kernel: ata3: hard resetting port
Sep 8 00:08:48 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:08:48 mirakel kernel: ata3: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:08:58 mirakel kernel: ata3: hard resetting port
Sep 8 00:08:58 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:08:58 mirakel kernel: ata3: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:09:08 mirakel kernel: ata3: hard resetting port
Sep 8 00:09:08 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:09:08 mirakel kernel: ata3: reset failed (errno=-19), retrying in 35 secs
Sep 8 00:09:13 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:09:13 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:09:13 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:09:13 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 8 00:09:18 mirakel kernel: ata4: hard resetting port
Sep 8 00:09:18 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:09:43 mirakel kernel: ata3: hard resetting port
Sep 8 00:09:43 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:09:43 mirakel kernel: ata3: reset failed, giving up
Sep 8 00:09:43 mirakel kernel: ata3.00: disabled
Sep 8 00:09:43 mirakel kernel: ata3: EH complete
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:09:43 mirakel kernel: end_request: I/O error, dev sdc, sector 488391871
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] READ CAPACITY failed
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Sense not available.
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Write Protect is off
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Asking for cache data failed
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Assuming drive cache: write through
Sep 8 00:09:43 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:09:43 mirakel kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 5 devices
Sep 8 00:09:48 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:09:48 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:09:48 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:09:48 mirakel kernel: ata4.00: limiting speed to UDMA/133:PIO3
Sep 8 00:09:48 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 8 00:09:53 mirakel kernel: ata4: hard resetting port
Sep 8 00:09:54 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:10:24 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:10:24 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:10:24 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:10:24 mirakel kernel: ata4.00: disabled
Sep 8 00:10:25 mirakel kernel: ata4: EH complete
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:10:25 mirakel kernel: end_request: I/O error, dev sdd, sector 488391871
Sep 8 00:10:25 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:10:25 mirakel kernel: raid5: Disk failure on sdd1, disabling device. Operation continuing on 4 devices
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] READ CAPACITY failed
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Sense not available.
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Write Protect is off
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Asking for cache data failed
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Assuming drive cache: write through
Sep 8 00:10:25 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:25 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716576
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:25 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716499
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716500
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716501
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 6175
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Aborting journal on device md0.
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_reserve_inode_write: Journal has aborted
Sep 8 00:10:25 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:25 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:25 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:25 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:25 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:25 mirakel kernel: disk 7, o:0, dev:sdd1
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 0
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_dirty_inode: Journal has aborted
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 0
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_free_blocks_sb: Journal has aborted
Sep 8 00:10:26 mirakel kernel: ext3_abort called.
Sep 8 00:10:26 mirakel kernel: EXT3-fs error (device md0): ext3_journal_start_sb: Detected aborted journal
Sep 8 00:10:26 mirakel kernel: Remounting filesystem read-only
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123686376
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123689709
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123689744
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:26 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:26 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:26 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:26 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:26 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:26 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: EXT3-fs error (device md0): ext3_readdir: directory #126337 contains a hole at offset 4096
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 19:26 ` Jon Ivar Rykkelid
@ 2007-09-13 19:54 ` Jeff Garzik
2007-09-13 21:15 ` Jon Ivar Rykkelid
2007-09-14 0:37 ` Robert Hancock
0 siblings, 2 replies; 25+ messages in thread
From: Jeff Garzik @ 2007-09-13 19:54 UTC (permalink / raw)
To: Jon Ivar Rykkelid; +Cc: Tejun Heo, linux-kernel, Robert Hancock
Jon Ivar Rykkelid wrote:
> Hi,
>
> I now tested with the adma=0 option, but if anything I got a crash
> quicker than before. Same error message started coming in, but this time
> the system hung before I was able to capture the log as well (but I saw
> the error, and it was the same as before, except that this time it was
> the ata3-channel that first started acting up..) - To remind you all
> what this is about, I have reattached the log that I originally captured...
Sounds like a hardware problem, since disabling ADMA is generally the
cure-all we use -- it appears to stress the hardware less.
Jeff
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 19:54 ` Jeff Garzik
@ 2007-09-13 21:15 ` Jon Ivar Rykkelid
2007-09-14 0:37 ` Robert Hancock
1 sibling, 0 replies; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-13 21:15 UTC (permalink / raw)
To: Jeff Garzik, Tejun Heo, Robert Hancock; +Cc: linux-kernel
Is this the general opinion? - Should I try to get a replacement
motherboard of the same type?
If so, can anyone confirm that the sata_nv-driver is working with the
Gigabyte GA-N650SLI-DS4 motherboard at all / have anyone been successful
with this MB? How about the MCP51 SATA controller? - Can anyone confirm
that the driver is working for this HW? I would feel awkward to try to
claim a warranty replacement if it is proved that the HW is OK after
all, and the problem is with the linux-driver...
BR
Jon Ivar
Jeff Garzik wrote:
> Jon Ivar Rykkelid wrote:
>> Hi,
>>
>> I now tested with the adma=0 option, but if anything I got a crash
>> quicker than before. Same error message started coming in, but this
>> time the system hung before I was able to capture the log as well
>> (but I saw the error, and it was the same as before, except that this
>> time it was the ata3-channel that first started acting up..) - To
>> remind you all what this is about, I have reattached the log that I
>> originally captured...
>
> Sounds like a hardware problem, since disabling ADMA is generally the
> cure-all we use -- it appears to stress the hardware less.
>
> Jeff
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Jon Ivar Rykkelid Web: http://www.pvv.org/~jonry
Enromvegen 191 Phone: +47 72 56 86 86
N-7026 Trondheim Mob.: +47 906 20 250
Norway Email: jonry@pvv.org
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 19:54 ` Jeff Garzik
2007-09-13 21:15 ` Jon Ivar Rykkelid
@ 2007-09-14 0:37 ` Robert Hancock
2007-09-14 12:10 ` Jon Ivar Rykkelid
1 sibling, 1 reply; 25+ messages in thread
From: Robert Hancock @ 2007-09-14 0:37 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jon Ivar Rykkelid, Tejun Heo, linux-kernel
Jeff Garzik wrote:
> Jon Ivar Rykkelid wrote:
>> Hi,
>>
>> I now tested with the adma=0 option, but if anything I got a crash
>> quicker than before. Same error message started coming in, but this
>> time the system hung before I was able to capture the log as well (but
>> I saw the error, and it was the same as before, except that this time
>> it was the ata3-channel that first started acting up..) - To remind
>> you all what this is about, I have reattached the log that I
>> originally captured...
>
> Sounds like a hardware problem, since disabling ADMA is generally the
> cure-all we use -- it appears to stress the hardware less.
If this is an MCP51 chipset, adma=0 will make no difference since that
chipset does not support ADMA in the first place.
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-14 0:37 ` Robert Hancock
@ 2007-09-14 12:10 ` Jon Ivar Rykkelid
0 siblings, 0 replies; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-14 12:10 UTC (permalink / raw)
To: Robert Hancock; +Cc: Jeff Garzik, Tejun Heo, linux-kernel
Hi,
To eliminate the possibility of this being a hardware issue, I have now
acquired another "Gigabyte GA-N650SLI-DS4" motherboard (with the "MCP51"
chipset) for testing. I'll swap parts this evening. Hopefully I'll be
able to tell you in a few hours whether this appears to be working as it
should. The motherboard that I'm going to swap to has actually been
tested (with MS Windows OS+driver) for more than a day with a disk
connected, so if this MB also fails, I think it will be safe to say that
the issue is with the sata_nv driver... So hang on.
(You can't think of something else that could conflict with the sata_nv
driver after a bit of time, like two of my raid-disks being encrypted,
me running a SW raid-5 array / some special HW (quad-core CPU) / me
running vmware on this server ... ? - To me, all these suggestions seems
rather far fetched, especially as all is working with another
controller, so I'm arguing that unless there's a HW issue, the issue is
with the driver, but you're the expert(s), so let me know if you differ.)
I'll keep you posted as to the result of swapping HW.. Give me a few
hours. :-)
BR
Jon Ivar
Robert Hancock wrote:
> Jeff Garzik wrote:
>> Jon Ivar Rykkelid wrote:
>>> Hi,
>>>
>>> I now tested with the adma=0 option, but if anything I got a crash
>>> quicker than before. Same error message started coming in, but this
>>> time the system hung before I was able to capture the log as well
>>> (but I saw the error, and it was the same as before, except that
>>> this time it was the ata3-channel that first started acting up..) -
>>> To remind you all what this is about, I have reattached the log that
>>> I originally captured...
>>
>> Sounds like a hardware problem, since disabling ADMA is generally the
>> cure-all we use -- it appears to stress the hardware less.
>
> If this is an MCP51 chipset, adma=0 will make no difference since that
> chipset does not support ADMA in the first place.
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 18:01 ` Jon Ivar Rykkelid
2007-09-13 19:26 ` Jon Ivar Rykkelid
@ 2007-09-14 13:29 ` Prakash Punnoor
2007-09-14 14:17 ` Jon Ivar Rykkelid
1 sibling, 1 reply; 25+ messages in thread
From: Prakash Punnoor @ 2007-09-14 13:29 UTC (permalink / raw)
To: Jon Ivar Rykkelid; +Cc: Tejun Heo, Jeff Garzik, linux-kernel, Robert Hancock
[-- Attachment #1: Type: text/plain, Size: 1018 bytes --]
On the day of Thursday 13 September 2007 Jon Ivar Rykkelid hast written:
> Resending, as my first attempts contained HTML and was blocked...
>
> Tejun Heo wrote:
> > Jon Ivar Rykkelid wrote:
> >> Thanks for the suggestion, but sata_nv is not built modular in my
> >> current kernel, so "no can do" at the moment
> >> (However, if some expert REALLY thinks this will fix things, I will
> >> CERTAINLY recompile and give it a go)
> >
> > Passing "sata_nv.adma=0" as kernel boot parameter will do the trick.
>
> Ahh, silly me... Of course!
> Ooops, I just got back, and verified: I actually have sata_nv running as
> a module after all on this server... My bad.
> I fixed /etc/modprobe.conf to include the following two lines:
> "
> alias scsi_hostadapter sata_nv
> options sata_nv adma=0
> ...
> "
I don't think it will matter, as adma doesn't affect MCP51, but only nforce4.
So I'd look for other trouble makers.
--
(°= =°)
//\ Prakash Punnoor /\\
V_/ \_V
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-14 13:29 ` Prakash Punnoor
@ 2007-09-14 14:17 ` Jon Ivar Rykkelid
2007-09-14 14:25 ` Jeff Garzik
2007-09-14 20:35 ` Jon Ivar Rykkelid
0 siblings, 2 replies; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-14 14:17 UTC (permalink / raw)
To: Prakash Punnoor; +Cc: Tejun Heo, Jeff Garzik, linux-kernel, Robert Hancock
Prakash Punnoor wrote:
> I don't think it will matter, as adma doesn't affect MCP51, but only nforce4.
> So I'd look for other trouble makers.
>
Robert told me. (And you're correct - It didn't help).
I'm going to test another (identical) motherboard this evening to
establish whether it could be a HW-issue.
I'll keep you posted
Jon Ivar
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-14 14:17 ` Jon Ivar Rykkelid
@ 2007-09-14 14:25 ` Jeff Garzik
2007-09-14 14:39 ` Tejun Heo
2007-09-14 20:35 ` Jon Ivar Rykkelid
1 sibling, 1 reply; 25+ messages in thread
From: Jeff Garzik @ 2007-09-14 14:25 UTC (permalink / raw)
To: Jon Ivar Rykkelid
Cc: Prakash Punnoor, Tejun Heo, linux-kernel, Robert Hancock
Jon Ivar Rykkelid wrote:
> Prakash Punnoor wrote:
>> I don't think it will matter, as adma doesn't affect MCP51, but only
>> nforce4. So I'd look for other trouble makers.
>>
> Robert told me. (And you're correct - It didn't help).
Yes, it was already in slow-and-safe mode.
> I'm going to test another (identical) motherboard this evening to
> establish whether it could be a HW-issue.
Not just motherboard. It is more likely to be a cable, drive or PSU
problem.
Jeff
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-14 14:25 ` Jeff Garzik
@ 2007-09-14 14:39 ` Tejun Heo
[not found] ` <46EAAA9A.1020903@pvv.org>
0 siblings, 1 reply; 25+ messages in thread
From: Tejun Heo @ 2007-09-14 14:39 UTC (permalink / raw)
To: Jeff Garzik
Cc: Jon Ivar Rykkelid, Prakash Punnoor, linux-kernel, Robert Hancock
Jeff Garzik wrote:
> Jon Ivar Rykkelid wrote:
>> Prakash Punnoor wrote:
>>> I don't think it will matter, as adma doesn't affect MCP51, but only
>>> nforce4. So I'd look for other trouble makers.
>>>
>> Robert told me. (And you're correct - It didn't help).
>
> Yes, it was already in slow-and-safe mode.
>
>
>> I'm going to test another (identical) motherboard this evening to
>> establish whether it could be a HW-issue.
>
> Not just motherboard. It is more likely to be a cable, drive or PSU
> problem.
I don't think it's cable as the problem occurs on multiple ports. My
bet is either the controller or PSU.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-14 14:17 ` Jon Ivar Rykkelid
2007-09-14 14:25 ` Jeff Garzik
@ 2007-09-14 20:35 ` Jon Ivar Rykkelid
2007-09-15 7:12 ` Prakash Punnoor
1 sibling, 1 reply; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-14 20:35 UTC (permalink / raw)
To: Robert Hancock; +Cc: Prakash Punnoor, Tejun Heo, Jeff Garzik, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 739 bytes --]
Hi, I'm getting inmore confident that the driver is the issue.
I have now been able to reproduce the same error on the new motherboard
as well... - (the same MB was tested to work in Windows with
windows-drivers)...
Unless you guys can come up with something clever, I'll see if I can get
my hands on / change to another (non-nvidia) chipset in a day or two, as
the sata_nv with this chipset apparently isn't working.
(Or have anyone EVER been successful with the latest kernel/driver on
this HW)?
Attaching everything relevant from /var/log/messages...
Jon Ivar Rykkelid wrote:
> I'm going to test another (identical) motherboard this evening to
> establish whether it could be a HW-issue.
>
> I'll keep you posted
Jon Ivar
[-- Attachment #2: sata_nv-new.log --]
[-- Type: text/plain, Size: 8614 bytes --]
Sep 14 20:09:15 mirakel kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 14 20:09:15 mirakel kernel: ata3.00: cmd 35/00:08:bf:44:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 14 20:09:15 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 14 20:09:15 mirakel kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 14 20:09:15 mirakel kernel: ata4.00: cmd 35/00:08:bf:44:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 14 20:09:15 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 14 20:09:16 mirakel kernel: ata3: soft resetting port
Sep 14 20:09:16 mirakel kernel: ata4: soft resetting port
Sep 14 20:09:16 mirakel kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 14 20:09:16 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 14 20:09:46 mirakel kernel: ata3.00: qc timeout (cmd 0x27)
Sep 14 20:09:46 mirakel kernel: ata3.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 14 20:09:46 mirakel kernel: ata3.00: failed to set xfermode (err_mask=0x40)
Sep 14 20:09:46 mirakel kernel: ata3: failed to recover some devices, retrying in 5 secs
Sep 14 20:09:46 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 14 20:09:46 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 14 20:09:46 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 14 20:09:46 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 14 20:09:51 mirakel kernel: ata3: hard resetting port
Sep 14 20:09:51 mirakel kernel: ata4: hard resetting port
Sep 14 20:09:51 mirakel kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 14 20:09:51 mirakel kernel: ata4: SRST failed (errno=-19)
Sep 14 20:09:51 mirakel kernel: ata4: reset failed (errno=-19), retrying in 10 secs
Sep 14 20:10:01 mirakel kernel: ata4: hard resetting port
Sep 14 20:10:01 mirakel kernel: ata4: SRST failed (errno=-19)
Sep 14 20:10:01 mirakel kernel: ata4: reset failed (errno=-19), retrying in 10 secs
Sep 14 20:10:11 mirakel kernel: ata4: hard resetting port
Sep 14 20:10:11 mirakel kernel: ata4: SRST failed (errno=-19)
Sep 14 20:10:11 mirakel kernel: ata4: reset failed (errno=-19), retrying in 35 secs
Sep 14 20:10:21 mirakel kernel: ata3.00: qc timeout (cmd 0x27)
Sep 14 20:10:21 mirakel kernel: ata3.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 14 20:10:21 mirakel kernel: ata3.00: failed to set xfermode (err_mask=0x40)
Sep 14 20:10:21 mirakel kernel: ata3.00: limiting speed to UDMA/133:PIO3
Sep 14 20:10:21 mirakel kernel: ata3: failed to recover some devices, retrying in 5 secs
Sep 14 20:10:26 mirakel kernel: ata3: hard resetting port
Sep 14 20:10:27 mirakel kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 14 20:10:46 mirakel kernel: ata4: hard resetting port
Sep 14 20:10:46 mirakel kernel: ata4: SRST failed (errno=-19)
Sep 14 20:10:46 mirakel kernel: ata4: reset failed, giving up
Sep 14 20:10:46 mirakel kernel: ata4.00: disabled
Sep 14 20:10:46 mirakel kernel: ata4: EH complete
Sep 14 20:10:46 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 14 20:10:46 mirakel kernel: end_request: I/O error, dev sdd, sector 488391871
Sep 14 20:10:46 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 14 20:10:46 mirakel kernel: raid5: Disk failure on sdd1, disabling device. Operation continuing on 7 devices
Sep 14 20:10:57 mirakel kernel: ata3.00: qc timeout (cmd 0x27)
Sep 14 20:10:57 mirakel kernel: ata3.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 14 20:10:57 mirakel kernel: ata3.00: failed to set xfermode (err_mask=0x40)
Sep 14 20:10:57 mirakel kernel: ata3.00: disabled
Sep 14 20:10:58 mirakel kernel: ata3: EH complete
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 14 20:10:58 mirakel kernel: end_request: I/O error, dev sdc, sector 488391871
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] READ CAPACITY failed
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] Sense not available.
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] Write Protect is off
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] Asking for cache data failed
Sep 14 20:10:58 mirakel kernel: sd 2:0:0:0: [sdc] Assuming drive cache: write through
Sep 14 20:10:58 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 14 20:10:58 mirakel kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 6 devices
Sep 14 20:10:58 mirakel kernel: RAID5 conf printout:
Sep 14 20:10:58 mirakel kernel: Buffer I/O error on device md0, logical block 119194013
Sep 14 20:10:58 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:58 mirakel kernel: Buffer I/O error on device md0, logical block 119194014
Sep 14 20:10:58 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:58 mirakel kernel: Buffer I/O error on device md0, logical block 6660
Sep 14 20:10:58 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:58 mirakel kernel: Aborting journal on device md0.
Sep 14 20:10:58 mirakel kernel: --- rd:8 wd:6
Sep 14 20:10:58 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 14 20:10:58 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 14 20:10:58 mirakel kernel: disk 2, o:1, dev:dm-0
Sep 14 20:10:58 mirakel kernel: disk 3, o:1, dev:hds1
Sep 14 20:10:58 mirakel kernel: disk 4, o:1, dev:dm-1
Sep 14 20:10:58 mirakel kernel: disk 5, o:0, dev:sdd1
Sep 14 20:10:58 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 14 20:10:58 mirakel kernel: disk 7, o:0, dev:sdc1
Sep 14 20:10:58 mirakel kernel: ext3_abort called.
Sep 14 20:10:58 mirakel kernel: EXT3-fs error (device md0): ext3_journal_start_sb: Detected aborted journal
Sep 14 20:10:58 mirakel kernel: Remounting filesystem read-only
Sep 14 20:10:58 mirakel kernel: RAID5 conf printout:
Sep 14 20:10:58 mirakel kernel: --- rd:8 wd:6
Sep 14 20:10:58 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 14 20:10:58 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 14 20:10:58 mirakel kernel: disk 2, o:1, dev:dm-0
Sep 14 20:10:58 mirakel kernel: disk 3, o:1, dev:hds1
Sep 14 20:10:58 mirakel kernel: disk 4, o:1, dev:dm-1
Sep 14 20:10:58 mirakel kernel: disk 5, o:0, dev:sdd1
Sep 14 20:10:58 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 14 20:10:58 mirakel kernel: RAID5 conf printout:
Sep 14 20:10:58 mirakel kernel: --- rd:8 wd:6
Sep 14 20:10:58 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 14 20:10:58 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 14 20:10:58 mirakel kernel: disk 2, o:1, dev:dm-0
Sep 14 20:10:58 mirakel kernel: disk 3, o:1, dev:hds1
Sep 14 20:10:58 mirakel kernel: disk 4, o:1, dev:dm-1
Sep 14 20:10:58 mirakel kernel: disk 5, o:0, dev:sdd1
Sep 14 20:10:58 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 14 20:10:58 mirakel kernel: RAID5 conf printout:
Sep 14 20:10:58 mirakel kernel: --- rd:8 wd:6
Sep 14 20:10:58 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 14 20:10:58 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 14 20:10:58 mirakel kernel: disk 2, o:1, dev:dm-0
Sep 14 20:10:58 mirakel kernel: disk 3, o:1, dev:hds1
Sep 14 20:10:58 mirakel kernel: disk 4, o:1, dev:dm-1
Sep 14 20:10:58 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 29
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 30
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 119177216
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 119180640
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 119180739
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 119193951
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
Sep 14 20:10:59 mirakel kernel: Buffer I/O error on device md0, logical block 119193953
Sep 14 20:10:59 mirakel kernel: lost page write due to I/O error on md0
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-14 20:35 ` Jon Ivar Rykkelid
@ 2007-09-15 7:12 ` Prakash Punnoor
2007-09-15 10:14 ` Jon Ivar Rykkelid
[not found] ` <46EBA82C.6050000@pvv.org>
0 siblings, 2 replies; 25+ messages in thread
From: Prakash Punnoor @ 2007-09-15 7:12 UTC (permalink / raw)
To: Jon Ivar Rykkelid; +Cc: Robert Hancock, Tejun Heo, Jeff Garzik, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 562 bytes --]
On the day of Friday 14 September 2007 Jon Ivar Rykkelid hast written:
> Hi, I'm getting inmore confident that the driver is the issue.
>
>
> (Or have anyone EVER been successful with the latest kernel/driver on
> this HW)?
I don't have exaclty the same hw, but the same chipset and I don't have any
problems - even with the swncq patch applied. Do you have an hpet? If not,
try booting with acpi_use_time_override. My system won't work with skipping
the override.
--
(°= =°)
//\ Prakash Punnoor /\\
V_/ \_V
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: sata_nv issues with MCP51 SATA controller
2007-09-15 7:12 ` Prakash Punnoor
@ 2007-09-15 10:14 ` Jon Ivar Rykkelid
2007-09-15 14:47 ` John Stoffel
[not found] ` <46EBA82C.6050000@pvv.org>
1 sibling, 1 reply; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-15 10:14 UTC (permalink / raw)
To: linux-kernel; +Cc: Prakash Punnoor, Robert Hancock, Tejun Heo, Jeff Garzik
Prakash Punnoor wrote:
> I don't have exaclty the same hw, but the same chipset and I don't have any
> problems - even with the swncq patch applied. Do you have an hpet? If not,
> try booting with acpi_use_time_override. My system won't work with skipping
> the override.
>
>
Hi , I reconnected and rebooted with the kernel option
"acpi_use_timer_override" (this is the correct spelling, isn't it? -
Kernel didn't complain.). Didn't help, the same error received as
before. - I'll have to connect all disks back to my PCI-connected SATA
controllers and start rebuilding my RAID yet again.
It seems random which disk is first affected (This far, I know that it
has happened to ata1, ata3 and ata4, three of my potential disks) - I
guess it just happens to the disk that is being used at the moment when
the driver / controller acts up.)
I'm about to give in. I think I'll try to replace both ( Gigabyte
GA-N650SLI-DS4 ) motherboards, as the driver simply isn't working for
the on-board controller of these boards. Could be a combination of the
controllers and some other HW on the motherboards of course, but all is
working when I connect all disks to my non-nvidia controllers. - Guess
I'll opt for a motherboard with an intel-chipset after all...
BR
Jon Ivar
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: sata_nv issues with MCP51 SATA controller
2007-09-15 10:14 ` Jon Ivar Rykkelid
@ 2007-09-15 14:47 ` John Stoffel
2007-09-15 19:29 ` Jon Ivar Rykkelid
0 siblings, 1 reply; 25+ messages in thread
From: John Stoffel @ 2007-09-15 14:47 UTC (permalink / raw)
To: Jon Ivar Rykkelid
Cc: linux-kernel, Prakash Punnoor, Robert Hancock, Tejun Heo,
Jeff Garzik
>>>>> "Jon" == Jon Ivar Rykkelid <jonry@pvv.org> writes:
Jon> Prakash Punnoor wrote:
>> I don't have exaclty the same hw, but the same chipset and I don't have any
>> problems - even with the swncq patch applied. Do you have an hpet? If not,
>> try booting with acpi_use_time_override. My system won't work with skipping
>> the override.
Jon> Hi , I reconnected and rebooted with the kernel option
Jon> "acpi_use_timer_override" (this is the correct spelling, isn't
Jon> it? - Kernel didn't complain.). Didn't help, the same error
Jon> received as before. - I'll have to connect all disks back to my
Jon> PCI-connected SATA controllers and start rebuilding my RAID yet
Jon> again.
What happens when you just have ONE disk connected to the motherboard
controller, and the rest connected to PCI controllers? Does it crap
out then? You've just such a nice repeatable problem across
motherboards that it's a shame to waste this debugging time.
I'm wondering if it's a PCI bus issue somehow, and that the load on
the motherboard controller isn't supportable when you have a bunch of
disks on PCI controllers as well. Shot in the dark...
Thanks for all your hard work on this, I know how frustrating it is to
not have a stable system!
John
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-15 14:47 ` John Stoffel
@ 2007-09-15 19:29 ` Jon Ivar Rykkelid
0 siblings, 0 replies; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-15 19:29 UTC (permalink / raw)
To: linux-kernel
Cc: John Stoffel, Prakash Punnoor, Robert Hancock, Tejun Heo,
Jeff Garzik
John Stoffel wrote:
> What happens when you just have ONE disk connected to the motherboard
> controller, and the rest connected to PCI controllers? Does it crap
> out then? You've just such a nice repeatable problem across
> motherboards that it's a shame to waste this debugging time.
>
Sorry, I gave in. I have now abandoned my nvidia trials (both
motherboards have been returned, and I'm now running with Intel chipset)
- My current motherboard is less ideal (in terms of PCI-slots etc.), but
on the other hand it works...
> I'm wondering if it's a PCI bus issue somehow, and that the load on
> the motherboard controller isn't supportable when you have a bunch of
> disks on PCI controllers as well. Shot in the dark...
>
That was actually not such a bad idea... Unfortunately it's too late now
(If not I should have tested for sure). I was/am after all running an
8-disk SATA array (plus a normal IDE disk - not in the raid). I had 4
disks running through two PCI-cards and 4 disks used the motherboard's
controller. - When all 8 disks were connected to the two PCI-cards the
speed dropped compared to when the motherboard's controller took some
load.. (So it could maybe be an issue with bandwidth / load ? - I don't
know.)
> Thanks for all your hard work on this, I know how frustrating it is to
> not have a stable system!
>
Sorry for giving in, but I felt I was banging my head against the wall
(and with too few sensible solutions being suggested). Now I guess I'm
semi-happy that all seems to work OK with the Intel chipset..
Frustrating that the sata_nv-driver / nvidia HW didn't work with my
configuration, though...
Thank you all for your effort as well - hope someone figures this out
sometime in the future.
All the best
Jon Ivar
^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <46EBA82C.6050000@pvv.org>]
* sata_nv issues with MCP51 SATA controller
@ 2007-09-13 7:18 Jon Ivar Rykkelid
2007-09-13 9:16 ` Tejun Heo
0 siblings, 1 reply; 25+ messages in thread
From: Jon Ivar Rykkelid @ 2007-09-13 7:18 UTC (permalink / raw)
To: linux-kernel; +Cc: linux-ide
[-- Attachment #1: Type: text/plain, Size: 2437 bytes --]
Hi, I was told to forward my error report to this address.
I am keen to test again if someone has a good suggestion / updated
driver etc... (Give me a couple of days in that case...)
-----
Hi,
I'm having serious disk-issues when using the on-board nvidia controller
for my HDDs (My motherboard is a Gigabyte GA-N650SLI-DS4 with nvidia
chipset, cpu is intel Core2Quad)
excerpt from "lspci":
00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
(rev a1)
00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
(rev a1)
I have a normal IDE/P-ATA-disk attached to the "IDE"-controller and that
works fine (/dev/hda)
However, any number of disks (I have tried 2 and 4) connected to the
SATA-controller(s), will eventually fail. - See attached log (excerpt /
anything relevant from /var/log/messages)
At first, disks were REALLY unstable, but then I disabled S.M.A.R.T.
(both in BIOS and Linux), and I updated from the CentOS5 (equivalent of
RHEL5) kernel (2.6.18) to the latest (at that time) official kernel form
kernel.org:
> uname -a
Linux mirakel 2.6.22.5-custom_jir #2 SMP Thu Aug 30 22:06:21 CEST 2007
i686 i686 i386 GNU/Linux
Now it will normally take a day or two before SATA crashes, so things
are better, but still rather useless.
First error when sata_nv get into problems is always:
"exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen"
(as shown in the attached log-file.) - when this happens to one device,
it'll almost instantly happen to the other disk attached to that
controller as well. A couple of minutes (or so) later, the disk(s)
connected to the other controller will start acting up as well (in the
same manner). - I/O freezes, and nothing helps except a reboot...
As I run a rather large (software / md) RAID-5 disk array on this server
(I'm doing a bit of video editing), every crash means a time-consuming
rebuild of the disk-array...
I have given up on the sata_nv / nvidia-controllers for the time being.
I now resort to some old PCI-connected sata-controllers which work fine
(but slow, as they are outdated and "overloaded").
So, if anyone has a good solution / suggestion / improved driver (over
the one supplied with the official 2.6.22.5-kernel) I am eager to give
it a go and see if the situation can be resolved.
I appreciate any sensible suggestions.
BR
Jon Ivar
-----
[-- Attachment #2: sata_nv-error.log --]
[-- Type: text/plain, Size: 17111 bytes --]
Sep 8 00:05:59 mirakel kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:05:59 mirakel kernel: ata1.00: cmd 35/00:08:47:83:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 8 00:05:59 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:05:59 mirakel kernel: ata1: soft resetting port
Sep 8 00:05:59 mirakel kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:00 mirakel kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:06:00 mirakel kernel: ata2.00: cmd c8/00:08:d7:6e:6f/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 in
Sep 8 00:06:00 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:06:00 mirakel kernel: ata2: soft resetting port
Sep 8 00:06:01 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:30 mirakel kernel: ata1.00: qc timeout (cmd 0x27)
Sep 8 00:06:30 mirakel kernel: ata1.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:06:30 mirakel kernel: ata1.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:06:30 mirakel kernel: ata1: failed to recover some devices, retrying in 5 secs
Sep 8 00:06:31 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:06:31 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:06:31 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:06:31 mirakel kernel: ata2: failed to recover some devices, retrying in 5 secs
Sep 8 00:06:35 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:35 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:35 mirakel kernel: ata1: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:06:36 mirakel kernel: ata2: hard resetting port
Sep 8 00:06:36 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:06:45 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:45 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:45 mirakel kernel: ata1: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:06:55 mirakel kernel: ata1: hard resetting port
Sep 8 00:06:55 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:06:55 mirakel kernel: ata1: reset failed (errno=-19), retrying in 35 secs
Sep 8 00:07:06 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:07:06 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:07:06 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:07:06 mirakel kernel: ata2.00: limiting speed to UDMA/133:PIO3
Sep 8 00:07:06 mirakel kernel: ata2: failed to recover some devices, retrying in 5 secs
Sep 8 00:07:11 mirakel kernel: ata2: hard resetting port
Sep 8 00:07:12 mirakel kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:07:30 mirakel kernel: ata1: hard resetting port
Sep 8 00:07:30 mirakel kernel: ata1: SRST failed (errno=-19)
Sep 8 00:07:30 mirakel kernel: ata1: reset failed, giving up
Sep 8 00:07:30 mirakel kernel: ata1.00: disabled
Sep 8 00:07:30 mirakel kernel: ata1: EH complete
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 488407879
Sep 8 00:07:30 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:07:30 mirakel kernel: raid5: Disk failure on dm-0, disabling device. Operation continuing on 7 devices
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 141263543
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: end_request: I/O error, dev sda, sector 4560055
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] READ CAPACITY failed
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Sense not available.
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Write Protect is off
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Asking for cache data failed
Sep 8 00:07:30 mirakel kernel: sd 0:0:0:0: [sda] Assuming drive cache: write through
Sep 8 00:07:42 mirakel kernel: ata2.00: qc timeout (cmd 0x27)
Sep 8 00:07:42 mirakel kernel: ata2.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:07:42 mirakel kernel: ata2.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:07:42 mirakel kernel: ata2.00: disabled
Sep 8 00:07:42 mirakel kernel: ata2: EH complete
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 141520599
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 141671879
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: end_request: I/O error, dev sdb, sector 488407879
Sep 8 00:07:42 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:07:42 mirakel kernel: raid5: Disk failure on dm-1, disabling device. Operation continuing on 6 devices
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] READ CAPACITY failed
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Sense not available.
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Write Protect is off
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Asking for cache data failed
Sep 8 00:07:42 mirakel kernel: sd 1:0:0:0: [sdb] Assuming drive cache: write through
Sep 8 00:08:12 mirakel kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:08:12 mirakel kernel: ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
Sep 8 00:08:12 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:08:13 mirakel kernel: ata3: soft resetting port
Sep 8 00:08:13 mirakel kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:08:42 mirakel kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Sep 8 00:08:42 mirakel kernel: ata4.00: cmd 35/00:08:bf:44:1c/00:00:1d:00:00/e0 tag 0 cdb 0x0 data 4096 out
Sep 8 00:08:42 mirakel kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Sep 8 00:08:43 mirakel kernel: ata4: soft resetting port
Sep 8 00:08:43 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:08:43 mirakel kernel: ata3.00: qc timeout (cmd 0x27)
Sep 8 00:08:43 mirakel kernel: ata3.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:08:43 mirakel kernel: ata3.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:08:43 mirakel kernel: ata3: failed to recover some devices, retrying in 5 secs
Sep 8 00:08:48 mirakel kernel: ata3: hard resetting port
Sep 8 00:08:48 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:08:48 mirakel kernel: ata3: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:08:58 mirakel kernel: ata3: hard resetting port
Sep 8 00:08:58 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:08:58 mirakel kernel: ata3: reset failed (errno=-19), retrying in 10 secs
Sep 8 00:09:08 mirakel kernel: ata3: hard resetting port
Sep 8 00:09:08 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:09:08 mirakel kernel: ata3: reset failed (errno=-19), retrying in 35 secs
Sep 8 00:09:13 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:09:13 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:09:13 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:09:13 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 8 00:09:18 mirakel kernel: ata4: hard resetting port
Sep 8 00:09:18 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:09:43 mirakel kernel: ata3: hard resetting port
Sep 8 00:09:43 mirakel kernel: ata3: SRST failed (errno=-19)
Sep 8 00:09:43 mirakel kernel: ata3: reset failed, giving up
Sep 8 00:09:43 mirakel kernel: ata3.00: disabled
Sep 8 00:09:43 mirakel kernel: ata3: EH complete
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:09:43 mirakel kernel: end_request: I/O error, dev sdc, sector 488391871
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] READ CAPACITY failed
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Sense not available.
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Write Protect is off
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Asking for cache data failed
Sep 8 00:09:43 mirakel kernel: sd 2:0:0:0: [sdc] Assuming drive cache: write through
Sep 8 00:09:43 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:09:43 mirakel kernel: raid5: Disk failure on sdc1, disabling device. Operation continuing on 5 devices
Sep 8 00:09:48 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:09:48 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:09:48 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:09:48 mirakel kernel: ata4.00: limiting speed to UDMA/133:PIO3
Sep 8 00:09:48 mirakel kernel: ata4: failed to recover some devices, retrying in 5 secs
Sep 8 00:09:53 mirakel kernel: ata4: hard resetting port
Sep 8 00:09:54 mirakel kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Sep 8 00:10:24 mirakel kernel: ata4.00: qc timeout (cmd 0x27)
Sep 8 00:10:24 mirakel kernel: ata4.00: ata_hpa_resize 1: hpa sectors (0) is smaller than sectors (490234752)
Sep 8 00:10:24 mirakel kernel: ata4.00: failed to set xfermode (err_mask=0x40)
Sep 8 00:10:24 mirakel kernel: ata4.00: disabled
Sep 8 00:10:25 mirakel kernel: ata4: EH complete
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:10:25 mirakel kernel: end_request: I/O error, dev sdd, sector 488391871
Sep 8 00:10:25 mirakel kernel: md: super_written gets error=-5, uptodate=0
Sep 8 00:10:25 mirakel kernel: raid5: Disk failure on sdd1, disabling device. Operation continuing on 4 devices
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] READ CAPACITY failed
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Sense not available.
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Write Protect is off
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Asking for cache data failed
Sep 8 00:10:25 mirakel kernel: sd 3:0:0:0: [sdd] Assuming drive cache: write through
Sep 8 00:10:25 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:25 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716576
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:25 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716499
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716500
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 123716501
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 6175
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: Aborting journal on device md0.
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_reserve_inode_write: Journal has aborted
Sep 8 00:10:25 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:25 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:25 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:25 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:25 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:25 mirakel kernel: disk 7, o:0, dev:sdd1
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 0
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_dirty_inode: Journal has aborted
Sep 8 00:10:25 mirakel kernel: Buffer I/O error on device md0, logical block 0
Sep 8 00:10:25 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:25 mirakel kernel: EXT3-fs error (device md0) in ext3_free_blocks_sb: Journal has aborted
Sep 8 00:10:26 mirakel kernel: ext3_abort called.
Sep 8 00:10:26 mirakel kernel: EXT3-fs error (device md0): ext3_journal_start_sb: Detected aborted journal
Sep 8 00:10:26 mirakel kernel: Remounting filesystem read-only
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123686376
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123689709
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: Buffer I/O error on device md0, logical block 123689744
Sep 8 00:10:26 mirakel kernel: lost page write due to I/O error on md0
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:26 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:26 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:26 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:26 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:26 mirakel kernel: disk 5, o:0, dev:sdc1
Sep 8 00:10:26 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:26 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:26 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:26 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:26 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:26 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:26 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 4, o:0, dev:dm-0
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 2, o:0, dev:dm-1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: RAID5 conf printout:
Sep 8 00:10:27 mirakel kernel: --- rd:8 wd:4
Sep 8 00:10:27 mirakel kernel: disk 0, o:1, dev:hdg1
Sep 8 00:10:27 mirakel kernel: disk 1, o:1, dev:hdo1
Sep 8 00:10:27 mirakel kernel: disk 3, o:1, dev:hds1
Sep 8 00:10:27 mirakel kernel: disk 6, o:1, dev:hdk1
Sep 8 00:10:27 mirakel kernel: EXT3-fs error (device md0): ext3_readdir: directory #126337 contains a hole at offset 4096
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: sata_nv issues with MCP51 SATA controller
2007-09-13 7:18 Jon Ivar Rykkelid
@ 2007-09-13 9:16 ` Tejun Heo
0 siblings, 0 replies; 25+ messages in thread
From: Tejun Heo @ 2007-09-13 9:16 UTC (permalink / raw)
To: Jon Ivar Rykkelid; +Cc: linux-kernel, linux-ide, Robert Hancock
Jon Ivar Rykkelid wrote:
> I'm having serious disk-issues when using the on-board nvidia controller
> for my HDDs (My motherboard is a Gigabyte GA-N650SLI-DS4 with nvidia
> chipset, cpu is intel Core2Quad)
>
> excerpt from "lspci":
> 00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> (rev a1)
> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller
> (rev a1)
>
> I have a normal IDE/P-ATA-disk attached to the "IDE"-controller and that
> works fine (/dev/hda)
>
> However, any number of disks (I have tried 2 and 4) connected to the
> SATA-controller(s), will eventually fail. - See attached log (excerpt /
> anything relevant from /var/log/messages)
>
> At first, disks were REALLY unstable, but then I disabled S.M.A.R.T.
> (both in BIOS and Linux), and I updated from the CentOS5 (equivalent of
> RHEL5) kernel (2.6.18) to the latest (at that time) official kernel form
> kernel.org:
>
>> uname -a
> Linux mirakel 2.6.22.5-custom_jir #2 SMP Thu Aug 30 22:06:21 CEST 2007
> i686 i686 i386 GNU/Linux
>
> Now it will normally take a day or two before SATA crashes, so things
> are better, but still rather useless.
>
> First error when sata_nv get into problems is always:
> "exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen"
> (as shown in the attached log-file.) - when this happens to one device,
> it'll almost instantly happen to the other disk attached to that
> controller as well. A couple of minutes (or so) later, the disk(s)
> connected to the other controller will start acting up as well (in the
> same manner). - I/O freezes, and nothing helps except a reboot...
>
> As I run a rather large (software / md) RAID-5 disk array on this server
> (I'm doing a bit of video editing), every crash means a time-consuming
> rebuild of the disk-array...
>
> I have given up on the sata_nv / nvidia-controllers for the time being.
> I now resort to some old PCI-connected sata-controllers which work fine
> (but slow, as they are outdated and "overloaded").
>
> So, if anyone has a good solution / suggestion / improved driver (over
> the one supplied with the official 2.6.22.5-kernel) I am eager to give
> it a go and see if the situation can be resolved.
>
> I appreciate any sensible suggestions.
Wheeee... the whole controller seems to have went down at once and it's
not even IRQ routing problem - resets are failing. This is the first
time I see something like this. Sorry but I don't have any idea what's
going on. cc'ing Robert. Any ideas?
--
tejun
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2007-09-15 19:30 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-13 7:46 sata_nv issues with MCP51 SATA controller Jon Ivar Rykkelid
2007-09-13 14:20 ` Jeff Garzik
2007-09-13 15:05 ` Jon Ivar Rykkelid
2007-09-13 15:14 ` Tejun Heo
2007-09-13 18:01 ` Jon Ivar Rykkelid
2007-09-13 19:26 ` Jon Ivar Rykkelid
2007-09-13 19:54 ` Jeff Garzik
2007-09-13 21:15 ` Jon Ivar Rykkelid
2007-09-14 0:37 ` Robert Hancock
2007-09-14 12:10 ` Jon Ivar Rykkelid
2007-09-14 13:29 ` Prakash Punnoor
2007-09-14 14:17 ` Jon Ivar Rykkelid
2007-09-14 14:25 ` Jeff Garzik
2007-09-14 14:39 ` Tejun Heo
[not found] ` <46EAAA9A.1020903@pvv.org>
2007-09-14 15:58 ` Jeff Garzik
2007-09-14 18:38 ` Jon Ivar Rykkelid
2007-09-14 20:24 ` auxsvr
2007-09-14 20:35 ` Jon Ivar Rykkelid
2007-09-15 7:12 ` Prakash Punnoor
2007-09-15 10:14 ` Jon Ivar Rykkelid
2007-09-15 14:47 ` John Stoffel
2007-09-15 19:29 ` Jon Ivar Rykkelid
[not found] ` <46EBA82C.6050000@pvv.org>
2007-09-15 11:30 ` Prakash Punnoor
-- strict thread matches above, loose matches on Subject: below --
2007-09-13 7:18 Jon Ivar Rykkelid
2007-09-13 9:16 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox