* Ext3 Errors on Dell RAID
@ 2005-08-23 13:05 Jess Balint
2005-08-23 13:28 ` Matt Domsch
0 siblings, 1 reply; 2+ messages in thread
From: Jess Balint @ 2005-08-23 13:05 UTC (permalink / raw)
To: linux-kernel
Problem:
I get massive ext3 errors once every few days. See "errors on console"
section below. Almost all commands return I/O error. I have to power
cycle the machine to get it running again. Upon reboot, there are
usually 3 orphan inodes deleted and everything is fine. See "messages
on reboot" below.
Configuration:
System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory
Discs: 3 SCSI discs in a controller-managed striped configuration
Controller: Dell PERC-2
kernel messages in "kernel boot messages" below
Other:
I had this problem before. I upgrade the card firmware to 2.8/build
6809, but still the same issue. I tried with the 2.4.29 kernel
(aacraid driver v 1.1-3) from the Slackware (10?) distribution and
then I upgraded to 2.4.31. It has the same driver version and same
problem. Running fsck always shows everything is fine (rc=0).
Does anybody have experience with this machine working well? If so,
what combination of kernel and firmware version?
Or does anybody know the root cause of the occasional massive ext3
errors or what I can do to test and/or fix it?
Please cc me jbalint-at-gmail as I am not on the list.
Thanks.
Jess
--------------------------------------------------
--------------------------------------------------
errors on console
--------------------------------------------------
--------------------------------------------------
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)) in ext3_orphan_add: IO failure
EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read
inode block - inode=1015869, block=1015811
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read
inode block - inode=1015869, block=1015811
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)) in ext3_orphan_add: IO failure
EXT3-fs error (device sd(8,2)): ext3_get_inode_loc: unable to read
inode block - inode=1213811, block=1212461
EXT3-fs error (device sd(8,2)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device sd(8,2)) in ext3_new_inode: IO failure
--------------------------------------------------
--------------------------------------------------
messages on reboot
--------------------------------------------------
--------------------------------------------------
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: sd(8,2): orphan cleanup on readonly fs
EXT3-fs: sd(8,2): 3 orphan inodes deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
--------------------------------------------------
--------------------------------------------------
kernel boot messages
--------------------------------------------------
--------------------------------------------------
SCSI subsystem driver Revision: 1.00
Red Hat/Adaptec aacraid driver (1.1-3 Aug 16 2005 17:25:05)
AAC0: kernel 2.8.4 build 6089
AAC0: monitor 2.8.4 build 6089
AAC0: bios 2.8.0 build 6089
AAC0: serial 4c72e2fafaf001
scsi0 : percraid
Vendor: DELL Model: rootvg Rev: V1.0
Type: Direct-Access ANSI SCSI revision: 02
scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7890/91 Ultra2 SCSI adapter>
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsi2 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7890/91 Ultra2 SCSI adapter>
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsi3 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec aic7860 Ultra SCSI adapter>
aic7860: Ultra Single Channel A, SCSI Id=7, 3/253 SCBs
blk: queue f7aaca18, I/O limit 4095Mb (mask 0xffffffff)
(scsi3:A:5): 20.000MB/s transfers (20.000MHz, offset 15)
Vendor: NEC Model: CD-ROM DRIVE:465 Rev: 1.03
Type: CD-ROM ANSI SCSI revision: 02
blk: queue f7aac818, I/O limit 4095Mb (mask 0xffffffff)
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 213274368 512-byte hdwr sectors (109196 MB)
Partition check:
sda: sda1 sda2
Attached scsi CD-ROM sr0 at scsi3, channel 0, id 5, lun 0
sr0: scsi3-mmc drive: 14x/32x cd/rw xa/form2 cdda tray
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: Ext3 Errors on Dell RAID
2005-08-23 13:05 Ext3 Errors on Dell RAID Jess Balint
@ 2005-08-23 13:28 ` Matt Domsch
0 siblings, 0 replies; 2+ messages in thread
From: Matt Domsch @ 2005-08-23 13:28 UTC (permalink / raw)
To: Jess Balint; +Cc: linux-kernel
On Tue, Aug 23, 2005 at 09:05:27AM -0400, Jess Balint wrote:
> Problem:
> I get massive ext3 errors once every few days. See "errors on console"
> section below. Almost all commands return I/O error. I have to power
> cycle the machine to get it running again. Upon reboot, there are
> usually 3 orphan inodes deleted and everything is fine. See "messages
> on reboot" below.
>
> Configuration:
> System: Dell PowerEdge 6300/500, 4 CPU SMP w/2GB memory
> Discs: 3 SCSI discs in a controller-managed striped configuration
> Controller: Dell PERC-2
> kernel messages in "kernel boot messages" below
This looks very familiar, and given the firmware versions you mention,
is probably a known issue. The controller firmware goes to do a cache
flush, but that doesn't complete in a sane amount of time, and
eventually the SCSI midlayer starts aborting commands and taking the
file system offline.
I don't believe a firmware update was released for your add-in PERC2
quad-channel card. Firmware 6091 was released for the PERC3/Di ROMBs
which addresses this exact case, though other failures have been
reported on linux-poweredge@dell.com (subscribe and read archives at
http://lists.us.dell.com) even with newer firmware.
The workarounds include:
1) disable the read and write cache using afacli.
2) mount file systems using 'noatime'.
3) backup your data, replace the controller with something newer
(disks on the onboard aic7xxx controller combined with Linux Software
RAID works quite well), recreate your RAID array on the new
controller, and restore your data from backups.
Thanks,
Matt
--
Matt Domsch
Software Architect
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2005-08-23 13:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-23 13:05 Ext3 Errors on Dell RAID Jess Balint
2005-08-23 13:28 ` Matt Domsch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox