* ext3 on a RAID1 going read only?
@ 2009-12-25 23:17 Steven Haigh
2009-12-25 23:35 ` Steven Haigh
0 siblings, 1 reply; 3+ messages in thread
From: Steven Haigh @ 2009-12-25 23:17 UTC (permalink / raw)
To: linux-raid
Hi guys,
Not 100% sure where to go with this one.... I've been having an issue with a particular server where after 30 days or so of uptime the / partition will go readonly after spitting the following to the console:
EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
Aborting journal on device md2.
Dec 25 18:17:27 wireless kernel: EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
Dec 25 18:17:27 wireless kernel: Aborting journal on device md2.
ext3_abort called.
Dec 25 18:17:27 EXT3-fs error (device md2): ext3_journal_start_sb: wireless kernel:Detected aborted journal ext3_abort called.
Remounting filesystem read-only
Dec 25 18:17:27 wireless kernel: EXT3-fs error (device md2): ext3_journal_start_sb: Detected aborted journal
Dec 25 18:17:27 wireless kernel: Remounting filesystem read-only
EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
Dec 25 18:17:36 wireless kernel: EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
I'm a bit confused here as from what I understand, if there are bad blocks on a disk the disk should be kicked from the array - however ext3 seems to figure out there's a bad block by itself and nominates /dev/md2 as the culprit...
Can anyone shine some light on what is going on here - as I'm not quite as cluey with this stuff as I probably should be ;)
--
Steven Haigh
Email: netwiz@crc.id.au
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ext3 on a RAID1 going read only?
2009-12-25 23:17 ext3 on a RAID1 going read only? Steven Haigh
@ 2009-12-25 23:35 ` Steven Haigh
2009-12-27 13:01 ` Goswin von Brederlow
0 siblings, 1 reply; 3+ messages in thread
From: Steven Haigh @ 2009-12-25 23:35 UTC (permalink / raw)
To: linux-raid
On 26/12/2009, at 10:17 AM, Steven Haigh wrote:
> Hi guys,
>
> Not 100% sure where to go with this one.... I've been having an issue with a particular server where after 30 days or so of uptime the / partition will go readonly after spitting the following to the console:
>
> EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
> Aborting journal on device md2.
> Dec 25 18:17:27 wireless kernel: EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
> Dec 25 18:17:27 wireless kernel: Aborting journal on device md2.
> ext3_abort called.
> Dec 25 18:17:27 EXT3-fs error (device md2): ext3_journal_start_sb: wireless kernel:Detected aborted journal ext3_abort called.
> Remounting filesystem read-only
> Dec 25 18:17:27 wireless kernel: EXT3-fs error (device md2): ext3_journal_start_sb: Detected aborted journal
> Dec 25 18:17:27 wireless kernel: Remounting filesystem read-only
> EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
> Dec 25 18:17:36 wireless kernel: EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
>
> I'm a bit confused here as from what I understand, if there are bad blocks on a disk the disk should be kicked from the array - however ext3 seems to figure out there's a bad block by itself and nominates /dev/md2 as the culprit...
>
> Can anyone shine some light on what is going on here - as I'm not quite as cluey with this stuff as I probably should be ;)
I should also mention that this is using CentOS 5.4 with kernel 2.6.18-164.9.1.el5. A few more details:
# mdadm -Q --detail /dev/md2
/dev/md2:
Version : 0.90
Creation Time : Mon Feb 23 17:15:41 2009
Raid Level : raid1
Array Size : 300511808 (286.59 GiB 307.72 GB)
Used Dev Size : 300511808 (286.59 GiB 307.72 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Sat Dec 26 10:34:23 2009
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : fed99e3d:d08fdcc9:b9593a45:2cc09736
Events : 0.30586
Number Major Minor RaidDevice State
0 3 3 0 active sync /dev/hda3
1 22 3 1 active sync /dev/hdc3
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdc1[1] hda1[0]
521984 blocks [2/2] [UU]
md1 : active raid1 hdc2[1] hda2[0]
10482304 blocks [2/2] [UU]
md3 : active raid1 hdc4[1] hda4[0]
1052160 blocks [2/2] [UU]
md2 : active raid1 hdc3[1] hda3[0]
300511808 blocks [2/2] [UU]
unused devices: <none>
--
Steven Haigh
Email: netwiz@crc.id.au
Web: http://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ext3 on a RAID1 going read only?
2009-12-25 23:35 ` Steven Haigh
@ 2009-12-27 13:01 ` Goswin von Brederlow
0 siblings, 0 replies; 3+ messages in thread
From: Goswin von Brederlow @ 2009-12-27 13:01 UTC (permalink / raw)
To: Steven Haigh; +Cc: linux-raid
Steven Haigh <netwiz@crc.id.au> writes:
> On 26/12/2009, at 10:17 AM, Steven Haigh wrote:
>
>> Hi guys,
>>
>> Not 100% sure where to go with this one.... I've been having an issue with a particular server where after 30 days or so of uptime the / partition will go readonly after spitting the following to the console:
>>
>> EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
>> Aborting journal on device md2.
>> Dec 25 18:17:27 wireless kernel: EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
>> Dec 25 18:17:27 wireless kernel: Aborting journal on device md2.
>> ext3_abort called.
>> Dec 25 18:17:27 EXT3-fs error (device md2): ext3_journal_start_sb: wireless kernel:Detected aborted journal ext3_abort called.
>> Remounting filesystem read-only
>> Dec 25 18:17:27 wireless kernel: EXT3-fs error (device md2): ext3_journal_start_sb: Detected aborted journal
>> Dec 25 18:17:27 wireless kernel: Remounting filesystem read-only
>> EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
>> Dec 25 18:17:36 wireless kernel: EXT3-fs error (device md2): ext3_xattr_block_list: inode 4932068: bad block 9873979
>>
>> I'm a bit confused here as from what I understand, if there are bad blocks on a disk the disk should be kicked from the array - however ext3 seems to figure out there's a bad block by itself and nominates /dev/md2 as the culprit...
>>
>> Can anyone shine some light on what is going on here - as I'm not quite as cluey with this stuff as I probably should be ;)
>
> I should also mention that this is using CentOS 5.4 with kernel 2.6.18-164.9.1.el5. A few more details:
>
> # mdadm -Q --detail /dev/md2
> /dev/md2:
> Version : 0.90
> Creation Time : Mon Feb 23 17:15:41 2009
> Raid Level : raid1
> Array Size : 300511808 (286.59 GiB 307.72 GB)
> Used Dev Size : 300511808 (286.59 GiB 307.72 GB)
> Raid Devices : 2
> Total Devices : 2
> Preferred Minor : 2
> Persistence : Superblock is persistent
>
> Update Time : Sat Dec 26 10:34:23 2009
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> UUID : fed99e3d:d08fdcc9:b9593a45:2cc09736
> Events : 0.30586
>
> Number Major Minor RaidDevice State
> 0 3 3 0 active sync /dev/hda3
> 1 22 3 1 active sync /dev/hdc3
>
> # cat /proc/mdstat
> Personalities : [raid1]
> md0 : active raid1 hdc1[1] hda1[0]
> 521984 blocks [2/2] [UU]
>
> md1 : active raid1 hdc2[1] hda2[0]
> 10482304 blocks [2/2] [UU]
>
> md3 : active raid1 hdc4[1] hda4[0]
> 1052160 blocks [2/2] [UU]
>
> md2 : active raid1 hdc3[1] hda3[0]
> 300511808 blocks [2/2] [UU]
>
> unused devices: <none>
Sounds like a block with bad data that doesn't give an IO error. The
raid layer can't see that the data is bad but the filesystem
recognises that the data makes no sense.
First I would run a check on the raid to see if the contents of both
drives differ. Then I would take one drive out of the raid and run
badblocks in read-write mode on it. Then resync and repeat with the
other drive.
If none of those show error I would backup, format and restore and
pray.
MfG
Goswin
PS: Why is your / not read-only?
PPS: run http://mrvn.homeip.net/fstest/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-12-27 13:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-25 23:17 ext3 on a RAID1 going read only? Steven Haigh
2009-12-25 23:35 ` Steven Haigh
2009-12-27 13:01 ` Goswin von Brederlow
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox