* Recovering a raid5 array with strange event count
@ 2007-04-13 10:14 Chris Allen
2007-04-13 12:13 ` Neil Brown
0 siblings, 1 reply; 4+ messages in thread
From: Chris Allen @ 2007-04-13 10:14 UTC (permalink / raw)
To: linux-raid
Dear All,
I have an 8-drive raid-5 array running under 2.6.11. This morning it
bombed out, and when I brought
it up again, two drives had incorrect event counts:
sda1: 0.8258715
sdb1: 0.8258715
sdc1: 0.8258715
sdd1: 0.8258715
sde1: 0.8258715
sdf1: 0.8258715
sdg1: 0.8258708
sdh1: 0.8258716
sdg1 is out of date (expected), but sdh1 has received an extra event.
Any attempt to restart with mdadm --assemble --force, results in an an
un-startable array with an event count of 0.8258715.
Can anybody advise on the correct command to use to get it started again?
I'm assuming I'll need to use mdadm --create --assume-clean - but I'm
not sure
which drives should be included/excluded when I do this.
Many thanks!
Chris Allen.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recovering a raid5 array with strange event count
2007-04-13 10:14 Recovering a raid5 array with strange event count Chris Allen
@ 2007-04-13 12:13 ` Neil Brown
2007-04-13 13:07 ` Chris Allen
2007-04-16 13:55 ` Chris Allen
0 siblings, 2 replies; 4+ messages in thread
From: Neil Brown @ 2007-04-13 12:13 UTC (permalink / raw)
To: Chris Allen; +Cc: linux-raid
On Friday April 13, chris@cjx.com wrote:
> Dear All,
>
> I have an 8-drive raid-5 array running under 2.6.11. This morning it
> bombed out, and when I brought
> it up again, two drives had incorrect event counts:
>
>
> sda1: 0.8258715
> sdb1: 0.8258715
> sdc1: 0.8258715
> sdd1: 0.8258715
> sde1: 0.8258715
> sdf1: 0.8258715
> sdg1: 0.8258708
> sdh1: 0.8258716
>
>
> sdg1 is out of date (expected), but sdh1 has received an extra event.
>
> Any attempt to restart with mdadm --assemble --force, results in an an
> un-startable array with an event count of 0.8258715.
>
> Can anybody advise on the correct command to use to get it started again?
> I'm assuming I'll need to use mdadm --create --assume-clean - but I'm
> not sure
> which drives should be included/excluded when I do this.
A difference of 1 in event counts is not supposed to cause a problem.
Have you tried simply assembling the array without including sdg1.
e.g.
mdadm -A /dev/md0 /dev/sd[abcdefh]1
??
NeilBrown
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recovering a raid5 array with strange event count
2007-04-13 12:13 ` Neil Brown
@ 2007-04-13 13:07 ` Chris Allen
2007-04-16 13:55 ` Chris Allen
1 sibling, 0 replies; 4+ messages in thread
From: Chris Allen @ 2007-04-13 13:07 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Neil Brown wrote:
> On Friday April 13, chris@cjx.com wrote:
>
>> Dear All,
>>
>> I have an 8-drive raid-5 array running under 2.6.11. This morning it
>> bombed out, and when I brought
>> it up again, two drives had incorrect event counts:
>>
>>
>> sda1: 0.8258715
>> sdb1: 0.8258715
>> sdc1: 0.8258715
>> sdd1: 0.8258715
>> sde1: 0.8258715
>> sdf1: 0.8258715
>> sdg1: 0.8258708
>> sdh1: 0.8258716
>>
>>
>> sdg1 is out of date (expected), but sdh1 has received an extra event.
>>
>> Any attempt to restart with mdadm --assemble --force, results in an an
>> un-startable array with an event count of 0.8258715.
>>
>> Can anybody advise on the correct command to use to get it started again?
>> I'm assuming I'll need to use mdadm --create --assume-clean - but I'm
>> not sure
>> which drives should be included/excluded when I do this.
>>
>
> A difference of 1 in event counts is not supposed to cause a problem.
> Have you tried simply assembling the array without including sdg1.
> e.g.
> mdadm -A /dev/md0 /dev/sd[abcdefh]1
>
>
>
# mdadm -A /dev/md0 /dev/sd[abcdefh]1
mdadm: /dev/md0 assembled from 7 drives - need all 8 to start it (use
--run to insist)
# mdadm -D /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
mdadm --run /dev/md0
mdadm: failed to run array /dev/md0: invalid argument
I've attached the syslog, the dump for the assembled array, the dump for
each drive
and the contents of /proc/mdstat. Using --force makes no difference.
Apr 13 13:59:45 snap29 kernel: md: bind<sdb1>
Apr 13 13:59:45 snap29 kernel: md: bind<sdc1>
Apr 13 13:59:45 snap29 kernel: md: bind<sdd1>
Apr 13 13:59:45 snap29 kernel: md: bind<sde1>
Apr 13 13:59:45 snap29 kernel: md: bind<sdf1>
Apr 13 13:59:45 snap29 kernel: md: bind<sdh1>
Apr 13 13:59:45 snap29 kernel: md: bind<sda1>
Apr 13 14:00:01 snap29 kernel: md: md0: raid array is not clean --
starting background reconstruction
Apr 13 14:00:01 snap29 kernel: raid5: device sda1 operational as raid disk 0
Apr 13 14:00:01 snap29 kernel: raid5: device sdh1 operational as raid disk 7
Apr 13 14:00:01 snap29 kernel: raid5: device sdf1 operational as raid disk 5
Apr 13 14:00:01 snap29 kernel: raid5: device sde1 operational as raid disk 4
Apr 13 14:00:01 snap29 kernel: raid5: device sdd1 operational as raid disk 3
Apr 13 14:00:01 snap29 kernel: raid5: device sdc1 operational as raid disk 2
Apr 13 14:00:01 snap29 kernel: raid5: device sdb1 operational as raid disk 1
Apr 13 14:00:01 snap29 kernel: raid5: cannot start dirty degraded array
for md0
Apr 13 14:00:01 snap29 kernel: RAID5 conf printout:
Apr 13 14:00:01 snap29 kernel: --- rd:8 wd:7 fd:1
Apr 13 14:00:01 snap29 kernel: disk 0, o:1, dev:sda1
Apr 13 14:00:01 snap29 kernel: disk 1, o:1, dev:sdb1
Apr 13 14:00:01 snap29 kernel: disk 2, o:1, dev:sdc1
Apr 13 14:00:01 snap29 kernel: disk 3, o:1, dev:sdd1
Apr 13 14:00:01 snap29 kernel: disk 4, o:1, dev:sde1
Apr 13 14:00:01 snap29 kernel: disk 5, o:1, dev:sdf1
Apr 13 14:00:01 snap29 kernel: disk 7, o:1, dev:sdh1
Apr 13 14:00:01 snap29 kernel: raid5: failed to run raid set md0
Apr 13 14:00:01 snap29 kernel: md: pers->run() failed ...
/dev/md0:
Version : 00.90.01
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Raid Devices : 8
Total Devices : 7
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Apr 13 10:11:15 2007
State : active, degraded, Not Started
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Events : 0.8258715
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 0 0 6 removed
7 8 113 7 active sync /dev/sdh1
/dev/sda1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:12 2007
State : clean
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bd5a - correct
Events : 0.8258715
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 1 0 active sync /dev/sda1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
/dev/sdb1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:12 2007
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bd6b - correct
Events : 0.8258715
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 1 8 17 1 active sync /dev/sdb1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:12 2007
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bd7d - correct
Events : 0.8258715
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 2 8 33 2 active sync /dev/sdc1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
/dev/sdd1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:12 2007
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bd8f - correct
Events : 0.8258715
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 49 3 active sync /dev/sdd1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
/dev/sde1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:12 2007
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bda1 - correct
Events : 0.8258715
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 65 4 active sync /dev/sde1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
/dev/sdf1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:12 2007
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bdb3 - correct
Events : 0.8258715
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 5 8 81 5 active sync /dev/sdf1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
/dev/sdh1:
Magic : a92b4efc
Version : 00.90.01
UUID : 31b253f9:02049908:aa4bb1ab:753b8fda
Creation Time : Wed Apr 19 06:23:21 2006
Raid Level : raid5
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Array Size : 3418687552 (3260.31 GiB 3500.74 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Update Time : Fri Apr 13 10:11:15 2007
State : active
Active Devices : 8
Working Devices : 8
Failed Devices : 0
Spare Devices : 0
Checksum : a469bddb - correct
Events : 0.8258716
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 7 8 113 7 active sync /dev/sdh1
0 0 8 1 0 active sync /dev/sda1
1 1 8 17 1 active sync /dev/sdb1
2 2 8 33 2 active sync /dev/sdc1
3 3 8 49 3 active sync /dev/sdd1
4 4 8 65 4 active sync /dev/sde1
5 5 8 81 5 active sync /dev/sdf1
6 6 8 97 6 active sync /dev/sdg1
7 7 8 113 7 active sync /dev/sdh1
Personalities : [raid5]
md0 : inactive sda1[0] sdh1[7] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1]
3418687552 blocks
unused devices: <none>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recovering a raid5 array with strange event count
2007-04-13 12:13 ` Neil Brown
2007-04-13 13:07 ` Chris Allen
@ 2007-04-16 13:55 ` Chris Allen
1 sibling, 0 replies; 4+ messages in thread
From: Chris Allen @ 2007-04-16 13:55 UTC (permalink / raw)
To: Neil Brown; +Cc: Chris Allen, linux-raid
Neil Brown wrote:
> On Friday April 13, chris@cjx.com wrote:
>
>> Dear All,
>>
>> I have an 8-drive raid-5 array running under 2.6.11. This morning it
>> bombed out, and when I brought
>> it up again, two drives had incorrect event counts:
>>
>>
>> sda1: 0.8258715
>> sdb1: 0.8258715
>> sdc1: 0.8258715
>> sdd1: 0.8258715
>> sde1: 0.8258715
>> sdf1: 0.8258715
>> sdg1: 0.8258708
>> sdh1: 0.8258716
>>
>>
>> sdg1 is out of date (expected), but sdh1 has received an extra event.
>>
>> Any attempt to restart with mdadm --assemble --force, results in an an
>> un-startable array with an event count of 0.8258715.
>>
>> Can anybody advise on the correct command to use to get it started again?
>> I'm assuming I'll need to use mdadm --create --assume-clean - but I'm
>> not sure
>> which drives should be included/excluded when I do this.
>>
>
> A difference of 1 in event counts is not supposed to cause a problem.
> Have you tried simply assembling the array without including sdg1.
> e.g.
> mdadm -A /dev/md0 /dev/sd[abcdefh]1
>
>
>
Further to this, I have tried upgrading the kernel to 2.6.17. I get the
same errors.
Don't know if it is any use, but here is the tail of an strace for an
assemble command for
both the bad system and a similar good system:
STRACE FROM ASSEMBLE - BAD ARRAY:
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\1\0\0\0\0\0\0\0\371S\2621I\311"...,
4096) = 4096
close(4) = 0
stat64("/dev/sdi1", {st_mode=S_IFBLK|0640, st_rdev=makedev(8, 129),
...}) = 0
open("/dev/sdb1", O_RDONLY|O_EXCL) = 4
ioctl(4, BLKGETSIZE64, 0xbffdf150) = 0
ioctl(4, BLKFLSBUF, 0) = 0
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\1\0\0\0\0\0\0\0\371S\2621I\311"...,
4096) = 4096
close(4) = 0
ioctl(3, 0x40480923, 0xbffdf2c0) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x40140921, 0xbffdf324) = 0
ioctl(3, 0x400c0930, 0) = -1 EIO (Input/output error)
write(2, "mdadm: failed to RUN_ARRAY /dev/"..., 56mdadm: failed to
RUN_ARRAY /dev/md0: Input/output error
) = 56
exit_group(1) = ?
SAME COMMAND, GOOD ARRAY:
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\0\0\0\0\0\0\0\0\316\360\34;:"...,
4096) = 4096
close(4) = 0
stat64("/dev/sdh1", {st_mode=S_IFBLK|0640, st_rdev=makedev(8, 113),
...}) = 0
open("/dev/sda1", O_RDONLY|O_EXCL) = 4
ioctl(4, BLKGETSIZE64, 0xbfcae6d8) = 0
ioctl(4, BLKFLSBUF, 0) = 0
_llseek(4, 500105150464, [500105150464], SEEK_SET) = 0
read(4, "\374N+\251\0\0\0\0Z\0\0\0\0\0\0\0\0\0\0\0\316\360\34;:"...,
4096) = 4096
close(4) = 0
ioctl(3, 0x40480923, 0xbfcae800) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x40140921, 0xbfcae85c) = 0
ioctl(3, 0x400c0930, 0) = 0
write(2, "mdadm: /dev/md0 has been started"..., 46mdadm: /dev/md0 has
been started with 8 drives) = 46
write(2, ".\n", 2.
) = 2
exit_group(0) = ?
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-04-16 13:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-13 10:14 Recovering a raid5 array with strange event count Chris Allen
2007-04-13 12:13 ` Neil Brown
2007-04-13 13:07 ` Chris Allen
2007-04-16 13:55 ` Chris Allen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).