2 drives failed, one "active", one with wrong event count

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 2 drives failed, one "active", one with wrong event count
@ 2010-01-28  9:05 Mikael Abrahamsson
  2010-01-29  4:17 ` Mikael Abrahamsson
  0 siblings, 1 reply; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-01-28  9:05 UTC (permalink / raw)
  To: linux-raid


I have a ubuntu 9.04 system with the default mdadm and kernel (2.6.28).

During the night two drives out of my 6 drive raid5 was kicked out due to 
sata timeouts. I rebooted the system and tried to assemble the array, but 
it would end up with 2 spares and 4 working drives, thus not enough.

More examination showed that of the two drives not working, one was 
"active" (as opposed to clean), and another one had a different event 
count from all the other drives.

Doing --assemble --force yielded:

[ 7748.103782] md: bind<sdg>
[ 7748.103989] md: bind<sdb>
[ 7748.104164] md: bind<sdc>
[ 7748.104315] md: bind<sde>
[ 7748.104456] md: bind<sdf>
[ 7748.104631] md: bind<sdd>
[ 7748.104664] md: kicking non-fresh sde from array!
[ 7748.104684] md: unbind<sde>
[ 7748.120532] md: export_rdev(sde)
[ 7748.120554] md: md0: raid array is not clean -- starting background 
reconstruction
[ 7748.122135] raid5: device sdd operational as raid disk 0
[ 7748.122153] raid5: device sdf operational as raid disk 5
[ 7748.122169] raid5: device sdc operational as raid disk 3
[ 7748.122186] raid5: device sdb operational as raid disk 2
[ 7748.122202] raid5: device sdg operational as raid disk 1
[ 7748.122218] raid5: cannot start dirty degraded array for md0
[ 7748.122234] RAID5 conf printout:
[ 7748.122248]  --- rd:6 wd:5
[ 7748.122261]  disk 0, o:1, dev:sdd
[ 7748.122275]  disk 1, o:1, dev:sdg
[ 7748.122289]  disk 2, o:1, dev:sdb
[ 7748.122303]  disk 3, o:1, dev:sdc
[ 7748.122317]  disk 5, o:1, dev:sdf
[ 7748.122331] raid5: failed to run raid set md0
[ 7748.122346] md: pers->run() failed ...

# mdadm --examine /dev/sd[bcdefg] | grep -i State
           State : clean
    Array State : _uUu_u 5 failed
           State : clean
    Array State : _uuU_u 5 failed
           State : active
    Array State : Uuuu_u 4 failed
           State : clean
    Array State : uuuuUu 3 failed
           State : clean
    Array State : _uuu_U 5 failed
           State : clean
    Array State : _Uuu_u 5 failed

So sde has another event count, sdd is "active" and thus making the array 
drity I guess.

Now, I mucked around a bit with assemble and examine, and then at one time 
the array tried to start with only two drives:

[ 8803.797521] raid5: device sdc operational as raid disk 3
[ 8803.797541] raid5: device sdd operational as raid disk 0
[ 8803.797558] raid5: not enough operational devices for md0 (4/6 failed)
[ 8803.797575] RAID5 conf printout:
[ 8803.797589]  --- rd:6 wd:2
[ 8803.797602]  disk 0, o:1, dev:sdd
[ 8803.797616]  disk 3, o:1, dev:sdc
[ 8803.797630] raid5: failed to run raid set md0
[ 8803.797645] md: pers->run() failed ...

I then stopped and started it again two times, and all of a sudden it 
assembled correctly and started reconstruction:

[ 8842.040824] md: md0 stopped.
[ 8842.040853] md: unbind<sdc>
[ 8842.056512] md: export_rdev(sdc)
[ 8842.056549] md: unbind<sdd>
[ 8842.068510] md: export_rdev(sdd)
[ 8865.784578] md: md0 stopped.
[ 8867.003573] md: bind<sdg>
[ 8867.003753] md: bind<sdb>
[ 8867.003981] md: bind<sdc>
[ 8867.004148] md: bind<sde>
[ 8867.004291] md: bind<sdf>
[ 8867.004489] md: bind<sdd>
[ 8867.004522] md: kicking non-fresh sde from array!
[ 8867.004541] md: unbind<sde>
[ 8867.020030] md: export_rdev(sde)
[ 8867.020052] md: md0: raid array is not clean -- starting background reconstruction
[ 8867.021633] raid5: device sdd operational as raid disk 0
[ 8867.021651] raid5: device sdf operational as raid disk 5
[ 8867.021667] raid5: device sdc operational as raid disk 3
[ 8867.021683] raid5: device sdb operational as raid disk 2
[ 8867.021699] raid5: device sdg operational as raid disk 1
[ 8867.021715] raid5: cannot start dirty degraded array for md0
[ 8867.021731] RAID5 conf printout:
[ 8867.021745]  --- rd:6 wd:5
[ 8867.021759]  disk 0, o:1, dev:sdd
[ 8867.021772]  disk 1, o:1, dev:sdg
[ 8867.021786]  disk 2, o:1, dev:sdb
[ 8867.021800]  disk 3, o:1, dev:sdc
[ 8867.021814]  disk 5, o:1, dev:sdf
[ 8867.021828] raid5: failed to run raid set md0
[ 8867.021843] md: pers->run() failed ...
[ 9044.443949] md: md0 stopped.
[ 9044.443981] md: unbind<sdd>
[ 9044.452013] md: export_rdev(sdd)
[ 9044.452066] md: unbind<sdf>
[ 9044.464011] md: export_rdev(sdf)
[ 9044.464039] md: unbind<sdc>
[ 9044.476009] md: export_rdev(sdc)
[ 9044.476056] md: unbind<sdb>
[ 9044.492010] md: export_rdev(sdb)
[ 9044.492037] md: unbind<sdg>
[ 9044.504010] md: export_rdev(sdg)
[ 9297.387893] md: bind<sdd>
[ 9301.337867] md: bind<sdc>
[ 9399.256047] md: md0 still in use.
[ 9399.678154] md: array md0 already has disks!
[ 9409.702060] md: md0 stopped.
[ 9409.702087] md: unbind<sdc>
[ 9409.712012] md: export_rdev(sdc)
[ 9409.712062] md: unbind<sdd>
[ 9409.724009] md: export_rdev(sdd)
[ 9411.880427] md: md0 still in use.
[ 9413.518157] md: bind<sdg>
[ 9413.518357] md: bind<sdb>
[ 9413.518527] md: bind<sdc>
[ 9413.518675] md: bind<sde>
[ 9413.518817] md: bind<sdf>
[ 9413.518987] md: bind<sdd>
[ 9413.519019] md: md0: raid array is not clean -- starting background reconstruction
[ 9413.521094] raid5: device sdd operational as raid disk 0
[ 9413.521113] raid5: device sdf operational as raid disk 5
[ 9413.521129] raid5: device sde operational as raid disk 4
[ 9413.521145] raid5: device sdc operational as raid disk 3
[ 9413.521162] raid5: device sdb operational as raid disk 2
[ 9413.521178] raid5: device sdg operational as raid disk 1
[ 9413.521859] raid5: allocated 6396kB for md0
[ 9413.521875] raid5: raid level 5 set md0 active with 6 out of 6 devices, algorithm 2
[ 9413.521901] RAID5 conf printout:
[ 9413.521915]  --- rd:6 wd:6
[ 9413.521928]  disk 0, o:1, dev:sdd
[ 9413.521942]  disk 1, o:1, dev:sdg
[ 9413.521956]  disk 2, o:1, dev:sdb
[ 9413.521970]  disk 3, o:1, dev:sdc
[ 9413.521984]  disk 4, o:1, dev:sde
[ 9413.521998]  disk 5, o:1, dev:sdf
[ 9413.522145] md0: detected capacity change from 0 to 10001993891840

Anyone have any idea what's going on?

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-28  9:05 2 drives failed, one "active", one with wrong event count Mikael Abrahamsson
@ 2010-01-29  4:17 ` Mikael Abrahamsson
  2010-01-29  7:06   ` Mikael Abrahamsson
  2010-01-29 10:17   ` Neil Brown
  0 siblings, 2 replies; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-01-29  4:17 UTC (permalink / raw)
  To: linux-raid

On Thu, 28 Jan 2010, Mikael Abrahamsson wrote:

> I have a ubuntu 9.04 system with the default mdadm and kernel (2.6.28).

I thought this might be a driver issue, so I tried upgrading to 9.10 which 
contains kernel 2.6.31 and mdadm 2.6.7.1. It seems the sw was unrelated, 
because now during the night three drives were kicked, so I now have 6 
drives, 3 "State: clean", 3 are "State: active", 1 of the "active" ones 
has a different event count. The array shows similar problems, sometimes 
it will assemble will all 6 drives being (S)pares, sometimes it'll 
assemble with 5 drives and shows as "inactive" in /proc/mdstat.

After finding 
<http://www.linuxquestions.org/questions/linux-general-1/raid5-with-mdadm-does-not-ron-or-rebuild-505361/> 
I tried this:

root@ub:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdd[0] sdf[7] sdc[4] sdb[2] sdg[6]
       9767572240 blocks super 1.2

unused devices: <none>
root@ub:~# cat /sys/block/md0/md/array_state
inactive
root@ub:~# echo "clean" > /sys/block/md0/md/array_state
-bash: echo: write error: Invalid argument
root@ub:~# cat /sys/block/md0/md/array_state
inactive

Still no go. Anyone who can help me what might be going wrong here, I 
mean, that a drive is stuck in "active" can't be a very weird event 
state?

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-29  4:17 ` Mikael Abrahamsson
@ 2010-01-29  7:06   ` Mikael Abrahamsson
  2010-01-29 10:17   ` Neil Brown
  1 sibling, 0 replies; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-01-29  7:06 UTC (permalink / raw)
  To: linux-raid

On Fri, 29 Jan 2010, Mikael Abrahamsson wrote:

> Still no go. Anyone who can help me what might be going wrong here, I 
> mean, that a drive is stuck in "active" can't be a very weird event 
> state?

mdadm --examine on all the drives:

http://pastebin.com/m271096ba

Also:

root@ub:~# mdadm --query --detail /dev/md0
/dev/md0:
         Version : 01.02
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
   Used Dev Size : 1953514432 (1863.02 GiB 2000.40 GB)
    Raid Devices : 6
   Total Devices : 5
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Fri Jan 29 03:08:35 2010
           State : active, degraded, Not Started
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0

          Layout : left-symmetric
      Chunk Size : 64K

            Name : swmike-htpc2:0
            UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
          Events : 2742684

     Number   Major   Minor   RaidDevice State
        0       8       48        0      active sync   /dev/sdd
        6       8       96        1      active sync   /dev/sdg
        2       8       16        2      active sync   /dev/sdb
        4       8       32        3      active sync   /dev/sdc
        4       0        0        4      removed
        7       8       80        5      active sync   /dev/sdf

root@ub:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] 
[raid4] [raid10]
md0 : inactive sdd[0] sdf[7] sdc[4] sdb[2] sdg[6]
       9767572240 blocks super 1.2

unused devices: <none>

root@ub:~# echo "clean" > /sys/block/md0/md/array_state
-bash: echo: write error: Invalid argument

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-29  4:17 ` Mikael Abrahamsson
  2010-01-29  7:06   ` Mikael Abrahamsson
@ 2010-01-29 10:17   ` Neil Brown
  2010-01-29 12:09     ` Mikael Abrahamsson
  1 sibling, 1 reply; 10+ messages in thread
From: Neil Brown @ 2010-01-29 10:17 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On Fri, 29 Jan 2010 05:17:10 +0100 (CET)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Thu, 28 Jan 2010, Mikael Abrahamsson wrote:
> 
> > I have a ubuntu 9.04 system with the default mdadm and kernel (2.6.28).
> 
> I thought this might be a driver issue, so I tried upgrading to 9.10 which 
> contains kernel 2.6.31 and mdadm 2.6.7.1. It seems the sw was unrelated, 
> because now during the night three drives were kicked, so I now have 6 
> drives, 3 "State: clean", 3 are "State: active", 1 of the "active" ones 
> has a different event count. The array shows similar problems, sometimes 
> it will assemble will all 6 drives being (S)pares, sometimes it'll 
> assemble with 5 drives and shows as "inactive" in /proc/mdstat.

1/ I think you are hitting and mdadm bug in "--assemble --force" that was
   fixed in 2.6.8. (git commit 4e9a6ff778cdc58dc).

2/ Don't poke thing in /sys unless you really know what you are doing (though
   I don't think this has causes you any problems).
3/ You really need to fix your problem with SATA timeouts or the array is
   never going to work.
4/ please (please please) don't use pastebin.  Just include the output inline
   in the mail message.  It is much easer to get at then.

NeilBrown

> 
> After finding 
> <http://www.linuxquestions.org/questions/linux-general-1/raid5-with-mdadm-does-not-ron-or-rebuild-505361/> 
> I tried this:
> 
> root@ub:~# cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> md0 : inactive sdd[0] sdf[7] sdc[4] sdb[2] sdg[6]
>        9767572240 blocks super 1.2
> 
> unused devices: <none>
> root@ub:~# cat /sys/block/md0/md/array_state
> inactive
> root@ub:~# echo "clean" > /sys/block/md0/md/array_state
> -bash: echo: write error: Invalid argument
> root@ub:~# cat /sys/block/md0/md/array_state
> inactive
> 
> Still no go. Anyone who can help me what might be going wrong here, I 
> mean, that a drive is stuck in "active" can't be a very weird event 
> state?
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-29 10:17   ` Neil Brown
@ 2010-01-29 12:09     ` Mikael Abrahamsson
  2010-01-29 12:27       ` Mikael Abrahamsson
  0 siblings, 1 reply; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-01-29 12:09 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Fri, 29 Jan 2010, Neil Brown wrote:

> 1/ I think you are hitting and mdadm bug in "--assemble --force" that was
>   fixed in 2.6.8. (git commit 4e9a6ff778cdc58dc).

Thanks, I'll try to upgrade mdadm.

> 3/ You really need to fix your problem with SATA timeouts or the array is
>   never going to work.

Yeah, I swapped the PSU yesterday (to an Corsair single rail one) after 
this happened the first time, I thought that might be related. Obviously 
it wasn't, so I'm going to swap cables as next step.

> 4/ please (please please) don't use pastebin.  Just include the output inline
>   in the mail message.  It is much easer to get at then.

Check, most of the important information was already in the email, I just 
added it as an extra... Will include it in the email next time.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-29 12:09     ` Mikael Abrahamsson
@ 2010-01-29 12:27       ` Mikael Abrahamsson
  2010-01-30 21:20         ` Mikael Abrahamsson
  0 siblings, 1 reply; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-01-29 12:27 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Fri, 29 Jan 2010, Mikael Abrahamsson wrote:

> On Fri, 29 Jan 2010, Neil Brown wrote:
>
>> 1/ I think you are hitting and mdadm bug in "--assemble --force" that was
>>   fixed in 2.6.8. (git commit 4e9a6ff778cdc58dc).
>
> Thanks, I'll try to upgrade mdadm.

Yes, that solved the problem. Thanks a bunch!

root@ub:~/mdadm-3.1.1# ./mdadm --assemble /dev/md0
mdadm: /dev/md0 assembled from 5 drives - not enough to start the array 
while not clean - consider --force.
root@ub:~/mdadm-3.1.1# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] 
[raid4] [raid10]
md0 : inactive sdd[0](S) sdf[7](S) sde[8](S) sdc[4](S) sdb[2](S) sdg[6](S)
       11721086688 blocks super 1.2

unused devices: <none>
root@ub:~/mdadm-3.1.1# ./mdadm --stop /dev/md0
mdadm: stopped /dev/md0
root@ub:~/mdadm-3.1.1# ./mdadm --assemble --force /dev/md0
mdadm: /dev/md0 has been started with 5 drives (out of 6).
root@ub:~/mdadm-3.1.1# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] 
[raid4] [raid10]
md0 : active raid5 sde[8] sdd[0] sdf[7] sdc[4] sdb[2] sdg[6]
       9767572160 blocks super 1.2 level 5, 64k chunk, algorithm 2 [6/5] [UUUU_U]
       [>....................]  recovery =  0.2% (5726592/1953514432) finish=468.1min speed=69341K/sec

unused devices: <none>

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-29 12:27       ` Mikael Abrahamsson
@ 2010-01-30 21:20         ` Mikael Abrahamsson
  2010-01-31 22:37           ` Neil Brown
  0 siblings, 1 reply; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-01-30 21:20 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Fri, 29 Jan 2010, Mikael Abrahamsson wrote:

> Yes, that solved the problem. Thanks a bunch!

Now I have another problem. Last time one other drive was kicked out 
during the resync due to UNC read errors. I ddrescued this drive to 
another drive on another system, and inserted the drive I copied to. So 
basically I have 5 drives which contain valid information of which one has 
a lower event count, and one drive being resync:ed. This state doesn't 
seem to be ok...

I guess if I removed the drive being resync:ed to and assembled it with 
--force it would update the event count of sdh (the copy of the drive that 
previously had read errors) and all would be fine. The bad part is that I 
don't really know which of the drives was being resync:ed to. Is this 
indicated by the "feature map" (guess 0x2 means partially sync:ed).

(6 hrs later: Ok, I physically removed the 0x2 drive and used --assemble 
--force and then I added a different drive and that seemed to work)

I don't know what the default action should be when there is a partially 
resync:ed drive and a drive with lower event count, but I tend to lean 
towards that it should take the drive with the lower event count and 
insert it, and then start sync:ing to the 0x2 drive. This might require 
some new options to mdadm to handle this behaviour?

root@ub:~/mdadm-3.1.1# ./mdadm --assemble --force /dev/md0
mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
root@ub:~/mdadm-3.1.1# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] 
[raid1] [raid10]
md0 : inactive sde[0] sdd[7] sdc[8] sdf[2] sdb[6]
       9767572240 blocks super 1.2

unused devices: <none>

[27567.806526] md: md0 stopped.
[27567.807713] md: bind<sdb>
[27567.807869] md: bind<sdf>
[27567.807975] md: bind<sdh>
[27567.808093] md: bind<sdc>
[27567.808224] md: bind<sdd>
[27567.808370] md: bind<sde>
[27567.808383] md: kicking non-fresh sdh from array!
[27567.808387] md: unbind<sdh>
[27567.830363] md: export_rdev(sdh)
[27567.831540] raid5: device sde operational as raid disk 0
[27567.831543] raid5: device sdd operational as raid disk 5
[27567.831545] raid5: device sdf operational as raid disk 2
[27567.831547] raid5: device sdb operational as raid disk 1
[27567.832043] raid5: allocated 6384kB for md0
[27567.832067] raid5: not enough operational devices for md0 (2/6 failed)
[27567.832094] RAID5 conf printout:
[27567.832095]  --- rd:6 wd:4
[27567.832097]  disk 0, o:1, dev:sde
[27567.832099]  disk 1, o:1, dev:sdb
[27567.832101]  disk 2, o:1, dev:sdf
[27567.832275]  disk 4, o:1, dev:sdc
[27567.832277]  disk 5, o:1, dev:sdd
[27567.832566] raid5: failed to run raid set md0
[27567.832581] md: pers->run() failed ...
[27567.897468] md0: ADD_NEW_DISK not supported


Linux ub 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 17:01:44 UTC 2009 
x86_64 GNU/Linux

root@ub:~/mdadm-3.1.1# ./mdadm --examine /dev/sd[b-h] | grep Event
mdadm: No md superblock detected on /dev/sdg.
          Events : 2742697
          Events : 2742697
          Events : 2742697
          Events : 2742697
          Events : 2742697
          Events : 2742694

root@ub:~/mdadm-3.1.1# ./mdadm --examine /dev/sd[b-h]
/dev/sdb:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
            Name : swmike-htpc2:0
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
    Raid Devices : 6

  Avail Dev Size : 3907028896 (1863.02 GiB 2000.40 GB)
      Array Size : 19535144320 (9315.08 GiB 10001.99 GB)
   Used Dev Size : 3907028864 (1863.02 GiB 2000.40 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : d1dc43a5:cabf2c69:980c1fe8:eab041a0

     Update Time : Fri Jan 29 18:16:02 2010
        Checksum : e94019d4 - correct
          Events : 2742697

          Layout : left-symmetric
      Chunk Size : 64K

    Device Role : Active device 1
    Array State : AAA.AA ('A' == active, '.' == missing)
/dev/sdc:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x2
      Array UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
            Name : swmike-htpc2:0
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
    Raid Devices : 6

  Avail Dev Size : 3907028896 (1863.02 GiB 2000.40 GB)
      Array Size : 19535144320 (9315.08 GiB 10001.99 GB)
   Used Dev Size : 3907028864 (1863.02 GiB 2000.40 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
Recovery Offset : 2371002368 sectors
           State : clean
     Device UUID : 25996f75:94aabd4b:88929fa5:9052e459

     Update Time : Fri Jan 29 18:16:02 2010
        Checksum : a072888c - correct
          Events : 2742697

          Layout : left-symmetric
      Chunk Size : 64K

    Device Role : Active device 4
    Array State : AAA.AA ('A' == active, '.' == missing)
/dev/sdd:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
            Name : swmike-htpc2:0
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
    Raid Devices : 6

  Avail Dev Size : 3907028896 (1863.02 GiB 2000.40 GB)
      Array Size : 19535144320 (9315.08 GiB 10001.99 GB)
   Used Dev Size : 3907028864 (1863.02 GiB 2000.40 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : b1c39781:50e6c164:76a2f4ab:9c8c9f45

     Update Time : Fri Jan 29 18:16:02 2010
        Checksum : 2a5c98cb - correct
          Events : 2742697

          Layout : left-symmetric
      Chunk Size : 64K

    Device Role : Active device 5
    Array State : AAA.AA ('A' == active, '.' == missing)
/dev/sde:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
            Name : swmike-htpc2:0
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
    Raid Devices : 6

  Avail Dev Size : 3907028896 (1863.02 GiB 2000.40 GB)
      Array Size : 19535144320 (9315.08 GiB 10001.99 GB)
   Used Dev Size : 3907028864 (1863.02 GiB 2000.40 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 7fcd2b91:f17ca45a:4d3b0e08:a156c70a

     Update Time : Fri Jan 29 18:16:02 2010
        Checksum : 51149c0e - correct
          Events : 2742697

          Layout : left-symmetric
      Chunk Size : 64K

    Device Role : Active device 0
    Array State : AAA.AA ('A' == active, '.' == missing)
/dev/sdf:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
            Name : swmike-htpc2:0
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
    Raid Devices : 6

  Avail Dev Size : 3907028896 (1863.02 GiB 2000.40 GB)
      Array Size : 19535144320 (9315.08 GiB 10001.99 GB)
   Used Dev Size : 3907028864 (1863.02 GiB 2000.40 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : e5689d67:b72d2697:84792201:492598b3

     Update Time : Fri Jan 29 18:16:02 2010
        Checksum : 5ecf51c - correct
          Events : 2742697

          Layout : left-symmetric
      Chunk Size : 64K

    Device Role : Active device 2
    Array State : AAA.AA ('A' == active, '.' == missing)
mdadm: No md superblock detected on /dev/sdg.
/dev/sdh:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x0
      Array UUID : 7eda4927:254c1b6e:f3c3144a:9f4159d2
            Name : swmike-htpc2:0
   Creation Time : Thu Mar 19 16:32:38 2009
      Raid Level : raid5
    Raid Devices : 6

  Avail Dev Size : 3907028896 (1863.02 GiB 2000.40 GB)
      Array Size : 19535144320 (9315.08 GiB 10001.99 GB)
   Used Dev Size : 3907028864 (1863.02 GiB 2000.40 GB)
     Data Offset : 272 sectors
    Super Offset : 8 sectors
           State : clean
     Device UUID : 3086b1fd:0d547803:5229a71a:6903df1c

     Update Time : Fri Jan 29 17:54:24 2010
        Checksum : 8b1dc19c - correct
          Events : 2742694

          Layout : left-symmetric
      Chunk Size : 64K

    Device Role : Active device 3
    Array State : AAAAAA ('A' == active, '.' == missing)

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-30 21:20         ` Mikael Abrahamsson
@ 2010-01-31 22:37           ` Neil Brown
  2010-02-01  7:13             ` Mikael Abrahamsson
  0 siblings, 1 reply; 10+ messages in thread
From: Neil Brown @ 2010-01-31 22:37 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On Sat, 30 Jan 2010 22:20:34 +0100 (CET)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Fri, 29 Jan 2010, Mikael Abrahamsson wrote:
> 
> > Yes, that solved the problem. Thanks a bunch!
> 
> Now I have another problem. Last time one other drive was kicked out 
> during the resync due to UNC read errors. I ddrescued this drive to 
> another drive on another system, and inserted the drive I copied to. So 
> basically I have 5 drives which contain valid information of which one has 
> a lower event count, and one drive being resync:ed. This state doesn't 
> seem to be ok...
> 
> I guess if I removed the drive being resync:ed to and assembled it with 
> --force it would update the event count of sdh (the copy of the drive that 
> previously had read errors) and all would be fine. The bad part is that I 
> don't really know which of the drives was being resync:ed to. Is this 
> indicated by the "feature map" (guess 0x2 means partially sync:ed).

0x2 means "the 'recovery_offset' fields is valid" which does correlate well
with "is partially sync:ed".

> 
> (6 hrs later: Ok, I physically removed the 0x2 drive and used --assemble 
> --force and then I added a different drive and that seemed to work)
> 
> I don't know what the default action should be when there is a partially 
> resync:ed drive and a drive with lower event count, but I tend to lean 
> towards that it should take the drive with the lower event count and 
> insert it, and then start sync:ing to the 0x2 drive. This might require 
> some new options to mdadm to handle this behaviour?

You might know that nothing has been written to the array since the device
with the lower event count was removed, but md doesn't know that.  Any device
with an old event count could have old and so cannot be trusted (unless you
assemble with --force meaning that you are taking responsibility).

My planned way to address this situation is to store a bad-block-list per
device and when we get an unrecoverable failure, record the address in the
bad-block-list and continue as best we can.

NeilBrown



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-01-31 22:37           ` Neil Brown
@ 2010-02-01  7:13             ` Mikael Abrahamsson
  2010-02-04  1:03               ` Neil Brown
  0 siblings, 1 reply; 10+ messages in thread
From: Mikael Abrahamsson @ 2010-02-01  7:13 UTC (permalink / raw)
  To: linux-raid

On Mon, 1 Feb 2010, Neil Brown wrote:

> You might know that nothing has been written to the array since the 
> device with the lower event count was removed, but md doesn't know that. 
> Any device with an old event count could have old and so cannot be 
> trusted (unless you assemble with --force meaning that you are taking 
> responsibility).

I did use --force, but it seems in the state "one drive with lower event 
count and another one with 0x2", the event count on the drive isn't 
forcably updated and since there is a 0x2 drive, the array isn't started.

I had the same situation again this morning (changing controller next), 
but this time I had bitmaps enabled so recovery of the array with 
--assemble --force took just a few seconds. Really nice.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 2 drives failed, one "active", one with wrong event count
  2010-02-01  7:13             ` Mikael Abrahamsson
@ 2010-02-04  1:03               ` Neil Brown
  0 siblings, 0 replies; 10+ messages in thread
From: Neil Brown @ 2010-02-04  1:03 UTC (permalink / raw)
  To: Mikael Abrahamsson; +Cc: linux-raid

On Mon, 1 Feb 2010 08:13:24 +0100 (CET)
Mikael Abrahamsson <swmike@swm.pp.se> wrote:

> On Mon, 1 Feb 2010, Neil Brown wrote:
> 
> > You might know that nothing has been written to the array since the 
> > device with the lower event count was removed, but md doesn't know that. 
> > Any device with an old event count could have old and so cannot be 
> > trusted (unless you assemble with --force meaning that you are taking 
> > responsibility).
> 
> I did use --force, but it seems in the state "one drive with lower event 
> count and another one with 0x2", the event count on the drive isn't 
> forcably updated and since there is a 0x2 drive, the array isn't started.
> 
> I had the same situation again this morning (changing controller next), 
> but this time I had bitmaps enabled so recovery of the array with 
> --assemble --force took just a few seconds. Really nice.
> 

Right... I understand now.

Fixed with the following patch which will be in 3.1.2.

Thanks,
NeilBrown

commit 921d9e164fd3f6203d1b0cf2424b793043afd001
Author: NeilBrown <neilb@suse.de>
Date:   Thu Feb 4 12:02:09 2010 +1100

    Assemble: fix --force assembly of v1.x arrays which are recovering.
    
    1.x metadata allows a device to be a member of the array while it
    is still recoverying.  So it is a working member, but is not
    completely in-sync.
    
    mdadm/assemble does not understand this distinction and assumes that a
    work member is fully in-sync for the purpose of determining if there
    are enough in-sync devices for the array to be functional.
    
    So collect the 'recovery_start' value from the metadata and use it in
    assemble when determining how useful a given device is.
    
    Reported-by: Mikael Abrahamsson <swmike@swm.pp.se>
    Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/Assemble.c b/Assemble.c
index 7f90048..e4d6181 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -800,7 +800,8 @@ int Assemble(struct supertype *st, char *mddev,
 		if (devices[j].i.events+event_margin >=
 		    devices[most_recent].i.events) {
 			devices[j].uptodate = 1;
-			if (i < content->array.raid_disks) {
+			if (i < content->array.raid_disks &&
+			    devices[j].i.recovery_start == MaxSector) {
 				okcnt++;
 				avail[i]=1;
 			} else
@@ -822,6 +823,7 @@ int Assemble(struct supertype *st, char *mddev,
 			int j = best[i];
 			if (j>=0 &&
 			    !devices[j].uptodate &&
+			    devices[j].i.recovery_start == MaxSector &&
 			    (chosen_drive < 0 ||
 			     devices[j].i.events
 			     > devices[chosen_drive].i.events))
diff --git a/super-ddf.c b/super-ddf.c
index 3e30229..870efd8 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -1369,6 +1369,7 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info)
 	info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
 
 
+	info->recovery_start = MaxSector;
 	info->reshape_active = 0;
 	info->name[0] = 0;
 
@@ -1427,6 +1428,7 @@ static void getinfo_super_ddf_bvd(struct supertype *st, struct mdinfo *info)
 
 	info->container_member = ddf->currentconf->vcnum;
 
+	info->recovery_start = MaxSector;
 	info->resync_start = 0;
 	if (!(ddf->virt->entries[info->container_member].state
 	      & DDF_state_inconsistent)  &&
diff --git a/super-intel.c b/super-intel.c
index 91479a2..bbdcb51 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -1452,6 +1452,7 @@ static void getinfo_super_imsm_volume(struct supertype *st, struct mdinfo *info)
 	info->data_offset	  = __le32_to_cpu(map->pba_of_lba0);
 	info->component_size	  = __le32_to_cpu(map->blocks_per_member);
 	memset(info->uuid, 0, sizeof(info->uuid));
+	info->recovery_start = MaxSector;
 
 	if (map->map_state == IMSM_T_STATE_UNINITIALIZED || dev->vol.dirty) {
 		info->resync_start = 0;
@@ -1559,6 +1560,7 @@ static void getinfo_super_imsm(struct supertype *st, struct mdinfo *info)
 	info->disk.number = -1;
 	info->disk.state = 0;
 	info->name[0] = 0;
+	info->recovery_start = MaxSector;
 
 	if (super->disks) {
 		__u32 reserved = imsm_reserved_sectors(super, super->disks);
diff --git a/super0.c b/super0.c
index 0485a3a..5c6b7d7 100644
--- a/super0.c
+++ b/super0.c
@@ -372,6 +372,7 @@ static void getinfo_super0(struct supertype *st, struct mdinfo *info)
 
 	uuid_from_super0(st, info->uuid);
 
+	info->recovery_start = MaxSector;
 	if (sb->minor_version > 90 && (sb->reshape_position+1) != 0) {
 		info->reshape_active = 1;
 		info->reshape_progress = sb->reshape_position;
diff --git a/super1.c b/super1.c
index 85bb598..40fbb81 100644
--- a/super1.c
+++ b/super1.c
@@ -612,6 +612,11 @@ static void getinfo_super1(struct supertype *st, struct mdinfo *info)
 	strncpy(info->name, sb->set_name, 32);
 	info->name[32] = 0;
 
+	if (sb->feature_map & __le32_to_cpu(MD_FEATURE_RECOVERY_OFFSET))
+		info->recovery_start = __le32_to_cpu(sb->recovery_offset);
+	else
+		info->recovery_start = MaxSector;
+
 	if (sb->feature_map & __le32_to_cpu(MD_FEATURE_RESHAPE_ACTIVE)) {
 		info->reshape_active = 1;
 		info->reshape_progress = __le64_to_cpu(sb->reshape_position);

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-02-04  1:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-28  9:05 2 drives failed, one "active", one with wrong event count Mikael Abrahamsson
2010-01-29  4:17 ` Mikael Abrahamsson
2010-01-29  7:06   ` Mikael Abrahamsson
2010-01-29 10:17   ` Neil Brown
2010-01-29 12:09     ` Mikael Abrahamsson
2010-01-29 12:27       ` Mikael Abrahamsson
2010-01-30 21:20         ` Mikael Abrahamsson
2010-01-31 22:37           ` Neil Brown
2010-02-01  7:13             ` Mikael Abrahamsson
2010-02-04  1:03               ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).