* How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild?
@ 2015-08-28 9:22 David C. Rankin
2015-08-28 9:42 ` David C. Rankin
2015-08-28 9:52 ` Robin Hill
0 siblings, 2 replies; 5+ messages in thread
From: David C. Rankin @ 2015-08-28 9:22 UTC (permalink / raw)
To: mdraid
All,
I had a disc-controller failure on a server running several raid1 arrays. The
disks are fine, but I have had the root partition come up in degraded mode. What
is the best way to tell mdraid to resync the disks? Here are the symptoms:
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb7[1]
52396032 blocks super 1.2 [2/1] [_U]
md3 : active raid1 sdb6[1] sda6[0]
1047552 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda8[0] sdb8[1]
922944192 blocks super 1.2 [2/2] [UU]
bitmap: 0/7 pages [0KB], 65536KB chunk
md0 : active raid1 sda5[0] sdb5[1]
204608 blocks super 1.2 [2/2] [UU]
unused devices: <none>
# mdadm --misc --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Nov 27 04:35:49 2013
Raid Level : raid1
Array Size : 52396032 (49.97 GiB 53.65 GB)
Used Dev Size : 52396032 (49.97 GiB 53.65 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Update Time : Fri Aug 28 04:12:18 2015
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Name : archiso:1
UUID : 320d86f7:22999af5:5eeefee1:35cd8970
Events : 100308
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 23 1 active sync /dev/sdb7
Reading, it looks like one approach is the boot the install media and then zero
the superblock on /dev/sda7 and then reboot. Will that force a rebuild, or do I
need to fail and remove the disk first? I was thinking:
# mdadm --zero-superblock /dev/sda7
should set it up for a rebuild without more. Is this a sane approach?
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild?
2015-08-28 9:22 How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? David C. Rankin
@ 2015-08-28 9:42 ` David C. Rankin
2015-08-28 9:54 ` Mikael Abrahamsson
2015-08-28 9:52 ` Robin Hill
1 sibling, 1 reply; 5+ messages in thread
From: David C. Rankin @ 2015-08-28 9:42 UTC (permalink / raw)
To: mdraid
On 08/28/2015 04:22 AM, David C. Rankin wrote:
> All,
>
> I had a disc-controller failure on a server running several raid1 arrays. The
> disks are fine, but I have had the root partition come up in degraded mode. What
> is the best way to tell mdraid to resync the disks? Here are the symptoms:
>
> # cat /proc/mdstat
> Personalities : [raid1]
> md1 : active raid1 sdb7[1]
> 52396032 blocks super 1.2 [2/1] [_U]
>
> md3 : active raid1 sdb6[1] sda6[0]
> 1047552 blocks super 1.2 [2/2] [UU]
>
> md2 : active raid1 sda8[0] sdb8[1]
> 922944192 blocks super 1.2 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 65536KB chunk
>
> md0 : active raid1 sda5[0] sdb5[1]
> 204608 blocks super 1.2 [2/2] [UU]
>
> unused devices: <none>
>
> # mdadm --misc --detail /dev/md1
> /dev/md1:
> Version : 1.2
> Creation Time : Wed Nov 27 04:35:49 2013
> Raid Level : raid1
> Array Size : 52396032 (49.97 GiB 53.65 GB)
> Used Dev Size : 52396032 (49.97 GiB 53.65 GB)
> Raid Devices : 2
> Total Devices : 1
> Persistence : Superblock is persistent
>
> Update Time : Fri Aug 28 04:12:18 2015
> State : clean, degraded
> Active Devices : 1
> Working Devices : 1
> Failed Devices : 0
> Spare Devices : 0
>
> Name : archiso:1
> UUID : 320d86f7:22999af5:5eeefee1:35cd8970
> Events : 100308
>
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 1 8 23 1 active sync /dev/sdb7
>
> Reading, it looks like one approach is the boot the install media and then zero
> the superblock on /dev/sda7 and then reboot. Will that force a rebuild, or do I
> need to fail and remove the disk first? I was thinking:
>
> # mdadm --zero-superblock /dev/sda7
>
> should set it up for a rebuild without more. Is this a sane approach?
>
This adds a bit more of the picture. It's like sda7 doesn't even know it was
kicked out. There are no disk errors logged for either of the drives:
# mdadm -E /dev/sd[ab]7
/dev/sda7:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x8
Array UUID : 320d86f7:22999af5:5eeefee1:35cd8970
Name : archiso:1
Creation Time : Wed Nov 27 04:35:49 2013
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 104792064 (49.97 GiB 53.65 GB)
Array Size : 52396032 (49.97 GiB 53.65 GB)
Data Offset : 65536 sectors
Super Offset : 8 sectors
Unused Space : before=65448 sectors, after=0 sectors
State : active
Device UUID : f5a48ea1:bce2f6f0:f47f9c0b:bad1d64d
Update Time : Sat Aug 8 17:17:21 2015
Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
Checksum : 2c45bcef - correct
Events : 280
Device Role : Active device 0
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb7:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x8
Array UUID : 320d86f7:22999af5:5eeefee1:35cd8970
Name : archiso:1
Creation Time : Wed Nov 27 04:35:49 2013
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 104792064 (49.97 GiB 53.65 GB)
Array Size : 52396032 (49.97 GiB 53.65 GB)
Data Offset : 65536 sectors
Super Offset : 8 sectors
Unused Space : before=65448 sectors, after=0 sectors
State : clean
Device UUID : 66e069cc:02daa93e:1d4a6eea:e5c21cb7
Update Time : Fri Aug 28 04:35:31 2015
Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present.
Checksum : ed07de3b - correct
Events : 100584
Device Role : Active device 1
Array State : .A ('A' == active, '.' == missing, 'R' == replacing)
Do I try a --re-add on sda7 or just zero it for a complete rebuild? Any help
appreciated.
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild?
2015-08-28 9:22 How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? David C. Rankin
2015-08-28 9:42 ` David C. Rankin
@ 2015-08-28 9:52 ` Robin Hill
2015-08-28 13:16 ` David C. Rankin
1 sibling, 1 reply; 5+ messages in thread
From: Robin Hill @ 2015-08-28 9:52 UTC (permalink / raw)
To: David C. Rankin; +Cc: mdraid
[-- Attachment #1: Type: text/plain, Size: 1742 bytes --]
On Fri Aug 28, 2015 at 04:22:09am -0500, David C. Rankin wrote:
> All,
>
> I had a disc-controller failure on a server running several raid1
> arrays. The disks are fine, but I have had the root partition come up
> in degraded mode. What is the best way to tell mdraid to resync the
> disks? Here are the symptoms:
>
> # cat /proc/mdstat
> Personalities : [raid1]
> md1 : active raid1 sdb7[1]
> 52396032 blocks super 1.2 [2/1] [_U]
>
> md3 : active raid1 sdb6[1] sda6[0]
> 1047552 blocks super 1.2 [2/2] [UU]
>
> md2 : active raid1 sda8[0] sdb8[1]
> 922944192 blocks super 1.2 [2/2] [UU]
> bitmap: 0/7 pages [0KB], 65536KB chunk
>
> md0 : active raid1 sda5[0] sdb5[1]
> 204608 blocks super 1.2 [2/2] [UU]
>
> unused devices: <none>
>
<- snip ->
> Reading, it looks like one approach is the boot the install media and
> then zero the superblock on /dev/sda7 and then reboot. Will that force
> a rebuild, or do I need to fail and remove the disk first? I was thinking:
>
> # mdadm --zero-superblock /dev/sda7
>
> should set it up for a rebuild without more. Is this a sane approach?
>
No need to over-complicate things. The only issue you have looks to be
that sda7 has not come up as part of md1, so just add it back in:
mdadm /dev/md1 -a /dev/sda7
You probably want to check dmesg, etc. to see why it didn't get added in
at all in the first place (I'd have expected it to be at least in as a
spare).
Cheers,
Robin
--
___
( ' } | Robin Hill <robin@robinhill.me.uk> |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild?
2015-08-28 9:42 ` David C. Rankin
@ 2015-08-28 9:54 ` Mikael Abrahamsson
0 siblings, 0 replies; 5+ messages in thread
From: Mikael Abrahamsson @ 2015-08-28 9:54 UTC (permalink / raw)
To: David C. Rankin; +Cc: mdraid
On Fri, 28 Aug 2015, David C. Rankin wrote:
> Do I try a --re-add on sda7 or just zero it for a complete rebuild? Any
> help appreciated.
Since sda7 has a much lower event count, it doesn't really matter. You do
not have a bitmap enabled and so since the event counts are off, a
complete resync will need to happen either way.
If --re-add doesn't work, use --add. If that doesn't work, zero superblock
and --add.
--
Mikael Abrahamsson email: swmike@swm.pp.se
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild?
2015-08-28 9:52 ` Robin Hill
@ 2015-08-28 13:16 ` David C. Rankin
0 siblings, 0 replies; 5+ messages in thread
From: David C. Rankin @ 2015-08-28 13:16 UTC (permalink / raw)
To: mdraid
On 08/28/2015 04:52 AM, Robin Hill wrote:
> On Fri Aug 28, 2015 at 04:22:09am -0500, David C. Rankin wrote:
>
>> Reading, it looks like one approach is the boot the install media and
>> then zero the superblock on /dev/sda7 and then reboot. Will that force
>> a rebuild, or do I need to fail and remove the disk first? I was thinking:
>>
>> # mdadm --zero-superblock /dev/sda7
>>
>> should set it up for a rebuild without more. Is this a sane approach?
>>
> No need to over-complicate things. The only issue you have looks to be
> that sda7 has not come up as part of md1, so just add it back in:
> mdadm /dev/md1 -a /dev/sda7
>
> You probably want to check dmesg, etc. to see why it didn't get added in
> at all in the first place (I'd have expected it to be at least in as a
> spare).
>
> Cheers,
> Robin
>
Hah!
You guys are great!
[07:57 phoinix:.../david/dev] # mdadm /dev/md1 -a /dev/sda7
mdadm: added /dev/sda7
[07:58 phoinix:.../david/dev] # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda7[2] sdb7[1]
52396032 blocks super 1.2 [2/1] [_U]
[=>...................] recovery = 6.1% (3246848/52396032)
finish=9.7min speed=83527K/sec
I have no clue what happened? I have never seen this before. There was never
an attempt to md: bind<sda7> (just like it didn't exist). The only thing
different about this boot (aside from the new highpoint raid controller) was the
fact I left the Arch install CD in the CD/DVD drive. I don't know if in the
early boot process, prior to the handoff to the highpoint controller, it somehow
may have grabbed sda?? (the CD drive is still on the onboard ATA controller,
while all SATA drives are attached to the highpoint controller).
Let me know if you see anything that makes any more sense?
And... during the time it took to compose this reply:
Personalities : [raid1]
md1 : active raid1 sda7[2] sdb7[1]
52396032 blocks super 1.2 [2/2] [UU]
md3 : active raid1 sdb6[1] sda6[0]
1047552 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda8[0] sdb8[1]
922944192 blocks super 1.2 [2/2] [UU]
bitmap: 0/7 pages [0KB], 65536KB chunk
md0 : active raid1 sda5[0] sdb5[1]
204608 blocks super 1.2 [2/2] [UU]
unused devices: <none>
Whoop!
I've gathered the relevant dmesg output. Maybe you can help make sense out of
it. It just looks like the system never tried to activate sda7...
[ 3.261932] sd 3:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00
TB/931 GiB)
[ 3.261948] sd 2:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00
TB/931 GiB)
[ 3.261988] sd 2:0:0:0: [sda] Write Protect is off
[ 3.261990] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 3.262010] sd 3:0:0:0: [sdb] Write Protect is off
[ 3.262012] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[ 3.262018] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[ 3.262034] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[ 3.298037] sda: sda1 < sda5 sda6 sda7 sda8 >
[ 3.298469] sd 2:0:0:0: [sda] Attached SCSI disk
[ 3.317052] sdb: sdb1 < sdb5 sdb6 sdb7 sdb8 >
[ 3.317456] sd 3:0:0:0: [sdb] Attached SCSI disk
[ 3.420533] md: bind<sdb5>
[ 3.421641] md: bind<sdb8>
[ 3.423385] md: bind<sda8>
[ 3.425987] md: raid1 personality registered for level 1
[ 3.426035] md: bind<sda5>
[ 3.426204] md/raid1:md2: active with 2 out of 2 mirrors
[ 3.426322] created bitmap (7 pages) for device md2
[ 3.426614] md2: bitmap initialized from disk: read 1 pages, set 0 of 14084 bits
[ 3.427474] md/raid1:md0: active with 2 out of 2 mirrors
[ 3.427496] md0: detected capacity change from 0 to 209518592
[ 3.469789] md0: unknown partition table
[ 3.543932] md2: detected capacity change from 0 to 945094852608
[ 3.544373] md: bind<sda6>
[ 3.545918] md: bind<sdb6>
[ 3.546646] md2: unknown partition table
[ 3.547402] md/raid1:md3: active with 2 out of 2 mirrors
[ 3.547428] md3: detected capacity change from 0 to 1072693248
[ 3.547986] md: bind<sdb7>
[ 3.549323] md/raid1:md1: active with 1 out of 2 mirrors
[ 3.549348] md1: detected capacity change from 0 to 53653536768
[ 3.558920] md3: unknown partition table
[ 3.559052] md1: unknown partition table
[ 4.345217] md1: unknown partition table
[ 4.371798] EXT4-fs (md1): mounted filesystem with ordered data mode. Opts:
(null)
<snip>
[ 6.020249] EXT4-fs (md1): re-mounted. Opts: stripe=32,data=ordered
[ 6.020965] systemd[1]: Started Remount Root and Kernel File Systems.
<snip>
[ 10.110516] md0: unknown partition table
[ 10.480935] md2: unknown partition table
[ 10.531124] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts:
stripe=32,data=ordered
[ 10.573132] EXT4-fs (md2): mounted filesystem with ordered data mode. Opts:
stripe=32,data=ordered
<snip>
[223796.599863] md: export_rdev(sda7)
[223796.658203] md: bind<sda7>
[223796.719104] disk 0, wo:1, o:1, dev:sda7
[223796.719108] disk 1, wo:0, o:1, dev:sdb7
[223796.719155] md: recovery of RAID array md1
[223796.719156] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
[223796.719158] md: using maximum available idle IO bandwidth (but not more than
200000 KB/sec) for recovery.
[223796.719161] md: using 128k window, over a total of 52396032k.
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-08-28 13:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-28 9:22 How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? David C. Rankin
2015-08-28 9:42 ` David C. Rankin
2015-08-28 9:54 ` Mikael Abrahamsson
2015-08-28 9:52 ` Robin Hill
2015-08-28 13:16 ` David C. Rankin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.