* How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild?
@ 2015-08-28 9:22 David C. Rankin
2015-08-28 9:42 ` David C. Rankin
2015-08-28 9:52 ` Robin Hill
0 siblings, 2 replies; 5+ messages in thread
From: David C. Rankin @ 2015-08-28 9:22 UTC (permalink / raw)
To: mdraid
All,
I had a disc-controller failure on a server running several raid1 arrays. The
disks are fine, but I have had the root partition come up in degraded mode. What
is the best way to tell mdraid to resync the disks? Here are the symptoms:
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb7[1]
52396032 blocks super 1.2 [2/1] [_U]
md3 : active raid1 sdb6[1] sda6[0]
1047552 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda8[0] sdb8[1]
922944192 blocks super 1.2 [2/2] [UU]
bitmap: 0/7 pages [0KB], 65536KB chunk
md0 : active raid1 sda5[0] sdb5[1]
204608 blocks super 1.2 [2/2] [UU]
unused devices: <none>
# mdadm --misc --detail /dev/md1
/dev/md1:
Version : 1.2
Creation Time : Wed Nov 27 04:35:49 2013
Raid Level : raid1
Array Size : 52396032 (49.97 GiB 53.65 GB)
Used Dev Size : 52396032 (49.97 GiB 53.65 GB)
Raid Devices : 2
Total Devices : 1
Persistence : Superblock is persistent
Update Time : Fri Aug 28 04:12:18 2015
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Name : archiso:1
UUID : 320d86f7:22999af5:5eeefee1:35cd8970
Events : 100308
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 23 1 active sync /dev/sdb7
Reading, it looks like one approach is the boot the install media and then zero
the superblock on /dev/sda7 and then reboot. Will that force a rebuild, or do I
need to fail and remove the disk first? I was thinking:
# mdadm --zero-superblock /dev/sda7
should set it up for a rebuild without more. Is this a sane approach?
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? 2015-08-28 9:22 How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? David C. Rankin @ 2015-08-28 9:42 ` David C. Rankin 2015-08-28 9:54 ` Mikael Abrahamsson 2015-08-28 9:52 ` Robin Hill 1 sibling, 1 reply; 5+ messages in thread From: David C. Rankin @ 2015-08-28 9:42 UTC (permalink / raw) To: mdraid On 08/28/2015 04:22 AM, David C. Rankin wrote: > All, > > I had a disc-controller failure on a server running several raid1 arrays. The > disks are fine, but I have had the root partition come up in degraded mode. What > is the best way to tell mdraid to resync the disks? Here are the symptoms: > > # cat /proc/mdstat > Personalities : [raid1] > md1 : active raid1 sdb7[1] > 52396032 blocks super 1.2 [2/1] [_U] > > md3 : active raid1 sdb6[1] sda6[0] > 1047552 blocks super 1.2 [2/2] [UU] > > md2 : active raid1 sda8[0] sdb8[1] > 922944192 blocks super 1.2 [2/2] [UU] > bitmap: 0/7 pages [0KB], 65536KB chunk > > md0 : active raid1 sda5[0] sdb5[1] > 204608 blocks super 1.2 [2/2] [UU] > > unused devices: <none> > > # mdadm --misc --detail /dev/md1 > /dev/md1: > Version : 1.2 > Creation Time : Wed Nov 27 04:35:49 2013 > Raid Level : raid1 > Array Size : 52396032 (49.97 GiB 53.65 GB) > Used Dev Size : 52396032 (49.97 GiB 53.65 GB) > Raid Devices : 2 > Total Devices : 1 > Persistence : Superblock is persistent > > Update Time : Fri Aug 28 04:12:18 2015 > State : clean, degraded > Active Devices : 1 > Working Devices : 1 > Failed Devices : 0 > Spare Devices : 0 > > Name : archiso:1 > UUID : 320d86f7:22999af5:5eeefee1:35cd8970 > Events : 100308 > > Number Major Minor RaidDevice State > 0 0 0 0 removed > 1 8 23 1 active sync /dev/sdb7 > > Reading, it looks like one approach is the boot the install media and then zero > the superblock on /dev/sda7 and then reboot. Will that force a rebuild, or do I > need to fail and remove the disk first? I was thinking: > > # mdadm --zero-superblock /dev/sda7 > > should set it up for a rebuild without more. Is this a sane approach? > This adds a bit more of the picture. It's like sda7 doesn't even know it was kicked out. There are no disk errors logged for either of the drives: # mdadm -E /dev/sd[ab]7 /dev/sda7: Magic : a92b4efc Version : 1.2 Feature Map : 0x8 Array UUID : 320d86f7:22999af5:5eeefee1:35cd8970 Name : archiso:1 Creation Time : Wed Nov 27 04:35:49 2013 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 104792064 (49.97 GiB 53.65 GB) Array Size : 52396032 (49.97 GiB 53.65 GB) Data Offset : 65536 sectors Super Offset : 8 sectors Unused Space : before=65448 sectors, after=0 sectors State : active Device UUID : f5a48ea1:bce2f6f0:f47f9c0b:bad1d64d Update Time : Sat Aug 8 17:17:21 2015 Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present. Checksum : 2c45bcef - correct Events : 280 Device Role : Active device 0 Array State : AA ('A' == active, '.' == missing, 'R' == replacing) /dev/sdb7: Magic : a92b4efc Version : 1.2 Feature Map : 0x8 Array UUID : 320d86f7:22999af5:5eeefee1:35cd8970 Name : archiso:1 Creation Time : Wed Nov 27 04:35:49 2013 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 104792064 (49.97 GiB 53.65 GB) Array Size : 52396032 (49.97 GiB 53.65 GB) Data Offset : 65536 sectors Super Offset : 8 sectors Unused Space : before=65448 sectors, after=0 sectors State : clean Device UUID : 66e069cc:02daa93e:1d4a6eea:e5c21cb7 Update Time : Fri Aug 28 04:35:31 2015 Bad Block Log : 512 entries available at offset 72 sectors - bad blocks present. Checksum : ed07de3b - correct Events : 100584 Device Role : Active device 1 Array State : .A ('A' == active, '.' == missing, 'R' == replacing) Do I try a --re-add on sda7 or just zero it for a complete rebuild? Any help appreciated. -- David C. Rankin, J.D.,P.E. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? 2015-08-28 9:42 ` David C. Rankin @ 2015-08-28 9:54 ` Mikael Abrahamsson 0 siblings, 0 replies; 5+ messages in thread From: Mikael Abrahamsson @ 2015-08-28 9:54 UTC (permalink / raw) To: David C. Rankin; +Cc: mdraid On Fri, 28 Aug 2015, David C. Rankin wrote: > Do I try a --re-add on sda7 or just zero it for a complete rebuild? Any > help appreciated. Since sda7 has a much lower event count, it doesn't really matter. You do not have a bitmap enabled and so since the event counts are off, a complete resync will need to happen either way. If --re-add doesn't work, use --add. If that doesn't work, zero superblock and --add. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? 2015-08-28 9:22 How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? David C. Rankin 2015-08-28 9:42 ` David C. Rankin @ 2015-08-28 9:52 ` Robin Hill 2015-08-28 13:16 ` David C. Rankin 1 sibling, 1 reply; 5+ messages in thread From: Robin Hill @ 2015-08-28 9:52 UTC (permalink / raw) To: David C. Rankin; +Cc: mdraid [-- Attachment #1: Type: text/plain, Size: 1742 bytes --] On Fri Aug 28, 2015 at 04:22:09am -0500, David C. Rankin wrote: > All, > > I had a disc-controller failure on a server running several raid1 > arrays. The disks are fine, but I have had the root partition come up > in degraded mode. What is the best way to tell mdraid to resync the > disks? Here are the symptoms: > > # cat /proc/mdstat > Personalities : [raid1] > md1 : active raid1 sdb7[1] > 52396032 blocks super 1.2 [2/1] [_U] > > md3 : active raid1 sdb6[1] sda6[0] > 1047552 blocks super 1.2 [2/2] [UU] > > md2 : active raid1 sda8[0] sdb8[1] > 922944192 blocks super 1.2 [2/2] [UU] > bitmap: 0/7 pages [0KB], 65536KB chunk > > md0 : active raid1 sda5[0] sdb5[1] > 204608 blocks super 1.2 [2/2] [UU] > > unused devices: <none> > <- snip -> > Reading, it looks like one approach is the boot the install media and > then zero the superblock on /dev/sda7 and then reboot. Will that force > a rebuild, or do I need to fail and remove the disk first? I was thinking: > > # mdadm --zero-superblock /dev/sda7 > > should set it up for a rebuild without more. Is this a sane approach? > No need to over-complicate things. The only issue you have looks to be that sda7 has not come up as part of md1, so just add it back in: mdadm /dev/md1 -a /dev/sda7 You probably want to check dmesg, etc. to see why it didn't get added in at all in the first place (I'd have expected it to be at least in as a spare). Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? 2015-08-28 9:52 ` Robin Hill @ 2015-08-28 13:16 ` David C. Rankin 0 siblings, 0 replies; 5+ messages in thread From: David C. Rankin @ 2015-08-28 13:16 UTC (permalink / raw) To: mdraid On 08/28/2015 04:52 AM, Robin Hill wrote: > On Fri Aug 28, 2015 at 04:22:09am -0500, David C. Rankin wrote: > >> Reading, it looks like one approach is the boot the install media and >> then zero the superblock on /dev/sda7 and then reboot. Will that force >> a rebuild, or do I need to fail and remove the disk first? I was thinking: >> >> # mdadm --zero-superblock /dev/sda7 >> >> should set it up for a rebuild without more. Is this a sane approach? >> > No need to over-complicate things. The only issue you have looks to be > that sda7 has not come up as part of md1, so just add it back in: > mdadm /dev/md1 -a /dev/sda7 > > You probably want to check dmesg, etc. to see why it didn't get added in > at all in the first place (I'd have expected it to be at least in as a > spare). > > Cheers, > Robin > Hah! You guys are great! [07:57 phoinix:.../david/dev] # mdadm /dev/md1 -a /dev/sda7 mdadm: added /dev/sda7 [07:58 phoinix:.../david/dev] # cat /proc/mdstat Personalities : [raid1] md1 : active raid1 sda7[2] sdb7[1] 52396032 blocks super 1.2 [2/1] [_U] [=>...................] recovery = 6.1% (3246848/52396032) finish=9.7min speed=83527K/sec I have no clue what happened? I have never seen this before. There was never an attempt to md: bind<sda7> (just like it didn't exist). The only thing different about this boot (aside from the new highpoint raid controller) was the fact I left the Arch install CD in the CD/DVD drive. I don't know if in the early boot process, prior to the handoff to the highpoint controller, it somehow may have grabbed sda?? (the CD drive is still on the onboard ATA controller, while all SATA drives are attached to the highpoint controller). Let me know if you see anything that makes any more sense? And... during the time it took to compose this reply: Personalities : [raid1] md1 : active raid1 sda7[2] sdb7[1] 52396032 blocks super 1.2 [2/2] [UU] md3 : active raid1 sdb6[1] sda6[0] 1047552 blocks super 1.2 [2/2] [UU] md2 : active raid1 sda8[0] sdb8[1] 922944192 blocks super 1.2 [2/2] [UU] bitmap: 0/7 pages [0KB], 65536KB chunk md0 : active raid1 sda5[0] sdb5[1] 204608 blocks super 1.2 [2/2] [UU] unused devices: <none> Whoop! I've gathered the relevant dmesg output. Maybe you can help make sense out of it. It just looks like the system never tried to activate sda7... [ 3.261932] sd 3:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) [ 3.261948] sd 2:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) [ 3.261988] sd 2:0:0:0: [sda] Write Protect is off [ 3.261990] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 3.262010] sd 3:0:0:0: [sdb] Write Protect is off [ 3.262012] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 3.262018] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 3.262034] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 3.298037] sda: sda1 < sda5 sda6 sda7 sda8 > [ 3.298469] sd 2:0:0:0: [sda] Attached SCSI disk [ 3.317052] sdb: sdb1 < sdb5 sdb6 sdb7 sdb8 > [ 3.317456] sd 3:0:0:0: [sdb] Attached SCSI disk [ 3.420533] md: bind<sdb5> [ 3.421641] md: bind<sdb8> [ 3.423385] md: bind<sda8> [ 3.425987] md: raid1 personality registered for level 1 [ 3.426035] md: bind<sda5> [ 3.426204] md/raid1:md2: active with 2 out of 2 mirrors [ 3.426322] created bitmap (7 pages) for device md2 [ 3.426614] md2: bitmap initialized from disk: read 1 pages, set 0 of 14084 bits [ 3.427474] md/raid1:md0: active with 2 out of 2 mirrors [ 3.427496] md0: detected capacity change from 0 to 209518592 [ 3.469789] md0: unknown partition table [ 3.543932] md2: detected capacity change from 0 to 945094852608 [ 3.544373] md: bind<sda6> [ 3.545918] md: bind<sdb6> [ 3.546646] md2: unknown partition table [ 3.547402] md/raid1:md3: active with 2 out of 2 mirrors [ 3.547428] md3: detected capacity change from 0 to 1072693248 [ 3.547986] md: bind<sdb7> [ 3.549323] md/raid1:md1: active with 1 out of 2 mirrors [ 3.549348] md1: detected capacity change from 0 to 53653536768 [ 3.558920] md3: unknown partition table [ 3.559052] md1: unknown partition table [ 4.345217] md1: unknown partition table [ 4.371798] EXT4-fs (md1): mounted filesystem with ordered data mode. Opts: (null) <snip> [ 6.020249] EXT4-fs (md1): re-mounted. Opts: stripe=32,data=ordered [ 6.020965] systemd[1]: Started Remount Root and Kernel File Systems. <snip> [ 10.110516] md0: unknown partition table [ 10.480935] md2: unknown partition table [ 10.531124] EXT4-fs (md0): mounted filesystem with ordered data mode. Opts: stripe=32,data=ordered [ 10.573132] EXT4-fs (md2): mounted filesystem with ordered data mode. Opts: stripe=32,data=ordered <snip> [223796.599863] md: export_rdev(sda7) [223796.658203] md: bind<sda7> [223796.719104] disk 0, wo:1, o:1, dev:sda7 [223796.719108] disk 1, wo:0, o:1, dev:sdb7 [223796.719155] md: recovery of RAID array md1 [223796.719156] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [223796.719158] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [223796.719161] md: using 128k window, over a total of 52396032k. -- David C. Rankin, J.D.,P.E. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-08-28 13:16 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-28 9:22 How best to re-sync raid1 array? zero superblock on removed disk and let it rebuild? David C. Rankin 2015-08-28 9:42 ` David C. Rankin 2015-08-28 9:54 ` Mikael Abrahamsson 2015-08-28 9:52 ` Robin Hill 2015-08-28 13:16 ` David C. Rankin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.