From: NeilBrown <neilb@suse.de>
To: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Cc: linux-raid@vger.kernel.org
Subject: Re: Help recovering an interrupted raid0 reshape
Date: Wed, 8 Apr 2015 08:56:35 +1000 [thread overview]
Message-ID: <20150408085635.64fa7101@notabene.brown> (raw)
In-Reply-To: <CAC_83AGk2MK8=qy2CL-WN-ewfVzgA2D-WRkefpriXrdHSEjU-Q@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 12018 bytes --]
On Tue, 7 Apr 2015 15:31:32 -0700 "Jonathan Harker (Jesusaurus)"
<jesusaurus@gentlydownthe.net> wrote:
> On Tue, Apr 7, 2015 at 2:13 PM, NeilBrown <neilb@suse.de> wrote:
> > On Tue, 7 Apr 2015 10:02:13 -0700 "Jonathan Harker (Jesusaurus)"
> > <jesusaurus@gentlydownthe.net> wrote:
> >
> >> On Mon, Apr 6, 2015 at 11:30 PM, NeilBrown <neilb@suse.de> wrote:
> >> >
> >> > Try:
> >> > mdadm -S /dev/md124
> >> > mdadm -A /dev/md124 --update=revert-reshape /dev/md/alpha /dev/md/beta
> >> > mdadm -S /dev/md124
> >> > mdadm -A /dev/md124 -vvv /dev/md/alpha /dev/md/beta /dev/md/gamma
> >> >
> >> > What does that report?
> >> >
> >> > NeilBrown
> >> >
> >>
> >> # mdadm --stop /dev/md124
> >> mdadm: stopped /dev/md124
> >> # mdadm -A /dev/md124 --update=revert-reshape /dev/md/alpha /dev/md/beta
> >> mdadm: /dev/md124 assembled from 2 drives - not enough to start the array.
> >> # cat /proc/mdstat
> >> Personalities : [raid6] [raid5] [raid4] [raid1] [raid10] [raid0]
> >> [linear] [multipath]
> >> md124 : inactive md126[0](S) md127[1](S)
> >> 3907022200 blocks super 1.2
> >>
> >> md0 : active raid1 sda5[0] sdb2[1]
> >> 107652416 blocks [2/2] [UU]
> >> bitmap: 1/1 pages [4KB], 65536KB chunk
> >>
> >> md125 : active raid1 sdh1[0] sdg1[1]
> >> 2930134016 blocks super 1.2 [2/2] [UU]
> >> bitmap: 0/22 pages [0KB], 65536KB chunk
> >>
> >> md126 : active raid1 sdc1[0] sdd1[1]
> >> 1953512312 blocks super 1.2 [2/2] [UU]
> >>
> >> md127 : active raid1 sde1[2] sdf1[1]
> >> 1953512312 blocks super 1.2 [2/2] [UU]
> >>
> >> unused devices: <none>
> >> # mdadm --stop /dev/md124
> >> mdadm: stopped /dev/md124
> >> # mdadm -A /dev/md124 -vvv /dev/md/alpha /dev/md/beta /dev/md/gamma
> >> mdadm: looking for devices for /dev/md124
> >> mdadm: UUID differs from /dev/md0.
> >> mdadm: UUID differs from /dev/md/alpha.
> >> mdadm: UUID differs from /dev/md/beta.
> >> mdadm: UUID differs from /dev/md/gamma.
> >> mdadm: UUID differs from /dev/md0.
> >> mdadm: UUID differs from /dev/md/alpha.
> >> mdadm: UUID differs from /dev/md/beta.
> >> mdadm: UUID differs from /dev/md/gamma.
> >> mdadm: UUID differs from /dev/md0.
> >> mdadm: UUID differs from /dev/md/alpha.
> >> mdadm: UUID differs from /dev/md/beta.
> >> mdadm: UUID differs from /dev/md/gamma.
> >> mdadm: /dev/md/alpha is identified as a member of /dev/md124, slot 1.
> >> mdadm: /dev/md/beta is identified as a member of /dev/md124, slot 0.
> >> mdadm: /dev/md/gamma is identified as a member of /dev/md124, slot 2.
> >> mdadm: :/dev/md124 has an active reshape - checking if critical
> >> section needs to be restored
> >> mdadm: added /dev/md/alpha to /dev/md124 as 1
> >> mdadm: added /dev/md/gamma to /dev/md124 as 2 (possibly out of date)
> >> mdadm: no uptodate device for slot 6 of /dev/md124
> >> mdadm: added /dev/md/beta to /dev/md124 as 0
> >> mdadm: /dev/md124 assembled from 2 drives - not enough to start the array.
> >> # cat /proc/mdstat
> >> Personalities : [raid6] [raid5] [raid4] [raid1] [raid10] [raid0]
> >> [linear] [multipath]
> >> md124 : inactive md125[3](S) md127[1](S) md126[0](S)
> >> 6837155192 blocks super 1.2
> >>
> >> md0 : active raid1 sda5[0] sdb2[1]
> >> 107652416 blocks [2/2] [UU]
> >> bitmap: 0/1 pages [0KB], 65536KB chunk
> >>
> >> md125 : active raid1 sdh1[0] sdg1[1]
> >> 2930134016 blocks super 1.2 [2/2] [UU]
> >> bitmap: 0/22 pages [0KB], 65536KB chunk
> >>
> >> md126 : active raid1 sdc1[0] sdd1[1]
> >> 1953512312 blocks super 1.2 [2/2] [UU]
> >>
> >> md127 : active raid1 sde1[2] sdf1[1]
> >> 1953512312 blocks super 1.2 [2/2] [UU]
> >>
> >> unused devices: <none>
> >>
> >> # mdadm --examine /dev/md/alpha
> >> /dev/md/alpha:
> >> Magic : a92b4efc
> >> Version : 1.2
> >> Feature Map : 0x4
> >> Array UUID : 1f4979ba:c49a77c0:59e689c2:bcc21c0a
> >> Name : hordern:hordern1 (local to host hordern)
> >> Creation Time : Fri Jan 2 09:59:40 2009
> >> Raid Level : raid4
> >> Raid Devices : 4
> >>
> >> Avail Dev Size : 3907021824 (1863.01 GiB 2000.40 GB)
> >> Array Size : 5860532736 (5589.04 GiB 6001.19 GB)
> >> Data Offset : 2048 sectors
> >> Super Offset : 8 sectors
> >> Unused Space : before=1968 sectors, after=752 sectors
> >> State : active
> >> Device UUID : 63aaa2e4:2a09f495:8372c7f9:eb2f2773
> >>
> >> Reshape pos'n : 129067008 (123.09 GiB 132.16 GB)
> >> Delta Devices : 1 (3->4)
> >>
> >> Update Time : Sun Mar 29 15:11:35 2015
> >> Checksum : 8be5e0e8 - correct
> >> Events : 14013
> >>
> >> Chunk Size : 512K
> >>
> >> Device Role : Active device 1
> >> Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >> # mdadm --examine /dev/md/beta
> >> /dev/md/beta:
> >> Magic : a92b4efc
> >> Version : 1.2
> >> Feature Map : 0x4
> >> Array UUID : 1f4979ba:c49a77c0:59e689c2:bcc21c0a
> >> Name : hordern:hordern1 (local to host hordern)
> >> Creation Time : Fri Jan 2 09:59:40 2009
> >> Raid Level : raid4
> >> Raid Devices : 4
> >>
> >> Avail Dev Size : 3907022576 (1863.01 GiB 2000.40 GB)
> >> Array Size : 5860532736 (5589.04 GiB 6001.19 GB)
> >> Used Dev Size : 3907021824 (1863.01 GiB 2000.40 GB)
> >> Data Offset : 2048 sectors
> >> Super Offset : 8 sectors
> >> Unused Space : before=1968 sectors, after=752 sectors
> >> State : clean
> >> Device UUID : 6e6dce14:3ebb2bb5:187aa292:403a55f6
> >>
> >> Reshape pos'n : 129067008 (123.09 GiB 132.16 GB)
> >> Delta Devices : 1 (3->4)
> >>
> >> Update Time : Sun Mar 29 15:11:35 2015
> >> Checksum : f7526adf - correct
> >> Events : 14013
> >>
> >> Chunk Size : 512K
> >>
> >> Device Role : Active device 0
> >> Array State : AA.. ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >> # mdadm --examine /dev/md/gamma
> >> /dev/md/gamma:
> >> Magic : a92b4efc
> >> Version : 1.2
> >> Feature Map : 0x6
> >> Array UUID : 1f4979ba:c49a77c0:59e689c2:bcc21c0a
> >> Name : hordern:hordern1 (local to host hordern)
> >> Creation Time : Fri Jan 2 09:59:40 2009
> >> Raid Level : raid4
> >> Raid Devices : 4
> >>
> >> Avail Dev Size : 5860265984 (2794.39 GiB 3000.46 GB)
> >> Array Size : 5860532736 (5589.04 GiB 6001.19 GB)
> >> Used Dev Size : 3907021824 (1863.01 GiB 2000.40 GB)
> >> Data Offset : 2048 sectors
> >> Super Offset : 8 sectors
> >> Recovery Offset : 86403072 sectors
> >> Unused Space : before=1960 sectors, after=1953244160 sectors
> >> State : active
> >> Device UUID : 782873ea:e265ecd4:5cc80ddf:035ba2b4
> >>
> >> Reshape pos'n : 129067008 (123.09 GiB 132.16 GB)
> >> Delta Devices : 1 (3->4)
> >>
> >> Update Time : Sun Mar 29 00:05:29 2015
> >> Bad Block Log : 512 entries available at offset 72 sectors
> >> Checksum : 710dc078 - correct
> >> Events : 673
> >>
> >> Chunk Size : 512K
> >>
> >> Device Role : Active device 2
> >> Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >> # mdadm --detail /dev/md124
> >> /dev/md124:
> >> Version : 1.2
> >> Raid Level : raid0
> >> Total Devices : 3
> >> Persistence : Superblock is persistent
> >>
> >> State : inactive
> >>
> >> Delta Devices : 1, (-1->0)
> >> New Level : raid4
> >> New Chunksize : 512K
> >>
> >> Name : hordern:hordern1 (local to host hordern)
> >> UUID : 1f4979ba:c49a77c0:59e689c2:bcc21c0a
> >> Events : 673
> >>
> >> Number Major Minor RaidDevice
> >>
> >> - 9 125 - /dev/md/gamma
> >> - 9 126 - /dev/md/beta
> >> - 9 127 - /dev/md/alpha
> >>
> >> So it looks like all three component devices have consistent
> >> superblocks now, awesome! But the raid0 array is still inactive with
> >> all three components listed as spares. It looks like /dev/md/gamma has
> >> a much lower event count, I'm guessing that is what causes the disk to
> >> be marked as possibly out of date.
> >>
> >> Is an "uptodate device" a specific thing, or does that simply mean
> >> that some component devices are out of date? The lack of spaces makes
> >> me think that uptodate is some keyword I'm not recognizing.
> >>
> >
> > Looks good. Nearly there.
> >
> > The difference in event counts is probably due to you trying lots of things
> > out, and them only affecting two devices.
> >
> > If you
> > # mdadm --stop /dev/md124
> > # mdadm -A --force /dev/md124 -vvv /dev/md/alpha /dev/md/beta /dev/md/gamma
> >
> > i.e. just add --force, it should ignored the difference in event count and
> > assemble the array.
> > For RAID0, the event count isn't really relevant to the data as there is no
> > possibility for inconsistency between data and parity on different devices.
> > As the reshape position is the same on all devices, I don't think there is
> > any risk at all in just using --force.
> > Of course, perform an fsck afterwards just to build confidence.
> >
> > NeilBrown
> >
>
> Unfortunately, adding --force didn't seem to make any difference:
>
> # mdadm --stop /dev/md124
> mdadm: stopped /dev/md124
> # mdadm -A --force /dev/md124 -vvv /dev/md/alpha /dev/md/beta /dev/md/gamma
> mdadm: looking for devices for /dev/md124
> mdadm: UUID differs from /dev/md0.
> mdadm: UUID differs from /dev/md/alpha.
> mdadm: UUID differs from /dev/md/beta.
> mdadm: UUID differs from /dev/md/gamma.
> mdadm: UUID differs from /dev/md0.
> mdadm: UUID differs from /dev/md/alpha.
> mdadm: UUID differs from /dev/md/beta.
> mdadm: UUID differs from /dev/md/gamma.
> mdadm: UUID differs from /dev/md0.
> mdadm: UUID differs from /dev/md/alpha.
> mdadm: UUID differs from /dev/md/beta.
> mdadm: UUID differs from /dev/md/gamma.
> mdadm: /dev/md/alpha is identified as a member of /dev/md124, slot 1.
> mdadm: /dev/md/beta is identified as a member of /dev/md124, slot 0.
> mdadm: /dev/md/gamma is identified as a member of /dev/md124, slot 2.
> mdadm: :/dev/md124 has an active reshape - checking if critical
> section needs to be restored
> mdadm: added /dev/md/alpha to /dev/md124 as 1
> mdadm: added /dev/md/gamma to /dev/md124 as 2 (possibly out of date)
> mdadm: no uptodate device for slot 6 of /dev/md124
> mdadm: added /dev/md/beta to /dev/md124 as 0
> mdadm: /dev/md124 assembled from 2 drives - not enough to start the array.
> # cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4] [raid1] [raid10] [raid0]
> [linear] [multipath]
> md124 : inactive md125[3](S) md127[1](S) md126[0](S)
> 6837155192 blocks super 1.2
>
> md0 : active raid1 sda5[0] sdb2[1]
> 107652416 blocks [2/2] [UU]
> bitmap: 0/1 pages [0KB], 65536KB chunk
>
> md125 : active raid1 sdh1[0] sdg1[1]
> 2930134016 blocks super 1.2 [2/2] [UU]
> bitmap: 0/22 pages [0KB], 65536KB chunk
>
> md126 : active raid1 sdc1[0] sdd1[1]
> 1953512312 blocks super 1.2 [2/2] [UU]
>
> md127 : active raid1 sde1[2] sdf1[1]
> 1953512312 blocks super 1.2 [2/2] [UU]
>
> unused devices: <none>
Hmm... I think I see the bug. It should be easy enough to fix, but I'd like
to be able to test it.
Could you please:
mkdir /tmp/md.metadata
mdadm --dump /tmp/md.metadata /dev/md/alpha /dev/md/beta /dev/md/gamma
tar czSf /tmp/md.tgz /tmp/md.metadata
and then send me /tmp/md.tgz, which should be tiny and contain just the
metadata from the array.
[[the patch which introduced the problem has a description which starts
"This is a bit of a hack and ..."
Never accept hacks!
]]
NeilBrown
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2015-04-07 22:56 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-04 2:22 Help recovering an interrupted raid0 reshape Jonathan Harker (Jesusaurus)
2015-04-06 23:46 ` NeilBrown
2015-04-07 1:50 ` NeilBrown
2015-04-07 5:14 ` Jonathan Harker (Jesusaurus)
2015-04-07 6:30 ` NeilBrown
2015-04-07 17:02 ` Jonathan Harker (Jesusaurus)
2015-04-07 21:13 ` NeilBrown
2015-04-07 22:31 ` Jonathan Harker (Jesusaurus)
2015-04-07 22:56 ` NeilBrown [this message]
2015-04-07 23:24 ` Jonathan Harker (Jesusaurus)
2015-04-08 2:09 ` NeilBrown
2015-04-08 17:00 ` Jonathan Harker (Jesusaurus)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150408085635.64fa7101@notabene.brown \
--to=neilb@suse.de \
--cc=jesusaurus@gentlydownthe.net \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox