From: NeilBrown <neil@brown.name>
To: Nathan Brown <nbrown.us@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Self inflicted reshape catastrophe
Date: Tue, 19 Jan 2021 09:36:51 +1100 [thread overview]
Message-ID: <8735yxkh30.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <CAHikZs7EKe2H5OYdxd5dwZ8WCs8fdVp-5BWku0vQ5Bb-yCstCw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2498 bytes --]
On Sun, Jan 17 2021, Nathan Brown wrote:
> It was this part of the original post
>
>> The raid didn't automatically assemble so I did
>> `mdadm --assemble` but really screwed up and put the 5 new disks in a
>> different array
>
> Basically, I did an `mdadm --assemble /dev/md1 <new disks for md0>`
That command wouldn't have the effect you describe (and is visible in
the --examine output - thanks).
Maybe you mean "--add" ???
> instead of `mdadm --assemble /dev/md0 <new disks for md0>`. Further
> complicated by the fact the md1 was missing a disk, so I let 1 of the
> 5 disks become a full member md1 since I didn't catch my error in time
> and enough recovery on md1 had occurred to wipe out any data transfer
> from the reshape on md0. The other 4 became hot spares. This wiped the
> super block on those 5 new disks, the super blocks no longer contain
> the correct information showing the original reshape attempt on md0.
>
> I have yet to dive into the code but it seems likely that I can
> manually reconstruct the appropriate super blocks for these 4 disks
> that still contain valid data as a result of the reshape with a worst
> case of ~1/5th data loss.
There will be fs-metadata loss as well as data loss, and that is the real
killer.
Yes the data is probably still on those "spare" devices. Probably just
the md-metadata is lost. The data that was on sdo1 is now lost, but
RAID6 protects you from losing one device, so that doesn't matter.
To reconstruct the correct metadata, the easiest approach is probably to
copy the superblock from the best drive in md0 and use a binary-editor
to change the 'Device Role' field to an appropriate number for each
different device. Maybe your kernel logs will have enough info to
confirm which device was in each role.
One approach to copying the metadata is to use "mdadm --dump=/tmp/md0 /dev/md0"
which should create sparse files in /tmp/md0 with the metadata from each
device.
Then binary-edit those files, and rename them. Then use
mdadm --restore=/tmp/md0 /dev/md0
to copy the metadata back. Maybe.
Then use "mdadm --examine --super=1.2" to check that the superblock
looks OK and to find out what the "expected" checksum is. Then edit the
superblock again to set the checksum.
Then try assembling the array with
mdadm --assemble --freeze-reshape --readonly ....
which should minimize the damage that can be done if something isn't
right.
Then try "fsck -n" the filesystem to see if it looks OK.
Good luck
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 857 bytes --]
next prev parent reply other threads:[~2021-01-18 22:37 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-14 22:57 Self inflicted reshape catastrophe Nathan Brown
2021-01-15 16:21 ` antlists
2021-01-18 0:49 ` NeilBrown
2021-01-18 3:12 ` Nathan Brown
2021-01-18 22:36 ` NeilBrown [this message]
2021-01-19 0:09 ` Nathan Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8735yxkh30.fsf@notabene.neil.brown.name \
--to=neil@brown.name \
--cc=linux-raid@vger.kernel.org \
--cc=nbrown.us@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox