* Missing Superblocks
[not found] <de712291-fa08-b35a-f8fb-6d18b573f3f4@aawcs.co.uk>
@ 2021-10-25 11:01 ` John Atkins
[not found] ` <CAAMCDee8fEHGMg7NBNzMq7+kbFHo-4DM0D2T=rNezpPZgKabeg@mail.gmail.com>
0 siblings, 1 reply; 6+ messages in thread
From: John Atkins @ 2021-10-25 11:01 UTC (permalink / raw)
To: linux-raid
Good-day All
I "have" an array RAID6 6drives 1 spare. The OS got corrupted on reboot.
On a fresh install the array is non detectable. Investigation shows that
out of the 6 active drives only the first one has a valid super block
the rest look to be corrupt. My assumption is that this might be down to
the fact that I raided on raw drives /dev/sdX and not on partitions
/dev/sdX1. Is this recoverable or should I load from a backup?
I assumed trying to create using --assume-clean should work but
currently the only drive with a super block shows as busy except it is
not mounted.
Many Thanks
John Atkins
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Missing Superblocks
[not found] ` <CAAMCDee8fEHGMg7NBNzMq7+kbFHo-4DM0D2T=rNezpPZgKabeg@mail.gmail.com>
@ 2021-10-26 9:45 ` John Atkins
2021-10-27 16:33 ` Wol
0 siblings, 1 reply; 6+ messages in thread
From: John Atkins @ 2021-10-26 9:45 UTC (permalink / raw)
To: Roger Heflin; +Cc: linux-raid
Thanks for the suggestions.
No partition ever on these disks.
I will try the dd method but as there was never a partition on the drive
I don't think that will return results.
The busy drive is not part of an active md array nor mounted so still a
bit bemused by that.
I know the order, after my first few muckups I number them to make sure
if I have to move them it will work. If I use assume clean, if it does
not work I can just try another order I assume. I do have a backup but
14T will take time to replicate.
Email.png
*Regards John Atkins*
Senior Systems Support, Installation & Service Engineer
AAW Control Systems Ltd
Mobile: 07780480014
Office: 01635 248589
E-Mail: john@aawcs.co.uk
On 26/10/2021 00:16, Roger Heflin wrote:
> Raw devices or not should not matter, unless you changed in the
> middle and did not quite do all of it right.
>
> Usually when I find a missing header on disks it is because it really
> was not where it was thought to be. I have seen people put on raw
> disks, later partitioniongin to misplace data--sometimes this
> overwrites data depending on where the header lives (reboot exposes
> it), I have seen them re-create a partition with a different starting
> position and misplace data (the new partition was not used until
> reboot). I have also seen people label a partition and then remove
> the partition and that also misplaces data. If you are lucky then it
> is a data misplace from a partition table issue.
>
> If you cannot find the header you might look a bit more carefully.
> "dd if=devicename bs=1M count=20 | xxd -a | more" and compare the good
> one to the bad one and see if the header begins at the same location
> or at a different location. On my disk with a partition table I get
> data in the first 4 512byte blocks, and then no data until the start
> of the actual partition/mdadm component.
>
> I have found LVM headers and other headers and reconstructed the
> partition table using the above trick (assuming there was a partition
> table).
>
> It is also possible that the header got overwrote by something.
>
> To use the assume clean you will need to do an mdadm --stop /dev/mdXX
> device (to undo the busy).
>
> But also note that you must get the order exactly right with the
> assume clean or it won't work (wrong order means corrupted data).
> Usually it is suggested to make copies of the disks before attempting it.
>
>
>
>
> On Mon, Oct 25, 2021 at 8:26 AM John Atkins <John@aawcs.co.uk
> <mailto:John@aawcs.co.uk>> wrote:
>
> Good-day All
> I "have" an array RAID6 6drives 1 spare. The OS got corrupted on
> reboot.
> On a fresh install the array is non detectable. Investigation
> shows that
> out of the 6 active drives only the first one has a valid super block
> the rest look to be corrupt. My assumption is that this might be
> down to
> the fact that I raided on raw drives /dev/sdX and not on partitions
> /dev/sdX1. Is this recoverable or should I load from a backup?
> I assumed trying to create using --assume-clean should work but
> currently the only drive with a super block shows as busy except
> it is
> not mounted.
> Many Thanks
> John Atkins
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Missing Superblocks
2021-10-26 9:45 ` John Atkins
@ 2021-10-27 16:33 ` Wol
2021-10-28 9:52 ` John Atkins
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Wol @ 2021-10-27 16:33 UTC (permalink / raw)
To: John Atkins, Roger Heflin; +Cc: linux-raid
On 26/10/2021 10:45, John Atkins wrote:
> Thanks for the suggestions.
> No partition ever on these disks.
BAD IDEA ... it *should* be okay, but there are too many rogue
programs/utilities out there that think stomping all over a
partition-free disk is acceptable behaviour ...
It's bad enough when a GPT or MBR gets trashed, which sadly is not
unusual in your scenario, but without partitions you're inviting
disaster... :-(
> I will try the dd method but as there was never a partition on the drive
> I don't think that will return results.
Why not? it may return traces of the array ...
> The busy drive is not part of an active md array nor mounted so still a
> bit bemused by that.
When mdadm attempts to start an array (which it does by default at
boot), if the attempt fails it usually leaves a broken inactive array in
an unusable state. You need to "kill" this mess before you can do
anything with it!
> I know the order, after my first few muckups I number them to make sure
> if I have to move them it will work. If I use assume clean, if it does
> not work I can just try another order I assume. I do have a backup but
> 14T will take time to replicate.
If you haven't yet tried to force the array, and possibly corrupted
where the headers should be, you could try a plain force-assemble, which
*might* work (very long shot ...)
Otherwise, read the wiki and try with overlays until something "strikes
gold". Then I'd be inclined to fail each drive in turn, re-adding it as
a partition, to try and avoid a similar screw-up in future. That, or
disconnect all the raid drives before an upgrade, and re-connect them
afterwards - though that's been known to cause grief, too :-(
(Of course, if you've used all available space, partitioning will shrink
the raid and cause more grief elsewhere ...)
Hopefully, you've never resized the array, and the mdadm defaults
haven't changed, so you'll strike gold first attempt. Otherwise it could
be a long hard slog with all the possible options.
https://raid.wiki.kernel.org/index.php/Linux_Raid
Cheers,
Wol
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Missing Superblocks
2021-10-27 16:33 ` Wol
@ 2021-10-28 9:52 ` John Atkins
2021-10-28 13:13 ` John Atkins
2021-10-28 17:17 ` Nix
2 siblings, 0 replies; 6+ messages in thread
From: John Atkins @ 2021-10-28 9:52 UTC (permalink / raw)
To: Wol, Roger Heflin; +Cc: linux-raid
On 27/10/2021 17:33, Wol wrote:
> On 26/10/2021 10:45, John Atkins wrote:
>> Thanks for the suggestions.
>> No partition ever on these disks.
>
> BAD IDEA ... it *should* be okay, but there are too many rogue
> programs/utilities out there that think stomping all over a
> partition-free disk is acceptable behaviour ...
>
> It's bad enough when a GPT or MBR gets trashed, which sadly is not
> unusual in your scenario, but without partitions you're inviting
> disaster... :-(
Ah confirmation from what I read in a 2017 post *sigh*, naivete on my
part thinking that with out partitions there was less to go wrong.
>
>> I will try the dd method but as there was never a partition on the
>> drive I don't think that will return results.
>
> Why not? it may return traces of the array ...
Thought this was to look for partition headers, I was assuming wrong
again. I will try this.
>
>> The busy drive is not part of an active md array nor mounted so still
>> a bit bemused by that.
>
> When mdadm attempts to start an array (which it does by default at
> boot), if the attempt fails it usually leaves a broken inactive array
> in an unusable state. You need to "kill" this mess before you can do
> anything with it!
Ah ha that explains that. I will kill what I can find.
>
>> I know the order, after my first few muckups I number them to make
>> sure if I have to move them it will work. If I use assume clean, if
>> it does not work I can just try another order I assume. I do have a
>> backup but 14T will take time to replicate.
>
> If you haven't yet tried to force the array, and possibly corrupted
> where the headers should be, you could try a plain force-assemble,
> which *might* work (very long shot ...)
I will try giving this ago, I was under the assumption as there is only
one drive with a super block it would not, I will check the Wiki again
to explore this and how to specify the drives.
>
> Otherwise, read the wiki and try with overlays until something
> "strikes gold". Then I'd be inclined to fail each drive in turn,
> re-adding it as a partition, to try and avoid a similar screw-up in
> future.
Yes from now on it is partitions not raw drives haha.
> That, or disconnect all the raid drives before an upgrade, and
> re-connect them afterwards - though that's been known to cause grief,
> too :-(
I was just a machine reboot not even upgrades so bit lost on what caused
the OS to scramble its self and prevent the OS from booting.
>
> (Of course, if you've used all available space, partitioning will
> shrink the raid and cause more grief elsewhere ...)
Luckily not yet
>
> Hopefully, you've never resized the array, and the mdadm defaults
> haven't changed, so you'll strike gold first attempt. Otherwise it
> could be a long hard slog with all the possible options.
>
> https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> Cheers,
> Wol
> .
Thank you very much for the advice!
John
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Missing Superblocks
2021-10-27 16:33 ` Wol
2021-10-28 9:52 ` John Atkins
@ 2021-10-28 13:13 ` John Atkins
2021-10-28 17:17 ` Nix
2 siblings, 0 replies; 6+ messages in thread
From: John Atkins @ 2021-10-28 13:13 UTC (permalink / raw)
To: Wol, Roger Heflin; +Cc: linux-raid
Apologies I feel like I am being extraordinarily thick!
I am trying to --assemble --force, I have tried listing the devices in
the correct order after /dev/md0. To which I get
sudo mdadm --assemble --force /dev/md0 /dev/sdc /dev/sdd
/dev/sde /dev/sdf /dev/sdg /dev/sdh
mdadm: /dev/sdd, is an invalid name for an md device - ignored.
mdadm: /dev/sde, is an invalid name for an md device - ignored.
mdadm: /dev/sdf, is an invalid name for an md device - ignored.
mdadm: /dev/sdg, is an invalid name for an md device - ignored.
mdadm: /dev/sdh is an invalid name for an md device - ignored.
mdadm: No super block found on /dev/sdd (Expected magic
a92b4efc, got 00000000)
mdadm: no RAID superblock on /dev/sdd
mdadm: /dev/sdd has no superblock - assembly aborted
I have tried listing the devices in the mdadm.config again with no luck.
ARRAY /dev/md0 metadata=1.2 level=6 name=nas:0 devices=/dev/sdc,
/dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh
(NB:. both with spaces like above and with out space between the
, ers)
To which I get.
sudo mdadm --assemble --force /dev/md0
mdadm: /dev/md0 assembled from 1 drive - not enough to start the
array.
What am I missing?
On 27/10/2021 17:33, Wol wrote:
> On 26/10/2021 10:45, John Atkins wrote:
>> Thanks for the suggestions.
>> No partition ever on these disks.
>
> BAD IDEA ... it *should* be okay, but there are too many rogue
> programs/utilities out there that think stomping all over a
> partition-free disk is acceptable behaviour ...
>
> It's bad enough when a GPT or MBR gets trashed, which sadly is not
> unusual in your scenario, but without partitions you're inviting
> disaster... :-(
>
>> I will try the dd method but as there was never a partition on the
>> drive I don't think that will return results.
>
> Why not? it may return traces of the array ...
>
>> The busy drive is not part of an active md array nor mounted so still
>> a bit bemused by that.
>
> When mdadm attempts to start an array (which it does by default at
> boot), if the attempt fails it usually leaves a broken inactive array
> in an unusable state. You need to "kill" this mess before you can do
> anything with it!
>
>> I know the order, after my first few muckups I number them to make
>> sure if I have to move them it will work. If I use assume clean, if
>> it does not work I can just try another order I assume. I do have a
>> backup but 14T will take time to replicate.
>
> If you haven't yet tried to force the array, and possibly corrupted
> where the headers should be, you could try a plain force-assemble,
> which *might* work (very long shot ...)
>
> Otherwise, read the wiki and try with overlays until something
> "strikes gold". Then I'd be inclined to fail each drive in turn,
> re-adding it as a partition, to try and avoid a similar screw-up in
> future. That, or disconnect all the raid drives before an upgrade, and
> re-connect them afterwards - though that's been known to cause grief,
> too :-(
>
> (Of course, if you've used all available space, partitioning will
> shrink the raid and cause more grief elsewhere ...)
>
> Hopefully, you've never resized the array, and the mdadm defaults
> haven't changed, so you'll strike gold first attempt. Otherwise it
> could be a long hard slog with all the possible options.
>
> https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> Cheers,
> Wol
> .
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Missing Superblocks
2021-10-27 16:33 ` Wol
2021-10-28 9:52 ` John Atkins
2021-10-28 13:13 ` John Atkins
@ 2021-10-28 17:17 ` Nix
2 siblings, 0 replies; 6+ messages in thread
From: Nix @ 2021-10-28 17:17 UTC (permalink / raw)
To: Wol; +Cc: John Atkins, Roger Heflin, linux-raid
On 27 Oct 2021, Wol uttered the following:
> On 26/10/2021 10:45, John Atkins wrote:
>> Thanks for the suggestions.
>> No partition ever on these disks.
>
> BAD IDEA ... it *should* be okay, but there are too many rogue programs/utilities out there that think stomping all over a
> partition-free disk is acceptable behaviour ...
There are even some BIOSes (or, rather, UEFI firmwares) that think this
is just fine. Without notice, of course, and often when you do nothing
more than reboot.
> It's bad enough when a GPT or MBR gets trashed, which sadly is not unusual in your scenario, but without partitions you're inviting
> disaster... :-(
Quite. I moved away from raw disk usage long ago: the cost/benefit
tradeoff is just not worth it.
--
NULL && (void)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-10-28 17:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <de712291-fa08-b35a-f8fb-6d18b573f3f4@aawcs.co.uk>
2021-10-25 11:01 ` Missing Superblocks John Atkins
[not found] ` <CAAMCDee8fEHGMg7NBNzMq7+kbFHo-4DM0D2T=rNezpPZgKabeg@mail.gmail.com>
2021-10-26 9:45 ` John Atkins
2021-10-27 16:33 ` Wol
2021-10-28 9:52 ` John Atkins
2021-10-28 13:13 ` John Atkins
2021-10-28 17:17 ` Nix
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.