Recovering Partial Data From Re-Added Drive

Linux RAID subsystem development
 help / color / mirror / Atom feed

* Recovering Partial Data From Re-Added Drive
@ 2018-01-23 17:16 Liwei
  2018-01-23 22:46 ` Andreas Klauer
  0 siblings, 1 reply; 4+ messages in thread
From: Liwei @ 2018-01-23 17:16 UTC (permalink / raw)
  To: linux-raid

Hi list,
    This is a very odd question and I'm just grasping at straws here...
========
TLDR version:
    1. RAID6 had 1 missing drive, running degraded
    2. 1 more drive dropped out due to glitch (drive seems fine)
    3. A few hours later, 1 more drive had headcrash, destroyed filesystem
    4. If I was lucky (i.e. no important writes occurred during those
hours in-between), it may have been possible to re-assemble raid with
glitched-drive in place of headcrash-drive
    5. However, accidentally re-added glitched-drive instead
    6. How to proceed?
========
    I have a RAID6 running degraded (12 out of 13 drives). Similar to
an email I previously sent to this list, it was in the process of
being migrated to a larger set of disks - thus I decided not to order
a replacement for the drive that died.

    This week, the most unfortunate thing happened: I woke up to find
the server in a boot loop, and upon checking, it appears that the
filesystem is no longer mountable. After a few emails with the btrfs
people, it appears that a very critical section of the FS, the root
tree, is gone, and unfortunately so are my files (and the cause of the
boot loop).

    What apparently happened was overnight, two things occurred:
    1. a drive glitch caused one drive to drop off the raid, causing
the raid to become unprotected. (based on an email from mdadm telling
me a drive has failed)
    2. a few hours later, a headcrash or something like that happened
and i suddenly had 1455 pending sectors. (based on an email from
smartmon telling me i have currentpending sectors)

    I can't speculate whether any significant write happened between
when the first and second thing occurred, but on the pretty-good
chance that there was no write (since it was night time, and only the
migration is running), the sectors in question should still be
consistent with the drive that glitched out.

    Thereafter, I imaged the drive with pending sectors, sans said
sectors, and placed it back in the array, to run the btrfs checks.
When that didn't work out, I absent-mindedly decided to re-add the
drive that glitched out and the raid started to re-sync things. Took
me a few minutes to realise that was a bad idea, so I stopped the
array and pulled all drives out. I think it only managed to sync the
initial few GBs before I stopped it.

    So the question is, how do I proceed from here? I realised what I
should have done, was to disassemble the array and reassemble them
sans the bad drive, and we might already have our data back. But now
that I have re-added the drive, can I still do something similar,
maybe manually?

Warm regards,
Liwei

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Recovering Partial Data From Re-Added Drive
  2018-01-23 17:16 Recovering Partial Data From Re-Added Drive Liwei
@ 2018-01-23 22:46 ` Andreas Klauer
  2018-01-24  7:34   ` Liwei
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Klauer @ 2018-01-23 22:46 UTC (permalink / raw)
  To: Liwei; +Cc: linux-raid

On Wed, Jan 24, 2018 at 01:16:43AM +0800, Liwei wrote:
> I have a RAID6 running degraded (12 out of 13 drives).
[...]
> thus I decided not to order a replacement for the drive that died.

A gamble that kicked you straight into Murphy's lawnmower.

> I imaged the drive with pending sectors

Do you have the ddrescue log/map to go with that?
If you did not use ddrescue - what did you use exactly?

If you know what the bad sectors were you can try fill those gaps 
with data from the other drives if it wasn't synced over.

If you still have the drive and sectors still bad, you can produce 
the map belatedly by copying it again... if you wiped it and 
sectors were reallocated, no such luck.

> When that didn't work out, I absent-mindedly decided to re-add the
> drive that glitched out and the raid started to re-sync things.
[...]
> I think it only managed to sync the initial few GBs before I stopped it.

Do we know where the bad sectors were located, 
and where the metadata btrfs needs is located?

If either is at the start of the device, then it's probably gone.

> I realised what I should have done

Add a drive the moment it was degraded. (not order and wait to ship. 
go out yourself and buy one same day. pilfer one if you must.)

Also replace drives before degraded if SMART shows it has a bad sector. 
And run regular selftests for SMART to be able to test for those.

And once you're in a data recovery situation, stop writing altogether.
That means no assemble, no add, no fsck, no mount, nothing.
Create copies or use snapshots/overlays.

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

As long as you use only the overlays, you can experiment without worry, 
unless there is still faulty hardware that should be replaced first.
Don't use overlays on drives that are about to go bad. ddrescue those.

> But now that I have re-added the drive, can I still do something similar,
> maybe manually?

You can try that (with overlays).

Also, it's possible for the device role to have changed when you added it, 
as you had two free slots and adding would make it pick one of them...

If you have old examine info or system logs, it would be good to verify 
that first, if role changed, you'd have a role conflict within a single 
drive and no matter what you do with it, it won't be right anymore.

In the end there is no surefire way to fix this, you just have to trial 
and error and it comes down to luck whether you'll be able to make btrfs 
happy again.

Good luck,
Andreas Klauer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Recovering Partial Data From Re-Added Drive
  2018-01-23 22:46 ` Andreas Klauer
@ 2018-01-24  7:34   ` Liwei
  2018-01-24 10:00     ` Andreas Klauer
  0 siblings, 1 reply; 4+ messages in thread
From: Liwei @ 2018-01-24  7:34 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: linux-raid

Apologies, sent with the wrong mail client, resend.

Hi Andreas,
Replies inline...

On 24 January 2018 at 06:46, Andreas Klauer
<Andreas.Klauer@metamorpher.de> wrote:
> On Wed, Jan 24, 2018 at 01:16:43AM +0800, Liwei wrote:
>> I have a RAID6 running degraded (12 out of 13 drives).
> [...]
>> thus I decided not to order a replacement for the drive that died.
>
> A gamble that kicked you straight into Murphy's lawnmower.
>

Indeed and I've learnt my lesson!

>> I imaged the drive with pending sectors
>
> Do you have the ddrescue log/map to go with that?
> If you did not use ddrescue - what did you use exactly?

Yes, ddrescue, and I did use a log.

>
> If you know what the bad sectors were you can try fill those gaps
> with data from the other drives if it wasn't synced over.

That's what I'm hoping to do.

>
> If you still have the drive and sectors still bad, you can produce
> the map belatedly by copying it again... if you wiped it and
> sectors were reallocated, no such luck.
>
>> When that didn't work out, I absent-mindedly decided to re-add the
>> drive that glitched out and the raid started to re-sync things.
> [...]
>> I think it only managed to sync the initial few GBs before I stopped it.
>
> Do we know where the bad sectors were located,
> and where the metadata btrfs needs is located?

Yes, ddrescue produced the bad sector list, and the btrfs superblock
has the starting byte number where it is expecting to read the root
tree. Using both information, my guess is that I need at least 27 of
those sectors, towards the end of the drive. Pretty sure they're still
there.

>
> If either is at the start of the device, then it's probably gone.
>
>> I realised what I should have done
>
> Add a drive the moment it was degraded. (not order and wait to ship.
> go out yourself and buy one same day. pilfer one if you must.)

Sigh, we were about 2 days away from the migration completing... I
thought. A hard lesson learned.

>
> Also replace drives before degraded if SMART shows it has a bad sector.
> And run regular selftests for SMART to be able to test for those.
>
> And once you're in a data recovery situation, stop writing altogether.
> That means no assemble, no add, no fsck, no mount, nothing.
> Create copies or use snapshots/overlays.
>
> https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

Thanks for the link!

>
> As long as you use only the overlays, you can experiment without worry,
> unless there is still faulty hardware that should be replaced first.
> Don't use overlays on drives that are about to go bad. ddrescue those.
>
>> But now that I have re-added the drive, can I still do something similar,
>> maybe manually?
>
> You can try that (with overlays).
>
> Also, it's possible for the device role to have changed when you added it,
> as you had two free slots and adding would make it pick one of them...
>
> If you have old examine info or system logs, it would be good to verify
> that first, if role changed, you'd have a role conflict within a single
> drive and no matter what you do with it, it won't be right anymore.

I've checked the history mdadm sends to my email, and it seems like
the re-added drive did not change roles.

However, how do I get mdadm to accept the re-added drive without
trying to sync? Right now, every time I reassemble the raid using the
re-added drive, it refuses to start because there are insufficient
devices (the re-syncing/re-added drive is not counted as active). Do I
have to manually edit the drive metadata? If so, what do I need to be
careful of?

>
> In the end there is no surefire way to fix this, you just have to trial
> and error and it comes down to luck whether you'll be able to make btrfs
> happy again.
>
> Good luck,
> Andreas Klauer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Recovering Partial Data From Re-Added Drive
  2018-01-24  7:34   ` Liwei
@ 2018-01-24 10:00     ` Andreas Klauer
  0 siblings, 0 replies; 4+ messages in thread
From: Andreas Klauer @ 2018-01-24 10:00 UTC (permalink / raw)
  To: Liwei; +Cc: linux-raid

On Wed, Jan 24, 2018 at 03:34:31PM +0800, Liwei wrote:
> However, how do I get mdadm to accept the re-added drive without
> trying to sync?

There is --freeze-reshape as well as --action=frozen or you could 
also go to /sys/block/mdX/md/ and set sync speeds to near zero, 
but with overlay, sync won't cause damage other than filling up 
your overlay backing device. If something goes wrong on top of 
the overlays, you just stop everything, reset the overlays and 
start over with a new experiment.

On overlays, you can also play with mdadm --create --assume-clean 
(which you should never do on bare drives, as it's easy to get 
the RAID settings and drive order wrong) so you can re-create 
using only the drives you want (use 'missing' otherwise).

Recreating with mdadm usually looks something like:

    mdadm --create /dev/md42 --assume-clean \
    --level=5 --chunk=512 --metadata=1.2 --data-offset=2048 \
    --raid-devices=3 missing /dev/mapper/overlay-sd{y,z}1

i.e. you have to specify everything, in particular level, 
chunksize, offsets, metadata version, layout, drive order, ... 
according to your existing RAID. Compare mdadm --examine 
before / after the re-create to see if properties match 
(creation time, uuid etc. change, chunksize/layout/offset 
/ device role etc. has to stay the same).

One thing of note is that a sync action doesn't just affect the 
current progress of the sync, but the entire drive - the sync 
progress can only be at one place at a time, but the RAID has to 
stay in sync as a whole as well, so - if there are any writes 
during a sync, they also immediately go to the resyncing drive.

It's different during a grow reshape - new drive only used in 
the region that has already been reshaped onto it, everything 
else still mapped to the old drives - but a resync affects 
the whole drive immediately, if there are any writes.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-01-24 10:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-23 17:16 Recovering Partial Data From Re-Added Drive Liwei
2018-01-23 22:46 ` Andreas Klauer
2018-01-24  7:34   ` Liwei
2018-01-24 10:00     ` Andreas Klauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox