Re: Kernel crash if both devices in raid1 are failing

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dmitry Katsubo <dmitry.katsubo@gmail.com>
To: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Kernel crash if both devices in raid1 are failing
Date: Tue, 19 Apr 2016 07:45:40 +0200	[thread overview]
Message-ID: <5715C604.2070200@gmail.com> (raw)
In-Reply-To: <CAJCQCtSDUVDVZ=JfkhOhwn-OTM3iA=cNXc9w=zLcsEatA4xU5Q@mail.gmail.com>

On 2016-04-18 02:19, Chris Murphy wrote:
> With two device failure on raid1 volume, the file system is actually
> broken. There's a big hole in the metadata, not just missing data,
> because there are only two copies of metadata, distributed across
> three drives.

Thanks, I understand that. Well, the drive has not completely failed,
it has accidental read-write errors. I still wonder what went wrong
and why the kernel has crashed - I think this should not happen, as it
does not allow me to operate with the data which still can be read.
I am happy to contribute more information if it would help.

> btrfs restore might be able to scrape off some files, but I don't
> expect it'll get very far. If there were n-way raid1, where every
> drive has a complete copy of 100% of the filesystem metadata, what you
> suggest would be possible.

Actually btrfs restore has recovered many files, however I was not
able to run in fully unattended mode as it complains about "looping a lot".
Does it mean that files are corrupted / not correctly restored?

> OK probably the worst thing you can do if you're trying to recover
> data from a degraded volume where a 2nd device is also having
> problems, is to mount it rw let alone write anything to it. *shrug*
> That's just going to make things much worse and more difficult to
> recover, assuming anything can be recovered at all. The least number
> of changes you make to such a volume, the better.

Another option I have thought about is to shrink the failing volume
up to some small value. This will cause chunks to be moved to another
location. How btrfs will behave if both copies cannot be read?
Would be nice to have a strategy to recover without "btrfs restore"
in such case. I wonder because "btrfs restore" assumes pausing of
normal system operation to do copying back and forth.

-- 
With best regards,
Dmitry

next prev parent reply	other threads:[~2016-04-19  5:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-14 20:30 Kernel crash if both devices in raid1 are failing Dmitry Katsubo
2016-04-17 23:18 ` Dmitry Katsubo
2016-04-21  3:45   ` Liu Bo
     [not found]     ` <571DC34A.50509@gmail.com>
2016-04-27  2:44       ` Dmitry Katsubo
2016-05-02 20:51         ` Dmitry Katsubo
2016-04-18  0:19 ` Chris Murphy
2016-04-19  5:45   ` Dmitry Katsubo [this message]
2016-04-19  7:58     ` Duncan
2016-04-20 22:02       ` Dmitry Katsubo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5715C604.2070200@gmail.com \
    --to=dmitry.katsubo@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).