linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Glenn Trigg <ggtrigg@gmail.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: help request for an unmountable raid1 filesystem
Date: Thu, 28 Mar 2019 20:27:34 -0600	[thread overview]
Message-ID: <CAJCQCtQ3YFovgt4tFNQ+uLE81BiD0KFqPdvzE7Rxu+3wujWp2A@mail.gmail.com> (raw)
In-Reply-To: <CAG3pWAmvekNpnBsQTaMrefKe6pvMSZjPm+SOxnKkMkpGO_1msQ@mail.gmail.com>

On Sat, Mar 9, 2019 at 2:36 PM Glenn Trigg <ggtrigg@gmail.com> wrote:

> I had some random machine freezing events which I suspected was due to
> issues with a raid1 filesystem and kernel module crashes.

Hard to say with available information. It's more likely hardware
related, and then there's on-disk corruption.


This:

> % mount -r /dev/sda1 /data
> mount: /data: can't read superblock on /dev/sda1.

and this:

> % btrfs rescue super-recover /dev/sda1
> All supers are valid, no need to recover

Seem in conflict. I don't really understand how the kernel complains
about a bad super and yet user space tools say they're all OK.  What
happens if you try:

# mount -o ro,nologreplay,usebackuproot

If that doesn't work, including kernel messages again, and also
include output from:

# btrfs insp dump-s -fa /dev/sda1
# btrfs insp dump-s -fa /dev/sdb1



>
> and dmesg says:
>
> [15944.017629] BTRFS info (device sda1): disk space caching is enabled
> [15944.017632] BTRFS info (device sda1): has skinny extents
> [15944.024480] BTRFS info (device sda1): bdev /dev/sda1 errs: wr 0, rd
> 0, flush 0, corrupt 1, gen 0
> [15944.024487] BTRFS info (device sda1): bdev /dev/sdb1 errs: wr 0, rd
> 0, flush 0, corrupt 4, gen 0
> [15944.029292] BTRFS error (device sda1): parent transid verify failed
> on 628168376320 wanted 37601 found 37700
> [15944.029466] BTRFS error (device sda1): parent transid verify failed
> on 628168376320 wanted 37601 found 37700

That's usually bad.


> Other system information is:
> % uname -a
> Linux izen 4.18.0-16-generic #17-Ubuntu SMP Fri Feb 8 00:06:57 UTC
> 2019 x86_64 x86_64 x86_64 GNU/Linux

It looks like extent tree corruption so I don't think it'll help to
use a newer kernel; but I'd try it anyway in the meantime until a
developer gets around to responding. Distro specific kernels tend to
be supported by that distribution where upstream lists tend to support
mainline. So I suggest 5.0.4, or 4.19.32, or you can be brave and
download this, image it to a USB stick (dd if=file of=/dev/ bs=1M
oflag=direct) which of course will erase everything on the stick.

https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20190327.n.0/compose/Everything/x86_64/iso/Fedora-Everything-netinst-x86_64-Rawhide-20190327.n.0.iso

That might have 5.1rc2 on it, or something in between rc1 and rc2.
You're still going to try and mount it read-only per above command, so
even if it blows up it's not going to make this worse.


> % btrfs check /dev/sda1
> Checking filesystem on /dev/sda1
> UUID: d5e50511-3e31-4de6-ba37-c5841895be9f
> checking extents
> parent transid verify failed on 628168343552 wanted 28163 found 37700
> parent transid verify failed on 628168343552 wanted 28163 found 37700
> parent transid verify failed on 628168343552 wanted 28163 found 37700
> parent transid verify failed on 628168343552 wanted 28163 found 37700

The transid's are really far apart, definitely something went really
wrong. It could be hardware or both hardware and btrfs bug. That it
affected *both* copies is a little weird unless it's memory corruption
related, and then a lot of things can go wrong.


>
> Where do I go from here?

If it can't be mounted, then the only chance is `btrfs-find-tree` and
`btrfs restore` to try and scrape out whatever data you need that
isn't already backed up. The priority before trying to repair it, is
to get anything important off because trying to repair it has a good
chance of permanent data loss. Definitely the latest tools are
recommended for repair, kernel doesn't matter so much.


-- 
Chris Murphy

  parent reply	other threads:[~2019-03-29  2:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-09 21:35 help request for an unmountable raid1 filesystem Glenn Trigg
2019-03-23  0:45 ` Glenn Trigg
2019-03-29  0:53   ` Glenn Trigg
2019-03-29  2:27 ` Chris Murphy [this message]
2019-03-29  3:21   ` Chris Murphy
2019-03-30 23:43   ` Glenn Trigg
2019-03-31 21:34     ` Chris Murphy
2019-04-01  5:48       ` Glenn Trigg
2019-04-01 17:14         ` Chris Murphy
2019-04-05  5:51           ` Glenn Trigg
2019-04-05  6:44             ` Chris Murphy
2019-04-06  0:37               ` Glenn Trigg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJCQCtQ3YFovgt4tFNQ+uLE81BiD0KFqPdvzE7Rxu+3wujWp2A@mail.gmail.com \
    --to=lists@colorremedies.com \
    --cc=ggtrigg@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).