From: Matthew Dawson <matthew@mjdsystems.ca>
To: Kai Krakow <hurikhan77+btrfs@gmail.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Help recovering filesystem (if possible)
Date: Wed, 17 Nov 2021 21:57:40 -0500 [thread overview]
Message-ID: <3321185.LZWGnKmheA@cwmtaff> (raw)
In-Reply-To: <CAMthOuMxvff2d0THhKWCpErQFumrJA9vmNqS6vtBNDwUwf3j-w@mail.gmail.com>
On Monday, November 15, 2021 5:46:43 A.M. EST Kai Krakow wrote:
> Am Mo., 15. Nov. 2021 um 02:55 Uhr schrieb Matthew Dawson
>
> <matthew@mjdsystems.ca>:
> > I recently upgrade one of my machines to the 5.15.2 kernel. on the first
> > reboot, I had a kernel fault during the initialization (I didn't get to
> > capture the printed stack trace, but I'm 99% sure it did not have BTRFS
> > related calls). I then rebooted the machine back to a 5.14 kernel, but
> > the
> > BCache (writeback) cache was corrupted. I then force started the
> > underlying disks, but now my BTRFS filesystem will no longer mount. I
> > realize there may be missing/corrupted data, but I would like to ideally
> > get any data I can off the disks.
>
> I had a similar issue lately where the system didn't reboot cleanly
> (there's some issue in the BIOS or with the SSD firmware where it
> would disconnect the SSD from SATA a few seconds after boot, forcing
> bcache into detaching dirty caches).
>
> Since you are seeing transaction IDs lacking behind expectations, I
> think you've lost dirty writeback data from bcache. Do fix this in the
> future, you should use bcache only in writearound or writethrough
> mode.
Considering I started the bcache devices without the cache, I don't doubt I've
lost writeback data and I have no doubts there will be issues. At this point
I'm just in data recovery, trying to get what I can.
>
> > This system involves 10 8TB disk, some are doing BCache -> LUKS -> BTRFS,
> > some are doing LUKS -> BTRFS.
>
> Not LUKS here, and all my btrfs pool members are attached to a single
> SSD as caching frontend.
>
> > When I try to mount the filesystem, I get the following in dmesg:
> > [117632.798339] BTRFS info (device dm-0): flagging fs with big metadata
> > feature [117632.798344] BTRFS info (device dm-0): disk space caching is
> > enabled [117632.798346] BTRFS info (device dm-0): has skinny extents
> > [117632.873186] BTRFS error (device dm-0): parent transid verify failed on
> > 132806584614912 wanted 3240123 found 3240119
>
> I had luck with the following steps:
>
> * ensure that all members are attached to bcache as they should
> * ensure bcache is running in writearound mode for each member
> * ensure that btrfs did scan for all members
>
> Next, I started `btrfs check` for each member disk, eventually one
> would contain the needed disk structures and only showed a few errors.
>
> I was then able to mount btrfs through that device node, open ctree
> didn't fail this time. I don't remember if I used "usebackuproot" for
> mount or a similar switch for "btrfs check".
>
> I then ran `btrfs scrub` which fixed the broken metadata. Luckily, I
> had only metadata corruption on the disks which had dirty writeback
> cleared, and metadata runs in RAID-1 mode for me.
>
> "btrfs check" then didn't find any errors. Reboot worked fine.
Thanks for the suggestion. Unfortunately, all my disks report basically the
same errors, so I wasn't able to recover my system this way.
>
> [...]
>
> > Is there any hope in recovering this data? Or should I give up on it at
> > this point and reformat? Most of the data is backed up (or are backups
> > themselves), but I'd like to get what I can.
>
> Well, I'm doing daily backups with borg - to a different technology
> (no btrfs, no bcache, different system). I don't think backing up
> btrfs to btrfs is a brilliant idea, especially not when both are
> mounted to the same system.
I'm not quite that redundant, but the backups of things I really care about
are actually to an off-site system. But accessing data through a backup can be
painful compared to hopefully just getting it out. Also the local backups on
the system would be nice to have, for historical purposes.
>
> You may try my steps above. If you've found a member device which
> shows fewer errors, you COULD try to repair it if mount still fails
> (or try one of the recovery mount options). But you may want to ask
> the experts again here.
I did try, thanks. Unfortunately as noted above it wasn't helpful.
Hopefully someone has a different idea? I am posting here because I feel any
luck is going to start using more dangerous options and those usually say to
ask the mailing list first.
>
> Depending on how much dirty writeback you've lost in bcache, chances
> may be good that one of the members has enough metadata to
> successfully mount or repair the filesystem. Or at least, it's a good
> start for "btrfs restore" then.
>
> What do we learn from this?
>
> * probably do not use bcache in writeback mode if you can avoid it
> * switch bcache to writearound mode before kernel upgrades, wait for
> writeback to finish
> * success mounting btrfs may depend a lot on which member device you
> actually mount
Thanks,
--
Matthew
next prev parent reply other threads:[~2021-11-18 2:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-15 1:52 Help recovering filesystem (if possible) Matthew Dawson
2021-11-15 10:46 ` Kai Krakow
2021-11-18 2:57 ` Matthew Dawson [this message]
2021-11-18 21:09 ` Zygo Blaxell
2021-11-19 4:42 ` Matthew Dawson
2021-11-24 4:43 ` Zygo Blaxell
2021-11-24 5:11 ` Matthew Dawson
-- strict thread matches above, loose matches on Subject: below --
2021-11-15 1:23 Matthew Dawson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3321185.LZWGnKmheA@cwmtaff \
--to=matthew@mjdsystems.ca \
--cc=hurikhan77+btrfs@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox