public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Lu Pi <lp.contact2@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: fails to boot with "BTRFS critical (device sda2): corrupt leaf: ..."
Date: Sat, 8 Aug 2020 19:45:54 +0800	[thread overview]
Message-ID: <e194deb8-4126-0ac6-becd-890939c99275@gmx.com> (raw)
In-Reply-To: <CAEYyJDyMkOBjhhVFbX_CCG0bnWC1i7OGLvPj8tFhntgxYjkRGg@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 6154 bytes --]



On 2020/8/8 下午5:57, Lu Pi wrote:
> Hi,
> 
> I have a system that fails to boot, with "BTRFS critical (device
> sda2): corrupt leaf: " error, and open an "initramfs" shell.
> 
> I did backup /home with 'btrfs restore'. There were a few errors,
> though only on cache files (Google Chrome cache files).
> 
> Now considering 'btrfs check --repair'.
> 
> I'm contacting you as recommended here:
>    https://btrfs.wiki.kernel.org/index.php/Tree-checker
>    "Please report to btrfs mail list <linux-btrfs@vger.kernel.org> first."
>    "Please *NOT* use btrfs check --repair until instructed by a developer."
> 
> 
> Can you advice?
> 
> 
> 
> 
> BACKGROUND
> 
> - the system is Linux Mint 17
> 
> - a week ago or so, after a kernel update, the system was remounting
> read-only after about 1 minute after boot. Downgrading the kernel
> solved the issue.
>   - 4.15.0-112-generic brought the issue
>   - 4.15.0-107-generic was OK
> 
> - a few days ago, something else happened, though I'm unsure, as I'm
> not the user of the system. Possibly any of these,
>   - another kernel update (now I can see that 4.15.0-112 is back)

The kernel update is the direct cause, we added a lot of extra selftest
to ensure we detect problems before it crash the kernel.

The root cause is some even older kernel, which writes some
uninitialized data to disk.

>   - maybe the system was shut down by cutting electricity (?)
>   - could it be also that the SSD drive is failing (?)
>   - or?
>   - though as a result the system fails to boot and the drive is not mountable.
> 
> 
> 
> SYSTEM INFORMATION
> ---
> When reporting errors or asking for support always supply the output
> of the following commands:
>   uname -a
>   btrfs --version
>   btrfs fi show
>   btrfs fi df /home # Replace /home with the mount point of your
> btrfs-filesystem
>   dmesg > dmesg.log
> ---
> 
> See below, and dmesg log enclosed
> 
> 
> ---
> (initramfs) uname -a
> Linux (none) 4.15.0-112-generic #113~16.04.1-Ubuntu SMP Fri Jul 10
> 04:37:08 UTC 2020 x86_64 GNU/Linux

This is a little old, considering how many enhancement and bug fixes are
in recent kernel releases.

Thus it's recommended to use newer kernel, *after* your problem been fixed.

> 
> 
> (initramfs) btrfs --version
> btrfs-progs v4.4

Btrfs-progs is too old to even detect the problem, not to mention fix it.
It should only report the fs as healthy, if there are no other problems.

> 
> 
> (initramfs) btrfs fi show
> Label: none  uuid: f813bbe2-0bff-4923-822b-d3f6d6ebbb9e
>     Total devices 1 FS bytes used 55.22GiB
>     devid    1 size 107.98GiB used 85.02GiB path /dev/sda2
> 
> 
> (initramfs) btrfs fi df /home
> ERROR: can't access '/home': No such file or directory
> 
> (initramfs) btrfs fi df /
> ERROR: not a btrfs filesystem: /
> 
> 
> 
> (initramfs) mount -t btrfs /dev/sda2 /mnt/sda/
> [70391.973518] BTRFS critical (device sda2): corrupt leaf:
> block=353828864 slot=148 extent bytenr=242073600 len=16384 invalid
> generation, have 9367487224930631680 expect (0, 458036]

The generation is mostly garbage, it's 0x8200000000000000L, just some
random number not really initialized.

This is a pretty old bug in older kernels.

It's the recently added extra self check detecting them.

This can be detected by "btrfs check --repair" after btrfs-progs v5.4.1,
but not yet repairable. (Haven't got a real world report before this one)


There is a way to workaround this, by locating the offending extent, and
delete it manually.

Firstly, you need to mount the fs with older kernel.

Then run the following command (maybe you need latest btrfs-progs):
# btrfs ins logical-resolve 242073600 <mnt>

Where the 242073600 is the "extent bytenr" in the dmesg output.

There are two possible output patterns:
- The path of the offending file
  Then just delete it.

- No such file or directory
  This means it's a tree block, it's going to be a little trikcy.
  You need to use btrfs ins again:
  # btrfs ins dump-tree -t 402653184 <device>

  Then search thing like this "EXTENT_DATA":
        item 6 key (257 EXTENT_DATA 0) itemoff 15813 itemsize 53
                generation 3 type 1 (regular)
                extent data disk byte 138424320 nr 1048576
                                      ^^^^^^^^^
  Then use that "138424320" to logical-resolve command again, then
  to remove all offending files.

I'll work on the btrfs check repair ability soon, before that, please
use the above workaround.

Sorry for the inconvenience and thanks for the first real world report.

Thanks,
Qu

> [70391.975504] BTRFS: error (device sda2) in __btrfs_free_extent:7000:
> errno=-5 IO failure
> [70391.977490] BTRFS: error (device sda2) in
> btrfs_run_delayed_refs:3083: errno=-5 IO failure
> [70391.979490] BTRFS: error (device sda2) in btrfs_replay_log:2369:
> errno=-5 IO failure (Failed to recover log tree)
> [70391.980588] BTRFS error (device sda2): pending csums is 475136
> [70392.023935] BTRFS error (device sda2): open_ctree failed
> mount: mounting /dev/sda2 on /mnt/sda/ failed: Input/output error
> 
> 
> 
> (initramfs) dmesg |grep 70391
> [70391.723717] BTRFS info (device sda2): disk space caching is enabled
> [70391.723721] BTRFS info (device sda2): has skinny extents
> [70391.763253] BTRFS info (device sda2): enabling ssd optimizations
> [70391.763256] BTRFS info (device sda2): start tree-log replay
> [70391.973518] BTRFS critical (device sda2): corrupt leaf:
> block=353828864 slot=148 extent bytenr=242073600 len=16384 invalid
> generation, have 9367487224930631680 expect (0, 458036]
> [70391.975504] BTRFS: error (device sda2) in __btrfs_free_extent:7000:
> errno=-5 IO failure
> [70391.977490] BTRFS: error (device sda2) in
> btrfs_run_delayed_refs:3083: errno=-5 IO failure
> [70391.979490] BTRFS: error (device sda2) in btrfs_replay_log:2369:
> errno=-5 IO failure (Failed to recover log tree)
> [70391.980588] BTRFS error (device sda2): pending csums is 475136
> 
> 
> full dmesg log enclosed.
> ---
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2020-08-08 11:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-08  9:57 fails to boot with "BTRFS critical (device sda2): corrupt leaf: ..." Lu Pi
2020-08-08 11:45 ` Qu Wenruo [this message]
2020-08-08 12:17   ` Qu Wenruo
2020-08-08 15:57   ` Lu Pi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e194deb8-4126-0ac6-becd-890939c99275@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lp.contact2@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox