From: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
To: Carsten Grommel <c.grommel@profihost.ag>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: How to (attempt to) repair these btrfs errors
Date: Sat, 5 Mar 2022 20:36:56 -0500 [thread overview]
Message-ID: <YiQQOFQO7G4NZTKS@hungrycats.org> (raw)
In-Reply-To: <AM0PR08MB3265280A4F4EF8151DA289F58E029@AM0PR08MB3265.eurprd08.prod.outlook.com>
On Tue, Mar 01, 2022 at 10:55:50AM +0000, Carsten Grommel wrote:
> Follow-up pastebin with the most recent errors in dmesg:
>
> https://pastebin.com/4yJJdQPJ
This seems to have expired.
> ________________________________________
> Von: Carsten Grommel
> Gesendet: Montag, 28. Februar 2022 19:41
> An: linux-btrfs@vger.kernel.org
> Betreff: How to (attempt to) repair these btrfs errors
>
> Hi,
>
> short buildup: btrfs filesystem used for storing ceph rbd backups within subvolumes got corrupted.
> Underlying 3 RAID 6es, btrfs is mounted on Top as RAID 0 over these Raids for performance ( we have to store massive Data)
>
> Linux cloud8-1550 5.10.93+2-ph #1 SMP Fri Jan 21 07:52:51 UTC 2022 x86_64 GNU/Linux
>
> But it was Kernel 5.4.121 before
>
> btrfs --version
> btrfs-progs v4.20.1
>
> btrfs fi show
> Label: none uuid: b634a011-28fa-41d7-8d6e-3f68ccb131d0
> Total devices 3 FS bytes used 56.74TiB
> devid 1 size 25.46TiB used 22.70TiB path /dev/sda1
> devid 2 size 25.46TiB used 22.69TiB path /dev/sdb1
> devid 3 size 25.46TiB used 22.70TiB path /dev/sdd1
>
> btrfs fi df /vmbackup/
> Data, RAID0: total=66.62TiB, used=56.45TiB
> System, RAID1: total=8.00MiB, used=4.36MiB
> Metadata, RAID1: total=750.00GiB, used=294.90GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
>
> Attached the dmesg.log, a few dmesg messages following regarding the different errors (some informations redacted):
>
> [Mon Feb 28 18:53:57 2022] BTRFS error (device sda1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 69074516, gen 184286
>
> [Mon Feb 28 18:53:57 2022] BTRFS error (device sda1): bdev /dev/sdd1 errs: wr 0, rd 0, flush 0, corrupt 69074517, gen 184286
>
> [Mon Feb 28 18:54:23 2022] BTRFS error (device sda1): unable to fixup (regular) error at logical 776693776384 on dev /dev/sdd1
>
> [Mon Feb 28 18:54:25 2022] scrub_handle_errored_block: 21812 callbacks suppressed
>
> [Mon Feb 28 18:54:31 2022] BTRFS warning (device sda1): checksum error at logical 777752285184 on dev /dev/sdd1, physical 259607957504, root 108747, inode 257, offset 59804737536, length 4096, links 1 (path: cephstorX_vm-XXX-disk-X-base.img_1645337735)
>
> I am able to mount the filesystem in read-write mode but accessing specific blocks seems to crash btrfs to remount into read-only
> I am currently running a scrub over the filesystem.
>
> The system got rebooted and the fs got remounted 2-3 times. I made the experience that usually btrfs would and could fix these kinds of errors after a remount, not this time though.
>
> Before I ran “btrfs check –repair” I would like some advice at how to tackle theses errors.
The corruption and generation event counts indicate sdd1 (or one of its
component devices) was offline for a long time or suffered corruption
on a large scale.
Data is raid0, so data repair is not possible. Delete all the files
that contain corrupt data.
If you are using space_cache=v1, now is a good time to upgrade to
space_cache=v2. v1 space cache is stored in the data profile, and it has
likely been corrupted. btrfs will usually detect and repair corruption
in space_cache=v1, but there is no need to take any such risk here
when you can easily use v2 instead (or at least clear the v1 cache).
I don't see any errors in these logs that would indicate a metadata issue,
but huge numbers of messages are suppressed. Perhaps a log closer
to the moment when the filesystem goes read-only will be more useful.
I would expect that if there are no problems on sda1 or sdb1 then it
should be possible to repair the metadata errors on sdd1 by scrubbing
that device.
> Kind regards
> Carsten Grommel
>
next prev parent reply other threads:[~2022-03-06 1:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-28 18:41 How to (attempt to) repair these btrfs errors Carsten Grommel
2022-03-01 10:55 ` AW: " Carsten Grommel
2022-03-06 1:36 ` Zygo Blaxell [this message]
2022-03-07 7:03 ` Carsten Grommel
2022-03-07 7:11 ` Qu Wenruo
2022-03-07 7:25 ` AW: " Carsten Grommel
2022-03-07 7:27 ` Carsten Grommel
2022-03-07 7:39 ` Qu Wenruo
2022-03-07 7:34 ` Qu Wenruo
2022-03-07 7:48 ` AW: " Carsten Grommel
2022-03-07 8:00 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YiQQOFQO7G4NZTKS@hungrycats.org \
--to=ce3g8jdj@umail.furryterror.org \
--cc=c.grommel@profihost.ag \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox