Re: btrfs corruption, extent buffer leak

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Maxim Mikityanskiy <maxtram95@gmail.com>
To: Filipe Manana <fdmanana@kernel.org>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, Chris Mason <clm@fb.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>
Subject: Re: btrfs corruption, extent buffer leak
Date: Tue, 24 Jan 2023 17:51:04 +0200	[thread overview]
Message-ID: <Y8/+aOngUIC2ytGB@mail.gmail.com> (raw)
In-Reply-To: <CAL3q7H5auixGxxjALT0D3mFcq-Lj=s2yX-HPEgLk=XZbUTTqng@mail.gmail.com>

Thanks for the advice!

On Mon, Jan 23, 2023 at 01:23:25PM +0000, Filipe Manana wrote:
> On Mon, Jan 23, 2023 at 12:03 PM Maxim Mikityanskiy <maxtram95@gmail.com> wrote:
> >
> > >
> > > https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@leemhuis.info/
> >
> > So it seems to be a known issue for 6.1. Is there any known workaround,
> > or should I downgrade the kernel? Is there any risk of running an older
> > kernel (and an older btrfs driver) on a filesystem that was driven by
> > 6.1?
> 
> You can temporarily downgrade to a 6.0 or older kernel if you want to.
> 
> >
> > > > Other than that, I couldn't list files in a directory two levels higher
> > > > than the file that I attempted to create.
> > >
> > > You couldn't list files while the fs was in RO state, or after
> > > rebooting? Or both?
> >
> > Only when it was in readonly. After rebooting, I could access that
> > directory again, and the contents seemed to be intact.
> >
> > > What happened exactly when attempting to list files? What error did you get?
> >
> > Sorry, I didn't write down the error code...
> >
> > ls didn't show any entries and just displayed one line with an error,
> > which I didn't save.
> >
> > > >
> > > > After rebooting from a live USB, I ran btrfs scrub (no errors found) and
> > > > btrfs check (some errors found):
> > > >
> > > > Opening filesystem to check...
> > > > Checking filesystem on /dev/mapper/root
> > > > UUID: ********-****-****-****-************
> > > > [1/7] checking root items
> > > > [2/7] checking extents
> > > > [3/7] checking free space tree
> > > > [4/7] checking fs roots
> > > > [5/7] checking only csums items (without verifying data)
> > > > [6/7] checking root refs
> > > > [7/7] checking quota groups
> > > > ERROR: failed to add qgroup relation, member=258 parent=71776119061217538: No such file or directory
> > > > ERROR: loading qgroups from disk: -2
> > > > ERROR: failed to check quota groups
> > >
> > > This is a different issue, it's the first time I see it, nothing
> > > related to the previous one. I'm adding Qu to CC since he knows
> > > qgroups much better than I do, and so he may have an idea.
> >
> > More info on this: after I rebooted and continued using the filesystem,
> > I started seeing these messages in dmesg:
> >
> > BTRFS warning (device dm-0): qgroup rescan is already in progress
> > BTRFS warning (device dm-0): qgroup rescan is already in progress
> > ...
> > BTRFS warning (device dm-0): qgroup rescan is already in progress
> > BTRFS info (device dm-0): qgroup scan completed (inconsistency flag cleared)
> >
> > These messages repeated multiple times, i.e. qgroup rescan was
> > apparently constantly triggered multiple times, and even after it was
> > completed, something retriggered it again and again.
> >
> > Then I removed a few hundreds of gigabytes of files, deleted most
> > subvolumes (there were several dozens of docker subvolumes), and I
> > noticed that quotas became disabled on this filesystem. I reenabled
> > quotas, rescanned qgroups, and the quota issue seems to be fixed: I no
> > longer see repeated rescans in dmesg, and btrfs check doesn't show any
> > errors now.
> 
> Disabling and re-enabling qgroups, or just rescanning, sometimes
> solves qgroup related problems.

I noticed that after I use docker, a lot of stale qgroups appear. They
can be easily cleared with btrfs qgroup clear-stale, but I don't recall
seeing them before:

0/3026           0.00B        0.00B   <stale>
0/3027           0.00B        0.00B   <stale>
0/3028           0.00B        0.00B   <stale>
0/3029           0.00B        0.00B   <stale>
0/3030           0.00B        0.00B   <stale>
0/3031           0.00B        0.00B   <stale>
0/3032           0.00B        0.00B   <stale>
0/3033           0.00B        0.00B   <stale>
0/3034           0.00B        0.00B   <stale>
0/3035           0.00B        0.00B   <stale>
0/3036           0.00B        0.00B   <stale>
0/3037           0.00B        0.00B   <stale>

Is there some garbage-collecting mechanism that will remove them over
time? Is it normal to see them at all?

> 
> >
> > > > found 1211137126400 bytes used, error(s) found
> > > > total csum bytes: 1170686968
> > > > total tree bytes: 10738614272
> > > > total fs tree bytes: 8738439168
> > > > total extent tree bytes: 557547520
> > > > btree space waste bytes: 1726206798
> > > > file data blocks allocated: 1533753126912
> > > >  referenced 1324118478848
> > > > extent buffer leak: start 931127214080 len 16384
> > > > extent buffer leak: start 103570046976 len 16384
> > > >
> > > > The quota error and especially the extent buffer leak error don't look
> > > > good to me. However, the filesystem seem to mount properly, and so far I
> > > > didn't find any lost files (still looking). I don't know whether the
> > > > amount of free space is shown correctly.
> > > >
> > > > What should be my steps to fix these errors? I didn't try btrfs check
> > > > --repair yet, because of numerous warnings not to use it.
> > > >
> > > > Also, what is the approximate amount of the data lost due to this extent
> > > > buffer leak? Is 16384 the number of sectors or the number of bytes?
> > >
> > > Why do you think there's data loss?
> >
> > The error message looked scary, I thought it meant that some extents
> > with real data were leaked on the filesystem and became unreferenced.
> > The "BTRFS critical: corrupt leaf" message in dmesg, followed by
> > switching to readonly (a standard fallback when the filesystem is
> > seriously screwed up), also gave me confidence some data were lost.
> 
> Only data that was not yet flushed to disk (and not fsynced) could be
> lost, i.e. just like a sudden power failure.
> 
> And for metadata (file names, directories, xattrs, etc) only for
> changes done since the last transaction commit and not fsynced.
> By default, unless you use the mount option commix=xxx, transaction
> commits happen every 30 seconds, sometimes less
> as some fyncs may fallback to a transaction commit, or a snapshot was
> created, etc.
> 
> >
> > > The extent buffer leak is just a
> > > btrfs-progs thing, it means the code failed to release allocated
> > > memory - but once 'btrfs check' exits, the memory is released. This is
> > > likely happening due to the qgroups error, some error path is not
> > > freeing the memory.
> >
> > That's a relief to hear. I actually noticed that the "start" numbers
> > weren't consistent if I ran btrfs check multiple times. And this error
> > disappeared after fixing quotas, so it indeed seems to be related.
> >
> > I appreciate your help, thanks! What's the best thing to do in these
> > circumstances to minimize further damage? Should I recreate the
> > filesystem, or is it fine as it is? Should I downgrade the kernel for
> > now? If the first error repeats, is there any risk for data loss?
> 
> No, no need to recreate the filesystem.
> That was corruption detected during a fsync operation, and spitting
> the error and turning the fs to read-only mode only prevents any
> corruptions from being persisted.

Thanks for the explanation! It's nice to hear it wasn't persisted to the
disk - that was what I worried about.

> Just downgrade to a 6.0 kernel or older for now, until the relevant
> fixes land in a 6.1.x stable release.

Thanks for the advice!

> 
> >
> > >
> > > >
> > > > Thanks,
> > > > Max

next prev parent reply	other threads:[~2023-01-24 15:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-21 13:29 btrfs corruption, extent buffer leak Maxim Mikityanskiy
2023-01-23 10:39 ` Filipe Manana
2023-01-23 12:03   ` Maxim Mikityanskiy
2023-01-23 13:23     ` Filipe Manana
2023-01-24 15:51       ` Maxim Mikityanskiy [this message]
2023-01-24 22:59         ` Qu Wenruo
2023-01-23 12:50   ` Qu Wenruo
2023-01-23 13:35     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8/+aOngUIC2ytGB@mail.gmail.com \
    --to=maxtram95@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).