public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Graham Cobb <g.btrfs@cobb.uk.net>, linux-btrfs@vger.kernel.org
Subject: Re: Interrupted and resumed scrubs seem to have caused filesystem to go readonly (EFBIG error)
Date: Thu, 2 Jan 2020 20:34:23 +0800	[thread overview]
Message-ID: <e481748b-d31a-e9a5-8532-e3e77188cbe3@gmx.com> (raw)
In-Reply-To: <09556f2c-be43-1363-ccbe-065c88f8d5c5@cobb.uk.net>


[-- Attachment #1.1: Type: text/plain, Size: 2899 bytes --]



On 2020/1/2 下午8:07, Graham Cobb wrote:
> On 02/01/2020 01:26, Qu Wenruo wrote:
>>
>>
>> On 2020/1/2 上午7:35, Graham Cobb wrote:
>>> I have a problem on one BTRFS filesystem. It is not a critical
>>> filesystem (it is used for backups) and I have not yet tried even
>>> unmounting and remounting, let alone a "btrfs check".
>>>
>>> The problem seems to be that after several iterations of running 'btrfs
>>> scrub' for 30 minutes, then pausing for a while, then resuming the
>>> scrub, I got a transaction aborted with an EFBIG error and a warning in
>>> the kernel log. The fs went readonly, and transid verify errors are now
>>> reported. The original log extract is available at
>>> http://www.cobb.uk.net/kern.log.bug-010120 but I have pasted the key
>>> part below.
>>
>> EFBIG in btrfs is very rare, and can only be caused by too many system
>> chunks.
>>
>> The most common reason is the chunk pre-alllocation for scrub, which
>> also matches your situation.
>>
>> There is already a fix for it, and will land in v5.5 kernel.
>> It looks like we should backport it.
> 
> Thanks Qu. I will wait for that kernel, and maybe stop my monthly scrubs
> (although my several other btrfs filesystems did not have a problem this
> month fortunately).

And the problem will normally not impact the fs, as newly created empty
system chunks will be soon cleaned up.

> 
> I am getting transid errors:

This is not a good news. And in fact it's normally a deadly problem.

> 
>>> Jan  1 06:51:56 black kernel: [1931271.801468] BTRFS error (device
>>> sdc3): parent transid verify failed on 16216583520256 wanted 301800
>>> found 301756
> 
> I presume 301800 is the transaction which failed and caused the fs to go
> readonly. I don't suppose it is likely I could revert the whole fs to
> the state of the last successful transaction is there?

This means some tree blocks doesn't reach disk.
It can be deadly, or just a side effect caused by the transaction abort.

> 
> It is not a big problem: the fs only contains backup snapshots (not my
> only backups!) although it would be nice to recover the historical
> snapshots if I could (I used them to research a bug I reported to debian
> just the other day!).

I'm afraid this depends on where the corruption is.

If it's just caused by that EFBIG error, and btrfs check reports no
error, then it's just temporary problem caused by transaction abort.


If it's in extent tree, it only affects mount or certain write
operations, but if you can mount the fs, it should be OK to read out the
whole fs.

If it's in csum tree, it will affect certain data read, other than
mostly OK.

If it's in subvolume trees, some directories/files can't be accessed.

So, please run a btrfs check on the unmounted fs to verify what's the case.

Thanks,
Qu

> 
> Regards
> Graham
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

  reply	other threads:[~2020-01-02 12:34 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-01 23:35 Interrupted and resumed scrubs seem to have caused filesystem to go readonly (EFBIG error) Graham Cobb
2020-01-02  1:26 ` Qu Wenruo
2020-01-02 12:07   ` Graham Cobb
2020-01-02 12:34     ` Qu Wenruo [this message]
2020-01-04 10:46       ` Graham Cobb

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e481748b-d31a-e9a5-8532-e3e77188cbe3@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=g.btrfs@cobb.uk.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox