From: Liu Bo <bo.li.liu@oracle.com>
To: dsterba@suse.cz, Paul Jones <paul@pauljones.id.au>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: kernel BUG at fs/btrfs/extent_io.c:1989
Date: Wed, 20 Sep 2017 12:19:16 -0700 [thread overview]
Message-ID: <20170920191916.GA10216@lim.localdomain> (raw)
In-Reply-To: <20170920125356.GI29043@twin.jikos.cz>
On Wed, Sep 20, 2017 at 02:53:57PM +0200, David Sterba wrote:
> On Tue, Sep 19, 2017 at 10:12:39AM -0600, Liu Bo wrote:
> > On Tue, Sep 19, 2017 at 05:07:25PM +0200, David Sterba wrote:
> > > On Tue, Sep 19, 2017 at 11:32:46AM +0000, Paul Jones wrote:
> > > > > This 'mirror 0' looks fishy, (as mirror comes from btrfs_io_bio->mirror_num,
> > > > > which should be at least 1 if raid1 setup is in use.)
> > > > >
> > > > > Not sure if 4.13.2-gentoo made any changes on btrfs, but can you please
> > > > > verify with the upstream kernel, say, v4.13?
> > > >
> > > > It's basically a vanilla kernel with a handful of unrelated patches.
> > > > The filesystem fell apart overnight, there were a few thousand
> > > > checksum errors and eventually it went read-only. I tried to remount
> > > > it, but got open_ctree failed. Btrfs check segfaulted, lowmem mode
> > > > completed with so many errors I gave up and will restore from the
> > > > backup.
> > > >
> > > > I think I know the problem now - the lvm cache was in writeback mode
> > > > (by accident) so during a defrag there would be gigabytes of unwritten
> > > > data in memory only, which was all lost when the system crashed
> > > > (motherboard failure). No wonder the filesystem didn't quite survive.
> > >
> > > Yeah, the caching layer was my first suspicion, and lack of propagating
> > > of the barriers. Good that you were able to confirm that as the root cause.
> > >
> > > > I must say though, I'm seriously impressed at the data integrity of
> > > > BTRFS - there were near 10,000 checksum errors, 4 which were
> > > > uncorrectable, and from what I could tell nearly all of the data was
> > > > still intact according to rsync checksums.
> > >
> > > Yay!
> >
> > But still don't get why mirror_num is 0, do you have an idea on how
> > does writeback cache make that?
>
> My first idea was that the cached blocks were zeroed, so we'd see the ino
> and mirror as 0. But this is not correct as the blocks would not pass
> the checksum tests, so the blocks must be from some previous generation.
> Ie. the transid verify failure. And all the error reports appear after
> that so I'm slightly suspicious about the way it's actually reported.
>
> btrfs_print_data_csum_error takes mirror from either io_bio or
> compressed_bio structures, so there might be a case when the structures
> are initialized. If the transid check is ok, then the structures are
> updated. If the check fails we'd see the initial mirror number. All of
> that is just a hypothesis, I haven't checked with the code.
>
Thanks a lot for the input, you're right, mirror_num 0 should come
from compressed read where it doesn't record the bbio->mirror_num but
the mirror passing from the upper layer, and it's not metadata as we
don't yet compress metadata, so this all makes sense.
I think it also disables the ability of read-repair from raid1 for
compressed data, and that's what caused the bug where it hits
BUG_ON(mirror_num == 0) in cleanup_io_failure().
The good news is that I can reproduce it, will send a patch and a
testcase.
> I don't have a theoretical explanation for the ino 0. The inode pointer
> that goes to btrfs_print_data_csum_error should be from a properly
> initialized inode and we print the number using btrfs_ino. That will use
> the vfs i_ino value and we should never get 0 out of that.
ino 0 comes from metadata read-repair, some cleanup may be needed to
make it less confusing.
thanks,
-liubo
prev parent reply other threads:[~2017-09-20 19:23 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-18 8:55 kernel BUG at fs/btrfs/extent_io.c:1989 Paul Jones
2017-09-18 17:09 ` Liu Bo
2017-09-18 18:30 ` Holger Hoffstätte
2017-09-18 19:35 ` Kai Krakow
2017-09-19 11:32 ` Paul Jones
2017-09-19 15:07 ` David Sterba
2017-09-19 16:12 ` Liu Bo
2017-09-20 12:53 ` David Sterba
2017-09-20 19:19 ` Liu Bo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170920191916.GA10216@lim.localdomain \
--to=bo.li.liu@oracle.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=paul@pauljones.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).