linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Liu Bo <bo.li.liu@oracle.com>
Cc: dsterba@suse.cz, Paul Jones <paul@pauljones.id.au>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: kernel BUG at fs/btrfs/extent_io.c:1989
Date: Wed, 20 Sep 2017 14:53:57 +0200	[thread overview]
Message-ID: <20170920125356.GI29043@twin.jikos.cz> (raw)
In-Reply-To: <20170919161239.GA18597@dhcp-10-211-47-181.usdhcp.oraclecorp.com>

On Tue, Sep 19, 2017 at 10:12:39AM -0600, Liu Bo wrote:
> On Tue, Sep 19, 2017 at 05:07:25PM +0200, David Sterba wrote:
> > On Tue, Sep 19, 2017 at 11:32:46AM +0000, Paul Jones wrote:
> > > > This 'mirror 0' looks fishy, (as mirror comes from btrfs_io_bio->mirror_num,
> > > > which should be at least 1 if raid1 setup is in use.)
> > > > 
> > > > Not sure if 4.13.2-gentoo made any changes on btrfs, but can you please
> > > > verify with the upstream kernel, say, v4.13?
> > > 
> > > It's basically a vanilla kernel with a handful of unrelated patches.
> > > The filesystem fell apart overnight, there were a few thousand
> > > checksum errors and eventually it went read-only. I tried to remount
> > > it, but got open_ctree failed. Btrfs check segfaulted, lowmem mode
> > > completed with so many errors I gave up and will restore from the
> > > backup.
> > > 
> > > I think I know the problem now - the lvm cache was in writeback mode
> > > (by accident) so during a defrag there would be gigabytes of unwritten
> > > data in memory only, which was all lost when the system crashed
> > > (motherboard failure). No wonder the filesystem didn't quite survive.
> > 
> > Yeah, the caching layer was my first suspicion, and lack of propagating
> > of the barriers. Good that you were able to confirm that as the root cause.
> > 
> > > I must say though, I'm seriously impressed at the data integrity of
> > > BTRFS - there were near 10,000 checksum errors, 4 which were
> > > uncorrectable, and from what I could tell nearly all of the data was
> > > still intact according to rsync checksums.
> > 
> > Yay!
> 
> But still don't get why mirror_num is 0, do you have an idea on how
> does writeback cache make that?

My first idea was that the cached blocks were zeroed, so we'd see the ino
and mirror as 0. But this is not correct as the blocks would not pass
the checksum tests, so the blocks must be from some previous generation.
Ie. the transid verify failure. And all the error reports appear after
that so I'm slightly suspicious about the way it's actually reported.

btrfs_print_data_csum_error takes mirror from either io_bio or
compressed_bio structures, so there might be a case when the structures
are initialized. If the transid check is ok, then the structures are
updated. If the check fails we'd see the initial mirror number. All of
that is just a hypothesis, I haven't checked with the code.

I don't have a theoretical explanation for the ino 0. The inode pointer
that goes to btrfs_print_data_csum_error should be from a properly
initialized inode and we print the number using btrfs_ino. That will use
the vfs i_ino value and we should never get 0 out of that.

  reply	other threads:[~2017-09-20 12:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-18  8:55 kernel BUG at fs/btrfs/extent_io.c:1989 Paul Jones
2017-09-18 17:09 ` Liu Bo
2017-09-18 18:30   ` Holger Hoffstätte
2017-09-18 19:35     ` Kai Krakow
2017-09-19 11:32   ` Paul Jones
2017-09-19 15:07     ` David Sterba
2017-09-19 16:12       ` Liu Bo
2017-09-20 12:53         ` David Sterba [this message]
2017-09-20 19:19           ` Liu Bo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170920125356.GI29043@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=bo.li.liu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=paul@pauljones.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).