linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: Henk Slager <eye1tm@gmail.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfsck: backpointer mismatch (and multiple other errors)
Date: Fri, 1 Apr 2016 09:10:44 +0800	[thread overview]
Message-ID: <56FDCA94.3030001@cn.fujitsu.com> (raw)
In-Reply-To: <CAPmG0jZ6hs4TA_0ozNEP4x6ucn971Nj-6+uempGi1VwTWAdCYQ@mail.gmail.com>



Henk Slager wrote on 2016/04/01 01:27 +0200:
> On Thu, Mar 31, 2016 at 10:44 PM, Kai Krakow <hurikhan77@gmail.com> wrote:
>> Hello!
>>
>> I already reported this in another thread but it was a bit confusing by
>> intermixing multiple volumes. So let's start a new thread:
>>
>> Since one of the last kernel upgrades, I'm experiencing one VDI file
>> (containing a NTFS image with Windows 7) getting damaged when running
>> the machine in VirtualBox. I got knowledge about this after
>> experiencing an error "duplicate object" and btrfs went RO. I fixed it
>> by deleting the VDI and restoring from backup - but no I get csum
>> errors as soon as some VM IO goes into the VDI file.
>>
>> The FS is still usable. One effect is, that after reading all files
>> with rsync (to copy to my backup), each call of "du" or "df" hangs, also
>> similar calls to "btrfs {sub|fi} ..." show the same effect. I guess one
>> outcome of this is, that the FS does not properly unmount during
>> shutdown.
>>
>> Kernel is 4.5.0 by now (the FS is much much older, dates back to 3.x
>> series, and never had problems), including Gentoo patch-set r1.
>
> One possibility could be that the vbox kernel modules somehow corrupt
> btrfs kernel area since kernel 4.5.
>
> In order to make this reproducible (or an attempt to reproduce) for
> others, you could unload VirtualBox stuff and restore the VDI file
> from backup (or whatever big file) and then make pseudo-random, but
> reproducible writes to the file.
>
> It is not clear to me what 'Gentoo patch-set r1' is and does. So just
> boot a vanilla v4.5 kernel from kernel.org and see if you get csum
> errors in dmesg.
>
> Also, where does 'duplicate object' come from? dmesg ? then please
> post its surroundings, straight from dmesg.
>
>> The device layout is:
>>
>> $ lsblk -o NAME,MODEL,FSTYPE,LABEL,MOUNTPOINT
>> NAME        MODEL            FSTYPE LABEL      MOUNTPOINT
>> sda         Crucial_CT128MX1
>> ├─sda1                       vfat   ESP        /boot
>> ├─sda2
>> └─sda3                       bcache
>>    ├─bcache0                  btrfs  system
>>    ├─bcache1                  btrfs  system
>>    └─bcache2                  btrfs  system     /usr/src
>> sdb         SAMSUNG HD103SJ
>> ├─sdb1                       swap   swap0      [SWAP]
>> └─sdb2                       bcache
>>    └─bcache2                  btrfs  system     /usr/src
>> sdc         SAMSUNG HD103SJ
>> ├─sdc1                       swap   swap1      [SWAP]
>> └─sdc2                       bcache
>>    └─bcache1                  btrfs  system
>> sdd         SAMSUNG HD103UJ
>> ├─sdd1                       swap   swap2      [SWAP]
>> └─sdd2                       bcache
>>    └─bcache0                  btrfs  system
>>
>> Mount options are:
>>
>> $ mount|fgrep btrfs
>> /dev/bcache2 on / type btrfs (rw,noatime,compress=lzo,nossd,discard,space_cache,autodefrag,subvolid=256,subvol=/gentoo/rootfs)
>>
>> The FS uses mraid=1 and draid=0.
>>
>> Output of btrfsck is:
>> (also available here:
>> https://gist.github.com/kakra/bfcce4af242f6548f4d6b45c8afb46ae)
>>
>> $ btrfsck /dev/disk/by-label/system
>> checking extents
>> ref mismatch on [10443660537856 524288] extent item 1, found 2
> This   10443660537856  number is bigger than the  1832931324360 number
> found for total bytes. AFAIK, this is already wrong.

Nope. That's btrfs logical space address, which can be beyond real disk 
bytenr.

The easiest method to reproduce such case, is write something in a 256M 
btrfs, and balance the fs several times.

Then all chunks can be at bytenr beyond 256M.

The real problem is, the extent has mismatched reference.
Normally it can fixed by --init-extent-tree option, but it normally 
means bigger problem, especially it has already caused kernel 
delayed-ref problem.

No to mention the error "extent item 11271947091968 has multiple extent 
items", which makes the problem more serious.


I assume some older kernel have already screwed up the extent tree, as 
although delayed-ref is bug-prove, it has improved in recent years.

But it seems fs tree is less damaged, I assume the extent tree 
corruption could be fixed by "--init-extent-tree".

For the only fs tree error (missing csum), if "btrfsck 
--init-extent-tree --repair" works without any problem, the most simple 
fix would be, just removing the file.
Or you can use a lot of CPU time and disk IO to rebuild the whole csum, 
by using "--init-csum-tree" option.

Thanks,
Qu

>
> [...]
>
>> checking fs roots
>> root 4336 inode 4284125 errors 1000, some csum missing
> What is in this inode?
>
>> Checking filesystem on /dev/disk/by-label/system
>> UUID: d2bb232a-2e8f-4951-8bcc-97e237f1b536
>> found 1832931324360 bytes used err is 1
>> total csum bytes: 1730105656
>> total tree bytes: 6494474240
>> total fs tree bytes: 3789783040
>> total extent tree bytes: 608219136
>> btree space waste bytes: 1221460063
>> file data blocks allocated: 2406059724800
>>   referenced 2040857763840
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



  reply	other threads:[~2016-04-01  1:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-31 20:44 btrfsck: backpointer mismatch (and multiple other errors) Kai Krakow
2016-03-31 23:27 ` Henk Slager
2016-04-01  1:10   ` Qu Wenruo [this message]
2016-04-02  8:47     ` Kai Krakow
2016-04-02  9:00   ` Kai Krakow
2016-04-02 17:17     ` Henk Slager
2016-04-02 20:16       ` Kai Krakow
2016-04-03  0:14         ` Chris Murphy
2016-04-03  4:02           ` Kai Krakow
2016-04-03  5:06             ` Duncan
2016-04-03 22:19               ` Kai Krakow
2016-04-04  0:51                 ` Chris Murphy
2016-04-04 19:36                   ` Kai Krakow
2016-04-04 19:57                     ` Chris Murphy
2016-04-04 20:50                       ` Kai Krakow
2016-04-04 21:00                         ` Kai Krakow
2016-04-04 23:09                         ` Chris Murphy
2016-04-05  7:05                           ` Kai Krakow
2016-04-04  4:34                 ` Duncan
2016-04-04 19:26                   ` Kai Krakow
2016-04-05  1:44                     ` Duncan
2016-04-03 19:03             ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56FDCA94.3030001@cn.fujitsu.com \
    --to=quwenruo@cn.fujitsu.com \
    --cc=eye1tm@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).