All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Josef Bacik <jbacik@fb.com>, Qu Wenruo <quwenruo@cn.fujitsu.com>,
	David Sterba <dsterba@suse.cz>
Subject: Re: ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
Date: Thu, 6 Jul 2017 22:39:53 -0700	[thread overview]
Message-ID: <20170707053953.GB9735@merlins.org> (raw)
In-Reply-To: <20170707053718.GA9735@merlins.org>

On Thu, Jul 06, 2017 at 10:37:18PM -0700, Marc MERLIN wrote:
> I'm still trying to fix my filesystem.
> It seems to work well enough since the damage is apparently localized, but
> I'd really want check --repair to actually bring it back to a working
> state, but now it's crashing
> 
> This is btrfs tools from git from a few days ago
> 
> Failed to find [4068943577088, 168, 16384]
> btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
> Failed to find [5905106075648, 168, 16384]
> btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
> Failed to find [21037056, 168, 16384]
> btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
> Failed to find [21053440, 168, 16384]
> btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
> Failed to find [21299200, 168, 16384]
> btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
> Failed to find [5523931971584, 168, 16384]
> btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
> ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
> btrfs(+0x113cf)[0x5651e60443cf]
> btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
> btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
> btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
> btrfs(+0x59184)[0x5651e608c184]
> btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
> btrfs(main+0x85)[0x5651e60442c3]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
> btrfs(_start+0x2a)[0x5651e6043e3a]

Mmmh, never mind, it seems that the software raid suffered yet another
double disk failure due to some undermined flakiness in the underlying block
device cabling :-/
That would likely explain the failures here.

 
> Full log:
> enabling repair mode
> Checking filesystem on /dev/mapper/dshelf2
> UUID: 85441c59-ad11-4b25-b1fe-974f9e4acede
> checking extents
> Fixed 0 roots.
> checking free space cache
> cache and super generation don't match, space cache will be invalidated
> checking fs roots
> checksum verify failed on 3037243965440 found 179689AF wanted 82B97043
> checksum verify failed on 3037243965440 found 179689AF wanted 82B97043
> checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F
> checksum verify failed on 3037243998208 found 60EA5C5B wanted 0CF5948F
> checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E
> checksum verify failed on 3037244293120 found 38382803 wanted 39E4F85E
> checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C
> checksum verify failed on 3037244342272 found E84F1D8F wanted 472DA98C
> checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09
> checksum verify failed on 3037244669952 found 2F6E4C0E wanted E00BBF09
> checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA
> checksum verify failed on 3037248913408 found CE2E4AEE wanted EF22F9CA
> checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC
> checksum verify failed on 3037248929792 found C989CB0E wanted E27527BC
> checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598
> checksum verify failed on 3037247569920 found 05848C79 wanted EF3D5598
> checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135
> checksum verify failed on 3037247586304 found 9D1E4E39 wanted F1EC8135
> checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D
> checksum verify failed on 3037247619072 found BFE40520 wanted 627DB20D
> checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC
> checksum verify failed on 3037249208320 found A6B5775F wanted B1E6C0FC
> checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7
> checksum verify failed on 3037252534272 found 207AD7DF wanted DE72BDF7
> checksum verify failed on 3111569391616 found 3C623707 wanted D955D668
> checksum verify failed on 3111569391616 found 3C623707 wanted D955D668
> checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A
> checksum verify failed on 3111569768448 found 0C129F3C wanted C509003A
> checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2
> checksum verify failed on 3111569735680 found E94C9D41 wanted 55836DD2
> checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35
> checksum verify failed on 3037253435392 found 8E124EB5 wanted A3291C35
> checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339
> checksum verify failed on 3037253746688 found 2B6A4DCD wanted 4323B339
> checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43
> checksum verify failed on 3111569702912 found 1048610C wanted 9856BB43
> checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF
> checksum verify failed on 3111569801216 found CD7AAF82 wanted C1DA44DF
> checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE
> checksum verify failed on 3037251878912 found 86FB02F3 wanted 728772CE
> checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0
> checksum verify failed on 3037252861952 found CFD54426 wanted E91774C0
> checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE
> checksum verify failed on 3037255974912 found E3655B7C wanted 8163FDDE
> checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10
> checksum verify failed on 3037252927488 found E7AD88A3 wanted F6BA5B10
> checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81
> checksum verify failed on 3037253500928 found 514A55B2 wanted 3611CD81
> checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B
> checksum verify failed on 3037256105984 found 41ADA274 wanted 8F7F0A0B
> Csum didn't match
> The following tree block(s) is corrupted in tree 3861:
> 	tree block bytenr: 1710573748224, level: 1, node key: (1073956, 12, 959325)
> Try to repair the btree for root 3861
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Csum didn't match
> Failed to find [4068943577088, 168, 16384]
> btrfs unable to find ref byte nr 4068943577088 parent 0 root 4  owner 1 offset 0
> Failed to find [5905106075648, 168, 16384]
> btrfs unable to find ref byte nr 5906282119168 parent 0 root 4  owner 0 offset 1
> Failed to find [21037056, 168, 16384]
> btrfs unable to find ref byte nr 21037056 parent 0 root 3  owner 1 offset 0
> Failed to find [21053440, 168, 16384]
> btrfs unable to find ref byte nr 21053440 parent 0 root 3  owner 0 offset 1
> Failed to find [21299200, 168, 16384]
> btrfs unable to find ref byte nr 21299200 parent 0 root 3  owner 0 offset 1
> Failed to find [5523931971584, 168, 16384]
> btrfs unable to find ref byte nr 5524037566464 parent 0 root 3861  owner 3 offset 0
> ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5
> btrfs(+0x113cf)[0x5651e60443cf]
> btrfs(__btrfs_cow_block+0x576)[0x5651e6045848]
> btrfs(btrfs_cow_block+0xea)[0x5651e6045dc6]
> btrfs(btrfs_search_slot+0x11df)[0x5651e604969d]
> btrfs(+0x59184)[0x5651e608c184]
> btrfs(cmd_check+0x2bd4)[0x5651e60987b3]
> btrfs(main+0x85)[0x5651e60442c3]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f34f523d2b1]
> btrfs(_start+0x2a)[0x5651e6043e3a]
> Aborted
> gargamel:~# 
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
> Microsoft is to operating systems ....
>                                       .... what McDonalds is to gourmet cooking
> Home page: http://marc.merlins.org/  

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

  reply	other threads:[~2017-07-07  5:39 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-20 14:39 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Marc MERLIN
2017-06-20 15:23 ` Hugo Mills
2017-06-20 15:26   ` Marc MERLIN
2017-06-20 15:36     ` Hugo Mills
2017-06-20 15:44       ` Marc MERLIN
2017-06-20 23:12         ` Marc MERLIN
2017-06-20 23:58           ` Marc MERLIN
2017-06-21  3:31           ` Chris Murphy
2017-06-21  3:43             ` Marc MERLIN
2017-06-21 15:13               ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
2017-06-21 23:22                 ` Chris Murphy
2017-06-22  0:48                   ` Marc MERLIN
2017-06-22  2:22                 ` Qu Wenruo
2017-06-22  2:53                   ` Marc MERLIN
2017-06-22  4:08                     ` Qu Wenruo
2017-06-23  4:06                       ` Marc MERLIN
2017-06-23  8:54                         ` Lu Fengqi
2017-06-23 16:17                           ` Marc MERLIN
2017-06-24  2:34                             ` Marc MERLIN
2017-06-26 10:46                               ` Lu Fengqi
2017-06-27 23:11                                 ` Marc MERLIN
2017-06-28  7:10                                   ` Lu Fengqi
2017-05-01 17:06                                     ` 4.11 relocate crash, null pointer Marc MERLIN
2017-05-01 18:08                                       ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Marc MERLIN
2017-05-02  1:50                                         ` Chris Murphy
2017-05-02  3:23                                           ` Marc MERLIN
2017-05-02  4:56                                             ` Chris Murphy
2017-05-02  5:11                                               ` Marc MERLIN
2017-05-02 18:47                                                 ` btrfs check --repair: failed to repair damaged filesystem, aborting Marc MERLIN
2017-05-03  6:00                                                   ` Marc MERLIN
2017-05-03  6:17                                                     ` Marc MERLIN
2017-05-03  6:32                                                       ` Roman Mamedov
2017-05-03 20:40                                                         ` Marc MERLIN
2017-07-07  5:37                                                 ` ctree.c:197: update_ref_for_cow: BUG_ON `ret` triggered, value -5 Marc MERLIN
2017-07-07  5:39                                                   ` Marc MERLIN [this message]
2017-07-07  9:33                                                     ` Lu Fengqi
2017-07-07 16:38                                                       ` Marc MERLIN
2017-07-09  4:34                                                         ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  5:05                                                           ` We really need a better/working btrfs check --repair Marc MERLIN
2017-07-09  6:34                                                           ` 4.11.6 / more corruption / root 15455 has a root item with a more recent gen (33682) compared to the found root node (0) Marc MERLIN
2017-07-09  7:57                                                           ` Martin Steigerwald
2017-07-09  9:16                                                             ` Paul Jones
2017-07-09 11:17                                                               ` Duncan
2017-07-09 13:00                                                                 ` Martin Steigerwald
2017-07-29 19:29                                                                 ` Imran Geriskovan
2017-07-29 23:38                                                                   ` Duncan
2017-07-30 14:54                                                                     ` Imran Geriskovan
2017-07-31  4:53                                                                       ` Duncan
2017-07-31 20:32                                                                         ` Imran Geriskovan
2017-08-01  1:36                                                                           ` Duncan
2017-08-01 15:18                                                                             ` Imran Geriskovan
2017-07-31 21:07                                                             ` Ivan Sizov
2017-07-31 21:17                                                               ` Marc MERLIN
2017-07-31 21:39                                                                 ` Ivan Sizov
2017-08-01 16:41                                                                   ` Ivan Sizov
2017-07-31 22:00                                                                 ` Justin Maggard
2017-08-01  6:38                                                                   ` Marc MERLIN
2017-05-02 19:59                                               ` 4.11 relocate crash, null pointer + rolling back a filesystem by X hours? Kai Krakow
2017-05-02  5:01                                             ` Duncan
2017-05-02 19:53                                               ` Kai Krakow
2017-05-23 16:58                                               ` Marc MERLIN
2017-05-24 10:16                                                 ` Duncan
2017-05-05  1:19                                             ` Qu Wenruo
2017-05-05  2:10                                               ` Qu Wenruo
2017-05-05  2:40                                               ` Marc MERLIN
2017-05-05  5:03                                                 ` Qu Wenruo
2017-05-05 15:43                                                   ` Marc MERLIN
2017-05-17 18:23                                                     ` Kai Krakow
2017-05-05  1:13                                         ` Qu Wenruo
2017-06-28 14:43                                     ` How to fix errors that check --mode lomem finds, but --mode normal doesn't? Marc MERLIN
2017-06-29 13:36                                       ` Lu Fengqi
2017-06-29 15:30                                         ` Marc MERLIN
2017-06-30 14:59                                           ` Lu Fengqi
2017-06-22  4:08                     ` Qu Wenruo
2017-06-21 12:04           ` 4.11.3: BTRFS critical (device dm-1): unable to add free space :-17 => btrfs check --repair runs clean Duncan
2017-06-21  3:26         ` Chris Murphy
2017-06-21  4:06           ` Marc MERLIN

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170707053953.GB9735@merlins.org \
    --to=marc@merlins.org \
    --cc=dsterba@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lufq.fnst@cn.fujitsu.com \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.