All of lore.kernel.org
 help / color / mirror / Atom feed
From: Larkin Lowrey <llowrey@nuclearwinter.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfsck check infinite loop
Date: Wed, 24 Sep 2014 10:29:06 -0500	[thread overview]
Message-ID: <5422E342.8060000@nuclearwinter.com> (raw)
In-Reply-To: <5422D4E0.8090605@nuclearwinter.com>

I noticed the following:

(gdb) print nrscan
$19 = 1680726970
(gdb) print tree->cache_size
$20 = 1073741824
(gdb) print cache_hard_max
$21 = 1073741824

It appears that cache_size can not shrink below cache_hard_max so we
never end up breaking out of the loop. The FS in question is 30TB with
~26TB in use. Perhaps cache_hard_max (1GB) is too small for this size
FS? I just bumped it to 2GB and am re-running to see if that helps.

--Larkin

On 9/24/2014 9:27 AM, Larkin Lowrey wrote:
> I ran 'btrfs check --repair --init-extent-tree' and appear to be in an
> infinite loop. It performed heavy IO for about 1.5 hours then the IO
> stopped and the CPU stayed at 100%. It's been like that for more than 12
> hours now.
>
> I made a hardware change last week that resulted in unstable RAM so I
> suspect some corrupt data was written to disk. I tried mounting with
> -orecovery,clear_cache,nospace_cache but I would get a panic shortly
> thereafter. I tried 'btrfs check --repair' but also got a panic. I
> finally tried 'btrfs check --repair --init-extent-tree' and hit an
> assertion failed error with btrfs-progs 3.16.
>
> After noticing some promising commits, I built from the integration repo
> (kdave), re-ran (v3.16.1) and got further (2hrs) but then got stuck in
> this infinite loop.
>
> Here's the backtrace of where it is now and has been for hours:
>
> #0  0x0000000000438f01 in free_some_buffers (tree=0xda3078) at
> extent_io.c:553
> #1  __alloc_extent_buffer (blocksize=4096, bytenr=<optimized out>,
> tree=0xda3078) at extent_io.c:592
> #2  alloc_extent_buffer (tree=0xda3078, bytenr=<optimized out>,
> blocksize=4096) at extent_io.c:671
> #3  0x000000000042be29 in btrfs_find_create_tree_block
> (root=root@entry=0xda34a0, bytenr=<optimized out>, blocksize=<optimized
> out>) at disk-io.c:133
> #4  0x000000000042d683 in read_tree_block (root=0xda34a0,
> bytenr=<optimized out>, blocksize=<optimized out>,
> parent_transid=161580) at disk-io.c:260
> #5  0x0000000000427c58 in read_node_slot (root=root@entry=0xda34a0,
> parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634
> #6  0x0000000000428558 in push_leaf_right (trans=trans@entry=0xe709b0,
> root=root@entry=0xda34a0, path=path@entry=0xde317a0,
> data_size=data_size@entry=67, empty=empty@entry=0)
>     at ctree.c:1608
> #7  0x0000000000428e4c in split_leaf (trans=trans@entry=0xe709b0,
> root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0,
> path=path@entry=0xde317a0,
>     data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977
> #8  0x000000000042aa54 in btrfs_search_slot (trans=0xe709b0,
> root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0,
> p=p@entry=0xde317a0, ins_len=ins_len@entry=67,
>     cow=cow@entry=1) at ctree.c:1120
> #9  0x000000000042af51 in btrfs_insert_empty_items
> (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0,
> path=path@entry=0xde317a0, cpu_key=cpu_key@entry=0x7fff24da24b0,
>     data_size=data_size@entry=0x7fff24da24a0, nr=nr@entry=1) at ctree.c:2412
> #10 0x00000000004175f6 in btrfs_insert_empty_item (data_size=42,
> key=0x7fff24da24b0, path=0xde317a0, root=0xda34a0, trans=0xe709b0) at
> ctree.h:2312
> #11 record_extent (flags=0, allocated=<optimized out>, back=0x95cb3d90,
> rec=0x95cb3cc0, path=0xde317a0, info=0xda3010, trans=0xe709b0) at
> cmds-check.c:4438
> #12 fixup_extent_refs (trans=trans@entry=0xe709b0, info=<optimized out>,
> extent_cache=extent_cache@entry=0x7fff24da2970,
> rec=rec@entry=0x95cb3cc0) at cmds-check.c:5287
> #13 0x000000000041ac01 in check_extent_refs
> (extent_cache=0x7fff24da2970, root=<optimized out>, trans=<optimized
> out>) at cmds-check.c:5511
> #14 check_chunks_and_extents (root=root@entry=0xfa7c70) at cmds-check.c:5978
> #15 0x000000000041bdd9 in cmd_check (argc=<optimized out>,
> argv=<optimized out>) at cmds-check.c:6723
> #16 0x0000000000404481 in main (argc=4, argv=0x7fff24da2fe0) at btrfs.c:247
>
> I checked node, node->next, node->next->next, node->next->prev, etc. and
> saw no obvious loop, at least not in the immediate vicinity of node. The
> value of node is different each time I check it.
>
> I'll periodically see the following backtrace:
>
> #0  __list_del (next=0x1326fe820, prev=0xda3088) at list.h:113
> #1  list_move_tail (head=0xda3088, list=0x1514b40f0) at list.h:183
> #2  free_some_buffers (tree=0xda3078) at extent_io.c:560
> #3  __alloc_extent_buffer (blocksize=4096, bytenr=<optimized out>,
> tree=0xda3078) at extent_io.c:592
> #4  alloc_extent_buffer (tree=0xda3078, bytenr=<optimized out>,
> blocksize=4096) at extent_io.c:671
> #5  0x000000000042be29 in btrfs_find_create_tree_block
> (root=root@entry=0xda34a0, bytenr=<optimized out>, blocksize=<optimized
> out>) at disk-io.c:133
> #6  0x000000000042d683 in read_tree_block (root=0xda34a0,
> bytenr=<optimized out>, blocksize=<optimized out>,
> parent_transid=161580) at disk-io.c:260
> #7  0x0000000000427c58 in read_node_slot (root=root@entry=0xda34a0,
> parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634
> #8  0x0000000000428558 in push_leaf_right (trans=trans@entry=0xe709b0,
> root=root@entry=0xda34a0, path=path@entry=0xde317a0,
> data_size=data_size@entry=67, empty=empty@entry=0)
>     at ctree.c:1608
> #9  0x0000000000428e4c in split_leaf (trans=trans@entry=0xe709b0,
> root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0,
> path=path@entry=0xde317a0,
>     data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977
> #10 0x000000000042aa54 in btrfs_search_slot (trans=0xe709b0,
> root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0,
> p=p@entry=0xde317a0, ins_len=ins_len@entry=67,
>     cow=cow@entry=1) at ctree.c:1120
> #11 0x000000000042af51 in btrfs_insert_empty_items
> (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0,
> path=path@entry=0xde317a0, cpu_key=cpu_key@entry=0x7fff24da24b0,
>     data_size=data_size@entry=0x7fff24da24a0, nr=nr@entry=1) at ctree.c:2412
> #12 0x00000000004175f6 in btrfs_insert_empty_item (data_size=42,
> key=0x7fff24da24b0, path=0xde317a0, root=0xda34a0, trans=0xe709b0) at
> ctree.h:2312
> #13 record_extent (flags=0, allocated=<optimized out>, back=0x95cb3d90,
> rec=0x95cb3cc0, path=0xde317a0, info=0xda3010, trans=0xe709b0) at
> cmds-check.c:4438
> #14 fixup_extent_refs (trans=trans@entry=0xe709b0, info=<optimized out>,
> extent_cache=extent_cache@entry=0x7fff24da2970,
> rec=rec@entry=0x95cb3cc0) at cmds-check.c:5287
> #15 0x000000000041ac01 in check_extent_refs
> (extent_cache=0x7fff24da2970, root=<optimized out>, trans=<optimized
> out>) at cmds-check.c:5511
> #16 check_chunks_and_extents (root=root@entry=0xfa7c70) at cmds-check.c:5978
> #17 0x000000000041bdd9 in cmd_check (argc=<optimized out>,
> argv=<optimized out>) at cmds-check.c:6723
> #18 0x0000000000404481 in main (argc=4, argv=0x7fff24da2fe0) at btrfs.c:247
>
> If there's interest in debugging I can leave this machine in this
> condition for a few days. It's just a backup server so losing the fs
> won't be the end of the world.
>
> --Larkin
>


  reply	other threads:[~2014-09-24 15:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-24 14:27 btrfsck check infinite loop Larkin Lowrey
2014-09-24 15:29 ` Larkin Lowrey [this message]
2014-09-24 18:00   ` Larkin Lowrey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5422E342.8060000@nuclearwinter.com \
    --to=llowrey@nuclearwinter.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.