Btrfs umount hang

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Btrfs umount hang
@ 2017-08-07 14:12 Angel Shtilianov
  2017-08-07 15:30 ` Nikolay Borisov
  2017-08-07 17:19 ` Jeff Mahoney
  0 siblings, 2 replies; 8+ messages in thread
From: Angel Shtilianov @ 2017-08-07 14:12 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1663 bytes --]

Hi there,
I'm investigating sporadic hanging during btrfs umount. The FS is
contained in a loop mounted file.
I have no reproduction scenario and the issue may happen once a day or
once a month. It is rare, but frustrating.
I have a crashdump (the server has been manually crashed and collected
a crashdump), so I could take look through the data structures.
What happens is that umount is getting in D state and a the kernel
complains about hung tasks. We are using kernel 4.4.y The actual back
trace is from 4.4.70, but this happens with all the 4.4 kernels I've
used (4.4.30 through 4.4.70).
Tasks like:
INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
are getting blocked waiting for btrfs_tree_read_lock, which is owned
by task umount:31696 (which is also blocked for more than 120 seconds)
regarding the lock debug.

umount is hung in "cache_block_group", see the '>' mark:
       while (cache->cached == BTRFS_CACHE_FAST) {
                struct btrfs_caching_control *ctl;

                ctl = cache->caching_ctl;
                atomic_inc(&ctl->count);
                prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
                spin_unlock(&cache->lock);

>                schedule();

                finish_wait(&ctl->wait, &wait);
                put_caching_control(ctl);
                spin_lock(&cache->lock);
        }

The complete backtraces could be found in the attached log.

Do you have any ideas ?
Any help will be greatly appreciated.

Best regards,
Angel Shtilianov

[-- Attachment #2: btrfs_hang.log --]
[-- Type: text/x-log, Size: 20177 bytes --]

Jul 21 22:30:34 smallvault22 kernel: [377419.794212] BTRFS info (device loop0): disk space caching is enabled
Jul 21 22:30:34 smallvault22 kernel: [377419.794449] BTRFS: has skinny extents
Jul 21 22:30:43 smallvault22 kernel: [377429.488153] BTRFS info (device loop2): disk space caching is enabled
Jul 21 22:30:43 smallvault22 kernel: [377429.488389] BTRFS: has skinny extents
Jul 21 22:30:44 smallvault22 kernel: [377429.710041] BTRFS info (device loop2): disk space caching is enabled
Jul 21 22:30:44 smallvault22 kernel: [377429.710279] BTRFS: has skinny extents
Jul 21 22:30:44 smallvault22 kernel: [377429.961130] BTRFS info (device loop0): disk space caching is enabled
Jul 21 22:30:44 smallvault22 kernel: [377429.961367] BTRFS: has skinny extents
Jul 21 22:32:04 smallvault22 kernel: [377510.277526] BTRFS: device fsid 7d53957c-847b-4dd6-8a70-63870a34c7f7 devid 1 transid 1222 / dev/loop0
Jul 21 22:32:04 smallvault22 kernel: [377510.294142] BTRFS info (device loop0): disk space caching is enabled
Jul 21 22:32:04 smallvault22 kernel: [377510.294387] BTRFS: has skinny extents
Jul 21 22:32:08 smallvault22 kernel: [377514.199173] BTRFS info (device loop0): new size for /dev/loop0 is 214748364800
Jul 21 22:32:09 smallvault22 kernel: [377514.722687] BTRFS info (device loop0): disk space caching is enabled
Jul 21 22:32:09 smallvault22 kernel: [377514.722924] BTRFS: has skinny extents
Jul 21 22:32:09 smallvault22 kernel: [377514.833162] INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
Jul 21 22:32:09 smallvault22 kernel: [377514.833543]       Tainted: G           O    4.4.70-backup2 #93
Jul 21 22:32:09 smallvault22 kernel: [377514.833767] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 21 22:32:09 smallvault22 kernel: [377514.834153] kworker/u32:9   D ffff880046142ec8     0 27574      2 0x00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.834536] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.834825]  ffff880046142ec8 ffff8802598b62f0 ffff880046142e90 ffffffff00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.835468]  ffff88046bb19a00 ffff88038b469a00 ffff880046144000 ffff8802598b62f0
Jul 21 22:32:09 smallvault22 kernel: [377514.836097]  ffff88038b469a00 ffff88038b469a00 ffff880046142f08 ffff880046142ee0
Jul 21 22:32:09 smallvault22 kernel: [377514.836724] Call Trace:
Jul 21 22:32:09 smallvault22 kernel: [377514.836947]  [<ffffffff816b41ac>] schedule+0x3c/0x90
Jul 21 22:32:09 smallvault22 kernel: [377514.837206]  [<ffffffffa0ab90b5>] btrfs_tree_read_lock+0xe5/0x140 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.837437]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.837669]  [<ffffffffa0a4f584>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.838054]  [<ffffffffa0a544db>] btrfs_search_slot+0x70b/0x9d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.838289]  [<ffffffff810ae74d>] ? trace_hardirqs_on+0xd/0x10
Jul 21 22:32:09 smallvault22 kernel: [377514.838524]  [<ffffffffa0a4f16a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.838760]  [<ffffffffa0a6e5a7>] btrfs_lookup_file_extent+0x37/0x40 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.838999]  [<ffffffffa0a803b5>] btrfs_get_extent+0x155/0xb30 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.839249]  [<ffffffffa0a99caa>] ? __set_extent_bit+0x15a/0x580 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.839491]  [<ffffffffa0a9d70b>] __do_readpage+0x1ab/0x8a0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.839728]  [<ffffffffa0a97da3>] ? btrfs_lookup_ordered_range+0xc3/0x170 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.840126]  [<ffffffffa0a80260>] ? btrfs_real_readdir+0x5b0/0x5b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.840358]  [<ffffffff810ae74d>] ? trace_hardirqs_on+0xd/0x10
Jul 21 22:32:09 smallvault22 kernel: [377514.840593]  [<ffffffffa0a97da3>] ? btrfs_lookup_ordered_range+0xc3/0x170 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.840982]  [<ffffffffa0a9e15e>] __extent_readpages.constprop.40+0x27e/0x2a0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.841382]  [<ffffffffa0a80260>] ? btrfs_real_readdir+0x5b0/0x5b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.841621]  [<ffffffffa0a9f1eb>] extent_readpages+0x18b/0x1e0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.841857]  [<ffffffffa0a80260>] ? btrfs_real_readdir+0x5b0/0x5b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.842099]  [<ffffffffa0a7d89f>] btrfs_readpages+0x1f/0x30 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.842327]  [<ffffffff8116ab45>] __do_page_cache_readahead+0x275/0x310
Jul 21 22:32:09 smallvault22 kernel: [377514.842571]  [<ffffffff8116aa30>] ? __do_page_cache_readahead+0x160/0x310
Jul 21 22:32:09 smallvault22 kernel: [377514.842797]  [<ffffffff8116adb0>] ondemand_readahead+0x1d0/0x440
Jul 21 22:32:09 smallvault22 kernel: [377514.843023]  [<ffffffff811b6cc2>] ? kmem_cache_alloc_trace+0x202/0x310
Jul 21 22:32:09 smallvault22 kernel: [377514.843259]  [<ffffffff8116b1b1>] page_cache_sync_readahead+0x31/0x50
Jul 21 22:32:09 smallvault22 kernel: [377514.843497]  [<ffffffffa0ac3fc1>] __load_free_space_cache+0x1c1/0x650 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.843885]  [<ffffffffa0ac4540>] load_free_space_cache+0xf0/0x1b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.844130]  [<ffffffffa0a597b7>] cache_block_group+0x1b7/0x3d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.844360]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.844591]  [<ffffffffa0a68c9f>] find_free_extent+0x7bf/0xf40 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.844825]  [<ffffffffa0a69487>] btrfs_reserve_extent+0x67/0x150 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.845059]  [<ffffffffa0a69669>] btrfs_alloc_tree_block+0xf9/0x480 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.845300]  [<ffffffffa0a4ff0c>] __btrfs_cow_block+0x12c/0x5a0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.845536]  [<ffffffffa0a50516>] btrfs_cow_block+0x116/0x2a0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.845769]  [<ffffffffa0a53fa6>] btrfs_search_slot+0x1d6/0x9d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.846003]  [<ffffffffa0a5b175>] lookup_inline_extent_backref+0xc5/0x5b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.846389]  [<ffffffff810812d2>] ? __might_sleep+0x52/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.846622]  [<ffffffffa0a4f16a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.846856]  [<ffffffffa0a5d701>] __btrfs_free_extent.isra.67+0xf1/0xcb0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.851247]  [<ffffffffa0ac985d>] ? btrfs_merge_delayed_refs+0x8d/0x630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.851635]  [<ffffffffa0a61982>] ? __btrfs_run_delayed_refs+0x192/0x1630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.852018]  [<ffffffffa0a620c0>] __btrfs_run_delayed_refs+0x8d0/0x1630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.852414]  [<ffffffffa0a7a2eb>] ? join_transaction.isra.13+0x2b/0x440 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.852820]  [<ffffffffa0a7a3d2>] ? join_transaction.isra.13+0x112/0x440 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.853214]  [<ffffffffa0a65c12>] btrfs_run_delayed_refs+0x82/0x2c0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.853452]  [<ffffffffa0a65ee4>] delayed_ref_async_start+0x94/0xb0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.853688]  [<ffffffffa0aae983>] normal_work_helper+0xe3/0x600 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.853924]  [<ffffffffa0aaf102>] btrfs_extent_refs_helper+0x12/0x20 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.854162]  [<ffffffff81073766>] process_one_work+0x1e6/0x6b0
Jul 21 22:32:09 smallvault22 kernel: [377514.854387]  [<ffffffff810736dd>] ? process_one_work+0x15d/0x6b0
Jul 21 22:32:09 smallvault22 kernel: [377514.854611]  [<ffffffff81073c99>] worker_thread+0x69/0x480
Jul 21 22:32:09 smallvault22 kernel: [377514.854833]  [<ffffffff81073c30>] ? process_one_work+0x6b0/0x6b0
Jul 21 22:32:09 smallvault22 kernel: [377514.855058]  [<ffffffff8107a37d>] kthread+0x11d/0x140
Jul 21 22:32:09 smallvault22 kernel: [377514.855285]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.855513]  [<ffffffff816ba8ff>] ret_from_fork+0x3f/0x70
Jul 21 22:32:09 smallvault22 kernel: [377514.855735]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.855959] INFO: lockdep is turned off.
Jul 21 22:32:09 smallvault22 kernel: [377514.856188] INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
Jul 21 22:32:09 smallvault22 kernel: [377514.856562]       Tainted: G           O    4.4.70-backup2 #93
Jul 21 22:32:09 smallvault22 kernel: [377514.856784] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 21 22:32:09 smallvault22 kernel: [377514.857169] kworker/u32:12  D ffff88001b903a68     0 27575      2 0x00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.857532] Workqueue: btrfs-freespace-write btrfs_freespace_write_helper [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.857967]  ffff88001b903a68 ffff8802598b62f0 ffff88001b903a30 ffffffff00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.858599]  ffff88046bb38000 ffff88046b6eb400 ffff88001b904000 ffff8802598b62f0
Jul 21 22:32:09 smallvault22 kernel: [377514.859228]  ffff88046b6eb400 ffff88046b6eb400 ffff88001b903aa8 ffff88001b903a80
Jul 21 22:32:09 smallvault22 kernel: [377514.859849] Call Trace:
Jul 21 22:32:09 smallvault22 kernel: [377514.860066]  [<ffffffff816b41ac>] schedule+0x3c/0x90
Jul 21 22:32:09 smallvault22 kernel: [377514.860306]  [<ffffffffa0ab90b5>] btrfs_tree_read_lock+0xe5/0x140 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.860536]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.860766]  [<ffffffffa0a4f584>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.861154]  [<ffffffffa0a544db>] btrfs_search_slot+0x70b/0x9d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.861384]  [<ffffffff810812d2>] ? __might_sleep+0x52/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.861618]  [<ffffffffa0a9190d>] btrfs_mark_extent_written+0x9d/0x9c0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.861993]  [<ffffffff816b9ba7>] ? _raw_spin_unlock+0x27/0x40
Jul 21 22:32:09 smallvault22 kernel: [377514.862237]  [<ffffffffa0a7a3d2>] ? join_transaction.isra.13+0x112/0x440 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.862640]  [<ffffffffa0a7c15a>] ? start_transaction+0x9a/0x5b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.862874]  [<ffffffffa0a84cf5>] btrfs_finish_ordered_io+0x415/0x630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.863269]  [<ffffffffa0aae900>] ? normal_work_helper+0x60/0x600 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.863507]  [<ffffffffa0aae900>] ? normal_work_helper+0x60/0x600 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.863742]  [<ffffffffa0a851b5>] finish_ordered_fn+0x15/0x20 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.863975]  [<ffffffffa0aaea5d>] normal_work_helper+0x1bd/0x600 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.864220]  [<ffffffffa0aaf082>] btrfs_freespace_write_helper+0x12/0x20 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.864597]  [<ffffffff81073766>] process_one_work+0x1e6/0x6b0
Jul 21 22:32:09 smallvault22 kernel: [377514.864820]  [<ffffffff810736dd>] ? process_one_work+0x15d/0x6b0
Jul 21 22:32:09 smallvault22 kernel: [377514.865043]  [<ffffffff81073c99>] worker_thread+0x69/0x480
Jul 21 22:32:09 smallvault22 kernel: [377514.865275]  [<ffffffff81073c30>] ? process_one_work+0x6b0/0x6b0
Jul 21 22:32:09 smallvault22 kernel: [377514.865500]  [<ffffffff8107a37d>] kthread+0x11d/0x140
Jul 21 22:32:09 smallvault22 kernel: [377514.865722]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.865946]  [<ffffffff816ba8ff>] ret_from_fork+0x3f/0x70
Jul 21 22:32:09 smallvault22 kernel: [377514.866180]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.866404] INFO: lockdep is turned off.
Jul 21 22:32:09 smallvault22 kernel: [377514.866625] INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
Jul 21 22:32:09 smallvault22 kernel: [377514.866997]       Tainted: G           O    4.4.70-backup2 #93
Jul 21 22:32:09 smallvault22 kernel: [377514.867228] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 21 22:32:09 smallvault22 kernel: [377514.867603] btrfs-transacti D ffff880235483958     0 31625      2 0x00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.867951]  ffff880235483958 ffff88000460ad90 ffff880235483920 ffffffff00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.868580]  ffff88046bb38000 ffff88046a833400 ffff880235484000 ffff88000460ad90
Jul 21 22:32:09 smallvault22 kernel: [377514.869209]  ffff88046a833400 ffff88046a833400 ffff880235483998 ffff880235483970
Jul 21 22:32:09 smallvault22 kernel: [377514.869828] Call Trace:
Jul 21 22:32:09 smallvault22 kernel: [377514.870044]  [<ffffffff816b41ac>] schedule+0x3c/0x90
Jul 21 22:32:09 smallvault22 kernel: [377514.870285]  [<ffffffffa0ab90b5>] btrfs_tree_read_lock+0xe5/0x140 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.870514]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.870743]  [<ffffffffa0a4f584>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.871135]  [<ffffffffa0a544db>] btrfs_search_slot+0x70b/0x9d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.871363]  [<ffffffff810ae526>] ? trace_hardirqs_on_caller+0x16/0x230
Jul 21 22:32:09 smallvault22 kernel: [377514.871595]  [<ffffffffa0a5b175>] lookup_inline_extent_backref+0xc5/0x5b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.871969]  [<ffffffff810812d2>] ? __might_sleep+0x52/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.872209]  [<ffffffffa0a4f16a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.872444]  [<ffffffffa0a5d701>] __btrfs_free_extent.isra.67+0xf1/0xcb0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.872844]  [<ffffffffa0ac985d>] ? btrfs_merge_delayed_refs+0x8d/0x630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.873236]  [<ffffffffa0a61982>] ? __btrfs_run_delayed_refs+0x192/0x1630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.873619]  [<ffffffffa0a620c0>] __btrfs_run_delayed_refs+0x8d0/0x1630 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.873992]  [<ffffffff810d4ea4>] ? lock_timer_base.isra.35+0x54/0x70
Jul 21 22:32:09 smallvault22 kernel: [377514.874238]  [<ffffffffa0a7a2eb>] ? join_transaction.isra.13+0x2b/0x440 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.874622]  [<ffffffffa0a65c12>] btrfs_run_delayed_refs+0x82/0x2c0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.874856]  [<ffffffffa0a7b5f3>] btrfs_commit_transaction+0x43/0xb10 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.875247]  [<ffffffffa0a769ac>] transaction_kthread+0x22c/0x290 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.875483]  [<ffffffffa0a76780>] ? btrfs_cleanup_transaction+0x650/0x650 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.875857]  [<ffffffff8107a37d>] kthread+0x11d/0x140
Jul 21 22:32:09 smallvault22 kernel: [377514.876083]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.876315]  [<ffffffff816ba8ff>] ret_from_fork+0x3f/0x70
Jul 21 22:32:09 smallvault22 kernel: [377514.876537]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.876760] INFO: lockdep is turned off.
Jul 21 22:32:09 smallvault22 kernel: [377514.876977] INFO: task umount:31696 blocked for more than 120 seconds.
Jul 21 22:32:09 smallvault22 kernel: [377514.877210]       Tainted: G           O    4.4.70-backup2 #93
Jul 21 22:32:09 smallvault22 kernel: [377514.877432] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 21 22:32:09 smallvault22 kernel: [377514.877804] umount          D ffff88025234b7b8     0 31696  31315 0x00000000
Jul 21 22:32:09 smallvault22 kernel: [377514.878163]  ffff88025234b7b8 ffff8800411adcb0 ffff88025234b780 0000000000000000
Jul 21 22:32:09 smallvault22 kernel: [377514.878781]  ffff88046bb1b400 ffff88044b59ce00 ffff88025234c000 ffff88018289b840
Jul 21 22:32:09 smallvault22 kernel: [377514.879408]  ffff88001910a400 ffff8800411adcb0 ffff8800411adc00 ffff88025234b7d0
Jul 21 22:32:09 smallvault22 kernel: [377514.880025] Call Trace:
Jul 21 22:32:09 smallvault22 kernel: [377514.880250]  [<ffffffff816b41ac>] schedule+0x3c/0x90
Jul 21 22:32:09 smallvault22 kernel: [377514.880482]  [<ffffffffa0a59736>] cache_block_group+0x136/0x3d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.880707]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
Jul 21 22:32:09 smallvault22 kernel: [377514.880936]  [<ffffffffa0a68c9f>] find_free_extent+0x7bf/0xf40 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.881180]  [<ffffffffa0a69487>] btrfs_reserve_extent+0x67/0x150 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.881416]  [<ffffffffa0a69669>] btrfs_alloc_tree_block+0xf9/0x480 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.881648]  [<ffffffffa0a4ff0c>] __btrfs_cow_block+0x12c/0x5a0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.881879]  [<ffffffffa0a50516>] btrfs_cow_block+0x116/0x2a0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.882121]  [<ffffffffa0a53fa6>] btrfs_search_slot+0x1d6/0x9d0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.882357]  [<ffffffffa0a703bf>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.882606]  [<ffffffffa0a7e7de>] btrfs_update_inode_item+0x4e/0x100 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.882839]  [<ffffffffa0a8181b>] btrfs_update_inode+0x6b/0x100 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.883071]  [<ffffffffa0a66798>] cache_save_setup+0x98/0x3b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.883313]  [<ffffffffa0a69d9a>] ? btrfs_start_dirty_block_groups+0x1aa/0x3b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.883695]  [<ffffffffa0a69da9>] btrfs_start_dirty_block_groups+0x1b9/0x3b0 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.884077]  [<ffffffffa0a7b6fe>] btrfs_commit_transaction+0x14e/0xb10 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.884467]  [<ffffffffa0a45bc4>] btrfs_sync_fs+0x84/0x200 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.884692]  [<ffffffff8120ff41>] sync_filesystem+0x71/0xa0
Jul 21 22:32:09 smallvault22 kernel: [377514.884914]  [<ffffffff811d7d47>] generic_shutdown_super+0x27/0x100
Jul 21 22:32:09 smallvault22 kernel: [377514.885145]  [<ffffffff811d8092>] kill_anon_super+0x12/0x20
Jul 21 22:32:09 smallvault22 kernel: [377514.885375]  [<ffffffffa0a49b88>] btrfs_kill_super+0x18/0x120 [btrfs]
Jul 21 22:32:09 smallvault22 kernel: [377514.885599]  [<ffffffff811d8323>] deactivate_locked_super+0x43/0x70
Jul 21 22:32:09 smallvault22 kernel: [377514.885822]  [<ffffffff811d877c>] deactivate_super+0x5c/0x60
Jul 21 22:32:09 smallvault22 kernel: [377514.886045]  [<ffffffff811f8e5f>] cleanup_mnt+0x3f/0x90
Jul 21 22:32:09 smallvault22 kernel: [377514.886274]  [<ffffffff811f8ef2>] __cleanup_mnt+0x12/0x20
Jul 21 22:32:09 smallvault22 kernel: [377514.886497]  [<ffffffff810787e1>] task_work_run+0x81/0xc0
Jul 21 22:32:09 smallvault22 kernel: [377514.886721]  [<ffffffff81002230>] exit_to_usermode_loop+0xc0/0xd0
Jul 21 22:32:09 smallvault22 kernel: [377514.886943]  [<ffffffff81002c33>] syscall_return_slowpath+0xf3/0x220
Jul 21 22:32:09 smallvault22 kernel: [377514.887177]  [<ffffffff816ba6b9>] int_ret_from_sys_call+0x25/0x9f

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-07 14:12 Btrfs umount hang Angel Shtilianov
@ 2017-08-07 15:30 ` Nikolay Borisov
  2017-08-07 17:19 ` Jeff Mahoney
  1 sibling, 0 replies; 8+ messages in thread
From: Nikolay Borisov @ 2017-08-07 15:30 UTC (permalink / raw)
  To: Angel Shtilianov, linux-btrfs



On  7.08.2017 17:12, Angel Shtilianov wrote:
> Hi there,
> I'm investigating sporadic hanging during btrfs umount. The FS is
> contained in a loop mounted file.
> I have no reproduction scenario and the issue may happen once a day or
> once a month. It is rare, but frustrating.
> I have a crashdump (the server has been manually crashed and collected
> a crashdump), so I could take look through the data structures.
> What happens is that umount is getting in D state and a the kernel
> complains about hung tasks. We are using kernel 4.4.y The actual back
> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
> used (4.4.30 through 4.4.70).
> Tasks like:
> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
> are getting blocked waiting for btrfs_tree_read_lock, which is owned
> by task umount:31696 (which is also blocked for more than 120 seconds)
> regarding the lock debug.
> 
> umount is hung in "cache_block_group", see the '>' mark:
>        while (cache->cached == BTRFS_CACHE_FAST) {
>                 struct btrfs_caching_control *ctl;
> 
>                 ctl = cache->caching_ctl;
>                 atomic_inc(&ctl->count);
>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>                 spin_unlock(&cache->lock);
> 
>>                schedule();
> 
>                 finish_wait(&ctl->wait, &wait);
>                 put_caching_control(ctl);
>                 spin_lock(&cache->lock);
>         }
> 
> The complete backtraces could be found in the attached log.
> 
> Do you have any ideas ?
> Any help will be greatly appreciated.

So by the looks of it while writing dirty bgs and requiring a free block
for the CoW process, cache_block_group() kicks off a caching thread
which should just go and read in the respective block group. So the
newly spawned caching_thread should actually wake up cache_block_group
either due to success, if it manages to find 2megs:
if (total_found > (1024 * 1024 * 2)) {

or in case of failure after the out label. But in both cases it will set
cache->cached to something different than BTRFS_CACHE_FAST and it ought
to exit the loop.


But from your description of the issue I take it the process never comes
back from the schedule, meaning it missed the wakeup from caching_thread
and atm I cannot see how this could happen. Can you print the state of
the 'cache' parameter of cache_block_group ?
> 
> Best regards,
> Angel Shtilianov
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-07 14:12 Btrfs umount hang Angel Shtilianov
  2017-08-07 15:30 ` Nikolay Borisov
@ 2017-08-07 17:19 ` Jeff Mahoney
  2017-08-07 18:10   ` Jeff Mahoney
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2017-08-07 17:19 UTC (permalink / raw)
  To: Angel Shtilianov, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1997 bytes --]

On 8/7/17 10:12 AM, Angel Shtilianov wrote:
> Hi there,
> I'm investigating sporadic hanging during btrfs umount. The FS is
> contained in a loop mounted file.
> I have no reproduction scenario and the issue may happen once a day or
> once a month. It is rare, but frustrating.
> I have a crashdump (the server has been manually crashed and collected
> a crashdump), so I could take look through the data structures.
> What happens is that umount is getting in D state and a the kernel
> complains about hung tasks. We are using kernel 4.4.y The actual back
> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
> used (4.4.30 through 4.4.70).
> Tasks like:
> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
> are getting blocked waiting for btrfs_tree_read_lock, which is owned
> by task umount:31696 (which is also blocked for more than 120 seconds)
> regarding the lock debug.
> 
> umount is hung in "cache_block_group", see the '>' mark:
>        while (cache->cached == BTRFS_CACHE_FAST) {
>                 struct btrfs_caching_control *ctl;
> 
>                 ctl = cache->caching_ctl;
>                 atomic_inc(&ctl->count);
>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>                 spin_unlock(&cache->lock);
> 
>>                schedule();
> 
>                 finish_wait(&ctl->wait, &wait);
>                 put_caching_control(ctl);
>                 spin_lock(&cache->lock);
>         }
> 
> The complete backtraces could be found in the attached log.
> 
> Do you have any ideas ?

Hi Angel -

In your log, it says lockdep is disabled.  What tripped it earlier?
Lockdep really should be catching locking deadlocks in situations like
this, if that's really the underlying cause.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-07 17:19 ` Jeff Mahoney
@ 2017-08-07 18:10   ` Jeff Mahoney
  2017-08-08 11:30     ` Angel Shtilianov
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2017-08-07 18:10 UTC (permalink / raw)
  To: Angel Shtilianov, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2688 bytes --]

On 8/7/17 1:19 PM, Jeff Mahoney wrote:
> On 8/7/17 10:12 AM, Angel Shtilianov wrote:
>> Hi there,
>> I'm investigating sporadic hanging during btrfs umount. The FS is
>> contained in a loop mounted file.
>> I have no reproduction scenario and the issue may happen once a day or
>> once a month. It is rare, but frustrating.
>> I have a crashdump (the server has been manually crashed and collected
>> a crashdump), so I could take look through the data structures.
>> What happens is that umount is getting in D state and a the kernel
>> complains about hung tasks. We are using kernel 4.4.y The actual back
>> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
>> used (4.4.30 through 4.4.70).
>> Tasks like:
>> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
>> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
>> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
>> are getting blocked waiting for btrfs_tree_read_lock, which is owned
>> by task umount:31696 (which is also blocked for more than 120 seconds)
>> regarding the lock debug.
>>
>> umount is hung in "cache_block_group", see the '>' mark:
>>        while (cache->cached == BTRFS_CACHE_FAST) {
>>                 struct btrfs_caching_control *ctl;
>>
>>                 ctl = cache->caching_ctl;
>>                 atomic_inc(&ctl->count);
>>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>>                 spin_unlock(&cache->lock);
>>
>>>                schedule();
>>
>>                 finish_wait(&ctl->wait, &wait);
>>                 put_caching_control(ctl);
>>                 spin_lock(&cache->lock);
>>         }
>>
>> The complete backtraces could be found in the attached log.
>>
>> Do you have any ideas ?
> 
> Hi Angel -
> 
> In your log, it says lockdep is disabled.  What tripped it earlier?
> Lockdep really should be catching locking deadlocks in situations like
> this, if that's really the underlying cause.

Actually, I'm not sure if lockdep would catch this one.  Here's my
hypothesis:

kworker/u32:9 is waiting on a read lock while reading the free space
cache, which means it owns the cache->cached value and will issue the
wakeup when it completes.

umount is waiting on for the wakeup from kworker/u32:9 but is holding
some tree locks in write mode.

If kworker/u32:9 is waiting on the locks that umount holds, we have a
deadlock.

Can you dump the extent buffer that kworker/u32:9 is waiting on?  Part
of that will contain the PID of the holder, and if matches umount, we
found the cause.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-07 18:10   ` Jeff Mahoney
@ 2017-08-08 11:30     ` Angel Shtilianov
  2017-08-09 15:42       ` Jeff Mahoney
  0 siblings, 1 reply; 8+ messages in thread
From: Angel Shtilianov @ 2017-08-08 11:30 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Angel Shtilianov, linux-btrfs

crash> bt -f 31625
PID: 31625  TASK: ffff88046a833400  CPU: 7   COMMAND: "btrfs-transacti"
wants to acquire struct extent_buffer ffff88000460aca0 lock, whose
lock_owner is 27574.

here is pid 27574:
PID: 27574  TASK: ffff88038b469a00  CPU: 4   COMMAND: "kworker/u32:9"
which is also is trying to acquire eb lock ffff8802598b6200, and here
the owner is 31696.

31696 is
PID: 31696  TASK: ffff88044b59ce00  CPU: 5   COMMAND: "umount"

So definitely here is a kind of deadlock.
umount holds the lock needed by the workers to complete and waits them
to complete.
Lockdep wouldn't complain about that.
I am still about to investigate what has previously triggered/disabled lockdep.
I have to obtain the log from the machine, but I need some time to get it.

Jeff, you were right.
Could you help demystifying how we ended up here?

Best regards,
Angel

On Mon, Aug 7, 2017 at 9:10 PM, Jeff Mahoney <jeffm@suse.com> wrote:
> On 8/7/17 1:19 PM, Jeff Mahoney wrote:
>> On 8/7/17 10:12 AM, Angel Shtilianov wrote:
>>> Hi there,
>>> I'm investigating sporadic hanging during btrfs umount. The FS is
>>> contained in a loop mounted file.
>>> I have no reproduction scenario and the issue may happen once a day or
>>> once a month. It is rare, but frustrating.
>>> I have a crashdump (the server has been manually crashed and collected
>>> a crashdump), so I could take look through the data structures.
>>> What happens is that umount is getting in D state and a the kernel
>>> complains about hung tasks. We are using kernel 4.4.y The actual back
>>> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
>>> used (4.4.30 through 4.4.70).
>>> Tasks like:
>>> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
>>> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
>>> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
>>> are getting blocked waiting for btrfs_tree_read_lock, which is owned
>>> by task umount:31696 (which is also blocked for more than 120 seconds)
>>> regarding the lock debug.
>>>
>>> umount is hung in "cache_block_group", see the '>' mark:
>>>        while (cache->cached == BTRFS_CACHE_FAST) {
>>>                 struct btrfs_caching_control *ctl;
>>>
>>>                 ctl = cache->caching_ctl;
>>>                 atomic_inc(&ctl->count);
>>>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>>>                 spin_unlock(&cache->lock);
>>>
>>>>                schedule();
>>>
>>>                 finish_wait(&ctl->wait, &wait);
>>>                 put_caching_control(ctl);
>>>                 spin_lock(&cache->lock);
>>>         }
>>>
>>> The complete backtraces could be found in the attached log.
>>>
>>> Do you have any ideas ?
>>
>> Hi Angel -
>>
>> In your log, it says lockdep is disabled.  What tripped it earlier?
>> Lockdep really should be catching locking deadlocks in situations like
>> this, if that's really the underlying cause.
>
> Actually, I'm not sure if lockdep would catch this one.  Here's my
> hypothesis:
>
> kworker/u32:9 is waiting on a read lock while reading the free space
> cache, which means it owns the cache->cached value and will issue the
> wakeup when it completes.
>
> umount is waiting on for the wakeup from kworker/u32:9 but is holding
> some tree locks in write mode.
>
> If kworker/u32:9 is waiting on the locks that umount holds, we have a
> deadlock.
>
> Can you dump the extent buffer that kworker/u32:9 is waiting on?  Part
> of that will contain the PID of the holder, and if matches umount, we
> found the cause.
>
> -Jeff
>
> --
> Jeff Mahoney
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-08 11:30     ` Angel Shtilianov
@ 2017-08-09 15:42       ` Jeff Mahoney
  2017-08-09 22:30         ` Angel Shtilianov
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2017-08-09 15:42 UTC (permalink / raw)
  To: Angel Shtilianov; +Cc: linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 4534 bytes --]

On 8/8/17 7:30 AM, Angel Shtilianov wrote:
> crash> bt -f 31625
> PID: 31625  TASK: ffff88046a833400  CPU: 7   COMMAND: "btrfs-transacti"
> wants to acquire struct extent_buffer ffff88000460aca0 lock, whose
> lock_owner is 27574.
> 
> here is pid 27574:
> PID: 27574  TASK: ffff88038b469a00  CPU: 4   COMMAND: "kworker/u32:9"
> which is also is trying to acquire eb lock ffff8802598b6200, and here
> the owner is 31696.
> 
> 31696 is
> PID: 31696  TASK: ffff88044b59ce00  CPU: 5   COMMAND: "umount"
> 
> So definitely here is a kind of deadlock.
> umount holds the lock needed by the workers to complete and waits them
> to complete.
> Lockdep wouldn't complain about that.
> I am still about to investigate what has previously triggered/disabled lockdep.
> I have to obtain the log from the machine, but I need some time to get it.
> 
> Jeff, you were right.
> Could you help demystifying how we ended up here?

Hi Angel -

It looks like a regression introduced by 291c7d2f5, but that's a very
old commit.  As that commit says, it's a rare occurence to hit that
wait, and that's probably why we haven't seen this issue sooner.

There's potential for this to happen whenever two threads are modifying
the tree at once and one needs to find a free extent.  I'll need to
think a bit on how to fix it.

-Jeff

> Best regards,
> Angel
> 
> On Mon, Aug 7, 2017 at 9:10 PM, Jeff Mahoney <jeffm@suse.com> wrote:
>> On 8/7/17 1:19 PM, Jeff Mahoney wrote:
>>> On 8/7/17 10:12 AM, Angel Shtilianov wrote:
>>>> Hi there,
>>>> I'm investigating sporadic hanging during btrfs umount. The FS is
>>>> contained in a loop mounted file.
>>>> I have no reproduction scenario and the issue may happen once a day or
>>>> once a month. It is rare, but frustrating.
>>>> I have a crashdump (the server has been manually crashed and collected
>>>> a crashdump), so I could take look through the data structures.
>>>> What happens is that umount is getting in D state and a the kernel
>>>> complains about hung tasks. We are using kernel 4.4.y The actual back
>>>> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
>>>> used (4.4.30 through 4.4.70).
>>>> Tasks like:
>>>> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
>>>> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
>>>> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
>>>> are getting blocked waiting for btrfs_tree_read_lock, which is owned
>>>> by task umount:31696 (which is also blocked for more than 120 seconds)
>>>> regarding the lock debug.
>>>>
>>>> umount is hung in "cache_block_group", see the '>' mark:
>>>>        while (cache->cached == BTRFS_CACHE_FAST) {
>>>>                 struct btrfs_caching_control *ctl;
>>>>
>>>>                 ctl = cache->caching_ctl;
>>>>                 atomic_inc(&ctl->count);
>>>>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>>>>                 spin_unlock(&cache->lock);
>>>>
>>>>>                schedule();
>>>>
>>>>                 finish_wait(&ctl->wait, &wait);
>>>>                 put_caching_control(ctl);
>>>>                 spin_lock(&cache->lock);
>>>>         }
>>>>
>>>> The complete backtraces could be found in the attached log.
>>>>
>>>> Do you have any ideas ?
>>>
>>> Hi Angel -
>>>
>>> In your log, it says lockdep is disabled.  What tripped it earlier?
>>> Lockdep really should be catching locking deadlocks in situations like
>>> this, if that's really the underlying cause.
>>
>> Actually, I'm not sure if lockdep would catch this one.  Here's my
>> hypothesis:
>>
>> kworker/u32:9 is waiting on a read lock while reading the free space
>> cache, which means it owns the cache->cached value and will issue the
>> wakeup when it completes.
>>
>> umount is waiting on for the wakeup from kworker/u32:9 but is holding
>> some tree locks in write mode.
>>
>> If kworker/u32:9 is waiting on the locks that umount holds, we have a
>> deadlock.
>>
>> Can you dump the extent buffer that kworker/u32:9 is waiting on?  Part
>> of that will contain the PID of the holder, and if matches umount, we
>> found the cause.
>>
>> -Jeff
>>
>> --
>> Jeff Mahoney
>> SUSE Labs
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-09 15:42       ` Jeff Mahoney
@ 2017-08-09 22:30         ` Angel Shtilianov
  2017-08-10 14:49           ` Angel Shtilianov
  0 siblings, 1 reply; 8+ messages in thread
From: Angel Shtilianov @ 2017-08-09 22:30 UTC (permalink / raw)
  To: Jeff Mahoney; +Cc: Angel Shtilianov, linux-btrfs

Hi Jeff,
That's exactly what I am observing in the backtraces.
umount reaches cache_block_group passing through btrfs_search_slot,
which looks like got the wr_lock (to be confirmed tomorrow when I'm in
the office) and all other btrfs worker threads are fighting for the
btrfs_tree_read_lock.
Umount is hanging on the cache_block_group wait, all others are
hanging on the read_lock. So caching seems to be the bottleneck,
blocking all others. Good catch! I'll take a look at the commit
tomorrow morning (with large cup of coffee of course).

By the way, looking through the logs I can see one more pattern -
umount after drop snapshot reaches similar state. The common is that
drop snapshot will also call search slot, get the readlock, drop it
and get the write lock. In the same time there is umount causing
fs_sync going concurently. I've seen at least three such hangs today
and I've started the corresponding crashdumps to get copied to my
machine. Tomorrow I'll take a look at them. It seems the common is
that umount again reaches to the same place - cache_block_group and
hangs. Tomorrow I'll take a look at the crash dump for the workers
state.

Thank you,
Angel

On Wed, Aug 9, 2017 at 6:42 PM, Jeff Mahoney <jeffm@suse.com> wrote:
> On 8/8/17 7:30 AM, Angel Shtilianov wrote:
>> crash> bt -f 31625
>> PID: 31625  TASK: ffff88046a833400  CPU: 7   COMMAND: "btrfs-transacti"
>> wants to acquire struct extent_buffer ffff88000460aca0 lock, whose
>> lock_owner is 27574.
>>
>> here is pid 27574:
>> PID: 27574  TASK: ffff88038b469a00  CPU: 4   COMMAND: "kworker/u32:9"
>> which is also is trying to acquire eb lock ffff8802598b6200, and here
>> the owner is 31696.
>>
>> 31696 is
>> PID: 31696  TASK: ffff88044b59ce00  CPU: 5   COMMAND: "umount"
>>
>> So definitely here is a kind of deadlock.
>> umount holds the lock needed by the workers to complete and waits them
>> to complete.
>> Lockdep wouldn't complain about that.
>> I am still about to investigate what has previously triggered/disabled lockdep.
>> I have to obtain the log from the machine, but I need some time to get it.
>>
>> Jeff, you were right.
>> Could you help demystifying how we ended up here?
>
> Hi Angel -
>
> It looks like a regression introduced by 291c7d2f5, but that's a very
> old commit.  As that commit says, it's a rare occurence to hit that
> wait, and that's probably why we haven't seen this issue sooner.
>
> There's potential for this to happen whenever two threads are modifying
> the tree at once and one needs to find a free extent.  I'll need to
> think a bit on how to fix it.
>
> -Jeff
>
>> Best regards,
>> Angel
>>
>> On Mon, Aug 7, 2017 at 9:10 PM, Jeff Mahoney <jeffm@suse.com> wrote:
>>> On 8/7/17 1:19 PM, Jeff Mahoney wrote:
>>>> On 8/7/17 10:12 AM, Angel Shtilianov wrote:
>>>>> Hi there,
>>>>> I'm investigating sporadic hanging during btrfs umount. The FS is
>>>>> contained in a loop mounted file.
>>>>> I have no reproduction scenario and the issue may happen once a day or
>>>>> once a month. It is rare, but frustrating.
>>>>> I have a crashdump (the server has been manually crashed and collected
>>>>> a crashdump), so I could take look through the data structures.
>>>>> What happens is that umount is getting in D state and a the kernel
>>>>> complains about hung tasks. We are using kernel 4.4.y The actual back
>>>>> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
>>>>> used (4.4.30 through 4.4.70).
>>>>> Tasks like:
>>>>> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
>>>>> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
>>>>> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
>>>>> are getting blocked waiting for btrfs_tree_read_lock, which is owned
>>>>> by task umount:31696 (which is also blocked for more than 120 seconds)
>>>>> regarding the lock debug.
>>>>>
>>>>> umount is hung in "cache_block_group", see the '>' mark:
>>>>>        while (cache->cached == BTRFS_CACHE_FAST) {
>>>>>                 struct btrfs_caching_control *ctl;
>>>>>
>>>>>                 ctl = cache->caching_ctl;
>>>>>                 atomic_inc(&ctl->count);
>>>>>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>>>>>                 spin_unlock(&cache->lock);
>>>>>
>>>>>>                schedule();
>>>>>
>>>>>                 finish_wait(&ctl->wait, &wait);
>>>>>                 put_caching_control(ctl);
>>>>>                 spin_lock(&cache->lock);
>>>>>         }
>>>>>
>>>>> The complete backtraces could be found in the attached log.
>>>>>
>>>>> Do you have any ideas ?
>>>>
>>>> Hi Angel -
>>>>
>>>> In your log, it says lockdep is disabled.  What tripped it earlier?
>>>> Lockdep really should be catching locking deadlocks in situations like
>>>> this, if that's really the underlying cause.
>>>
>>> Actually, I'm not sure if lockdep would catch this one.  Here's my
>>> hypothesis:
>>>
>>> kworker/u32:9 is waiting on a read lock while reading the free space
>>> cache, which means it owns the cache->cached value and will issue the
>>> wakeup when it completes.
>>>
>>> umount is waiting on for the wakeup from kworker/u32:9 but is holding
>>> some tree locks in write mode.
>>>
>>> If kworker/u32:9 is waiting on the locks that umount holds, we have a
>>> deadlock.
>>>
>>> Can you dump the extent buffer that kworker/u32:9 is waiting on?  Part
>>> of that will contain the PID of the holder, and if matches umount, we
>>> found the cause.
>>>
>>> -Jeff
>>>
>>> --
>>> Jeff Mahoney
>>> SUSE Labs
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
> --
> Jeff Mahoney
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Btrfs umount hang
  2017-08-09 22:30         ` Angel Shtilianov
@ 2017-08-10 14:49           ` Angel Shtilianov
  0 siblings, 0 replies; 8+ messages in thread
From: Angel Shtilianov @ 2017-08-10 14:49 UTC (permalink / raw)
  To: Angel Shtilianov; +Cc: Jeff Mahoney, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 8228 bytes --]

Here is a fresh snapshot hang example:

Here kworker/u24:0 is blocked waiting on btrfs_tree_read_lock.
And again underlaying is btrfs_search_slot after it has gone through
cache_block_group.
The other two has blocked on the same place in cache_block_group. So I
can confirm it is hanging the same way as with umount.
Here is the struct extent_buffer passed to btrfs_tree_read_lock for
kworker/u24:0:
struct extent_buffer {
...
  lock_owner = 23286,
  write_locks = {
    counter = 1
  },
  read_locks = {
    counter = 0
  },
  blocking_writers = {
    counter = 1
  },
..
}
So again the ioctl has taken the lock needed for its underlying operation.
Btrfs pid 23286 is going to perform a write, so it can't skip locking,
and not just read locking, but write locking.
The fact that it owns the wr lock is good. skip_locking in its
btrfs_path struct is zero, so till here it is fine and locked. The
problem rises later, if cache isnt available. Then we are about to try
to do cache_block_group. Even if it is not going to hang on the
wait in the beginning in the cache_block_group function (as in current case):
while (cache->cached == BTRFS_CACHE_FAST) {
                struct btrfs_caching_control *ctl;

                ctl = cache->caching_ctl;
                atomic_inc(&ctl->count);
                prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
                spin_unlock(&cache->lock);

>                schedule();

                finish_wait(&ctl->wait, &wait);
                put_caching_control(ctl);
                spin_lock(&cache->lock);
        }
it seems that there is high chance to hang later on
btrfs_tree_read_lock even if there is absolutely no concurrent work
going on by calling:
cache_block_group
 ->load_free_space_cache
   ->lookup_free_space_inode
     ->__lookup_free_space_inode
       ->btrfs_search_slot
         ->btrfs_read_lock_root_node
           ->btrfs_tree_read_lock
since we are already holding the write lock.
It looks like if we call cache_block_group at find_free_extent under
call via btrfs_search_slot we are about to get deadlocked too in case
the data isn't cached.

It looks scary. I hope that I've missed something.

Best regards,
Angel

On Thu, Aug 10, 2017 at 1:30 AM, Angel Shtilianov <kernel@kyup.com> wrote:
> Hi Jeff,
> That's exactly what I am observing in the backtraces.
> umount reaches cache_block_group passing through btrfs_search_slot,
> which looks like got the wr_lock (to be confirmed tomorrow when I'm in
> the office) and all other btrfs worker threads are fighting for the
> btrfs_tree_read_lock.
> Umount is hanging on the cache_block_group wait, all others are
> hanging on the read_lock. So caching seems to be the bottleneck,
> blocking all others. Good catch! I'll take a look at the commit
> tomorrow morning (with large cup of coffee of course).
>
> By the way, looking through the logs I can see one more pattern -
> umount after drop snapshot reaches similar state. The common is that
> drop snapshot will also call search slot, get the readlock, drop it
> and get the write lock. In the same time there is umount causing
> fs_sync going concurently. I've seen at least three such hangs today
> and I've started the corresponding crashdumps to get copied to my
> machine. Tomorrow I'll take a look at them. It seems the common is
> that umount again reaches to the same place - cache_block_group and
> hangs. Tomorrow I'll take a look at the crash dump for the workers
> state.
>
> Thank you,
> Angel
>
> On Wed, Aug 9, 2017 at 6:42 PM, Jeff Mahoney <jeffm@suse.com> wrote:
>> On 8/8/17 7:30 AM, Angel Shtilianov wrote:
>>> crash> bt -f 31625
>>> PID: 31625  TASK: ffff88046a833400  CPU: 7   COMMAND: "btrfs-transacti"
>>> wants to acquire struct extent_buffer ffff88000460aca0 lock, whose
>>> lock_owner is 27574.
>>>
>>> here is pid 27574:
>>> PID: 27574  TASK: ffff88038b469a00  CPU: 4   COMMAND: "kworker/u32:9"
>>> which is also is trying to acquire eb lock ffff8802598b6200, and here
>>> the owner is 31696.
>>>
>>> 31696 is
>>> PID: 31696  TASK: ffff88044b59ce00  CPU: 5   COMMAND: "umount"
>>>
>>> So definitely here is a kind of deadlock.
>>> umount holds the lock needed by the workers to complete and waits them
>>> to complete.
>>> Lockdep wouldn't complain about that.
>>> I am still about to investigate what has previously triggered/disabled lockdep.
>>> I have to obtain the log from the machine, but I need some time to get it.
>>>
>>> Jeff, you were right.
>>> Could you help demystifying how we ended up here?
>>
>> Hi Angel -
>>
>> It looks like a regression introduced by 291c7d2f5, but that's a very
>> old commit.  As that commit says, it's a rare occurence to hit that
>> wait, and that's probably why we haven't seen this issue sooner.
>>
>> There's potential for this to happen whenever two threads are modifying
>> the tree at once and one needs to find a free extent.  I'll need to
>> think a bit on how to fix it.
>>
>> -Jeff
>>
>>> Best regards,
>>> Angel
>>>
>>> On Mon, Aug 7, 2017 at 9:10 PM, Jeff Mahoney <jeffm@suse.com> wrote:
>>>> On 8/7/17 1:19 PM, Jeff Mahoney wrote:
>>>>> On 8/7/17 10:12 AM, Angel Shtilianov wrote:
>>>>>> Hi there,
>>>>>> I'm investigating sporadic hanging during btrfs umount. The FS is
>>>>>> contained in a loop mounted file.
>>>>>> I have no reproduction scenario and the issue may happen once a day or
>>>>>> once a month. It is rare, but frustrating.
>>>>>> I have a crashdump (the server has been manually crashed and collected
>>>>>> a crashdump), so I could take look through the data structures.
>>>>>> What happens is that umount is getting in D state and a the kernel
>>>>>> complains about hung tasks. We are using kernel 4.4.y The actual back
>>>>>> trace is from 4.4.70, but this happens with all the 4.4 kernels I've
>>>>>> used (4.4.30 through 4.4.70).
>>>>>> Tasks like:
>>>>>> INFO: task kworker/u32:9:27574 blocked for more than 120 seconds.
>>>>>> INFO: task kworker/u32:12:27575 blocked for more than 120 seconds.
>>>>>> INFO: task btrfs-transacti:31625 blocked for more than 120 seconds.
>>>>>> are getting blocked waiting for btrfs_tree_read_lock, which is owned
>>>>>> by task umount:31696 (which is also blocked for more than 120 seconds)
>>>>>> regarding the lock debug.
>>>>>>
>>>>>> umount is hung in "cache_block_group", see the '>' mark:
>>>>>>        while (cache->cached == BTRFS_CACHE_FAST) {
>>>>>>                 struct btrfs_caching_control *ctl;
>>>>>>
>>>>>>                 ctl = cache->caching_ctl;
>>>>>>                 atomic_inc(&ctl->count);
>>>>>>                 prepare_to_wait(&ctl->wait, &wait, TASK_UNINTERRUPTIBLE);
>>>>>>                 spin_unlock(&cache->lock);
>>>>>>
>>>>>>>                schedule();
>>>>>>
>>>>>>                 finish_wait(&ctl->wait, &wait);
>>>>>>                 put_caching_control(ctl);
>>>>>>                 spin_lock(&cache->lock);
>>>>>>         }
>>>>>>
>>>>>> The complete backtraces could be found in the attached log.
>>>>>>
>>>>>> Do you have any ideas ?
>>>>>
>>>>> Hi Angel -
>>>>>
>>>>> In your log, it says lockdep is disabled.  What tripped it earlier?
>>>>> Lockdep really should be catching locking deadlocks in situations like
>>>>> this, if that's really the underlying cause.
>>>>
>>>> Actually, I'm not sure if lockdep would catch this one.  Here's my
>>>> hypothesis:
>>>>
>>>> kworker/u32:9 is waiting on a read lock while reading the free space
>>>> cache, which means it owns the cache->cached value and will issue the
>>>> wakeup when it completes.
>>>>
>>>> umount is waiting on for the wakeup from kworker/u32:9 but is holding
>>>> some tree locks in write mode.
>>>>
>>>> If kworker/u32:9 is waiting on the locks that umount holds, we have a
>>>> deadlock.
>>>>
>>>> Can you dump the extent buffer that kworker/u32:9 is waiting on?  Part
>>>> of that will contain the PID of the holder, and if matches umount, we
>>>> found the cause.
>>>>
>>>> -Jeff
>>>>
>>>> --
>>>> Jeff Mahoney
>>>> SUSE Labs
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> --
>> Jeff Mahoney
>> SUSE Labs
>>

[-- Attachment #2: lockup2.log --]
[-- Type: text/x-log, Size: 7855 bytes --]

[2003789.023934] sysrq: SysRq : Show Blocked State
[2003789.024159]   task                        PC stack   pid father
[2003789.024613] kworker/u24:0   D ffff88037d363470     0 17905      2 0x00000000
[2003789.025027] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
[2003789.025378]  ffff88037d363470 ffff88026bbd1850 ffff88037d363438 ffffffff00000000
[2003789.025857]  ffff88046bf78000 ffff88000543b400 ffff88037d364000 ffff88026bbd1850
[2003789.026341]  ffff88000543b400 ffff88000543b400 ffff88037d3634b0 ffff88037d363488
[2003789.026817] Call Trace:
[2003789.026992]  [<ffffffff816b41ac>] schedule+0x3c/0x90
[2003789.027190]  [<ffffffffa0ac30b5>] btrfs_tree_read_lock+0xe5/0x140 [btrfs]
[2003789.027399]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
[2003789.027594]  [<ffffffffa0a59584>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
[2003789.030864]  [<ffffffffa0a5e4db>] btrfs_search_slot+0x70b/0x9d0 [btrfs]
[2003789.031046]  [<ffffffff810812d2>] ? __might_sleep+0x52/0xb0
[2003789.031244]  [<ffffffffa0a7a3bf>] btrfs_lookup_inode+0x2f/0xa0 [btrfs]
[2003789.031439]  [<ffffffffa0a925e7>] btrfs_iget+0xd7/0x6a0 [btrfs]
[2003789.031629]  [<ffffffffa0acb1df>] __lookup_free_space_inode+0xdf/0x150 [btrfs]
[2003789.031925]  [<ffffffffa0accf0b>] lookup_free_space_inode+0x6b/0xe0 [btrfs]
[2003789.032109]  [<ffffffffa0ace4ca>] load_free_space_cache+0x7a/0x1b0 [btrfs]
[2003789.032300]  [<ffffffffa0a637b7>] cache_block_group+0x1b7/0x3d0 [btrfs]
[2003789.032474]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
[2003789.032656]  [<ffffffffa0a72c9f>] find_free_extent+0x7bf/0xf40 [btrfs]
[2003789.032838]  [<ffffffffa0a73487>] btrfs_reserve_extent+0x67/0x150 [btrfs]
[2003789.033020]  [<ffffffffa0a73669>] btrfs_alloc_tree_block+0xf9/0x480 [btrfs]
[2003789.033210]  [<ffffffffa0a59f0c>] __btrfs_cow_block+0x12c/0x5a0 [btrfs]
[2003789.033399]  [<ffffffffa0a5a516>] btrfs_cow_block+0x116/0x2a0 [btrfs]
[2003789.033586]  [<ffffffffa0a5dfa6>] btrfs_search_slot+0x1d6/0x9d0 [btrfs]
[2003789.033774]  [<ffffffffa0a5ff71>] btrfs_insert_empty_items+0x71/0xc0 [btrfs]
[2003789.034075]  [<ffffffff811b69d3>] ? kmem_cache_alloc+0x203/0x2f0
[2003789.034267]  [<ffffffffa0a6001e>] btrfs_insert_item+0x5e/0xd0 [btrfs]
[2003789.034458]  [<ffffffffa0a6fa91>] btrfs_create_pending_block_groups+0x111/0x210 [btrfs]
[2003789.034764]  [<ffffffffa0a86814>] __btrfs_end_transaction+0x94/0x340 [btrfs]
[2003789.035070]  [<ffffffffa0a86ad0>] btrfs_end_transaction+0x10/0x20 [btrfs]
[2003789.035267]  [<ffffffffa0a70f20>] flush_space+0xf0/0x4b0 [btrfs]
[2003789.035460]  [<ffffffffa0a7140a>] ? btrfs_async_reclaim_metadata_space+0x12a/0x1c0 [btrfs]
[2003789.035767]  [<ffffffffa0a7140a>] ? btrfs_async_reclaim_metadata_space+0x12a/0x1c0 [btrfs]
[2003789.036072]  [<ffffffffa0a71470>] ? btrfs_async_reclaim_metadata_space+0x190/0x1c0 [btrfs]
[2003789.036384]  [<ffffffffa0a71402>] btrfs_async_reclaim_metadata_space+0x122/0x1c0 [btrfs]
[2003789.036680]  [<ffffffff81073766>] process_one_work+0x1e6/0x6b0
[2003789.036855]  [<ffffffff810736dd>] ? process_one_work+0x15d/0x6b0
[2003789.037028]  [<ffffffff81073c99>] worker_thread+0x69/0x480
[2003789.037214]  [<ffffffff81073c30>] ? process_one_work+0x6b0/0x6b0
[2003789.037396]  [<ffffffff8107a37d>] kthread+0x11d/0x140
[2003789.037573]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
[2003789.037751]  [<ffffffff816ba8ff>] ret_from_fork+0x3f/0x70
[2003789.037925]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
[2003789.038117] btrfs-transacti D ffff88039dcfb6a8     0 23285      2 0x00000000
[2003789.038513]  ffff88039dcfb6a8 ffff88046a26d4b0 ffff88039dcfb670 0000000000000000
[2003789.038991]  ffff88046bf59a00 ffff880003ac3400 ffff88039dcfc000 ffff880312a1c040
[2003789.039476]  ffff880126a6ba00 ffff88046a26d4b0 ffff88046a26d400 ffff88039dcfb6c0
[2003789.039953] Call Trace:
[2003789.040124]  [<ffffffff816b41ac>] schedule+0x3c/0x90
[2003789.040316]  [<ffffffffa0a63736>] cache_block_group+0x136/0x3d0 [btrfs]
[2003789.040495]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
[2003789.040681]  [<ffffffffa0a72c9f>] find_free_extent+0x7bf/0xf40 [btrfs]
[2003789.040868]  [<ffffffffa0a73487>] btrfs_reserve_extent+0x67/0x150 [btrfs]
[2003789.041055]  [<ffffffffa0a73669>] btrfs_alloc_tree_block+0xf9/0x480 [btrfs]
[2003789.041252]  [<ffffffffa0a59f0c>] __btrfs_cow_block+0x12c/0x5a0 [btrfs]
[2003789.041441]  [<ffffffffa0a5a516>] btrfs_cow_block+0x116/0x2a0 [btrfs]
[2003789.041627]  [<ffffffffa0a5dfa6>] btrfs_search_slot+0x1d6/0x9d0 [btrfs]
[2003789.041813]  [<ffffffffa0a65175>] lookup_inline_extent_backref+0xc5/0x5b0 [btrfs]
[2003789.042107]  [<ffffffff810812d2>] ? __might_sleep+0x52/0xb0
[2003789.042301]  [<ffffffffa0a5916a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
[2003789.042492]  [<ffffffffa0a67701>] __btrfs_free_extent.isra.67+0xf1/0xcb0 [btrfs]
[2003789.042800]  [<ffffffffa0ad385d>] ? btrfs_merge_delayed_refs+0x8d/0x630 [btrfs]
[2003789.043102]  [<ffffffffa0a6b982>] ? __btrfs_run_delayed_refs+0x192/0x1630 [btrfs]
[2003789.043416]  [<ffffffffa0a6c0c0>] __btrfs_run_delayed_refs+0x8d0/0x1630 [btrfs]
[2003789.043715]  [<ffffffff810d4ea4>] ? lock_timer_base.isra.35+0x54/0x70
[2003789.043906]  [<ffffffffa0a842eb>] ? join_transaction.isra.13+0x2b/0x440 [btrfs]
[2003789.044218]  [<ffffffffa0a6fc12>] btrfs_run_delayed_refs+0x82/0x2c0 [btrfs]
[2003789.044412]  [<ffffffffa0a855f3>] btrfs_commit_transaction+0x43/0xb10 [btrfs]
[2003789.044719]  [<ffffffffa0a809ac>] transaction_kthread+0x22c/0x290 [btrfs]
[2003789.044907]  [<ffffffffa0a80780>] ? btrfs_cleanup_transaction+0x650/0x650 [btrfs]
[2003789.045210]  [<ffffffff8107a37d>] kthread+0x11d/0x140
[2003789.045385]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
[2003789.045560]  [<ffffffff816ba8ff>] ret_from_fork+0x3f/0x70
[2003789.045735]  [<ffffffff8107a260>] ? kthread_create_on_node+0x220/0x220
[2003789.045910] btrfs           D ffff88026b22b880     0 23286  23202 0x00000000
[2003789.046291]  ffff88026b22b880 ffff88046a26d4b0 ffff88026b22b848 0000000000000000
[2003789.046767]  ffff88046bf5b400 ffff8800378e3400 ffff88026b22c000 ffff880312a1c040
[2003789.047241]  ffff88046572b400 ffff88046a26d4b0 ffff88046a26d400 ffff88026b22b898
[2003789.047720] Call Trace:
[2003789.047890]  [<ffffffff816b41ac>] schedule+0x3c/0x90
[2003789.048076]  [<ffffffffa0a63736>] cache_block_group+0x136/0x3d0 [btrfs]
[2003789.048260]  [<ffffffff810a3390>] ? wait_woken+0xb0/0xb0
[2003789.048449]  [<ffffffffa0a72c9f>] find_free_extent+0x7bf/0xf40 [btrfs]
[2003789.048638]  [<ffffffffa0a73487>] btrfs_reserve_extent+0x67/0x150 [btrfs]
[2003789.048827]  [<ffffffffa0a73669>] btrfs_alloc_tree_block+0xf9/0x480 [btrfs]
[2003789.049015]  [<ffffffffa0a59f0c>] __btrfs_cow_block+0x12c/0x5a0 [btrfs]
[2003789.049211]  [<ffffffffa0a5a516>] btrfs_cow_block+0x116/0x2a0 [btrfs]
[2003789.049414]  [<ffffffffa0a5dfa6>] btrfs_search_slot+0x1d6/0x9d0 [btrfs]
[2003789.049601]  [<ffffffffa0a5ff71>] btrfs_insert_empty_items+0x71/0xc0 [btrfs]
[2003789.049909]  [<ffffffffa0ac3758>] btrfs_insert_orphan_item+0x58/0x80 [btrfs]
[2003789.050227]  [<ffffffffa0abae48>] btrfs_ioctl_snap_destroy+0x4b8/0x7e0 [btrfs]
[2003789.050542]  [<ffffffffa0ac0e89>] btrfs_ioctl+0x899/0x27e0 [btrfs]
[2003789.050720]  [<ffffffff816b9ba7>] ? _raw_spin_unlock+0x27/0x40
[2003789.050897]  [<ffffffff8119068d>] ? handle_mm_fault+0xdbd/0x1b80
[2003789.051073]  [<ffffffff8118f926>] ? handle_mm_fault+0x56/0x1b80
[2003789.051257]  [<ffffffff811ea6ef>] do_vfs_ioctl+0x30f/0x580
[2003789.051434]  [<ffffffff810aa26f>] ? up_read+0x1f/0x40
[2003789.051605]  [<ffffffff811f6796>] ? __fget_light+0x66/0x90
[2003789.051775]  [<ffffffff811ea9d9>] SyS_ioctl+0x79/0x90
[2003789.051948]  [<ffffffff816ba55b>] entry_SYSCALL_64_fastpath+0x16/0x73
[2003848.780864] sysrq: SysRq : Trigger a crash
[2003848.781093] BUG: unable to handle kernel NULL pointer dereference at           (null)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-08-10 14:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-07 14:12 Btrfs umount hang Angel Shtilianov
2017-08-07 15:30 ` Nikolay Borisov
2017-08-07 17:19 ` Jeff Mahoney
2017-08-07 18:10   ` Jeff Mahoney
2017-08-08 11:30     ` Angel Shtilianov
2017-08-09 15:42       ` Jeff Mahoney
2017-08-09 22:30         ` Angel Shtilianov
2017-08-10 14:49           ` Angel Shtilianov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).