Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* kernel hangs during balance
@ 2017-12-19 16:09 Rich Rauenzahn
  2017-12-19 17:08 ` Rich Rauenzahn
  0 siblings, 1 reply; 10+ messages in thread
From: Rich Rauenzahn @ 2017-12-19 16:09 UTC (permalink / raw)
  To: Btrfs BTRFS

I'm running 4.4.106-1.el7.elrepo.x86_64 and I do a btrfs balance everynight.

Every night I'm getting a kernel hang, sometimes caught by my
watchdog, sometimes not.  Last night's hang was on the balance of DATA
on / at 70.

I'm not sure how to further trace this down to help you -- the console
by the time I notice just has lots of messages on it without the
initial ones.   The last items in /var/log/message aren't helpful, but
I'm pretty sure it is the nightly balance.

I've run btrfs check on / with no issues recently.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-19 16:09 kernel hangs during balance Rich Rauenzahn
@ 2017-12-19 17:08 ` Rich Rauenzahn
  2017-12-19 17:14   ` Hans van Kranenburg
  0 siblings, 1 reply; 10+ messages in thread
From: Rich Rauenzahn @ 2017-12-19 17:08 UTC (permalink / raw)
  To: Btrfs BTRFS

What's also confusing is I just ran a manual balance on the fs using
defaults (which are aggressive) and it completed with no problems.
It smells more like a race condition than a particular corruption.

On Tue, Dec 19, 2017 at 8:09 AM, Rich Rauenzahn <rrauenza@gmail.com> wrote:
> I'm running 4.4.106-1.el7.elrepo.x86_64 and I do a btrfs balance everynight.
>
> Every night I'm getting a kernel hang, sometimes caught by my
> watchdog, sometimes not.  Last night's hang was on the balance of DATA
> on / at 70.
>
> I'm not sure how to further trace this down to help you -- the console
> by the time I notice just has lots of messages on it without the
> initial ones.   The last items in /var/log/message aren't helpful, but
> I'm pretty sure it is the nightly balance.
>
> I've run btrfs check on / with no issues recently.
>
> Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-19 17:08 ` Rich Rauenzahn
@ 2017-12-19 17:14   ` Hans van Kranenburg
  2017-12-19 17:48     ` Rich Rauenzahn
  0 siblings, 1 reply; 10+ messages in thread
From: Hans van Kranenburg @ 2017-12-19 17:14 UTC (permalink / raw)
  To: Rich Rauenzahn, Btrfs BTRFS

On 12/19/2017 06:08 PM, Rich Rauenzahn wrote:
> What's also confusing is I just ran a manual balance on the fs using
> defaults (which are aggressive) and it completed with no problems.
> It smells more like a race condition than a particular corruption.

Just wild first guess... are you also using btrfs send/receive
functionality where the system having problems is the sending part?

> On Tue, Dec 19, 2017 at 8:09 AM, Rich Rauenzahn <rrauenza@gmail.com> wrote:
>> I'm running 4.4.106-1.el7.elrepo.x86_64 and I do a btrfs balance everynight.
>>
>> Every night I'm getting a kernel hang, sometimes caught by my
>> watchdog, sometimes not.  Last night's hang was on the balance of DATA
>> on / at 70.
>>
>> I'm not sure how to further trace this down to help you -- the console
>> by the time I notice just has lots of messages on it without the
>> initial ones.

Capturing more logs is definitely the first thing to do.

Look if the output of `dmesg` still shows the btrfs errors. Otherwise,
if something is spamming there, turn that off, or if you don't have the
errors in a log file because the log files are on the same btrfs, then
you have to find out another way to capture them. E.g. make the kernel
buffer for messages bigger, use netconsole or just pragmatic things like
ssh from another server and `dmesg -w` and store it on the other machine.

>>   The last items in /var/log/message aren't helpful, but
>> I'm pretty sure it is the nightly balance.
>>
>> I've run btrfs check on / with no issues recently.

-- 
Hans van Kranenburg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-19 17:14   ` Hans van Kranenburg
@ 2017-12-19 17:48     ` Rich Rauenzahn
  2017-12-20 16:24       ` Rich Rauenzahn
  0 siblings, 1 reply; 10+ messages in thread
From: Rich Rauenzahn @ 2017-12-19 17:48 UTC (permalink / raw)
  To: Hans van Kranenburg; +Cc: Btrfs BTRFS

On Tue, Dec 19, 2017 at 9:14 AM, Hans van Kranenburg
<hans.van.kranenburg@mendix.com> wrote:
> Just wild first guess... are you also using btrfs send/receive
> functionality where the system having problems is the sending part?

No.

>>> Every night I'm getting a kernel hang, sometimes caught by my
>>> watchdog, sometimes not.  Last night's hang was on the balance of DATA
>>> on / at 70.
>>>
>>> I'm not sure how to further trace this down to help you -- the console
>>> by the time I notice just has lots of messages on it without the
>>> initial ones.
>
> Capturing more logs is definitely the first thing to do.
>
> Look if the output of `dmesg` still shows the btrfs errors. Otherwise,

But I can't ...  I have to hit the reset button (or the watchdog does)
before I can get into the system again.

> if something is spamming there, turn that off, or if you don't have the
> errors in a log file because the log files are on the same btrfs, then
> you have to find out another way to capture them. E.g. make the kernel
> buffer for messages bigger, use netconsole or just pragmatic things like
> ssh from another server and `dmesg -w` and store it on the other machine.

Ah... yes, a permanent connection from another might work.

I also have a pfsense box -- I think it can take a syslogd daemon.
Maybe I can netconsole to it?

Will investigate.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-19 17:48     ` Rich Rauenzahn
@ 2017-12-20 16:24       ` Rich Rauenzahn
  2017-12-20 19:02         ` Chris Murphy
  0 siblings, 1 reply; 10+ messages in thread
From: Rich Rauenzahn @ 2017-12-20 16:24 UTC (permalink / raw)
  To: Hans van Kranenburg; +Cc: Btrfs BTRFS

Ok, caught the hung tasks last night.  I don't *think* this is
related, because I pretty sure this isn't happening on the same
filesystem, but I do have a loopback swap on one btrfs drive.

The hang might have occurred after the btrfs balance was finished ....
which is confusing.  I'm adding timestamps to my next btrfs job.


Dec 20 01:14:50 tendo [64680.942928] INFO: task systemd:1 blocked for
more than 120 seconds.
Dec 20 01:14:50 tendo [64680.945844]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:50 tendo [64680.948900] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:50 tendo [64680.951987]  ffff88040d6730d8
Dec 20 01:14:50 tendo ffffffff81a81500
Dec 20 01:14:50 tendo ffff88040d5f8000
Dec 20 01:14:50 tendo ffff88040d674000
Dec 20 01:14:50 tendo
Dec 20 01:14:50 tendo [64680.955107]  0000000000000000
Dec 20 01:14:50 tendo 7fffffffffffffff
Dec 20 01:14:50 tendo ffff8800bd19a648
Dec 20 01:14:50 tendo ffffe8ffffc03980
Dec 20 01:14:50 tendo
Dec 20 01:14:50 tendo [64680.957871]  ffff88040d6730f0
Dec 20 01:14:50 tendo ffffffff81700435
Dec 20 01:14:50 tendo ffff88041fa16cc0
Dec 20 01:14:50 tendo ffff88040d6731a0
Dec 20 01:14:50 tendo
Dec 20 01:14:50 tendo [64680.960618] Call Trace:
Dec 20 01:14:50 tendo [64680.963342]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:50 tendo [64680.966052]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:50 tendo [64680.968490]  [<ffffffff816ffa36>]
io_schedule_timeout+0xa6/0x110
Dec 20 01:14:50 tendo [64680.970900]  [<ffffffff8131a4dc>] bt_get+0x14c/0x1d0
Dec 20 01:14:50 tendo [64680.973211]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:50 tendo [64680.975383]  [<ffffffff8131a9e0>]
blk_mq_get_tag+0xb0/0xe0
Dec 20 01:14:50 tendo [64680.977540]  [<ffffffff813161bb>]
__blk_mq_alloc_request+0x1b/0x1f0
Dec 20 01:14:50 tendo [64680.979708]  [<ffffffff81317f7e>]
blk_mq_map_request+0x19e/0x210
Dec 20 01:14:50 tendo [64680.981871]  [<ffffffff8131901b>]
blk_sq_make_request+0xab/0x3d0
Dec 20 01:14:50 tendo [64680.984004]  [<ffffffff813091df>] ?
generic_make_request_checks+0x29f/0x4f0
Dec 20 01:14:50 tendo [64680.986122]  [<ffffffff8118cc4e>] ?
mempool_alloc+0x6e/0x170
Dec 20 01:14:50 tendo [64680.988207]  [<ffffffff8130d143>]
generic_make_request+0x113/0x2c0
Dec 20 01:14:50 tendo [64680.990137]  [<ffffffff8130d367>] submit_bio+0x77/0x150
Dec 20 01:14:50 tendo [64680.992041]  [<ffffffff81194799>] ?
__test_set_page_writeback+0x149/0x1c0
Dec 20 01:14:50 tendo [64680.993949]  [<ffffffff811cbf9d>]
__swap_writepage+0x20d/0x250
Dec 20 01:14:50 tendo [64680.995853]  [<ffffffff811d155c>] ?
__frontswap_store+0x8c/0x120
Dec 20 01:14:50 tendo [64680.997720]  [<ffffffff811cc019>]
swap_writepage+0x39/0x70
Dec 20 01:14:50 tendo [64680.999595]  [<ffffffff8119b550>]
pageout.isra.43+0x170/0x290
Dec 20 01:14:50 tendo [64681.001291]  [<ffffffff8119d843>]
shrink_page_list+0x353/0x780
Dec 20 01:14:50 tendo [64681.002975]  [<ffffffff8119e2b7>]
shrink_inactive_list+0x217/0x530
Dec 20 01:14:50 tendo [64681.004652]  [<ffffffff8119eed8>]
shrink_lruvec+0x588/0x740
Dec 20 01:14:50 tendo [64681.006311]  [<ffffffff8119f17b>]
shrink_zone+0xeb/0x2e0
Dec 20 01:14:50 tendo [64681.007967]  [<ffffffff8119f6e4>]
do_try_to_free_pages+0x164/0x430
Dec 20 01:14:50 tendo [64681.009617]  [<ffffffff8119aace>] ?
throttle_direct_reclaim+0x14e/0x240
Dec 20 01:14:50 tendo [64681.011141]  [<ffffffff8119fa85>]
try_to_free_pages+0xd5/0x190
Dec 20 01:14:51 tendo [64681.012659]  [<ffffffff81202310>]
__alloc_pages_slowpath.constprop.89+0x3a0/0x6e8
Dec 20 01:14:51 tendo [64681.014191]  [<ffffffff810afc22>] ?
sched_clock_cpu+0x72/0xa0
Dec 20 01:14:51 tendo [64681.015701]  [<ffffffff81192fc3>]
__alloc_pages_nodemask+0x253/0x260
Dec 20 01:14:51 tendo [64681.017215]  [<ffffffff811e8ff9>] ?
__kmalloc_node_track_caller+0x249/0x2b0
Dec 20 01:14:51 tendo [64681.018392]  [<ffffffff811dc445>]
alloc_pages_vma+0xa5/0x220
Dec 20 01:14:51 tendo [64681.019296]  [<ffffffff811b9951>]
handle_pte_fault+0xd21/0x1490
Dec 20 01:14:51 tendo [64681.020195]  [<ffffffff81211a5e>] ?
legitimize_path.isra.23+0x2e/0x60
Dec 20 01:14:51 tendo [64681.021100]  [<ffffffff813419ba>] ? put_dec+0x1a/0x80
Dec 20 01:14:51 tendo [64681.021992]  [<ffffffff8134384d>] ?
number.isra.13+0x2dd/0x310
Dec 20 01:14:51 tendo [64681.022883]  [<ffffffff811bafa0>]
handle_mm_fault+0x250/0x540
Dec 20 01:14:51 tendo [64681.023757]  [<ffffffff810688e8>]
__do_page_fault+0x188/0x3f0
Dec 20 01:14:51 tendo [64681.024626]  [<ffffffff81068b80>]
do_page_fault+0x30/0x80
Dec 20 01:14:51 tendo [64681.025486]  [<ffffffff812829c7>] ?
kernfs_path_locked+0x37/0x80
Dec 20 01:14:51 tendo [64681.026350]  [<ffffffff817061c8>] page_fault+0x28/0x30
Dec 20 01:14:51 tendo [64681.027201]  [<ffffffff813455be>] ?
copy_user_enhanced_fast_string+0xe/0x20
Dec 20 01:14:51 tendo [64681.028034]  [<ffffffff8122ae45>] ?
seq_read+0x295/0x390
Dec 20 01:14:51 tendo [64681.028832]  [<ffffffff81206997>] __vfs_read+0x37/0x100
Dec 20 01:14:51 tendo [64681.029618]  [<ffffffff812b3383>] ?
security_file_permission+0xa3/0xc0
Dec 20 01:14:51 tendo [64681.030342]  [<ffffffff81206f22>] ?
rw_verify_area+0x52/0xd0
Dec 20 01:14:51 tendo [64681.031055]  [<ffffffff8120701f>] vfs_read+0x7f/0x130
Dec 20 01:14:51 tendo [64681.031761]  [<ffffffff81207e75>] SyS_read+0x55/0xc0
Dec 20 01:14:51 tendo [64681.032441]  [<ffffffff81703fee>]
entry_SYSCALL_64_fastpath+0x12/0x71
Dec 20 01:14:51 tendo [64681.033119] INFO: task khugepaged:68 blocked
for more than 120 seconds.
Dec 20 01:14:51 tendo [64681.033815]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.034518] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.035238]  ffff88040cf4f318
Dec 20 01:14:51 tendo ffff88040d6c2c80
Dec 20 01:14:51 tendo ffff88040cf50000
Dec 20 01:14:51 tendo ffff88040cf50000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.035971]  0000000000000000
Dec 20 01:14:51 tendo 7fffffffffffffff
Dec 20 01:14:51 tendo ffff8800bd19a7c8
Dec 20 01:14:51 tendo ffffe8ffffc43980
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.036711]  ffff88040cf4f330
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88041fa56cc0
Dec 20 01:14:51 tendo ffff88040cf4f3e0
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.037454] Call Trace:
Dec 20 01:14:51 tendo [64681.038184]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.038925]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:51 tendo [64681.039651]  [<ffffffff811e4c2f>] ?
get_partial_node.isra.61+0x8f/0x1d0
Dec 20 01:14:51 tendo [64681.040367]  [<ffffffff816ffa36>]
io_schedule_timeout+0xa6/0x110
Dec 20 01:14:51 tendo [64681.041081]  [<ffffffff8131a4dc>] bt_get+0x14c/0x1d0
Dec 20 01:14:51 tendo [64681.041783]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.042495]  [<ffffffff8131a9e0>]
blk_mq_get_tag+0xb0/0xe0
Dec 20 01:14:51 tendo [64681.043204]  [<ffffffff813161bb>]
__blk_mq_alloc_request+0x1b/0x1f0
Dec 20 01:14:51 tendo [64681.043914]  [<ffffffff81317f7e>]
blk_mq_map_request+0x19e/0x210
Dec 20 01:14:51 tendo [64681.044624]  [<ffffffff8131901b>]
blk_sq_make_request+0xab/0x3d0
Dec 20 01:14:51 tendo [64681.045331]  [<ffffffff813091df>] ?
generic_make_request_checks+0x29f/0x4f0
Dec 20 01:14:51 tendo [64681.046046]  [<ffffffff8118cc4e>] ?
mempool_alloc+0x6e/0x170
Dec 20 01:14:51 tendo [64681.046758]  [<ffffffff8130d143>]
generic_make_request+0x113/0x2c0
Dec 20 01:14:51 tendo [64681.047471]  [<ffffffff8130d367>] submit_bio+0x77/0x150
Dec 20 01:14:51 tendo [64681.048187]  [<ffffffff81194799>] ?
__test_set_page_writeback+0x149/0x1c0
Dec 20 01:14:51 tendo [64681.048917]  [<ffffffff811cbf9d>]
__swap_writepage+0x20d/0x250
Dec 20 01:14:51 tendo [64681.049627]  [<ffffffff811d155c>] ?
__frontswap_store+0x8c/0x120
Dec 20 01:14:51 tendo [64681.050330]  [<ffffffff811cc019>]
swap_writepage+0x39/0x70
Dec 20 01:14:51 tendo [64681.051022]  [<ffffffff8119b550>]
pageout.isra.43+0x170/0x290
Dec 20 01:14:51 tendo [64681.051710]  [<ffffffff8119d843>]
shrink_page_list+0x353/0x780
Dec 20 01:14:51 tendo [64681.052387]  [<ffffffff8119e2b7>]
shrink_inactive_list+0x217/0x530
Dec 20 01:14:51 tendo [64681.053069]  [<ffffffff8119eed8>]
shrink_lruvec+0x588/0x740
Dec 20 01:14:51 tendo [64681.053745]  [<ffffffff8119f17b>]
shrink_zone+0xeb/0x2e0
Dec 20 01:14:51 tendo [64681.054417]  [<ffffffff8119f6e4>]
do_try_to_free_pages+0x164/0x430
Dec 20 01:14:51 tendo [64681.055086]  [<ffffffff8119fa85>]
try_to_free_pages+0xd5/0x190
Dec 20 01:14:51 tendo [64681.055751]  [<ffffffff81202310>]
__alloc_pages_slowpath.constprop.89+0x3a0/0x6e8
Dec 20 01:14:51 tendo [64681.056427]  [<ffffffff810b7445>] ?
put_prev_entity+0x35/0x670
Dec 20 01:14:51 tendo [64681.057105]  [<ffffffff81192fc3>]
__alloc_pages_nodemask+0x253/0x260
Dec 20 01:14:51 tendo [64681.057784]  [<ffffffff811eec60>]
khugepaged_scan_mm_slot+0x580/0xfe0
Dec 20 01:14:51 tendo [64681.058476]  [<ffffffff8135607d>] ? list_del+0xd/0x30
Dec 20 01:14:51 tendo [64681.059163]  [<ffffffff811ef7eb>]
khugepaged+0x12b/0x430
Dec 20 01:14:51 tendo [64681.059833]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.060494]  [<ffffffff811ef6c0>] ?
khugepaged_scan_mm_slot+0xfe0/0xfe0
Dec 20 01:14:51 tendo [64681.061157]  [<ffffffff8109d8e5>] kthread+0xe5/0x100
Dec 20 01:14:51 tendo [64681.061816]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.062476]  [<ffffffff8170434f>]
ret_from_fork+0x3f/0x70
Dec 20 01:14:51 tendo [64681.063129]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.063781] INFO: task kswapd0:77 blocked for
more than 120 seconds.
Dec 20 01:14:51 tendo [64681.064436]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.065105] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.065791]  ffff880408d435c8
Dec 20 01:14:51 tendo ffff88003f7c9640
Dec 20 01:14:51 tendo ffff88040c8b5900
Dec 20 01:14:51 tendo ffff880408d44000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.066485]  0000000000000000
Dec 20 01:14:51 tendo 7fffffffffffffff
Dec 20 01:14:51 tendo ffff8800bd19a688
Dec 20 01:14:51 tendo ffffe8ffffc83980
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.067178]  ffff880408d435e0
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88041fa96cc0
Dec 20 01:14:51 tendo ffff880408d43690
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.067860] Call Trace:
Dec 20 01:14:51 tendo [64681.068518]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.069191]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:51 tendo [64681.069838]  [<ffffffff8130eb06>] ?
blk_flush_plug_list+0xb6/0x200
Dec 20 01:14:51 tendo [64681.070475]  [<ffffffff816ffa36>]
io_schedule_timeout+0xa6/0x110
Dec 20 01:14:51 tendo [64681.071117]  [<ffffffff8131a4dc>] bt_get+0x14c/0x1d0
Dec 20 01:14:51 tendo [64681.071752]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.072391]  [<ffffffff8131a9e0>]
blk_mq_get_tag+0xb0/0xe0
Dec 20 01:14:51 tendo [64681.073026]  [<ffffffff813161bb>]
__blk_mq_alloc_request+0x1b/0x1f0
Dec 20 01:14:51 tendo [64681.073662]  [<ffffffff81317f7e>]
blk_mq_map_request+0x19e/0x210
Dec 20 01:14:51 tendo [64681.074298]  [<ffffffff8131901b>]
blk_sq_make_request+0xab/0x3d0
Dec 20 01:14:51 tendo [64681.074930]  [<ffffffff813091df>] ?
generic_make_request_checks+0x29f/0x4f0
Dec 20 01:14:51 tendo [64681.075564]  [<ffffffff8118cc4e>] ?
mempool_alloc+0x6e/0x170
Dec 20 01:14:51 tendo [64681.076200]  [<ffffffff8130d143>]
generic_make_request+0x113/0x2c0
Dec 20 01:14:51 tendo [64681.076840]  [<ffffffff8130d367>] submit_bio+0x77/0x150
Dec 20 01:14:51 tendo [64681.077473]  [<ffffffff81194799>] ?
__test_set_page_writeback+0x149/0x1c0
Dec 20 01:14:51 tendo [64681.078114]  [<ffffffff811cbf9d>]
__swap_writepage+0x20d/0x250
Dec 20 01:14:51 tendo [64681.078752]  [<ffffffff811d155c>] ?
__frontswap_store+0x8c/0x120
Dec 20 01:14:51 tendo [64681.079384]  [<ffffffff811cc019>]
swap_writepage+0x39/0x70
Dec 20 01:14:51 tendo [64681.080007]  [<ffffffff8119b550>]
pageout.isra.43+0x170/0x290
Dec 20 01:14:51 tendo [64681.080616]  [<ffffffff8119d843>]
shrink_page_list+0x353/0x780
Dec 20 01:14:51 tendo [64681.081229]  [<ffffffff8119e2b7>]
shrink_inactive_list+0x217/0x530
Dec 20 01:14:51 tendo [64681.081837]  [<ffffffff8119eed8>]
shrink_lruvec+0x588/0x740
Dec 20 01:14:51 tendo [64681.082443]  [<ffffffff811f682b>] ?
mem_cgroup_iter+0x12b/0x310
Dec 20 01:14:51 tendo [64681.083052]  [<ffffffff8119f17b>]
shrink_zone+0xeb/0x2e0
Dec 20 01:14:51 tendo [64681.083661]  [<ffffffff811a043a>] kswapd+0x4ea/0x9d0
Dec 20 01:14:51 tendo [64681.084269]  [<ffffffff8119ff50>] ?
mem_cgroup_shrink_node_zone+0x1c0/0x1c0
Dec 20 01:14:51 tendo [64681.084888]  [<ffffffff8109d8e5>] kthread+0xe5/0x100
Dec 20 01:14:51 tendo [64681.085505]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.086119]  [<ffffffff8170434f>]
ret_from_fork+0x3f/0x70
Dec 20 01:14:51 tendo [64681.086726]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.087338] INFO: task btrfs-transacti:423
blocked for more than 120 seconds.
Dec 20 01:14:51 tendo [64681.087953]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.088568] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.089205]  ffff88040745b990
Dec 20 01:14:51 tendo ffff88040d6c42c0
Dec 20 01:14:51 tendo ffff880406970000
Dec 20 01:14:51 tendo ffff88040745c000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.089840]  00000000ffffff00
Dec 20 01:14:51 tendo ffff880406970000
Dec 20 01:14:51 tendo ffff88040745b9d8
Dec 20 01:14:51 tendo ffff88030abc5a40
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.090470]  ffff88040745b9a8
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88030abc5aa8
Dec 20 01:14:51 tendo ffff88040745ba18
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.091096] Call Trace:
Dec 20 01:14:51 tendo [64681.091715]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.092361]  [<ffffffffa0519167>]
btrfs_tree_read_lock+0xd7/0x140 [btrfs]
Dec 20 01:14:51 tendo [64681.092995]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.093637]  [<ffffffffa04b2594>]
btrfs_read_lock_root_node+0x34/0x50 [btrfs]
Dec 20 01:14:51 tendo [64681.094285]  [<ffffffffa04b75b2>]
btrfs_search_slot+0x532/0xa10 [btrfs]
Dec 20 01:14:51 tendo [64681.094925]  [<ffffffffa04bd6ee>]
lookup_inline_extent_backref+0xee/0x5b0 [btrfs]
Dec 20 01:14:51 tendo [64681.095559]  [<ffffffffa04b22fa>] ?
btrfs_alloc_path+0x1a/0x20 [btrfs]
Dec 20 01:14:51 tendo [64681.096192]  [<ffffffffa04c0700>]
__btrfs_free_extent.isra.67+0x110/0xc90 [btrfs]
Dec 20 01:14:51 tendo [64681.096824]  [<ffffffffa0529627>] ?
btrfs_merge_delayed_refs+0x67/0x5e0 [btrfs]
Dec 20 01:14:51 tendo [64681.097448]  [<ffffffff81021b19>] ?
sched_clock+0x9/0x10
Dec 20 01:14:51 tendo [64681.098074]  [<ffffffffa04c4c85>]
__btrfs_run_delayed_refs.constprop.72+0xa25/0x1250 [btrfs]
Dec 20 01:14:51 tendo [64681.098704]  [<ffffffff81021b19>] ?
sched_clock+0x9/0x10
Dec 20 01:14:51 tendo [64681.099337]  [<ffffffff810b7445>] ?
put_prev_entity+0x35/0x670
Dec 20 01:14:51 tendo [64681.099967]  [<ffffffff810ea174>] ?
lock_timer_base.isra.33+0x54/0x70
Dec 20 01:14:51 tendo [64681.100587]  [<ffffffffa04c81cd>]
btrfs_run_delayed_refs+0x7d/0x2b0 [btrfs]
Dec 20 01:14:51 tendo [64681.101216]  [<ffffffffa04dd596>]
btrfs_commit_transaction+0x56/0xa70 [btrfs]
Dec 20 01:14:51 tendo [64681.101839]  [<ffffffffa04d8959>]
transaction_kthread+0x1a9/0x230 [btrfs]
Dec 20 01:14:51 tendo [64681.102466]  [<ffffffffa04d87b0>] ?
btrfs_cleanup_transaction+0x560/0x560 [btrfs]
Dec 20 01:14:51 tendo [64681.103093]  [<ffffffff8109d8e5>] kthread+0xe5/0x100
Dec 20 01:14:51 tendo [64681.103719]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.104343]  [<ffffffff8170434f>]
ret_from_fork+0x3f/0x70
Dec 20 01:14:51 tendo [64681.104968]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.105586] INFO: task systemd-journal:505
blocked for more than 120 seconds.
Dec 20 01:14:51 tendo [64681.106218]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.106851] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.107500]  ffff88040c16bbd0
Dec 20 01:14:51 tendo ffff88040d6c2c80
Dec 20 01:14:51 tendo ffff8804089542c0
Dec 20 01:14:51 tendo ffff88040c16c000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.108160]  ffff88014d0dac60
Dec 20 01:14:51 tendo 0000000000000001
Dec 20 01:14:51 tendo 0000000003b42000
Dec 20 01:14:51 tendo 0000000004000000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.108818]  ffff88040c16bbe8
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88014d0dabe0
Dec 20 01:14:51 tendo ffff88040c16bc58
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.109488] Call Trace:
Dec 20 01:14:51 tendo [64681.110137]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.110792]  [<ffffffffa04f901f>]
btrfs_start_ordered_extent+0x10f/0x140 [btrfs]
Dec 20 01:14:51 tendo [64681.111453]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.112125]  [<ffffffffa04ee0dc>]
btrfs_page_mkwrite+0x17c/0x320 [btrfs]
Dec 20 01:14:51 tendo [64681.112781]  [<ffffffff811b6174>]
do_page_mkwrite+0x54/0xa0
Dec 20 01:14:51 tendo [64681.113447]  [<ffffffff811b873f>]
do_wp_page+0x3cf/0x590
Dec 20 01:14:51 tendo [64681.114112]  [<ffffffff8118a918>] ?
filemap_map_pages+0x1f8/0x240
Dec 20 01:14:51 tendo [64681.114775]  [<ffffffff811b96d8>]
handle_pte_fault+0xaa8/0x1490
Dec 20 01:14:51 tendo [64681.115435]  [<ffffffff810d673b>] ?
devkmsg_read+0x16b/0x2d0
Dec 20 01:14:51 tendo [64681.116096]  [<ffffffff811bafa0>]
handle_mm_fault+0x250/0x540
Dec 20 01:14:51 tendo [64681.116751]  [<ffffffff810688e8>]
__do_page_fault+0x188/0x3f0
Dec 20 01:14:51 tendo [64681.117412]  [<ffffffff81068b80>]
do_page_fault+0x30/0x80
Dec 20 01:14:51 tendo [64681.118055]  [<ffffffff817061c8>] page_fault+0x28/0x30
Dec 20 01:14:51 tendo [64681.118698] INFO: task btrfs-transacti:756
blocked for more than 120 seconds.
Dec 20 01:14:51 tendo [64681.119336]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.119965] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.120598]  ffff8803eec139a8
Dec 20 01:14:51 tendo ffff88040d6c42c0
Dec 20 01:14:51 tendo ffff8804068c0000
Dec 20 01:14:51 tendo ffff8803eec14000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.121243]  00000000ffffff00
Dec 20 01:14:51 tendo ffff8804068c0000
Dec 20 01:14:51 tendo ffff8803eec139f0
Dec 20 01:14:51 tendo ffff88015afc0460
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.121879]  ffff8803eec139c0
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88015afc04c8
Dec 20 01:14:51 tendo ffff8803eec13a30
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.122514] Call Trace:
Dec 20 01:14:51 tendo [64681.123143]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.123785]  [<ffffffffa0519167>]
btrfs_tree_read_lock+0xd7/0x140 [btrfs]
Dec 20 01:14:51 tendo [64681.124425]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.125070]  [<ffffffffa04b2594>]
btrfs_read_lock_root_node+0x34/0x50 [btrfs]
Dec 20 01:14:51 tendo [64681.125716]  [<ffffffffa04b75b2>]
btrfs_search_slot+0x532/0xa10 [btrfs]
Dec 20 01:14:51 tendo [64681.126358]  [<ffffffffa04d45ce>] ?
btree_set_page_dirty+0xe/0x10 [btrfs]
Dec 20 01:14:51 tendo [64681.127003]  [<ffffffffa05015cd>] ?
set_extent_buffer_dirty+0x6d/0xb0 [btrfs]
Dec 20 01:14:51 tendo [64681.127640]  [<ffffffffa04d113b>]
btrfs_del_csums+0xeb/0x2e0 [btrfs]
Dec 20 01:14:51 tendo [64681.128278]  [<ffffffffa0500efb>] ?
free_extent_buffer+0x4b/0x90 [btrfs]
Dec 20 01:14:51 tendo [64681.128916]  [<ffffffffa04c0cb3>]
__btrfs_free_extent.isra.67+0x6c3/0xc90 [btrfs]
Dec 20 01:14:51 tendo [64681.129521]  [<ffffffffa04c46a1>]
__btrfs_run_delayed_refs.constprop.72+0x441/0x1250 [btrfs]
Dec 20 01:14:51 tendo [64681.130134]  [<ffffffff810b7445>] ?
put_prev_entity+0x35/0x670
Dec 20 01:14:51 tendo [64681.130747]  [<ffffffff81017695>] ?
__switch_to+0x1e5/0x490
Dec 20 01:14:51 tendo [64681.131357]  [<ffffffff810ea174>] ?
lock_timer_base.isra.33+0x54/0x70
Dec 20 01:14:51 tendo [64681.131973]  [<ffffffffa04c81cd>]
btrfs_run_delayed_refs+0x7d/0x2b0 [btrfs]
Dec 20 01:14:51 tendo [64681.132590]  [<ffffffffa04dd596>]
btrfs_commit_transaction+0x56/0xa70 [btrfs]
Dec 20 01:14:51 tendo [64681.133207]  [<ffffffffa04d8959>]
transaction_kthread+0x1a9/0x230 [btrfs]
Dec 20 01:14:51 tendo [64681.133820]  [<ffffffffa04d87b0>] ?
btrfs_cleanup_transaction+0x560/0x560 [btrfs]
Dec 20 01:14:51 tendo [64681.134436]  [<ffffffff8109d8e5>] kthread+0xe5/0x100
Dec 20 01:14:51 tendo [64681.135053]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.135668]  [<ffffffff8170434f>]
ret_from_fork+0x3f/0x70
Dec 20 01:14:51 tendo [64681.136276]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.136881] INFO: task btrfs-transacti:820
blocked for more than 120 seconds.
Dec 20 01:14:51 tendo [64681.137478]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.138073] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.138680]  ffff8800bc7af118
Dec 20 01:14:51 tendo ffffffff81a81500
Dec 20 01:14:51 tendo ffff8804070d0000
Dec 20 01:14:51 tendo ffff8800bc7b0000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.139292]  0000000000000000
Dec 20 01:14:51 tendo 7fffffffffffffff
Dec 20 01:14:51 tendo ffff8800bd19a648
Dec 20 01:14:51 tendo ffffe8ffffc03980
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.139890]  ffff8800bc7af130
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88041fa16cc0
Dec 20 01:14:51 tendo ffff8800bc7af1e0
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.140480] Call Trace:
Dec 20 01:14:51 tendo [64681.141060]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.141641]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:51 tendo [64681.142221]  [<ffffffff816ffa36>]
io_schedule_timeout+0xa6/0x110
Dec 20 01:14:51 tendo [64681.142799]  [<ffffffff8131a4dc>] bt_get+0x14c/0x1d0
Dec 20 01:14:51 tendo [64681.143375]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.143954]  [<ffffffff8131a9e0>]
blk_mq_get_tag+0xb0/0xe0
Dec 20 01:14:51 tendo [64681.144535]  [<ffffffff813161bb>]
__blk_mq_alloc_request+0x1b/0x1f0
Dec 20 01:14:51 tendo [64681.145119]  [<ffffffff81317f7e>]
blk_mq_map_request+0x19e/0x210
Dec 20 01:14:51 tendo [64681.145704]  [<ffffffff8131901b>]
blk_sq_make_request+0xab/0x3d0
Dec 20 01:14:51 tendo [64681.146291]  [<ffffffff813091df>] ?
generic_make_request_checks+0x29f/0x4f0
Dec 20 01:14:51 tendo [64681.146880]  [<ffffffff8118cc4e>] ?
mempool_alloc+0x6e/0x170
Dec 20 01:14:51 tendo [64681.147469]  [<ffffffff8130d143>]
generic_make_request+0x113/0x2c0
Dec 20 01:14:51 tendo [64681.148055]  [<ffffffff8130d367>] submit_bio+0x77/0x150
Dec 20 01:14:51 tendo [64681.148620]  [<ffffffff81194799>] ?
__test_set_page_writeback+0x149/0x1c0
Dec 20 01:14:51 tendo [64681.149194]  [<ffffffff811cbf9d>]
__swap_writepage+0x20d/0x250
Dec 20 01:14:51 tendo [64681.149753]  [<ffffffff811d155c>] ?
__frontswap_store+0x8c/0x120
Dec 20 01:14:51 tendo [64681.150304]  [<ffffffff811cc019>]
swap_writepage+0x39/0x70
Dec 20 01:14:51 tendo [64681.150852]  [<ffffffff8119b550>]
pageout.isra.43+0x170/0x290
Dec 20 01:14:51 tendo [64681.151392]  [<ffffffff8119d843>]
shrink_page_list+0x353/0x780
Dec 20 01:14:51 tendo [64681.151929]  [<ffffffff8119e2b7>]
shrink_inactive_list+0x217/0x530
Dec 20 01:14:51 tendo [64681.152462]  [<ffffffff8119eed8>]
shrink_lruvec+0x588/0x740
Dec 20 01:14:51 tendo [64681.152991]  [<ffffffff8119f17b>]
shrink_zone+0xeb/0x2e0
Dec 20 01:14:51 tendo [64681.153512]  [<ffffffff8119f6e4>]
do_try_to_free_pages+0x164/0x430
Dec 20 01:14:51 tendo [64681.154035]  [<ffffffff8119fa85>]
try_to_free_pages+0xd5/0x190
Dec 20 01:14:51 tendo [64681.154555]  [<ffffffff81202310>]
__alloc_pages_slowpath.constprop.89+0x3a0/0x6e8
Dec 20 01:14:51 tendo [64681.155079]  [<ffffffff81192fc3>]
__alloc_pages_nodemask+0x253/0x260
Dec 20 01:14:51 tendo [64681.155600]  [<ffffffff811da4b8>]
alloc_pages_current+0x88/0x120
Dec 20 01:14:51 tendo [64681.156119]  [<ffffffff8118a1e4>]
__page_cache_alloc+0xb4/0xc0
Dec 20 01:14:51 tendo [64681.156633]  [<ffffffff8118ac59>]
pagecache_get_page+0x59/0x1c0
Dec 20 01:14:51 tendo [64681.157146]  [<ffffffffa05226cd>] ?
io_ctl_init+0xdd/0x190 [btrfs]
Dec 20 01:14:51 tendo [64681.157654]  [<ffffffffa05227d0>]
io_ctl_prepare_pages+0x50/0x140 [btrfs]
Dec 20 01:14:51 tendo [64681.158168]  [<ffffffffa0522ec3>]
__btrfs_write_out_cache.isra.24+0xa3/0x430 [btrfs]
Dec 20 01:14:51 tendo [64681.158683]  [<ffffffffa052381d>]
btrfs_write_out_cache+0x8d/0x110 [btrfs]
Dec 20 01:14:51 tendo [64681.159214]  [<ffffffffa04c91ae>]
btrfs_write_dirty_block_groups+0x1de/0x270 [btrfs]
Dec 20 01:14:51 tendo [64681.159718]  [<ffffffffa0555f6f>]
commit_cowonly_roots+0x215/0x2bd [btrfs]
Dec 20 01:14:51 tendo [64681.160222]  [<ffffffffa04dda56>]
btrfs_commit_transaction+0x516/0xa70 [btrfs]
Dec 20 01:14:51 tendo [64681.160726]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.161249]  [<ffffffffa04d8959>]
transaction_kthread+0x1a9/0x230 [btrfs]
Dec 20 01:14:51 tendo [64681.161770]  [<ffffffffa04d87b0>] ?
btrfs_cleanup_transaction+0x560/0x560 [btrfs]
Dec 20 01:14:51 tendo [64681.162296]  [<ffffffff8109d8e5>] kthread+0xe5/0x100
Dec 20 01:14:51 tendo [64681.162819]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.163337]  [<ffffffff8170434f>]
ret_from_fork+0x3f/0x70
Dec 20 01:14:51 tendo [64681.163849]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.164339] INFO: task ksmtuned:1052 blocked
for more than 120 seconds.
Dec 20 01:14:51 tendo [64681.164825]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.165312] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.165804]  ffff880396daf368
Dec 20 01:14:51 tendo ffff88040d6c2c80
Dec 20 01:14:51 tendo ffff8800b5cec2c0
Dec 20 01:14:51 tendo ffff880396db0000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.166306]  0000000000000000
Dec 20 01:14:51 tendo 7fffffffffffffff
Dec 20 01:14:51 tendo ffff8800bd19a748
Dec 20 01:14:51 tendo ffffe8ffffc43980
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.166809]  ffff880396daf380
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88041fa56cc0
Dec 20 01:14:51 tendo ffff880396daf430
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.167317] Call Trace:
Dec 20 01:14:51 tendo [64681.167816]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.168322]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:51 tendo [64681.168825]  [<ffffffff816ffa36>]
io_schedule_timeout+0xa6/0x110
Dec 20 01:14:51 tendo [64681.169332]  [<ffffffff8131a4dc>] bt_get+0x14c/0x1d0
Dec 20 01:14:51 tendo [64681.169834]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.170359]  [<ffffffff8131a9e0>]
blk_mq_get_tag+0xb0/0xe0
Dec 20 01:14:51 tendo [64681.170869]  [<ffffffff813161bb>]
__blk_mq_alloc_request+0x1b/0x1f0
Dec 20 01:14:51 tendo [64681.171378]  [<ffffffff81317f7e>]
blk_mq_map_request+0x19e/0x210
Dec 20 01:14:51 tendo [64681.171885]  [<ffffffff8131901b>]
blk_sq_make_request+0xab/0x3d0
Dec 20 01:14:51 tendo [64681.172379]  [<ffffffff813091df>] ?
generic_make_request_checks+0x29f/0x4f0
Dec 20 01:14:51 tendo [64681.172892]  [<ffffffff8118cc4e>] ?
mempool_alloc+0x6e/0x170
Dec 20 01:14:51 tendo [64681.173403]  [<ffffffff8130d143>]
generic_make_request+0x113/0x2c0
Dec 20 01:14:51 tendo [64681.173905]  [<ffffffff8130d367>] submit_bio+0x77/0x150
Dec 20 01:14:51 tendo [64681.174406]  [<ffffffff81194799>] ?
__test_set_page_writeback+0x149/0x1c0
Dec 20 01:14:51 tendo [64681.174913]  [<ffffffff811cbf9d>]
__swap_writepage+0x20d/0x250
Dec 20 01:14:51 tendo [64681.175421]  [<ffffffff811d155c>] ?
__frontswap_store+0x8c/0x120
Dec 20 01:14:51 tendo [64681.175933]  [<ffffffff811cc019>]
swap_writepage+0x39/0x70
Dec 20 01:14:51 tendo [64681.176437]  [<ffffffff8119b550>]
pageout.isra.43+0x170/0x290
Dec 20 01:14:51 tendo [64681.176949]  [<ffffffff8119d843>]
shrink_page_list+0x353/0x780
Dec 20 01:14:51 tendo [64681.177461]  [<ffffffff8119e2b7>]
shrink_inactive_list+0x217/0x530
Dec 20 01:14:51 tendo [64681.177973]  [<ffffffff8119eed8>]
shrink_lruvec+0x588/0x740
Dec 20 01:14:51 tendo [64681.178480]  [<ffffffff8119f17b>]
shrink_zone+0xeb/0x2e0
Dec 20 01:14:51 tendo [64681.178998]  [<ffffffff8119f6e4>]
do_try_to_free_pages+0x164/0x430
Dec 20 01:14:51 tendo [64681.179500]  [<ffffffff8119aace>] ?
throttle_direct_reclaim+0x14e/0x240
Dec 20 01:14:51 tendo [64681.180004]  [<ffffffff8119fa85>]
try_to_free_pages+0xd5/0x190
Dec 20 01:14:51 tendo [64681.180507]  [<ffffffff81202310>]
__alloc_pages_slowpath.constprop.89+0x3a0/0x6e8
Dec 20 01:14:51 tendo [64681.181019]  [<ffffffff81192fc3>]
__alloc_pages_nodemask+0x253/0x260
Dec 20 01:14:51 tendo [64681.181523]  [<ffffffff81193191>]
alloc_kmem_pages_node+0x51/0xd0
Dec 20 01:14:51 tendo [64681.182046]  [<ffffffff8107c3d0>]
copy_process+0x180/0x1d20
Dec 20 01:14:51 tendo [64681.182548]  [<ffffffff811e53d2>] ?
kmem_cache_alloc+0x1e2/0x200
Dec 20 01:14:51 tendo [64681.183050]  [<ffffffff81208e2c>] ?
get_empty_filp+0x5c/0x1a0
Dec 20 01:14:51 tendo [64681.183542]  [<ffffffff8107e116>] _do_fork+0x96/0x330
Dec 20 01:14:51 tendo [64681.184018]  [<ffffffff810038e9>] ?
syscall_trace_enter_phase1+0x139/0x150
Dec 20 01:14:51 tendo [64681.184499]  [<ffffffff810796ce>] ?
syscall_slow_exit_work+0x3f/0xbd
Dec 20 01:14:51 tendo [64681.184984]  [<ffffffff8107e459>] SyS_clone+0x19/0x20
Dec 20 01:14:51 tendo [64681.185487]  [<ffffffff81703fee>]
entry_SYSCALL_64_fastpath+0x12/0x71
Dec 20 01:14:51 tendo [64681.185990] INFO: task pyfan:1158 blocked for
more than 120 seconds.
Dec 20 01:14:51 tendo [64681.186486]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.186987] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.187498]  ffff8803353ab368
Dec 20 01:14:51 tendo ffff88040d6c5900
Dec 20 01:14:51 tendo ffff880396520000
Dec 20 01:14:51 tendo ffff8803353ac000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.188023]  0000000000000000
Dec 20 01:14:51 tendo 7fffffffffffffff
Dec 20 01:14:51 tendo ffff8800bd19a748
Dec 20 01:14:51 tendo ffffe8ffffcc3980
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.188550]  ffff8803353ab380
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo ffff88041fad6cc0
Dec 20 01:14:51 tendo ffff8803353ab430
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.189075] Call Trace:
Dec 20 01:14:51 tendo [64681.189584]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.190093]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:51 tendo [64681.190598]  [<ffffffff816ffa36>]
io_schedule_timeout+0xa6/0x110
Dec 20 01:14:51 tendo [64681.191100]  [<ffffffff8131a4dc>] bt_get+0x14c/0x1d0
Dec 20 01:14:51 tendo [64681.191578]  [<ffffffff810c2940>] ?
prepare_to_wait_event+0xf0/0xf0
Dec 20 01:14:51 tendo [64681.192062]  [<ffffffff8131a9e0>]
blk_mq_get_tag+0xb0/0xe0
Dec 20 01:14:51 tendo [64681.192547]  [<ffffffff813161bb>]
__blk_mq_alloc_request+0x1b/0x1f0
Dec 20 01:14:51 tendo [64681.193036]  [<ffffffff81317f7e>]
blk_mq_map_request+0x19e/0x210
Dec 20 01:14:51 tendo [64681.193522]  [<ffffffff8131901b>]
blk_sq_make_request+0xab/0x3d0
Dec 20 01:14:51 tendo [64681.194020]  [<ffffffff813091df>] ?
generic_make_request_checks+0x29f/0x4f0
Dec 20 01:14:51 tendo [64681.194503]  [<ffffffff8118cc4e>] ?
mempool_alloc+0x6e/0x170
Dec 20 01:14:51 tendo [64681.194988]  [<ffffffff8130d143>]
generic_make_request+0x113/0x2c0
Dec 20 01:14:51 tendo [64681.195475]  [<ffffffff8130d367>] submit_bio+0x77/0x150
Dec 20 01:14:51 tendo [64681.195961]  [<ffffffff81194799>] ?
__test_set_page_writeback+0x149/0x1c0
Dec 20 01:14:51 tendo [64681.196458]  [<ffffffff811cbf9d>]
__swap_writepage+0x20d/0x250
Dec 20 01:14:51 tendo [64681.196948]  [<ffffffff811d155c>] ?
__frontswap_store+0x8c/0x120
Dec 20 01:14:51 tendo [64681.197454]  [<ffffffff811cc019>]
swap_writepage+0x39/0x70
Dec 20 01:14:51 tendo [64681.197941]  [<ffffffff8119b550>]
pageout.isra.43+0x170/0x290
Dec 20 01:14:51 tendo [64681.198426]  [<ffffffff8119d843>]
shrink_page_list+0x353/0x780
Dec 20 01:14:51 tendo [64681.198922]  [<ffffffff8119e2b7>]
shrink_inactive_list+0x217/0x530
Dec 20 01:14:51 tendo [64681.199417]  [<ffffffff8119eed8>]
shrink_lruvec+0x588/0x740
Dec 20 01:14:51 tendo [64681.199911]  [<ffffffff8119f17b>]
shrink_zone+0xeb/0x2e0
Dec 20 01:14:51 tendo [64681.200400]  [<ffffffff8119f6e4>]
do_try_to_free_pages+0x164/0x430
Dec 20 01:14:51 tendo [64681.200892]  [<ffffffff8119aace>] ?
throttle_direct_reclaim+0x14e/0x240
Dec 20 01:14:51 tendo [64681.201383]  [<ffffffff8119fa85>]
try_to_free_pages+0xd5/0x190
Dec 20 01:14:51 tendo [64681.201881]  [<ffffffff81202310>]
__alloc_pages_slowpath.constprop.89+0x3a0/0x6e8
Dec 20 01:14:51 tendo [64681.202386]  [<ffffffff81192fc3>]
__alloc_pages_nodemask+0x253/0x260
Dec 20 01:14:51 tendo [64681.202897]  [<ffffffff81193191>]
alloc_kmem_pages_node+0x51/0xd0
Dec 20 01:14:51 tendo [64681.203389]  [<ffffffff8107c3d0>]
copy_process+0x180/0x1d20
Dec 20 01:14:51 tendo [64681.203878]  [<ffffffff811e53d2>] ?
kmem_cache_alloc+0x1e2/0x200
Dec 20 01:14:51 tendo [64681.204376]  [<ffffffff81208e2c>] ?
get_empty_filp+0x5c/0x1a0
Dec 20 01:14:51 tendo [64681.204849]  [<ffffffff8107e116>] _do_fork+0x96/0x330
Dec 20 01:14:51 tendo [64681.205313]  [<ffffffff810038e9>] ?
syscall_trace_enter_phase1+0x139/0x150
Dec 20 01:14:51 tendo [64681.205810]  [<ffffffff8107e459>] SyS_clone+0x19/0x20
Dec 20 01:14:51 tendo [64681.206309]  [<ffffffff81703fee>]
entry_SYSCALL_64_fastpath+0x12/0x71
Dec 20 01:14:51 tendo [64681.206809] INFO: task loop0:1183 blocked for
more than 120 seconds.
Dec 20 01:14:51 tendo [64681.207302]       Not tainted
4.4.106-1.el7.elrepo.x86_64 #1
Dec 20 01:14:51 tendo [64681.207796] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 20 01:14:51 tendo [64681.208302]  ffff8800a1d2f998
Dec 20 01:14:51 tendo ffff88040d6c42c0
Dec 20 01:14:51 tendo ffff880337ae42c0
Dec 20 01:14:51 tendo ffff8800a1d30000
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.208815]  ffff8801f3af59e0
Dec 20 01:14:51 tendo ffff8801f3af59d8
Dec 20 01:14:51 tendo ffff880337ae42c0
Dec 20 01:14:51 tendo ffff8803eaf1d290
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.209342]  ffff8800a1d2f9b0
Dec 20 01:14:51 tendo ffffffff81700435
Dec 20 01:14:51 tendo 7fffffffffffffff
Dec 20 01:14:51 tendo ffff8800a1d2fa58
Dec 20 01:14:51 tendo
Dec 20 01:14:51 tendo [64681.209848] Call Trace:
Dec 20 01:14:51 tendo [64681.210364]  [<ffffffff81700435>] schedule+0x35/0x80
Dec 20 01:14:51 tendo [64681.210874]  [<ffffffff81703147>]
schedule_timeout+0x237/0x2d0
Dec 20 01:14:51 tendo [64681.211376]  [<ffffffff810aa3d5>] ?
wake_up_process+0x15/0x20
Dec 20 01:14:51 tendo [64681.211875]  [<ffffffff810946a4>] ?
wake_up_worker+0x24/0x30
Dec 20 01:14:51 tendo [64681.212366]  [<ffffffff81095482>] ?
insert_work+0x62/0xa0
Dec 20 01:14:51 tendo [64681.212834]  [<ffffffff81700f31>]
wait_for_completion+0xf1/0x130
Dec 20 01:14:51 tendo [64681.213325]  [<ffffffff810aa460>] ? wake_up_q+0x80/0x80
Dec 20 01:14:51 tendo [64681.213803]  [<ffffffffa04c26e7>]
btrfs_async_run_delayed_refs+0x127/0x150 [btrfs]
Dec 20 01:14:51 tendo [64681.214272]  [<ffffffffa04de786>]
__btrfs_end_transaction+0x1f6/0x340 [btrfs]
Dec 20 01:14:51 tendo [64681.214745]  [<ffffffffa04de8e0>]
btrfs_end_transaction+0x10/0x20 [btrfs]
Dec 20 01:14:51 tendo [64681.215241]  [<ffffffffa04e3b6d>]
btrfs_dirty_inode+0x6d/0xe0 [btrfs]
Dec 20 01:14:51 tendo [64681.215739]  [<ffffffffa04e3c44>]
btrfs_update_time+0x64/0xc0 [btrfs]
Dec 20 01:14:51 tendo [64681.216223]  [<ffffffff81223638>] touch_atime+0xa8/0xd0
Dec 20 01:14:51 tendo [64681.216700]  [<ffffffff8118bc51>]
generic_file_read_iter+0x581/0x660
Dec 20 01:14:51 tendo [64681.217189]  [<ffffffff8118cb37>] ?
mempool_free_slab+0x17/0x20
Dec 20 01:14:51 tendo [64681.217694]  [<ffffffff8118cf7b>] ?
mempool_free+0x2b/0x80
Dec 20 01:14:51 tendo [64681.218182]  [<ffffffff81206874>]
vfs_iter_read+0x74/0xb0
Dec 20 01:14:51 tendo [64681.218657]  [<ffffffffa097719e>]
lo_read_simple.isra.27+0xbe/0x200 [loop]
Dec 20 01:14:51 tendo [64681.219138]  [<ffffffffa0977850>]
loop_queue_work+0x570/0x72c [loop]
Dec 20 01:14:51 tendo [64681.219615]  [<ffffffff81017695>] ?
__switch_to+0x1e5/0x490
Dec 20 01:14:51 tendo [64681.220091]  [<ffffffff816ffde9>] ?
__schedule+0x349/0x960
Dec 20 01:14:51 tendo [64681.220565]  [<ffffffff8109deeb>]
kthread_worker_fn+0x7b/0x170
Dec 20 01:14:51 tendo [64681.221056]  [<ffffffff8109de70>] ?
kthread_create_on_node+0x1a0/0x1a0
Dec 20 01:14:51 tendo [64681.221533]  [<ffffffff8109d8e5>] kthread+0xe5/0x100
Dec 20 01:14:51 tendo [64681.222009]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60
Dec 20 01:14:51 tendo [64681.222485]  [<ffffffff8170434f>]
ret_from_fork+0x3f/0x70
Dec 20 01:14:51 tendo [64681.222961]  [<ffffffff8109d800>] ?
kthread_park+0x60/0x60

On Tue, Dec 19, 2017 at 9:48 AM, Rich Rauenzahn <rrauenza@gmail.com> wrote:
> On Tue, Dec 19, 2017 at 9:14 AM, Hans van Kranenburg
> <hans.van.kranenburg@mendix.com> wrote:
>> Just wild first guess... are you also using btrfs send/receive
>> functionality where the system having problems is the sending part?
>
> No.
>
>>>> Every night I'm getting a kernel hang, sometimes caught by my
>>>> watchdog, sometimes not.  Last night's hang was on the balance of DATA
>>>> on / at 70.
>>>>
>>>> I'm not sure how to further trace this down to help you -- the console
>>>> by the time I notice just has lots of messages on it without the
>>>> initial ones.
>>
>> Capturing more logs is definitely the first thing to do.
>>
>> Look if the output of `dmesg` still shows the btrfs errors. Otherwise,
>
> But I can't ...  I have to hit the reset button (or the watchdog does)
> before I can get into the system again.
>
>> if something is spamming there, turn that off, or if you don't have the
>> errors in a log file because the log files are on the same btrfs, then
>> you have to find out another way to capture them. E.g. make the kernel
>> buffer for messages bigger, use netconsole or just pragmatic things like
>> ssh from another server and `dmesg -w` and store it on the other machine.
>
> Ah... yes, a permanent connection from another might work.
>
> I also have a pfsense box -- I think it can take a syslogd daemon.
> Maybe I can netconsole to it?
>
> Will investigate.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-20 16:24       ` Rich Rauenzahn
@ 2017-12-20 19:02         ` Chris Murphy
  2017-12-20 19:58           ` Holger Hoffstätte
  0 siblings, 1 reply; 10+ messages in thread
From: Chris Murphy @ 2017-12-20 19:02 UTC (permalink / raw)
  To: Rich Rauenzahn; +Cc: Hans van Kranenburg, Btrfs BTRFS

I don't know if it's the sending MUA or the list server, but the line
wrapping makes this much harder to follow. I suggest putting it in a
text file and attaching the text file. It's definitely not on the
receiving side, I see it here also:
https://www.spinics.net/lists/linux-btrfs/msg72872.html

Also anytime you get a hang, it's best to issue sysrq+w which dumps
extra information into kernel messages that the developers sometimes
find useful.


Chris Murphy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-20 19:02         ` Chris Murphy
@ 2017-12-20 19:58           ` Holger Hoffstätte
  2017-12-20 21:11             ` Rich Rauenzahn
  2017-12-21  6:26             ` Duncan
  0 siblings, 2 replies; 10+ messages in thread
From: Holger Hoffstätte @ 2017-12-20 19:58 UTC (permalink / raw)
  To: Chris Murphy, Rich Rauenzahn, Btrfs BTRFS; +Cc: Hans van Kranenburg

On 12/20/17 20:02, Chris Murphy wrote:
> I don't know if it's the sending MUA or the list server, but the line
> wrapping makes this much harder to follow. I suggest putting it in a
> text file and attaching the text file. It's definitely not on the
> receiving side, I see it here also:
> https://www.spinics.net/lists/linux-btrfs/msg72872.html

You can see enough to suggest that blk-mq is hanging, which is
"unsurprising" (being kind here) with such an old kernel.
Rich, build your kernel with CONFIG_SCSI_MQ_DEFAULT=n or
boot with scsi_mod.use_blk_mq=n as kernel prarameter.

-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-20 19:58           ` Holger Hoffstätte
@ 2017-12-20 21:11             ` Rich Rauenzahn
  2017-12-21  6:26             ` Duncan
  1 sibling, 0 replies; 10+ messages in thread
From: Rich Rauenzahn @ 2017-12-20 21:11 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: Chris Murphy, Btrfs BTRFS, Hans van Kranenburg

I switched to the LT kernel because of this issue.  I was running
mainline and thought that LT would get me stability.  I can switch
back to LT while we RCA.

At the risk of changing two things, I could add that
(scsi_mod.use_blk_mq=n) to my boot and also switch back to ML.  I do
notice that disk IO is high at the time of the hang -- I've assumed it
is btrfs balance, but if I read the logs right, it may have finished
before the hang.  Adding more logging.

I've also put the kernel logs into a pastebin:  https://pastebin.com/80qRXA9c

...wish I could get the watchdog to force a dump/trace.  Will look at that, too.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-20 19:58           ` Holger Hoffstätte
  2017-12-20 21:11             ` Rich Rauenzahn
@ 2017-12-21  6:26             ` Duncan
  2017-12-23 19:59               ` Rich Rauenzahn
  1 sibling, 1 reply; 10+ messages in thread
From: Duncan @ 2017-12-21  6:26 UTC (permalink / raw)
  To: linux-btrfs

Holger Hoffstätte posted on Wed, 20 Dec 2017 20:58:14 +0100 as excerpted:

> On 12/20/17 20:02, Chris Murphy wrote:
>> I don't know if it's the sending MUA or the list server, but the line
>> wrapping makes this much harder to follow. I suggest putting it in a
>> text file and attaching the text file. It's definitely not on the
>> receiving side, I see it here also:
>> https://www.spinics.net/lists/linux-btrfs/msg72872.html
> 
> You can see enough to suggest that blk-mq is hanging, which is
> "unsurprising" (being kind here) with such an old kernel. Rich, build
> your kernel with CONFIG_SCSI_MQ_DEFAULT=n or boot with
> scsi_mod.use_blk_mq=n as kernel prarameter.

Well, the kernel is 4.4 "el repo".  4.4 /is/ an LTS (and elrepo is AFAIK 
a red hat backports repo, not sure how official, but useful for people on 
red hat), but it's now the second-back LTS, with 4.9 and the new 4.14 
being newer LTS series.

The thing is that this is the btrfs list, and we're development and thus 
rather forward focused here.  As such, we normally want at /least/ the 
second newest (first back) lts series kernel for best chance at 
reasonable support.  While I understand people who want to stick with LTS 
being reluctant to go with the /newest/ LTS before even a single current 
release has passed, making that LTS still a bit new as such things go, 
certainly the one-back LTS, 4.9, should be reasonable.

So yes, tho 4.4 is at least an LTS, for purposes of btrfs and this list, 
it really is rather old now, and an upgrade, presumably to 4.9 in keeping 
with the LTS theme, would be recommended.  If the issue can be confirmed 
on the current and LTS 4.14, so much the better, but certainly, 4.9 is a 
recommended upgrade, and a bug there would still be in the concern for a 
fix range, while 4.4... really is just out of the focus range for this 
list, tho various longer focus distros will of course still provide 
support for it.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernel hangs during balance
  2017-12-21  6:26             ` Duncan
@ 2017-12-23 19:59               ` Rich Rauenzahn
  0 siblings, 0 replies; 10+ messages in thread
From: Rich Rauenzahn @ 2017-12-23 19:59 UTC (permalink / raw)
  To: Btrfs BTRFS

I finally got a full stack trace via sysrq.   A lot of stacks seem to
end in page fault -- I wonder if its because my page file is a
loopback on btrfs.

https://pastebin.com/GyWAu1EP

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.14.8-1.el7.elrepo.x86_64
root=UUID=35f0ce3f-0902-47a3-8ad8-86179d1f3e3a ro
rootflags=subvol=root00 vconsole.keymap=us crashkernel=128M
vconsole.font=latarcyrheb-sun16 consoleblank=600 scsi_mod.use_blk_mq=n
LANG=en_US.UTF-8

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-12-23 19:59 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-19 16:09 kernel hangs during balance Rich Rauenzahn
2017-12-19 17:08 ` Rich Rauenzahn
2017-12-19 17:14   ` Hans van Kranenburg
2017-12-19 17:48     ` Rich Rauenzahn
2017-12-20 16:24       ` Rich Rauenzahn
2017-12-20 19:02         ` Chris Murphy
2017-12-20 19:58           ` Holger Hoffstätte
2017-12-20 21:11             ` Rich Rauenzahn
2017-12-21  6:26             ` Duncan
2017-12-23 19:59               ` Rich Rauenzahn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox