public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* Yet another ext4/jbd2 lockup
@ 2015-07-29 13:33 Nikolay Borisov
  2015-08-03 15:15 ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Nikolay Borisov @ 2015-07-29 13:33 UTC (permalink / raw)
  To: linux-ext4; +Cc: Theodore Ts'o, Marian Marinov

Hello, 

Again, I have come up against a rather annoying ext4/jbd2 
lock and I'm not sure it's the same issue as the last one 
I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). 
Again, all of this is happening on an 3.12.28 kernel. I did grep for
the ext4 commits that have come afterwards but none of them allude to
fixing something similar. This time the blocked process is an rsync 
which has been running OUTSIDE of a container e.g. in the / cgroup 
but it was copying stuff from a dm-thin volume acting as the root 
device for a cgroup: 

rsync           D ffffffff81622100     0 33158      1 0x00000004
 ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640
 ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640
 ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280
Call Trace:
 [<ffffffff81573f24>] schedule+0x24/0x70
 [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0
 [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
 [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660
 [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30
 [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150
 [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0
 [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0
 [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330
 [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40
 [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400
 [<ffffffff8115de05>] ? generic_permission+0xc5/0x230
 [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100
 [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100
 [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20
 [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b

Sampling echo "w" > /proc/sysrq-trigger I found some other 
blocked process which might be related to the issue. In particular: 

jbd2/dm-24-8    D 0000000000000000     0 31263      2 0x00000000
 ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640
 ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640
 ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000
Call Trace:
 [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70
 [<ffffffff81573f24>] schedule+0x24/0x70
 [<ffffffff81573ff7>] io_schedule+0x87/0xd0
 [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10
 [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0
 [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170
 [<ffffffff810f4fb2>] __lock_page+0x62/0x70
 [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40
 [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30
 [<ffffffff810fffe1>] write_cache_pages+0x391/0x470
 [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60
 [<ffffffff8110010c>] generic_writepages+0x4c/0x70
 [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790
 [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70
 [<ffffffff81235d12>] kjournald2+0xc2/0x250
 [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
 [<ffffffff81235c50>] ? commit_timeout+0x10/0x10
 [<ffffffff81097456>] kthread+0xc6/0xd0
 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
 [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60

kworker/u81:90  D 0000000000000000     0 36695      2 0x00000000
Workqueue: writeback bdi_writeback_workfn (flush-253:24)
 ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640
 ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640
 ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080
Call Trace:
 [<ffffffff81573f24>] schedule+0x24/0x70
 [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130
 [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
 [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70
 [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0
 [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0
 [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50
 [<ffffffff8110014b>] do_writepages+0x1b/0x30
 [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130
 [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0
 [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0
 [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0
 [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0
 [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0
 [<ffffffff810a4312>] ? wake_up_process+0x22/0x40
 [<ffffffff8109078b>] process_one_work+0x13b/0x460
 [<ffffffff81091a09>] worker_thread+0x119/0x3f0
 [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0
 [<ffffffff81097456>] kthread+0xc6/0xd0
 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
 [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60

Basically dm-24 is the device this rsync was syncing from  and the 
kworker is running as part of writeback for the same device 
(253:24 is the major:minor for this particular device). 
Furthermore, I see there have been a number of "sync" executed and 
their stack traces look like the following: 

sync            D ffffffff81622100     0 17596  17532 0x00000000
 ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640
 ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640
 ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0
Call Trace:
 [<ffffffff81573f24>] schedule+0x24/0x70
 [<ffffffff81571955>] schedule_timeout+0x185/0x1c0
 [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150
 [<ffffffff81574984>] wait_for_completion+0x94/0x100
 [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0
 [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0
 [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
 [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0
 [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
 [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20
 [<ffffffff81157b29>] iterate_supers+0xa9/0xe0
 [<ffffffff8117e880>] sys_sync+0x30/0x90
 [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b

I believe it is this sync which has triggered the bdi_writeback_workfn 
for the particular dm device. Essentially what I gather is that the 
rsync is locked since it wants to do some filesystem activity, hence 
it wants to start a transaction but it's blocked due to an unfinished 
transaction - presumably the one from the writeback path. However, 
the writeback is blocked in jbd2_log_wait_commit, waiting for its 
own transaction to complete and it is not progressing due to some 
reason. There hasn't been any OOM messages for the container in 
question and the global OOM killer hasn't triggered. The memory 
state of the machine when this was observed. There still some 
memory: 

SysRq : Show Memory
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
CPU    4: hi:    0, btch:   1 usd:   0
CPU    5: hi:    0, btch:   1 usd:   0
CPU    6: hi:    0, btch:   1 usd:   0
CPU    7: hi:    0, btch:   1 usd:   0
CPU    8: hi:    0, btch:   1 usd:   0
CPU    9: hi:    0, btch:   1 usd:   0
CPU   10: hi:    0, btch:   1 usd:   0
CPU   11: hi:    0, btch:   1 usd:   0
CPU   12: hi:    0, btch:   1 usd:   0
CPU   13: hi:    0, btch:   1 usd:   0
CPU   14: hi:    0, btch:   1 usd:   0
CPU   15: hi:    0, btch:   1 usd:   0
CPU   16: hi:    0, btch:   1 usd:   0
CPU   17: hi:    0, btch:   1 usd:   0
CPU   18: hi:    0, btch:   1 usd:   0
CPU   19: hi:    0, btch:   1 usd:   0
CPU   20: hi:    0, btch:   1 usd:   0
CPU   21: hi:    0, btch:   1 usd:   0
CPU   22: hi:    0, btch:   1 usd:   0
CPU   23: hi:    0, btch:   1 usd:   0
CPU   24: hi:    0, btch:   1 usd:   0
CPU   25: hi:    0, btch:   1 usd:   0
CPU   26: hi:    0, btch:   1 usd:   0
CPU   27: hi:    0, btch:   1 usd:   0
CPU   28: hi:    0, btch:   1 usd:   0
CPU   29: hi:    0, btch:   1 usd:   0
CPU   30: hi:    0, btch:   1 usd:   0
CPU   31: hi:    0, btch:   1 usd:   0
CPU   32: hi:    0, btch:   1 usd:   0
CPU   33: hi:    0, btch:   1 usd:   0
CPU   34: hi:    0, btch:   1 usd:   0
CPU   35: hi:    0, btch:   1 usd:   0
CPU   36: hi:    0, btch:   1 usd:   0
CPU   37: hi:    0, btch:   1 usd:   0
CPU   38: hi:    0, btch:   1 usd:   0
CPU   39: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd: 177
CPU    1: hi:  186, btch:  31 usd: 129
CPU    2: hi:  186, btch:  31 usd: 145
CPU    3: hi:  186, btch:  31 usd:  60
CPU    4: hi:  186, btch:  31 usd: 105
CPU    5: hi:  186, btch:  31 usd:  96
CPU    6: hi:  186, btch:  31 usd:  71
CPU    7: hi:  186, btch:  31 usd:  53
CPU    8: hi:  186, btch:  31 usd: 121
CPU    9: hi:  186, btch:  31 usd: 171
CPU   10: hi:  186, btch:  31 usd:  77
CPU   11: hi:  186, btch:  31 usd:   0
CPU   12: hi:  186, btch:  31 usd:   0
CPU   13: hi:  186, btch:  31 usd:   0
CPU   14: hi:  186, btch:  31 usd:  29
CPU   15: hi:  186, btch:  31 usd:  36
CPU   16: hi:  186, btch:  31 usd:   0
CPU   17: hi:  186, btch:  31 usd:   0
CPU   18: hi:  186, btch:  31 usd:   0
CPU   19: hi:  186, btch:  31 usd:  22
CPU   20: hi:  186, btch:  31 usd:  76
CPU   21: hi:  186, btch:  31 usd: 161
CPU   22: hi:  186, btch:  31 usd:  73
CPU   23: hi:  186, btch:  31 usd:  78
CPU   24: hi:  186, btch:  31 usd: 125
CPU   25: hi:  186, btch:  31 usd:  73
CPU   26: hi:  186, btch:  31 usd:  72
CPU   27: hi:  186, btch:  31 usd:  38
CPU   28: hi:  186, btch:  31 usd: 118
CPU   29: hi:  186, btch:  31 usd:  63
CPU   30: hi:  186, btch:  31 usd:   0
CPU   31: hi:  186, btch:  31 usd:   0
CPU   32: hi:  186, btch:  31 usd:   0
CPU   33: hi:  186, btch:  31 usd:   0
CPU   34: hi:  186, btch:  31 usd:   0
CPU   35: hi:  186, btch:  31 usd:   0
CPU   36: hi:  186, btch:  31 usd:   0
CPU   37: hi:  186, btch:  31 usd:   0
CPU   38: hi:  186, btch:  31 usd:   0
CPU   39: hi:  186, btch:  31 usd:   0
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 188
CPU    1: hi:  186, btch:  31 usd: 174
CPU    2: hi:  186, btch:  31 usd: 116
CPU    3: hi:  186, btch:  31 usd:  96
CPU    4: hi:  186, btch:  31 usd: 169
CPU    5: hi:  186, btch:  31 usd: 137
CPU    6: hi:  186, btch:  31 usd: 102
CPU    7: hi:  186, btch:  31 usd:  96
CPU    8: hi:  186, btch:  31 usd: 149
CPU    9: hi:  186, btch:  31 usd: 203
CPU   10: hi:  186, btch:  31 usd: 179
CPU   11: hi:  186, btch:  31 usd: 181
CPU   12: hi:  186, btch:  31 usd: 159
CPU   13: hi:  186, btch:  31 usd: 160
CPU   14: hi:  186, btch:  31 usd: 179
CPU   15: hi:  186, btch:  31 usd: 158
CPU   16: hi:  186, btch:  31 usd: 180
CPU   17: hi:  186, btch:  31 usd: 155
CPU   18: hi:  186, btch:  31 usd: 156
CPU   19: hi:  186, btch:  31 usd: 173
CPU   20: hi:  186, btch:  31 usd: 161
CPU   21: hi:  186, btch:  31 usd: 163
CPU   22: hi:  186, btch:  31 usd: 123
CPU   23: hi:  186, btch:  31 usd: 157
CPU   24: hi:  186, btch:  31 usd:  19
CPU   25: hi:  186, btch:  31 usd: 159
CPU   26: hi:  186, btch:  31 usd: 176
CPU   27: hi:  186, btch:  31 usd: 140
CPU   28: hi:  186, btch:  31 usd: 180
CPU   29: hi:  186, btch:  31 usd: 169
CPU   30: hi:  186, btch:  31 usd: 158
CPU   31: hi:  186, btch:  31 usd: 181
CPU   32: hi:  186, btch:  31 usd: 181
CPU   33: hi:  186, btch:  31 usd: 159
CPU   34: hi:  186, btch:  31 usd: 168
CPU   35: hi:  186, btch:  31 usd: 173
CPU   36: hi:  186, btch:  31 usd: 161
CPU   37: hi:  186, btch:  31 usd: 179
CPU   38: hi:  186, btch:  31 usd: 184
CPU   39: hi:  186, btch:  31 usd: 182
Node 1 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 164
CPU    1: hi:  186, btch:  31 usd: 167
CPU    2: hi:  186, btch:  31 usd: 160
CPU    3: hi:  186, btch:  31 usd: 171
CPU    4: hi:  186, btch:  31 usd: 175
CPU    5: hi:  186, btch:  31 usd: 184
CPU    6: hi:  186, btch:  31 usd: 169
CPU    7: hi:  186, btch:  31 usd: 159
CPU    8: hi:  186, btch:  31 usd: 171
CPU    9: hi:  186, btch:  31 usd: 185
CPU   10: hi:  186, btch:  31 usd:  86
CPU   11: hi:  186, btch:  31 usd:  73
CPU   12: hi:  186, btch:  31 usd:  29
CPU   13: hi:  186, btch:  31 usd:  62
CPU   14: hi:  186, btch:  31 usd:  91
CPU   15: hi:  186, btch:  31 usd: 114
CPU   16: hi:  186, btch:  31 usd:  69
CPU   17: hi:  186, btch:  31 usd: 117
CPU   18: hi:  186, btch:  31 usd:  17
CPU   19: hi:  186, btch:  31 usd:  20
CPU   20: hi:  186, btch:  31 usd: 163
CPU   21: hi:  186, btch:  31 usd: 161
CPU   22: hi:  186, btch:  31 usd: 175
CPU   23: hi:  186, btch:  31 usd: 172
CPU   24: hi:  186, btch:  31 usd: 177
CPU   25: hi:  186, btch:  31 usd: 171
CPU   26: hi:  186, btch:  31 usd: 157
CPU   27: hi:  186, btch:  31 usd: 171
CPU   28: hi:  186, btch:  31 usd: 182
CPU   29: hi:  186, btch:  31 usd: 183
CPU   30: hi:  186, btch:  31 usd:  40
CPU   31: hi:  186, btch:  31 usd: 157
CPU   32: hi:  186, btch:  31 usd:  76
CPU   33: hi:  186, btch:  31 usd: 157
CPU   34: hi:  186, btch:  31 usd:  63
CPU   35: hi:  186, btch:  31 usd:  62
CPU   36: hi:  186, btch:  31 usd:  58
CPU   37: hi:  186, btch:  31 usd:  36
CPU   38: hi:  186, btch:  31 usd:  19
CPU   39: hi:  186, btch:  31 usd:  47
active_anon:6762239 inactive_anon:12245334 isolated_anon:0
 active_file:10705451 inactive_file:11321321 isolated_file:0
 unevictable:5516901 dirty:457441 writeback:0 unstable:0
 free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537
 mapped:1567688 shmem:904717 pagetables:362061 bounce:0
 free_cma:0
Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 1668 128642 128642
Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 126973 126973
Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB
Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB
Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB
Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB
23967743 total pagecache pages
1034858 pages in swap cache
Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552
Free swap  = 16464476kB
Total swap = 41943036kB
67100065 pages RAM
0 pages HighMem/MovableOnly
524961 pages reserved

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Yet another ext4/jbd2 lockup
  2015-07-29 13:33 Yet another ext4/jbd2 lockup Nikolay Borisov
@ 2015-08-03 15:15 ` Jan Kara
       [not found]   ` <55C4616A.10100@kyup.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2015-08-03 15:15 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-ext4, Theodore Ts'o, Marian Marinov

  Hello,

On Wed 29-07-15 16:33:18, Nikolay Borisov wrote:
> Again, I have come up against a rather annoying ext4/jbd2 
> lock and I'm not sure it's the same issue as the last one 
> I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). 
> Again, all of this is happening on an 3.12.28 kernel. I did grep for
> the ext4 commits that have come afterwards but none of them allude to
> fixing something similar. This time the blocked process is an rsync 
> which has been running OUTSIDE of a container e.g. in the / cgroup 
> but it was copying stuff from a dm-thin volume acting as the root 
> device for a cgroup: 
> 
> rsync           D ffffffff81622100     0 33158      1 0x00000004
>  ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640
>  ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640
>  ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280
> Call Trace:
>  [<ffffffff81573f24>] schedule+0x24/0x70
>  [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0
>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
>  [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660
>  [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30
>  [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150
>  [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0
>  [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0
>  [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330
>  [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40
>  [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400
>  [<ffffffff8115de05>] ? generic_permission+0xc5/0x230
>  [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100
>  [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100
>  [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20
>  [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b

So this is waiting to start a new transaction in the journal.

> Sampling echo "w" > /proc/sysrq-trigger I found some other 

You can just attach full output from the above command. That way we can be
sure we see all the pieces...

> blocked process which might be related to the issue. In particular: 
> 
> jbd2/dm-24-8    D 0000000000000000     0 31263      2 0x00000000
>  ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640
>  ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640
>  ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000
> Call Trace:
>  [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70
>  [<ffffffff81573f24>] schedule+0x24/0x70
>  [<ffffffff81573ff7>] io_schedule+0x87/0xd0
>  [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10
>  [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0
>  [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170
>  [<ffffffff810f4fb2>] __lock_page+0x62/0x70
>  [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40
>  [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30
>  [<ffffffff810fffe1>] write_cache_pages+0x391/0x470
>  [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60
>  [<ffffffff8110010c>] generic_writepages+0x4c/0x70
>  [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790
>  [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70
>  [<ffffffff81235d12>] kjournald2+0xc2/0x250
>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
>  [<ffffffff81235c50>] ? commit_timeout+0x10/0x10
>  [<ffffffff81097456>] kthread+0xc6/0xd0
>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
>  [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60

So jbd2 thread is waiting for page lock. Interesting...

> kworker/u81:90  D 0000000000000000     0 36695      2 0x00000000
> Workqueue: writeback bdi_writeback_workfn (flush-253:24)
>  ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640
>  ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640
>  ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080
> Call Trace:
>  [<ffffffff81573f24>] schedule+0x24/0x70
>  [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130
>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
>  [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70
>  [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0
>  [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0
>  [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50
>  [<ffffffff8110014b>] do_writepages+0x1b/0x30
>  [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130
>  [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0
>  [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0
>  [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0
>  [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0
>  [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0
>  [<ffffffff810a4312>] ? wake_up_process+0x22/0x40
>  [<ffffffff8109078b>] process_one_work+0x13b/0x460
>  [<ffffffff81091a09>] worker_thread+0x119/0x3f0
>  [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0
>  [<ffffffff81097456>] kthread+0xc6/0xd0
>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
>  [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60

And this is likely the culprit - flusher thread is waiting for journal
commit from jbd2_journal_stop() at which point it may be holding page locks
which are blocking jbd2 thread. Now waiting from transaction commit from
jbd2_journal_stop() is really unexpected. That happens only if the
handle was a sync one and the only way I can see handle becoming a sync one
from a writeback path is when this would be the first file over 4 GB in the
filesystem which looks really unlikely... Can you post output of dumpe2fs
for the filesystem? Thanks!

								Honza

> Basically dm-24 is the device this rsync was syncing from  and the 
> kworker is running as part of writeback for the same device 
> (253:24 is the major:minor for this particular device). 
> Furthermore, I see there have been a number of "sync" executed and 
> their stack traces look like the following: 
> 
> sync            D ffffffff81622100     0 17596  17532 0x00000000
>  ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640
>  ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640
>  ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0
> Call Trace:
>  [<ffffffff81573f24>] schedule+0x24/0x70
>  [<ffffffff81571955>] schedule_timeout+0x185/0x1c0
>  [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150
>  [<ffffffff81574984>] wait_for_completion+0x94/0x100
>  [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0
>  [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0
>  [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
>  [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0
>  [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
>  [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20
>  [<ffffffff81157b29>] iterate_supers+0xa9/0xe0
>  [<ffffffff8117e880>] sys_sync+0x30/0x90
>  [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b
> 
> I believe it is this sync which has triggered the bdi_writeback_workfn 
> for the particular dm device. Essentially what I gather is that the 
> rsync is locked since it wants to do some filesystem activity, hence 
> it wants to start a transaction but it's blocked due to an unfinished 
> transaction - presumably the one from the writeback path. However, 
> the writeback is blocked in jbd2_log_wait_commit, waiting for its 
> own transaction to complete and it is not progressing due to some 
> reason. There hasn't been any OOM messages for the container in 
> question and the global OOM killer hasn't triggered. The memory 
> state of the machine when this was observed. There still some 
> memory: 
> 
> SysRq : Show Memory
> Mem-Info:
> Node 0 DMA per-cpu:
> CPU    0: hi:    0, btch:   1 usd:   0
> CPU    1: hi:    0, btch:   1 usd:   0
> CPU    2: hi:    0, btch:   1 usd:   0
> CPU    3: hi:    0, btch:   1 usd:   0
> CPU    4: hi:    0, btch:   1 usd:   0
> CPU    5: hi:    0, btch:   1 usd:   0
> CPU    6: hi:    0, btch:   1 usd:   0
> CPU    7: hi:    0, btch:   1 usd:   0
> CPU    8: hi:    0, btch:   1 usd:   0
> CPU    9: hi:    0, btch:   1 usd:   0
> CPU   10: hi:    0, btch:   1 usd:   0
> CPU   11: hi:    0, btch:   1 usd:   0
> CPU   12: hi:    0, btch:   1 usd:   0
> CPU   13: hi:    0, btch:   1 usd:   0
> CPU   14: hi:    0, btch:   1 usd:   0
> CPU   15: hi:    0, btch:   1 usd:   0
> CPU   16: hi:    0, btch:   1 usd:   0
> CPU   17: hi:    0, btch:   1 usd:   0
> CPU   18: hi:    0, btch:   1 usd:   0
> CPU   19: hi:    0, btch:   1 usd:   0
> CPU   20: hi:    0, btch:   1 usd:   0
> CPU   21: hi:    0, btch:   1 usd:   0
> CPU   22: hi:    0, btch:   1 usd:   0
> CPU   23: hi:    0, btch:   1 usd:   0
> CPU   24: hi:    0, btch:   1 usd:   0
> CPU   25: hi:    0, btch:   1 usd:   0
> CPU   26: hi:    0, btch:   1 usd:   0
> CPU   27: hi:    0, btch:   1 usd:   0
> CPU   28: hi:    0, btch:   1 usd:   0
> CPU   29: hi:    0, btch:   1 usd:   0
> CPU   30: hi:    0, btch:   1 usd:   0
> CPU   31: hi:    0, btch:   1 usd:   0
> CPU   32: hi:    0, btch:   1 usd:   0
> CPU   33: hi:    0, btch:   1 usd:   0
> CPU   34: hi:    0, btch:   1 usd:   0
> CPU   35: hi:    0, btch:   1 usd:   0
> CPU   36: hi:    0, btch:   1 usd:   0
> CPU   37: hi:    0, btch:   1 usd:   0
> CPU   38: hi:    0, btch:   1 usd:   0
> CPU   39: hi:    0, btch:   1 usd:   0
> Node 0 DMA32 per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 177
> CPU    1: hi:  186, btch:  31 usd: 129
> CPU    2: hi:  186, btch:  31 usd: 145
> CPU    3: hi:  186, btch:  31 usd:  60
> CPU    4: hi:  186, btch:  31 usd: 105
> CPU    5: hi:  186, btch:  31 usd:  96
> CPU    6: hi:  186, btch:  31 usd:  71
> CPU    7: hi:  186, btch:  31 usd:  53
> CPU    8: hi:  186, btch:  31 usd: 121
> CPU    9: hi:  186, btch:  31 usd: 171
> CPU   10: hi:  186, btch:  31 usd:  77
> CPU   11: hi:  186, btch:  31 usd:   0
> CPU   12: hi:  186, btch:  31 usd:   0
> CPU   13: hi:  186, btch:  31 usd:   0
> CPU   14: hi:  186, btch:  31 usd:  29
> CPU   15: hi:  186, btch:  31 usd:  36
> CPU   16: hi:  186, btch:  31 usd:   0
> CPU   17: hi:  186, btch:  31 usd:   0
> CPU   18: hi:  186, btch:  31 usd:   0
> CPU   19: hi:  186, btch:  31 usd:  22
> CPU   20: hi:  186, btch:  31 usd:  76
> CPU   21: hi:  186, btch:  31 usd: 161
> CPU   22: hi:  186, btch:  31 usd:  73
> CPU   23: hi:  186, btch:  31 usd:  78
> CPU   24: hi:  186, btch:  31 usd: 125
> CPU   25: hi:  186, btch:  31 usd:  73
> CPU   26: hi:  186, btch:  31 usd:  72
> CPU   27: hi:  186, btch:  31 usd:  38
> CPU   28: hi:  186, btch:  31 usd: 118
> CPU   29: hi:  186, btch:  31 usd:  63
> CPU   30: hi:  186, btch:  31 usd:   0
> CPU   31: hi:  186, btch:  31 usd:   0
> CPU   32: hi:  186, btch:  31 usd:   0
> CPU   33: hi:  186, btch:  31 usd:   0
> CPU   34: hi:  186, btch:  31 usd:   0
> CPU   35: hi:  186, btch:  31 usd:   0
> CPU   36: hi:  186, btch:  31 usd:   0
> CPU   37: hi:  186, btch:  31 usd:   0
> CPU   38: hi:  186, btch:  31 usd:   0
> CPU   39: hi:  186, btch:  31 usd:   0
> Node 0 Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 188
> CPU    1: hi:  186, btch:  31 usd: 174
> CPU    2: hi:  186, btch:  31 usd: 116
> CPU    3: hi:  186, btch:  31 usd:  96
> CPU    4: hi:  186, btch:  31 usd: 169
> CPU    5: hi:  186, btch:  31 usd: 137
> CPU    6: hi:  186, btch:  31 usd: 102
> CPU    7: hi:  186, btch:  31 usd:  96
> CPU    8: hi:  186, btch:  31 usd: 149
> CPU    9: hi:  186, btch:  31 usd: 203
> CPU   10: hi:  186, btch:  31 usd: 179
> CPU   11: hi:  186, btch:  31 usd: 181
> CPU   12: hi:  186, btch:  31 usd: 159
> CPU   13: hi:  186, btch:  31 usd: 160
> CPU   14: hi:  186, btch:  31 usd: 179
> CPU   15: hi:  186, btch:  31 usd: 158
> CPU   16: hi:  186, btch:  31 usd: 180
> CPU   17: hi:  186, btch:  31 usd: 155
> CPU   18: hi:  186, btch:  31 usd: 156
> CPU   19: hi:  186, btch:  31 usd: 173
> CPU   20: hi:  186, btch:  31 usd: 161
> CPU   21: hi:  186, btch:  31 usd: 163
> CPU   22: hi:  186, btch:  31 usd: 123
> CPU   23: hi:  186, btch:  31 usd: 157
> CPU   24: hi:  186, btch:  31 usd:  19
> CPU   25: hi:  186, btch:  31 usd: 159
> CPU   26: hi:  186, btch:  31 usd: 176
> CPU   27: hi:  186, btch:  31 usd: 140
> CPU   28: hi:  186, btch:  31 usd: 180
> CPU   29: hi:  186, btch:  31 usd: 169
> CPU   30: hi:  186, btch:  31 usd: 158
> CPU   31: hi:  186, btch:  31 usd: 181
> CPU   32: hi:  186, btch:  31 usd: 181
> CPU   33: hi:  186, btch:  31 usd: 159
> CPU   34: hi:  186, btch:  31 usd: 168
> CPU   35: hi:  186, btch:  31 usd: 173
> CPU   36: hi:  186, btch:  31 usd: 161
> CPU   37: hi:  186, btch:  31 usd: 179
> CPU   38: hi:  186, btch:  31 usd: 184
> CPU   39: hi:  186, btch:  31 usd: 182
> Node 1 Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 164
> CPU    1: hi:  186, btch:  31 usd: 167
> CPU    2: hi:  186, btch:  31 usd: 160
> CPU    3: hi:  186, btch:  31 usd: 171
> CPU    4: hi:  186, btch:  31 usd: 175
> CPU    5: hi:  186, btch:  31 usd: 184
> CPU    6: hi:  186, btch:  31 usd: 169
> CPU    7: hi:  186, btch:  31 usd: 159
> CPU    8: hi:  186, btch:  31 usd: 171
> CPU    9: hi:  186, btch:  31 usd: 185
> CPU   10: hi:  186, btch:  31 usd:  86
> CPU   11: hi:  186, btch:  31 usd:  73
> CPU   12: hi:  186, btch:  31 usd:  29
> CPU   13: hi:  186, btch:  31 usd:  62
> CPU   14: hi:  186, btch:  31 usd:  91
> CPU   15: hi:  186, btch:  31 usd: 114
> CPU   16: hi:  186, btch:  31 usd:  69
> CPU   17: hi:  186, btch:  31 usd: 117
> CPU   18: hi:  186, btch:  31 usd:  17
> CPU   19: hi:  186, btch:  31 usd:  20
> CPU   20: hi:  186, btch:  31 usd: 163
> CPU   21: hi:  186, btch:  31 usd: 161
> CPU   22: hi:  186, btch:  31 usd: 175
> CPU   23: hi:  186, btch:  31 usd: 172
> CPU   24: hi:  186, btch:  31 usd: 177
> CPU   25: hi:  186, btch:  31 usd: 171
> CPU   26: hi:  186, btch:  31 usd: 157
> CPU   27: hi:  186, btch:  31 usd: 171
> CPU   28: hi:  186, btch:  31 usd: 182
> CPU   29: hi:  186, btch:  31 usd: 183
> CPU   30: hi:  186, btch:  31 usd:  40
> CPU   31: hi:  186, btch:  31 usd: 157
> CPU   32: hi:  186, btch:  31 usd:  76
> CPU   33: hi:  186, btch:  31 usd: 157
> CPU   34: hi:  186, btch:  31 usd:  63
> CPU   35: hi:  186, btch:  31 usd:  62
> CPU   36: hi:  186, btch:  31 usd:  58
> CPU   37: hi:  186, btch:  31 usd:  36
> CPU   38: hi:  186, btch:  31 usd:  19
> CPU   39: hi:  186, btch:  31 usd:  47
> active_anon:6762239 inactive_anon:12245334 isolated_anon:0
>  active_file:10705451 inactive_file:11321321 isolated_file:0
>  unevictable:5516901 dirty:457441 writeback:0 unstable:0
>  free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537
>  mapped:1567688 shmem:904717 pagetables:362061 bounce:0
>  free_cma:0
> Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> lowmem_reserve[]: 0 1668 128642 128642
> Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 126973 126973
> Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0 0
> Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB
> Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB
> Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB
> Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB
> 23967743 total pagecache pages
> 1034858 pages in swap cache
> Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552
> Free swap  = 16464476kB
> Total swap = 41943036kB
> 67100065 pages RAM
> 0 pages HighMem/MovableOnly
> 524961 pages reserved
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Yet another ext4/jbd2 lockup
       [not found]   ` <55C4616A.10100@kyup.com>
@ 2015-08-10 13:58     ` Jan Kara
  2015-08-10 15:45       ` Marian Marinov
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2015-08-10 13:58 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Jan Kara, linux-ext4, Theodore Ts'o, Marian Marinov

[-- Attachment #1: Type: text/plain, Size: 19514 bytes --]

  Hello,

On Fri 07-08-15 10:42:34, Nikolay Borisov wrote:
> I have attached an XZ archive of the dumpe2fs command.

Thanks. So I don't see anything suspicious in the dumpe2fs. I've looked
again at the code and I still don't see how handle can be marked as sync in
ext4_writepages(). I wrote attached patch to dump stack in case this
happens. The dumping doesn't trigger for me. Can you reproduce the problem?
Can you run a kernel with the attached patch applied and report stack dumps
appearing in the kernel log?

								Honza

> On 08/03/2015 06:15 PM, Jan Kara wrote:
> >   Hello,
> > 
> > On Wed 29-07-15 16:33:18, Nikolay Borisov wrote:
> >> Again, I have come up against a rather annoying ext4/jbd2 
> >> lock and I'm not sure it's the same issue as the last one 
> >> I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). 
> >> Again, all of this is happening on an 3.12.28 kernel. I did grep for
> >> the ext4 commits that have come afterwards but none of them allude to
> >> fixing something similar. This time the blocked process is an rsync 
> >> which has been running OUTSIDE of a container e.g. in the / cgroup 
> >> but it was copying stuff from a dm-thin volume acting as the root 
> >> device for a cgroup: 
> >>
> >> rsync           D ffffffff81622100     0 33158      1 0x00000004
> >>  ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640
> >>  ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640
> >>  ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280
> >> Call Trace:
> >>  [<ffffffff81573f24>] schedule+0x24/0x70
> >>  [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0
> >>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
> >>  [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660
> >>  [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30
> >>  [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150
> >>  [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0
> >>  [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0
> >>  [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330
> >>  [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40
> >>  [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400
> >>  [<ffffffff8115de05>] ? generic_permission+0xc5/0x230
> >>  [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100
> >>  [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100
> >>  [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20
> >>  [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b
> > 
> > So this is waiting to start a new transaction in the journal.
> > 
> >> Sampling echo "w" > /proc/sysrq-trigger I found some other 
> > 
> > You can just attach full output from the above command. That way we can be
> > sure we see all the pieces...
> > 
> >> blocked process which might be related to the issue. In particular: 
> >>
> >> jbd2/dm-24-8    D 0000000000000000     0 31263      2 0x00000000
> >>  ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640
> >>  ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640
> >>  ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000
> >> Call Trace:
> >>  [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70
> >>  [<ffffffff81573f24>] schedule+0x24/0x70
> >>  [<ffffffff81573ff7>] io_schedule+0x87/0xd0
> >>  [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10
> >>  [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0
> >>  [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170
> >>  [<ffffffff810f4fb2>] __lock_page+0x62/0x70
> >>  [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40
> >>  [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30
> >>  [<ffffffff810fffe1>] write_cache_pages+0x391/0x470
> >>  [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60
> >>  [<ffffffff8110010c>] generic_writepages+0x4c/0x70
> >>  [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790
> >>  [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70
> >>  [<ffffffff81235d12>] kjournald2+0xc2/0x250
> >>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
> >>  [<ffffffff81235c50>] ? commit_timeout+0x10/0x10
> >>  [<ffffffff81097456>] kthread+0xc6/0xd0
> >>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
> >>  [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
> >>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
> > 
> > So jbd2 thread is waiting for page lock. Interesting...
> > 
> >> kworker/u81:90  D 0000000000000000     0 36695      2 0x00000000
> >> Workqueue: writeback bdi_writeback_workfn (flush-253:24)
> >>  ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640
> >>  ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640
> >>  ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080
> >> Call Trace:
> >>  [<ffffffff81573f24>] schedule+0x24/0x70
> >>  [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130
> >>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
> >>  [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70
> >>  [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0
> >>  [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0
> >>  [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50
> >>  [<ffffffff8110014b>] do_writepages+0x1b/0x30
> >>  [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130
> >>  [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0
> >>  [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0
> >>  [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0
> >>  [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0
> >>  [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0
> >>  [<ffffffff810a4312>] ? wake_up_process+0x22/0x40
> >>  [<ffffffff8109078b>] process_one_work+0x13b/0x460
> >>  [<ffffffff81091a09>] worker_thread+0x119/0x3f0
> >>  [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0
> >>  [<ffffffff81097456>] kthread+0xc6/0xd0
> >>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
> >>  [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
> >>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
> > 
> > And this is likely the culprit - flusher thread is waiting for journal
> > commit from jbd2_journal_stop() at which point it may be holding page locks
> > which are blocking jbd2 thread. Now waiting from transaction commit from
> > jbd2_journal_stop() is really unexpected. That happens only if the
> > handle was a sync one and the only way I can see handle becoming a sync one
> > from a writeback path is when this would be the first file over 4 GB in the
> > filesystem which looks really unlikely... Can you post output of dumpe2fs
> > for the filesystem? Thanks!
> > 
> > 								Honza
> > 
> >> Basically dm-24 is the device this rsync was syncing from  and the 
> >> kworker is running as part of writeback for the same device 
> >> (253:24 is the major:minor for this particular device). 
> >> Furthermore, I see there have been a number of "sync" executed and 
> >> their stack traces look like the following: 
> >>
> >> sync            D ffffffff81622100     0 17596  17532 0x00000000
> >>  ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640
> >>  ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640
> >>  ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0
> >> Call Trace:
> >>  [<ffffffff81573f24>] schedule+0x24/0x70
> >>  [<ffffffff81571955>] schedule_timeout+0x185/0x1c0
> >>  [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150
> >>  [<ffffffff81574984>] wait_for_completion+0x94/0x100
> >>  [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0
> >>  [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0
> >>  [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
> >>  [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0
> >>  [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
> >>  [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20
> >>  [<ffffffff81157b29>] iterate_supers+0xa9/0xe0
> >>  [<ffffffff8117e880>] sys_sync+0x30/0x90
> >>  [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b
> >>
> >> I believe it is this sync which has triggered the bdi_writeback_workfn 
> >> for the particular dm device. Essentially what I gather is that the 
> >> rsync is locked since it wants to do some filesystem activity, hence 
> >> it wants to start a transaction but it's blocked due to an unfinished 
> >> transaction - presumably the one from the writeback path. However, 
> >> the writeback is blocked in jbd2_log_wait_commit, waiting for its 
> >> own transaction to complete and it is not progressing due to some 
> >> reason. There hasn't been any OOM messages for the container in 
> >> question and the global OOM killer hasn't triggered. The memory 
> >> state of the machine when this was observed. There still some 
> >> memory: 
> >>
> >> SysRq : Show Memory
> >> Mem-Info:
> >> Node 0 DMA per-cpu:
> >> CPU    0: hi:    0, btch:   1 usd:   0
> >> CPU    1: hi:    0, btch:   1 usd:   0
> >> CPU    2: hi:    0, btch:   1 usd:   0
> >> CPU    3: hi:    0, btch:   1 usd:   0
> >> CPU    4: hi:    0, btch:   1 usd:   0
> >> CPU    5: hi:    0, btch:   1 usd:   0
> >> CPU    6: hi:    0, btch:   1 usd:   0
> >> CPU    7: hi:    0, btch:   1 usd:   0
> >> CPU    8: hi:    0, btch:   1 usd:   0
> >> CPU    9: hi:    0, btch:   1 usd:   0
> >> CPU   10: hi:    0, btch:   1 usd:   0
> >> CPU   11: hi:    0, btch:   1 usd:   0
> >> CPU   12: hi:    0, btch:   1 usd:   0
> >> CPU   13: hi:    0, btch:   1 usd:   0
> >> CPU   14: hi:    0, btch:   1 usd:   0
> >> CPU   15: hi:    0, btch:   1 usd:   0
> >> CPU   16: hi:    0, btch:   1 usd:   0
> >> CPU   17: hi:    0, btch:   1 usd:   0
> >> CPU   18: hi:    0, btch:   1 usd:   0
> >> CPU   19: hi:    0, btch:   1 usd:   0
> >> CPU   20: hi:    0, btch:   1 usd:   0
> >> CPU   21: hi:    0, btch:   1 usd:   0
> >> CPU   22: hi:    0, btch:   1 usd:   0
> >> CPU   23: hi:    0, btch:   1 usd:   0
> >> CPU   24: hi:    0, btch:   1 usd:   0
> >> CPU   25: hi:    0, btch:   1 usd:   0
> >> CPU   26: hi:    0, btch:   1 usd:   0
> >> CPU   27: hi:    0, btch:   1 usd:   0
> >> CPU   28: hi:    0, btch:   1 usd:   0
> >> CPU   29: hi:    0, btch:   1 usd:   0
> >> CPU   30: hi:    0, btch:   1 usd:   0
> >> CPU   31: hi:    0, btch:   1 usd:   0
> >> CPU   32: hi:    0, btch:   1 usd:   0
> >> CPU   33: hi:    0, btch:   1 usd:   0
> >> CPU   34: hi:    0, btch:   1 usd:   0
> >> CPU   35: hi:    0, btch:   1 usd:   0
> >> CPU   36: hi:    0, btch:   1 usd:   0
> >> CPU   37: hi:    0, btch:   1 usd:   0
> >> CPU   38: hi:    0, btch:   1 usd:   0
> >> CPU   39: hi:    0, btch:   1 usd:   0
> >> Node 0 DMA32 per-cpu:
> >> CPU    0: hi:  186, btch:  31 usd: 177
> >> CPU    1: hi:  186, btch:  31 usd: 129
> >> CPU    2: hi:  186, btch:  31 usd: 145
> >> CPU    3: hi:  186, btch:  31 usd:  60
> >> CPU    4: hi:  186, btch:  31 usd: 105
> >> CPU    5: hi:  186, btch:  31 usd:  96
> >> CPU    6: hi:  186, btch:  31 usd:  71
> >> CPU    7: hi:  186, btch:  31 usd:  53
> >> CPU    8: hi:  186, btch:  31 usd: 121
> >> CPU    9: hi:  186, btch:  31 usd: 171
> >> CPU   10: hi:  186, btch:  31 usd:  77
> >> CPU   11: hi:  186, btch:  31 usd:   0
> >> CPU   12: hi:  186, btch:  31 usd:   0
> >> CPU   13: hi:  186, btch:  31 usd:   0
> >> CPU   14: hi:  186, btch:  31 usd:  29
> >> CPU   15: hi:  186, btch:  31 usd:  36
> >> CPU   16: hi:  186, btch:  31 usd:   0
> >> CPU   17: hi:  186, btch:  31 usd:   0
> >> CPU   18: hi:  186, btch:  31 usd:   0
> >> CPU   19: hi:  186, btch:  31 usd:  22
> >> CPU   20: hi:  186, btch:  31 usd:  76
> >> CPU   21: hi:  186, btch:  31 usd: 161
> >> CPU   22: hi:  186, btch:  31 usd:  73
> >> CPU   23: hi:  186, btch:  31 usd:  78
> >> CPU   24: hi:  186, btch:  31 usd: 125
> >> CPU   25: hi:  186, btch:  31 usd:  73
> >> CPU   26: hi:  186, btch:  31 usd:  72
> >> CPU   27: hi:  186, btch:  31 usd:  38
> >> CPU   28: hi:  186, btch:  31 usd: 118
> >> CPU   29: hi:  186, btch:  31 usd:  63
> >> CPU   30: hi:  186, btch:  31 usd:   0
> >> CPU   31: hi:  186, btch:  31 usd:   0
> >> CPU   32: hi:  186, btch:  31 usd:   0
> >> CPU   33: hi:  186, btch:  31 usd:   0
> >> CPU   34: hi:  186, btch:  31 usd:   0
> >> CPU   35: hi:  186, btch:  31 usd:   0
> >> CPU   36: hi:  186, btch:  31 usd:   0
> >> CPU   37: hi:  186, btch:  31 usd:   0
> >> CPU   38: hi:  186, btch:  31 usd:   0
> >> CPU   39: hi:  186, btch:  31 usd:   0
> >> Node 0 Normal per-cpu:
> >> CPU    0: hi:  186, btch:  31 usd: 188
> >> CPU    1: hi:  186, btch:  31 usd: 174
> >> CPU    2: hi:  186, btch:  31 usd: 116
> >> CPU    3: hi:  186, btch:  31 usd:  96
> >> CPU    4: hi:  186, btch:  31 usd: 169
> >> CPU    5: hi:  186, btch:  31 usd: 137
> >> CPU    6: hi:  186, btch:  31 usd: 102
> >> CPU    7: hi:  186, btch:  31 usd:  96
> >> CPU    8: hi:  186, btch:  31 usd: 149
> >> CPU    9: hi:  186, btch:  31 usd: 203
> >> CPU   10: hi:  186, btch:  31 usd: 179
> >> CPU   11: hi:  186, btch:  31 usd: 181
> >> CPU   12: hi:  186, btch:  31 usd: 159
> >> CPU   13: hi:  186, btch:  31 usd: 160
> >> CPU   14: hi:  186, btch:  31 usd: 179
> >> CPU   15: hi:  186, btch:  31 usd: 158
> >> CPU   16: hi:  186, btch:  31 usd: 180
> >> CPU   17: hi:  186, btch:  31 usd: 155
> >> CPU   18: hi:  186, btch:  31 usd: 156
> >> CPU   19: hi:  186, btch:  31 usd: 173
> >> CPU   20: hi:  186, btch:  31 usd: 161
> >> CPU   21: hi:  186, btch:  31 usd: 163
> >> CPU   22: hi:  186, btch:  31 usd: 123
> >> CPU   23: hi:  186, btch:  31 usd: 157
> >> CPU   24: hi:  186, btch:  31 usd:  19
> >> CPU   25: hi:  186, btch:  31 usd: 159
> >> CPU   26: hi:  186, btch:  31 usd: 176
> >> CPU   27: hi:  186, btch:  31 usd: 140
> >> CPU   28: hi:  186, btch:  31 usd: 180
> >> CPU   29: hi:  186, btch:  31 usd: 169
> >> CPU   30: hi:  186, btch:  31 usd: 158
> >> CPU   31: hi:  186, btch:  31 usd: 181
> >> CPU   32: hi:  186, btch:  31 usd: 181
> >> CPU   33: hi:  186, btch:  31 usd: 159
> >> CPU   34: hi:  186, btch:  31 usd: 168
> >> CPU   35: hi:  186, btch:  31 usd: 173
> >> CPU   36: hi:  186, btch:  31 usd: 161
> >> CPU   37: hi:  186, btch:  31 usd: 179
> >> CPU   38: hi:  186, btch:  31 usd: 184
> >> CPU   39: hi:  186, btch:  31 usd: 182
> >> Node 1 Normal per-cpu:
> >> CPU    0: hi:  186, btch:  31 usd: 164
> >> CPU    1: hi:  186, btch:  31 usd: 167
> >> CPU    2: hi:  186, btch:  31 usd: 160
> >> CPU    3: hi:  186, btch:  31 usd: 171
> >> CPU    4: hi:  186, btch:  31 usd: 175
> >> CPU    5: hi:  186, btch:  31 usd: 184
> >> CPU    6: hi:  186, btch:  31 usd: 169
> >> CPU    7: hi:  186, btch:  31 usd: 159
> >> CPU    8: hi:  186, btch:  31 usd: 171
> >> CPU    9: hi:  186, btch:  31 usd: 185
> >> CPU   10: hi:  186, btch:  31 usd:  86
> >> CPU   11: hi:  186, btch:  31 usd:  73
> >> CPU   12: hi:  186, btch:  31 usd:  29
> >> CPU   13: hi:  186, btch:  31 usd:  62
> >> CPU   14: hi:  186, btch:  31 usd:  91
> >> CPU   15: hi:  186, btch:  31 usd: 114
> >> CPU   16: hi:  186, btch:  31 usd:  69
> >> CPU   17: hi:  186, btch:  31 usd: 117
> >> CPU   18: hi:  186, btch:  31 usd:  17
> >> CPU   19: hi:  186, btch:  31 usd:  20
> >> CPU   20: hi:  186, btch:  31 usd: 163
> >> CPU   21: hi:  186, btch:  31 usd: 161
> >> CPU   22: hi:  186, btch:  31 usd: 175
> >> CPU   23: hi:  186, btch:  31 usd: 172
> >> CPU   24: hi:  186, btch:  31 usd: 177
> >> CPU   25: hi:  186, btch:  31 usd: 171
> >> CPU   26: hi:  186, btch:  31 usd: 157
> >> CPU   27: hi:  186, btch:  31 usd: 171
> >> CPU   28: hi:  186, btch:  31 usd: 182
> >> CPU   29: hi:  186, btch:  31 usd: 183
> >> CPU   30: hi:  186, btch:  31 usd:  40
> >> CPU   31: hi:  186, btch:  31 usd: 157
> >> CPU   32: hi:  186, btch:  31 usd:  76
> >> CPU   33: hi:  186, btch:  31 usd: 157
> >> CPU   34: hi:  186, btch:  31 usd:  63
> >> CPU   35: hi:  186, btch:  31 usd:  62
> >> CPU   36: hi:  186, btch:  31 usd:  58
> >> CPU   37: hi:  186, btch:  31 usd:  36
> >> CPU   38: hi:  186, btch:  31 usd:  19
> >> CPU   39: hi:  186, btch:  31 usd:  47
> >> active_anon:6762239 inactive_anon:12245334 isolated_anon:0
> >>  active_file:10705451 inactive_file:11321321 isolated_file:0
> >>  unevictable:5516901 dirty:457441 writeback:0 unstable:0
> >>  free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537
> >>  mapped:1567688 shmem:904717 pagetables:362061 bounce:0
> >>  free_cma:0
> >> Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> >> lowmem_reserve[]: 0 1668 128642 128642
> >> Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> >> lowmem_reserve[]: 0 0 126973 126973
> >> Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> >> lowmem_reserve[]: 0 0 0 0
> >> Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> >> lowmem_reserve[]: 0 0 0 0
> >> Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB
> >> Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB
> >> Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB
> >> Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB
> >> 23967743 total pagecache pages
> >> 1034858 pages in swap cache
> >> Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552
> >> Free swap  = 16464476kB
> >> Total swap = 41943036kB
> >> 67100065 pages RAM
> >> 0 pages HighMem/MovableOnly
> >> 524961 pages reserved
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-jbd2-Debug-sync-handle-from-writeback.patch --]
[-- Type: text/x-patch, Size: 905 bytes --]

>From 1d98bc7c3d2940c25ac2ca0cb0a3f6101a942671 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.com>
Date: Mon, 10 Aug 2015 11:06:11 +0200
Subject: [PATCH] jbd2: Debug sync handle from writeback

Signed-off-by: Jan Kara <jack@suse.com>
---
 fs/ext4/ext4_jbd2.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index 9c5b49fb281e..0a63b49caf4e 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -278,8 +278,13 @@ static inline int ext4_handle_valid(handle_t *handle)
 
 static inline void ext4_handle_sync(handle_t *handle)
 {
-	if (ext4_handle_valid(handle))
+	if (ext4_handle_valid(handle)) {
 		handle->h_sync = 1;
+		if (handle->h_type == EXT4_HT_WRITE_PAGE) {
+			printk(KERN_ERR "ext4: Writeback handle is sync!\n");
+			dump_stack();
+		}
+	}
 }
 
 static inline int ext4_handle_is_aborted(handle_t *handle)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Yet another ext4/jbd2 lockup
  2015-08-10 13:58     ` Jan Kara
@ 2015-08-10 15:45       ` Marian Marinov
  0 siblings, 0 replies; 4+ messages in thread
From: Marian Marinov @ 2015-08-10 15:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, Theodore Ts'o, Marian Marinov

On 08/10/2015 04:58 PM, Jan Kara wrote:
>   Hello,
> 
> On Fri 07-08-15 10:42:34, Nikolay Borisov wrote:
>> I have attached an XZ archive of the dumpe2fs command.
> 
> Thanks. So I don't see anything suspicious in the dumpe2fs. I've looked
> again at the code and I still don't see how handle can be marked as sync in
> ext4_writepages(). I wrote attached patch to dump stack in case this
> happens. The dumping doesn't trigger for me. Can you reproduce the problem?
> Can you run a kernel with the attached patch applied and report stack dumps
> appearing in the kernel log?
> 
> 								Honza

Thank you for the patch. We will deploy it on one of our servers and look for the issues.
Our next maintenance window will be next week and we will merge it with out 4.1 kernel branch.

Unfortunately it is very hard to reproduce this in our test lab. We have been trying to reproduce it the past two weeks.
We are running fuzzers and stress tests inside containers and waiting to see the same issue.

Marian


> 
>> On 08/03/2015 06:15 PM, Jan Kara wrote:
>>>   Hello,
>>>
>>> On Wed 29-07-15 16:33:18, Nikolay Borisov wrote:
>>>> Again, I have come up against a rather annoying ext4/jbd2 
>>>> lock and I'm not sure it's the same issue as the last one 
>>>> I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). 
>>>> Again, all of this is happening on an 3.12.28 kernel. I did grep for
>>>> the ext4 commits that have come afterwards but none of them allude to
>>>> fixing something similar. This time the blocked process is an rsync 
>>>> which has been running OUTSIDE of a container e.g. in the / cgroup 
>>>> but it was copying stuff from a dm-thin volume acting as the root 
>>>> device for a cgroup: 
>>>>
>>>> rsync           D ffffffff81622100     0 33158      1 0x00000004
>>>>  ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640
>>>>  ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640
>>>>  ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280
>>>> Call Trace:
>>>>  [<ffffffff81573f24>] schedule+0x24/0x70
>>>>  [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0
>>>>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
>>>>  [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660
>>>>  [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30
>>>>  [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150
>>>>  [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0
>>>>  [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0
>>>>  [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330
>>>>  [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40
>>>>  [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400
>>>>  [<ffffffff8115de05>] ? generic_permission+0xc5/0x230
>>>>  [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100
>>>>  [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100
>>>>  [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20
>>>>  [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b
>>>
>>> So this is waiting to start a new transaction in the journal.
>>>
>>>> Sampling echo "w" > /proc/sysrq-trigger I found some other 
>>>
>>> You can just attach full output from the above command. That way we can be
>>> sure we see all the pieces...
>>>
>>>> blocked process which might be related to the issue. In particular: 
>>>>
>>>> jbd2/dm-24-8    D 0000000000000000     0 31263      2 0x00000000
>>>>  ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640
>>>>  ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640
>>>>  ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000
>>>> Call Trace:
>>>>  [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70
>>>>  [<ffffffff81573f24>] schedule+0x24/0x70
>>>>  [<ffffffff81573ff7>] io_schedule+0x87/0xd0
>>>>  [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10
>>>>  [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0
>>>>  [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170
>>>>  [<ffffffff810f4fb2>] __lock_page+0x62/0x70
>>>>  [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40
>>>>  [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30
>>>>  [<ffffffff810fffe1>] write_cache_pages+0x391/0x470
>>>>  [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60
>>>>  [<ffffffff8110010c>] generic_writepages+0x4c/0x70
>>>>  [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790
>>>>  [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70
>>>>  [<ffffffff81235d12>] kjournald2+0xc2/0x250
>>>>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
>>>>  [<ffffffff81235c50>] ? commit_timeout+0x10/0x10
>>>>  [<ffffffff81097456>] kthread+0xc6/0xd0
>>>>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
>>>>  [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
>>>>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
>>>
>>> So jbd2 thread is waiting for page lock. Interesting...
>>>
>>>> kworker/u81:90  D 0000000000000000     0 36695      2 0x00000000
>>>> Workqueue: writeback bdi_writeback_workfn (flush-253:24)
>>>>  ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640
>>>>  ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640
>>>>  ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080
>>>> Call Trace:
>>>>  [<ffffffff81573f24>] schedule+0x24/0x70
>>>>  [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130
>>>>  [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40
>>>>  [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70
>>>>  [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0
>>>>  [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0
>>>>  [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50
>>>>  [<ffffffff8110014b>] do_writepages+0x1b/0x30
>>>>  [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130
>>>>  [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0
>>>>  [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0
>>>>  [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0
>>>>  [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0
>>>>  [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0
>>>>  [<ffffffff810a4312>] ? wake_up_process+0x22/0x40
>>>>  [<ffffffff8109078b>] process_one_work+0x13b/0x460
>>>>  [<ffffffff81091a09>] worker_thread+0x119/0x3f0
>>>>  [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0
>>>>  [<ffffffff81097456>] kthread+0xc6/0xd0
>>>>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
>>>>  [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0
>>>>  [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60
>>>
>>> And this is likely the culprit - flusher thread is waiting for journal
>>> commit from jbd2_journal_stop() at which point it may be holding page locks
>>> which are blocking jbd2 thread. Now waiting from transaction commit from
>>> jbd2_journal_stop() is really unexpected. That happens only if the
>>> handle was a sync one and the only way I can see handle becoming a sync one
>>> from a writeback path is when this would be the first file over 4 GB in the
>>> filesystem which looks really unlikely... Can you post output of dumpe2fs
>>> for the filesystem? Thanks!
>>>
>>> 								Honza
>>>
>>>> Basically dm-24 is the device this rsync was syncing from  and the 
>>>> kworker is running as part of writeback for the same device 
>>>> (253:24 is the major:minor for this particular device). 
>>>> Furthermore, I see there have been a number of "sync" executed and 
>>>> their stack traces look like the following: 
>>>>
>>>> sync            D ffffffff81622100     0 17596  17532 0x00000000
>>>>  ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640
>>>>  ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640
>>>>  ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0
>>>> Call Trace:
>>>>  [<ffffffff81573f24>] schedule+0x24/0x70
>>>>  [<ffffffff81571955>] schedule_timeout+0x185/0x1c0
>>>>  [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150
>>>>  [<ffffffff81574984>] wait_for_completion+0x94/0x100
>>>>  [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0
>>>>  [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0
>>>>  [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
>>>>  [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0
>>>>  [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20
>>>>  [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20
>>>>  [<ffffffff81157b29>] iterate_supers+0xa9/0xe0
>>>>  [<ffffffff8117e880>] sys_sync+0x30/0x90
>>>>  [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b
>>>>
>>>> I believe it is this sync which has triggered the bdi_writeback_workfn 
>>>> for the particular dm device. Essentially what I gather is that the 
>>>> rsync is locked since it wants to do some filesystem activity, hence 
>>>> it wants to start a transaction but it's blocked due to an unfinished 
>>>> transaction - presumably the one from the writeback path. However, 
>>>> the writeback is blocked in jbd2_log_wait_commit, waiting for its 
>>>> own transaction to complete and it is not progressing due to some 
>>>> reason. There hasn't been any OOM messages for the container in 
>>>> question and the global OOM killer hasn't triggered. The memory 
>>>> state of the machine when this was observed. There still some 
>>>> memory: 
>>>>
>>>> SysRq : Show Memory
>>>> Mem-Info:
>>>> Node 0 DMA per-cpu:
>>>> CPU    0: hi:    0, btch:   1 usd:   0
>>>> CPU    1: hi:    0, btch:   1 usd:   0
>>>> CPU    2: hi:    0, btch:   1 usd:   0
>>>> CPU    3: hi:    0, btch:   1 usd:   0
>>>> CPU    4: hi:    0, btch:   1 usd:   0
>>>> CPU    5: hi:    0, btch:   1 usd:   0
>>>> CPU    6: hi:    0, btch:   1 usd:   0
>>>> CPU    7: hi:    0, btch:   1 usd:   0
>>>> CPU    8: hi:    0, btch:   1 usd:   0
>>>> CPU    9: hi:    0, btch:   1 usd:   0
>>>> CPU   10: hi:    0, btch:   1 usd:   0
>>>> CPU   11: hi:    0, btch:   1 usd:   0
>>>> CPU   12: hi:    0, btch:   1 usd:   0
>>>> CPU   13: hi:    0, btch:   1 usd:   0
>>>> CPU   14: hi:    0, btch:   1 usd:   0
>>>> CPU   15: hi:    0, btch:   1 usd:   0
>>>> CPU   16: hi:    0, btch:   1 usd:   0
>>>> CPU   17: hi:    0, btch:   1 usd:   0
>>>> CPU   18: hi:    0, btch:   1 usd:   0
>>>> CPU   19: hi:    0, btch:   1 usd:   0
>>>> CPU   20: hi:    0, btch:   1 usd:   0
>>>> CPU   21: hi:    0, btch:   1 usd:   0
>>>> CPU   22: hi:    0, btch:   1 usd:   0
>>>> CPU   23: hi:    0, btch:   1 usd:   0
>>>> CPU   24: hi:    0, btch:   1 usd:   0
>>>> CPU   25: hi:    0, btch:   1 usd:   0
>>>> CPU   26: hi:    0, btch:   1 usd:   0
>>>> CPU   27: hi:    0, btch:   1 usd:   0
>>>> CPU   28: hi:    0, btch:   1 usd:   0
>>>> CPU   29: hi:    0, btch:   1 usd:   0
>>>> CPU   30: hi:    0, btch:   1 usd:   0
>>>> CPU   31: hi:    0, btch:   1 usd:   0
>>>> CPU   32: hi:    0, btch:   1 usd:   0
>>>> CPU   33: hi:    0, btch:   1 usd:   0
>>>> CPU   34: hi:    0, btch:   1 usd:   0
>>>> CPU   35: hi:    0, btch:   1 usd:   0
>>>> CPU   36: hi:    0, btch:   1 usd:   0
>>>> CPU   37: hi:    0, btch:   1 usd:   0
>>>> CPU   38: hi:    0, btch:   1 usd:   0
>>>> CPU   39: hi:    0, btch:   1 usd:   0
>>>> Node 0 DMA32 per-cpu:
>>>> CPU    0: hi:  186, btch:  31 usd: 177
>>>> CPU    1: hi:  186, btch:  31 usd: 129
>>>> CPU    2: hi:  186, btch:  31 usd: 145
>>>> CPU    3: hi:  186, btch:  31 usd:  60
>>>> CPU    4: hi:  186, btch:  31 usd: 105
>>>> CPU    5: hi:  186, btch:  31 usd:  96
>>>> CPU    6: hi:  186, btch:  31 usd:  71
>>>> CPU    7: hi:  186, btch:  31 usd:  53
>>>> CPU    8: hi:  186, btch:  31 usd: 121
>>>> CPU    9: hi:  186, btch:  31 usd: 171
>>>> CPU   10: hi:  186, btch:  31 usd:  77
>>>> CPU   11: hi:  186, btch:  31 usd:   0
>>>> CPU   12: hi:  186, btch:  31 usd:   0
>>>> CPU   13: hi:  186, btch:  31 usd:   0
>>>> CPU   14: hi:  186, btch:  31 usd:  29
>>>> CPU   15: hi:  186, btch:  31 usd:  36
>>>> CPU   16: hi:  186, btch:  31 usd:   0
>>>> CPU   17: hi:  186, btch:  31 usd:   0
>>>> CPU   18: hi:  186, btch:  31 usd:   0
>>>> CPU   19: hi:  186, btch:  31 usd:  22
>>>> CPU   20: hi:  186, btch:  31 usd:  76
>>>> CPU   21: hi:  186, btch:  31 usd: 161
>>>> CPU   22: hi:  186, btch:  31 usd:  73
>>>> CPU   23: hi:  186, btch:  31 usd:  78
>>>> CPU   24: hi:  186, btch:  31 usd: 125
>>>> CPU   25: hi:  186, btch:  31 usd:  73
>>>> CPU   26: hi:  186, btch:  31 usd:  72
>>>> CPU   27: hi:  186, btch:  31 usd:  38
>>>> CPU   28: hi:  186, btch:  31 usd: 118
>>>> CPU   29: hi:  186, btch:  31 usd:  63
>>>> CPU   30: hi:  186, btch:  31 usd:   0
>>>> CPU   31: hi:  186, btch:  31 usd:   0
>>>> CPU   32: hi:  186, btch:  31 usd:   0
>>>> CPU   33: hi:  186, btch:  31 usd:   0
>>>> CPU   34: hi:  186, btch:  31 usd:   0
>>>> CPU   35: hi:  186, btch:  31 usd:   0
>>>> CPU   36: hi:  186, btch:  31 usd:   0
>>>> CPU   37: hi:  186, btch:  31 usd:   0
>>>> CPU   38: hi:  186, btch:  31 usd:   0
>>>> CPU   39: hi:  186, btch:  31 usd:   0
>>>> Node 0 Normal per-cpu:
>>>> CPU    0: hi:  186, btch:  31 usd: 188
>>>> CPU    1: hi:  186, btch:  31 usd: 174
>>>> CPU    2: hi:  186, btch:  31 usd: 116
>>>> CPU    3: hi:  186, btch:  31 usd:  96
>>>> CPU    4: hi:  186, btch:  31 usd: 169
>>>> CPU    5: hi:  186, btch:  31 usd: 137
>>>> CPU    6: hi:  186, btch:  31 usd: 102
>>>> CPU    7: hi:  186, btch:  31 usd:  96
>>>> CPU    8: hi:  186, btch:  31 usd: 149
>>>> CPU    9: hi:  186, btch:  31 usd: 203
>>>> CPU   10: hi:  186, btch:  31 usd: 179
>>>> CPU   11: hi:  186, btch:  31 usd: 181
>>>> CPU   12: hi:  186, btch:  31 usd: 159
>>>> CPU   13: hi:  186, btch:  31 usd: 160
>>>> CPU   14: hi:  186, btch:  31 usd: 179
>>>> CPU   15: hi:  186, btch:  31 usd: 158
>>>> CPU   16: hi:  186, btch:  31 usd: 180
>>>> CPU   17: hi:  186, btch:  31 usd: 155
>>>> CPU   18: hi:  186, btch:  31 usd: 156
>>>> CPU   19: hi:  186, btch:  31 usd: 173
>>>> CPU   20: hi:  186, btch:  31 usd: 161
>>>> CPU   21: hi:  186, btch:  31 usd: 163
>>>> CPU   22: hi:  186, btch:  31 usd: 123
>>>> CPU   23: hi:  186, btch:  31 usd: 157
>>>> CPU   24: hi:  186, btch:  31 usd:  19
>>>> CPU   25: hi:  186, btch:  31 usd: 159
>>>> CPU   26: hi:  186, btch:  31 usd: 176
>>>> CPU   27: hi:  186, btch:  31 usd: 140
>>>> CPU   28: hi:  186, btch:  31 usd: 180
>>>> CPU   29: hi:  186, btch:  31 usd: 169
>>>> CPU   30: hi:  186, btch:  31 usd: 158
>>>> CPU   31: hi:  186, btch:  31 usd: 181
>>>> CPU   32: hi:  186, btch:  31 usd: 181
>>>> CPU   33: hi:  186, btch:  31 usd: 159
>>>> CPU   34: hi:  186, btch:  31 usd: 168
>>>> CPU   35: hi:  186, btch:  31 usd: 173
>>>> CPU   36: hi:  186, btch:  31 usd: 161
>>>> CPU   37: hi:  186, btch:  31 usd: 179
>>>> CPU   38: hi:  186, btch:  31 usd: 184
>>>> CPU   39: hi:  186, btch:  31 usd: 182
>>>> Node 1 Normal per-cpu:
>>>> CPU    0: hi:  186, btch:  31 usd: 164
>>>> CPU    1: hi:  186, btch:  31 usd: 167
>>>> CPU    2: hi:  186, btch:  31 usd: 160
>>>> CPU    3: hi:  186, btch:  31 usd: 171
>>>> CPU    4: hi:  186, btch:  31 usd: 175
>>>> CPU    5: hi:  186, btch:  31 usd: 184
>>>> CPU    6: hi:  186, btch:  31 usd: 169
>>>> CPU    7: hi:  186, btch:  31 usd: 159
>>>> CPU    8: hi:  186, btch:  31 usd: 171
>>>> CPU    9: hi:  186, btch:  31 usd: 185
>>>> CPU   10: hi:  186, btch:  31 usd:  86
>>>> CPU   11: hi:  186, btch:  31 usd:  73
>>>> CPU   12: hi:  186, btch:  31 usd:  29
>>>> CPU   13: hi:  186, btch:  31 usd:  62
>>>> CPU   14: hi:  186, btch:  31 usd:  91
>>>> CPU   15: hi:  186, btch:  31 usd: 114
>>>> CPU   16: hi:  186, btch:  31 usd:  69
>>>> CPU   17: hi:  186, btch:  31 usd: 117
>>>> CPU   18: hi:  186, btch:  31 usd:  17
>>>> CPU   19: hi:  186, btch:  31 usd:  20
>>>> CPU   20: hi:  186, btch:  31 usd: 163
>>>> CPU   21: hi:  186, btch:  31 usd: 161
>>>> CPU   22: hi:  186, btch:  31 usd: 175
>>>> CPU   23: hi:  186, btch:  31 usd: 172
>>>> CPU   24: hi:  186, btch:  31 usd: 177
>>>> CPU   25: hi:  186, btch:  31 usd: 171
>>>> CPU   26: hi:  186, btch:  31 usd: 157
>>>> CPU   27: hi:  186, btch:  31 usd: 171
>>>> CPU   28: hi:  186, btch:  31 usd: 182
>>>> CPU   29: hi:  186, btch:  31 usd: 183
>>>> CPU   30: hi:  186, btch:  31 usd:  40
>>>> CPU   31: hi:  186, btch:  31 usd: 157
>>>> CPU   32: hi:  186, btch:  31 usd:  76
>>>> CPU   33: hi:  186, btch:  31 usd: 157
>>>> CPU   34: hi:  186, btch:  31 usd:  63
>>>> CPU   35: hi:  186, btch:  31 usd:  62
>>>> CPU   36: hi:  186, btch:  31 usd:  58
>>>> CPU   37: hi:  186, btch:  31 usd:  36
>>>> CPU   38: hi:  186, btch:  31 usd:  19
>>>> CPU   39: hi:  186, btch:  31 usd:  47
>>>> active_anon:6762239 inactive_anon:12245334 isolated_anon:0
>>>>  active_file:10705451 inactive_file:11321321 isolated_file:0
>>>>  unevictable:5516901 dirty:457441 writeback:0 unstable:0
>>>>  free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537
>>>>  mapped:1567688 shmem:904717 pagetables:362061 bounce:0
>>>>  free_cma:0
>>>> Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
>>>> lowmem_reserve[]: 0 1668 128642 128642
>>>> Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>>>> lowmem_reserve[]: 0 0 126973 126973
>>>> Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>>>> lowmem_reserve[]: 0 0 0 0
>>>> Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
>>>> lowmem_reserve[]: 0 0 0 0
>>>> Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB
>>>> Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB
>>>> Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB
>>>> Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB
>>>> 23967743 total pagecache pages
>>>> 1034858 pages in swap cache
>>>> Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552
>>>> Free swap  = 16464476kB
>>>> Total swap = 41943036kB
>>>> 67100065 pages RAM
>>>> 0 pages HighMem/MovableOnly
>>>> 524961 pages reserved
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-08-10 16:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-29 13:33 Yet another ext4/jbd2 lockup Nikolay Borisov
2015-08-03 15:15 ` Jan Kara
     [not found]   ` <55C4616A.10100@kyup.com>
2015-08-10 13:58     ` Jan Kara
2015-08-10 15:45       ` Marian Marinov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox