* Yet another ext4/jbd2 lockup @ 2015-07-29 13:33 Nikolay Borisov 2015-08-03 15:15 ` Jan Kara 0 siblings, 1 reply; 4+ messages in thread From: Nikolay Borisov @ 2015-07-29 13:33 UTC (permalink / raw) To: linux-ext4; +Cc: Theodore Ts'o, Marian Marinov Hello, Again, I have come up against a rather annoying ext4/jbd2 lock and I'm not sure it's the same issue as the last one I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). Again, all of this is happening on an 3.12.28 kernel. I did grep for the ext4 commits that have come afterwards but none of them allude to fixing something similar. This time the blocked process is an rsync which has been running OUTSIDE of a container e.g. in the / cgroup but it was copying stuff from a dm-thin volume acting as the root device for a cgroup: rsync D ffffffff81622100 0 33158 1 0x00000004 ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640 ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640 ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280 Call Trace: [<ffffffff81573f24>] schedule+0x24/0x70 [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0 [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660 [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30 [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150 [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0 [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0 [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330 [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40 [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400 [<ffffffff8115de05>] ? generic_permission+0xc5/0x230 [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100 [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100 [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20 [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b Sampling echo "w" > /proc/sysrq-trigger I found some other blocked process which might be related to the issue. In particular: jbd2/dm-24-8 D 0000000000000000 0 31263 2 0x00000000 ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640 ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640 ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000 Call Trace: [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70 [<ffffffff81573f24>] schedule+0x24/0x70 [<ffffffff81573ff7>] io_schedule+0x87/0xd0 [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10 [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0 [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170 [<ffffffff810f4fb2>] __lock_page+0x62/0x70 [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40 [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30 [<ffffffff810fffe1>] write_cache_pages+0x391/0x470 [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60 [<ffffffff8110010c>] generic_writepages+0x4c/0x70 [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790 [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70 [<ffffffff81235d12>] kjournald2+0xc2/0x250 [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 [<ffffffff81235c50>] ? commit_timeout+0x10/0x10 [<ffffffff81097456>] kthread+0xc6/0xd0 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 kworker/u81:90 D 0000000000000000 0 36695 2 0x00000000 Workqueue: writeback bdi_writeback_workfn (flush-253:24) ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640 ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640 ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080 Call Trace: [<ffffffff81573f24>] schedule+0x24/0x70 [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130 [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70 [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0 [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0 [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50 [<ffffffff8110014b>] do_writepages+0x1b/0x30 [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130 [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0 [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0 [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0 [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0 [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0 [<ffffffff810a4312>] ? wake_up_process+0x22/0x40 [<ffffffff8109078b>] process_one_work+0x13b/0x460 [<ffffffff81091a09>] worker_thread+0x119/0x3f0 [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0 [<ffffffff81097456>] kthread+0xc6/0xd0 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 Basically dm-24 is the device this rsync was syncing from and the kworker is running as part of writeback for the same device (253:24 is the major:minor for this particular device). Furthermore, I see there have been a number of "sync" executed and their stack traces look like the following: sync D ffffffff81622100 0 17596 17532 0x00000000 ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640 ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640 ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0 Call Trace: [<ffffffff81573f24>] schedule+0x24/0x70 [<ffffffff81571955>] schedule_timeout+0x185/0x1c0 [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150 [<ffffffff81574984>] wait_for_completion+0x94/0x100 [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0 [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0 [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0 [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20 [<ffffffff81157b29>] iterate_supers+0xa9/0xe0 [<ffffffff8117e880>] sys_sync+0x30/0x90 [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b I believe it is this sync which has triggered the bdi_writeback_workfn for the particular dm device. Essentially what I gather is that the rsync is locked since it wants to do some filesystem activity, hence it wants to start a transaction but it's blocked due to an unfinished transaction - presumably the one from the writeback path. However, the writeback is blocked in jbd2_log_wait_commit, waiting for its own transaction to complete and it is not progressing due to some reason. There hasn't been any OOM messages for the container in question and the global OOM killer hasn't triggered. The memory state of the machine when this was observed. There still some memory: SysRq : Show Memory Mem-Info: Node 0 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 CPU 2: hi: 0, btch: 1 usd: 0 CPU 3: hi: 0, btch: 1 usd: 0 CPU 4: hi: 0, btch: 1 usd: 0 CPU 5: hi: 0, btch: 1 usd: 0 CPU 6: hi: 0, btch: 1 usd: 0 CPU 7: hi: 0, btch: 1 usd: 0 CPU 8: hi: 0, btch: 1 usd: 0 CPU 9: hi: 0, btch: 1 usd: 0 CPU 10: hi: 0, btch: 1 usd: 0 CPU 11: hi: 0, btch: 1 usd: 0 CPU 12: hi: 0, btch: 1 usd: 0 CPU 13: hi: 0, btch: 1 usd: 0 CPU 14: hi: 0, btch: 1 usd: 0 CPU 15: hi: 0, btch: 1 usd: 0 CPU 16: hi: 0, btch: 1 usd: 0 CPU 17: hi: 0, btch: 1 usd: 0 CPU 18: hi: 0, btch: 1 usd: 0 CPU 19: hi: 0, btch: 1 usd: 0 CPU 20: hi: 0, btch: 1 usd: 0 CPU 21: hi: 0, btch: 1 usd: 0 CPU 22: hi: 0, btch: 1 usd: 0 CPU 23: hi: 0, btch: 1 usd: 0 CPU 24: hi: 0, btch: 1 usd: 0 CPU 25: hi: 0, btch: 1 usd: 0 CPU 26: hi: 0, btch: 1 usd: 0 CPU 27: hi: 0, btch: 1 usd: 0 CPU 28: hi: 0, btch: 1 usd: 0 CPU 29: hi: 0, btch: 1 usd: 0 CPU 30: hi: 0, btch: 1 usd: 0 CPU 31: hi: 0, btch: 1 usd: 0 CPU 32: hi: 0, btch: 1 usd: 0 CPU 33: hi: 0, btch: 1 usd: 0 CPU 34: hi: 0, btch: 1 usd: 0 CPU 35: hi: 0, btch: 1 usd: 0 CPU 36: hi: 0, btch: 1 usd: 0 CPU 37: hi: 0, btch: 1 usd: 0 CPU 38: hi: 0, btch: 1 usd: 0 CPU 39: hi: 0, btch: 1 usd: 0 Node 0 DMA32 per-cpu: CPU 0: hi: 186, btch: 31 usd: 177 CPU 1: hi: 186, btch: 31 usd: 129 CPU 2: hi: 186, btch: 31 usd: 145 CPU 3: hi: 186, btch: 31 usd: 60 CPU 4: hi: 186, btch: 31 usd: 105 CPU 5: hi: 186, btch: 31 usd: 96 CPU 6: hi: 186, btch: 31 usd: 71 CPU 7: hi: 186, btch: 31 usd: 53 CPU 8: hi: 186, btch: 31 usd: 121 CPU 9: hi: 186, btch: 31 usd: 171 CPU 10: hi: 186, btch: 31 usd: 77 CPU 11: hi: 186, btch: 31 usd: 0 CPU 12: hi: 186, btch: 31 usd: 0 CPU 13: hi: 186, btch: 31 usd: 0 CPU 14: hi: 186, btch: 31 usd: 29 CPU 15: hi: 186, btch: 31 usd: 36 CPU 16: hi: 186, btch: 31 usd: 0 CPU 17: hi: 186, btch: 31 usd: 0 CPU 18: hi: 186, btch: 31 usd: 0 CPU 19: hi: 186, btch: 31 usd: 22 CPU 20: hi: 186, btch: 31 usd: 76 CPU 21: hi: 186, btch: 31 usd: 161 CPU 22: hi: 186, btch: 31 usd: 73 CPU 23: hi: 186, btch: 31 usd: 78 CPU 24: hi: 186, btch: 31 usd: 125 CPU 25: hi: 186, btch: 31 usd: 73 CPU 26: hi: 186, btch: 31 usd: 72 CPU 27: hi: 186, btch: 31 usd: 38 CPU 28: hi: 186, btch: 31 usd: 118 CPU 29: hi: 186, btch: 31 usd: 63 CPU 30: hi: 186, btch: 31 usd: 0 CPU 31: hi: 186, btch: 31 usd: 0 CPU 32: hi: 186, btch: 31 usd: 0 CPU 33: hi: 186, btch: 31 usd: 0 CPU 34: hi: 186, btch: 31 usd: 0 CPU 35: hi: 186, btch: 31 usd: 0 CPU 36: hi: 186, btch: 31 usd: 0 CPU 37: hi: 186, btch: 31 usd: 0 CPU 38: hi: 186, btch: 31 usd: 0 CPU 39: hi: 186, btch: 31 usd: 0 Node 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 188 CPU 1: hi: 186, btch: 31 usd: 174 CPU 2: hi: 186, btch: 31 usd: 116 CPU 3: hi: 186, btch: 31 usd: 96 CPU 4: hi: 186, btch: 31 usd: 169 CPU 5: hi: 186, btch: 31 usd: 137 CPU 6: hi: 186, btch: 31 usd: 102 CPU 7: hi: 186, btch: 31 usd: 96 CPU 8: hi: 186, btch: 31 usd: 149 CPU 9: hi: 186, btch: 31 usd: 203 CPU 10: hi: 186, btch: 31 usd: 179 CPU 11: hi: 186, btch: 31 usd: 181 CPU 12: hi: 186, btch: 31 usd: 159 CPU 13: hi: 186, btch: 31 usd: 160 CPU 14: hi: 186, btch: 31 usd: 179 CPU 15: hi: 186, btch: 31 usd: 158 CPU 16: hi: 186, btch: 31 usd: 180 CPU 17: hi: 186, btch: 31 usd: 155 CPU 18: hi: 186, btch: 31 usd: 156 CPU 19: hi: 186, btch: 31 usd: 173 CPU 20: hi: 186, btch: 31 usd: 161 CPU 21: hi: 186, btch: 31 usd: 163 CPU 22: hi: 186, btch: 31 usd: 123 CPU 23: hi: 186, btch: 31 usd: 157 CPU 24: hi: 186, btch: 31 usd: 19 CPU 25: hi: 186, btch: 31 usd: 159 CPU 26: hi: 186, btch: 31 usd: 176 CPU 27: hi: 186, btch: 31 usd: 140 CPU 28: hi: 186, btch: 31 usd: 180 CPU 29: hi: 186, btch: 31 usd: 169 CPU 30: hi: 186, btch: 31 usd: 158 CPU 31: hi: 186, btch: 31 usd: 181 CPU 32: hi: 186, btch: 31 usd: 181 CPU 33: hi: 186, btch: 31 usd: 159 CPU 34: hi: 186, btch: 31 usd: 168 CPU 35: hi: 186, btch: 31 usd: 173 CPU 36: hi: 186, btch: 31 usd: 161 CPU 37: hi: 186, btch: 31 usd: 179 CPU 38: hi: 186, btch: 31 usd: 184 CPU 39: hi: 186, btch: 31 usd: 182 Node 1 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 164 CPU 1: hi: 186, btch: 31 usd: 167 CPU 2: hi: 186, btch: 31 usd: 160 CPU 3: hi: 186, btch: 31 usd: 171 CPU 4: hi: 186, btch: 31 usd: 175 CPU 5: hi: 186, btch: 31 usd: 184 CPU 6: hi: 186, btch: 31 usd: 169 CPU 7: hi: 186, btch: 31 usd: 159 CPU 8: hi: 186, btch: 31 usd: 171 CPU 9: hi: 186, btch: 31 usd: 185 CPU 10: hi: 186, btch: 31 usd: 86 CPU 11: hi: 186, btch: 31 usd: 73 CPU 12: hi: 186, btch: 31 usd: 29 CPU 13: hi: 186, btch: 31 usd: 62 CPU 14: hi: 186, btch: 31 usd: 91 CPU 15: hi: 186, btch: 31 usd: 114 CPU 16: hi: 186, btch: 31 usd: 69 CPU 17: hi: 186, btch: 31 usd: 117 CPU 18: hi: 186, btch: 31 usd: 17 CPU 19: hi: 186, btch: 31 usd: 20 CPU 20: hi: 186, btch: 31 usd: 163 CPU 21: hi: 186, btch: 31 usd: 161 CPU 22: hi: 186, btch: 31 usd: 175 CPU 23: hi: 186, btch: 31 usd: 172 CPU 24: hi: 186, btch: 31 usd: 177 CPU 25: hi: 186, btch: 31 usd: 171 CPU 26: hi: 186, btch: 31 usd: 157 CPU 27: hi: 186, btch: 31 usd: 171 CPU 28: hi: 186, btch: 31 usd: 182 CPU 29: hi: 186, btch: 31 usd: 183 CPU 30: hi: 186, btch: 31 usd: 40 CPU 31: hi: 186, btch: 31 usd: 157 CPU 32: hi: 186, btch: 31 usd: 76 CPU 33: hi: 186, btch: 31 usd: 157 CPU 34: hi: 186, btch: 31 usd: 63 CPU 35: hi: 186, btch: 31 usd: 62 CPU 36: hi: 186, btch: 31 usd: 58 CPU 37: hi: 186, btch: 31 usd: 36 CPU 38: hi: 186, btch: 31 usd: 19 CPU 39: hi: 186, btch: 31 usd: 47 active_anon:6762239 inactive_anon:12245334 isolated_anon:0 active_file:10705451 inactive_file:11321321 isolated_file:0 unevictable:5516901 dirty:457441 writeback:0 unstable:0 free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537 mapped:1567688 shmem:904717 pagetables:362061 bounce:0 free_cma:0 Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 1668 128642 128642 Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 126973 126973 Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB 23967743 total pagecache pages 1034858 pages in swap cache Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552 Free swap = 16464476kB Total swap = 41943036kB 67100065 pages RAM 0 pages HighMem/MovableOnly 524961 pages reserved ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Yet another ext4/jbd2 lockup 2015-07-29 13:33 Yet another ext4/jbd2 lockup Nikolay Borisov @ 2015-08-03 15:15 ` Jan Kara [not found] ` <55C4616A.10100@kyup.com> 0 siblings, 1 reply; 4+ messages in thread From: Jan Kara @ 2015-08-03 15:15 UTC (permalink / raw) To: Nikolay Borisov; +Cc: linux-ext4, Theodore Ts'o, Marian Marinov Hello, On Wed 29-07-15 16:33:18, Nikolay Borisov wrote: > Again, I have come up against a rather annoying ext4/jbd2 > lock and I'm not sure it's the same issue as the last one > I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). > Again, all of this is happening on an 3.12.28 kernel. I did grep for > the ext4 commits that have come afterwards but none of them allude to > fixing something similar. This time the blocked process is an rsync > which has been running OUTSIDE of a container e.g. in the / cgroup > but it was copying stuff from a dm-thin volume acting as the root > device for a cgroup: > > rsync D ffffffff81622100 0 33158 1 0x00000004 > ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640 > ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640 > ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280 > Call Trace: > [<ffffffff81573f24>] schedule+0x24/0x70 > [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0 > [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 > [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660 > [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30 > [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150 > [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0 > [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0 > [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330 > [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40 > [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400 > [<ffffffff8115de05>] ? generic_permission+0xc5/0x230 > [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100 > [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100 > [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20 > [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b So this is waiting to start a new transaction in the journal. > Sampling echo "w" > /proc/sysrq-trigger I found some other You can just attach full output from the above command. That way we can be sure we see all the pieces... > blocked process which might be related to the issue. In particular: > > jbd2/dm-24-8 D 0000000000000000 0 31263 2 0x00000000 > ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640 > ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640 > ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000 > Call Trace: > [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70 > [<ffffffff81573f24>] schedule+0x24/0x70 > [<ffffffff81573ff7>] io_schedule+0x87/0xd0 > [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10 > [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0 > [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170 > [<ffffffff810f4fb2>] __lock_page+0x62/0x70 > [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40 > [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30 > [<ffffffff810fffe1>] write_cache_pages+0x391/0x470 > [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60 > [<ffffffff8110010c>] generic_writepages+0x4c/0x70 > [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790 > [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70 > [<ffffffff81235d12>] kjournald2+0xc2/0x250 > [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 > [<ffffffff81235c50>] ? commit_timeout+0x10/0x10 > [<ffffffff81097456>] kthread+0xc6/0xd0 > [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 > [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 > [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 So jbd2 thread is waiting for page lock. Interesting... > kworker/u81:90 D 0000000000000000 0 36695 2 0x00000000 > Workqueue: writeback bdi_writeback_workfn (flush-253:24) > ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640 > ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640 > ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080 > Call Trace: > [<ffffffff81573f24>] schedule+0x24/0x70 > [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130 > [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 > [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70 > [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0 > [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0 > [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50 > [<ffffffff8110014b>] do_writepages+0x1b/0x30 > [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130 > [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0 > [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0 > [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0 > [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0 > [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0 > [<ffffffff810a4312>] ? wake_up_process+0x22/0x40 > [<ffffffff8109078b>] process_one_work+0x13b/0x460 > [<ffffffff81091a09>] worker_thread+0x119/0x3f0 > [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0 > [<ffffffff81097456>] kthread+0xc6/0xd0 > [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 > [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 > [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 And this is likely the culprit - flusher thread is waiting for journal commit from jbd2_journal_stop() at which point it may be holding page locks which are blocking jbd2 thread. Now waiting from transaction commit from jbd2_journal_stop() is really unexpected. That happens only if the handle was a sync one and the only way I can see handle becoming a sync one from a writeback path is when this would be the first file over 4 GB in the filesystem which looks really unlikely... Can you post output of dumpe2fs for the filesystem? Thanks! Honza > Basically dm-24 is the device this rsync was syncing from and the > kworker is running as part of writeback for the same device > (253:24 is the major:minor for this particular device). > Furthermore, I see there have been a number of "sync" executed and > their stack traces look like the following: > > sync D ffffffff81622100 0 17596 17532 0x00000000 > ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640 > ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640 > ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0 > Call Trace: > [<ffffffff81573f24>] schedule+0x24/0x70 > [<ffffffff81571955>] schedule_timeout+0x185/0x1c0 > [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150 > [<ffffffff81574984>] wait_for_completion+0x94/0x100 > [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0 > [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0 > [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 > [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0 > [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 > [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20 > [<ffffffff81157b29>] iterate_supers+0xa9/0xe0 > [<ffffffff8117e880>] sys_sync+0x30/0x90 > [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b > > I believe it is this sync which has triggered the bdi_writeback_workfn > for the particular dm device. Essentially what I gather is that the > rsync is locked since it wants to do some filesystem activity, hence > it wants to start a transaction but it's blocked due to an unfinished > transaction - presumably the one from the writeback path. However, > the writeback is blocked in jbd2_log_wait_commit, waiting for its > own transaction to complete and it is not progressing due to some > reason. There hasn't been any OOM messages for the container in > question and the global OOM killer hasn't triggered. The memory > state of the machine when this was observed. There still some > memory: > > SysRq : Show Memory > Mem-Info: > Node 0 DMA per-cpu: > CPU 0: hi: 0, btch: 1 usd: 0 > CPU 1: hi: 0, btch: 1 usd: 0 > CPU 2: hi: 0, btch: 1 usd: 0 > CPU 3: hi: 0, btch: 1 usd: 0 > CPU 4: hi: 0, btch: 1 usd: 0 > CPU 5: hi: 0, btch: 1 usd: 0 > CPU 6: hi: 0, btch: 1 usd: 0 > CPU 7: hi: 0, btch: 1 usd: 0 > CPU 8: hi: 0, btch: 1 usd: 0 > CPU 9: hi: 0, btch: 1 usd: 0 > CPU 10: hi: 0, btch: 1 usd: 0 > CPU 11: hi: 0, btch: 1 usd: 0 > CPU 12: hi: 0, btch: 1 usd: 0 > CPU 13: hi: 0, btch: 1 usd: 0 > CPU 14: hi: 0, btch: 1 usd: 0 > CPU 15: hi: 0, btch: 1 usd: 0 > CPU 16: hi: 0, btch: 1 usd: 0 > CPU 17: hi: 0, btch: 1 usd: 0 > CPU 18: hi: 0, btch: 1 usd: 0 > CPU 19: hi: 0, btch: 1 usd: 0 > CPU 20: hi: 0, btch: 1 usd: 0 > CPU 21: hi: 0, btch: 1 usd: 0 > CPU 22: hi: 0, btch: 1 usd: 0 > CPU 23: hi: 0, btch: 1 usd: 0 > CPU 24: hi: 0, btch: 1 usd: 0 > CPU 25: hi: 0, btch: 1 usd: 0 > CPU 26: hi: 0, btch: 1 usd: 0 > CPU 27: hi: 0, btch: 1 usd: 0 > CPU 28: hi: 0, btch: 1 usd: 0 > CPU 29: hi: 0, btch: 1 usd: 0 > CPU 30: hi: 0, btch: 1 usd: 0 > CPU 31: hi: 0, btch: 1 usd: 0 > CPU 32: hi: 0, btch: 1 usd: 0 > CPU 33: hi: 0, btch: 1 usd: 0 > CPU 34: hi: 0, btch: 1 usd: 0 > CPU 35: hi: 0, btch: 1 usd: 0 > CPU 36: hi: 0, btch: 1 usd: 0 > CPU 37: hi: 0, btch: 1 usd: 0 > CPU 38: hi: 0, btch: 1 usd: 0 > CPU 39: hi: 0, btch: 1 usd: 0 > Node 0 DMA32 per-cpu: > CPU 0: hi: 186, btch: 31 usd: 177 > CPU 1: hi: 186, btch: 31 usd: 129 > CPU 2: hi: 186, btch: 31 usd: 145 > CPU 3: hi: 186, btch: 31 usd: 60 > CPU 4: hi: 186, btch: 31 usd: 105 > CPU 5: hi: 186, btch: 31 usd: 96 > CPU 6: hi: 186, btch: 31 usd: 71 > CPU 7: hi: 186, btch: 31 usd: 53 > CPU 8: hi: 186, btch: 31 usd: 121 > CPU 9: hi: 186, btch: 31 usd: 171 > CPU 10: hi: 186, btch: 31 usd: 77 > CPU 11: hi: 186, btch: 31 usd: 0 > CPU 12: hi: 186, btch: 31 usd: 0 > CPU 13: hi: 186, btch: 31 usd: 0 > CPU 14: hi: 186, btch: 31 usd: 29 > CPU 15: hi: 186, btch: 31 usd: 36 > CPU 16: hi: 186, btch: 31 usd: 0 > CPU 17: hi: 186, btch: 31 usd: 0 > CPU 18: hi: 186, btch: 31 usd: 0 > CPU 19: hi: 186, btch: 31 usd: 22 > CPU 20: hi: 186, btch: 31 usd: 76 > CPU 21: hi: 186, btch: 31 usd: 161 > CPU 22: hi: 186, btch: 31 usd: 73 > CPU 23: hi: 186, btch: 31 usd: 78 > CPU 24: hi: 186, btch: 31 usd: 125 > CPU 25: hi: 186, btch: 31 usd: 73 > CPU 26: hi: 186, btch: 31 usd: 72 > CPU 27: hi: 186, btch: 31 usd: 38 > CPU 28: hi: 186, btch: 31 usd: 118 > CPU 29: hi: 186, btch: 31 usd: 63 > CPU 30: hi: 186, btch: 31 usd: 0 > CPU 31: hi: 186, btch: 31 usd: 0 > CPU 32: hi: 186, btch: 31 usd: 0 > CPU 33: hi: 186, btch: 31 usd: 0 > CPU 34: hi: 186, btch: 31 usd: 0 > CPU 35: hi: 186, btch: 31 usd: 0 > CPU 36: hi: 186, btch: 31 usd: 0 > CPU 37: hi: 186, btch: 31 usd: 0 > CPU 38: hi: 186, btch: 31 usd: 0 > CPU 39: hi: 186, btch: 31 usd: 0 > Node 0 Normal per-cpu: > CPU 0: hi: 186, btch: 31 usd: 188 > CPU 1: hi: 186, btch: 31 usd: 174 > CPU 2: hi: 186, btch: 31 usd: 116 > CPU 3: hi: 186, btch: 31 usd: 96 > CPU 4: hi: 186, btch: 31 usd: 169 > CPU 5: hi: 186, btch: 31 usd: 137 > CPU 6: hi: 186, btch: 31 usd: 102 > CPU 7: hi: 186, btch: 31 usd: 96 > CPU 8: hi: 186, btch: 31 usd: 149 > CPU 9: hi: 186, btch: 31 usd: 203 > CPU 10: hi: 186, btch: 31 usd: 179 > CPU 11: hi: 186, btch: 31 usd: 181 > CPU 12: hi: 186, btch: 31 usd: 159 > CPU 13: hi: 186, btch: 31 usd: 160 > CPU 14: hi: 186, btch: 31 usd: 179 > CPU 15: hi: 186, btch: 31 usd: 158 > CPU 16: hi: 186, btch: 31 usd: 180 > CPU 17: hi: 186, btch: 31 usd: 155 > CPU 18: hi: 186, btch: 31 usd: 156 > CPU 19: hi: 186, btch: 31 usd: 173 > CPU 20: hi: 186, btch: 31 usd: 161 > CPU 21: hi: 186, btch: 31 usd: 163 > CPU 22: hi: 186, btch: 31 usd: 123 > CPU 23: hi: 186, btch: 31 usd: 157 > CPU 24: hi: 186, btch: 31 usd: 19 > CPU 25: hi: 186, btch: 31 usd: 159 > CPU 26: hi: 186, btch: 31 usd: 176 > CPU 27: hi: 186, btch: 31 usd: 140 > CPU 28: hi: 186, btch: 31 usd: 180 > CPU 29: hi: 186, btch: 31 usd: 169 > CPU 30: hi: 186, btch: 31 usd: 158 > CPU 31: hi: 186, btch: 31 usd: 181 > CPU 32: hi: 186, btch: 31 usd: 181 > CPU 33: hi: 186, btch: 31 usd: 159 > CPU 34: hi: 186, btch: 31 usd: 168 > CPU 35: hi: 186, btch: 31 usd: 173 > CPU 36: hi: 186, btch: 31 usd: 161 > CPU 37: hi: 186, btch: 31 usd: 179 > CPU 38: hi: 186, btch: 31 usd: 184 > CPU 39: hi: 186, btch: 31 usd: 182 > Node 1 Normal per-cpu: > CPU 0: hi: 186, btch: 31 usd: 164 > CPU 1: hi: 186, btch: 31 usd: 167 > CPU 2: hi: 186, btch: 31 usd: 160 > CPU 3: hi: 186, btch: 31 usd: 171 > CPU 4: hi: 186, btch: 31 usd: 175 > CPU 5: hi: 186, btch: 31 usd: 184 > CPU 6: hi: 186, btch: 31 usd: 169 > CPU 7: hi: 186, btch: 31 usd: 159 > CPU 8: hi: 186, btch: 31 usd: 171 > CPU 9: hi: 186, btch: 31 usd: 185 > CPU 10: hi: 186, btch: 31 usd: 86 > CPU 11: hi: 186, btch: 31 usd: 73 > CPU 12: hi: 186, btch: 31 usd: 29 > CPU 13: hi: 186, btch: 31 usd: 62 > CPU 14: hi: 186, btch: 31 usd: 91 > CPU 15: hi: 186, btch: 31 usd: 114 > CPU 16: hi: 186, btch: 31 usd: 69 > CPU 17: hi: 186, btch: 31 usd: 117 > CPU 18: hi: 186, btch: 31 usd: 17 > CPU 19: hi: 186, btch: 31 usd: 20 > CPU 20: hi: 186, btch: 31 usd: 163 > CPU 21: hi: 186, btch: 31 usd: 161 > CPU 22: hi: 186, btch: 31 usd: 175 > CPU 23: hi: 186, btch: 31 usd: 172 > CPU 24: hi: 186, btch: 31 usd: 177 > CPU 25: hi: 186, btch: 31 usd: 171 > CPU 26: hi: 186, btch: 31 usd: 157 > CPU 27: hi: 186, btch: 31 usd: 171 > CPU 28: hi: 186, btch: 31 usd: 182 > CPU 29: hi: 186, btch: 31 usd: 183 > CPU 30: hi: 186, btch: 31 usd: 40 > CPU 31: hi: 186, btch: 31 usd: 157 > CPU 32: hi: 186, btch: 31 usd: 76 > CPU 33: hi: 186, btch: 31 usd: 157 > CPU 34: hi: 186, btch: 31 usd: 63 > CPU 35: hi: 186, btch: 31 usd: 62 > CPU 36: hi: 186, btch: 31 usd: 58 > CPU 37: hi: 186, btch: 31 usd: 36 > CPU 38: hi: 186, btch: 31 usd: 19 > CPU 39: hi: 186, btch: 31 usd: 47 > active_anon:6762239 inactive_anon:12245334 isolated_anon:0 > active_file:10705451 inactive_file:11321321 isolated_file:0 > unevictable:5516901 dirty:457441 writeback:0 unstable:0 > free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537 > mapped:1567688 shmem:904717 pagetables:362061 bounce:0 > free_cma:0 > Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes > lowmem_reserve[]: 0 1668 128642 128642 > Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 126973 126973 > Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 > Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 0 > Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB > Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB > Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB > Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB > 23967743 total pagecache pages > 1034858 pages in swap cache > Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552 > Free swap = 16464476kB > Total swap = 41943036kB > 67100065 pages RAM > 0 pages HighMem/MovableOnly > 524961 pages reserved > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <55C4616A.10100@kyup.com>]
* Re: Yet another ext4/jbd2 lockup [not found] ` <55C4616A.10100@kyup.com> @ 2015-08-10 13:58 ` Jan Kara 2015-08-10 15:45 ` Marian Marinov 0 siblings, 1 reply; 4+ messages in thread From: Jan Kara @ 2015-08-10 13:58 UTC (permalink / raw) To: Nikolay Borisov; +Cc: Jan Kara, linux-ext4, Theodore Ts'o, Marian Marinov [-- Attachment #1: Type: text/plain, Size: 19514 bytes --] Hello, On Fri 07-08-15 10:42:34, Nikolay Borisov wrote: > I have attached an XZ archive of the dumpe2fs command. Thanks. So I don't see anything suspicious in the dumpe2fs. I've looked again at the code and I still don't see how handle can be marked as sync in ext4_writepages(). I wrote attached patch to dump stack in case this happens. The dumping doesn't trigger for me. Can you reproduce the problem? Can you run a kernel with the attached patch applied and report stack dumps appearing in the kernel log? Honza > On 08/03/2015 06:15 PM, Jan Kara wrote: > > Hello, > > > > On Wed 29-07-15 16:33:18, Nikolay Borisov wrote: > >> Again, I have come up against a rather annoying ext4/jbd2 > >> lock and I'm not sure it's the same issue as the last one > >> I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). > >> Again, all of this is happening on an 3.12.28 kernel. I did grep for > >> the ext4 commits that have come afterwards but none of them allude to > >> fixing something similar. This time the blocked process is an rsync > >> which has been running OUTSIDE of a container e.g. in the / cgroup > >> but it was copying stuff from a dm-thin volume acting as the root > >> device for a cgroup: > >> > >> rsync D ffffffff81622100 0 33158 1 0x00000004 > >> ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640 > >> ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640 > >> ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280 > >> Call Trace: > >> [<ffffffff81573f24>] schedule+0x24/0x70 > >> [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0 > >> [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 > >> [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660 > >> [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30 > >> [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150 > >> [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0 > >> [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0 > >> [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330 > >> [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40 > >> [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400 > >> [<ffffffff8115de05>] ? generic_permission+0xc5/0x230 > >> [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100 > >> [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100 > >> [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20 > >> [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b > > > > So this is waiting to start a new transaction in the journal. > > > >> Sampling echo "w" > /proc/sysrq-trigger I found some other > > > > You can just attach full output from the above command. That way we can be > > sure we see all the pieces... > > > >> blocked process which might be related to the issue. In particular: > >> > >> jbd2/dm-24-8 D 0000000000000000 0 31263 2 0x00000000 > >> ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640 > >> ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640 > >> ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000 > >> Call Trace: > >> [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70 > >> [<ffffffff81573f24>] schedule+0x24/0x70 > >> [<ffffffff81573ff7>] io_schedule+0x87/0xd0 > >> [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10 > >> [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0 > >> [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170 > >> [<ffffffff810f4fb2>] __lock_page+0x62/0x70 > >> [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40 > >> [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30 > >> [<ffffffff810fffe1>] write_cache_pages+0x391/0x470 > >> [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60 > >> [<ffffffff8110010c>] generic_writepages+0x4c/0x70 > >> [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790 > >> [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70 > >> [<ffffffff81235d12>] kjournald2+0xc2/0x250 > >> [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 > >> [<ffffffff81235c50>] ? commit_timeout+0x10/0x10 > >> [<ffffffff81097456>] kthread+0xc6/0xd0 > >> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 > >> [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 > >> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 > > > > So jbd2 thread is waiting for page lock. Interesting... > > > >> kworker/u81:90 D 0000000000000000 0 36695 2 0x00000000 > >> Workqueue: writeback bdi_writeback_workfn (flush-253:24) > >> ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640 > >> ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640 > >> ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080 > >> Call Trace: > >> [<ffffffff81573f24>] schedule+0x24/0x70 > >> [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130 > >> [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 > >> [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70 > >> [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0 > >> [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0 > >> [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50 > >> [<ffffffff8110014b>] do_writepages+0x1b/0x30 > >> [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130 > >> [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0 > >> [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0 > >> [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0 > >> [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0 > >> [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0 > >> [<ffffffff810a4312>] ? wake_up_process+0x22/0x40 > >> [<ffffffff8109078b>] process_one_work+0x13b/0x460 > >> [<ffffffff81091a09>] worker_thread+0x119/0x3f0 > >> [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0 > >> [<ffffffff81097456>] kthread+0xc6/0xd0 > >> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 > >> [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 > >> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 > > > > And this is likely the culprit - flusher thread is waiting for journal > > commit from jbd2_journal_stop() at which point it may be holding page locks > > which are blocking jbd2 thread. Now waiting from transaction commit from > > jbd2_journal_stop() is really unexpected. That happens only if the > > handle was a sync one and the only way I can see handle becoming a sync one > > from a writeback path is when this would be the first file over 4 GB in the > > filesystem which looks really unlikely... Can you post output of dumpe2fs > > for the filesystem? Thanks! > > > > Honza > > > >> Basically dm-24 is the device this rsync was syncing from and the > >> kworker is running as part of writeback for the same device > >> (253:24 is the major:minor for this particular device). > >> Furthermore, I see there have been a number of "sync" executed and > >> their stack traces look like the following: > >> > >> sync D ffffffff81622100 0 17596 17532 0x00000000 > >> ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640 > >> ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640 > >> ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0 > >> Call Trace: > >> [<ffffffff81573f24>] schedule+0x24/0x70 > >> [<ffffffff81571955>] schedule_timeout+0x185/0x1c0 > >> [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150 > >> [<ffffffff81574984>] wait_for_completion+0x94/0x100 > >> [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0 > >> [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0 > >> [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 > >> [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0 > >> [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 > >> [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20 > >> [<ffffffff81157b29>] iterate_supers+0xa9/0xe0 > >> [<ffffffff8117e880>] sys_sync+0x30/0x90 > >> [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b > >> > >> I believe it is this sync which has triggered the bdi_writeback_workfn > >> for the particular dm device. Essentially what I gather is that the > >> rsync is locked since it wants to do some filesystem activity, hence > >> it wants to start a transaction but it's blocked due to an unfinished > >> transaction - presumably the one from the writeback path. However, > >> the writeback is blocked in jbd2_log_wait_commit, waiting for its > >> own transaction to complete and it is not progressing due to some > >> reason. There hasn't been any OOM messages for the container in > >> question and the global OOM killer hasn't triggered. The memory > >> state of the machine when this was observed. There still some > >> memory: > >> > >> SysRq : Show Memory > >> Mem-Info: > >> Node 0 DMA per-cpu: > >> CPU 0: hi: 0, btch: 1 usd: 0 > >> CPU 1: hi: 0, btch: 1 usd: 0 > >> CPU 2: hi: 0, btch: 1 usd: 0 > >> CPU 3: hi: 0, btch: 1 usd: 0 > >> CPU 4: hi: 0, btch: 1 usd: 0 > >> CPU 5: hi: 0, btch: 1 usd: 0 > >> CPU 6: hi: 0, btch: 1 usd: 0 > >> CPU 7: hi: 0, btch: 1 usd: 0 > >> CPU 8: hi: 0, btch: 1 usd: 0 > >> CPU 9: hi: 0, btch: 1 usd: 0 > >> CPU 10: hi: 0, btch: 1 usd: 0 > >> CPU 11: hi: 0, btch: 1 usd: 0 > >> CPU 12: hi: 0, btch: 1 usd: 0 > >> CPU 13: hi: 0, btch: 1 usd: 0 > >> CPU 14: hi: 0, btch: 1 usd: 0 > >> CPU 15: hi: 0, btch: 1 usd: 0 > >> CPU 16: hi: 0, btch: 1 usd: 0 > >> CPU 17: hi: 0, btch: 1 usd: 0 > >> CPU 18: hi: 0, btch: 1 usd: 0 > >> CPU 19: hi: 0, btch: 1 usd: 0 > >> CPU 20: hi: 0, btch: 1 usd: 0 > >> CPU 21: hi: 0, btch: 1 usd: 0 > >> CPU 22: hi: 0, btch: 1 usd: 0 > >> CPU 23: hi: 0, btch: 1 usd: 0 > >> CPU 24: hi: 0, btch: 1 usd: 0 > >> CPU 25: hi: 0, btch: 1 usd: 0 > >> CPU 26: hi: 0, btch: 1 usd: 0 > >> CPU 27: hi: 0, btch: 1 usd: 0 > >> CPU 28: hi: 0, btch: 1 usd: 0 > >> CPU 29: hi: 0, btch: 1 usd: 0 > >> CPU 30: hi: 0, btch: 1 usd: 0 > >> CPU 31: hi: 0, btch: 1 usd: 0 > >> CPU 32: hi: 0, btch: 1 usd: 0 > >> CPU 33: hi: 0, btch: 1 usd: 0 > >> CPU 34: hi: 0, btch: 1 usd: 0 > >> CPU 35: hi: 0, btch: 1 usd: 0 > >> CPU 36: hi: 0, btch: 1 usd: 0 > >> CPU 37: hi: 0, btch: 1 usd: 0 > >> CPU 38: hi: 0, btch: 1 usd: 0 > >> CPU 39: hi: 0, btch: 1 usd: 0 > >> Node 0 DMA32 per-cpu: > >> CPU 0: hi: 186, btch: 31 usd: 177 > >> CPU 1: hi: 186, btch: 31 usd: 129 > >> CPU 2: hi: 186, btch: 31 usd: 145 > >> CPU 3: hi: 186, btch: 31 usd: 60 > >> CPU 4: hi: 186, btch: 31 usd: 105 > >> CPU 5: hi: 186, btch: 31 usd: 96 > >> CPU 6: hi: 186, btch: 31 usd: 71 > >> CPU 7: hi: 186, btch: 31 usd: 53 > >> CPU 8: hi: 186, btch: 31 usd: 121 > >> CPU 9: hi: 186, btch: 31 usd: 171 > >> CPU 10: hi: 186, btch: 31 usd: 77 > >> CPU 11: hi: 186, btch: 31 usd: 0 > >> CPU 12: hi: 186, btch: 31 usd: 0 > >> CPU 13: hi: 186, btch: 31 usd: 0 > >> CPU 14: hi: 186, btch: 31 usd: 29 > >> CPU 15: hi: 186, btch: 31 usd: 36 > >> CPU 16: hi: 186, btch: 31 usd: 0 > >> CPU 17: hi: 186, btch: 31 usd: 0 > >> CPU 18: hi: 186, btch: 31 usd: 0 > >> CPU 19: hi: 186, btch: 31 usd: 22 > >> CPU 20: hi: 186, btch: 31 usd: 76 > >> CPU 21: hi: 186, btch: 31 usd: 161 > >> CPU 22: hi: 186, btch: 31 usd: 73 > >> CPU 23: hi: 186, btch: 31 usd: 78 > >> CPU 24: hi: 186, btch: 31 usd: 125 > >> CPU 25: hi: 186, btch: 31 usd: 73 > >> CPU 26: hi: 186, btch: 31 usd: 72 > >> CPU 27: hi: 186, btch: 31 usd: 38 > >> CPU 28: hi: 186, btch: 31 usd: 118 > >> CPU 29: hi: 186, btch: 31 usd: 63 > >> CPU 30: hi: 186, btch: 31 usd: 0 > >> CPU 31: hi: 186, btch: 31 usd: 0 > >> CPU 32: hi: 186, btch: 31 usd: 0 > >> CPU 33: hi: 186, btch: 31 usd: 0 > >> CPU 34: hi: 186, btch: 31 usd: 0 > >> CPU 35: hi: 186, btch: 31 usd: 0 > >> CPU 36: hi: 186, btch: 31 usd: 0 > >> CPU 37: hi: 186, btch: 31 usd: 0 > >> CPU 38: hi: 186, btch: 31 usd: 0 > >> CPU 39: hi: 186, btch: 31 usd: 0 > >> Node 0 Normal per-cpu: > >> CPU 0: hi: 186, btch: 31 usd: 188 > >> CPU 1: hi: 186, btch: 31 usd: 174 > >> CPU 2: hi: 186, btch: 31 usd: 116 > >> CPU 3: hi: 186, btch: 31 usd: 96 > >> CPU 4: hi: 186, btch: 31 usd: 169 > >> CPU 5: hi: 186, btch: 31 usd: 137 > >> CPU 6: hi: 186, btch: 31 usd: 102 > >> CPU 7: hi: 186, btch: 31 usd: 96 > >> CPU 8: hi: 186, btch: 31 usd: 149 > >> CPU 9: hi: 186, btch: 31 usd: 203 > >> CPU 10: hi: 186, btch: 31 usd: 179 > >> CPU 11: hi: 186, btch: 31 usd: 181 > >> CPU 12: hi: 186, btch: 31 usd: 159 > >> CPU 13: hi: 186, btch: 31 usd: 160 > >> CPU 14: hi: 186, btch: 31 usd: 179 > >> CPU 15: hi: 186, btch: 31 usd: 158 > >> CPU 16: hi: 186, btch: 31 usd: 180 > >> CPU 17: hi: 186, btch: 31 usd: 155 > >> CPU 18: hi: 186, btch: 31 usd: 156 > >> CPU 19: hi: 186, btch: 31 usd: 173 > >> CPU 20: hi: 186, btch: 31 usd: 161 > >> CPU 21: hi: 186, btch: 31 usd: 163 > >> CPU 22: hi: 186, btch: 31 usd: 123 > >> CPU 23: hi: 186, btch: 31 usd: 157 > >> CPU 24: hi: 186, btch: 31 usd: 19 > >> CPU 25: hi: 186, btch: 31 usd: 159 > >> CPU 26: hi: 186, btch: 31 usd: 176 > >> CPU 27: hi: 186, btch: 31 usd: 140 > >> CPU 28: hi: 186, btch: 31 usd: 180 > >> CPU 29: hi: 186, btch: 31 usd: 169 > >> CPU 30: hi: 186, btch: 31 usd: 158 > >> CPU 31: hi: 186, btch: 31 usd: 181 > >> CPU 32: hi: 186, btch: 31 usd: 181 > >> CPU 33: hi: 186, btch: 31 usd: 159 > >> CPU 34: hi: 186, btch: 31 usd: 168 > >> CPU 35: hi: 186, btch: 31 usd: 173 > >> CPU 36: hi: 186, btch: 31 usd: 161 > >> CPU 37: hi: 186, btch: 31 usd: 179 > >> CPU 38: hi: 186, btch: 31 usd: 184 > >> CPU 39: hi: 186, btch: 31 usd: 182 > >> Node 1 Normal per-cpu: > >> CPU 0: hi: 186, btch: 31 usd: 164 > >> CPU 1: hi: 186, btch: 31 usd: 167 > >> CPU 2: hi: 186, btch: 31 usd: 160 > >> CPU 3: hi: 186, btch: 31 usd: 171 > >> CPU 4: hi: 186, btch: 31 usd: 175 > >> CPU 5: hi: 186, btch: 31 usd: 184 > >> CPU 6: hi: 186, btch: 31 usd: 169 > >> CPU 7: hi: 186, btch: 31 usd: 159 > >> CPU 8: hi: 186, btch: 31 usd: 171 > >> CPU 9: hi: 186, btch: 31 usd: 185 > >> CPU 10: hi: 186, btch: 31 usd: 86 > >> CPU 11: hi: 186, btch: 31 usd: 73 > >> CPU 12: hi: 186, btch: 31 usd: 29 > >> CPU 13: hi: 186, btch: 31 usd: 62 > >> CPU 14: hi: 186, btch: 31 usd: 91 > >> CPU 15: hi: 186, btch: 31 usd: 114 > >> CPU 16: hi: 186, btch: 31 usd: 69 > >> CPU 17: hi: 186, btch: 31 usd: 117 > >> CPU 18: hi: 186, btch: 31 usd: 17 > >> CPU 19: hi: 186, btch: 31 usd: 20 > >> CPU 20: hi: 186, btch: 31 usd: 163 > >> CPU 21: hi: 186, btch: 31 usd: 161 > >> CPU 22: hi: 186, btch: 31 usd: 175 > >> CPU 23: hi: 186, btch: 31 usd: 172 > >> CPU 24: hi: 186, btch: 31 usd: 177 > >> CPU 25: hi: 186, btch: 31 usd: 171 > >> CPU 26: hi: 186, btch: 31 usd: 157 > >> CPU 27: hi: 186, btch: 31 usd: 171 > >> CPU 28: hi: 186, btch: 31 usd: 182 > >> CPU 29: hi: 186, btch: 31 usd: 183 > >> CPU 30: hi: 186, btch: 31 usd: 40 > >> CPU 31: hi: 186, btch: 31 usd: 157 > >> CPU 32: hi: 186, btch: 31 usd: 76 > >> CPU 33: hi: 186, btch: 31 usd: 157 > >> CPU 34: hi: 186, btch: 31 usd: 63 > >> CPU 35: hi: 186, btch: 31 usd: 62 > >> CPU 36: hi: 186, btch: 31 usd: 58 > >> CPU 37: hi: 186, btch: 31 usd: 36 > >> CPU 38: hi: 186, btch: 31 usd: 19 > >> CPU 39: hi: 186, btch: 31 usd: 47 > >> active_anon:6762239 inactive_anon:12245334 isolated_anon:0 > >> active_file:10705451 inactive_file:11321321 isolated_file:0 > >> unevictable:5516901 dirty:457441 writeback:0 unstable:0 > >> free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537 > >> mapped:1567688 shmem:904717 pagetables:362061 bounce:0 > >> free_cma:0 > >> Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes > >> lowmem_reserve[]: 0 1668 128642 128642 > >> Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > >> lowmem_reserve[]: 0 0 126973 126973 > >> Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > >> lowmem_reserve[]: 0 0 0 0 > >> Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > >> lowmem_reserve[]: 0 0 0 0 > >> Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB > >> Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB > >> Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB > >> Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB > >> 23967743 total pagecache pages > >> 1034858 pages in swap cache > >> Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552 > >> Free swap = 16464476kB > >> Total swap = 41943036kB > >> 67100065 pages RAM > >> 0 pages HighMem/MovableOnly > >> 524961 pages reserved > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jan Kara <jack@suse.com> SUSE Labs, CR [-- Attachment #2: 0001-jbd2-Debug-sync-handle-from-writeback.patch --] [-- Type: text/x-patch, Size: 905 bytes --] >From 1d98bc7c3d2940c25ac2ca0cb0a3f6101a942671 Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.com> Date: Mon, 10 Aug 2015 11:06:11 +0200 Subject: [PATCH] jbd2: Debug sync handle from writeback Signed-off-by: Jan Kara <jack@suse.com> --- fs/ext4/ext4_jbd2.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h index 9c5b49fb281e..0a63b49caf4e 100644 --- a/fs/ext4/ext4_jbd2.h +++ b/fs/ext4/ext4_jbd2.h @@ -278,8 +278,13 @@ static inline int ext4_handle_valid(handle_t *handle) static inline void ext4_handle_sync(handle_t *handle) { - if (ext4_handle_valid(handle)) + if (ext4_handle_valid(handle)) { handle->h_sync = 1; + if (handle->h_type == EXT4_HT_WRITE_PAGE) { + printk(KERN_ERR "ext4: Writeback handle is sync!\n"); + dump_stack(); + } + } } static inline int ext4_handle_is_aborted(handle_t *handle) -- 2.1.4 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Yet another ext4/jbd2 lockup 2015-08-10 13:58 ` Jan Kara @ 2015-08-10 15:45 ` Marian Marinov 0 siblings, 0 replies; 4+ messages in thread From: Marian Marinov @ 2015-08-10 15:45 UTC (permalink / raw) To: Jan Kara; +Cc: linux-ext4, Theodore Ts'o, Marian Marinov On 08/10/2015 04:58 PM, Jan Kara wrote: > Hello, > > On Fri 07-08-15 10:42:34, Nikolay Borisov wrote: >> I have attached an XZ archive of the dumpe2fs command. > > Thanks. So I don't see anything suspicious in the dumpe2fs. I've looked > again at the code and I still don't see how handle can be marked as sync in > ext4_writepages(). I wrote attached patch to dump stack in case this > happens. The dumping doesn't trigger for me. Can you reproduce the problem? > Can you run a kernel with the attached patch applied and report stack dumps > appearing in the kernel log? > > Honza Thank you for the patch. We will deploy it on one of our servers and look for the issues. Our next maintenance window will be next week and we will merge it with out 4.1 kernel branch. Unfortunately it is very hard to reproduce this in our test lab. We have been trying to reproduce it the past two weeks. We are running fuzzers and stress tests inside containers and waiting to see the same issue. Marian > >> On 08/03/2015 06:15 PM, Jan Kara wrote: >>> Hello, >>> >>> On Wed 29-07-15 16:33:18, Nikolay Borisov wrote: >>>> Again, I have come up against a rather annoying ext4/jbd2 >>>> lock and I'm not sure it's the same issue as the last one >>>> I posted (http://marc.info/?l=linux-ext4&m=143522748527527&w=2). >>>> Again, all of this is happening on an 3.12.28 kernel. I did grep for >>>> the ext4 commits that have come afterwards but none of them allude to >>>> fixing something similar. This time the blocked process is an rsync >>>> which has been running OUTSIDE of a container e.g. in the / cgroup >>>> but it was copying stuff from a dm-thin volume acting as the root >>>> device for a cgroup: >>>> >>>> rsync D ffffffff81622100 0 33158 1 0x00000004 >>>> ffff882ad6257b18 0000000000000086 ffff883cfb3e5280 0000000000011640 >>>> ffff882ad6257fd8 ffff882ad6256010 0000000000011640 0000000000011640 >>>> ffff882ad6257fd8 0000000000011640 ffff881fd3213180 ffff883cfb3e5280 >>>> Call Trace: >>>> [<ffffffff81573f24>] schedule+0x24/0x70 >>>> [<ffffffff8122f64c>] wait_transaction_locked+0x7c/0xb0 >>>> [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 >>>> [<ffffffff8122faf2>] start_this_handle+0x2c2/0x660 >>>> [<ffffffff8116a8fd>] ? d_rehash+0x1d/0x30 >>>> [<ffffffff8113b7f5>] ? kmem_cache_alloc+0xb5/0x150 >>>> [<ffffffff81230229>] jbd2__journal_start+0x109/0x1b0 >>>> [<ffffffff8120fe31>] __ext4_journal_start_sb+0x61/0xb0 >>>> [<ffffffff811ebbe7>] __ext4_new_inode+0x5b7/0x1330 >>>> [<ffffffff8115db8e>] ? __lookup_hash+0x1e/0x40 >>>> [<ffffffff811fb7ec>] ext4_mkdir+0x18c/0x400 >>>> [<ffffffff8115de05>] ? generic_permission+0xc5/0x230 >>>> [<ffffffff8115f018>] vfs_mkdir+0xb8/0x100 >>>> [<ffffffff811614e2>] SyS_mkdirat+0xb2/0x100 >>>> [<ffffffff8115dc44>] SyS_mkdir+0x14/0x20 >>>> [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b >>> >>> So this is waiting to start a new transaction in the journal. >>> >>>> Sampling echo "w" > /proc/sysrq-trigger I found some other >>> >>> You can just attach full output from the above command. That way we can be >>> sure we see all the pieces... >>> >>>> blocked process which might be related to the issue. In particular: >>>> >>>> jbd2/dm-24-8 D 0000000000000000 0 31263 2 0x00000000 >>>> ffff882af46df9f8 0000000000000046 ffff882643db0000 0000000000011640 >>>> ffff882af46dffd8 ffff882af46de010 0000000000011640 0000000000011640 >>>> ffff882af46dffd8 0000000000011640 ffff8827c519ca40 ffff882643db0000 >>>> Call Trace: >>>> [<ffffffff810f4fc0>] ? __lock_page+0x70/0x70 >>>> [<ffffffff81573f24>] schedule+0x24/0x70 >>>> [<ffffffff81573ff7>] io_schedule+0x87/0xd0 >>>> [<ffffffff810f4fc9>] sleep_on_page+0x9/0x10 >>>> [<ffffffff81571a45>] __wait_on_bit_lock+0x55/0xb0 >>>> [<ffffffff810f4b0a>] ? find_get_pages_tag+0xca/0x170 >>>> [<ffffffff810f4fb2>] __lock_page+0x62/0x70 >>>> [<ffffffff81097af0>] ? wake_atomic_t_function+0x40/0x40 >>>> [<ffffffff81101c30>] ? pagevec_lookup_tag+0x20/0x30 >>>> [<ffffffff810fffe1>] write_cache_pages+0x391/0x470 >>>> [<ffffffff810fec20>] ? set_page_dirty+0x60/0x60 >>>> [<ffffffff8110010c>] generic_writepages+0x4c/0x70 >>>> [<ffffffff81231b0b>] jbd2_journal_commit_transaction+0x41b/0x1790 >>>> [<ffffffff81083be7>] ? lock_timer_base+0x37/0x70 >>>> [<ffffffff81235d12>] kjournald2+0xc2/0x250 >>>> [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 >>>> [<ffffffff81235c50>] ? commit_timeout+0x10/0x10 >>>> [<ffffffff81097456>] kthread+0xc6/0xd0 >>>> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 >>>> [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 >>>> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 >>> >>> So jbd2 thread is waiting for page lock. Interesting... >>> >>>> kworker/u81:90 D 0000000000000000 0 36695 2 0x00000000 >>>> Workqueue: writeback bdi_writeback_workfn (flush-253:24) >>>> ffff880155ee7888 0000000000000046 ffff881ea1a71080 0000000000011640 >>>> ffff880155ee7fd8 ffff880155ee6010 0000000000011640 0000000000011640 >>>> ffff880155ee7fd8 0000000000011640 ffff882119215ac0 ffff881ea1a71080 >>>> Call Trace: >>>> [<ffffffff81573f24>] schedule+0x24/0x70 >>>> [<ffffffff81235bc5>] jbd2_log_wait_commit+0xb5/0x130 >>>> [<ffffffff81097a70>] ? wake_up_bit+0x40/0x40 >>>> [<ffffffff810a04ce>] ? __wake_up+0x4e/0x70 >>>> [<ffffffff8122f43b>] jbd2_journal_stop+0x1bb/0x2c0 >>>> [<ffffffff8120fda8>] __ext4_journal_stop+0x78/0xa0 >>>> [<ffffffff811f366f>] ext4_writepages+0x45f/0xa50 >>>> [<ffffffff8110014b>] do_writepages+0x1b/0x30 >>>> [<ffffffff811794cd>] __writeback_single_inode+0x3d/0x130 >>>> [<ffffffff8117a637>] writeback_sb_inodes+0x207/0x3f0 >>>> [<ffffffff8117a8b6>] __writeback_inodes_wb+0x96/0xc0 >>>> [<ffffffff8117aabb>] wb_writeback+0x1db/0x1f0 >>>> [<ffffffff8117ac61>] wb_do_writeback+0x191/0x1a0 >>>> [<ffffffff8117ace8>] bdi_writeback_workfn+0x78/0x1b0 >>>> [<ffffffff810a4312>] ? wake_up_process+0x22/0x40 >>>> [<ffffffff8109078b>] process_one_work+0x13b/0x460 >>>> [<ffffffff81091a09>] worker_thread+0x119/0x3f0 >>>> [<ffffffff810918f0>] ? manage_workers+0x2c0/0x2c0 >>>> [<ffffffff81097456>] kthread+0xc6/0xd0 >>>> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 >>>> [<ffffffff815762fc>] ret_from_fork+0x7c/0xb0 >>>> [<ffffffff81097390>] ? kthread_freezable_should_stop+0x60/0x60 >>> >>> And this is likely the culprit - flusher thread is waiting for journal >>> commit from jbd2_journal_stop() at which point it may be holding page locks >>> which are blocking jbd2 thread. Now waiting from transaction commit from >>> jbd2_journal_stop() is really unexpected. That happens only if the >>> handle was a sync one and the only way I can see handle becoming a sync one >>> from a writeback path is when this would be the first file over 4 GB in the >>> filesystem which looks really unlikely... Can you post output of dumpe2fs >>> for the filesystem? Thanks! >>> >>> Honza >>> >>>> Basically dm-24 is the device this rsync was syncing from and the >>>> kworker is running as part of writeback for the same device >>>> (253:24 is the major:minor for this particular device). >>>> Furthermore, I see there have been a number of "sync" executed and >>>> their stack traces look like the following: >>>> >>>> sync D ffffffff81622100 0 17596 17532 0x00000000 >>>> ffff8827cf1fdd78 0000000000000086 ffff88348291b9c0 0000000000011640 >>>> ffff8827cf1fdfd8 ffff8827cf1fc010 0000000000011640 0000000000011640 >>>> ffff8827cf1fdfd8 0000000000011640 ffff881fd32118c0 ffff88348291b9c0 >>>> Call Trace: >>>> [<ffffffff81573f24>] schedule+0x24/0x70 >>>> [<ffffffff81571955>] schedule_timeout+0x185/0x1c0 >>>> [<ffffffff810904c2>] ? __queue_delayed_work+0x82/0x150 >>>> [<ffffffff81574984>] wait_for_completion+0x94/0x100 >>>> [<ffffffff810a42d0>] ? try_to_wake_up+0x2b0/0x2b0 >>>> [<ffffffff81179a28>] ? bdi_queue_work+0x48/0xb0 >>>> [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 >>>> [<ffffffff81179c56>] sync_inodes_sb+0xa6/0x1c0 >>>> [<ffffffff8117e830>] ? fdatawrite_one_bdev+0x20/0x20 >>>> [<ffffffff8117e845>] sync_inodes_one_sb+0x15/0x20 >>>> [<ffffffff81157b29>] iterate_supers+0xa9/0xe0 >>>> [<ffffffff8117e880>] sys_sync+0x30/0x90 >>>> [<ffffffff815763a2>] system_call_fastpath+0x16/0x1b >>>> >>>> I believe it is this sync which has triggered the bdi_writeback_workfn >>>> for the particular dm device. Essentially what I gather is that the >>>> rsync is locked since it wants to do some filesystem activity, hence >>>> it wants to start a transaction but it's blocked due to an unfinished >>>> transaction - presumably the one from the writeback path. However, >>>> the writeback is blocked in jbd2_log_wait_commit, waiting for its >>>> own transaction to complete and it is not progressing due to some >>>> reason. There hasn't been any OOM messages for the container in >>>> question and the global OOM killer hasn't triggered. The memory >>>> state of the machine when this was observed. There still some >>>> memory: >>>> >>>> SysRq : Show Memory >>>> Mem-Info: >>>> Node 0 DMA per-cpu: >>>> CPU 0: hi: 0, btch: 1 usd: 0 >>>> CPU 1: hi: 0, btch: 1 usd: 0 >>>> CPU 2: hi: 0, btch: 1 usd: 0 >>>> CPU 3: hi: 0, btch: 1 usd: 0 >>>> CPU 4: hi: 0, btch: 1 usd: 0 >>>> CPU 5: hi: 0, btch: 1 usd: 0 >>>> CPU 6: hi: 0, btch: 1 usd: 0 >>>> CPU 7: hi: 0, btch: 1 usd: 0 >>>> CPU 8: hi: 0, btch: 1 usd: 0 >>>> CPU 9: hi: 0, btch: 1 usd: 0 >>>> CPU 10: hi: 0, btch: 1 usd: 0 >>>> CPU 11: hi: 0, btch: 1 usd: 0 >>>> CPU 12: hi: 0, btch: 1 usd: 0 >>>> CPU 13: hi: 0, btch: 1 usd: 0 >>>> CPU 14: hi: 0, btch: 1 usd: 0 >>>> CPU 15: hi: 0, btch: 1 usd: 0 >>>> CPU 16: hi: 0, btch: 1 usd: 0 >>>> CPU 17: hi: 0, btch: 1 usd: 0 >>>> CPU 18: hi: 0, btch: 1 usd: 0 >>>> CPU 19: hi: 0, btch: 1 usd: 0 >>>> CPU 20: hi: 0, btch: 1 usd: 0 >>>> CPU 21: hi: 0, btch: 1 usd: 0 >>>> CPU 22: hi: 0, btch: 1 usd: 0 >>>> CPU 23: hi: 0, btch: 1 usd: 0 >>>> CPU 24: hi: 0, btch: 1 usd: 0 >>>> CPU 25: hi: 0, btch: 1 usd: 0 >>>> CPU 26: hi: 0, btch: 1 usd: 0 >>>> CPU 27: hi: 0, btch: 1 usd: 0 >>>> CPU 28: hi: 0, btch: 1 usd: 0 >>>> CPU 29: hi: 0, btch: 1 usd: 0 >>>> CPU 30: hi: 0, btch: 1 usd: 0 >>>> CPU 31: hi: 0, btch: 1 usd: 0 >>>> CPU 32: hi: 0, btch: 1 usd: 0 >>>> CPU 33: hi: 0, btch: 1 usd: 0 >>>> CPU 34: hi: 0, btch: 1 usd: 0 >>>> CPU 35: hi: 0, btch: 1 usd: 0 >>>> CPU 36: hi: 0, btch: 1 usd: 0 >>>> CPU 37: hi: 0, btch: 1 usd: 0 >>>> CPU 38: hi: 0, btch: 1 usd: 0 >>>> CPU 39: hi: 0, btch: 1 usd: 0 >>>> Node 0 DMA32 per-cpu: >>>> CPU 0: hi: 186, btch: 31 usd: 177 >>>> CPU 1: hi: 186, btch: 31 usd: 129 >>>> CPU 2: hi: 186, btch: 31 usd: 145 >>>> CPU 3: hi: 186, btch: 31 usd: 60 >>>> CPU 4: hi: 186, btch: 31 usd: 105 >>>> CPU 5: hi: 186, btch: 31 usd: 96 >>>> CPU 6: hi: 186, btch: 31 usd: 71 >>>> CPU 7: hi: 186, btch: 31 usd: 53 >>>> CPU 8: hi: 186, btch: 31 usd: 121 >>>> CPU 9: hi: 186, btch: 31 usd: 171 >>>> CPU 10: hi: 186, btch: 31 usd: 77 >>>> CPU 11: hi: 186, btch: 31 usd: 0 >>>> CPU 12: hi: 186, btch: 31 usd: 0 >>>> CPU 13: hi: 186, btch: 31 usd: 0 >>>> CPU 14: hi: 186, btch: 31 usd: 29 >>>> CPU 15: hi: 186, btch: 31 usd: 36 >>>> CPU 16: hi: 186, btch: 31 usd: 0 >>>> CPU 17: hi: 186, btch: 31 usd: 0 >>>> CPU 18: hi: 186, btch: 31 usd: 0 >>>> CPU 19: hi: 186, btch: 31 usd: 22 >>>> CPU 20: hi: 186, btch: 31 usd: 76 >>>> CPU 21: hi: 186, btch: 31 usd: 161 >>>> CPU 22: hi: 186, btch: 31 usd: 73 >>>> CPU 23: hi: 186, btch: 31 usd: 78 >>>> CPU 24: hi: 186, btch: 31 usd: 125 >>>> CPU 25: hi: 186, btch: 31 usd: 73 >>>> CPU 26: hi: 186, btch: 31 usd: 72 >>>> CPU 27: hi: 186, btch: 31 usd: 38 >>>> CPU 28: hi: 186, btch: 31 usd: 118 >>>> CPU 29: hi: 186, btch: 31 usd: 63 >>>> CPU 30: hi: 186, btch: 31 usd: 0 >>>> CPU 31: hi: 186, btch: 31 usd: 0 >>>> CPU 32: hi: 186, btch: 31 usd: 0 >>>> CPU 33: hi: 186, btch: 31 usd: 0 >>>> CPU 34: hi: 186, btch: 31 usd: 0 >>>> CPU 35: hi: 186, btch: 31 usd: 0 >>>> CPU 36: hi: 186, btch: 31 usd: 0 >>>> CPU 37: hi: 186, btch: 31 usd: 0 >>>> CPU 38: hi: 186, btch: 31 usd: 0 >>>> CPU 39: hi: 186, btch: 31 usd: 0 >>>> Node 0 Normal per-cpu: >>>> CPU 0: hi: 186, btch: 31 usd: 188 >>>> CPU 1: hi: 186, btch: 31 usd: 174 >>>> CPU 2: hi: 186, btch: 31 usd: 116 >>>> CPU 3: hi: 186, btch: 31 usd: 96 >>>> CPU 4: hi: 186, btch: 31 usd: 169 >>>> CPU 5: hi: 186, btch: 31 usd: 137 >>>> CPU 6: hi: 186, btch: 31 usd: 102 >>>> CPU 7: hi: 186, btch: 31 usd: 96 >>>> CPU 8: hi: 186, btch: 31 usd: 149 >>>> CPU 9: hi: 186, btch: 31 usd: 203 >>>> CPU 10: hi: 186, btch: 31 usd: 179 >>>> CPU 11: hi: 186, btch: 31 usd: 181 >>>> CPU 12: hi: 186, btch: 31 usd: 159 >>>> CPU 13: hi: 186, btch: 31 usd: 160 >>>> CPU 14: hi: 186, btch: 31 usd: 179 >>>> CPU 15: hi: 186, btch: 31 usd: 158 >>>> CPU 16: hi: 186, btch: 31 usd: 180 >>>> CPU 17: hi: 186, btch: 31 usd: 155 >>>> CPU 18: hi: 186, btch: 31 usd: 156 >>>> CPU 19: hi: 186, btch: 31 usd: 173 >>>> CPU 20: hi: 186, btch: 31 usd: 161 >>>> CPU 21: hi: 186, btch: 31 usd: 163 >>>> CPU 22: hi: 186, btch: 31 usd: 123 >>>> CPU 23: hi: 186, btch: 31 usd: 157 >>>> CPU 24: hi: 186, btch: 31 usd: 19 >>>> CPU 25: hi: 186, btch: 31 usd: 159 >>>> CPU 26: hi: 186, btch: 31 usd: 176 >>>> CPU 27: hi: 186, btch: 31 usd: 140 >>>> CPU 28: hi: 186, btch: 31 usd: 180 >>>> CPU 29: hi: 186, btch: 31 usd: 169 >>>> CPU 30: hi: 186, btch: 31 usd: 158 >>>> CPU 31: hi: 186, btch: 31 usd: 181 >>>> CPU 32: hi: 186, btch: 31 usd: 181 >>>> CPU 33: hi: 186, btch: 31 usd: 159 >>>> CPU 34: hi: 186, btch: 31 usd: 168 >>>> CPU 35: hi: 186, btch: 31 usd: 173 >>>> CPU 36: hi: 186, btch: 31 usd: 161 >>>> CPU 37: hi: 186, btch: 31 usd: 179 >>>> CPU 38: hi: 186, btch: 31 usd: 184 >>>> CPU 39: hi: 186, btch: 31 usd: 182 >>>> Node 1 Normal per-cpu: >>>> CPU 0: hi: 186, btch: 31 usd: 164 >>>> CPU 1: hi: 186, btch: 31 usd: 167 >>>> CPU 2: hi: 186, btch: 31 usd: 160 >>>> CPU 3: hi: 186, btch: 31 usd: 171 >>>> CPU 4: hi: 186, btch: 31 usd: 175 >>>> CPU 5: hi: 186, btch: 31 usd: 184 >>>> CPU 6: hi: 186, btch: 31 usd: 169 >>>> CPU 7: hi: 186, btch: 31 usd: 159 >>>> CPU 8: hi: 186, btch: 31 usd: 171 >>>> CPU 9: hi: 186, btch: 31 usd: 185 >>>> CPU 10: hi: 186, btch: 31 usd: 86 >>>> CPU 11: hi: 186, btch: 31 usd: 73 >>>> CPU 12: hi: 186, btch: 31 usd: 29 >>>> CPU 13: hi: 186, btch: 31 usd: 62 >>>> CPU 14: hi: 186, btch: 31 usd: 91 >>>> CPU 15: hi: 186, btch: 31 usd: 114 >>>> CPU 16: hi: 186, btch: 31 usd: 69 >>>> CPU 17: hi: 186, btch: 31 usd: 117 >>>> CPU 18: hi: 186, btch: 31 usd: 17 >>>> CPU 19: hi: 186, btch: 31 usd: 20 >>>> CPU 20: hi: 186, btch: 31 usd: 163 >>>> CPU 21: hi: 186, btch: 31 usd: 161 >>>> CPU 22: hi: 186, btch: 31 usd: 175 >>>> CPU 23: hi: 186, btch: 31 usd: 172 >>>> CPU 24: hi: 186, btch: 31 usd: 177 >>>> CPU 25: hi: 186, btch: 31 usd: 171 >>>> CPU 26: hi: 186, btch: 31 usd: 157 >>>> CPU 27: hi: 186, btch: 31 usd: 171 >>>> CPU 28: hi: 186, btch: 31 usd: 182 >>>> CPU 29: hi: 186, btch: 31 usd: 183 >>>> CPU 30: hi: 186, btch: 31 usd: 40 >>>> CPU 31: hi: 186, btch: 31 usd: 157 >>>> CPU 32: hi: 186, btch: 31 usd: 76 >>>> CPU 33: hi: 186, btch: 31 usd: 157 >>>> CPU 34: hi: 186, btch: 31 usd: 63 >>>> CPU 35: hi: 186, btch: 31 usd: 62 >>>> CPU 36: hi: 186, btch: 31 usd: 58 >>>> CPU 37: hi: 186, btch: 31 usd: 36 >>>> CPU 38: hi: 186, btch: 31 usd: 19 >>>> CPU 39: hi: 186, btch: 31 usd: 47 >>>> active_anon:6762239 inactive_anon:12245334 isolated_anon:0 >>>> active_file:10705451 inactive_file:11321321 isolated_file:0 >>>> unevictable:5516901 dirty:457441 writeback:0 unstable:0 >>>> free:1028680 slab_reclaimable:3791349 slab_unreclaimable:13461537 >>>> mapped:1567688 shmem:904717 pagetables:362061 bounce:0 >>>> free_cma:0 >>>> Node 0 DMA free:15532kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes >>>> lowmem_reserve[]: 0 1668 128642 128642 >>>> Node 0 DMA32 free:509204kB min:1164kB low:1452kB high:1744kB active_anon:6016kB inactive_anon:13164kB active_file:2584kB inactive_file:5756kB unevictable:1112kB isolated(anon):0kB isolated(file):0kB present:2045980kB managed:1708904kB mlocked:1224kB dirty:20kB writeback:0kB mapped:4164kB shmem:3900kB slab_reclaimable:463020kB slab_unreclaimable:671072kB kernel_stack:1224kB pagetables:4268kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>> lowmem_reserve[]: 0 0 126973 126973 >>>> Node 0 Normal free:189488kB min:88804kB low:111004kB high:133204kB active_anon:16411204kB inactive_anon:24056744kB active_file:19949588kB inactive_file:22435040kB unevictable:8645768kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:130020968kB mlocked:8657300kB dirty:1773000kB writeback:0kB mapped:2878728kB shmem:1624252kB slab_reclaimable:6834156kB slab_unreclaimable:29014116kB kernel_stack:65736kB pagetables:711944kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>> lowmem_reserve[]: 0 0 0 0 >>>> Node 1 Normal free:3393536kB min:90240kB low:112800kB high:135360kB active_anon:10638288kB inactive_anon:24911428kB active_file:22869632kB inactive_file:22844488kB unevictable:13420724kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132117884kB mlocked:13427472kB dirty:56744kB writeback:0kB mapped:3387860kB shmem:1990716kB slab_reclaimable:7868220kB slab_unreclaimable:24160960kB kernel_stack:90600kB pagetables:731280kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>> lowmem_reserve[]: 0 0 0 0 >>>> Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 1*32kB (U) 0*64kB 1*128kB (U) 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (U) 3*4096kB (MR) = 15532kB >>>> Node 0 DMA32: 612*4kB (UEM) 1081*8kB (UEM) 295*16kB (UEM) 412*32kB (UEM) 288*64kB (UEM) 118*128kB (UEM) 63*256kB (UEM) 49*512kB (UEM) 30*1024kB (UEM) 29*2048kB (UEM) 77*4096kB (UMR) = 509256kB >>>> Node 0 Normal: 44891*4kB (UE) 0*8kB 0*16kB 5*32kB (R) 4*64kB (R) 2*128kB (R) 1*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 186124kB >>>> Node 1 Normal: 467218*4kB (UEM) 145540*8kB (UEM) 18995*16kB (UEM) 1482*32kB (UMR) 5*64kB (R) 1*128kB (R) 2*256kB (R) 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 3391128kB >>>> 23967743 total pagecache pages >>>> 1034858 pages in swap cache >>>> Swap cache stats: add 154167590, delete 153132732, find 2698616740/2708213552 >>>> Free swap = 16464476kB >>>> Total swap = 41943036kB >>>> 67100065 pages RAM >>>> 0 pages HighMem/MovableOnly >>>> 524961 pages reserved >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-08-10 16:32 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-29 13:33 Yet another ext4/jbd2 lockup Nikolay Borisov
2015-08-03 15:15 ` Jan Kara
[not found] ` <55C4616A.10100@kyup.com>
2015-08-10 13:58 ` Jan Kara
2015-08-10 15:45 ` Marian Marinov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox