From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Date: Mon, 26 Jul 2021 16:00:58 +0200 Subject: [Cluster-devel] gfs2 hang in xfstests generic/361 Message-ID: <20210726140058.GA9206@lst.de> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit I noticed this hang while testing the iomap_iter series on gfs2, but it also reproduces on 5.14-rc3. This is running locally with "-O -p lock_nolock": generic/361 files ... [ 1479.222703] run fstests generic/361 at 2021-07-26 13:57:10 [ 1481.188746] gfs2: fsid=vdc: Trying to join cluster "lock_nolock", "vdc" [ 1481.190270] gfs2: fsid=vdc: Now mounting FS (format 1801)... [ 1481.199160] gfs2: fsid=vdc.0: journal 0 mapped with 1 extents in 0ms [ 1481.202347] gfs2: fsid=vdc.0: jid=0, already locked for use [ 1481.203368] gfs2: fsid=vdc.0: jid=0: Looking at journal... [ 1481.269554] gfs2: fsid=vdc.0: jid=0: Journal head lookup took 66ms [ 1481.270730] gfs2: fsid=vdc.0: jid=0: Done [ 1481.271802] gfs2: fsid=vdc.0: first mount done, others may mount [ 1481.371768] loop0: detected capacity change from 0 to 2097152 [ 1482.799919] gfs2: fsid=loop0: Trying to join cluster "lock_nolock", "loop0" [ 1482.803425] gfs2: fsid=loop0: Now mounting FS (format 1801)... [ 1482.815215] gfs2: fsid=loop0.0: journal 0 mapped with 1 extents in 0ms [ 1482.819376] gfs2: fsid=loop0.0: jid=0, already locked for use [ 1482.822181] gfs2: fsid=loop0.0: jid=0: Looking at journal... [ 1483.005770] gfs2: fsid=loop0.0: jid=0: Journal head lookup took 183ms [ 1483.007077] gfs2: fsid=loop0.0: jid=0: Done [ 1483.007898] gfs2: fsid=loop0.0: first mount done, others may mount [ 1491.612990] loop: Write error at byte offset 468144128, length 4096. [ 1491.615612] blk_update_request: I/O error, dev loop0, sector 913152 op 0x1:(WRITE) fl0 [ 1491.617029] loop: Write error at byte offset 468844544, length 4096. [ 1491.623654] blk_update_request: I/O error, dev loop0, sector 915712 op 0x1:(WRITE) fl0 [ 1491.625416] loop: Write error at byte offset 469622784, length 4096. [ 1491.630162] loop0: writeback error on inode 16708, offset 398209024, sector 917224 [ 1491.631682] blk_update_request: I/O error, dev loop0, sector 917232 op 0x1:(WRITE) fl0 [ 1491.633448] loop: Write error at byte offset 470933504, length 4096. [ 1491.641790] blk_update_request: I/O error, dev loop0, sector 919792 op 0x1:(WRITE) fl0 [ 1491.642736] loop: Write error at byte offset 471711744, length 4096. [ 1491.647210] loop0: writeback error on inode 16708, offset 400293888, sector 921304 [ 1491.648540] blk_update_request: I/O error, dev loop0, sector 921312 op 0x1:(WRITE) fl0 [ 1491.649673] loop: Write error at byte offset 473022464, length 4096. [ 1491.655308] blk_update_request: I/O error, dev loop0, sector 923872 op 0x1:(WRITE) fl0 [ 1491.656176] loop: Write error at byte offset 473800704, length 4096. [ 1491.658274] loop0: writeback error on inode 16708, offset 402378752, sector 925384 [ 1491.659257] blk_update_request: I/O error, dev loop0, sector 925392 op 0x1:(WRITE) fl0 [ 1491.660052] loop: Write error at byte offset 475111424, length 4096. [ 1491.663825] blk_update_request: I/O error, dev loop0, sector 927952 op 0x1:(WRITE) fl0 [ 1491.663872] loop: Write error at byte offset 475889664, length 4096. [ 1491.666004] loop0: writeback error on inode 16708, offset 404463616, sector 929464 [ 1491.666880] blk_update_request: I/O error, dev loop0, sector 929472 op 0x1:(WRITE) fl0 [ 1491.667670] loop: Write error at byte offset 477200384, length 4096. [ 1491.671450] blk_update_request: I/O error, dev loop0, sector 932032 op 0x1:(WRITE) fl0 [ 1491.674066] loop0: writeback error on inode 16708, offset 406548480, sector 933544 [ 1491.674488] loop0: writeback error on inode 16708, offset 408633344, sector 937624 [ 1491.676163] loop0: writeback error on inode 16708, offset 410718208, sector 941704 [ 1491.678042] loop0: writeback error on inode 16708, offset 412803072, sector 945784 [ 1491.679677] loop0: writeback error on inode 16708, offset 414887936, sector 949864 [ 1491.681342] loop0: writeback error on inode 16708, offset 416972800, sector 953944 [ 1491.713431] Buffer I/O error on dev loop0, logical block 17192, lost async page write [ 1491.716453] Buffer I/O error on dev loop0, logical block 17219, lost async page write [ 1491.717967] Buffer I/O error on dev loop0, logical block 17729, lost async page write [ 1491.719021] Buffer I/O error on dev loop0, logical block 77399, lost async page write [ 1491.719333] Buffer I/O error on dev loop0, logical block 18239, lost async page write [ 1491.720966] Buffer I/O error on dev loop0, logical block 77933, lost async page write [ 1491.722209] Buffer I/O error on dev loop0, logical block 18749, lost async page write [ 1491.723602] Buffer I/O error on dev loop0, logical block 78443, lost async page write [ 1491.724989] Buffer I/O error on dev loop0, logical block 19259, lost async page write [ 1491.726488] Buffer I/O error on dev loop0, logical block 78953, lost async page write [ 1491.744459] gfs2: fsid=loop0.0: fatal: I/O error [ 1491.744459] block = 17192 [ 1491.744459] function = gfs2_ail1_empty_one, file = fs/gfs2/log.c, line = 323 [ 1491.747491] gfs2: fsid=loop0.0: fatal: I/O error(s) [ 1491.748477] gfs2: fsid=loop0.0: about to withdraw this file system [ 1491.752284] [ 1491.752587] ============================= [ 1491.753403] [ BUG: Invalid wait context ] [ 1491.754122] 5.14.0-rc2+ #47 Not tainted [ 1491.754860] ----------------------------- [ 1491.755563] kworker/2:1H/1975 is trying to lock: [ 1491.756370] ffff8881048d0888 (&wq->mutex){+.+.}-{3:3}, at: flush_workqueue+0xc9/0x5f0 [ 1491.757736] other info that might help us debug this: [ 1491.758622] context-{4:4} [ 1491.759087] 4 locks held by kworker/2:1H/1975: [ 1491.759863] #0: ffff888101717b38 ((wq_completion)glock_workqueue){+.+.}-{0:0}, at: p0 [ 1491.761623] #1: ffffc900040dfe78 ((work_completion)(&(&gl->gl_work)->work)){+.+.}-{00 [ 1491.763528] #2: ffff88811ce6b000 (&sdp->sd_log_flush_lock){++++}-{3:3}, at: gfs2_log0 [ 1491.765284] #3: ffff88811ce6ae28 (&sdp->sd_log_lock){+.+.}-{2:2}, at: gfs2_flush_rev0 [ 1491.767064] stack backtrace: [ 1491.767629] CPU: 2 PID: 1975 Comm: kworker/2:1H Not tainted 5.14.0-rc2+ #47 [ 1491.769000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/04 [ 1491.770641] Workqueue: glock_workqueue glock_work_func [ 1491.771635] Call Trace: [ 1491.772101] dump_stack_lvl+0x45/0x59 [ 1491.772777] __lock_acquire.cold+0x2a2/0x2be [ 1491.773529] ? save_trace+0x3e/0x380 [ 1491.774160] lock_acquire+0xc9/0x2f0 [ 1491.774815] ? flush_workqueue+0xc9/0x5f0 [ 1491.775521] __mutex_lock+0x75/0x870 [ 1491.776151] ? flush_workqueue+0xc9/0x5f0 [ 1491.776856] ? flush_workqueue+0xc9/0x5f0 [ 1491.777560] ? lock_release+0x13c/0x2e0 [ 1491.778234] flush_workqueue+0xc9/0x5f0 [ 1491.779012] gfs2_make_fs_ro+0x2b/0x2b0 [ 1491.779687] gfs2_withdraw.cold+0x16f/0x4bd [ 1491.780424] ? gfs2_freeze_lock+0x24/0x60 [ 1491.781129] gfs2_ail1_empty+0x305/0x310 [ 1491.781821] gfs2_flush_revokes+0x29/0x40 [ 1491.782526] revoke_lo_before_commit+0x12/0x1c0 [ 1491.783324] gfs2_log_flush+0x337/0xb00 [ 1491.784001] inode_go_sync+0x8e/0x200 [ 1491.784663] do_xmote+0xd2/0x380 [ 1491.785268] glock_work_func+0x57/0x130 [ 1491.785944] process_one_work+0x237/0x560 [ 1491.786660] worker_thread+0x4d/0x3b0 [ 1491.787307] ? process_one_work+0x560/0x560 [ 1491.788042] kthread+0x13f/0x170 [ 1491.788617] ? set_kthread_struct+0x40/0x40 [ 1491.789352] ret_from_fork+0x22/0x30 [ 1491.790084] BUG: scheduling while atomic: kworker/2:1H/1975/0x00000002 [ 1491.791316] INFO: lockdep is turned off. [ 1491.792119] Modules linked in: [ 1491.792734] Preemption disabled at: [ 1491.792735] [<0000000000000000>] 0x0 [ 1491.793971] CPU: 2 PID: 1975 Comm: kworker/2:1H Not tainted 5.14.0-rc2+ #47 [ 1491.795160] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/04 [ 1491.796618] Workqueue: glock_workqueue glock_work_func [ 1491.797531] Call Trace: [ 1491.797968] dump_stack_lvl+0x45/0x59 [ 1491.798621] __schedule_bug.cold+0x9c/0xad [ 1491.799340] __schedule+0x8e2/0xbb0 [ 1491.799956] ? lock_release+0x201/0x2e0 [ 1491.800633] schedule+0x48/0xf0 [ 1491.801241] schedule_timeout+0xaf/0x140 [ 1491.802045] ? trace_hardirqs_on+0x1b/0xe0 [ 1491.802840] ? destroy_timer_on_stack+0x10/0x10 [ 1491.803658] gfs2_make_fs_ro+0x249/0x2b0 [ 1491.804417] ? do_wait_intr_irq+0xa0/0xa0 [ 1491.805141] gfs2_withdraw.cold+0x16f/0x4bd [ 1491.805885] ? gfs2_freeze_lock+0x24/0x60 [ 1491.806685] gfs2_ail1_empty+0x305/0x310 [ 1491.807495] gfs2_flush_revokes+0x29/0x40 [ 1491.808281] revoke_lo_before_commit+0x12/0x1c0 [ 1491.809198] gfs2_log_flush+0x337/0xb00 [ 1491.809976] inode_go_sync+0x8e/0x200 [ 1491.810716] do_xmote+0xd2/0x380 [ 1491.811357] glock_work_func+0x57/0x130 [ 1491.812116] process_one_work+0x237/0x560 [ 1491.812882] worker_thread+0x4d/0x3b0 [ 1491.813612] ? process_one_work+0x560/0x560 [ 1491.814342] kthread+0x13f/0x170 [ 1491.814918] ? set_kthread_struct+0x40/0x40 [ 1491.815647] ret_from_fork+0x22/0x30 [ 1496.964925] gfs2: fsid=loop0.0: Journal recovery skipped for 0 until next mount. [ 1496.970088] gfs2: fsid=loop0.0: Glock dequeues delayed: 0 [ 1496.971948] gfs2: fsid=loop0.0: File system withdrawn [ 1496.973558] CPU: 2 PID: 1975 Comm: kworker/2:1H Tainted: G W 5.14.0-rc7 [ 1496.976692] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/04 [ 1496.979972] Workqueue: glock_workqueue glock_work_func [ 1496.981980] Call Trace: [ 1496.983015] dump_stack_lvl+0x45/0x59 [ 1496.984674] gfs2_withdraw.cold+0xd3/0x4bd [ 1496.986840] gfs2_ail1_empty+0x305/0x310 [ 1496.988888] gfs2_flush_revokes+0x29/0x40 [ 1496.990914] revoke_lo_before_commit+0x12/0x1c0 [ 1496.992965] gfs2_log_flush+0x337/0xb00 [ 1496.995501] inode_go_sync+0x8e/0x200 [ 1496.998030] do_xmote+0xd2/0x380 [ 1496.999681] glock_work_func+0x57/0x130 [ 1497.001480] process_one_work+0x237/0x560 [ 1497.003512] worker_thread+0x4d/0x3b0 [ 1497.005249] ? process_one_work+0x560/0x560 [ 1497.007208] kthread+0x13f/0x170 [ 1497.008761] ? set_kthread_struct+0x40/0x40 [ 1497.010758] ret_from_fork+0x22/0x30 [ 1502.086199] gfs2: fsid=loop0.0: warning: assertion "gfs2_log_is_empty(sdp)" failed at6 [ 1502.093118] CPU: 1 PID: 31026 Comm: mount Tainted: G W 5.14.0-rc2+ #47 [ 1502.096754] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/04 [ 1502.100580] Call Trace: [ 1502.101744] dump_stack_lvl+0x45/0x59 [ 1502.103470] gfs2_assert_warn_i.cold+0x2e/0x77 [ 1502.105537] gfs2_make_fs_ro+0x2a3/0x2b0 [ 1502.107381] ? do_wait_intr_irq+0xa0/0xa0 [ 1502.109256] gfs2_reconfigure+0x290/0x3c0 [ 1502.111144] ? gfs2_freeze_lock+0x24/0x60 [ 1502.113008] reconfigure_super+0x9f/0x220 [ 1502.114909] path_mount+0x96a/0xb40 [ 1502.116549] __x64_sys_mount+0xfe/0x140 [ 1502.118358] do_syscall_64+0x3b/0x90 [ 1502.120101] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 1502.122652] RIP: 0033:0x7f591889dfea [ 1502.124426] Code: 48 8b 0d a9 0e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 08 [ 1502.133077] RSP: 002b:00007ffde8ccc818 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [ 1502.136618] RAX: ffffffffffffffda RBX: 000055d76690c970 RCX: 00007f591889dfea [ 1502.139919] RDX: 000055d76690cb80 RSI: 000055d766910280 RDI: 000055d7669133d0 [ 1502.143309] RBP: 00007f5918beb1c4 R08: 000055d76690cba0 R09: 00007f5918960130 [ 1502.146999] R10: 0000000000200021 R11: 0000000000000246 R12: 0000000000000000 [ 1502.151191] R13: 0000000000200021 R14: 000055d7669133d0 R15: 000055d76690cb80