From: Wang Yugui <wangyugui@e16-tech.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: fstests btrfs/057 trigger a deadlock on 5.4.187
Date: Mon, 18 Jul 2022 08:03:34 +0800 [thread overview]
Message-ID: <20220718080300.A3BF.409509F4@e16-tech.com> (raw)
In-Reply-To: <20220611143547.3F40.409509F4@e16-tech.com>
Hi,
Because process_srcu() is 100% busy, so this problem is srcu related.
the only srcu in btrfs is subvol_srcu, but it is dropped in 5.17
c75e839414d3: Josef Bacik: btrfs: kill the subvol_srcu
and this patch is difficult to backport to 5.4.x.
so this problem maybe a little difficult to fix. If we hit this problem
in more frequency, we may upgrade to 5.10.x
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/07/18
> Hi,
>
>
> > this dead-lock is triggered by fstests btrfs/057 again on the lastest 5.4(5.4.197).
> >
> > Frequency: about 1/100.
> >
> > The Call Trace on 5.4.197
> >
> > [ 704.619697] sysrq: Show Blocked State
> > [ 704.624312] task PC stack pid father
> > [ 704.632499] btrfs-cleaner D 0 33205 2 0x80004000
> > [ 704.639055] Call Trace:
> > [ 704.642239] __schedule+0x284/0x6d0
> > [ 704.646595] schedule+0x2f/0xa0
> > [ 704.650669] wait_current_trans+0xb3/0xf0 [btrfs]
> > [ 704.656407] ? finish_wait+0x80/0x80
> > [ 704.660916] start_transaction+0x41c/0x4f0 [btrfs]
> > [ 704.666767] btrfs_drop_snapshot+0x548/0x840 [btrfs]
> > [ 704.672778] ? btrfs_run_defrag_inodes+0x79/0x380 [btrfs]
> > [ 704.679247] ? down_write+0xe/0x40
> > [ 704.683524] ? btrfs_delete_unused_bgs+0x35/0x5b0 [btrfs]
> > [ 704.690065] ? btree_submit_bio_start+0x10/0x10 [btrfs]
> > [ 704.696366] btrfs_clean_one_deleted_snapshot+0xba/0x110 [btrfs]
> > [ 704.703513] cleaner_kthread+0xfa/0x120 [btrfs]
> > [ 704.709054] kthread+0x112/0x130
> > [ 704.713104] ? __kthread_cancel_work+0x40/0x40
> > [ 704.718511] ret_from_fork+0x1f/0x40
> >
> > [ 704.722978] btrfs-transacti D 0 33206 2 0x80004000
> > [ 704.729585] Call Trace:
> > [ 704.732809] __schedule+0x284/0x6d0
> > [ 704.737156] schedule+0x2f/0xa0
> > [ 704.741123] wait_current_trans+0xb3/0xf0 [btrfs]
> > [ 704.746823] ? finish_wait+0x80/0x80
> > [ 704.751270] start_transaction+0x315/0x4f0 [btrfs]
> > [ 704.757099] transaction_kthread+0xa2/0x180 [btrfs]
> > [ 704.763001] ? btrfs_cleanup_transaction+0x590/0x590 [btrfs]
> > [ 704.769776] kthread+0x112/0x130
> > [ 704.773811] ? __kthread_cancel_work+0x40/0x40
> > [ 704.779218] ret_from_fork+0x1f/0x40
> >
> > [ 704.783650] umount D 0 33243 33239 0x00004000
> > [ 704.790253] Call Trace:
> > [ 704.793423] __schedule+0x284/0x6d0
> > [ 704.797788] schedule+0x2f/0xa0
> > [ 704.801733] schedule_timeout+0x20d/0x340
> > [ 704.806634] wait_for_completion+0x11f/0x190
> > [ 704.811866] ? wake_up_q+0x70/0x70
> > [ 704.816099] __synchronize_srcu.part.20+0x81/0xb0
> > [ 704.821784] ? __bpf_trace_rcu_utilization+0x10/0x10
> > [ 704.827787] btrfs_drop_and_free_fs_root+0x87/0xe0 [btrfs]
> > [ 704.834350] switch_commit_roots+0x186/0x1c0 [btrfs]
> > [ 704.840373] btrfs_commit_transaction+0x5da/0xa20 [btrfs]
> > [ 704.846842] ? finish_wait+0x80/0x80
> > [ 704.851265] sync_filesystem+0x71/0x90
> > [ 704.855907] generic_shutdown_super+0x22/0x100
> > [ 704.861342] kill_anon_super+0x14/0x30
> > [ 704.866001] btrfs_kill_super+0x12/0xa0 [btrfs]
> > [ 704.871487] deactivate_locked_super+0x34/0x70
> > [ 704.876890] cleanup_mnt+0xb8/0x140
> > [ 704.881190] task_work_run+0x8a/0xb0
> > [ 704.885586] exit_to_usermode_loop+0x103/0x120
> > [ 704.890981] do_syscall_64+0x198/0x1a0
> > [ 704.895604] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > [ 704.901698] RIP: 0033:0x7fc630348ba7
> > [ 704.906098] Code: Bad RIP value.
> > [ 704.910097] RSP: 002b:00007ffd612e8a18 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> > [ 704.918967] RAX: 0000000000000000 RBX: 0000555b85889060 RCX: 00007fc630348ba7
> > [ 704.927386] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000555b8588fb10
> > [ 704.935800] RBP: 0000555b8588fb10 R08: 0000000000000000 R09: 000000000000000d
> > [ 704.944185] R10: 00007ffd612e84a0 R11: 0000000000000246 R12: 00007fc6310f2ee4
> > [ 704.952559] R13: 0000000000000000 R14: 0000555b858892d0 R15: 00000000ffffffff
>
> In addition to the 3 state D theads,
> another thread(PID: 1560) is 100% busy
>
> this is the call trace reported by 'sysrq l'
>
> [ 1269.337878] CPU: 31 PID: 1560 Comm: kworker/31:2 Not tainted 5.4.197-2.el7.x86_64 #1
> [ 1269.337878] Hardware name: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.14.0 01/25/2022
> [ 1269.337879] Workqueue: rcu_gp process_srcu
> [ 1269.337879] RIP: 0010:delay_tsc+0x27/0x60
> [ 1269.337880] Code: 00 00 00 0f 1f 44 00 00 65 44 8b 0d a3 c3 64 54 0f 01 f9 66 90 48 c1 e2 20 48 09 c2 49 89 d0 eb 11 f3 90 65 8b 35 89 c3 64 54 <41> 39 f1 75 1b 41 89 f1 0f 01 f9 66 90 48 c1 e2 20 48 09 d0 48 89
> [ 1269.337880] RSP: 0018:ffffc3e30e1bbde0 EFLAGS: 00000283
> [ 1269.337880] RAX: 000007443487c36f RBX: 00000000000042bc RCX: 000000000000101f
> [ 1269.337881] RDX: 00000000000032a8 RSI: 000000000000001f RDI: 00000000000032c5
> [ 1269.337881] RBP: 0000000000000000 R08: 00000744348790c7 R09: 000000000000001f
> [ 1269.337881] R10: 8080808080808080 R11: ffffffffaca52ff8 R12: 0000000000000000
> [ 1269.337882] R13: 00000000000042bf R14: 00000000ffffffff R15: ffffa01a770b0498
> [ 1269.337882] FS: 0000000000000000(0000) GS:ffffa05aff7c0000(0000) knlGS:0000000000000000
> [ 1269.337882] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1269.337882] CR2: 0000555b8589e838 CR3: 0000007f7006e002 CR4: 00000000003606e0
> [ 1269.337883] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 1269.337883] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 1269.337883] Call Trace:
> [ 1269.337883] try_check_zero+0xc2/0xf0
> [ 1269.337883] process_srcu+0x15e/0x4a0
> [ 1269.337884] ? __queue_work+0x13d/0x3e0
> [ 1269.337884] process_one_work+0x1a7/0x360
> [ 1269.337884] worker_thread+0x30/0x390
> [ 1269.337884] ? create_worker+0x1a0/0x1a0
> [ 1269.337884] kthread+0x112/0x130
> [ 1269.337885] ? __kthread_cancel_work+0x40/0x40
> [ 1269.337885] ret_from_fork+0x1f/0x40
>
>
> Best Regards
> Wang Yugui (wangyugui@e16-tech.com)
> 2022/06/11
>
next prev parent reply other threads:[~2022-07-18 0:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-10 9:22 fstests btrfs/057 trigger a deadlock on 5.4.187 Wang Yugui
2022-06-11 6:26 ` Wang Yugui
2022-06-11 6:35 ` Wang Yugui
2022-07-18 0:03 ` Wang Yugui [this message]
2024-02-12 12:17 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220718080300.A3BF.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox