From: Vlastimil Babka <vbabka@suse.cz>
To: Alan Huang <mmpgouride@gmail.com>,
Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org,
syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com,
linux-mm@kvack.org, Tejun Heo <tj@kernel.org>,
Dennis Zhou <dennis@kernel.org>, Christoph Lameter <cl@linux.com>,
Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH] bcachefs: Use alloc_percpu_gfp to avoid deadlock
Date: Thu, 20 Feb 2025 18:16:43 +0100 [thread overview]
Message-ID: <78d954b5-e33f-4bbc-855b-e91e96278bef@suse.cz> (raw)
In-Reply-To: <25FBAAE5-8BC6-41F3-9A6D-65911BA5A5D7@gmail.com>
On 2/20/25 11:57, Alan Huang wrote:
> Ping
>
>> On Feb 12, 2025, at 22:27, Kent Overstreet <kent.overstreet@linux.dev> wrote:
>>
>> Adding pcpu people to the CC
>>
>> On Wed, Feb 12, 2025 at 06:06:25PM +0800, Alan Huang wrote:
>>> The cycle:
>>>
>>> CPU0: CPU1:
>>> bc->lock pcpu_alloc_mutex
>>> pcpu_alloc_mutex bc->lock
>>>
>>> Reported-by: syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com
>>> Tested-by: syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com
>>> Signed-off-by: Alan Huang <mmpgouride@gmail.com>
>>
>> So pcpu_alloc_mutex -> fs_reclaim?
>>
>> That's really awkward; seems like something that might invite more
>> issues. We can apply your fix if we need to, but I want to hear with the
>> percpu people have to say first.
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 6.14.0-rc2-syzkaller-00039-g09fbf3d50205 #0 Not tainted
>> ------------------------------------------------------
>> syz.0.21/5625 is trying to acquire lock:
>> ffffffff8ea19608 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x293/0x1760 mm/percpu.c:1782
>>
>> but task is already holding lock:
>> ffff888051401c68 (&bc->lock){+.+.}-{4:4}, at: bch2_btree_node_mem_alloc+0x559/0x16f0 fs/bcachefs/btree_cache.c:804
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #2 (&bc->lock){+.+.}-{4:4}:
>> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
>> __mutex_lock_common kernel/locking/mutex.c:585 [inline]
>> __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
>> bch2_btree_cache_scan+0x184/0xec0 fs/bcachefs/btree_cache.c:482
>> do_shrink_slab+0x72d/0x1160 mm/shrinker.c:437
>> shrink_slab+0x1093/0x14d0 mm/shrinker.c:664
>> shrink_one+0x43b/0x850 mm/vmscan.c:4868
>> shrink_many mm/vmscan.c:4929 [inline]
>> lru_gen_shrink_node mm/vmscan.c:5007 [inline]
>> shrink_node+0x37c5/0x3e50 mm/vmscan.c:5978
>> kswapd_shrink_node mm/vmscan.c:6807 [inline]
>> balance_pgdat mm/vmscan.c:6999 [inline]
>> kswapd+0x20f3/0x3b10 mm/vmscan.c:7264
>> kthread+0x7a9/0x920 kernel/kthread.c:464
>> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>
>> -> #1 (fs_reclaim){+.+.}-{0:0}:
>> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
>> __fs_reclaim_acquire mm/page_alloc.c:3853 [inline]
>> fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3867
>> might_alloc include/linux/sched/mm.h:318 [inline]
>> slab_pre_alloc_hook mm/slub.c:4066 [inline]
>> slab_alloc_node mm/slub.c:4144 [inline]
>> __do_kmalloc_node mm/slub.c:4293 [inline]
>> __kmalloc_noprof+0xae/0x4c0 mm/slub.c:4306
>> kmalloc_noprof include/linux/slab.h:905 [inline]
>> kzalloc_noprof include/linux/slab.h:1037 [inline]
>> pcpu_mem_zalloc mm/percpu.c:510 [inline]
>> pcpu_alloc_chunk mm/percpu.c:1430 [inline]
>> pcpu_create_chunk+0x57/0xbc0 mm/percpu-vm.c:338
>> pcpu_balance_populated mm/percpu.c:2063 [inline]
>> pcpu_balance_workfn+0xc4d/0xd40 mm/percpu.c:2200
>> process_one_work kernel/workqueue.c:3236 [inline]
>> process_scheduled_works+0xa66/0x1840 kernel/workqueue.c:3317
>> worker_thread+0x870/0xd30 kernel/workqueue.c:3398
>> kthread+0x7a9/0x920 kernel/kthread.c:464
>> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:148
>> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Seeing this as part of the chain (fs reclaim from a worker doing
pcpu_balance_workfn) makes me think Michal's patch could be a fix to this:
https://lore.kernel.org/all/20250206122633.167896-1-mhocko@kernel.org/
>> -> #0 (pcpu_alloc_mutex){+.+.}-{4:4}:
>> check_prev_add kernel/locking/lockdep.c:3163 [inline]
>> check_prevs_add kernel/locking/lockdep.c:3282 [inline]
>> validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
>> __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
>> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
>> __mutex_lock_common kernel/locking/mutex.c:585 [inline]
>> __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
>> pcpu_alloc_noprof+0x293/0x1760 mm/percpu.c:1782
>> __six_lock_init+0x104/0x150 fs/bcachefs/six.c:876
>> bch2_btree_lock_init+0x38/0x100 fs/bcachefs/btree_locking.c:12
>> bch2_btree_node_mem_alloc+0x565/0x16f0 fs/bcachefs/btree_cache.c:807
>> __bch2_btree_node_alloc fs/bcachefs/btree_update_interior.c:304 [inline]
>> bch2_btree_reserve_get+0x2df/0x1890 fs/bcachefs/btree_update_interior.c:532
>> bch2_btree_update_start+0xe56/0x14e0 fs/bcachefs/btree_update_interior.c:1230
>> bch2_btree_split_leaf+0x121/0x880 fs/bcachefs/btree_update_interior.c:1851
>> bch2_trans_commit_error+0x212/0x1380 fs/bcachefs/btree_trans_commit.c:908
>> __bch2_trans_commit+0x812b/0x97a0 fs/bcachefs/btree_trans_commit.c:1085
>> bch2_trans_commit fs/bcachefs/btree_update.h:183 [inline]
>> bch2_trans_mark_metadata_bucket+0x47a/0x17b0 fs/bcachefs/buckets.c:1043
>> bch2_trans_mark_metadata_sectors fs/bcachefs/buckets.c:1060 [inline]
>> __bch2_trans_mark_dev_sb fs/bcachefs/buckets.c:1100 [inline]
>> bch2_trans_mark_dev_sb+0x3f6/0x820 fs/bcachefs/buckets.c:1128
>> bch2_trans_mark_dev_sbs_flags+0x6be/0x720 fs/bcachefs/buckets.c:1138
>> bch2_fs_initialize+0xba0/0x1610 fs/bcachefs/recovery.c:1149
>> bch2_fs_start+0x36d/0x610 fs/bcachefs/super.c:1042
>> bch2_fs_get_tree+0xd8d/0x1740 fs/bcachefs/fs.c:2203
>> vfs_get_tree+0x90/0x2b0 fs/super.c:1814
>> do_new_mount+0x2be/0xb40 fs/namespace.c:3560
>> do_mount fs/namespace.c:3900 [inline]
>> __do_sys_mount fs/namespace.c:4111 [inline]
>> __se_sys_mount+0x2d6/0x3c0 fs/namespace.c:4088
>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> other info that might help us debug this:
>>
>> Chain exists of:
>> pcpu_alloc_mutex --> fs_reclaim --> &bc->lock
>>
>> Possible unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(&bc->lock);
>> lock(fs_reclaim);
>> lock(&bc->lock);
>> lock(pcpu_alloc_mutex);
>>
>> *** DEADLOCK ***
>>
>> 4 locks held by syz.0.21/5625:
>> #0: ffff888051400278 (&c->state_lock){+.+.}-{4:4}, at: bch2_fs_start+0x45/0x610 fs/bcachefs/super.c:1010
>> #1: ffff888051404378 (&c->btree_trans_barrier){.+.+}-{0:0}, at: srcu_lock_acquire include/linux/srcu.h:164 [inline]
>> #1: ffff888051404378 (&c->btree_trans_barrier){.+.+}-{0:0}, at: srcu_read_lock include/linux/srcu.h:256 [inline]
>> #1: ffff888051404378 (&c->btree_trans_barrier){.+.+}-{0:0}, at: __bch2_trans_get+0x7e4/0xd30 fs/bcachefs/btree_iter.c:3377
>> #2: ffff8880514266d0 (&c->gc_lock){.+.+}-{4:4}, at: bch2_btree_update_start+0x682/0x14e0 fs/bcachefs/btree_update_interior.c:1180
>> #3: ffff888051401c68 (&bc->lock){+.+.}-{4:4}, at: bch2_btree_node_mem_alloc+0x559/0x16f0 fs/bcachefs/btree_cache.c:804
>>
>> stack backtrace:
>> CPU: 0 UID: 0 PID: 5625 Comm: syz.0.21 Not tainted 6.14.0-rc2-syzkaller-00039-g09fbf3d50205 #0
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>> Call Trace:
>> <TASK>
>> __dump_stack lib/dump_stack.c:94 [inline]
>> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
>> print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2076
>> check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2208
>> check_prev_add kernel/locking/lockdep.c:3163 [inline]
>> check_prevs_add kernel/locking/lockdep.c:3282 [inline]
>> validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3906
>> __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
>> lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
>> __mutex_lock_common kernel/locking/mutex.c:585 [inline]
>> __mutex_lock+0x19c/0x1010 kernel/locking/mutex.c:730
>> pcpu_alloc_noprof+0x293/0x1760 mm/percpu.c:1782
>> __six_lock_init+0x104/0x150 fs/bcachefs/six.c:876
>> bch2_btree_lock_init+0x38/0x100 fs/bcachefs/btree_locking.c:12
>> bch2_btree_node_mem_alloc+0x565/0x16f0 fs/bcachefs/btree_cache.c:807
>> __bch2_btree_node_alloc fs/bcachefs/btree_update_interior.c:304 [inline]
>> bch2_btree_reserve_get+0x2df/0x1890 fs/bcachefs/btree_update_interior.c:532
>> bch2_btree_update_start+0xe56/0x14e0 fs/bcachefs/btree_update_interior.c:1230
>> bch2_btree_split_leaf+0x121/0x880 fs/bcachefs/btree_update_interior.c:1851
>> bch2_trans_commit_error+0x212/0x1380 fs/bcachefs/btree_trans_commit.c:908
>> __bch2_trans_commit+0x812b/0x97a0 fs/bcachefs/btree_trans_commit.c:1085
>> bch2_trans_commit fs/bcachefs/btree_update.h:183 [inline]
>> bch2_trans_mark_metadata_bucket+0x47a/0x17b0 fs/bcachefs/buckets.c:1043
>> bch2_trans_mark_metadata_sectors fs/bcachefs/buckets.c:1060 [inline]
>> __bch2_trans_mark_dev_sb fs/bcachefs/buckets.c:1100 [inline]
>> bch2_trans_mark_dev_sb+0x3f6/0x820 fs/bcachefs/buckets.c:1128
>> bch2_trans_mark_dev_sbs_flags+0x6be/0x720 fs/bcachefs/buckets.c:1138
>> bch2_fs_initialize+0xba0/0x1610 fs/bcachefs/recovery.c:1149
>> bch2_fs_start+0x36d/0x610 fs/bcachefs/super.c:1042
>> bch2_fs_get_tree+0xd8d/0x1740 fs/bcachefs/fs.c:2203
>> vfs_get_tree+0x90/0x2b0 fs/super.c:1814
>> do_new_mount+0x2be/0xb40 fs/namespace.c:3560
>> do_mount fs/namespace.c:3900 [inline]
>> __do_sys_mount fs/namespace.c:4111 [inline]
>> __se_sys_mount+0x2d6/0x3c0 fs/namespace.c:4088
>> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> RIP: 0033:0x7fcaed38e58a
>> Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007fcaec5fde68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
>> RAX: ffffffffffffffda RBX: 00007fcaec5fdef0 RCX: 00007fcaed38e58a
>> RDX: 00004000000000c0 RSI: 0000400000000180 RDI: 00007fcaec5fdeb0
>> RBP: 00004000000000c0 R08: 00007fcaec5fdef0 R09: 0000000000000000
>>
>>> ---
>>> fs/bcachefs/six.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/bcachefs/six.c b/fs/bcachefs/six.c
>>> index 7e7c66a1e1a6..ccdc6d496910 100644
>>> --- a/fs/bcachefs/six.c
>>> +++ b/fs/bcachefs/six.c
>>> @@ -873,7 +873,7 @@ void __six_lock_init(struct six_lock *lock, const char *name,
>>> * failure if they wish by checking lock->readers, but generally
>>> * will not want to treat it as an error.
>>> */
>>> - lock->readers = alloc_percpu(unsigned);
>>> + lock->readers = alloc_percpu_gfp(unsigned, GFP_NOWAIT|__GFP_NOWARN);
>>> }
>>> #endif
>>> }
>>> --
>>> 2.47.0
>>>
>
>
next prev parent reply other threads:[~2025-02-20 17:16 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250212100625.55860-1-mmpgouride@gmail.com>
2025-02-12 14:27 ` [PATCH] bcachefs: Use alloc_percpu_gfp to avoid deadlock Kent Overstreet
2025-02-20 10:57 ` Alan Huang
2025-02-20 12:40 ` Kent Overstreet
2025-02-20 12:44 ` Alan Huang
2025-02-20 17:16 ` Vlastimil Babka [this message]
2025-02-20 20:37 ` Kent Overstreet
2025-02-21 2:46 ` Dennis Zhou
2025-02-21 7:21 ` Vlastimil Babka
2025-02-21 19:44 ` Alan Huang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=78d954b5-e33f-4bbc-855b-e91e96278bef@suse.cz \
--to=vbabka@suse.cz \
--cc=cl@linux.com \
--cc=dennis@kernel.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mmpgouride@gmail.com \
--cc=syzbot+fe63f377148a6371a9db@syzkaller.appspotmail.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox