From: "Lai, Yi" <yi1.lai@linux.intel.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Peter Zijlstra <peterz@infradead.org>,
Waiman Long <longman@redhat.com>,
Boqun Feng <boqun.feng@gmail.com>, Ingo Molnar <mingo@redhat.com>,
Will Deacon <will@kernel.org>,
linux-kernel@vger.kernel.org,
Bart Van Assche <bvanassche@acm.org>,
yi1.lai@intel.com, syzkaller-bugs@googlegroups.com
Subject: Re: [PATCH V2 3/3] block: model freeze & enter queue as lock for supporting lockdep
Date: Wed, 30 Oct 2024 18:39:13 +0800 [thread overview]
Message-ID: <ZyIM0dWzxC9zBIuf@ly-workstation> (raw)
In-Reply-To: <CAFj5m9+bL23T7mMwR7g_8umTzkNJa14n8AhR3_g6QjB2YCcc5A@mail.gmail.com>
On Wed, Oct 30, 2024 at 05:50:15PM +0800, Ming Lei wrote:
> On Wed, Oct 30, 2024 at 4:51 PM Lai, Yi <yi1.lai@linux.intel.com> wrote:
> >
> > On Wed, Oct 30, 2024 at 03:13:09PM +0800, Ming Lei wrote:
> > > On Wed, Oct 30, 2024 at 02:45:03PM +0800, Lai, Yi wrote:
> ...
> > >
> > > It should be addressed by the following patch:
> > >
> > > https://lore.kernel.org/linux-block/ZyEGLdg744U_xBjp@fedora/
> > >
> >
> > I have applied proposed fix patch on top of next-20241029. Issue can
> > still be reproduced.
> >
> > It seems the dependency chain is different from Marek's log and mine.
>
> Can you post the new log since q->q_usage_counter(io)->fs_reclaim from
> blk_mq_init_sched is cut down by the patch?
>
New possible deadlock log after patch applied:
[ 52.485023] repro: page allocation failure: order:1, mode:0x10cc0(GFP_KERNEL|__GFP_NORETRY), nodemask=(null),cpuset=/,mems0
[ 52.486074] CPU: 1 UID: 0 PID: 635 Comm: repro Not tainted 6.12.0-rc5-next-20241029-kvm-dirty #6
[ 52.486752] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/014
[ 52.487616] Call Trace:
[ 52.487820] <TASK>
[ 52.488001] dump_stack_lvl+0x121/0x150
[ 52.488345] dump_stack+0x19/0x20
[ 52.488616] warn_alloc+0x218/0x350
[ 52.488913] ? __pfx_warn_alloc+0x10/0x10
[ 52.489263] ? __alloc_pages_direct_compact+0x130/0xa10
[ 52.489699] ? __pfx___alloc_pages_direct_compact+0x10/0x10
[ 52.490151] ? __drain_all_pages+0x27b/0x480
[ 52.490522] __alloc_pages_slowpath.constprop.0+0x14d6/0x21e0
[ 52.491018] ? __pfx___alloc_pages_slowpath.constprop.0+0x10/0x10
[ 52.491519] ? lock_is_held_type+0xef/0x150
[ 52.491875] ? __pfx_get_page_from_freelist+0x10/0x10
[ 52.492291] ? lock_acquire+0x80/0xb0
[ 52.492619] __alloc_pages_noprof+0x5d4/0x6f0
[ 52.492992] ? __pfx___alloc_pages_noprof+0x10/0x10
[ 52.493405] ? __sanitizer_cov_trace_switch+0x58/0xa0
[ 52.493830] ? policy_nodemask+0xf9/0x450
[ 52.494169] alloc_pages_mpol_noprof+0x30a/0x580
[ 52.494561] ? __pfx_alloc_pages_mpol_noprof+0x10/0x10
[ 52.494982] ? sysvec_apic_timer_interrupt+0x6a/0xd0
[ 52.495396] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 52.495845] alloc_pages_noprof+0xa9/0x180
[ 52.496201] kimage_alloc_pages+0x79/0x240
[ 52.496558] kimage_alloc_control_pages+0x1cb/0xa60
[ 52.496982] ? __pfx_kimage_alloc_control_pages+0x10/0x10
[ 52.497437] ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
[ 52.497897] do_kexec_load+0x3a6/0x8e0
[ 52.498228] ? __pfx_do_kexec_load+0x10/0x10
[ 52.498593] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[ 52.499035] ? _copy_from_user+0xb6/0xf0
[ 52.499371] __x64_sys_kexec_load+0x1cc/0x240
[ 52.499740] x64_sys_call+0xf0f/0x20d0
[ 52.500055] do_syscall_64+0x6d/0x140
[ 52.500367] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 52.500778] RIP: 0033:0x7f310423ee5d
[ 52.501077] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c88
[ 52.502494] RSP: 002b:00007fffcecca558 EFLAGS: 00000207 ORIG_RAX: 00000000000000f6
[ 52.503087] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f310423ee5d
[ 52.503644] RDX: 0000000020000040 RSI: 0000000000000009 RDI: 0000000000000000
[ 52.504198] RBP: 00007fffcecca560 R08: 00007fffcecc9fd0 R09: 00007fffcecca590
[ 52.504767] R10: 0000000000000000 R11: 0000000000000207 R12: 00007fffcecca6d8
[ 52.505345] R13: 0000000000401c72 R14: 0000000000403e08 R15: 00007f3104469000
[ 52.505949] </TASK>
[ 52.506239] Mem-Info:
[ 52.506449] active_anon:119 inactive_anon:14010 isolated_anon:0
[ 52.506449] active_file:17895 inactive_file:87 isolated_file:0
[ 52.506449] unevictable:0 dirty:15 writeback:0
[ 52.506449] slab_reclaimable:6957 slab_unreclaimable:20220
[ 52.506449] mapped:11598 shmem:1150 pagetables:766
[ 52.506449] sec_pagetables:0 bounce:0
[ 52.506449] kernel_misc_reclaimable:0
[ 52.506449] free:13776 free_pcp:99 free_cma:0
[ 52.509456] Node 0 active_anon:476kB inactive_anon:56040kB active_file:71580kB inactive_file:348kB unevictable:0kB isolateo
[ 52.511881] Node 0 DMA free:440kB boost:0kB min:440kB low:548kB high:656kB reserved_highatomic:0KB active_anon:0kB inactivB
[ 52.513883] lowmem_reserve[]: 0 1507 0 0 0
[ 52.514269] Node 0 DMA32 free:54664kB boost:0kB min:44612kB low:55764kB high:66916kB reserved_highatomic:0KB active_anon:4B
[ 52.516485] lowmem_reserve[]: 0 0 0 0 0
[ 52.516831] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 1*32kB (U) 0*64kB 1*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4B
[ 52.517895] Node 0 DMA32: 2970*4kB (UME) 1123*8kB (UME) 532*16kB (UME) 280*32kB (UM) 126*64kB (UM) 27*128kB (UME) 9*256kB B
[ 52.519279] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[ 52.519387]
[ 52.519971] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[ 52.520138] ======================================================
[ 52.520805] 19113 total pagecache pages
[ 52.521702] WARNING: possible circular locking dependency detected
[ 52.522016] 0 pages in swap cache
[ 52.522022] Free swap = 124984kB
[ 52.522027] Total swap = 124996kB
[ 52.523720] 6.12.0-rc5-next-20241029-kvm-dirty #6 Not tainted
[ 52.523741] ------------------------------------------------------
[ 52.524059] 524158 pages RAM
[ 52.525050] kswapd0/56 is trying to acquire lock:
[ 52.525452] 0 pages HighMem/MovableOnly
[ 52.525461] 129765 pages reserved
[ 52.525465] 0 pages cma reserved
[ 52.525469] 0 pages hwpoisoned
[ 52.527163] ffff8880104374e8 (&q->q_usage_counter(io)#25){++++}-{0:0}, at: __submit_bio+0x39f/0x550
[ 52.532396]
[ 52.532396] but task is already holding lock:
[ 52.533293] ffffffff872322a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xb0f/0x1500
[ 52.534508]
[ 52.534508] which lock already depends on the new lock.
[ 52.534508]
[ 52.535723]
[ 52.535723] the existing dependency chain (in reverse order) is:
[ 52.536818]
[ 52.536818] -> #2 (fs_reclaim){+.+.}-{0:0}:
[ 52.537705] lock_acquire+0x80/0xb0
[ 52.538337] fs_reclaim_acquire+0x116/0x160
[ 52.539076] blk_mq_alloc_and_init_hctx+0x4df/0x1200
[ 52.539906] blk_mq_realloc_hw_ctxs+0x4cf/0x610
[ 52.540676] blk_mq_init_allocated_queue+0x3da/0x11b0
[ 52.541547] blk_mq_alloc_queue+0x22c/0x300
[ 52.542279] __blk_mq_alloc_disk+0x34/0x100
[ 52.543011] loop_add+0x4c9/0xbd0
[ 52.543622] loop_init+0x133/0x1a0
[ 52.544248] do_one_initcall+0x114/0x5d0
[ 52.544954] kernel_init_freeable+0xab0/0xeb0
[ 52.545732] kernel_init+0x28/0x2f0
[ 52.546366] ret_from_fork+0x56/0x90
[ 52.547009] ret_from_fork_asm+0x1a/0x30
[ 52.547698]
[ 52.547698] -> #1 (&q->sysfs_lock){+.+.}-{4:4}:
[ 52.548625] lock_acquire+0x80/0xb0
[ 52.549276] __mutex_lock+0x17c/0x1540
[ 52.549958] mutex_lock_nested+0x1f/0x30
[ 52.550664] queue_attr_store+0xea/0x180
[ 52.551360] sysfs_kf_write+0x11f/0x180
[ 52.552036] kernfs_fop_write_iter+0x40e/0x630
[ 52.552808] vfs_write+0xc59/0x1140
[ 52.553446] ksys_write+0x14f/0x290
[ 52.554068] __x64_sys_write+0x7b/0xc0
[ 52.554728] x64_sys_call+0x1685/0x20d0
[ 52.555397] do_syscall_64+0x6d/0x140
[ 52.556029] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 52.556865]
[ 52.556865] -> #0 (&q->q_usage_counter(io)#25){++++}-{0:0}:
[ 52.557963] __lock_acquire+0x2ff8/0x5d60
[ 52.558667] lock_acquire.part.0+0x142/0x390
[ 52.559427] lock_acquire+0x80/0xb0
[ 52.560057] blk_mq_submit_bio+0x1cbe/0x2590
[ 52.560801] __submit_bio+0x39f/0x550
[ 52.561473] submit_bio_noacct_nocheck+0x647/0xcc0
[ 52.562285] submit_bio_noacct+0x620/0x1e00
[ 52.563017] submit_bio+0xce/0x480
[ 52.563637] __swap_writepage+0x2f1/0xdf0
[ 52.564349] swap_writepage+0x464/0xbc0
[ 52.565022] shmem_writepage+0xdeb/0x1340
[ 52.565745] pageout+0x3bc/0x9b0
[ 52.566353] shrink_folio_list+0x16b9/0x3b60
[ 52.567104] shrink_lruvec+0xd78/0x2790
[ 52.567794] shrink_node+0xb29/0x2870
[ 52.568454] balance_pgdat+0x9c2/0x1500
[ 52.569142] kswapd+0x765/0xe00
[ 52.569741] kthread+0x35a/0x470
[ 52.570340] ret_from_fork+0x56/0x90
[ 52.570993] ret_from_fork_asm+0x1a/0x30
[ 52.571696]
[ 52.571696] other info that might help us debug this:
[ 52.571696]
[ 52.572904] Chain exists of:
[ 52.572904] &q->q_usage_counter(io)#25 --> &q->sysfs_lock --> fs_reclaim
[ 52.572904]
[ 52.574631] Possible unsafe locking scenario:
[ 52.574631]
[ 52.575547] CPU0 CPU1
[ 52.576246] ---- ----
[ 52.576942] lock(fs_reclaim);
[ 52.577467] lock(&q->sysfs_lock);
[ 52.578382] lock(fs_reclaim);
[ 52.579250] rlock(&q->q_usage_counter(io)#25);
[ 52.579974]
[ 52.579974] *** DEADLOCK ***
[ 52.579974]
[ 52.580866] 1 lock held by kswapd0/56:
[ 52.581459] #0: ffffffff872322a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xb0f/0x1500
[ 52.582731]
[ 52.582731] stack backtrace:
[ 52.583404] CPU: 0 UID: 0 PID: 56 Comm: kswapd0 Not tainted 6.12.0-rc5-next-20241029-kvm-dirty #6
[ 52.584735] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/014
[ 52.586439] Call Trace:
[ 52.586836] <TASK>
[ 52.587190] dump_stack_lvl+0xea/0x150
[ 52.587753] dump_stack+0x19/0x20
[ 52.588253] print_circular_bug+0x47f/0x750
[ 52.588872] check_noncircular+0x2f4/0x3e0
[ 52.589492] ? __pfx_check_noncircular+0x10/0x10
[ 52.590180] ? lockdep_lock+0xd0/0x1d0
[ 52.590741] ? __pfx_lockdep_lock+0x10/0x10
[ 52.590790] kexec: Could not allocate control_code_buffer
[ 52.591341] __lock_acquire+0x2ff8/0x5d60
[ 52.592365] ? __pfx___lock_acquire+0x10/0x10
[ 52.593087] ? __pfx_mark_lock.part.0+0x10/0x10
[ 52.593753] ? __kasan_check_read+0x15/0x20
[ 52.594366] lock_acquire.part.0+0x142/0x390
[ 52.594989] ? __submit_bio+0x39f/0x550
[ 52.595554] ? __pfx_lock_acquire.part.0+0x10/0x10
[ 52.596246] ? debug_smp_processor_id+0x20/0x30
[ 52.596900] ? rcu_is_watching+0x19/0xc0
[ 52.597484] ? trace_lock_acquire+0x139/0x1b0
[ 52.598118] lock_acquire+0x80/0xb0
[ 52.598633] ? __submit_bio+0x39f/0x550
[ 52.599191] blk_mq_submit_bio+0x1cbe/0x2590
[ 52.599805] ? __submit_bio+0x39f/0x550
[ 52.600361] ? __kasan_check_read+0x15/0x20
[ 52.600966] ? __pfx_blk_mq_submit_bio+0x10/0x10
[ 52.601632] ? __pfx_mark_lock.part.0+0x10/0x10
[ 52.602285] ? __this_cpu_preempt_check+0x21/0x30
[ 52.602968] ? __this_cpu_preempt_check+0x21/0x30
[ 52.603646] ? lock_release+0x441/0x870
[ 52.604207] __submit_bio+0x39f/0x550
[ 52.604742] ? __pfx___submit_bio+0x10/0x10
[ 52.605364] ? __this_cpu_preempt_check+0x21/0x30
[ 52.606045] ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0
[ 52.606940] ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
[ 52.607707] ? kvm_clock_get_cycles+0x43/0x70
[ 52.608345] submit_bio_noacct_nocheck+0x647/0xcc0
[ 52.609045] ? __pfx_submit_bio_noacct_nocheck+0x10/0x10
[ 52.609820] ? __sanitizer_cov_trace_switch+0x58/0xa0
[ 52.610552] submit_bio_noacct+0x620/0x1e00
[ 52.611167] submit_bio+0xce/0x480
[ 52.611677] __swap_writepage+0x2f1/0xdf0
[ 52.612267] swap_writepage+0x464/0xbc0
[ 52.612837] shmem_writepage+0xdeb/0x1340
[ 52.613441] ? __pfx_shmem_writepage+0x10/0x10
[ 52.614090] ? __kasan_check_write+0x18/0x20
[ 52.614716] ? folio_clear_dirty_for_io+0xc1/0x600
[ 52.615403] pageout+0x3bc/0x9b0
[ 52.615894] ? __pfx_pageout+0x10/0x10
[ 52.616471] ? __pfx_folio_referenced_one+0x10/0x10
[ 52.617169] ? __pfx_folio_lock_anon_vma_read+0x10/0x10
[ 52.617918] ? __pfx_invalid_folio_referenced_vma+0x10/0x10
[ 52.618713] shrink_folio_list+0x16b9/0x3b60
[ 52.619346] ? __pfx_shrink_folio_list+0x10/0x10
[ 52.620021] ? __this_cpu_preempt_check+0x21/0x30
[ 52.620713] ? mark_lock.part.0+0xf3/0x17b0
[ 52.621339] ? isolate_lru_folios+0xcb1/0x1250
[ 52.621991] ? __pfx_mark_lock.part.0+0x10/0x10
[ 52.622655] ? __this_cpu_preempt_check+0x21/0x30
[ 52.623335] ? lock_release+0x441/0x870
[ 52.623900] ? __this_cpu_preempt_check+0x21/0x30
[ 52.624573] ? _raw_spin_unlock_irq+0x2c/0x60
[ 52.625204] ? lockdep_hardirqs_on+0x89/0x110
[ 52.625848] shrink_lruvec+0xd78/0x2790
[ 52.626422] ? __pfx_shrink_lruvec+0x10/0x10
[ 52.627040] ? __this_cpu_preempt_check+0x21/0x30
[ 52.627729] ? __this_cpu_preempt_check+0x21/0x30
[ 52.628423] ? trace_lock_acquire+0x139/0x1b0
[ 52.629061] ? trace_lock_acquire+0x139/0x1b0
[ 52.629752] shrink_node+0xb29/0x2870
[ 52.630305] ? __pfx_shrink_node+0x10/0x10
[ 52.630899] ? pgdat_balanced+0x1d4/0x230
[ 52.631490] balance_pgdat+0x9c2/0x1500
[ 52.632055] ? __pfx_balance_pgdat+0x10/0x10
[ 52.632669] ? __this_cpu_preempt_check+0x21/0x30
[ 52.633380] kswapd+0x765/0xe00
[ 52.633861] ? __pfx_kswapd+0x10/0x10
[ 52.634393] ? local_clock_noinstr+0xb0/0xd0
[ 52.635015] ? __pfx_autoremove_wake_function+0x10/0x10
[ 52.635759] ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
[ 52.636525] ? __kthread_parkme+0x15d/0x230
[ 52.637134] kthread+0x35a/0x470
[ 52.637616] ? __pfx_kswapd+0x10/0x10
[ 52.638146] ? __pfx_kthread+0x10/0x10
[ 52.638693] ret_from_fork+0x56/0x90
[ 52.639227] ? __pfx_kthread+0x10/0x10
[ 52.639778] ret_from_fork_asm+0x1a/0x30
[ 52.640391] </TASK>
> Thanks,
> Ming
>
next prev parent reply other threads:[~2024-10-30 10:40 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 0:37 [PATCH V2 0/3] block: model freeze/enter queue as lock for lockdep Ming Lei
2024-10-25 0:37 ` [PATCH V2 1/3] blk-mq: add non_owner variant of start_freeze/unfreeze queue APIs Ming Lei
2024-10-25 0:37 ` [PATCH V2 2/3] nvme: core: switch to non_owner variant of start_freeze/unfreeze queue Ming Lei
2024-10-25 0:37 ` [PATCH V2 3/3] block: model freeze & enter queue as lock for supporting lockdep Ming Lei
2024-10-29 11:13 ` Marek Szyprowski
2024-10-29 15:58 ` Ming Lei
2024-10-29 16:59 ` Marek Szyprowski
2024-11-12 8:36 ` Marek Szyprowski
2024-11-12 10:15 ` Ming Lei
2024-11-12 11:32 ` Marek Szyprowski
2024-11-12 11:48 ` Ming Lei
2024-10-30 6:45 ` Lai, Yi
2024-10-30 7:13 ` Ming Lei
2024-10-30 8:50 ` Lai, Yi
2024-10-30 9:50 ` Ming Lei
2024-10-30 10:39 ` Lai, Yi [this message]
2024-10-30 11:08 ` Ming Lei
2024-12-04 3:21 ` Lai, Yi
2024-12-04 3:30 ` Ming Lei
2025-01-13 14:39 ` Chris Bainbridge
2025-01-13 15:11 ` Ming Lei
2025-01-13 15:33 ` Chris Bainbridge
2025-01-13 15:52 ` Jens Axboe
2025-01-13 15:23 ` Chris Bainbridge
2024-10-26 13:15 ` [PATCH V2 0/3] block: model freeze/enter queue as lock for lockdep Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZyIM0dWzxC9zBIuf@ly-workstation \
--to=yi1.lai@linux.intel.com \
--cc=axboe@kernel.dk \
--cc=boqun.feng@gmail.com \
--cc=bvanassche@acm.org \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=ming.lei@redhat.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=syzkaller-bugs@googlegroups.com \
--cc=will@kernel.org \
--cc=yi1.lai@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.