Re: [syzbot] [block?] possible deadlock in elv_iosched_store

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nilay Shroff <nilay@linux.ibm.com>
To: Ming Lei <ming.lei@redhat.com>,
	syzbot <syzbot+4c7e0f9b94ad65811efb@syzkaller.appspotmail.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] [block?] possible deadlock in elv_iosched_store
Date: Tue, 1 Apr 2025 17:23:56 +0530	[thread overview]
Message-ID: <462d4e8a-dd95-48fe-b9fe-a558057f9595@linux.ibm.com> (raw)
In-Reply-To: <Z-dUCLvf06SfTOHy@fedora>



On 3/29/25 7:29 AM, Ming Lei wrote:
> On Fri, Mar 28, 2025 at 07:37:25AM -0700, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:    1a9239bb4253 Merge tag 'net-next-6.15' of git://git.kernel..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1384b43f980000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=c7163a109ac459a8
>> dashboard link: https://syzkaller.appspot.com/bug?extid=4c7e0f9b94ad65811efb
>> compiler:       gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=178cfa4c580000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=11a8ca4c580000
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/fc7dc9f0d9a7/disk-1a9239bb.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/f555a3ae03d3/vmlinux-1a9239bb.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/55f6ea74eaf2/bzImage-1a9239bb.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+4c7e0f9b94ad65811efb@syzkaller.appspotmail.com
>>
> 
> ...
> 
>>
>> If you want syzbot to run the reproducer, reply with:
>> #syz test: git://repo/address.git branch-or-commit-hash
>> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index ae8494d88897..d7a103dc258b 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -4465,14 +4465,12 @@ static struct blk_mq_hw_ctx *blk_mq_alloc_and_init_hctx(
>  	return NULL;
>  }
>  
> -static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
> -						struct request_queue *q)
> +static void __blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
> +				     struct request_queue *q)
>  {
>  	struct blk_mq_hw_ctx *hctx;
>  	unsigned long i, j;
>  
> -	/* protect against switching io scheduler  */
> -	mutex_lock(&q->elevator_lock);
>  	for (i = 0; i < set->nr_hw_queues; i++) {
>  		int old_node;
>  		int node = blk_mq_get_hctx_node(set, i);
> @@ -4505,7 +4503,19 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
>  
>  	xa_for_each_start(&q->hctx_table, j, hctx, j)
>  		blk_mq_exit_hctx(q, set, hctx, j);
> -	mutex_unlock(&q->elevator_lock);
> +}
> +
> +static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
> +				   struct request_queue *q, bool lock)
> +{
> +	if (lock) {
> +		/* protect against switching io scheduler  */
> +		mutex_lock(&q->elevator_lock);
> +		__blk_mq_realloc_hw_ctxs(set, q);
> +		mutex_unlock(&q->elevator_lock);
> +	} else {
> +		__blk_mq_realloc_hw_ctxs(set, q);
> +	}
>  
>  	/* unregister cpuhp callbacks for exited hctxs */
>  	blk_mq_remove_hw_queues_cpuhp(q);
> @@ -4537,7 +4547,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
>  
>  	xa_init(&q->hctx_table);
>  
> -	blk_mq_realloc_hw_ctxs(set, q);
> +	blk_mq_realloc_hw_ctxs(set, q, false);
>  	if (!q->nr_hw_queues)
>  		goto err_hctxs;
>  
> @@ -5033,7 +5043,7 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set,
>  fallback:
>  	blk_mq_update_queue_map(set);
>  	list_for_each_entry(q, &set->tag_list, tag_set_list) {
> -		blk_mq_realloc_hw_ctxs(set, q);
> +		blk_mq_realloc_hw_ctxs(set, q, true);
>  
>  		if (q->nr_hw_queues != set->nr_hw_queues) {
>  			int i = prev_nr_hw_queues;
> 

This patch looks good to me, however after we fix this one, I found another splat. 
I see that these new splats are side effect of commit ffa1e7ada456 ("block: Make 
request_queue lockdep splats show up earlier").

IMO in the block layer code (unless it's in an IO submission path or a path where we 
have already frozen queue) we may still want to allow memory allocation with GFP_KERNEL. 
So in that sense, for example, we may acquire ->elevator_lock followed by fs_reclaim. 
Or in another words, shouldn't it be legitimate to acquire blk layer specific lock and 
then allocate memory using GFP_KERNEL assuming we haven't freezed queue or we're not in 
IO submission path. But this commit ffa1e7ada456 ("block: Make request_queue lockdep
splats show up earlier") now showing up some false-positive splat as well, please see
below:

======================================================
WARNING: possible circular locking dependency detected
6.14.0+ #147 Not tainted
------------------------------------------------------
bash/5903 is trying to acquire lock:
c0000000ba0c6ad8 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0x11c/0x5d4

but task is already holding lock:
c0000000ba0c65b8 (&q->q_usage_counter(io)#20){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x28/0x40

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&q->q_usage_counter(io)#20){++++}-{0:0}:
       blk_alloc_queue+0x3a8/0x3e4
       blk_mq_alloc_queue+0x88/0x11c
       __blk_mq_alloc_disk+0x34/0xd8
       null_add_dev+0x3c8/0x914 [null_blk]
       null_init+0x1e0/0x4bc [null_blk]
       do_one_initcall+0x8c/0x4b8
       do_init_module+0x7c/0x2c4
       init_module_from_file+0xb4/0x108
       idempotent_init_module+0x26c/0x368
       sys_finit_module+0x98/0x150
       system_call_exception+0x134/0x360
       system_call_vectored_common+0x15c/0x2ec

-> #1 (fs_reclaim){+.+.}-{0:0}:
       fs_reclaim_acquire+0xe4/0x120
       kmem_cache_alloc_noprof+0x74/0x570
       __kernfs_new_node+0x98/0x378
       kernfs_new_node+0x80/0xc4
       kernfs_create_dir_ns+0x44/0xec
       sysfs_create_dir_ns+0x94/0x160
       kobject_add_internal+0xf4/0x3c8
       kobject_add+0x70/0x10c
       elv_register_queue+0x70/0x14c
       blk_register_queue+0x1d8/0x2bc
       add_disk_fwnode+0x3b4/0x5d0
       sd_probe+0x3bc/0x5b4 [sd_mod]
       really_probe+0x104/0x4c4
       __driver_probe_device+0xb8/0x200
       driver_probe_device+0x54/0x128
       __driver_attach_async_helper+0x7c/0x150
       async_run_entry_fn+0x60/0x1bc
       process_one_work+0x2ac/0x7e4
       worker_thread+0x238/0x460
       kthread+0x158/0x188
       start_kernel_thread+0x14/0x18

-> #0 (&q->elevator_lock){+.+.}-{4:4}:
       __lock_acquire+0x1b6c/0x2ae0
       lock_acquire+0x140/0x430
       __mutex_lock+0xf0/0xb00
       elv_iosched_store+0x11c/0x5d4
       queue_attr_store+0x12c/0x164
       sysfs_kf_write+0x6c/0xb0
       kernfs_fop_write_iter+0x1ac/0x2a8
       vfs_write+0x410/0x584
       ksys_write+0x84/0x140
       system_call_exception+0x134/0x360
       system_call_vectored_common+0x15c/0x2ec

other info that might help us debug this:

Chain exists of:
  &q->elevator_lock --> fs_reclaim --> &q->q_usage_counter(io)#20

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&q->q_usage_counter(io)#20);
                               lock(fs_reclaim);
                               lock(&q->q_usage_counter(io)#20);
  lock(&q->elevator_lock);

 *** DEADLOCK ***

5 locks held by bash/5903:
 #0: c00000005cb7f400 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x84/0x140
 #1: c000000008711288 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x168/0x2a8
 #2: c00000000a1e2c08 (kn->active#57){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x174/0x2a8
 #3: c0000000ba0c65b8 (&q->q_usage_counter(io)#20){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x28/0x40
 #4: c0000000ba0c65f0 (&q->q_usage_counter(queue)#21){+.+.}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0x28/0x40

stack backtrace:
CPU: 17 UID: 0 PID: 5903 Comm: bash Kdump: loaded Not tainted 6.14.0+ #147 VOLUNTARY 
Hardware name: IBM,9043-MRX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_028) hv:phyp pSeries
Call Trace:
[c0000000955df580] [c0000000011a7ef8] dump_stack_lvl+0x108/0x18c (unreliable)
[c0000000955df5b0] [c000000000225b0c] print_circular_bug+0x448/0x604
[c0000000955df660] [c000000000225f14] check_noncircular+0x24c/0x26c
[c0000000955df730] [c00000000022c3e8] __lock_acquire+0x1b6c/0x2ae0
[c0000000955df860] [c000000000229700] lock_acquire+0x140/0x430
[c0000000955df960] [c0000000011e84e8] __mutex_lock+0xf0/0xb00
[c0000000955dfa90] [c0000000008fb6f8] elv_iosched_store+0x11c/0x5d4
[c0000000955dfb50] [c000000000903ec0] queue_attr_store+0x12c/0x164
[c0000000955dfc60] [c0000000007ca58c] sysfs_kf_write+0x6c/0xb0
[c0000000955dfca0] [c0000000007c8df0] kernfs_fop_write_iter+0x1ac/0x2a8
[c0000000955dfcf0] [c0000000006a8c9c] vfs_write+0x410/0x584
[c0000000955dfdc0] [c0000000006a9148] ksys_write+0x84/0x140
[c0000000955dfe10] [c000000000031814] system_call_exception+0x134/0x360
[c0000000955dfe50] [c00000000000cedc] system_call_vectored_common+0x15c/0x2ec
   
What do you think?

Thanks,
--Nilay

next prev parent reply	other threads:[~2025-04-01 11:54 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-28 14:37 [syzbot] [block?] possible deadlock in elv_iosched_store syzbot
2025-03-29  1:59 ` Ming Lei
2025-04-01 11:53   ` Nilay Shroff [this message]
2025-04-01 12:16     ` Ming Lei
2025-04-01 12:29       ` Nilay Shroff
2025-03-29  2:01 ` Ming Lei
2025-03-29  2:02   ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=462d4e8a-dd95-48fe-b9fe-a558057f9595@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=syzbot+4c7e0f9b94ad65811efb@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.