Re: [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: Nilay Shroff <nilay@linux.ibm.com>
To: Yi Zhang <yi.zhang@redhat.com>
Cc: linux-block@vger.kernel.org, hch@lst.de, ming.lei@redhat.com,
	hare@suse.de, axboe@kernel.dk, sth@linux.ibm.com, gjoyce@ibm.com,
	Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context
Date: Wed, 2 Jul 2025 19:47:11 +0530	[thread overview]
Message-ID: <1b1b8767-2e08-496a-89db-385edf592d23@linux.ibm.com> (raw)
In-Reply-To: <CAHj4cs9ABpwaoywocFAHP+3k=oWsqBKbM5GFbCedgjEyxzpChA@mail.gmail.com>



On 7/2/25 7:23 PM, Yi Zhang wrote:
> Hi Nilay
> 
> With the patch on the latest linux-block/for-next, I reproduced the
> following WARNING with blktests block/005, here is the full log:
> 
> [  342.845331] run blktests block/005 at 2025-07-02 09:48:55
> 
> [  343.835605] ======================================================
> [  343.841783] WARNING: possible circular locking dependency detected
> [  343.847966] 6.16.0-rc4.fix+ #3 Not tainted
> [  343.852073] ------------------------------------------------------
> [  343.858250] check/1365 is trying to acquire lock:
> [  343.862957] ffffffff98141db0 (pcpu_alloc_mutex){+.+.}-{4:4}, at:
> pcpu_alloc_noprof+0x8eb/0xd70
> [  343.871587]
>                but task is already holding lock:
> [  343.877421] ffff888300cfb040 (&q->elevator_lock){+.+.}-{4:4}, at:
> elevator_change+0x152/0x530
> [  343.885958]
>                which lock already depends on the new lock.
> 
> [  343.894131]
>                the existing dependency chain (in reverse order) is:
> [  343.901609]
>                -> #3 (&q->elevator_lock){+.+.}-{4:4}:
> [  343.907891]        __lock_acquire+0x6f1/0xc00
> [  343.912259]        lock_acquire.part.0+0xb6/0x240
> [  343.916966]        __mutex_lock+0x17b/0x1690
> [  343.921247]        elevator_change+0x152/0x530
> [  343.925692]        elv_iosched_store+0x205/0x2f0
> [  343.930312]        queue_attr_store+0x23b/0x300
> [  343.934853]        kernfs_fop_write_iter+0x357/0x530
> [  343.939829]        vfs_write+0x9bc/0xf60
> [  343.943763]        ksys_write+0xf3/0x1d0
> [  343.947695]        do_syscall_64+0x8c/0x3d0
> [  343.951883]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  343.957462]
>                -> #2 (&q->q_usage_counter(io)#4){++++}-{0:0}:
> [  343.964440]        __lock_acquire+0x6f1/0xc00
> [  343.968799]        lock_acquire.part.0+0xb6/0x240
> [  343.973507]        blk_alloc_queue+0x5c5/0x710
> [  343.977959]        blk_mq_alloc_queue+0x14e/0x240
> [  343.982666]        __blk_mq_alloc_disk+0x15/0xd0
> [  343.987294]        nvme_alloc_ns+0x208/0x1690 [nvme_core]
> [  343.992727]        nvme_scan_ns+0x362/0x4c0 [nvme_core]
> [  343.997978]        async_run_entry_fn+0x96/0x4f0
> [  344.002599]        process_one_work+0x8cd/0x1950
> [  344.007226]        worker_thread+0x58d/0xcf0
> [  344.011499]        kthread+0x3d8/0x7a0
> [  344.015259]        ret_from_fork+0x406/0x510
> [  344.019532]        ret_from_fork_asm+0x1a/0x30
> [  344.023980]
>                -> #1 (fs_reclaim){+.+.}-{0:0}:
> [  344.029654]        __lock_acquire+0x6f1/0xc00
> [  344.034015]        lock_acquire.part.0+0xb6/0x240
> [  344.038727]        fs_reclaim_acquire+0x103/0x150
> [  344.043433]        prepare_alloc_pages+0x15f/0x600
> [  344.048230]        __alloc_frozen_pages_noprof+0x14a/0x3a0
> [  344.053722]        __alloc_pages_noprof+0xd/0x1d0
> [  344.058438]        pcpu_alloc_pages.constprop.0+0x104/0x420
> [  344.064017]        pcpu_populate_chunk+0x38/0x80
> [  344.068644]        pcpu_alloc_noprof+0x650/0xd70
> [  344.073265]        iommu_dma_init_fq+0x183/0x730
> [  344.077893]        iommu_dma_init_domain+0x566/0x990
> [  344.082866]        iommu_setup_dma_ops+0xca/0x230
> [  344.087571]        bus_iommu_probe+0x1f8/0x4a0
> [  344.092020]        iommu_device_register+0x153/0x240
> [  344.096993]        iommu_init_pci+0x53c/0x1040
> [  344.101447]        amd_iommu_init_pci+0xb6/0x5c0
> [  344.106066]        state_next+0xaf7/0xff0
> [  344.110080]        iommu_go_to_state+0x21/0x80
> [  344.114535]        amd_iommu_init+0x15/0x70
> [  344.118728]        pci_iommu_init+0x29/0x70
> [  344.122914]        do_one_initcall+0x100/0x5a0
> [  344.127361]        do_initcalls+0x138/0x1d0
> [  344.131556]        kernel_init_freeable+0x8b7/0xbd0
> [  344.136442]        kernel_init+0x1b/0x1f0
> [  344.140456]        ret_from_fork+0x406/0x510
> [  344.144735]        ret_from_fork_asm+0x1a/0x30
> [  344.149182]
>                -> #0 (pcpu_alloc_mutex){+.+.}-{4:4}:
> [  344.155379]        check_prev_add+0xf1/0xce0
> [  344.159653]        validate_chain+0x470/0x580
> [  344.164019]        __lock_acquire+0x6f1/0xc00
> [  344.168378]        lock_acquire.part.0+0xb6/0x240
> [  344.173085]        __mutex_lock+0x17b/0x1690
> [  344.177365]        pcpu_alloc_noprof+0x8eb/0xd70
> [  344.181984]        kyber_queue_data_alloc+0x16d/0x660
> [  344.187047]        kyber_init_sched+0x14/0x90
> [  344.191413]        blk_mq_init_sched+0x264/0x4e0
> [  344.196033]        elevator_switch+0x186/0x6a0
> [  344.200478]        elevator_change+0x305/0x530
> [  344.204924]        elv_iosched_store+0x205/0x2f0
> [  344.209545]        queue_attr_store+0x23b/0x300
> [  344.214084]        kernfs_fop_write_iter+0x357/0x530
> [  344.219051]        vfs_write+0x9bc/0xf60
> [  344.222976]        ksys_write+0xf3/0x1d0
> [  344.226902]        do_syscall_64+0x8c/0x3d0
> [  344.231088]        entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  344.236660]

Thanks for the report!

I see that the above warning is different from the one addressed by the
current patchset. In the warning you've reported, the kyber elevator 
allocates per-CPU data after acquiring ->elevator_lock, which introduces
a per-CPU lock dependency on the ->elevator_lock.

In contrast, the current patchset addresses a separate issue [1] that arises
due to elevator tag allocation. This allocation occurs after both ->freeze_lock
and ->elevator_lock are held. Internally, elevator tags allocation sets up 
per-CPU sbitmap->alloc_hint, which also introduces a similar per-CPU lock
dependency on ->elevator_lock.

That said, I'll plan to address the issue you've just reported in a separate
patch, once the current patchset is merged. 

Thanks,
--Nilay

[1]https://lore.kernel.org/all/0659ea8d-a463-47c8-9180-43c719e106eb@linux.ibm.com/

next prev parent reply	other threads:[~2025-07-02 14:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-01  8:18 [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context Nilay Shroff
2025-07-01  8:18 ` [PATCHv7 1/3] block: move elevator queue allocation logic into blk_mq_init_sched Nilay Shroff
2025-07-01  8:18 ` [PATCHv7 2/3] block: fix lockdep warning caused by lock dependency in elv_iosched_store Nilay Shroff
2025-07-01 10:52   ` Hannes Reinecke
2025-07-01  8:19 ` [PATCHv7 3/3] block: fix potential deadlock while running nr_hw_queue update Nilay Shroff
2025-07-01 11:00   ` Hannes Reinecke
2025-07-02 13:53 ` [PATCHv7 0/3] block: move sched_tags allocation/de-allocation outside of locking context Yi Zhang
2025-07-02 14:17   ` Nilay Shroff [this message]
2025-07-02 14:41     ` Yi Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1b1b8767-2e08-496a-89db-385edf592d23@linux.ibm.com \
    --to=nilay@linux.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=gjoyce@ibm.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=shinichiro.kawasaki@wdc.com \
    --cc=sth@linux.ibm.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox