public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* blktests failures with v6.19 kernel
@ 2026-02-13  7:57 Shinichiro Kawasaki
  2026-02-13  9:56 ` Daniel Wagner
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Shinichiro Kawasaki @ 2026-02-13  7:57 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-scsi@vger.kernel.org, nbd@other.debian.org,
	linux-rdma@vger.kernel.org

Hi all,

I ran the latest blktests (git hash: b5b699341102) with the v6.19 kernel. I
observed 6 failures listed below. Comparing with the previous report with the
v6.19-rc1 kernel [1], two failures were resolved (nvme/033 and srp) and three
failures are newly observed (nvme/061, zbd/009 and zbd/013). Recently, kmemleak
support was introduced to blktests. Two out of the three new failures were
detected by kmemleak. Your actions to fix the failures will be appreciated as
always.

[1] https://lore.kernel.org/linux-block/a078671f-10b3-47e7-acbb-4251c8744523@wdc.com/


List of failures
================
#1: nvme/005,063 (tcp transport)
#2: nvme/058 (fc transport)
#3: nvme/061 (rdma transport, siw driver)(new)(kmemleak)
#4: nbd/002
#5: zbd/009 (new)(kmemleak)
#6: zbd/013 (new)


Failure description
===================

#1: nvme/005,063 (tcp transport)

    The test case nvme/005 and 063 fail for tcp transport due to the lockdep
    WARN related to the three locks q->q_usage_counter, q->elevator_lock and
    set->srcu. Refer to the nvme/063 failure report for v6.16-rc1 kernel [2].

    [2] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/

#2: nvme/058 (fc transport)

    When the test case nvme/058 is repeated for fc transport about 50 times, it
    fails. I observed a couple of different symptoms by chance. One symptom is a
    lockdep WARN related to nvmet-wq [3]. And the other is a WARN at __add_disk
    [4]. This test case had failed with another different symptom with the
    kernel v6.19-rc1 (WARN at blk_mq_unquiesce_queue) [1], but this symptom is
    no longer observed.

#3: nvme/061 (rdma transport, siw driver)

    When the test case nvme/061 is repeated twice for the rdma transport and the
    siw driver on the kernel v6.19 with CONFIG_DEBUG_KMEMLEAK enabled, it fails
    with a kmemleak message [5]. The failure is not observed with the rxe
    driver.

#4: nbd/002

    The test case nbd/002 fails due to the lockdep WARN related to
    mm->mmap_lock, sk_lock-AF_INET6 and fs_reclaim. Refer to the nbd/002 failure
    report for v6.18-rc1 kernel [6].

    [6] https://lore.kernel.org/linux-block/ynmi72x5wt5ooljjafebhcarit3pvu6axkslqenikb2p5txe57@ldytqa2t4i2x/

#5: zbd/009

    When the test case zbd/009 is repeated twice on the kernel v6.19 with
    CONFIG_DEBUG_KMEMLEAK enabled,, it fails with a kmemleak message [7].

#6: zbd/013

    The test case zbd/013 fails with a KASAN [8]. The cause was in the task
    scheduler and the fix patch was already applied to the Linus master
    branch [9].

    [8] https://lore.kernel.org/lkml/aYrewLd7QNiPUJT1@shinmob/
    [9] https://lore.kernel.org/lkml/87tsvoa7to.ffs@tglx/


[3] lockdep WARN during nvme/058 with fc transport

[  409.028219] [     T95] ============================================
[  409.029133] [     T95] WARNING: possible recursive locking detected
[  409.030058] [     T95] 6.19.0+ #575 Not tainted
[  409.030892] [     T95] --------------------------------------------
[  409.031801] [     T95] kworker/u16:9/95 is trying to acquire lock:
[  409.032691] [     T95] ffff88813ba54948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x7e/0x1a0
[  409.033869] [     T95] 
                          but task is already holding lock:
[  409.035254] [     T95] ffff88813ba54948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0xeef/0x1480
[  409.036383] [     T95] 
                          other info that might help us debug this:
[  409.037769] [     T95]  Possible unsafe locking scenario:

[  409.039113] [     T95]        CPU0
[  409.039781] [     T95]        ----
[  409.040406] [     T95]   lock((wq_completion)nvmet-wq);
[  409.041154] [     T95]   lock((wq_completion)nvmet-wq);
[  409.041898] [     T95] 
                           *** DEADLOCK ***

[  409.043609] [     T95]  May be due to missing lock nesting notation

[  409.044960] [     T95] 3 locks held by kworker/u16:9/95:
[  409.045721] [     T95]  #0: ffff88813ba54948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0xeef/0x1480
[  409.046845] [     T95]  #1: ffff888109797ca8 ((work_completion)(&assoc->del_work)){+.+.}-{0:0}, at: process_one_work+0x7e4/0x1480
[  409.048063] [     T95]  #2: ffffffffac481480 (rcu_read_lock){....}-{1:3}, at: __flush_work+0xe3/0xc70
[  409.049128] [     T95] 
                          stack backtrace:
[  409.050366] [     T95] CPU: 1 UID: 0 PID: 95 Comm: kworker/u16:9 Not tainted 6.19.0+ #575 PREEMPT(voluntary) 
[  409.050370] [     T95] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
[  409.050373] [     T95] Workqueue: nvmet-wq nvmet_fc_delete_assoc_work [nvmet_fc]
[  409.050384] [     T95] Call Trace:
[  409.050386] [     T95]  <TASK>
[  409.050388] [     T95]  dump_stack_lvl+0x6a/0x90
[  409.050393] [     T95]  print_deadlock_bug.cold+0xc0/0xcd
[  409.050401] [     T95]  __lock_acquire+0x1312/0x2230
[  409.050408] [     T95]  ? lockdep_unlock+0x5e/0xc0
[  409.050410] [     T95]  ? __lock_acquire+0xd03/0x2230
[  409.050413] [     T95]  lock_acquire+0x170/0x300
[  409.050415] [     T95]  ? touch_wq_lockdep_map+0x7e/0x1a0
[  409.050418] [     T95]  ? __flush_work+0x4e8/0xc70
[  409.050420] [     T95]  ? find_held_lock+0x2b/0x80
[  409.050423] [     T95]  ? touch_wq_lockdep_map+0x7e/0x1a0
[  409.050425] [     T95]  touch_wq_lockdep_map+0x97/0x1a0
[  409.050428] [     T95]  ? touch_wq_lockdep_map+0x7e/0x1a0
[  409.050430] [     T95]  ? __flush_work+0x4e8/0xc70
[  409.050432] [     T95]  __flush_work+0x5c1/0xc70
[  409.050434] [     T95]  ? __pfx___flush_work+0x10/0x10
[  409.050436] [     T95]  ? __pfx___flush_work+0x10/0x10
[  409.050439] [     T95]  ? __pfx_wq_barrier_func+0x10/0x10
[  409.050444] [     T95]  ? __pfx___might_resched+0x10/0x10
[  409.050454] [     T95]  nvmet_ctrl_free+0x2e9/0x810 [nvmet]
[  409.050474] [     T95]  ? __pfx___cancel_work+0x10/0x10
[  409.050479] [     T95]  ? __pfx_nvmet_ctrl_free+0x10/0x10 [nvmet]
[  409.050498] [     T95]  nvmet_cq_put+0x136/0x180 [nvmet]
[  409.050515] [     T95]  nvmet_fc_target_assoc_free+0x398/0x2040 [nvmet_fc]
[  409.050522] [     T95]  ? __pfx_nvmet_fc_target_assoc_free+0x10/0x10 [nvmet_fc]
[  409.050528] [     T95]  nvmet_fc_delete_assoc_work.cold+0x82/0xff [nvmet_fc]
[  409.050533] [     T95]  process_one_work+0x868/0x1480
[  409.050538] [     T95]  ? __pfx_process_one_work+0x10/0x10
[  409.050541] [     T95]  ? lock_acquire+0x170/0x300
[  409.050545] [     T95]  ? assign_work+0x156/0x390
[  409.050548] [     T95]  worker_thread+0x5ee/0xfd0
[  409.050553] [     T95]  ? __pfx_worker_thread+0x10/0x10
[  409.050555] [     T95]  kthread+0x3af/0x770
[  409.050560] [     T95]  ? lock_acquire+0x180/0x300
[  409.050563] [     T95]  ? __pfx_kthread+0x10/0x10
[  409.050565] [     T95]  ? __pfx_kthread+0x10/0x10
[  409.050568] [     T95]  ? ret_from_fork+0x6e/0x810
[  409.050576] [     T95]  ? lock_release+0x1ab/0x2f0
[  409.050578] [     T95]  ? rcu_is_watching+0x11/0xb0
[  409.050582] [     T95]  ? __pfx_kthread+0x10/0x10
[  409.050585] [     T95]  ret_from_fork+0x55c/0x810
[  409.050588] [     T95]  ? __pfx_ret_from_fork+0x10/0x10
[  409.050590] [     T95]  ? __switch_to+0x10a/0xda0
[  409.050598] [     T95]  ? __switch_to_asm+0x33/0x70
[  409.050602] [     T95]  ? __pfx_kthread+0x10/0x10
[  409.050605] [     T95]  ret_from_fork_asm+0x1a/0x30
[  409.050610] [     T95]  </TASK>


[4] WARN during nvme/058 fc transport

[ 1410.913156] [   T1369] WARNING: block/genhd.c:463 at __add_disk+0x87b/0xd50, CPU#0: kworker/u16:12/1369
[ 1410.913411] [   T1146] nvme8c15n2: I/O Cmd(0x2) @ LBA 2096240, 8 blocks, I/O Error (sct 0x3 / sc 0x0) MORE 
[ 1410.913866] [   T1369] Modules linked in:
[ 1410.914386] [   T1146] I/O error, dev nvme8c15n2, sector 2096240 op 0x0:(READ) flags 0x2880700 phys_seg 1 prio class 2
[ 1410.914954] [   T1369]  nvme_fcloop nvmet_fc nvmet nvme_fc nvme_fabrics nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr sunrpc 9pnet_virtio 9pnet pcspkr netfs i2c_piix4 i2c_smbus fuse loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vsock zram bochs xfs drm_client_lib drm_shmem_helper nvme drm_kms_helper nvme_core sym53c8xx drm floppy e1000 nvme_keyring nvme_auth hkdf scsi_transport_spi serio_raw ata_generic pata_acpi i2c_dev qemu_fw_cfg [last unloaded: nvmet]
[ 1410.918223] [   T1418] nvme nvme15: NVME-FC{10}: io failed due to bad NVMe_ERSP: iu len 8, xfr len 1024 vs 0, status code 0, cmdid 36976 vs 36976
[ 1410.918677] [   T1397] nvme nvme15: rescanning namespaces.
[ 1410.921001] [   T1369] CPU: 0 UID: 0 PID: 1369 Comm: kworker/u16:12 Not tainted 6.19.0 #12 PREEMPT(voluntary) 
[ 1410.921914] [   T1383] nvme nvme15: NVME-FC{10}: transport association event: transport detected io error
[ 1410.922247] [   T1369] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-8.fc42 06/10/2025
[ 1410.922847] [   T1383] nvme nvme15: NVME-FC{10}: resetting controller
[ 1410.923450] [   T1369] Workqueue: nvme-wq nvme_fc_connect_ctrl_work [nvme_fc]
[ 1410.931697] [   T1369] RIP: 0010:__add_disk+0x87b/0xd50
[ 1410.932558] [   T1369] Code: 89 54 24 18 e8 d6 fa 75 ff 44 8b 54 24 18 e9 ca f9 ff ff 4c 89 f7 44 89 54 24 18 e8 bf fa 75 ff 44 8b 54 24 18 e9 7e f9 ff ff <0f> 0b e9 6f fe ff ff 0f 0b e9 68 fe ff ff 48 b8 00 00 00 00 00 fc
[ 1410.935252] [   T1369] RSP: 0018:ffff888124977770 EFLAGS: 00010246
[ 1410.936421] [   T1369] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff110249c1c01
[ 1410.937637] [   T1369] RDX: 0000000000000103 RSI: 0000000000000004 RDI: ffff8881537ca2b0
[ 1410.938806] [   T1369] RBP: ffff888124e0e008 R08: ffffffff8d81870b R09: ffffed102a6f9456
[ 1410.939826] [   T1369] R10: ffffed102a6f9457 R11: ffff8881151b8fb0 R12: ffff8881537ca280
[ 1410.940895] [   T1369] R13: 0000000000000000 R14: ffff888124e0e000 R15: ffff888112227008
[ 1410.942132] [   T1369] FS:  0000000000000000(0000) GS:ffff88841c312000(0000) knlGS:0000000000000000
[ 1410.943347] [   T1369] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1410.944482] [   T1369] CR2: 000055f53bf44f08 CR3: 0000000111996000 CR4: 00000000000006f0
[ 1410.945619] [   T1369] Call Trace:
[ 1410.946429] [   T1369]  <TASK>
[ 1410.947350] [   T1369]  add_disk_fwnode+0x36e/0x590
[ 1410.948386] [   T1369]  ? lock_acquire+0x17a/0x2f0
[ 1410.949325] [   T1369]  nvme_mpath_set_live+0xe0/0x4c0 [nvme_core]
[ 1410.950456] [   T1369]  nvme_update_ana_state+0x342/0x4e0 [nvme_core]
[ 1410.951455] [   T1369]  nvme_parse_ana_log+0x1f5/0x4c0 [nvme_core]
[ 1410.952465] [   T1369]  ? __pfx_nvme_update_ana_state+0x10/0x10 [nvme_core]
[ 1410.953534] [   T1369]  nvme_mpath_update+0xa1/0xf0 [nvme_core]
[ 1410.954617] [   T1369]  ? __pfx_nvme_mpath_update+0x10/0x10 [nvme_core]
[ 1410.955651] [   T1369]  ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 1410.956615] [   T1369]  ? lockdep_hardirqs_on+0x88/0x130
[ 1410.957508] [   T1369]  ? queue_work_on+0x8c/0xc0
[ 1410.958321] [   T1369]  nvme_start_ctrl+0x3ee/0x620 [nvme_core]
[ 1410.959361] [   T1369]  ? __pfx_nvme_start_ctrl+0x10/0x10 [nvme_core]
[ 1410.960318] [   T1369]  ? mark_held_locks+0x40/0x70
[ 1410.961249] [   T1369]  ? nvme_kick_requeue_lists+0x1b5/0x260 [nvme_core]
[ 1410.962258] [   T1369]  ? lock_release+0x1ab/0x2f0
[ 1410.963110] [   T1369]  nvme_fc_connect_ctrl_work.cold+0x758/0x7f3 [nvme_fc]
[ 1410.964131] [   T1369]  ? __pfx_nvme_fc_connect_ctrl_work+0x10/0x10 [nvme_fc]
[ 1410.965179] [   T1369]  ? lock_acquire+0x17a/0x2f0
[ 1410.966026] [   T1369]  ? process_one_work+0x722/0x1490
[ 1410.967510] [   T1369]  ? lock_release+0x1ab/0x2f0
[ 1410.968327] [   T1369]  process_one_work+0x868/0x1490
[ 1410.969235] [   T1369]  ? __pfx_process_one_work+0x10/0x10
[ 1410.970196] [   T1369]  ? lock_acquire+0x16a/0x2f0
[ 1410.971089] [   T1369]  ? assign_work+0x156/0x390
[ 1410.971925] [   T1369]  worker_thread+0x5ee/0xfd0
[ 1410.972725] [   T1369]  ? __pfx_worker_thread+0x10/0x10
[ 1410.973531] [   T1369]  ? __kthread_parkme+0xb3/0x200
[ 1410.974255] [   T1369]  ? __pfx_worker_thread+0x10/0x10
[ 1410.975093] [   T1369]  kthread+0x3af/0x770
[ 1410.975764] [   T1369]  ? lock_acquire+0x17a/0x2f0
[ 1410.976454] [   T1369]  ? __pfx_kthread+0x10/0x10
[ 1410.977209] [   T1369]  ? __pfx_kthread+0x10/0x10
[ 1410.977894] [   T1369]  ? ret_from_fork+0x6e/0x810
[ 1410.978624] [   T1369]  ? lock_release+0x1ab/0x2f0
[ 1410.979282] [   T1369]  ? rcu_is_watching+0x11/0xb0
[ 1410.979992] [   T1369]  ? __pfx_kthread+0x10/0x10
[ 1410.980650] [   T1369]  ret_from_fork+0x55c/0x810
[ 1410.981268] [   T1369]  ? __pfx_ret_from_fork+0x10/0x10
[ 1410.981950] [   T1369]  ? __switch_to+0x10a/0xda0
[ 1410.982640] [   T1369]  ? __switch_to_asm+0x33/0x70
[ 1410.983309] [   T1369]  ? __pfx_kthread+0x10/0x10
[ 1410.984006] [   T1369]  ret_from_fork_asm+0x1a/0x30
[ 1410.984664] [   T1369]  </TASK>
[ 1410.985198] [   T1369] irq event stamp: 2549333
[ 1410.985857] [   T1369] hardirqs last  enabled at (2549345): [<ffffffff8c6dd01e>] __up_console_sem+0x5e/0x70
[ 1410.986852] [   T1369] hardirqs last disabled at (2549356): [<ffffffff8c6dd003>] __up_console_sem+0x43/0x70
[ 1410.987853] [   T1369] softirqs last  enabled at (2549100): [<ffffffff8c51b246>] __irq_exit_rcu+0x126/0x240
[ 1410.988941] [   T1369] softirqs last disabled at (2549095): [<ffffffff8c51b246>] __irq_exit_rcu+0x126/0x240
[ 1410.989908] [   T1369] ---[ end trace 0000000000000000 ]---


[5] kmemleak at nvme/061 wiht rdma transport and siw driver

unreferenced object 0xffff888114792600 (size 32):
  comm "kworker/2:1H", pid 66, jiffies 4295489358
  hex dump (first 32 bytes):
    c2 f6 83 05 00 ea ff ff 00 00 00 00 00 10 00 00  ................
    00 b0 fd 60 81 88 ff ff 00 10 00 00 00 00 00 00  ...`............
  backtrace (crc 3dbac61d):
    __kmalloc_noprof+0x62f/0x8b0
    sgl_alloc_order+0x74/0x330
    0xffffffffc1b73433
    0xffffffffc1bc1f0d
    0xffffffffc1bc8064
    __ib_process_cq+0x14f/0x3e0 [ib_core]
    ib_cq_poll_work+0x49/0x160 [ib_core]
    process_one_work+0x868/0x1480
    worker_thread+0x5ee/0xfd0
    kthread+0x3af/0x770
    ret_from_fork+0x55c/0x810
    ret_from_fork_asm+0x1a/0x30
unreferenced object 0xffff888114792740 (size 32):
  comm "kworker/2:1H", pid 66, jiffies 4295489358
  hex dump (first 32 bytes):
    82 e8 50 04 00 ea ff ff 00 00 00 00 00 10 00 00  ..P.............
    00 20 3a 14 81 88 ff ff 00 10 00 00 00 00 00 00  . :.............
  backtrace (crc 5e69d517):
    __kmalloc_noprof+0x62f/0x8b0
    sgl_alloc_order+0x74/0x330
    0xffffffffc1b73433
    0xffffffffc1bc1f0d
    0xffffffffc1bc8064
    __ib_process_cq+0x14f/0x3e0 [ib_core]
    ib_cq_poll_work+0x49/0x160 [ib_core]
    process_one_work+0x868/0x1480
    worker_thread+0x5ee/0xfd0
    kthread+0x3af/0x770
    ret_from_fork+0x55c/0x810
    ret_from_fork_asm+0x1a/0x30
unreferenced object 0xffff88815e4a1d80 (size 32):
  comm "kworker/2:1H", pid 66, jiffies 4295490860
  hex dump (first 32 bytes):
    c2 5d 26 04 00 ea ff ff 00 00 00 00 00 10 00 00  .]&.............
    00 70 97 09 81 88 ff ff 00 10 00 00 00 00 00 00  .p..............
  backtrace (crc 6d5ef85b):
    __kmalloc_noprof+0x62f/0x8b0
    sgl_alloc_order+0x74/0x330
    0xffffffffc1b73433
    0xffffffffc1bc1f0d
    0xffffffffc1bc8064
    __ib_process_cq+0x14f/0x3e0 [ib_core]
    ib_cq_poll_work+0x49/0x160 [ib_core]
    process_one_work+0x868/0x1480
    worker_thread+0x5ee/0xfd0
    kthread+0x3af/0x770
    ret_from_fork+0x55c/0x810
    ret_from_fork_asm+0x1a/0x30
unreferenced object 0xffff88815e4a1780 (size 32):
  comm "kworker/2:1H", pid 66, jiffies 4295490860
  hex dump (first 32 bytes):
    02 ca cf 04 00 ea ff ff 00 00 00 00 00 10 00 00  ................
    00 80 f2 33 81 88 ff ff 00 10 00 00 00 00 00 00  ...3............
  backtrace (crc 9b068d98):
    __kmalloc_noprof+0x62f/0x8b0
    sgl_alloc_order+0x74/0x330
    0xffffffffc1b73433
    0xffffffffc1bc1f0d
    0xffffffffc1bc8064
    __ib_process_cq+0x14f/0x3e0 [ib_core]
    ib_cq_poll_work+0x49/0x160 [ib_core]
    process_one_work+0x868/0x1480
    worker_thread+0x5ee/0xfd0
    kthread+0x3af/0x770
    ret_from_fork+0x55c/0x810
    ret_from_fork_asm+0x1a/0x30
unreferenced object 0xffff88815e4a1f80 (size 32):
  comm "kworker/2:1H", pid 66, jiffies 4295490861
  hex dump (first 32 bytes):
    82 2b 84 05 00 ea ff ff 00 00 00 00 00 10 00 00  .+..............
    00 e0 0a 61 81 88 ff ff 00 10 00 00 00 00 00 00  ...a............
  backtrace (crc 1f32d61a):
    __kmalloc_noprof+0x62f/0x8b0
    sgl_alloc_order+0x74/0x330
    0xffffffffc1b73433
    0xffffffffc1bc1f0d
    0xffffffffc1bc8064
    __ib_process_cq+0x14f/0x3e0 [ib_core]
    ib_cq_poll_work+0x49/0x160 [ib_core]
    process_one_work+0x868/0x1480
    worker_thread+0x5ee/0xfd0
    kthread+0x3af/0x770
    ret_from_fork+0x55c/0x810
    ret_from_fork_asm+0x1a/0x30


[7] kmemleak at zbd/009

unreferenced object 0xffff88815f1f1280 (size 32):
  comm "mount", pid 1745, jiffies 4294866235
  hex dump (first 32 bytes):
    6d 65 74 61 64 61 74 61 2d 74 72 65 65 6c 6f 67  metadata-treelog
    00 93 9c fb af bb ae 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc 2ee03cc2):
    __kmalloc_node_track_caller_noprof+0x66b/0x8c0
    kstrdup+0x42/0xc0
    kobject_set_name_vargs+0x44/0x110
    kobject_init_and_add+0xcf/0x140
    btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
    create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
    create_space_info+0x247/0x320 [btrfs]
    btrfs_init_space_info+0x143/0x1b0 [btrfs]
    open_ctree+0x2eed/0x43fe [btrfs]
    btrfs_get_tree.cold+0x90/0x1da [btrfs]
    vfs_get_tree+0x87/0x2f0
    vfs_cmd_create+0xbd/0x280
    __do_sys_fsconfig+0x64f/0xa30
    do_syscall_64+0x95/0x540
    entry_SYSCALL_64_after_hwframe+0x76/0x7e
unreferenced object 0xffff888128d80000 (size 16):
  comm "mount", pid 1745, jiffies 4294866237
  hex dump (first 16 bytes):
    64 61 74 61 2d 72 65 6c 6f 63 00 4b 96 f6 48 82  data-reloc.K..H.
  backtrace (crc 1598f702):
    __kmalloc_node_track_caller_noprof+0x66b/0x8c0
    kstrdup+0x42/0xc0
    kobject_set_name_vargs+0x44/0x110
    kobject_init_and_add+0xcf/0x140
    btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
    create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
    create_space_info+0x211/0x320 [btrfs]
    open_ctree+0x2eed/0x43fe [btrfs]
    btrfs_get_tree.cold+0x90/0x1da [btrfs]
    vfs_get_tree+0x87/0x2f0
    vfs_cmd_create+0xbd/0x280
    __do_sys_fsconfig+0x64f/0xa30
    do_syscall_64+0x95/0x540
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-13  7:57 blktests failures with v6.19 kernel Shinichiro Kawasaki
@ 2026-02-13  9:56 ` Daniel Wagner
  2026-02-14 21:19   ` Chaitanya Kulkarni
  2026-02-13 17:44 ` Bart Van Assche
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Daniel Wagner @ 2026-02-13  9:56 UTC (permalink / raw)
  To: Shinichiro Kawasaki
  Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-scsi@vger.kernel.org, nbd@other.debian.org,
	linux-rdma@vger.kernel.org

On Fri, Feb 13, 2026 at 07:57:58AM +0000, Shinichiro Kawasaki wrote:
> [3] lockdep WARN during nvme/058 with fc transport
> 
> [  409.028219] [     T95] ============================================
> [  409.029133] [     T95] WARNING: possible recursive locking detected
> [  409.030058] [     T95] 6.19.0+ #575 Not tainted
> [  409.030892] [     T95] --------------------------------------------
> [  409.031801] [     T95] kworker/u16:9/95 is trying to acquire lock:
> [  409.032691] [     T95] ffff88813ba54948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x7e/0x1a0
> [  409.033869] [     T95] 
>                           but task is already holding lock:
> [  409.035254] [     T95] ffff88813ba54948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0xeef/0x1480
> [  409.036383] [     T95] 
>                           other info that might help us debug this:
> [  409.037769] [     T95]  Possible unsafe locking scenario:
> 
> [  409.039113] [     T95]        CPU0
> [  409.039781] [     T95]        ----
> [  409.040406] [     T95]   lock((wq_completion)nvmet-wq);
> [  409.041154] [     T95]   lock((wq_completion)nvmet-wq);
> [  409.041898] [     T95] 
>                            *** DEADLOCK ***
> 
> [  409.043609] [     T95]  May be due to missing lock nesting notation
> 
> [  409.044960] [     T95] 3 locks held by kworker/u16:9/95:
> [  409.045721] [     T95]  #0: ffff88813ba54948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0xeef/0x1480
> [  409.046845] [     T95]  #1: ffff888109797ca8 ((work_completion)(&assoc->del_work)){+.+.}-{0:0}, at: process_one_work+0x7e4/0x1480
> [  409.048063] [     T95]  #2: ffffffffac481480 (rcu_read_lock){....}-{1:3}, at: __flush_work+0xe3/0xc70
> [  409.049128] [     T95] 
>                           stack backtrace:
> [  409.050366] [     T95] CPU: 1 UID: 0 PID: 95 Comm: kworker/u16:9 Not tainted 6.19.0+ #575 PREEMPT(voluntary) 
> [  409.050370] [     T95] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
> [  409.050373] [     T95] Workqueue: nvmet-wq nvmet_fc_delete_assoc_work [nvmet_fc]
> [  409.050384] [     T95] Call Trace:
> [  409.050386] [     T95]  <TASK>
> [  409.050388] [     T95]  dump_stack_lvl+0x6a/0x90
> [  409.050393] [     T95]  print_deadlock_bug.cold+0xc0/0xcd
> [  409.050401] [     T95]  __lock_acquire+0x1312/0x2230
> [  409.050408] [     T95]  ? lockdep_unlock+0x5e/0xc0
> [  409.050410] [     T95]  ? __lock_acquire+0xd03/0x2230
> [  409.050413] [     T95]  lock_acquire+0x170/0x300
> [  409.050415] [     T95]  ? touch_wq_lockdep_map+0x7e/0x1a0
> [  409.050418] [     T95]  ? __flush_work+0x4e8/0xc70
> [  409.050420] [     T95]  ? find_held_lock+0x2b/0x80
> [  409.050423] [     T95]  ? touch_wq_lockdep_map+0x7e/0x1a0
> [  409.050425] [     T95]  touch_wq_lockdep_map+0x97/0x1a0
> [  409.050428] [     T95]  ? touch_wq_lockdep_map+0x7e/0x1a0
> [  409.050430] [     T95]  ? __flush_work+0x4e8/0xc70
> [  409.050432] [     T95]  __flush_work+0x5c1/0xc70
> [  409.050434] [     T95]  ? __pfx___flush_work+0x10/0x10
> [  409.050436] [     T95]  ? __pfx___flush_work+0x10/0x10
> [  409.050439] [     T95]  ? __pfx_wq_barrier_func+0x10/0x10
> [  409.050444] [     T95]  ? __pfx___might_resched+0x10/0x10
> [  409.050454] [     T95]  nvmet_ctrl_free+0x2e9/0x810 [nvmet]
> [  409.050474] [     T95]  ? __pfx___cancel_work+0x10/0x10
> [  409.050479] [     T95]  ? __pfx_nvmet_ctrl_free+0x10/0x10 [nvmet]
> [  409.050498] [     T95]  nvmet_cq_put+0x136/0x180 [nvmet]
> [  409.050515] [     T95]  nvmet_fc_target_assoc_free+0x398/0x2040 [nvmet_fc]
> [  409.050522] [     T95]  ? __pfx_nvmet_fc_target_assoc_free+0x10/0x10 [nvmet_fc]
> [  409.050528] [     T95]  nvmet_fc_delete_assoc_work.cold+0x82/0xff [nvmet_fc]
> [  409.050533] [     T95]  process_one_work+0x868/0x1480
> [  409.050538] [     T95]  ? __pfx_process_one_work+0x10/0x10
> [  409.050541] [     T95]  ? lock_acquire+0x170/0x300
> [  409.050545] [     T95]  ? assign_work+0x156/0x390
> [  409.050548] [     T95]  worker_thread+0x5ee/0xfd0
> [  409.050553] [     T95]  ? __pfx_worker_thread+0x10/0x10
> [  409.050555] [     T95]  kthread+0x3af/0x770
> [  409.050560] [     T95]  ? lock_acquire+0x180/0x300
> [  409.050563] [     T95]  ? __pfx_kthread+0x10/0x10
> [  409.050565] [     T95]  ? __pfx_kthread+0x10/0x10
> [  409.050568] [     T95]  ? ret_from_fork+0x6e/0x810
> [  409.050576] [     T95]  ? lock_release+0x1ab/0x2f0
> [  409.050578] [     T95]  ? rcu_is_watching+0x11/0xb0
> [  409.050582] [     T95]  ? __pfx_kthread+0x10/0x10
> [  409.050585] [     T95]  ret_from_fork+0x55c/0x810
> [  409.050588] [     T95]  ? __pfx_ret_from_fork+0x10/0x10
> [  409.050590] [     T95]  ? __switch_to+0x10a/0xda0
> [  409.050598] [     T95]  ? __switch_to_asm+0x33/0x70
> [  409.050602] [     T95]  ? __pfx_kthread+0x10/0x10
> [  409.050605] [     T95]  ret_from_fork_asm+0x1a/0x30
> [  409.050610] [     T95]  </TASK>

nvmet_fc_target_assoc_free runs in the nvmet_wq context and calls

  nvmet_fc_delete_target_queue
    nvmet_cq_put
      nvmet_cq_destroy
        nvmet_ctrl_put
         nvmet_ctrl_free
           flush_work(&ctrl->async_event_work);
           cancel_work_sync(&ctrl->fatal_err_work);
 
The async_event_work could be running on nvmet_wq. So this deadlock is
real. No idea how to fix it yet.

> [4] WARN during nvme/058 fc transport
> 
> [ 1410.913156] [   T1369] WARNING: block/genhd.c:463 at __add_disk+0x87b/0xd50, CPU#0: kworker/u16:12/1369

	/*
	 * If the driver provides an explicit major number it also must provide
	 * the number of minors numbers supported, and those will be used to
	 * setup the gendisk.
	 * Otherwise just allocate the device numbers for both the whole device
	 * and all partitions from the extended dev_t space.
	 */
	ret = -EINVAL;
	if (disk->major) {
		if (WARN_ON(!disk->minors))
			goto out;

If I read the correct, the gendisk is allocated in nvme_mpath_alloc_disk
and then added due the ANA change in nvme_update_ana_state. Not sure if
this is really fc related but I haven't really figured out how this part
of the code yet.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-13  7:57 blktests failures with v6.19 kernel Shinichiro Kawasaki
  2026-02-13  9:56 ` Daniel Wagner
@ 2026-02-13 17:44 ` Bart Van Assche
  2026-02-13 23:38   ` yanjun.zhu
  2026-02-16  9:38 ` Nilay Shroff
  2026-02-24  7:47 ` Johannes Thumshirn
  3 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2026-02-13 17:44 UTC (permalink / raw)
  To: Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	nbd@other.debian.org, linux-rdma@vger.kernel.org

On 2/12/26 11:57 PM, Shinichiro Kawasaki wrote:
> [5] kmemleak at nvme/061 wiht rdma transport and siw driver
> 
> unreferenced object 0xffff888114792600 (size 32):
>    comm "kworker/2:1H", pid 66, jiffies 4295489358
>    hex dump (first 32 bytes):
>      c2 f6 83 05 00 ea ff ff 00 00 00 00 00 10 00 00  ................
>      00 b0 fd 60 81 88 ff ff 00 10 00 00 00 00 00 00  ...`............
>    backtrace (crc 3dbac61d):
>      __kmalloc_noprof+0x62f/0x8b0
>      sgl_alloc_order+0x74/0x330
>      0xffffffffc1b73433
>      0xffffffffc1bc1f0d
>      0xffffffffc1bc8064
>      __ib_process_cq+0x14f/0x3e0 [ib_core]
>      ib_cq_poll_work+0x49/0x160 [ib_core]
>      process_one_work+0x868/0x1480
>      worker_thread+0x5ee/0xfd0
>      kthread+0x3af/0x770
>      ret_from_fork+0x55c/0x810
>      ret_from_fork_asm+0x1a/0x30

There are no sgl_alloc() calls in the RDMA subsystem. I think the above
indicates a memory leak in either the RDMA NVMe target driver or in the
NVMe target core.

> [7] kmemleak at zbd/009
> 
> unreferenced object 0xffff88815f1f1280 (size 32):
>    comm "mount", pid 1745, jiffies 4294866235
>    hex dump (first 32 bytes):
>      6d 65 74 61 64 61 74 61 2d 74 72 65 65 6c 6f 67  metadata-treelog
>      00 93 9c fb af bb ae 00 00 00 00 00 00 00 00 00  ................
>    backtrace (crc 2ee03cc2):
>      __kmalloc_node_track_caller_noprof+0x66b/0x8c0
>      kstrdup+0x42/0xc0
>      kobject_set_name_vargs+0x44/0x110
>      kobject_init_and_add+0xcf/0x140
>      btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
>      create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
>      create_space_info+0x247/0x320 [btrfs]
>      btrfs_init_space_info+0x143/0x1b0 [btrfs]
>      open_ctree+0x2eed/0x43fe [btrfs]
>      btrfs_get_tree.cold+0x90/0x1da [btrfs]
>      vfs_get_tree+0x87/0x2f0
>      vfs_cmd_create+0xbd/0x280
>      __do_sys_fsconfig+0x64f/0xa30
>      do_syscall_64+0x95/0x540
>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
> unreferenced object 0xffff888128d80000 (size 16):
>    comm "mount", pid 1745, jiffies 4294866237
>    hex dump (first 16 bytes):
>      64 61 74 61 2d 72 65 6c 6f 63 00 4b 96 f6 48 82  data-reloc.K..H.
>    backtrace (crc 1598f702):
>      __kmalloc_node_track_caller_noprof+0x66b/0x8c0
>      kstrdup+0x42/0xc0
>      kobject_set_name_vargs+0x44/0x110
>      kobject_init_and_add+0xcf/0x140
>      btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
>      create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
>      create_space_info+0x211/0x320 [btrfs]
>      open_ctree+0x2eed/0x43fe [btrfs]
>      btrfs_get_tree.cold+0x90/0x1da [btrfs]
>      vfs_get_tree+0x87/0x2f0
>      vfs_cmd_create+0xbd/0x280
>      __do_sys_fsconfig+0x64f/0xa30
>      do_syscall_64+0x95/0x540
>      entry_SYSCALL_64_after_hwframe+0x76/0x7e

Please report the above to the BTRFS maintainers.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-13 17:44 ` Bart Van Assche
@ 2026-02-13 23:38   ` yanjun.zhu
  0 siblings, 0 replies; 10+ messages in thread
From: yanjun.zhu @ 2026-02-13 23:38 UTC (permalink / raw)
  To: Bart Van Assche, Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	nbd@other.debian.org, linux-rdma@vger.kernel.org

On 2/13/26 9:44 AM, Bart Van Assche wrote:
> On 2/12/26 11:57 PM, Shinichiro Kawasaki wrote:
>> [5] kmemleak at nvme/061 wiht rdma transport and siw driver
>>
>> unreferenced object 0xffff888114792600 (size 32):
>>    comm "kworker/2:1H", pid 66, jiffies 4295489358
>>    hex dump (first 32 bytes):
>>      c2 f6 83 05 00 ea ff ff 00 00 00 00 00 10 00 00  ................
>>      00 b0 fd 60 81 88 ff ff 00 10 00 00 00 00 00 00  ...`............
>>    backtrace (crc 3dbac61d):
>>      __kmalloc_noprof+0x62f/0x8b0
>>      sgl_alloc_order+0x74/0x330
>>      0xffffffffc1b73433
>>      0xffffffffc1bc1f0d
>>      0xffffffffc1bc8064
>>      __ib_process_cq+0x14f/0x3e0 [ib_core]
>>      ib_cq_poll_work+0x49/0x160 [ib_core]
>>      process_one_work+0x868/0x1480
>>      worker_thread+0x5ee/0xfd0
>>      kthread+0x3af/0x770
>>      ret_from_fork+0x55c/0x810
>>      ret_from_fork_asm+0x1a/0x30
> 
> There are no sgl_alloc() calls in the RDMA subsystem. I think the above
> indicates a memory leak in either the RDMA NVMe target driver or in the
> NVMe target core.

3a2c32d357db RDMA/siw: reclassify sockets in order to avoid false 
positives from lockdep
85cb0757d7e1 net: Convert proto_ops connect() callbacks to use 
sockaddr_unsized
0e50474fa514 net: Convert proto_ops bind() callbacks to use sockaddr_unsized

There are only three commits touching the siw driver between v6.18 and 
v6.19. I therefore suspect the issue is more likely in the NVMe side.

Best Regards,
Zhu Yanjun

> 
>> [7] kmemleak at zbd/009
>>
>> unreferenced object 0xffff88815f1f1280 (size 32):
>>    comm "mount", pid 1745, jiffies 4294866235
>>    hex dump (first 32 bytes):
>>      6d 65 74 61 64 61 74 61 2d 74 72 65 65 6c 6f 67  metadata-treelog
>>      00 93 9c fb af bb ae 00 00 00 00 00 00 00 00 00  ................
>>    backtrace (crc 2ee03cc2):
>>      __kmalloc_node_track_caller_noprof+0x66b/0x8c0
>>      kstrdup+0x42/0xc0
>>      kobject_set_name_vargs+0x44/0x110
>>      kobject_init_and_add+0xcf/0x140
>>      btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
>>      create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
>>      create_space_info+0x247/0x320 [btrfs]
>>      btrfs_init_space_info+0x143/0x1b0 [btrfs]
>>      open_ctree+0x2eed/0x43fe [btrfs]
>>      btrfs_get_tree.cold+0x90/0x1da [btrfs]
>>      vfs_get_tree+0x87/0x2f0
>>      vfs_cmd_create+0xbd/0x280
>>      __do_sys_fsconfig+0x64f/0xa30
>>      do_syscall_64+0x95/0x540
>>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> unreferenced object 0xffff888128d80000 (size 16):
>>    comm "mount", pid 1745, jiffies 4294866237
>>    hex dump (first 16 bytes):
>>      64 61 74 61 2d 72 65 6c 6f 63 00 4b 96 f6 48 82  data-reloc.K..H.
>>    backtrace (crc 1598f702):
>>      __kmalloc_node_track_caller_noprof+0x66b/0x8c0
>>      kstrdup+0x42/0xc0
>>      kobject_set_name_vargs+0x44/0x110
>>      kobject_init_and_add+0xcf/0x140
>>      btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
>>      create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
>>      create_space_info+0x211/0x320 [btrfs]
>>      open_ctree+0x2eed/0x43fe [btrfs]
>>      btrfs_get_tree.cold+0x90/0x1da [btrfs]
>>      vfs_get_tree+0x87/0x2f0
>>      vfs_cmd_create+0xbd/0x280
>>      __do_sys_fsconfig+0x64f/0xa30
>>      do_syscall_64+0x95/0x540
>>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Please report the above to the BTRFS maintainers.
> 
> Thanks,
> 
> Bart.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-13  9:56 ` Daniel Wagner
@ 2026-02-14 21:19   ` Chaitanya Kulkarni
  2026-02-16 10:26     ` Daniel Wagner
  0 siblings, 1 reply; 10+ messages in thread
From: Chaitanya Kulkarni @ 2026-02-14 21:19 UTC (permalink / raw)
  To: Daniel Wagner, Shinichiro Kawasaki
  Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-scsi@vger.kernel.org, nbd@other.debian.org,
	linux-rdma@vger.kernel.org

On 2/13/26 01:56, Daniel Wagner wrote:
> nvmet_fc_target_assoc_free runs in the nvmet_wq context and calls
>
>    nvmet_fc_delete_target_queue
>      nvmet_cq_put
>        nvmet_cq_destroy
>          nvmet_ctrl_put
>           nvmet_ctrl_free
>             flush_work(&ctrl->async_event_work);
>             cancel_work_sync(&ctrl->fatal_err_work);
>   
> The async_event_work could be running on nvmet_wq. So this deadlock is
> real. No idea how to fix it yet.
>

Can following patch be the potential fix for above issue as well ?
totally untested ...

 From ad58e979ab9a2d4a7cc6234d28f2d90c174e4df9 Mon Sep 17 00:00:00 2001
From: Chaitanya Kulkarni <kch@nvidia.com>
Date: Thu, 5 Feb 2026 17:05:27 -0800
Subject: [INTERNAL PATCH] nvmet: move async event work off nvmet-wq

For target nvmet_ctrl_free() flushes ctrl->async_event_work.
If nvmet_ctrl_free() runs on nvmet-wq, the flush re-enters workqueue
completion for the same worker:-

A. Async event work queued on nvmet-wq (prior to disconnect):
   nvmet_execute_async_event()
      queue_work(nvmet_wq, &ctrl->async_event_work)

   nvmet_add_async_event()
      queue_work(nvmet_wq, &ctrl->async_event_work)

B. Full pre-work chain (RDMA CM path):
   nvmet_rdma_cm_handler()
      nvmet_rdma_queue_disconnect()
        __nvmet_rdma_queue_disconnect()
          queue_work(nvmet_wq, &queue->release_work)
            process_one_work()
              lock((wq_completion)nvmet-wq)  <--------- 1st
              nvmet_rdma_release_queue_work()

C. Recursive path (same worker):
   nvmet_rdma_release_queue_work()
      nvmet_rdma_free_queue()
        nvmet_sq_destroy()
          nvmet_ctrl_put()
            nvmet_ctrl_free()
              flush_work(&ctrl->async_event_work)
                __flush_work()
                  touch_wq_lockdep_map()
                  lock((wq_completion)nvmet-wq)i <--------- 2nd

Lockdep splat:

   ============================================
   WARNING: possible recursive locking detected
   6.19.0-rc3nvme+ #14 Tainted: G                 N
   --------------------------------------------
   kworker/u192:42/44933 is trying to acquire lock:
   ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90

   but task is already holding lock:
   ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x53e/0x660

   3 locks held by kworker/u192:42/44933:
    #0: ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x53e/0x660
    #1: ffffc9000e6cbe28 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x1c5/0x660
    #2: ffffffff82d4db60 (rcu_read_lock){....}-{1:3}, at: __flush_work+0x62/0x530

   Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
   Call Trace:
    __flush_work+0x268/0x530
    nvmet_ctrl_free+0x140/0x310 [nvmet]
    nvmet_cq_put+0x74/0x90 [nvmet]
    nvmet_rdma_free_queue+0x23/0xe0 [nvmet_rdma]
    nvmet_rdma_release_queue_work+0x19/0x50 [nvmet_rdma]
    process_one_work+0x206/0x660
    worker_thread+0x184/0x320
    kthread+0x10c/0x240
    ret_from_fork+0x319/0x390

Move async event work to a dedicated nvmet-aen-wq to avoid reentrant
flush on nvmet-wq.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
  drivers/nvme/target/admin-cmd.c |  2 +-
  drivers/nvme/target/core.c      | 13 +++++++++++--
  drivers/nvme/target/nvmet.h     |  1 +
  3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 3da31bb1183e..100d1466ff84 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -1586,7 +1586,7 @@ void nvmet_execute_async_event(struct nvmet_req *req)
  	ctrl->async_event_cmds[ctrl->nr_async_event_cmds++] = req;
  	mutex_unlock(&ctrl->lock);
  
-	queue_work(nvmet_wq, &ctrl->async_event_work);
+	queue_work(nvmet_aen_wq, &ctrl->async_event_work);
  }
  
  void nvmet_execute_keep_alive(struct nvmet_req *req)
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index cc88e5a28c8a..b0883c7fdb8f 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -26,6 +26,7 @@ static DEFINE_IDA(cntlid_ida);
  
  struct workqueue_struct *nvmet_wq;
  EXPORT_SYMBOL_GPL(nvmet_wq);
+struct workqueue_struct *nvmet_aen_wq;
  
  /*
   * This read/write semaphore is used to synchronize access to configuration
@@ -205,7 +206,7 @@ void nvmet_add_async_event(struct nvmet_ctrl *ctrl, u8 event_type,
  	list_add_tail(&aen->entry, &ctrl->async_events);
  	mutex_unlock(&ctrl->lock);
  
-	queue_work(nvmet_wq, &ctrl->async_event_work);
+	queue_work(nvmet_aen_wq, &ctrl->async_event_work);
  }
  
  static void nvmet_add_to_changed_ns_log(struct nvmet_ctrl *ctrl, __le32 nsid)
@@ -1958,9 +1959,14 @@ static int __init nvmet_init(void)
  	if (!nvmet_wq)
  		goto out_free_buffered_work_queue;
  
+	nvmet_aen_wq = alloc_workqueue("nvmet-aen-wq",
+			WQ_MEM_RECLAIM | WQ_UNBOUND, 0);
+	if (!nvmet_aen_wq)
+		goto out_free_nvmet_work_queue;
+
  	error = nvmet_init_debugfs();
  	if (error)
-		goto out_free_nvmet_work_queue;
+		goto out_free_nvmet_aen_work_queue;
  
  	error = nvmet_init_discovery();
  	if (error)
@@ -1976,6 +1982,8 @@ static int __init nvmet_init(void)
  	nvmet_exit_discovery();
  out_exit_debugfs:
  	nvmet_exit_debugfs();
+out_free_nvmet_aen_work_queue:
+	destroy_workqueue(nvmet_aen_wq);
  out_free_nvmet_work_queue:
  	destroy_workqueue(nvmet_wq);
  out_free_buffered_work_queue:
@@ -1993,6 +2001,7 @@ static void __exit nvmet_exit(void)
  	nvmet_exit_discovery();
  	nvmet_exit_debugfs();
  	ida_destroy(&cntlid_ida);
+	destroy_workqueue(nvmet_aen_wq);
  	destroy_workqueue(nvmet_wq);
  	destroy_workqueue(buffered_io_wq);
  	destroy_workqueue(zbd_wq);
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index b664b584fdc8..319d6a5e9cf0 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -501,6 +501,7 @@ extern struct kmem_cache *nvmet_bvec_cache;
  extern struct workqueue_struct *buffered_io_wq;
  extern struct workqueue_struct *zbd_wq;
  extern struct workqueue_struct *nvmet_wq;
+extern struct workqueue_struct *nvmet_aen_wq;
  
  static inline void nvmet_set_result(struct nvmet_req *req, u32 result)
  {
-- 
2.39.5


-ck





^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-13  7:57 blktests failures with v6.19 kernel Shinichiro Kawasaki
  2026-02-13  9:56 ` Daniel Wagner
  2026-02-13 17:44 ` Bart Van Assche
@ 2026-02-16  9:38 ` Nilay Shroff
  2026-02-16 21:18   ` Chaitanya Kulkarni
  2026-02-24  7:47 ` Johannes Thumshirn
  3 siblings, 1 reply; 10+ messages in thread
From: Nilay Shroff @ 2026-02-16  9:38 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	linux-rdma@vger.kernel.org, nbd@other.debian.org

Hi Chaitanya,

On 2/13/26 1:27 PM, Shinichiro Kawasaki wrote:
> Hi all,
> 
> I ran the latest blktests (git hash: b5b699341102) with the v6.19 kernel. I
> observed 6 failures listed below. Comparing with the previous report with the
> v6.19-rc1 kernel [1], two failures were resolved (nvme/033 and srp) and three
> failures are newly observed (nvme/061, zbd/009 and zbd/013). Recently, kmemleak
> support was introduced to blktests. Two out of the three new failures were
> detected by kmemleak. Your actions to fix the failures will be appreciated as
> always.
> 
> [1] https://lore.kernel.org/linux-block/a078671f-10b3-47e7-acbb-4251c8744523@wdc.com/
> 
> 
> List of failures
> ================
> #1: nvme/005,063 (tcp transport)
> #2: nvme/058 (fc transport)
> #3: nvme/061 (rdma transport, siw driver)(new)(kmemleak)
> #4: nbd/002
> #5: zbd/009 (new)(kmemleak)
> #6: zbd/013 (new)
> 
> 
> Failure description
> ===================
> 
> #1: nvme/005,063 (tcp transport)
> 
>     The test case nvme/005 and 063 fail for tcp transport due to the lockdep
>     WARN related to the three locks q->q_usage_counter, q->elevator_lock and
>     set->srcu. Refer to the nvme/063 failure report for v6.16-rc1 kernel [2].
> 
>     [2] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/

For the lockdep failure reported above for nvme/063, it seems we already had
solution but it appears that it's not yet upstreamed, check this:
https://lore.kernel.org/all/20251125061142.18094-1-ckulkarnilinux@gmail.com/

Can you please update and resend the above patch per the last feedback? I think
this should fix the lockdep reported under nvme/063.

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-14 21:19   ` Chaitanya Kulkarni
@ 2026-02-16 10:26     ` Daniel Wagner
  2026-02-17 21:22       ` Chaitanya Kulkarni
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Wagner @ 2026-02-16 10:26 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	nbd@other.debian.org, linux-rdma@vger.kernel.org

Hi Chaitanya,

On Sat, Feb 14, 2026 at 09:19:47PM +0000, Chaitanya Kulkarni wrote:
> On 2/13/26 01:56, Daniel Wagner wrote:
> > nvmet_fc_target_assoc_free runs in the nvmet_wq context and calls
> >
> >    nvmet_fc_delete_target_queue
> >      nvmet_cq_put
> >        nvmet_cq_destroy
> >          nvmet_ctrl_put
> >           nvmet_ctrl_free
> >             flush_work(&ctrl->async_event_work);
> >             cancel_work_sync(&ctrl->fatal_err_work);
> >   
> > The async_event_work could be running on nvmet_wq. So this deadlock is
> > real. No idea how to fix it yet.
> >
> 
> Can following patch be the potential fix for above issue as well ?
> totally untested ...

Yes this should work. I was not so happy adding a workqueue for this but
after looking at nvme, this seems acceptable approach. Though, I'd make
nvmet follow the nvme and instead adding an AEN workqueue, rather have a
nvmet-reset-wq or nvmet_delete-wq.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-16  9:38 ` Nilay Shroff
@ 2026-02-16 21:18   ` Chaitanya Kulkarni
  0 siblings, 0 replies; 10+ messages in thread
From: Chaitanya Kulkarni @ 2026-02-16 21:18 UTC (permalink / raw)
  To: Nilay Shroff
  Cc: Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	linux-rdma@vger.kernel.org, nbd@other.debian.org

On 2/16/26 01:38, Nilay Shroff wrote:
> Hi Chaitanya,
>
> On 2/13/26 1:27 PM, Shinichiro Kawasaki wrote:
>> Hi all,
>>
>> I ran the latest blktests (git hash: b5b699341102) with the v6.19 kernel. I
>> observed 6 failures listed below. Comparing with the previous report with the
>> v6.19-rc1 kernel [1], two failures were resolved (nvme/033 and srp) and three
>> failures are newly observed (nvme/061, zbd/009 and zbd/013). Recently, kmemleak
>> support was introduced to blktests. Two out of the three new failures were
>> detected by kmemleak. Your actions to fix the failures will be appreciated as
>> always.
>>
>> [1] https://lore.kernel.org/linux-block/a078671f-10b3-47e7-acbb-4251c8744523@wdc.com/
>>
>>
>> List of failures
>> ================
>> #1: nvme/005,063 (tcp transport)
>> #2: nvme/058 (fc transport)
>> #3: nvme/061 (rdma transport, siw driver)(new)(kmemleak)
>> #4: nbd/002
>> #5: zbd/009 (new)(kmemleak)
>> #6: zbd/013 (new)
>>
>>
>> Failure description
>> ===================
>>
>> #1: nvme/005,063 (tcp transport)
>>
>>      The test case nvme/005 and 063 fail for tcp transport due to the lockdep
>>      WARN related to the three locks q->q_usage_counter, q->elevator_lock and
>>      set->srcu. Refer to the nvme/063 failure report for v6.16-rc1 kernel [2].
>>
>>      [2] https://lore.kernel.org/linux-block/4fdm37so3o4xricdgfosgmohn63aa7wj3ua4e5vpihoamwg3ui@fq42f5q5t5ic/
> For the lockdep failure reported above for nvme/063, it seems we already had
> solution but it appears that it's not yet upstreamed, check this:
> https://lore.kernel.org/all/20251125061142.18094-1-ckulkarnilinux@gmail.com/
>
> Can you please update and resend the above patch per the last feedback? I think
> this should fix the lockdep reported under nvme/063.
>
> Thanks,
> --Nilay


Thank for pointing this out.

Please allow me 2-3 days I'll send out a patch.

-ck





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-16 10:26     ` Daniel Wagner
@ 2026-02-17 21:22       ` Chaitanya Kulkarni
  0 siblings, 0 replies; 10+ messages in thread
From: Chaitanya Kulkarni @ 2026-02-17 21:22 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	nbd@other.debian.org, linux-rdma@vger.kernel.org

On 2/16/26 02:26, Daniel Wagner wrote:
> Hi Chaitanya,
>
> On Sat, Feb 14, 2026 at 09:19:47PM +0000, Chaitanya Kulkarni wrote:
>> On 2/13/26 01:56, Daniel Wagner wrote:
>>> nvmet_fc_target_assoc_free runs in the nvmet_wq context and calls
>>>
>>>     nvmet_fc_delete_target_queue
>>>       nvmet_cq_put
>>>         nvmet_cq_destroy
>>>           nvmet_ctrl_put
>>>            nvmet_ctrl_free
>>>              flush_work(&ctrl->async_event_work);
>>>              cancel_work_sync(&ctrl->fatal_err_work);
>>>    
>>> The async_event_work could be running on nvmet_wq. So this deadlock is
>>> real. No idea how to fix it yet.
>>>
>> Can following patch be the potential fix for above issue as well ?
>> totally untested ...
> Yes this should work. I was not so happy adding a workqueue for this but
> after looking at nvme, this seems acceptable approach. Though, I'd make
> nvmet follow the nvme and instead adding an AEN workqueue, rather have a
> nvmet-reset-wq or nvmet_delete-wq.
>
> Thanks,
> Daniel


Thanks for looking into this.

The above patch has a cleanup bug, I've sent better patch, please have a look
hopefully we can get this merged this week.

-ck



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: blktests failures with v6.19 kernel
  2026-02-13  7:57 blktests failures with v6.19 kernel Shinichiro Kawasaki
                   ` (2 preceding siblings ...)
  2026-02-16  9:38 ` Nilay Shroff
@ 2026-02-24  7:47 ` Johannes Thumshirn
  3 siblings, 0 replies; 10+ messages in thread
From: Johannes Thumshirn @ 2026-02-24  7:47 UTC (permalink / raw)
  To: Shinichiro Kawasaki, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
	nbd@other.debian.org, linux-rdma@vger.kernel.org

On 2/13/26 8:58 AM, Shinichiro Kawasaki wrote:
> 7] kmemleak at zbd/009
>
> unreferenced object 0xffff88815f1f1280 (size 32):
>    comm "mount", pid 1745, jiffies 4294866235
>    hex dump (first 32 bytes):
>      6d 65 74 61 64 61 74 61 2d 74 72 65 65 6c 6f 67  metadata-treelog
>      00 93 9c fb af bb ae 00 00 00 00 00 00 00 00 00  ................
>    backtrace (crc 2ee03cc2):
>      __kmalloc_node_track_caller_noprof+0x66b/0x8c0
>      kstrdup+0x42/0xc0
>      kobject_set_name_vargs+0x44/0x110
>      kobject_init_and_add+0xcf/0x140
>      btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
>      create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
>      create_space_info+0x247/0x320 [btrfs]
>      btrfs_init_space_info+0x143/0x1b0 [btrfs]
>      open_ctree+0x2eed/0x43fe [btrfs]
>      btrfs_get_tree.cold+0x90/0x1da [btrfs]
>      vfs_get_tree+0x87/0x2f0
>      vfs_cmd_create+0xbd/0x280
>      __do_sys_fsconfig+0x64f/0xa30
>      do_syscall_64+0x95/0x540
>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
> unreferenced object 0xffff888128d80000 (size 16):
>    comm "mount", pid 1745, jiffies 4294866237
>    hex dump (first 16 bytes):
>      64 61 74 61 2d 72 65 6c 6f 63 00 4b 96 f6 48 82  data-reloc.K..H.
>    backtrace (crc 1598f702):
>      __kmalloc_node_track_caller_noprof+0x66b/0x8c0
>      kstrdup+0x42/0xc0
>      kobject_set_name_vargs+0x44/0x110
>      kobject_init_and_add+0xcf/0x140
>      btrfs_sysfs_add_space_info_type+0xf2/0x200 [btrfs]
>      create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
>      create_space_info+0x211/0x320 [btrfs]
>      open_ctree+0x2eed/0x43fe [btrfs]
>      btrfs_get_tree.cold+0x90/0x1da [btrfs]
>      vfs_get_tree+0x87/0x2f0
>      vfs_cmd_create+0xbd/0x280
>      __do_sys_fsconfig+0x64f/0xa30
>      do_syscall_64+0x95/0x540
>      entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
This clearly is a BTRFS bug, we're leaking the space-info's kobject. I 
/think/ I know why but not 100% right now.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-02-24  7:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-13  7:57 blktests failures with v6.19 kernel Shinichiro Kawasaki
2026-02-13  9:56 ` Daniel Wagner
2026-02-14 21:19   ` Chaitanya Kulkarni
2026-02-16 10:26     ` Daniel Wagner
2026-02-17 21:22       ` Chaitanya Kulkarni
2026-02-13 17:44 ` Bart Van Assche
2026-02-13 23:38   ` yanjun.zhu
2026-02-16  9:38 ` Nilay Shroff
2026-02-16 21:18   ` Chaitanya Kulkarni
2026-02-24  7:47 ` Johannes Thumshirn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox