linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7
@ 2025-05-29 12:41 Yi Zhang
  2025-06-02  5:52 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Yi Zhang @ 2025-05-29 12:41 UTC (permalink / raw)
  To: open list:NVM EXPRESS DRIVER; +Cc: Keith Busch, Jens Axboe, Christoph Hellwig

Hi

My regression test found this issue from v6.15-rc7, please help check
it and let me know if you need any infor/test for it, thanks.

[ 2313.264089] nvme nvme2: resetting controller
[ 2317.125038] nvme nvme2: D3 entry latency set to 10 seconds
[ 2317.201142] nvme nvme2: 16/0/0 default/read/poll queues
[ 2319.450561] nvme nvme2: nvme2n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=512 bytes
[ 2319.467732] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.477555] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.487377] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.497214] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.507026] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x600 flags=0x0020]
[ 2319.516999] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.526790] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x500 flags=0x0020]
[ 2319.536765] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.546560] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x600 flags=0x0020]
[ 2319.556524] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
domain=0x0033 address=0x0 flags=0x0020]
[ 2319.881675] nvme nvme3: rescanning namespaces.
[ 2320.163354] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=32768 bytes
[ 2320.177588] BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365]
                    preempt=0x00000000 lock=0->1 RCU=0->0
workfn=async_run_entry_fn
[ 2320.191968] 1 lock held by kworker/u67:3/365:
[ 2320.196395]  #0: ffff8881a06b11f0 (&q->limits_lock){+.+.}-{4:4},
at: nvme_update_ns_info_block+0x2b0/0x19a0 [nvme_core]
[ 2320.207301] CPU: 5 UID: 0 PID: 365 Comm: kworker/u67:3 Not tainted
6.15.0 #1 PREEMPT(voluntary)
[ 2320.207309] Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS
2.17.0 12/04/2024
[ 2320.207313] Workqueue: async async_run_entry_fn
[ 2320.207320] Call Trace:
[ 2320.207325]  <TASK>
[ 2320.207331]  dump_stack_lvl+0xac/0xc0
[ 2320.207343]  process_one_work+0xfc8/0x1950
[ 2320.207355]  ? __pfx_async_run_entry_fn+0x10/0x10
[ 2320.207398]  ? __pfx_process_one_work+0x10/0x10
[ 2320.207407]  ? srso_return_thunk+0x5/0x5f
[ 2320.207433]  ? srso_return_thunk+0x5/0x5f
[ 2320.207437]  ? assign_work+0x16c/0x240
[ 2320.207458]  worker_thread+0x58d/0xcf0
[ 2320.207493]  ? __pfx_worker_thread+0x10/0x10
[ 2320.207508]  kthread+0x3d8/0x7a0
[ 2320.207519]  ? __pfx_kthread+0x10/0x10
[ 2320.207532]  ? srso_return_thunk+0x5/0x5f
[ 2320.207536]  ? rcu_is_watching+0x11/0xb0
[ 2320.207548]  ? __pfx_kthread+0x10/0x10
[ 2320.207562]  ret_from_fork+0x30/0x70
[ 2320.207569]  ? __pfx_kthread+0x10/0x10
[ 2320.207577]  ret_from_fork_asm+0x1a/0x30
[ 2320.207625]  </TASK>
[ 2458.962206] INFO: task kworker/u67:0:1471 blocked for more than 122 seconds.
[ 2458.969586]       Not tainted 6.15.0 #1
[ 2458.973464] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2458.981344] task:kworker/u67:0   state:D stack:25400 pid:1471
tgid:1471  ppid:2      task_flags:0x4208060 flags:0x00004000
[ 2458.981370] Workqueue: async async_run_entry_fn
[ 2458.981393] Call Trace:
[ 2458.981401]  <TASK>
[ 2458.981423]  __schedule+0x8fa/0x1c70
[ 2458.981460]  ? __pfx___schedule+0x10/0x10
[ 2458.981479]  ? srso_return_thunk+0x5/0x5f
[ 2458.981495]  ? find_held_lock+0x32/0x90
[ 2458.981515]  ? srso_return_thunk+0x5/0x5f
[ 2458.981525]  ? __lock_release+0x1a2/0x2c0
[ 2458.981535]  ? srso_return_thunk+0x5/0x5f
[ 2458.981554]  ? schedule+0x1e8/0x270
[ 2458.981579]  schedule+0xdc/0x270
[ 2458.981596]  schedule_preempt_disabled+0x14/0x30
[ 2458.981608]  __mutex_lock+0xbf1/0x1690
[ 2458.981633]  ? nvme_update_ns_info_block+0x2b0/0x19a0 [nvme_core]
[ 2458.981721]  ? __pfx___mutex_lock+0x10/0x10
[ 2458.981731]  ? srso_return_thunk+0x5/0x5f
[ 2458.981751]  ? srso_return_thunk+0x5/0x5f
[ 2458.981794]  ? srso_return_thunk+0x5/0x5f
[ 2458.981821]  ? nvme_update_ns_info_block+0x2b0/0x19a0 [nvme_core]
[ 2458.981851]  ? srso_return_thunk+0x5/0x5f
[ 2458.981861]  nvme_update_ns_info_block+0x2b0/0x19a0 [nvme_core]
[ 2458.981928]  ? unwind_next_frame+0x453/0x20c0
[ 2458.981945]  ? srso_return_thunk+0x5/0x5f
[ 2458.981961]  ? srso_return_thunk+0x5/0x5f
[ 2458.981971]  ? unwind_next_frame+0x45d/0x20c0
[ 2458.981981]  ? ret_from_fork_asm+0x1a/0x30
[ 2458.981999]  ? srso_return_thunk+0x5/0x5f
[ 2458.982008]  ? kernel_text_address+0x13/0xd0
[ 2458.982022]  ? ret_from_fork_asm+0x1a/0x30
[ 2458.982032]  ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 2458.982052]  ? srso_return_thunk+0x5/0x5f
[ 2458.982062]  ? arch_stack_walk+0x87/0xf0
[ 2458.982088]  ? __pfx_nvme_update_ns_info_block+0x10/0x10 [nvme_core]
[ 2458.982126]  ? ret_from_fork_asm+0x1a/0x30
[ 2458.982150]  ? srso_return_thunk+0x5/0x5f
[ 2458.982160]  ? stack_trace_save+0x8f/0xc0
[ 2458.982174]  ? srso_return_thunk+0x5/0x5f
[ 2458.982184]  ? stack_depot_save_flags+0x41/0x6c0
[ 2458.982221]  ? srso_return_thunk+0x5/0x5f
[ 2458.982231]  ? kasan_save_stack+0x30/0x40
[ 2458.982244]  ? kasan_save_stack+0x20/0x40
[ 2458.982254]  ? kasan_save_track+0x10/0x30
[ 2458.982265]  ? kasan_save_free_info+0x37/0x60
[ 2458.982275]  ? __kasan_slab_free+0x34/0x50
[ 2458.982286]  ? kfree+0x29d/0x4c0
[ 2458.982298]  ? nvme_ns_info_from_identify+0x312/0x4d0 [nvme_core]
[ 2458.982328]  ? nvme_scan_ns+0x12f/0x4b0 [nvme_core]
[ 2458.982358]  ? async_run_entry_fn+0x96/0x4f0
[ 2458.982369]  ? process_one_work+0x8cd/0x1950
[ 2458.982379]  ? worker_thread+0x58d/0xcf0
[ 2458.982389]  ? kthread+0x3d8/0x7a0
[ 2458.982399]  ? ret_from_fork+0x30/0x70
[ 2458.982410]  ? ret_from_fork_asm+0x1a/0x30
[ 2458.982420]  ? srso_return_thunk+0x5/0x5f
[ 2458.982430]  ? local_clock_noinstr+0x9/0xc0
[ 2458.982445]  ? srso_return_thunk+0x5/0x5f
[ 2458.982455]  ? __lock_release+0x1a2/0x2c0
[ 2458.982473]  ? srso_return_thunk+0x5/0x5f
[ 2458.982496]  ? __pfx_nvme_scan_ns_async+0x10/0x10 [nvme_core]
[ 2458.982526]  nvme_update_ns_info+0x1d9/0xbe0 [nvme_core]
[ 2458.982561]  ? srso_return_thunk+0x5/0x5f
[ 2458.982571]  ? __debug_check_no_obj_freed+0x252/0x510
[ 2458.982605]  ? srso_return_thunk+0x5/0x5f
[ 2458.982615]  ? __lock_acquire+0x573/0xc00
[ 2458.982650]  ? __pfx_nvme_update_ns_info+0x10/0x10 [nvme_core]
[ 2458.982679]  ? find_held_lock+0x32/0x90
[ 2458.982692]  ? local_clock_noinstr+0x9/0xc0
[ 2458.982706]  ? srso_return_thunk+0x5/0x5f
[ 2458.982716]  ? __lock_release+0x1a2/0x2c0
[ 2458.982726]  ? srso_return_thunk+0x5/0x5f
[ 2458.982744]  ? nvme_find_get_ns+0x221/0x2f0 [nvme_core]
[ 2458.982779]  ? srso_return_thunk+0x5/0x5f
[ 2458.982795]  ? srso_return_thunk+0x5/0x5f
[ 2458.982806]  ? nvme_find_get_ns+0x232/0x2f0 [nvme_core]
[ 2458.982848]  ? __pfx_nvme_find_get_ns+0x10/0x10 [nvme_core]
[ 2458.982905]  ? __lock_acquire+0x573/0xc00
[ 2458.982938]  ? __pfx_nvme_scan_ns_async+0x10/0x10 [nvme_core]
[ 2458.982968]  nvme_scan_ns+0x26a/0x4b0 [nvme_core]
[ 2458.983006]  ? __pfx_nvme_scan_ns+0x10/0x10 [nvme_core]
[ 2458.983036]  ? srso_return_thunk+0x5/0x5f
[ 2458.983064]  ? ktime_get+0x16b/0x1f0
[ 2458.983075]  ? srso_return_thunk+0x5/0x5f
[ 2458.983085]  ? lockdep_hardirqs_on+0x78/0x100
[ 2458.983097]  ? srso_return_thunk+0x5/0x5f
[ 2458.983107]  ? srso_return_thunk+0x5/0x5f
[ 2458.983130]  ? __pfx_nvme_scan_ns_async+0x10/0x10 [nvme_core]
[ 2458.983159]  async_run_entry_fn+0x96/0x4f0
[ 2458.983187]  process_one_work+0x8cd/0x1950
[ 2458.983230]  ? srso_return_thunk+0x5/0x5f
[ 2458.983244]  ? __pfx_process_one_work+0x10/0x10
[ 2458.983258]  ? srso_return_thunk+0x5/0x5f
[ 2458.983288]  ? srso_return_thunk+0x5/0x5f
[ 2458.983298]  ? assign_work+0x16c/0x240
[ 2458.983324]  worker_thread+0x58d/0xcf0
[ 2458.983365]  ? __pfx_worker_thread+0x10/0x10
[ 2458.983385]  kthread+0x3d8/0x7a0
[ 2458.983401]  ? __pfx_kthread+0x10/0x10
[ 2458.983419]  ? srso_return_thunk+0x5/0x5f
[ 2458.983429]  ? rcu_is_watching+0x11/0xb0
[ 2458.983446]  ? __pfx_kthread+0x10/0x10
[ 2458.983466]  ret_from_fork+0x30/0x70
[ 2458.983477]  ? __pfx_kthread+0x10/0x10
[ 2458.983491]  ret_from_fork_asm+0x1a/0x30
[ 2458.983544]  </TASK>
[ 2458.983563] INFO: task kworker/u67:0:1471 is blocked on a mutex
likely owned by task kworker/u67:3:365.
[ 2458.992991] task:kworker/u67:3   state:I stack:23576 pid:365
tgid:365   ppid:2      task_flags:0x4208060 flags:0x00004000
[ 2458.993013] Workqueue:  0x0 (events_unbound)
[ 2458.993037] Call Trace:
[ 2458.993045]  <TASK>
[ 2458.993065]  __schedule+0x8fa/0x1c70
[ 2458.993095]  ? __pfx___schedule+0x10/0x10
[ 2458.993114]  ? srso_return_thunk+0x5/0x5f
[ 2458.993128]  ? find_held_lock+0x32/0x90
[ 2458.993147]  ? srso_return_thunk+0x5/0x5f
[ 2458.993156]  ? __lock_release+0x1a2/0x2c0
[ 2458.993166]  ? srso_return_thunk+0x5/0x5f
[ 2458.993184]  ? schedule+0x1e8/0x270
[ 2458.993209]  schedule+0xdc/0x270
[ 2458.993222]  ? worker_thread+0xfb/0xcf0
[ 2458.993234]  worker_thread+0x14f/0xcf0
[ 2458.993275]  ? __pfx_worker_thread+0x10/0x10
[ 2458.993296]  kthread+0x3d8/0x7a0
[ 2458.993312]  ? __pfx_kthread+0x10/0x10
[ 2458.993330]  ? srso_return_thunk+0x5/0x5f
[ 2458.993340]  ? rcu_is_watching+0x11/0xb0
[ 2458.993356]  ? __pfx_kthread+0x10/0x10
[ 2458.993376]  ret_from_fork+0x30/0x70
[ 2458.993387]  ? __pfx_kthread+0x10/0x10
[ 2458.993401]  ret_from_fork_asm+0x1a/0x30
[ 2458.993455]  </TASK>
[ 2458.993463] INFO: task kworker/u67:1:1480 blocked for more than 122 seconds.
[ 2459.000541]       Not tainted 6.15.0 #1
[ 2459.004412] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2459.012264] task:kworker/u67:1   state:D stack:23416 pid:1480
tgid:1480  ppid:2      task_flags:0x4208060 flags:0x00004000
[ 2459.012286] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[ 2459.012321] Call Trace:
[ 2459.012329]  <TASK>
[ 2459.012350]  __schedule+0x8fa/0x1c70
[ 2459.012380]  ? __pfx___schedule+0x10/0x10
[ 2459.012399]  ? srso_return_thunk+0x5/0x5f
[ 2459.012413]  ? find_held_lock+0x32/0x90
[ 2459.012431]  ? srso_return_thunk+0x5/0x5f
[ 2459.012441]  ? __lock_release+0x1a2/0x2c0
[ 2459.012451]  ? srso_return_thunk+0x5/0x5f
[ 2459.012469]  ? schedule+0x1e8/0x270
[ 2459.012493]  schedule+0xdc/0x270
[ 2459.012511]  async_synchronize_cookie_domain+0x1b8/0x220
[ 2459.012523]  ? nvme_remove_invalid_namespaces+0x289/0x470 [nvme_core]
[ 2459.012558]  ? __pfx_async_synchronize_cookie_domain+0x10/0x10
[ 2459.012579]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 2459.012598]  ? __async_schedule_node_domain+0x362/0x550
[ 2459.012609]  ? __kasan_kmalloc+0x7b/0x90
[ 2459.012641]  nvme_scan_ns_list+0x519/0x6a0 [nvme_core]
[ 2459.012672]  ? trace_contention_end+0x150/0x1c0
[ 2459.012711]  ? __pfx_nvme_scan_ns_list+0x10/0x10 [nvme_core]
[ 2459.012756]  ? __pfx___mutex_lock+0x10/0x10
[ 2459.012776]  ? srso_return_thunk+0x5/0x5f
[ 2459.012787]  ? nvme_init_non_mdts_limits+0x24c/0x570 [nvme_core]
[ 2459.012868]  ? nvme_scan_work+0x285/0x540 [nvme_core]
[ 2459.012920]  nvme_scan_work+0x285/0x540 [nvme_core]
[ 2459.012962]  ? __pfx_nvme_scan_work+0x10/0x10 [nvme_core]
[ 2459.012992]  ? srso_return_thunk+0x5/0x5f
[ 2459.013010]  ? srso_return_thunk+0x5/0x5f
[ 2459.013028]  ? srso_return_thunk+0x5/0x5f
[ 2459.013038]  ? rcu_is_watching+0x11/0xb0
[ 2459.013048]  ? srso_return_thunk+0x5/0x5f
[ 2459.013058]  ? srso_return_thunk+0x5/0x5f
[ 2459.013067]  ? lock_acquire+0x10e/0x160
[ 2459.013094]  process_one_work+0x8cd/0x1950
[ 2459.013143]  ? __pfx_process_one_work+0x10/0x10
[ 2459.013158]  ? srso_return_thunk+0x5/0x5f
[ 2459.013188]  ? srso_return_thunk+0x5/0x5f
[ 2459.013198]  ? assign_work+0x16c/0x240
[ 2459.013224]  worker_thread+0x58d/0xcf0
[ 2459.013241]  ? srso_return_thunk+0x5/0x5f
[ 2459.013273]  ? __pfx_worker_thread+0x10/0x10
[ 2459.013294]  kthread+0x3d8/0x7a0
[ 2459.013310]  ? __pfx_kthread+0x10/0x10
[ 2459.013328]  ? srso_return_thunk+0x5/0x5f
[ 2459.013338]  ? rcu_is_watching+0x11/0xb0
[ 2459.013354]  ? __pfx_kthread+0x10/0x10
[ 2459.013374]  ret_from_fork+0x30/0x70
[ 2459.013385]  ? __pfx_kthread+0x10/0x10
[ 2459.013398]  ret_from_fork_asm+0x1a/0x30
[ 2459.013452]  </TASK>
[ 2459.013463] INFO: task nvme:2310 blocked for more than 122 seconds.
[ 2459.019757]       Not tainted 6.15.0 #1
[ 2459.023631] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2459.031497] task:nvme            state:D stack:25296 pid:2310
tgid:2310  ppid:2309   task_flags:0x400100 flags:0x00004002
[ 2459.031519] Call Trace:
[ 2459.031527]  <TASK>
[ 2459.031548]  __schedule+0x8fa/0x1c70
[ 2459.031578]  ? __pfx___schedule+0x10/0x10
[ 2459.031596]  ? srso_return_thunk+0x5/0x5f
[ 2459.031610]  ? find_held_lock+0x32/0x90
[ 2459.031629]  ? srso_return_thunk+0x5/0x5f
[ 2459.031639]  ? __lock_release+0x1a2/0x2c0
[ 2459.031648]  ? srso_return_thunk+0x5/0x5f
[ 2459.031666]  ? schedule+0x1e8/0x270
[ 2459.031691]  schedule+0xdc/0x270
[ 2459.031708]  schedule_timeout+0x1fe/0x240
[ 2459.031724]  ? __pfx_schedule_timeout+0x10/0x10
[ 2459.031735]  ? lock_acquire.part.0+0xb6/0x240
[ 2459.031747]  ? srso_return_thunk+0x5/0x5f
[ 2459.031756]  ? find_held_lock+0x32/0x90
[ 2459.031766]  ? srso_return_thunk+0x5/0x5f
[ 2459.031776]  ? local_clock_noinstr+0x9/0xc0
[ 2459.031791]  ? srso_return_thunk+0x5/0x5f
[ 2459.031800]  ? __lock_release+0x1a2/0x2c0
[ 2459.031810]  ? srso_return_thunk+0x5/0x5f
[ 2459.031826]  ? srso_return_thunk+0x5/0x5f
[ 2459.031836]  ? rcu_is_watching+0x11/0xb0
[ 2459.031850]  ? _raw_spin_unlock_irq+0x24/0x50
[ 2459.031861]  ? srso_return_thunk+0x5/0x5f
[ 2459.031899]  __wait_for_common+0x1d3/0x4e0
[ 2459.031913]  ? __pfx_schedule_timeout+0x10/0x10
[ 2459.031940]  ? __pfx___wait_for_common+0x10/0x10
[ 2459.031951]  ? start_flush_work+0x4d5/0x9b0
[ 2459.031973]  ? srso_return_thunk+0x5/0x5f
[ 2459.031983]  ? start_flush_work+0x4df/0x9b0
[ 2459.032014]  __flush_work+0x103/0x1a0
[ 2459.032029]  ? __pfx___flush_work+0x10/0x10
[ 2459.032048]  ? __pfx_wq_barrier_func+0x10/0x10
[ 2459.032090]  ? __wait_for_common+0x9d/0x4e0
[ 2459.032121]  ? srso_return_thunk+0x5/0x5f
[ 2459.032133]  ? srso_return_thunk+0x5/0x5f
[ 2459.032143]  ? lockdep_hardirqs_on+0x78/0x100
[ 2459.032169]  nvme_passthru_end+0x24a/0x400 [nvme_core]
[ 2459.032216]  nvme_submit_user_cmd+0x260/0x320 [nvme_core]
[ 2459.032265]  nvme_user_cmd.constprop.0+0x246/0x450 [nvme_core]
[ 2459.032315]  ? __pfx_nvme_user_cmd.constprop.0+0x10/0x10 [nvme_core]
[ 2459.032380]  ? srso_return_thunk+0x5/0x5f
[ 2459.032390]  ? ioctl_has_perm.constprop.0.isra.0+0x27a/0x420
[ 2459.032429]  ? srso_return_thunk+0x5/0x5f
[ 2459.032442]  ? srso_return_thunk+0x5/0x5f
[ 2459.032467]  blkdev_ioctl+0x235/0x5c0
[ 2459.032484]  ? __pfx_blkdev_ioctl+0x10/0x10
[ 2459.032526]  __x64_sys_ioctl+0x132/0x190
[ 2459.032549]  do_syscall_64+0x8c/0x180
[ 2459.032566]  ? srso_return_thunk+0x5/0x5f
[ 2459.032575]  ? __lock_release+0x1a2/0x2c0
[ 2459.032595]  ? count_memcg_events_mm.constprop.0+0xd4/0x200
[ 2459.032609]  ? srso_return_thunk+0x5/0x5f
[ 2459.032619]  ? find_held_lock+0x32/0x90
[ 2459.032631]  ? local_clock_noinstr+0x9/0xc0
[ 2459.032646]  ? srso_return_thunk+0x5/0x5f
[ 2459.032655]  ? __lock_release+0x1a2/0x2c0
[ 2459.032675]  ? exc_page_fault+0x59/0xe0
[ 2459.032690]  ? srso_return_thunk+0x5/0x5f
[ 2459.032706]  ? srso_return_thunk+0x5/0x5f
[ 2459.032716]  ? do_user_addr_fault+0x489/0xb10
[ 2459.032731]  ? srso_return_thunk+0x5/0x5f
[ 2459.032741]  ? rcu_is_watching+0x11/0xb0
[ 2459.032753]  ? srso_return_thunk+0x5/0x5f
[ 2459.032767]  ? srso_return_thunk+0x5/0x5f
[ 2459.032789]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 2459.032800] RIP: 0033:0x7faaddb03a8b
[ 2459.032812] RSP: 002b:00007ffd9c90e418 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 2459.032828] RAX: ffffffffffffffda RBX: 00007ffd9c90e480 RCX: 00007faaddb03a8b
[ 2459.032837] RDX: 00007ffd9c90e480 RSI: 00000000c0484e41 RDI: 0000000000000003
[ 2459.032846] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[ 2459.032855] R10: 0000000000008000 R11: 0000000000000246 R12: 0000000000000003
[ 2459.032863] R13: 00000000c0484e41 R14: 0000000000000000 R15: 0000000000000001
[ 2459.032936]  </TASK>
[ 2459.032945]
               Showing all locks held in the system:
[ 2459.039161] 1 lock held by khungtaskd/141:
[ 2459.043289]  #0: ffffffffa1330220 (rcu_read_lock){....}-{1:3}, at:
debug_show_all_locks+0x32/0x1c0
[ 2459.052305] 1 lock held by kworker/u67:3/365:
[ 2459.056695]  #0: ffff8881a06b11f0 (&q->limits_lock){+.+.}-{4:4},
at: nvme_update_ns_info_block+0x2b0/0x19a0 [nvme_core]
[ 2459.067582] 1 lock held by systemd-journal/878:
[ 2459.072155] 4 locks held by in:imjournal/1245:
[ 2459.076638] 3 locks held by kworker/u67:0/1471:
[ 2459.081197]  #0: ffff88810192e958
((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0x1109/0x1950
[ 2459.090736]  #1: ffffc900097efd40
((work_completion)(&entry->work)){+.+.}-{0:0}, at:
process_one_work+0x814/0x1950
[ 2459.101140]  #2: ffff8881a06b11f0 (&q->limits_lock){+.+.}-{4:4},
at: nvme_update_ns_info_block+0x2b0/0x19a0 [nvme_core]
[ 2459.112025] 3 locks held by kworker/u67:1/1480:
[ 2459.116590]  #0: ffff888231b52958
((wq_completion)nvme-wq){+.+.}-{0:0}, at:
process_one_work+0x1109/0x1950
[ 2459.126299]  #1: ffffc9000988fd40
((work_completion)(&ctrl->scan_work)){+.+.}-{0:0}, at:
process_one_work+0x814/0x1950
[ 2459.137046]  #2: ffff888100f20430 (&ctrl->scan_lock){+.+.}-{4:4},
at: nvme_scan_work+0x12a/0x540 [nvme_core]
[ 2459.148490] =============================================


-- 
Best Regards,
  Yi Zhang



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7
  2025-05-29 12:41 [bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7 Yi Zhang
@ 2025-06-02  5:52 ` Christoph Hellwig
  2025-06-02  7:51   ` John Garry
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2025-06-02  5:52 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvme, Keith Busch, Jens Axboe, Alan Adamson

On Thu, May 29, 2025 at 08:41:36PM +0800, Yi Zhang wrote:
> Hi
> 
> My regression test found this issue from v6.15-rc7, please help check
> it and let me know if you need any infor/test for it, thanks.

Hi Zi,

The new code seems to be missing a queue_limits_cancel_update,
the patch below fies it.  But what kind of devices is this?
PCIe muti-controller subsystems aren't that command, and this
looks like a grave bug, combined with the I/O page fault that
looks really odd.

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index f69a232a000a..4bb3c68b3451 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2388,6 +2388,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
 	 * atomic write capabilities.
 	 */
 	if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
+		queue_limits_cancel_update(ns->disk->queue);
 		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
 		ret = -ENXIO;
 		goto out;


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7
  2025-06-02  5:52 ` Christoph Hellwig
@ 2025-06-02  7:51   ` John Garry
       [not found]     ` <CAHj4cs9MxW96b=a6sQOtz_DDc63uKcNX3dat-th__9D0bwRQ9g@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: John Garry @ 2025-06-02  7:51 UTC (permalink / raw)
  To: Christoph Hellwig, Yi Zhang
  Cc: linux-nvme, Keith Busch, Jens Axboe, Alan Adamson,
	martin.petersen

On 02/06/2025 06:52, Christoph Hellwig wrote:

+

> On Thu, May 29, 2025 at 08:41:36PM +0800, Yi Zhang wrote:
>> Hi
>>
>> My regression test found this issue from v6.15-rc7, please help check
>> it and let me know if you need any infor/test for it, thanks.
> 
> Hi Zi,
> 
> The new code seems to be missing a queue_limits_cancel_update,
> the patch below fies it.  But what kind of devices is this?
> PCIe muti-controller subsystems aren't that command, and this
> looks like a grave bug, combined with the I/O page fault that
> looks really odd.
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f69a232a000a..4bb3c68b3451 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2388,6 +2388,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>   	 * atomic write capabilities.
>   	 */
>   	if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
> +		queue_limits_cancel_update(ns->disk->queue);

For that:

Reviewed-by: John Garry <john.g.garry@oracle.com>

>   		blk_mq_unfreeze_queue(ns->disk->queue, memflags);
>   		ret = -ENXIO;
>   		goto out;
> 

But for the scenario which triggers this:


[ 2313.264089] nvme nvme2: resetting controller
[ 2317.125038] nvme nvme2: D3 entry latency set to 10 seconds
[ 2317.201142] nvme nvme2: 16/0/0 default/read/poll queues
[ 2319.450561] nvme nvme2: nvme2n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=512 bytes

....

[ 2319.881675] nvme nvme3: rescanning namespaces.
[ 2320.163354] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size,
Namespace will not be added: Subsystem=4096 bytes,
Controller/Namespace=32768 bytes
[ 2320.177588] BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365]
                     preempt=0x00000000 lock=0->1 RCU=0->0
workfn=async_run_entry_fn

It is overkill to just not add the namespace? I was under the impression 
that this would be an highly unlikely scenario (of inconsistent atomic 
write sizes).




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7
       [not found]     ` <CAHj4cs9MxW96b=a6sQOtz_DDc63uKcNX3dat-th__9D0bwRQ9g@mail.gmail.com>
@ 2025-06-11  2:06       ` Yi Zhang
  0 siblings, 0 replies; 4+ messages in thread
From: Yi Zhang @ 2025-06-11  2:06 UTC (permalink / raw)
  To: John Garry, Christoph Hellwig
  Cc: linux-nvme, Keith Busch, Jens Axboe, Alan Adamson,
	martin.petersen

On Tue, Jun 3, 2025 at 12:25 AM Yi Zhang <yi.zhang@redhat.com> wrote:
>
>
>
> On Mon, Jun 2, 2025 at 3:52 PM John Garry <john.g.garry@oracle.com> wrote:
>>
>> On 02/06/2025 06:52, Christoph Hellwig wrote:
>>
>> +
>>
>> > On Thu, May 29, 2025 at 08:41:36PM +0800, Yi Zhang wrote:
>> >> Hi
>> >>
>> >> My regression test found this issue from v6.15-rc7, please help check
>> >> it and let me know if you need any infor/test for it, thanks.
>> >
>> > Hi Zi,
>> >
>> > The new code seems to be missing a queue_limits_cancel_update,
>> > the patch below fies it.  But what kind of devices is this?
>
>
> Yeah, the patch fixed the "BUG: workqueue leaked" issue.
> It's one Micron_9300_MTFDHAL3T8TDP NVMe disk installed on one DELL R6515 server with AMD EPYC 7232P CPU.
>
>>
>> > PCIe muti-controller subsystems aren't that command, and this
>> > looks like a grave bug, combined with the I/O page fault that
>> > looks really odd.
>> >
>
>
> Here is the full steps and logs I did to reproduce it:
> # nvme format -l1 -f /dev/nvme3n1
> Success formatting namespace:1
> # nvme reset /dev/nvme3
> # nvme list
> Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev
> --------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
> /dev/nvme0n1          /dev/ng0n1            S795NC0X201793       SAMSUNG MZWLO1T9HCJR-00A07               0x1          0.00   B /   1.92  TB    512   B +  0 B   OPPA4B5Q
> /dev/nvme1n1          /dev/ng1n1                  S39WNA0K201139 Dell Express Flash PM1725a 1.6TB AIC     0x1          1.60  TB /   1.60  TB    512   B +  0 B   1.2.1
> /dev/nvme2n1          /dev/ng2n1            3F50A00H0LR3         KIOXIA KCMYDRUG1T92                      0x1          0.00   B /   1.92  TB    512   B +  0 B   1UET7104
> /dev/nvme3n1          /dev/ng3n1            2135312ADFD1         Micron_9300_MTFDHAL3T8TDP                0x1        480.09  GB /   3.84  TB    512   B +  0 B   11300DY0
> /dev/nvme4n1          /dev/ng4n1            S64FNE0R802879       SAMSUNG MZQL2960HCJR-00A07               0x1        960.20  GB / 960.20  GB    512   B +  0 B   GDC5302Q
> /dev/nvme5n1          /dev/ng5n1            CVFT6011001V1P6DGN   INTEL SSDPEDMD016T4                      0x1          1.60  TB /   1.60  TB    512   B +  0 B   8DV10171
> # nvme format -l0 -f /dev/nvme3n1
> Success formatting namespace:1
> # dmesg
> [ 2390.248457] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 2398.087894] nvme nvme3: resetting controller
> [ 2401.937274] nvme nvme3: D3 entry latency set to 10 seconds
> [ 2402.008719] nvme nvme3: 16/0/0 default/read/poll queues
> [ 2402.029564] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
> [ 2446.470477] amd_iommu_report_page_fault: 3 callbacks suppressed
> [ 2446.470489] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.486293] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.496099] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.505896] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x500 flags=0x0020]
> [ 2446.515863] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.525657] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x500 flags=0x0020]
> [ 2446.535631] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.545431] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x600 flags=0x0020]
> [ 2446.555400] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x0 flags=0x0020]
> [ 2446.565224] nvme 0000:44:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0033 address=0x600 flags=0x0020]
>
>>
>>
>> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> > index f69a232a000a..4bb3c68b3451 100644
>> > --- a/drivers/nvme/host/core.c
>> > +++ b/drivers/nvme/host/core.c
>> > @@ -2388,6 +2388,7 @@ static int nvme_update_ns_info_block(struct nvme_ns *ns,
>> >        * atomic write capabilities.
>> >        */
>> >       if (lim.atomic_write_hw_max > ns->ctrl->subsys->atomic_bs) {
>> > +             queue_limits_cancel_update(ns->disk->queue);
>>
>> For that:
>>
>> Reviewed-by: John Garry <john.g.garry@oracle.com>
>
>
> Tested-by: Yi Zhang <yi.zhang@redhat.com>
>

Hi Christoph

Would you mind sending a formal patch for this issue? Thanks.


>>
>>
>> >               blk_mq_unfreeze_queue(ns->disk->queue, memflags);
>> >               ret = -ENXIO;
>> >               goto out;
>> >
>>
>> But for the scenario which triggers this:
>>
>>
>> [ 2313.264089] nvme nvme2: resetting controller
>> [ 2317.125038] nvme nvme2: D3 entry latency set to 10 seconds
>> [ 2317.201142] nvme nvme2: 16/0/0 default/read/poll queues
>> [ 2319.450561] nvme nvme2: nvme2n1: Inconsistent Atomic Write Size,
>> Namespace will not be added: Subsystem=4096 bytes,
>> Controller/Namespace=512 bytes
>>
>> ....
>>
>> [ 2319.881675] nvme nvme3: rescanning namespaces.
>> [ 2320.163354] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size,
>> Namespace will not be added: Subsystem=4096 bytes,
>> Controller/Namespace=32768 bytes
>> [ 2320.177588] BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365]
>>                      preempt=0x00000000 lock=0->1 RCU=0->0
>> workfn=async_run_entry_fn
>>
>> It is overkill to just not add the namespace? I was under the impression
>> that this would be an highly unlikely scenario (of inconsistent atomic
>> write sizes).
>>
> Yeah, although the kernel shows "Namespace will not be added:", but the ns still can be seen after the format operation:
> # nvme format -l1 -f /dev/nvme3n1
> Success formatting namespace:1
> # lsblk
> nvme4n1                     259:0    0 894.3G  0 disk
> nvme2n1                     259:1    0   1.7T  0 disk
> nvme0n1                     259:2    0   1.7T  0 disk
> nvme1n1                     259:3    0   1.5T  0 disk
> nvme3n1                     259:4    0   3.5T  0 disk
> nvme5n1                     259:5    0   1.5T  0 disk
> # dmesg
> [ 3324.943476] nvme nvme3: nvme3n1: Inconsistent Atomic Write Size, Namespace will not be added: Subsystem=512 bytes, Controller/Namespace=4096 bytes
>
>
> --
> Best Regards,
>   Yi Zhang



-- 
Best Regards,
  Yi Zhang



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-06-11  2:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-29 12:41 [bug report]BUG: workqueue leaked atomic, lock or RCU: kworker/u67:3[365] observed from v6.15-rc7 Yi Zhang
2025-06-02  5:52 ` Christoph Hellwig
2025-06-02  7:51   ` John Garry
     [not found]     ` <CAHj4cs9MxW96b=a6sQOtz_DDc63uKcNX3dat-th__9D0bwRQ9g@mail.gmail.com>
2025-06-11  2:06       ` Yi Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).