* [bug report] blktests nvme/tcp nvme/060 hang @ 2025-08-06 1:57 Yi Zhang 2025-08-06 6:16 ` Maurizio Lombardi 2025-08-06 10:52 ` Hannes Reinecke 0 siblings, 2 replies; 9+ messages in thread From: Yi Zhang @ 2025-08-06 1:57 UTC (permalink / raw) To: linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi Hello I hit this issue when I was running blktests nvme/tcp nvme/060 on the latest linux-block/for-next with rt enabled, please help check it and let me know if you need any info/testing for it, thanks. [ 195.356104] run blktests nvme/060 at 2025-08-06 01:50:45 [ 195.578659] loop0: detected capacity change from 0 to 2097152 [ 195.620986] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [ 195.681508] nvmet_tcp: enabling port 0 (127.0.0.1:4420) [ 195.904991] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 195.919199] nvme nvme1: creating 80 I/O queues. [ 196.035295] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 196.175655] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 196.195731] nvme (3558) used greatest stack depth: 19248 bytes left [ 197.786460] nvmet: ctrl 1 fatal error occurred! [ 197.789337] nvme nvme1: starting error recovery [ 197.932751] nvme nvme1: Reconnecting in 1 seconds... [ 198.980348] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 198.985848] nvme nvme1: creating 80 I/O queues. [ 199.318924] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 199.423818] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 199.556572] nvme nvme1: Failed reconnect attempt 1/600 [ 200.075342] systemd-udevd (3501) used greatest stack depth: 17888 bytes left [ 200.496781] nvme nvme1: Property Set error: 880, offset 0x14 [ 200.977968] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 200.985964] nvme nvme1: creating 80 I/O queues. [ 201.076602] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 201.189249] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 201.206511] nvme (3741) used greatest stack depth: 17712 bytes left [ 201.867978] nvmet: ctrl 1 fatal error occurred! [ 201.868127] nvme nvme1: starting error recovery [ 201.986566] nvme nvme1: Reconnecting in 1 seconds... [ 203.049519] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 203.056878] nvme nvme1: creating 80 I/O queues. [ 203.399239] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 203.569429] nvme nvme1: Successfully reconnected (attempt 1/600) [ 203.918129] nvmet: ctrl 1 fatal error occurred! [ 203.918261] nvme nvme1: starting error recovery [ 204.026795] nvme nvme1: Reconnecting in 1 seconds... [ 204.423104] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 205.336507] nvme nvme1: Property Set error: 880, offset 0x14 [ 205.855722] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 205.861038] nvme nvme1: creating 80 I/O queues. [ 205.944537] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 205.965862] nvmet: ctrl 1 fatal error occurred! [ 206.053046] nvme nvme1: starting error recovery [ 206.147337] nvme nvme1: Failed to configure AEN (cfg 900) [ 206.147495] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 206.176163] nvme nvme1: Reconnecting in 1 seconds... [ 207.208170] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 207.211901] nvme nvme1: creating 80 I/O queues. [ 207.336799] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 207.349663] nvme nvme1: Successfully reconnected (attempt 1/600) [ 208.017068] nvmet: ctrl 1 fatal error occurred! [ 208.017201] nvme nvme1: starting error recovery [ 208.126889] nvme nvme1: Reconnecting in 1 seconds... [ 209.199440] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 209.203685] nvme nvme1: creating 80 I/O queues. [ 209.382909] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 209.503420] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 209.706412] nvme nvme1: Failed reconnect attempt 1/600 [ 210.617622] nvme nvme1: Property Set error: 880, offset 0x14 [ 211.158138] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 211.163794] nvme nvme1: creating 80 I/O queues. [ 211.245927] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 211.350477] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 212.097987] nvmet: ctrl 1 fatal error occurred! [ 212.098122] nvme nvme1: starting error recovery [ 212.217345] nvme nvme1: Reconnecting in 1 seconds... [ 213.280774] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 213.284617] nvme nvme1: creating 80 I/O queues. [ 213.625320] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 213.787167] nvme nvme1: Successfully reconnected (attempt 1/600) [ 214.147984] nvmet: ctrl 1 fatal error occurred! [ 214.148113] nvme nvme1: starting error recovery [ 214.248455] nvme nvme1: Reconnecting in 1 seconds... [ 214.604818] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 215.431717] nvme nvme1: Property Set error: 880, offset 0x14 [ 216.014626] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 216.020186] nvme nvme1: creating 80 I/O queues. [ 216.101613] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 216.195783] nvmet: ctrl 1 fatal error occurred! [ 216.536582] nvme nvme1: failed to send request -32 [ 216.537991] nvme nvme1: failed nvme_keep_alive_end_io error=4 [ 280.196270] nvme nvme1: I/O tag 0 (0000) type 4 opcode 0x7f (Fabrics Cmd) QID 80 timeout [ 280.198058] nvme nvme1: Connect command failed, error wo/DNR bit: 881 [ 280.198353] nvme nvme1: failed to connect queue: 80 ret=881 [ 285.369827] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 285.375603] nvme nvme1: creating 80 I/O queues. [ 285.449676] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 285.554119] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 287.344206] nvmet: ctrl 1 fatal error occurred! [ 287.344349] nvme nvme1: starting error recovery [ 287.466244] nvme nvme1: Reconnecting in 1 seconds... [ 288.479300] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 288.483283] nvme nvme1: creating 80 I/O queues. [ 288.785415] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 288.856766] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 289.066215] nvme nvme1: Failed reconnect attempt 1/600 [ 290.116265] nvme nvme1: Property Set error: 880, offset 0x14 [ 320.802045] loop: module loaded [ 321.017366] run blktests nvme/060 at 2025-08-06 01:52:51 [ 321.220243] loop0: detected capacity change from 0 to 2097152 [ 321.262034] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 [ 321.322296] nvmet_tcp: enabling port 0 (127.0.0.1:4420) [ 321.511316] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 321.524993] nvme nvme1: creating 80 I/O queues. [ 321.636336] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 321.754764] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 [ 323.431491] nvmet: ctrl 1 fatal error occurred! [ 323.434219] nvme nvme1: starting error recovery [ 323.566948] nvme nvme1: Reconnecting in 1 seconds... [ 324.653738] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. [ 324.662270] nvme nvme1: creating 80 I/O queues. [ 324.993712] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" [ 325.052085] nvme nvme1: mapped 80/0/0 default/read/poll queues. [ 325.481710] nvmet: ctrl 1 fatal error occurred! [ 327.286111] nvmet: ctrl 1 keep-alive timer (1 seconds) expired! [ 390.435380] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 390.441500] rcu: Tasks blocked on level-1 rcu_node (CPUs 0-15): P4784/2:b..l P1049/8:b..l [ 390.449771] rcu: (detected by 13, t=6503 jiffies, g=8845, q=2243 ncpus=80) [ 390.456724] task:kworker/u329:1 state:D stack:19264 pid:1049 tgid:1049 ppid:2 task_flags:0x4208060 flags:0x00000010 [ 390.467850] Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp] [ 390.474378] Call trace: [ 390.476813] __switch_to+0x1d8/0x330 (T) [ 390.480731] __schedule+0x860/0x1c30 [ 390.484297] schedule_rtlock+0x30/0x70 [ 390.488037] rtlock_slowlock_locked+0x464/0xf60 [ 390.492559] rt_read_lock+0x2bc/0x3e0 [ 390.496211] nvmet_tcp_listen_data_ready+0x3c/0x118 [nvmet_tcp] [ 390.502125] nvmet_tcp_data_ready+0x88/0x198 [nvmet_tcp] [ 390.507429] tcp_data_ready+0xdc/0x3e0 [ 390.511171] tcp_data_queue+0xe30/0x1e08 [ 390.515084] tcp_rcv_established+0x710/0x2328 [ 390.519432] tcp_v4_do_rcv+0x554/0x940 [ 390.523172] tcp_v4_rcv+0x1ec4/0x3078 [ 390.526825] ip_protocol_deliver_rcu+0xc0/0x3f0 [ 390.531347] ip_local_deliver_finish+0x2d4/0x5c0 [ 390.535954] ip_local_deliver+0x17c/0x3c0 [ 390.539953] ip_rcv_finish+0x148/0x238 [ 390.543693] ip_rcv+0xd0/0x2e0 [ 390.546737] __netif_receive_skb_one_core+0x100/0x180 [ 390.551780] __netif_receive_skb+0x2c/0x160 [ 390.555953] process_backlog+0x230/0x6f8 [ 390.559866] __napi_poll.constprop.0+0x9c/0x3e8 [ 390.564386] net_rx_action+0x808/0xb50 [ 390.568125] handle_softirqs.constprop.0+0x23c/0xca0 [ 390.573082] __local_bh_enable_ip+0x260/0x4f0 [ 390.577429] __dev_queue_xmit+0x6f4/0x1bd8 [ 390.581515] neigh_hh_output+0x190/0x2c0 [ 390.585429] ip_finish_output2+0x7c8/0x1788 [ 390.589602] __ip_finish_output+0x2c4/0x4f8 [ 390.593776] ip_finish_output+0x3c/0x2a8 [ 390.597689] ip_output+0x154/0x418 [ 390.601081] __ip_queue_xmit+0x580/0x1108 [ 390.605081] ip_queue_xmit+0x4c/0x78 [ 390.608647] __tcp_transmit_skb+0xfac/0x24e8 [ 390.612908] tcp_write_xmit+0xbec/0x3078 [ 390.616821] __tcp_push_pending_frames+0x90/0x2b8 [ 390.621515] tcp_send_fin+0x108/0x9a8 [ 390.625167] tcp_shutdown+0xd8/0xf8 [ 390.628646] inet_shutdown+0x14c/0x2e8 [ 390.632385] kernel_sock_shutdown+0x5c/0x98 [ 390.636560] __nvme_tcp_stop_queue+0x44/0x220 [nvme_tcp] [ 390.641865] nvme_tcp_stop_queue_nowait+0x130/0x200 [nvme_tcp] [ 390.647691] nvme_tcp_setup_ctrl+0x3bc/0x728 [nvme_tcp] [ 390.652909] nvme_tcp_reconnect_ctrl_work+0x78/0x290 [nvme_tcp] [ 390.658822] process_one_work+0x80c/0x1a78 [ 390.662910] worker_thread+0x6d0/0xaa8 [ 390.666650] kthread+0x304/0x3a0 [ 390.669869] ret_from_fork+0x10/0x20 [ 390.673437] task:kworker/u322:77 state:D stack:27184 pid:4784 tgid:4784 ppid:2 task_flags:0x4208060 flags:0x00000210 [ 390.684562] Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp] [ 390.691256] Call trace: [ 390.693692] __switch_to+0x1d8/0x330 (T) [ 390.697605] __schedule+0x860/0x1c30 [ 390.701171] schedule_rtlock+0x30/0x70 [ 390.704911] rwbase_write_lock.constprop.0.isra.0+0x2fc/0xb30 [ 390.710646] rt_write_lock+0x9c/0x138 [ 390.714299] nvmet_tcp_release_queue_work+0x168/0xb48 [nvmet_tcp] [ 390.720384] process_one_work+0x80c/0x1a78 [ 390.724470] worker_thread+0x6d0/0xaa8 [ 390.728210] kthread+0x304/0x3a0 [ 390.731428] ret_from_fork+0x10/0x20 -- Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 1:57 [bug report] blktests nvme/tcp nvme/060 hang Yi Zhang @ 2025-08-06 6:16 ` Maurizio Lombardi 2025-08-06 6:22 ` Maurizio Lombardi 2025-08-06 10:52 ` Hannes Reinecke 1 sibling, 1 reply; 9+ messages in thread From: Maurizio Lombardi @ 2025-08-06 6:16 UTC (permalink / raw) To: Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On Wed Aug 6, 2025 at 3:57 AM CEST, Yi Zhang wrote: > Hello > I hit this issue when I was running blktests nvme/tcp nvme/060 on the > latest linux-block/for-next with rt enabled, please help check it and > let me know if you need any info/testing for it, thanks. > > [ 390.474378] Call trace: > [ 390.476813] __switch_to+0x1d8/0x330 (T) > [ 390.480731] __schedule+0x860/0x1c30 > [ 390.484297] schedule_rtlock+0x30/0x70 > [ 390.488037] rtlock_slowlock_locked+0x464/0xf60 > [ 390.492559] rt_read_lock+0x2bc/0x3e0 > [ 390.496211] nvmet_tcp_listen_data_ready+0x3c/0x118 [nvmet_tcp] > [ 390.502125] nvmet_tcp_data_ready+0x88/0x198 [nvmet_tcp] I think that the problem is due to the fact that nvmet_tcp_data_ready() calls the queue->data_ready() callback with the sk_callback_lock locked. The data_ready callback points to nvmet_tcp_listen_data_ready() which tries to lock the same sk_callback_lock, hence the deadlock. Maybe it can be fixed by deferring the call to queue->data_ready() by using a workqueue. Maurizio ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 6:16 ` Maurizio Lombardi @ 2025-08-06 6:22 ` Maurizio Lombardi 2025-08-06 6:44 ` Maurizio Lombardi 0 siblings, 1 reply; 9+ messages in thread From: Maurizio Lombardi @ 2025-08-06 6:22 UTC (permalink / raw) To: Maurizio Lombardi, Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On Wed Aug 6, 2025 at 8:16 AM CEST, Maurizio Lombardi wrote: > On Wed Aug 6, 2025 at 3:57 AM CEST, Yi Zhang wrote: >> Hello >> I hit this issue when I was running blktests nvme/tcp nvme/060 on the >> latest linux-block/for-next with rt enabled, please help check it and >> let me know if you need any info/testing for it, thanks. >> >> [ 390.474378] Call trace: >> [ 390.476813] __switch_to+0x1d8/0x330 (T) >> [ 390.480731] __schedule+0x860/0x1c30 >> [ 390.484297] schedule_rtlock+0x30/0x70 >> [ 390.488037] rtlock_slowlock_locked+0x464/0xf60 >> [ 390.492559] rt_read_lock+0x2bc/0x3e0 >> [ 390.496211] nvmet_tcp_listen_data_ready+0x3c/0x118 [nvmet_tcp] >> [ 390.502125] nvmet_tcp_data_ready+0x88/0x198 [nvmet_tcp] > > I think that the problem is due to the fact that nvmet_tcp_data_ready() > calls the queue->data_ready() callback with the sk_callback_lock > locked. > The data_ready callback points to nvmet_tcp_listen_data_ready() > which tries to lock the same sk_callback_lock, hence the deadlock. > > Maybe it can be fixed by deferring the call to queue->data_ready() by > using a workqueue. > Ops sorry they are two read locks, the real problem then is that something is holding the write lock. Maurizio ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 6:22 ` Maurizio Lombardi @ 2025-08-06 6:44 ` Maurizio Lombardi 2025-08-06 9:30 ` Maurizio Lombardi 2025-08-06 10:54 ` Hannes Reinecke 0 siblings, 2 replies; 9+ messages in thread From: Maurizio Lombardi @ 2025-08-06 6:44 UTC (permalink / raw) To: Maurizio Lombardi, Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On Wed Aug 6, 2025 at 8:22 AM CEST, Maurizio Lombardi wrote: > On Wed Aug 6, 2025 at 8:16 AM CEST, Maurizio Lombardi wrote: >> >> I think that the problem is due to the fact that nvmet_tcp_data_ready() >> calls the queue->data_ready() callback with the sk_callback_lock >> locked. >> The data_ready callback points to nvmet_tcp_listen_data_ready() >> which tries to lock the same sk_callback_lock, hence the deadlock. >> >> Maybe it can be fixed by deferring the call to queue->data_ready() by >> using a workqueue. >> > > Ops sorry they are two read locks, the real problem then is that > something is holding the write lock. Ok, I think I get what happens now. The threads that call nvmet_tcp_data_ready() (takes the read lock 2 times) and nvmet_tcp_release_queue_work() (tries to take the write lock) are blocking each other. So I still think that deferring the call to queue->data_ready() by using a workqueue should fix it. Maurizio ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 6:44 ` Maurizio Lombardi @ 2025-08-06 9:30 ` Maurizio Lombardi 2025-08-06 10:54 ` Hannes Reinecke 1 sibling, 0 replies; 9+ messages in thread From: Maurizio Lombardi @ 2025-08-06 9:30 UTC (permalink / raw) To: Maurizio Lombardi, Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On Wed Aug 6, 2025 at 8:44 AM CEST, Maurizio Lombardi wrote: > On Wed Aug 6, 2025 at 8:22 AM CEST, Maurizio Lombardi wrote: >> >> Ops sorry they are two read locks, the real problem then is that >> something is holding the write lock. > > Ok, I think I get what happens now. > > The threads that call nvmet_tcp_data_ready() (takes the read lock 2 > times) and > nvmet_tcp_release_queue_work() (tries to take the write lock) > are blocking each other. > So I still think that deferring the call to queue->data_ready() by > using a workqueue should fix it. > I reproduced the issue by creating a reader thread that tries to take the lock twice and a writer thread that takes the write lock between the two calls to read_lock() [ 33.398311] [Reader] Thread started. [ 33.398410] [Writer] Thread started, waiting for reader to get lock... [ 33.398577] [Reader] Acquired read_lock successfully. [ 33.399391] [Reader] Sleeping for a while to allow writer to block... [ 33.418697] [Writer] Reader has the lock. Attempting to acquire write_lock... THIS SHOULD BLOCK. [ 41.288105] [Reader] Attempting to acquire a second read_lock... THIS SHOULD BLOCK. [ 93.388349] rcu: INFO: rcu_preempt self-detected stall on CPU [ 93.388758] rcu: 7-....: (5999 ticks this GP) idle=9db4/1/0x4000000000000000 softirq=1846/1846 fqs=2444 [ 93.389390] rcu: (t=6001 jiffies g=1917 q=4319 ncpus=8) [ 93.389745] CPU: 7 UID: 0 PID: 1784 Comm: reader_thread Kdump: loaded Tainted: G OEL ------- --- 6.12.0-116.el10.aarch64 #1 PREEMPT(voluntary) [ 93.389749] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE, [L]=SOFTLOCKUP [ 93.389749] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 93.389750] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 93.389752] pc : queued_spin_lock_slowpath+0x78/0x460 [ 93.389757] lr : queued_read_lock_slowpath+0x21c/0x228 [ 93.389759] sp : ffff80008bd6bdd0 [ 93.389760] x29: ffff80008bd6bdd0 x28: 0000000000000000 x27: 0000000000000000 [ 93.389762] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 93.389764] x23: ffffb1c374605008 x22: ffff0000ca9342c0 x21: ffff80008bafb960 [ 93.389766] x20: ffff0000c4735e40 x19: ffffb1c37460701c x18: 0000000000000006 [ 93.389767] x17: 444c554f48532053 x16: ffffb1c3ee73ab48 x15: 636f6c5f64616572 [ 93.389769] x14: 20646e6f63657320 x13: 2e4b434f4c422044 x12: ffffb1c3eff5ec10 [ 93.389771] x11: ffffb1c3efc9ec68 x10: ffffb1c3eff5ec68 x9 : ffffb1c3ee73b4c4 [ 93.389772] x8 : 0000000000000001 x7 : 00000000000bffe8 x6 : c0000000ffff7fff [ 93.389774] x5 : ffff00112ebe05c8 x4 : 0000000000000000 x3 : 0000000000000000 [ 93.389776] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000001 [ 93.389778] Call trace: [ 93.389779] queued_spin_lock_slowpath+0x78/0x460 (P) [ 93.389782] queued_read_lock_slowpath+0x21c/0x228 [ 93.389785] _raw_read_lock+0x60/0x80 [ 93.389787] reader_thread_fn+0x7c/0xc0 [dead] [ 93.389791] kthread+0x110/0x130 [ 93.389794] ret_from_fork+0x10/0x20 So apparently in case of contention writers have the precedence. Note that the same problem may also affect nvmet_tcp_write_space() Maurizio ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 6:44 ` Maurizio Lombardi 2025-08-06 9:30 ` Maurizio Lombardi @ 2025-08-06 10:54 ` Hannes Reinecke 1 sibling, 0 replies; 9+ messages in thread From: Hannes Reinecke @ 2025-08-06 10:54 UTC (permalink / raw) To: Maurizio Lombardi, Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On 8/6/25 08:44, Maurizio Lombardi wrote: > On Wed Aug 6, 2025 at 8:22 AM CEST, Maurizio Lombardi wrote: >> On Wed Aug 6, 2025 at 8:16 AM CEST, Maurizio Lombardi wrote: >>> >>> I think that the problem is due to the fact that nvmet_tcp_data_ready() >>> calls the queue->data_ready() callback with the sk_callback_lock >>> locked. >>> The data_ready callback points to nvmet_tcp_listen_data_ready() >>> which tries to lock the same sk_callback_lock, hence the deadlock. >>> >>> Maybe it can be fixed by deferring the call to queue->data_ready() by >>> using a workqueue. >>> >> >> Ops sorry they are two read locks, the real problem then is that >> something is holding the write lock. > > Ok, I think I get what happens now. > > The threads that call nvmet_tcp_data_ready() (takes the read lock 2 > times) and > nvmet_tcp_release_queue_work() (tries to take the write lock) > are blocking each other. > So I still think that deferring the call to queue->data_ready() by > using a workqueue should fix it. > It's nvmet_tcp_list_data_ready() which is the problem; thing is, we only need to take the lock to access 'sk_user_data' (as this might be while the callback is running). But the 'sk_state' value can be accessed without a lock, and as we need to look at sk_user_data only if the socket is in TCP_LISTEN state (which I hope is not the case during socket shutdown) we can move the check out of the lock and avoid this issue. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 1:57 [bug report] blktests nvme/tcp nvme/060 hang Yi Zhang 2025-08-06 6:16 ` Maurizio Lombardi @ 2025-08-06 10:52 ` Hannes Reinecke 2025-08-07 1:57 ` Yi Zhang 2025-08-11 14:10 ` Maurizio Lombardi 1 sibling, 2 replies; 9+ messages in thread From: Hannes Reinecke @ 2025-08-06 10:52 UTC (permalink / raw) To: Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On 8/6/25 03:57, Yi Zhang wrote: > Hello > I hit this issue when I was running blktests nvme/tcp nvme/060 on the > latest linux-block/for-next with rt enabled, please help check it and > let me know if you need any info/testing for it, thanks. > > [ 195.356104] run blktests nvme/060 at 2025-08-06 01:50:45 > [ 195.578659] loop0: detected capacity change from 0 to 2097152 > [ 195.620986] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > [ 195.681508] nvmet_tcp: enabling port 0 (127.0.0.1:4420) > [ 195.904991] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 195.919199] nvme nvme1: creating 80 I/O queues. > [ 196.035295] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 196.175655] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > 127.0.0.1:4420, hostnqn: > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 196.195731] nvme (3558) used greatest stack depth: 19248 bytes left > [ 197.786460] nvmet: ctrl 1 fatal error occurred! > [ 197.789337] nvme nvme1: starting error recovery > [ 197.932751] nvme nvme1: Reconnecting in 1 seconds... > [ 198.980348] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 198.985848] nvme nvme1: creating 80 I/O queues. > [ 199.318924] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 199.423818] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > [ 199.556572] nvme nvme1: Failed reconnect attempt 1/600 > [ 200.075342] systemd-udevd (3501) used greatest stack depth: 17888 bytes left > [ 200.496781] nvme nvme1: Property Set error: 880, offset 0x14 > [ 200.977968] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 200.985964] nvme nvme1: creating 80 I/O queues. > [ 201.076602] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 201.189249] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > 127.0.0.1:4420, hostnqn: > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 201.206511] nvme (3741) used greatest stack depth: 17712 bytes left > [ 201.867978] nvmet: ctrl 1 fatal error occurred! > [ 201.868127] nvme nvme1: starting error recovery > [ 201.986566] nvme nvme1: Reconnecting in 1 seconds... > [ 203.049519] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 203.056878] nvme nvme1: creating 80 I/O queues. > [ 203.399239] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 203.569429] nvme nvme1: Successfully reconnected (attempt 1/600) > [ 203.918129] nvmet: ctrl 1 fatal error occurred! > [ 203.918261] nvme nvme1: starting error recovery > [ 204.026795] nvme nvme1: Reconnecting in 1 seconds... > [ 204.423104] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > [ 205.336507] nvme nvme1: Property Set error: 880, offset 0x14 > [ 205.855722] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 205.861038] nvme nvme1: creating 80 I/O queues. > [ 205.944537] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 205.965862] nvmet: ctrl 1 fatal error occurred! > [ 206.053046] nvme nvme1: starting error recovery > [ 206.147337] nvme nvme1: Failed to configure AEN (cfg 900) > [ 206.147495] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > 127.0.0.1:4420, hostnqn: > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 206.176163] nvme nvme1: Reconnecting in 1 seconds... > [ 207.208170] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 207.211901] nvme nvme1: creating 80 I/O queues. > [ 207.336799] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 207.349663] nvme nvme1: Successfully reconnected (attempt 1/600) > [ 208.017068] nvmet: ctrl 1 fatal error occurred! > [ 208.017201] nvme nvme1: starting error recovery > [ 208.126889] nvme nvme1: Reconnecting in 1 seconds... > [ 209.199440] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 209.203685] nvme nvme1: creating 80 I/O queues. > [ 209.382909] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > [ 209.503420] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 209.706412] nvme nvme1: Failed reconnect attempt 1/600 > [ 210.617622] nvme nvme1: Property Set error: 880, offset 0x14 > [ 211.158138] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 211.163794] nvme nvme1: creating 80 I/O queues. > [ 211.245927] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 211.350477] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > 127.0.0.1:4420, hostnqn: > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 212.097987] nvmet: ctrl 1 fatal error occurred! > [ 212.098122] nvme nvme1: starting error recovery > [ 212.217345] nvme nvme1: Reconnecting in 1 seconds... > [ 213.280774] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 213.284617] nvme nvme1: creating 80 I/O queues. > [ 213.625320] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 213.787167] nvme nvme1: Successfully reconnected (attempt 1/600) > [ 214.147984] nvmet: ctrl 1 fatal error occurred! > [ 214.148113] nvme nvme1: starting error recovery > [ 214.248455] nvme nvme1: Reconnecting in 1 seconds... > [ 214.604818] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > [ 215.431717] nvme nvme1: Property Set error: 880, offset 0x14 > [ 216.014626] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 216.020186] nvme nvme1: creating 80 I/O queues. > [ 216.101613] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 216.195783] nvmet: ctrl 1 fatal error occurred! > [ 216.536582] nvme nvme1: failed to send request -32 > [ 216.537991] nvme nvme1: failed nvme_keep_alive_end_io error=4 > [ 280.196270] nvme nvme1: I/O tag 0 (0000) type 4 opcode 0x7f > (Fabrics Cmd) QID 80 timeout > [ 280.198058] nvme nvme1: Connect command failed, error wo/DNR bit: 881 > [ 280.198353] nvme nvme1: failed to connect queue: 80 ret=881 > [ 285.369827] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 285.375603] nvme nvme1: creating 80 I/O queues. > [ 285.449676] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 285.554119] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > 127.0.0.1:4420, hostnqn: > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 287.344206] nvmet: ctrl 1 fatal error occurred! > [ 287.344349] nvme nvme1: starting error recovery > [ 287.466244] nvme nvme1: Reconnecting in 1 seconds... > [ 288.479300] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 288.483283] nvme nvme1: creating 80 I/O queues. > [ 288.785415] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > [ 288.856766] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 289.066215] nvme nvme1: Failed reconnect attempt 1/600 > [ 290.116265] nvme nvme1: Property Set error: 880, offset 0x14 > [ 320.802045] loop: module loaded > [ 321.017366] run blktests nvme/060 at 2025-08-06 01:52:51 > [ 321.220243] loop0: detected capacity change from 0 to 2097152 > [ 321.262034] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > [ 321.322296] nvmet_tcp: enabling port 0 (127.0.0.1:4420) > [ 321.511316] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 321.524993] nvme nvme1: creating 80 I/O queues. > [ 321.636336] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 321.754764] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > 127.0.0.1:4420, hostnqn: > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > [ 323.431491] nvmet: ctrl 1 fatal error occurred! > [ 323.434219] nvme nvme1: starting error recovery > [ 323.566948] nvme nvme1: Reconnecting in 1 seconds... > [ 324.653738] nvmet: Created nvm controller 1 for subsystem > blktests-subsystem-1 for NQN > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > [ 324.662270] nvme nvme1: creating 80 I/O queues. > [ 324.993712] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > [ 325.052085] nvme nvme1: mapped 80/0/0 default/read/poll queues. > [ 325.481710] nvmet: ctrl 1 fatal error occurred! > [ 327.286111] nvmet: ctrl 1 keep-alive timer (1 seconds) expired! > [ 390.435380] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > [ 390.441500] rcu: Tasks blocked on level-1 rcu_node (CPUs 0-15): > P4784/2:b..l P1049/8:b..l > [ 390.449771] rcu: (detected by 13, t=6503 jiffies, g=8845, q=2243 ncpus=80) > [ 390.456724] task:kworker/u329:1 state:D stack:19264 pid:1049 > tgid:1049 ppid:2 task_flags:0x4208060 flags:0x00000010 > [ 390.467850] Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp] > [ 390.474378] Call trace: > [ 390.476813] __switch_to+0x1d8/0x330 (T) > [ 390.480731] __schedule+0x860/0x1c30 > [ 390.484297] schedule_rtlock+0x30/0x70 > [ 390.488037] rtlock_slowlock_locked+0x464/0xf60 > [ 390.492559] rt_read_lock+0x2bc/0x3e0 > [ 390.496211] nvmet_tcp_listen_data_ready+0x3c/0x118 [nvmet_tcp] > [ 390.502125] nvmet_tcp_data_ready+0x88/0x198 [nvmet_tcp] > [ 390.507429] tcp_data_ready+0xdc/0x3e0 > [ 390.511171] tcp_data_queue+0xe30/0x1e08 > [ 390.515084] tcp_rcv_established+0x710/0x2328 > [ 390.519432] tcp_v4_do_rcv+0x554/0x940 > [ 390.523172] tcp_v4_rcv+0x1ec4/0x3078 > [ 390.526825] ip_protocol_deliver_rcu+0xc0/0x3f0 > [ 390.531347] ip_local_deliver_finish+0x2d4/0x5c0 > [ 390.535954] ip_local_deliver+0x17c/0x3c0 > [ 390.539953] ip_rcv_finish+0x148/0x238 > [ 390.543693] ip_rcv+0xd0/0x2e0 > [ 390.546737] __netif_receive_skb_one_core+0x100/0x180 > [ 390.551780] __netif_receive_skb+0x2c/0x160 > [ 390.555953] process_backlog+0x230/0x6f8 > [ 390.559866] __napi_poll.constprop.0+0x9c/0x3e8 > [ 390.564386] net_rx_action+0x808/0xb50 > [ 390.568125] handle_softirqs.constprop.0+0x23c/0xca0 > [ 390.573082] __local_bh_enable_ip+0x260/0x4f0 > [ 390.577429] __dev_queue_xmit+0x6f4/0x1bd8 > [ 390.581515] neigh_hh_output+0x190/0x2c0 > [ 390.585429] ip_finish_output2+0x7c8/0x1788 > [ 390.589602] __ip_finish_output+0x2c4/0x4f8 > [ 390.593776] ip_finish_output+0x3c/0x2a8 > [ 390.597689] ip_output+0x154/0x418 > [ 390.601081] __ip_queue_xmit+0x580/0x1108 > [ 390.605081] ip_queue_xmit+0x4c/0x78 > [ 390.608647] __tcp_transmit_skb+0xfac/0x24e8 > [ 390.612908] tcp_write_xmit+0xbec/0x3078 > [ 390.616821] __tcp_push_pending_frames+0x90/0x2b8 > [ 390.621515] tcp_send_fin+0x108/0x9a8 > [ 390.625167] tcp_shutdown+0xd8/0xf8 > [ 390.628646] inet_shutdown+0x14c/0x2e8 > [ 390.632385] kernel_sock_shutdown+0x5c/0x98 > [ 390.636560] __nvme_tcp_stop_queue+0x44/0x220 [nvme_tcp] > [ 390.641865] nvme_tcp_stop_queue_nowait+0x130/0x200 [nvme_tcp] > [ 390.647691] nvme_tcp_setup_ctrl+0x3bc/0x728 [nvme_tcp] > [ 390.652909] nvme_tcp_reconnect_ctrl_work+0x78/0x290 [nvme_tcp] > [ 390.658822] process_one_work+0x80c/0x1a78 > [ 390.662910] worker_thread+0x6d0/0xaa8 > [ 390.666650] kthread+0x304/0x3a0 > [ 390.669869] ret_from_fork+0x10/0x20 > [ 390.673437] task:kworker/u322:77 state:D stack:27184 pid:4784 > tgid:4784 ppid:2 task_flags:0x4208060 flags:0x00000210 > [ 390.684562] Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp] > [ 390.691256] Call trace: > [ 390.693692] __switch_to+0x1d8/0x330 (T) > [ 390.697605] __schedule+0x860/0x1c30 > [ 390.701171] schedule_rtlock+0x30/0x70 > [ 390.704911] rwbase_write_lock.constprop.0.isra.0+0x2fc/0xb30 > [ 390.710646] rt_write_lock+0x9c/0x138 > [ 390.714299] nvmet_tcp_release_queue_work+0x168/0xb48 [nvmet_tcp] > [ 390.720384] process_one_work+0x80c/0x1a78 > [ 390.724470] worker_thread+0x6d0/0xaa8 > [ 390.728210] kthread+0x304/0x3a0 > [ 390.731428] ret_from_fork+0x10/0x20 > > Can you check if this fixes the problem? diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 688033b88d38..23bdce8dcdf0 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1991,15 +1991,13 @@ static void nvmet_tcp_listen_data_ready(struct sock *sk) struct nvmet_tcp_port *port; trace_sk_data_ready(sk); + if (sk->sk_state != TCP_LISTEN) + return; read_lock_bh(&sk->sk_callback_lock); port = sk->sk_user_data; - if (!port) - goto out; - - if (sk->sk_state == TCP_LISTEN) + if (port) queue_work(nvmet_wq, &port->accept_work); -out: read_unlock_bh(&sk->sk_callback_lock); } Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 10:52 ` Hannes Reinecke @ 2025-08-07 1:57 ` Yi Zhang 2025-08-11 14:10 ` Maurizio Lombardi 1 sibling, 0 replies; 9+ messages in thread From: Yi Zhang @ 2025-08-07 1:57 UTC (permalink / raw) To: Hannes Reinecke Cc: linux-block, open list:NVM EXPRESS DRIVER, Shinichiro Kawasaki, Maurizio Lombardi On Wed, Aug 6, 2025 at 6:52 PM Hannes Reinecke <hare@suse.de> wrote: > > On 8/6/25 03:57, Yi Zhang wrote: > > Hello > > I hit this issue when I was running blktests nvme/tcp nvme/060 on the > > latest linux-block/for-next with rt enabled, please help check it and > > let me know if you need any info/testing for it, thanks. > > > > [ 195.356104] run blktests nvme/060 at 2025-08-06 01:50:45 > > [ 195.578659] loop0: detected capacity change from 0 to 2097152 > > [ 195.620986] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > > [ 195.681508] nvmet_tcp: enabling port 0 (127.0.0.1:4420) > > [ 195.904991] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 195.919199] nvme nvme1: creating 80 I/O queues. > > [ 196.035295] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 196.175655] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > > 127.0.0.1:4420, hostnqn: > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 196.195731] nvme (3558) used greatest stack depth: 19248 bytes left > > [ 197.786460] nvmet: ctrl 1 fatal error occurred! > > [ 197.789337] nvme nvme1: starting error recovery > > [ 197.932751] nvme nvme1: Reconnecting in 1 seconds... > > [ 198.980348] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 198.985848] nvme nvme1: creating 80 I/O queues. > > [ 199.318924] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 199.423818] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > > [ 199.556572] nvme nvme1: Failed reconnect attempt 1/600 > > [ 200.075342] systemd-udevd (3501) used greatest stack depth: 17888 bytes left > > [ 200.496781] nvme nvme1: Property Set error: 880, offset 0x14 > > [ 200.977968] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 200.985964] nvme nvme1: creating 80 I/O queues. > > [ 201.076602] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 201.189249] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > > 127.0.0.1:4420, hostnqn: > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 201.206511] nvme (3741) used greatest stack depth: 17712 bytes left > > [ 201.867978] nvmet: ctrl 1 fatal error occurred! > > [ 201.868127] nvme nvme1: starting error recovery > > [ 201.986566] nvme nvme1: Reconnecting in 1 seconds... > > [ 203.049519] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 203.056878] nvme nvme1: creating 80 I/O queues. > > [ 203.399239] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 203.569429] nvme nvme1: Successfully reconnected (attempt 1/600) > > [ 203.918129] nvmet: ctrl 1 fatal error occurred! > > [ 203.918261] nvme nvme1: starting error recovery > > [ 204.026795] nvme nvme1: Reconnecting in 1 seconds... > > [ 204.423104] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > > [ 205.336507] nvme nvme1: Property Set error: 880, offset 0x14 > > [ 205.855722] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 205.861038] nvme nvme1: creating 80 I/O queues. > > [ 205.944537] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 205.965862] nvmet: ctrl 1 fatal error occurred! > > [ 206.053046] nvme nvme1: starting error recovery > > [ 206.147337] nvme nvme1: Failed to configure AEN (cfg 900) > > [ 206.147495] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > > 127.0.0.1:4420, hostnqn: > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 206.176163] nvme nvme1: Reconnecting in 1 seconds... > > [ 207.208170] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 207.211901] nvme nvme1: creating 80 I/O queues. > > [ 207.336799] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 207.349663] nvme nvme1: Successfully reconnected (attempt 1/600) > > [ 208.017068] nvmet: ctrl 1 fatal error occurred! > > [ 208.017201] nvme nvme1: starting error recovery > > [ 208.126889] nvme nvme1: Reconnecting in 1 seconds... > > [ 209.199440] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 209.203685] nvme nvme1: creating 80 I/O queues. > > [ 209.382909] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > > [ 209.503420] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 209.706412] nvme nvme1: Failed reconnect attempt 1/600 > > [ 210.617622] nvme nvme1: Property Set error: 880, offset 0x14 > > [ 211.158138] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 211.163794] nvme nvme1: creating 80 I/O queues. > > [ 211.245927] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 211.350477] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > > 127.0.0.1:4420, hostnqn: > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 212.097987] nvmet: ctrl 1 fatal error occurred! > > [ 212.098122] nvme nvme1: starting error recovery > > [ 212.217345] nvme nvme1: Reconnecting in 1 seconds... > > [ 213.280774] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 213.284617] nvme nvme1: creating 80 I/O queues. > > [ 213.625320] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 213.787167] nvme nvme1: Successfully reconnected (attempt 1/600) > > [ 214.147984] nvmet: ctrl 1 fatal error occurred! > > [ 214.148113] nvme nvme1: starting error recovery > > [ 214.248455] nvme nvme1: Reconnecting in 1 seconds... > > [ 214.604818] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > > [ 215.431717] nvme nvme1: Property Set error: 880, offset 0x14 > > [ 216.014626] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 216.020186] nvme nvme1: creating 80 I/O queues. > > [ 216.101613] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 216.195783] nvmet: ctrl 1 fatal error occurred! > > [ 216.536582] nvme nvme1: failed to send request -32 > > [ 216.537991] nvme nvme1: failed nvme_keep_alive_end_io error=4 > > [ 280.196270] nvme nvme1: I/O tag 0 (0000) type 4 opcode 0x7f > > (Fabrics Cmd) QID 80 timeout > > [ 280.198058] nvme nvme1: Connect command failed, error wo/DNR bit: 881 > > [ 280.198353] nvme nvme1: failed to connect queue: 80 ret=881 > > [ 285.369827] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 285.375603] nvme nvme1: creating 80 I/O queues. > > [ 285.449676] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 285.554119] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > > 127.0.0.1:4420, hostnqn: > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 287.344206] nvmet: ctrl 1 fatal error occurred! > > [ 287.344349] nvme nvme1: starting error recovery > > [ 287.466244] nvme nvme1: Reconnecting in 1 seconds... > > [ 288.479300] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 288.483283] nvme nvme1: creating 80 I/O queues. > > [ 288.785415] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > > [ 288.856766] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 289.066215] nvme nvme1: Failed reconnect attempt 1/600 > > [ 290.116265] nvme nvme1: Property Set error: 880, offset 0x14 > > [ 320.802045] loop: module loaded > > [ 321.017366] run blktests nvme/060 at 2025-08-06 01:52:51 > > [ 321.220243] loop0: detected capacity change from 0 to 2097152 > > [ 321.262034] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > > [ 321.322296] nvmet_tcp: enabling port 0 (127.0.0.1:4420) > > [ 321.511316] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 321.524993] nvme nvme1: creating 80 I/O queues. > > [ 321.636336] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 321.754764] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr > > 127.0.0.1:4420, hostnqn: > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 > > [ 323.431491] nvmet: ctrl 1 fatal error occurred! > > [ 323.434219] nvme nvme1: starting error recovery > > [ 323.566948] nvme nvme1: Reconnecting in 1 seconds... > > [ 324.653738] nvmet: Created nvm controller 1 for subsystem > > blktests-subsystem-1 for NQN > > nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349. > > [ 324.662270] nvme nvme1: creating 80 I/O queues. > > [ 324.993712] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1" > > [ 325.052085] nvme nvme1: mapped 80/0/0 default/read/poll queues. > > [ 325.481710] nvmet: ctrl 1 fatal error occurred! > > [ 327.286111] nvmet: ctrl 1 keep-alive timer (1 seconds) expired! > > [ 390.435380] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: > > [ 390.441500] rcu: Tasks blocked on level-1 rcu_node (CPUs 0-15): > > P4784/2:b..l P1049/8:b..l > > [ 390.449771] rcu: (detected by 13, t=6503 jiffies, g=8845, q=2243 ncpus=80) > > [ 390.456724] task:kworker/u329:1 state:D stack:19264 pid:1049 > > tgid:1049 ppid:2 task_flags:0x4208060 flags:0x00000010 > > [ 390.467850] Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp] > > [ 390.474378] Call trace: > > [ 390.476813] __switch_to+0x1d8/0x330 (T) > > [ 390.480731] __schedule+0x860/0x1c30 > > [ 390.484297] schedule_rtlock+0x30/0x70 > > [ 390.488037] rtlock_slowlock_locked+0x464/0xf60 > > [ 390.492559] rt_read_lock+0x2bc/0x3e0 > > [ 390.496211] nvmet_tcp_listen_data_ready+0x3c/0x118 [nvmet_tcp] > > [ 390.502125] nvmet_tcp_data_ready+0x88/0x198 [nvmet_tcp] > > [ 390.507429] tcp_data_ready+0xdc/0x3e0 > > [ 390.511171] tcp_data_queue+0xe30/0x1e08 > > [ 390.515084] tcp_rcv_established+0x710/0x2328 > > [ 390.519432] tcp_v4_do_rcv+0x554/0x940 > > [ 390.523172] tcp_v4_rcv+0x1ec4/0x3078 > > [ 390.526825] ip_protocol_deliver_rcu+0xc0/0x3f0 > > [ 390.531347] ip_local_deliver_finish+0x2d4/0x5c0 > > [ 390.535954] ip_local_deliver+0x17c/0x3c0 > > [ 390.539953] ip_rcv_finish+0x148/0x238 > > [ 390.543693] ip_rcv+0xd0/0x2e0 > > [ 390.546737] __netif_receive_skb_one_core+0x100/0x180 > > [ 390.551780] __netif_receive_skb+0x2c/0x160 > > [ 390.555953] process_backlog+0x230/0x6f8 > > [ 390.559866] __napi_poll.constprop.0+0x9c/0x3e8 > > [ 390.564386] net_rx_action+0x808/0xb50 > > [ 390.568125] handle_softirqs.constprop.0+0x23c/0xca0 > > [ 390.573082] __local_bh_enable_ip+0x260/0x4f0 > > [ 390.577429] __dev_queue_xmit+0x6f4/0x1bd8 > > [ 390.581515] neigh_hh_output+0x190/0x2c0 > > [ 390.585429] ip_finish_output2+0x7c8/0x1788 > > [ 390.589602] __ip_finish_output+0x2c4/0x4f8 > > [ 390.593776] ip_finish_output+0x3c/0x2a8 > > [ 390.597689] ip_output+0x154/0x418 > > [ 390.601081] __ip_queue_xmit+0x580/0x1108 > > [ 390.605081] ip_queue_xmit+0x4c/0x78 > > [ 390.608647] __tcp_transmit_skb+0xfac/0x24e8 > > [ 390.612908] tcp_write_xmit+0xbec/0x3078 > > [ 390.616821] __tcp_push_pending_frames+0x90/0x2b8 > > [ 390.621515] tcp_send_fin+0x108/0x9a8 > > [ 390.625167] tcp_shutdown+0xd8/0xf8 > > [ 390.628646] inet_shutdown+0x14c/0x2e8 > > [ 390.632385] kernel_sock_shutdown+0x5c/0x98 > > [ 390.636560] __nvme_tcp_stop_queue+0x44/0x220 [nvme_tcp] > > [ 390.641865] nvme_tcp_stop_queue_nowait+0x130/0x200 [nvme_tcp] > > [ 390.647691] nvme_tcp_setup_ctrl+0x3bc/0x728 [nvme_tcp] > > [ 390.652909] nvme_tcp_reconnect_ctrl_work+0x78/0x290 [nvme_tcp] > > [ 390.658822] process_one_work+0x80c/0x1a78 > > [ 390.662910] worker_thread+0x6d0/0xaa8 > > [ 390.666650] kthread+0x304/0x3a0 > > [ 390.669869] ret_from_fork+0x10/0x20 > > [ 390.673437] task:kworker/u322:77 state:D stack:27184 pid:4784 > > tgid:4784 ppid:2 task_flags:0x4208060 flags:0x00000210 > > [ 390.684562] Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp] > > [ 390.691256] Call trace: > > [ 390.693692] __switch_to+0x1d8/0x330 (T) > > [ 390.697605] __schedule+0x860/0x1c30 > > [ 390.701171] schedule_rtlock+0x30/0x70 > > [ 390.704911] rwbase_write_lock.constprop.0.isra.0+0x2fc/0xb30 > > [ 390.710646] rt_write_lock+0x9c/0x138 > > [ 390.714299] nvmet_tcp_release_queue_work+0x168/0xb48 [nvmet_tcp] > > [ 390.720384] process_one_work+0x80c/0x1a78 > > [ 390.724470] worker_thread+0x6d0/0xaa8 > > [ 390.728210] kthread+0x304/0x3a0 > > [ 390.731428] ret_from_fork+0x10/0x20 > > > > > Can you check if this fixes the problem? The issue was fixed with the patch now, feel free to add: Tested-by: Yi Zhang <yi.zhang@redhat.com> > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 688033b88d38..23bdce8dcdf0 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -1991,15 +1991,13 @@ static void nvmet_tcp_listen_data_ready(struct > sock *sk) > struct nvmet_tcp_port *port; > > trace_sk_data_ready(sk); > + if (sk->sk_state != TCP_LISTEN) > + return; > > read_lock_bh(&sk->sk_callback_lock); > port = sk->sk_user_data; > - if (!port) > - goto out; > - > - if (sk->sk_state == TCP_LISTEN) > + if (port) > queue_work(nvmet_wq, &port->accept_work); > -out: > read_unlock_bh(&sk->sk_callback_lock); > } > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke Kernel Storage Architect > hare@suse.de +49 911 74053 688 > SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg > HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich > -- Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [bug report] blktests nvme/tcp nvme/060 hang 2025-08-06 10:52 ` Hannes Reinecke 2025-08-07 1:57 ` Yi Zhang @ 2025-08-11 14:10 ` Maurizio Lombardi 1 sibling, 0 replies; 9+ messages in thread From: Maurizio Lombardi @ 2025-08-11 14:10 UTC (permalink / raw) To: Hannes Reinecke, Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER Cc: Shinichiro Kawasaki, Maurizio Lombardi On Wed Aug 6, 2025 at 12:52 PM CEST, Hannes Reinecke wrote: > On 8/6/25 03:57, Yi Zhang wrote: >> tgid:1049 ppid:2 task_flags:0x4208060 flags:0x00000010 >> [ 390.467850] Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp] >> [ 390.474378] Call trace: >> [ 390.476813] __switch_to+0x1d8/0x330 (T) >> [ 390.480731] __schedule+0x860/0x1c30 >> [ 390.484297] schedule_rtlock+0x30/0x70 >> [ 390.488037] rtlock_slowlock_locked+0x464/0xf60 >> [ 390.492559] rt_read_lock+0x2bc/0x3e0 >> [ 390.496211] nvmet_tcp_listen_data_ready+0x3c/0x118 [nvmet_tcp] >> [ 390.502125] nvmet_tcp_data_ready+0x88/0x198 [nvmet_tcp] >> [ 390.507429] tcp_data_ready+0xdc/0x3e0 >> [ 390.511171] tcp_data_queue+0xe30/0x1e08 >> [ 390.515084] tcp_rcv_established+0x710/0x2328 >> [ 390.519432] tcp_v4_do_rcv+0x554/0x940 >> [ 390.523172] tcp_v4_rcv+0x1ec4/0x3078 >> [ 390.526825] ip_protocol_deliver_rcu+0xc0/0x3f0 >> [ 390.531347] ip_local_deliver_finish+0x2d4/0x5c0 >> [ 390.535954] ip_local_deliver+0x17c/0x3c0 >> [ 390.539953] ip_rcv_finish+0x148/0x238 >> [ 390.543693] ip_rcv+0xd0/0x2e0 >> [ 390.546737] __netif_receive_skb_one_core+0x100/0x180 >> [ 390.551780] __netif_receive_skb+0x2c/0x160 >> [ 390.555953] process_backlog+0x230/0x6f8 >> [ 390.559866] __napi_poll.constprop.0+0x9c/0x3e8 >> [ 390.564386] net_rx_action+0x808/0xb50 >> [ 390.568125] handle_softirqs.constprop.0+0x23c/0xca0 >> [ 390.573082] __local_bh_enable_ip+0x260/0x4f0 >> [ 390.577429] __dev_queue_xmit+0x6f4/0x1bd8 >> [ 390.581515] neigh_hh_output+0x190/0x2c0 >> [ 390.585429] ip_finish_output2+0x7c8/0x1788 >> [ 390.589602] __ip_finish_output+0x2c4/0x4f8 >> [ 390.593776] ip_finish_output+0x3c/0x2a8 >> [ 390.597689] ip_output+0x154/0x418 >> [ 390.601081] __ip_queue_xmit+0x580/0x1108 >> [ 390.605081] ip_queue_xmit+0x4c/0x78 >> [ 390.608647] __tcp_transmit_skb+0xfac/0x24e8 >> [ 390.612908] tcp_write_xmit+0xbec/0x3078 >> [ 390.616821] __tcp_push_pending_frames+0x90/0x2b8 >> [ 390.621515] tcp_send_fin+0x108/0x9a8 >> [ 390.625167] tcp_shutdown+0xd8/0xf8 >> [ 390.628646] inet_shutdown+0x14c/0x2e8 >> [ 390.632385] kernel_sock_shutdown+0x5c/0x98 >> [ 390.636560] __nvme_tcp_stop_queue+0x44/0x220 [nvme_tcp] >> [ 390.641865] nvme_tcp_stop_queue_nowait+0x130/0x200 [nvme_tcp] >> [ 390.647691] nvme_tcp_setup_ctrl+0x3bc/0x728 [nvme_tcp] >> [ 390.652909] nvme_tcp_reconnect_ctrl_work+0x78/0x290 [nvme_tcp] >> [ 390.658822] process_one_work+0x80c/0x1a78 >> [ 390.662910] worker_thread+0x6d0/0xaa8 >> [ 390.666650] kthread+0x304/0x3a0 >> [ 390.669869] ret_from_fork+0x10/0x20 >> [ 390.673437] task:kworker/u322:77 state:D stack:27184 pid:4784 >> tgid:4784 ppid:2 task_flags:0x4208060 flags:0x00000210 >> [ 390.684562] Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp] >> [ 390.691256] Call trace: >> [ 390.693692] __switch_to+0x1d8/0x330 (T) >> [ 390.697605] __schedule+0x860/0x1c30 >> [ 390.701171] schedule_rtlock+0x30/0x70 >> [ 390.704911] rwbase_write_lock.constprop.0.isra.0+0x2fc/0xb30 >> [ 390.710646] rt_write_lock+0x9c/0x138 >> [ 390.714299] nvmet_tcp_release_queue_work+0x168/0xb48 [nvmet_tcp] >> [ 390.720384] process_one_work+0x80c/0x1a78 >> [ 390.724470] worker_thread+0x6d0/0xaa8 >> [ 390.728210] kthread+0x304/0x3a0 >> [ 390.731428] ret_from_fork+0x10/0x20 >> >> > Can you check if this fixes the problem? > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 688033b88d38..23bdce8dcdf0 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -1991,15 +1991,13 @@ static void nvmet_tcp_listen_data_ready(struct > sock *sk) > struct nvmet_tcp_port *port; > > trace_sk_data_ready(sk); > + if (sk->sk_state != TCP_LISTEN) > + return; > > read_lock_bh(&sk->sk_callback_lock); > port = sk->sk_user_data; > - if (!port) > - goto out; > - > - if (sk->sk_state == TCP_LISTEN) > + if (port) > queue_work(nvmet_wq, &port->accept_work); > -out: > read_unlock_bh(&sk->sk_callback_lock); > } Hannes, are you going to send a formal patch? In case you missed it, the patch is confirmed to work. Thanks, Maurizio ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-08-11 15:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-06 1:57 [bug report] blktests nvme/tcp nvme/060 hang Yi Zhang 2025-08-06 6:16 ` Maurizio Lombardi 2025-08-06 6:22 ` Maurizio Lombardi 2025-08-06 6:44 ` Maurizio Lombardi 2025-08-06 9:30 ` Maurizio Lombardi 2025-08-06 10:54 ` Hannes Reinecke 2025-08-06 10:52 ` Hannes Reinecke 2025-08-07 1:57 ` Yi Zhang 2025-08-11 14:10 ` Maurizio Lombardi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox