linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-rc] RDMA/irdma: Fix drain SQ hang with no completion
@ 2022-08-24 15:43 Shiraz Saleem
  2022-08-28  9:44 ` Leon Romanovsky
  0 siblings, 1 reply; 2+ messages in thread
From: Shiraz Saleem @ 2022-08-24 15:43 UTC (permalink / raw)
  To: jgg, leon; +Cc: linux-rdma, Shiraz Saleem, Kamal Heib

SW generated completions for outstanding WRs posted on SQ
after QP is in error target the wrong CQ. This causes the
ib_drain_sq to hang with no completion.

Fix this to generate completions on the right CQ.

[  863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds.
[  863.979224]       Not tainted 5.14.0-130.el9.x86_64 #1
[  863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  863.996997] task:kworker/u52:2   state:D stack:    0 pid:  671 ppid:     2 flags:0x00004000
[  864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc]
[  864.014056] Call Trace:
[  864.017575]  __schedule+0x206/0x580
[  864.022296]  schedule+0x43/0xa0
[  864.026736]  schedule_timeout+0x115/0x150
[  864.032185]  __wait_for_common+0x93/0x1d0
[  864.037717]  ? usleep_range_state+0x90/0x90
[  864.043368]  __ib_drain_sq+0xf6/0x170 [ib_core]
[  864.049371]  ? __rdma_block_iter_next+0x80/0x80 [ib_core]
[  864.056240]  ib_drain_sq+0x66/0x70 [ib_core]
[  864.062003]  rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma]
[  864.069365]  ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc]
[  864.076386]  xprt_rdma_close+0xe/0x30 [rpcrdma]
[  864.082593]  xprt_autoclose+0x52/0x100 [sunrpc]
[  864.088718]  process_one_work+0x1e8/0x3c0
[  864.094170]  worker_thread+0x50/0x3b0
[  864.099109]  ? rescuer_thread+0x370/0x370
[  864.104473]  kthread+0x149/0x170
[  864.109022]  ? set_kthread_struct+0x40/0x40
[  864.114713]  ret_from_fork+0x22/0x30

Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error")
Reported-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/hw/irdma/utils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c
index fdf4cc8..c8b9235 100644
--- a/drivers/infiniband/hw/irdma/utils.c
+++ b/drivers/infiniband/hw/irdma/utils.c
@@ -2598,7 +2598,7 @@ void irdma_generate_flush_completions(struct irdma_qp *iwqp)
 		spin_unlock_irqrestore(&iwqp->lock, flags2);
 		spin_unlock_irqrestore(&iwqp->iwscq->lock, flags1);
 		if (compl_generated)
-			irdma_comp_handler(iwqp->iwrcq);
+			irdma_comp_handler(iwqp->iwscq);
 	} else {
 		spin_unlock_irqrestore(&iwqp->iwscq->lock, flags1);
 		mod_delayed_work(iwqp->iwdev->cleanup_wq, &iwqp->dwork_flush,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH for-rc] RDMA/irdma: Fix drain SQ hang with no completion
  2022-08-24 15:43 [PATCH for-rc] RDMA/irdma: Fix drain SQ hang with no completion Shiraz Saleem
@ 2022-08-28  9:44 ` Leon Romanovsky
  0 siblings, 0 replies; 2+ messages in thread
From: Leon Romanovsky @ 2022-08-28  9:44 UTC (permalink / raw)
  To: Shiraz Saleem; +Cc: jgg, linux-rdma, Kamal Heib

On Wed, Aug 24, 2022 at 10:43:59AM -0500, Shiraz Saleem wrote:
> SW generated completions for outstanding WRs posted on SQ
> after QP is in error target the wrong CQ. This causes the
> ib_drain_sq to hang with no completion.
> 
> Fix this to generate completions on the right CQ.
> 
> [  863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds.
> [  863.979224]       Not tainted 5.14.0-130.el9.x86_64 #1
> [  863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  863.996997] task:kworker/u52:2   state:D stack:    0 pid:  671 ppid:     2 flags:0x00004000
> [  864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc]
> [  864.014056] Call Trace:
> [  864.017575]  __schedule+0x206/0x580
> [  864.022296]  schedule+0x43/0xa0
> [  864.026736]  schedule_timeout+0x115/0x150
> [  864.032185]  __wait_for_common+0x93/0x1d0
> [  864.037717]  ? usleep_range_state+0x90/0x90
> [  864.043368]  __ib_drain_sq+0xf6/0x170 [ib_core]
> [  864.049371]  ? __rdma_block_iter_next+0x80/0x80 [ib_core]
> [  864.056240]  ib_drain_sq+0x66/0x70 [ib_core]
> [  864.062003]  rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma]
> [  864.069365]  ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc]
> [  864.076386]  xprt_rdma_close+0xe/0x30 [rpcrdma]
> [  864.082593]  xprt_autoclose+0x52/0x100 [sunrpc]
> [  864.088718]  process_one_work+0x1e8/0x3c0
> [  864.094170]  worker_thread+0x50/0x3b0
> [  864.099109]  ? rescuer_thread+0x370/0x370
> [  864.104473]  kthread+0x149/0x170
> [  864.109022]  ? set_kthread_struct+0x40/0x40
> [  864.114713]  ret_from_fork+0x22/0x30
> 
> Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error")
> Reported-by: Kamal Heib <kamalheib1@gmail.com>
> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
> ---
>  drivers/infiniband/hw/irdma/utils.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Thanks, applied.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-08-28  9:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-24 15:43 [PATCH for-rc] RDMA/irdma: Fix drain SQ hang with no completion Shiraz Saleem
2022-08-28  9:44 ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).