All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next v2] RDMA/rxe: Fix incomplete state save in rxe_requester
@ 2023-07-21 20:07 Bob Pearson
  2023-07-31 18:26 ` Jason Gunthorpe
  0 siblings, 1 reply; 2+ messages in thread
From: Bob Pearson @ 2023-07-21 20:07 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, jhack, linux-rdma; +Cc: Bob Pearson

If a send packet is dropped by the IP layer in rxe_requester()
the call to rxe_xmit_packet() can fail with err == -EAGAIN.
To recover, the state of the wqe is restored to the state before
the packet was sent so it can be resent. However, the routines
that save and restore the state miss a significnt part of the
variable state in the wqe, the dma struct which is used to process
through the sge table. And, the state is not saved before the packet
is built which modifies the dma struct.

Under heavy stress testing with many QPs on a fast node sending
large messages to a slow node dropped packets are observed and
the resent packets are corrupted because the dma struct was not
restored. This patch fixes this behavior and allows the test cases
to succeed.

Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
v2:
  Rebased to for-next

 drivers/infiniband/sw/rxe/rxe_req.c | 45 ++++++++++++++++-------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 2171f19494bc..d8c41fd626a9 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -578,10 +578,11 @@ static void save_state(struct rxe_send_wqe *wqe,
 		       struct rxe_send_wqe *rollback_wqe,
 		       u32 *rollback_psn)
 {
-	rollback_wqe->state     = wqe->state;
+	rollback_wqe->state = wqe->state;
 	rollback_wqe->first_psn = wqe->first_psn;
-	rollback_wqe->last_psn  = wqe->last_psn;
-	*rollback_psn		= qp->req.psn;
+	rollback_wqe->last_psn = wqe->last_psn;
+	rollback_wqe->dma = wqe->dma;
+	*rollback_psn = qp->req.psn;
 }
 
 static void rollback_state(struct rxe_send_wqe *wqe,
@@ -589,10 +590,11 @@ static void rollback_state(struct rxe_send_wqe *wqe,
 			   struct rxe_send_wqe *rollback_wqe,
 			   u32 rollback_psn)
 {
-	wqe->state     = rollback_wqe->state;
+	wqe->state = rollback_wqe->state;
 	wqe->first_psn = rollback_wqe->first_psn;
-	wqe->last_psn  = rollback_wqe->last_psn;
-	qp->req.psn    = rollback_psn;
+	wqe->last_psn = rollback_wqe->last_psn;
+	wqe->dma = rollback_wqe->dma;
+	qp->req.psn = rollback_psn;
 }
 
 static void update_state(struct rxe_qp *qp, struct rxe_pkt_info *pkt)
@@ -797,6 +799,9 @@ int rxe_requester(struct rxe_qp *qp)
 	pkt.mask = rxe_opcode[opcode].mask;
 	pkt.wqe = wqe;
 
+	/* save wqe state before we build and send packet */
+	save_state(wqe, qp, &rollback_wqe, &rollback_psn);
+
 	av = rxe_get_av(&pkt, &ah);
 	if (unlikely(!av)) {
 		rxe_dbg_qp(qp, "Failed no address vector\n");
@@ -829,29 +834,29 @@ int rxe_requester(struct rxe_qp *qp)
 	if (ah)
 		rxe_put(ah);
 
-	/*
-	 * To prevent a race on wqe access between requester and completer,
-	 * wqe members state and psn need to be set before calling
-	 * rxe_xmit_packet().
-	 * Otherwise, completer might initiate an unjustified retry flow.
-	 */
-	save_state(wqe, qp, &rollback_wqe, &rollback_psn);
+	/* update wqe state as though we had sent it */
 	update_wqe_state(qp, wqe, &pkt);
 	update_wqe_psn(qp, wqe, &pkt, payload);
 
 	err = rxe_xmit_packet(qp, &pkt, skb);
 	if (err) {
-		qp->need_req_skb = 1;
+		if (err != -EAGAIN) {
+			wqe->status = IB_WC_LOC_QP_OP_ERR;
+			goto err;
+		}
 
+		/* the packet was dropped so reset wqe to the state
+		 * before we sent it so we can try to resend
+		 */
 		rollback_state(wqe, qp, &rollback_wqe, rollback_psn);
 
-		if (err == -EAGAIN) {
-			rxe_sched_task(&qp->req.task);
-			goto exit;
-		}
+		/* force a delay until the dropped packet is freed and
+		 * the send queue is drained below the low water mark
+		 */
+		qp->need_req_skb = 1;
 
-		wqe->status = IB_WC_LOC_QP_OP_ERR;
-		goto err;
+		rxe_sched_task(&qp->req.task);
+		goto exit;
 	}
 
 	update_state(qp, &pkt);

base-commit: b3d2b014b259ba758d72d7026685091bde1cf2d6
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH for-next v2] RDMA/rxe: Fix incomplete state save in rxe_requester
  2023-07-21 20:07 [PATCH for-next v2] RDMA/rxe: Fix incomplete state save in rxe_requester Bob Pearson
@ 2023-07-31 18:26 ` Jason Gunthorpe
  0 siblings, 0 replies; 2+ messages in thread
From: Jason Gunthorpe @ 2023-07-31 18:26 UTC (permalink / raw)
  To: Bob Pearson; +Cc: leon, zyjzyj2000, jhack, linux-rdma

On Fri, Jul 21, 2023 at 03:07:49PM -0500, Bob Pearson wrote:
> If a send packet is dropped by the IP layer in rxe_requester()
> the call to rxe_xmit_packet() can fail with err == -EAGAIN.
> To recover, the state of the wqe is restored to the state before
> the packet was sent so it can be resent. However, the routines
> that save and restore the state miss a significnt part of the
> variable state in the wqe, the dma struct which is used to process
> through the sge table. And, the state is not saved before the packet
> is built which modifies the dma struct.
> 
> Under heavy stress testing with many QPs on a fast node sending
> large messages to a slow node dropped packets are observed and
> the resent packets are corrupted because the dma struct was not
> restored. This patch fixes this behavior and allows the test cases
> to succeed.
> 
> Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer")
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> v2:
>   Rebased to for-next
> 
>  drivers/infiniband/sw/rxe/rxe_req.c | 45 ++++++++++++++++-------------
>  1 file changed, 25 insertions(+), 20 deletions(-)

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-07-31 18:26 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-21 20:07 [PATCH for-next v2] RDMA/rxe: Fix incomplete state save in rxe_requester Bob Pearson
2023-07-31 18:26 ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.