Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: "Li, Zhijian" <lizhijian@fujitsu.com>
To: Xiao Yang <yangx.jy@fujitsu.com>, <linux-rdma@vger.kernel.org>
Cc: <leon@kernel.org>, <jgg@ziepe.ca>, <rpearsonhpe@gmail.com>,
	<zyjzyj2000@gmail.com>
Subject: Re: [PATCH] RDMA/rxe: Process received packets in time
Date: Tue, 5 Jul 2022 11:43:38 +0800	[thread overview]
Message-ID: <8a034c85-3594-f965-6604-87361645016a@fujitsu.com> (raw)
In-Reply-To: <20220703155625.14497-1-yangx.jy@fujitsu.com>

on 7/3/2022 11:56 PM, Xiao Yang wrote:
> If received packets (i.e. skb) stored in qp->resp_pkts
> cannot be processed in time, they may be ovewritten/reused
> and then lead to abnormal behavior.
>
> For example, running test_qp_ex_rc_atomic_cmp_swp always
> reproduced a panic on my slow vm:
> --------------------------------------------------------
> [39867.797693] rdma_rxe: qp#17 state = GET ACK
> [39867.800667] rdma_rxe: qp#17 state = GET WQE
> [39867.800722] rdma_rxe: qp#17 state = CHECK PSN
> [39867.800739] rdma_rxe: qp#17 state = CHECK ACK
> [39867.800776] rdma_rxe: unexpected opcode
> [39867.800790] rdma_rxe: qp#17 state = ERROR
> ...
> [39867.822361] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [39867.822361] #PF: supervisor read access in kernel mode
> [39867.822361] #PF: error_code(0x0000) - not-present page
> [39867.822361] PGD 0 P4D 0
> [39867.822361] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [39867.822361] CPU: 3 PID: 19605 Comm: python3 Kdump: loaded Tainted: G        W         5.19.0-rc2+ #33
> [39867.822361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
> [39867.822361] RIP: 0010:rxe_completer+0x41f/0xd10 [rdma_rxe]
> [39867.822361] Code: 41 83 ff 0d 0f 84 a7 02 00 00 41 83 ff 0e 0f 85 da 00 00 00 48 85 db 0f 84 74 fe ff ff 48 8b 6b 08 48 8d 7b d8 be 01 00 00 00 <4c> 8b 75 00 e8 68 65 56 dd 48 8d bd a0 01 00 00 e8 8c 47 00 00 4c
> [39867.822361] RSP: 0000:ffff9eab813c7e18 EFLAGS: 00000286
> [39867.822361] RAX: 0000000000022b5a RBX: ffff8adb07efeb28 RCX: 0000000000000006
> [39867.822361] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8adb07efeb00
> [39867.822361] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
> [39867.822361] R10: 0000000000000000 R11: 0000000000011fac R12: ffff8adb06ae11a0
> [39867.822361] R13: 00000000fffffff5 R14: ffff9eab81803e80 R15: 000000000000000c
> [39867.822361] FS:  00007f87f99eb6c0(0000) GS:ffff8adb3dd00000(0000) knlGS:0000000000000000
> [39867.822361] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [39867.822361] CR2: 0000000000000000 CR3: 0000000106c3e000 CR4: 00000000000006e0
> [39867.822361] Call Trace:
> [39867.822361]  <TASK>
> [39867.822361]  ? __local_bh_enable_ip+0x83/0xf0
> [39867.822361]  rxe_do_task+0x67/0xc0 [rdma_rxe]
> [39867.822361]  tasklet_action_common.isra.0+0xe2/0x110
> [39867.822361]  __do_softirq+0xf0/0x4b5
> [39867.822361]  irq_exit_rcu+0xef/0x130
> [39867.822361]  sysvec_apic_timer_interrupt+0x40/0xc0
> [39867.822361]  asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [39867.822361] RIP: 0033:0x7f87f96fcce5
> ...
> --------------------------------------------------------
>
> The root cause is that a received skb stored in qp->resp_pkts
> has been ovewritten by another QP. Like the following logic:
> --------------------------------------------------------
>                      /* Trigger timeout */
> Requester(QP#17) -> retransmit_timer()
>                      -> rxe_completer()
>                            /* No skb (Atomic ACK) is added
>                             * to qp->resp_pkts */
>                            qp->req.need_retry = 1;
>                            rxe_run_task(&qp->req.task, 0);
>
>                      /* First skb (Atomic ACK) arrived &&
>                       * rxe_requester() is not called yet */
> Requester(QP#17) -> rxe_rcv()
>                         /* Add the skb (Atomic ACK) into qp->resp_pkts */
>                      -> rxe_comp_queue_pkt()
>                         -> rxe_completer()
>                               /* qp->req.need_retry == 1 */
>                               if (qp->req.need_retry) {
> 		                ret = -EAGAIN;
>                                  goto done;
> 		             }
>
> Responder(QP#18) -> rxe_rcv()
>                         /* The skb (Atomic ACK) has been reused
>                          * to store the Atomic Comapre_Swap request */

Well, i didn't see how/when another QP can reuse/overwrite this skb.
If this is true, i think we should fix it as well.

Thanks


>                      -> rxe_responder()
>
>                      /* rxe_requester() is called */
> Requester(QP#17) -> rxe_requester()
>                         if (unlikely(qp->req.need_retry)) {
> 		          req_retry(qp);
> 		          qp->req.need_retry = 0;
>                         }
>
> 	            /* Trigger timeout again */
> Requester(QP#17) -> retransmit_timer()
>                      -> rxe_completer()
>                            /* qp->req.need_retry == 0 */
>                            if (qp->req.need_retry) {
> 		             ret = -EAGAIN;
>                               goto done;
> 		          }
>                            /* check_ack() throws "unexpected opcode" */
>                         -> check_ack()
>                            return COMPST_ERROR;
> --------------------------------------------------------
>
> If qp->req.need_retry is set and qp->resp_pkts is not empty,
> Process received packets in time to fix the issue.
>
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_comp.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
> index da3a398053b8..8ffc874a25af 100644
> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
> @@ -580,7 +580,7 @@ int rxe_completer(void *arg)
>   		qp->comp.timeout_retry = 0;
>   	}
>   
> -	if (qp->req.need_retry) {
> +	if (qp->req.need_retry && !skb_queue_len(&qp->resp_pkts)) {
>   		ret = -EAGAIN;
>   		goto done;
>   	}



      parent reply	other threads:[~2022-07-05  3:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-03 15:56 [PATCH] RDMA/rxe: Process received packets in time Xiao Yang
2022-07-04 12:55 ` Jason Gunthorpe
2022-07-04 14:23   ` yangx.jy
2022-07-08  6:33     ` yangx.jy
2022-07-08  7:46       ` Pearson, Robert B
2022-07-05  3:43 ` Li, Zhijian [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a034c85-3594-f965-6604-87361645016a@fujitsu.com \
    --to=lizhijian@fujitsu.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=yangx.jy@fujitsu.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox