public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Devesh Sharma
	<Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org>
Cc: Linux NFS Mailing List
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Trond Myklebust
	<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
Date: Thu, 10 Apr 2014 10:01:01 -0500	[thread overview]
Message-ID: <5346B22D.3060706@opengridcomputing.com> (raw)
In-Reply-To: <E66D006A-0D04-4602-8BF5-6834CACD2E24-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

On 4/9/2014 7:26 PM, Chuck Lever wrote:
> On Apr 9, 2014, at 7:56 PM, Devesh Sharma <Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org> wrote:
>
>> Hi Chuk and Trond
>>
>> I will resend a v2 for this.
>> What if ib_post_send() fails with immidate error, I that case also DECR_CQCOUNT() will be called but no completion will be reported. Will that not cause any problems?
> We should investigate whether an error return from ib_post_{send,recv} means there will be no completion. But I’ve never seen these verbs fail in practice, so I’m not in a hurry to make work for anyone! ;-)

A synchronous failure from ib_post_* means the WR (or at least one of 
them if there were > 1) failed and did not get submitted to HW.  So 
there will be no completion for those that failed.

> However it seems to me the new (!ia->ri_id->qp) checks outside the connect logic are unnecessary.
>
> Clearly, as you noticed, the ib_post_{send,recv} verbs do not check that their “qp" argument is NULL before dereferencing it.
>
> But I don’t understand how xprtrdma can post any operation if the transport isn’t connected. In other words, how would it be possible to call rpcrdma_ep_post_recv() if the connect had failed and there was no QP?
>
> If disconnect wipes ia->ri_id->qp while there are still operations in progress, that would be the real bug.
>
>
>> Also in rpcrdma_register_frmr_external() I am seeing DECT_CQCOUNT is called twice
>> First at line 1538 (unlikely however) and second at line 1562. Shouldn't  it be only at 1562?
> if (seg1->mr_chunk.rl_mw->r.frmr.state == FRMR_IS_VALID) then rpcrdma_register_frmr_external() posts two Work Requests (LOCAL_INV then FAST_REG_MR) with one ib_post_send(). Thus it is correct to DECR_CQCOUNT twice in that case because each WR will trigger a separate completion event.
>
>
>> -----Original Message-----
>> From: Chuck Lever [mailto:chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org]
>> Sent: Thursday, April 10, 2014 1:57 AM
>> To: Devesh Sharma
>> Cc: Linux NFS Mailing List; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Trond Myklebust
>> Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
>>
>>
>> On Apr 9, 2014, at 4:22 PM, Trond Myklebust <trond.myklebust@primarydata.com> wrote:
>>
>>> Hi Devesh,
>>>
>>> This looks a lot better. I still have a couple of small suggestions, though.
>>>
>>> On Apr 9, 2014, at 14:40, Devesh Sharma <devesh.sharma-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org> wrote:
>>>
>>>> If the rdma_create_qp fails to create qp due to device firmware being
>>>> in invalid state xprtrdma still tries to destroy the non-existant qp
>>>> and ends up in a NULL pointer reference crash.
>>>> Adding proper checks for vaidating QP pointer avoids this to happen.
>>>>
>>>> Signed-off-by: Devesh Sharma <devesh.sharma-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
>>>> ---
>>>> net/sunrpc/xprtrdma/verbs.c |   29 +++++++++++++++++++++++++----
>>>> 1 files changed, 25 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/net/sunrpc/xprtrdma/verbs.c
>>>> b/net/sunrpc/xprtrdma/verbs.c index 9372656..902ac78 100644
>>>> --- a/net/sunrpc/xprtrdma/verbs.c
>>>> +++ b/net/sunrpc/xprtrdma/verbs.c
>>>> @@ -831,10 +831,12 @@ rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
>>>> 	if (ep->rep_connected != 0) {
>>>> 		struct rpcrdma_xprt *xprt;
>>>> retry:
>>>> -		rc = rpcrdma_ep_disconnect(ep, ia);
>>>> -		if (rc && rc != -ENOTCONN)
>>>> -			dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>>> +		if (ia->ri_id->qp) {
>>>> +			rc = rpcrdma_ep_disconnect(ep, ia);
>>>> +			if (rc && rc != -ENOTCONN)
>>>> +				dprintk("RPC:       %s: rpcrdma_ep_disconnect"
>>>> 				" status %i\n", __func__, rc);
>>>> +		}
>>>> 		rpcrdma_clean_cq(ep->rep_cq);
>>>>
>>>> 		xprt = container_of(ia, struct rpcrdma_xprt, rx_ia); @@ -859,7
>>>> +861,9 @@ retry:
>>>> 			goto out;
>>>> 		}
>>>> 		/* END TEMP */
>>>> -		rdma_destroy_qp(ia->ri_id);
>>>> +		if (ia->ri_id->qp) {
>>>> +			rdma_destroy_qp(ia->ri_id);
>>>> +		}
>>> Nit: No need for braces here.
>>>
>>>> 		rdma_destroy_id(ia->ri_id);
>>>> 		ia->ri_id = id;
>>>> 	}
>>>> @@ -1557,6 +1561,13 @@ rpcrdma_register_frmr_external(struct rpcrdma_mr_seg *seg,
>>>> 	frmr_wr.wr.fast_reg.rkey = seg1->mr_chunk.rl_mw->r.frmr.fr_mr->rkey;
>>>> 	DECR_CQCOUNT(&r_xprt->rx_ep);
>> I don't think you can DECR_CQCOUNT, then exit without posting the send. That will screw up the completion counter and result in a transport hang, won't it?
>>
>>>> +	if (!ia->ri_is->qp) {
>>>> +		rc = -EINVAL;
>>>> +		while (i--)
>>>> +			rpcrdma_unmap_one(ia, --seg);
>>>> +		goto out;
>>>> +	}
>>> Instead of duplicating the rpcrdma_unmap_one() cleanup here, why not
>>> just do
>>>
>>> 	if (ia->ri_is->qp)
>>> 		rc = ib_post_send(...)
>>> 	else
>>> 		rc = -EINVAL;
>>>
>>> BTW: can we not simply test for ia->ri_is->qp before we even call rpcrdma_map_one() and hence bail out before we have to do any cleanup?
>>>
>>>> +
>>>> 	rc = ib_post_send(ia->ri_id->qp, post_wr, &bad_wr);
>>>>
>>>> 	if (rc) {
>>>> @@ -1571,6 +1582,7 @@ rpcrdma_register_frmr_external(struct rpcrdma_mr_seg *seg,
>>>> 		seg1->mr_len = len;
>>>> 	}
>>>> 	*nsegs = i;
>>>> +out:
>>>> 	return rc;
>>>> }
>>>>
>>>> @@ -1592,6 +1604,9 @@ rpcrdma_deregister_frmr_external(struct rpcrdma_mr_seg *seg,
>>>> 	invalidate_wr.ex.invalidate_rkey = seg1->mr_chunk.rl_mw->r.frmr.fr_mr->rkey;
>>>> 	DECR_CQCOUNT(&r_xprt->rx_ep);
>> Ditto.
>>
>>>> +	if (!ia->ri_id->qp)
>>>> +		return -EINVAL;
>>>> +
>>>> 	rc = ib_post_send(ia->ri_id->qp, &invalidate_wr, &bad_wr);
>>>> 	if (rc)
>>>> 		dprintk("RPC:       %s: failed ib_post_send for invalidate,"
>>>> @@ -1923,6 +1938,9 @@ rpcrdma_ep_post(struct rpcrdma_ia *ia,
>>>> 		send_wr.send_flags = IB_SEND_SIGNALED;
>>>> 	}
>> Ditto.
>>
>>>> +	if (!ia->ri_id->qp)
>>>> +		return -EINVAL;
>>>> +
>>>> 	rc = ib_post_send(ia->ri_id->qp, &send_wr, &send_wr_fail);
>>>> 	if (rc)
>>>> 		dprintk("RPC:       %s: ib_post_send returned %i\n", __func__,
>>>> @@ -1951,6 +1969,9 @@ rpcrdma_ep_post_recv(struct rpcrdma_ia *ia,
>>>> 		rep->rr_iov.addr, rep->rr_iov.length, DMA_BIDIRECTIONAL);
>>>>
>>>> 	DECR_CQCOUNT(ep);
>> And here.
>>
>>>> +
>>>> +	if (!ia->ri_id->qp)
>>>> +		return -EINVAL;
>>>> 	rc = ib_post_recv(ia->ri_id->qp, &recv_wr, &recv_wr_fail);
>>>>
>>>> 	if (rc)
>>>> --
>>>> 1.7.1
>>>>
>>> _________________________________
>>> Trond Myklebust
>>> Linux NFS client maintainer, PrimaryData
>>> trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
>>> in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo
>>> info at  http://vger.kernel.org/majordomo-info.html
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-04-10 15:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-09 18:40 [PATCH V1] NFS-RDMA: fix qp pointer validation checks Devesh Sharma
     [not found] ` <014738b6-698e-4ea1-82f9-287378bfec19-3RiH6ntJJkOPfaB/Gd0HpljyZtpTMMwT@public.gmane.org>
2014-04-09 20:22   ` Trond Myklebust
     [not found]     ` <D7AB2150-5F25-4BA2-80D9-94890AD11F8F-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2014-04-09 20:26       ` Chuck Lever
     [not found]         ` <F1C70AD6-BDD4-4534-8DC4-61D2767581D9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-09 23:56           ` Devesh Sharma
     [not found]             ` <EE7902D3F51F404C82415C4803930ACD3FDEAA43-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-10  0:26               ` Chuck Lever
     [not found]                 ` <E66D006A-0D04-4602-8BF5-6834CACD2E24-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-10 15:01                   ` Steve Wise [this message]
     [not found]                     ` <5346B22D.3060706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-04-10 17:43                       ` Chuck Lever
     [not found]                         ` <D7836AB3-FCB6-40EF-9954-B58A05A87791-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-10 18:34                           ` Steve Wise
2014-04-10 17:42                   ` Devesh Sharma
     [not found]                     ` <EE7902D3F51F404C82415C4803930ACD3FDEB3B4-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-10 17:51                       ` Chuck Lever
     [not found]                         ` <BD7B05C0-4733-4DD1-83F3-B30B6B0EE48C-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-10 17:54                           ` Devesh Sharma
     [not found]                             ` <EE7902D3F51F404C82415C4803930ACD3FDEB3DF-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-10 19:53                               ` Chuck Lever
     [not found]                                 ` <56C87770-7940-4006-948C-FEF3C0EC4ACC-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-11 23:51                                   ` Devesh Sharma
     [not found]                                     ` <EE7902D3F51F404C82415C4803930ACD3FDEBD66-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-13  4:01                                       ` Chuck Lever
     [not found]                                         ` <5710A71F-C4D5-408B-9B41-07F21B5853F0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-14 20:53                                           ` Chuck Lever
     [not found]                                             ` <6837A427-B677-4CC7-A022-4FB9E52A3FC6-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-14 22:46                                               ` Devesh Sharma
     [not found]                                                 ` <EE7902D3F51F404C82415C4803930ACD3FDED915-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-15  0:39                                                   ` Chuck Lever
     [not found]                                                     ` <C689AB91-46F6-4E96-A673-0DE76FE54CC4-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-15 18:25                                                       ` Devesh Sharma
     [not found]                                                         ` <EE7902D3F51F404C82415C4803930ACD3FDEE11F-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-23 23:30                                                           ` Devesh Sharma
     [not found]                                                             ` <1bab6615-60c4-4865-a6a0-c53bb1c32341-3RiH6ntJJkP8BX6JNMqfyFjyZtpTMMwT@public.gmane.org>
2014-04-24  7:12                                                               ` Sagi Grimberg
     [not found]                                                                 ` <5358B975.4020207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-24 15:01                                                                   ` Chuck Lever
     [not found]                                                                     ` <B39C0B38-357F-4BDA-BDA7-048BD38853F7-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-24 15:48                                                                       ` Devesh Sharma
     [not found]                                                                         ` <EE7902D3F51F404C82415C4803930ACD3FDF4F83-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-24 17:44                                                                           ` Chuck Lever
2014-04-27 10:12                                                                       ` Sagi Grimberg
     [not found]                                                                     ` <535CD819.3050508@dev! .mellanox.co.il>
     [not found]                                                                       ` <535CD819.3050508-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-27 12:37                                                                         ` Chuck Lever
     [not found]                                                                           ` <4ACED3B0-CC8B-4F1F-8DB6-6C272AB17C99-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-28  8:58                                                                             ` Sagi Grimberg
2014-04-14 23:55                                           ` Devesh Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5346B22D.3060706@opengridcomputing.com \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox