From: "Steve Wise" <swise@opengridcomputing.com>
To: "'Devesh Sharma'" <Devesh.Sharma@Emulex.Com>,
"'Chuck Lever'" <chuck.lever@oracle.com>,
<linux-rdma@vger.kernel.org>, <linux-nfs@vger.kernel.org>
Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
Date: Wed, 2 Jul 2014 14:56:54 -0500 [thread overview]
Message-ID: <006401cf962f$c5ec49e0$51c4dda0$@opengridcomputing.com> (raw)
In-Reply-To: <EE7902D3F51F404C82415C4803930ACD3FE0C5AE@CMEXMB1.ad.emulex.com>
> -----Original Message-----
> From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]
> Sent: Wednesday, July 02, 2014 2:54 PM
> To: Steve Wise; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
>
>
>
> > -----Original Message-----
> > From: Steve Wise [mailto:swise@opengridcomputing.com]
> > Sent: Thursday, July 03, 2014 1:21 AM
> > To: Devesh Sharma; 'Chuck Lever'; linux-rdma@vger.kernel.org; linux-
> > nfs@vger.kernel.org
> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > disconnect
> >
> >
> >
> > > -----Original Message-----
> > > From: Devesh Sharma [mailto:Devesh.Sharma@Emulex.Com]
> > > Sent: Wednesday, July 02, 2014 2:43 PM
> > > To: Steve Wise; Chuck Lever; linux-rdma@vger.kernel.org;
> > > linux-nfs@vger.kernel.org
> > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > disconnect
> > >
> > > > -----Original Message-----
> > > > From: Steve Wise [mailto:swise@opengridcomputing.com]
> > > > Sent: Thursday, July 03, 2014 12:59 AM
> > > > To: Devesh Sharma; Chuck Lever; linux-rdma@vger.kernel.org; linux-
> > > > nfs@vger.kernel.org
> > > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > > disconnect
> > > >
> > > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> > > > > This change is very much prone to generate poll_cq errors because
> > > > > of un-cleaned completions which still point to the non-existent
> > > > > QPs. On the new connection when these completions are polled, the
> > > > > poll_cq will fail
> > > > because old QP pointer is already NULL.
> > > > > Did anyone hit this situation during their testing?
> > > >
> > > > Hey Devesh,
> > > >
> > > > iw_cxgb4 will silently toss CQEs if the QP is not active.
> > >
> > > Ya, just now checked that in mlx and cxgb4 driver code. On the other
> > > hand ocrdma is asserting a BUG-ON for such CQEs causing system panic.
> > > Out of curiosity I am asking, how this change is useful here, is it
> > > reducing the re-connection time...Anyhow rpcrdma_clean_cq was
> > > discarding the completions (flush/successful both)
> > >
> >
> > Well, I don't think there is anything restricting an application from destroying
> > the QP with pending CQEs on its CQs. So it definitely shouldn't cause a
> > BUG_ON() I think. I'll have to read up in the Verbs specs if destroying a QP
> > kills all the pending CQEs...
>
> Oh confusion...let me clarify: in ocrdma BUG ON is hit in poll_cq() after re-connection happens
> and cq is polled again.
> Now the first completion in CQ still points to old QP-ID for which ocrdma does not have valid
> QP pointer.
>
Right. Which means it’s a stale CQE. I don't think that should cause a BUG_ON.
> >
> >
> > > >
> > > >
> > > > >> -----Original Message-----
> > > > >> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-
> > > > >> owner@vger.kernel.org] On Behalf Of Chuck Lever
> > > > >> Sent: Tuesday, June 24, 2014 4:10 AM
> > > > >> To: linux-rdma@vger.kernel.org; linux-nfs@vger.kernel.org
> > > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > > >> disconnect
> > > > >>
> > > > >> CQs are not destroyed until unmount. By draining CQs on transport
> > > > >> disconnect, successful completions that can change the
> > > > >> r.frmr.state field can be missed.
> > >
> > > Still those are missed isn’t it....Since those successful completions
> > > will still be dropped after re- connection. Am I missing something to
> > > understanding the motivation...
> > >
> > > > >>
> > > > >> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > >> ---
> > > > >> net/sunrpc/xprtrdma/verbs.c | 5 -----
> > > > >> 1 file changed, 5 deletions(-)
> > > > >>
> > > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c
> > > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
> > > > >> --- a/net/sunrpc/xprtrdma/verbs.c
> > > > >> +++ b/net/sunrpc/xprtrdma/verbs.c
> > > > >> @@ -873,9 +873,6 @@ retry:
> > > > >> dprintk("RPC: %s:
> > rpcrdma_ep_disconnect"
> > > > >> " status %i\n", __func__, rc);
> > > > >>
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > >> -
> > > > >> xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> > > > >> id = rpcrdma_create_id(xprt, ia,
> > > > >> (struct sockaddr *)&xprt-
> > >rx_data.addr);
> > > > @@ -985,8 +982,6 @@
> > > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia
> > > > >> *ia) {
> > > > >> int rc;
> > > > >>
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > >> rc = rdma_disconnect(ia->ri_id);
> > > > >> if (!rc) {
> > > > >> /* returns without wait if not connected */
> > > > >>
> > > > >> --
> > > > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > > > >> in the body of a message to majordomo@vger.kernel.org More
> > > > majordomo
> > > > >> info at http://vger.kernel.org/majordomo-info.html
> > > > > N r y b X ǧv ^ ){.n + { " ^n r z \x1a h & \x1e G h \x03
> > > > > ( 階 ݢj" \x1a ^[m z ޖ f h ~ mml==
> >
WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: 'Devesh Sharma'
<Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org>,
'Chuck Lever'
<chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
Date: Wed, 2 Jul 2014 14:56:54 -0500 [thread overview]
Message-ID: <006401cf962f$c5ec49e0$51c4dda0$@opengridcomputing.com> (raw)
In-Reply-To: <EE7902D3F51F404C82415C4803930ACD3FE0C5AE-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
> -----Original Message-----
> From: Devesh Sharma [mailto:Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org]
> Sent: Wednesday, July 02, 2014 2:54 PM
> To: Steve Wise; 'Chuck Lever'; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-nfs@vger.kernel.org
> Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect
>
>
>
> > -----Original Message-----
> > From: Steve Wise [mailto:swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org]
> > Sent: Thursday, July 03, 2014 1:21 AM
> > To: Devesh Sharma; 'Chuck Lever'; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-
> > nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > disconnect
> >
> >
> >
> > > -----Original Message-----
> > > From: Devesh Sharma [mailto:Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org]
> > > Sent: Wednesday, July 02, 2014 2:43 PM
> > > To: Steve Wise; Chuck Lever; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> > > linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > Subject: RE: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > disconnect
> > >
> > > > -----Original Message-----
> > > > From: Steve Wise [mailto:swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org]
> > > > Sent: Thursday, July 03, 2014 12:59 AM
> > > > To: Devesh Sharma; Chuck Lever; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-
> > > > nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > > Subject: Re: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > > disconnect
> > > >
> > > > On 7/2/2014 2:06 PM, Devesh Sharma wrote:
> > > > > This change is very much prone to generate poll_cq errors because
> > > > > of un-cleaned completions which still point to the non-existent
> > > > > QPs. On the new connection when these completions are polled, the
> > > > > poll_cq will fail
> > > > because old QP pointer is already NULL.
> > > > > Did anyone hit this situation during their testing?
> > > >
> > > > Hey Devesh,
> > > >
> > > > iw_cxgb4 will silently toss CQEs if the QP is not active.
> > >
> > > Ya, just now checked that in mlx and cxgb4 driver code. On the other
> > > hand ocrdma is asserting a BUG-ON for such CQEs causing system panic.
> > > Out of curiosity I am asking, how this change is useful here, is it
> > > reducing the re-connection time...Anyhow rpcrdma_clean_cq was
> > > discarding the completions (flush/successful both)
> > >
> >
> > Well, I don't think there is anything restricting an application from destroying
> > the QP with pending CQEs on its CQs. So it definitely shouldn't cause a
> > BUG_ON() I think. I'll have to read up in the Verbs specs if destroying a QP
> > kills all the pending CQEs...
>
> Oh confusion...let me clarify: in ocrdma BUG ON is hit in poll_cq() after re-connection happens
> and cq is polled again.
> Now the first completion in CQ still points to old QP-ID for which ocrdma does not have valid
> QP pointer.
>
Right. Which means it’s a stale CQE. I don't think that should cause a BUG_ON.
> >
> >
> > > >
> > > >
> > > > >> -----Original Message-----
> > > > >> From: linux-rdma-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-rdma-
> > > > >> owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Chuck Lever
> > > > >> Sent: Tuesday, June 24, 2014 4:10 AM
> > > > >> To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > > >> Subject: [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport
> > > > >> disconnect
> > > > >>
> > > > >> CQs are not destroyed until unmount. By draining CQs on transport
> > > > >> disconnect, successful completions that can change the
> > > > >> r.frmr.state field can be missed.
> > >
> > > Still those are missed isn’t it....Since those successful completions
> > > will still be dropped after re- connection. Am I missing something to
> > > understanding the motivation...
> > >
> > > > >>
> > > > >> Signed-off-by: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> > > > >> ---
> > > > >> net/sunrpc/xprtrdma/verbs.c | 5 -----
> > > > >> 1 file changed, 5 deletions(-)
> > > > >>
> > > > >> diff --git a/net/sunrpc/xprtrdma/verbs.c
> > > > >> b/net/sunrpc/xprtrdma/verbs.c index 3c7f904..451e100 100644
> > > > >> --- a/net/sunrpc/xprtrdma/verbs.c
> > > > >> +++ b/net/sunrpc/xprtrdma/verbs.c
> > > > >> @@ -873,9 +873,6 @@ retry:
> > > > >> dprintk("RPC: %s:
> > rpcrdma_ep_disconnect"
> > > > >> " status %i\n", __func__, rc);
> > > > >>
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > >> -
> > > > >> xprt = container_of(ia, struct rpcrdma_xprt, rx_ia);
> > > > >> id = rpcrdma_create_id(xprt, ia,
> > > > >> (struct sockaddr *)&xprt-
> > >rx_data.addr);
> > > > @@ -985,8 +982,6 @@
> > > > >> rpcrdma_ep_disconnect(struct rpcrdma_ep *ep, struct rpcrdma_ia
> > > > >> *ia) {
> > > > >> int rc;
> > > > >>
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.recv_cq);
> > > > >> - rpcrdma_clean_cq(ep->rep_attr.send_cq);
> > > > >> rc = rdma_disconnect(ia->ri_id);
> > > > >> if (!rc) {
> > > > >> /* returns without wait if not connected */
> > > > >>
> > > > >> --
> > > > >> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > > > >> in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More
> > > > majordomo
> > > > >> info at http://vger.kernel.org/majordomo-info.html
> > > > > N r y b X ǧv ^ ){.n + { " ^n r z \x1a h & \x1e G h \x03
> > > > > ( 階 ݢj" \x1a ^[m z ޖ f h ~ mml==
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-07-02 19:56 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-23 22:39 [PATCH v1 00/13] NFS/RDMA patches for 3.17 Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-23 22:39 ` [PATCH v1 01/13] xprtrdma: Fix panic in rpcrdma_register_frmr_external() Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-24 14:37 ` Or Gerlitz
2014-06-24 14:37 ` Or Gerlitz
2014-06-23 22:39 ` [PATCH v1 02/13] xprtrdma: Protect ->qp during FRMR deregistration Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-23 22:39 ` [PATCH v1 03/13] xprtrdma: Limit data payload size for ALLPHYSICAL Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-23 22:39 ` [PATCH v1 04/13] xprtrdma: Update rkeys after transport reconnect Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-23 22:39 ` [PATCH v1 05/13] xprtrdma: Don't drain CQs on transport disconnect Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-07-02 19:06 ` Devesh Sharma
2014-07-02 19:06 ` Devesh Sharma
2014-07-02 19:28 ` Steve Wise
2014-07-02 19:28 ` Steve Wise
2014-07-02 19:40 ` Chuck Lever
2014-07-02 19:40 ` Chuck Lever
2014-07-02 19:46 ` Steve Wise
2014-07-02 19:46 ` Steve Wise
2014-07-02 19:48 ` Devesh Sharma
2014-07-02 19:48 ` Devesh Sharma
2014-07-02 19:59 ` Chuck Lever
2014-07-02 19:59 ` Chuck Lever
2014-07-03 5:33 ` Devesh Sharma
2014-07-03 5:33 ` Devesh Sharma
2014-07-02 19:42 ` Devesh Sharma
2014-07-02 19:42 ` Devesh Sharma
2014-07-02 19:50 ` Steve Wise
2014-07-02 19:50 ` Steve Wise
2014-07-02 19:53 ` Devesh Sharma
2014-07-02 19:53 ` Devesh Sharma
2014-07-02 19:56 ` Steve Wise [this message]
2014-07-02 19:56 ` Steve Wise
2014-07-02 19:57 ` Devesh Sharma
2014-07-02 19:57 ` Devesh Sharma
2014-07-02 19:56 ` Devesh Sharma
2014-07-02 19:56 ` Devesh Sharma
2014-06-23 22:39 ` [PATCH v1 06/13] xprtrdma: Unclutter struct rpcrdma_mr_seg Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-23 22:39 ` [PATCH v1 07/13] xprtrdma: Encode Work Request opcode in wc->wr_id Chuck Lever
2014-06-23 22:39 ` Chuck Lever
2014-06-23 22:40 ` [PATCH v1 08/13] xprtrdma: Back off rkey when FAST_REG_MR fails Chuck Lever
2014-06-23 22:40 ` Chuck Lever
2014-06-24 15:47 ` Anna Schumaker
2014-06-24 15:47 ` Anna Schumaker
2014-06-24 16:26 ` Chuck Lever
2014-06-24 16:26 ` Chuck Lever
2014-06-23 22:40 ` [PATCH v1 09/13] xprtrdma: Refactor rpcrdma_buffer_put() Chuck Lever
2014-06-23 22:40 ` Chuck Lever
2014-06-23 22:40 ` [PATCH v1 10/13] xprtrdma: Release FRMR segment buffers during LOCAL_INV completion Chuck Lever
2014-06-23 22:40 ` Chuck Lever
2014-06-25 5:17 ` Shirley Ma
2014-06-25 5:17 ` Shirley Ma
2014-06-25 14:32 ` Chuck Lever
2014-06-25 14:32 ` Chuck Lever
2014-06-25 16:14 ` Shirley Ma
2014-06-25 16:14 ` Shirley Ma
2014-06-23 22:40 ` [PATCH v1 11/13] xprtrdma: Clean up rpcrdma_ep_disconnect() Chuck Lever
2014-06-23 22:40 ` Chuck Lever
2014-06-23 22:40 ` [PATCH v1 12/13] xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro Chuck Lever
2014-06-23 22:40 ` Chuck Lever
2014-06-23 22:40 ` [PATCH v1 13/13] xprtrdma: Handle additional connection events Chuck Lever
2014-06-23 22:40 ` Chuck Lever
2014-06-24 15:58 ` Anna Schumaker
2014-06-24 15:58 ` Anna Schumaker
2014-06-24 14:35 ` [PATCH v1 00/13] NFS/RDMA patches for 3.17 Or Gerlitz
2014-06-24 14:35 ` Or Gerlitz
2014-06-24 17:07 ` Chuck Lever
2014-06-24 17:07 ` Chuck Lever
2014-06-25 22:47 ` Steve Wise
2014-06-25 22:47 ` Steve Wise
2014-06-27 16:17 ` Shirley Ma
2014-06-27 16:17 ` Shirley Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='006401cf962f$c5ec49e0$51c4dda0$@opengridcomputing.com' \
--to=swise@opengridcomputing.com \
--cc=Devesh.Sharma@Emulex.Com \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.