From: Chuck Lever <chuck.lever@oracle.com>
To: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Cc: Anna.Schumaker@netapp.com
Subject: [PATCH V3 16/17] xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting
Date: Wed, 30 Apr 2014 15:31:47 -0400 [thread overview]
Message-ID: <20140430193147.5663.13351.stgit@manet.1015granger.net> (raw)
In-Reply-To: <20140430191433.5663.16217.stgit@manet.1015granger.net>
Devesh Sharma <Devesh.Sharma@Emulex.Com> reports that after a
disconnect, his HCA is failing to create a fresh QP, leaving
ia_ri->ri_id->qp set to NULL. But xprtrdma still allows RPCs to
wake up and post LOCAL_INV as they exit, causing an oops.
rpcrdma_ep_connect() is allowing the wake-up by leaking the QP
creation error code (-EPERM in this case) to the RPC client's
generic layer. xprt_connect_status() does not recognize -EPERM, so
it kills pending RPC tasks immediately rather than retrying the
connect.
Re-arrange the QP creation logic so that when it fails on reconnect,
it leaves ->qp with the old QP rather than NULL. If pending RPC
tasks wake and exit, LOCAL_INV work requests will flush rather than
oops.
On initial connect, leaving ->qp == NULL is OK, since there are no
pending RPCs that might use ->qp. But be sure not to try to destroy
a NULL QP when rpcrdma_ep_connect() is retried.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/verbs.c | 29 ++++++++++++++++++++---------
1 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index c80995a..54edf2a 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -867,6 +867,7 @@ rpcrdma_ep_connect(struct rpcrdma_ep *ep, struct rpcrdma_ia *ia)
if (ep->rep_connected != 0) {
struct rpcrdma_xprt *xprt;
retry:
+ dprintk("RPC: %s: reconnecting...\n", __func__);
rc = rpcrdma_ep_disconnect(ep, ia);
if (rc && rc != -ENOTCONN)
dprintk("RPC: %s: rpcrdma_ep_disconnect"
@@ -879,7 +880,7 @@ retry:
id = rpcrdma_create_id(xprt, ia,
(struct sockaddr *)&xprt->rx_data.addr);
if (IS_ERR(id)) {
- rc = PTR_ERR(id);
+ rc = -EHOSTUNREACH;
goto out;
}
/* TEMP TEMP TEMP - fail if new device:
@@ -893,20 +894,30 @@ retry:
printk("RPC: %s: can't reconnect on "
"different device!\n", __func__);
rdma_destroy_id(id);
- rc = -ENETDOWN;
+ rc = -ENETUNREACH;
goto out;
}
/* END TEMP */
+ rc = rdma_create_qp(id, ia->ri_pd, &ep->rep_attr);
+ if (rc) {
+ dprintk("RPC: %s: rdma_create_qp failed %i\n",
+ __func__, rc);
+ rdma_destroy_id(id);
+ rc = -ENETUNREACH;
+ goto out;
+ }
rdma_destroy_qp(ia->ri_id);
rdma_destroy_id(ia->ri_id);
ia->ri_id = id;
- }
-
- rc = rdma_create_qp(ia->ri_id, ia->ri_pd, &ep->rep_attr);
- if (rc) {
- dprintk("RPC: %s: rdma_create_qp failed %i\n",
- __func__, rc);
- goto out;
+ } else {
+ dprintk("RPC: %s: connecting...\n", __func__);
+ rc = rdma_create_qp(ia->ri_id, ia->ri_pd, &ep->rep_attr);
+ if (rc) {
+ dprintk("RPC: %s: rdma_create_qp failed %i\n",
+ __func__, rc);
+ /* do not update ep->rep_connected */
+ return -ENETUNREACH;
+ }
}
/* XXX Tavor device performs badly with 2K MTU! */
next prev parent reply other threads:[~2014-04-30 19:31 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-30 19:29 [PATCH V3 00/17] NFS/RDMA client-side patches Chuck Lever
2014-04-30 19:29 ` [PATCH V3 01/17] xprtrdma: mind the device's max fast register page list depth Chuck Lever
2014-05-16 7:08 ` Devesh Sharma
2014-05-16 14:10 ` Steve Wise
2014-05-16 14:14 ` Steve Wise
2014-05-16 14:29 ` Steve Wise
2014-05-17 8:23 ` Devesh Sharma
2014-04-30 19:29 ` [PATCH V3 02/17] nfs-rdma: Fix for FMR leaks Chuck Lever
2014-04-30 19:29 ` [PATCH V3 03/17] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-04-30 19:30 ` [PATCH V3 04/17] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-04-30 19:30 ` [PATCH V3 05/17] xprtrdma: Remove MEMWINDOWS registration modes Chuck Lever
2014-04-30 19:30 ` [PATCH V3 06/17] xprtrdma: Remove REGISTER memory registration mode Chuck Lever
2014-04-30 19:30 ` [PATCH V3 07/17] xprtrdma: Fall back to MTHCAFMR when FRMR is not supported Chuck Lever
2014-04-30 19:30 ` [PATCH V3 08/17] xprtrdma: mount reports "Invalid mount option" if memreg mode " Chuck Lever
2014-04-30 19:30 ` [PATCH V3 09/17] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-04-30 19:30 ` [PATCH V3 10/17] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-04-30 19:31 ` [PATCH V3 11/17] xprtrdma: Split the completion queue Chuck Lever
2014-04-30 19:31 ` [PATCH V3 12/17] xprtrmda: Reduce lock contention in completion handlers Chuck Lever
2014-04-30 19:31 ` [PATCH V3 13/17] xprtrmda: Reduce calls to ib_poll_cq() " Chuck Lever
2014-04-30 19:31 ` [PATCH V3 14/17] xprtrdma: Limit work done by completion handler Chuck Lever
2014-04-30 19:31 ` [PATCH V3 15/17] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-04-30 19:31 ` Chuck Lever [this message]
2014-04-30 19:31 ` [PATCH V3 17/17] xprtrdma: Remove Tavor MTU setting Chuck Lever
2014-05-01 7:36 ` Hal Rosenstock
2014-05-02 19:27 ` [PATCH V3 00/17] NFS/RDMA client-side patches Doug Ledford
[not found] ` <20140430191433.5663.16217.stgit-FYjufvaPoItvLzlybtyyYzGyq/o6K9yX@public.gmane.org>
2014-05-02 19:27 ` Doug Ledford
2014-05-02 19:27 ` Doug Ledford
[not found] ` <5363f223.e39f420a.4af6.6fc9SMTPIN_ADDED_BROKEN@mx.google.com>
2014-05-02 20:20 ` Chuck Lever
2014-05-02 22:34 ` Doug Ledford
2014-05-02 22:34 ` Doug Ledford
2014-05-02 22:34 ` Doug Ledford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140430193147.5663.13351.stgit@manet.1015granger.net \
--to=chuck.lever@oracle.com \
--cc=Anna.Schumaker@netapp.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).