From: Chuck Lever <chuck.lever@oracle.com>
To: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: [PATCH v4 22/24] xprtrdma: Avoid deadlock when credit window is reset
Date: Wed, 21 May 2014 20:57:23 -0400 [thread overview]
Message-ID: <20140522005723.27190.90796.stgit@manet.1015granger.net> (raw)
In-Reply-To: <20140522004505.27190.58897.stgit@manet.1015granger.net>
Update the cwnd from the server's reply before invoking
__xprt_put_cong(). Otherwise the next task on the xprt_sending queue
is still subject to the old credit window. Currently, no task is
awoken if the old congestion window is still exceeded, even if the
new window is larger, and a deadlock results.
This is an issue during a transport reconnect. Servers don't
normally shrink the credit window, but the client does reset it to
1 when reconnecting so the server can safely grow it again.
As a minor optimization, remove the hack of grabbing the initial
cwnd size (which happens to be RPC_CWNDSCALE) and using that value
as the congestion scaling factor. The scaling value is invariant,
and we are better off without the multiplication operation.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
net/sunrpc/xprtrdma/rpc_rdma.c | 3 +++
net/sunrpc/xprtrdma/transport.c | 19 +------------------
net/sunrpc/xprtrdma/xprt_rdma.h | 1 -
3 files changed, 4 insertions(+), 19 deletions(-)
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index 1334646..82173c7 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -846,6 +846,9 @@ badheader:
break;
}
+ xprt->cwnd = atomic_read(&r_xprt->rx_buf.rb_credits) << RPC_CWNDSHIFT;
+ xprt_release_rqst_cong(rqst->rq_task);
+
dprintk("RPC: %s: xprt_complete_rqst(0x%p, 0x%p, %d)\n",
__func__, xprt, rqst, status);
xprt_complete_rqst(rqst->rq_task, status);
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 04b7452..65cfaca 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -448,23 +448,6 @@ xprt_rdma_connect(struct rpc_xprt *xprt, struct rpc_task *task)
}
}
-static int
-xprt_rdma_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task)
-{
- struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
- int credits = atomic_read(&r_xprt->rx_buf.rb_credits);
-
- /* == RPC_CWNDSCALE @ init, but *after* setup */
- if (r_xprt->rx_buf.rb_cwndscale == 0UL) {
- r_xprt->rx_buf.rb_cwndscale = xprt->cwnd;
- dprintk("RPC: %s: cwndscale %lu\n", __func__,
- r_xprt->rx_buf.rb_cwndscale);
- BUG_ON(r_xprt->rx_buf.rb_cwndscale <= 0);
- }
- xprt->cwnd = credits * r_xprt->rx_buf.rb_cwndscale;
- return xprt_reserve_xprt_cong(xprt, task);
-}
-
/*
* The RDMA allocate/free functions need the task structure as a place
* to hide the struct rpcrdma_req, which is necessary for the actual send/recv
@@ -686,7 +669,7 @@ static void xprt_rdma_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
*/
static struct rpc_xprt_ops xprt_rdma_procs = {
- .reserve_xprt = xprt_rdma_reserve_xprt,
+ .reserve_xprt = xprt_reserve_xprt_cong,
.release_xprt = xprt_release_xprt_cong, /* sunrpc/xprt.c */
.alloc_slot = xprt_alloc_slot,
.release_request = xprt_release_rqst_cong, /* ditto */
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 0c3b88e..89e7cd4 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -212,7 +212,6 @@ struct rpcrdma_req {
struct rpcrdma_buffer {
spinlock_t rb_lock; /* protects indexes */
atomic_t rb_credits; /* most recent server credits */
- unsigned long rb_cwndscale; /* cached framework rpc_cwndscale */
int rb_max_requests;/* client max requests */
struct list_head rb_mws; /* optional memory windows/fmrs/frmrs */
int rb_send_index;
next prev parent reply other threads:[~2014-05-22 0:57 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-22 0:54 [PATCH v4 00/24] NFS/RDMA client patches for next merge Chuck Lever
2014-05-22 0:54 ` [PATCH v4 01/24] xprtrdma: mind the device's max fast register page list depth Chuck Lever
2014-05-22 0:54 ` [PATCH v4 02/24] nfs-rdma: Fix for FMR leaks Chuck Lever
2014-05-22 0:54 ` [PATCH v4 03/24] xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context Chuck Lever
2014-05-22 0:54 ` [PATCH v4 04/24] xprtrdma: Remove BOUNCEBUFFERS memory registration mode Chuck Lever
2014-05-22 0:54 ` [PATCH v4 05/24] xprtrdma: Remove MEMWINDOWS registration modes Chuck Lever
2014-05-22 0:55 ` [PATCH v4 06/24] xprtrdma: Remove REGISTER memory registration mode Chuck Lever
2014-05-22 0:55 ` [PATCH v4 07/24] xprtrdma: Fall back to MTHCAFMR when FRMR is not supported Chuck Lever
2014-05-22 0:55 ` [PATCH v4 08/24] xprtrdma: mount reports "Invalid mount option" if memreg mode " Chuck Lever
2014-05-22 0:55 ` [PATCH v4 09/24] xprtrdma: Simplify rpcrdma_deregister_external() synopsis Chuck Lever
2014-05-22 0:55 ` [PATCH v4 10/24] xprtrdma: Make rpcrdma_ep_destroy() return void Chuck Lever
2014-05-22 0:55 ` [PATCH v4 11/24] xprtrdma: Split the completion queue Chuck Lever
2014-05-22 0:55 ` [PATCH v4 12/24] xprtrmda: Reduce lock contention in completion handlers Chuck Lever
2014-05-22 0:56 ` [PATCH v4 13/24] xprtrmda: Reduce calls to ib_poll_cq() " Chuck Lever
2014-05-22 0:56 ` [PATCH v4 14/24] xprtrdma: Limit work done by completion handler Chuck Lever
2014-05-22 0:56 ` [PATCH v4 15/24] xprtrdma: Reduce the number of hardway buffer allocations Chuck Lever
2014-05-22 0:56 ` [PATCH v4 16/24] xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting Chuck Lever
2014-05-22 0:56 ` [PATCH v4 17/24] xprtrdma: Remove Tavor MTU setting Chuck Lever
2014-05-22 0:56 ` [PATCH v4 18/24] xprtrdma: Allocate missing pagelist Chuck Lever
2014-05-22 0:56 ` [PATCH v4 19/24] xprtrdma: Use macros for reconnection timeout constants Chuck Lever
2014-05-22 0:57 ` [PATCH v4 20/24] xprtrdma: Reset connection timeout after successful reconnect Chuck Lever
2014-05-22 2:07 ` Trond Myklebust
2014-05-22 3:28 ` Chuck Lever
2014-05-22 0:57 ` [PATCH v4 21/24] SUNRPC: Move congestion window contants to header file Chuck Lever
2014-05-22 0:57 ` Chuck Lever [this message]
2014-05-22 0:57 ` [PATCH v4 23/24] xprtrdma: Remove BUG_ON() call sites Chuck Lever
2014-05-22 0:57 ` [PATCH v4 24/24] xprtrdma: Disconnect on registration failure Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140522005723.27190.90796.stgit@manet.1015granger.net \
--to=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).