* Re: [PATCH 01/04] NFS/RDMA client stall patches
@ 2008-05-19 3:50 Peter Leckie
2008-06-10 19:24 ` Trond Myklebust
0 siblings, 1 reply; 5+ messages in thread
From: Peter Leckie @ 2008-05-19 3:50 UTC (permalink / raw)
To: talpey; +Cc: linux-nfs
Don't call __xprt_get_cong() if this is a retransmit.
This prevents __xprt_get_cong() from recursively
incrementing the congestion avoidance window for
retransmitted data.
Signed-off-by: Peter Leckie <pleckie-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
Reviewed-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
X-Sgi-Pv: 971446
<http://bugworks/query.cgi/971446>---
Index: linux-2.6.25.3/net/sunrpc/xprt.c
===================================================================
--- linux-2.6.25.3.orig/net/sunrpc/xprt.c
+++ linux-2.6.25.3/net/sunrpc/xprt.c
@@ -224,7 +224,8 @@ int xprt_reserve_xprt_cong(struct rpc_ta
return 1;
goto out_sleep;
}
- if (__xprt_get_cong(xprt, task)) {
+ /*If this is a retransmit don't increment cong*/
+ if ((req && req->rq_ntrans) ||__xprt_get_cong(xprt, task)) {
xprt->snd_task = task;
if (req) {
req->rq_bytes_sent = 0;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 01/04] NFS/RDMA client stall patches
2008-05-19 3:50 [PATCH 01/04] NFS/RDMA client stall patches Peter Leckie
@ 2008-06-10 19:24 ` Trond Myklebust
2008-06-11 8:03 ` Peter Leckie
0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2008-06-10 19:24 UTC (permalink / raw)
To: Peter Leckie; +Cc: talpey, linux-nfs
On Mon, 2008-05-19 at 13:50 +1000, Peter Leckie wrote:
> Don't call __xprt_get_cong() if this is a retransmit.
> This prevents __xprt_get_cong() from recursively
> incrementing the congestion avoidance window for
> retransmitted data.
>
> Signed-off-by: Peter Leckie <pleckie-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
> Reviewed-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
> X-Sgi-Pv: 971446
> <http://bugworks/query.cgi/971446>---
> Index: linux-2.6.25.3/net/sunrpc/xprt.c
> ===================================================================
> --- linux-2.6.25.3.orig/net/sunrpc/xprt.c
> +++ linux-2.6.25.3/net/sunrpc/xprt.c
> @@ -224,7 +224,8 @@ int xprt_reserve_xprt_cong(struct rpc_ta
> return 1;
> goto out_sleep;
> }
> - if (__xprt_get_cong(xprt, task)) {
> + /*If this is a retransmit don't increment cong*/
> + if ((req && req->rq_ntrans) ||__xprt_get_cong(xprt, task)) {
> xprt->snd_task = task;
> if (req) {
> req->rq_bytes_sent = 0;
>
Why would we not want to increment the congestion avoidance window on
retransmitted data? On timeout, xprt_adjust_cwnd will call
__xprt_put_cong() prior to the retransmission, so I can't see how this
is a 'recursive increment'.
Trond
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 01/04] NFS/RDMA client stall patches
2008-06-10 19:24 ` Trond Myklebust
@ 2008-06-11 8:03 ` Peter Leckie
2008-06-11 13:53 ` Talpey, Thomas
0 siblings, 1 reply; 5+ messages in thread
From: Peter Leckie @ 2008-06-11 8:03 UTC (permalink / raw)
To: Trond Myklebust; +Cc: talpey, linux-nfs
Trond Myklebust wrote:
> On Mon, 2008-05-19 at 13:50 +1000, Peter Leckie wrote:
>
>> Don't call __xprt_get_cong() if this is a retransmit.
>> This prevents __xprt_get_cong() from recursively
>> incrementing the congestion avoidance window for
>> retransmitted data.
>>
>> Signed-off-by: Peter Leckie <pleckie-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
>> Reviewed-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
>> X-Sgi-Pv: 971446
>> <http://bugworks/query.cgi/971446>---
>> Index: linux-2.6.25.3/net/sunrpc/xprt.c
>> ===================================================================
>> --- linux-2.6.25.3.orig/net/sunrpc/xprt.c
>> +++ linux-2.6.25.3/net/sunrpc/xprt.c
>> @@ -224,7 +224,8 @@ int xprt_reserve_xprt_cong(struct rpc_ta
>> return 1;
>> goto out_sleep;
>> }
>> - if (__xprt_get_cong(xprt, task)) {
>> + /*If this is a retransmit don't increment cong*/
>> + if ((req && req->rq_ntrans) ||__xprt_get_cong(xprt, task)) {
>> xprt->snd_task = task;
>> if (req) {
>> req->rq_bytes_sent = 0;
>>
>>
>
> Why would we not want to increment the congestion avoidance window on
> retransmitted data? On timeout, xprt_adjust_cwnd will call
> __xprt_put_cong() prior to the retransmission, so I can't see how this
> is a 'recursive increment'.
>
> Trond
That's a good point you raise there I was looking to closely at the tcp
equivalent, the correct fix for this issue would be to implement a timer
function for NFS/RDMA pretty much identical to xs_udp_timer(), as follows:
Implement xprt_rdma_timer() to be called when an RPC times out.
This is needed to decrement the cong after an rpc times out preventing
the congestion aviodance from tripping under retransmitts.
Signed-off-by: Peter Leckie <pleckie-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
Reviewed-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
---
Index: linux-2.6.25.3/net/sunrpc/xprtrdma/transport.c
===================================================================
--- linux-2.6.25.3.orig/net/sunrpc/xprtrdma/transport.c
+++ linux-2.6.25.3/net/sunrpc/xprtrdma/transport.c
@@ -450,6 +450,18 @@ out1:
}
/*
+ * xprt_rdma_timer - called when a retransmit timeout occurs on a RDMA
transport
+ * @task: task that timed out
+ *
+ * Adjust the congestion window after a retransmit timeout has occurred.
+ */
+static void
+xprt_rdma_timer(struct rpc_task *task)
+{
+ xprt_adjust_cwnd(task, -ETIMEDOUT);
+}
+
+/*
* Close a connection, during shutdown or timeout/reconnect
*/
static void
@@ -755,7 +767,8 @@ static struct rpc_xprt_ops xprt_rdma_pro
.send_request = xprt_rdma_send_request,
.close = xprt_rdma_close,
.destroy = xprt_rdma_destroy,
- .print_stats = xprt_rdma_print_stats
+ .print_stats = xprt_rdma_print_stats,
+ .timer = xprt_rdma_timer
};
static struct xprt_class xprt_rdma = {
Thanks,
Pete
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 01/04] NFS/RDMA client stall patches
2008-06-11 8:03 ` Peter Leckie
@ 2008-06-11 13:53 ` Talpey, Thomas
[not found] ` <RTPCLUEXC1-PRDaogxL000001eb-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Talpey, Thomas @ 2008-06-11 13:53 UTC (permalink / raw)
To: Peter Leckie; +Cc: Trond Myklebust, talpey, linux-nfs
At 04:03 AM 6/11/2008, Peter Leckie wrote:
>That's a good point you raise there I was looking to closely at the tcp
>equivalent, the correct fix for this issue would be to implement a timer
>function for NFS/RDMA pretty much identical to xs_udp_timer(), as follows:
Hmm, in fact that runs into a different issue - retransmitting over RDMA
isn't allowed, since it consumes server credits and therefore will eventually
overrun the connection's receive queue. I have a patch in my queue to
force a disconnect in fact, which is the appropriate action. I will send them
out soon, it's in with some other post-Connectathon work.
I think with your earlier patch to avoid the 5-second pause, the disconnect
action will be prompt and accurate. However, I would still be concerned why
the RPC was timing out in the first place. Was there an issue in the server?
Tom.
>
>
>Implement xprt_rdma_timer() to be called when an RPC times out.
>This is needed to decrement the cong after an rpc times out preventing
>the congestion aviodance from tripping under retransmitts.
>
>Signed-off-by: Peter Leckie <pleckie-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
>Reviewed-by: Greg Banks <gnb-cP1dWloDopni96+mSzHFpQC/G2K4zDHf@public.gmane.org>
>---
>
>Index: linux-2.6.25.3/net/sunrpc/xprtrdma/transport.c
>===================================================================
>--- linux-2.6.25.3.orig/net/sunrpc/xprtrdma/transport.c
>+++ linux-2.6.25.3/net/sunrpc/xprtrdma/transport.c
>@@ -450,6 +450,18 @@ out1:
> }
>
> /*
>+ * xprt_rdma_timer - called when a retransmit timeout occurs on a RDMA
>transport
>+ * @task: task that timed out
>+ *
>+ * Adjust the congestion window after a retransmit timeout has occurred.
>+ */
>+static void
>+xprt_rdma_timer(struct rpc_task *task)
>+{
>+ xprt_adjust_cwnd(task, -ETIMEDOUT);
>+}
>+
>+/*
> * Close a connection, during shutdown or timeout/reconnect
> */
> static void
>@@ -755,7 +767,8 @@ static struct rpc_xprt_ops xprt_rdma_pro
> .send_request = xprt_rdma_send_request,
> .close = xprt_rdma_close,
> .destroy = xprt_rdma_destroy,
>- .print_stats = xprt_rdma_print_stats
>+ .print_stats = xprt_rdma_print_stats,
>+ .timer = xprt_rdma_timer
> };
>
> static struct xprt_class xprt_rdma = {
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 01/04] NFS/RDMA client stall patches
[not found] ` <RTPCLUEXC1-PRDaogxL000001eb-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
@ 2008-06-12 8:45 ` Peter Leckie
0 siblings, 0 replies; 5+ messages in thread
From: Peter Leckie @ 2008-06-12 8:45 UTC (permalink / raw)
To: Talpey, Thomas; +Cc: Trond Myklebust, talpey, linux-nfs
Talpey, Thomas wrote:
> At 04:03 AM 6/11/2008, Peter Leckie wrote:
>
>> That's a good point you raise there I was looking to closely at the tcp
>> equivalent, the correct fix for this issue would be to implement a timer
>> function for NFS/RDMA pretty much identical to xs_udp_timer(), as follows:
>>
>
> Hmm, in fact that runs into a different issue - retransmitting over RDMA
> isn't allowed, since it consumes server credits and therefore will eventually
> overrun the connection's receive queue. I have a patch in my queue to
> force a disconnect in fact, which is the appropriate action. I will send them
> out soon, it's in with some other post-Connectathon work.
>
> I think with your earlier patch to avoid the 5-second pause, the disconnect
> action will be prompt and accurate. However, I would still be concerned why
> the RPC was timing out in the first place. Was there an issue in the server?
>
So this is not a typical runtime issue it was just another reason the
NFS/RDMA client failed to reconnect after
the server disconnects us. Under this circumstance the client is stalled
on congestion and has not received the
disconnection event. So what both these patches do is allow the client
to try and resend another RPC, this causes
rpcrdma_conn_upcall() to be called with an event of
RDMA_CM_EVENT_CONNECT_ERROR which
then allows the xprt to be disconnected and reconnected. Without this
change what happens is the client sits there
for ever waiting for the congestion to drop.
This does ask another question why didn't the DREQ from the server cause
the xprt on the client to disconnect.
Umm I might try and reproduce this error and see what's happening.
Thanks,
Pete
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-06-12 8:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-19 3:50 [PATCH 01/04] NFS/RDMA client stall patches Peter Leckie
2008-06-10 19:24 ` Trond Myklebust
2008-06-11 8:03 ` Peter Leckie
2008-06-11 13:53 ` Talpey, Thomas
[not found] ` <RTPCLUEXC1-PRDaogxL000001eb-rtwIt2gI0FxT+ZUat5FNkAK/GNPrWCqfQQ4Iyu8u01E@public.gmane.org>
2008-06-12 8:45 ` Peter Leckie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox