linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tom Tucker <tom@opengridcomputing.com>
To: "J. Bruce Fields" <bfields@fieldses.org>,
	Steve Wise <swise@opengridcomputing.com>
Cc: trond.myklebust@primarydata.com, linux-nfs@vger.kernel.org
Subject: Re: [PATCH] Fix regression in NFSRDMA server
Date: Fri, 28 Mar 2014 10:21:27 -0500	[thread overview]
Message-ID: <53359377.8060502@opengridcomputing.com> (raw)
In-Reply-To: <20140328020834.GD27633@fieldses.org>

Hi Bruce,

On 3/27/14 9:08 PM, J. Bruce Fields wrote:
> On Tue, Mar 25, 2014 at 03:14:57PM -0500, Steve Wise wrote:
>> From: Tom Tucker <tom@ogc.us>
>>
>> The server regression was caused by the addition of rq_next_page
>> (afc59400d6c65bad66d4ad0b2daf879cbff8e23e). There were a few places that
>> were missed with the update of the rq_respages array.
> Apologies.  (But, it could happen again--could we set up some regular
> testing?  It doesn't have to be anything fancy, just cthon over
> rdma--really, just read and write over rdma--would probably catch a
> lot.)

I think Chelsio is going to be adding some NFSRDMA regression testing to 
their system test.

> Also: I don't get why all these rq_next_page initializations are
> required.  Why isn't the initialization at the top of svc_process()
> enough?  Is rdma using it before we get to that point?  The only use of
> it I see off hand is in the while loop that you're deleting.

I didn't apply tremendous deductive powers here, I just added updates to 
rq_next_page wherever the transport messed with rq_respages. That said, 
NFS WRITE is likely the culprit since the write is completed as a deferral 
and therefore the request doesn't go through svc_process, so if 
rq_next_page is bogus, the cleanup will free/re-use pages that are 
actually in use by the transport.

Tom
> --b.
>
>> Signed-off-by: Tom Tucker <tom@ogc.us>
>> Tested-by: Steve Wise <swise@ogc.us>
>> ---
>>
>>   net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |   12 ++++--------
>>   net/sunrpc/xprtrdma/svc_rdma_sendto.c   |    1 +
>>   2 files changed, 5 insertions(+), 8 deletions(-)
>>
>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>> index 0ce7552..8d904e4 100644
>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>> @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
>>   		sge_no++;
>>   	}
>>   	rqstp->rq_respages = &rqstp->rq_pages[sge_no];
>> +	rqstp->rq_next_page = rqstp->rq_respages + 1;
>>   
>>   	/* We should never run out of SGE because the limit is defined to
>>   	 * support the max allowed RPC data length
>> @@ -169,6 +170,7 @@ static int map_read_chunks(struct svcxprt_rdma *xprt,
>>   		 */
>>   		head->arg.pages[page_no] = rqstp->rq_arg.pages[page_no];
>>   		rqstp->rq_respages = &rqstp->rq_arg.pages[page_no+1];
>> +		rqstp->rq_next_page = rqstp->rq_respages + 1;
>>   
>>   		byte_count -= sge_bytes;
>>   		ch_bytes -= sge_bytes;
>> @@ -276,6 +278,7 @@ static int fast_reg_read_chunks(struct svcxprt_rdma *xprt,
>>   
>>   	/* rq_respages points one past arg pages */
>>   	rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
>> +	rqstp->rq_next_page = rqstp->rq_respages + 1;
>>   
>>   	/* Create the reply and chunk maps */
>>   	offset = 0;
>> @@ -520,13 +523,6 @@ next_sge:
>>   	for (ch_no = 0; &rqstp->rq_pages[ch_no] < rqstp->rq_respages; ch_no++)
>>   		rqstp->rq_pages[ch_no] = NULL;
>>   
>> -	/*
>> -	 * Detach res pages. If svc_release sees any it will attempt to
>> -	 * put them.
>> -	 */
>> -	while (rqstp->rq_next_page != rqstp->rq_respages)
>> -		*(--rqstp->rq_next_page) = NULL;
>> -
>>   	return err;
>>   }
>>   
>> @@ -550,7 +546,7 @@ static int rdma_read_complete(struct svc_rqst *rqstp,
>>   
>>   	/* rq_respages starts after the last arg page */
>>   	rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
>> -	rqstp->rq_next_page = &rqstp->rq_arg.pages[page_no];
>> +	rqstp->rq_next_page = rqstp->rq_respages + 1;
>>   
>>   	/* Rebuild rq_arg head and tail. */
>>   	rqstp->rq_arg.head[0] = head->arg.head[0];
>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_sendto.c b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>> index c1d124d..11e90f8 100644
>> --- a/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>> +++ b/net/sunrpc/xprtrdma/svc_rdma_sendto.c
>> @@ -625,6 +625,7 @@ static int send_reply(struct svcxprt_rdma *rdma,
>>   		if (page_no+1 >= sge_no)
>>   			ctxt->sge[page_no+1].length = 0;
>>   	}
>> +	rqstp->rq_next_page = rqstp->rq_respages + 1;
>>   	BUG_ON(sge_no > rdma->sc_max_sge);
>>   	memset(&send_wr, 0, sizeof send_wr);
>>   	ctxt->wr_op = IB_WR_SEND;
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  parent reply	other threads:[~2014-03-28 15:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-25 20:14 [PATCH] Fix regression in NFSRDMA server Steve Wise
2014-03-28  2:08 ` J. Bruce Fields
2014-03-28 13:38   ` Steve Wise
2014-03-28 15:21   ` Tom Tucker [this message]
2014-03-28 21:26     ` J. Bruce Fields
2014-03-28 21:31       ` Steve Wise
2014-03-29  0:11       ` Tom Tucker
2014-03-29  0:51         ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53359377.8060502@opengridcomputing.com \
    --to=tom@opengridcomputing.com \
    --cc=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=swise@opengridcomputing.com \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).