public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jeff.layton@primarydata.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
	bfields@fieldses.org
Subject: Re: [PATCH] nfsd: Ensure that NFSv4 always drops the connection when dropping a request
Date: Fri, 24 Oct 2014 10:57:58 -0400	[thread overview]
Message-ID: <20141024105758.064f2e14@tlielax.poochiereds.net> (raw)
In-Reply-To: <1414161055.21776.2.camel@leira.trondhjem.org>

On Fri, 24 Oct 2014 17:30:55 +0300
Trond Myklebust <trond.myklebust@primarydata.com> wrote:

> On Fri, 2014-10-24 at 09:34 -0400, Jeff Layton wrote:
> > On Fri, 24 Oct 2014 07:26:44 -0400
> > Jeff Layton <jlayton@primarydata.com> wrote:
> >  
> > > I guess the theory here is that the client sent the request, and got a
> > > TCP ACK and then the server never responded because it dropped the
> > > request? That sounds plausible and we almost certainly want something
> > > like the above for v4.0.
> > > 
> > > If the above patch fixes Christoph's problem with v4.1 though, then I
> > > think we'll need more investigation into why. I don't see a way to get
> > > a dropped request with v4.1. Those mainly occur due to interaction with
> > > the DRC (disabled for v4.1) or with svc_defer (disabled for all v4
> > > requests).
> > > 
> > > Is it possible that one of the nfsd threads is just hung while
> > > processing an RPC and that's why you're not getting a response?
> > > 
> > 
> > Ok, I'm able to reproduce this too and got a network trace. I think
> > Trond is correct as to the cause:
> > 
> > I see a SETATTR call go out and then see a TCP ACK for it a little
> > later, but didn't see a reply sent for it.
> > 
> > I sprinkled in some printks around the "dropit" codepath above and in
> > some other areas however and none of them fired. So while Trond's patch
> > looks like it might be correct, I don't think it'll fix the problem I'm
> > seeing.
> > 
> > Sniffing a little later, it seems like the server replies are being
> > delayed somehow:
> 
> You can get deferrals in the authentication code which trigger drops.
> Here is a variant which tries to deal with that:
> 
> 8<-----------------------------------------------------
> From a9be1292152af2cb651415c44fd3d91b3d017240 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> Date: Fri, 24 Oct 2014 12:54:47 +0300
> Subject: [PATCH v2] nfsd: Ensure that NFSv4 always drops the connection when
>  dropping a request
> 
> Both RFC3530 and RFC5661 impose a requirement on the server that it MUST NOT
> drop a request unless the connection is broken. This is in order to allow
> clients to assume that they can always expect an answer unless the
> connection is broken.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> Cc: stable@vger.kernel.org
> ---
> v2: Also deal with NFSv4 requests dropped due to authentication issues
> 
>  fs/nfsd/nfs4proc.c         |  6 ++++++
>  include/linux/sunrpc/svc.h |  1 +
>  net/sunrpc/svc.c           | 13 +++++++++----
>  3 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index cdeb3cfd6f32..500ac76662a8 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1960,6 +1960,11 @@ static const char *nfsd4_op_name(unsigned opnum)
>  	return "unknown_operation";
>  }
>  
> +static void nfsd4_dropme(struct svc_rqst *rqstp)
> +{
> +	svc_close_xprt(rqstp->rq_xprt);
> +}
> +
>  #define nfsd4_voidres			nfsd4_voidargs
>  struct nfsd4_voidargs { int dummy; };
>  
> @@ -1989,6 +1994,7 @@ struct svc_version	nfsd_version4 = {
>  		.vs_nproc	= 2,
>  		.vs_proc	= nfsd_procedures4,
>  		.vs_dispatch	= nfsd_dispatch,
> +		.vs_dropme	= nfsd4_dropme,
>  		.vs_xdrsize	= NFS4_SVC_XDRSIZE,
>  		.vs_rpcb_optnl	= 1,
>  };
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 21678464883a..824656da1f6d 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -400,6 +400,7 @@ struct svc_version {
>  	 * vs_dispatch == NULL means use default dispatcher.
>  	 */
>  	int			(*vs_dispatch)(struct svc_rqst *, __be32 *);
> +	void			(*vs_dropme)(struct svc_rqst *);
>  };
>  
>  /*
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ca8a7958f4e6..a178496ce2c1 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -1111,9 +1111,13 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
>  	rqstp->rq_vers = vers = svc_getnl(argv);	/* version number */
>  	rqstp->rq_proc = proc = svc_getnl(argv);	/* procedure number */
>  
> -	for (progp = serv->sv_program; progp; progp = progp->pg_next)
> -		if (prog == progp->pg_prog)
> +	for (progp = serv->sv_program; progp; progp = progp->pg_next) {
> +		if (prog == progp->pg_prog) {
> +			if (vers < progp->pg_nvers)
> +				versp = progp->pg_vers[vers];
>  			break;
> +		}
> +	}
>  
>  	/*
>  	 * Decode auth data, and add verifier to reply buffer.
> @@ -1148,8 +1152,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
>  	if (progp == NULL)
>  		goto err_bad_prog;
>  
> -	if (vers >= progp->pg_nvers ||
> -	  !(versp = progp->pg_vers[vers]))
> +	if (versp == NULL)
>  		goto err_bad_vers;
>  
>  	procp = versp->vs_proc + proc;
> @@ -1228,6 +1231,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
>   dropit:
>  	svc_authorise(rqstp);	/* doesn't hurt to call this twice */
>  	dprintk("svc: svc_process dropit\n");

I don't think this will fix it either. I turned the above dprintk into
a normal printk and it never fired during the test. As best I can tell,
svc_process_common is not returning 0 when this occurs.

> +	if (versp && versp->vs_dropme)
> +		versp->vs_dropme(rqstp);
>  	return 0;
>  
>  err_short_len:


-- 
Jeff Layton <jlayton@primarydata.com>

  reply	other threads:[~2014-10-24 14:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20 17:36 xfstests generic/075 failure on recent Linus' tree Christoph Hellwig
2014-10-21 13:12 ` Boaz Harrosh
2014-10-22 11:46 ` Christoph Hellwig
2014-10-22 12:00   ` Trond Myklebust
2014-10-22 13:08     ` Christoph Hellwig
2014-10-24  8:14       ` Trond Myklebust
2014-10-24 10:08         ` [PATCH] nfsd: Ensure that NFSv4 always drops the connection when dropping a request Trond Myklebust
2014-10-24 11:26           ` Jeff Layton
2014-10-24 13:34             ` Jeff Layton
2014-10-24 14:30               ` Trond Myklebust
2014-10-24 14:57                 ` Jeff Layton [this message]
2014-10-24 15:59                   ` Trond Myklebust
2014-10-24 17:08                     ` Jeff Layton
2014-10-27 19:00                       ` Jeff Layton
2014-10-24 13:42           ` Christoph Hellwig
2015-01-16 13:39 ` xfstests generic/075 failure on recent Linus' tree Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141024105758.064f2e14@tlielax.poochiereds.net \
    --to=jeff.layton@primarydata.com \
    --cc=bfields@fieldses.org \
    --cc=hch@infradead.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@primarydata.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox