From: Jeff Layton <jeff.layton@primarydata.com>
To: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Christoph Hellwig <hch@infradead.org>,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
bfields@fieldses.org
Subject: Re: [PATCH] nfsd: Ensure that NFSv4 always drops the connection when dropping a request
Date: Fri, 24 Oct 2014 10:57:58 -0400 [thread overview]
Message-ID: <20141024105758.064f2e14@tlielax.poochiereds.net> (raw)
In-Reply-To: <1414161055.21776.2.camel@leira.trondhjem.org>
On Fri, 24 Oct 2014 17:30:55 +0300
Trond Myklebust <trond.myklebust@primarydata.com> wrote:
> On Fri, 2014-10-24 at 09:34 -0400, Jeff Layton wrote:
> > On Fri, 24 Oct 2014 07:26:44 -0400
> > Jeff Layton <jlayton@primarydata.com> wrote:
> >
> > > I guess the theory here is that the client sent the request, and got a
> > > TCP ACK and then the server never responded because it dropped the
> > > request? That sounds plausible and we almost certainly want something
> > > like the above for v4.0.
> > >
> > > If the above patch fixes Christoph's problem with v4.1 though, then I
> > > think we'll need more investigation into why. I don't see a way to get
> > > a dropped request with v4.1. Those mainly occur due to interaction with
> > > the DRC (disabled for v4.1) or with svc_defer (disabled for all v4
> > > requests).
> > >
> > > Is it possible that one of the nfsd threads is just hung while
> > > processing an RPC and that's why you're not getting a response?
> > >
> >
> > Ok, I'm able to reproduce this too and got a network trace. I think
> > Trond is correct as to the cause:
> >
> > I see a SETATTR call go out and then see a TCP ACK for it a little
> > later, but didn't see a reply sent for it.
> >
> > I sprinkled in some printks around the "dropit" codepath above and in
> > some other areas however and none of them fired. So while Trond's patch
> > looks like it might be correct, I don't think it'll fix the problem I'm
> > seeing.
> >
> > Sniffing a little later, it seems like the server replies are being
> > delayed somehow:
>
> You can get deferrals in the authentication code which trigger drops.
> Here is a variant which tries to deal with that:
>
> 8<-----------------------------------------------------
> From a9be1292152af2cb651415c44fd3d91b3d017240 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@primarydata.com>
> Date: Fri, 24 Oct 2014 12:54:47 +0300
> Subject: [PATCH v2] nfsd: Ensure that NFSv4 always drops the connection when
> dropping a request
>
> Both RFC3530 and RFC5661 impose a requirement on the server that it MUST NOT
> drop a request unless the connection is broken. This is in order to allow
> clients to assume that they can always expect an answer unless the
> connection is broken.
>
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> Cc: stable@vger.kernel.org
> ---
> v2: Also deal with NFSv4 requests dropped due to authentication issues
>
> fs/nfsd/nfs4proc.c | 6 ++++++
> include/linux/sunrpc/svc.h | 1 +
> net/sunrpc/svc.c | 13 +++++++++----
> 3 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index cdeb3cfd6f32..500ac76662a8 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1960,6 +1960,11 @@ static const char *nfsd4_op_name(unsigned opnum)
> return "unknown_operation";
> }
>
> +static void nfsd4_dropme(struct svc_rqst *rqstp)
> +{
> + svc_close_xprt(rqstp->rq_xprt);
> +}
> +
> #define nfsd4_voidres nfsd4_voidargs
> struct nfsd4_voidargs { int dummy; };
>
> @@ -1989,6 +1994,7 @@ struct svc_version nfsd_version4 = {
> .vs_nproc = 2,
> .vs_proc = nfsd_procedures4,
> .vs_dispatch = nfsd_dispatch,
> + .vs_dropme = nfsd4_dropme,
> .vs_xdrsize = NFS4_SVC_XDRSIZE,
> .vs_rpcb_optnl = 1,
> };
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 21678464883a..824656da1f6d 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -400,6 +400,7 @@ struct svc_version {
> * vs_dispatch == NULL means use default dispatcher.
> */
> int (*vs_dispatch)(struct svc_rqst *, __be32 *);
> + void (*vs_dropme)(struct svc_rqst *);
> };
>
> /*
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ca8a7958f4e6..a178496ce2c1 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -1111,9 +1111,13 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
> rqstp->rq_vers = vers = svc_getnl(argv); /* version number */
> rqstp->rq_proc = proc = svc_getnl(argv); /* procedure number */
>
> - for (progp = serv->sv_program; progp; progp = progp->pg_next)
> - if (prog == progp->pg_prog)
> + for (progp = serv->sv_program; progp; progp = progp->pg_next) {
> + if (prog == progp->pg_prog) {
> + if (vers < progp->pg_nvers)
> + versp = progp->pg_vers[vers];
> break;
> + }
> + }
>
> /*
> * Decode auth data, and add verifier to reply buffer.
> @@ -1148,8 +1152,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
> if (progp == NULL)
> goto err_bad_prog;
>
> - if (vers >= progp->pg_nvers ||
> - !(versp = progp->pg_vers[vers]))
> + if (versp == NULL)
> goto err_bad_vers;
>
> procp = versp->vs_proc + proc;
> @@ -1228,6 +1231,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
> dropit:
> svc_authorise(rqstp); /* doesn't hurt to call this twice */
> dprintk("svc: svc_process dropit\n");
I don't think this will fix it either. I turned the above dprintk into
a normal printk and it never fired during the test. As best I can tell,
svc_process_common is not returning 0 when this occurs.
> + if (versp && versp->vs_dropme)
> + versp->vs_dropme(rqstp);
> return 0;
>
> err_short_len:
--
Jeff Layton <jlayton@primarydata.com>
next prev parent reply other threads:[~2014-10-24 14:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-20 17:36 xfstests generic/075 failure on recent Linus' tree Christoph Hellwig
2014-10-21 13:12 ` Boaz Harrosh
2014-10-22 11:46 ` Christoph Hellwig
2014-10-22 12:00 ` Trond Myklebust
2014-10-22 13:08 ` Christoph Hellwig
2014-10-24 8:14 ` Trond Myklebust
2014-10-24 10:08 ` [PATCH] nfsd: Ensure that NFSv4 always drops the connection when dropping a request Trond Myklebust
2014-10-24 11:26 ` Jeff Layton
2014-10-24 13:34 ` Jeff Layton
2014-10-24 14:30 ` Trond Myklebust
2014-10-24 14:57 ` Jeff Layton [this message]
2014-10-24 15:59 ` Trond Myklebust
2014-10-24 17:08 ` Jeff Layton
2014-10-27 19:00 ` Jeff Layton
2014-10-24 13:42 ` Christoph Hellwig
2015-01-16 13:39 ` xfstests generic/075 failure on recent Linus' tree Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141024105758.064f2e14@tlielax.poochiereds.net \
--to=jeff.layton@primarydata.com \
--cc=bfields@fieldses.org \
--cc=hch@infradead.org \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@primarydata.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox