Re: nfsd delays between svc_recv and gss_check_seq_num

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: bfields@fieldses.org (J. Bruce Fields)
To: Benjamin Coddington <bcodding@redhat.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: nfsd delays between svc_recv and gss_check_seq_num
Date: Mon, 25 Apr 2016 14:38:02 -0400	[thread overview]
Message-ID: <20160425183802.GA20742@fieldses.org> (raw)
In-Reply-To: <alpine.OSX.2.19.9992.1604100724080.44797@planck>

On Sun, Apr 10, 2016 at 07:44:45AM -0400, Benjamin Coddington wrote:
> My client hangs on xfstests generic/074 on a krb5 mount, and I've found that
> the linux server is silently discarding one or more RPCs because the GSS
> sequence numbers are outside the sequence window.
> 
> The reason is that sometimes one of the nfsd threads takes a long time
> between receiving the RPC and then checking if the sequence is within the
> window.  That delay allows the other nfsd threads to quickly move the window
> forward out of range.
> 
> If the server discards the RPC then that causes then the client to wait
> forever for a response or until the connection is reset.
> 
> By inserting tracepoints, I think I found two sources of delay:
> 
>  1) gss_svc_searchbyctx() uses dup_to_netobj() which has a kmemdup with
> GFP_KERNEL.  It does this because presumabely it doesn't know how big the
> context handle should be.
> 
>  2) gss_verify_mic() uses make_checksum() which eventually gets to
> crypto_alloc_hash() with GFP_KERNEL.
> 
> For the first delay, can we assume the context handles are all going to be
> the same size?  It looks like the handle is assigned by the server, so it
> seems like we should be able to know beforehand how large they are.

It's assigned by the server, but I believe that happens in userland,
either in svcgssd or gss-proxy.  On a quick look I can't find a limit
other than the rpc-imposed limit of 400 bytes for an rpc credential.  So
we'd need a documented agreement with svcgssd and gss-proxy for that.
Probably easy for the former, not sure about the latter.

> For the second allocation -- I haven't thrown a lot of thought into what
> could be done to fix it.. seems a bit tricker.  I'll think about both of
> these a bit more, but I thought in the meantime to ask if anyone has
> thoughts about this problem.  Maybe we can to the sequence check before
> verify_mic -- but then a message that fails verification could flip the
> sequence bit..

How much is this happening?  Could increase the sequence window?

--b.

next prev parent reply	other threads:[~2016-04-25 18:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-10 11:44 nfsd delays between svc_recv and gss_check_seq_num Benjamin Coddington
2016-04-25 18:38 ` J. Bruce Fields [this message]
2016-04-25 19:23   ` Benjamin Coddington
2016-04-25 20:10     ` J. Bruce Fields
2016-04-25 21:22 ` J. Bruce Fields
2016-07-20 10:33 ` Red Hat
2016-07-20 11:31   ` Benjamin Coddington

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160425183802.GA20742@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=bcodding@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).