linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfsd delays between svc_recv and gss_check_seq_num
@ 2016-04-10 11:44 Benjamin Coddington
  2016-04-25 18:38 ` J. Bruce Fields
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Benjamin Coddington @ 2016-04-10 11:44 UTC (permalink / raw)
  To: linux-nfs

My client hangs on xfstests generic/074 on a krb5 mount, and I've found that
the linux server is silently discarding one or more RPCs because the GSS
sequence numbers are outside the sequence window.

The reason is that sometimes one of the nfsd threads takes a long time
between receiving the RPC and then checking if the sequence is within the
window.  That delay allows the other nfsd threads to quickly move the window
forward out of range.

If the server discards the RPC then that causes then the client to wait
forever for a response or until the connection is reset.

By inserting tracepoints, I think I found two sources of delay:

 1) gss_svc_searchbyctx() uses dup_to_netobj() which has a kmemdup with
GFP_KERNEL.  It does this because presumabely it doesn't know how big the
context handle should be.

 2) gss_verify_mic() uses make_checksum() which eventually gets to
crypto_alloc_hash() with GFP_KERNEL.

For the first delay, can we assume the context handles are all going to be
the same size?  It looks like the handle is assigned by the server, so it
seems like we should be able to know beforehand how large they are.

For the second allocation -- I haven't thrown a lot of thought into what
could be done to fix it.. seems a bit tricker.  I'll think about both of
these a bit more, but I thought in the meantime to ask if anyone has
thoughts about this problem.  Maybe we can to the sequence check before
verify_mic -- but then a message that fails verification could flip the
sequence bit..

Ben

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-07-20 11:29 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-10 11:44 nfsd delays between svc_recv and gss_check_seq_num Benjamin Coddington
2016-04-25 18:38 ` J. Bruce Fields
2016-04-25 19:23   ` Benjamin Coddington
2016-04-25 20:10     ` J. Bruce Fields
2016-04-25 21:22 ` J. Bruce Fields
2016-07-20 10:33 ` Red Hat
2016-07-20 11:31   ` Benjamin Coddington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).