All of lore.kernel.org
 help / color / mirror / Atom feed
* GSS unwrapping breaks the DRC
@ 2020-04-15 17:05 Chuck Lever
  2020-04-15 19:25 ` Bruce Fields
  0 siblings, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2020-04-15 17:05 UTC (permalink / raw)
  To: Bruce Fields, Jeff Layton; +Cc: Linux NFS Mailing List

Hi Bruce and Jeff:

Testing intensive workloads with NFSv3 and NFSv4.0 on NFS/RDMA with krb5i
or krb5p results in a pretty quick workload failure. Closer examination
shows that the client is able to overrun the GSS sequence window with
some regularity. When that happens, the server drops the connection.

However, when the client retransmits requests with lost replies, they
never hit in the DRC, and that results in unexpected failures of non-
idempotent requests.

The retransmitted XIDs are found in the DRC, but the retransmitted request
has a different checksum than the original. We're hitting the "mismatch"
case in nfsd_cache_key_cmp for these requests.

I tracked down the problem to the way the DRC computes the length of the
part of the buffer it wants to checksum. nfsd_cache_csum uses

  head.iov_len + page_len

and then caps that at RC_CSUMLEN.

That works fine for krb5 and sys, but the GSS unwrap functions
(integ_unwrap_data and priv_unwrap_data) don't appear to update head.iov_len
properly. So nfsd_cache_csum's length computation is significantly larger
than the clear-text message, and that allows stale parts of the xdr_buf
to be included in the checksum.

Using xdr_buf_subsegment() at the end of integ_unwrap_data sets the xdr_buf
lengths properly and fixes the situation for krb5i.

I don't see a similar solution for priv_unwrap_data: there's no MIC len
available, and priv_len is not the actual length of the clear-text message.

Moreover, the comment in fix_priv_head() is disturbing. I don't see anywhere
where the relationship between the buf's head/len and how svc_defer works is
authoritatively documented. It's not clear exactly how priv_unwrap_data is
supposed to accommodate svc_defer, or whether integ_unwrap_data also needs
to accommodate it.

So I can't tell if the GSS unwrap functions are wrong or if there's a more
accurate way to compute the message length in nfsd_cache_csum. I suspect
both could use some improvement, but I'm not certain exactly what that
might be.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-04-23 19:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-15 17:05 GSS unwrapping breaks the DRC Chuck Lever
2020-04-15 19:25 ` Bruce Fields
2020-04-15 20:06   ` Chuck Lever
2020-04-15 21:58     ` Bruce Fields
2020-04-15 22:23       ` Chuck Lever
2020-04-16  0:00         ` Bruce Fields
2020-04-16 14:07           ` Chuck Lever
2020-04-16 14:28             ` Bruce Fields
2020-04-17 21:48         ` Chuck Lever
2020-04-23 19:34           ` Bruce Fields
2020-04-23 19:41             ` Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.