All of lore.kernel.org
 help / color / mirror / Atom feed
* NFSv4.1 backchannel for RDMA
@ 2015-01-23 21:00 Chuck Lever
  2015-01-23 22:44 ` Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: Chuck Lever @ 2015-01-23 21:00 UTC (permalink / raw)
  To: Linux NFS Mailing List

Hi-

I’d like to restart the discussion in this thread:

  http://marc.info/?l=linux-nfs&m=141348840527766&w=2

It seems to me there are two main points:

1.  Is bi-directional RPC on RPC/RDMA transports desirable?

2.  Is a secondary backchannel-only transport adequate and reliable?

I’ll try to summarize the current thinking.


Question 1:

The main reason to plumb bi-RPC into RPC/RDMA is that no changes to
the NFSv4.1 client upper layers would be needed. I think we also
agree that:

 - There is no performance benefit. CB operations typically lack
   significant payload, are infrequent, and can be long-running.

 - There is no need to penetrate firewalls. Firewall compatibility
   was the original motivation for single-transport NFSv4.1
   operation. Firewalls are not typically found in RDMA-native
   environments.

 - There is no requirement in RFC 5661 for the forward channel
   transport to support bi-directional RPC. Backchannel capability
   is detected via the CREATE_SESSION operation.

 - TCP connectivity will always be available wherever NFS/RDMA is
   deployed. For NFS/RDMA operation, IP address to GUID mapping must
   be provided by the transport layer, below RPC/RDMA.

 - To handle large payloads (possibly required by certain pNFS
   CB operations), an NFSv4.1 client would need to handle
   RDMA_NOMSG type calls over the backchannel. This would require
   the client to perform RDMA READ and WRITE operations against the
   server (the opposite of what happens in the forward channel).

There is some interest in prototyping an RPC/RDMA transport that is
capable of bi-directional RPC. A prototype would help us determine
whether there are subtle problems that make bi-RPC impossible for
RPC/RDMA, and identify any spec gaps that need to be addressed.
Because of the development cost and lack of perceptible benefits, a
prototype has not been attempted so far.

Would it be productive for a bi-capable RPC/RDMA transport prototype
to be pursued in Linux?


Question 2:

The Solaris client and server already implement a sidecar TCP
backchannel for NFSv4.1. This is something that can be tested.
Further, I think we agree that:

 - Servers are required to support a separate backchannel and
   forward channel transport, and both sides can detect what is
   supported with CREATE_SESSION. However, there are no existing
   implementations that have deployed this kind of logic widely.
 
 - The addition of a separate backchannel-only connection is
   considered session trunking, which is regarded as potentially
   hazardous. We haven’t identified exactly what the  hazards might
   be when the second connection handles only backchannel activity.

 - Although there are few or no server changes required to support
   a secondary backchannel, clients would have to be modified to
   establish this connection when one or both sides do not support
   a backchannel on the main transport and the server asserts the
   SEQ4_STATUS_CB_PATH_DOWN flag.

 - We have some confidence that creation of the second backchannel-
   only connection followed by BIND_CONN_TO_SESSION appears to be
   adequate and robust. However, the salient recovery edge conditions
   when a secondary backchannel transport is being used still need to
   be identified.

What further investigation is needed to be confident that the sidecar
solution is adequate and appropriate?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-01-24  1:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1321655997.90.1422051197579.JavaMail.root@thunderbeast.private.linuxbox.com>
2015-01-23 22:18 ` NFSv4.1 backchannel for RDMA Matt W. Benjamin
2015-01-23 21:00 Chuck Lever
2015-01-23 22:44 ` Trond Myklebust
2015-01-23 23:28   ` Chuck Lever
2015-01-24  1:01     ` Trond Myklebust

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.