linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Rick Macklem <rmacklem@uoguelph.ca>,
	Olga Kornievskaia <aglo@umich.edu>, Tom Talpey <tom@talpey.com>
Cc: Chuck Lever <chuck.lever@oracle.com>,
	Schumaker Anna <Anna.Schumaker@netapp.com>,
	Trond Myklebust <trondmy@hammerspace.com>,
	linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 0/9] Multiple network connections for a single NFS mount.
Date: Fri, 31 May 2019 11:01:37 +1000	[thread overview]
Message-ID: <87h89bxwr2.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <QB1PR01MB2643963C3A7EDE1D92C57221DD180@QB1PR01MB2643.CANPRD01.PROD.OUTLOOK.COM>

[-- Attachment #1: Type: text/plain, Size: 3857 bytes --]

On Thu, May 30 2019, Rick Macklem wrote:

> Olga Kornievskaia wrote:
>>On Thu, May 30, 2019 at 1:05 PM Tom Talpey <tom@talpey.com> wrote:
>>>
>>> On 5/29/2019 8:41 PM, NeilBrown wrote:
>>> > I've also re-arrange the patches a bit, merged two, and remove the
>>> > restriction to TCP and NFSV4.x,x>=1.  Discussions seemed to suggest
>>> > these restrictions were not needed, I can see no need.
>>>
>>> I believe the need is for the correctness of retries. Because NFSv2,
>>> NFSv3 and NFSv4.0 have no exactly-once semantics of their own, server
>>> duplicate request caches are important (although often imperfect).
>>> These caches use client XID's, source ports and addresses, sometimes
>>> in addition to other methods, to detect retry. Existing clients are
>>> careful to reconnect with the same source port, to ensure this. And
>>> existing servers won't change.
>>
>>Retries are already bound to the same connection so there shouldn't be
>>an issue of a retransmission coming from a different source port.
> I don't think the above is correct for NFSv4.0 (it may very well be true for NFSv3).

It is correct for the Linux implementation of NFS, though the term
"xprt" is more accurate than "connection".

A "task" is bound it a specific "xprt" which, in the case of tcp, has a
fixed source port.  If the TCP connection breaks, a new one is created
with the same addresses and ports, and this new connection serves the
same xprt.

> Here's what RFC7530 Sec. 3.1.1 says:
> 3.1.1.  Client Retransmission Behavior
>
>    When processing an NFSv4 request received over a reliable transport
>    such as TCP, the NFSv4 server MUST NOT silently drop the request,
>    except if the established transport connection has been broken.
>    Given such a contract between NFSv4 clients and servers, clients MUST
>    NOT retry a request unless one or both of the following are true:
>
>    o  The transport connection has been broken
>
>    o  The procedure being retried is the NULL procedure
>
> If the transport connection is broken, the retry needs to be done on a new TCP
> connection, does it not? (I'm assuming you are referring to a retry of an RPC here.)
> (My interpretation of "broken" is "can't be fixed, so the client must use a different
>  TCP connection.)

Yes, a new connection.  But the Linux client makes sure to use the same
source port.

>
> Also, NFSv4.0 cannot use Sun RPC over UDP, whereas some DRCs only
> work for UDP traffic. (The FreeBSD server does have DRC support for TCP, but
> the algorithm is very different than what is used for UDP, due to the long delay
> before a retried RPC request is received. This can result in significant server
> overheads, so some sites choose to disable the DRC for TCP traffic or tune it
> in such a way as it becomes almost useless.)
> The FreeBSD DRC code for NFS over TCP expects the retry to be from a different
> port# (due to a new connection re: the above) for NFSv4.0. For NFSv3, my best
> recollection is that it doesn't care what the source port# is. (It basically uses a
> hash on the RPC request excluding TCP/IP header to recognize possible
> duplicates.)

Interesting .... hopefully the hash is sufficiently strong.
I think it is best to assume same source port, but there is no formal
standard.

Thanks,
NeilBrown


>
> I don't know what other NFS servers choose to do w.r.t. the DRC for NFS over TCP,
> however for some reason I thought that the Linux knfsd only used a DRC for UDP?
> (Someone please clarify this.)
>
> rick
>
>> Multiple connections will result in multiple source ports, and possibly
>> multiple source addresses, meaning retried client requests may be
>> accepted as new, rather than having any chance of being recognized as
>> retries.
>>
>> NFSv4.1+ don't have this issue, but removing the restrictions would
>> seem to break the downlevel mounts.
>>
>> Tom.
>>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  parent reply	other threads:[~2019-05-31  1:01 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-30  0:41 [PATCH 0/9] Multiple network connections for a single NFS mount NeilBrown
2019-05-30  0:41 ` [PATCH 6/9] NFS: Add a mount option to specify number of TCP connections to use NeilBrown
2019-05-30  0:41 ` [PATCH 4/9] SUNRPC: enhance rpc_clnt_show_stats() to report on all xprts NeilBrown
2019-05-30  0:41 ` [PATCH 3/9] NFS: send state management on a single connection NeilBrown
2019-07-23 18:11   ` Schumaker, Anna
2019-07-23 22:54     ` NeilBrown
2019-07-31  2:05     ` [PATCH] NFS: add flags arg to nfs4_call_sync_sequence() NeilBrown
2019-05-30  0:41 ` [PATCH 9/9] NFS: Allow multiple connections to a NFSv2 or NFSv3 server NeilBrown
2019-05-30  0:41 ` [PATCH 1/9] SUNRPC: Add basic load balancing to the transport switch NeilBrown
2019-05-30  0:41 ` [PATCH 2/9] SUNRPC: Allow creation of RPC clients with multiple connections NeilBrown
2019-05-30  0:41 ` [PATCH 5/9] SUNRPC: add links for all client xprts to debugfs NeilBrown
2019-05-30  0:41 ` [PATCH 8/9] pNFS: Allow multiple connections to the DS NeilBrown
2019-05-30  0:41 ` [PATCH 7/9] NFSv4: Allow multiple connections to NFSv4.x servers NeilBrown
2019-05-30 17:05 ` [PATCH 0/9] Multiple network connections for a single NFS mount Tom Talpey
2019-05-30 17:20   ` Olga Kornievskaia
2019-05-30 17:41     ` Tom Talpey
2019-05-30 18:41       ` Olga Kornievskaia
2019-05-31  1:45         ` Tom Talpey
2019-05-30 22:38       ` NeilBrown
2019-05-31  1:48         ` Tom Talpey
2019-05-31  2:31           ` NeilBrown
2019-05-31 12:39             ` Tom Talpey
2019-05-30 23:53     ` Rick Macklem
2019-05-31  0:15       ` J. Bruce Fields
2019-05-31  1:01       ` NeilBrown [this message]
2019-05-31  2:20         ` Rick Macklem
2019-05-31 12:36           ` Tom Talpey
2019-05-31 13:33             ` Trond Myklebust
2019-05-30 17:56 ` Chuck Lever
2019-05-30 18:59   ` Olga Kornievskaia
2019-05-30 22:56   ` NeilBrown
2019-05-31 13:46     ` Chuck Lever
2019-05-31 15:38       ` J. Bruce Fields
2019-06-11  1:09       ` NeilBrown
2019-06-11 14:51         ` Chuck Lever
2019-06-11 15:05           ` Tom Talpey
2019-06-11 15:20           ` Trond Myklebust
2019-06-11 15:35             ` Chuck Lever
2019-06-11 16:41               ` Trond Myklebust
2019-06-11 17:32                 ` Chuck Lever
2019-06-11 17:44                   ` Trond Myklebust
2019-06-12 12:34                     ` Steve Dickson
2019-06-12 12:47                       ` Trond Myklebust
2019-06-12 13:10                         ` Trond Myklebust
2019-06-11 15:34           ` Olga Kornievskaia
2019-06-11 17:46             ` Chuck Lever
2019-06-11 19:13               ` Olga Kornievskaia
2019-06-11 20:02                 ` Tom Talpey
2019-06-11 20:09                   ` Chuck Lever
2019-06-11 21:10                     ` Olga Kornievskaia
2019-06-11 21:35                       ` Tom Talpey
2019-06-11 22:55                         ` NeilBrown
2019-06-12 12:55                           ` Tom Talpey
2019-06-11 23:02                       ` NeilBrown
2019-06-11 23:21                   ` NeilBrown
2019-06-12 12:52                     ` Tom Talpey
2019-06-11 23:42               ` NeilBrown
2019-06-12 12:39                 ` Steve Dickson
2019-06-12 17:36                 ` Chuck Lever
2019-06-12 23:03                   ` NeilBrown
2019-06-13 16:13                     ` Chuck Lever
2019-06-12  1:49           ` NeilBrown
2019-06-12 18:32             ` Chuck Lever
2019-06-12 23:37               ` NeilBrown
2019-06-13 16:27                 ` Chuck Lever
2019-05-31  0:24 ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h89bxwr2.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=aglo@umich.edu \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rmacklem@uoguelph.ca \
    --cc=tom@talpey.com \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).