From: Chuck Lever <chuck.lever@oracle.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@redhat.com>,
linux-nfs@vger.kernel.org, steved@redhat.com,
trond.myklebust@fys.uio.no
Subject: Re: [PATCH] mount.nfs: prefer IPv4 addresses over IPv6 (try #2)
Date: Thu, 21 Jan 2010 15:28:46 -0500 [thread overview]
Message-ID: <4B58B8FE.80800@oracle.com> (raw)
In-Reply-To: <20100121195746.GC22021@fieldses.org>
On 01/21/2010 02:57 PM, J. Bruce Fields wrote:
> On Thu, Jan 21, 2010 at 02:37:58PM -0500, Chuck Lever wrote:
>> On 01/21/2010 02:15 PM, J. Bruce Fields wrote:
>>> On Wed, Jan 20, 2010 at 10:36:36AM -0500, Chuck Lever wrote:
>>>> For the record, we looked at Solaris behavior yesterday. With bi-family
>>>> servers, its mount command tries IPv6 first, but appears smart enough to
>>>> fall back to IPv4. One thing we haven't tried is to see how difficult it
>>>> would be to fix the real problem by adding proper protocol family
>>>> negotiation to our own mount command.
>>>
>>> Sorry, I probably just haven't been following: what's "proper protocol
>>> family negotiation"? I thought the only ways to negotiate were either
>>> rpcbind (v2, v3) or trial and error (v4)?
>>
>> In TI-RPC parlance, a "protocol" is the transport protocol (UDP, for
>> example), and a "protocol family" is the address family ("inet6", for
>> example). A netid represents a particular combination of the two: the
>> netid "udp6" represents UDP over "inet6".
>>
>> The "protocol family" is really the value that is passed to socket(2).
>> This call generally takes PF_INET or something like that as its first
>> argument. All of the PF_FOO thingies have the same integer value as
>> their AF_FOO counterparts. For TI-RPC, we have "inet" and "inet6",
>> which are strings that match up with the AF_FOO and PF_FOO names.
>>
>> rpcb_getaddr(3t) is designed to use the rpcbind protocol to determine
>> the address and transport to use when contacting a remote service. Our
>> mount command has its own negotiation mechanism that is a superset of
>> rpcbind calls, in addition to having a faster timeout than
>> rpcb_getaddr(3t).
>
> What does "is a superset of rpcbind calls" mean?
rpcb_getaddr(3t) performs a single specific rpcbind query with a long
fixed timeout. mount.nfs uses several rpcbind queries, in a particular
order, to identify which NFS-related services are available. mount.nfs
uses individual queries rather than a single DUMPALL in order to enable
firewalls to detect which ports should be opened.
> I still don't
> understand what the proper protocol family negotiation is: what actually
> happens on the wire?
If a particular RPC service (including rpcbind) cannot be contacted via
"inet6," and the server has an "inet" address listed in DNS, then
mount.nfs should be smart enough to try the mount request via the "inet"
address too. This is in addition to support for rpcbind queries that
can return a netid, which would include information about which protocol
family to use).
Currently our mount.nfs command fails if the target server has at least
one IPv6 address listed in DNS in addition to an IPv4 address, but does
not support NFS/IPv6.
For NFSv4, a server that has an IPv6 address but does not support
NFS/IPv6 will refuse connection to port 2049 over IPv6. In that case,
mount.nfs should tell the kernel to retry the mount with the server's
IPv4 address, if it has one.
For NFSv3, a server that has an IPv6 address, but does not support
NFS/IPv6, will not register any inet6 netids in its rpcbind database.
Thus the mount.nfs command has to be smart enough to retry
PROGNOTREGISTERED results with the server's IPv4 address, if it has one.
If the server has an IPv6 address, but is running portmap instead of
rpcbind, the initial rpcbind query connection will be refused (portmap
does not set up an IPv6 listener). In that case, the mount request
should be retried with the server's IPv4 address, if it has one.
Note that in any of these cases, if an NFS server does not have any IPv6
addresses listed in DNS, then behavior should be the same as before.
--
chuck[dot]lever[at]oracle[dot]com
next prev parent reply other threads:[~2010-01-21 20:29 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-19 13:27 [PATCH] mount.nfs: prefer IPv4 addresses over IPv6 (try #2) Jeff Layton
2010-01-19 15:43 ` Chuck Lever
2010-01-19 20:38 ` Jeff Layton
[not found] ` <20100119153826.67dd97a5-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-01-19 20:51 ` Trond Myklebust
2010-01-19 21:06 ` Chuck Lever
2010-01-20 13:13 ` Jeff Layton
2010-01-20 13:29 ` Jeff Layton
2010-01-20 15:36 ` Chuck Lever
2010-01-20 16:34 ` Jeff Layton
[not found] ` <20100120113422.6071bfbd-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-01-20 19:09 ` Chuck Lever
2010-01-21 19:15 ` J. Bruce Fields
2010-01-21 19:37 ` Chuck Lever
2010-01-21 19:57 ` J. Bruce Fields
2010-01-21 20:28 ` Chuck Lever [this message]
2010-01-21 21:52 ` J. Bruce Fields
2010-01-23 12:54 ` NFS/IPv6 servers on GNU/Linux? Ivan Shmakov
[not found] ` <87y6jp56cw.fsf-Hr8DDCuc/255On46OghOUKxOck334EZe@public.gmane.org>
2010-01-23 14:30 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B58B8FE.80800@oracle.com \
--to=chuck.lever@oracle.com \
--cc=bfields@fieldses.org \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=steved@redhat.com \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox