* [PATCH 1/1] mount: RDMA processing in the mount command is broken
@ 2010-09-02 20:00 Steve Dickson
2010-09-02 20:15 ` Chuck Lever
2010-09-02 20:35 ` Chuck Lever
0 siblings, 2 replies; 10+ messages in thread
From: Steve Dickson @ 2010-09-02 20:00 UTC (permalink / raw)
To: Linux NFS Mailing list
The mounting code that process RMDA mounts is broken in a few places
First with '-o proto=rdma' was broken because nfs_get_proto()
did not how to convert a netid of 'rdma' in to a AF_INET
address family.
Secondly, '-o rdma' was broken because po_get() was being using
to detect the existence of 'rdma' in the options. With '-o rdma'
there is no value associated with that option so po_get()
was always return NULL.
This patch address both those problems.
Signed-off-by: Steve Dickson <steved@redhat.com>
---
support/nfs/getport.c | 8 ++++++++
utils/mount/stropts.c | 31 ++++++++++++++++++++++---------
2 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/support/nfs/getport.c b/support/nfs/getport.c
index c930539..b4c2f8f 100644
--- a/support/nfs/getport.c
+++ b/support/nfs/getport.c
@@ -216,6 +216,10 @@ nfs_get_proto(const char *netid, sa_family_t *family, unsigned long *protocol)
struct netconfig *nconf;
struct protoent *proto;
+ if (strcasecmp(netid, "rdma") == 0) {
+ *family = AF_INET;
+ return 1;
+ }
nconf = getnetconfigent(netid);
if (nconf == NULL)
return 0;
@@ -242,6 +246,10 @@ nfs_get_proto(const char *netid, sa_family_t *family, unsigned long *protocol)
{
struct protoent *proto;
+ if (strcasecmp(netid, "rdma") == 0) {
+ *family = AF_INET;
+ return 1;
+ }
proto = getprotobyname(netid);
if (proto == NULL)
return 0;
diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
index 0241400..d688ee8 100644
--- a/utils/mount/stropts.c
+++ b/utils/mount/stropts.c
@@ -98,6 +98,22 @@ struct nfsmount_info {
child; /* forked bg child? */
};
+/*
+ * Check the options to see if the transport is RDMA
+ */
+static int rdma_enabled(struct mount_options *options)
+{
+ char *option;
+
+ if (po_contains(options, "rdma"))
+ return 1;
+ option = po_get(options, "proto");
+ if (option && strcmp(option, "rdma") == 0)
+ return 1;
+
+ return 0;
+}
+
#ifdef MOUNT_CONFIG
static void nfs_default_version(struct nfsmount_info *mi);
@@ -302,11 +318,8 @@ static int nfs_set_version(struct nfsmount_info *mi)
if (strncmp(mi->type, "nfs4", 4) == 0)
mi->version = 4;
- else {
- char *option = po_get(mi->options, "proto");
- if (option && strcmp(option, "rdma") == 0)
- mi->version = 3;
- }
+ else if (rdma_enabled(mi->options))
+ mi->version = 3;
/*
* If we still don't know, check for version-specific
@@ -346,7 +359,9 @@ static int nfs_validate_options(struct nfsmount_info *mi)
if (!nfs_parse_devname(mi->spec, &mi->hostname, NULL))
return 0;
- if (!nfs_nfs_proto_family(mi->options, &family))
+ if (rdma_enabled(mi->options))
+ family = AF_INET;
+ else if (!nfs_nfs_proto_family(mi->options, &family))
return 0;
hint.ai_family = (int)family;
@@ -491,13 +506,11 @@ nfs_rewrite_pmap_mount_options(struct mount_options *options)
struct sockaddr *mnt_saddr = &mnt_address.sa;
socklen_t mnt_salen = sizeof(mnt_address);
struct pmap mnt_pmap;
- char *option;
/*
* Skip option negotiation for proto=rdma mounts.
*/
- option = po_get(options, "proto");
- if (option && strcmp(option, "rdma") == 0)
+ if (rdma_enabled(options))
goto out;
/*
--
1.7.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-02 20:00 [PATCH 1/1] mount: RDMA processing in the mount command is broken Steve Dickson
@ 2010-09-02 20:15 ` Chuck Lever
2010-09-02 22:45 ` Steve Dickson
2010-09-02 20:35 ` Chuck Lever
1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2010-09-02 20:15 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list
On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
> The mounting code that process RMDA mounts is broken in a few places
>
> First with '-o proto=rdma' was broken because nfs_get_proto()
> did not how to convert a netid of 'rdma' in to a AF_INET
> address family.
The usual solution for that is to add appropriate definitions to /etc/netconfig. Since "rdma" is going to become (or has become) a valid "IETF standard" netid, shouldn't this be handled just like the other netids? At some point "rdma" is going to start showing up in rpcbind, so this is going to have to work like a normal netid.
> Secondly, '-o rdma' was broken because po_get() was being using
> to detect the existence of 'rdma' in the options. With '-o rdma'
> there is no value associated with that option so po_get()
> was always return NULL.
This should be handled just the same way the "udp" and "tcp" mount options are handled, in nfs_nfs_protocol(). Just add an entry for "rdma" in nfs_transport_opttbl.
The problem is we don't have an IPPROTO_ number for RDMA, so you'll have to make one up.
What does the kernel do in these cases?
> This patch address both those problems.
Let's not riddle the mount code with these ad hoc string comparisons when we've already got plenty adequate generic support for adding new transport labels.
> Signed-off-by: Steve Dickson <steved@redhat.com>
> ---
> support/nfs/getport.c | 8 ++++++++
> utils/mount/stropts.c | 31 ++++++++++++++++++++++---------
> 2 files changed, 30 insertions(+), 9 deletions(-)
>
> diff --git a/support/nfs/getport.c b/support/nfs/getport.c
> index c930539..b4c2f8f 100644
> --- a/support/nfs/getport.c
> +++ b/support/nfs/getport.c
> @@ -216,6 +216,10 @@ nfs_get_proto(const char *netid, sa_family_t *family, unsigned long *protocol)
> struct netconfig *nconf;
> struct protoent *proto;
>
> + if (strcasecmp(netid, "rdma") == 0) {
> + *family = AF_INET;
> + return 1;
> + }
> nconf = getnetconfigent(netid);
> if (nconf == NULL)
> return 0;
> @@ -242,6 +246,10 @@ nfs_get_proto(const char *netid, sa_family_t *family, unsigned long *protocol)
> {
> struct protoent *proto;
>
> + if (strcasecmp(netid, "rdma") == 0) {
> + *family = AF_INET;
> + return 1;
> + }
> proto = getprotobyname(netid);
> if (proto == NULL)
> return 0;
> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> index 0241400..d688ee8 100644
> --- a/utils/mount/stropts.c
> +++ b/utils/mount/stropts.c
> @@ -98,6 +98,22 @@ struct nfsmount_info {
> child; /* forked bg child? */
> };
>
> +/*
> + * Check the options to see if the transport is RDMA
> + */
> +static int rdma_enabled(struct mount_options *options)
> +{
> + char *option;
> +
> + if (po_contains(options, "rdma"))
> + return 1;
> + option = po_get(options, "proto");
> + if (option && strcmp(option, "rdma") == 0)
> + return 1;
> +
> + return 0;
> +}
> +
> #ifdef MOUNT_CONFIG
> static void nfs_default_version(struct nfsmount_info *mi);
>
> @@ -302,11 +318,8 @@ static int nfs_set_version(struct nfsmount_info *mi)
>
> if (strncmp(mi->type, "nfs4", 4) == 0)
> mi->version = 4;
> - else {
> - char *option = po_get(mi->options, "proto");
> - if (option && strcmp(option, "rdma") == 0)
> - mi->version = 3;
> - }
> + else if (rdma_enabled(mi->options))
> + mi->version = 3;
>
> /*
> * If we still don't know, check for version-specific
> @@ -346,7 +359,9 @@ static int nfs_validate_options(struct nfsmount_info *mi)
> if (!nfs_parse_devname(mi->spec, &mi->hostname, NULL))
> return 0;
>
> - if (!nfs_nfs_proto_family(mi->options, &family))
> + if (rdma_enabled(mi->options))
> + family = AF_INET;
> + else if (!nfs_nfs_proto_family(mi->options, &family))
> return 0;
>
> hint.ai_family = (int)family;
> @@ -491,13 +506,11 @@ nfs_rewrite_pmap_mount_options(struct mount_options *options)
> struct sockaddr *mnt_saddr = &mnt_address.sa;
> socklen_t mnt_salen = sizeof(mnt_address);
> struct pmap mnt_pmap;
> - char *option;
>
> /*
> * Skip option negotiation for proto=rdma mounts.
> */
> - option = po_get(options, "proto");
> - if (option && strcmp(option, "rdma") == 0)
> + if (rdma_enabled(options))
> goto out;
>
> /*
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-02 20:00 [PATCH 1/1] mount: RDMA processing in the mount command is broken Steve Dickson
2010-09-02 20:15 ` Chuck Lever
@ 2010-09-02 20:35 ` Chuck Lever
2010-09-02 22:04 ` Steve Dickson
1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2010-09-02 20:35 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list
On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
> The mounting code that process RMDA mounts is broken in a few places
>
> First with '-o proto=rdma' was broken because nfs_get_proto()
> did not how to convert a netid of 'rdma' in to a AF_INET
> address family.
>
> Secondly, '-o rdma' was broken because po_get() was being using
> to detect the existence of 'rdma' in the options. With '-o rdma'
> there is no value associated with that option so po_get()
> was always return NULL.
Looking at nfs(5), "rdma" as a stand-alone option isn't documented. Only "proto=rdma" appears to be mentioned. Thus I would argue that this is already behaving correctly.
Are you proposing to add support for "-o rdma" ? If so what would that mean?
> This patch address both those problems.
>
> Signed-off-by: Steve Dickson <steved@redhat.com>
> ---
> support/nfs/getport.c | 8 ++++++++
> utils/mount/stropts.c | 31 ++++++++++++++++++++++---------
> 2 files changed, 30 insertions(+), 9 deletions(-)
>
> diff --git a/support/nfs/getport.c b/support/nfs/getport.c
> index c930539..b4c2f8f 100644
> --- a/support/nfs/getport.c
> +++ b/support/nfs/getport.c
> @@ -216,6 +216,10 @@ nfs_get_proto(const char *netid, sa_family_t *family, unsigned long *protocol)
> struct netconfig *nconf;
> struct protoent *proto;
>
> + if (strcasecmp(netid, "rdma") == 0) {
> + *family = AF_INET;
> + return 1;
> + }
> nconf = getnetconfigent(netid);
> if (nconf == NULL)
> return 0;
> @@ -242,6 +246,10 @@ nfs_get_proto(const char *netid, sa_family_t *family, unsigned long *protocol)
> {
> struct protoent *proto;
>
> + if (strcasecmp(netid, "rdma") == 0) {
> + *family = AF_INET;
> + return 1;
> + }
> proto = getprotobyname(netid);
> if (proto == NULL)
> return 0;
> diff --git a/utils/mount/stropts.c b/utils/mount/stropts.c
> index 0241400..d688ee8 100644
> --- a/utils/mount/stropts.c
> +++ b/utils/mount/stropts.c
> @@ -98,6 +98,22 @@ struct nfsmount_info {
> child; /* forked bg child? */
> };
>
> +/*
> + * Check the options to see if the transport is RDMA
> + */
> +static int rdma_enabled(struct mount_options *options)
> +{
> + char *option;
> +
> + if (po_contains(options, "rdma"))
> + return 1;
> + option = po_get(options, "proto");
> + if (option && strcmp(option, "rdma") == 0)
> + return 1;
> +
> + return 0;
> +}
> +
> #ifdef MOUNT_CONFIG
> static void nfs_default_version(struct nfsmount_info *mi);
>
> @@ -302,11 +318,8 @@ static int nfs_set_version(struct nfsmount_info *mi)
>
> if (strncmp(mi->type, "nfs4", 4) == 0)
> mi->version = 4;
> - else {
> - char *option = po_get(mi->options, "proto");
> - if (option && strcmp(option, "rdma") == 0)
> - mi->version = 3;
> - }
> + else if (rdma_enabled(mi->options))
> + mi->version = 3;
>
> /*
> * If we still don't know, check for version-specific
> @@ -346,7 +359,9 @@ static int nfs_validate_options(struct nfsmount_info *mi)
> if (!nfs_parse_devname(mi->spec, &mi->hostname, NULL))
> return 0;
>
> - if (!nfs_nfs_proto_family(mi->options, &family))
> + if (rdma_enabled(mi->options))
> + family = AF_INET;
> + else if (!nfs_nfs_proto_family(mi->options, &family))
> return 0;
>
> hint.ai_family = (int)family;
> @@ -491,13 +506,11 @@ nfs_rewrite_pmap_mount_options(struct mount_options *options)
> struct sockaddr *mnt_saddr = &mnt_address.sa;
> socklen_t mnt_salen = sizeof(mnt_address);
> struct pmap mnt_pmap;
> - char *option;
>
> /*
> * Skip option negotiation for proto=rdma mounts.
> */
> - option = po_get(options, "proto");
> - if (option && strcmp(option, "rdma") == 0)
> + if (rdma_enabled(options))
> goto out;
>
> /*
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-02 20:35 ` Chuck Lever
@ 2010-09-02 22:04 ` Steve Dickson
2010-09-03 15:01 ` Chuck Lever
0 siblings, 1 reply; 10+ messages in thread
From: Steve Dickson @ 2010-09-02 22:04 UTC (permalink / raw)
To: Chuck Lever; +Cc: Linux NFS Mailing list
On 09/02/2010 04:35 PM, Chuck Lever wrote:
>
> On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
>
>> The mounting code that process RMDA mounts is broken in a few places
>>
>> First with '-o proto=rdma' was broken because nfs_get_proto()
>> did not how to convert a netid of 'rdma' in to a AF_INET
>> address family.
>>
>> Secondly, '-o rdma' was broken because po_get() was being using
>> to detect the existence of 'rdma' in the options. With '-o rdma'
>> there is no value associated with that option so po_get()
>> was always return NULL.
>
> Looking at nfs(5), "rdma" as a stand-alone option isn't documented. Only "proto=rdma" appears to be mentioned. Thus I would argue that this is already behaving correctly.
>
> Are you proposing to add support for "-o rdma" ? If so what would that mean?
See Documentation/filesystems/nfs/nfs-rdma.txt
I see no problem in handing this same why we handle 'udp' and 'tcp'
steved.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-02 20:15 ` Chuck Lever
@ 2010-09-02 22:45 ` Steve Dickson
2010-09-03 16:01 ` Chuck Lever
0 siblings, 1 reply; 10+ messages in thread
From: Steve Dickson @ 2010-09-02 22:45 UTC (permalink / raw)
To: Chuck Lever; +Cc: Linux NFS Mailing list
Hey Chuck,
On 09/02/2010 04:15 PM, Chuck Lever wrote:
>
> On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
>
>> The mounting code that process RMDA mounts is broken in a few places
>>
>> First with '-o proto=rdma' was broken because nfs_get_proto()
>> did not how to convert a netid of 'rdma' in to a AF_INET
>> address family.
>
> The usual solution for that is to add appropriate definitions to /etc/netconfig.
> Since "rdma" is going to become (or has become) a valid "IETF standard" netid,
> shouldn't this be handled just like the other netids?
No.. because it is not a netid... when 'rdma' become a netid we'll treat
like one. Until the then let treat like we always have been...
> At some point "rdma" is going to start showing up in rpcbind,
> so this is going to have to work like a normal netid.
When this happen I will be more that willing to accept
the patches that will add support for the new netid.
>
>> Secondly, '-o rdma' was broken because po_get() was being using
>> to detect the existence of 'rdma' in the options. With '-o rdma'
>> there is no value associated with that option so po_get()
>> was always return NULL.
>
> This should be handled just the same way the "udp" and "tcp" mount
> options are handled, in nfs_nfs_protocol(). Just add an entry for
> "rdma" in nfs_transport_opttbl. The problem is we don't have an
> IPPROTO_ number for RDMA, so you'll have to make one up.
Exactly... And I was not about to make up a IPPROTO_ number
I just think that's crazy... Plus it turns out we didn't need to.
All that's needed is for the address family to be set to
AF_INET, from what Tom said. So it makes total sense put the
check in nfs_get_proto()
> What does the kernel do in these cases?
I took a quick look it appears they use IPPROTO_TCP (see xprt_setup_rdma())
but it appears they did not make up a IPPROTO_ number.
>
>> This patch address both those problems.
>
> Let's not riddle the mount code with these ad hoc string comparisons when
> we've already got plenty adequate generic support for adding new transport labels.
Riddle?? I'm condensing two existing (broken) if statements and adding
a third... I'm a bit taken back by your verbiage... I see changes to be
very unobtrusive and clean... with the befit of them working... ;-)
steved.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-02 22:04 ` Steve Dickson
@ 2010-09-03 15:01 ` Chuck Lever
2010-09-03 19:53 ` Steve Dickson
0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2010-09-03 15:01 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list
On Sep 2, 2010, at 6:04 PM, Steve Dickson wrote:
>
>
> On 09/02/2010 04:35 PM, Chuck Lever wrote:
>>
>> On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
>>
>>> The mounting code that process RMDA mounts is broken in a few places
>>>
>>> First with '-o proto=rdma' was broken because nfs_get_proto()
>>> did not how to convert a netid of 'rdma' in to a AF_INET
>>> address family.
>>>
>>> Secondly, '-o rdma' was broken because po_get() was being using
>>> to detect the existence of 'rdma' in the options. With '-o rdma'
>>> there is no value associated with that option so po_get()
>>> was always return NULL.
>>
>> Looking at nfs(5), "rdma" as a stand-alone option isn't documented. Only "proto=rdma" appears to be mentioned. Thus I would argue that this is already behaving correctly.
>>
>> Are you proposing to add support for "-o rdma" ? If so what would that mean?
> See Documentation/filesystems/nfs/nfs-rdma.txt
>
> I see no problem in handing this same why we handle 'udp' and 'tcp'
I used nfs(5) as the specification for the text-based option parsing support in the mount.nfs command, since that's generally what administrators and packagers look to when trying to figure out how to use mount.
I don't see any evidence in utils/mount/* that a stand-alone "rdma" option was ever supported. The kernel documentation could be wrong, or could be based on the kernel's mount option parser, which does support a stand-alone "rdma" mount option. So, my opinion is that, today, mount.nfs is not misbehaving by rejecting the "rdma" mount option. This is not a bug, it's working as designed.
To add an "rdma" mount option, therefore, is really introducing a new feature to mount.nfs. So, I think such a patch needs appropriate changes to user documentation, and this mode of invoking rdma needs testing, since clearly no-one noticed it was missing until now.
As with the rpc.nfsd changes, these mount option changes are two logically distinct but related changes that belong in separate patches.
--
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-02 22:45 ` Steve Dickson
@ 2010-09-03 16:01 ` Chuck Lever
2010-09-03 19:59 ` Steve Dickson
0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2010-09-03 16:01 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list
On Sep 2, 2010, at 6:45 PM, Steve Dickson wrote:
> Hey Chuck,
>
> On 09/02/2010 04:15 PM, Chuck Lever wrote:
>>
>> On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
>>
>>> The mounting code that process RMDA mounts is broken in a few places
>>>
>>> First with '-o proto=rdma' was broken because nfs_get_proto()
>>> did not how to convert a netid of 'rdma' in to a AF_INET
>>> address family.
>>
>> The usual solution for that is to add appropriate definitions to /etc/netconfig.
>> Since "rdma" is going to become (or has become) a valid "IETF standard" netid,
>> shouldn't this be handled just like the other netids?
> No.. because it is not a netid... when 'rdma' become a netid we'll treat
> like one. Until the then let treat like we always have been...
See the table in standards track RFC 5665 section 5.1.1, which defines both "rdma" and "rdma6" netids. "rdma" is a netid. The way we treat "proto=rdma" today is a hack which we should be rid of as soon as possible.
The real question is whether it is practical for libtirpc and getprotobyname(3) to treat it like a netid at this time. If it isn't, then we will need to continue to special case it in the mount command. That requires some internal documentation of the issues, however. At the very least you need a proper comment in both copies of nfs_get_proto(), because for now this is a very special case and should be treated as exceptional, not the norm.
The larger version of nfs_get_proto() should do something about "rdma6" too.
>> At some point "rdma" is going to start showing up in rpcbind,
>> so this is going to have to work like a normal netid.
> When this happen I will be more that willing to accept
> the patches that will add support for the new netid.
>
>>
>>> Secondly, '-o rdma' was broken because po_get() was being using
>>> to detect the existence of 'rdma' in the options. With '-o rdma'
>>> there is no value associated with that option so po_get()
>>> was always return NULL.
>>
>> This should be handled just the same way the "udp" and "tcp" mount
>> options are handled, in nfs_nfs_protocol(). Just add an entry for
>> "rdma" in nfs_transport_opttbl. The problem is we don't have an
>> IPPROTO_ number for RDMA, so you'll have to make one up.
> Exactly... And I was not about to make up a IPPROTO_ number
> I just think that's crazy...
"That's crazy" is an opinion. Do you have a technical argument for not wanting to create a special value?
It doesn't have to be in the system headers, since only mount.nfs will use it, for now. There isn't an IANA protocol number assigned to it yet, it looks like, so we can't add it outside of mount.nfs.
Basically you are ignoring the existing mechanism which mount.nfs uses for all the other protocol names just because you don't want to add a macro that defines NFSPROTO_RDMA ?
> Plus it turns out we didn't need to.
> All that's needed is for the address family to be set to
> AF_INET, from what Tom said. So it makes total sense put the
> check in nfs_get_proto()
This part of my argument isn't about nfs_get_proto() or about "proto=rdma".
I'm saying if we want a mount option that behaves like "udp" and "tcp" we should treat it that way, and put it in nfs_trans_opttbl[], because that's exactly why I created that table and nfs_nfs_protocol() and po_rightmost() in the first place.
Whenever practical, we should avoid the "slippery slope" of adding special cases for these mount options. I don't yet see a reason why this rule should not apply here.
>> What does the kernel do in these cases?
> I took a quick look it appears they use IPPROTO_TCP (see xprt_setup_rdma())
> but it appears they did not make up a IPPROTO_ number.
The kernel requires IPPROTO_TCP because it copies this to the mount transport protocol value. It could very easily have chosen a unique value.
This very issue is why we have the XPRT_TRANSPORT_ macros in the kernel in the first place.
>>> This patch address both those problems.
>>
>> Let's not riddle the mount code with these ad hoc string comparisons when
>> we've already got plenty adequate generic support for adding new transport labels.
> Riddle?? I'm condensing two existing (broken) if statements and adding
> a third... I'm a bit taken back by your verbiage... I see changes to be
> very unobtrusive and clean... with the befit of them working... ;-)
The only time "it works" is a reason to commit a patch is when "it is the only way to get the job done" is also true. I don't think that's the case here.
I might be equally offended by your use of (broken). Those if statements are working as designed. See my other e-mail.
But to address a specific technical point: if nfs_nfs_protocol() returned a specific value for "proto=rdma" or "rdma" then I don't think you would need rdma_enabled() at all.
Looking at Neil's patches last week, it occurred to me that we might want to grab the protocol settings in nfs_validate_options() and cache that number in the mount_info (just as is done with the version number). That way we would have that value throughout the process and wouldn't need to reparse the mount options multiple times when choosing how to process the mount request.
--
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-03 15:01 ` Chuck Lever
@ 2010-09-03 19:53 ` Steve Dickson
0 siblings, 0 replies; 10+ messages in thread
From: Steve Dickson @ 2010-09-03 19:53 UTC (permalink / raw)
To: Chuck Lever; +Cc: Linux NFS Mailing list
Hey...
On 09/03/2010 11:01 AM, Chuck Lever wrote:
>
> On Sep 2, 2010, at 6:04 PM, Steve Dickson wrote:
>
>>
>>
>> On 09/02/2010 04:35 PM, Chuck Lever wrote:
>>>
>>> On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
>>>
>>>> The mounting code that process RMDA mounts is broken in a few places
>>>>
>>>> First with '-o proto=rdma' was broken because nfs_get_proto()
>>>> did not how to convert a netid of 'rdma' in to a AF_INET
>>>> address family.
>>>>
>>>> Secondly, '-o rdma' was broken because po_get() was being using
>>>> to detect the existence of 'rdma' in the options. With '-o rdma'
>>>> there is no value associated with that option so po_get()
>>>> was always return NULL.
>>>
>>> Looking at nfs(5), "rdma" as a stand-alone option isn't documented. Only "proto=rdma" appears to be mentioned. Thus I would argue that this is already behaving correctly.
>>>
>>> Are you proposing to add support for "-o rdma" ? If so what would that mean?
>> See Documentation/filesystems/nfs/nfs-rdma.txt
>>
>> I see no problem in handing this same why we handle 'udp' and 'tcp'
>
> I used nfs(5) as the specification for the text-based option parsing support
> in the mount.nfs command, since that's generally what administrators
> and packagers look to when trying to figure out how to use mount.
I agree.. I'm looking into adding a couple blurbs in the man page.
>
> I don't see any evidence in utils/mount/* that a stand-alone "rdma"
> option was ever supported.
You are the one that added the code:
commit 0e0526cce8127f1c18063ff700f5e4d5c77dc108
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Tue Sep 29 10:38:52 2009 -0400
mount: Support negotiation between v4, v3, and v2
> The kernel documentation could be wrong, or
> could be based on the kernel's mount option parser, which does support
> a stand-alone "rdma" mount option. So, my opinion is that, today, mount.nfs
> is not misbehaving by rejecting the "rdma" mount option. This is not a bug,
> it's working as designed.
At this point the kernel documentation is the only documentation we have
at the moment... So that's the one we need to go with...
As far as it not being a bug... the '-o rdma' flag worked in nfs-utils-1.1.2
and now it does not... So I would say its a bug...
>
> To add an "rdma" mount option, therefore, is really introducing a new
> feature to mount.nfs. So, I think such a patch needs appropriate changes
> to user documentation, and this mode of invoking rdma needs testing,
> since clearly no-one noticed it was missing until now.
Agreed... blurbs will be added to the man page...
>
> As with the rpc.nfsd changes, these mount option changes are two
> logically distinct but related changes that belong in separate patches.
>
I don't understand what your are saying.
steved.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-03 16:01 ` Chuck Lever
@ 2010-09-03 19:59 ` Steve Dickson
2010-09-07 15:51 ` Chuck Lever
0 siblings, 1 reply; 10+ messages in thread
From: Steve Dickson @ 2010-09-03 19:59 UTC (permalink / raw)
To: Chuck Lever; +Cc: Linux NFS Mailing list
Hey,
On 09/03/2010 12:01 PM, Chuck Lever wrote:
>
> On Sep 2, 2010, at 6:45 PM, Steve Dickson wrote:
>
>> Hey Chuck,
>>
>> On 09/02/2010 04:15 PM, Chuck Lever wrote:
>>>
>>> On Sep 2, 2010, at 4:00 PM, Steve Dickson wrote:
>>>
>>>> The mounting code that process RMDA mounts is broken in a few places
>>>>
>>>> First with '-o proto=rdma' was broken because nfs_get_proto()
>>>> did not how to convert a netid of 'rdma' in to a AF_INET
>>>> address family.
>>>
>>> The usual solution for that is to add appropriate definitions to /etc/netconfig.
>>> Since "rdma" is going to become (or has become) a valid "IETF standard" netid,
>>> shouldn't this be handled just like the other netids?
>> No.. because it is not a netid... when 'rdma' become a netid we'll treat
>> like one. Until the then let treat like we always have been...
>
> See the table in standards track RFC 5665 section 5.1.1, which defines
> both "rdma" and "rdma6" netids. "rdma" is a netid.
I stand corrected... Thank for pointing this RFC out. I looks
real interesting! ;-)
> The way we treat "proto=rdma" today is a hack which we should be rid of
> as soon as possible.
>
> The real question is whether it is practical for libtirpc and getprotobyname(3) to
> treat it like a netid at this time. If it isn't, then we will need to
> continue to special case it in the mount command. That requires some internal
> documentation of the issues, however. At the very least you need a proper
> comment in both copies of nfs_get_proto(), because for now this is a very
> special case and should be treated as exceptional, not the norm.
Fine... A comment will be added.
>
> The larger version of nfs_get_proto() should do something about "rdma6" too.
>
>>> At some point "rdma" is going to start showing up in rpcbind,
>>> so this is going to have to work like a normal netid.
>> When this happen I will be more that willing to accept
>> the patches that will add support for the new netid.
>>
>>>
>>>> Secondly, '-o rdma' was broken because po_get() was being using
>>>> to detect the existence of 'rdma' in the options. With '-o rdma'
>>>> there is no value associated with that option so po_get()
>>>> was always return NULL.
>>>
>>> This should be handled just the same way the "udp" and "tcp" mount
>>> options are handled, in nfs_nfs_protocol(). Just add an entry for
>>> "rdma" in nfs_transport_opttbl. The problem is we don't have an
>>> IPPROTO_ number for RDMA, so you'll have to make one up.
>> Exactly... And I was not about to make up a IPPROTO_ number
>> I just think that's crazy...
>
> "That's crazy" is an opinion. Do you have a technical argument for not wanting
> to create a special value?
Because its simply not needed... I explain later in this email...
>
> It doesn't have to be in the system headers, since only
> mount.nfs will use it, for now. There isn't an IANA protocol number assigned
> to it yet, it looks like, so we can't add it outside of mount.nfs.
When a number is assigned, we will add it and use it..
>
> Basically you are ignoring the existing mechanism which mount.nfs uses for all the
> other protocol names just because you don't want to add a
> macro that defines NFSPROTO_RDMA ?
No. I'm improving code that was already there and again I just
don't see there is a need for a NFSPROTO_RDMA define... It will
not make the code any clear or cleaner that what I'm
currently proposing...
>
>> Plus it turns out we didn't need to.
>> All that's needed is for the address family to be set to
>> AF_INET, from what Tom said. So it makes total sense put the
>> check in nfs_get_proto()
>
> This part of my argument isn't about nfs_get_proto() or about "proto=rdma".
> I'm saying if we want a mount option that behaves like "udp" and "tcp" we
> should treat it that way, and put it in nfs_trans_opttbl[], because
> that's exactly why I created that table and nfs_nfs_protocol()
> and po_rightmost() in the first place.
>From the user standpoint the rdma option will behave like udp and tcp.
>
> Whenever practical, we should avoid the "slippery slope" of adding
> special cases for these mount options.
While I agree special cases can lead to slippery slopes, I don't
agree with of adding a tons and tons of code just for the sake
appearing to adhere to a yet not formed standard... When an
IANA protocol number is assigned we will use it and properly
integrate it into the code. But until then I think added one
9 line routine and 3 if statements is not unreasonable.
> I don't yet see a reason why this rule should not apply here.
My reasoning is this: We are enabling this new technology. If/When
this new technology matures, we (this code) will mature with it.
So adding these few lines (special cases to use your terms) is
need to enable this technology now, which is the right thing to do, IMHO...
>
>>> What does the kernel do in these cases?
>> I took a quick look it appears they use IPPROTO_TCP (see xprt_setup_rdma())
>> but it appears they did not make up a IPPROTO_ number.
>
> The kernel requires IPPROTO_TCP because it copies this to the mount
> transport protocol value. It could very easily have chosen a unique value.
> This very issue is why we have the XPRT_TRANSPORT_ macros in the kernel
> in the first place.
My point was they didn't create unique value...
>
>>>> This patch address both those problems.
>>>
>>> Let's not riddle the mount code with these ad hoc string comparisons when
>>> we've already got plenty adequate generic support for adding new transport labels.
>> Riddle?? I'm condensing two existing (broken) if statements and adding
>> a third... I'm a bit taken back by your verbiage... I see changes to be
>> very unobtrusive and clean... with the befit of them working... ;-)
>
> The only time "it works" is a reason to commit a patch is when "it is
> the only way to get the job done" is also true. I don't think that's
> the case here.
>
> I might be equally offended by your use of (broken). Those if statements are
> working as designed. See my other e-mail.
By no means did I mean to offend you... I was not able to do a rdma mount
with either the "proto=rdma" or "-o rdma" option. When I changed those
if statements, I was able to... That's all I meant by being broken.
>
> But to address a specific technical point: if nfs_nfs_protocol()
> returned a specific value for "proto=rdma" or "rdma" then I don't think
> you would need rdma_enabled() at all.
I took a look doing it this way... sincerely hoping it would be that
easy.. but its not...
nfs_nfs_protocol() returns network established protocols for things like
the portmapper, the configuration file and such... Now we have clearly
established there is no IPPROTO_RDMA protocol, so I would have to teach
all the callers of nfs_nfs_protocol() about IPPROTO_RDMA.
Plus, nfs_get_proto() would still have to be taught about rdma since
the first thing it does is call getnetconfigent() with the given netid and
we've clearly established we currently don't have the support for
a rdma netid...
Now I'm not against doing this work, but let the IANA people to
give use something to work with. When I see the rdma protocol in
/etc/protocols and a IPPROTO_RDMA established, I will be more
than willing to do the work... but at this point, to simply
enable this new technology, its just not worth it...
All that's need, at this point, is for the mount command to
use the correct address family, set the correct protocol version
and not do portmapper negation.. that's it!
>
> Looking at Neil's patches last week, it occurred to me that
> we might want to grab the protocol settings in nfs_validate_options()
> and cache that number in the mount_info (just as is done with the
> version number). That way we would have that value throughout
> the process and wouldn't need to reparse the mount options
> multiple times when choosing how to process the mount request.
>
I have not gotten to his patches yet... I've trying to get NFSoverRDMA
working... ;-)
steved.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] mount: RDMA processing in the mount command is broken
2010-09-03 19:59 ` Steve Dickson
@ 2010-09-07 15:51 ` Chuck Lever
0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2010-09-07 15:51 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list
On Sep 3, 2010, at 3:59 PM, Steve Dickson wrote:
> Hey,
>
> On 09/03/2010 12:01 PM, Chuck Lever wrote:
>> But to address a specific technical point: if nfs_nfs_protocol()
>> returned a specific value for "proto=rdma" or "rdma" then I don't think
>> you would need rdma_enabled() at all.
> I took a look doing it this way... sincerely hoping it would be that
> easy.. but its not...
I think you are making this way harder than it needs to be because you want to be "right." I can send you three patches that prove that my suggestion would make this code simpler and obviate the need for rdma_enabled().
> nfs_nfs_protocol() returns network established protocols for things like
> the portmapper, the configuration file and such... Now we have clearly
> established there is no IPPROTO_RDMA protocol, so I would have to teach
> all the callers of nfs_nfs_protocol() about IPPROTO_RDMA.
The nfs_nfs_protocol() function is designed to parse protocol-related mount options. There's no reason to add more functions that do the same thing.
IPPROTO_RDMA is not what is needed, since RDMA is not an IP protocol. A fixed arbitrary local value is all that is required.
> Plus, nfs_get_proto() would still have to be taught about rdma since
> the first thing it does is call getnetconfigent() with the given netid and
> we've clearly established we currently don't have the support for
> a rdma netid...
Yes. Simply done.
> Now I'm not against doing this work, but let the IANA people to
> give use something to work with. When I see the rdma protocol in
> /etc/protocols and a IPPROTO_RDMA established, I will be more
> than willing to do the work... but at this point, to simply
> enable this new technology, its just not worth it...
That claim only makes sense if you believe there would be a lot of work. I'll post my version in a bit and you will see how very simple it is.
> All that's need, at this point, is for the mount command to
> use the correct address family, set the correct protocol version
> and not do portmapper negation.. that's it!
By not setting the *protocol output parameter in nfs_get_proto() you invite future bugs. You basically are breaking the API contract (that both output variables will be filled in) and require that all callers be aware that sometimes *protocol won't be set. That's a layering violation. Isn't this code complicated enough without such a special case?
--
chuck[dot]lever[at]oracle[dot]com
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-09-07 15:52 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-02 20:00 [PATCH 1/1] mount: RDMA processing in the mount command is broken Steve Dickson
2010-09-02 20:15 ` Chuck Lever
2010-09-02 22:45 ` Steve Dickson
2010-09-03 16:01 ` Chuck Lever
2010-09-03 19:59 ` Steve Dickson
2010-09-07 15:51 ` Chuck Lever
2010-09-02 20:35 ` Chuck Lever
2010-09-02 22:04 ` Steve Dickson
2010-09-03 15:01 ` Chuck Lever
2010-09-03 19:53 ` Steve Dickson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox