* [PATCH 0/2] Improved node descriptions
@ 2011-02-17 21:30 Michael Heinz
2011-02-17 21:31 ` [PATCH 1/2] " Michael Heinz
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Michael Heinz @ 2011-02-17 21:30 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
michael.heinz-h88ZbnxC6KDQT0dZR+AlfA
The common practice in IB fabrics is to set the description of an HCA to be
the hostname of the machine plus a description (i.e., "myhost hca-1",
"myhost hca-2", etc..)
This has a limitation, however. The first is that if the machine's
hostname is set via DHCP, the HCA description may be set before the hostname
is, leading to an incorrect description. This can also occur if the machine's
hostname changes for some other reason after boot.
This can cause difficulties and confusion when trying to maintain a large
fabric - if all your nodes are described as "localhost HCA-1" it can be very
difficult to figure out which node is suffering from symbol errors.
This patch addresses the problem by providing a function to build the node
description. If the provided source string for the description contains an
'@' character, the function will substitute the current utsname.
This ensures that even after a fabric has been completely initialized, if
a node's hostname changes, that change will be reflected in the next sweep
of the SM, but also maintains compatibility with existing code since the
behavior is unchanged if the description string does not contain an '@'
character.
---
Michael Heinz (2):
Improved node descriptions
Making it easier to diagnose fabric problems by improving the node descriptions.
drivers/infiniband/core/mad.c | 18 ++++++++++++++++++
drivers/infiniband/hw/ipath/ipath_mad.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 3 ++-
drivers/infiniband/hw/mthca/mthca_mad.c | 3 ++-
drivers/infiniband/hw/qib/qib_mad.c | 2 +-
include/rdma/ib_mad.h | 8 ++++++++
6 files changed, 32 insertions(+), 4 deletions(-)
--
Signed-off-by: Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 1/2] Improved node descriptions
2011-02-17 21:30 [PATCH 0/2] Improved node descriptions Michael Heinz
@ 2011-02-17 21:31 ` Michael Heinz
2011-02-17 21:31 ` [PATCH 2/2] " Michael Heinz
2011-02-17 23:20 ` [PATCH 0/2] " Roland Dreier
2 siblings, 0 replies; 20+ messages in thread
From: Michael Heinz @ 2011-02-17 21:31 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
michael.heinz-h88ZbnxC6KDQT0dZR+AlfA
The common practice in IB fabrics is to set the description of an HCA to be
the hostname of the machine plus a description (i.e., "myhost hca-1",
"myhost hca-2", etc..)
This has a limitation, however. The first is that if the machine's
hostname is set via DHCP, the HCA description may be set before the hostname
is, leading to an incorrect description. This can also occur if the machine's
hostname changes for some other reason after boot.
This can cause difficulties and confusion when trying to maintain a large
fabric - if all your nodes are described as "localhost HCA-1" it can be very
difficult to figure out which node is suffering from symbol errors.
This patch addresses the problem by providing a function to build the node
description. If the provided source string for the description contains an
'@' character, the function will substitute the current utsname.
This ensures that even after a fabric has been completely initialized, if
a node's hostname changes, that change will be reflected in the next sweep
of the SM, but also maintains compatibility with existing code since the
behavior is unchanged if the description string does not contain an '@'
character.
Signed-off-by: Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
---
drivers/infiniband/core/mad.c | 18 ++++++++++++++++++
include/rdma/ib_mad.h | 8 ++++++++
2 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c
index 822cfdc..8e4ac68 100644
--- a/drivers/infiniband/core/mad.c
+++ b/drivers/infiniband/core/mad.c
@@ -41,6 +41,7 @@
#include "mad_rmpp.h"
#include "smi.h"
#include "agent.h"
+#include "linux/utsname.h"
MODULE_LICENSE("Dual BSD/GPL");
MODULE_DESCRIPTION("kernel IB MAD API");
@@ -932,6 +933,23 @@ int ib_get_mad_data_offset(u8 mgmt_class)
}
EXPORT_SYMBOL(ib_get_mad_data_offset);
+void ib_build_node_desc(char *dest, char *src)
+{
+ int i;
+ for (i = 0; i < 64;) {
+ if (*src == '@') {
+ char *name = init_utsname()->nodename;
+ for (; *name && (*name != '.') && (i < 64); ++i)
+ *dest++ = *name++;
+ src++;
+ } else {
+ *dest++ = *src++;
+ i++;
+ }
+ }
+}
+EXPORT_SYMBOL(ib_build_node_desc);
+
int ib_is_mad_class_rmpp(u8 mgmt_class)
{
if ((mgmt_class == IB_MGMT_CLASS_SUBN_ADM) ||
diff --git a/include/rdma/ib_mad.h b/include/rdma/ib_mad.h
index d3b9401..5916617 100644
--- a/include/rdma/ib_mad.h
+++ b/include/rdma/ib_mad.h
@@ -637,6 +637,14 @@ int ib_is_mad_class_rmpp(u8 mgmt_class);
int ib_get_mad_data_offset(u8 mgmt_class);
/**
+ * ib_build_node_desc - copies the node description and replaces
+ * any @ markers with the present system node name.
+ * @dest: destination
+ * @src: source
+ */
+void ib_build_node_desc(char *dest, char *src);
+
+/**
* ib_get_rmpp_segment - returns the data buffer for a given RMPP segment.
* @send_buf: Previously allocated send data buffer.
* @seg_num: number of segment to return
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH 2/2] Improved node descriptions
2011-02-17 21:30 [PATCH 0/2] Improved node descriptions Michael Heinz
2011-02-17 21:31 ` [PATCH 1/2] " Michael Heinz
@ 2011-02-17 21:31 ` Michael Heinz
2011-02-17 23:20 ` [PATCH 0/2] " Roland Dreier
2 siblings, 0 replies; 20+ messages in thread
From: Michael Heinz @ 2011-02-17 21:31 UTC (permalink / raw)
To: roland-DgEjT+Ai2ygdnm+yROfE0A, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
michael.heinz-h88ZbnxC6KDQT0dZR+AlfA
Adds support for ib_build_node_desc() to the HCAs.
Signed-off-by: Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
---
drivers/infiniband/hw/ipath/ipath_mad.c | 2 +-
drivers/infiniband/hw/mlx4/mad.c | 3 ++-
drivers/infiniband/hw/mthca/mthca_mad.c | 3 ++-
drivers/infiniband/hw/qib/qib_mad.c | 2 +-
4 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/ipath/ipath_mad.c b/drivers/infiniband/hw/ipath/ipath_mad.c
index ceb98ee..6da4750 100644
--- a/drivers/infiniband/hw/ipath/ipath_mad.c
+++ b/drivers/infiniband/hw/ipath/ipath_mad.c
@@ -60,7 +60,7 @@ static int recv_subn_get_nodedescription(struct ib_smp *smp,
if (smp->attr_mod)
smp->status |= IB_SMP_INVALID_FIELD;
- memcpy(smp->data, ibdev->node_desc, sizeof(smp->data));
+ ib_build_node_desc((char *)smp->data, ibdev->node_desc);
return reply(smp);
}
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 57ffa50..6b35d49 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -196,7 +196,8 @@ static void node_desc_override(struct ib_device *dev,
mad->mad_hdr.method == IB_MGMT_METHOD_GET_RESP &&
mad->mad_hdr.attr_id == IB_SMP_ATTR_NODE_DESC) {
spin_lock(&to_mdev(dev)->sm_lock);
- memcpy(((struct ib_smp *) mad)->data, dev->node_desc, 64);
+ ib_build_node_desc((char *)((struct ib_smp *) mad)->data,
+ dev->node_desc);
spin_unlock(&to_mdev(dev)->sm_lock);
}
}
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 03a5953..7662097 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -153,7 +153,8 @@ static void node_desc_override(struct ib_device *dev,
mad->mad_hdr.method == IB_MGMT_METHOD_GET_RESP &&
mad->mad_hdr.attr_id == IB_SMP_ATTR_NODE_DESC) {
mutex_lock(&to_mdev(dev)->cap_mask_mutex);
- memcpy(((struct ib_smp *) mad)->data, dev->node_desc, 64);
+ ib_build_node_desc((char *)((struct ib_smp *) mad)->data,
+ dev->node_desc);
mutex_unlock(&to_mdev(dev)->cap_mask_mutex);
}
}
diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
index 5ad224e..59d7cbb 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -260,7 +260,7 @@ static int subn_get_nodedescription(struct ib_smp *smp,
if (smp->attr_mod)
smp->status |= IB_SMP_INVALID_FIELD;
- memcpy(smp->data, ibdev->node_desc, sizeof(smp->data));
+ ib_build_node_desc((char *)smp->data, ibdev->node_desc);
return reply(smp);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
2011-02-17 21:30 [PATCH 0/2] Improved node descriptions Michael Heinz
2011-02-17 21:31 ` [PATCH 1/2] " Michael Heinz
2011-02-17 21:31 ` [PATCH 2/2] " Michael Heinz
@ 2011-02-17 23:20 ` Roland Dreier
[not found] ` <AANLkTim5MrHMVjaNFtHeWBy82dag4XNxdBcjBEW+d1yb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2 siblings, 1 reply; 20+ messages in thread
From: Roland Dreier @ 2011-02-17 23:20 UTC (permalink / raw)
To: Michael Heinz; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> This patch addresses the problem by providing a function to build the node
> description. If the provided source string for the description contains an
> '@' character, the function will substitute the current utsname.
>
> This ensures that even after a fabric has been completely initialized, if
> a node's hostname changes, that change will be reflected in the next sweep
> of the SM, but also maintains compatibility with existing code since the
> behavior is unchanged if the description string does not contain an '@'
> character.
This looks like a reasonable approach to me, although of course the SM
has no way of knowing it should update a port's node description if a
hostname changes.
Aside from some minor quibbles
- next time please use different subjects for each patch in the thread
- the prototype of ib_build_node_desc() seems to force every call
site to have a cast; maybe the function should take a pointer to
struct ib_smp instead?
- the internals of ib_build_node_desc() look a bit ugly, is there any
way to make it a little cleaner?
I do like this. Does anyone have any feelings about applying this
for 2.6.39? Is this shipping in OFED?
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTim5MrHMVjaNFtHeWBy82dag4XNxdBcjBEW+d1yb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-02-18 4:19 ` Hal Rosenstock
[not found] ` <AANLkTikh-8uGccT0tumHAu6cPOBm+k8joCaQ4W-grkHd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-18 14:09 ` Mike Heinz
` (2 subsequent siblings)
3 siblings, 1 reply; 20+ messages in thread
From: Hal Rosenstock @ 2011-02-18 4:19 UTC (permalink / raw)
To: Roland Dreier; +Cc: Michael Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Thu, Feb 17, 2011 at 6:20 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz <michael.heinz@qlogic.com> wrote:
>> This patch addresses the problem by providing a function to build the node
>> description. If the provided source string for the description contains an
>> '@' character, the function will substitute the current utsname.
>>
>> This ensures that even after a fabric has been completely initialized, if
>> a node's hostname changes, that change will be reflected in the next sweep
>> of the SM, but also maintains compatibility with existing code since the
>> behavior is unchanged if the description string does not contain an '@'
>> character.
>
> This looks like a reasonable approach to me, although of course the SM
> has no way of knowing it should update a port's node description if a
> hostname changes.
It does; There's an enhanced trap for this now.
> Aside from some minor quibbles
> - next time please use different subjects for each patch in the thread
> - the prototype of ib_build_node_desc() seems to force every call
> site to have a cast; maybe the function should take a pointer to
> struct ib_smp instead?
> - the internals of ib_build_node_desc() look a bit ugly, is there any
> way to make it a little cleaner?
> I do like this. Does anyone have any feelings about applying this
> for 2.6.39? Is this shipping in OFED?
I need a little time to review this.
-- Hal
> - R.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTikh-8uGccT0tumHAu6cPOBm+k8joCaQ4W-grkHd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-02-18 5:19 ` Roland Dreier
2011-02-18 16:22 ` Mike Heinz
1 sibling, 0 replies; 20+ messages in thread
From: Roland Dreier @ 2011-02-18 5:19 UTC (permalink / raw)
To: Hal Rosenstock; +Cc: Michael Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Thu, Feb 17, 2011 at 8:19 PM, Hal Rosenstock
<hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> This looks like a reasonable approach to me, although of course the SM
>> has no way of knowing it should update a port's node description if a
>> hostname changes.
> It does; There's an enhanced trap for this now.
Right; sorry for being unclear but I meant that this patch does not
have a mechanism for generating this trap if the hostname changes
or if userspace changes the node description.
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTim5MrHMVjaNFtHeWBy82dag4XNxdBcjBEW+d1yb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-18 4:19 ` Hal Rosenstock
@ 2011-02-18 14:09 ` Mike Heinz
2011-02-19 7:23 ` Jack Morgenstein
2011-02-19 19:24 ` Jason Gunthorpe
3 siblings, 0 replies; 20+ messages in thread
From: Mike Heinz @ 2011-02-18 14:09 UTC (permalink / raw)
To: Roland Dreier; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Roland,
Thanks for the positive feedback - I'll make the changes ASAP. Sorry about the subject lines, I thought I was doing the "right thing" by making them match.
The change is not in OFED yet, but QLogic has been using it in our own release for a while and works extremely well on large fabrics.
________________________________________
From: roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org [roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org] On Behalf Of Roland Dreier [roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org]
Sent: Thursday, February 17, 2011 6:20 PM
To: Mike Heinz
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 0/2] Improved node descriptions
On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> This patch addresses the problem by providing a function to build the node
> description. If the provided source string for the description contains an
> '@' character, the function will substitute the current utsname.
>
> This ensures that even after a fabric has been completely initialized, if
> a node's hostname changes, that change will be reflected in the next sweep
> of the SM, but also maintains compatibility with existing code since the
> behavior is unchanged if the description string does not contain an '@'
> character.
This looks like a reasonable approach to me, although of course the SM
has no way of knowing it should update a port's node description if a
hostname changes.
Aside from some minor quibbles
- next time please use different subjects for each patch in the thread
- the prototype of ib_build_node_desc() seems to force every call
site to have a cast; maybe the function should take a pointer to
struct ib_smp instead?
- the internals of ib_build_node_desc() look a bit ugly, is there any
way to make it a little cleaner?
I do like this. Does anyone have any feelings about applying this
for 2.6.39? Is this shipping in OFED?
- R.
This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTikh-8uGccT0tumHAu6cPOBm+k8joCaQ4W-grkHd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-18 5:19 ` Roland Dreier
@ 2011-02-18 16:22 ` Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A20B289C7-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
1 sibling, 1 reply; 20+ messages in thread
From: Mike Heinz @ 2011-02-18 16:22 UTC (permalink / raw)
To: Hal Rosenstock, Roland Dreier
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2851 bytes --]
Hal,
I'm in the process of smoke-testing changes to meet Roland's "quibbles" - I'm going to change the code to pass the structure instead of a char*.
Also, I just want to point out that even in the case where no enhanced trap is used, any change will still be seen by the SM the next time it sweeps. There will still be lag involved, but I don't think it's insurmountable.
-----Original Message-----
From: Hal Rosenstock [mailto:hal.rosenstock@gmail.com]
Sent: Thursday, February 17, 2011 11:19 PM
To: Roland Dreier
Cc: Mike Heinz; linux-rdma@vger.kernel.org
Subject: Re: [PATCH 0/2] Improved node descriptions
On Thu, Feb 17, 2011 at 6:20 PM, Roland Dreier <roland@kernel.org> wrote:
> On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz <michael.heinz@qlogic.com> wrote:
>> This patch addresses the problem by providing a function to build the node
>> description. If the provided source string for the description contains an
>> '@' character, the function will substitute the current utsname.
>>
>> This ensures that even after a fabric has been completely initialized, if
>> a node's hostname changes, that change will be reflected in the next sweep
>> of the SM, but also maintains compatibility with existing code since the
>> behavior is unchanged if the description string does not contain an '@'
>> character.
>
> This looks like a reasonable approach to me, although of course the SM
> has no way of knowing it should update a port's node description if a
> hostname changes.
It does; There's an enhanced trap for this now.
> Aside from some minor quibbles
> - next time please use different subjects for each patch in the thread
> - the prototype of ib_build_node_desc() seems to force every call
> site to have a cast; maybe the function should take a pointer to
> struct ib_smp instead?
> - the internals of ib_build_node_desc() look a bit ugly, is there any
> way to make it a little cleaner?
> I do like this. Does anyone have any feelings about applying this
> for 2.6.39? Is this shipping in OFED?
I need a little time to review this.
-- Hal
> - R.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
N§²æìr¸yúèØb²X¬¶Ç§vØ^)Þº{.nÇ+·¥{±Ù{ayº\x1dÊÚë,j\a¢f£¢·h»öì\x17/oSc¾Ú³9uÀ¦æåÈ&jw¨®\x03(éÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þàþf£¢·h§~m
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A20B289C7-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
@ 2011-02-18 22:57 ` Hal Rosenstock
0 siblings, 0 replies; 20+ messages in thread
From: Hal Rosenstock @ 2011-02-18 22:57 UTC (permalink / raw)
To: Mike Heinz
Cc: Roland Dreier, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Mike,
On Fri, Feb 18, 2011 at 11:22 AM, Mike Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> Hal,
>
> I'm in the process of smoke-testing changes to meet Roland's "quibbles" - I'm going to change the code to pass the structure instead of a char*.
>
> Also, I just want to point out that even in the case where no enhanced trap is used, any change will still be seen by the SM the next time it sweeps. There will still be lag involved, but I don't think it's insurmountable.
That's not true for all SMs. It's true for a "heavy" but not a "light"
sweep. There's no requirement to rediscover NodeDescriptions all the
time periodically.
I should have my comments (on your now updated patches) by COB Tuesday
as Monday is a holiday.
-- Hal
> -----Original Message-----
> From: Hal Rosenstock [mailto:hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org]
> Sent: Thursday, February 17, 2011 11:19 PM
> To: Roland Dreier
> Cc: Mike Heinz; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Subject: Re: [PATCH 0/2] Improved node descriptions
>
> On Thu, Feb 17, 2011 at 6:20 PM, Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>> On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz <michael.heinz@qlogic.com> wrote:
>>> This patch addresses the problem by providing a function to build the node
>>> description. If the provided source string for the description contains an
>>> '@' character, the function will substitute the current utsname.
>>>
>>> This ensures that even after a fabric has been completely initialized, if
>>> a node's hostname changes, that change will be reflected in the next sweep
>>> of the SM, but also maintains compatibility with existing code since the
>>> behavior is unchanged if the description string does not contain an '@'
>>> character.
>>
>> This looks like a reasonable approach to me, although of course the SM
>> has no way of knowing it should update a port's node description if a
>> hostname changes.
>
> It does; There's an enhanced trap for this now.
>
>> Aside from some minor quibbles
>> - next time please use different subjects for each patch in the thread
>> - the prototype of ib_build_node_desc() seems to force every call
>> site to have a cast; maybe the function should take a pointer to
>> struct ib_smp instead?
>> - the internals of ib_build_node_desc() look a bit ugly, is there any
>> way to make it a little cleaner?
>> I do like this. Does anyone have any feelings about applying this
>> for 2.6.39? Is this shipping in OFED?
>
> I need a little time to review this.
>
> -- Hal
>
>> - R.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
> This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTim5MrHMVjaNFtHeWBy82dag4XNxdBcjBEW+d1yb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-18 4:19 ` Hal Rosenstock
2011-02-18 14:09 ` Mike Heinz
@ 2011-02-19 7:23 ` Jack Morgenstein
[not found] ` <201102190923.12641.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-02-19 19:24 ` Jason Gunthorpe
3 siblings, 1 reply; 20+ messages in thread
From: Jack Morgenstein @ 2011-02-19 7:23 UTC (permalink / raw)
To: Roland Dreier
Cc: Michael Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Hal Rosenstock
On Friday 18 February 2011 01:20, Roland Dreier wrote:
> This looks like a reasonable approach to me, although of course the SM
> has no way of knowing it should update a port's node description if a
> hostname changes.
>
What about the problem of multiple HCA's of the same type on a single host?
Won't all of them get the identical node description?
Mike, can you add something to handle this case?
(See my original feedback:
http://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg04413.html
and see Mike's response:
http://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg04414.html
Mike, I agree that doing something for the upstream is better than doing
nothing, but I would still like to see the multiple-HCA case handled. I think
this can be done by adding a query to the low-level driver to distinguish between
HCAs).
-Jack
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTim5MrHMVjaNFtHeWBy82dag4XNxdBcjBEW+d1yb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
` (2 preceding siblings ...)
2011-02-19 7:23 ` Jack Morgenstein
@ 2011-02-19 19:24 ` Jason Gunthorpe
[not found] ` <20110219192458.GB4506-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
3 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2011-02-19 19:24 UTC (permalink / raw)
To: Roland Dreier; +Cc: Michael Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA
On Thu, Feb 17, 2011 at 03:20:31PM -0800, Roland Dreier wrote:
> On Thu, Feb 17, 2011 at 1:30 PM, Michael Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> > This patch addresses the problem by providing a function to build the node
> > description. If the provided source string for the description contains an
> > '@' character, the function will substitute the current utsname.
> >
> > This ensures that even after a fabric has been completely initialized, if
> > a node's hostname changes, that change will be reflected in the next sweep
> > of the SM, but also maintains compatibility with existing code since the
> > behavior is unchanged if the description string does not contain an '@'
> > character.
>
> This looks like a reasonable approach to me, although of course the SM
> has no way of knowing it should update a port's node description if a
> hostname changes.
>
> Aside from some minor quibbles
> - next time please use different subjects for each patch in the thread
> - the prototype of ib_build_node_desc() seems to force every call
> site to have a cast; maybe the function should take a pointer to
> struct ib_smp instead?
> - the internals of ib_build_node_desc() look a bit ugly, is there any
> way to make it a little cleaner?
> I do like this. Does anyone have any feelings about applying this
> for 2.6.39? Is this shipping in OFED?
If the main concern is DHCP what is the problem with using
/etc/dhcp/dhclient-enter-hooks.d/ or alike?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Improved node descriptions
[not found] ` <201102190923.12641.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2011-02-21 18:30 ` Mike Heinz
0 siblings, 0 replies; 20+ messages in thread
From: Mike Heinz @ 2011-02-21 18:30 UTC (permalink / raw)
To: Jack Morgenstein, Roland Dreier
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Hal Rosenstock, Todd Rimmer
Jack,
The way we handle that right now is in the openibd start script. What we do is this:
ib_set_node_desc()
{
# Add node description to sysfs
ibsysdir="/sys/class/infiniband"
if [ -d ${ibsysdir} ]; then
declare -i hca_id=1
for hca in ${ibsysdir}/*
do
if [ -e ${hca}/node_desc ]; then
logger -i "Set node_desc for $(basename $hca)"
echo -n "@ HCA-${hca_id}" >> ${hca}/node_desc
fi
let hca_id++
done
fi
}
Once the kernel modification is accepted, I plan to submit a patch for the openibd script.
-----Original Message-----
From: Jack Morgenstein [mailto:jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org]
Sent: Saturday, February 19, 2011 2:23 AM
To: Roland Dreier
Cc: Mike Heinz; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Hal Rosenstock
Subject: Re: [PATCH 0/2] Improved node descriptions
On Friday 18 February 2011 01:20, Roland Dreier wrote:
> This looks like a reasonable approach to me, although of course the SM
> has no way of knowing it should update a port's node description if a
> hostname changes.
>
What about the problem of multiple HCA's of the same type on a single host?
Won't all of them get the identical node description?
Mike, can you add something to handle this case?
(See my original feedback:
http://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg04413.html
and see Mike's response:
http://www.mail-archive.com/linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg04414.html
Mike, I agree that doing something for the upstream is better than doing
nothing, but I would still like to see the multiple-HCA case handled. I think
this can be done by adding a query to the low-level driver to distinguish between
HCAs).
-Jack
This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Improved node descriptions
[not found] ` <20110219192458.GB4506-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-02-21 19:26 ` Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A20B28B17-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Mike Heinz @ 2011-02-21 19:26 UTC (permalink / raw)
To: Jason Gunthorpe, Roland Dreier
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
The biggest problem with that is that patching existing boot scripts is always going to vary from distro to distro and is always going to have problems when dealing with files that were already edited for site-specific reasons.
Ipoib is a good example of this, I frequently see failures trying to install the ipoib rpm because it failed to successfully patch the ifup- scripts. I'd rather not add another such dependency.
For example, there are is /etc/dhcp/dhclient-enter-hooks.d in SLES11 or on Redhat 6 - redhat 6 uses different files for different hooks and uses a single file (which does not exist by default) called /etc/dhcp/dhclient-enter-hook, but SLES11 uses a single script for all dhcp hooks called /etc/dhcpcd.sh.
Trying to create a single RPM capable of hacking into a wide variety of different dhcp mechanisms could be a huge headache. Using something that relies on the openibd script and a kernel-expansion of the hostname is much less prone to problems.
-----Original Message-----
From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
Sent: Saturday, February 19, 2011 2:25 PM
To: Roland Dreier
Cc: Mike Heinz; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 0/2] Improved node descriptions
If the main concern is DHCP what is the problem with using
/etc/dhcp/dhclient-enter-hooks.d/ or alike?
Jason
This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A20B28B17-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
@ 2011-02-22 23:13 ` Roland Dreier
[not found] ` <AANLkTi=1rmRckZz1iAXLpakf5bMuBp4koGOyO-FUDz_M-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Roland Dreier @ 2011-02-22 23:13 UTC (permalink / raw)
To: Mike Heinz
Cc: Jason Gunthorpe,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Mon, Feb 21, 2011 at 11:26 AM, Mike Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> The biggest problem with that is that patching existing boot scripts
> is always going to vary from distro to distro and is always going to
> have problems when dealing with files that were already edited for
> site-specific reasons.
This is the wrong way to look at it. Really it would make sense in
the long term to add required support to native distros -- piling hack
on hack in OFED is clearly a long-term disaster. So I don't think
saying "it's hard to make a single RPM" is a very good argument.
With that said I think we should look at where it makes sense to
implement this sort of thing. Just because we can do something with
scripts in userspace doesn't necessarily mean we have to do it there
-- if it's much simpler or more robust in the kernel, then we can
implement it there.
In this case I do think it makes sense to add this support to the
kernel, since the kernel handling is so simple. In fact based on
Jack's question it might make sense to go further and have more
flexible expansion... what if we do something like adding primitive
format expansion, ie
%h --> expands to current hostname
%d --> expands to ib_device->name
Then one could have a trivial script that just does 'for every IB
device, prepend "%h/%d: " to the node description.' And I guess we
could even talk about making that the default kernel policy.
Not sure what Jason or others thinks about this opinion...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTi=1rmRckZz1iAXLpakf5bMuBp4koGOyO-FUDz_M-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-02-22 23:43 ` Jason Gunthorpe
[not found] ` <20110222234304.GA21731-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-02-23 21:02 ` Mike Heinz
1 sibling, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2011-02-22 23:43 UTC (permalink / raw)
To: Roland Dreier
Cc: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, Feb 22, 2011 at 03:13:47PM -0800, Roland Dreier wrote:
> On Mon, Feb 21, 2011 at 11:26 AM, Mike Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> > The biggest problem with that is that patching existing boot scripts
> > is always going to vary from distro to distro and is always going to
> > have problems when dealing with files that were already edited for
> > site-specific reasons.
>
> This is the wrong way to look at it. Really it would make sense in
> the long term to add required support to native distros -- piling hack
> on hack in OFED is clearly a long-term disaster. So I don't think
> saying "it's hard to make a single RPM" is a very good argument.
Indeed, considering that OFEDs entire purpose is to manage packaging
IB stuff for distributions this doesn't seem like a good argument..
Nor do I think OFED should even try to have one RPM for all distros,
good packaging isn't like that.
> In this case I do think it makes sense to add this support to the
> kernel, since the kernel handling is so simple. In fact based on
> Jack's question it might make sense to go further and have more
> flexible expansion... what if we do something like adding primitive
> format expansion, ie
Doing it in userspace makes generating the node description changed
trap simpler?
What about other drivers? I didn't see ehca in Mike's patch..
I just wonder if this is a big pain to do right, what about charsets, IDN,
and ugly details like that?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <20110222234304.GA21731-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-02-23 0:03 ` Roland Dreier
[not found] ` <AANLkTink3ec2O8-ExPuJpJd5j_Y0UtiL=QtM0rrmZR88-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Roland Dreier @ 2011-02-23 0:03 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, Feb 22, 2011 at 3:43 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> Doing it in userspace makes generating the node description changed
> trap simpler?
Hmm... how many places can do sethostname()? Seems easier to
catch in the kernel than hook every place in userspace. (Although
there's no kernel hook right now) I do agree it would be good to
have some idea of how we could generate the "node desc changed"
trap at appropriate times.
> What about other drivers? I didn't see ehca in Mike's patch..
ehca doesn't implement modify_device --> can't set node desc anyway.
> I just wonder if this is a big pain to do right, what about charsets, IDN,
> and ugly details like that?
Does anyone expect to care about non-ASCII node descs?
What can we sensibly do except take what we're given?
- R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTink3ec2O8-ExPuJpJd5j_Y0UtiL=QtM0rrmZR88-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-02-23 5:41 ` Jason Gunthorpe
[not found] ` <20110223054151.GA2363-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
0 siblings, 1 reply; 20+ messages in thread
From: Jason Gunthorpe @ 2011-02-23 5:41 UTC (permalink / raw)
To: Roland Dreier
Cc: Mike Heinz, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, Feb 22, 2011 at 04:03:29PM -0800, Roland Dreier wrote:
> On Tue, Feb 22, 2011 at 3:43 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > Doing it in userspace makes generating the node description changed
> > trap simpler?
>
> Hmm... how many places can do sethostname()? Seems easier to
> catch in the kernel than hook every place in userspace. (Although
> there's no kernel hook right now) I do agree it would be good to
> have some idea of how we could generate the "node desc changed"
> trap at appropriate times.
Wasn't thinking about 100% perfection, just if DHCP is the concern it
shouldn't be hard to hook that one place.
> > I just wonder if this is a big pain to do right, what about charsets, IDN,
> > and ugly details like that?
>
> Does anyone expect to care about non-ASCII node descs?
> What can we sensibly do except take what we're given?
node desc is UTF-8, hostname is IDNA, a conversion is required, see RFC
3490.
Does anyone care? Who knows, but very pedantically it is wrong to just copy
the host name byte by byte. I only mention it to point out that it is
trivial to do what Mike did, somewhat harder to do % escaping like you
suggest and solve the multiple HCA problem, harder still to trap
sethostname() and generate a trap, and extra special hard to correctly
handle character sets on top of all that. :)
So, is it still trivial to do it in the kernel?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Improved node descriptions
[not found] ` <AANLkTi=1rmRckZz1iAXLpakf5bMuBp4koGOyO-FUDz_M-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-22 23:43 ` Jason Gunthorpe
@ 2011-02-23 21:02 ` Mike Heinz
1 sibling, 0 replies; 20+ messages in thread
From: Mike Heinz @ 2011-02-23 21:02 UTC (permalink / raw)
To: Roland Dreier
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Ira Weiny
I had actually submitted a patch back in June of last year that did set the default policy to be the hostname & device but looking back through my records I can't find the reason it was never explicitly rejected - it got limited feedback and just wasn't ever added to Linux-rdma. (I'm not sure how to learn the final disposition of proposed patches.) Maybe it was a coding style or formatting issue?
Right now the HCAs default to a description of the HCA model. I can certainly create a patch that sets the default descriptions to "%h: %d (device description)" if that's what you want. We certainly need something that's widely applicable, not distribution specific and lets large clusters manage their nodes effectively.
-----Original Message-----
From: roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org [mailto:roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org] On Behalf Of Roland Dreier
Sent: Tuesday, February 22, 2011 6:14 PM
To: Mike Heinz
Cc: Jason Gunthorpe; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 0/2] Improved node descriptions
On Mon, Feb 21, 2011 at 11:26 AM, Mike Heinz <michael.heinz-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org> wrote:
> The biggest problem with that is that patching existing boot scripts
> is always going to vary from distro to distro and is always going to
> have problems when dealing with files that were already edited for
> site-specific reasons.
This is the wrong way to look at it. Really it would make sense in
the long term to add required support to native distros -- piling hack
on hack in OFED is clearly a long-term disaster. So I don't think
saying "it's hard to make a single RPM" is a very good argument.
With that said I think we should look at where it makes sense to
implement this sort of thing. Just because we can do something with
scripts in userspace doesn't necessarily mean we have to do it there
-- if it's much simpler or more robust in the kernel, then we can
implement it there.
In this case I do think it makes sense to add this support to the
kernel, since the kernel handling is so simple. In fact based on
Jack's question it might make sense to go further and have more
flexible expansion... what if we do something like adding primitive
format expansion, ie
%h --> expands to current hostname
%d --> expands to ib_device->name
Then one could have a trivial script that just does 'for every IB
device, prepend "%h/%d: " to the node description.' And I guess we
could even talk about making that the default kernel policy.
Not sure what Jason or others thinks about this opinion...
This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Improved node descriptions
[not found] ` <20110223054151.GA2363-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2011-02-23 21:56 ` Ira Weiny
2011-02-24 2:31 ` Mike Heinz
1 sibling, 0 replies; 20+ messages in thread
From: Ira Weiny @ 2011-02-23 21:56 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Roland Dreier, Mike Heinz,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
On Tue, 22 Feb 2011 21:41:52 -0800
Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Feb 22, 2011 at 04:03:29PM -0800, Roland Dreier wrote:
> > On Tue, Feb 22, 2011 at 3:43 PM, Jason Gunthorpe
> > <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > > Doing it in userspace makes generating the node description changed
> > > trap simpler?
> >
> > Hmm... how many places can do sethostname()? Seems easier to
> > catch in the kernel than hook every place in userspace. (Although
> > there's no kernel hook right now) I do agree it would be good to
> > have some idea of how we could generate the "node desc changed"
> > trap at appropriate times.
>
> Wasn't thinking about 100% perfection, just if DHCP is the concern it
> shouldn't be hard to hook that one place.
I like the idea of having a way to tie node description to the hostname. It just makes sense. The question in my mind is will upstream accept a hook in sethostname? Generally it seems like a good idea, perhaps other subsystems would want to know about it. But it does seem like it might not get accepted.
>
> > > I just wonder if this is a big pain to do right, what about charsets, IDN,
> > > and ugly details like that?
> >
> > Does anyone expect to care about non-ASCII node descs?
> > What can we sensibly do except take what we're given?
>
> node desc is UTF-8, hostname is IDNA, a conversion is required, see RFC
> 3490.
For what it is worth none of the diags or OpenSM support UTF-8. I'll add it to the list of things to do. ;-)
I don't know much about IDNA but it seems that ASCII is a subset of IDNA as well as UTF-8. Perhaps a check for ASCII in the hostname would suffice to at least make this work for ASCII hostnames only. (That should cover a significant portion of the hosts out there, if not all of them.)
>
> Does anyone care? Who knows, but very pedantically it is wrong to just copy
> the host name byte by byte. I only mention it to point out that it is
> trivial to do what Mike did, somewhat harder to do % escaping like you
> suggest and solve the multiple HCA problem, harder still to trap
> sethostname() and generate a trap, and extra special hard to correctly
> handle character sets on top of all that. :)
>
> So, is it still trivial to do it in the kernel?
Looking at the latest code it seems that setting the node descriptor for mlx4 and qib will result in a trap 144 being set. A hook in sethostname which results in the regeneration of the node description and the generation of the trap seems pretty easy. So, from a technical side I think the answer is "yes" it is pretty easy.
That said getting that hook into sethostname might be harder to get upstream than just writing the code. I can't comment on that because I don't have any idea how hard that would be. :-(
Ira
>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH 0/2] Improved node descriptions
[not found] ` <20110223054151.GA2363-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-02-23 21:56 ` Ira Weiny
@ 2011-02-24 2:31 ` Mike Heinz
1 sibling, 0 replies; 20+ messages in thread
From: Mike Heinz @ 2011-02-24 2:31 UTC (permalink / raw)
To: Jason Gunthorpe, Roland Dreier
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Jason,
While I do think supporting the trap would be a great idea going forward, I don't see any reason why it couldn't be combined with this patch when we figure out how to detect the hostname change.
For example, it would be great for me as a developer to be able to flag a group of machines as being dedicated to me, by setting their node descriptions to "%h: %d Mike's MPI Testbed" or similar.
I could echo that into the node description at boot time and, if the host name changes afterwards, still have a trap triggered (somehow) to inform the SM of the change.
-----Original Message-----
From: Jason Gunthorpe [mailto:jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org]
Sent: Wednesday, February 23, 2011 12:42 AM
To: Roland Dreier
Cc: Mike Heinz; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 0/2] Improved node descriptions
On Tue, Feb 22, 2011 at 04:03:29PM -0800, Roland Dreier wrote:
> On Tue, Feb 22, 2011 at 3:43 PM, Jason Gunthorpe
> <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> > Doing it in userspace makes generating the node description changed
> > trap simpler?
>
> Hmm... how many places can do sethostname()? Seems easier to
> catch in the kernel than hook every place in userspace. (Although
> there's no kernel hook right now) I do agree it would be good to
> have some idea of how we could generate the "node desc changed"
> trap at appropriate times.
Wasn't thinking about 100% perfection, just if DHCP is the concern it
shouldn't be hard to hook that one place.
> > I just wonder if this is a big pain to do right, what about charsets, IDN,
> > and ugly details like that?
>
> Does anyone expect to care about non-ASCII node descs?
> What can we sensibly do except take what we're given?
node desc is UTF-8, hostname is IDNA, a conversion is required, see RFC
3490.
Does anyone care? Who knows, but very pedantically it is wrong to just copy
the host name byte by byte. I only mention it to point out that it is
trivial to do what Mike did, somewhat harder to do % escaping like you
suggest and solve the multiple HCA problem, harder still to trap
sethostname() and generate a trap, and extra special hard to correctly
handle character sets on top of all that. :)
So, is it still trivial to do it in the kernel?
Jason
This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-02-24 2:31 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-17 21:30 [PATCH 0/2] Improved node descriptions Michael Heinz
2011-02-17 21:31 ` [PATCH 1/2] " Michael Heinz
2011-02-17 21:31 ` [PATCH 2/2] " Michael Heinz
2011-02-17 23:20 ` [PATCH 0/2] " Roland Dreier
[not found] ` <AANLkTim5MrHMVjaNFtHeWBy82dag4XNxdBcjBEW+d1yb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-18 4:19 ` Hal Rosenstock
[not found] ` <AANLkTikh-8uGccT0tumHAu6cPOBm+k8joCaQ4W-grkHd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-18 5:19 ` Roland Dreier
2011-02-18 16:22 ` Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A20B289C7-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2011-02-18 22:57 ` Hal Rosenstock
2011-02-18 14:09 ` Mike Heinz
2011-02-19 7:23 ` Jack Morgenstein
[not found] ` <201102190923.12641.jackm-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-02-21 18:30 ` Mike Heinz
2011-02-19 19:24 ` Jason Gunthorpe
[not found] ` <20110219192458.GB4506-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-02-21 19:26 ` Mike Heinz
[not found] ` <4C2744E8AD2982428C5BFE523DF8CDCB4A20B28B17-amwN6d8PyQWXx9kJd3VG2h2eb7JE58TQ@public.gmane.org>
2011-02-22 23:13 ` Roland Dreier
[not found] ` <AANLkTi=1rmRckZz1iAXLpakf5bMuBp4koGOyO-FUDz_M-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-22 23:43 ` Jason Gunthorpe
[not found] ` <20110222234304.GA21731-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-02-23 0:03 ` Roland Dreier
[not found] ` <AANLkTink3ec2O8-ExPuJpJd5j_Y0UtiL=QtM0rrmZR88-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-02-23 5:41 ` Jason Gunthorpe
[not found] ` <20110223054151.GA2363-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-02-23 21:56 ` Ira Weiny
2011-02-24 2:31 ` Mike Heinz
2011-02-23 21:02 ` Mike Heinz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox