linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset
@ 2010-11-12 22:11 Jim Schutt
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

Hi Sasha,

These patches clean up and add documentation to the
torus-2QoS routing module for OpenSM.  They apply on
top of my previous bug-fix patchset from September
(http://www.spinics.net/lists/linux-rdma/msg05809.html),
which applies to your torus-2qos branch.

Thanks -- Jim


Jim Schutt (13):
  Revert "opensm: Do not require -Q option for torus-2QoS routing
    engine."
  opensm: torus-2QoS requires that QoS be enabled.
  opensm/osm_ucast_mgr.c: ensure osm_ucast_mgr_process() returns
    failure when no routing engine runs.
  opensm: Fill in default QoS values at last possible moment.
  opensm: Cause torus-2QoS to warn if QoS configuration will cause
    issues.
  opensm/osm_torus.c: Also parse DOS line endings in torus-2QoS.conf.
  opensm/osm_torus.c: Use PRIx64 for GUID printing.
  opensm/osm_torus.c: Ignore multiple configurations of torus size.
  opensm/osm_subnet.c: Add torus-2QoS config file option to those
    configurable via opensm config file.
  opensm/main.c:  Add description of "no_fallback" to
    "--routing_engine" option documentation.
  opensm/man/opensm.8.in:  Add references to torus-2QoS.
  opensm: Add torus-2QoS man pages.
  opensm/doc/current-routing.txt: Sync torus-2QoS information with new
    man pages.

 opensm/Makefile.am              |    2 +-
 opensm/configure.in             |    6 +-
 opensm/doc/current-routing.txt  |  141 +++++++++++--
 opensm/man/opensm.8.in          |   29 +++-
 opensm/man/torus-2QoS.8.in      |  476 +++++++++++++++++++++++++++++++++++++++
 opensm/man/torus-2QoS.conf.5.in |  184 +++++++++++++++
 opensm/opensm/main.c            |    3 +
 opensm/opensm/osm_qos.c         |   62 ++++--
 opensm/opensm/osm_subnet.c      |   75 +++----
 opensm/opensm/osm_torus.c       |  319 +++++++++++++++++---------
 opensm/opensm/osm_ucast_mgr.c   |    1 +
 11 files changed, 1105 insertions(+), 193 deletions(-)
 create mode 100644 opensm/man/torus-2QoS.8.in
 create mode 100644 opensm/man/torus-2QoS.conf.5.in


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/13] Revert "opensm: Do not require -Q option for torus-2QoS routing engine."
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 02/13] opensm: torus-2QoS requires that QoS be enabled Jim Schutt
                     ` (13 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

This reverts commit b9691580e29c6a8cf1f45995988350c02826786d.

Since all other routing engines require -Q to cause SL2VL maps to
be programmed, torus-2QoS should do the same.

Of course, torus-2QoS requires SL2VL maps to be programmed for correct
routing, so a check for that will need to be added.

Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_qos.c    |    7 ++-----
 opensm/opensm/osm_subnet.c |   18 +++++++++---------
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index ab55918..ba198a0 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -314,9 +314,7 @@ int osm_qos_setup(osm_opensm_t * p_osm)
 	int ret = 0;
 	int vlarb_only;
 
-	if (!(p_osm->subn.opt.qos ||
-	      (p_osm->routing_engine_used &&
-	       p_osm->routing_engine_used->update_sl2vl)))
+	if (!p_osm->subn.opt.qos)
 		return 0;
 
 	OSM_LOG_ENTER(&p_osm->log);
@@ -333,8 +331,7 @@ int osm_qos_setup(osm_opensm_t * p_osm)
 	cl_plock_excl_acquire(&p_osm->lock);
 
 	/* read QoS policy config file */
-	if (p_osm->subn.opt.qos)
-		osm_qos_parse_policy_file(&p_osm->subn);
+	osm_qos_parse_policy_file(&p_osm->subn);
 
 	p_tbl = &p_osm->subn.port_guid_tbl;
 	p_next = cl_qmap_head(p_tbl);
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index f714af7..bc34a0f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -1051,8 +1051,6 @@ static void subn_verify_qos_set(osm_qos_options_t *set, const char *prefix,
 
 int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 {
-	osm_qos_options_t dflt;
-
 	if (p_opts->lmc > 7) {
 		log_report(" Invalid Cached Option Value:lmc = %u:"
 			   "Using Default:%u\n", p_opts->lmc, OSM_DEFAULT_LMC);
@@ -1103,15 +1101,17 @@ int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 		p_opts->console = OSM_DEFAULT_CONSOLE;
 	}
 
+	if (p_opts->qos) {
+		osm_qos_options_t dflt;
 
-	/* the default options in qos_options must be correct.
-	 * every other one need not be, b/c those will default
-	 * back to whatever is in qos_options.
-	 */
-	subn_set_default_qos_options(&dflt);
-	subn_verify_qos_set(&p_opts->qos_options, "qos", &dflt);
+		/* the default options in qos_options must be correct.
+		 * every other one need not be, b/c those will default
+		 * back to whatever is in qos_options.
+		 */
 
-	if (p_opts->qos) {
+		subn_set_default_qos_options(&dflt);
+
+		subn_verify_qos_set(&p_opts->qos_options, "qos", &dflt);
 		subn_verify_qos_set(&p_opts->qos_ca_options, "qos_ca",
 				    &p_opts->qos_options);
 		subn_verify_qos_set(&p_opts->qos_sw0_options, "qos_sw0",
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 02/13] opensm: torus-2QoS requires that QoS be enabled.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
  2010-11-12 22:11   ` [PATCH 01/13] Revert "opensm: Do not require -Q option for torus-2QoS routing engine." Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 03/13] opensm/osm_ucast_mgr.c: ensure osm_ucast_mgr_process() returns failure when no routing engine runs Jim Schutt
                     ` (12 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

SL2VL maps are only programmed if QoS is enabled.  Require this to be
the case if torus-2QoS is configured, and print a message otherwise.

Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 3b67f16..aeb4fe6 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -9045,6 +9045,14 @@ int torus_build_lfts(void *context)
 	struct fabric *fabric;
 	struct torus *torus;
 
+	if (!ctx->osm->subn.opt.qos) {
+		OSM_LOG(&ctx->osm->log, OSM_LOG_ERROR,
+			"Error: Routing engine list contains torus-2QoS. "
+			"Enable QoS for correct operation "
+			"(-Q or 'qos TRUE' in opensm.conf).\n");
+		return status;
+	}
+
 	fabric = &ctx->fabric;
 	teardown_fabric(fabric);
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 03/13] opensm/osm_ucast_mgr.c: ensure osm_ucast_mgr_process() returns failure when no routing engine runs.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
  2010-11-12 22:11   ` [PATCH 01/13] Revert "opensm: Do not require -Q option for torus-2QoS routing engine." Jim Schutt
  2010-11-12 22:11   ` [PATCH 02/13] opensm: torus-2QoS requires that QoS be enabled Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 04/13] opensm: Fill in default QoS values at last possible moment Jim Schutt
                     ` (11 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

If all configured routing engines fail to initialize, the routing engine
list will be empty.  If opensm is also configured for no fallback routing
engine, then osm_ucast_mgr_process() can incorrectly return success even
though no routing engine will run.  The result of that is that heavy
sweeps are attempted in a tight loop, spamming opensm logs and the fabric.

With this fix, heavy sweeps occur only after a trap or the sweep interval,
reducing log spam and making it easier to find the reason for the failure.

Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_ucast_mgr.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
index 85495eb..3e9c836 100644
--- a/opensm/opensm/osm_ucast_mgr.c
+++ b/opensm/opensm/osm_ucast_mgr.c
@@ -1086,6 +1086,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
 	    ucast_mgr_setup_all_switches(p_mgr->p_subn) < 0)
 		goto Exit;
 
+	failed = -1;
 	p_osm->routing_engine_used = NULL;
 	while (p_routing_eng) {
 		failed = ucast_mgr_route(p_routing_eng, p_osm);
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 04/13] opensm: Fill in default QoS values at last possible moment.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (2 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 03/13] opensm/osm_ucast_mgr.c: ensure osm_ucast_mgr_process() returns failure when no routing engine runs Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 05/13] opensm: Cause torus-2QoS to warn if QoS configuration will cause issues Jim Schutt
                     ` (10 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

The comments for struct osm_qos_options in osm_subnet.h describe values that
flag default QoS values for struct members.  osm_qos_options structs are
initialized with these flag values in subn_init_qos_options(), but they are
overwritten via osm_subn_verify_config() with the actual default values.

It turns out to be easy to wait until qos_build_config() to detect the flag
values and use the actual default values as needed.   osm_qos_setup() +
qos_build_config() already had code that set unconfigured CA, switch port,
and router specific QoS parameters from configured default QoS parameters,
so that duplicate code can be removed from osm_subn_verify_config().

In addition to code simplification, such delay in replacing default flag
values with the actual default values makes it possible for a routing
engine to detect that configured rather than default values were used.

For example, torus-2QoS can never use any configured qos_sl2vl values,
but should only warn if such are configured.

Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_qos.c    |   55 +++++++++++++++++++++++++++-------
 opensm/opensm/osm_subnet.c |   70 ++++++++++++++------------------------------
 2 files changed, 66 insertions(+), 59 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index ba198a0..afea7bb 100644
--- a/opensm/opensm/osm_qos.c
+++ b/opensm/opensm/osm_qos.c
@@ -376,7 +376,7 @@ int osm_qos_setup(osm_opensm_t * p_osm)
 /*
  *  QoS config stuff
  */
-static int parse_one_unsigned(char *str, char delim, unsigned *val)
+static int parse_one_unsigned(const char *str, char delim, unsigned *val)
 {
 	char *end;
 	*val = strtoul(str, &end, 0);
@@ -385,10 +385,10 @@ static int parse_one_unsigned(char *str, char delim, unsigned *val)
 	return (int)(end - str);
 }
 
-static int parse_vlarb_entry(char *str, ib_vl_arb_element_t * e)
+static int parse_vlarb_entry(const char *str, ib_vl_arb_element_t * e)
 {
 	unsigned val;
-	char *p = str;
+	const char *p = str;
 	p += parse_one_unsigned(p, ':', &val);
 	e->vl = val % 15;
 	p += parse_one_unsigned(p, ',', &val);
@@ -396,10 +396,10 @@ static int parse_vlarb_entry(char *str, ib_vl_arb_element_t * e)
 	return (int)(p - str);
 }
 
-static int parse_sl2vl_entry(char *str, uint8_t * raw)
+static int parse_sl2vl_entry(const char *str, uint8_t * raw)
 {
 	unsigned val1, val2;
-	char *p = str;
+	const char *p = str;
 	p += parse_one_unsigned(p, ',', &val1);
 	p += parse_one_unsigned(p, ',', &val2);
 	*raw = (val1 << 4) | (val2 & 0xf);
@@ -410,18 +410,36 @@ static void qos_build_config(struct qos_config *cfg, osm_qos_options_t * opt,
 			     osm_qos_options_t * dflt)
 {
 	int i;
-	char *p;
+	const char *p;
 
 	memset(cfg, 0, sizeof(*cfg));
 
-	cfg->max_vls = opt->max_vls > 0 ? opt->max_vls : dflt->max_vls;
+	if (opt->max_vls > 0)
+		cfg->max_vls = opt->max_vls;
+	else {
+		if (dflt->max_vls > 0)
+			cfg->max_vls = dflt->max_vls;
+		else
+			cfg->max_vls = OSM_DEFAULT_QOS_MAX_VLS;
+	}
 
 	if (opt->high_limit >= 0)
 		cfg->vl_high_limit = (uint8_t) opt->high_limit;
-	else
-		cfg->vl_high_limit = (uint8_t) dflt->high_limit;
+	else {
+		if (dflt->high_limit >= 0)
+			cfg->vl_high_limit = (uint8_t) dflt->high_limit;
+		else
+			cfg->vl_high_limit = (uint8_t) OSM_DEFAULT_QOS_HIGH_LIMIT;
+	}
 
-	p = opt->vlarb_high ? opt->vlarb_high : dflt->vlarb_high;
+	if (opt->vlarb_high)
+		p = opt->vlarb_high;
+	else {
+		if (dflt->vlarb_high)
+			p = dflt->vlarb_high;
+		else
+			p = OSM_DEFAULT_QOS_VLARB_HIGH;
+	}
 	for (i = 0; i < 2 * IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; i++) {
 		p += parse_vlarb_entry(p,
 				       &cfg->vlarb_high[i /
@@ -430,7 +448,14 @@ static void qos_build_config(struct qos_config *cfg, osm_qos_options_t * opt,
 						IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK]);
 	}
 
-	p = opt->vlarb_low ? opt->vlarb_low : dflt->vlarb_low;
+	if (opt->vlarb_low)
+		p = opt->vlarb_low;
+	else {
+		if (dflt->vlarb_low)
+			p = dflt->vlarb_low;
+		else
+			p = OSM_DEFAULT_QOS_VLARB_LOW;
+	}
 	for (i = 0; i < 2 * IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK; i++) {
 		p += parse_vlarb_entry(p,
 				       &cfg->vlarb_low[i /
@@ -440,6 +465,14 @@ static void qos_build_config(struct qos_config *cfg, osm_qos_options_t * opt,
 	}
 
 	p = opt->sl2vl ? opt->sl2vl : dflt->sl2vl;
+	if (opt->sl2vl)
+		p = opt->sl2vl;
+	else {
+		if (dflt->sl2vl)
+			p = dflt->sl2vl;
+		else
+			p = OSM_DEFAULT_QOS_SL2VL;
+	}
 	for (i = 0; i < IB_MAX_NUM_VLS / 2; i++)
 		p += parse_sl2vl_entry(p, &cfg->sl2vl.raw_vl_by_sl[i]);
 }
diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index bc34a0f..be406ac 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -636,15 +636,6 @@ osm_mgrp_t *osm_get_mgrp_by_mgid(IN osm_subn_t * subn, IN ib_gid_t * mgid)
 	return NULL;
 }
 
-static void subn_set_default_qos_options(IN osm_qos_options_t * opt)
-{
-	opt->max_vls = OSM_DEFAULT_QOS_MAX_VLS;
-	opt->high_limit = OSM_DEFAULT_QOS_HIGH_LIMIT;
-	opt->vlarb_high = OSM_DEFAULT_QOS_VLARB_HIGH;
-	opt->vlarb_low = OSM_DEFAULT_QOS_VLARB_LOW;
-	opt->sl2vl = OSM_DEFAULT_QOS_SL2VL;
-}
-
 static void subn_init_qos_options(osm_qos_options_t *opt, osm_qos_options_t *f)
 {
 	opt->max_vls = 0;
@@ -911,38 +902,37 @@ static ib_api_status_t parse_prefix_routes_file(IN osm_subn_t * p_subn)
 	return (errors == 0) ? IB_SUCCESS : IB_ERROR;
 }
 
-static void subn_verify_max_vls(unsigned *max_vls, const char *prefix, unsigned dflt)
+static void subn_verify_max_vls(unsigned *max_vls, const char *prefix)
 {
 	if (!*max_vls || *max_vls > 15) {
 		if (*max_vls)
 			log_report(" Invalid Cached Option: %s_max_vls=%u: "
 				   "Using Default = %u\n",
-				   prefix, *max_vls, dflt);
-		*max_vls = dflt;
+				   prefix, *max_vls, OSM_DEFAULT_QOS_MAX_VLS);
+		*max_vls = 0;
 	}
 }
 
-static void subn_verify_high_limit(int *high_limit, const char *prefix, int dflt)
+static void subn_verify_high_limit(int *high_limit, const char *prefix)
 {
 	if (*high_limit < 0 || *high_limit > 255) {
 		if (*high_limit > 255)
 			log_report(" Invalid Cached Option: %s_high_limit=%d: "
 				   "Using Default: %d\n",
-				   prefix, *high_limit, dflt);
-		*high_limit = dflt;
+				   prefix, *high_limit,
+				   OSM_DEFAULT_QOS_HIGH_LIMIT);
+		*high_limit = -1;
 	}
 }
 
 static void subn_verify_vlarb(char **vlarb, const char *prefix,
-			      const char *suffix, char *dflt)
+			      const char *suffix)
 {
 	char *str, *tok, *end, *ptr;
 	int count = 0;
 
-	if (*vlarb == NULL) {
-		*vlarb = strdup(dflt);
+	if (*vlarb == NULL)
 		return;
-	}
 
 	str = strdup(*vlarb);
 
@@ -1001,15 +991,13 @@ static void subn_verify_vlarb(char **vlarb, const char *prefix,
 	free(str);
 }
 
-static void subn_verify_sl2vl(char **sl2vl, const char *prefix, char *dflt)
+static void subn_verify_sl2vl(char **sl2vl, const char *prefix)
 {
 	char *str, *tok, *end, *ptr;
 	int count = 0;
 
-	if (*sl2vl == NULL) {
-		*sl2vl = strdup(dflt);
+	if (*sl2vl == NULL)
 		return;
-	}
 
 	str = strdup(*sl2vl);
 
@@ -1039,14 +1027,13 @@ static void subn_verify_sl2vl(char **sl2vl, const char *prefix, char *dflt)
 	free(str);
 }
 
-static void subn_verify_qos_set(osm_qos_options_t *set, const char *prefix,
-				osm_qos_options_t *dflt)
+static void subn_verify_qos_set(osm_qos_options_t *set, const char *prefix)
 {
-	subn_verify_max_vls(&set->max_vls, prefix, dflt->max_vls);
-	subn_verify_high_limit(&set->high_limit, prefix, dflt->high_limit);
-	subn_verify_vlarb(&set->vlarb_low, prefix, "low", dflt->vlarb_low);
-	subn_verify_vlarb(&set->vlarb_high, prefix, "high", dflt->vlarb_high);
-	subn_verify_sl2vl(&set->sl2vl, prefix, dflt->sl2vl);
+	subn_verify_max_vls(&set->max_vls, prefix);
+	subn_verify_high_limit(&set->high_limit, prefix);
+	subn_verify_vlarb(&set->vlarb_low, prefix, "low");
+	subn_verify_vlarb(&set->vlarb_high, prefix, "high");
+	subn_verify_sl2vl(&set->sl2vl, prefix);
 }
 
 int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
@@ -1102,24 +1089,11 @@ int osm_subn_verify_config(IN osm_subn_opt_t * p_opts)
 	}
 
 	if (p_opts->qos) {
-		osm_qos_options_t dflt;
-
-		/* the default options in qos_options must be correct.
-		 * every other one need not be, b/c those will default
-		 * back to whatever is in qos_options.
-		 */
-
-		subn_set_default_qos_options(&dflt);
-
-		subn_verify_qos_set(&p_opts->qos_options, "qos", &dflt);
-		subn_verify_qos_set(&p_opts->qos_ca_options, "qos_ca",
-				    &p_opts->qos_options);
-		subn_verify_qos_set(&p_opts->qos_sw0_options, "qos_sw0",
-				    &p_opts->qos_options);
-		subn_verify_qos_set(&p_opts->qos_swe_options, "qos_swe",
-				    &p_opts->qos_options);
-		subn_verify_qos_set(&p_opts->qos_rtr_options, "qos_rtr",
-				    &p_opts->qos_options);
+		subn_verify_qos_set(&p_opts->qos_options, "qos");
+		subn_verify_qos_set(&p_opts->qos_ca_options, "qos_ca");
+		subn_verify_qos_set(&p_opts->qos_sw0_options, "qos_sw0");
+		subn_verify_qos_set(&p_opts->qos_swe_options, "qos_swe");
+		subn_verify_qos_set(&p_opts->qos_rtr_options, "qos_rtr");
 	}
 
 #ifdef ENABLE_OSM_PERF_MGR
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 05/13] opensm: Cause torus-2QoS to warn if QoS configuration will cause issues.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (3 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 04/13] opensm: Fill in default QoS values at last possible moment Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 06/13] opensm/osm_torus.c: Also parse DOS line endings in torus-2QoS.conf Jim Schutt
                     ` (9 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

Torus-2QoS needs 8 VLs, and complete control over sl2vl maps, in order
to provide 2 QoS levels with routing that is free of credit loops on torus
fabrics.  Warn to this effect if an insufficient max_vls configuration
or a non-default qos_sl2vl configuration is detected.

Also, torus-2QoS needs to use VLs 0-3 to implement one QoS level, and
VLs 4-7 to implement the other.  The VLarb weights for VLs 0-3 should
all have the same value, and similarly for the weights for VLs 4-7.
Otherwise, differences in data rates for different paths may cause
hard-to-diagnose application issues.  Warn to this effect when
detected.

Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |   87 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 87 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index aeb4fe6..784955d 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -9038,6 +9038,84 @@ out:
 }
 
 static
+void check_vlarb_config(const char *vlarb_str, bool is_default,
+			const char *str, const char *pri, osm_log_t *log)
+{
+	unsigned total_weight[IB_MAX_NUM_VLS] = {0,};
+	unsigned i = 0, v, vl = 0;
+	char *end;
+	bool uniform;
+
+	while (*vlarb_str && i++ < 2 * IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK) {
+		v = strtoul(vlarb_str, &end, 0);
+		if (*end)
+			end++;
+		vlarb_str = end;
+		if (i & 0x1)
+			vl = v & 0xf;
+		else
+			total_weight[vl] += v & 0xff;
+	}
+	uniform = true;
+	v = total_weight[0];
+	for (i = 1; i < 8; i++) {
+		if (i == 4)
+			v = total_weight[i];
+		if (total_weight[i] != v)
+			uniform = false;
+	}
+	if (!uniform)
+		OSM_LOG(log, OSM_LOG_INFO,
+			"Warning: torus-2QoS requires same VLarb weights for "
+			"VLs 0-3; also for VLs 4-7: not true for %s "
+			"%s_vlarb_%s\n",
+			(is_default ? "default" : "configured"), str, pri);
+}
+
+static
+void check_qos_config(osm_qos_options_t *opt, bool tgt_is_default,
+		      const char *str, osm_log_t *log)
+{
+	const char *vlarb_str;
+	bool is_default;
+
+	if (opt->max_vls > 0 && opt->max_vls < 8)
+		OSM_LOG(log, OSM_LOG_INFO,
+			"Warning: full torus-2QoS functionality not available "
+			"for configured %s_max_vls = %d\n", str, opt->max_vls);
+
+	if (opt->vlarb_high) {
+		is_default = false;
+		vlarb_str = opt->vlarb_high;
+	} else{
+		is_default = true;
+		vlarb_str = OSM_DEFAULT_QOS_VLARB_HIGH;
+	}
+	/*
+	 * Only check values that were actually configured, or the overall
+	 * defaults that target-specific (CA, switch port, etc) defaults
+	 * are set from.
+	 */
+	if (!is_default || tgt_is_default)
+		check_vlarb_config(vlarb_str, is_default, str, "high", log);
+
+	if (opt->vlarb_low) {
+		is_default = false;
+		vlarb_str = opt->vlarb_low;
+	} else {
+		is_default = true;
+		vlarb_str = OSM_DEFAULT_QOS_VLARB_LOW;
+	}
+	if (!is_default || tgt_is_default)
+		check_vlarb_config(vlarb_str, is_default, str, "low", log);
+
+	if (opt->sl2vl)
+		OSM_LOG(log, OSM_LOG_INFO,
+			"Warning: torus-2QoS must override configured "
+			"%s_sl2vl to generate deadlock-free routes\n", str);
+}
+
+static
 int torus_build_lfts(void *context)
 {
 	int status = -1;
@@ -9111,9 +9189,18 @@ out:
 		if (torus)
 			teardown_torus(torus);
 	} else {
+		osm_subn_opt_t *opt = &torus->osm->subn.opt;
+		osm_log_t *log = &torus->osm->log;
+
 		if (ctx->torus)
 			teardown_torus(ctx->torus);
 		ctx->torus = torus;
+
+		check_qos_config(&opt->qos_options, 1, "qos", log);
+		check_qos_config(&opt->qos_ca_options, 0, "qos_ca", log);
+		check_qos_config(&opt->qos_sw0_options, 0, "qos_sw0", log);
+		check_qos_config(&opt->qos_swe_options, 0, "qos_swe", log);
+		check_qos_config(&opt->qos_rtr_options, 0, "qos_rtr", log);
 	}
 	teardown_fabric(fabric);
 	return status;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 06/13] opensm/osm_torus.c: Also parse DOS line endings in torus-2QoS.conf.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (4 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 05/13] opensm: Cause torus-2QoS to warn if QoS configuration will cause issues Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 07/13] opensm/osm_torus.c: Use PRIx64 for GUID printing Jim Schutt
                     ` (8 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 784955d..804334f 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -980,7 +980,7 @@ bool parse_config(const char *fn, struct fabric *f, struct torus *t)
 	FILE *fp;
 	char *keyword;
 	char *line_buf = NULL;
-	const char *parse_sep = " \n\t";
+	const char *parse_sep = " \n\t\015";
 	size_t line_buf_sz = 0;
 	size_t line_cntr = 0;
 	ssize_t llen;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 07/13] opensm/osm_torus.c: Use PRIx64 for GUID printing.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (5 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 06/13] opensm/osm_torus.c: Also parse DOS line endings in torus-2QoS.conf Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 08/13] opensm/osm_torus.c: Ignore multiple configurations of torus size Jim Schutt
                     ` (7 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |  216 ++++++++++++++++++++++----------------------
 1 files changed, 108 insertions(+), 108 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 804334f..8e0435b 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -60,8 +60,6 @@
 #define SWITCH_MAX_PORTGRPS  (1 + 2 * TORUS_MAX_DIM)
 
 typedef ib_net64_t guid_t;
-#define ntohllu(v_64bit) ((unsigned long long)cl_ntoh64(v_64bit))
-
 
 /*
  * An endpoint terminates a link, and is one of three types:
@@ -584,8 +582,8 @@ bool build_sw_endpoint(struct fabric *f, osm_port_t *osm_port)
 	sw = find_f_sw(f, sw_guid);
 	if (!sw) {
 		OSM_LOG(&f->osm->log, OSM_LOG_ERROR,
-			"Error: missing switch w/ GUID 0x%04llx\n",
-			ntohllu(sw_guid));
+			"Error: missing switch w/ GUID 0x%04"PRIx64"\n",
+			cl_ntoh64(sw_guid));
 		goto out;
 	}
 	/*
@@ -598,9 +596,9 @@ bool build_sw_endpoint(struct fabric *f, osm_port_t *osm_port)
 		} else
 			OSM_LOG(&f->osm->log, OSM_LOG_ERROR,
 				"Error: switch port %d has id "
-				"0x%04llx, expected 0x%04llx\n",
-				sw_port, ntohllu(sw->port[sw_port]->n_id),
-				ntohllu(sw_guid));
+				"0x%04"PRIx64", expected 0x%04"PRIx64"\n",
+				sw_port, cl_ntoh64(sw->port[sw_port]->n_id),
+				cl_ntoh64(sw_guid));
 		goto out;
 	}
 	ep = calloc(1, sizeof(*ep));
@@ -657,8 +655,8 @@ bool build_ca_link(struct fabric *f,
 	sw = find_f_sw(f, sw_guid);
 	if (!sw) {
 		OSM_LOG(&f->osm->log, OSM_LOG_ERROR,
-			"Error: missing switch w/ GUID 0x%04llx\n",
-			ntohllu(sw_guid));
+			"Error: missing switch w/ GUID 0x%04"PRIx64"\n",
+			cl_ntoh64(sw_guid));
 		goto out;
 	}
 	l = alloc_flink(f);
@@ -713,15 +711,15 @@ bool build_link(struct fabric *f,
 	sw0 = find_f_sw(f, sw_guid0);
 	if (!sw0) {
 		OSM_LOG(&f->osm->log, OSM_LOG_ERROR,
-			"Error: missing switch w/ GUID 0x%04llx\n",
-			ntohllu(sw_guid0));
+			"Error: missing switch w/ GUID 0x%04"PRIx64"\n",
+			cl_ntoh64(sw_guid0));
 			goto out;
 	}
 	sw1 = find_f_sw(f, sw_guid1);
 	if (!sw1) {
 		OSM_LOG(&f->osm->log, OSM_LOG_ERROR,
-			"Error: missing switch w/ GUID 0x%04llx\n",
-			ntohllu(sw_guid1));
+			"Error: missing switch w/ GUID 0x%04"PRIx64"\n",
+			cl_ntoh64(sw_guid1));
 			goto out;
 	}
 	l = alloc_flink(f);
@@ -1242,10 +1240,10 @@ void diagnose_fabric(struct fabric *f)
 
 		OSM_LOG(&f->osm->log, OSM_LOG_INFO,
 			"Found non-torus fabric link:"
-			" sw GUID 0x%04llx port %d <->"
-			" sw GUID 0x%04llx port %d\n",
-			ntohllu(l->end[0].n_id), l->end[0].port,
-			ntohllu(l->end[1].n_id), l->end[1].port);
+			" sw GUID 0x%04"PRIx64" port %d <->"
+			" sw GUID 0x%04"PRIx64" port %d\n",
+			cl_ntoh64(l->end[0].n_id), l->end[0].port,
+			cl_ntoh64(l->end[1].n_id), l->end[1].port);
 	}
 	/*
 	 * Report on any switches with ports using endpoints that didn't
@@ -1267,8 +1265,8 @@ void diagnose_fabric(struct fabric *f)
 
 			OSM_LOG(&f->osm->log, OSM_LOG_INFO,
 				"Found non-torus fabric port:"
-				" sw GUID 0x%04llx port %d\n",
-				ntohllu(f->sw[k]->n_id), p);
+				" sw GUID 0x%04"PRIx64" port %d\n",
+				cl_ntoh64(f->sw[k]->n_id), p);
 		}
 }
 
@@ -1423,15 +1421,15 @@ bool connect_tlink(struct port_grp *pg0, struct endpoint *f_ep0,
 	if (pg0->port_cnt == t->portgrp_sz) {
 		OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 			"Error: exceeded port group max "
-			"port count (%d): switch GUID 0x%04llx\n",
-			t->portgrp_sz, ntohllu(pg0->sw->n_id));
+			"port count (%d): switch GUID 0x%04"PRIx64"\n",
+			t->portgrp_sz, cl_ntoh64(pg0->sw->n_id));
 		goto out;
 	}
 	if (pg1->port_cnt == t->portgrp_sz) {
 		OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 			"Error: exceeded port group max "
-			"port count (%d): switch GUID 0x%04llx\n",
-			t->portgrp_sz, ntohllu(pg1->sw->n_id));
+			"port count (%d): switch GUID 0x%04"PRIx64"\n",
+			t->portgrp_sz, cl_ntoh64(pg1->sw->n_id));
 		goto out;
 	}
 	l = alloc_tlink(t);
@@ -1536,10 +1534,11 @@ bool link_tswitches(struct torus *t, int cdir,
 	default:
 	cdir_error:
 		OSM_LOG(&t->osm->log, OSM_LOG_ERROR, "Error: "
-			"sw 0x%04llx (%d,%d,%d) <--> sw 0x%04llx (%d,%d,%d) "
+			"sw 0x%04"PRIx64" (%d,%d,%d) <--> "
+			"sw 0x%04"PRIx64" (%d,%d,%d) "
 			"invalid torus %s link orientation\n",
-			ntohllu(t_sw0->n_id), t_sw0->i, t_sw0->j, t_sw0->k,
-			ntohllu(t_sw1->n_id), t_sw1->i, t_sw1->j, t_sw1->k,
+			cl_ntoh64(t_sw0->n_id), t_sw0->i, t_sw0->j, t_sw0->k,
+			cl_ntoh64(t_sw1->n_id), t_sw1->i, t_sw1->j, t_sw1->k,
 			cdir_name);
 		goto out;
 	}
@@ -1550,8 +1549,8 @@ bool link_tswitches(struct torus *t, int cdir,
 	if (!f_sw0 || !f_sw1) {
 		OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 			"Error: missing fabric switches!\n"
-			"  switch GUIDs: 0x%04llx 0x%04llx\n",
-			ntohllu(t_sw0->n_id), ntohllu(t_sw1->n_id));
+			"  switch GUIDs: 0x%04"PRIx64" 0x%04"PRIx64"\n",
+			cl_ntoh64(t_sw0->n_id), cl_ntoh64(t_sw1->n_id));
 		goto out;
 	}
 	pg0 = &t_sw0->ptgrp[2*cdir + 1];
@@ -1586,9 +1585,9 @@ bool link_tswitches(struct torus *t, int cdir,
 		if (!(f_ep0->type == PASSTHRU && f_ep1->type == PASSTHRU)) {
 			OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 				"Error: not interswitch "
-				"link:\n  0x%04llx/%d <-> 0x%04llx/%d\n",
-				ntohllu(f_ep0->n_id), f_ep0->port,
-				ntohllu(f_ep1->n_id), f_ep1->port);
+				"link:\n  0x%04"PRIx64"/%d <-> 0x%04"PRIx64"/%d\n",
+				cl_ntoh64(f_ep0->n_id), f_ep0->port,
+				cl_ntoh64(f_ep1->n_id), f_ep1->port);
 			goto out;
 		}
 		/*
@@ -1664,8 +1663,8 @@ bool link_srcsink(struct torus *t, int i, int j, int k)
 			if (pg->port_cnt == t->portgrp_sz) {
 				OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 					"Error: exceeded port group max port "
-					"count (%d): switch GUID 0x%04llx\n",
-					t->portgrp_sz, ntohllu(tsw->n_id));
+					"count (%d): switch GUID 0x%04"PRIx64"\n",
+					t->portgrp_sz, cl_ntoh64(tsw->n_id));
 				goto out;
 			}
 			fsw->port[p]->sw = tsw;
@@ -1699,8 +1698,8 @@ bool link_srcsink(struct torus *t, int i, int j, int k)
 			if (pg->port_cnt == t->portgrp_sz) {
 				OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 					"Error: exceeded port group max port "
-					"count (%d): switch GUID 0x%04llx\n",
-					t->portgrp_sz, ntohllu(tsw->n_id));
+					"count (%d): switch GUID 0x%04"PRIx64"\n",
+					t->portgrp_sz, cl_ntoh64(tsw->n_id));
 				goto out;
 			}
 			/*
@@ -1711,8 +1710,8 @@ bool link_srcsink(struct torus *t, int i, int j, int k)
 			if (!f_ep1->osm_port) {
 				OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 					"Error: NULL osm_port->priv port "
-					"GUID 0x%04llx\n",
-					ntohllu(f_ep1->n_id));
+					"GUID 0x%04"PRIx64"\n",
+					cl_ntoh64(f_ep1->n_id));
 				goto out;
 			}
 			tl = alloc_tlink(t);
@@ -7261,13 +7260,13 @@ void build_torus(struct fabric *f, struct torus *t)
 	if (!t->seed_idx)
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Using torus seed configured as default "
-			"(seed sw %d,%d,%d GUID 0x%04llx).\n",
-			i, j, k, ntohllu(sw[i][j][k]->n_id));
+			"(seed sw %d,%d,%d GUID 0x%04"PRIx64").\n",
+			i, j, k, cl_ntoh64(sw[i][j][k]->n_id));
 	else
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Using torus seed configured as backup #%u "
-			"(seed sw %d,%d,%d GUID 0x%04llx).\n",
-			t->seed_idx, i, j, k, ntohllu(sw[i][j][k]->n_id));
+			"(seed sw %d,%d,%d GUID 0x%04"PRIx64").\n",
+			t->seed_idx, i, j, k, cl_ntoh64(sw[i][j][k]->n_id));
 
 	/*
 	 * Search the fabric and construct the expected torus topology.
@@ -7315,15 +7314,15 @@ unsigned tsw_changes(struct t_switch *nsw, struct t_switch *osw)
 	if (nsw && !osw) {
 		cnt++;
 		OSM_LOG(&nsw->torus->osm->log, OSM_LOG_INFO,
-			"New torus switch %d,%d,%d GUID 0x%04llx\n",
-			nsw->i, nsw->j, nsw->k, ntohllu(nsw->n_id));
+			"New torus switch %d,%d,%d GUID 0x%04"PRIx64"\n",
+			nsw->i, nsw->j, nsw->k, cl_ntoh64(nsw->n_id));
 		goto out;
 	}
 	if (osw && !nsw) {
 		cnt++;
 		OSM_LOG(&osw->torus->osm->log, OSM_LOG_INFO,
-			"Lost torus switch %d,%d,%d GUID 0x%04llx\n",
-			osw->i, osw->j, osw->k, ntohllu(osw->n_id));
+			"Lost torus switch %d,%d,%d GUID 0x%04"PRIx64"\n",
+			osw->i, osw->j, osw->k, cl_ntoh64(osw->n_id));
 		goto out;
 	}
 	if (!(nsw && osw))
@@ -7333,17 +7332,17 @@ unsigned tsw_changes(struct t_switch *nsw, struct t_switch *osw)
 		cnt++;
 		OSM_LOG(&nsw->torus->osm->log, OSM_LOG_INFO,
 			"Torus switch %d,%d,%d GUID "
-			"was 0x%04llx, now 0x%04llx\n",
+			"was 0x%04"PRIx64", now 0x%04"PRIx64"\n",
 			nsw->i, nsw->j, nsw->k,
-			ntohllu(osw->n_id), ntohllu(nsw->n_id));
+			cl_ntoh64(osw->n_id), cl_ntoh64(nsw->n_id));
 	}
 
 	if (nsw->port_cnt != osw->port_cnt) {
 		cnt++;
 		OSM_LOG(&nsw->torus->osm->log, OSM_LOG_INFO,
-			"Torus switch %d,%d,%d GUID 0x%04llx "
+			"Torus switch %d,%d,%d GUID 0x%04"PRIx64" "
 			"had %d ports, now has %d\n",
-			nsw->i, nsw->j, nsw->k, ntohllu(nsw->n_id),
+			nsw->i, nsw->j, nsw->k, cl_ntoh64(nsw->n_id),
 			osw->port_cnt, nsw->port_cnt);
 	}
 	port_cnt = nsw->port_cnt;
@@ -7373,23 +7372,23 @@ unsigned tsw_changes(struct t_switch *nsw, struct t_switch *osw)
 		if (rnpt && !ropt) {
 			++cnt;
 			OSM_LOG(&nsw->torus->osm->log, OSM_LOG_INFO,
-				"Torus switch %d,%d,%d GUID 0x%04llx[%d] "
-				"remote now %s GUID 0x%04llx[%d], "
+				"Torus switch %d,%d,%d GUID 0x%04"PRIx64"[%d] "
+				"remote now %s GUID 0x%04"PRIx64"[%d], "
 				"was missing\n",
-				nsw->i, nsw->j, nsw->k, ntohllu(nsw->n_id), p,
-				rnpt->type == PASSTHRU ? "sw" : "node",
-				ntohllu(rnpt->n_id), rnpt->port);
+				nsw->i, nsw->j, nsw->k, cl_ntoh64(nsw->n_id),
+				p, rnpt->type == PASSTHRU ? "sw" : "node",
+				cl_ntoh64(rnpt->n_id), rnpt->port);
 			continue;
 		}
 		if (ropt && !rnpt) {
 			++cnt;
 			OSM_LOG(&nsw->torus->osm->log, OSM_LOG_INFO,
-				"Torus switch %d,%d,%d GUID 0x%04llx[%d] "
+				"Torus switch %d,%d,%d GUID 0x%04"PRIx64"[%d] "
 				"remote now missing, "
-				"was %s GUID 0x%04llx[%d]\n",
-				osw->i, osw->j, osw->k, ntohllu(nsw->n_id), p,
-				ropt->type == PASSTHRU ? "sw" : "node",
-				ntohllu(ropt->n_id), ropt->port);
+				"was %s GUID 0x%04"PRIx64"[%d]\n",
+				osw->i, osw->j, osw->k, cl_ntoh64(nsw->n_id),
+				p, ropt->type == PASSTHRU ? "sw" : "node",
+				cl_ntoh64(ropt->n_id), ropt->port);
 			continue;
 		}
 		if (!(rnpt && ropt))
@@ -7398,14 +7397,14 @@ unsigned tsw_changes(struct t_switch *nsw, struct t_switch *osw)
 		if (rnpt->n_id != ropt->n_id) {
 			++cnt;
 			OSM_LOG(&nsw->torus->osm->log, OSM_LOG_INFO,
-				"Torus switch %d,%d,%d GUID 0x%04llx[%d] "
-				"remote now %s GUID 0x%04llx[%d], "
-				"was %s GUID 0x%04llx[%d]\n",
-				nsw->i, nsw->j, nsw->k, ntohllu(nsw->n_id), p,
-				rnpt->type == PASSTHRU ? "sw" : "node",
-				ntohllu(rnpt->n_id), rnpt->port,
+				"Torus switch %d,%d,%d GUID 0x%04"PRIx64"[%d] "
+				"remote now %s GUID 0x%04"PRIx64"[%d], "
+				"was %s GUID 0x%04"PRIx64"[%d]\n",
+				nsw->i, nsw->j, nsw->k, cl_ntoh64(nsw->n_id),
+				p, rnpt->type == PASSTHRU ? "sw" : "node",
+				cl_ntoh64(rnpt->n_id), rnpt->port,
 				ropt->type == PASSTHRU ? "sw" : "node",
-				ntohllu(ropt->n_id), ropt->port);
+				cl_ntoh64(ropt->n_id), ropt->port);
 			continue;
 		}
 	}
@@ -7474,7 +7473,7 @@ static
 void rpt_torus_missing(struct torus *t, int i, int j, int k,
 		       struct t_switch *sw, int *missing_z)
 {
-	unsigned long long guid_ho;
+	uint64_t guid_ho;
 
 	if (!sw) {
 		/*
@@ -7498,43 +7497,43 @@ void rpt_torus_missing(struct torus *t, int i, int j, int k,
 			"Missing torus switch at %d,%d,%d\n", i, j, k);
 		return;
 	}
-	guid_ho = ntohllu(sw->n_id);
+	guid_ho = cl_ntoh64(sw->n_id);
 
 	if (!(sw->ptgrp[0].port_cnt || (t->x_sz == 1) ||
 	      ((t->flags & X_MESH) && i == 0)))
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Missing torus -x link on "
-			"switch %d,%d,%d GUID 0x%04llx\n",
+			"switch %d,%d,%d GUID 0x%04"PRIx64"\n",
 			i, j, k, guid_ho);
 	if (!(sw->ptgrp[1].port_cnt || (t->x_sz == 1) ||
 	      ((t->flags & X_MESH) && (i + 1) == t->x_sz)))
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Missing torus +x link on "
-			"switch %d,%d,%d GUID 0x%04llx\n",
+			"switch %d,%d,%d GUID 0x%04"PRIx64"\n",
 			i, j, k, guid_ho);
 	if (!(sw->ptgrp[2].port_cnt || (t->y_sz == 1) ||
 	      ((t->flags & Y_MESH) && j == 0)))
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Missing torus -y link on "
-			"switch %d,%d,%d GUID 0x%04llx\n",
+			"switch %d,%d,%d GUID 0x%04"PRIx64"\n",
 			i, j, k, guid_ho);
 	if (!(sw->ptgrp[3].port_cnt || (t->y_sz == 1) ||
 	      ((t->flags & Y_MESH) && (j + 1) == t->y_sz)))
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Missing torus +y link on "
-			"switch %d,%d,%d GUID 0x%04llx\n",
+			"switch %d,%d,%d GUID 0x%04"PRIx64"\n",
 			i, j, k, guid_ho);
 	if (!(sw->ptgrp[4].port_cnt || (t->z_sz == 1) ||
 	      ((t->flags & Z_MESH) && k == 0)))
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Missing torus -z link on "
-			"switch %d,%d,%d GUID 0x%04llx\n",
+			"switch %d,%d,%d GUID 0x%04"PRIx64"\n",
 			i, j, k, guid_ho);
 	if (!(sw->ptgrp[5].port_cnt || (t->z_sz == 1) ||
 	      ((t->flags & Z_MESH) && (k + 1) == t->z_sz)))
 		OSM_LOG(&t->osm->log, OSM_LOG_INFO,
 			"Missing torus +z link on "
-			"switch %d,%d,%d GUID 0x%04llx\n",
+			"switch %d,%d,%d GUID 0x%04"PRIx64"\n",
 			i, j, k, guid_ho);
 }
 
@@ -7932,9 +7931,9 @@ void torus_update_osm_sl2vl(void *context, osm_physp_t *osm_phys_port,
 
 			guid = osm_node_get_node_guid(node);
 			OSM_LOG(log, OSM_LOG_INFO,
-				"Error: osm_switch (GUID 0x%04llx) "
+				"Error: osm_switch (GUID 0x%04"PRIx64") "
 				"not in our fabric description\n",
-				ntohllu(guid));
+				cl_ntoh64(guid));
 		return;
 		}
 	}
@@ -8192,9 +8191,10 @@ void warn_on_routing(const char *msg,
 		     struct t_switch *sw, struct t_switch *dsw)
 {
 	OSM_LOG(&sw->torus->osm->log, OSM_LOG_ERROR,
-		"%s from sw 0x%04llx (%d,%d,%d) to sw 0x%04llx (%d,%d,%d)\n",
-		msg, ntohllu(sw->n_id), sw->i, sw->j, sw->k,
-		ntohllu(dsw->n_id), dsw->i, dsw->j, dsw->k);
+		"%s from sw 0x%04"PRIx64" (%d,%d,%d) "
+		"to sw 0x%04"PRIx64" (%d,%d,%d)\n",
+		msg, cl_ntoh64(sw->n_id), sw->i, sw->j, sw->k,
+		cl_ntoh64(dsw->n_id), dsw->i, dsw->j, dsw->k);
 }
 
 static
@@ -8351,9 +8351,9 @@ no_route:
 	 * We can't get there from here.
 	 */
 	OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
-		"Error: routing on sw 0x%04llx: sending "
-		"traffic for dest sw 0x%04llx to port %u\n",
-		ntohllu(sw->n_id), ntohllu(dsw->n_id), OSM_NO_PATH);
+		"Error: routing on sw 0x%04"PRIx64": sending "
+		"traffic for dest sw 0x%04"PRIx64" to port %u\n",
+		cl_ntoh64(sw->n_id), cl_ntoh64(dsw->n_id), OSM_NO_PATH);
 	return -1;
 }
 
@@ -8367,8 +8367,8 @@ bool get_lid(struct port_grp *pg, unsigned p,
 	if (p >= pg->port_cnt) {
 		OSM_LOG(&pg->sw->torus->osm->log, OSM_LOG_ERROR,
 			"Error: Port group index %u too large: sw "
-			"0x%04llx pt_grp %u pt_grp_cnt %u\n",
-			p, ntohllu(pg->sw->n_id),
+			"0x%04"PRIx64" pt_grp %u pt_grp_cnt %u\n",
+			p, cl_ntoh64(pg->sw->n_id),
 			(unsigned)pg->port_grp, (unsigned)pg->port_cnt);
 		return false;
 	}
@@ -8388,16 +8388,16 @@ bool get_lid(struct port_grp *pg, unsigned p,
 			*ca = true;
 	} else {
 		OSM_LOG(&pg->sw->torus->osm->log, OSM_LOG_ERROR,
-			"Error: Switch 0x%04llx port %d improperly connected\n",
-			ntohllu(pg->sw->n_id), pg->port[p]->port);
+			"Error: Switch 0x%04"PRIx64" port %d improperly connected\n",
+			cl_ntoh64(pg->sw->n_id), pg->port[p]->port);
 		return false;
 	}
 	osm_port = ep->osm_port;
 	if (!(osm_port && osm_port->priv == ep)) {
 		OSM_LOG(&pg->sw->torus->osm->log, OSM_LOG_ERROR,
 			"Error: ep->osm_port->priv != ep "
-			"for sw 0x%04llu port %d\n",
-			ntohllu(((struct t_switch *)(ep->sw))->n_id), ep->port);
+			"for sw 0x%04"PRIx64" port %d\n",
+			cl_ntoh64(((struct t_switch *)(ep->sw))->n_id), ep->port);
 		return false;
 	}
 	*dlid_base = cl_ntoh16(osm_physp_get_base_lid(osm_port->p_physp));
@@ -8422,7 +8422,7 @@ bool torus_lft(struct torus *t, struct t_switch *sw)
 	if (!(sw->osm_switch && sw->osm_switch->priv == sw)) {
 		OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
 			"Error: sw->osm_switch->priv != sw "
-			"for sw 0x%04llu\n", ntohllu(sw->n_id));
+			"for sw 0x%04"PRIx64"\n", cl_ntoh64(sw->n_id));
 		return false;
 	}
 	osm_sw = sw->osm_switch;
@@ -8476,16 +8476,16 @@ osm_mtree_node_t *mcast_stree_branch(struct t_switch *sw, osm_switch_t *osm_sw,
 
 	if (osm_sw->priv != sw) {
 		OSM_LOG(&sw->torus->osm->log, OSM_LOG_INFO,
-			"Error: osm_sw (GUID 0x%04llx) "
+			"Error: osm_sw (GUID 0x%04"PRIx64") "
 			"not in our fabric description\n",
-			ntohllu(osm_node_get_node_guid(osm_sw->p_node)));
+			cl_ntoh64(osm_node_get_node_guid(osm_sw->p_node)));
 		goto out;
 	}
 	if (!osm_switch_supports_mcast(osm_sw)) {
 		OSM_LOG(&sw->torus->osm->log, OSM_LOG_ERROR,
-			"Error: osm_sw (GUID 0x%04llx) "
+			"Error: osm_sw (GUID 0x%04"PRIx64") "
 			"does not support multicast\n",
-			ntohllu(osm_node_get_node_guid(osm_sw->p_node)));
+			cl_ntoh64(osm_node_get_node_guid(osm_sw->p_node)));
 		goto out;
 	}
 	mtn = osm_mtree_node_new(osm_sw);
@@ -8525,7 +8525,7 @@ osm_mtree_node_t *mcast_stree_branch(struct t_switch *sw, osm_switch_t *osm_sw,
 		      ds_sw->osm_switch == ds_node->sw)) {
 			OSM_LOG(&sw->torus->osm->log, OSM_LOG_ERROR,
 				"Error: stale pointer to osm_sw "
-				"(GUID 0x%04llx)\n", ntohllu(ds_sw->n_id));
+				"(GUID 0x%04"PRIx64")\n", cl_ntoh64(ds_sw->n_id));
 			continue;
 		}
 		mtn->child_array[p] =
@@ -8646,9 +8646,9 @@ ib_api_status_t torus_mcast_stree(void *context, osm_mgrp_box_t *mgb)
 				guid_t id;
 				id = osm_node_get_node_guid(osm_port->p_node);
 				OSM_LOG(&ctx->osm->log, OSM_LOG_ERROR,
-					"Error: osm_port (GUID 0x%04llx) "
+					"Error: osm_port (GUID 0x%04"PRIx64") "
 					"not in our fabric description\n",
-					ntohllu(id));
+					cl_ntoh64(id));
 				continue;
 			}
 		}
@@ -8678,8 +8678,8 @@ ib_api_status_t torus_mcast_stree(void *context, osm_mgrp_box_t *mgb)
 					     t->master_stree_root->n_id);
 	if (!(osm_sw && t->master_stree_root->osm_switch == osm_sw)) {
 		OSM_LOG(&ctx->osm->log, OSM_LOG_ERROR,
-			"Error: stale pointer to osm_sw (GUID 0x%04llx)\n",
-			ntohllu(t->master_stree_root->n_id));
+			"Error: stale pointer to osm_sw (GUID 0x%04"PRIx64")\n",
+			cl_ntoh64(t->master_stree_root->n_id));
 		return IB_ERROR;
 	}
 	mgb->root = mcast_stree_branch(t->master_stree_root, osm_sw,
@@ -8936,9 +8936,9 @@ bool torus_master_stree(struct torus *t)
 
 				success = false;
 				OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
-					"Error: sw 0x%04llx (%d,%d,%d) not in "
+					"Error: sw 0x%04"PRIx64" (%d,%d,%d) not in "
 					"torus multicast master spanning tree\n",
-					ntohllu(sw->n_id), i, j, k);
+					cl_ntoh64(sw->n_id), i, j, k);
 			}
 out:
 	return success;
@@ -8975,9 +8975,9 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
 		if (!sport) {
 			guid = osm_node_get_node_guid(osm_sport->p_node);
 			OSM_LOG(log, OSM_LOG_INFO,
-				"Error: osm_sport (GUID 0x%04llx) "
+				"Error: osm_sport (GUID 0x%04"PRIx64") "
 				"not in our fabric description\n",
-				ntohllu(guid));
+				cl_ntoh64(guid));
 			goto out;
 		}
 	}
@@ -8987,9 +8987,9 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
 		if (!dport) {
 			guid = osm_node_get_node_guid(osm_dport->p_node);
 			OSM_LOG(log, OSM_LOG_INFO,
-				"Error: osm_dport (GUID 0x%04llx) "
+				"Error: osm_dport (GUID 0x%04"PRIx64") "
 				"not in our fabric description\n",
-				ntohllu(guid));
+				cl_ntoh64(guid));
 			goto out;
 		}
 	}
@@ -9000,15 +9000,15 @@ uint8_t torus_path_sl(void *context, uint8_t path_sl_hint,
 	if (sport->type != SRCSINK) {
 		guid = osm_node_get_node_guid(osm_sport->p_node);
 		OSM_LOG(log, OSM_LOG_INFO,
-			"Error: osm_sport (GUID 0x%04llx) "
-			"not a data src/sink port\n", ntohllu(guid));
+			"Error: osm_sport (GUID 0x%04"PRIx64") "
+			"not a data src/sink port\n", cl_ntoh64(guid));
 		goto out;
 	}
 	if (dport->type != SRCSINK) {
 		guid = osm_node_get_node_guid(osm_dport->p_node);
 		OSM_LOG(log, OSM_LOG_INFO,
-			"Error: osm_dport (GUID 0x%04llx) "
-			"not a data src/sink port\n", ntohllu(guid));
+			"Error: osm_dport (GUID 0x%04"PRIx64") "
+			"not a data src/sink port\n", cl_ntoh64(guid));
 		goto out;
 	}
 	/*
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 08/13] opensm/osm_torus.c: Ignore multiple configurations of torus size.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (6 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 07/13] opensm/osm_torus.c: Use PRIx64 for GUID printing Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 09/13] opensm/osm_subnet.c: Add torus-2QoS config file option to those configurable via opensm config file Jim Schutt
                     ` (6 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 8e0435b..add3cf9 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -782,6 +782,12 @@ bool parse_torus(struct torus *t, const char *parse_sep)
 	char *ptr;
 	bool success = false;
 
+	/*
+	 * There can be only one.  Ignore the imposters.
+	 */
+	if (t->sw_pool)
+		goto out;
+
 	if (!parse_size(&t->x_sz, &t->flags, X_MESH, parse_sep))
 		goto out;
 
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 09/13] opensm/osm_subnet.c: Add torus-2QoS config file option to those configurable via opensm config file.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (7 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 08/13] opensm/osm_torus.c: Ignore multiple configurations of torus size Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 10/13] opensm/main.c: Add description of "no_fallback" to "--routing_engine" option documentation Jim Schutt
                     ` (5 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_subnet.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_subnet.c b/opensm/opensm/osm_subnet.c
index be406ac..f2ca36f 100644
--- a/opensm/opensm/osm_subnet.c
+++ b/opensm/opensm/osm_subnet.c
@@ -352,6 +352,7 @@ static const opt_rec_t opt_tbl[] = {
 	{ "guid_routing_order_file", OPT_OFFSET(guid_routing_order_file), opts_parse_charp, NULL, 0 },
 	{ "sa_db_file", OPT_OFFSET(sa_db_file), opts_parse_charp, NULL, 0 },
 	{ "sa_db_dump", OPT_OFFSET(sa_db_dump), opts_parse_boolean, NULL, 1 },
+	{ "torus_config", OPT_OFFSET(torus_conf_file), opts_parse_charp, NULL, 1 },
 	{ "do_mesh_analysis", OPT_OFFSET(do_mesh_analysis), opts_parse_boolean, NULL, 1 },
 	{ "exit_on_fatal", OPT_OFFSET(exit_on_fatal), opts_parse_boolean, NULL, 1 },
 	{ "honor_guid2lid_file", OPT_OFFSET(honor_guid2lid_file), opts_parse_boolean, NULL, 1 },
@@ -1447,6 +1448,10 @@ int osm_subn_output_conf(FILE *out, IN osm_subn_opt_t * p_opts)
 		p_opts->sa_db_dump ? "TRUE" : "FALSE");
 
 	fprintf(out,
+		"# Torus-2QoS configuration file name\ntorus_config %s\n\n",
+		p_opts->torus_conf_file ? p_opts->torus_conf_file : null_str);
+
+	fprintf(out,
 		"#\n# HANDOVER - MULTIPLE SMs OPTIONS\n#\n"
 		"# SM priority used for deciding who is the master\n"
 		"# Range goes from 0 (lowest priority) to 15 (highest).\n"
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 10/13] opensm/main.c: Add description of "no_fallback" to "--routing_engine" option documentation.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (8 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 09/13] opensm/osm_subnet.c: Add torus-2QoS config file option to those configurable via opensm config file Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 11/13] opensm/man/opensm.8.in: Add references to torus-2QoS Jim Schutt
                     ` (4 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/main.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/main.c b/opensm/opensm/main.c
index e74dc46..756fe6f 100644
--- a/opensm/opensm/main.c
+++ b/opensm/opensm/main.c
@@ -174,6 +174,9 @@ static void show_usage(void)
 	       "          Min Hop algorithm.  Multiple routing engines can be specified\n"
 	       "          separated by commas so that specific ordering of routing\n"
 	       "          algorithms will be tried if earlier routing engines fail.\n"
+	       "          If all configured routing engines fail, OpenSM will always\n"
+	       "          attempt to route with Min Hop unless 'no_fallback' is\n"
+	       "          included in the list of routing engines.\n"
 	       "          Supported engines: updn, file, ftree, lash, dor, torus-2QoS\n\n");
 	printf("--do_mesh_analysis\n"
 	       "          This option enables additional analysis for the lash\n"
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 11/13] opensm/man/opensm.8.in: Add references to torus-2QoS.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (9 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 10/13] opensm/main.c: Add description of "no_fallback" to "--routing_engine" option documentation Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 12/13] opensm: Add torus-2QoS man pages Jim Schutt
                     ` (3 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/man/opensm.8.in |   29 ++++++++++++++++++++++++++---
 1 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in
index 47dff99..c021026 100644
--- a/opensm/man/opensm.8.in
+++ b/opensm/man/opensm.8.in
@@ -1,4 +1,4 @@
-.TH OPENSM 8 "October 22, 2009" "OpenIB" "OpenIB Management"
+.TH OPENSM 8 "November 3, 2010" "OpenIB" "OpenIB Management"
 
 .SH NAME
 opensm \- InfiniBand subnet manager and administration (SM/SA)
@@ -51,6 +51,7 @@ opensm \- InfiniBand subnet manager and administration (SM/SA)
 [\-\-prefix_routes_file <path>]
 [\-\-consolidate_ipv6_snm_req]
 [\-\-log_prefix <prefix text>]
+[\-\-torus_config <path to file>]
 [\-v(erbose)] [\-V] [\-D <flags>] [\-d(ebug) <number>]
 [\-h(elp)] [\-?]
 
@@ -148,8 +149,10 @@ LID assignments resolving multiple use of same LID.
 This option chooses routing engine(s) to use instead of Min Hop
 algorithm (default).  Multiple routing engines can be specified
 separated by commas so that specific ordering of routing algorithms
-will be tried if earlier routing engines fail.
-Supported engines: minhop, updn, file, ftree, lash, dor
+will be tried if earlier routing engines fail.  If all configured
+routing engines fail, OpenSM will always attempt to route with Min Hop
+unless 'no_fallback' is included in the list of routing engines.
+Supported engines: minhop, updn, file, ftree, lash, dor, torus-2QoS.
 .TP
 \fB\-\-do_mesh_analysis\fR
 This option enables additional analysis for the lash routing engine to
@@ -364,6 +367,11 @@ when two or more instances of OpenSM run in a single node to manage multiple
 fabrics. For example, in a dual-fabric (or dual-rail) IB cluster, the prefix
 for the first fabric could be "mpi" and the other fabric could be "storage".
 .TP
+\fB\-\-torus_config\fR <path to torus\-2QoS config file>
+This option defines the file name for the extra configuration
+information needed for the torus-2QoS routing engine.   The default
+name is \fB\%@OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@\fP
+.TP
 \fB\-v\fR, \fB\-\-verbose\fR
 This option increases the log verbosity level.
 The -v option may be specified multiple times
@@ -1004,6 +1012,14 @@ along the mesh dimension, or the -O option used as an override.
 
 Use '-R dor' option to activate the DOR algorithm.
 
+Torus-2QoS Routing Algorithm
+
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics;
+see torus-2QoS(8) for full documentation.
+
+Use '-R torus-2QoS -Q' or '-R torus-2QoS,no_fallback -Q'
+to activate the torus-2QoS algorithm.
+
 
 Routing References
 
@@ -1113,6 +1129,10 @@ default QOS policy config file
 .B @OPENSM_CONFIG_DIR@/@PREFIX_ROUTES_FILE@
 default prefix routes file.
 
+.TP
+.B @OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@
+default torus-2QoS config file.
+
 .SH AUTHORS
 .TP
 Hal Rosenstock
@@ -1135,3 +1155,6 @@ Ira Weiny
 .TP
 Dale Purdy
 .RI < purdy-sJ/iWh9BUns@public.gmane.org >
+
+.SH SEE ALSO
+torus-2QoS(8), torus-2QoS.conf(5).
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 12/13] opensm: Add torus-2QoS man pages.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (10 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 11/13] opensm/man/opensm.8.in: Add references to torus-2QoS Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-12 22:11   ` [PATCH 13/13] opensm/doc/current-routing.txt: Sync torus-2QoS information with new " Jim Schutt
                     ` (2 subsequent siblings)
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/Makefile.am              |    2 +-
 opensm/configure.in             |    6 +-
 opensm/man/torus-2QoS.8.in      |  476 +++++++++++++++++++++++++++++++++++++++
 opensm/man/torus-2QoS.conf.5.in |  184 +++++++++++++++
 4 files changed, 666 insertions(+), 2 deletions(-)
 create mode 100644 opensm/man/torus-2QoS.8.in
 create mode 100644 opensm/man/torus-2QoS.conf.5.in

diff --git a/opensm/Makefile.am b/opensm/Makefile.am
index 88ff9da..58a682b 100644
--- a/opensm/Makefile.am
+++ b/opensm/Makefile.am
@@ -12,7 +12,7 @@ install-exec-hook:
 	chmod 755 $(DESTDIR)/$(sysconfdir)/init.d/opensmd
 
 
-man_MANS = man/opensm.8 man/osmtest.8
+man_MANS = man/opensm.8 man/osmtest.8 man/torus-2QoS.8 man/torus-2QoS.conf.5
 
 various_scripts = $(wildcard scripts/*)
 docs = doc/performance-manager-HOWTO.txt doc/QoS_management_in_OpenSM.txt \
diff --git a/opensm/configure.in b/opensm/configure.in
index 8695965..aaad999 100644
--- a/opensm/configure.in
+++ b/opensm/configure.in
@@ -196,6 +196,10 @@ AC_DEFINE_UNQUOTED(HAVE_DEFAULT_QOS_POLICY_FILE,
 	[Define a QOS policy config file])
 AC_SUBST(QOS_POLICY_FILE)
 
+dnl For now, this does not need to be configurable
+TORUS2QOS_CONF_FILE=torus-2QoS.conf
+AC_SUBST(TORUS2QOS_CONF_FILE)
+
 dnl Check for a different prefix-routes file
 PREFIX_ROUTES_FILE=prefix-routes.conf
 AC_MSG_CHECKING(for --with-prefix-routes-conf)
@@ -226,7 +230,7 @@ dnl Checks for headers and libraries
 OPENIB_APP_OSMV_CHECK_HEADER
 OPENIB_APP_OSMV_CHECK_LIB
 
-AC_CONFIG_FILES([man/opensm.8 scripts/opensm.init scripts/redhat-opensm.init scripts/sldd.sh])
+AC_CONFIG_FILES([man/opensm.8 man/torus-2QoS.8 man/torus-2QoS.conf.5 scripts/opensm.init scripts/redhat-opensm.init scripts/sldd.sh])
 
 dnl Create the following Makefiles
 AC_OUTPUT([include/opensm/osm_version.h Makefile include/Makefile complib/Makefile libvendor/Makefile opensm/Makefile osmeventplugin/Makefile osmtest/Makefile opensm.spec])
diff --git a/opensm/man/torus-2QoS.8.in b/opensm/man/torus-2QoS.8.in
new file mode 100644
index 0000000..68e2bce
--- /dev/null
+++ b/opensm/man/torus-2QoS.8.in
@@ -0,0 +1,476 @@
+.TH TORUS\-2QOS 8 "November 10, 2010" "OpenIB" "OpenIB Management"
+.
+.SH NAME
+torus\-2QoS \- Routing engine for OpenSM subnet manager
+.
+.SH DESCRIPTION
+.
+Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+The torus-2QoS routing engine can provide the following functionality on
+a 2D/3D torus:
+.br
+\" roff illiteracy leads to following brain-dead list implementation
+\"
+.na  \" otherwise line space adjustment can add spaces between dash and text
+.in +2m
+\[en]
+'in +2m
+Routing that is free of credit loops.
+.in
+\[en]
+'in +2m
+Two levels of Quality of Service (QoS), assuming switches and channel
+adapters support eight data VLs.
+.in
+\[en]
+'in +2m
+The ability to route around a single failed switch, and/or multiple failed
+links, without
+.in
+.in +2m
+\[en]
+'in +2
+introducing credit loops, or
+.in
+\[en]
+'in +2m
+changing path SL values.
+.in -4m
+\[en]
+'in +2m
+Very short run times, with good scaling properties as fabric size increases.
+.ad
+.
+.SH UNICAST ROUTING
+.
+Unicast routing in torus-2QoS is based on Dimension Order Routing (DOR).
+It avoids the deadlocks that would otherwise occur in a DOR-routed
+torus using the concept of a dateline for each torus dimension.
+It encodes into a path SL which datelines the path crosses, as follows:
+\f(CR
+.P
+.nf
+    sl = 0;
+    for (d = 0; d < torus_dimensions; d++) {
+        /* path_crosses_dateline(d) returns 0 or 1 */
+        sl |= path_crosses_dateline(d) << d;
+    }
+.fi
+\fR
+.P
+On a 3D torus this consumes three SL bits, leaving one SL bit unused.
+Torus-2QoS uses this SL bit to implement two QoS levels.
+.P
+Torus-2QoS also makes use of the output port
+dependence of switch SL2VL maps to encode into one VL bit the
+information encoded in three SL bits.
+It computes in which torus coordinate direction each inter-switch link
+"points", and writes SL2VL maps for such ports as follows:
+\f(CR
+.P
+.nf
+    for (sl = 0; sl < 16; sl++) {
+        /* cdir(port) computes which torus coordinate direction
+         * a switch port "points" in; returns 0, 1, or 2
+         */
+        sl2vl(iport,oport,sl) = 0x1 & (sl >> cdir(oport));
+    }
+.fi
+\fR
+.P
+Thus, on a pristine 3D torus,
+\fIi.e.\fR,
+in the absence of failed fabric switches,
+torus-2QoS consumes eight SL values (SL bits 0-2) and
+two VL values (VL bit 0) per QoS level to provide deadlock-free routing.
+.P
+Torus-2QoS routes around link failure by "taking the long way around" any
+1D ring interrupted by link failure.  For example, consider the 2D 6x5
+torus below, where switches are denoted by [+a-zA-Z]:
+.
+.
+\# define macros to start and end ascii art, assuming Roman font.
+\# the start macro takes an argument which is the width in ems of
+\# the ascii art, and is used to center it.
+\#
+.de ascii_art
+.nop \f(CR
+.nr indent_in_ems ((((\\n[.ll] - \\n[.i]) / \\w'm') - \\$1)/2)
+.in +\\n[indent_in_ems]m
+.nf
+..
+.de end_ascii_art
+.fi
+.in
+.nop \fR
+..
+\# end of macro definitions
+.
+.
+.ascii_art 36
+       |    |    |    |    |    |
+  4  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  3  --+----+----+----D----+----+--
+       |    |    |    |    |    |
+  2  --+----+----I----r----+----+--
+       |    |    |    |    |    |
+  1  --m----S----n----T----o----p--
+       |    |    |    |    |    |
+y=0  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+For a pristine fabric the path from S to D would be S-n-T-r-D.
+In the event that either link S-n or n-T has failed, torus-2QoS would
+use the path S-m-p-o-T-r-D.
+Note that it can do this without changing the path SL
+value; once the 1D ring m-S-n-T-o-p-m has been broken by failure, path
+segments using it cannot contribute to deadlock, and the x-direction
+dateline (between, say, x=5 and x=0) can be ignored for path segments on
+that ring.
+.P
+One result of this is that torus-2QoS can route around many simultaneous
+link failures, as long as no 1D ring is broken into disjoint segments.
+For example, if links n-T and T-o have both failed, that ring has been broken
+into two disjoint segments, T and o-p-m-S-n.
+Torus-2QoS checks for such
+issues, reports if they are found, and refuses to route such fabrics.
+.P
+Note that in the case where there are multiple parallel links between a
+pair of switches, torus-2QoS will allocate routes across such links
+in a round-robin fashion, based on ports at the path destination switch that
+are active and not used for inter-switch links.
+Should a link that is one of several such parallel links fail, routes
+are redistributed across the remaining links.
+When the last of such a set of parallel links fails, traffic is rerouted
+as described above.
+.P
+Handling a failed switch under DOR requires introducing into a path at
+least one turn that would be otherwise "illegal",
+\fIi.e.\fR,
+not allowed by DOR rules.
+Torus-2QoS will introduce such a turn as close as possible to the
+failed switch in order to route around it.
+.P
+In the above example, suppose switch T has failed, and consider the path
+from S to D.
+Torus-2QoS will produce the path S-n-I-r-D, rather than the
+S-n-T-r-D path for a pristine torus, by introducing an early turn at n.
+Normal DOR rules will cause traffic arriving at switch I to be forwarded
+to switch r; for traffic arriving from I due to the "early" turn at n,
+this will generate an "illegal" turn at I.
+.P
+Torus-2QoS will also use the input port dependence of SL2VL maps to set VL
+bit 1 (which would be otherwise unused) for y-x, z-x, and z-y turns,
+\fIi.e.\fR,
+those turns that are illegal under DOR.
+This causes the first hop after any such turn to use a separate set of
+VL values, and prevents deadlock in the presence of a single failed switch.
+.P
+For any given path, only the hops after a turn that is illegal under DOR
+can contribute to a credit loop that leads to deadlock.  So in the example
+above with failed switch T, the location of the illegal turn at I in the
+path from S to D requires that any credit loop caused by that turn must
+encircle the failed switch at T.  Thus the second and later hops after the
+illegal turn at I (\fIi.e.\fR, hop r-D) cannot contribute to a credit loop
+because they cannot be used to construct a loop encircling T.  The hop I-r
+uses a separate VL, so it cannot contribute to a credit loop encircling T.
+.P
+Extending this argument shows that in addition to being capable of routing
+around a single switch failure without introducing deadlock, torus-2QoS can
+also route around multiple failed switches on the condition they are
+adjacent in the last dimension routed by DOR.  For example, consider the
+following case on a 6x6 2D torus:
+.
+.ascii_art 36
+       |    |    |    |    |    |
+  5  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  4  --+----+----+----D----+----+--
+       |    |    |    |    |    |
+  3  --+----+----I----u----+----+--
+       |    |    |    |    |    |
+  2  --+----+----q----R----+----+--
+       |    |    |    |    |    |
+  1  --m----S----n----T----o----p--
+       |    |    |    |    |    |
+y=0  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+Suppose switches T and R have failed, and consider the path from S to D.
+Torus-2QoS will generate the path S-n-q-I-u-D, with an illegal turn at
+switch I, and with hop I-u using a VL with bit 1 set.
+.P
+As a further example, consider a case that torus-2QoS cannot route without
+deadlock: two failed switches adjacent in a dimension that is not the last
+dimension routed by DOR; here the failed switches are O and T:
+.
+.ascii_art 36
+       |    |    |    |    |    |
+  5  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  4  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+  3  --+----+----+----+----D----+--
+       |    |    |    |    |    |
+  2  --+----+----I----q----r----+--
+       |    |    |    |    |    |
+  1  --m----S----n----O----T----p--
+       |    |    |    |    |    |
+y=0  --+----+----+----+----+----+--
+       |    |    |    |    |    |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+In a pristine fabric, torus-2QoS would generate the path from S to D as
+S-n-O-T-r-D.  With failed switches O and T, torus-2QoS will generate the
+path S-n-I-q-r-D, with illegal turn at switch I, and with hop I-q using a
+VL with bit 1 set.  In contrast to the earlier examples, the second hop
+after the illegal turn, q-r, can be used to construct a credit loop
+encircling the failed switches.
+.
+.SH MULTICAST ROUTING
+.
+Since torus-2QoS uses all four available SL bits, and the three data VL
+bits that are typically available in current switches, there is no way
+to use SL/VL values to separate multicast traffic from unicast traffic.
+Thus, torus-2QoS must generate multicast routing such that credit loops
+cannot arise from a combination of multicast and unicast path segments.
+.P
+It turns out that it is possible to construct spanning trees for multicast
+routing that have that property.  For the 2D 6x5 torus example above, here
+is the full-fabric spanning tree that torus-2QoS will construct, where "x"
+is the root switch and each "+" is a non-root switch:
+.
+.ascii_art 36
+  4    +    +    +    +    +    +
+       |    |    |    |    |    |
+  3    +    +    +    +    +    +
+       |    |    |    |    |    |
+  2    +----+----+----x----+----+
+       |    |    |    |    |    |
+  1    +    +    +    +    +    +
+       |    |    |    |    |    |
+y=0    +    +    +    +    +    +
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+For multicast traffic routed from root to tip, every turn in the above
+spanning tree is a legal DOR turn.
+.P
+For traffic routed from tip to root, and some traffic routed through the
+root, turns are not legal DOR turns.  However, to construct a credit loop,
+the union of multicast routing on this spanning tree with DOR unicast
+routing can only provide 3 of the 4 turns needed for the loop.
+.P
+In addition, if none of the above spanning tree branches crosses a dateline
+used for unicast credit loop avoidance on a torus, and if multicast traffic
+is confined to SL 0 or SL 8 (recall that torus-2QoS uses SL bit 3 to
+differentiate QoS level), then multicast traffic also cannot contribute to
+the "ring" credit loops that are otherwise possible in a torus.
+.P
+Torus-2QoS uses these ideas to create a master spanning tree.  Every
+multicast group spanning tree will be constructed as a subset of the master
+tree, with the same root as the master tree.
+.P
+Such multicast group spanning trees will in general not be optimal for
+groups which are a subset of the full fabric. However, this compromise must
+be made to enable support for two QoS levels on a torus while preventing
+credit loops.
+.P
+In the presence of link or switch failures that result in a fabric for
+which torus-2QoS can generate credit-loop-free unicast routes, it is also
+possible to generate a master spanning tree for multicast that retains the
+required properties.  For example, consider that same 2D 6x5 torus, with
+the link from (2,2) to (3,2) failed.  Torus-2QoS will generate the following
+master spanning tree:
+.
+.ascii_art 36
+  4    +    +    +    +    +    +
+       |    |    |    |    |    |
+  3    +    +    +    +    +    +
+       |    |    |    |    |    |
+  2  --+----+----+    x----+----+--
+       |    |    |    |    |    |
+  1    +    +    +    +    +    +
+       |    |    |    |    |    |
+y=0    +    +    +    +    +    +
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+Two things are notable about this master spanning tree.  First, assuming
+the x dateline was between x=5 and x=0, this spanning tree has a branch
+that crosses the dateline.  However, just as for unicast, crossing a
+dateline on a 1D ring (here, the ring for y=2) that is broken by a failure
+cannot contribute to a torus credit loop.
+.P
+Second, this spanning tree is no longer optimal even for multicast groups
+that encompass the entire fabric.  That, unfortunately, is a compromise that
+must be made to retain the other desirable properties of torus-2QoS routing.
+.P
+In the event that a single switch fails, torus-2QoS will generate a master
+spanning tree that has no "extra" turns by appropriately selecting a root
+switch.
+In the 2D 6x5 torus example, assume now that the switch at (3,2),
+\fIi.e.\fR, the root for a pristine fabric, fails.
+Torus-2QoS will generate the
+following master spanning tree for that case:
+.
+.ascii_art 36
+                      |
+  4    +    +    +    +    +    +
+       |    |    |    |    |    |
+  3    +    +    +    +    +    +
+       |    |    |         |    |
+  2    +    +    +         +    +
+       |    |    |         |    |
+  1    +----+----x----+----+----+
+       |    |    |    |    |    |
+y=0    +    +    +    +    +    +
+                      |
+
+     x=0    1    2    3    4    5
+.end_ascii_art
+.P
+Assuming the y dateline was between y=4 and y=0, this spanning tree has
+a branch that crosses a dateline.  However, again this cannot contribute
+to credit loops as it occurs on a 1D ring (the ring for x=3) that is
+broken by a failure, as in the above example.
+.
+.SH TORUS TOPOLOGY DISCOVERY
+.
+The algorithm used by torus-2QoS to contruct the torus topology from
+the undirected graph representing the fabric requires that the radix of
+each dimension be configured via torus-2QoS.conf.
+It also requires that the torus topology be "seeded"; for a 3D torus this
+requires configuring four switches that define the three coordinate
+directions of the torus.
+.P
+Given this starting information, the algorithm is to examine the
+cube formed by the eight switch locations bounded by the corners
+(x,y,z) and (x+1,y+1,z+1).
+Based on switches already placed into the torus topology at some of these
+locations, the algorithm examines 4-loops of inter-switch links to find the
+one that is consistent with a face of the cube of switch locations,
+and adds its swiches to the discovered topology in the correct locations.
+.P
+Because the algorithm is based on examing the topology of 4-loops of links,
+a torus with one or more radix-4 dimensions requires extra initial
+seed configuration.
+See torus-2QoS.conf(5) for details.
+Torus-2QoS will detect and report when it has insufficient configuration
+for a torus with radix-4 dimensions.
+.P
+In the event the torus is significantly degraded, \fIi.e.\fR, there are
+many missing switches or links, it may happen that torus-2QoS is unable
+to place into the torus some switches and/or links that were discoverd
+in the fabric, and will generate a warning in that case.
+A similar condition occurs if torus-2QoS is misconfigured, \fIi.e.\fR,
+the radix of a torus dimension as configured does not match the radix
+of that torus dimension as wired, and many switches/links in the fabric
+will not be placed into the torus.
+.
+.SH QUALITY OF SERVICE CONFIGURATION
+.
+OpenSM will not program switchs and channel adapters with
+SL2VL maps or VL arbitration configuration unless it is invoked with -Q.
+Since torus-2QoS depends on such functionality for correct operation,
+always invoke OpenSM with -Q when torus-2QoS is in the list of routing
+engines.
+.P
+Any quality of service configuration method supported by OpenSM will
+work with torus-2QoS, subject to the following limitations and
+considerations.
+.P
+For all routing engines supported by OpenSM except torus-2QoS,
+there is a one-to-one correspondence between QoS level and SL.
+Torus-2QoS can only support two quality of service levels, so only
+the high-order bit of any SL value used for unicast QoS configuration
+will be honored by torus-2QoS.
+.P
+For multicast QoS configuration, only SL values 0 and 8 should be used
+with torus-2QoS.
+.P
+Since SL to VL map configuration must be under the complete control of
+torus-2QoS, any configuration via qos_sl2vl, qos_swe_sl2vl,
+\fIetc.\fR, must and  will be ignored, and a warning will be generated.
+.P
+Torus-2QoS uses VL values 0-3 to implement one of its supported QoS
+levels, and VL values 4-7 to implement the other.  Hard-to-diagnose
+application issues may arise if traffic is not delivered fairly
+across each of these two VL ranges.
+Torus-2QoS will detect and warn if VL arbitration is configured
+unfairly across VLs in the range 0-3, and also in the range 4-7.
+Note that the default OpenSM VL arbitration configuration
+does not meet this constraint, so all torus-2QoS users should
+configure VL arbitration via qos_vlarb_high, qos_vlarb_low, \fIetc.\fR
+.
+.SH OPERATIONAL CONSIDERATIONS
+.
+Any routing algorithm for a torus IB fabric must employ path
+SL values to avoid credit loops.
+As a result, all applications run over such fabrics must perform a
+path record query to obtain the correct path SL for connection setup.
+Applications that use \fBrdma_cm\fR for connection setup will automatically
+meet this requirement.
+.P
+If a change in fabric topology causes changes in path SL values required
+to route without credit loops, in general all applications would need
+to repath to avoid message deadlock.  Since torus-2QoS has the ability
+to reroute after a single switch failure without changing path SL values,
+repathing by running applications is not required when the fabric
+is routed with torus-2QoS.
+.P
+Torus-2QoS can provide unchanging path SL values in the presence of
+subnet manager failover provided that all OpenSM instances have the
+same idea of dateline location.  See torus-2QoS.conf(5) for details.
+.P
+Torus-2QoS will detect configurations of failed switches and links
+that prevent routing that is free of credit loops, and will
+log warnings and refuse to route.  If "no_fallback" was configured in the
+list of OpenSM routing engines, then no other routing engine
+will attempt to route the fabric.  In that case all paths that
+do not transit the failed components will continue to work, and
+the subset of paths that are still operational will continue to remain
+free of credit loops.
+OpenSM will continue to attempt to route the fabric after every sweep
+interval, and after any change (such as a link up) in the fabric topology.
+When the fabric components are repaired, full functionality will be
+restored.
+.P
+In the event OpenSM was configured to allow some other engine to
+route the fabric if torus-2QoS fails, then credit loops and message
+deadlock are likely if torus-2QoS had previously routed
+the fabric successfully.
+Even if the other engine is capable of routing a torus
+without credit loops, applications that built connections with
+path SL values granted under torus-2QoS will likely experience
+message deadlock under routing generated by a different engine,
+unless they repath.
+.P
+To verify that a torus fabric is routed free of credit loops,
+use \fBibdmchk\fR to analyze data collected via \fBibdiagnet -vlr\fR.
+.
+.SH FILES
+.TP
+.B @OPENSM_CONFIG_DIR@/@OPENSM_CONFIG_FILE@
+default OpenSM config file.
+.TP
+.B @OPENSM_CONFIG_DIR@/@QOS_POLICY_FILE@
+default QoS policy config file.
+.TP
+.B @OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@
+default torus-2QoS config file.
+.
+.SH SEE ALSO
+.
+opensm(8), torus-2QoS.conf(5), ibdiagnet(1), ibdmchk(1), rdma_cm(7).
diff --git a/opensm/man/torus-2QoS.conf.5.in b/opensm/man/torus-2QoS.conf.5.in
new file mode 100644
index 0000000..147a7b1
--- /dev/null
+++ b/opensm/man/torus-2QoS.conf.5.in
@@ -0,0 +1,184 @@
+.TH TORUS\-2QOS.CONF 5 "November 11, 2010" "OpenIB" "OpenIB Management"
+.
+.SH NAME
+torus\-2QoS.conf \- Torus-2QoS configuration for OpenSM subnet manager
+.
+.SH DESCRIPTION
+.
+The file
+.B torus-2QoS.conf
+contains configuration information that is specific to the OpenSM
+routing engine torus-2QoS.
+Blank lines and lines where the first non-whitespace character is
+"#" are ignored.
+A token is any contiguous group of non-whitespace characters.
+Any tokens on a line following the recognized configuration tokens described
+below are ignored.
+.
+.P
+\fR[\fBtorus\fR|\fBmesh\fR]
+\fIx_radix\fR[\fBm\fR|\fBM\fR|\fBt\fR|\fBT\fR]
+\fIy_radix\fR[\fBm\fR|\fBM\fR|\fBt\fR|\fBT\fR]
+\fIz_radix\fR[\fBm\fR|\fBM\fR|\fBt\fR|\fBT\fR]
+.RS
+Either \fBtorus\fR or \fBmesh\fR must be the first keyword in the
+configuration, and sets the topology
+that torus-2QoS will try to construct.
+A 2D topology can be configured by specifying one of
+\fIx_radix\fR, \fIy_radix\fR, or \fIz_radix\fR as 1.
+An individual dimension can be configured as mesh (open) or torus
+(looped) by suffixing its radix specification with one of
+\fBm\fR, \fBM\fR, \fBt\fR, or \fBT\fR.  Thus, "mesh 3T 4 5" and
+"torus 3 4M 5M" both specify the same topology.
+.P
+Note that although torus-2QoS can route mesh fabrics, its ability to
+route around failed components is severely compromised on such fabrics.
+A failed fabric component is very likely to cause a disjoint ring;
+see \fBUNICAST ROUTING\fR in torus-2QoS(8).
+.RE
+.
+.P
+\fBxp_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fByp_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBzp_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBxm_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBym_link
+\fIsw0_GUID sw1_GUID
+.br
+.ns
+\fBzm_link
+\fIsw0_GUID sw1_GUID
+\fR
+.RS
+These keywords are used to seed the torus/mesh topolgy.
+For example, "xp_link 0x2000 0x2001" specifies that a link from
+the switch with node GUID 0x2000 to the switch with node GUID 0x2001
+would point in the positive x direction,
+while "xm_link 0x2000 0x2001" specifies that a link from
+the switch with node GUID 0x2000 to the switch with node GUID 0x2001
+would point in the negative x direction.  All the link keywords for
+a given seed must specify the same "from" switch.
+.P
+In general, it is not necessary to configure both the positive and
+negative directions for a given coordinate; either is sufficient.
+However, the algorithm used for topology discovery needs extra information
+for torus dimensions of radix four (see \fBTOPOLOGY DISCOVERY\fR in
+torus-2QoS(8)).  For such cases both the positive and negative coordinate
+directions must be specified.
+.P
+Based on the topology specifed via the \fBtorus\fR/\fBmesh\fR keyword,
+torus-2QoS will detect and log when it has insufficient seed configuration.
+.RE
+.
+.P
+\fBx_dateline
+\fIposition
+.br
+.ns
+\fBy_dateline
+\fIposition
+.br
+.ns
+\fBz_dateline
+\fIposition
+\fR
+.RS
+In order for torus-2QoS to provide the guarantee that path SL values
+do not change under any conditions for which it can still route the fabric,
+its idea of dateline position must not change relative to physical switch
+locations.  The dateline keywords provide the means to configure such
+behavior.
+.P
+The dateline for a torus dimension is always between the switch with
+coordinate 0 and the switch with coordinate radix-1 for that dimension.
+By default, the common switch in a torus seed is taken as the origin of
+the coordinate system used to describe switch location.
+The \fIposition\fR parameter for a dateline keyword moves the origin
+(and hence the dateline) the specified amount relative to the common
+switch in a torus seed.
+.RE
+.
+.P
+\fBnext_seed
+\fR
+.RS
+If any of the switches used to specify a seed were to fail torus-2QoS
+would be unable to complete topology discovery successfully.
+The \fBnext_seed\fR keyword specifies that the following link and dateline
+keywords apply to a new seed specification.
+.P
+For maximum resiliency, no seed specification should share a switch
+with any other seed specification.
+Multiple seed specifications should use dateline configuration to
+ensure that torus-2QoS can grant path SL values that are constant,
+regardless of which seed was used to initiate topology discovery.
+.RE
+.
+.P
+\fBportgroup_max_ports
+\fImax_ports
+\fR
+.RS
+This keyword specifies the maximum number of parallel inter-switch
+links, and also the maximum number of host ports per switch, that
+torus-2QoS can accommodate.
+The default value is 16.
+Torus-2QoS will log an error message during topology discovery if this
+parameter needs to be increased.
+If this keyword appears multiple times, the last instance prevails.
+.RE
+.
+.SH EXAMPLE
+.
+\f(RC
+.nf
+# Look for a 2D (since x radix is one) 4x5 torus.
+torus 1 4 5
+
+# y is radix-4 torus dimension, need both
+# ym_link and yp_link configuration.
+yp_link 0x200000 0x200005  # sw @ y=0,z=0 -> sw @ y=1,z=0
+ym_link 0x200000 0x20000f  # sw @ y=0,z=0 -> sw @ y=3,z=0
+
+# z is not radix-4 torus dimension, only need one of
+# zm_link or zp_link configuration.
+zp_link 0x200000 0x200001  # sw @ y=0,z=0 -> sw @ y=0,z=1
+
+next_seed
+
+yp_link 0x20000b 0x200010  # sw @ y=2,z=1 -> sw @ y=3,z=1
+ym_link 0x20000b 0x200006  # sw @ y=2,z=1 -> sw @ y=1,z=1
+zp_link 0x20000b 0x20000c  # sw @ y=2,z=1 -> sw @ y=2,z=2
+
+y_dateline -2  # Move the dateline for this seed
+z_dateline -1  # back to its original position.
+
+# If OpenSM failover is configured, for maximum resiliency
+# one instance should run on a host attached to a switch
+# from the first seed, and another instance should run
+# on a host attached to a switch from the second seed.
+# Both instances should use this torus-2QoS.conf to ensure
+# path SL values do not change in the event of SM failover.
+.fi
+\fR
+.
+.SH FILES
+.TP
+.B @OPENSM_CONFIG_DIR@/@TORUS2QOS_CONF_FILE@
+Default torus-2QoS config file.
+.
+.SH SEE ALSO
+.
+opensm(8), torus-2QoS(8).
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 13/13] opensm/doc/current-routing.txt: Sync torus-2QoS information with new man pages.
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (11 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 12/13] opensm: Add torus-2QoS man pages Jim Schutt
@ 2010-11-12 22:11   ` Jim Schutt
  2010-11-30 15:23   ` [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset Sasha Khapyorsky
  2011-01-30 17:12   ` Sasha Khapyorsky
  14 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2010-11-12 22:11 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt


Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/doc/current-routing.txt |  141 +++++++++++++++++++++++++++++++++++----
 1 files changed, 126 insertions(+), 15 deletions(-)

diff --git a/opensm/doc/current-routing.txt b/opensm/doc/current-routing.txt
index 4eaf861..5048c55 100644
--- a/opensm/doc/current-routing.txt
+++ b/opensm/doc/current-routing.txt
@@ -399,7 +399,7 @@ Use '-R dor' option to activate the DOR algorithm.
 Torus-2QoS Routing Algorithm
 ----------------------------
 
-Torus-2QoS is routing algorithm designed for large-scale 2D/3D torus fabrics.
+Torus-2QoS is a routing algorithm designed for large-scale 2D/3D torus fabrics.
 The torus-2QoS routing engine can provide the following functionality on
 a 2D/3D torus:
 - routing that is free of credit loops
@@ -411,6 +411,8 @@ a 2D/3D torus:
 - very short run times, with good scaling properties as fabric size
     increases
 
+Unicast Routing:
+
 Torus-2QoS is a DOR-based algorithm that avoids deadlocks that would otherwise
 occur in a torus using the concept of a dateline for each torus dimension.
 It encodes into a path SL which datelines the path crosses as follows:
@@ -423,17 +425,18 @@ It encodes into a path SL which datelines the path crosses as follows:
 For a 3D torus, that leaves one SL bit free, which torus-2QoS uses to
 implement two QoS levels.
 
-This is possible because torus-2QoS also makes use of the output port
-dependence of the switch SL2VL maps.  It computes in which torus coordinate
-direction each interswitch link "points", and writes SL2VL maps for such
-ports as follows:
+Torus-2QoS also makes use of the output port dependence of switch SL2VL
+maps to encode into one VL bit the information encoded in three SL bits.
+It computes in which torus coordinate direction each inter-switch link
+"points", and writes SL2VL maps for such ports as follows:
 
   for (sl = 0; sl < 16; sl ++)
     /* cdir(port) reports which torus coordinate direction a switch port
      * "points" in, and returns 0, 1, or 2 */
     sl2vl(iport,oport,sl) = 0x1 & (sl >> cdir(oport));
 
-Thus torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
+Thus, on a pristine 3D torus, i.e., in the absence of failed fabric switches,
+torus-2QoS consumes 8 SL values (SL bits 0-2) and 2 VL values (VL bit 0)
 per QoS level to provide deadlock-free routing on a 3D torus.
 
 Torus-2QoS routes around link failure by "taking the long way around" any
@@ -454,7 +457,7 @@ torus below, where switches are denoted by [+a-zA-Z]:
 
       x=0    1    2    3    4    5
 
-For a pristine fabric the path from S to D would be S-n-T-r-d.  In the
+For a pristine fabric the path from S to D would be S-n-T-r-D.  In the
 event that either link S-n or n-T has failed, torus-2QoS would use the path
 S-m-p-o-T-r-D.  Note that it can do this without changing the path SL
 value; once the 1D ring m-S-n-T-o-p-m has been broken by failure, path
@@ -463,11 +466,19 @@ dateline (between, say, x=5 and x=0) can be ignored for path segments on
 that ring.
 
 One result of this is that torus-2QoS can route around many simultaneous
-link failures, as long as no 1D ring is broken into disjoint regions.  For
+link failures, as long as no 1D ring is broken into disjoint segments.  For
 example, if links n-T and T-o have both failed, that ring has been broken
-into two disjoint regions, T and o-p-m-S-n.  Torus-2QoS checks for such
+into two disjoint segments, T and o-p-m-S-n.  Torus-2QoS checks for such
 issues, reports if they are found, and refuses to route such fabrics.
 
+Note that in the case where there are multiple parallel links between a pair
+of switches, torus-2QoS will allocate routes across such links in a round-
+robin fashion, based on ports at the path destination switch that are active
+and not used for inter-switch links.  Should a link that is one of several
+such parallel links fail, routes are redistributed across the remaining
+links.   When the last of such a set of parallel links fails, traffic is
+rerouted as described above.
+
 Handling a failed switch under DOR requires introducing into a path at
 least one turn that would be otherwise "illegal", i.e. not allowed by DOR
 rules.  Torus-2QoS will introduce such a turn as close as possible to the
@@ -476,8 +487,9 @@ failed switch in order to route around it.
 In the above example, suppose switch T has failed, and consider the path
 from S to D.  Torus-2QoS will produce the path S-n-I-r-D, rather than the
 S-n-T-r-D path for a pristine torus, by introducing an early turn at n.
-For traffic arriving at switch I from n, normal DOR rules will generate an
-illegal turn in the path from S to D at I, and a legal turn at r.
+Normal DOR rules will cause traffic arriving at switch I to be forwarded
+to switch r; for traffic arriving from I due to the "early" turn at n,
+this will generate an "illegal" turn at I.
 
 Torus-2QoS will also use the input port dependence of SL2VL maps to set VL
 bit 1 (which would be otherwise unused) for y-x, z-x, and z-y turns, i.e.,
@@ -549,6 +561,8 @@ VL with bit 1 set.  In contrast to the earlier examples, the second hop
 after the illegal turn, q-r, can be used to construct a credit loop
 encircling the failed switches.
 
+Multicast Routing:
+
 Since torus-2QoS uses all four available SL bits, and the three data VL
 bits that are typically available in current switches, there is no way
 to use SL/VL values to separate multicast traffic from unicast traffic.
@@ -649,7 +663,104 @@ a branch that crosses a dateline.  However, again this cannot contribute
 to credit loops as it occurs on a 1D ring (the ring for x=3) that is
 broken by a failure, as in the above example.
 
-Due to the use made by torus-2QoS of SLs and VLs, QoS configuration should
-only employ SL values 0 and 8, for both multicast and unicast.  Also,
-SL to VL map configuration must be under the complete control of torus-2QoS,
-so any user-supplied configuration must and will be ignored.
+Torus Topolgy Discovery:
+
+The algorithm used by torus-2QoS to contruct the torus topology from the
+undirected graph representing the fabric requires that the radix of each
+dimension be configured via torus-2QoS.conf. It also requires that the
+torus topology be "seeded"; for a 3D torus this requires configuring four
+switches that define the three coordinate directions of the torus.
+
+Given this starting information, the algorithm is to examine the cube
+formed by the eight switch locations bounded by the corners (x,y,z) and
+(x+1,y+1,z+1).  Based on switches already placed into the torus topology at
+some of these locations, the algorithm examines 4-loops of interswitch
+links to find the one that is consistent with a face of the cube of switch
+locations, and adds its swiches to the discovered topology in the correct
+locations.
+
+Because the algorithm is based on examing the topology of 4-loops of links,
+a torus with one or more radix-4 dimensions requires extra initial seed
+configuration.  See torus-2QoS.conf(5) for details. Torus-2QoS will detect
+and report when it has insufficient configuration for a torus with radix-4
+dimensions.
+
+In the event the torus is significantly degraded, i.e., there are many
+missing switches or links, it may happen that torus-2QoS is unable to place
+into the torus some switches and/or links that were discoverd in the
+fabric, and will generate a warning in that case.  A similar condition
+occurs if torus-2QoS is misconfigured, i.e., the radix of a torus dimension
+as configured does not match the radix of that torus dimension as wired,
+and many switches/links in the fabric will not be placed into the torus.
+
+Quality Of Service Configuration:
+
+OpenSM will not program switchs and channel adapters with SL2VL maps or VL
+arbitration configuration unless it is invoked with -Q.  Since torus-2QoS
+depends on such functionality for correct operation, always invoke OpenSM
+with -Q when torus-2QoS is in the list of routing engines.
+
+Any quality of service configuration method supported by OpenSM will work
+with torus-2QoS, subject to the following limitations and considerations.
+
+For all routing engines supported by OpenSM except torus-2QoS, there is a
+one-to-one correspondence between QoS level and SL. Torus-2QoS can only
+support two quality of service levels, so only the high-order bit of any SL
+value used for unicast QoS configuration will be honored by torus-2QoS.
+
+For multicast QoS configuration, only SL values 0 and 8 should be used with
+torus-2QoS.
+
+Since SL to VL map configuration must be under the complete control of
+torus-2QoS, any configuration via qos_sl2vl, qos_swe_sl2vl, etc., must and
+will be ignored, and a warning will be generated.
+
+Torus-2QoS uses VL values 0-3 to implement one of its supported QoS levels,
+and VL values 4-7 to implement the other.  Hard-to-diagnose application
+issues may arise if traffic is not delivered fairly across each of these
+two VL ranges. Torus-2QoS will detect and warn if VL arbitration is
+configured unfairly across VLs in the range 0-3, and also in the range
+4-7. Note that the default OpenSM VL arbitration configuration does not
+meet this constraint, so all torus-2QoS users should configure VL
+arbitration via qos_vlarb_high, qos_vlarb_low, etc.
+
+Operational Considerations:
+
+Any routing algorithm for a torus IB fabric must employ path SL values to
+avoid credit loops. As a result, all applications run over such fabrics
+must perform a path record query to obtain the correct path SL for
+connection setup. Applications that use rdma_cm for connection setup will
+automatically meet this requirement.
+
+If a change in fabric topology causes changes in path SL values required to
+route without credit loops, in general all applications would need to
+repath to avoid message deadlock. Since torus-2QoS has the ability to
+reroute after a single switch failure without changing path SL values,
+repathing by running applications is not required when the fabric is routed
+with torus-2QoS.
+
+Torus-2QoS can provide unchanging path SL values in the presence of subnet
+manager failover provided that all OpenSM instances have the same idea of
+dateline location. See torus-2QoS.conf(5) for details.
+
+Torus-2QoS will detect configurations of failed switches and links that
+prevent routing that is free of credit loops, and will log warnings and
+refuse to route. If "no_fallback" was configured in the list of OpenSM
+routing engines, then no other routing engine will attempt to route the
+fabric. In that case all paths that do not transit the failed components
+will continue to work, and the subset of paths that are still operational
+will continue to remain free of credit loops. OpenSM will continue to
+attempt to route the fabric after every sweep interval, and after any
+change (such as a link up) in the fabric topology. When the fabric
+components are repaired, full functionality will be restored.
+
+In the event OpenSM was configured to allow some other engine to route the
+fabric if torus-2QoS fails, then credit loops and message deadlock are
+likely if torus-2QoS had previously routed the fabric successfully. Even if
+the other engine is capable of routing a torus without credit loops,
+applications that built connections with path SL values granted under
+torus-2QoS will likely experience message deadlock under routing generated
+by a different engine, unless they repath.
+
+To verify that a torus fabric is routed free of credit loops, use ibdmchk
+to analyze data collected via ibdiagnet -vlr.
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (12 preceding siblings ...)
  2010-11-12 22:11   ` [PATCH 13/13] opensm/doc/current-routing.txt: Sync torus-2QoS information with new " Jim Schutt
@ 2010-11-30 15:23   ` Sasha Khapyorsky
  2011-01-30 17:12   ` Sasha Khapyorsky
  14 siblings, 0 replies; 17+ messages in thread
From: Sasha Khapyorsky @ 2010-11-30 15:23 UTC (permalink / raw)
  To: Jim Schutt; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 15:11 Fri 12 Nov     , Jim Schutt wrote:
> Hi Sasha,
> 
> These patches clean up and add documentation to the
> torus-2QoS routing module for OpenSM.  They apply on
> top of my previous bug-fix patchset from September
> (http://www.spinics.net/lists/linux-rdma/msg05809.html),
> which applies to your torus-2qos branch.

Applied. Thanks.

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset
       [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
                     ` (13 preceding siblings ...)
  2010-11-30 15:23   ` [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset Sasha Khapyorsky
@ 2011-01-30 17:12   ` Sasha Khapyorsky
  2011-01-31 15:35     ` Jim Schutt
  14 siblings, 1 reply; 17+ messages in thread
From: Sasha Khapyorsky @ 2011-01-30 17:12 UTC (permalink / raw)
  To: Jim Schutt; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Alex Netes

Hi Jim,

On 15:11 Fri 12 Nov     , Jim Schutt wrote:
> 
> These patches clean up and add documentation to the
> torus-2QoS routing module for OpenSM.  They apply on
> top of my previous bug-fix patchset from September
> (http://www.spinics.net/lists/linux-rdma/msg05809.html),
> which applies to your torus-2qos branch.

Following your and others feedback. I've merge torus-2qos branch
upstream. Thanks.

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset
  2011-01-30 17:12   ` Sasha Khapyorsky
@ 2011-01-31 15:35     ` Jim Schutt
  0 siblings, 0 replies; 17+ messages in thread
From: Jim Schutt @ 2011-01-31 15:35 UTC (permalink / raw)
  To: Sasha Khapyorsky
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Alex Netes

Hi Sasha,

On Sun, 2011-01-30 at 10:12 -0700, Sasha Khapyorsky wrote:
> Hi Jim,
> 
> On 15:11 Fri 12 Nov     , Jim Schutt wrote:
> > 
> > These patches clean up and add documentation to the
> > torus-2QoS routing module for OpenSM.  They apply on
> > top of my previous bug-fix patchset from September
> > (http://www.spinics.net/lists/linux-rdma/msg05809.html),
> > which applies to your torus-2qos branch.
> 
> Following your and others feedback. I've merge torus-2qos branch
> upstream. Thanks.

Great news! Thanks.

-- Jim

> 
> Sasha
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-01-31 15:35 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-12 22:11 [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset Jim Schutt
     [not found] ` <1289599882-15165-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
2010-11-12 22:11   ` [PATCH 01/13] Revert "opensm: Do not require -Q option for torus-2QoS routing engine." Jim Schutt
2010-11-12 22:11   ` [PATCH 02/13] opensm: torus-2QoS requires that QoS be enabled Jim Schutt
2010-11-12 22:11   ` [PATCH 03/13] opensm/osm_ucast_mgr.c: ensure osm_ucast_mgr_process() returns failure when no routing engine runs Jim Schutt
2010-11-12 22:11   ` [PATCH 04/13] opensm: Fill in default QoS values at last possible moment Jim Schutt
2010-11-12 22:11   ` [PATCH 05/13] opensm: Cause torus-2QoS to warn if QoS configuration will cause issues Jim Schutt
2010-11-12 22:11   ` [PATCH 06/13] opensm/osm_torus.c: Also parse DOS line endings in torus-2QoS.conf Jim Schutt
2010-11-12 22:11   ` [PATCH 07/13] opensm/osm_torus.c: Use PRIx64 for GUID printing Jim Schutt
2010-11-12 22:11   ` [PATCH 08/13] opensm/osm_torus.c: Ignore multiple configurations of torus size Jim Schutt
2010-11-12 22:11   ` [PATCH 09/13] opensm/osm_subnet.c: Add torus-2QoS config file option to those configurable via opensm config file Jim Schutt
2010-11-12 22:11   ` [PATCH 10/13] opensm/main.c: Add description of "no_fallback" to "--routing_engine" option documentation Jim Schutt
2010-11-12 22:11   ` [PATCH 11/13] opensm/man/opensm.8.in: Add references to torus-2QoS Jim Schutt
2010-11-12 22:11   ` [PATCH 12/13] opensm: Add torus-2QoS man pages Jim Schutt
2010-11-12 22:11   ` [PATCH 13/13] opensm/doc/current-routing.txt: Sync torus-2QoS information with new " Jim Schutt
2010-11-30 15:23   ` [PATCH 00/13] opensm: Cleanups and more documentation for torus-2QoS patchset Sasha Khapyorsky
2011-01-30 17:12   ` Sasha Khapyorsky
2011-01-31 15:35     ` Jim Schutt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).