public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] opensm: Bug fixes for torus-2QoS patchset
@ 2010-09-17 17:03 Jim Schutt
       [not found] ` <1284742994-24503-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Jim Schutt @ 2010-09-17 17:03 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

Hi Sasha,

These patches fix bugs discovered during further testing of the
torus-2QoS routing module for OpenSM.  They apply to your
torus-2qos branch.

Thanks -- Jim


Jim Schutt (2):
  opensm/osm_torus.c: Add check for invalid topology discovery due to
    user misconfiguration.
  opensm/osm_torus.c: Handle calloc() failure on routing engine context
    creation.

 opensm/opensm/osm_torus.c |   24 +++++++++++++++++++++++-
 1 files changed, 23 insertions(+), 1 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] opensm/osm_torus.c: Add check for invalid topology discovery due to user misconfiguration.
       [not found] ` <1284742994-24503-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
@ 2010-09-17 17:03   ` Jim Schutt
  2010-09-17 17:03   ` [PATCH 2/2] opensm/osm_torus.c: Handle calloc() failure on routing engine context creation Jim Schutt
  2010-11-30 14:53   ` [PATCH 0/2] opensm: Bug fixes for torus-2QoS patchset Sasha Khapyorsky
  2 siblings, 0 replies; 4+ messages in thread
From: Jim Schutt @ 2010-09-17 17:03 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

Hal Rosenstock found a way to make torus-2QoS seg fault: when
the fabric contains a torus dimension with radix 4, but the
configuration info in torus-2QoS.conf didn't say so.  This
patch detects the result of such misconfiguration, and warns.

Tested-by: Hal Rosenstock <hal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 0b7741d..12b480d 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -1623,6 +1623,22 @@ bool link_srcsink(struct torus *t, int i, int j, int k)
 		return true;
 
 	fsw = tsw->tmp;
+	/*
+	 * link_srcsink is supposed to get called once for every switch in
+	 * the fabric.  At this point every fsw we encounter must have a
+	 * non-null osm_switch.  Otherwise something has gone horribly
+	 * wrong with topology discovery; the most likely reason is that
+	 * the fabric contains a radix-4 torus dimension, but the user gave
+	 * a config that didn't say so, breaking all the checking in
+	 * safe_x_perpendicular and friends.
+	 */
+	if (!(fsw && fsw->osm_switch)) {
+		OSM_LOG(&t->osm->log, OSM_LOG_ERROR,
+			"Error: Invalid topology discovery. "
+			"Verify torus-2QoS.conf contents.\n");
+		return false;
+	}
+
 	pg = &tsw->ptgrp[2 * TORUS_MAX_DIM];
 	pg->type = SRCSINK;
 	tsw->osm_switch = fsw->osm_switch;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] opensm/osm_torus.c: Handle calloc() failure on routing engine context creation.
       [not found] ` <1284742994-24503-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
  2010-09-17 17:03   ` [PATCH 1/2] opensm/osm_torus.c: Add check for invalid topology discovery due to user misconfiguration Jim Schutt
@ 2010-09-17 17:03   ` Jim Schutt
  2010-11-30 14:53   ` [PATCH 0/2] opensm: Bug fixes for torus-2QoS patchset Sasha Khapyorsky
  2 siblings, 0 replies; 4+ messages in thread
From: Jim Schutt @ 2010-09-17 17:03 UTC (permalink / raw)
  To: sashak-smomgflXvOZWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jim Schutt

Hal Rosenstock pointed out this calloc() could fail.

Signed-off-by: Jim Schutt <jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
---
 opensm/opensm/osm_torus.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/opensm/opensm/osm_torus.c b/opensm/opensm/osm_torus.c
index 12b480d..3b67f16 100644
--- a/opensm/opensm/osm_torus.c
+++ b/opensm/opensm/osm_torus.c
@@ -410,7 +410,11 @@ struct torus_context *torus_context_create(osm_opensm_t *osm)
 	struct torus_context *ctx;
 
 	ctx = calloc(1, sizeof(*ctx));
-	ctx->osm = osm;
+	if (ctx)
+		ctx->osm = osm;
+	else
+		OSM_LOG(&osm->log, OSM_LOG_ERROR,
+			"Error: calloc: %s\n", strerror(errno));
 
 	return ctx;
 }
@@ -9113,6 +9117,8 @@ int osm_ucast_torus2QoS_setup(struct osm_routing_engine *r,
 	struct torus_context *ctx;
 
 	ctx = torus_context_create(osm);
+	if (!ctx)
+		return -1;
 
 	r->context = ctx;
 	r->ucast_build_fwd_tables = torus_build_lfts;
-- 
1.6.2.2


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] opensm: Bug fixes for torus-2QoS patchset
       [not found] ` <1284742994-24503-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
  2010-09-17 17:03   ` [PATCH 1/2] opensm/osm_torus.c: Add check for invalid topology discovery due to user misconfiguration Jim Schutt
  2010-09-17 17:03   ` [PATCH 2/2] opensm/osm_torus.c: Handle calloc() failure on routing engine context creation Jim Schutt
@ 2010-11-30 14:53   ` Sasha Khapyorsky
  2 siblings, 0 replies; 4+ messages in thread
From: Sasha Khapyorsky @ 2010-11-30 14:53 UTC (permalink / raw)
  To: Jim Schutt; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 11:03 Fri 17 Sep     , Jim Schutt wrote:
> Hi Sasha,
> 
> These patches fix bugs discovered during further testing of the
> torus-2QoS routing module for OpenSM.  They apply to your
> torus-2qos branch.
> 
> Thanks -- Jim
> 
> 
> Jim Schutt (2):
>   opensm/osm_torus.c: Add check for invalid topology discovery due to
>     user misconfiguration.
>   opensm/osm_torus.c: Handle calloc() failure on routing engine context
>     creation.
> 
>  opensm/opensm/osm_torus.c |   24 +++++++++++++++++++++++-
>  1 files changed, 23 insertions(+), 1 deletions(-)

Applied. Thanks.

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-11-30 14:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-17 17:03 [PATCH 0/2] opensm: Bug fixes for torus-2QoS patchset Jim Schutt
     [not found] ` <1284742994-24503-1-git-send-email-jaschut-4OHPYypu0djtX7QSmKvirg@public.gmane.org>
2010-09-17 17:03   ` [PATCH 1/2] opensm/osm_torus.c: Add check for invalid topology discovery due to user misconfiguration Jim Schutt
2010-09-17 17:03   ` [PATCH 2/2] opensm/osm_torus.c: Handle calloc() failure on routing engine context creation Jim Schutt
2010-11-30 14:53   ` [PATCH 0/2] opensm: Bug fixes for torus-2QoS patchset Sasha Khapyorsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox