public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] opensm partition.conf issues
@ 2013-10-17 11:10 Bernd Schubert
  2013-10-17 11:10 ` [PATCH opensm 1/2] reduce log level for missing partition configuration file Bernd Schubert
  2013-10-17 11:10 ` [PATCH opensm 2/2] Try default parition config if parsing the partitions.conf failed Bernd Schubert
  0 siblings, 2 replies; 5+ messages in thread
From: Bernd Schubert @ 2013-10-17 11:10 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb

Without an partition.conf file opensm filled the log file with error messages
on every sweep.

Sep 27 18:21:47 917409 [6A60D700] 0x01 -> osm_prtn_make_partitions: Partition configuration /etc/opensm/partitions.conf is not accessible (No such file or directory)
Sep 27 18:21:47 918994 [6A60D700] 0x02 -> SUBNET UP
Sep 27 18:22:07 917436 [6A60D700] 0x01 -> osm_prtn_make_partitions: Partition configuration /etc/opensm/partitions.conf is not accessible (No such file or directory)
Sep 27 18:22:07 918979 [6A60D700] 0x02 -> SUBNET UP

With an empty partition file opensm didn't bring up the fabric properly, resulting
in endless kernel messages
ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22

opensm.log file had lots of these message
Oct 17 09:31:38 461345 [67EC4700] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x000000
00000130c7, MGID: ff12:401b:ffff::ffff:ffff from port 0x0002c9030036dd21 (MT25408 ConnectX Mellanox Technologies)
Oct 17 09:31:38 505122 [66EC2700] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B11: method = SubnAdmSet, scope_state = 0x1, component mask = 0x0000000000010083, expected comp mask = 0x000000
00000130c7, MGID: ff12:401b:ffff::ffff:ffff from port 0x0002c90300f4d6d1 (MT25408 ConnectX Mellanox Technologies)
Oct 17 09:31:46 294201 [5D6AF700] 0x02 -> SUBNET UP

So here are two patches to reduce the log level and to try the default 
partition configuration if there is not a single valid config line.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH opensm 1/2] reduce log level for missing partition configuration file.
  2013-10-17 11:10 [PATCH 0/2] opensm partition.conf issues Bernd Schubert
@ 2013-10-17 11:10 ` Bernd Schubert
  2013-10-22  5:59   ` Hal Rosenstock
  2013-10-17 11:10 ` [PATCH opensm 2/2] Try default parition config if parsing the partitions.conf failed Bernd Schubert
  1 sibling, 1 reply; 5+ messages in thread
From: Bernd Schubert @ 2013-10-17 11:10 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb

A missing non-mandatory file is not an error.

Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
---
 opensm/osm_prtn.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/opensm/osm_prtn.c b/opensm/osm_prtn.c
index 24a1fe3..e76e2e1 100644
--- a/opensm/osm_prtn.c
+++ b/opensm/osm_prtn.c
@@ -383,7 +383,7 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
 	file_name = p_subn->opt.partition_config_file ?
 	    p_subn->opt.partition_config_file : OSM_DEFAULT_PARTITION_CONFIG_FILE;
 	if (stat(file_name, &statbuf)) {
-		OSM_LOG(p_log, OSM_LOG_ERROR, "Partition configuration "
+		OSM_LOG(p_log, OSM_LOG_VERBOSE, "Partition configuration "
 			"%s is not accessible (%s)\n", file_name,
 			strerror(errno));
 		is_config = FALSE;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH opensm 2/2] Try default parition config if parsing the partitions.conf failed
  2013-10-17 11:10 [PATCH 0/2] opensm partition.conf issues Bernd Schubert
  2013-10-17 11:10 ` [PATCH opensm 1/2] reduce log level for missing partition configuration file Bernd Schubert
@ 2013-10-17 11:10 ` Bernd Schubert
  2013-10-22  6:08   ` Hal Rosenstock
  1 sibling, 1 reply; 5+ messages in thread
From: Bernd Schubert @ 2013-10-17 11:10 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb

If partitions.conf is for some reasons invalid or empty, try again
with the default configuration.

This will re-use the default configuration created by prtn_make_default(),
but osm_prtn_make_new() will automatically overwrite the initial default.

Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
---
 opensm/osm_prtn.c        |   11 ++++++++++-
 opensm/osm_prtn_config.c |   11 ++++++++++-
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/opensm/osm_prtn.c b/opensm/osm_prtn.c
index e76e2e1..4db7e7a 100644
--- a/opensm/osm_prtn.c
+++ b/opensm/osm_prtn.c
@@ -376,6 +376,7 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
 	struct stat statbuf;
 	const char *file_name;
 	boolean_t is_config = TRUE;
+	boolean_t is_wrong_config = FALSE;
 	ib_api_status_t status = IB_SUCCESS;
 	cl_map_item_t *p_next;
 	osm_prtn_t *p;
@@ -389,6 +390,7 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
 		is_config = FALSE;
 	}
 
+retry_default:
 	/* clean up current port maps */
 	p_next = cl_qmap_head(&p_subn->prtn_pkey_tbl);
 	while (p_next != cl_qmap_end(&p_subn->prtn_pkey_tbl)) {
@@ -404,9 +406,11 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
 	if (status != IB_SUCCESS)
 		goto _err;
 
-	if (is_config && osm_prtn_config_parse_file(p_log, p_subn, file_name))
+	if (is_config && osm_prtn_config_parse_file(p_log, p_subn, file_name)) {
 		OSM_LOG(p_log, OSM_LOG_VERBOSE, "Partition configuration "
 			"was not fully processed\n");
+		is_wrong_config = TRUE;
+	}
 
 	/* and now clean up empty partitions */
 	p_next = cl_qmap_head(&p_subn->prtn_pkey_tbl);
@@ -421,6 +425,11 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
 		}
 	}
 
+	if (is_config && is_wrong_config) {
+		is_config = FALSE;
+		goto retry_default;
+	}
+
 _err:
 	return status;
 }
diff --git a/opensm/osm_prtn_config.c b/opensm/osm_prtn_config.c
index 8f4a673..e916582 100644
--- a/opensm/osm_prtn_config.c
+++ b/opensm/osm_prtn_config.c
@@ -696,6 +696,9 @@ done:
 	return len;
 }
 
+/**
+ * @return -1 on error, 0 on success
+ */
 int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
 			       const char *file_name)
 {
@@ -703,6 +706,7 @@ int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
 	struct part_conf *conf = NULL;
 	FILE *file;
 	int lineno;
+	boolean_t is_parse_success = FALSE;
 
 	file = fopen(file_name, "r");
 	if (!file) {
@@ -753,6 +757,8 @@ int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
 				break;
 			}
 
+			is_parse_success = TRUE;
+
 			p += len;
 
 			if (q) {
@@ -764,5 +770,8 @@ int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
 
 	fclose(file);
 
-	return 0;
+	if (is_parse_success)
+		return 0;
+	else
+		return -1;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH opensm 1/2] reduce log level for missing partition configuration file.
  2013-10-17 11:10 ` [PATCH opensm 1/2] reduce log level for missing partition configuration file Bernd Schubert
@ 2013-10-22  5:59   ` Hal Rosenstock
  0 siblings, 0 replies; 5+ messages in thread
From: Hal Rosenstock @ 2013-10-22  5:59 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 10/17/2013 7:10 AM, Bernd Schubert wrote:
> A missing non-mandatory file is not an error.
> 
> Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>

Thanks. Applied.

-- Hal

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH opensm 2/2] Try default parition config if parsing the partitions.conf failed
  2013-10-17 11:10 ` [PATCH opensm 2/2] Try default parition config if parsing the partitions.conf failed Bernd Schubert
@ 2013-10-22  6:08   ` Hal Rosenstock
  0 siblings, 0 replies; 5+ messages in thread
From: Hal Rosenstock @ 2013-10-22  6:08 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi Bernd,

On 10/17/2013 7:10 AM, Bernd Schubert wrote:
> If partitions.conf is for some reasons invalid or empty, try again
> with the default configuration.
> 
> This will re-use the default configuration created by prtn_make_default(),
> but osm_prtn_make_new() will automatically overwrite the initial default.

This seems like a better "policy". The admin now will need to notice
that he might not have gotten the partitioning he was trying to instill
in the subnet. Before he would not miss this because the result was more
severe.

> Signed-off-by: Bernd Schubert <bernd.schubert-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>
> ---
>  opensm/osm_prtn.c        |   11 ++++++++++-
>  opensm/osm_prtn_config.c |   11 ++++++++++-
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/opensm/osm_prtn.c b/opensm/osm_prtn.c
> index e76e2e1..4db7e7a 100644
> --- a/opensm/osm_prtn.c
> +++ b/opensm/osm_prtn.c
> @@ -376,6 +376,7 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
>  	struct stat statbuf;
>  	const char *file_name;
>  	boolean_t is_config = TRUE;
> +	boolean_t is_wrong_config = FALSE;
>  	ib_api_status_t status = IB_SUCCESS;
>  	cl_map_item_t *p_next;
>  	osm_prtn_t *p;
> @@ -389,6 +390,7 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
>  		is_config = FALSE;
>  	}
>  
> +retry_default:
>  	/* clean up current port maps */
>  	p_next = cl_qmap_head(&p_subn->prtn_pkey_tbl);
>  	while (p_next != cl_qmap_end(&p_subn->prtn_pkey_tbl)) {
> @@ -404,9 +406,11 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
>  	if (status != IB_SUCCESS)
>  		goto _err;
>  
> -	if (is_config && osm_prtn_config_parse_file(p_log, p_subn, file_name))
> +	if (is_config && osm_prtn_config_parse_file(p_log, p_subn, file_name)) {
>  		OSM_LOG(p_log, OSM_LOG_VERBOSE, "Partition configuration "
>  			"was not fully processed\n");
> +		is_wrong_config = TRUE;
> +	}
>  
>  	/* and now clean up empty partitions */
>  	p_next = cl_qmap_head(&p_subn->prtn_pkey_tbl);
> @@ -421,6 +425,11 @@ ib_api_status_t osm_prtn_make_partitions(osm_log_t * p_log, osm_subn_t * p_subn)
>  		}
>  	}
>  
> +	if (is_config && is_wrong_config) {
> +		is_config = FALSE;
> +		goto retry_default;
> +	}
> +
>  _err:
>  	return status;
>  }
> diff --git a/opensm/osm_prtn_config.c b/opensm/osm_prtn_config.c
> index 8f4a673..e916582 100644
> --- a/opensm/osm_prtn_config.c
> +++ b/opensm/osm_prtn_config.c
> @@ -696,6 +696,9 @@ done:
>  	return len;
>  }
>  
> +/**
> + * @return -1 on error, 0 on success
> + */
>  int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
>  			       const char *file_name)
>  {
> @@ -703,6 +706,7 @@ int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
>  	struct part_conf *conf = NULL;
>  	FILE *file;
>  	int lineno;
> +	boolean_t is_parse_success = FALSE;
>  
>  	file = fopen(file_name, "r");
>  	if (!file) {
> @@ -753,6 +757,8 @@ int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
>  				break;
>  			}
>  
> +			is_parse_success = TRUE;

Doesn't this set is_parse_success on first good parseable line in
partition config ? I think easiest change is to set is_parse_success to
TRUE at top of routine and in the 2 places in this routine where parsing
can fail set it to FALSE. Does that make sense ?

-- Hal

> +
>  			p += len;
>  
>  			if (q) {
> @@ -764,5 +770,8 @@ int osm_prtn_config_parse_file(osm_log_t * p_log, osm_subn_t * p_subn,
>  
>  	fclose(file);
>  
> -	return 0;
> +	if (is_parse_success)
> +		return 0;
> +	else
> +		return -1;
>  }
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-10-22  6:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-17 11:10 [PATCH 0/2] opensm partition.conf issues Bernd Schubert
2013-10-17 11:10 ` [PATCH opensm 1/2] reduce log level for missing partition configuration file Bernd Schubert
2013-10-22  5:59   ` Hal Rosenstock
2013-10-17 11:10 ` [PATCH opensm 2/2] Try default parition config if parsing the partitions.conf failed Bernd Schubert
2013-10-22  6:08   ` Hal Rosenstock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox