All of lore.kernel.org
 help / color / mirror / Atom feed
* [Cluster-devel] [DLM][PATCH] lowcomms init fixes
@ 2007-04-20 20:25 Josef Bacik
  2007-04-20 22:51 ` Josef Bacik
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2007-04-20 20:25 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hello,

This patch resolves 2 problems.  First, if we tear everything down, the
connection cache and everything, and then go to create it again because we've
registered a new lockspace, we will get a garbage connection for the first
connection, because we never destroy the idr, which will still be holding
references to memory.  So this patch init's the connections_idr everytime
lowcomms starts so we can make sure we don't have this problem.  The second
problem is that when we go to destroy all of the connections and remove the
dlm_conn kmem cache, it will complain because there are still things allocated.
This is because we generally do this

for (i = 0; i < max_nodeid; i++)

where max_nodeid is set to whatever the highest nodeid is requested, which
starts@0, so we will always leave 1 thing allocated.  This patch makes us set
max_nodid appropriately, and fixes an inconsistent for loop in the code.  Thank
you,

Josef

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 2b32f3c..66986be 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -196,7 +196,7 @@ static struct connection *__nodeid2con(i
 	}
 
 	if (nodeid > max_nodeid)
-		max_nodeid = nodeid;
+		max_nodeid = nodeid + 1;
 
 	return con;
 }
@@ -380,7 +380,7 @@ static void sctp_init_failed(void)
 	struct connection *con;
 
 	down(&connections_lock);
-	for (i=1; i<=max_nodeid; i++) {
+	for (i=1; i<max_nodeid; i++) {
 		con = __nodeid2con(i, 0);
 		if (!con)
 			continue;
@@ -1424,6 +1424,7 @@ int dlm_lowcomms_start(void)
 	int error = -EINVAL;
 	struct connection *con;
 
+	idr_init(&connections_idr);
 	init_local();
 	if (!dlm_local_count) {
 		log_print("no local IP address has been set");
@@ -1437,6 +1438,9 @@ int dlm_lowcomms_start(void)
 	if (!con_cache)
 		goto out;
 
+	/* set a default max_nodeid, it will be reset if we need more */
+	max_nodeid = 0;
+
 	/* Set some sysctl minima */
 	if (sysctl_rmem_max < NEEDED_RMEM)
 		sysctl_rmem_max = NEEDED_RMEM;



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [DLM][PATCH] lowcomms init fixes
  2007-04-20 20:25 [Cluster-devel] [DLM][PATCH] lowcomms init fixes Josef Bacik
@ 2007-04-20 22:51 ` Josef Bacik
  2007-04-23 10:19   ` Patrick Caulfield
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2007-04-20 22:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Fri, Apr 20, 2007 at 04:25:52PM -0400, Josef Bacik wrote:
> Hello,
> 
> This patch resolves 2 problems.  First, if we tear everything down, the
> connection cache and everything, and then go to create it again because we've
> registered a new lockspace, we will get a garbage connection for the first
> connection, because we never destroy the idr, which will still be holding
> references to memory.  So this patch init's the connections_idr everytime
> lowcomms starts so we can make sure we don't have this problem.  The second
> problem is that when we go to destroy all of the connections and remove the
> dlm_conn kmem cache, it will complain because there are still things allocated.
> This is because we generally do this
> 
> for (i = 0; i < max_nodeid; i++)
> 
> where max_nodeid is set to whatever the highest nodeid is requested, which
> starts at 0, so we will always leave 1 thing allocated.  This patch makes us set
> max_nodid appropriately, and fixes an inconsistent for loop in the code.  Thank
> you,

Ok I uncovered some other ugliness, when you go to run dlm_lowcomms_close when
shutting down, you will get stuck in an infinite loop, because we had previously
run dlm_lowcomms_stop when clvmd stopped, so we have freed all of our
connections and we pull a garbage connection off of the idr, because we don't
init it until we go to start the lowcomms again.  So this is the same patch as
before, except instead of init'ing the idr everytime we start lowcomms, we init
it when we stop lowcomms so we don't have to worry about this kind of problem.
Thank you,

Josef

Signed-off-by: Josef Bacik <jwhiter@redhat.com>

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 2b32f3c..6fe250c 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -196,7 +196,7 @@ static struct connection *__nodeid2con(i
 	}
 
 	if (nodeid > max_nodeid)
-		max_nodeid = nodeid;
+		max_nodeid = nodeid + 1;
 
 	return con;
 }
@@ -380,7 +380,7 @@ static void sctp_init_failed(void)
 	struct connection *con;
 
 	down(&connections_lock);
-	for (i=1; i<=max_nodeid; i++) {
+	for (i=1; i<max_nodeid; i++) {
 		con = __nodeid2con(i, 0);
 		if (!con)
 			continue;
@@ -1417,6 +1417,7 @@ void dlm_lowcomms_stop(void)
 	}
 	up(&connections_lock);
 	kmem_cache_destroy(con_cache);
+	idr_init(&connections_idr);
 }
 
 int dlm_lowcomms_start(void)
@@ -1437,6 +1438,9 @@ int dlm_lowcomms_start(void)
 	if (!con_cache)
 		goto out;
 
+	/* set a default max_nodeid, it will be reset if we need more */
+	max_nodeid = 0;
+
 	/* Set some sysctl minima */
 	if (sysctl_rmem_max < NEEDED_RMEM)
 		sysctl_rmem_max = NEEDED_RMEM;



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Cluster-devel] [DLM][PATCH] lowcomms init fixes
  2007-04-20 22:51 ` Josef Bacik
@ 2007-04-23 10:19   ` Patrick Caulfield
  2007-04-23 15:15     ` Josef Bacik
  0 siblings, 1 reply; 4+ messages in thread
From: Patrick Caulfield @ 2007-04-23 10:19 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Thanks for finding & fixing those bugs Josef.
I've changed the patch a little, making the comparison always <= max_nodeid, I
prefer max_modeid to be what it says it as and not 1 more than the max node ID.

I'm a little concerned about the need for idr_init() in lowcomms_close(). Surely
if the idr was properly cleaned out it would not be necessary - though somehow
it seems to be, Do you know what's actually happening here ?

Anyway, here's the modified patch. it seems to work for me, but I'd be grateful
if you could run it through your tests too, cos I'm supposed to be working on
something else@the moment!

Steve, can you wait till Josef has tested this patch before adding it to the git
tree? thanks.

--
Patrick

Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street,
Windsor, Berkshire, SL4 ITE, UK.
Registered in England and Wales under Company Registration No. 3798903
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lowcomms-fix.patch
Type: text/x-patch
Size: 2509 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20070423/dea123b8/attachment.bin>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Cluster-devel] [DLM][PATCH] lowcomms init fixes
  2007-04-23 10:19   ` Patrick Caulfield
@ 2007-04-23 15:15     ` Josef Bacik
  0 siblings, 0 replies; 4+ messages in thread
From: Josef Bacik @ 2007-04-23 15:15 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, Apr 23, 2007 at 11:19:52AM +0100, Patrick Caulfield wrote:
> Thanks for finding & fixing those bugs Josef.
> I've changed the patch a little, making the comparison always <= max_nodeid, I
> prefer max_modeid to be what it says it as and not 1 more than the max node ID.
> 
> I'm a little concerned about the need for idr_init() in lowcomms_close(). Surely
> if the idr was properly cleaned out it would not be necessary - though somehow
> it seems to be, Do you know what's actually happening here ?
> 
> Anyway, here's the modified patch. it seems to work for me, but I'd be grateful
> if you could run it through your tests too, cos I'm supposed to be working on
> something else at the moment!
> 
> Steve, can you wait till Josef has tested this patch before adding it to the git
> tree? thanks.
> 

K I tested it and it works great, thanks much Patrick.

Josef



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-04-23 15:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-20 20:25 [Cluster-devel] [DLM][PATCH] lowcomms init fixes Josef Bacik
2007-04-20 22:51 ` Josef Bacik
2007-04-23 10:19   ` Patrick Caulfield
2007-04-23 15:15     ` Josef Bacik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.