All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] ocfs2_controld.cman
@ 2009-04-08 21:33 David Teigland
  2009-04-08 22:22 ` Joel Becker
  0 siblings, 1 reply; 10+ messages in thread
From: David Teigland @ 2009-04-08 21:33 UTC (permalink / raw)
  To: ocfs2-devel

If I start ocfs2_controld.cman in parallel on a few nodes, only one of them
starts up, the others exit with one of these errors:

call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
call_section_read at 387: Checkpoint "ocfs2:controld" does not have a section named "daemon_protocol"

call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
call_section_read at 397: Unable to read section "daemon_protocol" from checkpoint "ocfs2:controld": Object does not exist

It does work ok if I remove those two checks.

Another thing I noticed while looking in the code is that it assumes a single
node will become the first member of a cpg on its own when a bunch of nodes
join at once: daemon_joined(daemon_group.cg_member_count == 1);

This isn't a correct assumption.  It's possible that two or more nodes joining
at once will become initial members together.  (I realize that it's a very
convenient assumption to make after using it in previous pre-cpg programs, and
it may take a fair amount of work to do without.)

Dave

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-08 21:33 [Ocfs2-devel] ocfs2_controld.cman David Teigland
@ 2009-04-08 22:22 ` Joel Becker
  2009-04-09 11:38   ` Andrew Beekhof
                     ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-08 22:22 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Apr 08, 2009 at 04:33:17PM -0500, David Teigland wrote:
> If I start ocfs2_controld.cman in parallel on a few nodes, only one of them
> starts up, the others exit with one of these errors:
> 
> call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
> call_section_read at 387: Checkpoint "ocfs2:controld" does not have a section named "daemon_protocol"
> 
> call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
> call_section_read at 397: Unable to read section "daemon_protocol" from checkpoint "ocfs2:controld": Object does not exist
> 
> It does work ok if I remove those two checks.

	These checks are required - otherwise you end up with unsync'd
daemons, which is crap.
	I've changed the daemon to wait indefinitely, and that's
something lmb was testing.  See the controld-fixes branch of
ocfs2-tools.git.  That should fix these problems.

> Another thing I noticed while looking in the code is that it assumes a single
> node will become the first member of a cpg on its own when a bunch of nodes
> join at once: daemon_joined(daemon_group.cg_member_count == 1);
> 
> This isn't a correct assumption.  It's possible that two or more nodes joining
> at once will become initial members together.  (I realize that it's a very
> convenient assumption to make after using it in previous pre-cpg programs, and
> it may take a fair amount of work to do without.)

	Well, this is going to be fun.  I have to figure out which
daemon is the "first", and now it's just racy.  I could swear that
someone told me cpg would guarantee i see the joins in order, not at the
same time.

Joel

-- 

"Three o'clock is always too late or too early for anything you
 want to do."
        - Jean-Paul Sartre

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-08 22:22 ` Joel Becker
@ 2009-04-09 11:38   ` Andrew Beekhof
  2009-04-09 16:11     ` David Teigland
  2009-04-09 18:45     ` Joel Becker
  2009-04-09 16:22   ` David Teigland
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 10+ messages in thread
From: Andrew Beekhof @ 2009-04-09 11:38 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
>
> ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> daemon is the "first", and now it's just racy. ?I could swear that
> someone told me cpg would guarantee i see the joins in order, not at the
> same time.

"In order" does not necessarily imply "one node at a time".

I don't consider it unreasonable for two nodes starting (effectively)
simultaneously to appear in the first membership.
I believe Heartbeat had the same property.

Why not just take a lock when you want to create the daemon_protocol
section (and allow the second guy to fail gracefully)?
Perhaps cpg even has something like this built in...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-09 11:38   ` Andrew Beekhof
@ 2009-04-09 16:11     ` David Teigland
  2009-04-09 18:44       ` Joel Becker
  2009-04-09 18:45     ` Joel Becker
  1 sibling, 1 reply; 10+ messages in thread
From: David Teigland @ 2009-04-09 16:11 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Apr 09, 2009 at 01:38:10PM +0200, Andrew Beekhof wrote:
> On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
> >
> > ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> > daemon is the "first", and now it's just racy. ?I could swear that
> > someone told me cpg would guarantee i see the joins in order, not at the
> > same time.
> 
> "In order" does not necessarily imply "one node at a time".
> 
> I don't consider it unreasonable for two nodes starting (effectively)
> simultaneously to appear in the first membership.
> I believe Heartbeat had the same property.

Right, confchg order is guaranteed, but doesn't imply one node per confchg.

> Why not just take a lock when you want to create the daemon_protocol
> section (and allow the second guy to fail gracefully)?
> Perhaps cpg even has something like this built in...

You probably want to use cpg messages to order things.  So, for example,
everyone sends a message proposing that it create the section, and the node
whose message arrives first does it.  If you're coordinating things with
messages like this anyway, it's not much more work to include protocol
information in the message and eliminate checkpoints.

Dave

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-08 22:22 ` Joel Becker
  2009-04-09 11:38   ` Andrew Beekhof
@ 2009-04-09 16:22   ` David Teigland
  2009-04-09 18:46     ` Joel Becker
  2009-04-10  0:11   ` Joel Becker
  2009-04-14 23:39   ` [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join Joel Becker
  3 siblings, 1 reply; 10+ messages in thread
From: David Teigland @ 2009-04-09 16:22 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Apr 08, 2009 at 03:22:37PM -0700, Joel Becker wrote:
> > This isn't a correct assumption.  It's possible that two or more nodes
> > joining at once will become initial members together.  (I realize that
> > it's a very convenient assumption to make after using it in previous
> > pre-cpg programs, and it may take a fair amount of work to do without.)
> 
> Well, this is going to be fun.  I have to figure out which daemon is
> the "first", and now it's just racy.  I could swear that someone told
> me cpg would guarantee i see the joins in order, not at the same time.

It may just work to have both race to create the checkpoint, the loser should
get an error back from create (I haven't tried it, but I'd expect it to work
that way.)

Dave

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-09 16:11     ` David Teigland
@ 2009-04-09 18:44       ` Joel Becker
  0 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-09 18:44 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Apr 09, 2009 at 11:11:37AM -0500, David Teigland wrote:
> On Thu, Apr 09, 2009 at 01:38:10PM +0200, Andrew Beekhof wrote:
> > On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
> > >
> > > ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> > > daemon is the "first", and now it's just racy. ?I could swear that
> > > someone told me cpg would guarantee i see the joins in order, not at the
> > > same time.
> > 
> > "In order" does not necessarily imply "one node at a time".
> > 
> > I don't consider it unreasonable for two nodes starting (effectively)
> > simultaneously to appear in the first membership.
> > I believe Heartbeat had the same property.
> 
> Right, confchg order is guaranteed, but doesn't imply one node per confchg.

	I have no problem with more than one join at the same time, but
somehow I had the idea that the first joiner would be alone.  Consider
me corrected.

> > Why not just take a lock when you want to create the daemon_protocol
> > section (and allow the second guy to fail gracefully)?
> > Perhaps cpg even has something like this built in...
> 
> You probably want to use cpg messages to order things.  So, for example,
> everyone sends a message proposing that it create the section, and the node
> whose message arrives first does it.  If you're coordinating things with
> messages like this anyway, it's not much more work to include protocol
> information in the message and eliminate checkpoints.

	Ugh ugh ugh.  This code is already a complex world of states
that are hard to keep in your head.  It only gets worse the more things
in flight.  Checkpoints give us a nice way to look up data about other
nodes without this hassle - they only give us a little pain in this
setup phase.

Joel

-- 

"I am working for the time when unqualified blacks, browns, and
 women join the unqualified men in running our overnment."
	- Sissy Farenthold

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-09 11:38   ` Andrew Beekhof
  2009-04-09 16:11     ` David Teigland
@ 2009-04-09 18:45     ` Joel Becker
  1 sibling, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-09 18:45 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Apr 09, 2009 at 01:38:10PM +0200, Andrew Beekhof wrote:
> On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
> >
> > ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> > daemon is the "first", and now it's just racy. ?I could swear that
> > someone told me cpg would guarantee i see the joins in order, not at the
> > same time.
> 
> "In order" does not necessarily imply "one node at a time".
> 
> I don't consider it unreasonable for two nodes starting (effectively)
> simultaneously to appear in the first membership.
> I believe Heartbeat had the same property.
> 
> Why not just take a lock when you want to create the daemon_protocol
> section (and allow the second guy to fail gracefully)?
> Perhaps cpg even has something like this built in...

	I don't want to rely on dlm in this daemon.  These control
daemons are complex enough, they are our only connection between the fs
and the stack,and we need to make them correct.

Joel

-- 

"Ninety feet between bases is perhaps as close as man has ever come
 to perfection."
	- Red Smith

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-09 16:22   ` David Teigland
@ 2009-04-09 18:46     ` Joel Becker
  0 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-09 18:46 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Apr 09, 2009 at 11:22:28AM -0500, David Teigland wrote:
> On Wed, Apr 08, 2009 at 03:22:37PM -0700, Joel Becker wrote:
> > > This isn't a correct assumption.  It's possible that two or more nodes
> > > joining at once will become initial members together.  (I realize that
> > > it's a very convenient assumption to make after using it in previous
> > > pre-cpg programs, and it may take a fair amount of work to do without.)
> > 
> > Well, this is going to be fun.  I have to figure out which daemon is
> > the "first", and now it's just racy.  I could swear that someone told
> > me cpg would guarantee i see the joins in order, not at the same time.
> 
> It may just work to have both race to create the checkpoint, the loser should
> get an error back from create (I haven't tried it, but I'd expect it to work
> that way.)

	If only OpenAIS wasn't so loose here.  If my daemon dies and
restarts, the checkpoints I previously created might not have gone away
yet.  So I get EEXIST for a short while until CKPT is done disposing of
them.  ocfs2_controld handles this, but it means we can't rely on
EEXIST.

Joel

-- 

"Baby, even the losers
 Get luck sometimes.
 Even the losers
 Keep a little bit of pride."

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] ocfs2_controld.cman
  2009-04-08 22:22 ` Joel Becker
  2009-04-09 11:38   ` Andrew Beekhof
  2009-04-09 16:22   ` David Teigland
@ 2009-04-10  0:11   ` Joel Becker
  2009-04-14 23:39   ` [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join Joel Becker
  3 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-10  0:11 UTC (permalink / raw)
  To: ocfs2-devel

On Wed, Apr 08, 2009 at 03:22:37PM -0700, Joel Becker wrote:
> On Wed, Apr 08, 2009 at 04:33:17PM -0500, David Teigland wrote:
> > If I start ocfs2_controld.cman in parallel on a few nodes, only one of them
> > starts up, the others exit with one of these errors:
> > 
> > call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
> > call_section_read at 387: Checkpoint "ocfs2:controld" does not have a section named "daemon_protocol"
> > 
> > call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
> > call_section_read at 397: Unable to read section "daemon_protocol" from checkpoint "ocfs2:controld": Object does not exist
> > 
> > It does work ok if I remove those two checks.
> 
> 	These checks are required - otherwise you end up with unsync'd
> daemons, which is crap.
> 	I've changed the daemon to wait indefinitely, and that's
> something lmb was testing.  See the controld-fixes branch of
> ocfs2-tools.git.  That should fix these problems.

	These changes are now in the master branch of ocfs2-tools.git.

Joel

-- 

"To fall in love is to create a religion that has a fallible god."
        -Jorge Luis Borges

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join.
  2009-04-08 22:22 ` Joel Becker
                     ` (2 preceding siblings ...)
  2009-04-10  0:11   ` Joel Becker
@ 2009-04-14 23:39   ` Joel Becker
  3 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-14 23:39 UTC (permalink / raw)
  To: ocfs2-devel

More than one ocfs2_controld can join the cpg group at the same time,
even for the first members of a group.  So 'jouners == 1' is not a valid
check to see if this controld is the first one.

We change the code to pick the controld with the lowest node number to
set up the daemon control group.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
---
 ocfs2_controld/cpg.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/ocfs2_controld/cpg.c b/ocfs2_controld/cpg.c
index a2d8ae1..5e166b0 100644
--- a/ocfs2_controld/cpg.c
+++ b/ocfs2_controld/cpg.c
@@ -728,6 +728,8 @@ out:
 
 static void daemon_set_cgroup(struct cgroup *cg, void *user_data)
 {
+	int i, first = 1;
+
 	void (*daemon_joined)(int first) = user_data;
 
 	if (cg != &daemon_group) {
@@ -735,7 +737,16 @@ static void daemon_set_cgroup(struct cgroup *cg, void *user_data)
 		return;
 	}
 
-	daemon_joined(daemon_group.cg_member_count == 1);
+	if (daemon_group.cg_cb_member_count != daemon_group.cg_cb_joined_count)
+		first = 0;
+	else {
+		for (i = 0; i < daemon_group.cg_cb_joined_count; i++) {
+			if (daemon_group.cg_cb_joined[i].nodeid < our_nodeid)
+				first = 0;
+		}
+	}
+
+	daemon_joined(first);
 }
 
 int setup_cpg(void (*daemon_joined)(int first))
-- 
1.6.1.3

-- 

Life's Little Instruction Book #139

	"Never deprive someone of hope; it might be all they have."

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-04-14 23:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-08 21:33 [Ocfs2-devel] ocfs2_controld.cman David Teigland
2009-04-08 22:22 ` Joel Becker
2009-04-09 11:38   ` Andrew Beekhof
2009-04-09 16:11     ` David Teigland
2009-04-09 18:44       ` Joel Becker
2009-04-09 18:45     ` Joel Becker
2009-04-09 16:22   ` David Teigland
2009-04-09 18:46     ` Joel Becker
2009-04-10  0:11   ` Joel Becker
2009-04-14 23:39   ` [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join Joel Becker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.