* [Ocfs2-devel] ocfs2_controld.cman
2009-04-08 22:22 ` Joel Becker
@ 2009-04-09 11:38 ` Andrew Beekhof
2009-04-09 16:11 ` David Teigland
2009-04-09 18:45 ` Joel Becker
2009-04-09 16:22 ` David Teigland
` (2 subsequent siblings)
3 siblings, 2 replies; 10+ messages in thread
From: Andrew Beekhof @ 2009-04-09 11:38 UTC (permalink / raw)
To: ocfs2-devel
On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
>
> ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> daemon is the "first", and now it's just racy. ?I could swear that
> someone told me cpg would guarantee i see the joins in order, not at the
> same time.
"In order" does not necessarily imply "one node at a time".
I don't consider it unreasonable for two nodes starting (effectively)
simultaneously to appear in the first membership.
I believe Heartbeat had the same property.
Why not just take a lock when you want to create the daemon_protocol
section (and allow the second guy to fail gracefully)?
Perhaps cpg even has something like this built in...
^ permalink raw reply [flat|nested] 10+ messages in thread* [Ocfs2-devel] ocfs2_controld.cman
2009-04-09 11:38 ` Andrew Beekhof
@ 2009-04-09 16:11 ` David Teigland
2009-04-09 18:44 ` Joel Becker
2009-04-09 18:45 ` Joel Becker
1 sibling, 1 reply; 10+ messages in thread
From: David Teigland @ 2009-04-09 16:11 UTC (permalink / raw)
To: ocfs2-devel
On Thu, Apr 09, 2009 at 01:38:10PM +0200, Andrew Beekhof wrote:
> On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
> >
> > ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> > daemon is the "first", and now it's just racy. ?I could swear that
> > someone told me cpg would guarantee i see the joins in order, not at the
> > same time.
>
> "In order" does not necessarily imply "one node at a time".
>
> I don't consider it unreasonable for two nodes starting (effectively)
> simultaneously to appear in the first membership.
> I believe Heartbeat had the same property.
Right, confchg order is guaranteed, but doesn't imply one node per confchg.
> Why not just take a lock when you want to create the daemon_protocol
> section (and allow the second guy to fail gracefully)?
> Perhaps cpg even has something like this built in...
You probably want to use cpg messages to order things. So, for example,
everyone sends a message proposing that it create the section, and the node
whose message arrives first does it. If you're coordinating things with
messages like this anyway, it's not much more work to include protocol
information in the message and eliminate checkpoints.
Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Ocfs2-devel] ocfs2_controld.cman
2009-04-09 16:11 ` David Teigland
@ 2009-04-09 18:44 ` Joel Becker
0 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-09 18:44 UTC (permalink / raw)
To: ocfs2-devel
On Thu, Apr 09, 2009 at 11:11:37AM -0500, David Teigland wrote:
> On Thu, Apr 09, 2009 at 01:38:10PM +0200, Andrew Beekhof wrote:
> > On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
> > >
> > > ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> > > daemon is the "first", and now it's just racy. ?I could swear that
> > > someone told me cpg would guarantee i see the joins in order, not at the
> > > same time.
> >
> > "In order" does not necessarily imply "one node at a time".
> >
> > I don't consider it unreasonable for two nodes starting (effectively)
> > simultaneously to appear in the first membership.
> > I believe Heartbeat had the same property.
>
> Right, confchg order is guaranteed, but doesn't imply one node per confchg.
I have no problem with more than one join at the same time, but
somehow I had the idea that the first joiner would be alone. Consider
me corrected.
> > Why not just take a lock when you want to create the daemon_protocol
> > section (and allow the second guy to fail gracefully)?
> > Perhaps cpg even has something like this built in...
>
> You probably want to use cpg messages to order things. So, for example,
> everyone sends a message proposing that it create the section, and the node
> whose message arrives first does it. If you're coordinating things with
> messages like this anyway, it's not much more work to include protocol
> information in the message and eliminate checkpoints.
Ugh ugh ugh. This code is already a complex world of states
that are hard to keep in your head. It only gets worse the more things
in flight. Checkpoints give us a nice way to look up data about other
nodes without this hassle - they only give us a little pain in this
setup phase.
Joel
--
"I am working for the time when unqualified blacks, browns, and
women join the unqualified men in running our overnment."
- Sissy Farenthold
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Ocfs2-devel] ocfs2_controld.cman
2009-04-09 11:38 ` Andrew Beekhof
2009-04-09 16:11 ` David Teigland
@ 2009-04-09 18:45 ` Joel Becker
1 sibling, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-09 18:45 UTC (permalink / raw)
To: ocfs2-devel
On Thu, Apr 09, 2009 at 01:38:10PM +0200, Andrew Beekhof wrote:
> On Thu, Apr 9, 2009 at 00:22, Joel Becker <Joel.Becker@oracle.com> wrote:
> >
> > ? ? ? ?Well, this is going to be fun. ?I have to figure out which
> > daemon is the "first", and now it's just racy. ?I could swear that
> > someone told me cpg would guarantee i see the joins in order, not at the
> > same time.
>
> "In order" does not necessarily imply "one node at a time".
>
> I don't consider it unreasonable for two nodes starting (effectively)
> simultaneously to appear in the first membership.
> I believe Heartbeat had the same property.
>
> Why not just take a lock when you want to create the daemon_protocol
> section (and allow the second guy to fail gracefully)?
> Perhaps cpg even has something like this built in...
I don't want to rely on dlm in this daemon. These control
daemons are complex enough, they are our only connection between the fs
and the stack,and we need to make them correct.
Joel
--
"Ninety feet between bases is perhaps as close as man has ever come
to perfection."
- Red Smith
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Ocfs2-devel] ocfs2_controld.cman
2009-04-08 22:22 ` Joel Becker
2009-04-09 11:38 ` Andrew Beekhof
@ 2009-04-09 16:22 ` David Teigland
2009-04-09 18:46 ` Joel Becker
2009-04-10 0:11 ` Joel Becker
2009-04-14 23:39 ` [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join Joel Becker
3 siblings, 1 reply; 10+ messages in thread
From: David Teigland @ 2009-04-09 16:22 UTC (permalink / raw)
To: ocfs2-devel
On Wed, Apr 08, 2009 at 03:22:37PM -0700, Joel Becker wrote:
> > This isn't a correct assumption. It's possible that two or more nodes
> > joining at once will become initial members together. (I realize that
> > it's a very convenient assumption to make after using it in previous
> > pre-cpg programs, and it may take a fair amount of work to do without.)
>
> Well, this is going to be fun. I have to figure out which daemon is
> the "first", and now it's just racy. I could swear that someone told
> me cpg would guarantee i see the joins in order, not at the same time.
It may just work to have both race to create the checkpoint, the loser should
get an error back from create (I haven't tried it, but I'd expect it to work
that way.)
Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Ocfs2-devel] ocfs2_controld.cman
2009-04-09 16:22 ` David Teigland
@ 2009-04-09 18:46 ` Joel Becker
0 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-09 18:46 UTC (permalink / raw)
To: ocfs2-devel
On Thu, Apr 09, 2009 at 11:22:28AM -0500, David Teigland wrote:
> On Wed, Apr 08, 2009 at 03:22:37PM -0700, Joel Becker wrote:
> > > This isn't a correct assumption. It's possible that two or more nodes
> > > joining at once will become initial members together. (I realize that
> > > it's a very convenient assumption to make after using it in previous
> > > pre-cpg programs, and it may take a fair amount of work to do without.)
> >
> > Well, this is going to be fun. I have to figure out which daemon is
> > the "first", and now it's just racy. I could swear that someone told
> > me cpg would guarantee i see the joins in order, not at the same time.
>
> It may just work to have both race to create the checkpoint, the loser should
> get an error back from create (I haven't tried it, but I'd expect it to work
> that way.)
If only OpenAIS wasn't so loose here. If my daemon dies and
restarts, the checkpoints I previously created might not have gone away
yet. So I get EEXIST for a short while until CKPT is done disposing of
them. ocfs2_controld handles this, but it means we can't rely on
EEXIST.
Joel
--
"Baby, even the losers
Get luck sometimes.
Even the losers
Keep a little bit of pride."
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Ocfs2-devel] ocfs2_controld.cman
2009-04-08 22:22 ` Joel Becker
2009-04-09 11:38 ` Andrew Beekhof
2009-04-09 16:22 ` David Teigland
@ 2009-04-10 0:11 ` Joel Becker
2009-04-14 23:39 ` [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join Joel Becker
3 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-10 0:11 UTC (permalink / raw)
To: ocfs2-devel
On Wed, Apr 08, 2009 at 03:22:37PM -0700, Joel Becker wrote:
> On Wed, Apr 08, 2009 at 04:33:17PM -0500, David Teigland wrote:
> > If I start ocfs2_controld.cman in parallel on a few nodes, only one of them
> > starts up, the others exit with one of these errors:
> >
> > call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
> > call_section_read at 387: Checkpoint "ocfs2:controld" does not have a section named "daemon_protocol"
> >
> > call_section_read at 370: Reading from section "daemon_protocol" on checkpoint "ocfs2:controld" (try 1)
> > call_section_read at 397: Unable to read section "daemon_protocol" from checkpoint "ocfs2:controld": Object does not exist
> >
> > It does work ok if I remove those two checks.
>
> These checks are required - otherwise you end up with unsync'd
> daemons, which is crap.
> I've changed the daemon to wait indefinitely, and that's
> something lmb was testing. See the controld-fixes branch of
> ocfs2-tools.git. That should fix these problems.
These changes are now in the master branch of ocfs2-tools.git.
Joel
--
"To fall in love is to create a religion that has a fallible god."
-Jorge Luis Borges
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 10+ messages in thread* [Ocfs2-devel] [PATCH] ocfs2_controld: Handle simultaneous group join.
2009-04-08 22:22 ` Joel Becker
` (2 preceding siblings ...)
2009-04-10 0:11 ` Joel Becker
@ 2009-04-14 23:39 ` Joel Becker
3 siblings, 0 replies; 10+ messages in thread
From: Joel Becker @ 2009-04-14 23:39 UTC (permalink / raw)
To: ocfs2-devel
More than one ocfs2_controld can join the cpg group at the same time,
even for the first members of a group. So 'jouners == 1' is not a valid
check to see if this controld is the first one.
We change the code to pick the controld with the lowest node number to
set up the daemon control group.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
---
ocfs2_controld/cpg.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)
diff --git a/ocfs2_controld/cpg.c b/ocfs2_controld/cpg.c
index a2d8ae1..5e166b0 100644
--- a/ocfs2_controld/cpg.c
+++ b/ocfs2_controld/cpg.c
@@ -728,6 +728,8 @@ out:
static void daemon_set_cgroup(struct cgroup *cg, void *user_data)
{
+ int i, first = 1;
+
void (*daemon_joined)(int first) = user_data;
if (cg != &daemon_group) {
@@ -735,7 +737,16 @@ static void daemon_set_cgroup(struct cgroup *cg, void *user_data)
return;
}
- daemon_joined(daemon_group.cg_member_count == 1);
+ if (daemon_group.cg_cb_member_count != daemon_group.cg_cb_joined_count)
+ first = 0;
+ else {
+ for (i = 0; i < daemon_group.cg_cb_joined_count; i++) {
+ if (daemon_group.cg_cb_joined[i].nodeid < our_nodeid)
+ first = 0;
+ }
+ }
+
+ daemon_joined(first);
}
int setup_cpg(void (*daemon_joined)(int first))
--
1.6.1.3
--
Life's Little Instruction Book #139
"Never deprive someone of hope; it might be all they have."
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
^ permalink raw reply related [flat|nested] 10+ messages in thread