From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Wed, 12 Aug 2009 16:48:10 -0500 Subject: [Cluster-devel] disallowed in cluster3 In-Reply-To: <4A7A8175.3010109@redhat.com> References: <20090805165234.GB17292@redhat.com> <4A7A8175.3010109@redhat.com> Message-ID: <20090812214810.GF7564@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Aug 06, 2009 at 08:08:37AM +0100, Christine Caulfield wrote: > On 05/08/09 17:52, David Teigland wrote: > >When rewriting daemons for cluster3 to remove groupd, I wrote them to not > >need > >or use the disallowed-nodes feature from cman for handling remerging of > >cluster partitions. In compat mode, however, (the cluster2 code) they > >would > >still depend on that cman feature, which is why it still exists in cluster3 > >cman. > > > >I've found, though, that when we do have a partition remerge, cman's > >disallowed feature gets in the way of the daemons trying to handle it > >themselves (which I'm testing, it doesn't seem quite right in all > >partitioning/merging cases yet.) > > > >So, I think what we need is for cluster3 cman to turn off the disallowed > >feature unless the cluster is in compat mode, i.e. >groupd_compat="1"/> > >exists. > > > Thanks, I've been looking for an excuse to add code to cluster3 to > disable that mode ;-) Trying this out and testing the daemon's handling of partition merging situations... I uncovered one of my own comments in fenced: /* We don't require cman dirty/disallowed to detect and handle cpg merges after a partition, because we already do that with started_count checks and our own disallowed flag. But, we do need cman dirty/disallowed to deal with correctly skipping victims that rejoin the cluster. Without cman dirty/disallowed, we'd skip fencing a node after a merge of a partition since the merged node would be a cman member and a fenced:daemon cpg member. By setting the dirty flag, cman won't report a dirty merged node as a member, so we'll continue fencing it. */ So, the primary reason for cman disallowed is gone, but I'd forgotten about this little unsolved detail around bypassing fencing of a node that has rebooted and rejoined (distinguishing that from partitioned and rejoined). There really should be a way for fenced to deal with that without punting it off to cman disallowed, I'll be working on that. Dave