From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Thu, 26 Feb 2009 08:33:26 -0600 Subject: [Cluster-devel] unfencing In-Reply-To: <1235631117.27848.62.camel@cerberus.int.fabbione.net> References: <20090220214431.GC23911@redhat.com> <1235370440.7816.209.camel@cerberus.int.fabbione.net> <20090223181530.GB12791@redhat.com> <1235413889.7816.256.camel@cerberus.int.fabbione.net> <20090223184030.GC12791@redhat.com> <1235415175.7816.261.camel@cerberus.int.fabbione.net> <20090223190958.GD12791@redhat.com> <1235631117.27848.62.camel@cerberus.int.fabbione.net> Message-ID: <20090226143326.GA8234@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Feb 26, 2009 at 07:51:57AM +0100, Fabio M. Di Nitto wrote: > On Mon, 2009-02-23 at 13:09 -0600, David Teigland wrote: > > On Mon, Feb 23, 2009 at 07:52:55PM +0100, Fabio M. Di Nitto wrote: > > > > A node unfences *itself* when it boots up. As such, power-unfencing doesn't > > > > make sense; unfencing is only meant to reverse storage fencing. > > > > > > What can stop a user to run fence_node -U from another node to do remote > > > (un)fencing? > > > > It would work. Users can do anything they like, that's beside the point. > > I was thinking about 2 little points.. > > Given the time at which fence_node -U will fire, you probably want to > add a cman_init + cman_is_active + cman_finish loop in fence_node to > make sure cman is ready to reply to our ccs queries, otherwise we might > have a race condition at boot time (it might be already there.. didn't > really check the code). All our daemons do that to give cman time to > bootstrap. Yes, good point. I wonder if we'd be better off having cman_tool join effectively do an is_active wait before exiting? Then we could probably avoid doing it many other places. (It's also annoying when corosync crashes after is_active completes, but before I've read what I need from cman/ccs.) > The second thing would be to set a minimal protection mechanism by > allowing fence_node -U to be fired only for the node that it is invoking > it. So if we run on node A, fence_node -U can only execute unfencing > operations for node A. For testing purposes then we could add a manual > override such as "--i-understand-this-operation-can-destroy-the-world". I plan to use "fence_node -U" (no name) to unfence self. I'm inclined to just allow any node name after that, but not advertise it.