From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Wed, 3 Oct 2012 13:10:55 -0400 Subject: [Cluster-devel] fence daemon problems In-Reply-To: <24E144B8C0207547AD09C467A8259F75576A15FF@lisa.maurer-it.com> References: <24E144B8C0207547AD09C467A8259F755768AE73@lisa.maurer-it.com> <24E144B8C0207547AD09C467A8259F755769CF56@lisa.maurer-it.com> <20121003144614.GB12614@redhat.com> <24E144B8C0207547AD09C467A8259F75576A155B@lisa.maurer-it.com> <20121003162411.GC12614@redhat.com> <24E144B8C0207547AD09C467A8259F75576A15CB@lisa.maurer-it.com> <20121003164433.GD12614@redhat.com> <24E144B8C0207547AD09C467A8259F75576A15FF@lisa.maurer-it.com> Message-ID: <20121003171055.GE12614@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Wed, Oct 03, 2012 at 04:55:55PM +0000, Dietmar Maurer wrote: > > The difficult cases, which I think you're seeing, are partitions where > > no group has quorum, e.g. 2/2. In this case we do nothing, and the > > user has to resolve it by resetting some of the nodes > > The problem with that is that those 'difficult' cases are very likely. > For example a switch reboot results in that state if you do not have > redundant network (yes, I know that this setup is simply wrong). > > And things get worse, because it is not possible to reboot such nodes, > because rgmanager shutdown simply hangs. Is there any way to avoid that, > so that it is at least possible to reboot those nodes? Fabio's checkquorum script will reboot nodes that loose quorum.