* [Cluster-devel] cluster4 dlm dlm_stonith – should it really fence by turning node off? @ 2012-11-03 14:58 Jacek Konieczny 2012-11-05 16:10 ` [Cluster-devel] cluster4 dlm dlm_stonith ??? " David Teigland 0 siblings, 1 reply; 7+ messages in thread From: Jacek Konieczny @ 2012-11-03 14:58 UTC (permalink / raw) To: cluster-devel.redhat.com Hello, The dlm_stonith fencing helper is really convenient when Pacemaker is in use. Though, it doesn't quite work as I would expect ? when fencing is needed it requests a node to be turned off instead of rebooting. And it doesn't handle unfencing ? so automatic recovery is not possible (rebooted node could join the cluster cleanly later, provided quorum handling is properly configured in the cluster stack). Preferably this behaviour should be configurable. I have hacked a work-around by (ab)using argv[0] ? when 'dlm_stonith' is called as 'dlm_stonith_reboot' the node would be rebooted instead of halting ? this works for me well-enough, but I don't think this is the right solution. Any ideas how to solve that properly? An argument for the helper to be included in the config file? Or, maybe, just change the default behaviour? Greets, Jacek ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] cluster4 dlm dlm_stonith ??? should it really fence by turning node off? 2012-11-03 14:58 [Cluster-devel] cluster4 dlm dlm_stonith – should it really fence by turning node off? Jacek Konieczny @ 2012-11-05 16:10 ` David Teigland 2012-11-05 17:47 ` Jacek Konieczny 0 siblings, 1 reply; 7+ messages in thread From: David Teigland @ 2012-11-05 16:10 UTC (permalink / raw) To: cluster-devel.redhat.com On Sat, Nov 03, 2012 at 03:58:28PM +0100, Jacek Konieczny wrote: > Hello, > > The dlm_stonith fencing helper is really convenient when Pacemaker is in > use. Though, it doesn't quite work as I would expect ??? when fencing > is needed it requests a node to be turned off instead of rebooting. And > it doesn't handle unfencing ??? so automatic recovery is not possible > (rebooted node could join the cluster cleanly later, provided quorum > handling is properly configured in the cluster stack). > > Preferably this behaviour should be configurable. I have hacked a > work-around by (ab)using argv[0] ??? when 'dlm_stonith' is called as > 'dlm_stonith_reboot' the node would be rebooted instead of halting > ??? this works for me well-enough, but I don't think this is the right > solution. Could you send the patch? Do you think the patch is not right or reboot is not right? If the later, what do you think is wrong with reboot? > Any ideas how to solve that properly? An argument for the helper to be > included in the config file? Or, maybe, just change the default > behaviour? > > Greets, > Jacek ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] cluster4 dlm dlm_stonith ??? should it really fence by turning node off? 2012-11-05 16:10 ` [Cluster-devel] cluster4 dlm dlm_stonith ??? " David Teigland @ 2012-11-05 17:47 ` Jacek Konieczny 2012-11-05 18:05 ` Jacek Konieczny 0 siblings, 1 reply; 7+ messages in thread From: Jacek Konieczny @ 2012-11-05 17:47 UTC (permalink / raw) To: cluster-devel.redhat.com On Mon, Nov 05, 2012 at 11:10:17AM -0500, David Teigland wrote: > On Sat, Nov 03, 2012 at 03:58:28PM +0100, Jacek Konieczny wrote: > > Hello, > > > > The dlm_stonith fencing helper is really convenient when Pacemaker is in > > use. Though, it doesn't quite work as I would expect ??? when fencing > > is needed it requests a node to be turned off instead of rebooting. And > > it doesn't handle unfencing ??? so automatic recovery is not possible > > (rebooted node could join the cluster cleanly later, provided quorum > > handling is properly configured in the cluster stack). > > > > Preferably this behaviour should be configurable. I have hacked a > > work-around by (ab)using argv[0] ??? when 'dlm_stonith' is called as > > 'dlm_stonith_reboot' the node would be rebooted instead of halting > > ??? this works for me well-enough, but I don't think this is the right > > solution. > > Could you send the patch? Do you think the patch is not right or reboot > is not right? If the later, what do you think is wrong with reboot? I don't like the patch as probably it would be better to use extra argument to the dlm_stonith invokation in the dlm_controld file. I did not fix it that way, as I was not sure how arguments are supposed to be passed from dlm_controld configuration file to the fencing handlers. I'll sent the patch anyway. If reboot is wrong ? I have assumed the 'off' was chosen for purpose and that my attempt to make it reboot instead could be a mistake. Though, I found no reason why reboot would be worse. Greets, Jacek ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] cluster4 dlm dlm_stonith ??? should it really fence by turning node off? 2012-11-05 17:47 ` Jacek Konieczny @ 2012-11-05 18:05 ` Jacek Konieczny 2012-11-05 18:05 ` [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper Jacek Konieczny 0 siblings, 1 reply; 7+ messages in thread From: Jacek Konieczny @ 2012-11-05 18:05 UTC (permalink / raw) To: cluster-devel.redhat.com My workaround for the missing functionality (or wrong default): [PATCH] dlm_stonith_{off,reboot} aliases for fence helper Greets, Jacek ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper 2012-11-05 18:05 ` Jacek Konieczny @ 2012-11-05 18:05 ` Jacek Konieczny 2012-11-05 19:30 ` David Teigland 0 siblings, 1 reply; 7+ messages in thread From: Jacek Konieczny @ 2012-11-05 18:05 UTC (permalink / raw) To: cluster-devel.redhat.com Not so pretty hack to allow rebooting instead of halting the node fenced by dlm_stonith. 'dlm_stonith_reboot' can be now used instead of 'dlm_stonith' in the dlm.conf file to request node reboot. 'dlm_stonith_off' alias is also provided to explicitly request power-off. Signed-off-by: Jacek Konieczny <jajcus@jajcus.net> --- fence/Makefile | 10 +++++++++- fence/stonith_helper.c | 8 +++++++- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/fence/Makefile b/fence/Makefile index b4c59dd..2f24677 100644 --- a/fence/Makefile +++ b/fence/Makefile @@ -6,6 +6,8 @@ BINDIR=$(PREFIX)/sbin BIN_TARGET = dlm_stonith #MAN_TARGET = dlm_stonith.8 +SYMLINKS = dlm_stonith_off dlm_stonith_reboot + BIN_SOURCE = stonith_helper.c BIN_CFLAGS += -D_GNU_SOURCE -O2 -ggdb \ @@ -37,11 +39,14 @@ BIN_LDFLAGS += -Wl,-z,now -Wl,-z,relro -pie BIN_LDFLAGS += `xml2-config --libs` BIN_LDFLAGS += -ldl -all: $(BIN_TARGET) +all: $(BIN_TARGET) $(SYMLINKS) $(BIN_TARGET): $(BIN_SOURCE) $(CC) $(BIN_SOURCE) $(BIN_CFLAGS) $(BIN_LDFLAGS) -o $@ -L. +$(SYMLINKS): $(BIN_TARGET) + for link in $(SYMLINKS) ; do ln -sf $(BIN_TARGET) $$link ; done + clean: rm -f *.o *.so *.so.* $(BIN_TARGET) @@ -53,5 +58,8 @@ install: all $(INSTALL) -d $(DESTDIR)/$(BINDIR) $(INSTALL) -d $(DESTDIR)/$(MANDIR)/man8 $(INSTALL) -c -m 755 $(BIN_TARGET) $(DESTDIR)/$(BINDIR) + for link in $(SYMLINKS) ; do \ + ln -sf $(BIN_TARGET) $(DESTDIR)/$(BINDIR)/$$link ; \ + done # $(INSTALL) -m 644 $(MAN_TARGET) $(DESTDIR)/$(MANDIR)/man8/ diff --git a/fence/stonith_helper.c b/fence/stonith_helper.c index 5b384c1..73a2245 100644 --- a/fence/stonith_helper.c +++ b/fence/stonith_helper.c @@ -16,6 +16,7 @@ int nodeid; uint64_t fail_time; +int turn_off = 1; #define MAX_ARG_LEN 1024 @@ -26,6 +27,11 @@ static int get_options(int argc, char *argv[]) char val[MAX_ARG_LEN]; char c; int rv; + int arg0_l; + + arg0_l = strlen(argv[0]); + if (arg0_l>7 && !strcmp(argv[0] + arg0_l - 7, "_reboot")) turn_off = 0; + else if (arg0_l>4 && !strcmp(argv[0] + arg0_l - 4, "_off")) turn_off = 1; if (argc > 1) { while ((c = getopt(argc, argv, "n:t:")) != -1) { @@ -77,7 +83,7 @@ int main(int argc, char *argv[]) if (t >= fail_time) return 0; - rv = stonith_api_kick_helper(nodeid, 300, 1); + rv = stonith_api_kick_helper(nodeid, 300, turn_off); if (rv) { fprintf(stderr, "kick_helper error %d nodeid %d\n", rv, nodeid); openlog("stonith_helper", LOG_CONS | LOG_PID, LOG_DAEMON); -- 1.7.7.4 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper 2012-11-05 18:05 ` [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper Jacek Konieczny @ 2012-11-05 19:30 ` David Teigland 2012-11-05 20:46 ` Jacek Konieczny 0 siblings, 1 reply; 7+ messages in thread From: David Teigland @ 2012-11-05 19:30 UTC (permalink / raw) To: cluster-devel.redhat.com On Mon, Nov 05, 2012 at 07:05:22PM +0100, Jacek Konieczny wrote: > - rv = stonith_api_kick_helper(nodeid, 300, 1); > + rv = stonith_api_kick_helper(nodeid, 300, turn_off); I'd like it to be "reboot", but seeing the arg as "bool off" I figured the opposite would be "on" ... if you're saying that the opposite of off is actually reboot, then I'll just make it 0. ^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper 2012-11-05 19:30 ` David Teigland @ 2012-11-05 20:46 ` Jacek Konieczny 0 siblings, 0 replies; 7+ messages in thread From: Jacek Konieczny @ 2012-11-05 20:46 UTC (permalink / raw) To: cluster-devel.redhat.com On Mon, Nov 05, 2012 at 02:30:33PM -0500, David Teigland wrote: > On Mon, Nov 05, 2012 at 07:05:22PM +0100, Jacek Konieczny wrote: > > - rv = stonith_api_kick_helper(nodeid, 300, 1); > > + rv = stonith_api_kick_helper(nodeid, 300, turn_off); > > I'd like it to be "reboot", but seeing the arg as "bool off" I figured the > opposite would be "on" ... if you're saying that the opposite of off is > actually reboot, then I'll just make it 0. If it is 'off' by mistake and not for any better reason, then I think this will be the best ? to make this '0' by default. I have verified the Pacemaker sources twice ? 'off=0' means 'reboot'. Effort to make this configurable may be postponed until anybody is interested in fencing by power-off. Greets, Jacek ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-11-05 20:46 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-11-03 14:58 [Cluster-devel] cluster4 dlm dlm_stonith – should it really fence by turning node off? Jacek Konieczny 2012-11-05 16:10 ` [Cluster-devel] cluster4 dlm dlm_stonith ??? " David Teigland 2012-11-05 17:47 ` Jacek Konieczny 2012-11-05 18:05 ` Jacek Konieczny 2012-11-05 18:05 ` [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper Jacek Konieczny 2012-11-05 19:30 ` David Teigland 2012-11-05 20:46 ` Jacek Konieczny
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).