* [Cluster-devel] cluster4 dlm dlm_stonith – should it really fence by turning node off?
@ 2012-11-03 14:58 Jacek Konieczny
2012-11-05 16:10 ` [Cluster-devel] cluster4 dlm dlm_stonith ??? " David Teigland
0 siblings, 1 reply; 7+ messages in thread
From: Jacek Konieczny @ 2012-11-03 14:58 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hello,
The dlm_stonith fencing helper is really convenient when Pacemaker is in
use. Though, it doesn't quite work as I would expect ? when fencing
is needed it requests a node to be turned off instead of rebooting. And
it doesn't handle unfencing ? so automatic recovery is not possible
(rebooted node could join the cluster cleanly later, provided quorum
handling is properly configured in the cluster stack).
Preferably this behaviour should be configurable. I have hacked a
work-around by (ab)using argv[0] ? when 'dlm_stonith' is called as
'dlm_stonith_reboot' the node would be rebooted instead of halting
? this works for me well-enough, but I don't think this is the right
solution.
Any ideas how to solve that properly? An argument for the helper to be
included in the config file? Or, maybe, just change the default
behaviour?
Greets,
Jacek
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] cluster4 dlm dlm_stonith ??? should it really fence by turning node off?
2012-11-03 14:58 [Cluster-devel] cluster4 dlm dlm_stonith – should it really fence by turning node off? Jacek Konieczny
@ 2012-11-05 16:10 ` David Teigland
2012-11-05 17:47 ` Jacek Konieczny
0 siblings, 1 reply; 7+ messages in thread
From: David Teigland @ 2012-11-05 16:10 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Sat, Nov 03, 2012 at 03:58:28PM +0100, Jacek Konieczny wrote:
> Hello,
>
> The dlm_stonith fencing helper is really convenient when Pacemaker is in
> use. Though, it doesn't quite work as I would expect ??? when fencing
> is needed it requests a node to be turned off instead of rebooting. And
> it doesn't handle unfencing ??? so automatic recovery is not possible
> (rebooted node could join the cluster cleanly later, provided quorum
> handling is properly configured in the cluster stack).
>
> Preferably this behaviour should be configurable. I have hacked a
> work-around by (ab)using argv[0] ??? when 'dlm_stonith' is called as
> 'dlm_stonith_reboot' the node would be rebooted instead of halting
> ??? this works for me well-enough, but I don't think this is the right
> solution.
Could you send the patch? Do you think the patch is not right or reboot
is not right? If the later, what do you think is wrong with reboot?
> Any ideas how to solve that properly? An argument for the helper to be
> included in the config file? Or, maybe, just change the default
> behaviour?
>
> Greets,
> Jacek
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] cluster4 dlm dlm_stonith ??? should it really fence by turning node off?
2012-11-05 16:10 ` [Cluster-devel] cluster4 dlm dlm_stonith ??? " David Teigland
@ 2012-11-05 17:47 ` Jacek Konieczny
2012-11-05 18:05 ` Jacek Konieczny
0 siblings, 1 reply; 7+ messages in thread
From: Jacek Konieczny @ 2012-11-05 17:47 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon, Nov 05, 2012 at 11:10:17AM -0500, David Teigland wrote:
> On Sat, Nov 03, 2012 at 03:58:28PM +0100, Jacek Konieczny wrote:
> > Hello,
> >
> > The dlm_stonith fencing helper is really convenient when Pacemaker is in
> > use. Though, it doesn't quite work as I would expect ??? when fencing
> > is needed it requests a node to be turned off instead of rebooting. And
> > it doesn't handle unfencing ??? so automatic recovery is not possible
> > (rebooted node could join the cluster cleanly later, provided quorum
> > handling is properly configured in the cluster stack).
> >
> > Preferably this behaviour should be configurable. I have hacked a
> > work-around by (ab)using argv[0] ??? when 'dlm_stonith' is called as
> > 'dlm_stonith_reboot' the node would be rebooted instead of halting
> > ??? this works for me well-enough, but I don't think this is the right
> > solution.
>
> Could you send the patch? Do you think the patch is not right or reboot
> is not right? If the later, what do you think is wrong with reboot?
I don't like the patch as probably it would be better to use extra
argument to the dlm_stonith invokation in the dlm_controld file. I did
not fix it that way, as I was not sure how arguments are supposed to be
passed from dlm_controld configuration file to the fencing handlers.
I'll sent the patch anyway.
If reboot is wrong ? I have assumed the 'off' was chosen for purpose and
that my attempt to make it reboot instead could be a mistake. Though, I
found no reason why reboot would be worse.
Greets,
Jacek
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] cluster4 dlm dlm_stonith ??? should it really fence by turning node off?
2012-11-05 17:47 ` Jacek Konieczny
@ 2012-11-05 18:05 ` Jacek Konieczny
2012-11-05 18:05 ` [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper Jacek Konieczny
0 siblings, 1 reply; 7+ messages in thread
From: Jacek Konieczny @ 2012-11-05 18:05 UTC (permalink / raw)
To: cluster-devel.redhat.com
My workaround for the missing functionality (or wrong default):
[PATCH] dlm_stonith_{off,reboot} aliases for fence helper
Greets,
Jacek
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper
2012-11-05 18:05 ` Jacek Konieczny
@ 2012-11-05 18:05 ` Jacek Konieczny
2012-11-05 19:30 ` David Teigland
0 siblings, 1 reply; 7+ messages in thread
From: Jacek Konieczny @ 2012-11-05 18:05 UTC (permalink / raw)
To: cluster-devel.redhat.com
Not so pretty hack to allow rebooting instead of halting
the node fenced by dlm_stonith.
'dlm_stonith_reboot' can be now used instead of 'dlm_stonith'
in the dlm.conf file to request node reboot.
'dlm_stonith_off' alias is also provided to explicitly request
power-off.
Signed-off-by: Jacek Konieczny <jajcus@jajcus.net>
---
fence/Makefile | 10 +++++++++-
fence/stonith_helper.c | 8 +++++++-
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/fence/Makefile b/fence/Makefile
index b4c59dd..2f24677 100644
--- a/fence/Makefile
+++ b/fence/Makefile
@@ -6,6 +6,8 @@ BINDIR=$(PREFIX)/sbin
BIN_TARGET = dlm_stonith
#MAN_TARGET = dlm_stonith.8
+SYMLINKS = dlm_stonith_off dlm_stonith_reboot
+
BIN_SOURCE = stonith_helper.c
BIN_CFLAGS += -D_GNU_SOURCE -O2 -ggdb \
@@ -37,11 +39,14 @@ BIN_LDFLAGS += -Wl,-z,now -Wl,-z,relro -pie
BIN_LDFLAGS += `xml2-config --libs`
BIN_LDFLAGS += -ldl
-all: $(BIN_TARGET)
+all: $(BIN_TARGET) $(SYMLINKS)
$(BIN_TARGET): $(BIN_SOURCE)
$(CC) $(BIN_SOURCE) $(BIN_CFLAGS) $(BIN_LDFLAGS) -o $@ -L.
+$(SYMLINKS): $(BIN_TARGET)
+ for link in $(SYMLINKS) ; do ln -sf $(BIN_TARGET) $$link ; done
+
clean:
rm -f *.o *.so *.so.* $(BIN_TARGET)
@@ -53,5 +58,8 @@ install: all
$(INSTALL) -d $(DESTDIR)/$(BINDIR)
$(INSTALL) -d $(DESTDIR)/$(MANDIR)/man8
$(INSTALL) -c -m 755 $(BIN_TARGET) $(DESTDIR)/$(BINDIR)
+ for link in $(SYMLINKS) ; do \
+ ln -sf $(BIN_TARGET) $(DESTDIR)/$(BINDIR)/$$link ; \
+ done
# $(INSTALL) -m 644 $(MAN_TARGET) $(DESTDIR)/$(MANDIR)/man8/
diff --git a/fence/stonith_helper.c b/fence/stonith_helper.c
index 5b384c1..73a2245 100644
--- a/fence/stonith_helper.c
+++ b/fence/stonith_helper.c
@@ -16,6 +16,7 @@
int nodeid;
uint64_t fail_time;
+int turn_off = 1;
#define MAX_ARG_LEN 1024
@@ -26,6 +27,11 @@ static int get_options(int argc, char *argv[])
char val[MAX_ARG_LEN];
char c;
int rv;
+ int arg0_l;
+
+ arg0_l = strlen(argv[0]);
+ if (arg0_l>7 && !strcmp(argv[0] + arg0_l - 7, "_reboot")) turn_off = 0;
+ else if (arg0_l>4 && !strcmp(argv[0] + arg0_l - 4, "_off")) turn_off = 1;
if (argc > 1) {
while ((c = getopt(argc, argv, "n:t:")) != -1) {
@@ -77,7 +83,7 @@ int main(int argc, char *argv[])
if (t >= fail_time)
return 0;
- rv = stonith_api_kick_helper(nodeid, 300, 1);
+ rv = stonith_api_kick_helper(nodeid, 300, turn_off);
if (rv) {
fprintf(stderr, "kick_helper error %d nodeid %d\n", rv, nodeid);
openlog("stonith_helper", LOG_CONS | LOG_PID, LOG_DAEMON);
--
1.7.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper
2012-11-05 18:05 ` [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper Jacek Konieczny
@ 2012-11-05 19:30 ` David Teigland
2012-11-05 20:46 ` Jacek Konieczny
0 siblings, 1 reply; 7+ messages in thread
From: David Teigland @ 2012-11-05 19:30 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon, Nov 05, 2012 at 07:05:22PM +0100, Jacek Konieczny wrote:
> - rv = stonith_api_kick_helper(nodeid, 300, 1);
> + rv = stonith_api_kick_helper(nodeid, 300, turn_off);
I'd like it to be "reboot", but seeing the arg as "bool off" I figured the
opposite would be "on" ... if you're saying that the opposite of off is
actually reboot, then I'll just make it 0.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper
2012-11-05 19:30 ` David Teigland
@ 2012-11-05 20:46 ` Jacek Konieczny
0 siblings, 0 replies; 7+ messages in thread
From: Jacek Konieczny @ 2012-11-05 20:46 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon, Nov 05, 2012 at 02:30:33PM -0500, David Teigland wrote:
> On Mon, Nov 05, 2012 at 07:05:22PM +0100, Jacek Konieczny wrote:
> > - rv = stonith_api_kick_helper(nodeid, 300, 1);
> > + rv = stonith_api_kick_helper(nodeid, 300, turn_off);
>
> I'd like it to be "reboot", but seeing the arg as "bool off" I figured the
> opposite would be "on" ... if you're saying that the opposite of off is
> actually reboot, then I'll just make it 0.
If it is 'off' by mistake and not for any better reason, then I think
this will be the best ? to make this '0' by default. I have verified the
Pacemaker sources twice ? 'off=0' means 'reboot'.
Effort to make this configurable may be postponed until anybody is
interested in fencing by power-off.
Greets,
Jacek
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-11-05 20:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-03 14:58 [Cluster-devel] cluster4 dlm dlm_stonith – should it really fence by turning node off? Jacek Konieczny
2012-11-05 16:10 ` [Cluster-devel] cluster4 dlm dlm_stonith ??? " David Teigland
2012-11-05 17:47 ` Jacek Konieczny
2012-11-05 18:05 ` Jacek Konieczny
2012-11-05 18:05 ` [Cluster-devel] [PATCH] dlm_stonith_{off, reboot} aliases for fence helper Jacek Konieczny
2012-11-05 19:30 ` David Teigland
2012-11-05 20:46 ` Jacek Konieczny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).