From: Marc Grimme <grimme@atix.de>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] rind-0.8.1 patch
Date: Mon, 4 Feb 2008 18:41:02 +0100 [thread overview]
Message-ID: <200802041841.02494.grimme@atix.de> (raw)
In-Reply-To: <1196441345.2454.25.camel@localhost.localdomain>
Hi Lon,
finally I had time looking at this patch and adapted your example for the
follow-service a little bit.
Besides that the eventtriggering is running es expected I stubled over some
minor changes (find patch attached).
1. Isn't it better to organize the configuration as follows:
<event name="followservice_node" class="node"
file="/usr/local/cluster/follow-service.sl">
follow_service("service:ip_a", "service:ip_b", "ip_a",
1);
</event>
Now you can use the follow_service function as a library function and make the
implementation in the cluster.conf (this is already integrated in the patch).
I would also like something like this:
<event name="followservice_node" class="node">
<file="/usr/local/cluster/another-lib.sl">
<file="/usr/local/cluster/follow-service.sl">
follow_service("service:ip_a", "service:ip_b", "ip_a", 1);
</event>
This would make using sl-files very modular. I didn't yet have time to
implement it but wanted to hear what you are thinking.
2. I found that the sl-function nodes_online() returns also online if the node
in question is in the cluster but has no rgmanager running. For me it worked
just to change the line in rgmanager/src/daemons/slang_event.c:606 :
- if (membership->cml_members[i].cn_member &&
+ if (membership->cml_members[i].cn_member > 0 &&
But I'm not sure if this is right. For me it worked perfectly well ;-) .
Next is I reimplemented your example on follow-service and made it more
general. Still some cases might not be handled. But all my tests (which were
not too many up to know) didn't show any problems. I will hand it over to the
SAP Guys this week to let then see it this suits there requirements for
master/slave queue replication (find the example attached).
I hope this feetback helps.
Regards Marc.
On Friday 30 November 2007 17:49:05 Lon Hohberger wrote:
> Minor bugfixes.
>
> -- Lon
--
Gruss / Regards,
Marc Grimme
http://www.atix.de/ http://www.open-sharedroot.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rgmanager-rind.patch
Type: text/x-diff
Size: 1498 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20080204/bedcf2fe/attachment.bin>
-------------- next part --------------
%
% Returns a list of nodes for the given service that are online and in the failoverdomain.
%
define nodelist_online(service_name) {
variable nodes, nofailback, restricted, ordered, node_list;
nodes=nodes_online();
(nofailback, restricted, ordered, node_list) = service_domain_info(service_name);
return intersection(nodes, node_list);
}
%
% Idea:
% General purpose function of a construct when Service(svc1) and Service(svc2)
% should not be running on the same node even after failover.
% There are to options to influence the behaviour. If both services have to be
% running on the same node (only one node is left in the failovergroup) what
% service is the master and should both services be running or only the master
% service survives. If master is not svc1 or svc2 both service might run on the
% same node. If master is either svc1 or svc2 the specified one will be the
% surviving service.
% If followslave is not 0 the svc1 always follows svc2. That means it will be
% started on on the same node as svc1. And if available svc2 will be relocated
% to any other node.
%
define follow_service(svc1, svc2, master, followslave)
{
variable state, owner_svc1, owner_svc2;
variable nodes1, nodes2, allowed;
debug("*** FOLLOW_SERVICE: follow_service(",svc1,", ",svc2,", ", master, ", ", followslave, ")");
debug("*** FOLLOW_SERVICE: event_type: ", event_type, "service_name: ", service_name, ", service_state: ", service_state);
%
% setup the master
%
if ((master != svc1) and (master != svc2)) {
debug("*** FOLLOW_SERVICE: master=NULL");
master=NULL;
}
% get infos we need to decide further
(owner_svc1, state) = service_status(svc1);
(owner_svc2, state) = service_status(svc2);
nodes1 = nodelist_online(svc1);
nodes2 = nodelist_online(svc2);
debug("*** FOLLOW_SERVICE: service_status(",svc1,"): ", service_status(svc1));
debug("*** FOLLOW_SERVICE: owner_svc1: ", owner_svc1, ", owner_svc2: ", owner_svc2, ", nodes1: ", nodes1, ", nodes2: ", nodes2);
if ((event_type == EVENT_NODE) and (owner_svc1 == node_id) and
(node_state == NODE_OFFLINE) and (owner_svc2 >= 0)) {
%
% uh oh, the owner of the master server died. Restart it
% on the node running the slave server or if we should not
% follow the slave start it somewhere else.
%
if (followslave>0) {
if (master != svc2) {
()=service_start(svc1, owner_svc2);
}
} else {
allowed = subtract(nodes1, owner_svc2);
if (length(allowed) > 0) {
()=service_start(svc1, allowed);
} else if (master == svc1) {
()=service_start(svc1, owner_svc2);
()=service_stop(svc2);
} else if (master == NULL) {
()=service_start(svc1, owner_svc2);
}
}
}
else if ((event_type == EVENT_NODE) and (owner_svc2 == node_id) and
(node_state == NODE_OFFLINE) and (owner_svc1 >= 0)) {
%
% uh oh, the owner of the svc2 died. Restart it
% on any other node but not the one running the svc1.
% If svc1 is the only one left only start it there
% if master==svc2
%
allowed=subtract(nodes2, owner_svc1);
if (length(allowed) > 0) {
()=service_start(svc2, allowed);
} else if (master == svc2) {
()=service_start(svc2, owner_svc1);
()=service_stop(svc1);
} else if (master == NULL) {
()=service_start(svc2, owner_svc1);
}
}
else if (((event_type == EVENT_SERVICE) and (service_state == "started") and (owner_svc2 == owner_svc1) and (owner_svc1 > 0) and (owner_svc2 > 0)) or
((event_type == EVENT_CONFIG) and (owner_svc2 == owner_svc1))) {
allowed=subtract(nodes2, owner_svc1);
debug("*** FOLLOW SERVICE: service event started triggered.", allowed);
if (length(allowed) > 0) {
()=service_stop(svc2);
()=service_start(svc2, allowed);
} else if ((master == svc2) and (owner_svc2 > 0)){
debug("*** FOLLOW SERVICE: will stop service .", svc1);
()=service_stop(svc1);
} else if ((master == svc1) and (owner_svc1 > 0)) {
debug("*** FOLLOW SERVICE: will stop service .", svc2);
()=service_stop(svc2);
} else {
debug("*** FOLLOW SERVICE: both services running on the same node or only one is running.", allowed, ", ", master);
}
}
return;
}
next prev parent reply other threads:[~2008-02-04 17:41 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-11-30 16:49 [Cluster-devel] rind-0.8.1 patch Lon Hohberger
2008-02-04 17:41 ` Marc Grimme [this message]
2008-02-05 17:58 ` Lon Hohberger
2008-02-06 9:03 ` Marc Grimme
2008-02-06 17:01 ` Lon Hohberger
2008-02-06 17:22 ` Lon Hohberger
2008-02-06 19:18 ` Marc Grimme
2008-02-07 8:38 ` Marc Grimme
2008-02-08 20:56 ` Lon Hohberger
2008-02-14 21:56 ` Lon Hohberger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200802041841.02494.grimme@atix.de \
--to=grimme@atix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).