* [Cluster-devel] [PATCH] rgmanager: Retry when config is out of sync [RHEL5]
@ 2012-02-29 23:53 Lon Hohberger
2012-03-01 4:58 ` Fabio M. Di Nitto
0 siblings, 1 reply; 2+ messages in thread
From: Lon Hohberger @ 2012-02-29 23:53 UTC (permalink / raw)
To: cluster-devel.redhat.com
[This patch is already in RHEL5]
If you add a service to rgmanager v1 or v2 and that
service fails to start on the first node but succeeds
in its initial stop operation, there is a chance that
the remote instance of rgmanager has not yet reread
the configuration, causing the service to be placed
into the 'recovering' state without further action.
This patch causes the originator of the request to
retry the operation.
Later versions of rgmanager (ex STABLE3 branch and
derivatives) are unlikely to have this problem since
configuration updates are not polled, but rather
delivered to clients.
Update 22-Feb-2012: The above is incorrect, this was
reproduced a rgmanager v3 installation.
Resolves: rhbz#796272
Signed-off-by: Lon Hohberger <lhh@redhat.com>
---
rgmanager/src/daemons/rg_state.c | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/rgmanager/src/daemons/rg_state.c b/rgmanager/src/daemons/rg_state.c
index 23a4bec..8c5af5b 100644
--- a/rgmanager/src/daemons/rg_state.c
+++ b/rgmanager/src/daemons/rg_state.c
@@ -1801,6 +1801,7 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target,
rg_state_t svcStatus;
int target = preferred_target, me = my_id();
int ret, x, request = orig_request;
+ int retries;
get_rg_state_local(svcName, &svcStatus);
if (svcStatus.rs_state == RG_STATE_DISABLED ||
@@ -1933,6 +1934,8 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target,
if (target == me)
goto exhausted;
+ retries = 0;
+retry:
ret = svc_start_remote(svcName, request, target);
switch (ret) {
case RG_ERUN:
@@ -1942,6 +1945,22 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target,
*new_owner = svcStatus.rs_owner;
free_member_list(allowed_nodes);
return 0;
+ case RG_ENOSERVICE:
+ /*
+ * Configuration update pending on remote node? Give it
+ * a few seconds to sync up. rhbz#568126
+ *
+ * Configuration updates are synchronized in later releases
+ * of rgmanager; this should not be needed.
+ */
+ if (retries++ < 4) {
+ sleep(3);
+ goto retry;
+ }
+ logt_print(LOG_WARNING, "Member #%d has a different "
+ "configuration than I do; trying next "
+ "member.", target);
+ /* Deliberate */
case RG_EDEPEND:
case RG_EFAIL:
/* Uh oh - we failed to relocate to this node.
--
1.7.7.6
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [Cluster-devel] [PATCH] rgmanager: Retry when config is out of sync [RHEL5]
2012-02-29 23:53 [Cluster-devel] [PATCH] rgmanager: Retry when config is out of sync [RHEL5] Lon Hohberger
@ 2012-03-01 4:58 ` Fabio M. Di Nitto
0 siblings, 0 replies; 2+ messages in thread
From: Fabio M. Di Nitto @ 2012-03-01 4:58 UTC (permalink / raw)
To: cluster-devel.redhat.com
ACK.
Fabio
On 03/01/2012 12:53 AM, Lon Hohberger wrote:
> [This patch is already in RHEL5]
>
> If you add a service to rgmanager v1 or v2 and that
> service fails to start on the first node but succeeds
> in its initial stop operation, there is a chance that
> the remote instance of rgmanager has not yet reread
> the configuration, causing the service to be placed
> into the 'recovering' state without further action.
>
> This patch causes the originator of the request to
> retry the operation.
>
> Later versions of rgmanager (ex STABLE3 branch and
> derivatives) are unlikely to have this problem since
> configuration updates are not polled, but rather
> delivered to clients.
>
> Update 22-Feb-2012: The above is incorrect, this was
> reproduced a rgmanager v3 installation.
>
> Resolves: rhbz#796272
>
> Signed-off-by: Lon Hohberger <lhh@redhat.com>
> ---
> rgmanager/src/daemons/rg_state.c | 19 +++++++++++++++++++
> 1 files changed, 19 insertions(+), 0 deletions(-)
>
> diff --git a/rgmanager/src/daemons/rg_state.c b/rgmanager/src/daemons/rg_state.c
> index 23a4bec..8c5af5b 100644
> --- a/rgmanager/src/daemons/rg_state.c
> +++ b/rgmanager/src/daemons/rg_state.c
> @@ -1801,6 +1801,7 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target,
> rg_state_t svcStatus;
> int target = preferred_target, me = my_id();
> int ret, x, request = orig_request;
> + int retries;
>
> get_rg_state_local(svcName, &svcStatus);
> if (svcStatus.rs_state == RG_STATE_DISABLED ||
> @@ -1933,6 +1934,8 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target,
> if (target == me)
> goto exhausted;
>
> + retries = 0;
> +retry:
> ret = svc_start_remote(svcName, request, target);
> switch (ret) {
> case RG_ERUN:
> @@ -1942,6 +1945,22 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target,
> *new_owner = svcStatus.rs_owner;
> free_member_list(allowed_nodes);
> return 0;
> + case RG_ENOSERVICE:
> + /*
> + * Configuration update pending on remote node? Give it
> + * a few seconds to sync up. rhbz#568126
> + *
> + * Configuration updates are synchronized in later releases
> + * of rgmanager; this should not be needed.
> + */
> + if (retries++ < 4) {
> + sleep(3);
> + goto retry;
> + }
> + logt_print(LOG_WARNING, "Member #%d has a different "
> + "configuration than I do; trying next "
> + "member.", target);
> + /* Deliberate */
> case RG_EDEPEND:
> case RG_EFAIL:
> /* Uh oh - we failed to relocate to this node.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-03-01 4:58 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-29 23:53 [Cluster-devel] [PATCH] rgmanager: Retry when config is out of sync [RHEL5] Lon Hohberger
2012-03-01 4:58 ` Fabio M. Di Nitto
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).