From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fabio M. Di Nitto Date: Thu, 01 Mar 2012 05:58:37 +0100 Subject: [Cluster-devel] [PATCH] rgmanager: Retry when config is out of sync [RHEL5] In-Reply-To: <1330559599-6220-1-git-send-email-lhh@redhat.com> References: <1330559599-6220-1-git-send-email-lhh@redhat.com> Message-ID: <4F4F01FD.9090001@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit ACK. Fabio On 03/01/2012 12:53 AM, Lon Hohberger wrote: > [This patch is already in RHEL5] > > If you add a service to rgmanager v1 or v2 and that > service fails to start on the first node but succeeds > in its initial stop operation, there is a chance that > the remote instance of rgmanager has not yet reread > the configuration, causing the service to be placed > into the 'recovering' state without further action. > > This patch causes the originator of the request to > retry the operation. > > Later versions of rgmanager (ex STABLE3 branch and > derivatives) are unlikely to have this problem since > configuration updates are not polled, but rather > delivered to clients. > > Update 22-Feb-2012: The above is incorrect, this was > reproduced a rgmanager v3 installation. > > Resolves: rhbz#796272 > > Signed-off-by: Lon Hohberger > --- > rgmanager/src/daemons/rg_state.c | 19 +++++++++++++++++++ > 1 files changed, 19 insertions(+), 0 deletions(-) > > diff --git a/rgmanager/src/daemons/rg_state.c b/rgmanager/src/daemons/rg_state.c > index 23a4bec..8c5af5b 100644 > --- a/rgmanager/src/daemons/rg_state.c > +++ b/rgmanager/src/daemons/rg_state.c > @@ -1801,6 +1801,7 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target, > rg_state_t svcStatus; > int target = preferred_target, me = my_id(); > int ret, x, request = orig_request; > + int retries; > > get_rg_state_local(svcName, &svcStatus); > if (svcStatus.rs_state == RG_STATE_DISABLED || > @@ -1933,6 +1934,8 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target, > if (target == me) > goto exhausted; > > + retries = 0; > +retry: > ret = svc_start_remote(svcName, request, target); > switch (ret) { > case RG_ERUN: > @@ -1942,6 +1945,22 @@ handle_relocate_req(char *svcName, int orig_request, int preferred_target, > *new_owner = svcStatus.rs_owner; > free_member_list(allowed_nodes); > return 0; > + case RG_ENOSERVICE: > + /* > + * Configuration update pending on remote node? Give it > + * a few seconds to sync up. rhbz#568126 > + * > + * Configuration updates are synchronized in later releases > + * of rgmanager; this should not be needed. > + */ > + if (retries++ < 4) { > + sleep(3); > + goto retry; > + } > + logt_print(LOG_WARNING, "Member #%d has a different " > + "configuration than I do; trying next " > + "member.", target); > + /* Deliberate */ > case RG_EDEPEND: > case RG_EFAIL: > /* Uh oh - we failed to relocate to this node.