From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fabio M. Di Nitto Date: Fri, 29 Oct 2010 08:46:33 +0200 Subject: [Cluster-devel] [PATCH 2/2] rgmanager: Work around lockspace release hang In-Reply-To: <1288300647-8469-2-git-send-email-lhh@redhat.com> References: <1288300647-8469-1-git-send-email-lhh@redhat.com> <1288300647-8469-2-git-send-email-lhh@redhat.com> Message-ID: <4CCA6DC9.8010901@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit ACK on both patches. Fabio On 10/28/2010 11:17 PM, Lon Hohberger wrote: > If CMAN dies uncleanly (ex: because of cman_kill_node() call > on another cluster node), rgmanager would hang trying to > release the lock space, preventing it from exiting and causing > it to spin. > > This patch works around the hang during unclean shutdown > situations. > > Resolves: rhbz#639961 > > Signed-off-by: Lon Hohberger > --- > rgmanager/src/daemons/main.c | 11 ++++++++--- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/rgmanager/src/daemons/main.c b/rgmanager/src/daemons/main.c > index 64c32a3..52e38bc 100644 > --- a/rgmanager/src/daemons/main.c > +++ b/rgmanager/src/daemons/main.c > @@ -64,7 +64,7 @@ int node_has_fencing(int nodeid); > int fence_domain_joined(void); > > int cluster_timeout = 10; > -int shutdown_pending = 0, running = 1, need_reconfigure = 0; > +int shutdown_pending = 0, running = 1, need_reconfigure = 0, dying = 0; > char debug = 0; /* XXX* */ > static int signalled = 0; > static int port = RG_PORT; > @@ -676,12 +676,14 @@ handle_cluster_event(msgctx_t *ctx) > msg_receive(ctx, NULL, 0, 0); > clulog(LOG_WARNING, "#67: Shutting down uncleanly\n"); > rg_set_inquorate(); > - rg_doall(RG_INIT, 1, "Emergency stop of %s"); > + rg_doall(RG_INIT, 1, "Emergency stop of %s\n"); > rg_clear_initialized(0); > #if defined(LIBCMAN_VERSION) && LIBCMAN_VERSION >= 2 > /* cman_replyto_shutdown() */ > #endif > running = 0; > + dying = 1; /* XXX Hack to work around hang during > + unclean lockspace release */ > break; > } > > @@ -1180,7 +1182,10 @@ main(int argc, char **argv) > cleanup(cluster_ctx); > > out_cleanup: > - clu_lock_finished(rgmanager_lsname); > + /* XXX - This hangs if CMAN has died, so we skip if we are > + * exiting uncleanly. */ > + if (!dying) > + clu_lock_finished(rgmanager_lsname); > > out: > clulog(LOG_NOTICE, "Shutdown complete, exiting\n");