From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lon Hohberger Date: Wed, 27 Oct 2010 17:17:12 -0400 Subject: [Cluster-devel] [PATCH] rgmanager: Halt services if CMAN dies Message-ID: <1288214232-17790-1-git-send-email-lhh@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit If cman dies because it receives a kill packet (of doom) from other hosts, rgmanager does not notice. This can happen if, for example, you are using qdiskd and it hangs on I/O to the quorum disk due to frequent trespasses or other SAN interruptions. The other instance of qdiskd will ask CMAN to evict the hung node, causing it to be ejected from the cluster and fenced. Data is safe (which is the top priority). If power-cycle fencing is in use, there is no issue at all; the node reboots and service failover occurs fairly quickly. However, problems can arise if, in the same hung-I/O situation: * storage-level fencing is in use * rgmanager has one or more IP addresses in use as part of cluster services. This is because more recent versions of the IP resource agent actually ping the IP address prior to bringing it online for use by services. This prevents accidental take-over of IP addresses in use by other hosts on the network due to an administrator mistake when setting up the cluster. Unfortunately, this behavior also prevents service failover if the presumed-dead host is still online. This patch causes rgmanager to use poll() instead of select() when dealing with the baseline CMAN connection it uses for receiving membership changes and so forth. If the socket is closed by CMAN (either by CMAN's death or some other reason), rgmanager can now detect and act upon that will now treat that stimulus. It treats it as an emergency cluster shutdown request. It will halt all services and exit as quickly as possible. Unfortunately, there is a race between this emergency action and recovery on the surviving host. It is not possible for rgmanager to guarantee that all services will halt after the node has been fenced from shared storage (but before the other host attempts to start the service(s)). Furthermore, a hung 'stop' request caused by loss of access to shared storage may very well cause rgmanager to hang forever, preventing some services (or parts) from ever actually being killed. A main use case for storage-level fencing over power- cycling is the ability to perform post-mortem RCA of what happened in order to cause the node to die in the first place. This implies that rgmanager killing the host would be an incorrect resolution. Resolves: rhbz#639961 Signed-off-by: Lon Hohberger --- rgmanager/src/clulib/msg_cluster.c | 32 ++++++++++++++++++++++---------- 1 files changed, 22 insertions(+), 10 deletions(-) diff --git a/rgmanager/src/clulib/msg_cluster.c b/rgmanager/src/clulib/msg_cluster.c index 4ec3750..00f28c3 100644 --- a/rgmanager/src/clulib/msg_cluster.c +++ b/rgmanager/src/clulib/msg_cluster.c @@ -34,7 +34,9 @@ #include #include #include +#include +static void process_cman_event(cman_handle_t handle, void *private, int reason, int arg); /* Ripped from ccsd's setup_local_socket */ int cluster_msg_close(msgctx_t *ctx); @@ -165,18 +167,17 @@ static int poll_cluster_messages(int timeout) { int ret = -1; - fd_set rfds; - int fd, lfd, max; + int fd, lfd; struct timeval tv; struct timeval *p = NULL; cman_handle_t ch; + struct pollfd fds[2]; if (timeout >= 0) { p = &tv; tv.tv_sec = tv.tv_usec = timeout; } - FD_ZERO(&rfds); /* This sucks - it could cause other threads trying to get a membership list to block for a long time. Now, that should not @@ -195,20 +196,31 @@ poll_cluster_messages(int timeout) cman_unlock(ch); return 0; } - FD_SET(fd, &rfds); - FD_SET(lfd, &rfds); - max = (lfd > fd ? lfd : fd); - if (select(max + 1, &rfds, NULL, NULL, p) > 0) { + fds[0].fd = lfd; + fds[1].fd = fd; + fds[0].events = POLLIN | POLLHUP | POLLERR; + fds[1].events = POLLIN | POLLHUP | POLLERR; + + if (poll(fds, 2, timeout * 1000) > 0) { + /* Someone woke us up */ - if (FD_ISSET(lfd, &rfds)) { + if (fds[0].revents & POLLIN) { cman_unlock(ch); errno = EAGAIN; return -1; } - cman_dispatch(ch, 0); - ret = 0; + if (fds[1].revents & (POLLHUP | POLLERR)) { + process_cman_event(ch, NULL, + CMAN_REASON_TRY_SHUTDOWN, + 0); + } + + if (fds[1].revents & POLLIN) { + cman_dispatch(ch, 0); + ret = 0; + } } cman_unlock(ch); -- 1.7.2.3