From mboxrd@z Thu Jan 1 00:00:00 1970 From: Menyhart Zoltan Date: Wed, 01 Dec 2010 10:23:25 +0100 Subject: [Cluster-devel] Patch: making DLM more robust In-Reply-To: <20101130173051.GB27123@redhat.com> References: <4CEA9ADD.2050109@bull.net> <20101122173442.GA21879@redhat.com> <4CEBD6A2.8090005@bull.net> <20101123171508.GC30147@redhat.com> <4CF52D0E.2020800@bull.net> <20101130173051.GB27123@redhat.com> Message-ID: <4CF6140D.7060809@bull.net> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit David Teigland wrote: > Thanks, I'll take a look; as long as it's disabled by default I don't > expect I'd object much. There are two main problems with this idea, > though, that need to be handled before it's generally usable: > > 1. The kernel can wait on user space indefinately during completely normal > situations, e.g. the loss of quorum or fencing failures can delay > completion indefinately. In my eyes, a networked application should indicate a failure within a "human expectable" time delay. E.g.: - You can try a DLM_USER_CREATE_LOCKSPACE for 5 seconds - If it times out, you can log it, display some status telling the user that it has already been retried for H hours M minutes and S seconds - And retry (if configured so to do by itself) if there is no intervention > This means you can easily introduce false > failures when using a timeout. If we cannot obtain a given resource within a limited time frame, then it is a real error for the customer: s/he cannot mount an OCFS2 volume, cannot issue a cluster command, etc. > EINTR, since it's driven by user > intervention, is a better idea, e.g. killing a mount process. > > 2. The difficulty, even with EINTR, is correctly and cleanly unwinding the > dlm_controld state. Let's take this example indlm/libdlm/libdlm.c: int create_lockspace_v6(const char *name, uint32_t flags) { char reqbuf[sizeof(struct dlm_write_request) + DLM_LOCKSPACE_LEN]; struct dlm_write_request *req = (struct dlm_write_request *)reqbuf; int namelen = strlen(name); memset(reqbuf, 0, sizeof(reqbuf)); set_version_v6(req); req->cmd = DLM_USER_CREATE_LOCKSPACE; req->i.lspace.flags = flags; if (namelen > DLM_LOCKSPACE_LEN) { errno = EINVAL; return -1; } memcpy(req->i.lspace.name, name, namelen); return write(control_fd, req, sizeof(*req) + namelen); } The caller should already be prepared to unwind everything in case of an EINVAL is returned due to a name length error. "write()" can also return several errors. We will have two more error codes: EINTR: there is no much difference if the signal arrives just before we call "write()" or inside the system call... If you already ignore it... If you already handle it... ETIMEDOUT:see above There should be a smooth way out from errors, other than hard reseting the machine :-) Thanks, Zoltan Menyhart