From mboxrd@z Thu Jan 1 00:00:00 1970 From: teigland@sourceware.org Date: 3 Dec 2007 16:40:42 -0000 Subject: [Cluster-devel] cluster/gfs-kernel/src/dlm lock_dlm.h plock.c Message-ID: <20071203164042.20857.qmail@sourceware.org> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit CVSROOT: /cvs/cluster Module name: cluster Branch: RHEL4 Changes by: teigland at sourceware.org 2007-12-03 16:40:41 Modified files: gfs-kernel/src/dlm: lock_dlm.h plock.c Log message: Posix locks don't work between threads, but it seems some programs do involve threads sharing plocks incidentally (with meaningless results). Given the plock state we keep in lock_dlm, multiple threads accessing the locks would not only be meaningless, but could corrupt the state, leaving the threads permanently hung. This patch tries to keep threads from stomping on each other, hopefully preventing most hangs. bz 383391 Patches: http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/gfs-kernel/src/dlm/lock_dlm.h.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.18.2.7&r2=1.18.2.8 http://sourceware.org/cgi-bin/cvsweb.cgi/cluster/gfs-kernel/src/dlm/plock.c.diff?cvsroot=cluster&only_with_tag=RHEL4&r1=1.12.2.4&r2=1.12.2.5 --- cluster/gfs-kernel/src/dlm/Attic/lock_dlm.h 2007/03/22 21:23:18 1.18.2.7 +++ cluster/gfs-kernel/src/dlm/Attic/lock_dlm.h 2007/12/03 16:40:41 1.18.2.8 @@ -152,6 +152,7 @@ uint64_t end; int count; int ex; + int busy; }; #define LFL_NOBLOCK 0 --- cluster/gfs-kernel/src/dlm/Attic/plock.c 2007/03/22 21:23:18 1.12.2.4 +++ cluster/gfs-kernel/src/dlm/Attic/plock.c 2007/12/03 16:40:41 1.12.2.5 @@ -488,8 +488,10 @@ if (found) { DLM_ASSERT(po->lp, ); + po->busy = 1; error = wait_async(po->lp); DLM_ASSERT(!error, ); + po->busy = 0; goto restart; } } @@ -1154,8 +1156,15 @@ int found = FALSE; list_for_each_entry(po, &r->locks, list) { - if (po->owner == owner) - continue; + if (po->owner == owner) { + if (!po->busy) + continue; + log_debug("po busy %llx %llu %u", + (unsigned long long)name->ln_number, + (unsigned long long)owner, po->pid); + found = TRUE; + break; + } if (!ranges_overlap(po->start, po->end, *start, *end)) continue;