From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Teigland Date: Fri, 14 Sep 2007 08:37:29 -0500 Subject: [Cluster-devel] [PATCH] [GFS2] bz 276631 : GFS2: chmod hung - TRY 2 In-Reply-To: <1189742683.5632.13.camel@technetium.msp.redhat.com> References: <1189742683.5632.13.camel@technetium.msp.redhat.com> Message-ID: <20070914133729.GA18955@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Sep 13, 2007 at 11:04:43PM -0500, Bob Peterson wrote: > diff -pur a/fs/gfs2/locking/dlm/thread.c b/fs/gfs2/locking/dlm/thread.c > --- a/fs/gfs2/locking/dlm/thread.c 2007-09-13 17:33:58.000000000 -0500 > +++ b/fs/gfs2/locking/dlm/thread.c 2007-09-13 22:47:14.000000000 -0500 > @@ -279,8 +279,10 @@ static int gdlm_thread(void *data) > /* Only thread1 is allowed to do blocking callbacks since gfs > may wait for a completion callback within a blocking cb. */ > > + spin_lock(&ls->async_lock); > if (current == ls->thread1) > blist = 1; > + spin_unlock(&ls->async_lock); > > while (!kthread_should_stop()) { > set_current_state(TASK_INTERRUPTIBLE); > @@ -338,10 +340,12 @@ int gdlm_init_threads(struct gdlm_ls *ls > struct task_struct *p; > int error; > > + spin_lock(&ls->async_lock); > p = kthread_run(gdlm_thread, ls, "lock_dlm1"); > error = IS_ERR(p); > if (error) { > log_error("can't start lock_dlm1 thread %d", error); > + spin_unlock(&ls->async_lock); > return error; > } > ls->thread1 = p; > @@ -351,9 +355,11 @@ int gdlm_init_threads(struct gdlm_ls *ls > if (error) { > log_error("can't start lock_dlm2 thread %d", error); > kthread_stop(ls->thread1); > + spin_unlock(&ls->async_lock); > return error; > } > ls->thread2 = p; > + spin_unlock(&ls->async_lock); This is strange. First, it seems very likely to me that kthread_run could sleep, and almost certain that kthread_stop will sleep. Second, using a spinlock to signal a completion from one thread to another like this may be common with mutexes/completions, but not with spinlocks. I'd suggest one of two alternatives. Either have the thread check it's own name for "lock_dlm1", or add a new gdlm_thread1 function, i.e. int _gdlm_thread(struct gdlm_ls *ls, int blist) { /* current function goes here */ } int gdlm_thread1(void *data) { return _gdlm_thread(data, 1); } int gdlm_thread2(void *data) { return _gdlm_thread(data, 0); } kthread_run(gdlm_thread1, ls, "lock_dlm1"); kthread_run(gdlm_thread2, ls, "lock_dlm2");