From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Whitehouse Date: Fri, 25 Jun 2010 10:53:33 +0100 Subject: [Cluster-devel] [GFS2 Patch] GFS2: recovery stuck on transaction lock In-Reply-To: <1065407972.775941277307887194.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> References: <1065407972.775941277307887194.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com> Message-ID: <1277459613.2507.3.camel@localhost> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, Now in the -nmw tree. Thanks, Steve. On Wed, 2010-06-23 at 11:44 -0400, Bob Peterson wrote: > Hi, > > This patch fixes bugzilla bug #590878: GFS2: recovery stuck on > transaction lock. We set the frozen flag on the glock when we receive > a completion that cannot be delivered due to blocked locks. At that > point we check to see whether the first waiting holder has the noexp > flag set. If the noexp lock is queued later, then we need to unfreeze > the glock at that point in time, namely, in the glock work function. > > This patch was originally written by Steve Whitehouse, but since > he's on holiday, I'm submitting it. It's been well tested with a > complex recovery test called revolver. > > Regards, > > Bob Peterson > Red Hat GFS > > Signed-off-by: Steve Whitehouse > Signed-off-by: Bob Peterson > -- > diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c > index ddcdbf4..dbab3fd 100644 > --- a/fs/gfs2/glock.c > +++ b/fs/gfs2/glock.c > @@ -706,8 +706,18 @@ static void glock_work_func(struct work_struct *work) > { > unsigned long delay = 0; > struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_work.work); > + struct gfs2_holder *gh; > int drop_ref = 0; > > + if (unlikely(test_bit(GLF_FROZEN, &gl->gl_flags))) { > + spin_lock(&gl->gl_spin); > + gh = find_first_waiter(gl); > + if (gh && (gh->gh_flags & LM_FLAG_NOEXP) && > + test_and_clear_bit(GLF_FROZEN, &gl->gl_flags)) > + set_bit(GLF_REPLY_PENDING, &gl->gl_flags); > + spin_unlock(&gl->gl_spin); > + } > + > if (test_and_clear_bit(GLF_REPLY_PENDING, &gl->gl_flags)) { > finish_xmote(gl, gl->gl_reply); > drop_ref = 1;