From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Whitehouse Date: Thu, 12 Jul 2007 09:14:47 +0100 Subject: [Cluster-devel] Re: [PATCH 2 of 2][GFS2] bz #245832: soft lockup detected in databuf_lo_before_commit In-Reply-To: <1184187323.11507.241.camel@technetium.msp.redhat.com> References: <1184187323.11507.241.camel@technetium.msp.redhat.com> Message-ID: <1184228087.8765.273.camel@quoit> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, Now in the -nmw git tree. Thanks, Steve. On Wed, 2007-07-11 at 15:55 -0500, Bob Peterson wrote: > Hi, > > This is part 2 of the patch for bug #245832, part 1 of which is already > in the git tree. > > The problem was that sdp->sd_log_num_databuf was not always being > protected by the gfs2_log_lock spinlock, but the sd_log_le_databuf > (which it is supposed to reflect) was protected. That meant there > was a timing window during which gfs2_log_flush called > databuf_lo_before_commit and the count didn't match what was > really on the linked list in that window. So when it ran out of > items on the linked list, it decremented total_dbuf from 0 to -1 and > thus never left the "while(total_dbuf)" loop. > > The solution is to protect the variable sdp->sd_log_num_databuf so > that the value will always match the contents of the linked list, > and therefore the number will never go negative, and therefore, the > loop will be exited properly. > > Regards, > > Bob Peterson > Red Hat Cluster Suite > > Signed-off-by: Bob Peterson > -- > fs/gfs2/lops.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c > index aff70f0..3b395c4 100644 > --- a/fs/gfs2/lops.c > +++ b/fs/gfs2/lops.c > @@ -486,8 +486,8 @@ static void databuf_lo_add(struct gfs2_sbd *sdp, struct gfs2_log_element *le) > gfs2_pin(sdp, bd->bd_bh); > tr->tr_num_databuf_new++; > } > - sdp->sd_log_num_databuf++; > gfs2_log_lock(sdp); > + sdp->sd_log_num_databuf++; > list_add(&le->le_list, &sdp->sd_log_le_databuf); > gfs2_log_unlock(sdp); > } > @@ -523,7 +523,7 @@ static void databuf_lo_before_commit(struct gfs2_sbd *sdp) > struct buffer_head *bh = NULL,*bh1 = NULL; > struct gfs2_log_descriptor *ld; > unsigned int limit; > - unsigned int total_dbuf = sdp->sd_log_num_databuf; > + unsigned int total_dbuf; > unsigned int total_jdata = sdp->sd_log_num_jdata; > unsigned int num, n; > __be64 *ptr = NULL; > @@ -535,6 +535,7 @@ static void databuf_lo_before_commit(struct gfs2_sbd *sdp) > * into the log along with a header > */ > gfs2_log_lock(sdp); > + total_dbuf = sdp->sd_log_num_databuf; > bd2 = bd1 = list_prepare_entry(bd1, &sdp->sd_log_le_databuf, > bd_le.le_list); > while(total_dbuf) { > @@ -653,6 +654,7 @@ static void databuf_lo_before_commit(struct gfs2_sbd *sdp) > break; > } > bh = NULL; > + BUG_ON(total_dbuf < num); > total_dbuf -= num; > total_jdata -= num; > } > >