From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Lagerwall Date: Fri, 22 Mar 2019 18:15:12 +0000 Subject: [Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks In-Reply-To: <745330930.13931854.1553021309310.JavaMail.zimbra@redhat.com> References: <745330930.13931854.1553021309310.JavaMail.zimbra@redhat.com> Message-ID: <06bfe342-bebb-e464-4759-eedc641db017@citrix.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On 3/19/19 6:48 PM, Bob Peterson wrote: > ----- Original Message ----- >> Hi, >> >> Occasionally during testing, we see the following assertion failure in >> log_pull_tail(): >> >> [ 1104.061245] gfs2: fsid=xapi-clusterd:2d2cc24c-c48a-ca.0: fatal: >> assertion "atomic_read(&sdp->sd_log_blks_free) <= >> sdp->sd_jdesc->jd_blocks" failed >> [ 1104.061245] function = log_pull_tail, file = fs/gfs2/log.c, line = 510 >> >> It always seems to happen shortly after journal recovery. I added some >> debug logging at the point of the assertion failure and got the following: > (snip) >> Any ideas about this? >> >> Thanks, >> -- >> Ross Lagerwall > > Hi Ross, > > I've been fighting with/debugging multiple recovery problems for a long time now. > I've done countless (well, okay, thousands of) recovery tests and I can tell > you that gfs2 recovery has some major problems. These problems usually don't > occur when you have a few gfs2 mounts: they're much more likely when you have > lots of gfs2 mounts. I'm using 32 mounts. > > The problem you mentioned sounds vaguely familiar, but I can't find anything > directly related. Make sure all your journals are the same size, and see if fsck.gfs2 > complains about the journal. Otherwise, it could be a side effect of one of the > recovery issues I'm working on. Do you have other symptoms? Also, make sure > multiple nodes aren't trying to use the same journal because of lock_nolock or > something...I've made that mistake in the past. > I think I've found the cause of the assertion I was hitting. Recovery sets sd_log_flush_head but does not take locks which means a concurrent call to gfs2_log_flush() can result in sd_log_head being set to sd_log_flush_head. A later call to gfs2_log_flush() will then hit an assertion failure in log_pull_tail() because the mismatch between sd_log_head and sd_log_tail means too many blocks are freed. I've worked around it by taking the log_flush lock in the patch below and it seems to avoid the problem. However, tracing the recovery process I see that it sets sd_log_flush_head and then calls clean_journal() -> gfs2_write_log_header() -> gfs2_log_bmap() -> gfs2_log_incr_head(). This has: BUG_ON((sdp->sd_log_flush_head == sdp->sd_log_tail) && (sdp->sd_log_flush_head != sdp->sd_log_head)); ... but sd_log_tail and sd_log_head have not been set by gfs2_recover_func() so it might still BUG_ON() during recovery if you're particularly unlucky. I had a look at your "GFS2: Withdraw corruption patches [V2]" series but I didn't see anything that might fix this. If you think this patch is useful then I can send it as a proper patch to the list. Thanks, Ross -------------- gfs2: Take log_flush lock during recovery Recovery sets sd_log_flush_head but does not take any locks which means a concurrent call to gfs2_log_flush can result in sd_log_head being set to sd_log_flush_head. A later call to gfs2_log_flush will then hit an assertion failure in log_pull_tail because the mismatch between sd_log_head and sd_log_tail means too many blocks are freed. gfs2: fsid=xapi-clusterd:88a31b8e-4072-b0.1: fatal: assertion "atomic_read(&sdp->sd_log_blks_free) <= sdp->sd_jdesc->jd_blocks" failed function = log_pull_tail, file = fs/gfs2/log.c, line = 510 diff --git a/fs/gfs2/recovery.c b/fs/gfs2/recovery.c index 0f501f938d1c..5f7c3271e909 100644 --- a/fs/gfs2/recovery.c +++ b/fs/gfs2/recovery.c @@ -498,6 +498,8 @@ void gfs2_recover_func(struct work_struct *work) goto fail_gunlock_thaw; } + down_write(&sdp->sd_log_flush_lock); + t_tlck = ktime_get(); fs_info(sdp, "jid=%u: Replaying journal...\n", jd->jd_jid); @@ -512,6 +514,8 @@ void gfs2_recover_func(struct work_struct *work) clean_journal(jd, &head); + up_write(&sdp->sd_log_flush_lock); + gfs2_glock_dq_uninit(&thaw_gh); t_rep = ktime_get(); fs_info(sdp, "jid=%u: Journal replayed in %lldms [jlck:%lldms, "