From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Peterson Date: Tue, 24 Jul 2007 00:07:51 -0500 Subject: [Cluster-devel] [PATCH 0 of 5]Bz #248176: GFS2: invalid metadata block, gfs2_meta_indirect_buffer Message-ID: <1185253671.517.60.camel@technetium.msp.redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Here is a set of five patches designed to fix the "invalid metadata block" and hang problems encountered when running the revolver test. In order, the five patches are: 1. There were still some critical variables being manipulated outside the log_lock spinlock. That usually resulted in more hangs. 2. The list_move code previously concocted in log.c for bug #238162 (see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=238162#c23) seems to be causing a problem. That section was reverted. HOWEVER, I need to rerun the test cases listed in that bug to make sure removing it doesn't cause anything to break. I haven't had time yet. 3. The try_rgrp_unlink code in rgrp.c had an infinite loop. This was caused because the bitmap function rgblk_search can return a block less than the "goal" block, in which case it was looping. The fix is to make it always march forward as needed. 4. There was metadata corruption caused because the clone bitmaps weren't being kept in synch with the regular bitmaps in some cases. Code was added to keep them in synch. 5. Metadata corruption was occurring because page references weren't being removed in all cases. I previously added a function called detach_bufdata, but I discovered there already WAS a function out there to do the job. It's called gfs2_meta_cache_flush. So I added a call to that to remove the page references. Recently I had been thinking that this was entirely unnecessary, but when I removed the code, the metadata corruption problem returned immediately. It might be that there is a timing window where the pages can be referenced before gfs2_meta_cache_flush is called and my patch cleans them up sooner.