From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wendy Cheng Date: Fri, 10 Aug 2007 09:12:14 -0400 Subject: [Cluster-devel] [PATCH 4 of 5] Bz #248176: GFS2: invalid metadata block - REVISED In-Reply-To: <1186734409.8765.757.camel@quoit> References: <1186609929.25269.46.camel@technetium.msp.redhat.com> <46BB1AC4.9030509@redhat.com> <1186673842.25269.81.camel@technetium.msp.redhat.com> <46BB5B2B.2040405@redhat.com> <1186734409.8765.757.camel@quoit> Message-ID: <46BC642E.5080606@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Steven Whitehouse wrote: >Hi, > >On Thu, 2007-08-09 at 14:21 -0400, Wendy Cheng wrote: > > >>Bob Peterson wrote: >> >> >>>On Thu, 2007-08-09 at 09:46 -0400, Wendy Cheng wrote: >>> >>> >>> >>>>Set aside "after this patch, the problem goes away" thing ... >>>> >>>>I haven't checked previous three patches yet so I may not have the >>>>overall picture ... but why adding the journal flush spin lock here >>>>could prevent the new inode to get re-used before its associated buffer >>>>are flushed to the logs ? Could you elaborate more ? >>>> >>>> >>>> >>>> >>>>>+ down_write(&sdp->sd_log_flush_lock); >>>>> block = rgblk_search(rgd, goal, GFS2_BLKST_UNLINKED, >>>>> GFS2_BLKST_UNLINKED); >>>>>+ up_write(&sdp->sd_log_flush_lock); >>>>> >>>>> >>>>> >>>IIRC, if we don't protect rgblk_search from finding GFS2_BLKST_UNLINKED >>>blocks, a "deleted" inode may be returned to function >>>gfs2_inplace_reserve_i which will do an iput on the inode, >>>which may reference buffers that are being flushed to disk. >>>If almost all blocks in that bitmap are allocated, I think the >>>deleted block may sometimes be reused and the buffer >>>associated with the reused block may be changed before it's >>>actually written out to disk. >>> >>> >>> >>Log flushing is an asynchronous event. I still don't see how this can >>*protect* the condition you just described (i.e., prevents the block >>being assigned to someone else before log flush occurs). Or do I >>understand your statement right (i.e., the log flushing must occur >>before the block is used by someone else) ? It may *reduce* the >>possibility (if log flushing happens at the same time as this >>assignment) but I don't see how it can *stop* the condition. >> >>You may "reduce" the (rare) possibility but the real issue is still >>hanging there ? If this is true, then I don't agree we have to pay the >>price of moving a journal flushing lock into resource handling code. >> >>-- Wendy >> >> >> > >Due to the way in which the locking is defined, the journal lock is also >used to keep other processes out of the rgrp bitmaps. This prevents the >state of the rgrp bitmaps changing while we are scanning them in case a >journal flush might occur. > >The sd_log_flush_lock is an rwsem which is held in read mode by each and >every transaction and in write mode when flushing the journal. Log >flushing ought to be an asynchronous event, but due to the design of the >journaling, it unfortunately isn't in GFS2. It is something that we need >to review in the future, > > > > It is still not clear what exactly does this lock protect ? The unlinked rgrp bitmap itself or the buffers associated with these disk blocks ? If it is later (the buffers as Bob said), it implies for every block GFS2 takes from this unlinked bitmap, journal flush *has* to happen before it can be used ? Could you elaborate more ? -- Wendy