From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH 2/2] ext3: Fix dirtying of journalled buffers in data=journal mode Date: Thu, 5 Aug 2010 21:31:35 +0200 Message-ID: <20100805193135.GI3535@quack.suse.cz> References: <1281026536-10413-1-git-send-email-jack@suse.cz> <1281026536-10413-3-git-send-email-jack@suse.cz> <20100805120941.d915ea09.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jan Kara , linux-ext4@vger.kernel.org, tytso@mit.edu To: Andrew Morton Return-path: Received: from cantor.suse.de ([195.135.220.2]:54288 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755727Ab0HETcI (ORCPT ); Thu, 5 Aug 2010 15:32:08 -0400 Content-Disposition: inline In-Reply-To: <20100805120941.d915ea09.akpm@linux-foundation.org> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Thu 05-08-10 12:09:41, Andrew Morton wrote: > On Thu, 5 Aug 2010 18:42:16 +0200 > Jan Kara wrote: > > > In data=journal mode, we still use block_write_begin() to prepare page for > > writing. This function can occasionally mark buffer dirty which violates > > journalling assumptions - when a buffer is part of a transaction, it should be > > dirty and a buffer can be already part of a forget list of some transaction > > when block_write_begin() gets called. This violation of journalling assumptions > > then results in "JBD: Spotted dirty metadata buffer..." warnings. > > > > In fact, temporary dirtying the buffer while the page is still locked does not > > really cause problems to the journalling because we won't write the buffer > > until the page gets unlocked. So we just have to make sure to clear dirty bits > > before unlocking the page. > > > > Signed-off-by: Jan Kara > > --- > > fs/ext3/inode.c | 18 +++++++++++++++++- > > 1 files changed, 17 insertions(+), 1 deletions(-) > > > > diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c > > index 735f019..1a84abb 100644 > > --- a/fs/ext3/inode.c > > +++ b/fs/ext3/inode.c > > @@ -1149,9 +1149,25 @@ static int walk_page_buffers( handle_t *handle, > > static int do_journal_get_write_access(handle_t *handle, > > struct buffer_head *bh) > > { > > + int dirty = buffer_dirty(bh); > > + int ret; > > + > > if (!buffer_mapped(bh) || buffer_freed(bh)) > > return 0; > > - return ext3_journal_get_write_access(handle, bh); > > + /* > > + * __block_prepare_write() could have dirtied some buffers. Clean > > + * the dirty bit as jbd2_journal_get_write_access() could complain > > + * otherwise about fs integrity issues. Setting of the dirty bit > > + * by __block_prepare_write() isn't a real problem here as we clear > > + * the bit before releasing a page lock and thus writeback cannot > > + * ever write the buffer. > > + */ > > + if (dirty) > > + clear_buffer_dirty(bh); > > mark_buffer_dirty() can run set_page_dirty() which will set the page > dirty and increment dirty-page accounting. If we then run > clear_buffer_dirty() we can end up with a dirty page which has clean > buffers and an off-by-one in dirty-page accounting. > > Later, writeback will come along and will attempt to write the "dirty" > page. It will discover the cleanness of the buffers, will mark the > page clean without doing any IO and will decrement the dirty-page > accounting. So everything gets fixed up again. > > So I don't see any problem here and this isn't the only place where > this sort of thing occurs. It's just somethnig to be aware of and to > have a think about. Yes, JBD tends to do this a lot (with all the journaled buffers). Thanks for review! Honza -- Jan Kara SUSE Labs, CR