From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: kjournald() with DIO Date: Thu, 15 Sep 2005 14:26:42 -0700 Message-ID: <20050915142642.05f3d75e.akpm@osdl.org> References: <20050913160701.355cd46a.akpm@osdl.org> <1126718583.4010.6.camel@localhost.localdomain> <20050914111809.41c5b395.akpm@osdl.org> <1126734025.4010.21.camel@localhost.localdomain> <20050914150224.3b6d7051.akpm@osdl.org> <1126796604.14837.111.camel@dyn9047017102.beaverton.ibm.com> <20050915192225.GJ4122@opteron.random> <20050915130018.287270e4.akpm@osdl.org> <20050915202019.GK4122@opteron.random> <20050915133500.754a8b4d.akpm@osdl.org> <20050915210358.GL4122@opteron.random> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: pbadari@us.ibm.com, cmm@us.ibm.com, linux-fsdevel@vger.kernel.org, sct@redhat.com Return-path: Received: from smtp.osdl.org ([65.172.181.4]:57492 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S1030547AbVIOV1k (ORCPT ); Thu, 15 Sep 2005 17:27:40 -0400 To: Andrea Arcangeli In-Reply-To: <20050915210358.GL4122@opteron.random> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Andrea Arcangeli wrote: > > On Thu, Sep 15, 2005 at 01:35:00PM -0700, Andrew Morton wrote: > > I'll need reminding - why was the buffer unfreeable? Because kjournald had > > a ref on it? Whereabouts is that happening? > > Yes, kjournald had a ref (b_count) on it. It's happening when running > iozone in some way (Badari knows how to reproduce). > > I was now wondering why it's a problem to destroy dirty buffers in > invalidate_inode_pages2? (ok, ignoring the detail that PageDirty may > return true inside invalidate_complete_page) Bear in mind that generic_file_direct_IO() has just fsynced that section of the file so there shouldn't be any dirty buffers unless something funny is happening. Like, direct-io fell back to buffered IO. In that case perhaps we should run sync_page_range() before trying the invalidate_inode_pages2_range()? > Clearly we can't do that inside releasepage, but invalidate_inode_pages2 > is all about destroying the dirty and uptodate information, since we > just finished writing with direct-io (or at least the dirty info isn't > that important anymore, we're still in our write context for those pages > that we're invalidating [even better with the range]). Those dirty pages/buffers could be there because __generic_file_aio_write_nolock() fell back to buffered I/O. If we run sync_page_range() before invalidate_inode_pages2_range() and we *still* find dirty pages/buffers then yes, it might be right to simply nuke those dirty bits.