From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: [RFC] fsblock Date: Mon, 25 Jun 2007 08:25:21 -0400 Message-ID: <20070625122521.GA12446@think.oraclecorp.com> References: <20070624014528.GA17609@wotan.suse.de> <467DE00A.9080700@garzik.org> <20070624034755.GA3292@wotan.suse.de> <20070624135126.GA10077@think.oraclecorp.com> <467F67A8.3030408@yahoo.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Nick Piggin , Jeff Garzik , Linux Kernel Mailing List , Linux Memory Management List , linux-fsdevel@vger.kernel.org To: Nick Piggin Return-path: Received: from rgminet01.oracle.com ([148.87.113.118]:50960 "EHLO rgminet01.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751312AbXFYM2V (ORCPT ); Mon, 25 Jun 2007 08:28:21 -0400 Content-Disposition: inline In-Reply-To: <467F67A8.3030408@yahoo.com.au> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Mon, Jun 25, 2007 at 04:58:48PM +1000, Nick Piggin wrote: > > >Using buffer heads instead allows the FS to send file data down inside > >the transaction code, without taking the page lock. So, locking wrt > >data=ordered is definitely going to be tricky. > > > >The best long term option may be making the locking order > >transaction -> page lock, and change writepage to punt to some other > >queue when it needs to start a transaction. > > Yeah, that's what I would like, and I think it would come naturally > if we move away from these "pass down a single, locked page APIs" > in the VM, and let the filesystem do the locking and potentially > batching of larger ranges. Definitely. > > write_begin/write_end is a step in that direction (and it helps > OCFS and GFS quite a bit). I think there is also not much reason > for writepage sites to require the page to lock the page and clear > the dirty bit themselves (which has seems ugly to me). If we keep the page mapping information with the page all the time (ie writepage doesn't have to call get_block ever), it may be possible to avoid sending down a locked page. But, I don't know the delayed allocation internals well enough to say for sure if that is true. Either way, writepage is the easiest of the bunch because it can be deferred. -chris