From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: Re: [PATCH 4 of 8] Add flags to control direct IO helpers Date: Thu, 8 Feb 2007 07:58:03 -0500 Message-ID: <20070208125803.GG11967@think.oraclecorp.com> References: <04dd7ddd593e9f147723.1170811969@opti.oraclecorp.com> <20070207170845.GA13893@in.ibm.com> <20070207180544.GC11967@think.oraclecorp.com> <20070208040305.GA32642@in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org To: Suparna Bhattacharya Return-path: Received: from agminet01.oracle.com ([141.146.126.228]:51538 "EHLO agminet01.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422921AbXBHM7b (ORCPT ); Thu, 8 Feb 2007 07:59:31 -0500 Content-Disposition: inline In-Reply-To: <20070208040305.GA32642@in.ibm.com> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Thu, Feb 08, 2007 at 09:33:05AM +0530, Suparna Bhattacharya wrote: > On Wed, Feb 07, 2007 at 01:05:44PM -0500, Chris Mason wrote: > > On Wed, Feb 07, 2007 at 10:38:45PM +0530, Suparna Bhattacharya wrote: > > > > + * The flags parameter is a bitmask of: > > > > + * > > > > + * DIO_PLACEHOLDERS (use placeholder pages for locking) > > > > + * DIO_CREATE (pass create=1 to get_block for filling holes or extending) > > > > > > A little more explanation about why these options are needed, and examples > > > of when one would specify each of these options would be good. > > > > I'll extend the comments in the patch, but for discussion here: > > > > DIO_PLACEHOLDERS: placeholders are inserted into the page cache to > > synchronize the DIO with buffered writes. From a locking point of view, > > this is similar to inserting and locking pages in the address space > > corresponding to the DIO. > > > > placeholders guard against concurrent allocations and truncates during the DIO. > > You don't need placeholders if truncates and allocations are are > > impossible (for example, on a block device). > > Likewise placeholders may not be needed if the underlying filesystem > already takes care of locking to synchronizes DIO vs buffered. True, although I don't think any FS covers 100% of the cases right now. > > > > > DIO_CREATE: placeholders make it possible for filesystems to safely fill > > holes and extend the file via get_block during the DIO. If DIO_CREATE > > is turned on, get_block will be called with create=1, allowing the FS to > > allocate blocks during the DIO. > > When would one NOT specify DIO_CREATE, and what are the implications ? > The purpose of having an option of NOT allowing the FS to allocate blocks > during DIO is one is not very intuitive from the standpoint of the caller. > (the block device case could be an example, but then create=1 could not do > any harm or add extra overhead, so why bother ?) DIO has fallen back to buffered IO for so long that I wanted filesystems to explicitly choose the create=1 for now. A good example is my patch for ext3, where the ext3 get_block routine needed to be changed to start a transaction instead of finding the current trans in current->journal_info. The reiserfs DIO get_block needed to be told not to expect i_mutex to be held, etc etc. > > Is there still a valid case where we fallback to buffered IO to fill holes > - to me that seems to be the only situation where create=0 must be enforced. Right, when create=0 we fall back, otherwise we don't. > > > > > DIO_DROP_I_MUTEX: If the write is inside of i_size, i_mutex is dropped > > during the DIO and taken again before returning. > > Again an example of when one would not specify this (block device and > XFS ?) would be useful. If the FS can't fill a hole or extend the file without i_mutex, or if the caller has already dropped I_MUTEX themselves. I think this is only XFS right now, the long term goal is to make placeholders fast enough for XFS to use. -chris