From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753198AbbJZWTf (ORCPT ); Mon, 26 Oct 2015 18:19:35 -0400 Received: from ipmail04.adl6.internode.on.net ([150.101.137.141]:9177 "EHLO ipmail04.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751597AbbJZWTd (ORCPT ); Mon, 26 Oct 2015 18:19:33 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CpCAAqpi5WPJv4LHleKAGDDYFDhlqicQEBAQEBAQaLKIUkhgmGFwICAQECgTdNAQEBAQEBBwEBAQFAAT+EMgEBAQMBOhwjBQsIAxgJJQ8FJQMHGhMbiA0HxgcBAQgCASAZhheFRYUNB4QuBZY2jRqcN4J0HYFpKjSHGAEBAQ Date: Tue, 27 Oct 2015 09:19:30 +1100 From: Dave Chinner To: Dan Williams Cc: Jan Kara , "linux-kernel@vger.kernel.org" , "jmoyer@redhat.com" , "hch@lst.de" , "axboe@fb.com" , "akpm@linux-foundation.org" , "linux-nvdimm@lists.01.org" , "willy@linux.intel.com" , "ross.zwisler@linux.intel.com" Subject: Re: [PATCH 5/5] block: enable dax for raw block devices Message-ID: <20151026221930.GL19199@dastard> References: <20151022064142.12700.11849.stgit@dwillia2-desk3.amr.corp.intel.com> <20151022064211.12700.77105.stgit@dwillia2-desk3.amr.corp.intel.com> <20151022093549.GE14445@quack.suse.cz> <1445529945.17208.4.camel@intel.com> <20151022210818.GC8670@quack.suse.cz> <20151025212247.GI19199@dastard> <20151026062319.GJ19199@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 26, 2015 at 05:56:30PM +0900, Dan Williams wrote: > On Mon, Oct 26, 2015 at 3:23 PM, Dave Chinner wrote: > > Also, DAX access isn't a property of mmap - it's a property > > of the inode. We cannot do DAX access via mmap while mixing page > > cache based access through file descriptor based interfaces. This > > I why I'm adding an inode attribute (on disk) to enable per-file DAX > > capabilities - either everything is via the DAX paths, or nothing > > is. > > > > Per-inode control sounds very useful, I'll look at a similar mechanism > for the raw block case. > > However, still not quite convinced page-cache control is an inode-only > property, especially when direct-i/o is not an inode-property. That > said, I agree the complexity of handling mixed mappings of the same > file is prohibitive. We didn't get that choice with direct IO - support via O_DIRECT was kinda inherited from other OS's(*). We still have all sorts of coherency problems between buffered/mmap/direct IO on the same file, and I'd really, really like to avoid making that same mistake again with DAX. i.e. We have a choice with DAX right now that will allow us to avoid coherency problems that we know existi and can't solve right now. Making DAX and inode property rather than a application context property avoids those coherence problems as all access will play by the same rules.... (*)That said, some other OS's did O_DIRECT as an inode property (e.g. solaris) where O_DIRECT was only done if no other cached operations were required (e.g. mmap), and so the fd would transparently shift between buffered and O_DIRECT depending on external accesses to the inode. This was not liked because of it's unpredictable effect on CPU usage and IO latency.... > Sounds good, get blkdev_issue_flush() functional first and then worry > about building a more efficient solution on top. *nod* Cheers, Dave. -- Dave Chinner david@fromorbit.com