From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753158AbdFUCUp (ORCPT ); Tue, 20 Jun 2017 22:20:45 -0400 Received: from ipmail07.adl2.internode.on.net ([150.101.137.131]:63506 "EHLO ipmail07.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752990AbdFUCUn (ORCPT ); Tue, 20 Jun 2017 22:20:43 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CrBABC10lZ//yBpztdlG+pFYYeBAICgz4DAQEBAQECayiFGQEFOhwVDgULCAMYCSUPBSUDIROKK60CjA0gi1CKPR8FnmGTVZIblQ1XgQowIQgbFYYLgV8uig8BAQE Date: Wed, 21 Jun 2017 12:19:03 +1000 From: Dave Chinner To: "Darrick J. Wong" Cc: Dan Williams , Christoph Hellwig , Andy Lutomirski , Ross Zwisler , "Rudoff, Andy" , Andrew Morton , Jan Kara , linux-nvdimm , Linux API , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Jeff Moyer , Linux FS Devel Subject: Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem Message-ID: <20170621021903.GM17542@dastard> References: <20170619132107.GG11993@dastard> <20170620004653.GI17542@dastard> <20170620084924.GA9752@lst.de> <20170620235346.GK17542@dastard> <20170621012403.GB4730@birch.djwong.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170621012403.GB4730@birch.djwong.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote: > On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote: > > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote: > > > An immutable-extent DAX-file and a reflink-capable DAX-file are not > > > mutually exclusive, > > > > Actually, they are mutually exclusive: when the immutable extent DAX > > inode is breaking the extent sharing done during the reflink > > operation, the copy-on-write operation requires allocating and > > freeing extents on the inode that has immutable extents. Which, if > > the inode really has immutable extents, cannot be done. > > > > That said, if the extent sharing is broken on the other side of the > > reflink (i.e. the non-immutable inode created by the reflink) then > > the extent map of the inode with immutable extents will remain > > unchanged. i.e. there are two sides to this, and if you only see one > > side you might come to the wrong conclusion. > > > > However, we cannot guarantee that no writes occur to the inode with > > immutable extent maps (especially as the whole point is to allow > > userspace writes and commits without the kernel being involved), so > > extent sharing on immutable extent maps cannot be allowed... > > Just to play devil's advocate... > > /If/ you have rmap and /if/ you discover that there's only one > IOMAP_IMMUTABLE file owning this same block and /if/ you're willing to > relocate every other mapping on the whole filesystem, /then/ you could > /in theory/ support shared daxfiles. I figured that nobody apart from experienced filesystem developers would understand the complexities of rmap and refcounts and how they could be abused to do this. I also assumed that that people like you would understand this is possible but completely impractical.... > However, that's so many on-disk metadata lookups to shove into a > pagefault handler that I don't think anyone in XFSland would entertain > such an ugly fantasy. You'd be making a lot of metadata requests, and > you'd have to lock the rmapbt while grabbing inodes, which is insane. Exactly. But while I understand this, consider the amount of assumed filesystem and XFS knowledge in that one simple paragraph. Most non-experts would have stopped *understanding* at "/If/ you have rmap" and go away with the wrong ideas in their heads. Hence I now tend to omit mentioning "possible but impractical" things in mixed expertise discussions.... > Much easier to have a per-inode flag that says "the block map of this > file does not change" and put up with the restricted semantics. In a nutshell. Cheers, Dave. -- Dave Chinner david@fromorbit.com