From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from verein.lst.de ([213.95.11.211]:44371 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750738AbdAWSDQ (ORCPT ); Mon, 23 Jan 2017 13:03:16 -0500 Date: Mon, 23 Jan 2017 19:03:14 +0100 From: Christoph Hellwig To: Dan Williams Cc: Christoph Hellwig , Matthew Wilcox , "linux-nvdimm@lists.01.org" , Tony Luck , Jan Kara , Toshi Kani , Mike Snitzer , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , Jeff Moyer , Jens Axboe , "dm-devel@redhat.com" , Ingo Molnar , Al Viro , "H. Peter Anvin" , "linux-fsdevel@vger.kernel.org" , Thomas Gleixner , Linus Torvalds , Ross Zwisler Subject: Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm Message-ID: <20170123180314.GA23073@lst.de> References: <20170122162910.GA5267@lst.de> <20170122183046.GA7359@lst.de> <20170122184439.GA7603@lst.de> <20170123160009.GB517@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Jan 23, 2017 at 09:14:04AM -0800, Dan Williams wrote: > The use case that we have now is distinguishing volatile vs persistent > memory (brd vs pmem). brd is a development tool, so until we have other reasons for this abstraction (which I'm pretty sure will show up rather sooner than later) I would not worry about it too much. > I took a look at mtd layering approach and the main difference is that > layers above the block layer do not appear to know anything about mtd > specifics. Or the block layer itself for that matter. And that's exactly where I want DAX to be in the future. > For fs/dax.c we currently need some path to retrieve a dax > anchor object through the block device. We have a need to retreiver the anchor object. We currently do it though the block layer for historical reasons, but it doesn't have to be that way. > > In the longer run I like your dax_operations, but they need to be > > separate from the block layer. > > I'll move them from block_device_operations to dax data hanging off of > the bdev_inode, or is there a better way to go from bdev-to-dax? I don't think that's any better. What we really want is a way to find the underlying persistent memory / DAX / whatever we call it node without going through a block device. E.g. a library function to give that object for a given path name, where the path name could be either that of the /dev/pmemN or the /dev/daxN device. If the file system for now still needs a block device as well it will only accept the /dev/pmemN name, and open both the low-level pmem device and the block device. Once that file system doesn't need block code (and I think we could do that easily for XFS, nevermind any new FS) it won't have to deal with the block device at all. pmem.c then becomes a consumer of the dax_ops just like the file system.