From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-fsdevel-owner@vger.kernel.org>
Received: from verein.lst.de ([213.95.11.211]:44371 "EHLO newverein.lst.de"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1750738AbdAWSDQ (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
        Mon, 23 Jan 2017 13:03:16 -0500
Date: Mon, 23 Jan 2017 19:03:14 +0100
From: Christoph Hellwig <hch@lst.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
        Matthew Wilcox <mawilcox@microsoft.com>,
        "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
        Tony Luck <tony.luck@intel.com>, Jan Kara <jack@suse.cz>,
        Toshi Kani <toshi.kani@hpe.com>,
        Mike Snitzer <snitzer@redhat.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "x86@kernel.org" <x86@kernel.org>, Jeff Moyer <jmoyer@redhat.com>,
        Jens Axboe <axboe@fb.com>,
        "dm-devel@redhat.com" <dm-devel@redhat.com>,
        Ingo Molnar <mingo@redhat.com>,
        Al Viro <viro@zeniv.linux.org.uk>,
        "H. Peter Anvin" <hpa@zytor.com>,
        "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Ross Zwisler <ross.zwisler@linux.intel.com>
Subject: Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to
        libnvdimm
Message-ID: <20170123180314.GA23073@lst.de>
References: <BY2PR21MB00367799FE7B7E8302A99260CB730@BY2PR21MB0036.namprd21.prod.outlook.com> <20170122162910.GA5267@lst.de> <BY2PR21MB00368E95CBB533D4C761C882CB730@BY2PR21MB0036.namprd21.prod.outlook.com> <20170122183046.GA7359@lst.de> <BY2PR21MB0036CC7935BFE438EA001763CB730@BY2PR21MB0036.namprd21.prod.outlook.com> <20170122184439.GA7603@lst.de> <BY2PR21MB0036CA85562DDD21814C0B27CB720@BY2PR21MB0036.namprd21.prod.outlook.com> <CAPcyv4h83wk5SGVPY+8nUZ1e4Gq9fz61GE3BhAdzGi=HFAuDcg@mail.gmail.com> <20170123160009.GB517@lst.de> <CAPcyv4gAbwS9yKNgAN9ytpDg7Jqh1FubZbGSfbFP0f-DdXPpCg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAPcyv4gAbwS9yKNgAN9ytpDg7Jqh1FubZbGSfbFP0f-DdXPpCg@mail.gmail.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Mon, Jan 23, 2017 at 09:14:04AM -0800, Dan Williams wrote:
> The use case that we have now is distinguishing volatile vs persistent
> memory (brd vs pmem).

brd is a development tool, so until we have other reasons for this
abstraction (which I'm pretty sure will show up rather sooner than later)
I would not worry about it too much.

> I took a look at mtd layering approach and the main difference is that
> layers above the block layer do not appear to know anything about mtd
> specifics.

Or the block layer itself for that matter.  And that's exactly where
I want DAX to be in the future.

> For fs/dax.c we currently need some path to retrieve a dax
> anchor object through the block device.

We have a need to retreiver the anchor object.  We currently do it
though the block layer for historical reasons, but it doesn't have
to be that way.

> > In the longer run I like your dax_operations, but they need to be
> > separate from the block layer.
> 
> I'll move them from block_device_operations to dax data hanging off of
> the bdev_inode, or is there a better way to go from bdev-to-dax?

I don't think that's any better.  What we really want is a way
to find the underlying persistent memory / DAX / whatever we call
it node without going through a block device.  E.g. a library function
to give that object for a given path name, where the path name could
be either that of the /dev/pmemN or the /dev/daxN device.

If the file system for now still needs a block device as well it
will only accept the /dev/pmemN name, and open both the low-level
pmem device and the block device.  Once that file system doesn't
need block code (and I think we could do that easily for XFS,
nevermind any new FS) it won't have to deal with the block
device at all.

pmem.c then becomes a consumer of the dax_ops just like the file system.