All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>, Andy Lutomirski <luto@kernel.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	"Rudoff, Andy" <andy.rudoff@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux API <linux-api@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Jeff Moyer <jmoyer@redhat.com>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem
Date: Wed, 21 Jun 2017 12:19:03 +1000	[thread overview]
Message-ID: <20170621021903.GM17542@dastard> (raw)
In-Reply-To: <20170621012403.GB4730@birch.djwong.org>

On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote:
> > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote:
> > > An immutable-extent DAX-file and a reflink-capable DAX-file are not
> > > mutually exclusive,
> > 
> > Actually, they are mutually exclusive: when the immutable extent DAX
> > inode is breaking the extent sharing done during the reflink
> > operation, the copy-on-write operation requires allocating and
> > freeing extents on the inode that has immutable extents. Which, if
> > the inode really has immutable extents, cannot be done.
> > 
> > That said, if the extent sharing is broken on the other side of the
> > reflink (i.e. the non-immutable inode created by the reflink) then
> > the extent map of the inode with immutable extents will remain
> > unchanged. i.e. there are two sides to this, and if you only see one
> > side you might come to the wrong conclusion.
> > 
> > However, we cannot guarantee that no writes occur to the inode with
> > immutable extent maps (especially as the whole point is to allow
> > userspace writes and commits without the kernel being involved), so
> > extent sharing on immutable extent maps cannot be allowed...
> 
> Just to play devil's advocate...
> 
> /If/ you have rmap and /if/ you discover that there's only one
> IOMAP_IMMUTABLE file owning this same block and /if/ you're willing to
> relocate every other mapping on the whole filesystem, /then/ you could
> /in theory/ support shared daxfiles.

I figured that nobody apart from experienced filesystem developers
would understand the complexities of rmap and refcounts and how they
could be abused to do this. I also assumed that that people like you
would understand this is possible but completely impractical....

> However, that's so many on-disk metadata lookups to shove into a
> pagefault handler that I don't think anyone in XFSland would entertain
> such an ugly fantasy.  You'd be making a lot of metadata requests, and
> you'd have to lock the rmapbt while grabbing inodes, which is insane.

Exactly. But while I understand this, consider the amount of assumed
filesystem and XFS knowledge in that one simple paragraph. Most
non-experts would have stopped *understanding* at "/If/ you have
rmap" and go away with the wrong ideas in their heads. Hence I now
tend to omit mentioning "possible but impractical" things in mixed
expertise discussions....

> Much easier to have a per-inode flag that says "the block map of this
> file does not change" and put up with the restricted semantics.

In a nutshell.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux API <linux-api@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andy Lutomirski <luto@kernel.org>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem
Date: Wed, 21 Jun 2017 12:19:03 +1000	[thread overview]
Message-ID: <20170621021903.GM17542@dastard> (raw)
In-Reply-To: <20170621012403.GB4730@birch.djwong.org>

On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote:
> > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote:
> > > An immutable-extent DAX-file and a reflink-capable DAX-file are not
> > > mutually exclusive,
> > 
> > Actually, they are mutually exclusive: when the immutable extent DAX
> > inode is breaking the extent sharing done during the reflink
> > operation, the copy-on-write operation requires allocating and
> > freeing extents on the inode that has immutable extents. Which, if
> > the inode really has immutable extents, cannot be done.
> > 
> > That said, if the extent sharing is broken on the other side of the
> > reflink (i.e. the non-immutable inode created by the reflink) then
> > the extent map of the inode with immutable extents will remain
> > unchanged. i.e. there are two sides to this, and if you only see one
> > side you might come to the wrong conclusion.
> > 
> > However, we cannot guarantee that no writes occur to the inode with
> > immutable extent maps (especially as the whole point is to allow
> > userspace writes and commits without the kernel being involved), so
> > extent sharing on immutable extent maps cannot be allowed...
> 
> Just to play devil's advocate...
> 
> /If/ you have rmap and /if/ you discover that there's only one
> IOMAP_IMMUTABLE file owning this same block and /if/ you're willing to
> relocate every other mapping on the whole filesystem, /then/ you could
> /in theory/ support shared daxfiles.

I figured that nobody apart from experienced filesystem developers
would understand the complexities of rmap and refcounts and how they
could be abused to do this. I also assumed that that people like you
would understand this is possible but completely impractical....

> However, that's so many on-disk metadata lookups to shove into a
> pagefault handler that I don't think anyone in XFSland would entertain
> such an ugly fantasy.  You'd be making a lot of metadata requests, and
> you'd have to lock the rmapbt while grabbing inodes, which is insane.

Exactly. But while I understand this, consider the amount of assumed
filesystem and XFS knowledge in that one simple paragraph. Most
non-experts would have stopped *understanding* at "/If/ you have
rmap" and go away with the wrong ideas in their heads. Hence I now
tend to omit mentioning "possible but impractical" things in mixed
expertise discussions....

> Much easier to have a per-inode flag that says "the block map of this
> file does not change" and put up with the restricted semantics.

In a nutshell.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Christoph Hellwig <hch@lst.de>, Andy Lutomirski <luto@kernel.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	"Rudoff, Andy" <andy.rudoff@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux API <linux-api@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Jeff Moyer <jmoyer@redhat.com>,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem
Date: Wed, 21 Jun 2017 12:19:03 +1000	[thread overview]
Message-ID: <20170621021903.GM17542@dastard> (raw)
In-Reply-To: <20170621012403.GB4730@birch.djwong.org>

On Tue, Jun 20, 2017 at 06:24:03PM -0700, Darrick J. Wong wrote:
> On Wed, Jun 21, 2017 at 09:53:46AM +1000, Dave Chinner wrote:
> > On Tue, Jun 20, 2017 at 09:17:36AM -0700, Dan Williams wrote:
> > > An immutable-extent DAX-file and a reflink-capable DAX-file are not
> > > mutually exclusive,
> > 
> > Actually, they are mutually exclusive: when the immutable extent DAX
> > inode is breaking the extent sharing done during the reflink
> > operation, the copy-on-write operation requires allocating and
> > freeing extents on the inode that has immutable extents. Which, if
> > the inode really has immutable extents, cannot be done.
> > 
> > That said, if the extent sharing is broken on the other side of the
> > reflink (i.e. the non-immutable inode created by the reflink) then
> > the extent map of the inode with immutable extents will remain
> > unchanged. i.e. there are two sides to this, and if you only see one
> > side you might come to the wrong conclusion.
> > 
> > However, we cannot guarantee that no writes occur to the inode with
> > immutable extent maps (especially as the whole point is to allow
> > userspace writes and commits without the kernel being involved), so
> > extent sharing on immutable extent maps cannot be allowed...
> 
> Just to play devil's advocate...
> 
> /If/ you have rmap and /if/ you discover that there's only one
> IOMAP_IMMUTABLE file owning this same block and /if/ you're willing to
> relocate every other mapping on the whole filesystem, /then/ you could
> /in theory/ support shared daxfiles.

I figured that nobody apart from experienced filesystem developers
would understand the complexities of rmap and refcounts and how they
could be abused to do this. I also assumed that that people like you
would understand this is possible but completely impractical....

> However, that's so many on-disk metadata lookups to shove into a
> pagefault handler that I don't think anyone in XFSland would entertain
> such an ugly fantasy.  You'd be making a lot of metadata requests, and
> you'd have to lock the rmapbt while grabbing inodes, which is insane.

Exactly. But while I understand this, consider the amount of assumed
filesystem and XFS knowledge in that one simple paragraph. Most
non-experts would have stopped *understanding* at "/If/ you have
rmap" and go away with the wrong ideas in their heads. Hence I now
tend to omit mentioning "possible but impractical" things in mixed
expertise discussions....

> Much easier to have a per-inode flag that says "the block map of this
> file does not change" and put up with the restricted semantics.

In a nutshell.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2017-06-21  2:19 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-17  1:15 [RFC PATCH 0/2] daxfile: enable byte-addressable updates to pmem Dan Williams
2017-06-17  1:15 ` Dan Williams
2017-06-17  1:15 ` Dan Williams
2017-06-17  1:15 ` Dan Williams
     [not found] ` <149766212410.22552.15957843500156182524.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-06-17  1:15   ` [RFC PATCH 1/2] mm: introduce bmap_walk() Dan Williams
2017-06-17  1:15     ` Dan Williams
2017-06-17  1:15     ` Dan Williams
2017-06-17  1:15     ` Dan Williams
2017-06-17  5:22     ` Christoph Hellwig
2017-06-17  5:22       ` Christoph Hellwig
     [not found]       ` <20170617052212.GA8246-jcswGhMUV9g@public.gmane.org>
2017-06-17 12:29         ` Dan Williams
2017-06-17 12:29           ` Dan Williams
2017-06-17 12:29           ` Dan Williams
2017-06-17 12:29           ` Dan Williams
2017-06-18  7:51           ` Christoph Hellwig
2017-06-18  7:51             ` Christoph Hellwig
2017-06-18  7:51             ` Christoph Hellwig
2017-06-19 16:18             ` Darrick J. Wong
2017-06-19 16:18               ` Darrick J. Wong
     [not found]             ` <20170618075152.GA25871-jcswGhMUV9g@public.gmane.org>
2017-06-19 18:19               ` Al Viro
2017-06-19 18:19                 ` Al Viro
2017-06-19 18:19                 ` Al Viro
2017-06-19 18:19                 ` Al Viro
2017-06-20  7:34                 ` Christoph Hellwig
2017-06-20  7:34                   ` Christoph Hellwig
2017-06-20  7:34                   ` Christoph Hellwig
2017-06-17  1:15 ` [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem Dan Williams
2017-06-17  1:15   ` Dan Williams
2017-06-17  1:15   ` Dan Williams
     [not found]   ` <149766213493.22552.4057048843646200083.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-06-17 16:25     ` Andy Lutomirski
2017-06-17 16:25       ` Andy Lutomirski
2017-06-17 16:25       ` Andy Lutomirski
2017-06-17 16:25       ` Andy Lutomirski
2017-06-17 21:52       ` Dan Williams
2017-06-17 21:52         ` Dan Williams
2017-06-17 21:52         ` Dan Williams
     [not found]         ` <CAPcyv4j4UEegViDJcLZjVv5AFGC18-DcvHFnhZatB0hH3BY85g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-17 23:50           ` Andy Lutomirski
2017-06-17 23:50             ` Andy Lutomirski
2017-06-17 23:50             ` Andy Lutomirski
2017-06-17 23:50             ` Andy Lutomirski
2017-06-18  3:15             ` Dan Williams
2017-06-18  3:15               ` Dan Williams
2017-06-18  3:15               ` Dan Williams
2017-06-18  5:05               ` Andy Lutomirski
2017-06-18  5:05                 ` Andy Lutomirski
2017-06-18  5:05                 ` Andy Lutomirski
     [not found]                 ` <CALCETrVY38h2ajpod2U_2pdHSp8zO4mG2p19h=OnnHmhGTairw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-19 13:21                   ` Dave Chinner
2017-06-19 13:21                     ` Dave Chinner
2017-06-19 13:21                     ` Dave Chinner
2017-06-19 13:21                     ` Dave Chinner
2017-06-19 15:22                     ` Andy Lutomirski
2017-06-19 15:22                       ` Andy Lutomirski
2017-06-19 15:22                       ` Andy Lutomirski
     [not found]                       ` <CALCETrUe0igzK0RZTSSondkCY3ApYQti89tOh00f0j_APrf_dQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-20  0:46                         ` Dave Chinner
2017-06-20  0:46                           ` Dave Chinner
2017-06-20  0:46                           ` Dave Chinner
2017-06-20  0:46                           ` Dave Chinner
2017-06-20  5:53                           ` Andy Lutomirski
2017-06-20  5:53                             ` Andy Lutomirski
2017-06-20  5:53                             ` Andy Lutomirski
2017-06-20  5:53                             ` Andy Lutomirski
2017-06-20  8:49                             ` Christoph Hellwig
2017-06-20  8:49                               ` Christoph Hellwig
2017-06-20  8:49                               ` Christoph Hellwig
     [not found]                               ` <20170620084924.GA9752-jcswGhMUV9g@public.gmane.org>
2017-06-20 16:17                                 ` Dan Williams
2017-06-20 16:17                                   ` Dan Williams
2017-06-20 16:17                                   ` Dan Williams
2017-06-20 16:17                                   ` Dan Williams
     [not found]                                   ` <CAPcyv4jkH6iwDoG4NnCaTNXozwYgVXiJDe2iFSONcE63KvGQoA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-20 16:26                                     ` Andy Lutomirski
2017-06-20 16:26                                       ` Andy Lutomirski
2017-06-20 16:26                                       ` Andy Lutomirski
2017-06-20 16:26                                       ` Andy Lutomirski
2017-06-20 23:53                                   ` Dave Chinner
2017-06-20 23:53                                     ` Dave Chinner
2017-06-20 23:53                                     ` Dave Chinner
2017-06-21  1:24                                     ` Darrick J. Wong
2017-06-21  1:24                                       ` Darrick J. Wong
2017-06-21  2:19                                       ` Dave Chinner [this message]
2017-06-21  2:19                                         ` Dave Chinner
2017-06-21  2:19                                         ` Dave Chinner
     [not found]                             ` <CALCETrVuoPDRuuhc9X8eVCYiFUzWLSTRkcjbD6jas_2J2GixNQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-20 10:11                               ` Dave Chinner
2017-06-20 10:11                                 ` Dave Chinner
2017-06-20 10:11                                 ` Dave Chinner
2017-06-20 10:11                                 ` Dave Chinner
2017-06-20 16:14                                 ` Andy Lutomirski
2017-06-20 16:14                                   ` Andy Lutomirski
2017-06-20 16:14                                   ` Andy Lutomirski
2017-06-20 16:14                                   ` Andy Lutomirski
2017-06-21  1:40                                   ` Dave Chinner
2017-06-21  1:40                                     ` Dave Chinner
2017-06-21  1:40                                     ` Dave Chinner
2017-06-21  5:18                                     ` Andy Lutomirski
2017-06-21  5:18                                       ` Andy Lutomirski
2017-06-21  5:18                                       ` Andy Lutomirski
2017-06-21  5:18                                       ` Andy Lutomirski
     [not found]                                       ` <CALCETrVYmbyNS-btvsN_M-QyWPZA_Y_4JXOM893g7nhZA+WviQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-06-22  0:02                                         ` Dave Chinner
2017-06-22  0:02                                           ` Dave Chinner
2017-06-22  0:02                                           ` Dave Chinner
2017-06-22  0:02                                           ` Dave Chinner
2017-06-22  4:07                                           ` Andy Lutomirski
2017-06-22  4:07                                             ` Andy Lutomirski
2017-06-22  4:07                                             ` Andy Lutomirski
2017-06-22  4:07                                             ` Andy Lutomirski
2017-06-23  0:52                                             ` Dave Chinner
2017-06-23  0:52                                               ` Dave Chinner
2017-06-23  0:52                                               ` Dave Chinner
2017-06-23  3:07                                               ` Andy Lutomirski
2017-06-23  3:07                                                 ` Andy Lutomirski
2017-06-23  3:07                                                 ` Andy Lutomirski
2017-06-18  8:18               ` Christoph Hellwig
2017-06-18  8:18                 ` Christoph Hellwig
     [not found]                 ` <20170618081850.GA26332-jcswGhMUV9g@public.gmane.org>
2017-06-19  1:51                   ` Dan Williams
2017-06-19  1:51                     ` Dan Williams
2017-06-19  1:51                     ` Dan Williams
2017-06-19  1:51                     ` Dan Williams
2017-06-20  5:22   ` Darrick J. Wong
2017-06-20  5:22     ` Darrick J. Wong
2017-06-20 15:42     ` Ross Zwisler
2017-06-20 15:42       ` Ross Zwisler
2017-06-22  7:09       ` Darrick J. Wong
2017-06-22  7:09         ` Darrick J. Wong
     [not found]     ` <20170620052214.GA3787-PTl6brltDGh4DFYR7WNSRA@public.gmane.org>
2017-06-21 23:37       ` Dave Chinner
2017-06-21 23:37         ` Dave Chinner
2017-06-21 23:37         ` Dave Chinner
2017-06-21 23:37         ` Dave Chinner
2017-06-22  7:23         ` Darrick J. Wong
2017-06-22  7:23           ` Darrick J. Wong
2017-06-22  7:23           ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170621021903.GM17542@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=andy.rudoff@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@kernel.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.