From: Jan Kara <jack@suse.cz>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Jan Kara <jack@suse.cz>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
Dave Chinner <david@fromorbit.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
linux-xfs@vger.kernel.org, Jeff Moyer <jmoyer@redhat.com>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Andy Lutomirski <luto@kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Linux API <linux-api@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap
Date: Mon, 14 Aug 2017 14:40:59 +0200 [thread overview]
Message-ID: <20170814124059.GC17820@quack2.suse.cz> (raw)
In-Reply-To: <CAPcyv4ixTgSWG9K2Eg3XJmOvqJht81qL+Z3njoOjcXCD7XMpZw@mail.gmail.com>
On Sun 13-08-17 13:31:45, Dan Williams wrote:
> On Sun, Aug 13, 2017 at 2:24 AM, Christoph Hellwig <hch@lst.de> wrote:
> > Thay being said I think we absolutely should support RDMA memory
> > registrations for DAX mappings. I'm just not sure how S_IOMAP_IMMUTABLE
> > helps with that. We'll want a MAP_SYNC | MAP_POPULATE to make sure
> > all the blocks are polulated and all ptes are set up. Second we need
> > to make sure get_user_page works, which for now means we'll need a
> > struct page mapping for the region (which will be really annoying
> > for PCIe mappings, like the upcoming NVMe persistent memory region),
> > and we need to gurantee that the extent mapping won't change while
> > the get_user_pages holds the pages inside it. I think that is true
> > due to side effects even with the current DAX code, but we'll need to
> > make it explicit. And maybe that's where we need to converge -
> > "sealing" the extent map makes sense as such a temporary measure
> > that is not persisted on disk, which automatically gets released
> > when the holding process exits, because we sort of already do this
> > implicitly. It might also make sense to have explicitl breakable
> > seals similar to what I do for the pNFS blocks kernel server, as
> > any userspace RDMA file server would also need those semantics.
>
> Ok, how about a MAP_DIRECT flag that arranges for faults to that range to:
>
> 1/ only succeed if the fault can be satisfied without page cache
>
> 2/ only install a pte for the fault if it can do so without
> triggering block map updates
>
> So, I think it would still end up setting an inode flag to make
> xfs_bmapi_write() fail while any process has a MAP_DIRECT mapping
> active. However, it would not record that state in the on-disk
> metadata and it would automatically clear at munmap time. That should
> be enough to support the host-persistent-memory, and
> NVMe-persistent-memory use cases (provided we have struct page for
> NVMe). Although, we need more safety infrastructure in the NVMe case
> where we would need to software manage I/O coherence.
Hum, this proposal (and the problems you are trying to deal with) seem very
similar to Peter Zijlstra's mpin() proposal from 2014 [1], just moved to
the DAX area (and so additionally complicated by the fact that filesystems
now have to care). The patch set was not merged due to lack of interest I
think but it looked sensible and the proposed API would make sense for more
stuff than just DAX so maybe it would be better than MAP_DIRECT flag?
[1] https://lwn.net/Articles/600502/
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2017-08-14 12:48 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-04 2:28 [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap Dan Williams
2017-08-04 2:28 ` [PATCH v2 1/5] fs, xfs: introduce S_IOMAP_IMMUTABLE Dan Williams
2017-08-04 20:00 ` Darrick J. Wong
2017-08-04 20:31 ` Dan Williams
2017-08-05 9:47 ` Christoph Hellwig
2017-08-07 0:25 ` Dave Chinner
2017-08-11 10:34 ` Christoph Hellwig
2017-08-04 2:28 ` [PATCH v2 2/5] fs, xfs: introduce FALLOC_FL_SEAL_BLOCK_MAP Dan Williams
2017-08-04 19:46 ` Darrick J. Wong
2017-08-04 19:52 ` Dan Williams
2017-08-04 23:31 ` Dave Chinner
2017-08-04 23:43 ` Dan Williams
2017-08-05 0:04 ` Dave Chinner
2017-08-04 2:28 ` [PATCH v2 3/5] fs, xfs: introduce FALLOC_FL_UNSEAL_BLOCK_MAP Dan Williams
2017-08-04 20:04 ` Darrick J. Wong
2017-08-04 20:36 ` Dan Williams
2017-08-04 2:28 ` [PATCH v2 4/5] xfs: introduce XFS_DIFLAG2_IOMAP_IMMUTABLE Dan Williams
2017-08-04 20:33 ` Darrick J. Wong
2017-08-04 20:45 ` Dan Williams
2017-08-04 23:46 ` Dave Chinner
2017-08-04 23:57 ` Darrick J. Wong
2017-08-04 2:28 ` [PATCH v2 5/5] xfs: toggle XFS_DIFLAG2_IOMAP_IMMUTABLE in response to fallocate Dan Williams
2017-08-04 20:14 ` Darrick J. Wong
2017-08-04 20:47 ` Dan Williams
2017-08-04 20:53 ` Darrick J. Wong
2017-08-04 20:55 ` Dan Williams
2017-08-04 2:38 ` [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap Dan Williams
2017-08-05 9:50 ` Christoph Hellwig
2017-08-06 18:51 ` Dan Williams
2017-08-11 10:44 ` Christoph Hellwig
2017-08-11 22:26 ` Dan Williams
2017-08-12 3:57 ` Andy Lutomirski
2017-08-12 4:44 ` Dan Williams
2017-08-12 7:34 ` Christoph Hellwig
2017-08-12 7:33 ` Christoph Hellwig
2017-08-12 19:19 ` Dan Williams
2017-08-13 9:24 ` Christoph Hellwig
2017-08-13 20:31 ` Dan Williams
2017-08-14 12:40 ` Jan Kara [this message]
2017-08-14 16:14 ` Dan Williams
2017-08-15 8:37 ` Jan Kara
2017-08-15 23:50 ` Dan Williams
2017-08-16 13:57 ` Jan Kara
2017-08-21 9:16 ` Peter Zijlstra
2017-08-14 21:46 ` Darrick J. Wong
2017-08-13 23:46 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170814124059.GC17820@quack2.suse.cz \
--to=jack@suse.cz \
--cc=dan.j.williams@intel.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=jmoyer@redhat.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-xfs@vger.kernel.org \
--cc=luto@kernel.org \
--cc=peterz@infradead.org \
--cc=ross.zwisler@linux.intel.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).