linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Boaz Harrosh <boaz@plexistor.com>, Jan Kara <jack@suse.cz>,
	Mike Snitzer <snitzer@redhat.com>, Neil Brown <neilb@suse.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Chris Mason <clm@fb.com>, Paul Mackerras <paulus@samba.org>,
	"H. Peter Anvin" <hpa@zytor.com>, Christoph Hellwig <hch@lst.de>,
	Alasdair Kergon <agk@redhat.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Mel Gorman <mgorman@suse.de>,
	Matthew Wilcox <willy@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Rik van Riel <riel@redhat.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Jens Axboe <axboe@kernel.dk>, Theodore Ts'o <tytso@mit.edu>,
	"Martin K. Petersen" <martin.petersen@ora
Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t
Date: Thu, 7 May 2015 19:36:41 +0200	[thread overview]
Message-ID: <20150507173641.GA21781@gmail.com> (raw)
In-Reply-To: <CAPcyv4i5957h0PW5JL0rv4iurL=c7ppEFj3cDrrrMc4MQmJQGQ@mail.gmail.com>


* Dan Williams <dan.j.williams@intel.com> wrote:

> > Anyway, I did want to say that while I may not be convinced about 
> > the approach, I think the patches themselves don't look horrible. 
> > I actually like your "__pfn_t". So while I (very obviously) have 
> > some doubts about this approach, it may be that the most 
> > convincing argument is just in the code.
> 
> Ok, I'll keep thinking about this and come back when we have a 
> better story about passing mmap'd persistent memory around in 
> userspace.

So is there anything fundamentally wrong about creating struct page 
backing at mmap() time (and making sure aliased mmaps share struct 
page arrays)?

Because if that is done, then the DMA agent won't even know about the 
memory being persistent RAM. It's just a regular struct page, that 
happens to point to persistent RAM. Same goes for all the high level 
VM APIs, futexes, etc. Everything will Just Work.

It will also be relatively fast: mmap() is a relative slowpath, 
comparatively.

As far as RAID is concerned: that's a relatively easy situation, as 
there's only a single user of the devices, the RAID context that 
manages all component devices exclusively. Device to device DMA can 
use the block layer directly, i.e. most of the patches you've got here 
in this series, except:

74287   C May 06 Dan Williams    ( 232) ├─>[PATCH v2 09/10] dax: convert to __pfn_t

I think DAX mmap()s need struct page backing.

I think there's a simple rule: if a page is visible to user-space via 
the MMU then it needs struct page backing. If it's "hidden", like 
behind a RAID abstraction, it probably doesn't.

With the remaining patches a high level RAID driver ought to be able 
to send pfn-to-sector and sector-to-pfn requests to other block 
drivers, without any unnecessary struct page allocation overhead, 
right?

As long as the pfn concept remains a clever way to reuse our 
ram<->sector interfaces to implement sector<->sector IO, in the cases 
where the IO has no serialization or MMU concerns, not using struct 
page and using pfn_t looks natural.

The moment it starts reaching user space APIs, like in the DAX case, 
and especially if it becomes user-MMU visible, it's a mistake to not 
have struct page backing, I think.

(In that sense the current DAX mmap() code is already a partial 
mistake.)

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2015-05-07 17:36 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-06 20:04 [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Dan Williams
2015-05-06 20:04 ` [PATCH v2 01/10] arch: introduce __pfn_t for persistent memory i/o Dan Williams
2015-05-07 14:55   ` Stephen Rothwell
2015-05-08  0:21     ` Dan Williams
2015-05-06 20:05 ` [PATCH v2 02/10] block: add helpers for accessing a bio_vec page Dan Williams
2015-05-08 15:59   ` Dan Williams
2015-05-06 20:05 ` [PATCH v2 03/10] block: convert .bv_page to .bv_pfn bio_vec Dan Williams
2015-05-06 20:05 ` [PATCH v2 04/10] dma-mapping: allow archs to optionally specify a ->map_pfn() operation Dan Williams
2015-05-06 20:05 ` [PATCH v2 05/10] scatterlist: use sg_phys() Dan Williams
2015-05-06 20:05 ` [PATCH v2 06/10] scatterlist: support "page-less" (__pfn_t only) entries Dan Williams
2015-05-06 20:05 ` [PATCH v2 07/10] x86: support dma_map_pfn() Dan Williams
2015-05-06 20:05 ` [PATCH v2 08/10] x86: support kmap_atomic_pfn_t() for persistent memory Dan Williams
2015-05-06 20:20   ` [Linux-nvdimm] " Dan Williams
2015-05-06 20:05 ` [PATCH v2 09/10] dax: convert to __pfn_t Dan Williams
2015-05-06 20:05 ` [PATCH v2 10/10] block: base support for pfn i/o Dan Williams
2015-05-06 20:50 ` [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Al Viro
2015-05-06 22:10 ` Linus Torvalds
2015-05-06 23:47   ` Dan Williams
2015-05-07  0:19     ` Linus Torvalds
2015-05-07  2:36       ` Dan Williams
2015-05-07  9:02         ` Ingo Molnar
2015-05-07 14:42           ` Ingo Molnar
2015-05-07 15:52             ` Dan Williams
2015-05-07 17:52               ` Ingo Molnar
2015-05-07 15:00         ` Linus Torvalds
2015-05-07 15:40           ` Dan Williams
2015-05-07 15:58             ` Linus Torvalds
2015-05-07 16:03               ` Dan Williams
2015-05-07 17:36                 ` Ingo Molnar [this message]
2015-05-07 17:42                   ` Dan Williams
2015-05-07 17:56                     ` Dave Hansen
2015-05-07 19:11                       ` Ingo Molnar
2015-05-07 19:36                         ` Jerome Glisse
2015-05-07 19:48                           ` Ingo Molnar
2015-05-07 19:53                             ` Ingo Molnar
2015-05-07 20:18                               ` Jerome Glisse
2015-05-08  5:37                                 ` Ingo Molnar
2015-05-08  9:20                                   ` Al Viro
2015-05-08  9:26                                     ` Ingo Molnar
2015-05-08 10:00                                       ` Al Viro
2015-05-08 13:45                         ` Rik van Riel
2015-05-08 14:05                           ` Ingo Molnar
2015-05-08 14:40                             ` John Stoffel
2015-05-08 15:54                               ` Linus Torvalds
2015-05-08 16:28                                 ` Al Viro
2015-05-08 16:59                                 ` Rik van Riel
2015-05-09  1:14                                   ` Linus Torvalds
2015-05-09  3:02                                     ` Rik van Riel
2015-05-09  3:52                                       ` Linus Torvalds
2015-05-09 21:56                                       ` Dave Chinner
2015-05-09  8:45                                   ` "Directly mapped persistent memory page cache" Ingo Molnar
2015-05-09 15:51                                     ` Eric W. Biederman
2015-05-10 10:07                                       ` Ingo Molnar
2015-05-09 18:24                                     ` Dan Williams
2015-05-10  9:46                                       ` Ingo Molnar
2015-05-10 17:29                                         ` Dan Williams
2015-05-11  8:25                                     ` Dave Chinner
2015-05-11  9:18                                       ` Ingo Molnar
2015-05-11 10:12                                         ` Zuckerman, Boris
2015-05-11 10:38                                           ` Ingo Molnar
2015-05-11 14:51                                             ` Jeff Moyer
2015-05-12  0:53                                         ` Dave Chinner
2015-05-12 14:47                                           ` Jerome Glisse
2015-06-05  5:43                                             ` Dan Williams
2015-05-11 14:31                                     ` Matthew Wilcox
2015-05-11 20:01                                       ` Jerome Glisse
2015-05-08 20:40                                 ` [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t John Stoffel
2015-05-08 14:54                             ` Rik van Riel
2015-05-07 17:43                 ` Linus Torvalds
2015-05-07 20:06                   ` Dan Williams
2015-05-07 16:18       ` Christoph Hellwig
2015-05-07 16:41         ` Dan Williams
2015-05-07 18:40           ` Ingo Molnar
2015-05-07 19:44             ` Dan Williams
2015-05-07 17:30         ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150507173641.GA21781@gmail.com \
    --to=mingo@kernel.org \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=benh@kernel.crashing.org \
    --cc=boaz@plexistor.com \
    --cc=clm@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hch@lst.de \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=martin.petersen@ora \
    --cc=mgorman@suse.de \
    --cc=neilb@suse.de \
    --cc=paulus@samba.org \
    --cc=riel@redhat.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=snitzer@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).