From: Ingo Molnar <mingo@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Boaz Harrosh <boaz@plexistor.com>, Jan Kara <jack@suse.cz>,
Mike Snitzer <snitzer@redhat.com>, Neil Brown <neilb@suse.de>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Chris Mason <clm@fb.com>, Paul Mackerras <paulus@samba.org>,
"H. Peter Anvin" <hpa@zytor.com>, Christoph Hellwig <hch@lst.de>,
Alasdair Kergon <agk@redhat.com>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
Mel Gorman <mgorman@suse.de>,
Matthew Wilcox <willy@linux.intel.com>,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Rik van Riel <riel@redhat.com>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Jens Axboe <axboe@kernel.dk>, Theodore Ts'o <tytso@mit.edu>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Julia Lawall <Julia.Lawall@lip6.fr>, Tejun Heo <tj@kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t
Date: Thu, 7 May 2015 16:42:25 +0200 [thread overview]
Message-ID: <20150507144225.GA20491@gmail.com> (raw)
In-Reply-To: <20150507090217.GA4467@gmail.com>
* Ingo Molnar <mingo@kernel.org> wrote:
> [...]
>
> For anything more complex, that maps any of this storage to
> user-space, or exposes it to higher level struct page based APIs,
> etc., where references matter and it's more of a cache with
> potentially multiple users, not an IO space, the natural API is
> struct page.
Let me walk back on this:
> I'd say that this particular series mostly addresses the 'pfn as
> sector_t' side of the equation, where persistent memory is IO space,
> not memory space, and as such it is the more natural and thus also
> the cheaper/faster approach.
... but that does not appear to be the case: this series replaces a
'struct page' interface with a pure pfn interface for the express
purpose of being able to DMA to/from 'memory areas' that are not
struct page backed.
> Linus probably disagrees? :-)
[ and he'd disagree rightfully ;-) ]
So what this patch set tries to achieve is (sector_t -> sector_t) IO
between storage devices (i.e. a rare and somewhat weird usecase), and
does it by squeezing one device's storage address into our formerly
struct page backed descriptor, via a pfn.
That looks like a layering violation and a mistake to me. If we want
to do direct (sector_t -> sector_t) IO, with no serialization worries,
it should have its own (simple) API - which things like hierarchical
RAID or RDMA APIs could use.
If what we want to do is to support say an mmap() of a file on
persistent storage, and then read() into that file from another device
via DMA, then I think we should have allocated struct page backing at
mmap() time already, and all regular syscall APIs would 'just work'
from that point on - far above what page-less, pfn-based APIs can do.
The temporary struct page backing can then be freed at munmap() time.
And if the usage is pure fd based, we don't really have fd-to-fd APIs
beyond the rarely used splice variants (and even those don't do pure
cross-IO, they use a pipe as an intermediary), so there's no problem
to solve I suspect.
Thanks,
Ingo
next prev parent reply other threads:[~2015-05-07 14:42 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-06 20:04 [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Dan Williams
2015-05-06 20:04 ` [PATCH v2 01/10] arch: introduce __pfn_t for persistent memory i/o Dan Williams
2015-05-07 14:55 ` Stephen Rothwell
2015-05-08 0:21 ` Dan Williams
2015-05-06 20:05 ` [PATCH v2 02/10] block: add helpers for accessing a bio_vec page Dan Williams
2015-05-08 15:59 ` Dan Williams
2015-05-06 20:05 ` [PATCH v2 03/10] block: convert .bv_page to .bv_pfn bio_vec Dan Williams
2015-05-06 20:05 ` [PATCH v2 04/10] dma-mapping: allow archs to optionally specify a ->map_pfn() operation Dan Williams
2015-05-06 20:05 ` [PATCH v2 05/10] scatterlist: use sg_phys() Dan Williams
2015-05-06 20:05 ` [PATCH v2 06/10] scatterlist: support "page-less" (__pfn_t only) entries Dan Williams
2015-05-06 20:05 ` [PATCH v2 07/10] x86: support dma_map_pfn() Dan Williams
2015-05-06 20:05 ` [PATCH v2 08/10] x86: support kmap_atomic_pfn_t() for persistent memory Dan Williams
2015-05-06 20:20 ` [Linux-nvdimm] " Dan Williams
2015-05-06 20:05 ` [PATCH v2 09/10] dax: convert to __pfn_t Dan Williams
2015-05-06 20:05 ` [PATCH v2 10/10] block: base support for pfn i/o Dan Williams
2015-05-06 22:10 ` [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Linus Torvalds
2015-05-06 23:47 ` Dan Williams
2015-05-07 0:19 ` Linus Torvalds
2015-05-07 2:36 ` Dan Williams
2015-05-07 9:02 ` Ingo Molnar
2015-05-07 14:42 ` Ingo Molnar [this message]
2015-05-07 15:52 ` Dan Williams
2015-05-07 17:52 ` Ingo Molnar
2015-05-07 15:00 ` Linus Torvalds
2015-05-07 15:40 ` Dan Williams
2015-05-07 15:58 ` Linus Torvalds
2015-05-07 16:03 ` Dan Williams
2015-05-07 17:36 ` Ingo Molnar
2015-05-07 17:42 ` Dan Williams
2015-05-07 17:56 ` Dave Hansen
2015-05-07 19:11 ` Ingo Molnar
2015-05-07 19:36 ` Jerome Glisse
2015-05-07 19:48 ` Ingo Molnar
2015-05-07 19:53 ` Ingo Molnar
2015-05-07 20:18 ` Jerome Glisse
2015-05-08 5:37 ` Ingo Molnar
2015-05-08 9:20 ` Al Viro
2015-05-08 9:26 ` Ingo Molnar
2015-05-08 10:00 ` Al Viro
2015-05-08 13:45 ` Rik van Riel
2015-05-08 14:05 ` Ingo Molnar
2015-05-08 14:54 ` Rik van Riel
[not found] ` <21836.51957.715473.780762@quad.stoffel.home>
2015-05-08 15:54 ` Linus Torvalds
2015-05-08 16:28 ` Al Viro
2015-05-08 16:59 ` Rik van Riel
2015-05-09 1:14 ` Linus Torvalds
2015-05-09 3:02 ` Rik van Riel
2015-05-09 3:52 ` Linus Torvalds
2015-05-09 21:56 ` Dave Chinner
2015-05-09 8:45 ` "Directly mapped persistent memory page cache" Ingo Molnar
2015-05-09 18:24 ` Dan Williams
2015-05-10 9:46 ` Ingo Molnar
2015-05-10 17:29 ` Dan Williams
[not found] ` <87r3qpyciy.fsf@x220.int.ebiederm.org>
2015-05-10 10:07 ` Ingo Molnar
2015-05-11 8:25 ` Dave Chinner
2015-05-11 9:18 ` Ingo Molnar
2015-05-11 10:12 ` Zuckerman, Boris
2015-05-11 10:38 ` Ingo Molnar
2015-05-12 0:53 ` Dave Chinner
2015-05-12 14:47 ` Jerome Glisse
2015-06-05 5:43 ` Dan Williams
2015-05-11 14:31 ` Matthew Wilcox
2015-05-11 20:01 ` Jerome Glisse
2015-05-07 17:43 ` [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t Linus Torvalds
2015-05-07 20:06 ` Dan Williams
2015-05-07 16:18 ` Christoph Hellwig
2015-05-07 16:41 ` Dan Williams
2015-05-07 18:40 ` Ingo Molnar
2015-05-07 19:44 ` Dan Williams
2015-05-07 17:30 ` Jerome Glisse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150507144225.GA20491@gmail.com \
--to=mingo@kernel.org \
--cc=Julia.Lawall@lip6.fr \
--cc=agk@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=boaz@plexistor.com \
--cc=clm@fb.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hch@lst.de \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=martin.petersen@oracle.com \
--cc=mgorman@suse.de \
--cc=neilb@suse.de \
--cc=paulus@samba.org \
--cc=riel@redhat.com \
--cc=ross.zwisler@linux.intel.com \
--cc=schwidefsky@de.ibm.com \
--cc=snitzer@redhat.com \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).