From: Boaz Harrosh <boaz@plexistor.com>
To: Ingo Molnar <mingo@kernel.org>, Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Matthew Wilcox <matthew.r.wilcox@intel.com>
Subject: Re: [GIT PULL] PMEM driver for v4.1
Date: Mon, 13 Apr 2015 15:21:30 +0300 [thread overview]
Message-ID: <552BB4CA.6060808@plexistor.com> (raw)
In-Reply-To: <20150413104531.GB30556@gmail.com>
On 04/13/2015 01:45 PM, Ingo Molnar wrote:
>
> * Christoph Hellwig <hch@lst.de> wrote:
>
>> On Mon, Apr 13, 2015 at 11:33:09AM +0200, Ingo Molnar wrote:
>>> Limitations: this is a regular block device, and since the pmem areas
>>> are not struct page backed, they are invisible to the rest of the
>>> system (other than the block IO device), so direct IO to/from pmem
>>> areas, direct mmap() or XIP is not possible yet. The page cache will
>>> also shadow and double buffer pmem contents, etc.
>>
>> Unless you use the DAX support in ext2/4 and soon XFS, in which case
>> we avoid that double buffering when doing read/write and mmap
>
> Indeed, I missed that DAX support just went upstream in v4.0 - nice!
>
> DAX may have some other limitations though that comes from not having
> struct page * backing and using VM_MIXEDMAP, the following APIs might
> not work on DAX files:
>
> - splice
splice works fine. Also I sent a cleanup in this area to Andrew it will
be in for 4.1
> - zero copy O_DIRECT into DAX areas.
DAX is always O_DIRECT.
What does not work is mmap of DAX file and use that pointer in an
O_DIRECT operation of another device. (unless it is a DAX device)
Also mmap of DAX file and RDMA or direct-networking. Will need
a copy.
All this is fixable by applying my page-struct patch for pmem
> - futexes
>
> - ( AFAICS hugetlbs won't work on DAX mmap()s yet - although with
> the current nocache mapping that's probable the least of the
> performance issues for now. )
>
> Btw., what's the future design plan here? Enable struct page backing,
> or provide special codepaths for all DAX uses like the special pte
> based approach for mmap()s?
>
I'm hopping for struct page, 4k pages at first and 2M pages later on,
which needs more work in IO stacks, where I need this most.
> Thanks,
> Ingo
>
Thanks
Boaz
WARNING: multiple messages have this Message-ID (diff)
From: Boaz Harrosh <boaz@plexistor.com>
To: Ingo Molnar <mingo@kernel.org>, Christoph Hellwig <hch@lst.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-nvdimm@ml01.01.org,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Matthew Wilcox <matthew.r.wilcox@intel.com>
Subject: Re: [GIT PULL] PMEM driver for v4.1
Date: Mon, 13 Apr 2015 15:21:30 +0300 [thread overview]
Message-ID: <552BB4CA.6060808@plexistor.com> (raw)
In-Reply-To: <20150413104531.GB30556@gmail.com>
On 04/13/2015 01:45 PM, Ingo Molnar wrote:
>
> * Christoph Hellwig <hch@lst.de> wrote:
>
>> On Mon, Apr 13, 2015 at 11:33:09AM +0200, Ingo Molnar wrote:
>>> Limitations: this is a regular block device, and since the pmem areas
>>> are not struct page backed, they are invisible to the rest of the
>>> system (other than the block IO device), so direct IO to/from pmem
>>> areas, direct mmap() or XIP is not possible yet. The page cache will
>>> also shadow and double buffer pmem contents, etc.
>>
>> Unless you use the DAX support in ext2/4 and soon XFS, in which case
>> we avoid that double buffering when doing read/write and mmap
>
> Indeed, I missed that DAX support just went upstream in v4.0 - nice!
>
> DAX may have some other limitations though that comes from not having
> struct page * backing and using VM_MIXEDMAP, the following APIs might
> not work on DAX files:
>
> - splice
splice works fine. Also I sent a cleanup in this area to Andrew it will
be in for 4.1
> - zero copy O_DIRECT into DAX areas.
DAX is always O_DIRECT.
What does not work is mmap of DAX file and use that pointer in an
O_DIRECT operation of another device. (unless it is a DAX device)
Also mmap of DAX file and RDMA or direct-networking. Will need
a copy.
All this is fixable by applying my page-struct patch for pmem
> - futexes
>
> - ( AFAICS hugetlbs won't work on DAX mmap()s yet - although with
> the current nocache mapping that's probable the least of the
> performance issues for now. )
>
> Btw., what's the future design plan here? Enable struct page backing,
> or provide special codepaths for all DAX uses like the special pte
> based approach for mmap()s?
>
I'm hopping for struct page, 4k pages at first and 2M pages later on,
which needs more work in IO stacks, where I need this most.
> Thanks,
> Ingo
>
Thanks
Boaz
next prev parent reply other threads:[~2015-04-13 12:21 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-13 9:33 [GIT PULL] PMEM driver for v4.1 Ingo Molnar
2015-04-13 9:33 ` Ingo Molnar
2015-04-13 9:35 ` Christoph Hellwig
2015-04-13 9:35 ` Christoph Hellwig
2015-04-13 10:45 ` Ingo Molnar
2015-04-13 10:45 ` Ingo Molnar
2015-04-13 11:11 ` [Linux-nvdimm] " Yigal Korman
2015-04-13 11:11 ` Yigal Korman
2015-04-13 17:19 ` Christoph Hellwig
2015-04-13 17:19 ` Christoph Hellwig
2015-04-14 6:41 ` Boaz Harrosh
2015-04-14 6:41 ` Boaz Harrosh
2015-04-13 12:21 ` Boaz Harrosh [this message]
2015-04-13 12:21 ` Boaz Harrosh
2015-04-13 12:35 ` Ingo Molnar
2015-04-13 12:35 ` Ingo Molnar
2015-04-13 13:36 ` Boaz Harrosh
2015-04-13 13:36 ` Boaz Harrosh
2015-04-13 17:22 ` Christoph Hellwig
2015-04-13 17:22 ` Christoph Hellwig
2015-04-13 17:18 ` Christoph Hellwig
2015-04-13 17:18 ` Christoph Hellwig
2015-04-14 12:41 ` Ingo Molnar
2015-04-14 12:41 ` Ingo Molnar
2015-04-14 13:45 ` Boaz Harrosh
2015-04-14 13:45 ` Boaz Harrosh
2015-04-14 14:08 ` [Linux-nvdimm] " Elliott, Robert (Server Storage)
2015-04-14 14:08 ` Elliott, Robert (Server Storage)
2015-04-14 16:34 ` Dan Williams
2015-04-14 16:34 ` Dan Williams
2015-04-14 21:46 ` Elliott, Robert (Server Storage)
2015-04-14 21:46 ` Elliott, Robert (Server Storage)
2015-04-15 8:03 ` Ingo Molnar
2015-04-15 8:03 ` Ingo Molnar
2015-04-14 16:04 ` Dan Williams
2015-04-14 16:04 ` Dan Williams
2015-04-15 8:45 ` Ingo Molnar
2015-04-15 8:45 ` Ingo Molnar
2015-04-16 4:31 ` Dan Williams
2015-04-16 4:31 ` Dan Williams
2015-04-17 6:38 ` Christoph Hellwig
2015-04-17 6:38 ` Christoph Hellwig
2015-04-18 15:42 ` Linus Torvalds
2015-04-18 15:42 ` Linus Torvalds
2015-05-25 18:16 ` [Linux-nvdimm] " Matthew Wilcox
2015-05-25 18:16 ` Matthew Wilcox
2015-05-25 18:30 ` Ingo Molnar
2015-05-25 18:30 ` Ingo Molnar
2015-05-26 8:41 ` Boaz Harrosh
2015-05-26 8:41 ` Boaz Harrosh
2015-05-26 19:31 ` Matthew Wilcox
2015-05-26 19:31 ` Matthew Wilcox
2015-05-27 8:10 ` Boaz Harrosh
2015-05-27 8:10 ` Boaz Harrosh
2015-05-27 8:11 ` Christoph Hellwig
2015-05-27 8:11 ` Christoph Hellwig
2015-05-27 8:26 ` Boaz Harrosh
2015-05-27 8:26 ` Boaz Harrosh
2015-05-27 7:50 ` Ingo Molnar
2015-05-27 7:50 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=552BB4CA.6060808@plexistor.com \
--to=boaz@plexistor.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=matthew.r.wilcox@intel.com \
--cc=mingo@kernel.org \
--cc=ross.zwisler@linux.intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.