From: Boaz Harrosh <boaz@plexistor.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-nvdimm <linux-nvdimm@ml01.01.org>,
Matthew Wilcox <willy@linux.intel.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Dave Chinner <david@fromorbit.com>,
Oleg Nesterov <oleg@redhat.com>, Mel Gorman <mgorman@suse.de>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm <linux-mm@kvack.org>, Arnd Bergmann <arnd@arndb.de>
Subject: Re: [RFC 0/2] New MAP_PMEM_AWARE mmap flag
Date: Mon, 22 Feb 2016 13:05:09 +0200 [thread overview]
Message-ID: <56CAEB65.3080807@plexistor.com> (raw)
In-Reply-To: <CAPcyv4gQV9Oh9OpHTGuGfTJ_s1C_L7J-VGyto3JMdAcgqyVeAw@mail.gmail.com>
On 02/22/2016 12:03 AM, Dan Williams wrote:
> On Sun, Feb 21, 2016 at 1:23 PM, Boaz Harrosh <boaz@plexistor.com> wrote:
<>
>> I have manually tested all this and it seems to work. Can you see
>> a theoretical scenario where it would not?
>
> I'm worried about the scenario where the pmem aware app assumes that
> none of the cachelines in its mapping are dirty when it goes to issue
> pcommit. We'll have two applications with different perceptions of
> when writes are durable.
Warning rant: Rrrr the theoretical pcommit. We have built mountains
on a none existing CPU. Show me a pcomit already.
But yes pcommit changes nothing.
> Maybe it's not a problem in practice, at
> least current generation x86 cpus flush existing dirty cachelines when
> performing non-temporal stores. However, it bothers me that there are
> cpus where a pmem-unaware app could prevent a pmem-aware app from
> making writes durable. It seems if one app has established a
> MAP_PMEM_AWARE mapping it needs guarantees that all apps participating
> in that shared mapping have the same awareness.
>
But we are not breaking any current POSIX guaranties. You are thinking
memory, but this is POSIX filesystem semantics. This is all up to the
application.
Consider a regular page-cached FS, and your above two applications,
(Which BTW do not exist exactly because). Both are doing a write not
to a cacheline to a page even:
App 1 app2
- write block X ...
- sync write block X
- POWER OFF
There is no guaranty that app 1 version is what will be read
after mount. Any random amount of app2 changes can be seen.
In fact even while the pages are in DMA they can change.
All that is guarantied is that the page will be marked dirty
because app 2 dirty it even though app 1 submitted it to be
cleaned.
And is what we have. If app 2 is pmem-unaware the page is added
to the radix tree, come sync time it will cl_flush.
In Any which case after the write storms end, and a final
sync is preformed we should have an image of the very last
writes. This is POSIX. And this is kept here.
So no no need for "shared mapping have the same awareness"
[BTW: coming from the NFS world all this is one big lough
because there we don't even have a read concurrent write
guaranty let alone a write vs write guaranty.]
> Another potential issue is that MAP_PMEM_AWARE is not enough on its
> own. If the filesystem or inode does not support DAX the application
> needs to assume page cache semantics. At a minimum MAP_PMEM_AWARE
> requests would need to fail if DAX is not available.
>
Yes good idea, will do.
Shalom
Boaz
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-02-22 11:05 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-21 17:03 [RFC 0/2] New MAP_PMEM_AWARE mmap flag Boaz Harrosh
2016-02-21 17:04 ` [RFC 1/2] mmap: Define a new " Boaz Harrosh
2016-02-21 17:06 ` [RFC 2/2] dax: Support " Boaz Harrosh
2016-02-21 19:51 ` [RFC 0/2] New " Dan Williams
2016-02-21 20:24 ` Boaz Harrosh
2016-02-21 20:57 ` Dan Williams
2016-02-21 21:23 ` Boaz Harrosh
2016-02-21 22:03 ` Dan Williams
2016-02-21 22:31 ` Dave Chinner
2016-02-22 9:57 ` Boaz Harrosh
2016-02-22 15:34 ` Jeff Moyer
2016-02-22 17:44 ` Christoph Hellwig
2016-02-22 17:58 ` Jeff Moyer
2016-02-22 18:03 ` Christoph Hellwig
2016-02-22 18:52 ` Jeff Moyer
2016-02-23 9:45 ` Christoph Hellwig
2016-02-22 20:05 ` Rudoff, Andy
2016-02-23 9:52 ` Christoph Hellwig
2016-02-23 10:07 ` Rudoff, Andy
2016-02-23 12:06 ` Dave Chinner
2016-02-23 17:10 ` Ross Zwisler
2016-02-23 21:47 ` Dave Chinner
2016-02-23 22:15 ` Boaz Harrosh
2016-02-23 23:28 ` Dave Chinner
2016-02-24 0:08 ` Boaz Harrosh
2016-02-23 14:10 ` Boaz Harrosh
2016-02-23 16:56 ` Dan Williams
2016-02-23 17:05 ` Ross Zwisler
2016-02-23 17:26 ` Dan Williams
2016-02-23 21:55 ` Boaz Harrosh
2016-02-23 22:33 ` Dan Williams
2016-02-23 23:07 ` Boaz Harrosh
2016-02-23 23:23 ` Dan Williams
2016-02-23 23:40 ` Boaz Harrosh
2016-02-24 0:08 ` Dave Chinner
2016-02-23 23:28 ` Jeff Moyer
2016-02-23 23:34 ` Dan Williams
2016-02-23 23:43 ` Jeff Moyer
2016-02-23 23:56 ` Dan Williams
2016-02-24 4:09 ` Ross Zwisler
2016-02-24 19:30 ` Ross Zwisler
2016-02-25 9:46 ` Jan Kara
2016-02-25 7:44 ` Boaz Harrosh
2016-02-24 15:02 ` Jeff Moyer
2016-02-24 22:56 ` Dave Chinner
2016-02-25 16:24 ` Jeff Moyer
2016-02-25 19:11 ` Jeff Moyer
2016-02-25 20:15 ` Dave Chinner
2016-02-25 20:57 ` Jeff Moyer
2016-02-25 22:27 ` Dave Chinner
2016-02-26 4:02 ` Dan Williams
2016-02-26 10:04 ` Thanumalayan Sankaranarayana Pillai
2016-02-28 10:17 ` Boaz Harrosh
2016-03-03 17:38 ` Howard Chu
2016-02-29 20:25 ` Jeff Moyer
2016-02-25 21:08 ` Phil Terry
2016-02-25 21:39 ` Dave Chinner
2016-02-25 21:20 ` Dave Chinner
2016-02-29 20:32 ` Jeff Moyer
2016-02-23 17:25 ` Ross Zwisler
2016-02-23 22:47 ` Boaz Harrosh
2016-02-22 21:50 ` Dave Chinner
2016-02-23 13:51 ` Boaz Harrosh
2016-02-23 14:22 ` Jeff Moyer
2016-02-22 11:05 ` Boaz Harrosh [this message]
2016-03-11 6:44 ` Andy Lutomirski
2016-03-11 19:07 ` Dan Williams
2016-03-11 19:10 ` Andy Lutomirski
2016-03-11 23:02 ` Rudoff, Andy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56CAEB65.3080807@plexistor.com \
--to=boaz@plexistor.com \
--cc=arnd@arndb.de \
--cc=dan.j.williams@intel.com \
--cc=david@fromorbit.com \
--cc=hannes@cmpxchg.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=mgorman@suse.de \
--cc=oleg@redhat.com \
--cc=ross.zwisler@linux.intel.com \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).