linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@linux.intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@linux.intel.com>, linux-fsdevel@vger.kernel.org
Subject: Re: direct_access, pinning and truncation
Date: Thu, 9 Oct 2014 11:25:24 -0400	[thread overview]
Message-ID: <20141009152524.GN5098@wil.cx> (raw)
In-Reply-To: <20141009011038.GA4376@dastard>

On Thu, Oct 09, 2014 at 12:10:38PM +1100, Dave Chinner wrote:
> On Wed, Oct 08, 2014 at 03:05:23PM -0400, Matthew Wilcox wrote:
> > 
> > One of the things on my todo list is making O_DIRECT work to a
> > memory-mapped direct_access file.
> 
> I don't understand the motivation or the use case: O_DIRECT is
> purely for bypassing the page cache, and DAX already bypasses the
> page cache.  What difference is there between the DAX read/write
> path and a DAX-based O_DIRECT IO path, and why doesn't just ignoring
> O_DIRECT for DAX enabled filesystems simply do what you need?

There are two filesystems involved ... if both (or neither!) are DAX,
everything's fine.  The problem comes when you do things this way around:

int cachefd = open("/dax/cache", O_RDWR);
int datafd = open("/nfs/bigdata", O_RDWR | O_DIRECT);
void *cache = mmap(NULL, 1024 * 1024 * 1024, PROT_READ | PROT_WRITE,
		MAP_SHARED, cachefd, 0);
read(datafd, cache, 1024 * 1024);

The non-DAX filesystem needs to pin pages from the DAX filesystem while
they're under I/O.


Another attempt to solve this problem might be to turn the O_DIRECT
read into a read into a page of DRAM, followed by a copy from DRAM
to PMEM.  Conversely, writes could be done as a copy to DRAM followed
by a page-based write.


You also elided the paragraphs where I point out that this is an example
of a more general problem; there really are people who want to do RDMA
to DAX memory (the HPC crowd, of course), and we need to not open up
security holes when enabling that.  Since it's a potentially long-duration
and bi-directional mapping, the copy solution isn't going to work here
(without going all the way to solution 1).

  reply	other threads:[~2014-10-09 15:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-08 19:05 direct_access, pinning and truncation Matthew Wilcox
2014-10-08 23:21 ` Zach Brown
2014-10-09 16:44   ` Matthew Wilcox
2014-10-09 19:14     ` Zach Brown
2014-10-10 10:01       ` Jan Kara
2014-10-09  1:10 ` Dave Chinner
2014-10-09 15:25   ` Matthew Wilcox [this message]
2014-10-13  1:19     ` Dave Chinner
2014-10-19  9:51     ` Boaz Harrosh
2014-10-10 13:08 ` Jan Kara
2014-10-10 14:24   ` Matthew Wilcox
2014-10-19 11:08     ` Boaz Harrosh
2014-10-19 23:01       ` Dave Chinner
2014-10-21  9:17         ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141009152524.GN5098@wil.cx \
    --to=willy@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).