public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Boaz Harrosh <openosd@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>, Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org
Subject: Re: direct_access, pinning and truncation
Date: Mon, 20 Oct 2014 10:01:52 +1100	[thread overview]
Message-ID: <20141019230152.GM17506@dastard> (raw)
In-Reply-To: <54439B97.7020305@gmail.com>

On Sun, Oct 19, 2014 at 02:08:07PM +0300, Boaz Harrosh wrote:
> On 10/10/2014 05:24 PM, Matthew Wilcox wrote:
> <>
> > 
> > I'm assuming that we come up with *some* way to solve the missing struct
> > page problem.  Whether it's restructuring splice, O_DIRECT and RDMA to do
> > without struct pages, 
> 
> That makes no sense to me, where will it end? You are doubling the size of the
> code to have two paths, and there will always be a subsystem you did not touch
> and is missing support. And why? page was already invented to do exactly what you
> want, track state of a PFN.
.....
> > whether it's coming up
> > with some other data structure that takes the place of struct page for
> > DAX ... 
> 
> Again. Why reinvent the wheel when the old one works perfectly and does
> everything you want, including the most important aspect. Not adding any
> new infrastructure, and/or modifying any code. So why even think about it?
> 
> > doesn't matter for this part of the conversation.
> > 
> 
> I agree, this does not solve the reference problem, in this case DAX will
> need an new entry into the FS to communicate delayed free-block. But as Jan
> pointed out this is not against current FS structure.
> 
> I think lots of current DAX problems and performance short comings can be
> solved very nicely if we assume we have struct-page for pmem. For example
> the use of the page-lock instead of the i_mutex we take today.

Which makes me look at what DAX is intended for.

DAX is an enabler, allowing us to get direct access to PMEM with
*existing filesystem technologies*.  I don't want to have to add new
extent management functions to XFS to add temporary references to
allow DAX to hold onto extents after an inode has been freed because
some RDMA app has pinned the PMEM and forgot to let it go. That way
lies madness for existing filesystems - yes, we can add such warts
to them, but it's ugly, nasty and needed only by a very, very small
lunatic fringe of users.

IMO, this proposal is way outside the original DAX-replaces-XIP scope;
I really don't think that requiring extensive modifications to
filesystems to use DAX is a good idea. Apart from it being contrary to the
original architectural goal of DAX (which was "enable direct access
with minimal filesystem implementation impact"), we risk significant
impact on non-DAX users by requiring architectural changes to the
underlying filesystems to support DAX.

So my question is this: at what point do we say "out of scope for
DAX, make this work with a native PMEM filesystem"?  DAX as it
stands fills the "95% of what people need" goal with minimal effort;
our efforts should be focussed on merging what we have, not creeping
the scope and making it harder to implement and get merged.

If we want RDMA into PMEM devices or direct IO to/from persisten
memory, then I'd suggest that this is functionality that belongs in
native PMEM storage devices/filesystems and should be designed to be
efficient in that environment way from the ground up.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2014-10-19 23:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-08 19:05 direct_access, pinning and truncation Matthew Wilcox
2014-10-08 23:21 ` Zach Brown
2014-10-09 16:44   ` Matthew Wilcox
2014-10-09 19:14     ` Zach Brown
2014-10-10 10:01       ` Jan Kara
2014-10-09  1:10 ` Dave Chinner
2014-10-09 15:25   ` Matthew Wilcox
2014-10-13  1:19     ` Dave Chinner
2014-10-19  9:51     ` Boaz Harrosh
2014-10-10 13:08 ` Jan Kara
2014-10-10 14:24   ` Matthew Wilcox
2014-10-19 11:08     ` Boaz Harrosh
2014-10-19 23:01       ` Dave Chinner [this message]
2014-10-21  9:17         ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141019230152.GM17506@dastard \
    --to=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=openosd@gmail.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox