From: Yasunori Goto <y-goto@jp.fujitsu.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
NVDIMM-ML <linux-nvdimm@lists.01.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
linux-ext4 <linux-ext4@vger.kernel.org>
Subject: Re: Question about Experimental of Filesystem DAX.
Date: Mon, 04 Jun 2018 10:44:15 +0900 [thread overview]
Message-ID: <20180604104406.4A13.E1E9C6FF@jp.fujitsu.com> (raw)
In-Reply-To: <20180531202556.GA28256@linux.intel.com>
> On Thu, May 31, 2018 at 11:26:43AM -0700, Dan Williams wrote:
> > On Thu, May 31, 2018 at 10:46 AM, Darrick J. Wong
> > <darrick.wong@oracle.com> wrote:
> > > On Thu, May 31, 2018 at 09:29:15AM -0700, Dan Williams wrote:
> > >> On Thu, May 31, 2018 at 8:07 AM, Ross Zwisler
> > >> <ross.zwisler@linux.intel.com> wrote:
> > >> > On Thu, May 31, 2018 at 11:27:33AM +0900, Yasunori Goto wrote:
> > >> >> Hello,
> > >> >>
> > >> >>
> > >> >> I would like to know about the Experimental message of Filesystem DAX.
> > >> >> --------------------------------------------------------
> > >> >> DAX enabled. Warning: EXPERIMENTAL, use at your own risk
> > >> >> --------------------------------------------------------
> > >> >>
> > >> >> AFAIK, the final issue of Filesystem DAX is metadata update problem,
> > >> >> and it is(will be?) solved by great effort of MAP_SYNC and
> > >> >> "fix dma vs truncate/hole-punch" patch set.
> > >> >> So, I suppose that the Experimental message can be removed,
> > >> >> but I'm not sure.
> > >> >>
> > >> >> Is it possible?
> > >> >> Otherwise, are there any other issues in Filesystem DAX yet?
> > >> >>
> > >> >> If this is silly question, sorry for noise....
> > >> >>
> > >> >> Thanks,
> > >> >> ---
> > >> >> Yasunori Goto
> > >> >
> > >> > Adding in the XFS and ext4 developers, as it's really their call when to
> > >> > remove this notice.
> > >> >
> > >> > We've talked about this off and on for a long while, but IMHO we should remove
> > >> > the EXPERIMENTAL warning. The last few things that we had on our TODO list
> > >> > before this was removed were:
> > >> >
> > >> > 1) Get consistent handling of the DAX mount option. We currently have this,
> > >> > as both filesystems will behave the same and fall back and remove the DAX
> > >> > mount option if it is unsupported by the block device, etc.
> > >
> > > <nod>
> > >
> > > As an aside, I wonder if Christoph's musings about "just have the kernel
> > > determine the appropriate dax/non-dax setting from the acpi tables and
> > > skip the inode flag entirely" ever got resolved?
> > >
> > >> > 2) Get consistent handling of the DAX inode option. We currently have this,
> > >> > as all DAX behavior now happens through the mount option. If/when we
> > >> > re-enable the per-inode DAX flag we should do it consistently for all DAX
> > >> > enabled filesystems.
> > >
> > > The behavior of the inode flag isn't all that consistent. ext4 doesn't
> > > support it at all. On XFS, you can set or clear FS_XFLAG_DAX on a
> > > directory which will propagate the setting to any files created in that
> > > directory.
> > >
> > > However, if you set or clear it on a file we update the on-disk inode
> > > but we can't change the in-core state flag (S_DAX) until the next
> > > in-core inode instantiation. It's weird that users can change the flag
> > > but the intended behavior changes won't happen until some ... time ...
> > > in the future??
> > >
> > >> > 3) Make DAX work with other XFS features like reflink, etc. This one isn't
> > >> > done, but we at least disallow DAX with XFS features like reflink where it
> > >> > could be an issue. Darrick, do you still feel like we need to get these
> > >> > working together to remove EXPERIMENTAL, or are you happy enough that we're
> > >> > keeping them separated and that we're keeping user data safe?
> > >
> > > Yes, reflink and dax still need to work together. I've not heard any
> > > good arguments for why page sharing + copy on write are fundamentally
> > > incompatible with the dax model, or why dax users will never, ever
> > > require reflink.
> >
> > Right, but that's separate from DAX being scream in your face
> > "EXPERIMENTAL!". It's just an additional feature that can be added on
> > once all the normal expectations of a userspace mapping work. I think
> > reliable rmap is the last of those requirements.
> >
> > > The recent thread between Jan and Dan make me wonder if making mappings
> > > share struct pages is going to be a nightmare to add to the mm code,
> > > though...
> >
> > It's going to be a bit messy because a singular page->mapping
> > association is fundamentally incompatible with DAX. Perhaps a linked
> > list of mapping "siblings"?
> >
> > > Also: ideally XFS would also be able to consume poison event
> > > notifications from the pmem so that it can try to deal with metadata
> > > loss, but that's probably a separate effort.
> >
> > Right, not a gating item for declaring DAX ready for prime time.
>
> Yep, I think that the very loud EXPERIMENTAL message is essentially telling
> users "your data is at risk if you use this". I totally agree that we still
> have lots of work to do. However, I don't think that these feature
> enhancements should gate removal of the EXPERIMENTAL notice. IMHO that
> should only exist as long as we have issues that we know could corrupt data,
> crash the box, etc. As far as I know those are basically the 2 items on Dan's
> list from a few mails ago (poison recovery & DMA vs truncate).
Everyone,
Thank you very much for your information/opinions.
Not only about "experimental", I could understand what is still to do.
Thanks a lot!
---
Yasunori Goto
next prev parent reply other threads:[~2018-06-04 1:56 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20180531112731.0CAC.E1E9C6FF@jp.fujitsu.com>
2018-05-31 15:07 ` Question about Experimental of Filesystem DAX Ross Zwisler
2018-05-31 16:29 ` Dan Williams
2018-05-31 17:46 ` Darrick J. Wong
2018-05-31 18:26 ` Dan Williams
2018-05-31 20:25 ` Ross Zwisler
2018-06-04 1:44 ` Yasunori Goto [this message]
2018-06-04 3:51 ` Dave Chinner
2018-05-31 23:05 ` Dave Chinner
2018-06-01 1:03 ` Dan Williams
2018-06-07 14:38 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180604104406.4A13.E1E9C6FF@jp.fujitsu.com \
--to=y-goto@jp.fujitsu.com \
--cc=dan.j.williams@intel.com \
--cc=darrick.wong@oracle.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ross.zwisler@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox