From: Dave Chinner <david@fromorbit.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
Dan Williams <dan.j.williams@intel.com>,
Eric Sandeen <sandeen@redhat.com>,
Lukas Czerner <lczerner@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Theodore Ts'o <tytso@mit.edu>, Christoph Hellwig <hch@lst.de>,
Jan Kara <jack@suse.cz>, linux-ext4 <linux-ext4@vger.kernel.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH 0/9] add ext4 per-inode DAX flag
Date: Fri, 8 Sep 2017 08:12:01 +1000 [thread overview]
Message-ID: <20170907221201.GZ17782@dastard> (raw)
In-Reply-To: <20170907215148.GA12669@linux.intel.com>
On Thu, Sep 07, 2017 at 03:51:48PM -0600, Ross Zwisler wrote:
> On Thu, Sep 07, 2017 at 03:26:10PM -0600, Andreas Dilger wrote:
> > However, I wonder if this could
> > be prevented at runtime, and only allow S_DAX to be set when the inode is
> > first instantiated, and wouldn't be allowed to change after that? Setting
> > or clearing the per-inode DAX flag might still be allowed, but it wouldn't
> > be enabled until the inode is next fetched into cache? Similarly, for
> > inodes that have conflicting features (e.g. inline data or encryption)
> > would not be allowed to enable S_DAX.
>
> Ooh, this seems interesting. This would ensure that S_DAX transitions
> couldn't ever race with I/Os or mmaps(). I had some other ideas for how to
> handle this, but I think your idea is more promising. :)
IMO, that's an awful admin interface - it can't be done on demand
(i.e. when needed) because we can't force an inode to be evicted
from the cache. And then we have the "why the hell did that just
change" problem if an inode is evicted due to memory pressure and
then immediately reinstantiated by the running workload. That's a
recipe for driving admins insane...
> I guess with this solution we'd need:
>
> a) A good way of letting the user detect the state where they had set the DAX
> inode flag, but that it wasn't yet in use by the inode.
>
> b) A reliable way of flushing the inode from the filesystem cache, so that the
> next time an open() happens they get the new behavior. The way I usually do
> this is via umount/remount, but there is probably already a way to do this?
Not if it's referenced. And if it's not referenced, then the only
hammer we have is Brutus^Wdrop_caches. That's not an option for
production machines.
Neat idea, but one I'd already thought of and discarded as "not
practical from an admin perspective".
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
Eric Sandeen <sandeen@redhat.com>, Theodore Ts'o <tytso@mit.edu>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Jan Kara <jack@suse.cz>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Christoph Hellwig <hch@lst.de>, xfs <linux-xfs@vger.kernel.org>,
Lukas Czerner <lczerner@redhat.com>,
linux-ext4 <linux-ext4@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 0/9] add ext4 per-inode DAX flag
Date: Fri, 8 Sep 2017 08:12:01 +1000 [thread overview]
Message-ID: <20170907221201.GZ17782@dastard> (raw)
In-Reply-To: <20170907215148.GA12669@linux.intel.com>
On Thu, Sep 07, 2017 at 03:51:48PM -0600, Ross Zwisler wrote:
> On Thu, Sep 07, 2017 at 03:26:10PM -0600, Andreas Dilger wrote:
> > However, I wonder if this could
> > be prevented at runtime, and only allow S_DAX to be set when the inode is
> > first instantiated, and wouldn't be allowed to change after that? Setting
> > or clearing the per-inode DAX flag might still be allowed, but it wouldn't
> > be enabled until the inode is next fetched into cache? Similarly, for
> > inodes that have conflicting features (e.g. inline data or encryption)
> > would not be allowed to enable S_DAX.
>
> Ooh, this seems interesting. This would ensure that S_DAX transitions
> couldn't ever race with I/Os or mmaps(). I had some other ideas for how to
> handle this, but I think your idea is more promising. :)
IMO, that's an awful admin interface - it can't be done on demand
(i.e. when needed) because we can't force an inode to be evicted
from the cache. And then we have the "why the hell did that just
change" problem if an inode is evicted due to memory pressure and
then immediately reinstantiated by the running workload. That's a
recipe for driving admins insane...
> I guess with this solution we'd need:
>
> a) A good way of letting the user detect the state where they had set the DAX
> inode flag, but that it wasn't yet in use by the inode.
>
> b) A reliable way of flushing the inode from the filesystem cache, so that the
> next time an open() happens they get the new behavior. The way I usually do
> this is via umount/remount, but there is probably already a way to do this?
Not if it's referenced. And if it's not referenced, then the only
hammer we have is Brutus^Wdrop_caches. That's not an option for
production machines.
Neat idea, but one I'd already thought of and discarded as "not
practical from an admin perspective".
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2017-09-07 22:12 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-05 22:35 [PATCH 0/9] add ext4 per-inode DAX flag Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 1/9] ext4: remove duplicate extended attributes defs Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-06 7:29 ` Jan Kara
2017-09-06 7:29 ` Jan Kara
[not found] ` <20170905223541.20594-1-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2017-09-05 22:35 ` [PATCH 2/9] xfs: always use DAX if mount option is used Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 3/9] xfs: validate bdev support for DAX inode flag Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 4/9] ext4: add ext4_should_use_dax() Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 5/9] ext4: ext4_change_inode_journal_flag error handling Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 6/9] ext4: safely transition S_DAX on journaling changes Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-06 9:47 ` Jan Kara
2017-09-06 9:47 ` Jan Kara
[not found] ` <20170906094700.GC27916-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-09-06 17:09 ` Ross Zwisler
2017-09-06 17:09 ` Ross Zwisler
2017-09-06 17:09 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 7/9] ext4: prevent data corruption with inline data + DAX Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-06 20:55 ` Andreas Dilger
2017-09-06 23:11 ` Ross Zwisler
2017-09-06 23:11 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 8/9] ext4: add sanity check for encryption " Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` [PATCH 9/9] ext4: add per-inode DAX flag Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-05 22:35 ` Ross Zwisler
2017-09-06 2:12 ` [PATCH 0/9] add ext4 " Eric Sandeen
2017-09-06 2:12 ` Eric Sandeen
2017-09-06 2:12 ` Eric Sandeen
2017-09-06 17:07 ` Ross Zwisler
2017-09-06 17:07 ` Ross Zwisler
2017-09-07 20:54 ` Dan Williams
2017-09-07 20:54 ` Dan Williams
[not found] ` <CAPcyv4hfhDT9NFRXL+MT5epiqWHJ0RLraV4P3CZ4EJM6L-s0Nw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-09-07 21:13 ` Ross Zwisler
2017-09-07 21:13 ` Ross Zwisler
2017-09-07 21:13 ` Ross Zwisler
2017-09-07 21:26 ` Andreas Dilger
[not found] ` <5F58D3F5-D93B-4648-AE01-8A46956FBB4B-m1MBpc4rdrD3fQ9qLvQP4Q@public.gmane.org>
2017-09-07 21:51 ` Ross Zwisler
2017-09-07 21:51 ` Ross Zwisler
2017-09-07 21:51 ` Ross Zwisler
2017-09-07 22:12 ` Dave Chinner [this message]
2017-09-07 22:12 ` Dave Chinner
2017-09-07 22:19 ` Ross Zwisler
2017-09-07 22:19 ` Ross Zwisler
[not found] ` <20170907221900.GB12669-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2017-09-07 23:25 ` Dave Chinner
2017-09-07 23:25 ` Dave Chinner
2017-09-07 23:25 ` Dave Chinner
2017-09-08 9:48 ` Jan Kara
2017-09-08 9:48 ` Jan Kara
2017-09-08 15:39 ` Theodore Ts'o
2017-09-08 15:39 ` Theodore Ts'o
2017-09-08 15:39 ` Theodore Ts'o
[not found] ` <20170908153913.jjhzogjs5zpeea5v-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>
2017-09-11 8:47 ` Jan Kara
2017-09-11 8:47 ` Jan Kara
2017-09-11 8:47 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170907221201.GZ17782@dastard \
--to=david@fromorbit.com \
--cc=adilger@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=darrick.wong@oracle.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ross.zwisler@linux.intel.com \
--cc=sandeen@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.