From: "Williams, Dan J" <dan.j.williams@intel.com>
To: "jack@suse.cz" <jack@suse.cz>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
"willy@linux.intel.com" <willy@linux.intel.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>,
"jack@suse.com" <jack@suse.com>,
"david@fromorbit.com" <david@fromorbit.com>
Subject: Re: [PATCH 4/8] mm, dax: truncate dax mappings at bdev or fs shutdown
Date: Thu, 19 Nov 2015 00:22:14 +0000 [thread overview]
Message-ID: <1447892533.13153.8.camel@intel.com> (raw)
In-Reply-To: <20151118150945.GE6097@quack.suse.cz>
On Wed, 2015-11-18 at 16:09 +0100, Jan Kara wrote:
> Hum, I don't get this. truncate_inode_pages_final() gets called when inode
> has no more users. So there are no mappings of the inode. So how could
> truncate_pagecache() possibly make a difference?
True. I confirmed with more focus testing that the change to
truncate_inode_pages_final() is not necessary. After
invalidate_inodes() does unmap_mapping_range() we are protected by
future calls to get_block() and blk_queue_enter() failing when there
are attempts to re-establish a mapping after the block device has been
torn down.
Here's a revised patch. Note that the call truncate_pagecache() is
replaced with a call to unmap_mapping_range() since it is fine to
access zero pages that might still be in the page cache.
8<----
Subject: mm, dax: unmap dax mappings at bdev or fs shutdown
From: Dan Williams <dan.j.williams@intel.com>
Currently dax mappings leak past / survive block_device shutdown. While
page cache pages are permitted to be read/written after the block_device
is torn down this is not acceptable in the dax case as all media access
must end when the device is disabled. The pfn backing a dax mapping is
permitted to be invalidated after bdev shutdown and this is indeed the
case with brd.
When a dax capable block_device driver calls del_gendisk() in its
shutdown path del_gendisk() needs to ensure that all DAX pfns are
unmapped. This is different than the pagecache backed case where the
disk is protected by the queue being torn down which ends I/O to the
device. Since dax bypasses the page cache we need to unconditionally
unmap the inode.
Cc: <stable@vger.kernel.org>
Cc: Jan Kara <jack@suse.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
[honza: drop changes to truncate_inode_pages_final]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
fs/inode.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/fs/inode.c b/fs/inode.c
index 1be5f9003eb3..dcb31d2c15e6 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -579,6 +579,18 @@ static void dispose_list(struct list_head *head)
}
}
+static void unmap_list(struct list_head *head)
+{
+ struct inode *inode, *_i;
+
+ list_for_each_entry_safe(inode, _i, head, i_lru) {
+ list_del_init(&inode->i_lru);
+ unmap_mapping_range(&inode->i_data, 0, 0, 1);
+ iput(inode);
+ cond_resched();
+ }
+}
+
/**
* evict_inodes - evict all evictable inodes for a superblock
* @sb: superblock to operate on
@@ -642,6 +654,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
int busy = 0;
struct inode *inode, *next;
LIST_HEAD(dispose);
+ LIST_HEAD(unmap);
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
@@ -655,6 +668,19 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
busy = 1;
continue;
}
+ if (IS_DAX(inode) && atomic_read(&inode->i_count)) {
+ /*
+ * dax mappings can't live past this invalidation event
+ * as there is no page cache present to allow the data
+ * to remain accessible.
+ */
+ __iget(inode);
+ inode_lru_list_del(inode);
+ spin_unlock(&inode->i_lock);
+ list_add(&inode->i_lru, &unmap);
+ busy = 1;
+ continue;
+ }
if (atomic_read(&inode->i_count)) {
spin_unlock(&inode->i_lock);
busy = 1;
@@ -669,6 +695,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
spin_unlock(&sb->s_inode_list_lock);
dispose_list(&dispose);
+ unmap_list(&unmap);
return busy;
}
next prev parent reply other threads:[~2015-11-19 0:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-17 20:15 [PATCH 0/8] dax fixes / cleanups: pmd vs thp, lifetime, and locking Dan Williams
2015-11-17 20:15 ` [PATCH 1/8] ext2, ext4: warn when mounting with dax enabled Dan Williams
2015-11-17 20:16 ` [PATCH 2/8] dax: disable pmd mappings Dan Williams
2015-11-17 20:51 ` Ross Zwisler
2015-11-17 20:16 ` [PATCH 3/8] mm, dax: fix DAX deadlocks (COW fault) Dan Williams
2015-11-17 20:16 ` [PATCH 4/8] mm, dax: truncate dax mappings at bdev or fs shutdown Dan Williams
2015-11-18 15:09 ` Jan Kara
2015-11-19 0:22 ` Williams, Dan J [this message]
2015-11-19 12:55 ` Jan Kara
2015-11-19 16:55 ` Dan Williams
2015-11-19 17:12 ` Jan Kara
2015-11-19 23:17 ` Dave Chinner
2015-11-20 0:05 ` Williams, Dan J
2015-11-20 4:06 ` Dave Chinner
2015-11-20 4:25 ` Dan Williams
2015-11-20 17:08 ` Dan Williams
2015-11-17 20:16 ` [PATCH 5/8] pmem, dax: clean up clear_pmem() Dan Williams
2015-11-17 20:16 ` [PATCH 6/8] dax: increase granularity of dax_clear_blocks() operations Dan Williams
2015-11-17 20:16 ` [PATCH 7/8] dax: guarantee page aligned results from bdev_direct_access() Dan Williams
2015-11-17 20:16 ` [PATCH 8/8] dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1447892533.13153.8.camel@intel.com \
--to=dan.j.williams@intel.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=jack@suse.com \
--cc=jack@suse.cz \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=ross.zwisler@linux.intel.com \
--cc=stable@vger.kernel.org \
--cc=willy@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).