From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-nvdimm@lists.01.org, NeilBrown <neilb@suse.com>, Wilcox,
Subject: Re: [PATCH 05/10] dax: Allow DAX code to replace exceptional entries
Date: Wed, 23 Mar 2016 11:52:58 -0600 [thread overview]
Message-ID: <20160323175258.GD5544@linux.intel.com> (raw)
In-Reply-To: <1458566575-28063-6-git-send-email-jack@suse.cz>
On Mon, Mar 21, 2016 at 02:22:50PM +0100, Jan Kara wrote:
> Currently we forbid page_cache_tree_insert() to replace exceptional radix
> tree entries for DAX inodes. However to make DAX faults race free we will
> lock radix tree entries and when hole is created, we need to replace
> such locked radix tree entry with a hole page. So modify
> page_cache_tree_insert() to allow that.
Perhaps this is addressed later in the series, but at first glance this seems
unsafe to me - what happens of we had tasks waiting on the locked entry? Are
they woken up when we replace the locked entry with a page cache entry?
If they are woken up, will they be alright with the fact that they were
sleeping on a lock for an exceptional entry, and now that lock no longer
exists? The radix tree entry will now be filled with a zero page hole, and so
they can't acquire the lock they were sleeping for - instead if they wanted to
lock that page they'd need to lock the zero page page lock, right?
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> include/linux/dax.h | 6 ++++++
> mm/filemap.c | 18 +++++++++++-------
> 2 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 7c45ac7ea1d1..4b63923e1f8d 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -3,8 +3,14 @@
>
> #include <linux/fs.h>
> #include <linux/mm.h>
> +#include <linux/radix-tree.h>
> #include <asm/pgtable.h>
>
> +/*
> + * Since exceptional entries do not use indirect bit, we reuse it as a lock bit
> + */
> +#define DAX_ENTRY_LOCK RADIX_TREE_INDIRECT_PTR
> +
> ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t,
> get_block_t, dio_iodone_t, int flags);
> int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size);
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 7c00f105845e..fbebedaf719e 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -597,14 +597,18 @@ static int page_cache_tree_insert(struct address_space *mapping,
> if (!radix_tree_exceptional_entry(p))
> return -EEXIST;
>
> - if (WARN_ON(dax_mapping(mapping)))
> - return -EINVAL;
> -
> - if (shadowp)
> - *shadowp = p;
> mapping->nrexceptional--;
> - if (node)
> - workingset_node_shadows_dec(node);
> + if (!dax_mapping(mapping)) {
> + if (shadowp)
> + *shadowp = p;
> + if (node)
> + workingset_node_shadows_dec(node);
> + } else {
> + /* DAX can replace empty locked entry with a hole */
> + WARN_ON_ONCE(p !=
> + (void *)(RADIX_TREE_EXCEPTIONAL_ENTRY |
> + DAX_ENTRY_LOCK));
> + }
> }
> radix_tree_replace_slot(slot, page);
> mapping->nrpages++;
> --
> 2.6.2
>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2016-03-23 17:54 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-21 13:22 [RFC v2] [PATCH 0/10] DAX page fault locking Jan Kara
2016-03-21 13:22 ` [PATCH 01/10] DAX: move RADIX_DAX_ definitions to dax.c Jan Kara
2016-03-21 17:25 ` Matthew Wilcox
2016-03-21 13:22 ` [PATCH 02/10] radix-tree: make 'indirect' bit available to exception entries Jan Kara
2016-03-21 17:34 ` Matthew Wilcox
2016-03-22 9:12 ` Jan Kara
2016-03-22 9:27 ` Matthew Wilcox
2016-03-22 10:37 ` Jan Kara
2016-03-23 16:41 ` Ross Zwisler
2016-03-24 12:31 ` Jan Kara
2016-03-21 13:22 ` [PATCH 03/10] dax: Remove complete_unwritten argument Jan Kara
2016-03-23 17:12 ` Ross Zwisler
2016-03-24 12:32 ` Jan Kara
2016-03-21 13:22 ` [PATCH 04/10] dax: Fix data corruption for written and mmapped files Jan Kara
2016-03-23 17:39 ` Ross Zwisler
2016-03-24 12:51 ` Jan Kara
2016-03-29 15:17 ` Ross Zwisler
2016-03-21 13:22 ` [PATCH 05/10] dax: Allow DAX code to replace exceptional entries Jan Kara
2016-03-23 17:52 ` Ross Zwisler [this message]
2016-03-24 10:42 ` Jan Kara
2016-03-21 13:22 ` [PATCH 06/10] dax: Remove redundant inode size checks Jan Kara
2016-03-23 21:08 ` Ross Zwisler
2016-03-21 13:22 ` [PATCH 07/10] dax: Disable huge page handling Jan Kara
2016-03-23 20:50 ` Ross Zwisler
2016-03-24 12:56 ` Jan Kara
2016-03-21 13:22 ` [PATCH 08/10] dax: New fault locking Jan Kara
2016-03-29 21:57 ` Ross Zwisler
2016-03-31 16:27 ` Jan Kara
2016-03-21 13:22 ` [PATCH 09/10] dax: Use radix tree entry lock to protect cow faults Jan Kara
2016-03-21 19:11 ` Matthew Wilcox
2016-03-22 7:03 ` Jan Kara
2016-03-29 22:18 ` Ross Zwisler
2016-03-21 13:22 ` [PATCH 10/10] dax: Remove i_mmap_lock protection Jan Kara
2016-03-29 22:17 ` Ross Zwisler
2016-03-21 17:41 ` [RFC v2] [PATCH 0/10] DAX page fault locking Matthew Wilcox
2016-03-23 15:09 ` Jan Kara
2016-03-23 20:50 ` Matthew Wilcox
2016-03-24 10:00 ` Matthew Wilcox
2016-03-22 19:32 ` Ross Zwisler
2016-03-22 21:07 ` Toshi Kani
2016-03-22 21:15 ` Dave Chinner
2016-03-23 9:45 ` Jan Kara
2016-03-23 15:11 ` Toshi Kani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160323175258.GD5544@linux.intel.com \
--to=ross.zwisler@linux.intel.com \
--cc=jack@suse.cz \
--cc=linux-nvdimm@lists.01.org \
--cc=neilb@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox