Linux-NVDIMM Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-nvdimm@lists.01.org, NeilBrown <neilb@suse.com>, Wilcox,
Subject: Re: [PATCH 07/10] dax: Disable huge page handling
Date: Wed, 23 Mar 2016 14:50:00 -0600	[thread overview]
Message-ID: <20160323205000.GE5544@linux.intel.com> (raw)
In-Reply-To: <1458566575-28063-8-git-send-email-jack@suse.cz>

On Mon, Mar 21, 2016 at 02:22:52PM +0100, Jan Kara wrote:
> Currently the handling of huge pages for DAX is racy. For example the
> following can happen:
> 
> CPU0 (THP write fault)			CPU1 (normal read fault)
> 
> __dax_pmd_fault()			__dax_fault()
>   get_block(inode, block, &bh, 0) -> not mapped
> 					get_block(inode, block, &bh, 0)
> 					  -> not mapped
>   if (!buffer_mapped(&bh) && write)
>     get_block(inode, block, &bh, 1) -> allocates blocks
>   truncate_pagecache_range(inode, lstart, lend);
> 					dax_load_hole();
> 
> This results in data corruption since process on CPU1 won't see changes
> into the file done by CPU0.
> 
> The race can happen even if two normal faults race however with THP the
> situation is even worse because the two faults don't operate on the same
> entries in the radix tree and we want to use these entries for
> serialization. So disable THP support in DAX code for now.

Yep, I agree that we should disable PMD faults until we get the multi-order
radix tree work finished and integrated with this locking.

I do agree with Dan though that it would be preferable to disable PMD faults
by having CONFIG_FS_DAX_PMD depend on BROKEN.  That seems smaller and easier
to switch PMD faults back on for testing.

> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/dax.c            | 2 +-
>  include/linux/dax.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 0329ec0bee2e..444e9dd079ca 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -719,7 +719,7 @@ int dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
>  }
>  EXPORT_SYMBOL_GPL(dax_fault);
>  
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#if 0
>  /*
>   * The 'colour' (ie low bits) within a PMD of a page offset.  This comes up
>   * more often than one might expect in the below function.
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 4b63923e1f8d..fd28d824254b 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -29,7 +29,7 @@ static inline struct page *read_dax_sector(struct block_device *bdev,
>  }
>  #endif
>  
> -#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#if 0
>  int dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *,
>  				unsigned int flags, get_block_t);
>  int __dax_pmd_fault(struct vm_area_struct *, unsigned long addr, pmd_t *,
> -- 
> 2.6.2
> 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  reply	other threads:[~2016-03-23 20:50 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-21 13:22 [RFC v2] [PATCH 0/10] DAX page fault locking Jan Kara
2016-03-21 13:22 ` [PATCH 01/10] DAX: move RADIX_DAX_ definitions to dax.c Jan Kara
2016-03-21 17:25   ` Matthew Wilcox
2016-03-21 13:22 ` [PATCH 02/10] radix-tree: make 'indirect' bit available to exception entries Jan Kara
2016-03-21 17:34   ` Matthew Wilcox
2016-03-22  9:12     ` Jan Kara
2016-03-22  9:27       ` Matthew Wilcox
2016-03-22 10:37         ` Jan Kara
2016-03-23 16:41           ` Ross Zwisler
2016-03-24 12:31             ` Jan Kara
2016-03-21 13:22 ` [PATCH 03/10] dax: Remove complete_unwritten argument Jan Kara
2016-03-23 17:12   ` Ross Zwisler
2016-03-24 12:32     ` Jan Kara
2016-03-21 13:22 ` [PATCH 04/10] dax: Fix data corruption for written and mmapped files Jan Kara
2016-03-23 17:39   ` Ross Zwisler
2016-03-24 12:51     ` Jan Kara
2016-03-29 15:17       ` Ross Zwisler
2016-03-21 13:22 ` [PATCH 05/10] dax: Allow DAX code to replace exceptional entries Jan Kara
2016-03-23 17:52   ` Ross Zwisler
2016-03-24 10:42     ` Jan Kara
2016-03-21 13:22 ` [PATCH 06/10] dax: Remove redundant inode size checks Jan Kara
2016-03-23 21:08   ` Ross Zwisler
2016-03-21 13:22 ` [PATCH 07/10] dax: Disable huge page handling Jan Kara
2016-03-23 20:50   ` Ross Zwisler [this message]
2016-03-24 12:56     ` Jan Kara
2016-03-21 13:22 ` [PATCH 08/10] dax: New fault locking Jan Kara
2016-03-29 21:57   ` Ross Zwisler
2016-03-31 16:27     ` Jan Kara
2016-03-21 13:22 ` [PATCH 09/10] dax: Use radix tree entry lock to protect cow faults Jan Kara
2016-03-21 19:11   ` Matthew Wilcox
2016-03-22  7:03     ` Jan Kara
2016-03-29 22:18   ` Ross Zwisler
2016-03-21 13:22 ` [PATCH 10/10] dax: Remove i_mmap_lock protection Jan Kara
2016-03-29 22:17   ` Ross Zwisler
2016-03-21 17:41 ` [RFC v2] [PATCH 0/10] DAX page fault locking Matthew Wilcox
2016-03-23 15:09   ` Jan Kara
2016-03-23 20:50     ` Matthew Wilcox
2016-03-24 10:00     ` Matthew Wilcox
2016-03-22 19:32 ` Ross Zwisler
2016-03-22 21:07   ` Toshi Kani
2016-03-22 21:15     ` Dave Chinner
2016-03-23  9:45     ` Jan Kara
2016-03-23 15:11       ` Toshi Kani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160323205000.GE5544@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=jack@suse.cz \
    --cc=linux-nvdimm@lists.01.org \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox