From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Thu, 22 Dec 2016 14:18:52 -0700 Message-ID: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Return-path: Sender: owner-linux-mm@kvack.org To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org List-Id: linux-arch.vger.kernel.org Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss, as detailed in patch 4. You can find a working tree here: https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 This series applies cleanly to mmotm-2016-12-19-16-31. Changes since v1: - Included Dan's patch to kill DAX support for UML. - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) Dan Williams (1): dax: kill uml support Ross Zwisler (3): dax: add stub for pmdp_huge_clear_flush() mm: add follow_pte_pmd() dax: wrprotect pmd_t in dax_mapping_entry_mkclean fs/Kconfig | 2 +- fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- include/asm-generic/pgtable.h | 10 +++++++++ include/linux/mm.h | 4 ++-- mm/memory.c | 41 ++++++++++++++++++++++++++++-------- 5 files changed, 79 insertions(+), 27 deletions(-) -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: [PATCH v2 1/4] dax: kill uml support Date: Thu, 22 Dec 2016 14:18:53 -0700 Message-ID: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Return-path: In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org To: linux-kernel@vger.kernel.org Cc: Dan Williams , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, Ross Zwisler List-Id: linux-arch.vger.kernel.org From: Dan Williams The lack of common transparent-huge-page helpers for UML is becoming increasingly painful for fs/dax.c now that it is growing more pmd functionality. Add UML to the list of unsupported architectures. Cc: Jan Kara Cc: Christoph Hellwig Cc: Dave Chinner Cc: Dave Hansen Cc: Matthew Wilcox Cc: Alexander Viro Cc: Ross Zwisler Signed-off-by: Dan Williams [rez: squashed #ifdef removal into another patch in the series ] Signed-off-by: Ross Zwisler --- fs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index c2a377c..661931f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" config FS_DAX bool "Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) + depends on !(ARM || MIPS || SPARC || UML) help Direct Access (DAX) can be used on memory-backed block devices. If the block device supports DAX and the filesystem supports DAX, -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: [PATCH v2 4/4] dax: wrprotect pmd_t in dax_mapping_entry_mkclean Date: Thu, 22 Dec 2016 14:18:56 -0700 Message-ID: <1482441536-14550-5-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org, Dave Chinner , Christoph Hellwig , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Dave Hansen , Alexander Viro , linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-arch.vger.kernel.org Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd_t dirty and writeable 2) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pmd_t as clean and write protected. 3) more mmap writes to the PMD. These don't cause any page faults since the pmd_t is dirty and writeable. The radix tree entry remains clean. 4) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 5) crash - dirty data that should have been fsync'd as part of 4) could still have been in the processor cache, and is lost. Fix this by marking the pmd_t clean and write protected in dax_mapping_entry_mkclean(), which is called as part of the fsync operation 2). This will cause the writes in step 3) above to generate page faults where we'll re-dirty the PMD radix tree entry, resulting in flushes in the fsync that happens in step 4). Signed-off-by: Ross Zwisler Cc: Jan Kara Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Reviewed-by: Jan Kara --- fs/dax.c | 49 ++++++++++++++++++++++++++++++++++--------------- include/linux/mm.h | 2 -- mm/memory.c | 4 ++-- 3 files changed, 36 insertions(+), 19 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5c74f60..62b3ed4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -691,8 +691,8 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, pgoff_t index, unsigned long pfn) { struct vm_area_struct *vma; - pte_t *ptep; - pte_t pte; + pte_t pte, *ptep = NULL; + pmd_t *pmdp = NULL; spinlock_t *ptl; bool changed; @@ -707,21 +707,40 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, address = pgoff_address(index, vma); changed = false; - if (follow_pte(vma->vm_mm, address, &ptep, &ptl)) + if (follow_pte_pmd(vma->vm_mm, address, &ptep, &pmdp, &ptl)) continue; - if (pfn != pte_pfn(*ptep)) - goto unlock; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock; - flush_cache_page(vma, address, pfn); - pte = ptep_clear_flush(vma, address, ptep); - pte = pte_wrprotect(pte); - pte = pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); - changed = true; -unlock: - pte_unmap_unlock(ptep, ptl); + if (pmdp) { + pmd_t pmd; + + if (pfn != pmd_pfn(*pmdp)) + goto unlock_pmd; + if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) + goto unlock_pmd; + + flush_cache_page(vma, address, pfn); + pmd = pmdp_huge_clear_flush(vma, address, pmdp); + pmd = pmd_wrprotect(pmd); + pmd = pmd_mkclean(pmd); + set_pmd_at(vma->vm_mm, address, pmdp, pmd); + changed = true; +unlock_pmd: + spin_unlock(ptl); + } else { + if (pfn != pte_pfn(*ptep)) + goto unlock_pte; + if (!pte_dirty(*ptep) && !pte_write(*ptep)) + goto unlock_pte; + + flush_cache_page(vma, address, pfn); + pte = ptep_clear_flush(vma, address, ptep); + pte = pte_wrprotect(pte); + pte = pte_mkclean(pte); + set_pte_at(vma->vm_mm, address, ptep, pte); + changed = true; +unlock_pte: + pte_unmap_unlock(ptep, ptl); + } if (changed) mmu_notifier_invalidate_page(vma->vm_mm, address); diff --git a/include/linux/mm.h b/include/linux/mm.h index ff0e1c1..f4de7fa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1210,8 +1210,6 @@ int copy_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 29edd91..ddcf979 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3826,8 +3826,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return -EINVAL; } -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp) +static inline int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { int res; -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: [PATCH v2 3/4] mm: add follow_pte_pmd() Date: Thu, 22 Dec 2016 14:18:55 -0700 Message-ID: <1482441536-14550-4-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Return-path: In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org List-Id: linux-arch.vger.kernel.org Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a huge page PMD leaf to be found and returned. Signed-off-by: Ross Zwisler Suggested-by: Dave Hansen --- include/linux/mm.h | 2 ++ mm/memory.c | 37 ++++++++++++++++++++++++++++++------- 2 files changed, 32 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4424784..ff0e1c1 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1212,6 +1212,8 @@ void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 455c3e6..29edd91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3779,8 +3779,8 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) } #endif /* __PAGETABLE_PMD_FOLDED */ -static int __follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) +static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) { pgd_t *pgd; pud_t *pud; @@ -3797,11 +3797,20 @@ static int __follow_pte(struct mm_struct *mm, unsigned long address, pmd = pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) - goto out; - /* We cannot handle huge page PFN maps. Luckily they don't exist. */ - if (pmd_huge(*pmd)) + if (pmd_huge(*pmd)) { + if (!pmdpp) + goto out; + + *ptlp = pmd_lock(mm, pmd); + if (pmd_huge(*pmd)) { + *pmdpp = pmd; + return 0; + } + spin_unlock(*ptlp); + } + + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; ptep = pte_offset_map_lock(mm, pmd, address, ptlp); @@ -3824,9 +3833,23 @@ int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte(mm, address, ptepp, ptlp))); + !(res = __follow_pte_pmd(mm, address, ptepp, NULL, + ptlp))); + return res; +} + +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) +{ + int res; + + /* (void) is needed to make gcc happy */ + (void) __cond_lock(*ptlp, + !(res = __follow_pte_pmd(mm, address, ptepp, pmdpp, + ptlp))); return res; } +EXPORT_SYMBOL(follow_pte_pmd); /** * follow_pfn - look up PFN at a user virtual address -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Date: Thu, 22 Dec 2016 14:18:54 -0700 Message-ID: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org, Dave Chinner , Christoph Hellwig , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Dave Hansen , Alexander Viro , linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-arch.vger.kernel.org Add a pmdp_huge_clear_flush() stub for configs that don't define CONFIG_TRANSPARENT_HUGEPAGE. We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at least we do want this compile successfully even for configs without CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call this code gets called, based on whether we find DAX PMD entries in our tree. We shouldn't ever find such PMD entries for !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be called. Signed-off-by: Ross Zwisler --- include/asm-generic/pgtable.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 18af2bc..65e9536 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH +#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); +#else +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, + unsigned long address, + pmd_t *pmdp) +{ + WARN_ON_ONCE(1); + return *pmdp; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Date: Fri, 23 Dec 2016 14:44:57 +0100 Message-ID: <20161223134457.GG22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1482441536-14550-3-git-send-email-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Ross Zwisler Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jan Kara , Arnd Bergmann , Matthew Wilcox , linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org, Dave Chinner , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Christoph Hellwig , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Dave Hansen , Alexander Viro , linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Andrew Morton List-Id: linux-arch.vger.kernel.org On Thu 22-12-16 14:18:54, Ross Zwisler wrote: > Add a pmdp_huge_clear_flush() stub for configs that don't define > CONFIG_TRANSPARENT_HUGEPAGE. > > We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at > least we do want this compile successfully even for configs without > CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call > this code gets called, based on whether we find DAX PMD entries in our > tree. We shouldn't ever find such PMD entries for > !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be > called. > > Signed-off-by: Ross Zwisler Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > include/asm-generic/pgtable.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index 18af2bc..65e9536 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, > #endif > > #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > unsigned long address, > pmd_t *pmdp); > +#else > +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > + unsigned long address, > + pmd_t *pmdp) > +{ > + WARN_ON_ONCE(1); > + return *pmdp; > +} > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > #endif > > #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kara Subject: Re: [PATCH v2 1/4] dax: kill uml support Date: Fri, 23 Dec 2016 14:45:39 +0100 Message-ID: <20161223134539.GH22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1482441536-14550-2-git-send-email-ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" To: Ross Zwisler Cc: linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org, Dave Chinner , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Dave Hansen , Alexander Viro , linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Christoph Hellwig List-Id: linux-arch.vger.kernel.org On Thu 22-12-16 14:18:53, Ross Zwisler wrote: > From: Dan Williams > > The lack of common transparent-huge-page helpers for UML is becoming > increasingly painful for fs/dax.c now that it is growing more pmd > functionality. Add UML to the list of unsupported architectures. > > Cc: Jan Kara > Cc: Christoph Hellwig > Cc: Dave Chinner > Cc: Dave Hansen > Cc: Matthew Wilcox > Cc: Alexander Viro > Cc: Ross Zwisler > Signed-off-by: Dan Williams > [rez: squashed #ifdef removal into another patch in the series ] > Signed-off-by: Ross Zwisler Fine by me. You can add: Acked-by: Jan Kara Honza > --- > fs/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/Kconfig b/fs/Kconfig > index c2a377c..661931f 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" > config FS_DAX > bool "Direct Access (DAX) support" > depends on MMU > - depends on !(ARM || MIPS || SPARC) > + depends on !(ARM || MIPS || SPARC || UML) > help > Direct Access (DAX) can be used on memory-backed block devices. > If the block device supports DAX and the filesystem supports DAX, > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Tue, 3 Jan 2017 17:13:49 -0700 Message-ID: <20170104001349.GA8176@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org List-Id: linux-arch.vger.kernel.org On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > pmd_t of a DAX PMD entry during an *sync operation. This can result in > data loss, as detailed in patch 4. > > You can find a working tree here: > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > This series applies cleanly to mmotm-2016-12-19-16-31. > > Changes since v1: > - Included Dan's patch to kill DAX support for UML. > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > Dan Williams (1): > dax: kill uml support > > Ross Zwisler (3): > dax: add stub for pmdp_huge_clear_flush() > mm: add follow_pte_pmd() > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > fs/Kconfig | 2 +- > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > include/asm-generic/pgtable.h | 10 +++++++++ > include/linux/mm.h | 4 ++-- > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > 5 files changed, 79 insertions(+), 27 deletions(-) Well, 0-day found another architecture that doesn't define pmd_pfn() et al., so we'll need some more fixes. (Thank you, 0-day, for the coverage!) I have to apologize, I didn't understand that Dan intended his "dax: kill uml support" patch to land in v4.11. I thought he intended it as a cleanup to my series, which really needs to land in v4.10. That's why I folded them together into this v2, along with the wrapper suggested by Jan. Andrew, does it work for you to just keep v1 of this series, and eventually send that to Linus for v4.10? https://lkml.org/lkml/2016/12/20/649 You've already pulled that one into -mm, and it does correctly solve the data loss issue. That would let us deal with getting rid of the #ifdef, blacklisting architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on series for v4.11. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Thu, 5 Jan 2017 17:27:34 -0800 Message-ID: <20170105172734.23a7603ff19006b49e9ba01a@linux-foundation.org> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <20170104001349.GA8176@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20170104001349.GA8176@linux.intel.com> Sender: owner-linux-mm@kvack.org To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org List-Id: linux-arch.vger.kernel.org On Tue, 3 Jan 2017 17:13:49 -0700 Ross Zwisler wrote: > On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > > pmd_t of a DAX PMD entry during an *sync operation. This can result in > > data loss, as detailed in patch 4. > > > > You can find a working tree here: > > > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > > > This series applies cleanly to mmotm-2016-12-19-16-31. > > > > Changes since v1: > > - Included Dan's patch to kill DAX support for UML. > > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > > > Dan Williams (1): > > dax: kill uml support > > > > Ross Zwisler (3): > > dax: add stub for pmdp_huge_clear_flush() > > mm: add follow_pte_pmd() > > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > > > fs/Kconfig | 2 +- > > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > > include/asm-generic/pgtable.h | 10 +++++++++ > > include/linux/mm.h | 4 ++-- > > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > > 5 files changed, 79 insertions(+), 27 deletions(-) > > Well, 0-day found another architecture that doesn't define pmd_pfn() et al., > so we'll need some more fixes. (Thank you, 0-day, for the coverage!) > > I have to apologize, I didn't understand that Dan intended his "dax: kill uml > support" patch to land in v4.11. I thought he intended it as a cleanup to my > series, which really needs to land in v4.10. That's why I folded them > together into this v2, along with the wrapper suggested by Jan. > > Andrew, does it work for you to just keep v1 of this series, and eventually > send that to Linus for v4.10? > > https://lkml.org/lkml/2016/12/20/649 > > You've already pulled that one into -mm, and it does correctly solve the data > loss issue. > > That would let us deal with getting rid of the #ifdef, blacklisting > architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on > series for v4.11. I have mm-add-follow_pte_pmd.patch and dax-wrprotect-pmd_t-in-dax_mapping_entry_mkclean.patch queued for 4.10. Please (re)send any additional patches, indicating for each one whether you believe it should also go into 4.10? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Zwisler Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Fri, 6 Jan 2017 11:18:19 -0700 Message-ID: <20170106181819.GA3486@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <20170104001349.GA8176@linux.intel.com> <20170105172734.23a7603ff19006b49e9ba01a@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mga14.intel.com ([192.55.52.115]:13267 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758472AbdAFSSW (ORCPT ); Fri, 6 Jan 2017 13:18:22 -0500 Content-Disposition: inline In-Reply-To: <20170105172734.23a7603ff19006b49e9ba01a@linux-foundation.org> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Andrew Morton Cc: Ross Zwisler , linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org On Thu, Jan 05, 2017 at 05:27:34PM -0800, Andrew Morton wrote: > On Tue, 3 Jan 2017 17:13:49 -0700 Ross Zwisler wrote: > > > On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > > > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > > > pmd_t of a DAX PMD entry during an *sync operation. This can result in > > > data loss, as detailed in patch 4. > > > > > > You can find a working tree here: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > > > > > This series applies cleanly to mmotm-2016-12-19-16-31. > > > > > > Changes since v1: > > > - Included Dan's patch to kill DAX support for UML. > > > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > > > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > > > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > > > > > Dan Williams (1): > > > dax: kill uml support > > > > > > Ross Zwisler (3): > > > dax: add stub for pmdp_huge_clear_flush() > > > mm: add follow_pte_pmd() > > > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > > > > > fs/Kconfig | 2 +- > > > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > > > include/asm-generic/pgtable.h | 10 +++++++++ > > > include/linux/mm.h | 4 ++-- > > > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > > > 5 files changed, 79 insertions(+), 27 deletions(-) > > > > Well, 0-day found another architecture that doesn't define pmd_pfn() et al., > > so we'll need some more fixes. (Thank you, 0-day, for the coverage!) > > > > I have to apologize, I didn't understand that Dan intended his "dax: kill uml > > support" patch to land in v4.11. I thought he intended it as a cleanup to my > > series, which really needs to land in v4.10. That's why I folded them > > together into this v2, along with the wrapper suggested by Jan. > > > > Andrew, does it work for you to just keep v1 of this series, and eventually > > send that to Linus for v4.10? > > > > https://lkml.org/lkml/2016/12/20/649 > > > > You've already pulled that one into -mm, and it does correctly solve the data > > loss issue. > > > > That would let us deal with getting rid of the #ifdef, blacklisting > > architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on > > series for v4.11. > > I have mm-add-follow_pte_pmd.patch and > dax-wrprotect-pmd_t-in-dax_mapping_entry_mkclean.patch queued for 4.10. > Please (re)send any additional patches, indicating for each one > whether you believe it should also go into 4.10? The two patches that you already have queued are correct, and no additional patches are necessary for v4.10 for this issue. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966807AbcLVVTJ (ORCPT ); Thu, 22 Dec 2016 16:19:09 -0500 From: Ross Zwisler Subject: [PATCH v2 3/4] mm: add follow_pte_pmd() Date: Thu, 22 Dec 2016 14:18:55 -0700 Message-ID: <1482441536-14550-4-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20161222211855.XNuoHhos_T0SGXmpp3AcgUDJ7NOf43Rk0EITrx7DZ40@z> Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a huge page PMD leaf to be found and returned. Signed-off-by: Ross Zwisler Suggested-by: Dave Hansen --- include/linux/mm.h | 2 ++ mm/memory.c | 37 ++++++++++++++++++++++++++++++------- 2 files changed, 32 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4424784..ff0e1c1 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1212,6 +1212,8 @@ void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 455c3e6..29edd91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3779,8 +3779,8 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) } #endif /* __PAGETABLE_PMD_FOLDED */ -static int __follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) +static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) { pgd_t *pgd; pud_t *pud; @@ -3797,11 +3797,20 @@ static int __follow_pte(struct mm_struct *mm, unsigned long address, pmd = pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) - goto out; - /* We cannot handle huge page PFN maps. Luckily they don't exist. */ - if (pmd_huge(*pmd)) + if (pmd_huge(*pmd)) { + if (!pmdpp) + goto out; + + *ptlp = pmd_lock(mm, pmd); + if (pmd_huge(*pmd)) { + *pmdpp = pmd; + return 0; + } + spin_unlock(*ptlp); + } + + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; ptep = pte_offset_map_lock(mm, pmd, address, ptlp); @@ -3824,9 +3833,23 @@ int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte(mm, address, ptepp, ptlp))); + !(res = __follow_pte_pmd(mm, address, ptepp, NULL, + ptlp))); + return res; +} + +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) +{ + int res; + + /* (void) is needed to make gcc happy */ + (void) __cond_lock(*ptlp, + !(res = __follow_pte_pmd(mm, address, ptepp, pmdpp, + ptlp))); return res; } +EXPORT_SYMBOL(follow_pte_pmd); /** * follow_pfn - look up PFN at a user virtual address -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966872AbcLVVTU (ORCPT ); Thu, 22 Dec 2016 16:19:20 -0500 From: Ross Zwisler Subject: [PATCH v2 4/4] dax: wrprotect pmd_t in dax_mapping_entry_mkclean Date: Thu, 22 Dec 2016 14:18:56 -0700 Message-ID: <1482441536-14550-5-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20161222211856.YSM0rqRY669jRLgbtrOd1RbgxOWSqtPmR-Z5sEiQLqk@z> Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd_t dirty and writeable 2) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pmd_t as clean and write protected. 3) more mmap writes to the PMD. These don't cause any page faults since the pmd_t is dirty and writeable. The radix tree entry remains clean. 4) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 5) crash - dirty data that should have been fsync'd as part of 4) could still have been in the processor cache, and is lost. Fix this by marking the pmd_t clean and write protected in dax_mapping_entry_mkclean(), which is called as part of the fsync operation 2). This will cause the writes in step 3) above to generate page faults where we'll re-dirty the PMD radix tree entry, resulting in flushes in the fsync that happens in step 4). Signed-off-by: Ross Zwisler Cc: Jan Kara Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Reviewed-by: Jan Kara --- fs/dax.c | 49 ++++++++++++++++++++++++++++++++++--------------- include/linux/mm.h | 2 -- mm/memory.c | 4 ++-- 3 files changed, 36 insertions(+), 19 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5c74f60..62b3ed4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -691,8 +691,8 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, pgoff_t index, unsigned long pfn) { struct vm_area_struct *vma; - pte_t *ptep; - pte_t pte; + pte_t pte, *ptep = NULL; + pmd_t *pmdp = NULL; spinlock_t *ptl; bool changed; @@ -707,21 +707,40 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, address = pgoff_address(index, vma); changed = false; - if (follow_pte(vma->vm_mm, address, &ptep, &ptl)) + if (follow_pte_pmd(vma->vm_mm, address, &ptep, &pmdp, &ptl)) continue; - if (pfn != pte_pfn(*ptep)) - goto unlock; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock; - flush_cache_page(vma, address, pfn); - pte = ptep_clear_flush(vma, address, ptep); - pte = pte_wrprotect(pte); - pte = pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); - changed = true; -unlock: - pte_unmap_unlock(ptep, ptl); + if (pmdp) { + pmd_t pmd; + + if (pfn != pmd_pfn(*pmdp)) + goto unlock_pmd; + if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) + goto unlock_pmd; + + flush_cache_page(vma, address, pfn); + pmd = pmdp_huge_clear_flush(vma, address, pmdp); + pmd = pmd_wrprotect(pmd); + pmd = pmd_mkclean(pmd); + set_pmd_at(vma->vm_mm, address, pmdp, pmd); + changed = true; +unlock_pmd: + spin_unlock(ptl); + } else { + if (pfn != pte_pfn(*ptep)) + goto unlock_pte; + if (!pte_dirty(*ptep) && !pte_write(*ptep)) + goto unlock_pte; + + flush_cache_page(vma, address, pfn); + pte = ptep_clear_flush(vma, address, ptep); + pte = pte_wrprotect(pte); + pte = pte_mkclean(pte); + set_pte_at(vma->vm_mm, address, ptep, pte); + changed = true; +unlock_pte: + pte_unmap_unlock(ptep, ptl); + } if (changed) mmu_notifier_invalidate_page(vma->vm_mm, address); diff --git a/include/linux/mm.h b/include/linux/mm.h index ff0e1c1..f4de7fa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1210,8 +1210,6 @@ int copy_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 29edd91..ddcf979 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3826,8 +3826,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return -EINVAL; } -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp) +static inline int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { int res; -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:46205 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932213AbcLVVTW (ORCPT ); Thu, 22 Dec 2016 16:19:22 -0500 From: Ross Zwisler Subject: [PATCH v2 1/4] dax: kill uml support Date: Thu, 22 Dec 2016 14:18:53 -0700 Message-ID: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-kernel@vger.kernel.org Cc: Dan Williams , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, Ross Zwisler Message-ID: <20161222211853.atLVL3X0eDk327wE0h9o4AglPH7gzDbnOXZZ0qvRAlQ@z> From: Dan Williams The lack of common transparent-huge-page helpers for UML is becoming increasingly painful for fs/dax.c now that it is growing more pmd functionality. Add UML to the list of unsupported architectures. Cc: Jan Kara Cc: Christoph Hellwig Cc: Dave Chinner Cc: Dave Hansen Cc: Matthew Wilcox Cc: Alexander Viro Cc: Ross Zwisler Signed-off-by: Dan Williams [rez: squashed #ifdef removal into another patch in the series ] Signed-off-by: Ross Zwisler --- fs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index c2a377c..661931f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" config FS_DAX bool "Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) + depends on !(ARM || MIPS || SPARC || UML) help Direct Access (DAX) can be used on memory-backed block devices. If the block device supports DAX and the filesystem supports DAX, -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758530AbcLVVTH (ORCPT ); Thu, 22 Dec 2016 16:19:07 -0500 From: Ross Zwisler Subject: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Thu, 22 Dec 2016 14:18:52 -0700 Message-ID: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20161222211852.8TpGUELYB4En1EcQ9zfVLCpemLzJ-IOwg75HZWaBTXA@z> Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss, as detailed in patch 4. You can find a working tree here: https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 This series applies cleanly to mmotm-2016-12-19-16-31. Changes since v1: - Included Dan's patch to kill DAX support for UML. - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) Dan Williams (1): dax: kill uml support Ross Zwisler (3): dax: add stub for pmdp_huge_clear_flush() mm: add follow_pte_pmd() dax: wrprotect pmd_t in dax_mapping_entry_mkclean fs/Kconfig | 2 +- fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- include/asm-generic/pgtable.h | 10 +++++++++ include/linux/mm.h | 4 ++-- mm/memory.c | 41 ++++++++++++++++++++++++++++-------- 5 files changed, 79 insertions(+), 27 deletions(-) -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966782AbcLVVTI (ORCPT ); Thu, 22 Dec 2016 16:19:08 -0500 From: Ross Zwisler Subject: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Date: Thu, 22 Dec 2016 14:18:54 -0700 Message-ID: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20161222211854.KgQTuyvRdkyKRVFQ2cxr-lJUAJ9EPWhk9tYbRyU0e_o@z> Add a pmdp_huge_clear_flush() stub for configs that don't define CONFIG_TRANSPARENT_HUGEPAGE. We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at least we do want this compile successfully even for configs without CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call this code gets called, based on whether we find DAX PMD entries in our tree. We shouldn't ever find such PMD entries for !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be called. Signed-off-by: Ross Zwisler --- include/asm-generic/pgtable.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 18af2bc..65e9536 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH +#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); +#else +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, + unsigned long address, + pmd_t *pmdp) +{ + WARN_ON_ONCE(1); + return *pmdp; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de ([195.135.220.15]:42484 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbcLWNpC (ORCPT ); Fri, 23 Dec 2016 08:45:02 -0500 Date: Fri, 23 Dec 2016 14:44:57 +0100 From: Jan Kara Subject: Re: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Message-ID: <20161223134457.GG22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20161223134457.3Ufed8DNJnbfN7b8eLGOHvvA8d0oCq-UnoA9cnxGDdI@z> On Thu 22-12-16 14:18:54, Ross Zwisler wrote: > Add a pmdp_huge_clear_flush() stub for configs that don't define > CONFIG_TRANSPARENT_HUGEPAGE. > > We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at > least we do want this compile successfully even for configs without > CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call > this code gets called, based on whether we find DAX PMD entries in our > tree. We shouldn't ever find such PMD entries for > !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be > called. > > Signed-off-by: Ross Zwisler Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > include/asm-generic/pgtable.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index 18af2bc..65e9536 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, > #endif > > #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > unsigned long address, > pmd_t *pmdp); > +#else > +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > + unsigned long address, > + pmd_t *pmdp) > +{ > + WARN_ON_ONCE(1); > + return *pmdp; > +} > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > #endif > > #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de ([195.135.220.15]:42512 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759834AbcLWNpl (ORCPT ); Fri, 23 Dec 2016 08:45:41 -0500 Date: Fri, 23 Dec 2016 14:45:39 +0100 From: Jan Kara Subject: Re: [PATCH v2 1/4] dax: kill uml support Message-ID: <20161223134539.GH22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Dan Williams , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20161223134539.0Z6fJRqPle6eusA-6nyBo4B2bxszDjaZbUhNPjM4go8@z> On Thu 22-12-16 14:18:53, Ross Zwisler wrote: > From: Dan Williams > > The lack of common transparent-huge-page helpers for UML is becoming > increasingly painful for fs/dax.c now that it is growing more pmd > functionality. Add UML to the list of unsupported architectures. > > Cc: Jan Kara > Cc: Christoph Hellwig > Cc: Dave Chinner > Cc: Dave Hansen > Cc: Matthew Wilcox > Cc: Alexander Viro > Cc: Ross Zwisler > Signed-off-by: Dan Williams > [rez: squashed #ifdef removal into another patch in the series ] > Signed-off-by: Ross Zwisler Fine by me. You can add: Acked-by: Jan Kara Honza > --- > fs/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/Kconfig b/fs/Kconfig > index c2a377c..661931f 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" > config FS_DAX > bool "Direct Access (DAX) support" > depends on MMU > - depends on !(ARM || MIPS || SPARC) > + depends on !(ARM || MIPS || SPARC || UML) > help > Direct Access (DAX) can be used on memory-backed block devices. > If the block device supports DAX and the filesystem supports DAX, > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com ([134.134.136.100]:62577 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760051AbdADAON (ORCPT ); Tue, 3 Jan 2017 19:14:13 -0500 Date: Tue, 3 Jan 2017 17:13:49 -0700 From: Ross Zwisler Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170104001349.GA8176@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Andrew Morton Cc: linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Message-ID: <20170104001349.jdBaA2Eqb5qON4mrC8edJjGGj8YBvd33HiSpsuuYZpc@z> On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > pmd_t of a DAX PMD entry during an *sync operation. This can result in > data loss, as detailed in patch 4. > > You can find a working tree here: > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > This series applies cleanly to mmotm-2016-12-19-16-31. > > Changes since v1: > - Included Dan's patch to kill DAX support for UML. > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > Dan Williams (1): > dax: kill uml support > > Ross Zwisler (3): > dax: add stub for pmdp_huge_clear_flush() > mm: add follow_pte_pmd() > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > fs/Kconfig | 2 +- > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > include/asm-generic/pgtable.h | 10 +++++++++ > include/linux/mm.h | 4 ++-- > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > 5 files changed, 79 insertions(+), 27 deletions(-) Well, 0-day found another architecture that doesn't define pmd_pfn() et al., so we'll need some more fixes. (Thank you, 0-day, for the coverage!) I have to apologize, I didn't understand that Dan intended his "dax: kill uml support" patch to land in v4.11. I thought he intended it as a cleanup to my series, which really needs to land in v4.10. That's why I folded them together into this v2, along with the wrapper suggested by Jan. Andrew, does it work for you to just keep v1 of this series, and eventually send that to Linus for v4.10? https://lkml.org/lkml/2016/12/20/649 You've already pulled that one into -mm, and it does correctly solve the data loss issue. That would let us deal with getting rid of the #ifdef, blacklisting architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on series for v4.11. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linuxfoundation.org ([140.211.169.12]:42010 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751068AbdAFB0R (ORCPT ); Thu, 5 Jan 2017 20:26:17 -0500 Date: Thu, 5 Jan 2017 17:27:34 -0800 From: Andrew Morton Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170105172734.23a7603ff19006b49e9ba01a@linux-foundation.org> In-Reply-To: <20170104001349.GA8176@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <20170104001349.GA8176@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-arch-owner@vger.kernel.org List-ID: To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Message-ID: <20170106012734.IDx6bfaKYAMIuFr2yWiichC8ZbUY8neGsvuJK5unFK8@z> On Tue, 3 Jan 2017 17:13:49 -0700 Ross Zwisler wrote: > On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > > pmd_t of a DAX PMD entry during an *sync operation. This can result in > > data loss, as detailed in patch 4. > > > > You can find a working tree here: > > > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > > > This series applies cleanly to mmotm-2016-12-19-16-31. > > > > Changes since v1: > > - Included Dan's patch to kill DAX support for UML. > > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > > > Dan Williams (1): > > dax: kill uml support > > > > Ross Zwisler (3): > > dax: add stub for pmdp_huge_clear_flush() > > mm: add follow_pte_pmd() > > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > > > fs/Kconfig | 2 +- > > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > > include/asm-generic/pgtable.h | 10 +++++++++ > > include/linux/mm.h | 4 ++-- > > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > > 5 files changed, 79 insertions(+), 27 deletions(-) > > Well, 0-day found another architecture that doesn't define pmd_pfn() et al., > so we'll need some more fixes. (Thank you, 0-day, for the coverage!) > > I have to apologize, I didn't understand that Dan intended his "dax: kill uml > support" patch to land in v4.11. I thought he intended it as a cleanup to my > series, which really needs to land in v4.10. That's why I folded them > together into this v2, along with the wrapper suggested by Jan. > > Andrew, does it work for you to just keep v1 of this series, and eventually > send that to Linus for v4.10? > > https://lkml.org/lkml/2016/12/20/649 > > You've already pulled that one into -mm, and it does correctly solve the data > loss issue. > > That would let us deal with getting rid of the #ifdef, blacklisting > architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on > series for v4.11. I have mm-add-follow_pte_pmd.patch and dax-wrprotect-pmd_t-in-dax_mapping_entry_mkclean.patch queued for 4.10. Please (re)send any additional patches, indicating for each one whether you believe it should also go into 4.10? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 0AB63823B7 for ; Thu, 22 Dec 2016 13:19:05 -0800 (PST) From: Ross Zwisler Subject: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Thu, 22 Dec 2016 14:18:52 -0700 Message-Id: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: linux-kernel@vger.kernel.org Cc: linux-arch@vger.kernel.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org List-ID: Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss, as detailed in patch 4. You can find a working tree here: https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 This series applies cleanly to mmotm-2016-12-19-16-31. Changes since v1: - Included Dan's patch to kill DAX support for UML. - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) Dan Williams (1): dax: kill uml support Ross Zwisler (3): dax: add stub for pmdp_huge_clear_flush() mm: add follow_pte_pmd() dax: wrprotect pmd_t in dax_mapping_entry_mkclean fs/Kconfig | 2 +- fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- include/asm-generic/pgtable.h | 10 +++++++++ include/linux/mm.h | 4 ++-- mm/memory.c | 41 ++++++++++++++++++++++++++++-------- 5 files changed, 79 insertions(+), 27 deletions(-) -- 2.7.4 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 8230E820AD for ; Fri, 23 Dec 2016 05:45:41 -0800 (PST) Date: Fri, 23 Dec 2016 14:45:39 +0100 From: Jan Kara Subject: Re: [PATCH v2 1/4] dax: kill uml support Message-ID: <20161223134539.GH22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Ross Zwisler Cc: linux-arch@vger.kernel.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org, Christoph Hellwig List-ID: On Thu 22-12-16 14:18:53, Ross Zwisler wrote: > From: Dan Williams > > The lack of common transparent-huge-page helpers for UML is becoming > increasingly painful for fs/dax.c now that it is growing more pmd > functionality. Add UML to the list of unsupported architectures. > > Cc: Jan Kara > Cc: Christoph Hellwig > Cc: Dave Chinner > Cc: Dave Hansen > Cc: Matthew Wilcox > Cc: Alexander Viro > Cc: Ross Zwisler > Signed-off-by: Dan Williams > [rez: squashed #ifdef removal into another patch in the series ] > Signed-off-by: Ross Zwisler Fine by me. You can add: Acked-by: Jan Kara Honza > --- > fs/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/Kconfig b/fs/Kconfig > index c2a377c..661931f 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" > config FS_DAX > bool "Direct Access (DAX) support" > depends on MMU > - depends on !(ARM || MIPS || SPARC) > + depends on !(ARM || MIPS || SPARC || UML) > help > Direct Access (DAX) can be used on memory-backed block devices. > If the block device supports DAX and the filesystem supports DAX, > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id E3085823C3 for ; Thu, 22 Dec 2016 13:19:05 -0800 (PST) From: Ross Zwisler Subject: [PATCH v2 1/4] dax: kill uml support Date: Thu, 22 Dec 2016 14:18:53 -0700 Message-Id: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: linux-kernel@vger.kernel.org Cc: linux-arch@vger.kernel.org, Jan Kara , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org, Andrew Morton List-ID: From: Dan Williams The lack of common transparent-huge-page helpers for UML is becoming increasingly painful for fs/dax.c now that it is growing more pmd functionality. Add UML to the list of unsupported architectures. Cc: Jan Kara Cc: Christoph Hellwig Cc: Dave Chinner Cc: Dave Hansen Cc: Matthew Wilcox Cc: Alexander Viro Cc: Ross Zwisler Signed-off-by: Dan Williams [rez: squashed #ifdef removal into another patch in the series ] Signed-off-by: Ross Zwisler --- fs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index c2a377c..661931f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" config FS_DAX bool "Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) + depends on !(ARM || MIPS || SPARC || UML) help Direct Access (DAX) can be used on memory-backed block devices. If the block device supports DAX and the filesystem supports DAX, -- 2.7.4 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 785F3823CA for ; Thu, 22 Dec 2016 13:19:08 -0800 (PST) From: Ross Zwisler Subject: [PATCH v2 3/4] mm: add follow_pte_pmd() Date: Thu, 22 Dec 2016 14:18:55 -0700 Message-Id: <1482441536-14550-4-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: linux-kernel@vger.kernel.org Cc: linux-arch@vger.kernel.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org List-ID: Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a huge page PMD leaf to be found and returned. Signed-off-by: Ross Zwisler Suggested-by: Dave Hansen --- include/linux/mm.h | 2 ++ mm/memory.c | 37 ++++++++++++++++++++++++++++++------- 2 files changed, 32 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4424784..ff0e1c1 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1212,6 +1212,8 @@ void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 455c3e6..29edd91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3779,8 +3779,8 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) } #endif /* __PAGETABLE_PMD_FOLDED */ -static int __follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) +static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) { pgd_t *pgd; pud_t *pud; @@ -3797,11 +3797,20 @@ static int __follow_pte(struct mm_struct *mm, unsigned long address, pmd = pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) - goto out; - /* We cannot handle huge page PFN maps. Luckily they don't exist. */ - if (pmd_huge(*pmd)) + if (pmd_huge(*pmd)) { + if (!pmdpp) + goto out; + + *ptlp = pmd_lock(mm, pmd); + if (pmd_huge(*pmd)) { + *pmdpp = pmd; + return 0; + } + spin_unlock(*ptlp); + } + + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; ptep = pte_offset_map_lock(mm, pmd, address, ptlp); @@ -3824,9 +3833,23 @@ int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte(mm, address, ptepp, ptlp))); + !(res = __follow_pte_pmd(mm, address, ptepp, NULL, + ptlp))); + return res; +} + +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) +{ + int res; + + /* (void) is needed to make gcc happy */ + (void) __cond_lock(*ptlp, + !(res = __follow_pte_pmd(mm, address, ptepp, pmdpp, + ptlp))); return res; } +EXPORT_SYMBOL(follow_pte_pmd); /** * follow_pfn - look up PFN at a user virtual address -- 2.7.4 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 51344820AD for ; Fri, 23 Dec 2016 05:45:02 -0800 (PST) Date: Fri, 23 Dec 2016 14:44:57 +0100 From: Jan Kara Subject: Re: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Message-ID: <20161223134457.GG22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Ross Zwisler Cc: linux-arch@vger.kernel.org, Jan Kara , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , linux-kernel@vger.kernel.org, Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org, Andrew Morton List-ID: On Thu 22-12-16 14:18:54, Ross Zwisler wrote: > Add a pmdp_huge_clear_flush() stub for configs that don't define > CONFIG_TRANSPARENT_HUGEPAGE. > > We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at > least we do want this compile successfully even for configs without > CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call > this code gets called, based on whether we find DAX PMD entries in our > tree. We shouldn't ever find such PMD entries for > !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be > called. > > Signed-off-by: Ross Zwisler Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > include/asm-generic/pgtable.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index 18af2bc..65e9536 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, > #endif > > #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > unsigned long address, > pmd_t *pmdp); > +#else > +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > + unsigned long address, > + pmd_t *pmdp) > +{ > + WARN_ON_ONCE(1); > + return *pmdp; > +} > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > #endif > > #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 1E8F2823B7 for ; Thu, 22 Dec 2016 13:19:07 -0800 (PST) From: Ross Zwisler Subject: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Date: Thu, 22 Dec 2016 14:18:54 -0700 Message-Id: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: linux-kernel@vger.kernel.org Cc: linux-arch@vger.kernel.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org List-ID: Add a pmdp_huge_clear_flush() stub for configs that don't define CONFIG_TRANSPARENT_HUGEPAGE. We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at least we do want this compile successfully even for configs without CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call this code gets called, based on whether we find DAX PMD entries in our tree. We shouldn't ever find such PMD entries for !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be called. Signed-off-by: Ross Zwisler --- include/asm-generic/pgtable.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 18af2bc..65e9536 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH +#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); +#else +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, + unsigned long address, + pmd_t *pmdp) +{ + WARN_ON_ONCE(1); + return *pmdp; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT -- 2.7.4 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 95D4C823CE for ; Thu, 22 Dec 2016 13:19:09 -0800 (PST) From: Ross Zwisler Subject: [PATCH v2 4/4] dax: wrprotect pmd_t in dax_mapping_entry_mkclean Date: Thu, 22 Dec 2016 14:18:56 -0700 Message-Id: <1482441536-14550-5-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: linux-kernel@vger.kernel.org Cc: linux-arch@vger.kernel.org, Jan Kara , Andrew Morton , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org List-ID: Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd_t dirty and writeable 2) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pmd_t as clean and write protected. 3) more mmap writes to the PMD. These don't cause any page faults since the pmd_t is dirty and writeable. The radix tree entry remains clean. 4) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 5) crash - dirty data that should have been fsync'd as part of 4) could still have been in the processor cache, and is lost. Fix this by marking the pmd_t clean and write protected in dax_mapping_entry_mkclean(), which is called as part of the fsync operation 2). This will cause the writes in step 3) above to generate page faults where we'll re-dirty the PMD radix tree entry, resulting in flushes in the fsync that happens in step 4). Signed-off-by: Ross Zwisler Cc: Jan Kara Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Reviewed-by: Jan Kara --- fs/dax.c | 49 ++++++++++++++++++++++++++++++++++--------------- include/linux/mm.h | 2 -- mm/memory.c | 4 ++-- 3 files changed, 36 insertions(+), 19 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5c74f60..62b3ed4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -691,8 +691,8 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, pgoff_t index, unsigned long pfn) { struct vm_area_struct *vma; - pte_t *ptep; - pte_t pte; + pte_t pte, *ptep = NULL; + pmd_t *pmdp = NULL; spinlock_t *ptl; bool changed; @@ -707,21 +707,40 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, address = pgoff_address(index, vma); changed = false; - if (follow_pte(vma->vm_mm, address, &ptep, &ptl)) + if (follow_pte_pmd(vma->vm_mm, address, &ptep, &pmdp, &ptl)) continue; - if (pfn != pte_pfn(*ptep)) - goto unlock; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock; - flush_cache_page(vma, address, pfn); - pte = ptep_clear_flush(vma, address, ptep); - pte = pte_wrprotect(pte); - pte = pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); - changed = true; -unlock: - pte_unmap_unlock(ptep, ptl); + if (pmdp) { + pmd_t pmd; + + if (pfn != pmd_pfn(*pmdp)) + goto unlock_pmd; + if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) + goto unlock_pmd; + + flush_cache_page(vma, address, pfn); + pmd = pmdp_huge_clear_flush(vma, address, pmdp); + pmd = pmd_wrprotect(pmd); + pmd = pmd_mkclean(pmd); + set_pmd_at(vma->vm_mm, address, pmdp, pmd); + changed = true; +unlock_pmd: + spin_unlock(ptl); + } else { + if (pfn != pte_pfn(*ptep)) + goto unlock_pte; + if (!pte_dirty(*ptep) && !pte_write(*ptep)) + goto unlock_pte; + + flush_cache_page(vma, address, pfn); + pte = ptep_clear_flush(vma, address, ptep); + pte = pte_wrprotect(pte); + pte = pte_mkclean(pte); + set_pte_at(vma->vm_mm, address, ptep, pte); + changed = true; +unlock_pte: + pte_unmap_unlock(ptep, ptl); + } if (changed) mmu_notifier_invalidate_page(vma->vm_mm, address); diff --git a/include/linux/mm.h b/include/linux/mm.h index ff0e1c1..f4de7fa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1210,8 +1210,6 @@ int copy_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 29edd91..ddcf979 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3826,8 +3826,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return -EINVAL; } -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp) +static inline int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { int res; -- 2.7.4 _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id CF24F81911 for ; Tue, 3 Jan 2017 16:13:50 -0800 (PST) Date: Tue, 3 Jan 2017 17:13:49 -0700 From: Ross Zwisler Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170104001349.GA8176@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Andrew Morton Cc: linux-arch@vger.kernel.org, Jan Kara , Arnd Bergmann , Matthew Wilcox , linux-nvdimm@lists.01.org, Dave Chinner , linux-kernel@vger.kernel.org, Christoph Hellwig , linux-mm@kvack.org, Dave Hansen , Alexander Viro , linux-fsdevel@vger.kernel.org List-ID: On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > pmd_t of a DAX PMD entry during an *sync operation. This can result in > data loss, as detailed in patch 4. > > You can find a working tree here: > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > This series applies cleanly to mmotm-2016-12-19-16-31. > > Changes since v1: > - Included Dan's patch to kill DAX support for UML. > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > Dan Williams (1): > dax: kill uml support > > Ross Zwisler (3): > dax: add stub for pmdp_huge_clear_flush() > mm: add follow_pte_pmd() > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > fs/Kconfig | 2 +- > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > include/asm-generic/pgtable.h | 10 +++++++++ > include/linux/mm.h | 4 ++-- > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > 5 files changed, 79 insertions(+), 27 deletions(-) Well, 0-day found another architecture that doesn't define pmd_pfn() et al., so we'll need some more fixes. (Thank you, 0-day, for the coverage!) I have to apologize, I didn't understand that Dan intended his "dax: kill uml support" patch to land in v4.11. I thought he intended it as a cleanup to my series, which really needs to land in v4.10. That's why I folded them together into this v2, along with the wrapper suggested by Jan. Andrew, does it work for you to just keep v1 of this series, and eventually send that to Linus for v4.10? https://lkml.org/lkml/2016/12/20/649 You've already pulled that one into -mm, and it does correctly solve the data loss issue. That would let us deal with getting rid of the #ifdef, blacklisting architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on series for v4.11. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 23 Dec 2016 14:44:57 +0100 From: Jan Kara To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Subject: Re: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Message-ID: <20161223134457.GG22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: On Thu 22-12-16 14:18:54, Ross Zwisler wrote: > Add a pmdp_huge_clear_flush() stub for configs that don't define > CONFIG_TRANSPARENT_HUGEPAGE. > > We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at > least we do want this compile successfully even for configs without > CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call > this code gets called, based on whether we find DAX PMD entries in our > tree. We shouldn't ever find such PMD entries for > !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be > called. > > Signed-off-by: Ross Zwisler Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > include/asm-generic/pgtable.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index 18af2bc..65e9536 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, > #endif > > #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > unsigned long address, > pmd_t *pmdp); > +#else > +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > + unsigned long address, > + pmd_t *pmdp) > +{ > + WARN_ON_ONCE(1); > + return *pmdp; > +} > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > #endif > > #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Subject: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Date: Thu, 22 Dec 2016 14:18:54 -0700 Message-Id: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: Add a pmdp_huge_clear_flush() stub for configs that don't define CONFIG_TRANSPARENT_HUGEPAGE. We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at least we do want this compile successfully even for configs without CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call this code gets called, based on whether we find DAX PMD entries in our tree. We shouldn't ever find such PMD entries for !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be called. Signed-off-by: Ross Zwisler --- include/asm-generic/pgtable.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 18af2bc..65e9536 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH +#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); +#else +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, + unsigned long address, + pmd_t *pmdp) +{ + WARN_ON_ONCE(1); + return *pmdp; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 23 Dec 2016 14:45:39 +0100 From: Jan Kara To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Dan Williams , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Subject: Re: [PATCH v2 1/4] dax: kill uml support Message-ID: <20161223134539.GH22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: On Thu 22-12-16 14:18:53, Ross Zwisler wrote: > From: Dan Williams > > The lack of common transparent-huge-page helpers for UML is becoming > increasingly painful for fs/dax.c now that it is growing more pmd > functionality. Add UML to the list of unsupported architectures. > > Cc: Jan Kara > Cc: Christoph Hellwig > Cc: Dave Chinner > Cc: Dave Hansen > Cc: Matthew Wilcox > Cc: Alexander Viro > Cc: Ross Zwisler > Signed-off-by: Dan Williams > [rez: squashed #ifdef removal into another patch in the series ] > Signed-off-by: Ross Zwisler Fine by me. You can add: Acked-by: Jan Kara Honza > --- > fs/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/Kconfig b/fs/Kconfig > index c2a377c..661931f 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" > config FS_DAX > bool "Direct Access (DAX) support" > depends on MMU > - depends on !(ARM || MIPS || SPARC) > + depends on !(ARM || MIPS || SPARC || UML) > help > Direct Access (DAX) can be used on memory-backed block devices. > If the block device supports DAX and the filesystem supports DAX, > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Subject: [PATCH v2 4/4] dax: wrprotect pmd_t in dax_mapping_entry_mkclean Date: Thu, 22 Dec 2016 14:18:56 -0700 Message-Id: <1482441536-14550-5-git-send-email-ross.zwisler@linux.intel.com> In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd_t dirty and writeable 2) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pmd_t as clean and write protected. 3) more mmap writes to the PMD. These don't cause any page faults since the pmd_t is dirty and writeable. The radix tree entry remains clean. 4) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 5) crash - dirty data that should have been fsync'd as part of 4) could still have been in the processor cache, and is lost. Fix this by marking the pmd_t clean and write protected in dax_mapping_entry_mkclean(), which is called as part of the fsync operation 2). This will cause the writes in step 3) above to generate page faults where we'll re-dirty the PMD radix tree entry, resulting in flushes in the fsync that happens in step 4). Signed-off-by: Ross Zwisler Cc: Jan Kara Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Reviewed-by: Jan Kara --- fs/dax.c | 49 ++++++++++++++++++++++++++++++++++--------------- include/linux/mm.h | 2 -- mm/memory.c | 4 ++-- 3 files changed, 36 insertions(+), 19 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5c74f60..62b3ed4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -691,8 +691,8 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, pgoff_t index, unsigned long pfn) { struct vm_area_struct *vma; - pte_t *ptep; - pte_t pte; + pte_t pte, *ptep = NULL; + pmd_t *pmdp = NULL; spinlock_t *ptl; bool changed; @@ -707,21 +707,40 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, address = pgoff_address(index, vma); changed = false; - if (follow_pte(vma->vm_mm, address, &ptep, &ptl)) + if (follow_pte_pmd(vma->vm_mm, address, &ptep, &pmdp, &ptl)) continue; - if (pfn != pte_pfn(*ptep)) - goto unlock; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock; - flush_cache_page(vma, address, pfn); - pte = ptep_clear_flush(vma, address, ptep); - pte = pte_wrprotect(pte); - pte = pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); - changed = true; -unlock: - pte_unmap_unlock(ptep, ptl); + if (pmdp) { + pmd_t pmd; + + if (pfn != pmd_pfn(*pmdp)) + goto unlock_pmd; + if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) + goto unlock_pmd; + + flush_cache_page(vma, address, pfn); + pmd = pmdp_huge_clear_flush(vma, address, pmdp); + pmd = pmd_wrprotect(pmd); + pmd = pmd_mkclean(pmd); + set_pmd_at(vma->vm_mm, address, pmdp, pmd); + changed = true; +unlock_pmd: + spin_unlock(ptl); + } else { + if (pfn != pte_pfn(*ptep)) + goto unlock_pte; + if (!pte_dirty(*ptep) && !pte_write(*ptep)) + goto unlock_pte; + + flush_cache_page(vma, address, pfn); + pte = ptep_clear_flush(vma, address, ptep); + pte = pte_wrprotect(pte); + pte = pte_mkclean(pte); + set_pte_at(vma->vm_mm, address, ptep, pte); + changed = true; +unlock_pte: + pte_unmap_unlock(ptep, ptl); + } if (changed) mmu_notifier_invalidate_page(vma->vm_mm, address); diff --git a/include/linux/mm.h b/include/linux/mm.h index ff0e1c1..f4de7fa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1210,8 +1210,6 @@ int copy_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 29edd91..ddcf979 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3826,8 +3826,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return -EINVAL; } -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp) +static inline int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { int res; -- 2.7.4 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 6 Jan 2017 11:18:19 -0700 From: Ross Zwisler To: Andrew Morton Cc: Ross Zwisler , linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170106181819.GA3486@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <20170104001349.GA8176@linux.intel.com> <20170105172734.23a7603ff19006b49e9ba01a@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170105172734.23a7603ff19006b49e9ba01a@linux-foundation.org> Sender: owner-linux-mm@kvack.org List-ID: On Thu, Jan 05, 2017 at 05:27:34PM -0800, Andrew Morton wrote: > On Tue, 3 Jan 2017 17:13:49 -0700 Ross Zwisler wrote: > > > On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > > > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > > > pmd_t of a DAX PMD entry during an *sync operation. This can result in > > > data loss, as detailed in patch 4. > > > > > > You can find a working tree here: > > > > > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > > > > > This series applies cleanly to mmotm-2016-12-19-16-31. > > > > > > Changes since v1: > > > - Included Dan's patch to kill DAX support for UML. > > > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > > > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > > > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > > > > > Dan Williams (1): > > > dax: kill uml support > > > > > > Ross Zwisler (3): > > > dax: add stub for pmdp_huge_clear_flush() > > > mm: add follow_pte_pmd() > > > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > > > > > fs/Kconfig | 2 +- > > > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > > > include/asm-generic/pgtable.h | 10 +++++++++ > > > include/linux/mm.h | 4 ++-- > > > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > > > 5 files changed, 79 insertions(+), 27 deletions(-) > > > > Well, 0-day found another architecture that doesn't define pmd_pfn() et al., > > so we'll need some more fixes. (Thank you, 0-day, for the coverage!) > > > > I have to apologize, I didn't understand that Dan intended his "dax: kill uml > > support" patch to land in v4.11. I thought he intended it as a cleanup to my > > series, which really needs to land in v4.10. That's why I folded them > > together into this v2, along with the wrapper suggested by Jan. > > > > Andrew, does it work for you to just keep v1 of this series, and eventually > > send that to Linus for v4.10? > > > > https://lkml.org/lkml/2016/12/20/649 > > > > You've already pulled that one into -mm, and it does correctly solve the data > > loss issue. > > > > That would let us deal with getting rid of the #ifdef, blacklisting > > architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on > > series for v4.11. > > I have mm-add-follow_pte_pmd.patch and > dax-wrprotect-pmd_t-in-dax_mapping_entry_mkclean.patch queued for 4.10. > Please (re)send any additional patches, indicating for each one > whether you believe it should also go into 4.10? The two patches that you already have queued are correct, and no additional patches are necessary for v4.10 for this issue. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 3 Jan 2017 17:13:49 -0700 From: Ross Zwisler To: Andrew Morton Cc: linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170104001349.GA8176@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > pmd_t of a DAX PMD entry during an *sync operation. This can result in > data loss, as detailed in patch 4. > > You can find a working tree here: > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > This series applies cleanly to mmotm-2016-12-19-16-31. > > Changes since v1: > - Included Dan's patch to kill DAX support for UML. > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > Dan Williams (1): > dax: kill uml support > > Ross Zwisler (3): > dax: add stub for pmdp_huge_clear_flush() > mm: add follow_pte_pmd() > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > fs/Kconfig | 2 +- > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > include/asm-generic/pgtable.h | 10 +++++++++ > include/linux/mm.h | 4 ++-- > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > 5 files changed, 79 insertions(+), 27 deletions(-) Well, 0-day found another architecture that doesn't define pmd_pfn() et al., so we'll need some more fixes. (Thank you, 0-day, for the coverage!) I have to apologize, I didn't understand that Dan intended his "dax: kill uml support" patch to land in v4.11. I thought he intended it as a cleanup to my series, which really needs to land in v4.10. That's why I folded them together into this v2, along with the wrapper suggested by Jan. Andrew, does it work for you to just keep v1 of this series, and eventually send that to Linus for v4.10? https://lkml.org/lkml/2016/12/20/649 You've already pulled that one into -mm, and it does correctly solve the data loss issue. That would let us deal with getting rid of the #ifdef, blacklisting architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on series for v4.11. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f69.google.com (mail-pg0-f69.google.com [74.125.83.69]) by kanga.kvack.org (Postfix) with ESMTP id 9CD0D6B0069 for ; Tue, 3 Jan 2017 19:13:51 -0500 (EST) Received: by mail-pg0-f69.google.com with SMTP id a190so1270351926pgc.0 for ; Tue, 03 Jan 2017 16:13:51 -0800 (PST) Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id p3si70587199pli.95.2017.01.03.16.13.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Jan 2017 16:13:50 -0800 (PST) Date: Tue, 3 Jan 2017 17:13:49 -0700 From: Ross Zwisler Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170104001349.GA8176@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > pmd_t of a DAX PMD entry during an *sync operation. This can result in > data loss, as detailed in patch 4. > > You can find a working tree here: > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > This series applies cleanly to mmotm-2016-12-19-16-31. > > Changes since v1: > - Included Dan's patch to kill DAX support for UML. > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > Dan Williams (1): > dax: kill uml support > > Ross Zwisler (3): > dax: add stub for pmdp_huge_clear_flush() > mm: add follow_pte_pmd() > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > fs/Kconfig | 2 +- > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > include/asm-generic/pgtable.h | 10 +++++++++ > include/linux/mm.h | 4 ++-- > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > 5 files changed, 79 insertions(+), 27 deletions(-) Well, 0-day found another architecture that doesn't define pmd_pfn() et al., so we'll need some more fixes. (Thank you, 0-day, for the coverage!) I have to apologize, I didn't understand that Dan intended his "dax: kill uml support" patch to land in v4.11. I thought he intended it as a cleanup to my series, which really needs to land in v4.10. That's why I folded them together into this v2, along with the wrapper suggested by Jan. Andrew, does it work for you to just keep v1 of this series, and eventually send that to Linus for v4.10? https://lkml.org/lkml/2016/12/20/649 You've already pulled that one into -mm, and it does correctly solve the data loss issue. That would let us deal with getting rid of the #ifdef, blacklisting architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on series for v4.11. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966817AbcLVVTI (ORCPT ); Thu, 22 Dec 2016 16:19:08 -0500 Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758530AbcLVVTH (ORCPT ); Thu, 22 Dec 2016 16:19:07 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,390,1477983600"; d="scan'208";a="1075369255" From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Date: Thu, 22 Dec 2016 14:18:52 -0700 Message-Id: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss, as detailed in patch 4. You can find a working tree here: https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 This series applies cleanly to mmotm-2016-12-19-16-31. Changes since v1: - Included Dan's patch to kill DAX support for UML. - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) Dan Williams (1): dax: kill uml support Ross Zwisler (3): dax: add stub for pmdp_huge_clear_flush() mm: add follow_pte_pmd() dax: wrprotect pmd_t in dax_mapping_entry_mkclean fs/Kconfig | 2 +- fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- include/asm-generic/pgtable.h | 10 +++++++++ include/linux/mm.h | 4 ++-- mm/memory.c | 41 ++++++++++++++++++++++++++++-------- 5 files changed, 79 insertions(+), 27 deletions(-) -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966865AbcLVVTK (ORCPT ); Thu, 22 Dec 2016 16:19:10 -0500 Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966807AbcLVVTJ (ORCPT ); Thu, 22 Dec 2016 16:19:09 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,390,1477983600"; d="scan'208";a="1075369277" From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: [PATCH v2 3/4] mm: add follow_pte_pmd() Date: Thu, 22 Dec 2016 14:18:55 -0700 Message-Id: <1482441536-14550-4-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a huge page PMD leaf to be found and returned. Signed-off-by: Ross Zwisler Suggested-by: Dave Hansen --- include/linux/mm.h | 2 ++ mm/memory.c | 37 ++++++++++++++++++++++++++++++------- 2 files changed, 32 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4424784..ff0e1c1 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1212,6 +1212,8 @@ void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 455c3e6..29edd91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3779,8 +3779,8 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) } #endif /* __PAGETABLE_PMD_FOLDED */ -static int __follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) +static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) { pgd_t *pgd; pud_t *pud; @@ -3797,11 +3797,20 @@ static int __follow_pte(struct mm_struct *mm, unsigned long address, pmd = pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) - goto out; - /* We cannot handle huge page PFN maps. Luckily they don't exist. */ - if (pmd_huge(*pmd)) + if (pmd_huge(*pmd)) { + if (!pmdpp) + goto out; + + *ptlp = pmd_lock(mm, pmd); + if (pmd_huge(*pmd)) { + *pmdpp = pmd; + return 0; + } + spin_unlock(*ptlp); + } + + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; ptep = pte_offset_map_lock(mm, pmd, address, ptlp); @@ -3824,9 +3833,23 @@ int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, /* (void) is needed to make gcc happy */ (void) __cond_lock(*ptlp, - !(res = __follow_pte(mm, address, ptepp, ptlp))); + !(res = __follow_pte_pmd(mm, address, ptepp, NULL, + ptlp))); + return res; +} + +int follow_pte_pmd(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp) +{ + int res; + + /* (void) is needed to make gcc happy */ + (void) __cond_lock(*ptlp, + !(res = __follow_pte_pmd(mm, address, ptepp, pmdpp, + ptlp))); return res; } +EXPORT_SYMBOL(follow_pte_pmd); /** * follow_pfn - look up PFN at a user virtual address -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966934AbcLVVTd (ORCPT ); Thu, 22 Dec 2016 16:19:33 -0500 Received: from mga03.intel.com ([134.134.136.65]:46205 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932213AbcLVVTW (ORCPT ); Thu, 22 Dec 2016 16:19:22 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,390,1477983600"; d="scan'208";a="1075369262" From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Dan Williams , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org, Ross Zwisler Subject: [PATCH v2 1/4] dax: kill uml support Date: Thu, 22 Dec 2016 14:18:53 -0700 Message-Id: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Dan Williams The lack of common transparent-huge-page helpers for UML is becoming increasingly painful for fs/dax.c now that it is growing more pmd functionality. Add UML to the list of unsupported architectures. Cc: Jan Kara Cc: Christoph Hellwig Cc: Dave Chinner Cc: Dave Hansen Cc: Matthew Wilcox Cc: Alexander Viro Cc: Ross Zwisler Signed-off-by: Dan Williams [rez: squashed #ifdef removal into another patch in the series ] Signed-off-by: Ross Zwisler --- fs/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index c2a377c..661931f 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" config FS_DAX bool "Direct Access (DAX) support" depends on MMU - depends on !(ARM || MIPS || SPARC) + depends on !(ARM || MIPS || SPARC || UML) help Direct Access (DAX) can be used on memory-backed block devices. If the block device supports DAX and the filesystem supports DAX, -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941150AbcLVVTX (ORCPT ); Thu, 22 Dec 2016 16:19:23 -0500 Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966872AbcLVVTU (ORCPT ); Thu, 22 Dec 2016 16:19:20 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,390,1477983600"; d="scan'208";a="1075369282" From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: [PATCH v2 4/4] dax: wrprotect pmd_t in dax_mapping_entry_mkclean Date: Thu, 22 Dec 2016 14:18:56 -0700 Message-Id: <1482441536-14550-5-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd_t dirty and writeable 2) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pmd_t as clean and write protected. 3) more mmap writes to the PMD. These don't cause any page faults since the pmd_t is dirty and writeable. The radix tree entry remains clean. 4) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 5) crash - dirty data that should have been fsync'd as part of 4) could still have been in the processor cache, and is lost. Fix this by marking the pmd_t clean and write protected in dax_mapping_entry_mkclean(), which is called as part of the fsync operation 2). This will cause the writes in step 3) above to generate page faults where we'll re-dirty the PMD radix tree entry, resulting in flushes in the fsync that happens in step 4). Signed-off-by: Ross Zwisler Cc: Jan Kara Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Reviewed-by: Jan Kara --- fs/dax.c | 49 ++++++++++++++++++++++++++++++++++--------------- include/linux/mm.h | 2 -- mm/memory.c | 4 ++-- 3 files changed, 36 insertions(+), 19 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 5c74f60..62b3ed4 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -691,8 +691,8 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, pgoff_t index, unsigned long pfn) { struct vm_area_struct *vma; - pte_t *ptep; - pte_t pte; + pte_t pte, *ptep = NULL; + pmd_t *pmdp = NULL; spinlock_t *ptl; bool changed; @@ -707,21 +707,40 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping, address = pgoff_address(index, vma); changed = false; - if (follow_pte(vma->vm_mm, address, &ptep, &ptl)) + if (follow_pte_pmd(vma->vm_mm, address, &ptep, &pmdp, &ptl)) continue; - if (pfn != pte_pfn(*ptep)) - goto unlock; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock; - flush_cache_page(vma, address, pfn); - pte = ptep_clear_flush(vma, address, ptep); - pte = pte_wrprotect(pte); - pte = pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); - changed = true; -unlock: - pte_unmap_unlock(ptep, ptl); + if (pmdp) { + pmd_t pmd; + + if (pfn != pmd_pfn(*pmdp)) + goto unlock_pmd; + if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) + goto unlock_pmd; + + flush_cache_page(vma, address, pfn); + pmd = pmdp_huge_clear_flush(vma, address, pmdp); + pmd = pmd_wrprotect(pmd); + pmd = pmd_mkclean(pmd); + set_pmd_at(vma->vm_mm, address, pmdp, pmd); + changed = true; +unlock_pmd: + spin_unlock(ptl); + } else { + if (pfn != pte_pfn(*ptep)) + goto unlock_pte; + if (!pte_dirty(*ptep) && !pte_write(*ptep)) + goto unlock_pte; + + flush_cache_page(vma, address, pfn); + pte = ptep_clear_flush(vma, address, ptep); + pte = pte_wrprotect(pte); + pte = pte_mkclean(pte); + set_pte_at(vma->vm_mm, address, ptep, pte); + changed = true; +unlock_pte: + pte_unmap_unlock(ptep, ptl); + } if (changed) mmu_notifier_invalidate_page(vma->vm_mm, address); diff --git a/include/linux/mm.h b/include/linux/mm.h index ff0e1c1..f4de7fa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1210,8 +1210,6 @@ int copy_page_range(struct mm_struct *dst, struct mm_struct *src, struct vm_area_struct *vma); void unmap_mapping_range(struct address_space *mapping, loff_t const holebegin, loff_t const holelen, int even_cows); -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp); int follow_pte_pmd(struct mm_struct *mm, unsigned long address, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index 29edd91..ddcf979 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3826,8 +3826,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address, return -EINVAL; } -int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, - spinlock_t **ptlp) +static inline int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { int res; -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966992AbcLVVUF (ORCPT ); Thu, 22 Dec 2016 16:20:05 -0500 Received: from mga03.intel.com ([134.134.136.65]:2589 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966782AbcLVVTI (ORCPT ); Thu, 22 Dec 2016 16:19:08 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,390,1477983600"; d="scan'208";a="1075369269" From: Ross Zwisler To: linux-kernel@vger.kernel.org Cc: Ross Zwisler , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Date: Thu, 22 Dec 2016 14:18:54 -0700 Message-Id: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add a pmdp_huge_clear_flush() stub for configs that don't define CONFIG_TRANSPARENT_HUGEPAGE. We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at least we do want this compile successfully even for configs without CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call this code gets called, based on whether we find DAX PMD entries in our tree. We shouldn't ever find such PMD entries for !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be called. Signed-off-by: Ross Zwisler --- include/asm-generic/pgtable.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 18af2bc..65e9536 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH +#ifdef CONFIG_TRANSPARENT_HUGEPAGE extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); +#else +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, + unsigned long address, + pmd_t *pmdp) +{ + WARN_ON_ONCE(1); + return *pmdp; +} +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT -- 2.7.4 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758635AbcLWNpE (ORCPT ); Fri, 23 Dec 2016 08:45:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:42484 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbcLWNpC (ORCPT ); Fri, 23 Dec 2016 08:45:02 -0500 Date: Fri, 23 Dec 2016 14:44:57 +0100 From: Jan Kara To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: Re: [PATCH v2 2/4] dax: add stub for pmdp_huge_clear_flush() Message-ID: <20161223134457.GG22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-3-git-send-email-ross.zwisler@linux.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 22-12-16 14:18:54, Ross Zwisler wrote: > Add a pmdp_huge_clear_flush() stub for configs that don't define > CONFIG_TRANSPARENT_HUGEPAGE. > > We use a WARN_ON_ONCE() instead of a BUILD_BUG() because in the DAX code at > least we do want this compile successfully even for configs without > CONFIG_TRANSPARENT_HUGEPAGE. It'll be a runtime decision whether we call > this code gets called, based on whether we find DAX PMD entries in our > tree. We shouldn't ever find such PMD entries for > !CONFIG_TRANSPARENT_HUGEPAGE configs, so this function should never be > called. > > Signed-off-by: Ross Zwisler Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > include/asm-generic/pgtable.h | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index 18af2bc..65e9536 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -178,9 +178,19 @@ extern pte_t ptep_clear_flush(struct vm_area_struct *vma, > #endif > > #ifndef __HAVE_ARCH_PMDP_HUGE_CLEAR_FLUSH > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > unsigned long address, > pmd_t *pmdp); > +#else > +static inline pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, > + unsigned long address, > + pmd_t *pmdp) > +{ > + WARN_ON_ONCE(1); > + return *pmdp; > +} > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > #endif > > #ifndef __HAVE_ARCH_PTEP_SET_WRPROTECT > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965201AbcLWNpn (ORCPT ); Fri, 23 Dec 2016 08:45:43 -0500 Received: from mx2.suse.de ([195.135.220.15]:42512 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759834AbcLWNpl (ORCPT ); Fri, 23 Dec 2016 08:45:41 -0500 Date: Fri, 23 Dec 2016 14:45:39 +0100 From: Jan Kara To: Ross Zwisler Cc: linux-kernel@vger.kernel.org, Dan Williams , Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: Re: [PATCH v2 1/4] dax: kill uml support Message-ID: <20161223134539.GH22679@quack2.suse.cz> References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-2-git-send-email-ross.zwisler@linux.intel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 22-12-16 14:18:53, Ross Zwisler wrote: > From: Dan Williams > > The lack of common transparent-huge-page helpers for UML is becoming > increasingly painful for fs/dax.c now that it is growing more pmd > functionality. Add UML to the list of unsupported architectures. > > Cc: Jan Kara > Cc: Christoph Hellwig > Cc: Dave Chinner > Cc: Dave Hansen > Cc: Matthew Wilcox > Cc: Alexander Viro > Cc: Ross Zwisler > Signed-off-by: Dan Williams > [rez: squashed #ifdef removal into another patch in the series ] > Signed-off-by: Ross Zwisler Fine by me. You can add: Acked-by: Jan Kara Honza > --- > fs/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/Kconfig b/fs/Kconfig > index c2a377c..661931f 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -37,7 +37,7 @@ source "fs/f2fs/Kconfig" > config FS_DAX > bool "Direct Access (DAX) support" > depends on MMU > - depends on !(ARM || MIPS || SPARC) > + depends on !(ARM || MIPS || SPARC || UML) > help > Direct Access (DAX) can be used on memory-backed block devices. > If the block device supports DAX and the filesystem supports DAX, > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762179AbdADAOr (ORCPT ); Tue, 3 Jan 2017 19:14:47 -0500 Received: from mga07.intel.com ([134.134.136.100]:62577 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760051AbdADAON (ORCPT ); Tue, 3 Jan 2017 19:14:13 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.33,457,1477983600"; d="scan'208";a="25811755" Date: Tue, 3 Jan 2017 17:13:49 -0700 From: Ross Zwisler To: Andrew Morton Cc: linux-kernel@vger.kernel.org, Alexander Viro , Andrew Morton , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@ml01.01.org Subject: Re: [PATCH v2 0/4] Write protect DAX PMDs in *sync path Message-ID: <20170104001349.GA8176@linux.intel.com> Mail-Followup-To: Ross Zwisler , Andrew Morton , linux-kernel@vger.kernel.org, Alexander Viro , Arnd Bergmann , Christoph Hellwig , Dan Williams , Dave Chinner , Dave Hansen , Jan Kara , Matthew Wilcox , linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org References: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1482441536-14550-1-git-send-email-ross.zwisler@linux.intel.com> User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 22, 2016 at 02:18:52PM -0700, Ross Zwisler wrote: > Currently dax_mapping_entry_mkclean() fails to clean and write protect the > pmd_t of a DAX PMD entry during an *sync operation. This can result in > data loss, as detailed in patch 4. > > You can find a working tree here: > > https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean_v2 > > This series applies cleanly to mmotm-2016-12-19-16-31. > > Changes since v1: > - Included Dan's patch to kill DAX support for UML. > - Instead of wrapping the DAX PMD code in dax_mapping_entry_mkclean() in > an #ifdef, we now create a stub for pmdp_huge_clear_flush() for the case > when CONFIG_TRANSPARENT_HUGEPAGE isn't defined. (Dan & Jan) > > Dan Williams (1): > dax: kill uml support > > Ross Zwisler (3): > dax: add stub for pmdp_huge_clear_flush() > mm: add follow_pte_pmd() > dax: wrprotect pmd_t in dax_mapping_entry_mkclean > > fs/Kconfig | 2 +- > fs/dax.c | 49 ++++++++++++++++++++++++++++++------------- > include/asm-generic/pgtable.h | 10 +++++++++ > include/linux/mm.h | 4 ++-- > mm/memory.c | 41 ++++++++++++++++++++++++++++-------- > 5 files changed, 79 insertions(+), 27 deletions(-) Well, 0-day found another architecture that doesn't define pmd_pfn() et al., so we'll need some more fixes. (Thank you, 0-day, for the coverage!) I have to apologize, I didn't understand that Dan intended his "dax: kill uml support" patch to land in v4.11. I thought he intended it as a cleanup to my series, which really needs to land in v4.10. That's why I folded them together into this v2, along with the wrapper suggested by Jan. Andrew, does it work for you to just keep v1 of this series, and eventually send that to Linus for v4.10? https://lkml.org/lkml/2016/12/20/649 You've already pulled that one into -mm, and it does correctly solve the data loss issue. That would let us deal with getting rid of the #ifdef, blacklisting architectures and introducing the pmdp_huge_clear_flush() strub in a follow-on series for v4.11.