* [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()"
2015-10-02 21:02 [PATCH 0/3] Revert locking changes in DAX for v4.3 Ross Zwisler
@ 2015-10-02 21:02 ` Ross Zwisler
2015-10-02 21:11 ` Dan Williams
2015-10-02 21:02 ` [PATCH 2/3] Revert "mm: take i_mmap_lock in unmap_mapping_range() for DAX" Ross Zwisler
2015-10-02 21:02 ` [PATCH 3/3] Revert "dax: fix race between simultaneous faults" Ross Zwisler
2 siblings, 1 reply; 7+ messages in thread
From: Ross Zwisler @ 2015-10-02 21:02 UTC (permalink / raw)
To: linux-kernel
Cc: Ross Zwisler, Alexander Viro, Matthew Wilcox, linux-fsdevel,
Andrew Morton, Dan Williams, Dave Chinner, Jan Kara,
Kirill A. Shutemov, linux-nvdimm
This reverts commit 8346c416d17bf5b4ea1508662959bb62e73fd6a5.
This commit did fix the issue it intended to fix, but it turns out that
the locking changes introduced by these two commits:
commit 843172978bb9 ("dax: fix race between simultaneous faults")
commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
had other issues as well, so they need to just be reverted.
The list of issues in DAX after these commits (some newly introduced by
the commits, some preexisting) can be found here:
https://lkml.org/lkml/2015/9/25/602
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
fs/dax.c | 13 +------------
1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index bcfb14b..7ae6df7 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -569,20 +569,8 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE)
goto fallback;
- sector = bh.b_blocknr << (blkbits - 9);
-
if (buffer_unwritten(&bh) || buffer_new(&bh)) {
int i;
-
- length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn,
- bh.b_size);
- if (length < 0) {
- result = VM_FAULT_SIGBUS;
- goto out;
- }
- if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR))
- goto fallback;
-
for (i = 0; i < PTRS_PER_PMD; i++)
clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
wmb_pmem();
@@ -635,6 +623,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
result = VM_FAULT_NOPAGE;
spin_unlock(ptl);
} else {
+ sector = bh.b_blocknr << (blkbits - 9);
length = bdev_direct_access(bh.b_bdev, sector, &kaddr, &pfn,
bh.b_size);
if (length < 0) {
--
2.1.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()"
2015-10-02 21:02 ` [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()" Ross Zwisler
@ 2015-10-02 21:11 ` Dan Williams
2015-10-02 23:28 ` Ross Zwisler
0 siblings, 1 reply; 7+ messages in thread
From: Dan Williams @ 2015-10-02 21:11 UTC (permalink / raw)
To: Ross Zwisler
Cc: linux-kernel@vger.kernel.org, Alexander Viro, Matthew Wilcox,
linux-fsdevel, Andrew Morton, Dave Chinner, Jan Kara,
Kirill A. Shutemov, linux-nvdimm@lists.01.org
On Fri, Oct 2, 2015 at 2:02 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> This reverts commit 8346c416d17bf5b4ea1508662959bb62e73fd6a5.
>
> This commit did fix the issue it intended to fix, but it turns out that
> the locking changes introduced by these two commits:
>
> commit 843172978bb9 ("dax: fix race between simultaneous faults")
> commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
>
> had other issues as well, so they need to just be reverted.
Wait, why introduce two points in the kernel history where we have a
known uninitialized variable? I'd say fix up the revert of "mm: take
i_mmap_lock in unmap_mapping_range() for DAX" to address the conflict
with the fix, one less patch and keeps the stability rolling forward.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()"
2015-10-02 21:11 ` Dan Williams
@ 2015-10-02 23:28 ` Ross Zwisler
2015-10-05 8:49 ` Jan Kara
0 siblings, 1 reply; 7+ messages in thread
From: Ross Zwisler @ 2015-10-02 23:28 UTC (permalink / raw)
To: Dan Williams
Cc: Ross Zwisler, linux-kernel@vger.kernel.org, Alexander Viro,
Matthew Wilcox, linux-fsdevel, Andrew Morton, Dave Chinner,
Jan Kara, Kirill A. Shutemov, linux-nvdimm@lists.01.org
On Fri, Oct 02, 2015 at 02:11:03PM -0700, Dan Williams wrote:
> On Fri, Oct 2, 2015 at 2:02 PM, Ross Zwisler
> <ross.zwisler@linux.intel.com> wrote:
> > This reverts commit 8346c416d17bf5b4ea1508662959bb62e73fd6a5.
> >
> > This commit did fix the issue it intended to fix, but it turns out that
> > the locking changes introduced by these two commits:
> >
> > commit 843172978bb9 ("dax: fix race between simultaneous faults")
> > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
> >
> > had other issues as well, so they need to just be reverted.
>
> Wait, why introduce two points in the kernel history where we have a
> known uninitialized variable? I'd say fix up the revert of "mm: take
> i_mmap_lock in unmap_mapping_range() for DAX" to address the conflict
> with the fix, one less patch and keeps the stability rolling forward.
Essentially because I wasn't sure about the rules regarding reverts, if there
are any. I assumed (perhaps incorrectly) that you'd want a 1:1 relationship
between original commits and reverts. If it's better to not have intermediate
breakage, sure, let's squash them.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()"
2015-10-02 23:28 ` Ross Zwisler
@ 2015-10-05 8:49 ` Jan Kara
0 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2015-10-05 8:49 UTC (permalink / raw)
To: Ross Zwisler
Cc: Dan Williams, linux-kernel@vger.kernel.org, Alexander Viro,
Matthew Wilcox, linux-fsdevel, Andrew Morton, Dave Chinner,
Jan Kara, Kirill A. Shutemov, linux-nvdimm@lists.01.org
On Fri 02-10-15 17:28:42, Ross Zwisler wrote:
> On Fri, Oct 02, 2015 at 02:11:03PM -0700, Dan Williams wrote:
> > On Fri, Oct 2, 2015 at 2:02 PM, Ross Zwisler
> > <ross.zwisler@linux.intel.com> wrote:
> > > This reverts commit 8346c416d17bf5b4ea1508662959bb62e73fd6a5.
> > >
> > > This commit did fix the issue it intended to fix, but it turns out that
> > > the locking changes introduced by these two commits:
> > >
> > > commit 843172978bb9 ("dax: fix race between simultaneous faults")
> > > commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
> > >
> > > had other issues as well, so they need to just be reverted.
> >
> > Wait, why introduce two points in the kernel history where we have a
> > known uninitialized variable? I'd say fix up the revert of "mm: take
> > i_mmap_lock in unmap_mapping_range() for DAX" to address the conflict
> > with the fix, one less patch and keeps the stability rolling forward.
>
> Essentially because I wasn't sure about the rules regarding reverts, if there
> are any. I assumed (perhaps incorrectly) that you'd want a 1:1 relationship
> between original commits and reverts. If it's better to not have intermediate
> breakage, sure, let's squash them.
Well, reverts aren't any special commits after all. So if it is simple
enough to just revert part of the patch that is broken, then just reverting
that part is fine.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/3] Revert "mm: take i_mmap_lock in unmap_mapping_range() for DAX"
2015-10-02 21:02 [PATCH 0/3] Revert locking changes in DAX for v4.3 Ross Zwisler
2015-10-02 21:02 ` [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()" Ross Zwisler
@ 2015-10-02 21:02 ` Ross Zwisler
2015-10-02 21:02 ` [PATCH 3/3] Revert "dax: fix race between simultaneous faults" Ross Zwisler
2 siblings, 0 replies; 7+ messages in thread
From: Ross Zwisler @ 2015-10-02 21:02 UTC (permalink / raw)
To: linux-kernel
Cc: Ross Zwisler, Alexander Viro, Matthew Wilcox, linux-fsdevel,
linux-mm, Andrew Morton, Dan Williams, Dave Chinner, Jan Kara,
Kirill A. Shutemov, linux-nvdimm
This reverts commit 46c043ede4711e8d598b9d63c5616c1fedb0605e.
The following two locking commits in the DAX code:
commit 843172978bb9 ("dax: fix race between simultaneous faults")
commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
introduced a number of deadlocks and other issues, and need to be
reverted for the v4.3 kernel. The list of issues in DAX after these
commits (some newly introduced by the commits, some preexisting) can be
found here:
https://lkml.org/lkml/2015/9/25/602
This revert keeps the PMEM API changes to the zeroing code in
__dax_pmd_fault(), which were added by this commit:
commit d77e92e270ed ("dax: update PMD fault handler with PMEM API")
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
fs/dax.c | 37 +++++++++++++++++--------------------
mm/memory.c | 11 +++++++++--
2 files changed, 26 insertions(+), 22 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 7ae6df7..de3f53e 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -569,26 +569,6 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
if (!buffer_size_valid(&bh) || bh.b_size < PMD_SIZE)
goto fallback;
- if (buffer_unwritten(&bh) || buffer_new(&bh)) {
- int i;
- for (i = 0; i < PTRS_PER_PMD; i++)
- clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
- wmb_pmem();
- count_vm_event(PGMAJFAULT);
- mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
- result |= VM_FAULT_MAJOR;
- }
-
- /*
- * If we allocated new storage, make sure no process has any
- * zero pages covering this hole
- */
- if (buffer_new(&bh)) {
- i_mmap_unlock_write(mapping);
- unmap_mapping_range(mapping, pgoff << PAGE_SHIFT, PMD_SIZE, 0);
- i_mmap_lock_write(mapping);
- }
-
/*
* If a truncate happened while we were allocating blocks, we may
* leave blocks allocated to the file that are beyond EOF. We can't
@@ -603,6 +583,13 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
if ((pgoff | PG_PMD_COLOUR) >= size)
goto fallback;
+ /*
+ * If we allocated new storage, make sure no process has any
+ * zero pages covering this hole
+ */
+ if (buffer_new(&bh))
+ unmap_mapping_range(mapping, pgoff << PAGE_SHIFT, PMD_SIZE, 0);
+
if (!write && !buffer_mapped(&bh) && buffer_uptodate(&bh)) {
spinlock_t *ptl;
pmd_t entry;
@@ -633,6 +620,16 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
if ((length < PMD_SIZE) || (pfn & PG_PMD_COLOUR))
goto fallback;
+ if (buffer_unwritten(&bh) || buffer_new(&bh)) {
+ int i;
+ for (i = 0; i < PTRS_PER_PMD; i++)
+ clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
+ wmb_pmem();
+ count_vm_event(PGMAJFAULT);
+ mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
+ result |= VM_FAULT_MAJOR;
+ }
+
result |= vmf_insert_pfn_pmd(vma, address, pmd, pfn, write);
}
diff --git a/mm/memory.c b/mm/memory.c
index 9cb2747..5ec066f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2426,10 +2426,17 @@ void unmap_mapping_range(struct address_space *mapping,
if (details.last_index < details.first_index)
details.last_index = ULONG_MAX;
- i_mmap_lock_write(mapping);
+
+ /*
+ * DAX already holds i_mmap_lock to serialise file truncate vs
+ * page fault and page fault vs page fault.
+ */
+ if (!IS_DAX(mapping->host))
+ i_mmap_lock_write(mapping);
if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap)))
unmap_mapping_range_tree(&mapping->i_mmap, &details);
- i_mmap_unlock_write(mapping);
+ if (!IS_DAX(mapping->host))
+ i_mmap_unlock_write(mapping);
}
EXPORT_SYMBOL(unmap_mapping_range);
--
2.1.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] Revert "dax: fix race between simultaneous faults"
2015-10-02 21:02 [PATCH 0/3] Revert locking changes in DAX for v4.3 Ross Zwisler
2015-10-02 21:02 ` [PATCH 1/3] Revert "dax: fix NULL pointer in __dax_pmd_fault()" Ross Zwisler
2015-10-02 21:02 ` [PATCH 2/3] Revert "mm: take i_mmap_lock in unmap_mapping_range() for DAX" Ross Zwisler
@ 2015-10-02 21:02 ` Ross Zwisler
2 siblings, 0 replies; 7+ messages in thread
From: Ross Zwisler @ 2015-10-02 21:02 UTC (permalink / raw)
To: linux-kernel
Cc: Ross Zwisler, Alexander Viro, Matthew Wilcox, linux-fsdevel,
linux-mm, Andrew Morton, Dan Williams, Dave Chinner, Jan Kara,
Kirill A. Shutemov, linux-nvdimm
This reverts commit 843172978bb92997310d2f7fbc172ece423cfc02.
The following two locking commits in the DAX code:
commit 843172978bb9 ("dax: fix race between simultaneous faults")
commit 46c043ede471 ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
introduced a number of deadlocks and other issues, and need to be
reverted for the v4.3 kernel. The list of issues in DAX after these
commits (some newly introduced by the commits, some preexisting) can be
found here:
https://lkml.org/lkml/2015/9/25/602
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
fs/dax.c | 33 ++++++++++++++++-----------------
mm/memory.c | 11 +++--------
2 files changed, 19 insertions(+), 25 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index de3f53e..f364c90 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -285,6 +285,7 @@ static int copy_user_bh(struct page *to, struct buffer_head *bh,
static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh,
struct vm_area_struct *vma, struct vm_fault *vmf)
{
+ struct address_space *mapping = inode->i_mapping;
sector_t sector = bh->b_blocknr << (inode->i_blkbits - 9);
unsigned long vaddr = (unsigned long)vmf->virtual_address;
void __pmem *addr;
@@ -292,6 +293,8 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh,
pgoff_t size;
int error;
+ i_mmap_lock_read(mapping);
+
/*
* Check truncate didn't happen while we were allocating a block.
* If it did, this block may or may not be still allocated to the
@@ -321,6 +324,8 @@ static int dax_insert_mapping(struct inode *inode, struct buffer_head *bh,
error = vm_insert_mixed(vma, vaddr, pfn);
out:
+ i_mmap_unlock_read(mapping);
+
return error;
}
@@ -382,17 +387,15 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
* from a read fault and we've raced with a truncate
*/
error = -EIO;
- goto unlock;
+ goto unlock_page;
}
- } else {
- i_mmap_lock_write(mapping);
}
error = get_block(inode, block, &bh, 0);
if (!error && (bh.b_size < PAGE_SIZE))
error = -EIO; /* fs corruption? */
if (error)
- goto unlock;
+ goto unlock_page;
if (!buffer_mapped(&bh) && !buffer_unwritten(&bh) && !vmf->cow_page) {
if (vmf->flags & FAULT_FLAG_WRITE) {
@@ -403,9 +406,8 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
if (!error && (bh.b_size < PAGE_SIZE))
error = -EIO;
if (error)
- goto unlock;
+ goto unlock_page;
} else {
- i_mmap_unlock_write(mapping);
return dax_load_hole(mapping, page, vmf);
}
}
@@ -417,15 +419,17 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
else
clear_user_highpage(new_page, vaddr);
if (error)
- goto unlock;
+ goto unlock_page;
vmf->page = page;
if (!page) {
+ i_mmap_lock_read(mapping);
/* Check we didn't race with truncate */
size = (i_size_read(inode) + PAGE_SIZE - 1) >>
PAGE_SHIFT;
if (vmf->pgoff >= size) {
+ i_mmap_unlock_read(mapping);
error = -EIO;
- goto unlock;
+ goto out;
}
}
return VM_FAULT_LOCKED;
@@ -461,8 +465,6 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
WARN_ON_ONCE(!(vmf->flags & FAULT_FLAG_WRITE));
}
- if (!page)
- i_mmap_unlock_write(mapping);
out:
if (error == -ENOMEM)
return VM_FAULT_OOM | major;
@@ -471,14 +473,11 @@ int __dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
return VM_FAULT_SIGBUS | major;
return VM_FAULT_NOPAGE | major;
- unlock:
+ unlock_page:
if (page) {
unlock_page(page);
page_cache_release(page);
- } else {
- i_mmap_unlock_write(mapping);
}
-
goto out;
}
EXPORT_SYMBOL(__dax_fault);
@@ -556,10 +555,10 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
block = (sector_t)pgoff << (PAGE_SHIFT - blkbits);
bh.b_size = PMD_SIZE;
- i_mmap_lock_write(mapping);
length = get_block(inode, block, &bh, write);
if (length)
return VM_FAULT_SIGBUS;
+ i_mmap_lock_read(mapping);
/*
* If the filesystem isn't willing to tell us the length of a hole,
@@ -634,11 +633,11 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address,
}
out:
+ i_mmap_unlock_read(mapping);
+
if (buffer_unwritten(&bh))
complete_unwritten(&bh, !(result & VM_FAULT_ERROR));
- i_mmap_unlock_write(mapping);
-
return result;
fallback:
diff --git a/mm/memory.c b/mm/memory.c
index 5ec066f..deb679c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2427,16 +2427,11 @@ void unmap_mapping_range(struct address_space *mapping,
details.last_index = ULONG_MAX;
- /*
- * DAX already holds i_mmap_lock to serialise file truncate vs
- * page fault and page fault vs page fault.
- */
- if (!IS_DAX(mapping->host))
- i_mmap_lock_write(mapping);
+ /* DAX uses i_mmap_lock to serialise file truncate vs page fault */
+ i_mmap_lock_write(mapping);
if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap)))
unmap_mapping_range_tree(&mapping->i_mmap, &details);
- if (!IS_DAX(mapping->host))
- i_mmap_unlock_write(mapping);
+ i_mmap_unlock_write(mapping);
}
EXPORT_SYMBOL(unmap_mapping_range);
--
2.1.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread