* FAILED: patch "[PATCH] dax: fix data corruption when fault races with write" failed to apply to 4.11-stable tree
@ 2017-05-18 7:26 gregkh
2017-05-18 23:14 ` Ross Zwisler
0 siblings, 1 reply; 3+ messages in thread
From: gregkh @ 2017-05-18 7:26 UTC (permalink / raw)
To: jack, akpm, dan.j.williams, ross.zwisler, stable, torvalds; +Cc: stable
The patch below does not apply to the 4.11-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 13e451fdc1af05568ea379d71c02a126295d2244 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Fri, 12 May 2017 15:46:57 -0700
Subject: [PATCH] dax: fix data corruption when fault races with write
Currently DAX read fault can race with write(2) in the following way:
CPU1 - write(2) CPU2 - read fault
dax_iomap_pte_fault()
->iomap_begin() - sees hole
dax_iomap_rw()
iomap_apply()
->iomap_begin - allocates blocks
dax_iomap_actor()
invalidate_inode_pages2_range()
- there's nothing to invalidate
grab_mapping_entry()
- we add zero page in the radix tree
and map it to page tables
The result is that hole page is mapped into page tables (and thus zeros
are seen in mmap) while file has data written in that place.
Fix the problem by locking exception entry before mapping blocks for the
fault. That way we are sure invalidate_inode_pages2_range() call for
racing write will either block on entry lock waiting for the fault to
finish (and unmap stale page tables after that) or read fault will see
already allocated blocks by write(2).
Fixes: 9f141d6ef6258a3a37a045842d9ba7e68f368956
Link: http://lkml.kernel.org/r/20170510085419.27601-5-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/fs/dax.c b/fs/dax.c
index 123d9903c77d..32f020c9cedf 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1148,6 +1148,12 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
if ((vmf->flags & FAULT_FLAG_WRITE) && !vmf->cow_page)
flags |= IOMAP_WRITE;
+ entry = grab_mapping_entry(mapping, vmf->pgoff, 0);
+ if (IS_ERR(entry)) {
+ vmf_ret = dax_fault_return(PTR_ERR(entry));
+ goto out;
+ }
+
/*
* Note that we don't bother to use iomap_apply here: DAX required
* the file system block size to be equal the page size, which means
@@ -1156,17 +1162,11 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap);
if (error) {
vmf_ret = dax_fault_return(error);
- goto out;
+ goto unlock_entry;
}
if (WARN_ON_ONCE(iomap.offset + iomap.length < pos + PAGE_SIZE)) {
- vmf_ret = dax_fault_return(-EIO); /* fs corruption? */
- goto finish_iomap;
- }
-
- entry = grab_mapping_entry(mapping, vmf->pgoff, 0);
- if (IS_ERR(entry)) {
- vmf_ret = dax_fault_return(PTR_ERR(entry));
- goto finish_iomap;
+ error = -EIO; /* fs corruption? */
+ goto error_finish_iomap;
}
sector = dax_iomap_sector(&iomap, pos);
@@ -1188,13 +1188,13 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
}
if (error)
- goto error_unlock_entry;
+ goto error_finish_iomap;
__SetPageUptodate(vmf->cow_page);
vmf_ret = finish_fault(vmf);
if (!vmf_ret)
vmf_ret = VM_FAULT_DONE_COW;
- goto unlock_entry;
+ goto finish_iomap;
}
switch (iomap.type) {
@@ -1214,7 +1214,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
case IOMAP_HOLE:
if (!(vmf->flags & FAULT_FLAG_WRITE)) {
vmf_ret = dax_load_hole(mapping, &entry, vmf);
- goto unlock_entry;
+ goto finish_iomap;
}
/*FALLTHRU*/
default:
@@ -1223,10 +1223,8 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
break;
}
- error_unlock_entry:
+ error_finish_iomap:
vmf_ret = dax_fault_return(error) | major;
- unlock_entry:
- put_locked_mapping_entry(mapping, vmf->pgoff, entry);
finish_iomap:
if (ops->iomap_end) {
int copied = PAGE_SIZE;
@@ -1241,7 +1239,9 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
*/
ops->iomap_end(inode, pos, PAGE_SIZE, copied, flags, &iomap);
}
-out:
+ unlock_entry:
+ put_locked_mapping_entry(mapping, vmf->pgoff, entry);
+ out:
trace_dax_pte_fault_done(inode, vmf, vmf_ret);
return vmf_ret;
}
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: FAILED: patch "[PATCH] dax: fix data corruption when fault races with write" failed to apply to 4.11-stable tree
2017-05-18 7:26 FAILED: patch "[PATCH] dax: fix data corruption when fault races with write" failed to apply to 4.11-stable tree gregkh
@ 2017-05-18 23:14 ` Ross Zwisler
2017-05-23 10:25 ` Greg KH
0 siblings, 1 reply; 3+ messages in thread
From: Ross Zwisler @ 2017-05-18 23:14 UTC (permalink / raw)
To: gregkh; +Cc: jack, akpm, dan.j.williams, ross.zwisler, stable, torvalds
On Thu, May 18, 2017 at 09:26:04AM +0200, gregkh@linuxfoundation.org wrote:
>
> The patch below does not apply to the 4.11-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
>
> thanks,
>
> greg k-h
The following patch from Jan was sent in v3 of his series before he had merged
with my tracepoint changes, and applies cleanly to v4.11.1. This is for the
following upstream commit:
commit 13e451fdc1af ("dax: fix data corruption when fault races with write")
--- >8 ---
>From d23bb209e613befbf568c54298bb3add847653a0 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Tue, 9 May 2017 14:18:37 +0200
Subject: [PATCH] dax: Fix data corruption when fault races with write
Currently DAX read fault can race with write(2) in the following way:
CPU1 - write(2) CPU2 - read fault
dax_iomap_pte_fault()
->iomap_begin() - sees hole
dax_iomap_rw()
iomap_apply()
->iomap_begin - allocates blocks
dax_iomap_actor()
invalidate_inode_pages2_range()
- there's nothing to invalidate
grab_mapping_entry()
- we add zero page in the radix tree
and map it to page tables
The result is that hole page is mapped into page tables (and thus zeros
are seen in mmap) while file has data written in that place.
Fix the problem by locking exception entry before mapping blocks for the
fault. That way we are sure invalidate_inode_pages2_range() call for
racing write will either block on entry lock waiting for the fault to
finish (and unmap stale page tables after that) or read fault will see
already allocated blocks by write(2).
Fixes: 9f141d6ef6258a3a37a045842d9ba7e68f368956
CC: stable@vger.kernel.org
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
---
fs/dax.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index 85abd74..d1d2cd9 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1153,23 +1153,23 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
if ((vmf->flags & FAULT_FLAG_WRITE) && !vmf->cow_page)
flags |= IOMAP_WRITE;
+ entry = grab_mapping_entry(mapping, vmf->pgoff, 0);
+ if (IS_ERR(entry))
+ return dax_fault_return(PTR_ERR(entry));
+
/*
* Note that we don't bother to use iomap_apply here: DAX required
* the file system block size to be equal the page size, which means
* that we never have to deal with more than a single extent here.
*/
error = ops->iomap_begin(inode, pos, PAGE_SIZE, flags, &iomap);
- if (error)
- return dax_fault_return(error);
- if (WARN_ON_ONCE(iomap.offset + iomap.length < pos + PAGE_SIZE)) {
- vmf_ret = dax_fault_return(-EIO); /* fs corruption? */
- goto finish_iomap;
+ if (error) {
+ vmf_ret = dax_fault_return(error);
+ goto unlock_entry;
}
-
- entry = grab_mapping_entry(mapping, vmf->pgoff, 0);
- if (IS_ERR(entry)) {
- vmf_ret = dax_fault_return(PTR_ERR(entry));
- goto finish_iomap;
+ if (WARN_ON_ONCE(iomap.offset + iomap.length < pos + PAGE_SIZE)) {
+ error = -EIO; /* fs corruption? */
+ goto error_finish_iomap;
}
sector = dax_iomap_sector(&iomap, pos);
@@ -1191,13 +1191,13 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
}
if (error)
- goto error_unlock_entry;
+ goto error_finish_iomap;
__SetPageUptodate(vmf->cow_page);
vmf_ret = finish_fault(vmf);
if (!vmf_ret)
vmf_ret = VM_FAULT_DONE_COW;
- goto unlock_entry;
+ goto finish_iomap;
}
switch (iomap.type) {
@@ -1217,7 +1217,7 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
case IOMAP_HOLE:
if (!(vmf->flags & FAULT_FLAG_WRITE)) {
vmf_ret = dax_load_hole(mapping, &entry, vmf);
- goto unlock_entry;
+ goto finish_iomap;
}
/*FALLTHRU*/
default:
@@ -1226,10 +1226,8 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
break;
}
- error_unlock_entry:
+ error_finish_iomap:
vmf_ret = dax_fault_return(error) | major;
- unlock_entry:
- put_locked_mapping_entry(mapping, vmf->pgoff, entry);
finish_iomap:
if (ops->iomap_end) {
int copied = PAGE_SIZE;
@@ -1244,6 +1242,8 @@ static int dax_iomap_pte_fault(struct vm_fault *vmf,
*/
ops->iomap_end(inode, pos, PAGE_SIZE, copied, flags, &iomap);
}
+ unlock_entry:
+ put_locked_mapping_entry(mapping, vmf->pgoff, entry);
return vmf_ret;
}
--
2.9.4
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: FAILED: patch "[PATCH] dax: fix data corruption when fault races with write" failed to apply to 4.11-stable tree
2017-05-18 23:14 ` Ross Zwisler
@ 2017-05-23 10:25 ` Greg KH
0 siblings, 0 replies; 3+ messages in thread
From: Greg KH @ 2017-05-23 10:25 UTC (permalink / raw)
To: Ross Zwisler; +Cc: jack, akpm, dan.j.williams, stable, torvalds
On Thu, May 18, 2017 at 05:14:40PM -0600, Ross Zwisler wrote:
> On Thu, May 18, 2017 at 09:26:04AM +0200, gregkh@linuxfoundation.org wrote:
> >
> > The patch below does not apply to the 4.11-stable tree.
> > If someone wants it applied there, or to any other stable or longterm
> > tree, then please email the backport, including the original git commit
> > id to <stable@vger.kernel.org>.
> >
> > thanks,
> >
> > greg k-h
>
> The following patch from Jan was sent in v3 of his series before he had merged
> with my tracepoint changes, and applies cleanly to v4.11.1. This is for the
> following upstream commit:
>
> commit 13e451fdc1af ("dax: fix data corruption when fault races with write")
Thanks, now applied.
greg k-h
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-05-23 10:25 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-18 7:26 FAILED: patch "[PATCH] dax: fix data corruption when fault races with write" failed to apply to 4.11-stable tree gregkh
2017-05-18 23:14 ` Ross Zwisler
2017-05-23 10:25 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).