From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Ross Zwisler <ross.zwisler@linux.intel.com>,
Jan Kara <jack@suse.cz>, Dan Williams <dan.j.williams@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.10 76/93] dax: prevent invalidation of mapped DAX entries
Date: Thu, 18 May 2017 12:47:45 +0200 [thread overview]
Message-ID: <20170518104746.155382907@linuxfoundation.org> (raw)
In-Reply-To: <20170518104743.163522815@linuxfoundation.org>
4.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ross Zwisler <ross.zwisler@linux.intel.com>
commit 4636e70bb0a8b871998b6841a2e4b205cf2bc863 upstream.
Patch series "mm,dax: Fix data corruption due to mmap inconsistency",
v4.
This series fixes data corruption that can happen for DAX mounts when
page faults race with write(2) and as a result page tables get out of
sync with block mappings in the filesystem and thus data seen through
mmap is different from data seen through read(2).
The series passes testing with t_mmap_stale test program from Ross and
also other mmap related tests on DAX filesystem.
This patch (of 4):
dax_invalidate_mapping_entry() currently removes DAX exceptional entries
only if they are clean and unlocked. This is done via:
invalidate_mapping_pages()
invalidate_exceptional_entry()
dax_invalidate_mapping_entry()
However, for page cache pages removed in invalidate_mapping_pages()
there is an additional criteria which is that the page must not be
mapped. This is noted in the comments above invalidate_mapping_pages()
and is checked in invalidate_inode_page().
For DAX entries this means that we can can end up in a situation where a
DAX exceptional entry, either a huge zero page or a regular DAX entry,
could end up mapped but without an associated radix tree entry. This is
inconsistent with the rest of the DAX code and with what happens in the
page cache case.
We aren't able to unmap the DAX exceptional entry because according to
its comments invalidate_mapping_pages() isn't allowed to block, and
unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem.
Since we essentially never have unmapped DAX entries to evict from the
radix tree, just remove dax_invalidate_mapping_entry().
Fixes: c6dcf52c23d2 ("mm: Invalidate DAX radix tree entries only if appropriate")
Link: http://lkml.kernel.org/r/20170510085419.27601-2-jack@suse.cz
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/dax.c | 29 -----------------------------
include/linux/dax.h | 1 -
mm/truncate.c | 9 +++------
3 files changed, 3 insertions(+), 36 deletions(-)
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -503,35 +503,6 @@ int dax_delete_mapping_entry(struct addr
}
/*
- * Invalidate exceptional DAX entry if easily possible. This handles DAX
- * entries for invalidate_inode_pages() so we evict the entry only if we can
- * do so without blocking.
- */
-int dax_invalidate_mapping_entry(struct address_space *mapping, pgoff_t index)
-{
- int ret = 0;
- void *entry, **slot;
- struct radix_tree_root *page_tree = &mapping->page_tree;
-
- spin_lock_irq(&mapping->tree_lock);
- entry = __radix_tree_lookup(page_tree, index, NULL, &slot);
- if (!entry || !radix_tree_exceptional_entry(entry) ||
- slot_locked(mapping, slot))
- goto out;
- if (radix_tree_tag_get(page_tree, index, PAGECACHE_TAG_DIRTY) ||
- radix_tree_tag_get(page_tree, index, PAGECACHE_TAG_TOWRITE))
- goto out;
- radix_tree_delete(page_tree, index);
- mapping->nrexceptional--;
- ret = 1;
-out:
- spin_unlock_irq(&mapping->tree_lock);
- if (ret)
- dax_wake_mapping_entry_waiter(mapping, index, entry, true);
- return ret;
-}
-
-/*
* Invalidate exceptional DAX entry if it is clean.
*/
int dax_invalidate_mapping_entry_sync(struct address_space *mapping,
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -41,7 +41,6 @@ ssize_t dax_iomap_rw(struct kiocb *iocb,
int dax_iomap_fault(struct vm_area_struct *vma, struct vm_fault *vmf,
struct iomap_ops *ops);
int dax_delete_mapping_entry(struct address_space *mapping, pgoff_t index);
-int dax_invalidate_mapping_entry(struct address_space *mapping, pgoff_t index);
int dax_invalidate_mapping_entry_sync(struct address_space *mapping,
pgoff_t index);
void dax_wake_mapping_entry_waiter(struct address_space *mapping,
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -66,17 +66,14 @@ static void truncate_exceptional_entry(s
/*
* Invalidate exceptional entry if easily possible. This handles exceptional
- * entries for invalidate_inode_pages() so for DAX it evicts only unlocked and
- * clean entries.
+ * entries for invalidate_inode_pages().
*/
static int invalidate_exceptional_entry(struct address_space *mapping,
pgoff_t index, void *entry)
{
- /* Handled by shmem itself */
- if (shmem_mapping(mapping))
+ /* Handled by shmem itself, or for DAX we do nothing. */
+ if (shmem_mapping(mapping) || dax_mapping(mapping))
return 1;
- if (dax_mapping(mapping))
- return dax_invalidate_mapping_entry(mapping, index);
clear_shadow_entry(mapping, index, entry);
return 1;
}
next prev parent reply other threads:[~2017-05-18 10:47 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-18 10:46 [PATCH 4.10 00/93] 4.10.17-stable review Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 01/93] xen: adjust early dom0 p2m handling to xen hypervisor behavior Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 02/93] target: Fix compare_and_write_callback handling for non GOOD status Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 03/93] target/fileio: Fix zero-length READ and WRITE handling Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 04/93] iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 05/93] usb: xhci: bInterval quirk for TI TUSB73x0 Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 06/93] usb: host: xhci: print correct command ring address Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 07/93] USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 08/93] USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 10/93] staging: vt6656: use off stack for in buffer USB transfers Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 11/93] staging: vt6656: use off stack for out " Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 12/93] staging: gdm724x: gdm_mux: fix use-after-free on module unload Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 13/93] staging: wilc1000: Fix problem with wrong vif index Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 14/93] staging: comedi: jr3_pci: fix possible null pointer dereference Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 15/93] staging: comedi: jr3_pci: cope with jiffies wraparound Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 16/93] usb: misc: add missing continue in switch Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 17/93] usb: gadget: legacy gadgets are optional Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 18/93] usb: Make sure usb/phy/of gets built-in Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 19/93] usb: hub: Fix error loop seen after hub communication errors Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 20/93] usb: hub: Do not attempt to autosuspend disconnected devices Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 21/93] usb: misc: legousbtower: Fix buffers on stack Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 22/93] x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 23/93] selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 24/93] x86, pmem: Fix cache flushing for iovec write < 8 bytes Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 25/93] um: Fix PTRACE_POKEUSER on x86_64 Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 26/93] perf/x86: Fix Broadwell-EP DRAM RAPL events Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 28/93] KVM: arm/arm64: fix races in kvm_psci_vcpu_on Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 29/93] arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accesses Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 30/93] block: fix blk_integrity_register to use templates interval_exp if not 0 Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 31/93] crypto: s5p-sss - Close possible race for completed requests Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 32/93] crypto: algif_aead - Require setkey before accept(2) Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 33/93] crypto: ccp - Use only the relevant interrupt bits Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 34/93] crypto: ccp - Disable interrupts early on unload Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 35/93] crypto: ccp - Change ISR handler method for a v3 CCP Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 36/93] crypto: ccp - Change ISR handler method for a v5 CCP Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 37/93] dm crypt: rewrite (wipe) key in crypto layer using random data Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 38/93] dm era: save spacemap metadata root after the pre-commit Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 39/93] dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 40/93] dm thin: fix a memory leak when passing discard bio down Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 41/93] vfio/type1: Remove locked page accounting workqueue Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 42/93] iov_iter: dont revert iov buffer if csum error Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 43/93] IB/core: Fix sysfs registration error flow Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 44/93] IB/core: For multicast functions, verify that LIDs are multicast LIDs Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 45/93] IB/IPoIB: ibX: failed to create mcg debug file Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 46/93] IB/mlx4: Fix ib device initialization error flow Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 47/93] IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 48/93] IB/hfi1: Prevent kernel QP post send hard lockups Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 49/93] perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 50/93] perf annotate s390: Fix perf annotate error -95 (4.10 regression) Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 51/93] perf annotate s390: Implement jump types for perf annotate Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 52/93] jbd2: fix dbench4 performance regression for nobarrier mounts Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 53/93] ext4: evict inline data when writing to memory map Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 54/93] orangefs: fix bounds check for listxattr Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 55/93] orangefs: clean up oversize xattr validation Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 56/93] orangefs: do not set getattr_time on orangefs_lookup Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 57/93] orangefs: do not check possibly stale size on truncate Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 58/93] fs/xattr.c: zero out memory copied to userspace in getxattr Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 59/93] ceph: fix memory leak in __ceph_setxattr() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 60/93] fs/block_dev: always invalidate cleancache in invalidate_bdev() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 61/93] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 62/93] Fix match_prepath() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 63/93] Set unicode flag on cifs echo request to avoid Mac error Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 64/93] SMB3: Work around mount failure when using SMB3 dialect to Macs Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 66/93] cifs: fix leak in FSCTL_ENUM_SNAPS response handling Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 67/93] cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 68/93] CIFS: fix oplock break deadlocks Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 69/93] cifs: fix CIFS_IOC_GET_MNT_INFO oops Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 71/93] ovl: do not set overlay.opaque on non-dir create Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 72/93] padata: free correct variable Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 73/93] md/raid1: avoid reusing a resync bio after error handling Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 74/93] device-dax: fix cdev leak Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 75/93] device-dax: fix sysfs attribute deadlock Greg Kroah-Hartman
2017-05-18 10:47 ` Greg Kroah-Hartman [this message]
2017-05-18 10:47 ` [PATCH 4.10 77/93] mm: fix data corruption due to stale mmap reads Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 78/93] f2fs: fix fs corruption due to zero inode page Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 79/93] fscrypt: fix context consistency check when key(s) unavailable Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 80/93] serial: samsung: Use right device for DMA-mapping calls Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 81/93] serial: omap: fix runtime-pm handling on unbind Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 82/93] serial: omap: suspend device on probe errors Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 83/93] tty: pty: Fix ldisc flush after userspace become aware of the data already Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 84/93] Bluetooth: Fix user channel for 32bit userspace on 64bit kernel Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 85/93] Bluetooth: hci_bcm: add missing tty-device sanity check Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 86/93] Bluetooth: hci_intel: " Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 87/93] ipmi: Fix kernel panic at ipmi_ssif_thread() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 88/93] libnvdimm, region: fix flush hint detection crash Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 89/93] libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 90/93] libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering Greg Kroah-Hartman
2017-05-18 10:48 ` [PATCH 4.10 91/93] libnvdimm, pfn: fix npfns vs section alignment Greg Kroah-Hartman
2017-05-18 10:48 ` [PATCH 4.10 92/93] pstore: Fix flags to enable dumps on powerpc Greg Kroah-Hartman
2017-05-18 10:48 ` [PATCH 4.10 93/93] pstore: Shut down worker when unregistering Greg Kroah-Hartman
2017-05-18 17:31 ` [PATCH 4.10 00/93] 4.10.17-stable review Shuah Khan
2017-05-19 1:10 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170518104746.155382907@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=jack@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=ross.zwisler@linux.intel.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).