From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>,
Bill O'Donnell <billodo@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
Sasha Levin <sashal@kernel.org>,
linux-fsdevel@vger.kernel.org
Subject: [PATCH AUTOSEL 4.19 22/45] vfs: fix page locking deadlocks when deduping files
Date: Thu, 29 Aug 2019 14:15:22 -0400 [thread overview]
Message-ID: <20190829181547.8280-22-sashal@kernel.org> (raw)
In-Reply-To: <20190829181547.8280-1-sashal@kernel.org>
From: "Darrick J. Wong" <darrick.wong@oracle.com>
[ Upstream commit edc58dd0123b552453a74369bd0c8d890b497b4b ]
When dedupe wants to use the page cache to compare parts of two files
for dedupe, we must be very careful to handle locking correctly. The
current code doesn't do this. It must lock and unlock the page only
once if the two pages are the same, since the overlapping range check
doesn't catch this when blocksize < pagesize. If the pages are distinct
but from the same file, we must observe page locking order and lock them
in order of increasing offset to avoid clashing with writeback locking.
Fixes: 876bec6f9bbfcb3 ("vfs: refactor clone/dedupe_file_range common functions")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/read_write.c | 49 +++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 41 insertions(+), 8 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 85fd7a8ee29eb..5fb5ee5b8cd70 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1888,10 +1888,7 @@ int vfs_clone_file_range(struct file *file_in, loff_t pos_in,
}
EXPORT_SYMBOL(vfs_clone_file_range);
-/*
- * Read a page's worth of file data into the page cache. Return the page
- * locked.
- */
+/* Read a page's worth of file data into the page cache. */
static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
{
struct address_space *mapping;
@@ -1907,10 +1904,32 @@ static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
put_page(page);
return ERR_PTR(-EIO);
}
- lock_page(page);
return page;
}
+/*
+ * Lock two pages, ensuring that we lock in offset order if the pages are from
+ * the same file.
+ */
+static void vfs_lock_two_pages(struct page *page1, struct page *page2)
+{
+ /* Always lock in order of increasing index. */
+ if (page1->index > page2->index)
+ swap(page1, page2);
+
+ lock_page(page1);
+ if (page1 != page2)
+ lock_page(page2);
+}
+
+/* Unlock two pages, being careful not to unlock the same page twice. */
+static void vfs_unlock_two_pages(struct page *page1, struct page *page2)
+{
+ unlock_page(page1);
+ if (page1 != page2)
+ unlock_page(page2);
+}
+
/*
* Compare extents of two files to see if they are the same.
* Caller must have locked both inodes to prevent write races.
@@ -1948,10 +1967,24 @@ int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
dest_page = vfs_dedupe_get_page(dest, destoff);
if (IS_ERR(dest_page)) {
error = PTR_ERR(dest_page);
- unlock_page(src_page);
put_page(src_page);
goto out_error;
}
+
+ vfs_lock_two_pages(src_page, dest_page);
+
+ /*
+ * Now that we've locked both pages, make sure they're still
+ * mapped to the file data we're interested in. If not,
+ * someone is invalidating pages on us and we lose.
+ */
+ if (!PageUptodate(src_page) || !PageUptodate(dest_page) ||
+ src_page->mapping != src->i_mapping ||
+ dest_page->mapping != dest->i_mapping) {
+ same = false;
+ goto unlock;
+ }
+
src_addr = kmap_atomic(src_page);
dest_addr = kmap_atomic(dest_page);
@@ -1963,8 +1996,8 @@ int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
kunmap_atomic(dest_addr);
kunmap_atomic(src_addr);
- unlock_page(dest_page);
- unlock_page(src_page);
+unlock:
+ vfs_unlock_two_pages(src_page, dest_page);
put_page(dest_page);
put_page(src_page);
--
2.20.1
next prev parent reply other threads:[~2019-08-29 18:24 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-29 18:15 [PATCH AUTOSEL 4.19 01/45] net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 02/45] netfilter: nf_tables: use-after-free in failing rule with bound set Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 03/45] rxrpc: Fix local endpoint refcounting Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 04/45] tools: bpftool: fix error message (prog -> object) Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 05/45] hv_netvsc: Fix a warning of suspicious RCU usage Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 06/45] net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 07/45] Bluetooth: btqca: Add a short delay before downloading the NVM Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 08/45] Bluetooth: hidp: Let hidp_send_message return number of queued bytes Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 09/45] ibmveth: Convert multicast list size for little-endian system Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 10/45] gpio: Fix build error of function redefinition Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 11/45] netfilter: nft_flow_offload: skip tcp rst and fin packets Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 12/45] rxrpc: Fix local endpoint replacement Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 13/45] rxrpc: Fix read-after-free in rxrpc_queue_local() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 14/45] drm/mediatek: use correct device to import PRIME buffers Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 15/45] drm/mediatek: set DMA max segment size Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 16/45] scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 17/45] scsi: target: tcmu: avoid use-after-free after command timeout Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 18/45] cxgb4: fix a memory leak bug Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 19/45] liquidio: add cleanup in octeon_setup_iq() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 20/45] net: myri10ge: fix memory leaks Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 21/45] lan78xx: Fix " Sasha Levin
2019-08-29 18:15 ` Sasha Levin [this message]
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 23/45] cx82310_eth: fix a memory leak bug Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 24/45] net: kalmia: fix memory leaks Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 25/45] ibmvnic: Unmap DMA address of TX descriptor buffers after use Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 26/45] net: cavium: fix driver name Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 27/45] wimax/i2400m: fix a memory leak bug Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 28/45] ravb: Fix use-after-free ravb_tstamp_skb Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 29/45] kprobes: Fix potential deadlock in kprobe_optimizer() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 30/45] HID: cp2112: prevent sleeping function called from invalid context Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 31/45] x86/boot/compressed/64: Fix boot on machines with broken E820 table Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 32/45] Input: hyperv-keyboard: Use in-place iterator API in the channel callback Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 33/45] Tools: hv: kvp: eliminate 'may be used uninitialized' warning Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 34/45] nvme-multipath: fix possible I/O hang when paths are updated Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 35/45] IB/mlx4: Fix memory leaks Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 36/45] infiniband: hfi1: fix a memory leak bug Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 37/45] infiniband: hfi1: fix memory leaks Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 38/45] selftests: kvm: fix state save/load on processors without XSAVE Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 39/45] selftests/kvm: make platform_info_test pass on AMD Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 40/45] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 41/45] ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 42/45] ceph: fix buffer free while holding i_ceph_lock in fill_inode() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 43/45] KVM: arm/arm64: Only skip MMIO insn once Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 44/45] afs: Fix leak in afs_lookup_cell_rcu() Sasha Levin
2019-08-29 18:15 ` [PATCH AUTOSEL 4.19 45/45] KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190829181547.8280-22-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=billodo@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).