public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.4 28/62] btrfs: fix potential deadlock in the search ioctl
Date: Fri, 11 Sep 2020 14:46:11 +0200	[thread overview]
Message-ID: <20200911122503.793716871@linuxfoundation.org> (raw)
In-Reply-To: <20200911122502.395450276@linuxfoundation.org>

From: Josef Bacik <josef@toxicpanda.com>

[ Upstream commit a48b73eca4ceb9b8a4b97f290a065335dbcd8a04 ]

With the conversion of the tree locks to rwsem I got the following
lockdep splat:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted
  ------------------------------------------------------
  compsize/11122 is trying to acquire lock:
  ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90

  but task is already holding lock:
  ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (btrfs-fs-00){++++}-{3:3}:
	 down_write_nested+0x3b/0x70
	 __btrfs_tree_lock+0x24/0x120
	 btrfs_search_slot+0x756/0x990
	 btrfs_lookup_inode+0x3a/0xb4
	 __btrfs_update_delayed_inode+0x93/0x270
	 btrfs_async_run_delayed_root+0x168/0x230
	 btrfs_work_helper+0xd4/0x570
	 process_one_work+0x2ad/0x5f0
	 worker_thread+0x3a/0x3d0
	 kthread+0x133/0x150
	 ret_from_fork+0x1f/0x30

  -> #1 (&delayed_node->mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_delayed_update_inode+0x50/0x440
	 btrfs_update_inode+0x8a/0xf0
	 btrfs_dirty_inode+0x5b/0xd0
	 touch_atime+0xa1/0xd0
	 btrfs_file_mmap+0x3f/0x60
	 mmap_region+0x3a4/0x640
	 do_mmap+0x376/0x580
	 vm_mmap_pgoff+0xd5/0x120
	 ksys_mmap_pgoff+0x193/0x230
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
	 __lock_acquire+0x1272/0x2310
	 lock_acquire+0x9e/0x360
	 __might_fault+0x68/0x90
	 _copy_to_user+0x1e/0x80
	 copy_to_sk.isra.32+0x121/0x300
	 search_ioctl+0x106/0x200
	 btrfs_ioctl_tree_search_v2+0x7b/0xf0
	 btrfs_ioctl+0x106f/0x30a0
	 ksys_ioctl+0x83/0xc0
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(btrfs-fs-00);
				 lock(&delayed_node->mutex);
				 lock(btrfs-fs-00);
    lock(&mm->mmap_lock#2);

   *** DEADLOCK ***

  1 lock held by compsize/11122:
   #0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180

  stack backtrace:
  CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922
  Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
  Call Trace:
   dump_stack+0x78/0xa0
   check_noncircular+0x165/0x180
   __lock_acquire+0x1272/0x2310
   lock_acquire+0x9e/0x360
   ? __might_fault+0x3e/0x90
   ? find_held_lock+0x72/0x90
   __might_fault+0x68/0x90
   ? __might_fault+0x3e/0x90
   _copy_to_user+0x1e/0x80
   copy_to_sk.isra.32+0x121/0x300
   ? btrfs_search_forward+0x2a6/0x360
   search_ioctl+0x106/0x200
   btrfs_ioctl_tree_search_v2+0x7b/0xf0
   btrfs_ioctl+0x106f/0x30a0
   ? __do_sys_newfstat+0x5a/0x70
   ? ksys_ioctl+0x83/0xc0
   ksys_ioctl+0x83/0xc0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x50/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

The problem is we're doing a copy_to_user() while holding tree locks,
which can deadlock if we have to do a page fault for the copy_to_user().
This exists even without my locking changes, so it needs to be fixed.
Rework the search ioctl to do the pre-fault and then
copy_to_user_nofault for the copying.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/extent_io.c |  8 ++++----
 fs/btrfs/extent_io.h |  6 +++---
 fs/btrfs/ioctl.c     | 27 ++++++++++++++++++++-------
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 2f9f738ecf84a..97a80238fdee3 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5431,9 +5431,9 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
 	}
 }
 
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
-			       void __user *dstv,
-			       unsigned long start, unsigned long len)
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+				       void __user *dstv,
+				       unsigned long start, unsigned long len)
 {
 	size_t cur;
 	size_t offset;
@@ -5454,7 +5454,7 @@ int read_extent_buffer_to_user(const struct extent_buffer *eb,
 
 		cur = min(len, (PAGE_CACHE_SIZE - offset));
 		kaddr = page_address(page);
-		if (copy_to_user(dst, kaddr + offset, cur)) {
+		if (probe_user_write(dst, kaddr + offset, cur)) {
 			ret = -EFAULT;
 			break;
 		}
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 751435967724e..9631be7fc9e24 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -313,9 +313,9 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
 void read_extent_buffer(const struct extent_buffer *eb, void *dst,
 			unsigned long start,
 			unsigned long len);
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
-			       void __user *dst, unsigned long start,
-			       unsigned long len);
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+				       void __user *dst, unsigned long start,
+				       unsigned long len);
 void write_extent_buffer(struct extent_buffer *eb, const void *src,
 			 unsigned long start, unsigned long len);
 void copy_extent_buffer(struct extent_buffer *dst, struct extent_buffer *src,
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 245a50f490f63..91a45ef69152d 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2017,9 +2017,14 @@ static noinline int copy_to_sk(struct btrfs_root *root,
 		sh.len = item_len;
 		sh.transid = found_transid;
 
-		/* copy search result header */
-		if (copy_to_user(ubuf + *sk_offset, &sh, sizeof(sh))) {
-			ret = -EFAULT;
+		/*
+		 * Copy search result header. If we fault then loop again so we
+		 * can fault in the pages and -EFAULT there if there's a
+		 * problem. Otherwise we'll fault and then copy the buffer in
+		 * properly this next time through
+		 */
+		if (probe_user_write(ubuf + *sk_offset, &sh, sizeof(sh))) {
+			ret = 0;
 			goto out;
 		}
 
@@ -2027,10 +2032,14 @@ static noinline int copy_to_sk(struct btrfs_root *root,
 
 		if (item_len) {
 			char __user *up = ubuf + *sk_offset;
-			/* copy the item */
-			if (read_extent_buffer_to_user(leaf, up,
-						       item_off, item_len)) {
-				ret = -EFAULT;
+			/*
+			 * Copy the item, same behavior as above, but reset the
+			 * * sk_offset so we copy the full thing again.
+			 */
+			if (read_extent_buffer_to_user_nofault(leaf, up,
+						item_off, item_len)) {
+				ret = 0;
+				*sk_offset -= sizeof(sh);
 				goto out;
 			}
 
@@ -2120,6 +2129,10 @@ static noinline int search_ioctl(struct inode *inode,
 	key.offset = sk->min_offset;
 
 	while (1) {
+		ret = fault_in_pages_writeable(ubuf, *buf_size - sk_offset);
+		if (ret)
+			break;
+
 		ret = btrfs_search_forward(root, &key, path, sk->min_transid);
 		if (ret != 0) {
 			if (ret > 0)
-- 
2.25.1




  parent reply	other threads:[~2020-09-11 17:30 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-11 12:45 [PATCH 4.4 00/62] 4.4.236-rc1 review Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 01/62] HID: core: Correctly handle ReportSize being zero Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 02/62] HID: core: Sanitize event code and type when mapping input Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 03/62] perf record/stat: Explicitly call out event modifiers in the documentation Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 04/62] mm, page_alloc: remove unnecessary variable from free_pcppages_bulk Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 05/62] hwmon: (applesmc) check status earlier Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 06/62] ceph: dont allow setlease on cephfs Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 07/62] s390: dont trace preemption in percpu macros Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 08/62] xen/xenbus: Fix granting of vmallocd memory Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 09/62] dmaengine: of-dma: Fix of_dma_router_xlates of_dma_xlate handling Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 10/62] batman-adv: Avoid uninitialized chaddr when handling DHCP Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 11/62] batman-adv: bla: use netif_rx_ni when not in interrupt context Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 12/62] dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate() Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 13/62] netfilter: nf_tables: incorrect enum nft_list_attributes definition Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 14/62] netfilter: nf_tables: fix destination register zeroing Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 15/62] dmaengine: pl330: Fix burst length if burst size is smaller than bus width Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.4 16/62] bnxt_en: Check for zero dir entries in NVRAM Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 17/62] fix regression in "epoll: Keep a reference on files added to the check list" Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 18/62] tg3: Fix soft lockup when tg3_reset_task() fails Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 19/62] iommu/vt-d: Serialize IOMMU GCMD register modifications Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 20/62] thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 21/62] include/linux/log2.h: add missing () around n in roundup_pow_of_two() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 22/62] btrfs: drop path before adding new uuid tree entry Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 23/62] btrfs: Remove redundant extent_buffer_get in get_old_root Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 24/62] btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 25/62] btrfs: set the lockdep class for log tree extent buffers Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 26/62] uaccess: Add non-pagefault user-space read functions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 27/62] uaccess: Add non-pagefault user-space write function Greg Kroah-Hartman
2020-09-11 12:46 ` Greg Kroah-Hartman [this message]
2020-09-11 12:46 ` [PATCH 4.4 29/62] net: qmi_wwan: MDM9x30 specific power management Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 30/62] net: qmi_wwan: support "raw IP" mode Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 31/62] net: qmi_wwan: should hold RTNL while changing netdev type Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 32/62] net: qmi_wwan: ignore bogus CDC Union descriptors Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 33/62] Add Dell Wireless 5809e Gobi 4G HSPA+ Mobile Broadband Card (rev3) to qmi_wwan Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 34/62] qmi_wwan: Added support for Gemaltos Cinterion PHxx WWAN interface Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 35/62] qmi_wwan: add support for Quectel EC21 and EC25 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 36/62] NET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 37/62] drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 38/62] usb: qmi_wwan: add D-Link DWM-222 A2 device ID Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 39/62] net: usb: qmi_wwan: add Telit ME910 support Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 40/62] net: usb: qmi_wwan: add Telit 0x1050 composition Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 41/62] ALSA: ca0106: fix error code handling Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 42/62] ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 43/62] dm cache metadata: Avoid returning cmd->bm wild pointer on error Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 44/62] dm thin " Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 45/62] net: refactor bind_bucket fastreuse into helper Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 46/62] net: initialize fastreuse on inet_inherit_port Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 47/62] checkpatch: fix the usage of capture group ( ... ) Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 48/62] mm/hugetlb: fix a race between hugetlb sysctl handlers Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 49/62] cfg80211: regulatory: reject invalid hints Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 50/62] net: usb: Fix uninit-was-stored issue in asix_read_phy_addr() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 51/62] ALSA: firewire-digi00x: add support for console models of Digi00x series Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 52/62] ALSA: firewire-digi00x: exclude Avid Adrenaline from detection Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 53/62] ALSA; firewire-tascam: exclude Tascam FE-8 " Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 54/62] fs/affs: use octal for permissions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 55/62] affs: fix basic permission bits to actually work Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 56/62] ravb: Fixed to be able to unload modules Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 57/62] net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 58/62] bnxt_en: Failure to update PHY is not fatal condition Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 59/62] bnxt: dont enable NAPI until rings are ready Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 60/62] net: usb: dm9601: Add USB ID of Keenetic Plus DSL Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 61/62] sctp: not disable bh in the whole sctp_get_port_local() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.4 62/62] net: disable netpoll on fresh napis Greg Kroah-Hartman
2020-09-11 22:36 ` [PATCH 4.4 00/62] 4.4.236-rc1 review Shuah Khan
2020-09-12  2:15 ` Guenter Roeck
2020-09-12  8:08 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200911122503.793716871@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox