stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Larry Chen <lchen@suse.com>, Mark Fasheh <mark@fasheh.com>,
	Joel Becker <jlbec@evilplan.org>,
	Junxiao Bi <junxiao.bi@oracle.com>,
	Joseph Qi <jiangqi903@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.4 29/33] ocfs2: fix deadlock caused by ocfs2_defrag_extent()
Date: Wed,  5 Dec 2018 04:51:27 -0500	[thread overview]
Message-ID: <20181205095131.7685-29-sashal@kernel.org> (raw)
In-Reply-To: <20181205095131.7685-1-sashal@kernel.org>

From: Larry Chen <lchen@suse.com>

[ Upstream commit e21e57445a64598b29a6f629688f9b9a39e7242a ]

ocfs2_defrag_extent may fall into deadlock.

ocfs2_ioctl_move_extents
    ocfs2_ioctl_move_extents
      ocfs2_move_extents
        ocfs2_defrag_extent
          ocfs2_lock_allocators_move_extents

            ocfs2_reserve_clusters
              inode_lock GLOBAL_BITMAP_SYSTEM_INODE

	  __ocfs2_flush_truncate_log
              inode_lock GLOBAL_BITMAP_SYSTEM_INODE

As backtrace shows above, ocfs2_reserve_clusters() will call inode_lock
against the global bitmap if local allocator has not sufficient cluters.
Once global bitmap could meet the demand, ocfs2_reserve_cluster will
return success with global bitmap locked.

After ocfs2_reserve_cluster(), if truncate log is full,
__ocfs2_flush_truncate_log() will definitely fall into deadlock because
it needs to inode_lock global bitmap, which has already been locked.

To fix this bug, we could remove from
ocfs2_lock_allocators_move_extents() the code which intends to lock
global allocator, and put the removed code after
__ocfs2_flush_truncate_log().

ocfs2_lock_allocators_move_extents() is referred by 2 places, one is
here, the other does not need the data allocator context, which means
this patch does not affect the caller so far.

Link: http://lkml.kernel.org/r/20181101071422.14470-1-lchen@suse.com
Signed-off-by: Larry Chen <lchen@suse.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ocfs2/move_extents.c | 47 +++++++++++++++++++++++------------------
 1 file changed, 26 insertions(+), 21 deletions(-)

diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c
index 124471d26a73..c1a83c58456e 100644
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -156,18 +156,14 @@ static int __ocfs2_move_extent(handle_t *handle,
 }
 
 /*
- * lock allocators, and reserving appropriate number of bits for
- * meta blocks and data clusters.
- *
- * in some cases, we don't need to reserve clusters, just let data_ac
- * be NULL.
+ * lock allocator, and reserve appropriate number of bits for
+ * meta blocks.
  */
-static int ocfs2_lock_allocators_move_extents(struct inode *inode,
+static int ocfs2_lock_meta_allocator_move_extents(struct inode *inode,
 					struct ocfs2_extent_tree *et,
 					u32 clusters_to_move,
 					u32 extents_to_split,
 					struct ocfs2_alloc_context **meta_ac,
-					struct ocfs2_alloc_context **data_ac,
 					int extra_blocks,
 					int *credits)
 {
@@ -192,13 +188,6 @@ static int ocfs2_lock_allocators_move_extents(struct inode *inode,
 		goto out;
 	}
 
-	if (data_ac) {
-		ret = ocfs2_reserve_clusters(osb, clusters_to_move, data_ac);
-		if (ret) {
-			mlog_errno(ret);
-			goto out;
-		}
-	}
 
 	*credits += ocfs2_calc_extend_credits(osb->sb, et->et_root_el);
 
@@ -260,10 +249,10 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
 		}
 	}
 
-	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, *len, 1,
-						 &context->meta_ac,
-						 &context->data_ac,
-						 extra_blocks, &credits);
+	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
+						*len, 1,
+						&context->meta_ac,
+						extra_blocks, &credits);
 	if (ret) {
 		mlog_errno(ret);
 		goto out;
@@ -286,6 +275,21 @@ static int ocfs2_defrag_extent(struct ocfs2_move_extents_context *context,
 		}
 	}
 
+	/*
+	 * Make sure ocfs2_reserve_cluster is called after
+	 * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
+	 *
+	 * If ocfs2_reserve_cluster is called
+	 * before __ocfs2_flush_truncate_log, dead lock on global bitmap
+	 * may happen.
+	 *
+	 */
+	ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
+	if (ret) {
+		mlog_errno(ret);
+		goto out_unlock_mutex;
+	}
+
 	handle = ocfs2_start_trans(osb, credits);
 	if (IS_ERR(handle)) {
 		ret = PTR_ERR(handle);
@@ -606,9 +610,10 @@ static int ocfs2_move_extent(struct ocfs2_move_extents_context *context,
 		}
 	}
 
-	ret = ocfs2_lock_allocators_move_extents(inode, &context->et, len, 1,
-						 &context->meta_ac,
-						 NULL, extra_blocks, &credits);
+	ret = ocfs2_lock_meta_allocator_move_extents(inode, &context->et,
+						len, 1,
+						&context->meta_ac,
+						extra_blocks, &credits);
 	if (ret) {
 		mlog_errno(ret);
 		goto out;
-- 
2.17.1

  parent reply	other threads:[~2018-12-05  9:51 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-05  9:50 [PATCH AUTOSEL 4.4 01/33] ARM: OMAP2+: prm44xx: Fix section annotation on omap44xx_prm_enable_io_wakeup Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 02/33] ARM: OMAP1: ams-delta: Fix possible use of uninitialized field Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 03/33] sysv: return 'err' instead of 0 in __sysv_write_inode Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 04/33] s390/cpum_cf: Reject request for sampling in event initialization Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 05/33] hwmon: (ina2xx) Fix current value calculation Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 06/33] ASoC: dapm: Recalculate audio map forcely when card instantiated Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 07/33] hwmon: (w83795) temp4_type has writable permission Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 08/33] Btrfs: send, fix infinite loop due to directory rename dependencies Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 09/33] uprobes: Fix handle_swbp() vs. unregister() + register() race once more Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 10/33] ASoC: omap-mcpdm: Add pm_qos handling to avoid under/overruns with CPU_IDLE Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 11/33] ASoC: omap-dmic: Add pm_qos handling to avoid overruns " Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 12/33] exportfs: do not read dentry after free Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 13/33] bpf: fix check of allowed specifiers in bpf_trace_printk Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 14/33] USB: omap_udc: use devm_request_irq() Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 15/33] USB: omap_udc: fix crashes on probe error and module removal Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 16/33] USB: omap_udc: fix omap_udc_start() on 15xx machines Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 17/33] USB: omap_udc: fix USB gadget functionality on Palm Tungsten E Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 18/33] KVM: x86: fix empty-body warnings Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 19/33] net: thunderx: fix NULL pointer dereference in nic_remove Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 20/33] ixgbe: recognize 1000BaseLX SFP modules as 1Gbps Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 21/33] rapidio/rionet: do not free skb before reading its length Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 22/33] net: hisilicon: remove unexpected free_netdev Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 23/33] s390/qeth: fix length check in SNMP processing Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 24/33] drm/ast: fixed reading monitor EDID not stable issue Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 25/33] xen: xlate_mmu: add missing header to fix 'W=1' warning Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 26/33] fscache: fix race between enablement and dropping of object Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 27/33] fscache, cachefiles: remove redundant variable 'cache' Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 28/33] unifdef: use memcpy instead of strncpy Sasha Levin
2018-12-05  9:51 ` Sasha Levin [this message]
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 30/33] hfs: do not free node before using Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 31/33] hfsplus: " Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 32/33] debugobjects: avoid recursive calls with kmemleak Sasha Levin
2018-12-05  9:51 ` [PATCH AUTOSEL 4.4 33/33] ocfs2: fix potential use after free Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181205095131.7685-29-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jiangqi903@gmail.com \
    --cc=jlbec@evilplan.org \
    --cc=junxiao.bi@oracle.com \
    --cc=lchen@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark@fasheh.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).