From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wengang Date: Fri, 25 Jul 2014 13:57:05 +0800 Subject: [Ocfs2-devel] [PATCH] ocfs2: reflink: fix slow unlink for refcounted file In-Reply-To: <53D1EFF3.8090606@oracle.com> References: <1406266367-12188-1-git-send-email-junxiao.bi@oracle.com> <53D1EFF3.8090606@oracle.com> Message-ID: <53D1F1B1.5040801@oracle.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com You may want to add one more parameter to ocfs2_remove_btree_range indicating if the lock is already taken outside or not. If not taken, do the lock and unlock inside; otherwise, skip lock/unlock Will that work? thanks, wengang ? 2014?07?25? 13:49, Wengang ??: > Will it be safe to do the moving? > > There are three calls of ocfs2_remove_btree_range > > 1 7124 fs/ocfs2/alloc.c <> > status = ocfs2_remove_btree_range(inode, &et, trunc_cpos, > 2 4479 fs/ocfs2/dir.c <> > ret = ocfs2_remove_btree_range(dir, &et, cpos, p_cpos, clen, 0, > 3 1805 fs/ocfs2/file.c <> > ret = ocfs2_remove_btree_range(inode, &et, trunc_cpos, > > Obviously, the 2nd and the 3rd are not sub-routines of > ocfs2_commit_truncate(). > > I think this moving is not safe. > > thanks, > wengang > > ? 2014?07?25? 13:32, Junxiao Bi ??: >> When running ocfs2 test suite multiple nodes reflink stress test, for >> a 4 nodes cluster, every unlink() for refcounted file needs about 700s. >> >> The slow unlink is caused by the contention of refcount tree lock since all >> nodes are unlink files using the same refcount tree. When the unlinking file >> have many extents(over 1600 in our test), most of the extents has refcounted >> flag set. In ocfs2_commit_truncate(), it will execute the following call trace >> for every extents. This means it needs get and released refcount tree lock >> about 1600 times. And when several nodes are do this at the same time, the >> performance will be very low. >> . >> ocfs2_remove_btree_range() >> ----ocfs2_lock_refcount_tree() >> ------ocfs2_refcount_lock() >> --------__ocfs2_cluster_lock() >> >> ocfs2_refcount_lock() is costly, move it to ocfs2_commit_truncate() to do >> lock/unlock once can improve a lot performance. >> >> Signed-off-by: Junxiao Bi >> --- >> fs/ocfs2/alloc.c | 22 ++++++++++++---------- >> 1 file changed, 12 insertions(+), 10 deletions(-) >> >> diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c >> index 9d8fcf2..a764363 100644 >> --- a/fs/ocfs2/alloc.c >> +++ b/fs/ocfs2/alloc.c >> @@ -5667,13 +5667,6 @@ int ocfs2_remove_btree_range(struct inode *inode, >> BUG_ON(!(OCFS2_I(inode)->ip_dyn_features & >> OCFS2_HAS_REFCOUNT_FL)); >> >> - ret = ocfs2_lock_refcount_tree(osb, refcount_loc, 1, >> - &ref_tree, NULL); >> - if (ret) { >> - mlog_errno(ret); >> - goto bail; >> - } >> - >> ret = ocfs2_prepare_refcount_change_for_del(inode, >> refcount_loc, >> phys_blkno, >> @@ -5755,9 +5748,6 @@ bail: >> if (meta_ac) >> ocfs2_free_alloc_context(meta_ac); >> >> - if (ref_tree) >> - ocfs2_unlock_refcount_tree(osb, ref_tree, 1); >> - >> return ret; >> } >> >> @@ -7012,6 +7002,7 @@ int ocfs2_commit_truncate(struct ocfs2_super *osb, >> u64 refcount_loc = le64_to_cpu(di->i_refcount_loc); >> struct ocfs2_extent_tree et; >> struct ocfs2_cached_dealloc_ctxt dealloc; >> + struct ocfs2_refcount_tree *ref_tree = NULL; >> >> ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh); >> ocfs2_init_dealloc_ctxt(&dealloc); >> @@ -7121,6 +7112,15 @@ start: >> >> phys_cpos = ocfs2_blocks_to_clusters(inode->i_sb, blkno); >> >> + if ((flags & OCFS2_EXT_REFCOUNTED) && trunc_len && !ref_tree) { >> + status = ocfs2_lock_refcount_tree(osb, refcount_loc, 1, >> + &ref_tree, NULL); >> + if (status) { >> + mlog_errno(status); >> + goto bail; >> + } >> + } >> + >> status = ocfs2_remove_btree_range(inode, &et, trunc_cpos, >> phys_cpos, trunc_len, flags, &dealloc, >> refcount_loc); >> @@ -7138,6 +7138,8 @@ start: >> goto start; >> >> bail: >> + if (ref_tree) >> + ocfs2_unlock_refcount_tree(osb, ref_tree, 1); >> >> ocfs2_schedule_truncate_log_flush(osb, 1); >> > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel