linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] xfs: defer agfl block frees
@ 2018-04-10 17:46 Brian Foster
  2018-04-10 17:46 ` [PATCH 1/7] xfs: create agfl block free helper function Brian Foster
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

Hi all,

Here's a non-rfc version of the defer AGFL frees patches. As a quick
summary, the purpose of this series is to reduce per-tx log reservation
consumption of the block allocation code. Under certain conditions, the
pre-allocation AGFL fix up code can end up performing multiple block
free operations that can result in multiple btree shape changes before
the originally requested allocation even begins. This leads to log
reservation overrun problems because the number of AGFL block frees
required for a particular transaction cannot be determined in advance.
Therefore, process frees of AGFL blocks using deferred operations in the
same way that data blocks are freed.

In conjunction, clean up some of the disparate dfops callchain plumbing
by allowing a transaction to carry a pointer to an associated dop
structure. This eliminates the need to plumb xfs_defer_ops through yet
another callchain for the purpose of deferring AGFL frees. It also
facilitates cleanups of similar codepaths where dfops is already passed
around separately from the transaction that will ultimately log the
associated intents.

There aren't that many code changes since rfcv2. The series has been
rebased to for-next and the assignment of ->t_dfops occurs within
xfs_defer_init() rather than being open-coded. Also, the series has seen
more testing and drops some of the related test code from the rfc.

Thoughts, reviews, flames appreciated.

Brian

v1:
- Rebased to for-next.
- Attach dfops to tp via xfs_defer_init().
- Fix up xfs_defer_finish() to restore original pointer.
- Dropped debug/test patches.
rfcv2: https://marc.info/?l=linux-xfs&m=151681946702093&w=2
- Use transaction to carry deferred ops struct to allocation context.
- Remove dfops param from dir interfaces where possible.
- Defer AGFL block frees from more contexts.
- Define runtime stat for transaction regrant and logcount patch (to be
  potentially removed).
rfcv1: https://marc.info/?l=linux-xfs&m=151267309423883&w=2

Brian Foster (7):
  xfs: create agfl block free helper function
  xfs: allow attach of dfops to transaction
  xfs: defer agfl block frees when dfops is available
  xfs: defer agfl block frees from deferred ops processing context
  xfs: defer agfl frees from inode inactivation
  xfs: defer frees from common inode allocation paths
  xfs: defer agfl frees from directory op transactions

 fs/xfs/libxfs/xfs_alloc.c       | 82 +++++++++++++++++++++++++++++++++++------
 fs/xfs/libxfs/xfs_alloc.h       |  2 +
 fs/xfs/libxfs/xfs_attr.c        | 18 ++++-----
 fs/xfs/libxfs/xfs_attr_remote.c |  6 +--
 fs/xfs/libxfs/xfs_bmap.c        |  4 +-
 fs/xfs/libxfs/xfs_defer.c       | 17 ++++++++-
 fs/xfs/libxfs/xfs_defer.h       |  4 +-
 fs/xfs/libxfs/xfs_dir2.c        | 16 ++++----
 fs/xfs/libxfs/xfs_dir2.h        |  9 ++---
 fs/xfs/libxfs/xfs_ialloc.c      |  6 +--
 fs/xfs/libxfs/xfs_ialloc.h      |  1 -
 fs/xfs/libxfs/xfs_refcount.c    |  2 +-
 fs/xfs/xfs_bmap_util.c          | 12 +++---
 fs/xfs/xfs_dquot.c              |  2 +-
 fs/xfs/xfs_inode.c              | 76 +++++++++++++++++++-------------------
 fs/xfs/xfs_inode.h              |  3 +-
 fs/xfs/xfs_iomap.c              |  6 +--
 fs/xfs/xfs_log_recover.c        |  2 +-
 fs/xfs/xfs_reflink.c            |  8 ++--
 fs/xfs/xfs_rtalloc.c            |  2 +-
 fs/xfs/xfs_symlink.c            |  6 +--
 fs/xfs/xfs_trace.h              |  2 +
 fs/xfs/xfs_trans.c              | 11 ++++--
 fs/xfs/xfs_trans.h              |  1 +
 fs/xfs/xfs_trans_extfree.c      | 70 +++++++++++++++++++++++++++++++++++
 25 files changed, 259 insertions(+), 109 deletions(-)

-- 
2.13.6


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] xfs: create agfl block free helper function
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  2018-04-15  7:38   ` Christoph Hellwig
  2018-04-10 17:46 ` [PATCH 2/7] xfs: allow attach of dfops to transaction Brian Foster
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

Refactor the AGFL block free code into a new helper such that it can
be invoked from deferred context. No functional changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c | 37 +++++++++++++++++++++++++++----------
 fs/xfs/libxfs/xfs_alloc.h |  2 ++
 2 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 4bcc095fe44a..193a5b4909c5 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2060,6 +2060,30 @@ xfs_alloc_space_available(
 	return true;
 }
 
+int
+xfs_free_agfl_block(
+	struct xfs_trans	*tp,
+	xfs_agnumber_t		agno,
+	xfs_agblock_t		agbno,
+	struct xfs_buf		*agbp,
+	struct xfs_owner_info	*oinfo)
+{
+	int			error;
+	struct xfs_buf		*bp;
+
+	error = xfs_free_ag_extent(tp, agbp, agno, agbno, 1, oinfo,
+				   XFS_AG_RESV_AGFL);
+	if (error)
+		return error;
+
+	bp = xfs_btree_get_bufs(tp->t_mountp, tp, agno, agbno, 0);
+	if (!bp)
+		return -EFSCORRUPTED;
+	xfs_trans_binval(tp, bp);
+
+	return 0;
+}
+
 /*
  * Check the agfl fields of the agf for inconsistency or corruption. The purpose
  * is to detect an agfl header padding mismatch between current and early v5
@@ -2247,21 +2271,14 @@ xfs_alloc_fix_freelist(
 	else
 		xfs_rmap_ag_owner(&targs.oinfo, XFS_RMAP_OWN_AG);
 	while (!(flags & XFS_ALLOC_FLAG_NOSHRINK) && pag->pagf_flcount > need) {
-		struct xfs_buf	*bp;
-
 		error = xfs_alloc_get_freelist(tp, agbp, &bno, 0);
 		if (error)
 			goto out_agbp_relse;
-		error = xfs_free_ag_extent(tp, agbp, args->agno, bno, 1,
-					   &targs.oinfo, XFS_AG_RESV_AGFL);
+
+		error = xfs_free_agfl_block(tp, args->agno, bno, agbp,
+					    &targs.oinfo);
 		if (error)
 			goto out_agbp_relse;
-		bp = xfs_btree_get_bufs(mp, tp, args->agno, bno, 0);
-		if (!bp) {
-			error = -EFSCORRUPTED;
-			goto out_agbp_relse;
-		}
-		xfs_trans_binval(tp, bp);
 	}
 
 	targs.tp = tp;
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index cbf789ea5a4e..949e21326066 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -223,6 +223,8 @@ int xfs_read_agf(struct xfs_mount *mp, struct xfs_trans *tp,
 			xfs_agnumber_t agno, int flags, struct xfs_buf **bpp);
 int xfs_alloc_read_agfl(struct xfs_mount *mp, struct xfs_trans *tp,
 			xfs_agnumber_t agno, struct xfs_buf **bpp);
+int xfs_free_agfl_block(struct xfs_trans *, xfs_agnumber_t, xfs_agblock_t,
+			struct xfs_buf *, struct xfs_owner_info *);
 int xfs_alloc_fix_freelist(struct xfs_alloc_arg *args, int flags);
 int xfs_free_extent_fix_freelist(struct xfs_trans *tp, xfs_agnumber_t agno,
 		struct xfs_buf **agbp);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] xfs: allow attach of dfops to transaction
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
  2018-04-10 17:46 ` [PATCH 1/7] xfs: create agfl block free helper function Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  2018-04-15  7:44   ` Christoph Hellwig
  2018-04-10 17:46 ` [PATCH 3/7] xfs: defer agfl block frees when dfops is available Brian Foster
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

xfs_defer_ops is allocated and managed separately from the xfs_trans
on which it depends. The former is typically allocated on the stack
and processed before the associated thread escapes the scope that
defines the structure.

While this currently works fine, there are many places where
xfs_dfops is plumbed through deep callchains and/or subsystem data
structures (e.g., struct xfs_da_args) to other contexts. Since
deferred operations require a transaction, the scope of
xfs_defer_ops is essentially a subset of that of a transaction. An
upcoming enhancement to defer AGFL block frees will introduce yet
another context that requires plumbing of xfs_defer_ops (struct
xfs_alloc_arg) where a transaction is already present.

Rather than continue to pass dfops around independently, support the
ability to optionally carry an xfs_defer_ops structure in the
transaction itself. This facilitates the addition of deferred AGFL
block free behavior from selective contexts as well as incremental
clean up of other callchains to set/use the transaction reference
rather than plumb the dfops through explicitly.

Note that this patch does not change behavior nor dictate any
changes to how dfops structs are allocated (on the stack) and so all
existing rules apply. Changes to how dfops are allocated can be
considered once all paths are converted and thus would use a
consistent allocation pattern.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 18 +++++++++---------
 fs/xfs/libxfs/xfs_attr_remote.c |  6 +++---
 fs/xfs/libxfs/xfs_bmap.c        |  4 ++--
 fs/xfs/libxfs/xfs_defer.c       |  5 ++++-
 fs/xfs/libxfs/xfs_defer.h       |  3 ++-
 fs/xfs/libxfs/xfs_refcount.c    |  2 +-
 fs/xfs/xfs_bmap_util.c          | 12 ++++++------
 fs/xfs/xfs_dquot.c              |  2 +-
 fs/xfs/xfs_inode.c              | 12 ++++++------
 fs/xfs/xfs_iomap.c              |  6 +++---
 fs/xfs/xfs_log_recover.c        |  2 +-
 fs/xfs/xfs_reflink.c            |  8 ++++----
 fs/xfs/xfs_rtalloc.c            |  2 +-
 fs/xfs/xfs_symlink.c            |  4 ++--
 fs/xfs/xfs_trans.c              | 11 ++++++++---
 fs/xfs/xfs_trans.h              |  1 +
 16 files changed, 54 insertions(+), 44 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index ce4a34a2751d..afc441b13c97 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -327,7 +327,7 @@ xfs_attr_set(
 		 * It won't fit in the shortform, transform to a leaf block.
 		 * GROT: another possible req'mt for a double-split btree op.
 		 */
-		xfs_defer_init(args.dfops, args.firstblock);
+		xfs_defer_init(NULL, args.dfops, args.firstblock);
 		error = xfs_attr_shortform_to_leaf(&args, &leaf_bp);
 		if (error)
 			goto out_defer_cancel;
@@ -603,7 +603,7 @@ xfs_attr_leaf_addname(xfs_da_args_t *args)
 		 * Commit that transaction so that the node_addname() call
 		 * can manage its own transactions.
 		 */
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		error = xfs_attr3_leaf_to_node(args);
 		if (error)
 			goto out_defer_cancel;
@@ -692,7 +692,7 @@ xfs_attr_leaf_addname(xfs_da_args_t *args)
 		 * If the result is small enough, shrink it all into the inode.
 		 */
 		if ((forkoff = xfs_attr_shortform_allfit(bp, dp))) {
-			xfs_defer_init(args->dfops, args->firstblock);
+			xfs_defer_init(NULL, args->dfops, args->firstblock);
 			error = xfs_attr3_leaf_to_shortform(bp, args, forkoff);
 			/* bp is gone due to xfs_da_shrink_inode */
 			if (error)
@@ -756,7 +756,7 @@ xfs_attr_leaf_removename(xfs_da_args_t *args)
 	 * If the result is small enough, shrink it all into the inode.
 	 */
 	if ((forkoff = xfs_attr_shortform_allfit(bp, dp))) {
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		error = xfs_attr3_leaf_to_shortform(bp, args, forkoff);
 		/* bp is gone due to xfs_da_shrink_inode */
 		if (error)
@@ -884,7 +884,7 @@ xfs_attr_node_addname(xfs_da_args_t *args)
 			 */
 			xfs_da_state_free(state);
 			state = NULL;
-			xfs_defer_init(args->dfops, args->firstblock);
+			xfs_defer_init(NULL, args->dfops, args->firstblock);
 			error = xfs_attr3_leaf_to_node(args);
 			if (error)
 				goto out_defer_cancel;
@@ -910,7 +910,7 @@ xfs_attr_node_addname(xfs_da_args_t *args)
 		 * in the index/blkno/rmtblkno/rmtblkcnt fields and
 		 * in the index2/blkno2/rmtblkno2/rmtblkcnt2 fields.
 		 */
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		error = xfs_da3_split(state);
 		if (error)
 			goto out_defer_cancel;
@@ -1008,7 +1008,7 @@ xfs_attr_node_addname(xfs_da_args_t *args)
 		 * Check to see if the tree needs to be collapsed.
 		 */
 		if (retval && (state->path.active > 1)) {
-			xfs_defer_init(args->dfops, args->firstblock);
+			xfs_defer_init(NULL, args->dfops, args->firstblock);
 			error = xfs_da3_join(state);
 			if (error)
 				goto out_defer_cancel;
@@ -1132,7 +1132,7 @@ xfs_attr_node_removename(xfs_da_args_t *args)
 	 * Check to see if the tree needs to be collapsed.
 	 */
 	if (retval && (state->path.active > 1)) {
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		error = xfs_da3_join(state);
 		if (error)
 			goto out_defer_cancel;
@@ -1164,7 +1164,7 @@ xfs_attr_node_removename(xfs_da_args_t *args)
 			goto out;
 
 		if ((forkoff = xfs_attr_shortform_allfit(bp, dp))) {
-			xfs_defer_init(args->dfops, args->firstblock);
+			xfs_defer_init(NULL, args->dfops, args->firstblock);
 			error = xfs_attr3_leaf_to_shortform(bp, args, forkoff);
 			/* bp is gone due to xfs_da_shrink_inode */
 			if (error)
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 21be186067a2..3577585a3e11 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -492,7 +492,7 @@ xfs_attr_rmtval_set(
 		 * extent and then crash then the block may not contain the
 		 * correct metadata after log recovery occurs.
 		 */
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		nmap = 1;
 		error = xfs_bmapi_write(args->trans, dp, (xfs_fileoff_t)lblkno,
 				  blkcnt, XFS_BMAPI_ATTRFORK, args->firstblock,
@@ -534,7 +534,7 @@ xfs_attr_rmtval_set(
 
 		ASSERT(blkcnt > 0);
 
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		nmap = 1;
 		error = xfs_bmapi_read(dp, (xfs_fileoff_t)lblkno,
 				       blkcnt, &map, &nmap,
@@ -638,7 +638,7 @@ xfs_attr_rmtval_remove(
 	blkcnt = args->rmtblkcnt;
 	done = 0;
 	while (!done) {
-		xfs_defer_init(args->dfops, args->firstblock);
+		xfs_defer_init(NULL, args->dfops, args->firstblock);
 		error = xfs_bunmapi(args->trans, args->dp, lblkno, blkcnt,
 				    XFS_BMAPI_ATTRFORK, 1, args->firstblock,
 				    args->dfops, &done);
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 6a7c2f03ea11..4d93e4562e7e 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1108,7 +1108,7 @@ xfs_bmap_add_attrfork(
 	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
 	ip->i_afp->if_flags = XFS_IFEXTENTS;
 	logflags = 0;
-	xfs_defer_init(&dfops, &firstblock);
+	xfs_defer_init(NULL, &dfops, &firstblock);
 	switch (ip->i_d.di_format) {
 	case XFS_DINODE_FMT_LOCAL:
 		error = xfs_bmap_add_attrfork_local(tp, ip, &firstblock, &dfops,
@@ -6004,7 +6004,7 @@ xfs_bmap_split_extent(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
-	xfs_defer_init(&dfops, &firstfsb);
+	xfs_defer_init(NULL, &dfops, &firstfsb);
 
 	error = xfs_bmap_split_extent_at(tp, ip, split_fsb,
 			&firstfsb, &dfops);
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index 087fea02c389..f3aef54257d1 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -525,6 +525,7 @@ xfs_defer_init_op_type(
 /* Initialize a deferred operation. */
 void
 xfs_defer_init(
+	struct xfs_trans		*tp,
 	struct xfs_defer_ops		*dop,
 	xfs_fsblock_t			*fbp)
 {
@@ -532,5 +533,7 @@ xfs_defer_init(
 	*fbp = NULLFSBLOCK;
 	INIT_LIST_HEAD(&dop->dop_intake);
 	INIT_LIST_HEAD(&dop->dop_pending);
-	trace_xfs_defer_init(NULL, dop);
+	if (tp)
+		tp->t_dfops = dop;
+	trace_xfs_defer_init(tp ? tp->t_mountp : NULL, dop);
 }
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index 045beacdd37d..e263b3973cb6 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -76,7 +76,8 @@ void xfs_defer_add(struct xfs_defer_ops *dop, enum xfs_defer_ops_type type,
 		struct list_head *h);
 int xfs_defer_finish(struct xfs_trans **tp, struct xfs_defer_ops *dop);
 void xfs_defer_cancel(struct xfs_defer_ops *dop);
-void xfs_defer_init(struct xfs_defer_ops *dop, xfs_fsblock_t *fbp);
+void xfs_defer_init(struct xfs_trans *tp, struct xfs_defer_ops *dop,
+		    xfs_fsblock_t *fbp);
 bool xfs_defer_has_unfinished_work(struct xfs_defer_ops *dop);
 int xfs_defer_ijoin(struct xfs_defer_ops *dop, struct xfs_inode *ip);
 int xfs_defer_bjoin(struct xfs_defer_ops *dop, struct xfs_buf *bp);
diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
index 560e28473024..561a08e44289 100644
--- a/fs/xfs/libxfs/xfs_refcount.c
+++ b/fs/xfs/libxfs/xfs_refcount.c
@@ -1648,7 +1648,7 @@ xfs_refcount_recover_cow_leftovers(
 		trace_xfs_refcount_recover_extent(mp, agno, &rr->rr_rrec);
 
 		/* Free the orphan record */
-		xfs_defer_init(&dfops, &fsb);
+		xfs_defer_init(NULL, &dfops, &fsb);
 		agbno = rr->rr_rrec.rc_startblock - XFS_REFC_COW_START;
 		fsb = XFS_AGB_TO_FSB(mp, agno, agbno);
 		error = xfs_refcount_free_cow_extent(mp, &dfops, fsb,
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 8cd8c412f52d..78a17c43aab8 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -754,7 +754,7 @@ xfs_bmap_punch_delalloc_range(
 		 * allocated or freed for a delalloc extent and hence we need
 		 * don't cancel or finish them after the xfs_bunmapi() call.
 		 */
-		xfs_defer_init(&dfops, &firstblock);
+		xfs_defer_init(NULL, &dfops, &firstblock);
 		error = xfs_bunmapi(NULL, ip, start_fsb, 1, 0, 1, &firstblock,
 					&dfops, &done);
 		if (error)
@@ -1000,7 +1000,7 @@ xfs_alloc_file_space(
 
 		xfs_trans_ijoin(tp, ip, 0);
 
-		xfs_defer_init(&dfops, &firstfsb);
+		xfs_defer_init(NULL, &dfops, &firstfsb);
 		error = xfs_bmapi_write(tp, ip, startoffset_fsb,
 					allocatesize_fsb, alloc_type, &firstfsb,
 					resblks, imapp, &nimaps, &dfops);
@@ -1070,7 +1070,7 @@ xfs_unmap_extent(
 
 	xfs_trans_ijoin(tp, ip, 0);
 
-	xfs_defer_init(&dfops, &firstfsb);
+	xfs_defer_init(NULL, &dfops, &firstfsb);
 	error = xfs_bunmapi(tp, ip, startoffset_fsb, len_fsb, 0, 2, &firstfsb,
 			&dfops, done);
 	if (error)
@@ -1358,7 +1358,7 @@ xfs_collapse_file_space(
 			goto out_trans_cancel;
 		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
-		xfs_defer_init(&dfops, &first_block);
+		xfs_defer_init(NULL, &dfops, &first_block);
 		error = xfs_bmap_collapse_extents(tp, ip, &next_fsb, shift_fsb,
 				&done, &first_block, &dfops);
 		if (error)
@@ -1433,7 +1433,7 @@ xfs_insert_file_space(
 
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
-		xfs_defer_init(&dfops, &first_block);
+		xfs_defer_init(NULL, &dfops, &first_block);
 		error = xfs_bmap_insert_extents(tp, ip, &next_fsb, shift_fsb,
 				&done, stop_fsb, &first_block, &dfops);
 		if (error)
@@ -1619,7 +1619,7 @@ xfs_swap_extent_rmap(
 
 		/* Unmap the old blocks in the source file. */
 		while (tirec.br_blockcount) {
-			xfs_defer_init(&dfops, &firstfsb);
+			xfs_defer_init(NULL, &dfops, &firstfsb);
 			trace_xfs_swap_extent_rmap_remap_piece(tip, &tirec);
 
 			/* Read extent from the source file */
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index a7daef9e16bf..6f63e179b20c 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -314,7 +314,7 @@ xfs_qm_dqalloc(
 	/*
 	 * Initialize the bmap freelist prior to calling bmapi code.
 	 */
-	xfs_defer_init(&dfops, &firstblock);
+	xfs_defer_init(NULL, &dfops, &firstblock);
 	xfs_ilock(quotip, XFS_ILOCK_EXCL);
 	/*
 	 * Return if this type of quotas is turned off while we didn't
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index c09774ca53ae..0c2e53fbc0f0 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1195,7 +1195,7 @@ xfs_create(
 	xfs_ilock(dp, XFS_ILOCK_EXCL | XFS_ILOCK_PARENT);
 	unlock_dp_on_error = true;
 
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 
 	/*
 	 * Reserve disk quota and the inode.
@@ -1450,7 +1450,7 @@ xfs_link(
 			goto error_return;
 	}
 
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 
 	/*
 	 * Handle initial link state of O_TMPFILE inode
@@ -1578,7 +1578,7 @@ xfs_itruncate_extents(
 	ASSERT(first_unmap_block < last_block);
 	unmap_len = last_block - first_unmap_block + 1;
 	while (!done) {
-		xfs_defer_init(&dfops, &first_block);
+		xfs_defer_init(NULL, &dfops, &first_block);
 		error = xfs_bunmapi(tp, ip,
 				    first_unmap_block, unmap_len,
 				    xfs_bmapi_aflag(whichfork),
@@ -1808,7 +1808,7 @@ xfs_inactive_ifree(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 	error = xfs_ifree(tp, ip, &dfops);
 	if (error) {
 		/*
@@ -2644,7 +2644,7 @@ xfs_remove(
 	if (error)
 		goto out_trans_cancel;
 
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 	error = xfs_dir_removename(tp, dp, name, ip->i_ino,
 					&first_block, &dfops, resblks);
 	if (error) {
@@ -3011,7 +3011,7 @@ xfs_rename(
 		goto out_trans_cancel;
 	}
 
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 
 	/* RENAME_EXCHANGE is unique from here on. */
 	if (flags & RENAME_EXCHANGE)
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 046469fcc1b8..c942ac25547d 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -266,7 +266,7 @@ xfs_iomap_write_direct(
 	 * From this point onwards we overwrite the imap pointer that the
 	 * caller gave to us.
 	 */
-	xfs_defer_init(&dfops, &firstfsb);
+	xfs_defer_init(NULL, &dfops, &firstfsb);
 	nimaps = 1;
 	error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
 				bmapi_flags, &firstfsb, resblks, imap,
@@ -728,7 +728,7 @@ xfs_iomap_write_allocate(
 			xfs_ilock(ip, XFS_ILOCK_EXCL);
 			xfs_trans_ijoin(tp, ip, 0);
 
-			xfs_defer_init(&dfops, &first_block);
+			xfs_defer_init(NULL, &dfops, &first_block);
 
 			/*
 			 * it is possible that the extents have changed since
@@ -889,7 +889,7 @@ xfs_iomap_write_unwritten(
 		/*
 		 * Modify the unwritten extent state of the buffer.
 		 */
-		xfs_defer_init(&dfops, &firstfsb);
+		xfs_defer_init(NULL, &dfops, &firstfsb);
 		nimaps = 1;
 		error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb,
 					XFS_BMAPI_CONVERT, &firstfsb, resblks,
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 2b2383f1895e..4674a7cefb8b 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -4889,7 +4889,7 @@ xlog_recover_process_intents(
 #if defined(DEBUG) || defined(XFS_WARN)
 	last_lsn = xlog_assign_lsn(log->l_curr_cycle, log->l_curr_block);
 #endif
-	xfs_defer_init(&dfops, &firstfsb);
+	xfs_defer_init(NULL, &dfops, &firstfsb);
 	while (lip != NULL) {
 		/*
 		 * We're done when we see something other than an intent.
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index cdbd342a5249..467248660028 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -444,7 +444,7 @@ xfs_reflink_allocate_cow(
 
 	xfs_trans_ijoin(tp, ip, 0);
 
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 	nimaps = 1;
 
 	/* Allocate the entire reservation as unwritten blocks. */
@@ -593,7 +593,7 @@ xfs_reflink_cancel_cow_blocks(
 				break;
 		} else if (del.br_state == XFS_EXT_UNWRITTEN || cancel_real) {
 			xfs_trans_ijoin(*tpp, ip, 0);
-			xfs_defer_init(&dfops, &firstfsb);
+			xfs_defer_init(NULL, &dfops, &firstfsb);
 
 			/* Free the CoW orphan record. */
 			error = xfs_refcount_free_cow_extent(ip->i_mount,
@@ -776,7 +776,7 @@ xfs_reflink_end_cow(
 			goto prev_extent;
 
 		/* Unmap the old blocks in the data fork. */
-		xfs_defer_init(&dfops, &firstfsb);
+		xfs_defer_init(NULL, &dfops, &firstfsb);
 		rlen = del.br_blockcount;
 		error = __xfs_bunmapi(tp, ip, del.br_startoff, &rlen, 0, 1,
 				&firstfsb, &dfops);
@@ -1125,7 +1125,7 @@ xfs_reflink_remap_extent(
 	/* Unmap the old blocks in the data fork. */
 	rlen = unmap_len;
 	while (rlen) {
-		xfs_defer_init(&dfops, &firstfsb);
+		xfs_defer_init(NULL, &dfops, &firstfsb);
 		error = __xfs_bunmapi(tp, ip, destoff, &rlen, 0, 1,
 				&firstfsb, &dfops);
 		if (error)
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 488719d43ca8..62c40dd96648 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -795,7 +795,7 @@ xfs_growfs_rt_alloc(
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
-		xfs_defer_init(&dfops, &firstblock);
+		xfs_defer_init(NULL, &dfops, &firstblock);
 		/*
 		 * Allocate blocks to the bitmap file.
 		 */
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 5b66ac12913c..e4c4ea25af61 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -258,7 +258,7 @@ xfs_symlink(
 	 * Initialize the bmap freelist prior to calling either
 	 * bmapi or the directory create code.
 	 */
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 
 	/*
 	 * Allocate an inode for the symlink.
@@ -454,7 +454,7 @@ xfs_inactive_symlink_rmt(
 	 * Find the block(s) so we can inval and unmap them.
 	 */
 	done = 0;
-	xfs_defer_init(&dfops, &first_block);
+	xfs_defer_init(NULL, &dfops, &first_block);
 	nmaps = ARRAY_SIZE(mval);
 	error = xfs_bmapi_read(ip, 0, xfs_symlink_blocks(mp, size),
 				mval, &nmaps, 0);
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index d6d8f9d129a7..c7e82dbd760e 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -31,6 +31,7 @@
 #include "xfs_log.h"
 #include "xfs_trace.h"
 #include "xfs_error.h"
+#include "xfs_defer.h"
 
 kmem_zone_t	*xfs_trans_zone;
 kmem_zone_t	*xfs_log_item_desc_zone;
@@ -94,11 +95,11 @@ xfs_trans_free(
  * blocks.  Locks and log items, however, are no inherited.  They must
  * be added to the new transaction explicitly.
  */
-STATIC xfs_trans_t *
+STATIC struct xfs_trans *
 xfs_trans_dup(
-	xfs_trans_t	*tp)
+	struct xfs_trans	*tp)
 {
-	xfs_trans_t	*ntp;
+	struct xfs_trans	*ntp;
 
 	ntp = kmem_zone_zalloc(xfs_trans_zone, KM_SLEEP);
 
@@ -127,6 +128,7 @@ xfs_trans_dup(
 	ntp->t_rtx_res = tp->t_rtx_res - tp->t_rtx_res_used;
 	tp->t_rtx_res = tp->t_rtx_res_used;
 	ntp->t_pflags = tp->t_pflags;
+	ntp->t_dfops = tp->t_dfops;
 
 	xfs_trans_dup_dqinfo(tp, ntp);
 
@@ -936,6 +938,9 @@ __xfs_trans_commit(
 	int			error = 0;
 	int			sync = tp->t_flags & XFS_TRANS_SYNC;
 
+	ASSERT(!tp->t_dfops || !xfs_defer_has_unfinished_work(tp->t_dfops) ||
+	       regrant);
+
 	/*
 	 * If there is nothing to be logged by the transaction,
 	 * then unlock all of the items associated with the
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 9d542dfe0052..220b5b65f524 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -111,6 +111,7 @@ typedef struct xfs_trans {
 	struct xlog_ticket	*t_ticket;	/* log mgr ticket */
 	struct xfs_mount	*t_mountp;	/* ptr to fs mount struct */
 	struct xfs_dquot_acct   *t_dqinfo;	/* acctg info for dquots */
+	struct xfs_defer_ops	*t_dfops;	/* deferred ops reference */
 	unsigned int		t_flags;	/* misc flags */
 	int64_t			t_icount_delta;	/* superblock icount change */
 	int64_t			t_ifree_delta;	/* superblock ifree change */
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] xfs: defer agfl block frees when dfops is available
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
  2018-04-10 17:46 ` [PATCH 1/7] xfs: create agfl block free helper function Brian Foster
  2018-04-10 17:46 ` [PATCH 2/7] xfs: allow attach of dfops to transaction Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  2018-04-15  7:46   ` Christoph Hellwig
  2018-04-10 17:46 ` [PATCH 4/7] xfs: defer agfl block frees from deferred ops processing context Brian Foster
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

The AGFL fixup code executes before every block allocation/free and
rectifies the AGFL based on the current, dynamic allocation
requirements of the fs. The AGFL must hold a minimum number of
blocks to satisfy a worst case split of the free space btrees caused
by the impending allocation operation. The AGFL is also updated to
maintain the implicit requirement for a minimum number of free slots
to satisfy a worst case join of the free space btrees.

Since the AGFL caches individual blocks, AGFL reduction typically
involves multiple, single block frees. We've had reports of
transaction overrun problems during certain workloads that boil down
to AGFL reduction freeing multiple blocks and consuming more space
in the log than was reserved for the transaction.

Since the objective of freeing AGFL blocks is to ensure free AGFL
free slots are available for the upcoming allocation, one way to
address this problem is to release surplus blocks from the AGFL
immediately but defer the free of those blocks (similar to how
file-mapped blocks are unmapped from the file in one transaction and
freed via a deferred operation) until the transaction is rolled.
This turns AGFL reduction into an operation with predictable log
reservation consumption.

Add the capability to defer AGFL block frees when a deferred ops
list is available to the AGFL fixup code. While deferring AGFL frees
is currently a conditional behavior (based on whether the caller has
populated the new dfops field of the transaction structure), the
long term objective is to convert all codepaths to use tp->t_dfops
and thus defer AGFL block frees consistently.

A bit of customization is required to handle deferred completion
processing because AGFL blocks are accounted against a separate
reservation pool and AGFL blocks are not inserted into the extent
busy list when freed (they are inserted when used and released back
to the AGFL). Reuse the majority of the existing deferred extent
free infrastructure and customize it appropriately to handle AGFL
blocks.

Note that this patch only adds infrastructure. It does not change
behavior because no callers have been updated to pass ->t_dfops into
the allocation code.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c  | 51 ++++++++++++++++++++++++++++++---
 fs/xfs/libxfs/xfs_defer.h  |  1 +
 fs/xfs/xfs_trace.h         |  2 ++
 fs/xfs/xfs_trans_extfree.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 120 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 193a5b4909c5..acf45e1aebd0 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -39,6 +39,9 @@
 #include "xfs_buf_item.h"
 #include "xfs_log.h"
 #include "xfs_ag_resv.h"
+#include "xfs_bmap.h"
+
+extern kmem_zone_t	*xfs_bmap_free_item_zone;
 
 struct workqueue_struct *xfs_alloc_wq;
 
@@ -2172,6 +2175,40 @@ xfs_agfl_reset(
 }
 
 /*
+ * Defer an AGFL block free. This is effectively equivalent to
+ * xfs_bmap_add_free() with some special handling particular to AGFL blocks.
+ *
+ * Deferring AGFL frees helps prevent log reservation overruns due to too many
+ * allocation operations in a transaction. AGFL frees are prone to this problem
+ * because for one they are always freed one at a time. Further, an immediate
+ * AGFL block free can cause a btree join and require another block free before
+ * the real allocation can proceed. Deferring the free disconnects freeing up
+ * the AGFL slot from freeing the block.
+ */
+STATIC void
+xfs_defer_agfl_block(
+	struct xfs_mount		*mp,
+	struct xfs_defer_ops		*dfops,
+	xfs_agnumber_t			agno,
+	xfs_fsblock_t			agbno,
+	struct xfs_owner_info		*oinfo)
+{
+	struct xfs_extent_free_item	*new;		/* new element */
+
+	ASSERT(xfs_bmap_free_item_zone != NULL);
+	ASSERT(oinfo != NULL);
+
+	new = kmem_zone_alloc(xfs_bmap_free_item_zone, KM_SLEEP);
+	new->xefi_startblock = XFS_AGB_TO_FSB(mp, agno, agbno);
+	new->xefi_blockcount = 1;
+	new->xefi_oinfo = *oinfo;
+
+	trace_xfs_agfl_free_defer(mp, agno, 0, agbno, 1);
+
+	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_AGFL_FREE, &new->xefi_list);
+}
+
+/*
  * Decide whether to use this allocation group for this allocation.
  * If so, fix up the btree freelist's size.
  */
@@ -2275,10 +2312,16 @@ xfs_alloc_fix_freelist(
 		if (error)
 			goto out_agbp_relse;
 
-		error = xfs_free_agfl_block(tp, args->agno, bno, agbp,
-					    &targs.oinfo);
-		if (error)
-			goto out_agbp_relse;
+		/* defer agfl frees if dfops is provided */
+		if (tp->t_dfops) {
+			xfs_defer_agfl_block(mp, tp->t_dfops, args->agno, bno,
+					     &targs.oinfo);
+		} else {
+			error = xfs_free_agfl_block(tp, args->agno, bno, agbp,
+						    &targs.oinfo);
+			if (error)
+				goto out_agbp_relse;
+		}
 	}
 
 	targs.tp = tp;
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index e263b3973cb6..a3a108e1f322 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -55,6 +55,7 @@ enum xfs_defer_ops_type {
 	XFS_DEFER_OPS_TYPE_REFCOUNT,
 	XFS_DEFER_OPS_TYPE_RMAP,
 	XFS_DEFER_OPS_TYPE_FREE,
+	XFS_DEFER_OPS_TYPE_AGFL_FREE,
 	XFS_DEFER_OPS_TYPE_MAX,
 };
 
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 8955254b900e..00e6aaea6565 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -2433,6 +2433,8 @@ DEFINE_DEFER_PENDING_EVENT(xfs_defer_pending_abort);
 #define DEFINE_BMAP_FREE_DEFERRED_EVENT DEFINE_PHYS_EXTENT_DEFERRED_EVENT
 DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_bmap_free_defer);
 DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_bmap_free_deferred);
+DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_agfl_free_defer);
+DEFINE_BMAP_FREE_DEFERRED_EVENT(xfs_agfl_free_deferred);
 
 /* rmap tracepoints */
 DECLARE_EVENT_CLASS(xfs_rmap_class,
diff --git a/fs/xfs/xfs_trans_extfree.c b/fs/xfs/xfs_trans_extfree.c
index ab438647592a..f5620796ae25 100644
--- a/fs/xfs/xfs_trans_extfree.c
+++ b/fs/xfs/xfs_trans_extfree.c
@@ -231,9 +231,79 @@ static const struct xfs_defer_op_type xfs_extent_free_defer_type = {
 	.cancel_item	= xfs_extent_free_cancel_item,
 };
 
+/*
+ * AGFL blocks are accounted differently in the reserve pools and are not
+ * inserted into the busy extent list.
+ */
+STATIC int
+xfs_agfl_free_finish_item(
+	struct xfs_trans		*tp,
+	struct xfs_defer_ops		*dop,
+	struct list_head		*item,
+	void				*done_item,
+	void				**state)
+{
+	struct xfs_mount		*mp = tp->t_mountp;
+	struct xfs_efd_log_item		*efdp = done_item;
+	struct xfs_extent_free_item	*free;
+	struct xfs_extent		*extp;
+	struct xfs_buf			*agbp;
+	int				error;
+	xfs_agnumber_t			agno;
+	xfs_agblock_t			agbno;
+	uint				next_extent;
+
+	free = container_of(item, struct xfs_extent_free_item, xefi_list);
+	ASSERT(free->xefi_blockcount == 1);
+	agno = XFS_FSB_TO_AGNO(mp, free->xefi_startblock);
+	agbno = XFS_FSB_TO_AGBNO(mp, free->xefi_startblock);
+
+	trace_xfs_agfl_free_deferred(mp, agno, 0, agbno, free->xefi_blockcount);
+
+	error = xfs_alloc_read_agf(mp, tp, agno, 0, &agbp);
+	if (!error)
+		error = xfs_free_agfl_block(tp, agno, agbno, agbp,
+					    &free->xefi_oinfo);
+
+	/*
+	 * Mark the transaction dirty, even on error. This ensures the
+	 * transaction is aborted, which:
+	 *
+	 * 1.) releases the EFI and frees the EFD
+	 * 2.) shuts down the filesystem
+	 */
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	efdp->efd_item.li_desc->lid_flags |= XFS_LID_DIRTY;
+
+	next_extent = efdp->efd_next_extent;
+	ASSERT(next_extent < efdp->efd_format.efd_nextents);
+	extp = &(efdp->efd_format.efd_extents[next_extent]);
+	extp->ext_start = free->xefi_startblock;
+	extp->ext_len = free->xefi_blockcount;
+	efdp->efd_next_extent++;
+
+	kmem_free(free);
+	return error;
+}
+
+
+/* sub-type with special handling for AGFL deferred frees */
+static const struct xfs_defer_op_type xfs_agfl_free_defer_type = {
+	.type		= XFS_DEFER_OPS_TYPE_AGFL_FREE,
+	.max_items	= XFS_EFI_MAX_FAST_EXTENTS,
+	.diff_items	= xfs_extent_free_diff_items,
+	.create_intent	= xfs_extent_free_create_intent,
+	.abort_intent	= xfs_extent_free_abort_intent,
+	.log_item	= xfs_extent_free_log_item,
+	.create_done	= xfs_extent_free_create_done,
+	.finish_item	= xfs_agfl_free_finish_item,
+	.cancel_item	= xfs_extent_free_cancel_item,
+};
+
 /* Register the deferred op type. */
 void
 xfs_extent_free_init_defer_op(void)
 {
 	xfs_defer_init_op_type(&xfs_extent_free_defer_type);
+	xfs_defer_init_op_type(&xfs_agfl_free_defer_type);
 }
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/7] xfs: defer agfl block frees from deferred ops processing context
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
                   ` (2 preceding siblings ...)
  2018-04-10 17:46 ` [PATCH 3/7] xfs: defer agfl block frees when dfops is available Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  2018-04-10 17:46 ` [PATCH 5/7] xfs: defer agfl frees from inode inactivation Brian Foster
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

Now that AGFL block frees can be deferred when dfops is carried
through to the allocator via the transaction, we want to start
deferring AGFL block frees from the contexts that are known to push
the limits of existing log reservations.

The first such context is deferred operation processing itself. This
primarily targets deferred extent frees (such as file extents and
inode chunks), but in doing so covers all allocation operations that
occur in deferred operation processing context.

Update xfs_defer_finish() to set and reset ->t_dfops across the
processing sequence. This means that any AGFL block frees due to
allocation events result in the addition of new EFIs to the dfops
rather than being processed immediately. xfs_defer_finish() rolls
the transaction at least once more to process the frees of the AGFL
blocks back to the allocation btrees and returns once the AGFL is
rectified.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_defer.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index f3aef54257d1..493fdc86f54a 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -352,11 +352,22 @@ xfs_defer_finish(
 	void				*state;
 	int				error = 0;
 	void				(*cleanup_fn)(struct xfs_trans *, void *, int);
+	struct xfs_defer_ops		*orig_dop;
 
 	ASSERT((*tp)->t_flags & XFS_TRANS_PERM_LOG_RES);
 
 	trace_xfs_defer_finish((*tp)->t_mountp, dop);
 
+	/*
+	 * Attach dfops to the transaction during deferred ops processing. This
+	 * explicitly causes calls into the allocator to defer AGFL block frees.
+	 * Note that this code can go away once all dfops users attach to the
+	 * associated tp.
+	 */
+	ASSERT(!(*tp)->t_dfops || ((*tp)->t_dfops == dop));
+	orig_dop = (*tp)->t_dfops;
+	(*tp)->t_dfops = dop;
+
 	/* Until we run out of pending work to finish... */
 	while (xfs_defer_has_unfinished_work(dop)) {
 		/* Log intents for work items sitting in the intake. */
@@ -428,6 +439,7 @@ xfs_defer_finish(
 	}
 
 out:
+	(*tp)->t_dfops = orig_dop;
 	if (error)
 		trace_xfs_defer_finish_error((*tp)->t_mountp, dop, error);
 	else
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/7] xfs: defer agfl frees from inode inactivation
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
                   ` (3 preceding siblings ...)
  2018-04-10 17:46 ` [PATCH 4/7] xfs: defer agfl block frees from deferred ops processing context Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  2018-04-10 17:46 ` [PATCH 6/7] xfs: defer frees from common inode allocation paths Brian Foster
  2018-04-10 17:46 ` [PATCH 7/7] xfs: defer agfl frees from directory op transactions Brian Foster
  6 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

XFS inode chunks are already freed via deferred operations (which
now also defer AGFL block frees), but inode btree blocks are freed
directly in the associated context. This has been known to lead to
log reservation overruns in particular workloads where an inobt
block free may require several AGFL block frees (and thus several
allocation btree modifications) before the inobt block itself is
actually freed.

To avoid this problem, we want to defer the frees of any AGFL blocks
that must be dropped before the inobt block free can take place.
This requires passing the dfops from xfs_inactive_ifree() down
through the inobt ->[alloc|free]_block() callouts, which essentially
only requires to attach the dfops to the transaction since it is
already carried all the way through to the inobt update and
allocation.

Since dfops is attached to the transaction, we no longer have to
pass it down explicitly through xfs_difree() and friends to defer
the physical inode chunk free. Kill two stones with one bird and
clean up the entire codepath to reference ->t_dfops.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_ialloc.c |  6 ++----
 fs/xfs/libxfs/xfs_ialloc.h |  1 -
 fs/xfs/xfs_inode.c         | 16 ++++++++++------
 fs/xfs/xfs_inode.h         |  3 +--
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index de627fa19168..c7d8295f3ec1 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -1897,7 +1897,6 @@ xfs_difree_inobt(
 	struct xfs_trans		*tp,
 	struct xfs_buf			*agbp,
 	xfs_agino_t			agino,
-	struct xfs_defer_ops		*dfops,
 	struct xfs_icluster		*xic,
 	struct xfs_inobt_rec_incore	*orec)
 {
@@ -1984,7 +1983,7 @@ xfs_difree_inobt(
 			goto error0;
 		}
 
-		xfs_difree_inode_chunk(mp, agno, &rec, dfops);
+		xfs_difree_inode_chunk(mp, agno, &rec, tp->t_dfops);
 	} else {
 		xic->deleted = false;
 
@@ -2129,7 +2128,6 @@ int
 xfs_difree(
 	struct xfs_trans	*tp,		/* transaction pointer */
 	xfs_ino_t		inode,		/* inode to be freed */
-	struct xfs_defer_ops	*dfops,		/* extents to free */
 	struct xfs_icluster	*xic)	/* cluster info if deleted */
 {
 	/* REFERENCED */
@@ -2181,7 +2179,7 @@ xfs_difree(
 	/*
 	 * Fix up the inode allocation btree.
 	 */
-	error = xfs_difree_inobt(mp, tp, agbp, agino, dfops, xic, &rec);
+	error = xfs_difree_inobt(mp, tp, agbp, agino, xic, &rec);
 	if (error)
 		goto error0;
 
diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
index c5402bb4ce0c..884562efd708 100644
--- a/fs/xfs/libxfs/xfs_ialloc.h
+++ b/fs/xfs/libxfs/xfs_ialloc.h
@@ -94,7 +94,6 @@ int					/* error */
 xfs_difree(
 	struct xfs_trans *tp,		/* transaction pointer */
 	xfs_ino_t	inode,		/* inode to be freed */
-	struct xfs_defer_ops *dfops,	/* extents to free */
 	struct xfs_icluster *ifree);	/* cluster info if deleted */
 
 /*
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 0c2e53fbc0f0..117dced8bf66 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1808,8 +1808,13 @@ xfs_inactive_ifree(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
-	xfs_defer_init(NULL, &dfops, &first_block);
-	error = xfs_ifree(tp, ip, &dfops);
+	/*
+	 * Attach dfops to tp to defer any physical inode chunk frees and/or
+	 * AGFL block frees that occur as a result of inobt block allocation
+	 * operations.
+	 */
+	xfs_defer_init(tp, &dfops, &first_block);
+	error = xfs_ifree(tp, ip);
 	if (error) {
 		/*
 		 * If we fail to free the inode, shut down.  The cancel
@@ -2431,9 +2436,8 @@ xfs_ifree_local_data(
  */
 int
 xfs_ifree(
-	xfs_trans_t	*tp,
-	xfs_inode_t	*ip,
-	struct xfs_defer_ops	*dfops)
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip)
 {
 	int			error;
 	struct xfs_icluster	xic = { 0 };
@@ -2452,7 +2456,7 @@ xfs_ifree(
 	if (error)
 		return error;
 
-	error = xfs_difree(tp, ip->i_ino, dfops, &xic);
+	error = xfs_difree(tp, ip->i_ino, &xic);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 1eebc53df7d7..9e1ec46f31a2 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -413,8 +413,7 @@ uint		xfs_ilock_data_map_shared(struct xfs_inode *);
 uint		xfs_ilock_attr_map_shared(struct xfs_inode *);
 
 uint		xfs_ip2xflags(struct xfs_inode *);
-int		xfs_ifree(struct xfs_trans *, xfs_inode_t *,
-			   struct xfs_defer_ops *);
+int		xfs_ifree(struct xfs_trans *, struct xfs_inode *);
 int		xfs_itruncate_extents(struct xfs_trans **, struct xfs_inode *,
 				      int, xfs_fsize_t);
 void		xfs_iext_realloc(xfs_inode_t *, int, int);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/7] xfs: defer frees from common inode allocation paths
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
                   ` (4 preceding siblings ...)
  2018-04-10 17:46 ` [PATCH 5/7] xfs: defer agfl frees from inode inactivation Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  2018-04-10 17:46 ` [PATCH 7/7] xfs: defer agfl frees from directory op transactions Brian Foster
  6 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

Inode allocation can require block allocation for physical inode
chunk allocation, inode btree record insertion, and/or directory
block allocation for entry insertion. Any of these block allocation
requests can require AGFL fixups prior to the actual allocation.
Update the common file creation transacions to defer AGFL frees from
these contexts to avoid too much log reservation consumption
per-transaction.

Since these transactions are already passed down through the btree
cursors and da_args structure, this simply requires to attach dfops
to the transaction. Note that this covers tr_create, tr_mkdir and
tr_symlink. Other transactions such as tr_create_tmpfile do not
already make use of deferred operations and so are left alone for
the time being.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_inode.c   | 2 +-
 fs/xfs/xfs_symlink.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 117dced8bf66..fdf5d276b44c 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1195,7 +1195,7 @@ xfs_create(
 	xfs_ilock(dp, XFS_ILOCK_EXCL | XFS_ILOCK_PARENT);
 	unlock_dp_on_error = true;
 
-	xfs_defer_init(NULL, &dfops, &first_block);
+	xfs_defer_init(tp, &dfops, &first_block);
 
 	/*
 	 * Reserve disk quota and the inode.
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index e4c4ea25af61..0f3564741fa9 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -258,7 +258,7 @@ xfs_symlink(
 	 * Initialize the bmap freelist prior to calling either
 	 * bmapi or the directory create code.
 	 */
-	xfs_defer_init(NULL, &dfops, &first_block);
+	xfs_defer_init(tp, &dfops, &first_block);
 
 	/*
 	 * Allocate an inode for the symlink.
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/7] xfs: defer agfl frees from directory op transactions
  2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
                   ` (5 preceding siblings ...)
  2018-04-10 17:46 ` [PATCH 6/7] xfs: defer frees from common inode allocation paths Brian Foster
@ 2018-04-10 17:46 ` Brian Foster
  6 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-10 17:46 UTC (permalink / raw)
  To: linux-xfs

Directory operations can perform block allocations as entries are
added/removed from directories. Defer AGFL block frees from the
remaining directory operation transactions. This covers the hard
link, remove and rename operations.

Since this fixes up all callers of the dir2 create, remove and
replace functions to pass dfops via the transaction, clean up the
associated interfaces to no longer require a dfops parameter.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_dir2.c | 16 +++++++-------
 fs/xfs/libxfs/xfs_dir2.h |  9 +++-----
 fs/xfs/xfs_inode.c       | 56 ++++++++++++++++++++++--------------------------
 fs/xfs/xfs_symlink.c     |  2 +-
 4 files changed, 38 insertions(+), 45 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 92f94e190f04..3a6e126212ec 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -256,7 +256,6 @@ xfs_dir_createname(
 	struct xfs_name		*name,
 	xfs_ino_t		inum,		/* new entry inode number */
 	xfs_fsblock_t		*first,		/* bmap's firstblock */
-	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
 	xfs_extlen_t		total)		/* bmap's total block count */
 {
 	struct xfs_da_args	*args;
@@ -282,11 +281,12 @@ xfs_dir_createname(
 	args->hashval = dp->i_mount->m_dirnameops->hashname(name);
 	args->inumber = inum;
 	args->dp = dp;
-	args->firstblock = first;
-	args->dfops = dfops;
 	args->total = total;
 	args->whichfork = XFS_DATA_FORK;
 	args->trans = tp;
+	ASSERT(tp->t_dfops || !first);
+	args->dfops = tp->t_dfops;
+	args->firstblock = first;
 	args->op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
 	if (!inum)
 		args->op_flags |= XFS_DA_OP_JUSTCHECK;
@@ -433,7 +433,6 @@ xfs_dir_removename(
 	struct xfs_name	*name,
 	xfs_ino_t	ino,
 	xfs_fsblock_t	*first,		/* bmap's firstblock */
-	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
 	xfs_extlen_t	total)		/* bmap's total block count */
 {
 	struct xfs_da_args *args;
@@ -455,10 +454,11 @@ xfs_dir_removename(
 	args->inumber = ino;
 	args->dp = dp;
 	args->firstblock = first;
-	args->dfops = dfops;
 	args->total = total;
 	args->whichfork = XFS_DATA_FORK;
 	args->trans = tp;
+	ASSERT(tp->t_dfops);
+	args->dfops = tp->t_dfops;
 
 	if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
 		rval = xfs_dir2_sf_removename(args);
@@ -495,7 +495,6 @@ xfs_dir_replace(
 	struct xfs_name	*name,		/* name of entry to replace */
 	xfs_ino_t	inum,		/* new inode number */
 	xfs_fsblock_t	*first,		/* bmap's firstblock */
-	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
 	xfs_extlen_t	total)		/* bmap's total block count */
 {
 	struct xfs_da_args *args;
@@ -520,10 +519,11 @@ xfs_dir_replace(
 	args->inumber = inum;
 	args->dp = dp;
 	args->firstblock = first;
-	args->dfops = dfops;
 	args->total = total;
 	args->whichfork = XFS_DATA_FORK;
 	args->trans = tp;
+	ASSERT(tp->t_dfops);
+	args->dfops = tp->t_dfops;
 
 	if (dp->i_d.di_format == XFS_DINODE_FMT_LOCAL) {
 		rval = xfs_dir2_sf_replace(args);
@@ -559,7 +559,7 @@ xfs_dir_canenter(
 	xfs_inode_t	*dp,
 	struct xfs_name	*name)		/* name of entry to add */
 {
-	return xfs_dir_createname(tp, dp, name, 0, NULL, NULL, 0);
+	return xfs_dir_createname(tp, dp, name, 0, NULL, 0);
 }
 
 /*
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index 989e95a53db2..d9e508988fd8 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -130,19 +130,16 @@ extern int xfs_dir_init(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_inode *pdp);
 extern int xfs_dir_createname(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
-				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				xfs_fsblock_t *first, xfs_extlen_t tot);
 extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t *inum,
 				struct xfs_name *ci_name);
 extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t ino,
-				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				xfs_fsblock_t *first, xfs_extlen_t tot);
 extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
-				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				xfs_fsblock_t *first, xfs_extlen_t tot);
 extern int xfs_dir_canenter(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index fdf5d276b44c..9ca198dbd419 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1224,8 +1224,8 @@ xfs_create(
 	xfs_trans_ijoin(tp, dp, XFS_ILOCK_EXCL);
 	unlock_dp_on_error = false;
 
-	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
-					&first_block, &dfops, resblks ?
+	error = xfs_dir_createname(tp, dp, name, ip->i_ino, &first_block,
+				   resblks ?
 					resblks - XFS_IALLOC_SPACE_RES(mp) : 0);
 	if (error) {
 		ASSERT(error != -ENOSPC);
@@ -1450,7 +1450,7 @@ xfs_link(
 			goto error_return;
 	}
 
-	xfs_defer_init(NULL, &dfops, &first_block);
+	xfs_defer_init(tp, &dfops, &first_block);
 
 	/*
 	 * Handle initial link state of O_TMPFILE inode
@@ -1462,7 +1462,7 @@ xfs_link(
 	}
 
 	error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
-					&first_block, &dfops, resblks);
+				   &first_block, resblks);
 	if (error)
 		goto error_return;
 	xfs_trans_ichgtime(tp, tdp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
@@ -2648,9 +2648,9 @@ xfs_remove(
 	if (error)
 		goto out_trans_cancel;
 
-	xfs_defer_init(NULL, &dfops, &first_block);
-	error = xfs_dir_removename(tp, dp, name, ip->i_ino,
-					&first_block, &dfops, resblks);
+	xfs_defer_init(tp, &dfops, &first_block);
+	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
+				   resblks);
 	if (error) {
 		ASSERT(error != -ENOENT);
 		goto out_bmap_cancel;
@@ -2738,9 +2738,9 @@ xfs_sort_for_rename(
 
 static int
 xfs_finish_rename(
-	struct xfs_trans	*tp,
-	struct xfs_defer_ops	*dfops)
+	struct xfs_trans	*tp)
 {
+	struct xfs_defer_ops	*dfops = tp->t_dfops;
 	int			error;
 
 	/*
@@ -2774,7 +2774,6 @@ xfs_cross_rename(
 	struct xfs_inode	*dp2,
 	struct xfs_name		*name2,
 	struct xfs_inode	*ip2,
-	struct xfs_defer_ops	*dfops,
 	xfs_fsblock_t		*first_block,
 	int			spaceres)
 {
@@ -2784,16 +2783,14 @@ xfs_cross_rename(
 	int		dp2_flags = 0;
 
 	/* Swap inode number for dirent in first parent */
-	error = xfs_dir_replace(tp, dp1, name1,
-				ip2->i_ino,
-				first_block, dfops, spaceres);
+	error = xfs_dir_replace(tp, dp1, name1, ip2->i_ino, first_block,
+				spaceres);
 	if (error)
 		goto out_trans_abort;
 
 	/* Swap inode number for dirent in second parent */
-	error = xfs_dir_replace(tp, dp2, name2,
-				ip1->i_ino,
-				first_block, dfops, spaceres);
+	error = xfs_dir_replace(tp, dp2, name2, ip1->i_ino, first_block,
+				spaceres);
 	if (error)
 		goto out_trans_abort;
 
@@ -2808,7 +2805,7 @@ xfs_cross_rename(
 		if (S_ISDIR(VFS_I(ip2)->i_mode)) {
 			error = xfs_dir_replace(tp, ip2, &xfs_name_dotdot,
 						dp1->i_ino, first_block,
-						dfops, spaceres);
+						spaceres);
 			if (error)
 				goto out_trans_abort;
 
@@ -2835,7 +2832,7 @@ xfs_cross_rename(
 		if (S_ISDIR(VFS_I(ip1)->i_mode)) {
 			error = xfs_dir_replace(tp, ip1, &xfs_name_dotdot,
 						dp2->i_ino, first_block,
-						dfops, spaceres);
+						spaceres);
 			if (error)
 				goto out_trans_abort;
 
@@ -2874,10 +2871,10 @@ xfs_cross_rename(
 	}
 	xfs_trans_ichgtime(tp, dp1, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
 	xfs_trans_log_inode(tp, dp1, XFS_ILOG_CORE);
-	return xfs_finish_rename(tp, dfops);
+	return xfs_finish_rename(tp);
 
 out_trans_abort:
-	xfs_defer_cancel(dfops);
+	xfs_defer_cancel(tp->t_dfops);
 	xfs_trans_cancel(tp);
 	return error;
 }
@@ -3015,13 +3012,13 @@ xfs_rename(
 		goto out_trans_cancel;
 	}
 
-	xfs_defer_init(NULL, &dfops, &first_block);
+	xfs_defer_init(tp, &dfops, &first_block);
 
 	/* RENAME_EXCHANGE is unique from here on. */
 	if (flags & RENAME_EXCHANGE)
 		return xfs_cross_rename(tp, src_dp, src_name, src_ip,
 					target_dp, target_name, target_ip,
-					&dfops, &first_block, spaceres);
+					&first_block, spaceres);
 
 	/*
 	 * Set up the target.
@@ -3043,7 +3040,7 @@ xfs_rename(
 		 */
 		error = xfs_dir_createname(tp, target_dp, target_name,
 						src_ip->i_ino, &first_block,
-						&dfops, spaceres);
+						spaceres);
 		if (error)
 			goto out_bmap_cancel;
 
@@ -3082,8 +3079,7 @@ xfs_rename(
 		 * name at the destination directory, remove it first.
 		 */
 		error = xfs_dir_replace(tp, target_dp, target_name,
-					src_ip->i_ino,
-					&first_block, &dfops, spaceres);
+					src_ip->i_ino, &first_block, spaceres);
 		if (error)
 			goto out_bmap_cancel;
 
@@ -3117,8 +3113,8 @@ xfs_rename(
 		 * directory.
 		 */
 		error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
-					target_dp->i_ino,
-					&first_block, &dfops, spaceres);
+					target_dp->i_ino, &first_block,
+					spaceres);
 		ASSERT(error != -EEXIST);
 		if (error)
 			goto out_bmap_cancel;
@@ -3157,10 +3153,10 @@ xfs_rename(
 	 */
 	if (wip) {
 		error = xfs_dir_replace(tp, src_dp, src_name, wip->i_ino,
-					&first_block, &dfops, spaceres);
+					&first_block, spaceres);
 	} else
 		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
-					   &first_block, &dfops, spaceres);
+					   &first_block, spaceres);
 	if (error)
 		goto out_bmap_cancel;
 
@@ -3195,7 +3191,7 @@ xfs_rename(
 	if (new_parent)
 		xfs_trans_log_inode(tp, target_dp, XFS_ILOG_CORE);
 
-	error = xfs_finish_rename(tp, &dfops);
+	error = xfs_finish_rename(tp);
 	if (wip)
 		IRELE(wip);
 	return error;
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 0f3564741fa9..264b4196d540 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -350,7 +350,7 @@ xfs_symlink(
 	 * Create the directory entry for the symlink.
 	 */
 	error = xfs_dir_createname(tp, dp, link_name, ip->i_ino,
-					&first_block, &dfops, resblks);
+				   &first_block, resblks);
 	if (error)
 		goto out_bmap_cancel;
 	xfs_trans_ichgtime(tp, dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
-- 
2.13.6


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/7] xfs: create agfl block free helper function
  2018-04-10 17:46 ` [PATCH 1/7] xfs: create agfl block free helper function Brian Foster
@ 2018-04-15  7:38   ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2018-04-15  7:38 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

Looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/7] xfs: allow attach of dfops to transaction
  2018-04-10 17:46 ` [PATCH 2/7] xfs: allow attach of dfops to transaction Brian Foster
@ 2018-04-15  7:44   ` Christoph Hellwig
  2018-04-16 13:33     ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2018-04-15  7:44 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

> diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> index 087fea02c389..f3aef54257d1 100644
> --- a/fs/xfs/libxfs/xfs_defer.c
> +++ b/fs/xfs/libxfs/xfs_defer.c
> @@ -525,6 +525,7 @@ xfs_defer_init_op_type(
>  /* Initialize a deferred operation. */
>  void
>  xfs_defer_init(
> +	struct xfs_trans		*tp,
>  	struct xfs_defer_ops		*dop,
>  	xfs_fsblock_t			*fbp)
>  {
> @@ -532,5 +533,7 @@ xfs_defer_init(
>  	*fbp = NULLFSBLOCK;
>  	INIT_LIST_HEAD(&dop->dop_intake);
>  	INIT_LIST_HEAD(&dop->dop_pending);
> -	trace_xfs_defer_init(NULL, dop);
> +	if (tp)
> +		tp->t_dfops = dop;
> +	trace_xfs_defer_init(tp ? tp->t_mountp : NULL, dop);

there is really no oint un doing this massive change of prototypes
everywhere just to move an assignment into xfs_defer_init.  Just keep
that assignment in the caller.


But in general I'm really concerned about making the dops in the
transaction conditional behavior.  This is just going down the rathole
of inconsistent behavior where some callers expect it in the transaction
vs other explcitit and every time something really to defered ops is
touched it will need a deep code audit.  I'd rather switch everything
over in a go and be done with it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/7] xfs: defer agfl block frees when dfops is available
  2018-04-10 17:46 ` [PATCH 3/7] xfs: defer agfl block frees when dfops is available Brian Foster
@ 2018-04-15  7:46   ` Christoph Hellwig
  2018-04-16 13:42     ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2018-04-15  7:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

> Add the capability to defer AGFL block frees when a deferred ops
> list is available to the AGFL fixup code.

When is one available and when not?  Why are we fine not delaying it
in some cases?

> Note that this patch only adds infrastructure. It does not change
> behavior because no callers have been updated to pass ->t_dfops into
> the allocation code.

So the plan is to eventually have all callers pass it?  If so that is
another argument to first always pass defer_ops through the transaction,
then make sure all callers have it where required, and only then move
to actually use it.

> +{
> +	struct xfs_extent_free_item	*new;		/* new element */
> +
> +	ASSERT(xfs_bmap_free_item_zone != NULL);
> +	ASSERT(oinfo != NULL);

No need for either assert, we get nice clean NULL ptr dereferences when
this isn't true.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/7] xfs: allow attach of dfops to transaction
  2018-04-15  7:44   ` Christoph Hellwig
@ 2018-04-16 13:33     ` Brian Foster
  0 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-16 13:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Sun, Apr 15, 2018 at 12:44:00AM -0700, Christoph Hellwig wrote:
> > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> > index 087fea02c389..f3aef54257d1 100644
> > --- a/fs/xfs/libxfs/xfs_defer.c
> > +++ b/fs/xfs/libxfs/xfs_defer.c
> > @@ -525,6 +525,7 @@ xfs_defer_init_op_type(
> >  /* Initialize a deferred operation. */
> >  void
> >  xfs_defer_init(
> > +	struct xfs_trans		*tp,
> >  	struct xfs_defer_ops		*dop,
> >  	xfs_fsblock_t			*fbp)
> >  {
> > @@ -532,5 +533,7 @@ xfs_defer_init(
> >  	*fbp = NULLFSBLOCK;
> >  	INIT_LIST_HEAD(&dop->dop_intake);
> >  	INIT_LIST_HEAD(&dop->dop_pending);
> > -	trace_xfs_defer_init(NULL, dop);
> > +	if (tp)
> > +		tp->t_dfops = dop;
> > +	trace_xfs_defer_init(tp ? tp->t_mountp : NULL, dop);
> 
> there is really no oint un doing this massive change of prototypes
> everywhere just to move an assignment into xfs_defer_init.  Just keep
> that assignment in the caller.
> 

I was doing that in the previous versions and thought this might be more
clear in terms of ensuring new code initialized structures properly.
That said, I don't much care which approach is used so I can change it
back and (potentially) revisit when more code is switched over.

> 
> But in general I'm really concerned about making the dops in the
> transaction conditional behavior.  This is just going down the rathole
> of inconsistent behavior where some callers expect it in the transaction
> vs other explcitit and every time something really to defered ops is
> touched it will need a deep code audit.  I'd rather switch everything
> over in a go and be done with it.

Deep code audit, eh? :/ There is no such confusion in the allocator
context because there's only one way to pass the dfops: in the
transaction. The reason for doing this is essentially to optimize the
step out of manually passing down the dfops and then immediately
refactoring all that cruft out because the needed dfops-passing
callchains currently don't exist.

The purpose of this patchset is a functional fix: defer AGFL frees to
reduce tx reservation overuse from problematic contexts. It introduces
the concept of the broader refactoring (that Dave proposed on a previous
version, IIRC) purely for the scope of the fix. For example, consider
that ->t_dfops were renamed to something like ->t_agfl_dfops for the
purpose of this patch series.

Not using the transaction means we'd have to pass dfops manually into
the allocator and deep into the btree code (via da_args and btree
cursors). I could technically do that, but it creates a bunch of ugly
code that results in the same functional logic/behavior of this patch
series as is and essentially is a waste of time.

The temporary confusion that does remain is the inconsistency that some
callchains use ->t_dfops while others do not (yet). The reason/fix for
that is mostly aesthetic/mechanical and so really shouldn't be that
confusing IMO. I certainly don't think that justifies plumbing dfops
through manually only to delete it in followup patches. We could try to
reduce that confusion by dropping the xfs_defer_init() bits and renaming
->t_dfops as noted above, but note that technically drops the other
dfops-passing callchain cleanups in the series because ->t_agfl_dfops
should only be used for AGFL fixups. That would have to go away until
->t_agfl_dfops were renamed back to ->t_dfops, but would further
separate the refactoring from the functional change without creating
unnecessary work. Thoughts?

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/7] xfs: defer agfl block frees when dfops is available
  2018-04-15  7:46   ` Christoph Hellwig
@ 2018-04-16 13:42     ` Brian Foster
  0 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2018-04-16 13:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Sun, Apr 15, 2018 at 12:46:58AM -0700, Christoph Hellwig wrote:
> > Add the capability to defer AGFL block frees when a deferred ops
> > list is available to the AGFL fixup code.
> 
> When is one available and when not?  Why are we fine not delaying it
> in some cases?
> 

Because not all transactions either have this problem or generally may
not have a need for deferred operations at all. This mechanism doesn't
dictate that all allocator callers must use deferred operations if they
haven't necessarily demonstrated the problem.

> > Note that this patch only adds infrastructure. It does not change
> > behavior because no callers have been updated to pass ->t_dfops into
> > the allocation code.
> 
> So the plan is to eventually have all callers pass it?  If so that is
> another argument to first always pass defer_ops through the transaction,
> then make sure all callers have it where required, and only then move
> to actually use it.
> 

The plan is that ultimately any transaction that already uses deferred
operations or any transaction that explicitly requires deferred agfl
frees (those modified in this patchset) will pass it along to the
allocator context. I wasn't planning to add dfops for any allocator
callers that don't currently use/need one, but we can revisit that
later.

As noted in my previous mail, this is essentially equivalent to passing
down the dfops on the stack and optimizing out the step of immediately
refactoring that code away. Further, doing broader refactoring first
puts the cleanup ahead of the fix, which is backwards. I want to fix the
problem first and clean up the code second.

> > +{
> > +	struct xfs_extent_free_item	*new;		/* new element */
> > +
> > +	ASSERT(xfs_bmap_free_item_zone != NULL);
> > +	ASSERT(oinfo != NULL);
> 
> No need for either assert, we get nice clean NULL ptr dereferences when
> this isn't true.

Ok.

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-04-16 13:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-10 17:46 [PATCH 0/7] xfs: defer agfl block frees Brian Foster
2018-04-10 17:46 ` [PATCH 1/7] xfs: create agfl block free helper function Brian Foster
2018-04-15  7:38   ` Christoph Hellwig
2018-04-10 17:46 ` [PATCH 2/7] xfs: allow attach of dfops to transaction Brian Foster
2018-04-15  7:44   ` Christoph Hellwig
2018-04-16 13:33     ` Brian Foster
2018-04-10 17:46 ` [PATCH 3/7] xfs: defer agfl block frees when dfops is available Brian Foster
2018-04-15  7:46   ` Christoph Hellwig
2018-04-16 13:42     ` Brian Foster
2018-04-10 17:46 ` [PATCH 4/7] xfs: defer agfl block frees from deferred ops processing context Brian Foster
2018-04-10 17:46 ` [PATCH 5/7] xfs: defer agfl frees from inode inactivation Brian Foster
2018-04-10 17:46 ` [PATCH 6/7] xfs: defer frees from common inode allocation paths Brian Foster
2018-04-10 17:46 ` [PATCH 7/7] xfs: defer agfl frees from directory op transactions Brian Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).