[PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes
@ 2015-08-06 17:44 Brian Foster
  2015-08-06 17:44 ` [PATCH 01/11] xfs: disentagle EFI release from the extent count Brian Foster
                   ` (10 more replies)
  0 siblings, 11 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Hi all,

This series includes a bunch of bug fixes from doing some log recovery
testing. It started with an RFC of the EFI/EFD bits and expanded to
include more fixes as problems were discovered.

Patch 1 removes the superfluous EFI/EFD extent count reference tracking.
Patches 2-6 make a couple cleanups and fix the explicit EFI/EFD
reference counting to avoid leaving EFI items on the AIL in the event of
error. Patches 7-8 fix handling of cancelled icreate transactions in log
recovery. Patch 9 is a repost of a previous log recovery fix. Patches
10-11 fix a couple more mount related crash/hang issues.

Just about all of these problems were reproduced by running repeated
fsstress/godown workloads against XFS. This set has gone ~150 iterations
of said testing before hitting some kind of recovery time directory
corruption in the form of a dir. block write verifier failure. I'll see
if it is reproducible, but I don't think it's related to anything
here since I haven't touched directory code...

Brian

v1:
- Updated EFI/EFD tracking semantics based on rfc comments.
- Added more cleanups/fixes to series.
rfc: http://oss.sgi.com/pipermail/xfs/2015-July/042617.html

Brian Foster (11):
  xfs: disentagle EFI release from the extent count
  xfs: return committed status from xfs_trans_roll()
  xfs: fix efi/efd error handling to avoid fs shutdown hangs
  xfs: ensure EFD trans aborts on log recovery extent free failure
  xfs: use EFI refcount consistently in log recovery
  xfs: don't leave EFIs on AIL on mount failure
  xfs: icreate log item recovery and cancellation tracepoints
  xfs: fix broken icreate log item cancellation
  xfs: checksum log record ext headers based on record size
  xfs: clean up root inode properly on mount failure
  xfs: fix btree cursor error cleanups

 fs/xfs/libxfs/xfs_ialloc.c |   2 +-
 fs/xfs/xfs_bmap_util.c     |  84 +++++++++++++++-----------
 fs/xfs/xfs_extfree_item.c  | 118 +++++++++++++++++-------------------
 fs/xfs/xfs_extfree_item.h  |   1 +
 fs/xfs/xfs_itable.c        |   3 +-
 fs/xfs/xfs_log.c           |  23 +++++--
 fs/xfs/xfs_log.h           |   2 +-
 fs/xfs/xfs_log_priv.h      |   2 +
 fs/xfs/xfs_log_recover.c   | 147 +++++++++++++++++++++++++++++++++------------
 fs/xfs/xfs_mount.c         |  37 ++++++++----
 fs/xfs/xfs_trace.h         |  34 +++++++++++
 fs/xfs/xfs_trans.c         |  15 ++++-
 fs/xfs/xfs_trans.h         |   2 +-
 13 files changed, 307 insertions(+), 163 deletions(-)

-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/11] xfs: disentagle EFI release from the extent count
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-09  7:36   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 02/11] xfs: return committed status from xfs_trans_roll() Brian Foster
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Release of the EFI either occurs based on the reference count or the
extent count. The extent count used is either the count tracked in the
EFI or EFD, depending on the particular situation. In either case, the
count is initialized to the final value and thus always matches the
current efi_next_extent value once the EFI is completely constructed.

Given this along with the fact that the EFI is always completely
constructed before error can occur, eliminate the unnecessary extent
count manipulation and release the EFI directly based on reference
count. The efi_next_extent counter remains because it is still used to
track the slot to log the next extent to free.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_extfree_item.c | 20 ++++++++------------
 fs/xfs/xfs_extfree_item.h |  1 +
 fs/xfs/xfs_log_recover.c  |  2 +-
 fs/xfs/xfs_trans.h        |  1 -
 4 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index adc8f8f..42c9b05 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -149,7 +149,7 @@ xfs_efi_item_unpin(
 		xfs_efi_item_free(efip);
 		return;
 	}
-	__xfs_efi_release(efip);
+	xfs_efi_release(efip);
 }
 
 /*
@@ -307,18 +307,14 @@ xfs_efi_copy_format(xfs_log_iovec_t *buf, xfs_efi_log_format_t *dst_efi_fmt)
  * by this efi item we can free the efi item.
  */
 void
-xfs_efi_release(xfs_efi_log_item_t	*efip,
-		uint			nextents)
+xfs_efi_release(struct xfs_efi_log_item	*efip)
 {
-	ASSERT(atomic_read(&efip->efi_next_extent) >= nextents);
-	if (atomic_sub_and_test(nextents, &efip->efi_next_extent)) {
-		/* recovery needs us to drop the EFI reference, too */
-		if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
-			__xfs_efi_release(efip);
-
+	/* recovery needs us to drop the EFI reference, too */
+	if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
 		__xfs_efi_release(efip);
-		/* efip may now have been freed, do not reference it again. */
-	}
+
+	__xfs_efi_release(efip);
+	/* efip may now have been freed, do not reference it again. */
 }
 
 static inline struct xfs_efd_log_item *EFD_ITEM(struct xfs_log_item *lip)
@@ -442,7 +438,7 @@ xfs_efd_item_committed(
 	 * EFI got unpinned and freed before the EFD got aborted.
 	 */
 	if (!(lip->li_flags & XFS_LI_ABORTED))
-		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
+		xfs_efi_release(efdp->efd_efip);
 
 	xfs_efd_item_free(efdp);
 	return (xfs_lsn_t)-1;
diff --git a/fs/xfs/xfs_extfree_item.h b/fs/xfs/xfs_extfree_item.h
index 0ffbce3..399562e 100644
--- a/fs/xfs/xfs_extfree_item.h
+++ b/fs/xfs/xfs_extfree_item.h
@@ -77,5 +77,6 @@ xfs_efd_log_item_t	*xfs_efd_init(struct xfs_mount *, xfs_efi_log_item_t *,
 int			xfs_efi_copy_format(xfs_log_iovec_t *buf,
 					    xfs_efi_log_format_t *dst_efi_fmt);
 void			xfs_efi_item_free(xfs_efi_log_item_t *);
+void			xfs_efi_release(struct xfs_efi_log_item *);
 
 #endif	/* __XFS_EXTFREE_ITEM_H__ */
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index c674b40..9d8f242 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3766,7 +3766,7 @@ xlog_recover_process_efi(
 			 * free the memory associated with it.
 			 */
 			set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
-			xfs_efi_release(efip, efip->efi_format.efi_nextents);
+			xfs_efi_release(efip);
 			return -EIO;
 		}
 	}
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 3b21b4e..f48e839 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -213,7 +213,6 @@ void		xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint);
 void		xfs_trans_log_buf(xfs_trans_t *, struct xfs_buf *, uint, uint);
 void		xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint);
 struct xfs_efi_log_item	*xfs_trans_get_efi(xfs_trans_t *, uint);
-void		xfs_efi_release(struct xfs_efi_log_item *, uint);
 void		xfs_trans_log_efi_extent(xfs_trans_t *,
 					 struct xfs_efi_log_item *,
 					 xfs_fsblock_t,
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/11] xfs: return committed status from xfs_trans_roll()
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
  2015-08-06 17:44 ` [PATCH 01/11] xfs: disentagle EFI release from the extent count Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-09  7:37   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs Brian Foster
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Some callers need to make error handling decisions based on whether the
current transaction successfully committed or not. Rename
xfs_trans_roll(), add a new parameter and provide a wrapper to preserve
existing callers.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_trans.c | 15 +++++++++++++--
 fs/xfs/xfs_trans.h |  1 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 0582a27..a0ab1da 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -1019,9 +1019,10 @@ xfs_trans_cancel(
  * chunk we've been working on and get a new transaction to continue.
  */
 int
-xfs_trans_roll(
+__xfs_trans_roll(
 	struct xfs_trans	**tpp,
-	struct xfs_inode	*dp)
+	struct xfs_inode	*dp,
+	int			*committed)
 {
 	struct xfs_trans	*trans;
 	struct xfs_trans_res	tres;
@@ -1052,6 +1053,7 @@ xfs_trans_roll(
 	if (error)
 		return error;
 
+	*committed = 1;
 	trans = *tpp;
 
 	/*
@@ -1074,3 +1076,12 @@ xfs_trans_roll(
 		xfs_trans_ijoin(trans, dp, 0);
 	return 0;
 }
+
+int
+xfs_trans_roll(
+	struct xfs_trans	**tpp,
+	struct xfs_inode	*dp)
+{
+	int			committed = 0;
+	return __xfs_trans_roll(tpp, dp, &committed);
+}
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index f48e839..ba1660b 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -225,6 +225,7 @@ void		xfs_trans_log_efd_extent(xfs_trans_t *,
 					 xfs_fsblock_t,
 					 xfs_extlen_t);
 int		xfs_trans_commit(struct xfs_trans *);
+int		__xfs_trans_roll(struct xfs_trans **, struct xfs_inode *, int *);
 int		xfs_trans_roll(struct xfs_trans **, struct xfs_inode *);
 void		xfs_trans_cancel(xfs_trans_t *);
 int		xfs_trans_ail_init(struct xfs_mount *);
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
  2015-08-06 17:44 ` [PATCH 01/11] xfs: disentagle EFI release from the extent count Brian Foster
  2015-08-06 17:44 ` [PATCH 02/11] xfs: return committed status from xfs_trans_roll() Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-07  0:41   ` Dave Chinner
  2015-08-09  7:46   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure Brian Foster
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Freeing an extent in XFS involves logging an EFI (extent free
intention), freeing the actual extent, and logging an EFD (extent free
done). The EFI object is created with a reference count of 2: one for
the current transaction and one for the subsequently created EFD. Under
normal circumstances, the first reference is dropped when the EFI is
unpinned and the second reference is dropped when the EFD is committed
to the on-disk log.

In event of errors or filesystem shutdown, there are various potential
cleanup scenarios depending on the state of the EFI/EFD. The cleanup
scenarios are confusing and racy, as demonstrated by the following test
sequence:

	# mount $dev $mnt
	# fsstress -d $mnt -n 99999 -p 16 -z -f fallocate=1 \
		-f punch=1 -f creat=1 -f unlink=1 &
	# sleep 5
	# killall -9 fsstress; wait
	# godown -f $mnt
	# umount

... in which the final umount can hang due to the AIL being pinned
indefinitely by one or more EFI items. This can occur due to several
conditions. For example, if the shutdown occurs after the EFI is
committed to the on-disk log and the EFD committed to the CIL, but
before the EFD committed to the log, the EFD iop_committed() abort
handler does not drop its reference to the EFI. Alternatively, manual
error injection in the xfs_bmap_finish() codepath shows that if an error
occurs after the EFI transaction is committed but before the EFD is
constructed and logged, the EFI is never released from the AIL.

Update the EFI/EFD item handling code to use a more straightforward and
reliable approach to error handling. If an error occurs after the EFI
transaction is committed and before the EFD is constructed, release the
EFI explicitly from xfs_bmap_finish(). If the EFI transaction is
cancelled, release the EFI in the unlock handler.

Once the EFD is constructed, it is responsible for releasing the EFI
under any circumstances (including whether the EFI item aborts due to
log I/O error). Update the EFD item handlers to release the EFI if the
transaction is cancelled or aborts due to log I/O error. Finally, update
xfs_bmap_finish() to log at least one EFD extent to the transaction
before xfs_free_extent() errors are handled to ensure the transaction is
dirty and EFD item error handling is triggered.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_bmap_util.c    | 84 +++++++++++++++++++++++++++--------------------
 fs/xfs/xfs_extfree_item.c | 71 ++++++++++++++++++++++-----------------
 2 files changed, 90 insertions(+), 65 deletions(-)

diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 0f34886c..fa65f67 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -67,16 +67,15 @@ xfs_fsb_to_db(struct xfs_inode *ip, xfs_fsblock_t fsb)
  */
 int						/* error */
 xfs_bmap_finish(
-	xfs_trans_t		**tp,		/* transaction pointer addr */
-	xfs_bmap_free_t		*flist,		/* i/o: list extents to free */
-	int			*committed)	/* xact committed or not */
+	struct xfs_trans		**tp,	/* transaction pointer addr */
+	struct xfs_bmap_free		*flist,	/* i/o: list extents to free */
+	int				*committed)/* xact committed or not */
 {
-	xfs_efd_log_item_t	*efd;		/* extent free data */
-	xfs_efi_log_item_t	*efi;		/* extent free intention */
-	int			error;		/* error return value */
-	xfs_bmap_free_item_t	*free;		/* free extent item */
-	xfs_mount_t		*mp;		/* filesystem mount structure */
-	xfs_bmap_free_item_t	*next;		/* next item on free list */
+	struct xfs_efd_log_item		*efd;	/* extent free data */
+	struct xfs_efi_log_item		*efi;	/* extent free intention */
+	int				error;	/* error return value */
+	struct xfs_bmap_free_item	*free;	/* free extent item */
+	struct xfs_bmap_free_item	*next;	/* next item on free list */
 
 	ASSERT((*tp)->t_flags & XFS_TRANS_PERM_LOG_RES);
 	if (flist->xbf_count == 0) {
@@ -88,40 +87,55 @@ xfs_bmap_finish(
 		xfs_trans_log_efi_extent(*tp, efi, free->xbfi_startblock,
 			free->xbfi_blockcount);
 
-	error = xfs_trans_roll(tp, NULL);
-	*committed = 1;
-	/*
-	 * We have a new transaction, so we should return committed=1,
-	 * even though we're returning an error.
-	 */
-	if (error)
+	error = __xfs_trans_roll(tp, NULL, committed);
+	if (error) {
+		/*
+		 * If the transaction was committed, drop the EFD reference
+		 * since we're bailing out of here. The other reference is
+		 * dropped when the EFI hits the AIL.
+		 *
+		 * If the transaction was not committed, the EFI is freed by the
+		 * EFI item unlock handler on abort. Also, we have a new
+		 * transaction so we should return committed=1 even though we're
+		 * returning an error.
+		 */
+		if (*committed) {
+			xfs_efi_release(efi);
+			xfs_force_shutdown((*tp)->t_mountp,
+				(error == -EFSCORRUPTED) ?
+					SHUTDOWN_CORRUPT_INCORE :
+					SHUTDOWN_META_IO_ERROR);
+		} else {
+			*committed = 1;
+		}
+
 		return error;
+	}
 
 	efd = xfs_trans_get_efd(*tp, efi, flist->xbf_count);
 	for (free = flist->xbf_first; free != NULL; free = next) {
 		next = free->xbfi_next;
-		if ((error = xfs_free_extent(*tp, free->xbfi_startblock,
-				free->xbfi_blockcount))) {
-			/*
-			 * The bmap free list will be cleaned up at a
-			 * higher level.  The EFI will be canceled when
-			 * this transaction is aborted.
-			 * Need to force shutdown here to make sure it
-			 * happens, since this transaction may not be
-			 * dirty yet.
-			 */
-			mp = (*tp)->t_mountp;
-			if (!XFS_FORCED_SHUTDOWN(mp))
-				xfs_force_shutdown(mp,
-						   (error == -EFSCORRUPTED) ?
-						   SHUTDOWN_CORRUPT_INCORE :
-						   SHUTDOWN_META_IO_ERROR);
-			return error;
-		}
+
+		/*
+		 * Free the extent and log the EFD to dirty the transaction
+		 * before handling errors. This ensures that the transaction is
+		 * aborted, which:
+		 *
+		 * 1.) releases the EFI and frees the EFD
+		 * 2.) shuts down the filesystem
+		 *
+		 * The bmap free list is cleaned up at a higher level.
+		 */
+		error = xfs_free_extent(*tp, free->xbfi_startblock,
+					free->xbfi_blockcount);
 		xfs_trans_log_efd_extent(*tp, efd, free->xbfi_startblock,
-			free->xbfi_blockcount);
+					 free->xbfi_blockcount);
+		if (error)
+			return error;
+
 		xfs_bmap_del_free(flist, NULL, free);
 	}
+
 	return 0;
 }
 
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index 42c9b05..aceb54f 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -61,9 +61,15 @@ __xfs_efi_release(
 
 	if (atomic_dec_and_test(&efip->efi_refcount)) {
 		spin_lock(&ailp->xa_lock);
-		/* xfs_trans_ail_delete() drops the AIL lock. */
-		xfs_trans_ail_delete(ailp, &efip->efi_item,
-				     SHUTDOWN_LOG_IO_ERROR);
+		/*
+		 * We don't know whether the EFI made it to the AIL. Remove it
+		 * if so. Note that xfs_trans_ail_delete() drops the AIL lock.
+		 */
+		if (efip->efi_item.li_flags & XFS_LI_IN_AIL)
+			xfs_trans_ail_delete(ailp, &efip->efi_item,
+					     SHUTDOWN_LOG_IO_ERROR);
+		else
+			spin_unlock(&ailp->xa_lock);
 		xfs_efi_item_free(efip);
 	}
 }
@@ -128,12 +134,12 @@ xfs_efi_item_pin(
 }
 
 /*
- * While EFIs cannot really be pinned, the unpin operation is the last place at
- * which the EFI is manipulated during a transaction.  If we are being asked to
- * remove the EFI it's because the transaction has been cancelled and by
- * definition that means the EFI cannot be in the AIL so remove it from the
- * transaction and free it.  Otherwise coordinate with xfs_efi_release()
- * to determine who gets to free the EFI.
+ * The unpin operation is the last place an EFI is manipulated in the log. It is
+ * either inserted in the AIL or aborted in the event of a log I/O error. In
+ * either case, the EFI transaction has been successfully committed to make it
+ * this far. Therefore, we expect whoever committed the EFI to either construct
+ * and commit the EFD or drop the EFD's reference in the event of error. Simply
+ * drop the log's EFI reference now that the log is done with it.
  */
 STATIC void
 xfs_efi_item_unpin(
@@ -141,14 +147,6 @@ xfs_efi_item_unpin(
 	int			remove)
 {
 	struct xfs_efi_log_item	*efip = EFI_ITEM(lip);
-
-	if (remove) {
-		ASSERT(!(lip->li_flags & XFS_LI_IN_AIL));
-		if (lip->li_desc)
-			xfs_trans_del_item(lip);
-		xfs_efi_item_free(efip);
-		return;
-	}
 	xfs_efi_release(efip);
 }
 
@@ -167,6 +165,11 @@ xfs_efi_item_push(
 	return XFS_ITEM_PINNED;
 }
 
+/*
+ * The EFI has been either committed or aborted if the transaction has been
+ * cancelled. If the transaction was cancelled, an EFD isn't going to be
+ * constructed and thus we free the EFI here directly.
+ */
 STATIC void
 xfs_efi_item_unlock(
 	struct xfs_log_item	*lip)
@@ -411,20 +414,27 @@ xfs_efd_item_push(
 	return XFS_ITEM_PINNED;
 }
 
+/*
+ * The EFD is either committed or aborted if the transaction is cancelled. If
+ * the transaction is cancelled, drop our reference to the EFI and free the EFD.
+ */
 STATIC void
 xfs_efd_item_unlock(
 	struct xfs_log_item	*lip)
 {
-	if (lip->li_flags & XFS_LI_ABORTED)
-		xfs_efd_item_free(EFD_ITEM(lip));
+	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
+
+	if (lip->li_flags & XFS_LI_ABORTED) {
+		xfs_efi_release(efdp->efd_efip);
+		xfs_efd_item_free(efdp);
+	}
 }
 
 /*
- * When the efd item is committed to disk, all we need to do
- * is delete our reference to our partner efi item and then
- * free ourselves.  Since we're freeing ourselves we must
- * return -1 to keep the transaction code from further referencing
- * this item.
+ * When the efd item is committed to disk, all we need to do is delete our
+ * reference to our partner efi item and then free ourselves. Since we're
+ * freeing ourselves we must return -1 to keep the transaction code from further
+ * referencing this item.
  */
 STATIC xfs_lsn_t
 xfs_efd_item_committed(
@@ -434,14 +444,15 @@ xfs_efd_item_committed(
 	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
 
 	/*
-	 * If we got a log I/O error, it's always the case that the LR with the
-	 * EFI got unpinned and freed before the EFD got aborted.
+	 * Drop the EFI reference regardless of whether the EFD has been
+	 * aborted. Once the EFD transaction is constructed, it is the sole
+	 * responsibility of the EFD to release the EFI (even if the EFI is
+	 * aborted due to log I/O error).
 	 */
-	if (!(lip->li_flags & XFS_LI_ABORTED))
-		xfs_efi_release(efdp->efd_efip);
-
+	xfs_efi_release(efdp->efd_efip);
 	xfs_efd_item_free(efdp);
-	return (xfs_lsn_t)-1;
+
+	return (xfs_lsn_t) -1;
 }
 
 /*
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (2 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-09  7:53   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 05/11] xfs: use EFI refcount consistently in log recovery Brian Foster
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Log recovery attempts to free extents with leftover EFIs in the AIL
after initial processing. If the extent free fails (e.g., due to
unrelated fs corruption), the transaction is cancelled, though it might
not be dirtied at the time. If this is the case, the EFD does not abort
and thus does not release the EFI. This can lead to hangs as the EFI
pins the AIL.

Update xlog_recover_process_efi() to log the EFD in the transaction
before xfs_free_extent() errors are handled to ensure the transaction is
dirty, aborts the EFD and releases the EFI on error. This is consistent
with how EFDs are handled at runtime in xfs_bmap_finish().

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 9d8f242..3433b7b 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3779,11 +3779,16 @@ xlog_recover_process_efi(
 
 	for (i = 0; i < efip->efi_format.efi_nextents; i++) {
 		extp = &(efip->efi_format.efi_extents[i]);
+		/*
+		 * Log the EFD before error handling to ensure the EFD aborts
+		 * (and releases the EFI) on error.
+		 */
 		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
-		if (error)
-			goto abort_error;
 		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
 					 extp->ext_len);
+		if (error)
+			goto abort_error;
+
 	}
 
 	set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/11] xfs: use EFI refcount consistently in log recovery
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (3 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-07  1:51   ` Dave Chinner
  2015-08-09  7:56   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure Brian Foster
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

The EFI is initialized with a reference count of 2. One for the EFI to
ensure the item makes it to the AIL and one for the subsequently created
EFD to release the EFI once the EFD is committed. Log recovery uses the
EFI in a similar manner, but implements a hack to remove both references
in one call once the EFD is handled.

Update log recovery to use EFI reference counting in a manner consistent
with the log. When an EFI is encountered during recovery, an EFI item is
allocated and inserted to the AIL directly. Since the EFI reference is
typically dropped when the EFI is unpinned and this is analogous with
AIL insertion, drop the EFI reference at this point.

When a corresponding EFD is encountered in the log, this indicates that
the extents were freed, no processing is required and the EFI can be
dropped. Update xlog_recover_efd_pass2() to simply drop the EFD
reference at this point rather than open code the AIL removal and EFI
free.

Remaining EFIs (i.e., with no corresponding EFD) are processed in
xlog_recover_finish(). An EFD transaction is allocated and the extents
are freed, which transfers ownership of the EFI reference to the EFD
item in the log.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_extfree_item.c | 57 +++++++++++++++++------------------------------
 fs/xfs/xfs_log_recover.c  | 43 ++++++++++++++++++-----------------
 2 files changed, 43 insertions(+), 57 deletions(-)

diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index aceb54f..2b2acac 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -47,34 +47,6 @@ xfs_efi_item_free(
 }
 
 /*
- * Freeing the efi requires that we remove it from the AIL if it has already
- * been placed there. However, the EFI may not yet have been placed in the AIL
- * when called by xfs_efi_release() from EFD processing due to the ordering of
- * committed vs unpin operations in bulk insert operations. Hence the reference
- * count to ensure only the last caller frees the EFI.
- */
-STATIC void
-__xfs_efi_release(
-	struct xfs_efi_log_item	*efip)
-{
-	struct xfs_ail		*ailp = efip->efi_item.li_ailp;
-
-	if (atomic_dec_and_test(&efip->efi_refcount)) {
-		spin_lock(&ailp->xa_lock);
-		/*
-		 * We don't know whether the EFI made it to the AIL. Remove it
-		 * if so. Note that xfs_trans_ail_delete() drops the AIL lock.
-		 */
-		if (efip->efi_item.li_flags & XFS_LI_IN_AIL)
-			xfs_trans_ail_delete(ailp, &efip->efi_item,
-					     SHUTDOWN_LOG_IO_ERROR);
-		else
-			spin_unlock(&ailp->xa_lock);
-		xfs_efi_item_free(efip);
-	}
-}
-
-/*
  * This returns the number of iovecs needed to log the given efi item.
  * We only need 1 iovec for an efi item.  It just logs the efi_log_format
  * structure.
@@ -304,20 +276,31 @@ xfs_efi_copy_format(xfs_log_iovec_t *buf, xfs_efi_log_format_t *dst_efi_fmt)
 }
 
 /*
- * This is called by the efd item code below to release references to the given
- * efi item.  Each efd calls this with the number of extents that it has
- * logged, and when the sum of these reaches the total number of extents logged
- * by this efi item we can free the efi item.
+ * Freeing the efi requires that we remove it from the AIL if it has already
+ * been placed there. However, the EFI may not yet have been placed in the AIL
+ * when called by xfs_efi_release() from EFD processing due to the ordering of
+ * committed vs unpin operations in bulk insert operations. Hence the reference
+ * count to ensure only the last caller frees the EFI.
  */
 void
 xfs_efi_release(struct xfs_efi_log_item	*efip)
 {
-	/* recovery needs us to drop the EFI reference, too */
-	if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
-		__xfs_efi_release(efip);
+	struct xfs_ail		*ailp = efip->efi_item.li_ailp;
 
-	__xfs_efi_release(efip);
-	/* efip may now have been freed, do not reference it again. */
+	if (atomic_dec_and_test(&efip->efi_refcount)) {
+		spin_lock(&ailp->xa_lock);
+		/*
+		 * We don't know whether the EFI made it to the AIL. Remove it
+		 * if so. Note that xfs_trans_ail_delete() drops the AIL lock.
+		 */
+		if (efip->efi_item.li_flags & XFS_LI_IN_AIL)
+			xfs_trans_ail_delete(ailp, &efip->efi_item,
+					     SHUTDOWN_LOG_IO_ERROR);
+		else
+			spin_unlock(&ailp->xa_lock);
+
+		xfs_efi_item_free(efip);
+	}
 }
 
 static inline struct xfs_efd_log_item *EFD_ITEM(struct xfs_log_item *lip)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 3433b7b..a74ff68 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2933,16 +2933,16 @@ xlog_recover_efi_pass2(
 	struct xlog_recover_item	*item,
 	xfs_lsn_t			lsn)
 {
-	int			error;
-	xfs_mount_t		*mp = log->l_mp;
-	xfs_efi_log_item_t	*efip;
-	xfs_efi_log_format_t	*efi_formatp;
+	int				error;
+	struct xfs_mount		*mp = log->l_mp;
+	struct xfs_efi_log_item		*efip;
+	struct xfs_efi_log_format	*efi_formatp;
 
 	efi_formatp = item->ri_buf[0].i_addr;
 
 	efip = xfs_efi_init(mp, efi_formatp->efi_nextents);
-	if ((error = xfs_efi_copy_format(&(item->ri_buf[0]),
-					 &(efip->efi_format)))) {
+	error = xfs_efi_copy_format(&(item->ri_buf[0]), &(efip->efi_format));
+	if (error) {
 		xfs_efi_item_free(efip);
 		return error;
 	}
@@ -2950,20 +2950,23 @@ xlog_recover_efi_pass2(
 
 	spin_lock(&log->l_ailp->xa_lock);
 	/*
-	 * xfs_trans_ail_update() drops the AIL lock.
+	 * The EFI has two references. One for the EFD and one for EFI to ensure
+	 * it makes it into the AIL. Insert the EFI into the AIL directly and
+	 * drop the EFI reference. Note that xfs_trans_ail_update() drops the
+	 * AIL lock.
 	 */
 	xfs_trans_ail_update(log->l_ailp, &efip->efi_item, lsn);
+	xfs_efi_release(efip);
 	return 0;
 }
 
 
 /*
- * This routine is called when an efd format structure is found in
- * a committed transaction in the log.  It's purpose is to cancel
- * the corresponding efi if it was still in the log.  To do this
- * it searches the AIL for the efi with an id equal to that in the
- * efd format structure.  If we find it, we remove the efi from the
- * AIL and free it.
+ * This routine is called when an EFD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding EFI if it
+ * was still in the log. To do this it searches the AIL for the EFI with an id
+ * equal to that in the EFD format structure. If we find it we drop the EFD
+ * reference, which removes the EFI from the AIL and frees it.
  */
 STATIC int
 xlog_recover_efd_pass2(
@@ -2985,8 +2988,8 @@ xlog_recover_efd_pass2(
 	efi_id = efd_formatp->efd_efi_id;
 
 	/*
-	 * Search for the efi with the id in the efd format structure
-	 * in the AIL.
+	 * Search for the EFI with the id in the EFD format structure in the
+	 * AIL.
 	 */
 	spin_lock(&ailp->xa_lock);
 	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
@@ -2995,18 +2998,18 @@ xlog_recover_efd_pass2(
 			efip = (xfs_efi_log_item_t *)lip;
 			if (efip->efi_format.efi_id == efi_id) {
 				/*
-				 * xfs_trans_ail_delete() drops the
-				 * AIL lock.
+				 * Drop the EFD reference to the EFI. This
+				 * removes the EFI from the AIL and frees it.
 				 */
-				xfs_trans_ail_delete(ailp, lip,
-						     SHUTDOWN_CORRUPT_INCORE);
-				xfs_efi_item_free(efip);
+				spin_unlock(&ailp->xa_lock);
+				xfs_efi_release(efip);
 				spin_lock(&ailp->xa_lock);
 				break;
 			}
 		}
 		lip = xfs_trans_ail_cursor_next(ailp, &cur);
 	}
+
 	xfs_trans_ail_cursor_done(&cur);
 	spin_unlock(&ailp->xa_lock);
 
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (4 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 05/11] xfs: use EFI refcount consistently in log recovery Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-07  2:23   ` Dave Chinner
  2015-08-09  8:01   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 07/11] xfs: icreate log item recovery and cancellation tracepoints Brian Foster
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Log recovery occurs in two phases at mount time. In the first phase,
EFIs and EFDs are processed and potentially cancelled out. EFIs without
EFD objects are inserted into the AIL for processing and recovery in the
second phase. xfs_mountfs() runs various other operations between the
phases and is thus subject to failure. If failure occurs after the first
phase but before the second, pending EFIs sit on the AIL, pin it and
cause the mount to hang.

Update the mount sequence to ensure that pending EFIs are cancelled in
the event of failure. Add a recovery cancellation mechanism to iterate
the AIL and cancel all EFI items when requested. Plumb cancellation
support through the log mount finish helper and update xfs_mountfs() to
invoke cancellation in the event of failure after recovery has started.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c         | 16 +++++++++++-----
 fs/xfs/xfs_log.h         |  2 +-
 fs/xfs/xfs_log_priv.h    |  2 ++
 fs/xfs/xfs_log_recover.c | 41 ++++++++++++++++++++++++++++++++++++-----
 fs/xfs/xfs_mount.c       | 35 +++++++++++++++++++++++------------
 5 files changed, 73 insertions(+), 23 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 6b5a84a..522286c 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -740,19 +740,25 @@ out:
  * it.
  */
 int
-xfs_log_mount_finish(xfs_mount_t *mp)
+xfs_log_mount_finish(
+	struct xfs_mount	*mp,
+	bool			cancel)
 {
 	int	error = 0;
 
-	if (!(mp->m_flags & XFS_MOUNT_NORECOVERY)) {
+	if (mp->m_flags & XFS_MOUNT_NORECOVERY) {
+		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
+		return 0;
+	}
+
+	if (cancel) {
+		error = xlog_recover_cancel(mp->m_log);
+	} else {
 		error = xlog_recover_finish(mp->m_log);
 		if (!error)
 			xfs_log_work_queue(mp);
-	} else {
-		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
 	}
 
-
 	return error;
 }
 
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index fa27aae..425e880 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -146,7 +146,7 @@ int	  xfs_log_mount(struct xfs_mount	*mp,
 			struct xfs_buftarg	*log_target,
 			xfs_daddr_t		start_block,
 			int		 	num_bblocks);
-int	  xfs_log_mount_finish(struct xfs_mount *mp);
+int	  xfs_log_mount_finish(struct xfs_mount *mp, bool);
 xfs_lsn_t xlog_assign_tail_lsn(struct xfs_mount *mp);
 xfs_lsn_t xlog_assign_tail_lsn_locked(struct xfs_mount *mp);
 void	  xfs_log_space_wake(struct xfs_mount *mp);
diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index 1c87c8a..950f3f9 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -426,6 +426,8 @@ xlog_recover(
 extern int
 xlog_recover_finish(
 	struct xlog		*log);
+extern int
+xlog_recover_cancel(struct xlog *);
 
 extern __le32	 xlog_cksum(struct xlog *log, struct xlog_rec_header *rhead,
 			    char *dp, int size);
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index a74ff68..a7ba078 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3823,10 +3823,11 @@ abort_error:
  */
 STATIC int
 xlog_recover_process_efis(
-	struct xlog	*log)
+	struct xlog		*log,
+	bool			cancel)
 {
-	xfs_log_item_t		*lip;
-	xfs_efi_log_item_t	*efip;
+	struct xfs_log_item	*lip;
+	struct xfs_efi_log_item	*efip;
 	int			error = 0;
 	struct xfs_ail_cursor	cur;
 	struct xfs_ail		*ailp;
@@ -3847,10 +3848,24 @@ xlog_recover_process_efis(
 			break;
 		}
 
+		efip = (struct xfs_efi_log_item *) lip;
+
+		/*
+		 * A cancel occurs when the mount has failed and we're bailing
+		 * out. Release all pending EFIs so they don't pin the AIL.
+		 */
+		if (cancel) {
+			spin_unlock(&ailp->xa_lock);
+			xfs_efi_release(efip);
+			spin_lock(&ailp->xa_lock);
+
+			lip = xfs_trans_ail_cursor_next(ailp, &cur);
+			continue;
+		}
+
 		/*
 		 * Skip EFIs that we've already processed.
 		 */
-		efip = (xfs_efi_log_item_t *)lip;
 		if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags)) {
 			lip = xfs_trans_ail_cursor_next(ailp, &cur);
 			continue;
@@ -4617,8 +4632,13 @@ xlog_recover_finish(
 	 */
 	if (log->l_flags & XLOG_RECOVERY_NEEDED) {
 		int	error;
-		error = xlog_recover_process_efis(log);
+		error = xlog_recover_process_efis(log, false);
 		if (error) {
+			/*
+			 * We could still have pending EFIs. Cancel them so we
+			 * don't hang...
+			 */
+			xlog_recover_process_efis(log, true);
 			xfs_alert(log->l_mp, "Failed to recover EFIs");
 			return error;
 		}
@@ -4644,6 +4664,17 @@ xlog_recover_finish(
 	return 0;
 }
 
+int
+xlog_recover_cancel(
+	struct xlog	*log)
+{
+	int		error = 0;
+
+	if (log->l_flags & XLOG_RECOVERY_NEEDED)
+		error = xlog_recover_process_efis(log, true);
+
+	return error;
+}
 
 #if defined(DEBUG)
 /*
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 461e791..45a33d5 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -615,14 +615,15 @@ xfs_default_resblks(xfs_mount_t *mp)
  */
 int
 xfs_mountfs(
-	xfs_mount_t	*mp)
+	struct xfs_mount	*mp)
 {
-	xfs_sb_t	*sbp = &(mp->m_sb);
-	xfs_inode_t	*rip;
-	__uint64_t	resblks;
-	uint		quotamount = 0;
-	uint		quotaflags = 0;
-	int		error = 0;
+	struct xfs_sb		*sbp = &(mp->m_sb);
+	struct xfs_inode	*rip;
+	__uint64_t		resblks;
+	uint			quotamount = 0;
+	uint			quotaflags = 0;
+	int			error = 0;
+	bool			log_mount_cancel = false;
 
 	xfs_sb_mount_common(mp, sbp);
 
@@ -799,7 +800,10 @@ xfs_mountfs(
 	}
 
 	/*
-	 * log's mount-time initialization. Perform 1st part recovery if needed
+	 * Log's mount-time initialization. The first part of recovery can place
+	 * some items on the AIL, to be handled when recovery is finished. Set
+	 * the cancel flag to ensure that the recovery is cancelled should we
+	 * fail before then.
 	 */
 	error = xfs_log_mount(mp, mp->m_logdev_targp,
 			      XFS_FSB_TO_DADDR(mp, sbp->sb_logstart),
@@ -808,6 +812,7 @@ xfs_mountfs(
 		xfs_warn(mp, "log mount failed");
 		goto out_fail_wait;
 	}
+	log_mount_cancel = true;
 
 	/*
 	 * Now the log is mounted, we know if it was an unclean shutdown or
@@ -910,11 +915,15 @@ xfs_mountfs(
 	}
 
 	/*
-	 * Finish recovering the file system.  This part needed to be
-	 * delayed until after the root and real-time bitmap inodes
-	 * were consistently read in.
+	 * Finish recovering the file system.  This part needed to be delayed
+	 * until after the root and real-time bitmap inodes were consistently
+	 * read in.
+	 *
+	 * Reset the cancel flag as the finish cleans up after itself on
+	 * failure.
 	 */
-	error = xfs_log_mount_finish(mp);
+	log_mount_cancel = false;
+	error = xfs_log_mount_finish(mp, false);
 	if (error) {
 		xfs_warn(mp, "log mount finish failed");
 		goto out_rtunmount;
@@ -956,6 +965,8 @@ xfs_mountfs(
  out_rele_rip:
 	IRELE(rip);
  out_log_dealloc:
+	if (log_mount_cancel)
+		xfs_log_mount_finish(mp, true);
 	xfs_log_unmount(mp);
  out_fail_wait:
 	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp)
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/11] xfs: icreate log item recovery and cancellation tracepoints
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (5 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-09  8:02   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 08/11] xfs: fix broken icreate log item cancellation Brian Foster
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Various log items have recovery tracepoints to identify whether a
particular log item is recovered or cancelled. Add the equivalent
tracepoints for the icreate transaction.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c |  5 ++++-
 fs/xfs/xfs_trace.h       | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index a7ba078..76248bf 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3105,9 +3105,12 @@ xlog_recover_do_icreate_pass2(
 	 * done easily.
 	 */
 	if (xlog_check_buffer_cancelled(log,
-			XFS_AGB_TO_DADDR(mp, agno, agbno), length, 0))
+			XFS_AGB_TO_DADDR(mp, agno, agbno), length, 0)) {
+		trace_xfs_log_recover_icreate_cancel(log, icl);
 		return 0;
+	}
 
+	trace_xfs_log_recover_icreate_recover(log, icl);
 	xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno, length,
 			      be32_to_cpu(icl->icl_gen));
 	return 0;
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 8d916d3..9aeeb21 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -2089,6 +2089,40 @@ DEFINE_LOG_RECOVER_INO_ITEM(xfs_log_recover_inode_recover);
 DEFINE_LOG_RECOVER_INO_ITEM(xfs_log_recover_inode_cancel);
 DEFINE_LOG_RECOVER_INO_ITEM(xfs_log_recover_inode_skip);
 
+DECLARE_EVENT_CLASS(xfs_log_recover_icreate_item_class,
+	TP_PROTO(struct xlog *log, struct xfs_icreate_log *in_f),
+	TP_ARGS(log, in_f),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(xfs_agnumber_t, agno)
+		__field(xfs_agblock_t, agbno)
+		__field(unsigned int, count)
+		__field(unsigned int, isize)
+		__field(xfs_agblock_t, length)
+		__field(unsigned int, gen)
+	),
+	TP_fast_assign(
+		__entry->dev = log->l_mp->m_super->s_dev;
+		__entry->agno = be32_to_cpu(in_f->icl_ag);
+		__entry->agbno = be32_to_cpu(in_f->icl_agbno);
+		__entry->count = be32_to_cpu(in_f->icl_count);
+		__entry->isize = be32_to_cpu(in_f->icl_isize);
+		__entry->length = be32_to_cpu(in_f->icl_length);
+		__entry->gen = be32_to_cpu(in_f->icl_gen);
+	),
+	TP_printk("dev %d:%d agno %u agbno %u count %u isize %u length %u "
+		  "gen %u", MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->agno, __entry->agbno, __entry->count, __entry->isize,
+		  __entry->length, __entry->gen)
+)
+#define DEFINE_LOG_RECOVER_ICREATE_ITEM(name) \
+DEFINE_EVENT(xfs_log_recover_icreate_item_class, name, \
+	TP_PROTO(struct xlog *log, struct xfs_icreate_log *in_f), \
+	TP_ARGS(log, in_f))
+
+DEFINE_LOG_RECOVER_ICREATE_ITEM(xfs_log_recover_icreate_cancel);
+DEFINE_LOG_RECOVER_ICREATE_ITEM(xfs_log_recover_icreate_recover);
+
 DECLARE_EVENT_CLASS(xfs_discard_class,
 	TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno,
 		 xfs_agblock_t agbno, xfs_extlen_t len),
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/11] xfs: fix broken icreate log item cancellation
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (6 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 07/11] xfs: icreate log item recovery and cancellation tracepoints Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-06 17:44 ` [PATCH 09/11] xfs: checksum log record ext headers based on record size Brian Foster
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

Inode cluster buffers are invalidated and cancelled when inode chunks
are freed to notify log recovery that previous logged updates to the
metadata buffer should be skipped. This ensures that log recovery does
not overwrite buffers that might have already been reused.

On v4 filesystems, inode chunk allocation and inode updates are logged
via the cluster buffers and thus cancellation is easily detected via
buffer cancellation items. v5 filesystems use the new icreate
transaction, which uses logical logging and ordered buffers to log a
full inode chunk allocation at once. The resulting icreate item often
spans multiple inode cluster buffers.

Log recovery checks for cancelled buffers when processing icreate log
items, but it has a couple problems. First, it uses the full length of
the inode chunk rather than the cluster size. Second, it uses the length
in FSB units rather than BB units. Either of these problems prevent
icreate recovery from identifying cancelled buffers and thus inode
initialization proceeds unconditionally.

Update xlog_recover_do_icreate_pass2() to iterate the icreate range in
cluster sized increments and check each increment for cancellation.
Since icreate is currently only used for the minimum atomic inode chunk
allocation, we expect that either all or none of the buffers will be
cancelled. Cancel the icreate if at least one buffer is cancelled to
avoid making a bad situation worse by initializing a partial inode
chunk, but detect such anomalies and warn the user.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 49 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 76248bf..62a9b57 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3037,6 +3037,11 @@ xlog_recover_do_icreate_pass2(
 	unsigned int		count;
 	unsigned int		isize;
 	xfs_agblock_t		length;
+	int			blks_per_cluster;
+	int			bb_per_cluster;
+	int			cancel_count;
+	int			nbufs;
+	int			i;
 
 	icl = (struct xfs_icreate_log *)item->ri_buf[0].i_addr;
 	if (icl->icl_type != XFS_LI_ICREATE) {
@@ -3095,25 +3100,45 @@ xlog_recover_do_icreate_pass2(
 	}
 
 	/*
-	 * Inode buffers can be freed. Do not replay the inode initialisation as
-	 * we could be overwriting something written after this inode buffer was
-	 * cancelled.
+	 * The icreate transaction can cover multiple cluster buffers and these
+	 * buffers could have been freed and reused. Check the individual
+	 * buffers for cancellation so we don't overwrite anything written after
+	 * a cancellation.
+	 */
+	blks_per_cluster = xfs_icluster_size_fsb(mp);
+	bb_per_cluster = XFS_FSB_TO_BB(mp, blks_per_cluster);
+	nbufs = length / blks_per_cluster;
+	for (i = 0, cancel_count = 0; i < nbufs; i++) {
+		xfs_daddr_t	daddr;
+
+		daddr = XFS_AGB_TO_DADDR(mp, agno,
+					 agbno + i * blks_per_cluster);
+		if (xlog_check_buffer_cancelled(log, daddr, bb_per_cluster, 0))
+			cancel_count++;
+	}
+
+	/*
+	 * We currently only use icreate for a single allocation at a time. This
+	 * means we should expect either all or none of the buffers to be
+	 * cancelled. Be conservative and skip replay if at least one buffer is
+	 * cancelled, but warn the user that something is awry if the buffers
+	 * are not consistent.
 	 *
-	 * XXX: we need to iterate all buffers and only init those that are not
-	 * cancelled. I think that a more fine grained factoring of
-	 * xfs_ialloc_inode_init may be appropriate here to enable this to be
-	 * done easily.
+	 * XXX: This must be refined to only skip cancelled clusters once we use
+	 * icreate for multiple chunk allocations.
 	 */
-	if (xlog_check_buffer_cancelled(log,
-			XFS_AGB_TO_DADDR(mp, agno, agbno), length, 0)) {
+	ASSERT(!cancel_count || cancel_count == nbufs);
+	if (cancel_count) {
+		if (cancel_count != nbufs)
+			xfs_warn(mp,
+	"WARNING: partial inode chunk cancellation, skipped icreate.");
 		trace_xfs_log_recover_icreate_cancel(log, icl);
 		return 0;
 	}
 
 	trace_xfs_log_recover_icreate_recover(log, icl);
-	xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno, length,
-			      be32_to_cpu(icl->icl_gen));
-	return 0;
+	return xfs_ialloc_inode_init(mp, NULL, buffer_list, count, agno, agbno,
+				     length, be32_to_cpu(icl->icl_gen));
 }
 
 STATIC void
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/11] xfs: checksum log record ext headers based on record size
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (7 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 08/11] xfs: fix broken icreate log item cancellation Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-06 17:44 ` [PATCH 10/11] xfs: clean up root inode properly on mount failure Brian Foster
  2015-08-06 17:44 ` [PATCH 11/11] xfs: fix btree cursor error cleanups Brian Foster
  10 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

The first 4 bytes of every basic block in the physical log is stamped
with the current lsn. To support this mechanism, the log record header
(first block of each new log record) contains space for the original
first byte of each log record block before it is replaced with the lsn.
The log record header has space for 32k worth of blocks. The version 2
log adds new extended record headers for each additional 32k worth of
blocks beyond what is supported by the record header.

The log record checksum incorporates the log record header, the extended
headers and the record payload. xlog_cksum() checksums the extended
headers based on log->l_iclog_heads, which specifies the number of
extended headers in a log record based on the log buffer size mount
option. The log buffer size is variable, however, and thus means the
checksum can be calculated differently based on how a filesystem is
mounted. This is problematic if a filesystem crashes and recovery occurs
on a subsequent mount using a different log buffer size. For example,
crash an active filesystem that is mounted with the default (32k)
logbsize, attempt remount/recovery using '-o logbsize=64k' and the mount
fails on or warns about log checksum failures.

To avoid this problem, update xlog_cksum() to calculate the checksum
based on the size of the log buffer according to the log record. The
size is already included in the h_size field of the log record header
and thus is available at log recovery time. Extended log record headers
are also only written when the log record is large enough to require
them. This makes checksum calculation of log records consistent with the
extended record header mechanism as well as how on-disk records are
checksummed with various log buffer size mount options.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 522286c..49cc57e 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1660,8 +1660,13 @@ xlog_cksum(
 	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
 		union xlog_in_core2 *xhdr = (union xlog_in_core2 *)rhead;
 		int		i;
+		int		xheads;

-		for (i = 1; i < log->l_iclog_heads; i++) {
+		xheads = size / XLOG_HEADER_CYCLE_SIZE;
+		if (size % XLOG_HEADER_CYCLE_SIZE)
+			xheads++;
+
+		for (i = 1; i < xheads; i++) {
 			crc = crc32c(crc, &xhdr[i].hic_xheader,
 				     sizeof(struct xlog_rec_ext_header));
 		}
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/11] xfs: clean up root inode properly on mount failure
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (8 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 09/11] xfs: checksum log record ext headers based on record size Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-09  8:03   ` Christoph Hellwig
  2015-08-06 17:44 ` [PATCH 11/11] xfs: fix btree cursor error cleanups Brian Foster
  10 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

The root inode is read as part of the xfs_mountfs() sequence and the
reference is dropped in the event of failure after we grab the inode.
The reference drop doesn't necessarily free the inode, however. It marks
it for reclaim and potentially kicks off the reclaim workqueue.  The
workqueue is destroyed further up the error path, which means we are
subject to crash if the workqueue job runs after this point or a memory
leak which is identified if the xfs_inode_zone is destroyed (e.g., on
module removal). Both of these outcomes are reproducible via manual
instrumentation of a mount error after the root inode xfs_iget() call in
xfs_mountfs().

Update the xfs_mountfs() error path to cancel any potential reclaim work
items and to run a synchronous inode reclaim if the root inode is marked
for reclaim. This ensures that no jobs remain on the queue before it is
destroyed and that the root inode is freed before the reclaim mechanism
is torn down.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_mount.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 45a33d5..bcdd3bf 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -964,6 +964,8 @@ xfs_mountfs(
 	xfs_rtunmount_inodes(mp);
  out_rele_rip:
 	IRELE(rip);
+	cancel_delayed_work_sync(&mp->m_reclaim_work);
+	xfs_reclaim_inodes(mp, SYNC_WAIT);
  out_log_dealloc:
 	if (log_mount_cancel)
 		xfs_log_mount_finish(mp, true);
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 11/11] xfs: fix btree cursor error cleanups
  2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
                   ` (9 preceding siblings ...)
  2015-08-06 17:44 ` [PATCH 10/11] xfs: clean up root inode properly on mount failure Brian Foster
@ 2015-08-06 17:44 ` Brian Foster
  2015-08-09  8:03   ` Christoph Hellwig
  10 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-06 17:44 UTC (permalink / raw)
  To: xfs

The btree cursor cleanup function takes an error parameter that affects
how buffers are released from the cursor. All buffers are released in
the event of error. Several callers do not specify the XFS_BTREE_ERROR
flag in the event of error, however. This can cause buffers to hang
around locked or with an elevated hold count and thus lead to umount
hangs in the event of errors.

Fix up the xfs_btree_del_cursor() callers to pass XFS_BTREE_ERROR if the
cursor is being torn down due to error.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_ialloc.c | 2 +-
 fs/xfs/xfs_itable.c        | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index ce63e04..54deb2d 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -2233,7 +2233,7 @@ xfs_imap_lookup(
 	}
 
 	xfs_trans_brelse(tp, agbp);
-	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+	xfs_btree_del_cursor(cur, error ? XFS_BTREE_ERROR : XFS_BTREE_NOERROR);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index f41b0c3..930ebd8 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -473,7 +473,8 @@ xfs_bulkstat(
 		 * pending error, then we are done.
 		 */
 del_cursor:
-		xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+		xfs_btree_del_cursor(cur, error ?
+					  XFS_BTREE_ERROR : XFS_BTREE_NOERROR);
 		xfs_buf_relse(agbp);
 		if (error)
 			break;
-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs
  2015-08-06 17:44 ` [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs Brian Foster
@ 2015-08-07  0:41   ` Dave Chinner
  2015-08-07 12:09     ` Brian Foster
  2015-08-09  7:46   ` Christoph Hellwig
  1 sibling, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2015-08-07  0:41 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:24PM -0400, Brian Foster wrote:
....
> Update the EFI/EFD item handling code to use a more straightforward and
> reliable approach to error handling. If an error occurs after the EFI
> transaction is committed and before the EFD is constructed, release the
> EFI explicitly from xfs_bmap_finish(). If the EFI transaction is
> cancelled, release the EFI in the unlock handler.
> 
> Once the EFD is constructed, it is responsible for releasing the EFI
> under any circumstances (including whether the EFI item aborts due to
> log I/O error).

Can you document these rules in a comment at the top of
fs/xfs/xfs_extfree_item.c?

> diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> index 42c9b05..aceb54f 100644
> --- a/fs/xfs/xfs_extfree_item.c
> +++ b/fs/xfs/xfs_extfree_item.c
> @@ -61,9 +61,15 @@ __xfs_efi_release(
>  
>  	if (atomic_dec_and_test(&efip->efi_refcount)) {
>  		spin_lock(&ailp->xa_lock);
> -		/* xfs_trans_ail_delete() drops the AIL lock. */
> -		xfs_trans_ail_delete(ailp, &efip->efi_item,
> -				     SHUTDOWN_LOG_IO_ERROR);
> +		/*
> +		 * We don't know whether the EFI made it to the AIL. Remove it
> +		 * if so. Note that xfs_trans_ail_delete() drops the AIL lock.
> +		 */
> +		if (efip->efi_item.li_flags & XFS_LI_IN_AIL)
> +			xfs_trans_ail_delete(ailp, &efip->efi_item,
> +					     SHUTDOWN_LOG_IO_ERROR);
> +		else
> +			spin_unlock(&ailp->xa_lock);
>  		xfs_efi_item_free(efip);
>  	}

We now have a lot of this pattern:

	spin_lock(&ailp->xa_lock);
	if (lip->li_flags & XFS_LI_IN_AIL)
		xfs_trans_ail_delete(ailp, lip, shutdown_reason);
	else
		spin_unlock(&ailp->xa_lock);

occurring in the code. Can you look into adding a helper function
for this, say xfs_trans_ail_remove()? (separate patch, of course!)

The rest of the patch looks fine - getting rid of almost all of
those aborted flag special cases is a big win :)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 05/11] xfs: use EFI refcount consistently in log recovery
  2015-08-06 17:44 ` [PATCH 05/11] xfs: use EFI refcount consistently in log recovery Brian Foster
@ 2015-08-07  1:51   ` Dave Chinner
  2015-08-07 12:09     ` Brian Foster
  2015-08-09  7:56   ` Christoph Hellwig
  1 sibling, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2015-08-07  1:51 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:26PM -0400, Brian Foster wrote:
> @@ -2933,16 +2933,16 @@ xlog_recover_efi_pass2(
>  	struct xlog_recover_item	*item,
>  	xfs_lsn_t			lsn)
>  {
> -	int			error;
> -	xfs_mount_t		*mp = log->l_mp;
> -	xfs_efi_log_item_t	*efip;
> -	xfs_efi_log_format_t	*efi_formatp;
> +	int				error;
> +	struct xfs_mount		*mp = log->l_mp;
> +	struct xfs_efi_log_item		*efip;
> +	struct xfs_efi_log_format	*efi_formatp;
>  
>  	efi_formatp = item->ri_buf[0].i_addr;
>  
>  	efip = xfs_efi_init(mp, efi_formatp->efi_nextents);
> -	if ((error = xfs_efi_copy_format(&(item->ri_buf[0]),
> -					 &(efip->efi_format)))) {
> +	error = xfs_efi_copy_format(&(item->ri_buf[0]), &(efip->efi_format));
> +	if (error) {

You can kill the extra (..) around the variables here.

....

> @@ -2995,18 +2998,18 @@ xlog_recover_efd_pass2(
>  			efip = (xfs_efi_log_item_t *)lip;
>  			if (efip->efi_format.efi_id == efi_id) {
>  				/*
> -				 * xfs_trans_ail_delete() drops the
> -				 * AIL lock.
> +				 * Drop the EFD reference to the EFI. This
> +				 * removes the EFI from the AIL and frees it.
>  				 */
> -				xfs_trans_ail_delete(ailp, lip,
> -						     SHUTDOWN_CORRUPT_INCORE);
> -				xfs_efi_item_free(efip);
> +				spin_unlock(&ailp->xa_lock);
> +				xfs_efi_release(efip);
>  				spin_lock(&ailp->xa_lock);

Need to call xfs_efi_release() outside the ailp->xa_lock as
xfs_efi_release() now does the removal of the log item from the AIL
and so can deadlock when taking the ailp->xa_lock.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure
  2015-08-06 17:44 ` [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure Brian Foster
@ 2015-08-07  2:23   ` Dave Chinner
  2015-08-07 12:10     ` Brian Foster
  2015-08-09  8:01   ` Christoph Hellwig
  1 sibling, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2015-08-07  2:23 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:27PM -0400, Brian Foster wrote:
> Log recovery occurs in two phases at mount time. In the first phase,
> EFIs and EFDs are processed and potentially cancelled out. EFIs without
> EFD objects are inserted into the AIL for processing and recovery in the
> second phase. xfs_mountfs() runs various other operations between the
> phases and is thus subject to failure. If failure occurs after the first
> phase but before the second, pending EFIs sit on the AIL, pin it and
> cause the mount to hang.
> 
> Update the mount sequence to ensure that pending EFIs are cancelled in
> the event of failure. Add a recovery cancellation mechanism to iterate
> the AIL and cancel all EFI items when requested. Plumb cancellation
> support through the log mount finish helper and update xfs_mountfs() to
> invoke cancellation in the event of failure after recovery has started.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_log.c         | 16 +++++++++++-----
>  fs/xfs/xfs_log.h         |  2 +-
>  fs/xfs/xfs_log_priv.h    |  2 ++
>  fs/xfs/xfs_log_recover.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  fs/xfs/xfs_mount.c       | 35 +++++++++++++++++++++++------------
>  5 files changed, 73 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 6b5a84a..522286c 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -740,19 +740,25 @@ out:
>   * it.
>   */
>  int
> -xfs_log_mount_finish(xfs_mount_t *mp)
> +xfs_log_mount_finish(
> +	struct xfs_mount	*mp,
> +	bool			cancel)
>  {
>  	int	error = 0;
>  
> -	if (!(mp->m_flags & XFS_MOUNT_NORECOVERY)) {
> +	if (mp->m_flags & XFS_MOUNT_NORECOVERY) {
> +		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> +		return 0;
> +	}
> +
> +	if (cancel) {
> +		error = xlog_recover_cancel(mp->m_log);
> +	} else {
>  		error = xlog_recover_finish(mp->m_log);
>  		if (!error)
>  			xfs_log_work_queue(mp);
> -	} else {
> -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
>  	}
>  
> -
>  	return error;
>  }

As I mentioned on #xfs, I don't think the error/cancel path needs to
go through this function. Add a new function xfs_log_mount_cancel(),
and put the contents of xlog_recover_cancel() inside it, along with
the log unmount function...

(General rule: prefix "xfs_log_....()" is for functions that call
into the log from outside, passing a struct xfs_mount. prefix
"xlog_...()" is for internal calls, passing a struct xlog.)

> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index a74ff68..a7ba078 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3823,10 +3823,11 @@ abort_error:
>   */
>  STATIC int
>  xlog_recover_process_efis(
> -	struct xlog	*log)
> +	struct xlog		*log,
> +	bool			cancel)
>  {
> -	xfs_log_item_t		*lip;
> -	xfs_efi_log_item_t	*efip;
> +	struct xfs_log_item	*lip;
> +	struct xfs_efi_log_item	*efip;
>  	int			error = 0;
>  	struct xfs_ail_cursor	cur;
>  	struct xfs_ail		*ailp;
> @@ -3847,10 +3848,24 @@ xlog_recover_process_efis(
>  			break;
>  		}
>  
> +		efip = (struct xfs_efi_log_item *) lip;
> +
> +		/*
> +		 * A cancel occurs when the mount has failed and we're bailing
> +		 * out. Release all pending EFIs so they don't pin the AIL.
> +		 */
> +		if (cancel) {
> +			spin_unlock(&ailp->xa_lock);
> +			xfs_efi_release(efip);
> +			spin_lock(&ailp->xa_lock);
> +
> +			lip = xfs_trans_ail_cursor_next(ailp, &cur);
> +			continue;
> +		}

This might be better to separate into a xlog_recover_cancel_efis()
function because this is much simpler than the rest of the loop.
It may be more code, but its simpler to read and understand.

> +
>  		/*
>  		 * Skip EFIs that we've already processed.
>  		 */
> -		efip = (xfs_efi_log_item_t *)lip;
>  		if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags)) {
>  			lip = xfs_trans_ail_cursor_next(ailp, &cur);
>  			continue;
> @@ -4617,8 +4632,13 @@ xlog_recover_finish(
>  	 */
>  	if (log->l_flags & XLOG_RECOVERY_NEEDED) {
>  		int	error;
> -		error = xlog_recover_process_efis(log);
> +		error = xlog_recover_process_efis(log, false);
>  		if (error) {
> +			/*
> +			 * We could still have pending EFIs. Cancel them so we
> +			 * don't hang...
> +			 */
> +			xlog_recover_process_efis(log, true);

Not actually necessary. We don't clear the XLOG_RECOVERY_NEEDED
flag, so when the error stack is run in xfs_mountfs(), we will
call xfs_log_mount_cancel(), it will see the XLOG_RECOVERY_NEEDED
and run xlog_recover_cancel_efis()....

>  			xfs_alert(log->l_mp, "Failed to recover EFIs");
>  			return error;
>  		}
> @@ -4644,6 +4664,17 @@ xlog_recover_finish(
>  	return 0;
>  }
>  
> +int
> +xlog_recover_cancel(
> +	struct xlog	*log)
> +{
> +	int		error = 0;
> +
> +	if (log->l_flags & XLOG_RECOVERY_NEEDED)
> +		error = xlog_recover_process_efis(log, true);
> +
> +	return error;
> +}

void
xfs_log_mount_cancel(
	struct xfs_mount	*mp)
{
	int		error = 0;

	if (log->l_flags & XLOG_RECOVERY_NEEDED)
		xlog_recover_cancel_efis(log);

	xfs_log_unmount(mp);
	return error;
}

> @@ -799,7 +800,10 @@ xfs_mountfs(
>  	}
>  
>  	/*
> -	 * log's mount-time initialization. Perform 1st part recovery if needed
> +	 * Log's mount-time initialization. The first part of recovery can place
> +	 * some items on the AIL, to be handled when recovery is finished. Set
> +	 * the cancel flag to ensure that the recovery is cancelled should we
> +	 * fail before then.
>  	 */
>  	error = xfs_log_mount(mp, mp->m_logdev_targp,
>  			      XFS_FSB_TO_DADDR(mp, sbp->sb_logstart),
> @@ -808,6 +812,7 @@ xfs_mountfs(
>  		xfs_warn(mp, "log mount failed");
>  		goto out_fail_wait;
>  	}
> +	log_mount_cancel = true;

At this point, XLOG_RECOVERY_NEEDED will be set if EFI processing is
required, so a variable to track this here is not needed.

> @@ -910,11 +915,15 @@ xfs_mountfs(
>  	}
>  
>  	/*
> -	 * Finish recovering the file system.  This part needed to be
> -	 * delayed until after the root and real-time bitmap inodes
> -	 * were consistently read in.
> +	 * Finish recovering the file system.  This part needed to be delayed
> +	 * until after the root and real-time bitmap inodes were consistently
> +	 * read in.
> +	 *
> +	 * Reset the cancel flag as the finish cleans up after itself on
> +	 * failure.
>  	 */
> -	error = xfs_log_mount_finish(mp);
> +	log_mount_cancel = false;
> +	error = xfs_log_mount_finish(mp, false);
>  	if (error) {
>  		xfs_warn(mp, "log mount finish failed");
>  		goto out_rtunmount;

xfs_log_mount_finish() will clear the XLOG_RECOVERY_NEEDED. On
error, we jump into the error stack above the call to
xfs_log_mount_cancel(), so it will handle the cleanup....

> @@ -956,6 +965,8 @@ xfs_mountfs(
>   out_rele_rip:
>  	IRELE(rip);
>   out_log_dealloc:
> +	if (log_mount_cancel)
> +		xfs_log_mount_finish(mp, true);
>  	xfs_log_unmount(mp);

And this just becomes a call to xfs_log_mount_cancel().

I hope this is a bit clearer that mud ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs
  2015-08-07  0:41   ` Dave Chinner
@ 2015-08-07 12:09     ` Brian Foster
  2015-08-07 22:45       ` Dave Chinner
  0 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-07 12:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Fri, Aug 07, 2015 at 10:41:00AM +1000, Dave Chinner wrote:
> On Thu, Aug 06, 2015 at 01:44:24PM -0400, Brian Foster wrote:
> ....
> > Update the EFI/EFD item handling code to use a more straightforward and
> > reliable approach to error handling. If an error occurs after the EFI
> > transaction is committed and before the EFD is constructed, release the
> > EFI explicitly from xfs_bmap_finish(). If the EFI transaction is
> > cancelled, release the EFI in the unlock handler.
> > 
> > Once the EFD is constructed, it is responsible for releasing the EFI
> > under any circumstances (including whether the EFI item aborts due to
> > log I/O error).
> 
> Can you document these rules in a comment at the top of
> fs/xfs/xfs_extfree_item.c?
> 

Sure.

> > diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> > index 42c9b05..aceb54f 100644
> > --- a/fs/xfs/xfs_extfree_item.c
> > +++ b/fs/xfs/xfs_extfree_item.c
> > @@ -61,9 +61,15 @@ __xfs_efi_release(
> >  
> >  	if (atomic_dec_and_test(&efip->efi_refcount)) {
> >  		spin_lock(&ailp->xa_lock);
> > -		/* xfs_trans_ail_delete() drops the AIL lock. */
> > -		xfs_trans_ail_delete(ailp, &efip->efi_item,
> > -				     SHUTDOWN_LOG_IO_ERROR);
> > +		/*
> > +		 * We don't know whether the EFI made it to the AIL. Remove it
> > +		 * if so. Note that xfs_trans_ail_delete() drops the AIL lock.
> > +		 */
> > +		if (efip->efi_item.li_flags & XFS_LI_IN_AIL)
> > +			xfs_trans_ail_delete(ailp, &efip->efi_item,
> > +					     SHUTDOWN_LOG_IO_ERROR);
> > +		else
> > +			spin_unlock(&ailp->xa_lock);
> >  		xfs_efi_item_free(efip);
> >  	}
> 
> We now have a lot of this pattern:
> 
> 	spin_lock(&ailp->xa_lock);
> 	if (lip->li_flags & XFS_LI_IN_AIL)
> 		xfs_trans_ail_delete(ailp, lip, shutdown_reason);
> 	else
> 		spin_unlock(&ailp->xa_lock);
> 
> occurring in the code. Can you look into adding a helper function
> for this, say xfs_trans_ail_remove()? (separate patch, of course!)
> 

There was only one other instance of this with regard to the EFI that is
ultimately replaced with an xfs_efi_release() call. I'll look into
whether there's enough instances of this for other items to justify a
helper.

Brian

> The rest of the patch looks fine - getting rid of almost all of
> those aborted flag special cases is a big win :)
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 05/11] xfs: use EFI refcount consistently in log recovery
  2015-08-07  1:51   ` Dave Chinner
@ 2015-08-07 12:09     ` Brian Foster
  0 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-07 12:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Fri, Aug 07, 2015 at 11:51:33AM +1000, Dave Chinner wrote:
> On Thu, Aug 06, 2015 at 01:44:26PM -0400, Brian Foster wrote:
> > @@ -2933,16 +2933,16 @@ xlog_recover_efi_pass2(
> >  	struct xlog_recover_item	*item,
> >  	xfs_lsn_t			lsn)
> >  {
> > -	int			error;
> > -	xfs_mount_t		*mp = log->l_mp;
> > -	xfs_efi_log_item_t	*efip;
> > -	xfs_efi_log_format_t	*efi_formatp;
> > +	int				error;
> > +	struct xfs_mount		*mp = log->l_mp;
> > +	struct xfs_efi_log_item		*efip;
> > +	struct xfs_efi_log_format	*efi_formatp;
> >  
> >  	efi_formatp = item->ri_buf[0].i_addr;
> >  
> >  	efip = xfs_efi_init(mp, efi_formatp->efi_nextents);
> > -	if ((error = xfs_efi_copy_format(&(item->ri_buf[0]),
> > -					 &(efip->efi_format)))) {
> > +	error = xfs_efi_copy_format(&(item->ri_buf[0]), &(efip->efi_format));
> > +	if (error) {
> 
> You can kill the extra (..) around the variables here.
> 

Ok.

> ....
> 
> > @@ -2995,18 +2998,18 @@ xlog_recover_efd_pass2(
> >  			efip = (xfs_efi_log_item_t *)lip;
> >  			if (efip->efi_format.efi_id == efi_id) {
> >  				/*
> > -				 * xfs_trans_ail_delete() drops the
> > -				 * AIL lock.
> > +				 * Drop the EFD reference to the EFI. This
> > +				 * removes the EFI from the AIL and frees it.
> >  				 */
> > -				xfs_trans_ail_delete(ailp, lip,
> > -						     SHUTDOWN_CORRUPT_INCORE);
> > -				xfs_efi_item_free(efip);
> > +				spin_unlock(&ailp->xa_lock);
> > +				xfs_efi_release(efip);
> >  				spin_lock(&ailp->xa_lock);
> 
> Need to call xfs_efi_release() outside the ailp->xa_lock as
> xfs_efi_release() now does the removal of the log item from the AIL
> and so can deadlock when taking the ailp->xa_lock.
> 

It drops and reacquires the lock, hence the removal of the comment about
xfs_trans_ail_delete() dropping it. ;)

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure
  2015-08-07  2:23   ` Dave Chinner
@ 2015-08-07 12:10     ` Brian Foster
  0 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-07 12:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Fri, Aug 07, 2015 at 12:23:36PM +1000, Dave Chinner wrote:
> On Thu, Aug 06, 2015 at 01:44:27PM -0400, Brian Foster wrote:
> > Log recovery occurs in two phases at mount time. In the first phase,
> > EFIs and EFDs are processed and potentially cancelled out. EFIs without
> > EFD objects are inserted into the AIL for processing and recovery in the
> > second phase. xfs_mountfs() runs various other operations between the
> > phases and is thus subject to failure. If failure occurs after the first
> > phase but before the second, pending EFIs sit on the AIL, pin it and
> > cause the mount to hang.
> > 
> > Update the mount sequence to ensure that pending EFIs are cancelled in
> > the event of failure. Add a recovery cancellation mechanism to iterate
> > the AIL and cancel all EFI items when requested. Plumb cancellation
> > support through the log mount finish helper and update xfs_mountfs() to
> > invoke cancellation in the event of failure after recovery has started.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_log.c         | 16 +++++++++++-----
> >  fs/xfs/xfs_log.h         |  2 +-
> >  fs/xfs/xfs_log_priv.h    |  2 ++
> >  fs/xfs/xfs_log_recover.c | 41 ++++++++++++++++++++++++++++++++++++-----
> >  fs/xfs/xfs_mount.c       | 35 +++++++++++++++++++++++------------
> >  5 files changed, 73 insertions(+), 23 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 6b5a84a..522286c 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -740,19 +740,25 @@ out:
> >   * it.
> >   */
> >  int
> > -xfs_log_mount_finish(xfs_mount_t *mp)
> > +xfs_log_mount_finish(
> > +	struct xfs_mount	*mp,
> > +	bool			cancel)
> >  {
> >  	int	error = 0;
> >  
> > -	if (!(mp->m_flags & XFS_MOUNT_NORECOVERY)) {
> > +	if (mp->m_flags & XFS_MOUNT_NORECOVERY) {
> > +		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> > +		return 0;
> > +	}
> > +
> > +	if (cancel) {
> > +		error = xlog_recover_cancel(mp->m_log);
> > +	} else {
> >  		error = xlog_recover_finish(mp->m_log);
> >  		if (!error)
> >  			xfs_log_work_queue(mp);
> > -	} else {
> > -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> >  	}
> >  
> > -
> >  	return error;
> >  }
> 
> As I mentioned on #xfs, I don't think the error/cancel path needs to
> go through this function. Add a new function xfs_log_mount_cancel(),
> and put the contents of xlog_recover_cancel() inside it, along with
> the log unmount function...
> 

Ok.

> (General rule: prefix "xfs_log_....()" is for functions that call
> into the log from outside, passing a struct xfs_mount. prefix
> "xlog_...()" is for internal calls, passing a struct xlog.)
> 

Sure, xlog_recover_cancel() was just intended to be consistent with
xlog_recover_finish().

> > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > index a74ff68..a7ba078 100644
> > --- a/fs/xfs/xfs_log_recover.c
> > +++ b/fs/xfs/xfs_log_recover.c
> > @@ -3823,10 +3823,11 @@ abort_error:
> >   */
> >  STATIC int
> >  xlog_recover_process_efis(
> > -	struct xlog	*log)
> > +	struct xlog		*log,
> > +	bool			cancel)
> >  {
> > -	xfs_log_item_t		*lip;
> > -	xfs_efi_log_item_t	*efip;
> > +	struct xfs_log_item	*lip;
> > +	struct xfs_efi_log_item	*efip;
> >  	int			error = 0;
> >  	struct xfs_ail_cursor	cur;
> >  	struct xfs_ail		*ailp;
> > @@ -3847,10 +3848,24 @@ xlog_recover_process_efis(
> >  			break;
> >  		}
> >  
> > +		efip = (struct xfs_efi_log_item *) lip;
> > +
> > +		/*
> > +		 * A cancel occurs when the mount has failed and we're bailing
> > +		 * out. Release all pending EFIs so they don't pin the AIL.
> > +		 */
> > +		if (cancel) {
> > +			spin_unlock(&ailp->xa_lock);
> > +			xfs_efi_release(efip);
> > +			spin_lock(&ailp->xa_lock);
> > +
> > +			lip = xfs_trans_ail_cursor_next(ailp, &cur);
> > +			continue;
> > +		}
> 
> This might be better to separate into a xlog_recover_cancel_efis()
> function because this is much simpler than the rest of the loop.
> It may be more code, but its simpler to read and understand.
> 

That was my first instinct, I just didn't want to duplicate the list
processing and whatnot once I looked at it. I'll see how it looks.

> > +
> >  		/*
> >  		 * Skip EFIs that we've already processed.
> >  		 */
> > -		efip = (xfs_efi_log_item_t *)lip;
> >  		if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags)) {
> >  			lip = xfs_trans_ail_cursor_next(ailp, &cur);
> >  			continue;
> > @@ -4617,8 +4632,13 @@ xlog_recover_finish(
> >  	 */
> >  	if (log->l_flags & XLOG_RECOVERY_NEEDED) {
> >  		int	error;
> > -		error = xlog_recover_process_efis(log);
> > +		error = xlog_recover_process_efis(log, false);
> >  		if (error) {
> > +			/*
> > +			 * We could still have pending EFIs. Cancel them so we
> > +			 * don't hang...
> > +			 */
> > +			xlog_recover_process_efis(log, true);
> 
> Not actually necessary. We don't clear the XLOG_RECOVERY_NEEDED
> flag, so when the error stack is run in xfs_mountfs(), we will
> call xfs_log_mount_cancel(), it will see the XLOG_RECOVERY_NEEDED
> and run xlog_recover_cancel_efis()....
> 

Ok.

> >  			xfs_alert(log->l_mp, "Failed to recover EFIs");
> >  			return error;
> >  		}
> > @@ -4644,6 +4664,17 @@ xlog_recover_finish(
> >  	return 0;
> >  }
> >  
> > +int
> > +xlog_recover_cancel(
> > +	struct xlog	*log)
> > +{
> > +	int		error = 0;
> > +
> > +	if (log->l_flags & XLOG_RECOVERY_NEEDED)
> > +		error = xlog_recover_process_efis(log, true);
> > +
> > +	return error;
> > +}
> 
> void
> xfs_log_mount_cancel(
> 	struct xfs_mount	*mp)
> {
> 	int		error = 0;
> 
> 	if (log->l_flags & XLOG_RECOVERY_NEEDED)
> 		xlog_recover_cancel_efis(log);
> 
> 	xfs_log_unmount(mp);
> 	return error;
> }
> 
> > @@ -799,7 +800,10 @@ xfs_mountfs(
> >  	}
> >  
> >  	/*
> > -	 * log's mount-time initialization. Perform 1st part recovery if needed
> > +	 * Log's mount-time initialization. The first part of recovery can place
> > +	 * some items on the AIL, to be handled when recovery is finished. Set
> > +	 * the cancel flag to ensure that the recovery is cancelled should we
> > +	 * fail before then.
> >  	 */
> >  	error = xfs_log_mount(mp, mp->m_logdev_targp,
> >  			      XFS_FSB_TO_DADDR(mp, sbp->sb_logstart),
> > @@ -808,6 +812,7 @@ xfs_mountfs(
> >  		xfs_warn(mp, "log mount failed");
> >  		goto out_fail_wait;
> >  	}
> > +	log_mount_cancel = true;
> 
> At this point, XLOG_RECOVERY_NEEDED will be set if EFI processing is
> required, so a variable to track this here is not needed.
> 

Indeed, I didn't notice how XLOG_RECOVERY_NEEDED was used here. This
seems much cleaner.

> > @@ -910,11 +915,15 @@ xfs_mountfs(
> >  	}
> >  
> >  	/*
> > -	 * Finish recovering the file system.  This part needed to be
> > -	 * delayed until after the root and real-time bitmap inodes
> > -	 * were consistently read in.
> > +	 * Finish recovering the file system.  This part needed to be delayed
> > +	 * until after the root and real-time bitmap inodes were consistently
> > +	 * read in.
> > +	 *
> > +	 * Reset the cancel flag as the finish cleans up after itself on
> > +	 * failure.
> >  	 */
> > -	error = xfs_log_mount_finish(mp);
> > +	log_mount_cancel = false;
> > +	error = xfs_log_mount_finish(mp, false);
> >  	if (error) {
> >  		xfs_warn(mp, "log mount finish failed");
> >  		goto out_rtunmount;
> 
> xfs_log_mount_finish() will clear the XLOG_RECOVERY_NEEDED. On
> error, we jump into the error stack above the call to
> xfs_log_mount_cancel(), so it will handle the cleanup....
> 
> > @@ -956,6 +965,8 @@ xfs_mountfs(
> >   out_rele_rip:
> >  	IRELE(rip);
> >   out_log_dealloc:
> > +	if (log_mount_cancel)
> > +		xfs_log_mount_finish(mp, true);
> >  	xfs_log_unmount(mp);
> 
> And this just becomes a call to xfs_log_mount_cancel().
> 
> I hope this is a bit clearer that mud ;)
> 

I think I follow.. thanks.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs
  2015-08-07 12:09     ` Brian Foster
@ 2015-08-07 22:45       ` Dave Chinner
  0 siblings, 0 replies; 35+ messages in thread
From: Dave Chinner @ 2015-08-07 22:45 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Aug 07, 2015 at 08:09:34AM -0400, Brian Foster wrote:
> On Fri, Aug 07, 2015 at 10:41:00AM +1000, Dave Chinner wrote:
> > On Thu, Aug 06, 2015 at 01:44:24PM -0400, Brian Foster wrote:
> > > diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
> > > index 42c9b05..aceb54f 100644
> > > --- a/fs/xfs/xfs_extfree_item.c
> > > +++ b/fs/xfs/xfs_extfree_item.c
> > > @@ -61,9 +61,15 @@ __xfs_efi_release(
> > >  
> > >  	if (atomic_dec_and_test(&efip->efi_refcount)) {
> > >  		spin_lock(&ailp->xa_lock);
> > > -		/* xfs_trans_ail_delete() drops the AIL lock. */
> > > -		xfs_trans_ail_delete(ailp, &efip->efi_item,
> > > -				     SHUTDOWN_LOG_IO_ERROR);
> > > +		/*
> > > +		 * We don't know whether the EFI made it to the AIL. Remove it
> > > +		 * if so. Note that xfs_trans_ail_delete() drops the AIL lock.
> > > +		 */
> > > +		if (efip->efi_item.li_flags & XFS_LI_IN_AIL)
> > > +			xfs_trans_ail_delete(ailp, &efip->efi_item,
> > > +					     SHUTDOWN_LOG_IO_ERROR);
> > > +		else
> > > +			spin_unlock(&ailp->xa_lock);
> > >  		xfs_efi_item_free(efip);
> > >  	}
> > 
> > We now have a lot of this pattern:
> > 
> > 	spin_lock(&ailp->xa_lock);
> > 	if (lip->li_flags & XFS_LI_IN_AIL)
> > 		xfs_trans_ail_delete(ailp, lip, shutdown_reason);
> > 	else
> > 		spin_unlock(&ailp->xa_lock);
> > 
> > occurring in the code. Can you look into adding a helper function
> > for this, say xfs_trans_ail_remove()? (separate patch, of course!)
> > 
> 
> There was only one other instance of this with regard to the EFI that is
> ultimately replaced with an xfs_efi_release() call. I'll look into
> whether there's enough instances of this for other items to justify a
> helper.

A quick look shows at least another 4 outside the EFI code:

xfs_buf_item_unlock
xfs_qm_dqflush_done
xfs_qm_dqflush
xfs_iflush_abort

That's enough for a helper, I think.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] xfs: disentagle EFI release from the extent count
  2015-08-06 17:44 ` [PATCH 01/11] xfs: disentagle EFI release from the extent count Brian Foster
@ 2015-08-09  7:36   ` Christoph Hellwig
  2015-08-10 12:37     ` Brian Foster
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  7:36 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:22PM -0400, Brian Foster wrote:

-       __xfs_efi_release(efip);
+	xfs_efi_release(efip);

Can you explain in the changelog why this is safe?


> -xfs_efi_release(xfs_efi_log_item_t	*efip,
> -		uint			nextents)
> +xfs_efi_release(struct xfs_efi_log_item	*efip)

Can you use normal XFS function formatting here? e.g.

xfs_efi_release(
	struct xfs_efi_log_item		*efip)


As a follow on we should be able to remove atomic_inc_return and
replace it with a local iterator in xfs_bmap_finish().

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 02/11] xfs: return committed status from xfs_trans_roll()
  2015-08-06 17:44 ` [PATCH 02/11] xfs: return committed status from xfs_trans_roll() Brian Foster
@ 2015-08-09  7:37   ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  7:37 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:23PM -0400, Brian Foster wrote:
> Some callers need to make error handling decisions based on whether the
> current transaction successfully committed or not. Rename
> xfs_trans_roll(), add a new parameter and provide a wrapper to preserve
> existing callers.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs
  2015-08-06 17:44 ` [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs Brian Foster
  2015-08-07  0:41   ` Dave Chinner
@ 2015-08-09  7:46   ` Christoph Hellwig
  1 sibling, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  7:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

>  STATIC void
>  xfs_efi_item_unpin(
> @@ -141,14 +147,6 @@ xfs_efi_item_unpin(
>  	int			remove)
>  {
>  	struct xfs_efi_log_item	*efip = EFI_ITEM(lip);
> -
> -	if (remove) {
> -		ASSERT(!(lip->li_flags & XFS_LI_IN_AIL));
> -		if (lip->li_desc)
> -			xfs_trans_del_item(lip);
> -		xfs_efi_item_free(efip);
> -		return;
> -	}
>  	xfs_efi_release(efip);

After this the buf iterm is the last one looking a the remove
argument.  I wonder if it might need similar treatment.

> -	return (xfs_lsn_t)-1;
> +
> +	return (xfs_lsn_t) -1;

This changes the style away from the most common style in Linux and XFS.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure
  2015-08-06 17:44 ` [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure Brian Foster
@ 2015-08-09  7:53   ` Christoph Hellwig
  2015-08-10 12:38     ` Brian Foster
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  7:53 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

> +		/*
> +		 * Log the EFD before error handling to ensure the EFD aborts
> +		 * (and releases the EFI) on error.
> +		 */
>  		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
>  		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
>  					 extp->ext_len);

Given that we now always need to log the extents in the EFD even on
error maybe it's time to move the call to xfs_free_extent into
xfs_trans_log_efd_extent and rename it to xfs_trans_free_extent?

Or even better convert xfs_bmap_finish to also walk the extents in the
EFI instead of xfs_bmap_free list and have a xfs_trans_free_extents helper
ala:

int
xfs_trans_free_extents(
	struct xfs_trans	*tp,
	struct xfs_efi_log_item	*efip)
{
	struct xfs_efd_log_item	*efdp;
	int			error = 0, i;

	efdp = xfs_trans_get_efd(tp, efip, efip->efi_format.efi_nextents);
	for (i = 0; i < efip->efi_format.efi_nextents; i++) {
		struct xfs_extent *extp = &efip->efi_format.efi_extents[i];

		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
					extp->ext_len);

		if (error)
			break;
	}

	return error;
}

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 05/11] xfs: use EFI refcount consistently in log recovery
  2015-08-06 17:44 ` [PATCH 05/11] xfs: use EFI refcount consistently in log recovery Brian Foster
  2015-08-07  1:51   ` Dave Chinner
@ 2015-08-09  7:56   ` Christoph Hellwig
  2015-08-10 12:37     ` Brian Foster
  1 sibling, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  7:56 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

> +	error = xfs_efi_copy_format(&(item->ri_buf[0]), &(efip->efi_format));
> +	if (error) {
>  		xfs_efi_item_free(efip);

Shou;dn't we switch all users of xfs_item_free except for
xfs_efi_release to xfs_efi_release now for consistency?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure
  2015-08-06 17:44 ` [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure Brian Foster
  2015-08-07  2:23   ` Dave Chinner
@ 2015-08-09  8:01   ` Christoph Hellwig
  2015-08-10 12:38     ` Brian Foster
  1 sibling, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  8:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

>  STATIC int
>  xlog_recover_process_efis(
> -	struct xlog	*log)
> +	struct xlog		*log,
> +	bool			cancel)

There is almost no code shared between cancelation and processing,
so please add a new xlog_cancel_efis helper instead.

> +		efip = (struct xfs_efi_log_item *) lip;

Please use container_of to get to a containing structure.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 07/11] xfs: icreate log item recovery and cancellation tracepoints
  2015-08-06 17:44 ` [PATCH 07/11] xfs: icreate log item recovery and cancellation tracepoints Brian Foster
@ 2015-08-09  8:02   ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  8:02 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 11/11] xfs: fix btree cursor error cleanups
  2015-08-06 17:44 ` [PATCH 11/11] xfs: fix btree cursor error cleanups Brian Foster
@ 2015-08-09  8:03   ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  8:03 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:32PM -0400, Brian Foster wrote:
> The btree cursor cleanup function takes an error parameter that affects
> how buffers are released from the cursor. All buffers are released in
> the event of error. Several callers do not specify the XFS_BTREE_ERROR
> flag in the event of error, however. This can cause buffers to hang
> around locked or with an elevated hold count and thus lead to umount
> hangs in the event of errors.
> 
> Fix up the xfs_btree_del_cursor() callers to pass XFS_BTREE_ERROR if the
> cursor is being torn down due to error.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

I wonder if we should change the second parameter of
xfs_btree_del_cursor to "int error" to make the life esier for the
callers.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 10/11] xfs: clean up root inode properly on mount failure
  2015-08-06 17:44 ` [PATCH 10/11] xfs: clean up root inode properly on mount failure Brian Foster
@ 2015-08-09  8:03   ` Christoph Hellwig
  0 siblings, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-09  8:03 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Thu, Aug 06, 2015 at 01:44:31PM -0400, Brian Foster wrote:
> The root inode is read as part of the xfs_mountfs() sequence and the
> reference is dropped in the event of failure after we grab the inode.
> The reference drop doesn't necessarily free the inode, however. It marks
> it for reclaim and potentially kicks off the reclaim workqueue.  The
> workqueue is destroyed further up the error path, which means we are
> subject to crash if the workqueue job runs after this point or a memory
> leak which is identified if the xfs_inode_zone is destroyed (e.g., on
> module removal). Both of these outcomes are reproducible via manual
> instrumentation of a mount error after the root inode xfs_iget() call in
> xfs_mountfs().
> 
> Update the xfs_mountfs() error path to cancel any potential reclaim work
> items and to run a synchronous inode reclaim if the root inode is marked
> for reclaim. This ensures that no jobs remain on the queue before it is
> destroyed and that the root inode is freed before the reclaim mechanism
> is torn down.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] xfs: disentagle EFI release from the extent count
  2015-08-09  7:36   ` Christoph Hellwig
@ 2015-08-10 12:37     ` Brian Foster
  2015-08-12  7:15       ` Christoph Hellwig
  0 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-10 12:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Sun, Aug 09, 2015 at 12:36:41AM -0700, Christoph Hellwig wrote:
> On Thu, Aug 06, 2015 at 01:44:22PM -0400, Brian Foster wrote:
> 
> -       __xfs_efi_release(efip);
> +	xfs_efi_release(efip);
> 
> Can you explain in the changelog why this is safe?
> 

I thought it did. I'll try to elaborate.

> 
> > -xfs_efi_release(xfs_efi_log_item_t	*efip,
> > -		uint			nextents)
> > +xfs_efi_release(struct xfs_efi_log_item	*efip)
> 
> Can you use normal XFS function formatting here? e.g.
> 
> xfs_efi_release(
> 	struct xfs_efi_log_item		*efip)
> 

Ok.

> 
> As a follow on we should be able to remove atomic_inc_return and
> replace it with a local iterator in xfs_bmap_finish().
> 

I'm not sure what you mean here...

Brian

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 05/11] xfs: use EFI refcount consistently in log recovery
  2015-08-09  7:56   ` Christoph Hellwig
@ 2015-08-10 12:37     ` Brian Foster
  0 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-10 12:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Sun, Aug 09, 2015 at 12:56:40AM -0700, Christoph Hellwig wrote:
> > +	error = xfs_efi_copy_format(&(item->ri_buf[0]), &(efip->efi_format));
> > +	if (error) {
> >  		xfs_efi_item_free(efip);
> 
> Shou;dn't we switch all users of xfs_item_free except for
> xfs_efi_release to xfs_efi_release now for consistency?
> 

I don't think that's necessarily correct, at least as a one-to-one
conversion. We'd have to release two references here as well as in the
unlock handler if the transaction is aborted.

E.g., the reference count is only relevant once the EFI enters the
transaction subsystem.

Brian

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure
  2015-08-09  7:53   ` Christoph Hellwig
@ 2015-08-10 12:38     ` Brian Foster
  2015-08-10 13:39       ` Brian Foster
  0 siblings, 1 reply; 35+ messages in thread
From: Brian Foster @ 2015-08-10 12:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Sun, Aug 09, 2015 at 12:53:11AM -0700, Christoph Hellwig wrote:
> > +		/*
> > +		 * Log the EFD before error handling to ensure the EFD aborts
> > +		 * (and releases the EFI) on error.
> > +		 */
> >  		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
> >  		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
> >  					 extp->ext_len);
> 
> Given that we now always need to log the extents in the EFD even on
> error maybe it's time to move the call to xfs_free_extent into
> xfs_trans_log_efd_extent and rename it to xfs_trans_free_extent?
> 
> Or even better convert xfs_bmap_finish to also walk the extents in the
> EFI instead of xfs_bmap_free list and have a xfs_trans_free_extents helper
> ala:
> 

Good idea, I'll incorporate something like this.

Brian

> int
> xfs_trans_free_extents(
> 	struct xfs_trans	*tp,
> 	struct xfs_efi_log_item	*efip)
> {
> 	struct xfs_efd_log_item	*efdp;
> 	int			error = 0, i;
> 
> 	efdp = xfs_trans_get_efd(tp, efip, efip->efi_format.efi_nextents);
> 	for (i = 0; i < efip->efi_format.efi_nextents; i++) {
> 		struct xfs_extent *extp = &efip->efi_format.efi_extents[i];
> 
> 		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
> 		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
> 					extp->ext_len);
> 
> 		if (error)
> 			break;
> 	}
> 
> 	return error;
> }
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure
  2015-08-09  8:01   ` Christoph Hellwig
@ 2015-08-10 12:38     ` Brian Foster
  0 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-10 12:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Sun, Aug 09, 2015 at 01:01:50AM -0700, Christoph Hellwig wrote:
> >  STATIC int
> >  xlog_recover_process_efis(
> > -	struct xlog	*log)
> > +	struct xlog		*log,
> > +	bool			cancel)
> 
> There is almost no code shared between cancelation and processing,
> so please add a new xlog_cancel_efis helper instead.
> 

See v2.

> > +		efip = (struct xfs_efi_log_item *) lip;
> 
> Please use container_of to get to a containing structure.
> 

Ok.

Brian

> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure
  2015-08-10 12:38     ` Brian Foster
@ 2015-08-10 13:39       ` Brian Foster
  0 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-10 13:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Mon, Aug 10, 2015 at 08:38:04AM -0400, Brian Foster wrote:
> On Sun, Aug 09, 2015 at 12:53:11AM -0700, Christoph Hellwig wrote:
> > > +		/*
> > > +		 * Log the EFD before error handling to ensure the EFD aborts
> > > +		 * (and releases the EFI) on error.
> > > +		 */
> > >  		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
> > >  		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
> > >  					 extp->ext_len);
> > 
> > Given that we now always need to log the extents in the EFD even on
> > error maybe it's time to move the call to xfs_free_extent into
> > xfs_trans_log_efd_extent and rename it to xfs_trans_free_extent?
> > 
> > Or even better convert xfs_bmap_finish to also walk the extents in the
> > EFI instead of xfs_bmap_free list and have a xfs_trans_free_extents helper
> > ala:
> > 
> 
> Good idea, I'll incorporate something like this.
> 
> Brian
> 

After taking a closer look at this, I think I'll go with the first idea
to push down xfs_free_extent(). The code below to iterate based on the
EFI, while a nice cleanup, complicates handling the free list on error.
E.g., we currently remove each bmap_free_item as it is successfully
processed so the resulting list is consistent with the items that were
not processed before the error.

It might not matter functionally at the moment since the caller probably
cancels the free list, but I consider that a landmine should we decide
to do something more with the unprocessed entries down the road.

Brian

> > int
> > xfs_trans_free_extents(
> > 	struct xfs_trans	*tp,
> > 	struct xfs_efi_log_item	*efip)
> > {
> > 	struct xfs_efd_log_item	*efdp;
> > 	int			error = 0, i;
> > 
> > 	efdp = xfs_trans_get_efd(tp, efip, efip->efi_format.efi_nextents);
> > 	for (i = 0; i < efip->efi_format.efi_nextents; i++) {
> > 		struct xfs_extent *extp = &efip->efi_format.efi_extents[i];
> > 
> > 		error = xfs_free_extent(tp, extp->ext_start, extp->ext_len);
> > 		xfs_trans_log_efd_extent(tp, efdp, extp->ext_start,
> > 					extp->ext_len);
> > 
> > 		if (error)
> > 			break;
> > 	}
> > 
> > 	return error;
> > }
> > 
> > _______________________________________________
> > xfs mailing list
> > xfs@oss.sgi.com
> > http://oss.sgi.com/mailman/listinfo/xfs
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] xfs: disentagle EFI release from the extent count
  2015-08-10 12:37     ` Brian Foster
@ 2015-08-12  7:15       ` Christoph Hellwig
  2015-08-12 12:47         ` Brian Foster
  0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2015-08-12  7:15 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, xfs

On Mon, Aug 10, 2015 at 08:37:27AM -0400, Brian Foster wrote:
> > As a follow on we should be able to remove atomic_inc_return and
> > replace it with a local iterator in xfs_bmap_finish().
> > 
> 
> I'm not sure what you mean here...

efi_next_extent is not only used is to find a 'slot' for the extent
in xfs_trans_log_efi_extent.  For that we don't need an atomic_t
in the EFI, but can have a local variable in xfs_bmap_finish.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/11] xfs: disentagle EFI release from the extent count
  2015-08-12  7:15       ` Christoph Hellwig
@ 2015-08-12 12:47         ` Brian Foster
  0 siblings, 0 replies; 35+ messages in thread
From: Brian Foster @ 2015-08-12 12:47 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Wed, Aug 12, 2015 at 12:15:18AM -0700, Christoph Hellwig wrote:
> On Mon, Aug 10, 2015 at 08:37:27AM -0400, Brian Foster wrote:
> > > As a follow on we should be able to remove atomic_inc_return and
> > > replace it with a local iterator in xfs_bmap_finish().
> > > 
> > 
> > I'm not sure what you mean here...
> 
> efi_next_extent is not only used is to find a 'slot' for the extent
> in xfs_trans_log_efi_extent.  For that we don't need an atomic_t
> in the EFI, but can have a local variable in xfs_bmap_finish.

Ok, I see. efi_next_extent isn't used for much at this point. We could
kill it entirely, we'd just have to add an index parameter to the EFI
logging function (or change up the params). We do still have asserts in
the logging and item format handlers that help ensure a properly
constructed EFI, so there is some use I suppose. I'm not sure it needed
to be atomic though, even prior to the changes in this series...

Brian

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2015-08-12 12:48 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-06 17:44 [PATCH 00/11] xfs: miscellaneous logging, recovery, umount fixes Brian Foster
2015-08-06 17:44 ` [PATCH 01/11] xfs: disentagle EFI release from the extent count Brian Foster
2015-08-09  7:36   ` Christoph Hellwig
2015-08-10 12:37     ` Brian Foster
2015-08-12  7:15       ` Christoph Hellwig
2015-08-12 12:47         ` Brian Foster
2015-08-06 17:44 ` [PATCH 02/11] xfs: return committed status from xfs_trans_roll() Brian Foster
2015-08-09  7:37   ` Christoph Hellwig
2015-08-06 17:44 ` [PATCH 03/11] xfs: fix efi/efd error handling to avoid fs shutdown hangs Brian Foster
2015-08-07  0:41   ` Dave Chinner
2015-08-07 12:09     ` Brian Foster
2015-08-07 22:45       ` Dave Chinner
2015-08-09  7:46   ` Christoph Hellwig
2015-08-06 17:44 ` [PATCH 04/11] xfs: ensure EFD trans aborts on log recovery extent free failure Brian Foster
2015-08-09  7:53   ` Christoph Hellwig
2015-08-10 12:38     ` Brian Foster
2015-08-10 13:39       ` Brian Foster
2015-08-06 17:44 ` [PATCH 05/11] xfs: use EFI refcount consistently in log recovery Brian Foster
2015-08-07  1:51   ` Dave Chinner
2015-08-07 12:09     ` Brian Foster
2015-08-09  7:56   ` Christoph Hellwig
2015-08-10 12:37     ` Brian Foster
2015-08-06 17:44 ` [PATCH 06/11] xfs: don't leave EFIs on AIL on mount failure Brian Foster
2015-08-07  2:23   ` Dave Chinner
2015-08-07 12:10     ` Brian Foster
2015-08-09  8:01   ` Christoph Hellwig
2015-08-10 12:38     ` Brian Foster
2015-08-06 17:44 ` [PATCH 07/11] xfs: icreate log item recovery and cancellation tracepoints Brian Foster
2015-08-09  8:02   ` Christoph Hellwig
2015-08-06 17:44 ` [PATCH 08/11] xfs: fix broken icreate log item cancellation Brian Foster
2015-08-06 17:44 ` [PATCH 09/11] xfs: checksum log record ext headers based on record size Brian Foster
2015-08-06 17:44 ` [PATCH 10/11] xfs: clean up root inode properly on mount failure Brian Foster
2015-08-09  8:03   ` Christoph Hellwig
2015-08-06 17:44 ` [PATCH 11/11] xfs: fix btree cursor error cleanups Brian Foster
2015-08-09  8:03   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox