linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xfs: free the EFI entries from AIL on forced shutdown
@ 2014-03-14 16:37 Mark Tinguely
  2014-03-17  5:29 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Tinguely @ 2014-03-14 16:37 UTC (permalink / raw)
  To: xfs

[-- Attachment #1: xfs-remove-efi-entry-before-log-unmount.patch --]
[-- Type: text/plain, Size: 8628 bytes --]

The extent free intention (EFI) and extent free done (EFD)
log items are in separate transactions. It is possible that
the EFI can be pushed to the AIL before a forced shutdown
where it gets stuck for following reasons:

 No EFD. If freeing the extent fails in xfs_bmap_finish() or
 xfs_recover_process_efi(), then the corresponding extent
 free done (EFD) entry will not be created.

 EFD IOP Abort processing. If xfs_trans_cancel() is called with
 an abort flag, or if the xfs_trans_commit() is called when the
 file system is in forced shutdown or if the log buffer write fails,
 then the EFD iop commands will not remove the EFI from the AIL.

If the EFI entry is left on the AIL, xfs_ail_push all_sync() will
fail to complete, the unmount will hang, and a reboot is required
to get the complete the unmount.

A forced shutdown pushs the CIL, so the EFI/EFD on the CIL before
the forced shutdown will be processed normally. Aborted EFI/EFD will
remove themself. All that remains to be done to is to remove the EFI
entries from the AIL in the above listed shutdown cases.

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
---
Note to Christoph about the xfs_recover_process_efi part of patch:
  From b90935eaba9eb13c67101e5d723513bc6ca6e722 Mon Sep 17 00:00:00 2001
  Date: Sat, 23 Nov 2013 20:11:09 +0100
  Subject: [PATCH] xfs: simplify EFI/EFD recovery

  Use a cancellation table, similar to how we handle buffers instead of
  abusing the AIL during recovery.

 http://oss.sgi.com/archives/xfs/2013-12/msg00438.html

will place the EFI onto the AIL in xfs_recover_process_efi() by the time
the xfs_free_extent() error happens. Therefore the changes
in xfs_recover_process_efis() would be the same with or without
your new feature.

Tested:
 Recovery - metadata with bad extent.
 xfs_bmap_finish()
	- placed errors as if the free extent call failed.
	- forced shutdown (as if from a different thread) before EFD created.
	- forced shutdown (as if from a different thread) after EFD created
	   but before xfs_trans_commit()
	- forced shutdown after xfs_trans_commit()
	- xfstests ...

 fs/xfs/xfs_bmap_util.c    |    9 +++++--
 fs/xfs/xfs_extfree_item.c |   55 ++++++++++++++++++++++++++++++++++++++++++----
 fs/xfs/xfs_extfree_item.h |    1 
 fs/xfs/xfs_log_recover.c  |   27 ++++++++++------------
 4 files changed, 71 insertions(+), 21 deletions(-)

Index: b/fs/xfs/xfs_bmap_util.c
===================================================================
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -124,8 +124,12 @@ xfs_bmap_finish(
 				free->xbfi_blockcount))) {
 			/*
 			 * The bmap free list will be cleaned up at a
-			 * higher level.  The EFI will be canceled when
-			 * this transaction is aborted.
+			 * higher level. If the EFI is still on the CIL,
+			 * then it will be canceled when this transaction
+			 * is aborted. But if the EFI has been pushed onto
+			 * the AIL, then it must be manually removed from
+			 * the AIL.
+			 *
 			 * Need to force shutdown here to make sure it
 			 * happens, since this transaction may not be
 			 * dirty yet.
@@ -136,6 +140,7 @@ xfs_bmap_finish(
 						   (error == EFSCORRUPTED) ?
 						   SHUTDOWN_CORRUPT_INCORE :
 						   SHUTDOWN_META_IO_ERROR);
+			xfs_efi_clear_ail(mp->m_ail);
 			return error;
 		}
 		xfs_trans_log_efd_extent(ntp, efd, free->xbfi_startblock,
Index: b/fs/xfs/xfs_extfree_item.c
===================================================================
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -420,8 +420,15 @@ STATIC void
 xfs_efd_item_unlock(
 	struct xfs_log_item	*lip)
 {
-	if (lip->li_flags & XFS_LI_ABORTED)
+	/*
+	 * Clear the EFI if on AIL when aborting xfs_bmap_finish.
+	 * The forced shutdown will force the log, so other EFDs
+	 * should not be processed from the CIL.
+	 */
+	if (lip->li_flags & XFS_LI_ABORTED) {
+		xfs_efi_clear_ail(lip->li_ailp);
 		xfs_efd_item_free(EFD_ITEM(lip));
+	}
 }
 
 /*
@@ -439,10 +446,15 @@ xfs_efd_item_committed(
 	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
 
 	/*
-	 * If we got a log I/O error, it's always the case that the LR with the
-	 * EFI got unpinned and freed before the EFD got aborted.
+	 * If we got a log I/O error and the EFI is also in this buffer, it
+	 * will be unpinned and freed before the EFD got aborted. But the EFI
+	 * is in an earlier transaction and could be on the AIL when the log
+	 * I/O error happened for this EFD. In that case, manually remove the
+	 * remaining EFIs from the AIL.
 	 */
-	if (!(lip->li_flags & XFS_LI_ABORTED))
+	if (lip->li_flags & XFS_LI_ABORTED)
+		xfs_efi_clear_ail(lip->li_ailp);
+	else
 		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
 
 	xfs_efd_item_free(efdp);
@@ -506,3 +518,38 @@ xfs_efd_init(
 
 	return efdp;
 }
+
+/* Remove all the EFI entries from the AIL using a cursor.
+ * An error that forces down the filesystem will leave the EFI on
+ * the AIL and thus prevening xfs_ail_push_all_sync to complete hanging
+ * the filesystem.
+ *
+ * xfs_free_extent failures in xfs_bmap_finish will not even have an EFD
+ * to match the one on the AIL.
+ */
+void
+xfs_efi_clear_ail(
+	struct xfs_ail		*ailp)
+{
+	struct xfs_log_item	*lip;
+	struct xfs_efi_log_item	*efip;
+	struct xfs_ail_cursor	cur;
+
+	spin_lock(&ailp->xa_lock);
+
+	for (lip = xfs_trans_ail_cursor_first(ailp, &cur, 0); lip != NULL;
+	     lip = xfs_trans_ail_cursor_next(ailp, &cur)) {
+		/* Ignore non EFI entries.  */
+		if (lip->li_type != XFS_LI_EFI)
+			continue;
+
+		/* Remove the EFI entry from the AIL */
+		efip = (xfs_efi_log_item_t *)lip;
+		spin_unlock(&ailp->xa_lock);
+		xfs_efi_release(efip, efip->efi_format.efi_nextents);
+		spin_lock(&ailp->xa_lock);
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+	xfs_trans_ail_cursor_done(ailp, &cur);
+	spin_unlock(&ailp->xa_lock);
+}
Index: b/fs/xfs/xfs_extfree_item.h
===================================================================
--- a/fs/xfs/xfs_extfree_item.h
+++ b/fs/xfs/xfs_extfree_item.h
@@ -77,5 +77,6 @@ xfs_efd_log_item_t	*xfs_efd_init(struct
 int			xfs_efi_copy_format(xfs_log_iovec_t *buf,
 					    xfs_efi_log_format_t *dst_efi_fmt);
 void			xfs_efi_item_free(xfs_efi_log_item_t *);
+void			xfs_efi_clear_ail(struct xfs_ail *ailp);
 
 #endif	/* __XFS_EXTFREE_ITEM_H__ */
Index: b/fs/xfs/xfs_log_recover.c
===================================================================
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3634,20 +3634,23 @@ xlog_recover_process_data(
 /*
  * Process an extent free intent item that was recovered from
  * the log.  We need to free the extents that it describes.
+ * The caller will free all EFI entries on error.
  */
 STATIC int
 xlog_recover_process_efi(
-	xfs_mount_t		*mp,
-	xfs_efi_log_item_t	*efip)
+	struct xfs_mount	*mp,
+	struct xfs_efi_log_item	*efip)
 {
-	xfs_efd_log_item_t	*efdp;
-	xfs_trans_t		*tp;
+	struct xfs_efd_log_item	*efdp;
+	struct xfs_trans	*tp;
 	int			i;
 	int			error = 0;
 	xfs_extent_t		*extp;
 	xfs_fsblock_t		startblock_fsb;
 
 	ASSERT(!test_bit(XFS_EFI_RECOVERED, &efip->efi_flags));
+	/* All paths need the XFS_EFI_RECOVERED flag set. Do it here. */
+	set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
 
 	/*
 	 * First check the validity of the extents described by the
@@ -3662,12 +3665,6 @@ xlog_recover_process_efi(
 		    (extp->ext_len == 0) ||
 		    (startblock_fsb >= mp->m_sb.sb_dblocks) ||
 		    (extp->ext_len >= mp->m_sb.sb_agblocks)) {
-			/*
-			 * This will pull the EFI from the AIL and
-			 * free the memory associated with it.
-			 */
-			set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
-			xfs_efi_release(efip, efip->efi_format.efi_nextents);
 			return XFS_ERROR(EIO);
 		}
 	}
@@ -3687,7 +3684,6 @@ xlog_recover_process_efi(
 					 extp->ext_len);
 	}
 
-	set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
 	error = xfs_trans_commit(tp, 0);
 	return error;
 
@@ -3718,8 +3714,8 @@ STATIC int
 xlog_recover_process_efis(
 	struct xlog	*log)
 {
-	xfs_log_item_t		*lip;
-	xfs_efi_log_item_t	*efip;
+	struct xfs_log_item	*lip;
+	struct xfs_efi_log_item	*efip;
 	int			error = 0;
 	struct xfs_ail_cursor	cur;
 	struct xfs_ail		*ailp;
@@ -3753,12 +3749,13 @@ xlog_recover_process_efis(
 		error = xlog_recover_process_efi(log->l_mp, efip);
 		spin_lock(&ailp->xa_lock);
 		if (error)
-			goto out;
+			break;
 		lip = xfs_trans_ail_cursor_next(ailp, &cur);
 	}
-out:
 	xfs_trans_ail_cursor_done(ailp, &cur);
 	spin_unlock(&ailp->xa_lock);
+	if (error)
+		xfs_efi_clear_ail(ailp);
 	return error;
 }
 


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: free the EFI entries from AIL on forced shutdown
  2014-03-14 16:37 [PATCH] xfs: free the EFI entries from AIL on forced shutdown Mark Tinguely
@ 2014-03-17  5:29 ` Dave Chinner
  2014-03-20 19:47   ` Mark Tinguely
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2014-03-17  5:29 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs

On Fri, Mar 14, 2014 at 11:37:01AM -0500, Mark Tinguely wrote:
> The extent free intention (EFI) and extent free done (EFD)
> log items are in separate transactions. It is possible that
> the EFI can be pushed to the AIL before a forced shutdown
> where it gets stuck for following reasons:
> 
>  No EFD. If freeing the extent fails in xfs_bmap_finish() or
>  xfs_recover_process_efi(), then the corresponding extent
>  free done (EFD) entry will not be created.

You don't handle the xfs_bmap_finish() case properly - the patch
only addresses the case where the EFD was created and contains a
reference to the EFI. The only time we can have an EFI in the AIL
without an EFD pointing to it is if xfs_trans_reserve() call in
xfs_bmap_finish() fails due to a shutdown. That failure case
leaks the EFI reference that is supposed to be passed to the EFD...

>  EFD IOP Abort processing. If xfs_trans_cancel() is called with
>  an abort flag, or if the xfs_trans_commit() is called when the
>  file system is in forced shutdown or if the log buffer write fails,
>  then the EFD iop commands will not remove the EFI from the AIL.

Which they should, because after xfs_trans_get_efd(), the EFD owns
the reference to the EFI. Hence aborting the EFD should release the
EFI reference it owns.

> Index: b/fs/xfs/xfs_extfree_item.c
> ===================================================================
> --- a/fs/xfs/xfs_extfree_item.c
> +++ b/fs/xfs/xfs_extfree_item.c
> @@ -420,8 +420,15 @@ STATIC void
>  xfs_efd_item_unlock(
>  	struct xfs_log_item	*lip)
>  {
> -	if (lip->li_flags & XFS_LI_ABORTED)
> +	/*
> +	 * Clear the EFI if on AIL when aborting xfs_bmap_finish.
> +	 * The forced shutdown will force the log, so other EFDs
> +	 * should not be processed from the CIL.
> +	 */
> +	if (lip->li_flags & XFS_LI_ABORTED) {
> +		xfs_efi_clear_ail(lip->li_ailp);
>  		xfs_efd_item_free(EFD_ITEM(lip));
> +	}
>  }

So, we abort one EFI, so we kill every EFI in the filesystem? We
want to be able to cancel and rollback transactions eventually, and
this will prevent us from being able to do that as it will kill EFIs
unrelated to the EFD being aborted.

AFAICT, all this needs to do is:

	if (!(lip->li_flags & XFS_LI_ABORTED))
		return;

	xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
	xfs_efd_item_free(efdp);

i.e. before freeing the EFD, release the reference to the EFI
that it was passed.

> @@ -439,10 +446,15 @@ xfs_efd_item_committed(
>  	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
>  
>  	/*
> -	 * If we got a log I/O error, it's always the case that the LR with the
> -	 * EFI got unpinned and freed before the EFD got aborted.
> +	 * If we got a log I/O error and the EFI is also in this buffer, it
> +	 * will be unpinned and freed before the EFD got aborted. But the EFI
> +	 * is in an earlier transaction and could be on the AIL when the log
> +	 * I/O error happened for this EFD. In that case, manually remove the
> +	 * remaining EFIs from the AIL.
>  	 */
> -	if (!(lip->li_flags & XFS_LI_ABORTED))
> +	if (lip->li_flags & XFS_LI_ABORTED)
> +		xfs_efi_clear_ail(lip->li_ailp);
> +	else
>  		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);

Same here - we can simply always call xfs_efi_release() and we'll
end up doing the right thing w.r.t. the EFi reference count.

And if we do the right thing with the EFI reference count when
aborting the EFI (i.e. call __xfs_efi_release() rather than freeing
it) the aborting of transactions will correctly free all the
references to the EFI and remove them from the AIL. The only case we
then have to handle specially is the xfs_trans_reserve failure in
xfs_bmap_finish(), where we need to release the EFI directly in the
error path.

That gets rid of the "big hammer" error handling for normal runtime
shutdown that xfs_efi_clear_ail(). i.e. we need to fix the reference
count handling, not work around it.


> Index: b/fs/xfs/xfs_log_recover.c
> ===================================================================
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3634,20 +3634,23 @@ xlog_recover_process_data(
>  /*
>   * Process an extent free intent item that was recovered from
>   * the log.  We need to free the extents that it describes.
> + * The caller will free all EFI entries on error.
>   */
>  STATIC int
>  xlog_recover_process_efi(
> -	xfs_mount_t		*mp,
> -	xfs_efi_log_item_t	*efip)
> +	struct xfs_mount	*mp,
> +	struct xfs_efi_log_item	*efip)
>  {
> -	xfs_efd_log_item_t	*efdp;
> -	xfs_trans_t		*tp;
> +	struct xfs_efd_log_item	*efdp;
> +	struct xfs_trans	*tp;
>  	int			i;
>  	int			error = 0;
>  	xfs_extent_t		*extp;
>  	xfs_fsblock_t		startblock_fsb;
>  
>  	ASSERT(!test_bit(XFS_EFI_RECOVERED, &efip->efi_flags));
> +	/* All paths need the XFS_EFI_RECOVERED flag set. Do it here. */
> +	set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
>  
>  	/*
>  	 * First check the validity of the extents described by the
> @@ -3662,12 +3665,6 @@ xlog_recover_process_efi(
>  		    (extp->ext_len == 0) ||
>  		    (startblock_fsb >= mp->m_sb.sb_dblocks) ||
>  		    (extp->ext_len >= mp->m_sb.sb_agblocks)) {
> -			/*
> -			 * This will pull the EFI from the AIL and
> -			 * free the memory associated with it.
> -			 */
> -			set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
> -			xfs_efi_release(efip, efip->efi_format.efi_nextents);
>  			return XFS_ERROR(EIO);
>  		}
>  	}
> @@ -3687,7 +3684,6 @@ xlog_recover_process_efi(
>  					 extp->ext_len);
>  	}
>  
> -	set_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
>  	error = xfs_trans_commit(tp, 0);
>  	return error;

This is basically saying "xfs_efi_release() should drop both the AIL
and the EFD reference as we really only only have one reference". I
think we should simply remove the XFS_EFI_RECOVERED bit from the
reference counting code, and simply subtract one of the EFI
references in xlog_recover_efi_pass2() where we insert the EFI into
the AIL, so that it behaves exactly like the runtime case when
later processing the EFDs.

That is, in xlog_recover_efd_pass2() when we find a matching EFD we
simply call xfs_efi_release() and that does all the freeing because
we're removing the EFD reference which is the only remaining
reference.

Then in xlog_recover_process_efi(), we simply call xfs_efi_release()
for the error cases to drop the last reference to the EFI, otherwise
the commit of the EFD will free it because it calls
xfs_efi_release() appropriately.

>  
> @@ -3718,8 +3714,8 @@ STATIC int
>  xlog_recover_process_efis(
>  	struct xlog	*log)
>  {
> -	xfs_log_item_t		*lip;
> -	xfs_efi_log_item_t	*efip;
> +	struct xfs_log_item	*lip;
> +	struct xfs_efi_log_item	*efip;
>  	int			error = 0;
>  	struct xfs_ail_cursor	cur;
>  	struct xfs_ail		*ailp;
> @@ -3753,12 +3749,13 @@ xlog_recover_process_efis(
>  		error = xlog_recover_process_efi(log->l_mp, efip);
>  		spin_lock(&ailp->xa_lock);
>  		if (error)
> -			goto out;
> +			break;
>  		lip = xfs_trans_ail_cursor_next(ailp, &cur);
>  	}
> -out:
>  	xfs_trans_ail_cursor_done(ailp, &cur);
>  	spin_unlock(&ailp->xa_lock);
> +	if (error)
> +		xfs_efi_clear_ail(ailp);
>  	return error;
>  }

If we fix the reference counting, then I think all we need here is:

	int	error = 0;

	while (ail cursor next) {
		....
		if (!error)
			error = xlog_recover_process_efi()
		else
			xfs_efi_release(efip)
		....
	}

So that any error will trigger freeing of all the other unprocessed
EFIs on the AIL that are pending recovery without needing any more
special looping....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: free the EFI entries from AIL on forced shutdown
  2014-03-17  5:29 ` Dave Chinner
@ 2014-03-20 19:47   ` Mark Tinguely
  2014-03-20 20:30     ` Mark Tinguely
  2014-03-20 23:11     ` Dave Chinner
  0 siblings, 2 replies; 5+ messages in thread
From: Mark Tinguely @ 2014-03-20 19:47 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 03/17/14 00:29, Dave Chinner wrote:
> On Fri, Mar 14, 2014 at 11:37:01AM -0500, Mark Tinguely wrote:
>> The extent free intention (EFI) and extent free done (EFD)
>> log items are in separate transactions. It is possible that
>> the EFI can be pushed to the AIL before a forced shutdown
>> where it gets stuck for following reasons:
>>
>>   No EFD. If freeing the extent fails in xfs_bmap_finish() or
>>   xfs_recover_process_efi(), then the corresponding extent
>>   free done (EFD) entry will not be created.
>
> You don't handle the xfs_bmap_finish() case properly - the patch
> only addresses the case where the EFD was created and contains a
> reference to the EFI. The only time we can have an EFI in the AIL
> without an EFD pointing to it is if xfs_trans_reserve() call in
> xfs_bmap_finish() fails due to a shutdown. That failure case
> leaks the EFI reference that is supposed to be passed to the EFD...
>
>>   EFD IOP Abort processing. If xfs_trans_cancel() is called with
>>   an abort flag, or if the xfs_trans_commit() is called when the
>>   file system is in forced shutdown or if the log buffer write fails,
>>   then the EFD iop commands will not remove the EFI from the AIL.
>
> Which they should, because after xfs_trans_get_efd(), the EFD owns
> the reference to the EFI. Hence aborting the EFD should release the
> EFI reference it owns.

Nod.

>
>> Index: b/fs/xfs/xfs_extfree_item.c
>> ===================================================================
>> --- a/fs/xfs/xfs_extfree_item.c
>> +++ b/fs/xfs/xfs_extfree_item.c
>> @@ -420,8 +420,15 @@ STATIC void
>>   xfs_efd_item_unlock(
>>   	struct xfs_log_item	*lip)
>>   {
>> -	if (lip->li_flags&  XFS_LI_ABORTED)
>> +	/*
>> +	 * Clear the EFI if on AIL when aborting xfs_bmap_finish.
>> +	 * The forced shutdown will force the log, so other EFDs
>> +	 * should not be processed from the CIL.
>> +	 */
>> +	if (lip->li_flags&  XFS_LI_ABORTED) {
>> +		xfs_efi_clear_ail(lip->li_ailp);
>>   		xfs_efd_item_free(EFD_ITEM(lip));
>> +	}
>>   }
>
> So, we abort one EFI, so we kill every EFI in the filesystem? We
> want to be able to cancel and rollback transactions eventually, and
> this will prevent us from being able to do that as it will kill EFIs
> unrelated to the EFD being aborted.
>
> AFAICT, all this needs to do is:
>
> 	if (!(lip->li_flags&  XFS_LI_ABORTED))
> 		return;
>
> 	xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
> 	xfs_efd_item_free(efdp);
>
> i.e. before freeing the EFD, release the reference to the EFI
> that it was passed.

Nod.

>> @@ -439,10 +446,15 @@ xfs_efd_item_committed(
>>   	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
>>
>>   	/*
>> -	 * If we got a log I/O error, it's always the case that the LR with the
>> -	 * EFI got unpinned and freed before the EFD got aborted.
>> +	 * If we got a log I/O error and the EFI is also in this buffer, it
>> +	 * will be unpinned and freed before the EFD got aborted. But the EFI
>> +	 * is in an earlier transaction and could be on the AIL when the log
>> +	 * I/O error happened for this EFD. In that case, manually remove the
>> +	 * remaining EFIs from the AIL.
>>   	 */
>> -	if (!(lip->li_flags&  XFS_LI_ABORTED))
>> +	if (lip->li_flags&  XFS_LI_ABORTED)
>> +		xfs_efi_clear_ail(lip->li_ailp);
>> +	else
>>   		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
>
> Same here - we can simply always call xfs_efi_release() and we'll
> end up doing the right thing w.r.t. the EFi reference count.
>
> And if we do the right thing with the EFI reference count when
> aborting the EFI (i.e. call __xfs_efi_release() rather than freeing
> it) the aborting of transactions will correctly free all the
> references to the EFI and remove them from the AIL.


This is fine if the EFI made it to the AIL before the log abort - which 
it may or may not have done.

Log errors will not force the log. The call to 
xfs_trans_committed_bulk() with the abort flag set will not place the 
EFI on the AIL (takes the iop_unpin/continue path).

Your proposal of changing xfs_efi_committed() from freeing the EFI to a 
call to __xfs_efi_release() and changing xfs_efd_committed() to 
xfs_efi_release() will do the right thing for the counters and removing 
both the EFI/EFD entries, but it will also want to remove the 
non-existent EFI entry from the AIL. I do not like adding the EFI to the 
AIL in the abort path.

> The only case we
> then have to handle specially is the xfs_trans_reserve failure in
> xfs_bmap_finish(), where we need to release the EFI directly in the
> error path.

any reason to not move the efd creation earlier to not special case it?

> That gets rid of the "big hammer" error handling for normal runtime
> shutdown that xfs_efi_clear_ail(). i.e. we need to fix the reference
> count handling, not work around it.
>
>
>> Index: b/fs/xfs/xfs_log_recover.c
>> ===================================================================
>> --- a/fs/xfs/xfs_log_recover.c
>> +++ b/fs/xfs/xfs_log_recover.c
>> @@ -3634,20 +3634,23 @@ xlog_recover_process_data(
>>   /*
>>    * Process an extent free intent item that was recovered from
>>    * the log.  We need to free the extents that it describes.
>> + * The caller will free all EFI entries on error.
>>    */
>>   STATIC int
>>   xlog_recover_process_efi(
>> -	xfs_mount_t		*mp,
>> -	xfs_efi_log_item_t	*efip)
>> +	struct xfs_mount	*mp,
>> +	struct xfs_efi_log_item	*efip)
>>   {
>> -	xfs_efd_log_item_t	*efdp;
>> -	xfs_trans_t		*tp;
>> +	struct xfs_efd_log_item	*efdp;
>> +	struct xfs_trans	*tp;
>>   	int			i;
>>   	int			error = 0;
>>   	xfs_extent_t		*extp;
>>   	xfs_fsblock_t		startblock_fsb;
>>
>>   	ASSERT(!test_bit(XFS_EFI_RECOVERED,&efip->efi_flags));
>> +	/* All paths need the XFS_EFI_RECOVERED flag set. Do it here. */
>> +	set_bit(XFS_EFI_RECOVERED,&efip->efi_flags);
>>
>>   	/*
>>   	 * First check the validity of the extents described by the
>> @@ -3662,12 +3665,6 @@ xlog_recover_process_efi(
>>   		    (extp->ext_len == 0) ||
>>   		    (startblock_fsb>= mp->m_sb.sb_dblocks) ||
>>   		    (extp->ext_len>= mp->m_sb.sb_agblocks)) {
>> -			/*
>> -			 * This will pull the EFI from the AIL and
>> -			 * free the memory associated with it.
>> -			 */
>> -			set_bit(XFS_EFI_RECOVERED,&efip->efi_flags);
>> -			xfs_efi_release(efip, efip->efi_format.efi_nextents);
>>   			return XFS_ERROR(EIO);
>>   		}
>>   	}
>> @@ -3687,7 +3684,6 @@ xlog_recover_process_efi(
>>   					 extp->ext_len);
>>   	}
>>
>> -	set_bit(XFS_EFI_RECOVERED,&efip->efi_flags);
>>   	error = xfs_trans_commit(tp, 0);
>>   	return error;
>
> This is basically saying "xfs_efi_release() should drop both the AIL
> and the EFD reference as we really only only have one reference". I
> think we should simply remove the XFS_EFI_RECOVERED bit from the
> reference counting code, and simply subtract one of the EFI
> references in xlog_recover_efi_pass2() where we insert the EFI into
> the AIL, so that it behaves exactly like the runtime case when
> later processing the EFDs.
>
> That is, in xlog_recover_efd_pass2() when we find a matching EFD we
> simply call xfs_efi_release() and that does all the freeing because
> we're removing the EFD reference which is the only remaining
> reference.
>
> Then in xlog_recover_process_efi(), we simply call xfs_efi_release()
> for the error cases to drop the last reference to the EFI, otherwise
> the commit of the EFD will free it because it calls
> xfs_efi_release() appropriately.

Works for me.

>
>>
>> @@ -3718,8 +3714,8 @@ STATIC int
>>   xlog_recover_process_efis(
>>   	struct xlog	*log)
>>   {
>> -	xfs_log_item_t		*lip;
>> -	xfs_efi_log_item_t	*efip;
>> +	struct xfs_log_item	*lip;
>> +	struct xfs_efi_log_item	*efip;
>>   	int			error = 0;
>>   	struct xfs_ail_cursor	cur;
>>   	struct xfs_ail		*ailp;
>> @@ -3753,12 +3749,13 @@ xlog_recover_process_efis(
>>   		error = xlog_recover_process_efi(log->l_mp, efip);
>>   		spin_lock(&ailp->xa_lock);
>>   		if (error)
>> -			goto out;
>> +			break;
>>   		lip = xfs_trans_ail_cursor_next(ailp,&cur);
>>   	}
>> -out:
>>   	xfs_trans_ail_cursor_done(ailp,&cur);
>>   	spin_unlock(&ailp->xa_lock);
>> +	if (error)
>> +		xfs_efi_clear_ail(ailp);
>>   	return error;
>>   }
>
> If we fix the reference counting, then I think all we need here is:
>
> 	int	error = 0;
>
> 	while (ail cursor next) {
> 		....
> 		if (!error)
> 			error = xlog_recover_process_efi()
> 		else
> 			xfs_efi_release(efip)
> 		....
> 	}
>
> So that any error will trigger freeing of all the other unprocessed
> EFIs on the AIL that are pending recovery without needing any more
> special looping....
>

Nod.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: free the EFI entries from AIL on forced shutdown
  2014-03-20 19:47   ` Mark Tinguely
@ 2014-03-20 20:30     ` Mark Tinguely
  2014-03-20 23:11     ` Dave Chinner
  1 sibling, 0 replies; 5+ messages in thread
From: Mark Tinguely @ 2014-03-20 20:30 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 03/20/14 14:47, Mark Tinguely wrote:
> On 03/17/14 00:29, Dave Chinner wrote:

>
>> The only case we
>> then have to handle specially is the xfs_trans_reserve failure in
>> xfs_bmap_finish(), where we need to release the EFI directly in the
>> error path.
>
> any reason to not move the efd creation earlier to not special case it?

Umm, other than no ticket or space for the log_item. Never mind, we have 
to special case it.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: free the EFI entries from AIL on forced shutdown
  2014-03-20 19:47   ` Mark Tinguely
  2014-03-20 20:30     ` Mark Tinguely
@ 2014-03-20 23:11     ` Dave Chinner
  1 sibling, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2014-03-20 23:11 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs

On Thu, Mar 20, 2014 at 02:47:45PM -0500, Mark Tinguely wrote:
> On 03/17/14 00:29, Dave Chinner wrote:
> >On Fri, Mar 14, 2014 at 11:37:01AM -0500, Mark Tinguely wrote:
> >>@@ -439,10 +446,15 @@ xfs_efd_item_committed(
> >>  	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
> >>
> >>  	/*
> >>-	 * If we got a log I/O error, it's always the case that the LR with the
> >>-	 * EFI got unpinned and freed before the EFD got aborted.
> >>+	 * If we got a log I/O error and the EFI is also in this buffer, it
> >>+	 * will be unpinned and freed before the EFD got aborted. But the EFI
> >>+	 * is in an earlier transaction and could be on the AIL when the log
> >>+	 * I/O error happened for this EFD. In that case, manually remove the
> >>+	 * remaining EFIs from the AIL.
> >>  	 */
> >>-	if (!(lip->li_flags&  XFS_LI_ABORTED))
> >>+	if (lip->li_flags&  XFS_LI_ABORTED)
> >>+		xfs_efi_clear_ail(lip->li_ailp);
> >>+	else
> >>  		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
> >
> >Same here - we can simply always call xfs_efi_release() and we'll
> >end up doing the right thing w.r.t. the EFi reference count.
> >
> >And if we do the right thing with the EFI reference count when
> >aborting the EFI (i.e. call __xfs_efi_release() rather than freeing
> >it) the aborting of transactions will correctly free all the
> >references to the EFI and remove them from the AIL.
> 
> This is fine if the EFI made it to the AIL before the log abort -
> which it may or may not have done.
> 
> Log errors will not force the log. The call to
> xfs_trans_committed_bulk() with the abort flag set will not place
> the EFI on the AIL (takes the iop_unpin/continue path).

Sure - we don't want it added to the AIL on abort. Instead, we want
the reference count that was intended for the AIL to be dropped when
we abort the object so that we can free the object when the last
reference count goes away.

> Your proposal of changing xfs_efi_committed() from freeing the EFI
> to a call to __xfs_efi_release() and changing xfs_efd_committed() to
> xfs_efi_release() will do the right thing for the counters and
> removing both the EFI/EFD entries, but it will also want to remove
> the non-existent EFI entry from the AIL. I do not like adding the
> EFI to the AIL in the abort path.

If the EFI is not in the AIL, then don't try to remove it from the
AIL.  We do this for all sorts of objects on abort/shutdown. e.g.
the abort path in xfs_buf_item_unlock() checks if the item is in the
AIL before removing it. We do the same when flushing a dquot and a
shutdown is encountered. Same for aborting an inode flush, or
completing an inode cluster freeing.

> >The only case we
> >then have to handle specially is the xfs_trans_reserve failure in
> >xfs_bmap_finish(), where we need to release the EFI directly in the
> >error path.
> 
> any reason to not move the efd creation earlier to not special case it?

You can't join objects to a transaction before you've reserved space
for the transaction.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-03-20 23:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-14 16:37 [PATCH] xfs: free the EFI entries from AIL on forced shutdown Mark Tinguely
2014-03-17  5:29 ` Dave Chinner
2014-03-20 19:47   ` Mark Tinguely
2014-03-20 20:30     ` Mark Tinguely
2014-03-20 23:11     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).