* [PATCH 0/2] xfs: fix a class of shutdown hangs @ 2012-03-23 1:47 Dave Chinner 2012-03-23 1:47 ` [PATCH 1/2] xfs: move shutdown out of xfs_trans_ail_delete_bulk Dave Chinner 2012-03-23 1:47 ` [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() Dave Chinner 0 siblings, 2 replies; 10+ messages in thread From: Dave Chinner @ 2012-03-23 1:47 UTC (permalink / raw) To: xfs The shutdown hang these two patches fix was tripped over by RH QA when some other (unknown) problem resulting in trying to remove an EFI log item from the AIL when it was not in the AIL at all. The shutdown was triggered from log IO completion context, and hung trying to wait for log IO completion. IOWs, the patches will allow the shutdown to finish and the filesystem to be unmounted, but do nothing to fix the unknown problem that caused the shutdown to be triggered. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] xfs: move shutdown out of xfs_trans_ail_delete_bulk 2012-03-23 1:47 [PATCH 0/2] xfs: fix a class of shutdown hangs Dave Chinner @ 2012-03-23 1:47 ` Dave Chinner 2012-03-24 17:01 ` Christoph Hellwig 2012-03-23 1:47 ` [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() Dave Chinner 1 sibling, 1 reply; 10+ messages in thread From: Dave Chinner @ 2012-03-23 1:47 UTC (permalink / raw) To: xfs From: Dave Chinner <dchinner@redhat.com> xfs_trans_ail_delete_bulk() can be called from different contexts so if the itemis not in the AIL we need different shutdown for each context. Move the shutdown to the call locations to prepare for changing the shutdown methods where appropriate. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_buf_item.c | 12 ++++++++++-- fs/xfs/xfs_dquot.c | 9 ++++++--- fs/xfs/xfs_dquot_item.c | 5 ++++- fs/xfs/xfs_extfree_item.c | 7 ++++++- fs/xfs/xfs_iget.c | 9 ++++++--- fs/xfs/xfs_inode_item.c | 16 +++++++++++++--- fs/xfs/xfs_log_recover.c | 12 +++++++----- fs/xfs/xfs_trans_ail.c | 10 ++++++---- fs/xfs/xfs_trans_priv.h | 6 +++--- 9 files changed, 61 insertions(+), 25 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index eac97ef..3e5f654 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -454,8 +454,13 @@ xfs_buf_item_unpin( bp->b_fspriv = NULL; bp->b_iodone = NULL; } else { + int error; + spin_lock(&ailp->xa_lock); - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)bip); + error = xfs_trans_ail_delete(ailp, lip); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); xfs_buf_item_relse(bp); ASSERT(bp->b_fspriv == NULL); } @@ -1030,6 +1035,7 @@ xfs_buf_iodone( struct xfs_log_item *lip) { struct xfs_ail *ailp = lip->li_ailp; + int error; ASSERT(BUF_ITEM(lip)->bli_buf == bp); @@ -1045,6 +1051,8 @@ xfs_buf_iodone( * Either way, AIL is useless if we're forcing a shutdown. */ spin_lock(&ailp->xa_lock); - xfs_trans_ail_delete(ailp, lip); + error = xfs_trans_ail_delete(ailp, lip); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, SHUTDOWN_CORRUPT_INCORE); xfs_buf_item_free(BUF_ITEM(lip)); } diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 4be16a0..cd5bd4b 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -856,9 +856,12 @@ xfs_qm_dqflush_done( /* xfs_trans_ail_delete() drops the AIL lock. */ spin_lock(&ailp->xa_lock); - if (lip->li_lsn == qip->qli_flush_lsn) - xfs_trans_ail_delete(ailp, lip); - else + if (lip->li_lsn == qip->qli_flush_lsn) { + int error = xfs_trans_ail_delete(ailp, lip); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); + } else spin_unlock(&ailp->xa_lock); } diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 34baeae..69a098c 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -448,13 +448,16 @@ xfs_qm_qoffend_logitem_committed( struct xfs_qoff_logitem *qfe = QOFF_ITEM(lip); struct xfs_qoff_logitem *qfs = qfe->qql_start_lip; struct xfs_ail *ailp = qfs->qql_item.li_ailp; + int error; /* * Delete the qoff-start logitem from the AIL. * xfs_trans_ail_delete() drops the AIL lock. */ spin_lock(&ailp->xa_lock); - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)qfs); + error = xfs_trans_ail_delete(ailp, (struct xfs_log_item *)qfs); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, SHUTDOWN_CORRUPT_INCORE); kmem_free(qfs); kmem_free(qfe); diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 35c2aff..4ccf2b6 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -62,9 +62,14 @@ __xfs_efi_release( struct xfs_ail *ailp = efip->efi_item.li_ailp; if (!test_and_clear_bit(XFS_EFI_COMMITTED, &efip->efi_flags)) { + int error; + spin_lock(&ailp->xa_lock); /* xfs_trans_ail_delete() drops the AIL lock. */ - xfs_trans_ail_delete(ailp, &efip->efi_item); + error = xfs_trans_ail_delete(ailp, &efip->efi_item); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); xfs_efi_item_free(efip); } } diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c index bcc6c24..4700ba4 100644 --- a/fs/xfs/xfs_iget.c +++ b/fs/xfs/xfs_iget.c @@ -135,9 +135,12 @@ xfs_inode_free( XFS_FORCED_SHUTDOWN(ip->i_mount)); if (lip->li_flags & XFS_LI_IN_AIL) { spin_lock(&ailp->xa_lock); - if (lip->li_flags & XFS_LI_IN_AIL) - xfs_trans_ail_delete(ailp, lip); - else + if (lip->li_flags & XFS_LI_IN_AIL) { + int error = xfs_trans_ail_delete(ailp, lip); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); + } else spin_unlock(&ailp->xa_lock); } xfs_inode_item_destroy(ip); diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 05d924e..b0a813f 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -837,7 +837,9 @@ xfs_iflush_done( */ if (need_ail) { struct xfs_log_item *log_items[need_ail]; - int i = 0; + int i = 0; + int error; + spin_lock(&ailp->xa_lock); for (blip = lip; blip; blip = blip->li_bio_list) { iip = INODE_ITEM(blip); @@ -848,7 +850,10 @@ xfs_iflush_done( ASSERT(i <= need_ail); } /* xfs_trans_ail_delete_bulk() drops the AIL lock. */ - xfs_trans_ail_delete_bulk(ailp, log_items, i); + error = xfs_trans_ail_delete_bulk(ailp, log_items, i); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); } @@ -887,8 +892,13 @@ xfs_iflush_abort( if (iip->ili_item.li_flags & XFS_LI_IN_AIL) { spin_lock(&ailp->xa_lock); if (iip->ili_item.li_flags & XFS_LI_IN_AIL) { + int error; /* xfs_trans_ail_delete() drops the AIL lock. */ - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)iip); + error = xfs_trans_ail_delete(ailp, + (xfs_log_item_t *)iip); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); } else spin_unlock(&ailp->xa_lock); } diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 7c75c73..63df121 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2638,11 +2638,13 @@ xlog_recover_efd_pass2( if (lip->li_type == XFS_LI_EFI) { efip = (xfs_efi_log_item_t *)lip; if (efip->efi_format.efi_id == efi_id) { - /* - * xfs_trans_ail_delete() drops the - * AIL lock. - */ - xfs_trans_ail_delete(ailp, lip); + int error; + + /* xfs_trans_ail_delete() drops the AIL lock. */ + error = xfs_trans_ail_delete(ailp, lip); + if (error == EFSCORRUPTED) + xfs_force_shutdown(ailp->xa_mount, + SHUTDOWN_CORRUPT_INCORE); xfs_efi_item_free(efip); spin_lock(&ailp->xa_lock); break; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 1dead07..4c94a1f 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -694,9 +694,10 @@ xfs_trans_ail_update_bulk( * of traffic on the lock, especially during IO completion. * * This function must be called with the AIL lock held. The lock is dropped - * before returning. + * before returning. If EFSCORRUPTED is returned, the caller must shut down the + * filesystem. */ -void +int xfs_trans_ail_delete_bulk( struct xfs_ail *ailp, struct xfs_log_item **log_items, @@ -718,9 +719,9 @@ xfs_trans_ail_delete_bulk( xfs_alert_tag(mp, XFS_PTAG_AILDELETE, "%s: attempting to delete a log item that is not in the AIL", __func__); - xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + return EFSCORRUPTED; } - return; + return 0; } xfs_ail_delete(ailp, lip); @@ -735,6 +736,7 @@ xfs_trans_ail_delete_bulk( xlog_assign_tail_lsn(ailp->xa_mount); xfs_log_space_wake(ailp->xa_mount); } + return 0; } /* diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index 8ab2ced..8ca636a 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -89,15 +89,15 @@ xfs_trans_ail_update( xfs_trans_ail_update_bulk(ailp, NULL, &lip, 1, lsn); } -void xfs_trans_ail_delete_bulk(struct xfs_ail *ailp, +int xfs_trans_ail_delete_bulk(struct xfs_ail *ailp, struct xfs_log_item **log_items, int nr_items) __releases(ailp->xa_lock); -static inline void +static inline int xfs_trans_ail_delete( struct xfs_ail *ailp, xfs_log_item_t *lip) __releases(ailp->xa_lock) { - xfs_trans_ail_delete_bulk(ailp, &lip, 1); + return xfs_trans_ail_delete_bulk(ailp, &lip, 1); } void xfs_ail_push(struct xfs_ail *, xfs_lsn_t); -- 1.7.9 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] xfs: move shutdown out of xfs_trans_ail_delete_bulk 2012-03-23 1:47 ` [PATCH 1/2] xfs: move shutdown out of xfs_trans_ail_delete_bulk Dave Chinner @ 2012-03-24 17:01 ` Christoph Hellwig 0 siblings, 0 replies; 10+ messages in thread From: Christoph Hellwig @ 2012-03-24 17:01 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs > spin_lock(&ailp->xa_lock); > - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)qfs); > + error = xfs_trans_ail_delete(ailp, (struct xfs_log_item *)qfs); please make this a: error = xfs_trans_ail_delete(ailp, &qfs->qli_item); while you're at it. > if (lip->li_flags & XFS_LI_IN_AIL) { > spin_lock(&ailp->xa_lock); > - if (lip->li_flags & XFS_LI_IN_AIL) > - xfs_trans_ail_delete(ailp, lip); > - else > + if (lip->li_flags & XFS_LI_IN_AIL) { > + int error = xfs_trans_ail_delete(ailp, lip); > + if (error == EFSCORRUPTED) > + xfs_force_shutdown(ailp->xa_mount, > + SHUTDOWN_CORRUPT_INCORE); > + } else Given that we already have two checks for XFS_LI_IN_AIL around the call to xfs_trans_ail_delete I don't think we need to handle the error here. > @@ -887,8 +892,13 @@ xfs_iflush_abort( > if (iip->ili_item.li_flags & XFS_LI_IN_AIL) { > spin_lock(&ailp->xa_lock); > if (iip->ili_item.li_flags & XFS_LI_IN_AIL) { > + int error; > /* xfs_trans_ail_delete() drops the AIL lock. */ > - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)iip); > + error = xfs_trans_ail_delete(ailp, > + (xfs_log_item_t *)iip); > + if (error == EFSCORRUPTED) > + xfs_force_shutdown(ailp->xa_mount, > + SHUTDOWN_CORRUPT_INCORE); Same argument here. Also please pass in &iip->>ili_item instead of the cast here. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() 2012-03-23 1:47 [PATCH 0/2] xfs: fix a class of shutdown hangs Dave Chinner 2012-03-23 1:47 ` [PATCH 1/2] xfs: move shutdown out of xfs_trans_ail_delete_bulk Dave Chinner @ 2012-03-23 1:47 ` Dave Chinner 2012-03-24 17:03 ` Christoph Hellwig 1 sibling, 1 reply; 10+ messages in thread From: Dave Chinner @ 2012-03-23 1:47 UTC (permalink / raw) To: xfs From: Dave Chinner <dchinner@redhat.com> When a shutdown is triggered from failing to find an item in the AIL during delete, we can be called from either metadata IO completion context or from log IO completion context. In the case of log IO completion context, we must indicate that this is a log error so that the forced shutdown does not attempt to flush the log. To flush the log whilst in log IO completion will cause a deadlock as the shutdown won't proceed until log IO completes, and log Io cannot complete because it has blocked waiting for itself to complete.... We delete items in the AIL from log IO completion when we are unpinning in-memory only items, or items that do not require writeback to remove from the AIL (e.g. EFI/EFD items). Hence there are several locations that need this treatment. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_buf_item.c | 2 +- fs/xfs/xfs_dquot_item.c | 2 +- fs/xfs/xfs_extfree_item.c | 2 +- fs/xfs/xfs_inode.c | 4 ++-- fs/xfs/xfs_inode_item.c | 6 ++++-- fs/xfs/xfs_inode_item.h | 2 +- 6 files changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 3e5f654..07fac37 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -460,7 +460,7 @@ xfs_buf_item_unpin( error = xfs_trans_ail_delete(ailp, lip); if (error == EFSCORRUPTED) xfs_force_shutdown(ailp->xa_mount, - SHUTDOWN_CORRUPT_INCORE); + SHUTDOWN_LOG_IO_ERROR); xfs_buf_item_relse(bp); ASSERT(bp->b_fspriv == NULL); } diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 69a098c..452fb24 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -457,7 +457,7 @@ xfs_qm_qoffend_logitem_committed( spin_lock(&ailp->xa_lock); error = xfs_trans_ail_delete(ailp, (struct xfs_log_item *)qfs); if (error == EFSCORRUPTED) - xfs_force_shutdown(ailp->xa_mount, SHUTDOWN_CORRUPT_INCORE); + xfs_force_shutdown(ailp->xa_mount, SHUTDOWN_LOG_IO_ERROR); kmem_free(qfs); kmem_free(qfe); diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 4ccf2b6..40d1b0e 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -69,7 +69,7 @@ __xfs_efi_release( error = xfs_trans_ail_delete(ailp, &efip->efi_item); if (error == EFSCORRUPTED) xfs_force_shutdown(ailp->xa_mount, - SHUTDOWN_CORRUPT_INCORE); + SHUTDOWN_LOG_IO_ERROR); xfs_efi_item_free(efip); } } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index bc46c0a..4fb2e99 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2377,7 +2377,7 @@ cluster_corrupt_out: /* * Unlocks the flush lock */ - xfs_iflush_abort(iq); + xfs_iflush_abort(iq, false); kmem_free(ilist); xfs_perag_put(pag); return XFS_ERROR(EFSCORRUPTED); @@ -2503,7 +2503,7 @@ cluster_corrupt_out: /* * Unlocks the flush lock */ - xfs_iflush_abort(ip); + xfs_iflush_abort(ip, false); return XFS_ERROR(EFSCORRUPTED); } diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index b0a813f..bd7fd73 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -883,7 +883,8 @@ xfs_iflush_done( */ void xfs_iflush_abort( - xfs_inode_t *ip) + struct xfs_inode *ip, + bool stale) { xfs_inode_log_item_t *iip = ip->i_itemp; @@ -898,6 +899,7 @@ xfs_iflush_abort( (xfs_log_item_t *)iip); if (error == EFSCORRUPTED) xfs_force_shutdown(ailp->xa_mount, + stale ? SHUTDOWN_LOG_IO_ERROR : SHUTDOWN_CORRUPT_INCORE); } else spin_unlock(&ailp->xa_lock); @@ -925,7 +927,7 @@ xfs_istale_done( struct xfs_buf *bp, struct xfs_log_item *lip) { - xfs_iflush_abort(INODE_ITEM(lip)->ili_inode); + xfs_iflush_abort(INODE_ITEM(lip)->ili_inode, true); } /* diff --git a/fs/xfs/xfs_inode_item.h b/fs/xfs/xfs_inode_item.h index 41d61c3..376d4d0 100644 --- a/fs/xfs/xfs_inode_item.h +++ b/fs/xfs/xfs_inode_item.h @@ -165,7 +165,7 @@ extern void xfs_inode_item_init(struct xfs_inode *, struct xfs_mount *); extern void xfs_inode_item_destroy(struct xfs_inode *); extern void xfs_iflush_done(struct xfs_buf *, struct xfs_log_item *); extern void xfs_istale_done(struct xfs_buf *, struct xfs_log_item *); -extern void xfs_iflush_abort(struct xfs_inode *); +extern void xfs_iflush_abort(struct xfs_inode *, bool); extern int xfs_inode_item_format_convert(xfs_log_iovec_t *, xfs_inode_log_format_t *); -- 1.7.9 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() 2012-03-23 1:47 ` [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() Dave Chinner @ 2012-03-24 17:03 ` Christoph Hellwig 2012-03-24 22:46 ` Dave Chinner 0 siblings, 1 reply; 10+ messages in thread From: Christoph Hellwig @ 2012-03-24 17:03 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On Fri, Mar 23, 2012 at 12:47:43PM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@redhat.com> > > When a shutdown is triggered from failing to find an item in the AIL > during delete, we can be called from either metadata IO completion > context or from log IO completion context. In the case of log IO > completion context, we must indicate that this is a log error so > that the forced shutdown does not attempt to flush the log. > > To flush the log whilst in log IO completion will cause a deadlock > as the shutdown won't proceed until log IO completes, and log Io > cannot complete because it has blocked waiting for itself to > complete.... > > We delete items in the AIL from log IO completion when we are > unpinning in-memory only items, or items that do not require > writeback to remove from the AIL (e.g. EFI/EFD items). Hence there > are several locations that need this treatment. Looks good. I wonder if it might be simple to simply pass a flags argument to xfs_ail_delete(_bulk) which tells which kind of shutdown to do. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() 2012-03-24 17:03 ` Christoph Hellwig @ 2012-03-24 22:46 ` Dave Chinner 2012-03-25 11:28 ` Christoph Hellwig 0 siblings, 1 reply; 10+ messages in thread From: Dave Chinner @ 2012-03-24 22:46 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs On Sat, Mar 24, 2012 at 01:03:02PM -0400, Christoph Hellwig wrote: > On Fri, Mar 23, 2012 at 12:47:43PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > When a shutdown is triggered from failing to find an item in the AIL > > during delete, we can be called from either metadata IO completion > > context or from log IO completion context. In the case of log IO > > completion context, we must indicate that this is a log error so > > that the forced shutdown does not attempt to flush the log. > > > > To flush the log whilst in log IO completion will cause a deadlock > > as the shutdown won't proceed until log IO completes, and log Io > > cannot complete because it has blocked waiting for itself to > > complete.... > > > > We delete items in the AIL from log IO completion when we are > > unpinning in-memory only items, or items that do not require > > writeback to remove from the AIL (e.g. EFI/EFD items). Hence there > > are several locations that need this treatment. > > Looks good. I wonder if it might be simple to simply pass a flags > argument to xfs_ail_delete(_bulk) which tells which kind of shutdown > to do. I thought about doing that, but it seems strange and unusual to tell code how to handle errors internally instead of returning the error and letting the caller handle it... I don't mind either way, though. If you prefer I pass in the shutdown flag, I can change it all to do that.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() 2012-03-24 22:46 ` Dave Chinner @ 2012-03-25 11:28 ` Christoph Hellwig 2012-03-27 9:45 ` [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) Dave Chinner 0 siblings, 1 reply; 10+ messages in thread From: Christoph Hellwig @ 2012-03-25 11:28 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs On Sun, Mar 25, 2012 at 09:46:49AM +1100, Dave Chinner wrote: > > Looks good. I wonder if it might be simple to simply pass a flags > > argument to xfs_ail_delete(_bulk) which tells which kind of shutdown > > to do. > > I thought about doing that, but it seems strange and unusual to tell > code how to handle errors internally instead of returning the error > and letting the caller handle it... > > I don't mind either way, though. If you prefer I pass in the > shutdown flag, I can change it all to do that.... I don't have a strong opinion. Moving the shutdown to the caller is cleaner for sure, but it's a lot of churn when the other version would _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) 2012-03-25 11:28 ` Christoph Hellwig @ 2012-03-27 9:45 ` Dave Chinner 2012-03-27 9:57 ` Christoph Hellwig 2012-04-06 17:42 ` Mark Tinguely 0 siblings, 2 replies; 10+ messages in thread From: Dave Chinner @ 2012-03-27 9:45 UTC (permalink / raw) To: Christoph Hellwig; +Cc: xfs On Sun, Mar 25, 2012 at 07:28:01AM -0400, Christoph Hellwig wrote: > On Sun, Mar 25, 2012 at 09:46:49AM +1100, Dave Chinner wrote: > > > Looks good. I wonder if it might be simple to simply pass a flags > > > argument to xfs_ail_delete(_bulk) which tells which kind of shutdown > > > to do. > > > > I thought about doing that, but it seems strange and unusual to tell > > code how to handle errors internally instead of returning the error > > and letting the caller handle it... > > > > I don't mind either way, though. If you prefer I pass in the > > shutdown flag, I can change it all to do that.... > > I don't have a strong opinion. Moving the shutdown to the caller is > cleaner for sure, but it's a lot of churn when the other version would Ok, Here's a "pass-in-the-shutdown-method" version of the change. Cheers, Dave. -- Dave Chinner david@fromorbit.com xfs: pass shutdown method into xfs_trans_ail_delete_bulk From: Dave Chinner <dchinner@redhat.com> xfs_trans_ail_delete_bulk() can be called from different contexts so if the item is not in the AIL we need different shutdown for each context. Pass in the shutdown method needed so the correct action can be taken. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/xfs_buf_item.c | 4 ++-- fs/xfs/xfs_dquot.c | 2 +- fs/xfs/xfs_dquot_item.c | 2 +- fs/xfs/xfs_extfree_item.c | 3 ++- fs/xfs/xfs_iget.c | 3 ++- fs/xfs/xfs_inode.c | 4 ++-- fs/xfs/xfs_inode_item.c | 23 +++++++++++++---------- fs/xfs/xfs_inode_item.h | 2 +- fs/xfs/xfs_log_recover.c | 3 ++- fs/xfs/xfs_trans_ail.c | 5 +++-- fs/xfs/xfs_trans_priv.h | 8 +++++--- 11 files changed, 34 insertions(+), 25 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index eac97ef..8111a65 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -455,7 +455,7 @@ xfs_buf_item_unpin( bp->b_iodone = NULL; } else { spin_lock(&ailp->xa_lock); - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)bip); + xfs_trans_ail_delete(ailp, lip, SHUTDOWN_LOG_IO_ERROR); xfs_buf_item_relse(bp); ASSERT(bp->b_fspriv == NULL); } @@ -1045,6 +1045,6 @@ xfs_buf_iodone( * Either way, AIL is useless if we're forcing a shutdown. */ spin_lock(&ailp->xa_lock); - xfs_trans_ail_delete(ailp, lip); + xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); xfs_buf_item_free(BUF_ITEM(lip)); } diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 1155208..abbdb7a 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -857,7 +857,7 @@ xfs_qm_dqflush_done( /* xfs_trans_ail_delete() drops the AIL lock. */ spin_lock(&ailp->xa_lock); if (lip->li_lsn == qip->qli_flush_lsn) - xfs_trans_ail_delete(ailp, lip); + xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); else spin_unlock(&ailp->xa_lock); } diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index 34baeae..4e3127b 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -454,7 +454,7 @@ xfs_qm_qoffend_logitem_committed( * xfs_trans_ail_delete() drops the AIL lock. */ spin_lock(&ailp->xa_lock); - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)qfs); + xfs_trans_ail_delete(ailp, &qfs->qql_item, SHUTDOWN_LOG_IO_ERROR); kmem_free(qfs); kmem_free(qfe); diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 35c2aff..be61d1d 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -64,7 +64,8 @@ __xfs_efi_release( if (!test_and_clear_bit(XFS_EFI_COMMITTED, &efip->efi_flags)) { spin_lock(&ailp->xa_lock); /* xfs_trans_ail_delete() drops the AIL lock. */ - xfs_trans_ail_delete(ailp, &efip->efi_item); + xfs_trans_ail_delete(ailp, &efip->efi_item, + SHUTDOWN_LOG_IO_ERROR); xfs_efi_item_free(efip); } } diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c index bcc6c24..b9df823 100644 --- a/fs/xfs/xfs_iget.c +++ b/fs/xfs/xfs_iget.c @@ -136,7 +136,8 @@ xfs_inode_free( if (lip->li_flags & XFS_LI_IN_AIL) { spin_lock(&ailp->xa_lock); if (lip->li_flags & XFS_LI_IN_AIL) - xfs_trans_ail_delete(ailp, lip); + xfs_trans_ail_delete(ailp, lip, + SHUTDOWN_CORRUPT_INCORE); else spin_unlock(&ailp->xa_lock); } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index bc46c0a..4fb2e99 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2377,7 +2377,7 @@ cluster_corrupt_out: /* * Unlocks the flush lock */ - xfs_iflush_abort(iq); + xfs_iflush_abort(iq, false); kmem_free(ilist); xfs_perag_put(pag); return XFS_ERROR(EFSCORRUPTED); @@ -2503,7 +2503,7 @@ cluster_corrupt_out: /* * Unlocks the flush lock */ - xfs_iflush_abort(ip); + xfs_iflush_abort(ip, false); return XFS_ERROR(EFSCORRUPTED); } diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 05d924e..696f565 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -848,7 +848,8 @@ xfs_iflush_done( ASSERT(i <= need_ail); } /* xfs_trans_ail_delete_bulk() drops the AIL lock. */ - xfs_trans_ail_delete_bulk(ailp, log_items, i); + xfs_trans_ail_delete_bulk(ailp, log_items, i, + SHUTDOWN_CORRUPT_INCORE); } @@ -869,16 +870,15 @@ xfs_iflush_done( } /* - * This is the inode flushing abort routine. It is called - * from xfs_iflush when the filesystem is shutting down to clean - * up the inode state. - * It is responsible for removing the inode item - * from the AIL if it has not been re-logged, and unlocking the inode's - * flush lock. + * This is the inode flushing abort routine. It is called from xfs_iflush when + * the filesystem is shutting down to clean up the inode state. It is + * responsible for removing the inode item from the AIL if it has not been + * re-logged, and unlocking the inode's flush lock. */ void xfs_iflush_abort( - xfs_inode_t *ip) + xfs_inode_t *ip, + bool stale) { xfs_inode_log_item_t *iip = ip->i_itemp; @@ -888,7 +888,10 @@ xfs_iflush_abort( spin_lock(&ailp->xa_lock); if (iip->ili_item.li_flags & XFS_LI_IN_AIL) { /* xfs_trans_ail_delete() drops the AIL lock. */ - xfs_trans_ail_delete(ailp, (xfs_log_item_t *)iip); + xfs_trans_ail_delete(ailp, &iip->ili_item, + stale ? + SHUTDOWN_LOG_IO_ERROR : + SHUTDOWN_CORRUPT_INCORE); } else spin_unlock(&ailp->xa_lock); } @@ -915,7 +918,7 @@ xfs_istale_done( struct xfs_buf *bp, struct xfs_log_item *lip) { - xfs_iflush_abort(INODE_ITEM(lip)->ili_inode); + xfs_iflush_abort(INODE_ITEM(lip)->ili_inode, true); } /* diff --git a/fs/xfs/xfs_inode_item.h b/fs/xfs/xfs_inode_item.h index 41d61c3..376d4d0 100644 --- a/fs/xfs/xfs_inode_item.h +++ b/fs/xfs/xfs_inode_item.h @@ -165,7 +165,7 @@ extern void xfs_inode_item_init(struct xfs_inode *, struct xfs_mount *); extern void xfs_inode_item_destroy(struct xfs_inode *); extern void xfs_iflush_done(struct xfs_buf *, struct xfs_log_item *); extern void xfs_istale_done(struct xfs_buf *, struct xfs_log_item *); -extern void xfs_iflush_abort(struct xfs_inode *); +extern void xfs_iflush_abort(struct xfs_inode *, bool); extern int xfs_inode_item_format_convert(xfs_log_iovec_t *, xfs_inode_log_format_t *); diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 7c75c73..41f0694 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2642,7 +2642,8 @@ xlog_recover_efd_pass2( * xfs_trans_ail_delete() drops the * AIL lock. */ - xfs_trans_ail_delete(ailp, lip); + xfs_trans_ail_delete(ailp, lip, + SHUTDOWN_CORRUPT_INCORE); xfs_efi_item_free(efip); spin_lock(&ailp->xa_lock); break; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 1dead07..a6016e4 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -700,7 +700,8 @@ void xfs_trans_ail_delete_bulk( struct xfs_ail *ailp, struct xfs_log_item **log_items, - int nr_items) __releases(ailp->xa_lock) + int nr_items, + int shutdown_type) __releases(ailp->xa_lock) { xfs_log_item_t *mlip; int mlip_changed = 0; @@ -718,7 +719,7 @@ xfs_trans_ail_delete_bulk( xfs_alert_tag(mp, XFS_PTAG_AILDELETE, "%s: attempting to delete a log item that is not in the AIL", __func__); - xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + xfs_force_shutdown(mp, shutdown_type); } return; } diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index 8ab2ced..449c881 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -90,14 +90,16 @@ xfs_trans_ail_update( } void xfs_trans_ail_delete_bulk(struct xfs_ail *ailp, - struct xfs_log_item **log_items, int nr_items) + struct xfs_log_item **log_items, int nr_items, + int shutdown_type) __releases(ailp->xa_lock); static inline void xfs_trans_ail_delete( struct xfs_ail *ailp, - xfs_log_item_t *lip) __releases(ailp->xa_lock) + xfs_log_item_t *lip, + int shutdown_type) __releases(ailp->xa_lock) { - xfs_trans_ail_delete_bulk(ailp, &lip, 1); + xfs_trans_ail_delete_bulk(ailp, &lip, 1, shutdown_type); } void xfs_ail_push(struct xfs_ail *, xfs_lsn_t); _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) 2012-03-27 9:45 ` [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) Dave Chinner @ 2012-03-27 9:57 ` Christoph Hellwig 2012-04-06 17:42 ` Mark Tinguely 1 sibling, 0 replies; 10+ messages in thread From: Christoph Hellwig @ 2012-03-27 9:57 UTC (permalink / raw) To: Dave Chinner; +Cc: Christoph Hellwig, xfs This one looks a lot simpler to me, so it gets my vote. Reviewed-by: Christoph Hellwig <hch@lst.de> Alex, can we push this to Linus soonish? I'll have to rebase the AIL emptying changes on top of this one, so getting it in soon would make my life a lot easier. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) 2012-03-27 9:45 ` [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) Dave Chinner 2012-03-27 9:57 ` Christoph Hellwig @ 2012-04-06 17:42 ` Mark Tinguely 1 sibling, 0 replies; 10+ messages in thread From: Mark Tinguely @ 2012-04-06 17:42 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On 03/27/12 04:45, Dave Chinner wrote: > > xfs: pass shutdown method into xfs_trans_ail_delete_bulk > > From: Dave Chinner<dchinner@redhat.com> > > xfs_trans_ail_delete_bulk() can be called from different contexts so > if the item is not in the AIL we need different shutdown for each > context. Pass in the shutdown method needed so the correct action > can be taken. > > Signed-off-by: Dave Chinner<dchinner@redhat.com> Looks good. Reviewed-by: Mark Tinguely <tinguely@sgi.com> _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-04-06 17:42 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-03-23 1:47 [PATCH 0/2] xfs: fix a class of shutdown hangs Dave Chinner 2012-03-23 1:47 ` [PATCH 1/2] xfs: move shutdown out of xfs_trans_ail_delete_bulk Dave Chinner 2012-03-24 17:01 ` Christoph Hellwig 2012-03-23 1:47 ` [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait() Dave Chinner 2012-03-24 17:03 ` Christoph Hellwig 2012-03-24 22:46 ` Dave Chinner 2012-03-25 11:28 ` Christoph Hellwig 2012-03-27 9:45 ` [PATCH v2] shutdown hang fix (was Re: [PATCH 2/2] xfs: avoid shutdown hang in xlog_wait()) Dave Chinner 2012-03-27 9:57 ` Christoph Hellwig 2012-04-06 17:42 ` Mark Tinguely
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox