All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH RFC 1/2] xfs: return committed status from xfs_trans_roll()
Date: Wed, 29 Jul 2015 07:47:52 -0400	[thread overview]
Message-ID: <20150729114751.GB61108@bfoster.bfoster> (raw)
In-Reply-To: <20150728215140.GC24249@dastard>

On Wed, Jul 29, 2015 at 07:51:40AM +1000, Dave Chinner wrote:
> On Tue, Jul 28, 2015 at 09:40:06AM -0400, Brian Foster wrote:
> > On Tue, Jul 28, 2015 at 10:40:09AM +1000, Dave Chinner wrote:
> > > [ reply to both patches in one reply, because it's related. ]
> > > 
> > > On Thu, Jul 23, 2015 at 04:13:29PM -0400, Brian Foster wrote:
...
> > > >  			free->xbfi_blockcount);
> > > >  
> > > > -	error = xfs_trans_roll(tp, NULL);
> > > > -	*committed = 1;
> > > > +	error = __xfs_trans_roll(tp, NULL, committed);
> > > > +
> > > >  	/*
> > > > -	 * We have a new transaction, so we should return committed=1,
> > > > -	 * even though we're returning an error.
> > > > +	 * We have a new transaction, so we should return committed=1, even
> > > > +	 * though we're returning an error. If an error was returned after the
> > > > +	 * original transaction was committed, defer the error handling until
> > > > +	 * the EFD is logged. We do this because a committed EFI requires an EFD
> > > > +	 * transaction to be processed to ensure the EFI is released.
> > > >  	 */
> > > > -	if (error)
> > > > +	if (error && *committed == 0) {
> > > > +		*committed = 1;
> > > >  		return error;
> > > > +	}
> > > 
> > > So if we failed to commit the EFI, we say we did and then return.
> > > Why do we need to do that?
> > > 
> > > /me goes an looks at all the callers.
> > > 
> > > Hmm - only xfs_itruncate_extents() relies on that behaviour, but it
> > > could just as easily do an inode join on error, because on success
> > > the inode has already been joined to the new transaction by
> > > xfs_trans_roll().
> > > 
> > 
> > Interesting, though I don't follow why this caller even depends on it.
> > It doesn't transfer lock ownership to the transaction. What difference
> > does it make in the error path if the inode is joined?
> 
> Callers of xfs_itruncate_extents() expect it to be locked and joined
> on return, even on error.
> 

All of the callers I see cancel the transaction and unlock the inode
separately on error. I could be glossing over some obvious point here,
but so far I'm not seeing it...

> > > Looking further, we have quite a bit of inconsistency in the error
> > > handling of xfs_bmap_finish() - some callers issue a
> > > xfs_bmap_cancel() in the error path, and some don't. failing to
> > > cancel the freelist on error looks to me like a memory leak, because
> > > we don't free the extents from the free list until the EFD for the
> > > extent has been logged. If we error out earlier, we still have items
> > > on the free list that haven't been processed.
> > > 
> > > So it looks to me like we need fixes there.
> > > 
> > 
> > Heh, not too surprising. I'll make a note to make a pass through these.
> > 
> > > Further, it appears to me that there is really one xfs_bmap_finish()
> > > caller that requires the committed flag: xfs_qm_dqalloc(). All the
> > > others either use it for an assert or joining the inode to the
> > > transaction when committed = 1, which xfs_trans_roll() will have
> > > already done if we return committed = 1....
> > > 
> > 
> > Assuming xfs_trans_reserve() hasn't failed, which could cause *committed
> > == 1 without the inode joined. We could probably change this in
> > __xfs_trans_roll() since the inode is presumably already locked.
> 
> I don't think we can join the inode until after the reservation is
> done. It could still be done in __xfs_trans_roll() regardless.
> 

Hmm, I don't see anything obvious that prevents it. It probably defies
convention though. Anyways, we're probably a few levels of indirection
away from the bug fixes and work that we know needs to happen at this
point. I'll try to get the EFI/EFD stuff worked out, along with some of
the other issues I'm reproducing, then we can go from there...

Brian

> > > > +	return error;
> > > 
> > > This loop doesn't obviously do the right thing now. It
> > > will log the first extent into the EFD and then trigger a shutdown
> > > and return. The extent count in the EFD may not match the
> > > extent count on the EFI, so releasing the EFD at this point may not
> > > release all the extents in the EFI and hence not release the EFI.
> > > 
> > 
> > The EFD unlock handler forcibly releases the EFI on abort. It drops the
> > EFI extent count reference by whatever the extent count on the EFD is,
> > and that is determined on EFD initialization (xfs_trans_get_efd())
> > regardless of how many extents were logged to the EFD.
> > 
> > That said, the error handling here is certainly not obvious because it
> > depends on the lifecycle of the associated log items. The broader goal
> > is to reduce that dependency so the code here is more straightforward...
> 
> *nod*
> 
> > > I think I'd prefer to see a xfs_trans_cancel_efi() call to handle
> > > this error path rather than having to go through the efd to release
> > > the reference on the EFI. i.e.
> > > 
> > > 	error = __xfs_trans_roll(tp, NULL, &committed);
> > > 	if (error) {
> > > 		if (committed) {
> > > 			if (!XFS_FORCED_SHUTDOWN(mp))
> > > 				xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> > > 			xfs_efi_cancel(tp, efi);
> > > 		}
> > > 		return error;
> > > 	}
> > > 
> > 
> > That's a nice idea. It pulls some error handling out of the log item
> > handling code explicitly. An EFD version might be useful for the
> > unlogged EFD error case as well. IMO, the more of these cases that are
> > handled explicitly in xfs_bmap_finish() rather than implicitly via the
> > transaction management code, the more reliable and robust to future
> > change it will be. I'll explore it further...
> 
> Yes, that mirrors my thinking exactly - the EFI/EFD error handling
> has always been problematic with the reliance on reference counting
> via transaction commit/abort callbacks to handle it.
> 
> > > Hmmm - something I just noticed: if we only have one EFD per EFI,
> > > why do we do we have that layer of extent counting before dropping
> > > real references?
> > > 
> > 
> > I wondered this myself, but hadn't made it deep enough to see if we used
> > the reference count elsewhere.
> > 
> > > >  xfs_efd_item_unlock(
> > > >  	struct xfs_log_item	*lip)
> > > >  {
> > > > -	if (lip->li_flags & XFS_LI_ABORTED)
> > > > -		xfs_efd_item_free(EFD_ITEM(lip));
> > > > +	struct xfs_efd_log_item	*efdp = EFD_ITEM(lip);
> > > > +
> > > > +	if (lip->li_flags & XFS_LI_ABORTED) {
> > > > +		xfs_efi_release(efdp->efd_efip, efdp->efd_format.efd_nextents);
> > > > +		xfs_efd_item_free(efdp);
> > > > +	}
> > > >  }
> > > 
> > > i.e. we always call xfs_efi_release() with efi_nextents or
> > > efd_nextents, which are always the same, and so we never partially
> > > complete an EFI. Should we just kill that layer, as it does tend to
> > > complicate the EFI release code?
> > > 
> > 
> > Yeah, that might be a good idea if we don't use the reference count
> > elsewhere. I'll look into that as a subsequent cleanup as well.
> 
> Excellent!
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-07-29 11:47 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-23 20:13 [PATCH RFC 0/2] xfs: fix up EFI/EFD error handling Brian Foster
2015-07-23 20:13 ` [PATCH RFC 1/2] xfs: return committed status from xfs_trans_roll() Brian Foster
2015-07-28  0:40   ` Dave Chinner
2015-07-28 13:40     ` Brian Foster
2015-07-28 21:51       ` Dave Chinner
2015-07-29 11:47         ` Brian Foster [this message]
2015-07-30 17:21       ` Christoph Hellwig
2015-08-10 18:55     ` Brian Foster
2015-07-23 20:13 ` [PATCH RFC 2/2] xfs: fix efi/efd error handling to avoid fs shutdown hangs Brian Foster
2015-07-29 22:18   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150729114751.GB61108@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.