public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Ben Myers <bpm@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: don't free EFIs before the EFDs are committed
Date: Fri, 5 Apr 2013 13:31:05 -0500	[thread overview]
Message-ID: <20130405183105.GC22182@sgi.com> (raw)
In-Reply-To: <1364958561-12440-1-git-send-email-david@fromorbit.com>

On Wed, Apr 03, 2013 at 02:09:21PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Filesystems are occasionally being shut down with this error:
> 
> xfs_trans_ail_delete_bulk: attempting to delete a log item that is
> not in the AIL.
> 
> It was diagnosed to be related to the EFI/EFD commit order when the
> EFI and EFD are in different checkpoints and the EFD is committed
> before the EFI here:
> 
> http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
> 
> The real problem is that a single bit cannot fully describe the
> states that the EFI/EFD processing can be in. These completion
> states are:
> 
> EFI			EFI in AIL	EFD		Result
> committed/unpinned	Yes		committed	OK
> committed/pinned	No		committed	Shutdown
> uncommitted		No		committed	Shutdown
> 
> 
> Note that the "result" field is what should happen, not what does
> happen. The current logic is broken and handles the first two cases
> correctly by luck.  That is, the code will free the EFI if the
> XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
> inverted logic "works" because if both EFI and EFD are committed,
> then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
> bit, and the second frees the EFI item. Hence as long as
> xfs_efi_item_committed() has been called, everything appears to be
> fine.
> 
> It is the third case where the logic fails - where
> xfs_efd_item_committed() is called before xfs_efi_item_committed(),
> and that results in the EFI being freed before it has been
> committed. That is the bug that triggered the shutdown, and hence
> keeping track of whether the EFI has been committed or not is
> insufficient to correctly order the EFI/EFD operations w.r.t. the
> AIL.
> 
> What we really want is this: the EFI is always placed into the
> AIL before the last reference goes away. The only way to guarantee
> that is that the EFI is not freed until after it has been unpinned
> *and* the EFD has been committed. That is, restructure the logic so
> that the only case that can occur is the first case.
> 
> This can be done easily by replacing the XFS_EFI_COMMITTED with an
> EFI reference count. The EFI is initialised with it's own count, and
> that is not released until it is unpinned. However, there is a
> complication to this method - the high level EFI/EFD code in
> xfs_bmap_finish() does not hold direct references to the EFI
> structure, and runs a transaction commit between the EFI and EFD
> processing. Hence the EFI can be freed even before the EFD is
> created using such a method.
> 
> Further, log recovery uses the AIL for tracking EFI/EFDs that need
> to be recovered, but it uses the AIL *differently* to the EFI
> transaction commit. Hence log recovery never pins or unpins EFIs, so
> we can't drop the EFI reference count indirectly to free the EFI.
> 
> However, this doesn't prevent us from using a reference count here.
> There is a 1:1 relationship between EFIs and EFDs, so when we
> initialise the EFI we can take a reference count for the EFD as
> well. This solves the xfs_bmap_finish() issue - the EFI will never
> be freed until the EFD is processed. In terms of log recovery,
> during the committing of the EFD we can look for the
> XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
> thereby ensuring everything works correctly there as well.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Applied.

Regards,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

      parent reply	other threads:[~2013-04-05 18:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-03  3:09 [PATCH] xfs: don't free EFIs before the EFDs are committed Dave Chinner
2013-04-03 19:12 ` Mark Tinguely
2013-04-03 19:46   ` Eric Sandeen
2013-04-03 21:02     ` Eric Sandeen
2013-04-03 21:45       ` Mark Tinguely
2013-04-04  1:31         ` Dave Chinner
2013-04-04  1:14   ` Dave Chinner
2013-04-04 22:06 ` Mark Tinguely
2013-04-05  0:45   ` Dave Chinner
2013-04-05 18:31 ` Ben Myers [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130405183105.GC22182@sgi.com \
    --to=bpm@sgi.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox