From: Ben Myers <bpm@sgi.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: don't free EFIs before the EFDs are committed
Date: Fri, 5 Apr 2013 13:31:05 -0500 [thread overview]
Message-ID: <20130405183105.GC22182@sgi.com> (raw)
In-Reply-To: <1364958561-12440-1-git-send-email-david@fromorbit.com>
On Wed, Apr 03, 2013 at 02:09:21PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> Filesystems are occasionally being shut down with this error:
>
> xfs_trans_ail_delete_bulk: attempting to delete a log item that is
> not in the AIL.
>
> It was diagnosed to be related to the EFI/EFD commit order when the
> EFI and EFD are in different checkpoints and the EFD is committed
> before the EFI here:
>
> http://oss.sgi.com/archives/xfs/2013-01/msg00082.html
>
> The real problem is that a single bit cannot fully describe the
> states that the EFI/EFD processing can be in. These completion
> states are:
>
> EFI EFI in AIL EFD Result
> committed/unpinned Yes committed OK
> committed/pinned No committed Shutdown
> uncommitted No committed Shutdown
>
>
> Note that the "result" field is what should happen, not what does
> happen. The current logic is broken and handles the first two cases
> correctly by luck. That is, the code will free the EFI if the
> XFS_EFI_COMMITTED bit is *not* set, rather than if it is set. The
> inverted logic "works" because if both EFI and EFD are committed,
> then the first __xfs_efi_release() call clears the XFS_EFI_COMMITTED
> bit, and the second frees the EFI item. Hence as long as
> xfs_efi_item_committed() has been called, everything appears to be
> fine.
>
> It is the third case where the logic fails - where
> xfs_efd_item_committed() is called before xfs_efi_item_committed(),
> and that results in the EFI being freed before it has been
> committed. That is the bug that triggered the shutdown, and hence
> keeping track of whether the EFI has been committed or not is
> insufficient to correctly order the EFI/EFD operations w.r.t. the
> AIL.
>
> What we really want is this: the EFI is always placed into the
> AIL before the last reference goes away. The only way to guarantee
> that is that the EFI is not freed until after it has been unpinned
> *and* the EFD has been committed. That is, restructure the logic so
> that the only case that can occur is the first case.
>
> This can be done easily by replacing the XFS_EFI_COMMITTED with an
> EFI reference count. The EFI is initialised with it's own count, and
> that is not released until it is unpinned. However, there is a
> complication to this method - the high level EFI/EFD code in
> xfs_bmap_finish() does not hold direct references to the EFI
> structure, and runs a transaction commit between the EFI and EFD
> processing. Hence the EFI can be freed even before the EFD is
> created using such a method.
>
> Further, log recovery uses the AIL for tracking EFI/EFDs that need
> to be recovered, but it uses the AIL *differently* to the EFI
> transaction commit. Hence log recovery never pins or unpins EFIs, so
> we can't drop the EFI reference count indirectly to free the EFI.
>
> However, this doesn't prevent us from using a reference count here.
> There is a 1:1 relationship between EFIs and EFDs, so when we
> initialise the EFI we can take a reference count for the EFD as
> well. This solves the xfs_bmap_finish() issue - the EFI will never
> be freed until the EFD is processed. In terms of log recovery,
> during the committing of the EFD we can look for the
> XFS_EFI_RECOVERED bit being set and drop the EFI reference as well,
> thereby ensuring everything works correctly there as well.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
Applied.
Regards,
Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2013-04-05 18:31 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-03 3:09 [PATCH] xfs: don't free EFIs before the EFDs are committed Dave Chinner
2013-04-03 19:12 ` Mark Tinguely
2013-04-03 19:46 ` Eric Sandeen
2013-04-03 21:02 ` Eric Sandeen
2013-04-03 21:45 ` Mark Tinguely
2013-04-04 1:31 ` Dave Chinner
2013-04-04 1:14 ` Dave Chinner
2013-04-04 22:06 ` Mark Tinguely
2013-04-05 0:45 ` Dave Chinner
2013-04-05 18:31 ` Ben Myers [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130405183105.GC22182@sgi.com \
--to=bpm@sgi.com \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox