From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: Dave Jones <davej@redhat.com>, CAI Qian <caiqian@redhat.com>,
xfs@oss.sgi.com
Subject: Re: xfs_efi_item slab corruption. (v3.9-10936-g51a26ae)
Date: Wed, 8 May 2013 09:54:58 +1000 [thread overview]
Message-ID: <20130507235458.GG24635@dastard> (raw)
In-Reply-To: <51898400.8000900@sgi.com>
On Tue, May 07, 2013 at 05:45:20PM -0500, Mark Tinguely wrote:
> On 05/07/13 17:22, Dave Chinner wrote:
> >On Tue, May 07, 2013 at 03:24:28PM -0500, Mark Tinguely wrote:
> >>On 05/07/13 15:22, Dave Jones wrote:
> >>>On Tue, May 07, 2013 at 03:04:33PM -0500, Mark Tinguely wrote:
> >>> > On 05/07/13 14:59, Dave Jones wrote:
> >>> > > On Tue, May 07, 2013 at 02:58:15PM -0500, Mark Tinguely wrote:
> >>> > >
> >>> > > > > I can hit this almost instantly with fsx. I'll do a bisect, though
> >>> > > > > it sounds like you already have a suspect.
> >>> > > > >
> >>> > > >
> >>> > > > If you want to try kmem debug of Linux 3.8 that would help.
> >>> > >
> >>> > > I'm not sure what that is.
> >>> >
> >>> > Sorry, if you would test Linux 3.8 with "CONFIG_DEBUG_SLAB=y".
> >>>
> >>>Ah, done that. (I pretty much always run with it).
> >>>
> >>>This is something new. Even 3.9 was fine. It's only since
> >>>the recent xfs merge.
> >>>
> >>> Dave
> >>>
> >>
> >>git revert 666d644cd72a9ec58b353209ff191d7430f3b357
> >
> >That won't prevent the use after free. That commit fixed a problem
> >that could lead to a use after free, but what we are seeing here is
> >that it has ultimately exposed a previously unknown issue that
> >causes the use after free.
> >
> >Basically what is happening is that there are two commits for the
> >EFD being processed, when there should only be one. I'm not sure how
> >this is happening yet, but these three traces came out from my debug
> >sequentially when running generic/006:
>
> Sorry for the misleading statement. Yes, I agree that patch is a
> good thing. I meant that Dave and only Dave revert it and only to
> test if that patch was the change that caused the new symptom -
> which we know now that it is.
Sure, I realise that, and it turns out I'm wrong - it is a bug in
commit 666d644cd. Poisoning turns a "will probably never occur"
problem into an instant reproducer, because it sets a bit in the efi
structure that is normally zero when the EFI is freed and hence
triggers a second free of the EFI when reading it after the first
free....
Dave, the patch below should fix the problem.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
xfs: Don't reference the EFI after it is freed
From: Dave Chinner <dchinner@redhat.com>
Checking the EFI for whether it is being released from recovery
after we've already released the known active reference is a mistake
worthy of a brown paper bag. Fix the (now) obvious use after free
that it can cause.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_extfree_item.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c
index c0f3750..98c437d 100644
--- a/fs/xfs/xfs_extfree_item.c
+++ b/fs/xfs/xfs_extfree_item.c
@@ -305,10 +305,22 @@ xfs_efi_release(xfs_efi_log_item_t *efip,
{
ASSERT(atomic_read(&efip->efi_next_extent) >= nextents);
if (atomic_sub_and_test(nextents, &efip->efi_next_extent)) {
+ int recovered;
+
+ /*
+ * __xfs_efi_release() can release the last reference to the EFI
+ * and free it, so it is unsafe to reference it after we've
+ * released the reference. The only case this is safe to do is
+ * if we are in recovery and the XFS_EFI_RECOVERED bit is set,
+ * meaning that we have two references to release. Check the
+ * recovered bit before the initial release, as we cannot
+ * reliably check it afterwards.
+ */
+ recovered = test_bit(XFS_EFI_RECOVERED, &efip->efi_flags);
__xfs_efi_release(efip);
/* recovery needs us to drop the EFI reference, too */
- if (test_bit(XFS_EFI_RECOVERED, &efip->efi_flags))
+ if (recovered)
__xfs_efi_release(efip);
}
}
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-05-07 23:55 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-07 13:37 xfs_efi_item slab corruption. (v3.9-10936-g51a26ae) Dave Jones
2013-05-07 19:04 ` Mark Tinguely
2013-05-07 19:07 ` Dave Jones
2013-05-07 19:24 ` Mark Tinguely
2013-05-07 19:31 ` Dave Jones
2013-05-07 19:58 ` Mark Tinguely
2013-05-07 19:59 ` Dave Jones
2013-05-07 20:04 ` Mark Tinguely
2013-05-07 20:22 ` Dave Jones
2013-05-07 20:24 ` Mark Tinguely
2013-05-07 22:22 ` Dave Chinner
2013-05-07 22:45 ` Mark Tinguely
2013-05-07 23:54 ` Dave Chinner [this message]
2013-05-08 0:43 ` Dave Jones
2013-05-08 13:24 ` Mark Tinguely
2013-05-10 1:38 ` Dave Chinner
2013-05-11 12:09 ` Dmitry Monakhov
2013-05-12 0:40 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130507235458.GG24635@dastard \
--to=david@fromorbit.com \
--cc=caiqian@redhat.com \
--cc=davej@redhat.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox