public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Lachlan McIlroy <lachlan@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: Filesystem corruption writing out unlinked inodes
Date: Thu, 4 Sep 2008 19:08:35 +1000	[thread overview]
Message-ID: <20080904090835.GE15950@disturbed> (raw)
In-Reply-To: <48BF33EC.7080406@sgi.com>

On Thu, Sep 04, 2008 at 11:03:40AM +1000, Lachlan McIlroy wrote:
> Dave Chinner wrote:
>> On Tue, Sep 02, 2008 at 03:58:58PM +1000, Lachlan McIlroy wrote:
>>> I'm just not sure about the assumption
>>> that if the flush lock cannot be acquired in xfs_ifree_cluster() then
>>> the inode must be in the process of being flushed. The flush could
>>> be aborted due to the inode being pinned or some other case and the
>>> inode never gets marked as stale.
>>
>> Did that happen?
>>
>> Basically I'm asking what the sequence of events is that leads up
>> to this problem - we need to identify the actual race condition
>> before speculating on potential fixes....
>>
>
> In the trace below pid 7731 is unlinking an inode and it's not the last
> inode so it doesn't go through xfs_ifree_cluster() and mark the other
> inodes as stale.  At the same time pid 12269 unlinks the final inode in
> the cluster and calls xfs_ifree_cluster() but fails to lock the inode
> held by pid 7731 so it skips it.  Pid 12269 deallocates the inode cluster
> and the disk space is reallocated as user data (the data "temp28/file00006"
> is what the test writes into it's files).  Meanwhile pid 7731 finally
> continues and tries to flush the inode.

Ah - how are we unlinking two inodes in the one AG at the same time?
That's supposed to be serialised by the AGI buffer lock....

Ah - I see - we hold the inode across the transaction commit in
xfs_inactive(). That means that the AGI is unlocked well before the
inode is unlocked, which allows the racing inode inactivate to lock
the AGI and call xfs_icluster_free() before the inode is unlocked
after the transaction commit.

Ok, now we understand the race condition....

> Looks like xfs_ifree_cluster() should do a blocking wait on the ilock and
> maybe move the setting of XFS_ISTALE outside the flock.

No, we can't do a blocking wait on the ilock - we already hold the
ilock on other inodes and so we could deadlock by doing that.

Hmmmm - I wonder what the reason for the holding of the inode lock
over the transaction commit is.... Perhaps it is to make the
detatching of the dquots atomic with the inactivation (seems like
a valid reason to me).

Perhap we should also hold the AGI buffer across the transaction
commit as well and only release that after the inode is
unlocked so the cluster free does not make progress until after
the inode inactivation of all inodes in the cluster is complete....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2008-09-04  9:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-02  4:48 Filesystem corruption writing out unlinked inodes Lachlan McIlroy
2008-09-02  5:15 ` Dave Chinner
2008-09-02  5:58   ` Lachlan McIlroy
2008-09-02  6:21     ` Dave Chinner
2008-09-04  1:03       ` Lachlan McIlroy
2008-09-04  9:08         ` Dave Chinner [this message]
2008-09-05  6:23           ` Lachlan McIlroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080904090835.GE15950@disturbed \
    --to=david@fromorbit.com \
    --cc=lachlan@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox