From: David Chinner <dgc@sgi.com>
To: Masayuki Saito <m-saito@tnes.nec.co.jp>
Cc: Nathan Scott <nathans@sgi.com>, David Chinner <dgc@sgi.com>,
xfs@oss.sgi.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] xfs: i_state of inode is changed after the inode is freed
Date: Mon, 14 Aug 2006 12:59:02 +1000 [thread overview]
Message-ID: <20060814025901.GE51703024@melbourne.sgi.com> (raw)
In-Reply-To: <20060724170133m-saito@mail.aom.tnes.nec.co.jp>
Masayuki,
Sorry for taking so long to get back to you - travelling and vacation
left a mountain of email for me to delete :/
On Mon, Jul 24, 2006 at 05:01:33PM +0900, Masayuki Saito wrote:
> >I think a fix is going to be much more invasive than just adding
> >reference as my fixes appear to have only narrowed the race window
> >and not solved it. The addition of the lock in the original patch
> >solves the atomic xfs_iunpin()/xfs_reclaim() execution problem,
> >but it does not solve the problems with the i_flags field. Adding
> >a new lock may be our only option here.
>
> I'm considering the solution which fixes two problems([a] i_state of
> the inode is changed while the inode is freed in xfs filesystem and
> [b] the above i_flags problem)
>
> the solution:
> (1)Add new spin_lock(i_flags_lock) for all refernece and change
> places of all i_flags.
> (2)Add igrab()/iput() for xfs_iunpin().
>
> It makes sure that mark_inode_dirty_sync() is never called if
> xfs_iunpin() runs after I_CLEAR is set. Because XFS_IRECLAIM
> or XFS_IRECLAIMABLE is set/checked within the spin_lock.
*nod*
> And there is the reason that igrab()/iput() is needed even if I add
> new spin_lock for xfs_iunpin(). We can prevent the following case
> by adding them.
> * After passing (I_NEW|I_FREEING|I_CLEAR) check in xfs_iunpin(),
> I_FREEING is set.
> * Then mark_inode_dirty_sync() is called and i_state is changed.
> * Hit BUG_ON(!(inode->i_state & I_FREEING)) in clear_inode().
*nod*
> If these ideas seem to be correct, I'll make patches for above (1),(2).
> Any comment?
>
>
> (The following is a part of my thinking patch. Only xfs_iunpin().)
>
> --- linux-2.6.17.6/fs/xfs/xfs_inode.c.orig 2006-07-22 08:07:50.194236144 +0900
> +++ linux-2.6.17.6/fs/xfs/xfs_inode.c 2006-07-25 06:07:18.062853045 +0900
> @@ -2729,6 +2729,8 @@ void
> xfs_iunpin(
> xfs_inode_t *ip)
> {
> + int need_unlock;
> +
> ASSERT(atomic_read(&ip->i_pincount) > 0);
>
> if (atomic_dec_and_test(&ip->i_pincount)) {
> @@ -2744,6 +2746,8 @@ xfs_iunpin(
> * call as the inode reclaim may be blocked waiting for
> * the inode to become unpinned.
> */
> + spin_lock(&ip->i_flags_lock);
> + need_unlock = 1;
> if (!(ip->i_flags & (XFS_IRECLAIM|XFS_IRECLAIMABLE))) {
> vnode_t *vp = XFS_ITOV_NULL(ip);
>
> @@ -2751,10 +2755,22 @@ xfs_iunpin(
> if (vp) {
> struct inode *inode = vn_to_inode(vp);
>
> - if (!(inode->i_state & I_NEW))
> - mark_inode_dirty_sync(inode);
> + if (!(inode->i_state &
> + (I_NEW|I_FREEING|I_CLEAR))) {
> + inode = igrab(inode);
> + if (inode != NULL) {
> + mark_inode_dirty_sync(inode);
> + spin_unlock(&ip->i_flags_lock);
> + need_unlock = 0;
> + iput(inode);
> + }
> + }
> }
> }
> + if (need_unlock) {
> + spin_unlock(&ip->i_flags_lock);
> + need_unlock = 0;
> + }
> wake_up(&ip->i_ipin_wait);
> }
> }
Hmmm - Idon't think we should iput() before we wake up any pinned waiters.
When we have a waiter on i_ipin_wait (called from xfs_iflush()), we have
a thread sleeping with the inode locked.
If we then call iput() and it drops the last reference, we can call back
into the filesystem and start transactions. Those transactions will need
to lock the inode. Hence I think the above can deadlock when racing against
an inode flush.
The code should probably read:
if (dropped last pincount) {
int need_iput = 0;
struct inode *inode;
spin_lock(i_flags_lock)
if (!reclaimable) {
if (!vp) {
if (!(i_state & (NEW|CLEAR))) {
inode = igrab(inode)
if (inode) {
need_iput = 1
mark_inode_dirty_sync(inode)
}
}
}
}
spin_unlock(i_flags_lock)
wake_up(&ip->i_ipin_wait)
if (need_iput)
iput(inode);
}
to avoid this possible deadlock.
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
next prev parent reply other threads:[~2006-08-14 3:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20060710103740.B1674239@wobbly.melbourne.sgi.com>
2006-07-14 10:25 ` [PATCH] xfs: i_state of inode is changed after the inode is freed Masayuki Saito
2006-07-17 11:05 ` Nathan Scott
2006-07-17 14:07 ` David Chinner
2006-07-24 8:01 ` Masayuki Saito
2006-08-14 2:59 ` David Chinner [this message]
2006-08-22 7:48 ` Masayuki Saito
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060814025901.GE51703024@melbourne.sgi.com \
--to=dgc@sgi.com \
--cc=linux-kernel@vger.kernel.org \
--cc=m-saito@tnes.nec.co.jp \
--cc=nathans@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox