public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 1/3] xfs: push down inactive transaction mgmt for remote symlinks
Date: Fri, 20 Sep 2013 09:23:56 +1000	[thread overview]
Message-ID: <20130919232356.GN9901@dastard> (raw)
In-Reply-To: <523AF184.3030002@redhat.com>

On Thu, Sep 19, 2013 at 08:43:48AM -0400, Brian Foster wrote:
> On 09/18/2013 06:51 PM, Dave Chinner wrote:
> > On Wed, Sep 18, 2013 at 02:06:26PM -0400, Brian Foster wrote:
> >> I just want to call out one thing here in case it isn't noticed on
> >> review... the safety of this is something I was curious about.
> >> Specifically, note that I've removed the inode locking from
> >> xfs_inactive(), which previously covered xfs_inactive_symlink() (for
> >> xfs_idata_realloc()), down into xfs_inactive_symlink_rmt().
> > 
> > see my comments about idata_realloc() in the previous email. It
> > might be safe, but it's better not to leave a landmine if we add
> > some other caller to the function in the future.
> > 
> >> My assumption was that this is currently ok since at this point we have
> >> an inode with di_nlink == 0.
> > 
> > It's not an entirely correct assumption. The end result is likely
> > the same, but di_nlink has no influence here. i.e. the inode
> > lifecycle is rather complex and there is an interesting condition
> > that covers inodes going through xfs_inactive().
> > 
> > xfs_inactive() is called when the VFS reference count goes to zero
> > and the VFS inode is being reclaimed, but the XFS_IRECLAIMABLE flag
> > is not yet set on it. This doesn't happen until after xfs_inactive()
> > completes and the VFS calls ->destroy_inode. Hence the inode is in a
> > limbo state where calls to igrab() will fail but the inode can be
> > found in the inode radix trees without being marked as "under
> > reclaim conditions".
> > 
> > We handle this with xfs_iget_cache_hit() by the use of igrab(),
> > which will fail on such an inode, and we use the same logic in
> > xfs_inode_ag_walk_grab() to avoid this hole. That said,
> > xfs_reclaim_inode_grab() does no such thing - it only checks for
> > XFS_IRECLAIMABLE under an RCU traversal, and so may find inodes for
> > which that the radix tree reclaimable tag is stale. hence that
> > check is always done under a spinlock.
> > 
> > IOWs, the only thing that protects us from outside interference in
> > xfs_inactive() is the logic in the XFS inode cache lookups
> > specifically avoiding inodes in the transient state that
> > xfs_inactive() is called under. It doesn't matter what the contents
> > of the inode are - it's the safe transition from one lifecycle state
> > to the next that is important at this point.
> > 
> > So, like I said in the previous email, we have to be careful with
> > cache lookups to prevent races with the work xfs_inactive() is
> > doing, but that doesn't mean we shouldn't still lock the inodes
> > correctly when modifying them...
> > 
> 
> So broadly speaking, the inode states are more granular than my di_nlink
> based assumption.

More that the inode lifecycle states are not related to the value of
di_nlink at all ;)

> We have to account for access via internal caches,
> even if the inode is in the process of being torn down in the vfs. I'll
> have to wade through the caching code much more to understand the
> intricacies. ;) Thanks for the breakdown.

First you need to understand the VFS inode lifecycle, as the XFS
inode lifecycle mostly wraps around the outside of the VFS inode
lifecycle. ;)

> With regard to the locking here, any preference as to whether
> xfs_inactive_symlink() takes the lock and hands it to
> xfs_inactive_symlink_rmt() or the former locks/unlocks and the latter
> continues to work as implemented in this patch (save other comments to
> be addressed)?
> 
> Actually now that I look at the code, xfs_inactive_symlink_rmt() does
> the transaction allocation and reservation now, so for that reason I
> think the lock/unlock pattern is required.

Right, they are effectively two different cases with different
locking constraints.

Cheers,

Dave.

> 
> Brian
> 
> >> If that's not accurate or not expected to
> >> remain so after O_TMPFILE related work, I suppose I could pull the
> >> locking back up into xfs_inactive_symlink().
> > 
> > O_TMPFILE itself won't change anything - they will look just like
> > any other unlinked inode going through xfs_inactive() on their way
> > to the XFS_IRECLAIMABLE state.
> > 
> > It's when we start separating the xfs_inactive() work into multiple
> > distinct stages to allow for optimisation of inode freeing that we
> > need to be careful as these introduce new states into the lifecycle
> > state machine. These will most likely involve new state flags and
> > radix tree tags and walks, but AFAICT, overall concept that
> > xfs_inactive/xfs_inactive_symlink is run from the same special
> > isolated "limbo" context should not change....
> > 
> > Cheers,
> > 
> > Dave.
> > 
> 
> 

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-09-19 23:24 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-18 16:15 [PATCH 0/3] xfs: rework xfs_inactive() Brian Foster
2013-09-18 16:15 ` [PATCH 1/3] xfs: push down inactive transaction mgmt for remote symlinks Brian Foster
2013-09-18 18:06   ` Brian Foster
2013-09-18 22:51     ` Dave Chinner
2013-09-19 12:43       ` Brian Foster
2013-09-19 23:23         ` Dave Chinner [this message]
2013-09-18 22:17   ` Dave Chinner
2013-09-19 12:55     ` Brian Foster
2013-09-20 13:05     ` Brian Foster
2013-09-18 16:15 ` [PATCH 2/3] xfs: push down inactive transaction mgmt for truncate Brian Foster
2013-09-18 23:00   ` Dave Chinner
2013-09-18 16:16 ` [PATCH 3/3] xfs: push down inactive transaction mgmt for ifree Brian Foster
2013-09-18 23:06   ` Dave Chinner
2013-09-19 12:44     ` Brian Foster
2013-09-19 23:29       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130919232356.GN9901@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox