public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 1/3] xfs: push down inactive transaction mgmt for remote symlinks
Date: Thu, 19 Sep 2013 08:51:20 +1000	[thread overview]
Message-ID: <20130918225120.GC9901@dastard> (raw)
In-Reply-To: <5239EBA2.4070207@redhat.com>

On Wed, Sep 18, 2013 at 02:06:26PM -0400, Brian Foster wrote:
> On 09/18/2013 12:15 PM, Brian Foster wrote:
> > Push down the transaction management for remote symlinks from
> > xfs_inactive() down to xfs_inactive_symlink_rmt(). The latter is
> > cleaned up to avoid transaction management intended for the
> > calling context (i.e., trans duplication, reservation, item
> > attachment).
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_inode.c   | 15 ++++++------
> >  fs/xfs/xfs_symlink.c | 64 ++++++++++++++++++----------------------------------
> >  fs/xfs/xfs_symlink.h |  2 +-
> >  3 files changed, 31 insertions(+), 50 deletions(-)
> > 
> ...
> > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> > index f622a97..f85f6f2 100644
> > --- a/fs/xfs/xfs_symlink.c
> > +++ b/fs/xfs/xfs_symlink.c
> > @@ -424,8 +424,7 @@ xfs_symlink(
> >   */
> ...
> >  
> > @@ -563,16 +546,13 @@ xfs_inactive_symlink_rmt(
> >   */
> >  int
> >  xfs_inactive_symlink(
> > -	struct xfs_inode	*ip,
> > -	struct xfs_trans	**tp)
> > +	struct xfs_inode	*ip)
> >  {
> >  	struct xfs_mount	*mp = ip->i_mount;
> >  	int			pathlen;
> >  
> >  	trace_xfs_inactive_symlink(ip);
> >  
> > -	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> > -
> 
> I just want to call out one thing here in case it isn't noticed on
> review... the safety of this is something I was curious about.
> Specifically, note that I've removed the inode locking from
> xfs_inactive(), which previously covered xfs_inactive_symlink() (for
> xfs_idata_realloc()), down into xfs_inactive_symlink_rmt().

see my comments about idata_realloc() in the previous email. It
might be safe, but it's better not to leave a landmine if we add
some other caller to the function in the future.

> My assumption was that this is currently ok since at this point we have
> an inode with di_nlink == 0.

It's not an entirely correct assumption. The end result is likely
the same, but di_nlink has no influence here. i.e. the inode
lifecycle is rather complex and there is an interesting condition
that covers inodes going through xfs_inactive().

xfs_inactive() is called when the VFS reference count goes to zero
and the VFS inode is being reclaimed, but the XFS_IRECLAIMABLE flag
is not yet set on it. This doesn't happen until after xfs_inactive()
completes and the VFS calls ->destroy_inode. Hence the inode is in a
limbo state where calls to igrab() will fail but the inode can be
found in the inode radix trees without being marked as "under
reclaim conditions".

We handle this with xfs_iget_cache_hit() by the use of igrab(),
which will fail on such an inode, and we use the same logic in
xfs_inode_ag_walk_grab() to avoid this hole. That said,
xfs_reclaim_inode_grab() does no such thing - it only checks for
XFS_IRECLAIMABLE under an RCU traversal, and so may find inodes for
which that the radix tree reclaimable tag is stale. hence that
check is always done under a spinlock.

IOWs, the only thing that protects us from outside interference in
xfs_inactive() is the logic in the XFS inode cache lookups
specifically avoiding inodes in the transient state that
xfs_inactive() is called under. It doesn't matter what the contents
of the inode are - it's the safe transition from one lifecycle state
to the next that is important at this point.

So, like I said in the previous email, we have to be careful with
cache lookups to prevent races with the work xfs_inactive() is
doing, but that doesn't mean we shouldn't still lock the inodes
correctly when modifying them...

> If that's not accurate or not expected to
> remain so after O_TMPFILE related work, I suppose I could pull the
> locking back up into xfs_inactive_symlink().

O_TMPFILE itself won't change anything - they will look just like
any other unlinked inode going through xfs_inactive() on their way
to the XFS_IRECLAIMABLE state.

It's when we start separating the xfs_inactive() work into multiple
distinct stages to allow for optimisation of inode freeing that we
need to be careful as these introduce new states into the lifecycle
state machine. These will most likely involve new state flags and
radix tree tags and walks, but AFAICT, overall concept that
xfs_inactive/xfs_inactive_symlink is run from the same special
isolated "limbo" context should not change....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-09-18 22:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-18 16:15 [PATCH 0/3] xfs: rework xfs_inactive() Brian Foster
2013-09-18 16:15 ` [PATCH 1/3] xfs: push down inactive transaction mgmt for remote symlinks Brian Foster
2013-09-18 18:06   ` Brian Foster
2013-09-18 22:51     ` Dave Chinner [this message]
2013-09-19 12:43       ` Brian Foster
2013-09-19 23:23         ` Dave Chinner
2013-09-18 22:17   ` Dave Chinner
2013-09-19 12:55     ` Brian Foster
2013-09-20 13:05     ` Brian Foster
2013-09-18 16:15 ` [PATCH 2/3] xfs: push down inactive transaction mgmt for truncate Brian Foster
2013-09-18 23:00   ` Dave Chinner
2013-09-18 16:16 ` [PATCH 3/3] xfs: push down inactive transaction mgmt for ifree Brian Foster
2013-09-18 23:06   ` Dave Chinner
2013-09-19 12:44     ` Brian Foster
2013-09-19 23:29       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130918225120.GC9901@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox