From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] xfs: evict all inodes involved with log redo item recovery
Date: Thu, 10 Aug 2017 13:54:25 -0400 [thread overview]
Message-ID: <20170810175425.GF3777@bfoster.bfoster> (raw)
In-Reply-To: <20170810171851.GW24087@magnolia>
On Thu, Aug 10, 2017 at 10:18:51AM -0700, Darrick J. Wong wrote:
> On Thu, Aug 10, 2017 at 10:51:26AM -0400, Brian Foster wrote:
> > On Wed, Aug 09, 2017 at 10:23:51PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > > When we introduced the bmap redo log items, we set MS_ACTIVE on the
> > > mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
> > > from being truncated prematurely during log recovery. This also had the
> > > effect of putting linked inodes on the lru instead of evicting them.
> > >
> > > Unfortunately, we neglected to find all those unreferenced lru inodes
> > > and evict them after finishing log recovery, which means that we leak
> > > them if anything goes wrong in the rest of xfs_mountfs, because the lru
> > > is only cleaned out on unmount.
> > >
> > > Therefore, look for unreferenced inodes in the lru list immediately
> > > after clearing MS_ACTIVE. If we find any, igrab/iput them so that
> > > they're evicted from memory.
> > >
> > > Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > > fs/xfs/xfs_mount.c | 29 +++++++++++++++++++++++++++++
> > > 1 file changed, 29 insertions(+)
> > >
> > >
> > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > > index d63a367..8da5155 100644
> > > --- a/fs/xfs/xfs_mount.c
> > > +++ b/fs/xfs/xfs_mount.c
> > > @@ -616,6 +616,34 @@ xfs_default_resblks(xfs_mount_t *mp)
> > > }
> > >
> > > /*
> > > + * Find inodes that were put on the LRU during the last part of log
> > > + * recovery that are no longer referenced by anything, and igrab/iput
> > > + * them so that they'll be evicted now.
> > > + *
> > > + * We let all inodes involved in redo item processing end up on the LRU
> > > + * instead of being evicted immediately so that if we do something to an
> > > + * unlinked inode, the irele won't cause premature truncation and
> > > + * freeing of the inode, which results in log recovery failure. We have
> > > + * to evict the unreferenced lru inodes here because we don't otherwise
> > > + * clean up the lru if there's a subsequent failure in xfs_mountfs,
> > > + * which leads to us leaking the inodes if nothing else (e.g. quotacheck)
> > > + * references the inodes before the failure occurs.
> > > + */
> > > +STATIC void
> > > +xfs_mountfs_evict_log_redo_inodes(
> > > + struct xfs_mount *mp)
> > > +{
> > > + struct super_block *sb = mp->m_super;
> > > + struct inode *inode;
> > > + struct inode *next;
> > > +
> >
> > Perhaps assert that MS_ACTIVE is cleared here as a bit of
> > self-documentation..?
>
> Huh? That's the previous patch; MS_ACTIVE should already be clear when
> we get here.
>
Right.. I mean add something like the following at the top of the
function:
ASSERT(!(mp->m_super->s_flags & MS_ACTIVE));
... so it's clear that this function only serves a purpose when the
filesystem is not active.
> > > + list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
> > > + if (!atomic_read(&inode->i_count))
> > > + iput(igrab(inode));
> > > + }
> >
> > I assume we don't need s_inodes_list_lock here since nothing else should
> > be messing with the list..?
>
> Nothing should be messing with the list, though the inode_lru_list_del
> in iput_final will take s_inode_list_lock.
>
> > If so, this looks fine otherwise. Though any reason you didn't end up
> > calling evict_inodes() here? Because it's not exported..?
>
> I felt there's less messing around with vfs internals by sticking to
> igrab and iput (and letting them deal with the internal inode state)
> since that's what xfs uses elsewhere to manage inode life cycles.
>
Ok, sounds good:
Reviewed-by: Brian Foster <bfoster@redhat.com>
> --D
>
> >
> > Brian
> >
> > > +}
> > > +
> > > +/*
> > > * This function does the following on an initial mount of a file system:
> > > * - reads the superblock from disk and init the mount struct
> > > * - if we're a 32-bit kernel, do a size check on the superblock
> > > @@ -960,6 +988,7 @@ xfs_mountfs(
> > > mp->m_super->s_flags |= MS_ACTIVE;
> > > error = xfs_log_mount_finish(mp);
> > > mp->m_super->s_flags &= ~MS_ACTIVE;
> > > + xfs_mountfs_evict_log_redo_inodes(mp);
> > > if (error) {
> > > xfs_warn(mp, "log mount finish failed");
> > > goto out_rtunmount;
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-08-10 17:54 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-10 5:23 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
2017-08-10 5:23 ` [PATCH 2/3] xfs: evict all inodes involved with log redo item recovery Darrick J. Wong
2017-08-10 14:51 ` Brian Foster
2017-08-10 17:18 ` Darrick J. Wong
2017-08-10 17:54 ` Brian Foster [this message]
2017-08-11 11:22 ` Christoph Hellwig
2017-08-11 16:51 ` Darrick J. Wong
2017-08-11 19:50 ` [PATCH v2 " Darrick J. Wong
2017-08-11 23:42 ` Dave Chinner
2017-08-11 23:59 ` Darrick J. Wong
2017-08-10 5:23 ` [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails Darrick J. Wong
2017-08-10 14:51 ` Brian Foster
2017-08-11 11:19 ` Christoph Hellwig
2017-08-11 19:48 ` [PATCH v3 " Darrick J. Wong
2017-08-14 12:40 ` Brian Foster
2017-08-10 18:15 ` [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Allison Henderson
2017-08-11 11:13 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170810175425.GF3777@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.