From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/3] xfs: evict all inodes involved with log redo item recovery
Date: Thu, 10 Aug 2017 13:54:25 -0400 [thread overview]
Message-ID: <20170810175425.GF3777@bfoster.bfoster> (raw)
In-Reply-To: <20170810171851.GW24087@magnolia>
On Thu, Aug 10, 2017 at 10:18:51AM -0700, Darrick J. Wong wrote:
> On Thu, Aug 10, 2017 at 10:51:26AM -0400, Brian Foster wrote:
> > On Wed, Aug 09, 2017 at 10:23:51PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > > When we introduced the bmap redo log items, we set MS_ACTIVE on the
> > > mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
> > > from being truncated prematurely during log recovery. This also had the
> > > effect of putting linked inodes on the lru instead of evicting them.
> > >
> > > Unfortunately, we neglected to find all those unreferenced lru inodes
> > > and evict them after finishing log recovery, which means that we leak
> > > them if anything goes wrong in the rest of xfs_mountfs, because the lru
> > > is only cleaned out on unmount.
> > >
> > > Therefore, look for unreferenced inodes in the lru list immediately
> > > after clearing MS_ACTIVE. If we find any, igrab/iput them so that
> > > they're evicted from memory.
> > >
> > > Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > > fs/xfs/xfs_mount.c | 29 +++++++++++++++++++++++++++++
> > > 1 file changed, 29 insertions(+)
> > >
> > >
> > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > > index d63a367..8da5155 100644
> > > --- a/fs/xfs/xfs_mount.c
> > > +++ b/fs/xfs/xfs_mount.c
> > > @@ -616,6 +616,34 @@ xfs_default_resblks(xfs_mount_t *mp)
> > > }
> > >
> > > /*
> > > + * Find inodes that were put on the LRU during the last part of log
> > > + * recovery that are no longer referenced by anything, and igrab/iput
> > > + * them so that they'll be evicted now.
> > > + *
> > > + * We let all inodes involved in redo item processing end up on the LRU
> > > + * instead of being evicted immediately so that if we do something to an
> > > + * unlinked inode, the irele won't cause premature truncation and
> > > + * freeing of the inode, which results in log recovery failure. We have
> > > + * to evict the unreferenced lru inodes here because we don't otherwise
> > > + * clean up the lru if there's a subsequent failure in xfs_mountfs,
> > > + * which leads to us leaking the inodes if nothing else (e.g. quotacheck)
> > > + * references the inodes before the failure occurs.
> > > + */
> > > +STATIC void
> > > +xfs_mountfs_evict_log_redo_inodes(
> > > + struct xfs_mount *mp)
> > > +{
> > > + struct super_block *sb = mp->m_super;
> > > + struct inode *inode;
> > > + struct inode *next;
> > > +
> >
> > Perhaps assert that MS_ACTIVE is cleared here as a bit of
> > self-documentation..?
>
> Huh? That's the previous patch; MS_ACTIVE should already be clear when
> we get here.
>
Right.. I mean add something like the following at the top of the
function:
ASSERT(!(mp->m_super->s_flags & MS_ACTIVE));
... so it's clear that this function only serves a purpose when the
filesystem is not active.
> > > + list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
> > > + if (!atomic_read(&inode->i_count))
> > > + iput(igrab(inode));
> > > + }
> >
> > I assume we don't need s_inodes_list_lock here since nothing else should
> > be messing with the list..?
>
> Nothing should be messing with the list, though the inode_lru_list_del
> in iput_final will take s_inode_list_lock.
>
> > If so, this looks fine otherwise. Though any reason you didn't end up
> > calling evict_inodes() here? Because it's not exported..?
>
> I felt there's less messing around with vfs internals by sticking to
> igrab and iput (and letting them deal with the internal inode state)
> since that's what xfs uses elsewhere to manage inode life cycles.
>
Ok, sounds good:
Reviewed-by: Brian Foster <bfoster@redhat.com>
> --D
>
> >
> > Brian
> >
> > > +}
> > > +
> > > +/*
> > > * This function does the following on an initial mount of a file system:
> > > * - reads the superblock from disk and init the mount struct
> > > * - if we're a 32-bit kernel, do a size check on the superblock
> > > @@ -960,6 +988,7 @@ xfs_mountfs(
> > > mp->m_super->s_flags |= MS_ACTIVE;
> > > error = xfs_log_mount_finish(mp);
> > > mp->m_super->s_flags &= ~MS_ACTIVE;
> > > + xfs_mountfs_evict_log_redo_inodes(mp);
> > > if (error) {
> > > xfs_warn(mp, "log mount finish failed");
> > > goto out_rtunmount;
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-08-10 17:54 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-10 5:23 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
2017-08-10 5:23 ` [PATCH 2/3] xfs: evict all inodes involved with log redo item recovery Darrick J. Wong
2017-08-10 14:51 ` Brian Foster
2017-08-10 17:18 ` Darrick J. Wong
2017-08-10 17:54 ` Brian Foster [this message]
2017-08-11 11:22 ` Christoph Hellwig
2017-08-11 16:51 ` Darrick J. Wong
2017-08-11 19:50 ` [PATCH v2 " Darrick J. Wong
2017-08-11 23:42 ` Dave Chinner
2017-08-11 23:59 ` Darrick J. Wong
2017-08-10 5:23 ` [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails Darrick J. Wong
2017-08-10 14:51 ` Brian Foster
2017-08-11 11:19 ` Christoph Hellwig
2017-08-11 19:48 ` [PATCH v3 " Darrick J. Wong
2017-08-14 12:40 ` Brian Foster
2017-08-10 18:15 ` [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Allison Henderson
2017-08-11 11:13 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170810175425.GF3777@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox