* [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
@ 2017-08-09 1:04 Darrick J. Wong
2017-08-09 1:06 ` [PATCH 2/3] xfs: don't leak linked inodes during log recovery Darrick J. Wong
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-09 1:04 UTC (permalink / raw)
To: xfs
Way back when we established inode block-map redo log items, it was
discovered that we needed to prevent the VFS from evicting inodes during
log recovery because any given inode might be have bmap redo items to
replay even if the inode has no link count and is ultimately deleted,
and any eviction of an unlinked inode causes the inode to be truncated
and freed too early.
To make this possible, we set MS_ACTIVE so that inodes would not be torn
down immediately upon release. Unfortunately, this also results in the
quota inodes not being released at all if a later part of the mount
process should fail, because we never reclaim the inodes. So, clear
MS_ACTIVE immediately after we finish the log recovery so that the quota
inodes will be torn down properly if we abort the mount.
Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
fs/xfs/xfs_mount.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 40d4e8b..d463ab3 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -949,7 +949,9 @@ xfs_mountfs(
* iput to behave like they do for an active filesystem.
* xfs_fs_drop_inode needs to be able to prevent the deletion
* of inodes before we're done replaying log items on those
- * inodes.
+ * inodes. Turn it off immediately after xfs_log_mount_finish
+ * so that we don't leak the quota inodes if subsequent mount
+ * activities fail.
*/
mp->m_super->s_flags |= MS_ACTIVE;
@@ -959,6 +961,7 @@ xfs_mountfs(
* read in.
*/
error = xfs_log_mount_finish(mp);
+ mp->m_super->s_flags &= ~MS_ACTIVE;
if (error) {
xfs_warn(mp, "log mount finish failed");
goto out_rtunmount;
@@ -1028,7 +1031,6 @@ xfs_mountfs(
out_quota:
xfs_qm_unmount_quotas(mp);
out_rtunmount:
- mp->m_super->s_flags &= ~MS_ACTIVE;
xfs_rtunmount_inodes(mp);
out_rele_rip:
IRELE(rip);
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 2/3] xfs: don't leak linked inodes during log recovery
2017-08-09 1:04 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
@ 2017-08-09 1:06 ` Darrick J. Wong
2017-08-09 12:36 ` Brian Foster
2017-08-09 1:07 ` [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails Darrick J. Wong
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-09 1:06 UTC (permalink / raw)
To: xfs
When we introduced the bmap redo log items, we set MS_ACTIVE on the
mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
from being truncated prematurely during log recovery. However, we
neglected to drop linked inodes that are recovered, and if we don't use
the inode between recovery and unmount, the inode will never be marked
reclaimable and thus we fail to free it at umount time. If we're in
log recovery but IRECOVERY is /not/ set, the inode is linked and can be
reclaimed.
Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
fs/xfs/xfs_super.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 38aaacd..9b06ca2 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1040,6 +1040,13 @@ xfs_fs_drop_inode(
if (ip->i_flags & XFS_IRECOVERY) {
ASSERT(ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED);
return 0;
+ } else if (ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED) {
+ /*
+ * This inode was loaded during recovery but is not
+ * being unlinked, so we can free it without fear of
+ * premature truncation.
+ */
+ return 1;
}
return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 2/3] xfs: don't leak linked inodes during log recovery
2017-08-09 1:06 ` [PATCH 2/3] xfs: don't leak linked inodes during log recovery Darrick J. Wong
@ 2017-08-09 12:36 ` Brian Foster
2017-08-09 16:49 ` Darrick J. Wong
0 siblings, 1 reply; 14+ messages in thread
From: Brian Foster @ 2017-08-09 12:36 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: xfs
On Tue, Aug 08, 2017 at 06:06:50PM -0700, Darrick J. Wong wrote:
> When we introduced the bmap redo log items, we set MS_ACTIVE on the
> mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
> from being truncated prematurely during log recovery. However, we
> neglected to drop linked inodes that are recovered, and if we don't use
> the inode between recovery and unmount, the inode will never be marked
> reclaimable and thus we fail to free it at umount time. If we're in
> log recovery but IRECOVERY is /not/ set, the inode is linked and can be
> reclaimed.
>
I follow the change in behavior in the previous commit and how this
restores the original behavior for linked inodes, so this patch makes
sense from that perspective. I'm not following where/how the leak occurs
from the description, however. Wouldn't the inode end up on the lru to
be shrunk/evicted/reclaimed at a later point? What happens if the inode
is subsequently used that prevents the leak? (Whatever I'm missing, it
would be nice to elaborate on in the commit log.)
Also, if there is a memory leak vector for !drop linked inodes here,
does that not apply to XFS_IRECOVERY inodes if log recovery itself
happens to fail between bui recovery and iunlink processing?
Brian
> Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_super.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 38aaacd..9b06ca2 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1040,6 +1040,13 @@ xfs_fs_drop_inode(
> if (ip->i_flags & XFS_IRECOVERY) {
> ASSERT(ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED);
> return 0;
> + } else if (ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED) {
> + /*
> + * This inode was loaded during recovery but is not
> + * being unlinked, so we can free it without fear of
> + * premature truncation.
> + */
> + return 1;
> }
>
> return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 2/3] xfs: don't leak linked inodes during log recovery
2017-08-09 12:36 ` Brian Foster
@ 2017-08-09 16:49 ` Darrick J. Wong
2017-08-09 17:17 ` Brian Foster
0 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-09 16:49 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Wed, Aug 09, 2017 at 08:36:49AM -0400, Brian Foster wrote:
> On Tue, Aug 08, 2017 at 06:06:50PM -0700, Darrick J. Wong wrote:
> > When we introduced the bmap redo log items, we set MS_ACTIVE on the
> > mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
> > from being truncated prematurely during log recovery. However, we
> > neglected to drop linked inodes that are recovered, and if we don't use
> > the inode between recovery and unmount, the inode will never be marked
> > reclaimable and thus we fail to free it at umount time. If we're in
> > log recovery but IRECOVERY is /not/ set, the inode is linked and can be
> > reclaimed.
> >
>
> I follow the change in behavior in the previous commit and how this
> restores the original behavior for linked inodes, so this patch makes
> sense from that perspective. I'm not following where/how the leak occurs
> from the description, however.
Linked inodes are inode_add_lru()'d, but nothing ever calls
evict_inodes() to clean up that lru (sb->s_inodes).
> Wouldn't the inode end up on the lru to be shrunk/evicted/reclaimed at
> a later point?
> What happens if the inode is subsequently used that prevents the leak?
> (Whatever I'm missing, it would be nice to elaborate on in the commit
> log.)
If we make it all the way to a successful mount, then an unmount can
call generic_shutdown_super -> evict_inodes to clean up all the inodes
on the lru list.
> Also, if there is a memory leak vector for !drop linked inodes here,
> does that not apply to XFS_IRECOVERY inodes if log recovery itself
> happens to fail between bui recovery and iunlink processing?
Ugh, I forgot about that possibility. I think the solution is to
evict_inodes right after we clear MS_ACTIVE but before we see if
xfs_log_mount_finish actually failed.
--D
>
> Brian
>
> > Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > fs/xfs/xfs_super.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 38aaacd..9b06ca2 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1040,6 +1040,13 @@ xfs_fs_drop_inode(
> > if (ip->i_flags & XFS_IRECOVERY) {
> > ASSERT(ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED);
> > return 0;
> > + } else if (ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED) {
> > + /*
> > + * This inode was loaded during recovery but is not
> > + * being unlinked, so we can free it without fear of
> > + * premature truncation.
> > + */
> > + return 1;
> > }
> >
> > return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 2/3] xfs: don't leak linked inodes during log recovery
2017-08-09 16:49 ` Darrick J. Wong
@ 2017-08-09 17:17 ` Brian Foster
0 siblings, 0 replies; 14+ messages in thread
From: Brian Foster @ 2017-08-09 17:17 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: xfs
On Wed, Aug 09, 2017 at 09:49:29AM -0700, Darrick J. Wong wrote:
> On Wed, Aug 09, 2017 at 08:36:49AM -0400, Brian Foster wrote:
> > On Tue, Aug 08, 2017 at 06:06:50PM -0700, Darrick J. Wong wrote:
> > > When we introduced the bmap redo log items, we set MS_ACTIVE on the
> > > mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
> > > from being truncated prematurely during log recovery. However, we
> > > neglected to drop linked inodes that are recovered, and if we don't use
> > > the inode between recovery and unmount, the inode will never be marked
> > > reclaimable and thus we fail to free it at umount time. If we're in
> > > log recovery but IRECOVERY is /not/ set, the inode is linked and can be
> > > reclaimed.
> > >
> >
> > I follow the change in behavior in the previous commit and how this
> > restores the original behavior for linked inodes, so this patch makes
> > sense from that perspective. I'm not following where/how the leak occurs
> > from the description, however.
>
> Linked inodes are inode_add_lru()'d, but nothing ever calls
> evict_inodes() to clean up that lru (sb->s_inodes).
>
> > Wouldn't the inode end up on the lru to be shrunk/evicted/reclaimed at
> > a later point?
>
> > What happens if the inode is subsequently used that prevents the leak?
> > (Whatever I'm missing, it would be nice to elaborate on in the commit
> > log.)
>
> If we make it all the way to a successful mount, then an unmount can
> call generic_shutdown_super -> evict_inodes to clean up all the inodes
> on the lru list.
>
Ok, so you are saying that the leak occurs only if the mount ends up
failing because the vfs doesn't expect to have to evict lru inodes until
it sets the sb active, right? If so, that makes sense.. thanks.
FWIW, I think it's just more that the patch description is a bit
confusing here than anything. It doesn't mention mount failure and I'm
still a little confused about what you mean by: "if we don't use the
inode between recovery and unmount, the inode will never be marked
reclaimable and thus we fail to free it at umount time." Does that mean
that assuming imminent mount failure, the inode is leaked unless
something else happens to lookup/release the inode after we clear
MS_ACTIVE but before the mount failure sequence completes?
> > Also, if there is a memory leak vector for !drop linked inodes here,
> > does that not apply to XFS_IRECOVERY inodes if log recovery itself
> > happens to fail between bui recovery and iunlink processing?
>
> Ugh, I forgot about that possibility. I think the solution is to
> evict_inodes right after we clear MS_ACTIVE but before we see if
> xfs_log_mount_finish actually failed.
>
That sounds reasonable. If we do that, I suppose that means we can drop
the additional logic in ->drop_inode() because that would cover both
linked and unlinked inodes that make it onto the lru during recovery.
Brian
> --D
>
> >
> > Brian
> >
> > > Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > > fs/xfs/xfs_super.c | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > > index 38aaacd..9b06ca2 100644
> > > --- a/fs/xfs/xfs_super.c
> > > +++ b/fs/xfs/xfs_super.c
> > > @@ -1040,6 +1040,13 @@ xfs_fs_drop_inode(
> > > if (ip->i_flags & XFS_IRECOVERY) {
> > > ASSERT(ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED);
> > > return 0;
> > > + } else if (ip->i_mount->m_log->l_flags & XLOG_RECOVERY_NEEDED) {
> > > + /*
> > > + * This inode was loaded during recovery but is not
> > > + * being unlinked, so we can free it without fear of
> > > + * premature truncation.
> > > + */
> > > + return 1;
> > > }
> > >
> > > return generic_drop_inode(inode) || (ip->i_flags & XFS_IDONTCACHE);
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails
2017-08-09 1:04 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
2017-08-09 1:06 ` [PATCH 2/3] xfs: don't leak linked inodes during log recovery Darrick J. Wong
@ 2017-08-09 1:07 ` Darrick J. Wong
2017-08-09 12:36 ` Brian Foster
2017-08-09 6:31 ` [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Nikolay Borisov
2017-08-09 12:36 ` Brian Foster
3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-09 1:07 UTC (permalink / raw)
To: xfs
If we fail a mount on account of cow recovery errors, it's possible that
a previous quotacheck left some dquots in memory. The bailout clause of
xfs_mountfs forgets to purge these, and so we leak them. Fix that.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
fs/xfs/xfs_mount.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index d463ab3..8a12118 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1029,6 +1029,8 @@ xfs_mountfs(
out_agresv:
xfs_fs_unreserve_ag_blocks(mp);
out_quota:
+ /* Clean out dquots that might be in memory after quotacheck. */
+ xfs_qm_unmount(mp);
xfs_qm_unmount_quotas(mp);
out_rtunmount:
xfs_rtunmount_inodes(mp);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails
2017-08-09 1:07 ` [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails Darrick J. Wong
@ 2017-08-09 12:36 ` Brian Foster
2017-08-09 16:06 ` Darrick J. Wong
0 siblings, 1 reply; 14+ messages in thread
From: Brian Foster @ 2017-08-09 12:36 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: xfs
On Tue, Aug 08, 2017 at 06:07:00PM -0700, Darrick J. Wong wrote:
> If we fail a mount on account of cow recovery errors, it's possible that
> a previous quotacheck left some dquots in memory. The bailout clause of
> xfs_mountfs forgets to purge these, and so we leak them. Fix that.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_mount.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index d463ab3..8a12118 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1029,6 +1029,8 @@ xfs_mountfs(
> out_agresv:
> xfs_fs_unreserve_ag_blocks(mp);
> out_quota:
> + /* Clean out dquots that might be in memory after quotacheck. */
> + xfs_qm_unmount(mp);
> xfs_qm_unmount_quotas(mp);
Should these calls be reversed? It looks like qm_unmount() can free
m_quotainfo before qm_unmount_quotas() has a chance to release quota
inodes and whatnot.
Brian
> out_rtunmount:
> xfs_rtunmount_inodes(mp);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails
2017-08-09 12:36 ` Brian Foster
@ 2017-08-09 16:06 ` Darrick J. Wong
0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-09 16:06 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Wed, Aug 09, 2017 at 08:36:58AM -0400, Brian Foster wrote:
> On Tue, Aug 08, 2017 at 06:07:00PM -0700, Darrick J. Wong wrote:
> > If we fail a mount on account of cow recovery errors, it's possible that
> > a previous quotacheck left some dquots in memory. The bailout clause of
> > xfs_mountfs forgets to purge these, and so we leak them. Fix that.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > fs/xfs/xfs_mount.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > index d463ab3..8a12118 100644
> > --- a/fs/xfs/xfs_mount.c
> > +++ b/fs/xfs/xfs_mount.c
> > @@ -1029,6 +1029,8 @@ xfs_mountfs(
> > out_agresv:
> > xfs_fs_unreserve_ag_blocks(mp);
> > out_quota:
> > + /* Clean out dquots that might be in memory after quotacheck. */
> > + xfs_qm_unmount(mp);
> > xfs_qm_unmount_quotas(mp);
>
> Should these calls be reversed? It looks like qm_unmount() can free
> m_quotainfo before qm_unmount_quotas() has a chance to release quota
> inodes and whatnot.
xfs_qm_unmount -> xfs_qm_destroy_quotainfo will free the quota inodes,
but you're right that we need (at least in theory) to call
xfs_qm_unmount_quotas first so that we can xfs_qm_dqdetach the
root/rbm/rsum inodes. I don't think we've actually attached dquots to
those inodes at that point in the mount process (and the slab leak
complaints have stopped) so at a practical level it might not matter.
Will change & resubmit,
--D
>
> Brian
>
> > out_rtunmount:
> > xfs_rtunmount_inodes(mp);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
2017-08-09 1:04 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
2017-08-09 1:06 ` [PATCH 2/3] xfs: don't leak linked inodes during log recovery Darrick J. Wong
2017-08-09 1:07 ` [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails Darrick J. Wong
@ 2017-08-09 6:31 ` Nikolay Borisov
2017-08-09 12:36 ` Brian Foster
3 siblings, 0 replies; 14+ messages in thread
From: Nikolay Borisov @ 2017-08-09 6:31 UTC (permalink / raw)
To: Darrick J. Wong, xfs
On 9.08.2017 04:04, Darrick J. Wong wrote:
> Way back when we established inode block-map redo log items, it was
> discovered that we needed to prevent the VFS from evicting inodes during
> log recovery because any given inode might be have bmap redo items to
> replay even if the inode has no link count and is ultimately deleted,
> and any eviction of an unlinked inode causes the inode to be truncated
> and freed too early.
>
> To make this possible, we set MS_ACTIVE so that inodes would not be torn
> down immediately upon release. Unfortunately, this also results in the
> quota inodes not being released at all if a later part of the mount
> process should fail, because we never reclaim the inodes. So, clear
> MS_ACTIVE immediately after we finish the log recovery so that the quota
> inodes will be torn down properly if we abort the mount.
>
> Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
I think the whole series warrants:
Cc: <stable@vger.kernel.org> # 4.9
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
2017-08-09 1:04 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
` (2 preceding siblings ...)
2017-08-09 6:31 ` [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Nikolay Borisov
@ 2017-08-09 12:36 ` Brian Foster
2017-08-09 15:46 ` Darrick J. Wong
3 siblings, 1 reply; 14+ messages in thread
From: Brian Foster @ 2017-08-09 12:36 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: xfs
On Tue, Aug 08, 2017 at 06:04:44PM -0700, Darrick J. Wong wrote:
> Way back when we established inode block-map redo log items, it was
> discovered that we needed to prevent the VFS from evicting inodes during
> log recovery because any given inode might be have bmap redo items to
> replay even if the inode has no link count and is ultimately deleted,
> and any eviction of an unlinked inode causes the inode to be truncated
> and freed too early.
>
> To make this possible, we set MS_ACTIVE so that inodes would not be torn
> down immediately upon release. Unfortunately, this also results in the
> quota inodes not being released at all if a later part of the mount
> process should fail, because we never reclaim the inodes. So, clear
> MS_ACTIVE immediately after we finish the log recovery so that the quota
> inodes will be torn down properly if we abort the mount.
>
> Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> fs/xfs/xfs_mount.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 40d4e8b..d463ab3 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -949,7 +949,9 @@ xfs_mountfs(
> * iput to behave like they do for an active filesystem.
> * xfs_fs_drop_inode needs to be able to prevent the deletion
> * of inodes before we're done replaying log items on those
> - * inodes.
> + * inodes. Turn it off immediately after xfs_log_mount_finish
> + * so that we don't leak the quota inodes if subsequent mount
> + * activities fail.
> */
> mp->m_super->s_flags |= MS_ACTIVE;
>
> @@ -959,6 +961,7 @@ xfs_mountfs(
> * read in.
> */
> error = xfs_log_mount_finish(mp);
> + mp->m_super->s_flags &= ~MS_ACTIVE;
Just an aesthetic nit, but could we combine these two above hunks and
the associated comments so the intent is very obvious? E.g., so it looks
something like this:
/*
* Set MS_ACTIVE around log recovery ...
*/
mp->m_super->s_flags |= MS_ACTIVE;
error = xfs_log_mount_finish(mp);
mp->m_super->s_flags &= ~MS_ACTIVE;
...
Otherwise this looks fine:
Reviewed-by: Brian Foster <bfoster@redhat.com>
> if (error) {
> xfs_warn(mp, "log mount finish failed");
> goto out_rtunmount;
> @@ -1028,7 +1031,6 @@ xfs_mountfs(
> out_quota:
> xfs_qm_unmount_quotas(mp);
> out_rtunmount:
> - mp->m_super->s_flags &= ~MS_ACTIVE;
> xfs_rtunmount_inodes(mp);
> out_rele_rip:
> IRELE(rip);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
2017-08-09 12:36 ` Brian Foster
@ 2017-08-09 15:46 ` Darrick J. Wong
0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-09 15:46 UTC (permalink / raw)
To: Brian Foster; +Cc: xfs
On Wed, Aug 09, 2017 at 08:36:03AM -0400, Brian Foster wrote:
> On Tue, Aug 08, 2017 at 06:04:44PM -0700, Darrick J. Wong wrote:
> > Way back when we established inode block-map redo log items, it was
> > discovered that we needed to prevent the VFS from evicting inodes during
> > log recovery because any given inode might be have bmap redo items to
> > replay even if the inode has no link count and is ultimately deleted,
> > and any eviction of an unlinked inode causes the inode to be truncated
> > and freed too early.
> >
> > To make this possible, we set MS_ACTIVE so that inodes would not be torn
> > down immediately upon release. Unfortunately, this also results in the
> > quota inodes not being released at all if a later part of the mount
> > process should fail, because we never reclaim the inodes. So, clear
> > MS_ACTIVE immediately after we finish the log recovery so that the quota
> > inodes will be torn down properly if we abort the mount.
> >
> > Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > fs/xfs/xfs_mount.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > index 40d4e8b..d463ab3 100644
> > --- a/fs/xfs/xfs_mount.c
> > +++ b/fs/xfs/xfs_mount.c
> > @@ -949,7 +949,9 @@ xfs_mountfs(
> > * iput to behave like they do for an active filesystem.
> > * xfs_fs_drop_inode needs to be able to prevent the deletion
> > * of inodes before we're done replaying log items on those
> > - * inodes.
> > + * inodes. Turn it off immediately after xfs_log_mount_finish
> > + * so that we don't leak the quota inodes if subsequent mount
> > + * activities fail.
> > */
> > mp->m_super->s_flags |= MS_ACTIVE;
> >
> > @@ -959,6 +961,7 @@ xfs_mountfs(
> > * read in.
> > */
> > error = xfs_log_mount_finish(mp);
> > + mp->m_super->s_flags &= ~MS_ACTIVE;
>
> Just an aesthetic nit, but could we combine these two above hunks and
> the associated comments so the intent is very obvious? E.g., so it looks
> something like this:
>
> /*
> * Set MS_ACTIVE around log recovery ...
> */
> mp->m_super->s_flags |= MS_ACTIVE;
> error = xfs_log_mount_finish(mp);
> mp->m_super->s_flags &= ~MS_ACTIVE;
> ...
Yes, that's fine.
--D
>
> Otherwise this looks fine:
>
> Reviewed-by: Brian Foster <bfoster@redhat.com>
>
> > if (error) {
> > xfs_warn(mp, "log mount finish failed");
> > goto out_rtunmount;
> > @@ -1028,7 +1031,6 @@ xfs_mountfs(
> > out_quota:
> > xfs_qm_unmount_quotas(mp);
> > out_rtunmount:
> > - mp->m_super->s_flags &= ~MS_ACTIVE;
> > xfs_rtunmount_inodes(mp);
> > out_rele_rip:
> > IRELE(rip);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
@ 2017-08-10 5:23 Darrick J. Wong
2017-08-10 18:15 ` Allison Henderson
2017-08-11 11:13 ` Christoph Hellwig
0 siblings, 2 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-10 5:23 UTC (permalink / raw)
To: linux-xfs; +Cc: Brian Foster
From: Darrick J. Wong <darrick.wong@oracle.com>
Way back when we established inode block-map redo log items, it was
discovered that we needed to prevent the VFS from evicting inodes during
log recovery because any given inode might be have bmap redo items to
replay even if the inode has no link count and is ultimately deleted,
and any eviction of an unlinked inode causes the inode to be truncated
and freed too early.
To make this possible, we set MS_ACTIVE so that inodes would not be torn
down immediately upon release. Unfortunately, this also results in the
quota inodes not being released at all if a later part of the mount
process should fail, because we never reclaim the inodes. So, clear
MS_ACTIVE immediately after we finish the log recovery so that the quota
inodes will be torn down properly if we abort the mount.
Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
v2: wrap the xfs_log_mount_finish more tightly with the MS_ACTIVE mods
---
fs/xfs/xfs_mount.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 40d4e8b..d63a367 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -945,20 +945,21 @@ xfs_mountfs(
}
/*
+ * Finish recovering the file system. This part needed to be delayed
+ * until after the root and real-time bitmap inodes were consistently
+ * read in.
+ *
* During the second phase of log recovery, we need iget and
* iput to behave like they do for an active filesystem.
* xfs_fs_drop_inode needs to be able to prevent the deletion
* of inodes before we're done replaying log items on those
- * inodes.
+ * inodes. Turn it off immediately after xfs_log_mount_finish
+ * so that we don't leak the quota inodes if subsequent mount
+ * activities fail.
*/
mp->m_super->s_flags |= MS_ACTIVE;
-
- /*
- * Finish recovering the file system. This part needed to be delayed
- * until after the root and real-time bitmap inodes were consistently
- * read in.
- */
error = xfs_log_mount_finish(mp);
+ mp->m_super->s_flags &= ~MS_ACTIVE;
if (error) {
xfs_warn(mp, "log mount finish failed");
goto out_rtunmount;
@@ -1028,7 +1029,6 @@ xfs_mountfs(
out_quota:
xfs_qm_unmount_quotas(mp);
out_rtunmount:
- mp->m_super->s_flags &= ~MS_ACTIVE;
xfs_rtunmount_inodes(mp);
out_rele_rip:
IRELE(rip);
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
2017-08-10 5:23 Darrick J. Wong
@ 2017-08-10 18:15 ` Allison Henderson
2017-08-11 11:13 ` Christoph Hellwig
1 sibling, 0 replies; 14+ messages in thread
From: Allison Henderson @ 2017-08-10 18:15 UTC (permalink / raw)
To: Darrick J. Wong, linux-xfs; +Cc: Brian Foster
On 8/9/2017 10:23 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Way back when we established inode block-map redo log items, it was
> discovered that we needed to prevent the VFS from evicting inodes during
> log recovery because any given inode might be have bmap redo items to
> replay even if the inode has no link count and is ultimately deleted,
> and any eviction of an unlinked inode causes the inode to be truncated
> and freed too early.
>
> To make this possible, we set MS_ACTIVE so that inodes would not be torn
> down immediately upon release. Unfortunately, this also results in the
> quota inodes not being released at all if a later part of the mount
> process should fail, because we never reclaim the inodes. So, clear
> MS_ACTIVE immediately after we finish the log recovery so that the quota
> inodes will be torn down properly if we abort the mount.
>
> Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Reviewed-by: Brian Foster <bfoster@redhat.com>
These look good to me. You can add my review to your set. Thanks!
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> ---
> v2: wrap the xfs_log_mount_finish more tightly with the MS_ACTIVE mods
> ---
> fs/xfs/xfs_mount.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
>
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 40d4e8b..d63a367 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -945,20 +945,21 @@ xfs_mountfs(
> }
>
> /*
> + * Finish recovering the file system. This part needed to be delayed
> + * until after the root and real-time bitmap inodes were consistently
> + * read in.
> + *
> * During the second phase of log recovery, we need iget and
> * iput to behave like they do for an active filesystem.
> * xfs_fs_drop_inode needs to be able to prevent the deletion
> * of inodes before we're done replaying log items on those
> - * inodes.
> + * inodes. Turn it off immediately after xfs_log_mount_finish
> + * so that we don't leak the quota inodes if subsequent mount
> + * activities fail.
> */
> mp->m_super->s_flags |= MS_ACTIVE;
> -
> - /*
> - * Finish recovering the file system. This part needed to be delayed
> - * until after the root and real-time bitmap inodes were consistently
> - * read in.
> - */
> error = xfs_log_mount_finish(mp);
> + mp->m_super->s_flags &= ~MS_ACTIVE;
> if (error) {
> xfs_warn(mp, "log mount finish failed");
> goto out_rtunmount;
> @@ -1028,7 +1029,6 @@ xfs_mountfs(
> out_quota:
> xfs_qm_unmount_quotas(mp);
> out_rtunmount:
> - mp->m_super->s_flags &= ~MS_ACTIVE;
> xfs_rtunmount_inodes(mp);
> out_rele_rip:
> IRELE(rip);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=XFp4B05bcXkJ0dhYaFjd3F8telP01COkBp9cI7mKLb4&m=ZRJN-nlKvtwRQMyfqfP0fbTnWqpvu10UW0FgWde9x3s&s=9TJPrWuX_B_5LmwXgPPbM_1TBzmmtjFE3RKmTgccKLk&e=
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak
2017-08-10 5:23 Darrick J. Wong
2017-08-10 18:15 ` Allison Henderson
@ 2017-08-11 11:13 ` Christoph Hellwig
1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2017-08-11 11:13 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: linux-xfs, Brian Foster
This looks much more reasonable than what we had before:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2017-08-11 11:13 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-09 1:04 [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Darrick J. Wong
2017-08-09 1:06 ` [PATCH 2/3] xfs: don't leak linked inodes during log recovery Darrick J. Wong
2017-08-09 12:36 ` Brian Foster
2017-08-09 16:49 ` Darrick J. Wong
2017-08-09 17:17 ` Brian Foster
2017-08-09 1:07 ` [PATCH 3/3] xfs: don't leak quotacheck dquots when cow recovery fails Darrick J. Wong
2017-08-09 12:36 ` Brian Foster
2017-08-09 16:06 ` Darrick J. Wong
2017-08-09 6:31 ` [PATCH 1/3] xfs: clear MS_ACTIVE after finishing log recovery to avoid inode leak Nikolay Borisov
2017-08-09 12:36 ` Brian Foster
2017-08-09 15:46 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2017-08-10 5:23 Darrick J. Wong
2017-08-10 18:15 ` Allison Henderson
2017-08-11 11:13 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox