* [PATCH] xfs: during log recovery, destroy the unlinked inodes even for read-only mount
@ 2016-12-06 9:00 Hou Tao
2016-12-07 6:30 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Hou Tao @ 2016-12-06 9:00 UTC (permalink / raw)
To: linux-xfs; +Cc: david, stable
During the 2nd stage of log recovery, if the filesystem is firstly mounted
as read-only, the unlink inodes will not be destroyed and the unlinked list
in AGI will not be cleared. Even after a read-write remount or umount,
the unlinked inodes will still be valid and be kept on disk, and the
available freespace will be incorrect.
To fix the problem, we need to force xfs_inactive() to destroy the
unlinked inode when the filesystem is mounted as read-only.
So clear the XFS_MOUNT_RDONLY flag temporarily before the recovery
of unlinked inodes and restore the flag after the recovery has done.
The problem can be reproduced by the following steps:
1. mount a xfs fs on a KVM VM
2. on the VM launch an application which does the following things:
open(xfs_file); unlink(xfs_file);
while(1) { write(xfs_file, 2MB); sleep(1); }
3. wait 5 seconds, sync the xfs fs, and wait 5 seconds
4. terminate the VM
5. start the VM and mount the xfs as read-only
6. remount the xfs as read-write or umount
7. check the unlinked list and the available freespace
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
fs/xfs/xfs_log_recover.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 9b3d7c7..fcc83e0 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -5025,6 +5025,7 @@ xlog_recover_process_iunlinks(
int bucket;
int error;
uint mp_dmevmask;
+ int ro_mount;
mp = log->l_mp;
@@ -5034,6 +5035,11 @@ xlog_recover_process_iunlinks(
mp_dmevmask = mp->m_dmevmask;
mp->m_dmevmask = 0;
+ /* Destroy the unlinked inodes even for read-only mount */
+ ro_mount = mp->m_flags & XFS_MOUNT_RDONLY;
+ if (ro_mount)
+ mp->m_flags &= ~XFS_MOUNT_RDONLY;
+
for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
/*
* Find the agi for this ag.
@@ -5070,6 +5076,9 @@ xlog_recover_process_iunlinks(
xfs_buf_rele(agibp);
}
+ if (ro_mount)
+ mp->m_flags |= XFS_MOUNT_RDONLY;
+
mp->m_dmevmask = mp_dmevmask;
}
--
2.5.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] xfs: during log recovery, destroy the unlinked inodes even for read-only mount
2016-12-06 9:00 [PATCH] xfs: during log recovery, destroy the unlinked inodes even for read-only mount Hou Tao
@ 2016-12-07 6:30 ` Dave Chinner
2016-12-15 4:07 ` Eric Sandeen
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2016-12-07 6:30 UTC (permalink / raw)
To: Hou Tao; +Cc: linux-xfs, stable
On Tue, Dec 06, 2016 at 05:00:47PM +0800, Hou Tao wrote:
> During the 2nd stage of log recovery, if the filesystem is firstly mounted
> as read-only, the unlink inodes will not be destroyed and the unlinked list
> in AGI will not be cleared. Even after a read-write remount or umount,
> the unlinked inodes will still be valid and be kept on disk, and the
> available freespace will be incorrect.
>
> To fix the problem, we need to force xfs_inactive() to destroy the
> unlinked inode when the filesystem is mounted as read-only.
> So clear the XFS_MOUNT_RDONLY flag temporarily before the recovery
> of unlinked inodes and restore the flag after the recovery has done.
>
> The problem can be reproduced by the following steps:
> 1. mount a xfs fs on a KVM VM
> 2. on the VM launch an application which does the following things:
> open(xfs_file); unlink(xfs_file);
> while(1) { write(xfs_file, 2MB); sleep(1); }
> 3. wait 5 seconds, sync the xfs fs, and wait 5 seconds
> 4. terminate the VM
> 5. start the VM and mount the xfs as read-only
> 6. remount the xfs as read-write or umount
> 7. check the unlinked list and the available freespace
This is only papering over the larger problem.
I was talking to Eric about this larger "recovery on read-only
mount" problem last week on IRC - I can't find it my logs right now,
but IIRC I'd suggested that we should always run xfs_mountfs()
in read-write mount if the underlying device can be written to,
and then once that is complete do a rw->ro transition exactly as we
do for a remount,ro operation.
That way we remove all the special "write on read only mount" hacks
we currently have throughout the code to enable log recovery to run
on read-only mounts.
Essentially is requires moving the device read only check from the
log code to xfs_fs_fill_super() and handling the no-recovery flag
there before we call xfs_mountfs() and adding the rw->ro state
transition after we return. This will be much simpler and much more
reliable than trying to turn off "read only" state around certain
operations...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] xfs: during log recovery, destroy the unlinked inodes even for read-only mount
2016-12-07 6:30 ` Dave Chinner
@ 2016-12-15 4:07 ` Eric Sandeen
0 siblings, 0 replies; 3+ messages in thread
From: Eric Sandeen @ 2016-12-15 4:07 UTC (permalink / raw)
To: Dave Chinner, Hou Tao; +Cc: linux-xfs, stable
On 12/7/16 12:30 AM, Dave Chinner wrote:
> On Tue, Dec 06, 2016 at 05:00:47PM +0800, Hou Tao wrote:
>> During the 2nd stage of log recovery, if the filesystem is firstly mounted
>> as read-only, the unlink inodes will not be destroyed and the unlinked list
>> in AGI will not be cleared. Even after a read-write remount or umount,
>> the unlinked inodes will still be valid and be kept on disk, and the
>> available freespace will be incorrect.
>>
>> To fix the problem, we need to force xfs_inactive() to destroy the
>> unlinked inode when the filesystem is mounted as read-only.
>> So clear the XFS_MOUNT_RDONLY flag temporarily before the recovery
>> of unlinked inodes and restore the flag after the recovery has done.
>>
>> The problem can be reproduced by the following steps:
>> 1. mount a xfs fs on a KVM VM
>> 2. on the VM launch an application which does the following things:
>> open(xfs_file); unlink(xfs_file);
>> while(1) { write(xfs_file, 2MB); sleep(1); }
>> 3. wait 5 seconds, sync the xfs fs, and wait 5 seconds
>> 4. terminate the VM
>> 5. start the VM and mount the xfs as read-only
>> 6. remount the xfs as read-write or umount
>> 7. check the unlinked list and the available freespace
>
> This is only papering over the larger problem.
>
> I was talking to Eric about this larger "recovery on read-only
> mount" problem last week on IRC - I can't find it my logs right now,
> but IIRC I'd suggested that we should always run xfs_mountfs()
> in read-write mount if the underlying device can be written to,
> and then once that is complete do a rw->ro transition exactly as we
> do for a remount,ro operation.
Yeah, I have a larger patchset to try to handle this and other
related processing that wasn't happening on ro mounts. I got
derailed because my regression test for it ran into all kinds
of unexpected new & unrelated bugs. So I haven't sent it yet...
There were lots of little bits here and there stemming, I think,
from old Irix code that didn't do /any/ device IO on a ro mount.
-Eric
> That way we remove all the special "write on read only mount" hacks
> we currently have throughout the code to enable log recovery to run
> on read-only mounts.
>
> Essentially is requires moving the device read only check from the
> log code to xfs_fs_fill_super() and handling the no-recovery flag
> there before we call xfs_mountfs() and adding the rw->ro state
> transition after we return. This will be much simpler and much more
> reliable than trying to turn off "read only" state around certain
> operations...
>
> Cheers,
>
> Dave.
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-12-15 4:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-06 9:00 [PATCH] xfs: during log recovery, destroy the unlinked inodes even for read-only mount Hou Tao
2016-12-07 6:30 ` Dave Chinner
2016-12-15 4:07 ` Eric Sandeen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).