* [PATCH] xfs: dummy transactions should not dirty VFS state
@ 2010-08-03 9:10 Dave Chinner
2010-08-03 9:55 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2010-08-03 9:10 UTC (permalink / raw)
To: xfs
From: Dave Chinner <dchinner@redhat.com>
When we need to cover the log, we issue dummy transactions to ensure
the current log tail is on disk. Unfortunately we currently use the
root inode in the dummy transaction, and the act of committing the
transaction dirties the inode at the VFS level.
As a result, the VFS writeback of the dirty inode will prevent the
filesystem from idling long enough for the log covering state
machine to complete. The state machine gets stuck in a loop issuing
new dummy transactions to cover the log and never makes progress.
To avoid this problem, the dummy transactions should not cause
externally visible state changes. To ensure this occurs, make sure
that dummy transactions log an unchanging field in the superblock as
it's state is never propagated outside the filesystem. This allows
the log covering state machine to complete successfully and the
filesystem now correctly enters a fully idle state about 90s after
the last modification was made.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/linux-2.6/xfs_sync.c | 37 +++----------------------------------
fs/xfs/xfs_fsops.c | 24 ++++++++++++++----------
2 files changed, 17 insertions(+), 44 deletions(-)
diff --git a/fs/xfs/linux-2.6/xfs_sync.c b/fs/xfs/linux-2.6/xfs_sync.c
index dfcbd98..bec6539 100644
--- a/fs/xfs/linux-2.6/xfs_sync.c
+++ b/fs/xfs/linux-2.6/xfs_sync.c
@@ -34,6 +34,7 @@
#include "xfs_inode_item.h"
#include "xfs_quota.h"
#include "xfs_trace.h"
+#include "xfs_fsops.h"
#include <linux/kthread.h>
#include <linux/freezer.h>
@@ -341,38 +342,6 @@ xfs_sync_attr(
}
STATIC int
-xfs_commit_dummy_trans(
- struct xfs_mount *mp,
- uint flags)
-{
- struct xfs_inode *ip = mp->m_rootip;
- struct xfs_trans *tp;
- int error;
-
- /*
- * Put a dummy transaction in the log to tell recovery
- * that all others are OK.
- */
- tp = xfs_trans_alloc(mp, XFS_TRANS_DUMMY1);
- error = xfs_trans_reserve(tp, 0, XFS_ICHANGE_LOG_RES(mp), 0, 0, 0);
- if (error) {
- xfs_trans_cancel(tp, 0);
- return error;
- }
-
- xfs_ilock(ip, XFS_ILOCK_EXCL);
-
- xfs_trans_ijoin(tp, ip);
- xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
- error = xfs_trans_commit(tp, 0);
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
-
- /* the log force ensures this transaction is pushed to disk */
- xfs_log_force(mp, (flags & SYNC_WAIT) ? XFS_LOG_SYNC : 0);
- return error;
-}
-
-STATIC int
xfs_sync_fsdata(
struct xfs_mount *mp)
{
@@ -432,7 +401,7 @@ xfs_quiesce_data(
/* mark the log as covered if needed */
if (xfs_log_need_covered(mp))
- error2 = xfs_commit_dummy_trans(mp, SYNC_WAIT);
+ error2 = xfs_fs_log_dummy(mp);
/* flush data-only devices */
if (mp->m_rtdev_targp)
@@ -578,7 +547,7 @@ xfs_sync_worker(
/* dgc: errors ignored here */
error = xfs_qm_sync(mp, SYNC_TRYLOCK);
if (xfs_log_need_covered(mp))
- error = xfs_commit_dummy_trans(mp, 0);
+ error = xfs_fs_log_dummy(mp);
}
mp->m_sync_seq++;
wake_up(&mp->m_wait_single_sync_task);
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index dbca5f5..9e15623 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -604,6 +604,15 @@ out:
return 0;
}
+/*
+ * Dump a transaction into the log that contains no real change. This is needed
+ * to be able to make the log dirty or stamp the current tail LSN into the log
+ * during the covering operation.
+ *
+ * We cannot use an inode here for this - that will push dirty state back up
+ * into the VFS and then periodic inode flushing will prevent log covering from
+ * making progress. Hence we log a field in the superblock instead.
+ */
int
xfs_fs_log_dummy(
xfs_mount_t *mp)
@@ -613,22 +622,17 @@ xfs_fs_log_dummy(
int error;
tp = _xfs_trans_alloc(mp, XFS_TRANS_DUMMY1, KM_SLEEP);
- error = xfs_trans_reserve(tp, 0, XFS_ICHANGE_LOG_RES(mp), 0, 0, 0);
+ error = xfs_trans_reserve(tp, 0, mp->m_sb.sb_sectsize + 128, 0, 0,
+ XFS_DEFAULT_LOG_COUNT);
if (error) {
xfs_trans_cancel(tp, 0);
return error;
}
- ip = mp->m_rootip;
- xfs_ilock(ip, XFS_ILOCK_EXCL);
-
- xfs_trans_ijoin(tp, ip);
- xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+ /* log the UUID because it is an unchanging field */
+ xfs_mod_sb(tp, XFS_SB_UUID);
xfs_trans_set_sync(tp);
- error = xfs_trans_commit(tp, 0);
-
- xfs_iunlock(ip, XFS_ILOCK_EXCL);
- return error;
+ return xfs_trans_commit(tp, 0);
}
int
--
1.7.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: dummy transactions should not dirty VFS state
2010-08-03 9:10 [PATCH] xfs: dummy transactions should not dirty VFS state Dave Chinner
@ 2010-08-03 9:55 ` Christoph Hellwig
2010-08-03 12:07 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2010-08-03 9:55 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
I looks good to me, but there are two subtile differences between
xfs_commit_dummy_trans and xfs_fs_log_dummy that get lost. For one
xfs_commit_dummy_trans doesn't actually commit a synchronous transaction
(or rather forces out the log) unless SYNC_WAIT is set, in addition
to that xfs_fs_log_dummy uses _xfs_trans_alloc, which doesn't get
blocked by the filesystem freezing.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: dummy transactions should not dirty VFS state
2010-08-03 9:55 ` Christoph Hellwig
@ 2010-08-03 12:07 ` Dave Chinner
2010-08-03 13:50 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2010-08-03 12:07 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Tue, Aug 03, 2010 at 05:55:18AM -0400, Christoph Hellwig wrote:
> I looks good to me, but there are two subtile differences between
> xfs_commit_dummy_trans and xfs_fs_log_dummy that get lost.
Yes, I noticed those things. Especially as I modified the wrong
one in the first place and realised both need fixing and the
duplication of code seems completely unnecessary. We should have
only one copy of this code, not two copies that do slightly
different things.
> For one
> xfs_commit_dummy_trans doesn't actually commit a synchronous transaction
> (or rather forces out the log) unless SYNC_WAIT is set,
I don't think that we really _need_ a non-blocking version - waiting
for a single sync transaction in xfssyncd once every 36s is hardly
going to kill performance.
> in addition
> to that xfs_fs_log_dummy uses _xfs_trans_alloc, which doesn't get
> blocked by the filesystem freezing.
Everything will be clean on a frozen filesystem, so all the current
code does is block the xfssyncd until the filesytem is
unfrozen. Given that we can still read everything on the frozen
filesystem, inode caches can still grow and hence we still need to
run regular reclaiming. If the xfssyncd is blocked then only memory
pressure can free up inodes.
If we want to keep all these little differences, then we still
need to kill one of the two versions. Let me know which you
prefer...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: dummy transactions should not dirty VFS state
2010-08-03 12:07 ` Dave Chinner
@ 2010-08-03 13:50 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2010-08-03 13:50 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Tue, Aug 03, 2010 at 10:07:28PM +1000, Dave Chinner wrote:
> Yes, I noticed those things. Especially as I modified the wrong
> one in the first place and realised both need fixing and the
> duplication of code seems completely unnecessary. We should have
> only one copy of this code, not two copies that do slightly
> different things.
Yes, having one copy is much better.
> > For one
> > xfs_commit_dummy_trans doesn't actually commit a synchronous transaction
> > (or rather forces out the log) unless SYNC_WAIT is set,
>
> I don't think that we really _need_ a non-blocking version - waiting
> for a single sync transaction in xfssyncd once every 36s is hardly
> going to kill performance.
Sounds fair, but it needs documentation in the changelog, and possibly
in the source code as well.
> > in addition
> > to that xfs_fs_log_dummy uses _xfs_trans_alloc, which doesn't get
> > blocked by the filesystem freezing.
>
> Everything will be clean on a frozen filesystem, so all the current
> code does is block the xfssyncd until the filesytem is
> unfrozen. Given that we can still read everything on the frozen
> filesystem, inode caches can still grow and hence we still need to
> run regular reclaiming. If the xfssyncd is blocked then only memory
> pressure can free up inodes.
That's a reason not to wait. But given the bugs we had in this area
I'd rather not blindly start the transaction here.
Instead we could check s_frozen manually to no bother even doing
the calls to write the dummy record, plus maybe an assert so that it
trips up for debug builds.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-08-03 13:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-08-03 9:10 [PATCH] xfs: dummy transactions should not dirty VFS state Dave Chinner
2010-08-03 9:55 ` Christoph Hellwig
2010-08-03 12:07 ` Dave Chinner
2010-08-03 13:50 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox