From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 04/37] xfs: implement freezing by emptying the AIL
Date: Mon, 23 Apr 2012 15:58:34 +1000 [thread overview]
Message-ID: <1335160747-17254-5-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1335160747-17254-1-git-send-email-david@fromorbit.com>
From: Christoph Hellwig <hch@infradead.org>
Now that we write back all metadata either synchronously or through
the AIL we can simply implement metadata freezing in terms of
emptying the AIL.
The implementation for this is fairly simply and straight-forward:
A new routine is added that asks the xfsaild to push the AIL to the
end and waits for it to complete and send a wakeup. The routine will
then loop if the AIL is not actually empty, and continue to do so
until the AIL is compeltely empty.
We keep an inode reclaim pass in the freeze process to avoid having
memory pressure have to reclaim inodes that require dirtying the
filesystem to be reclaimed after the freeze has completed. This
means we can also treat unmount in the exact same way as freeze.
As an upside we can now remove the radix tree based inode writeback
and xfs_unmountfs_writesb.
[ Dave Chinner:
- Cleaned up commit message.
- Added inode reclaim passes back into freeze.
- Cleaned up wakeup mechanism to avoid the use of a new
sleep counter variable. ]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_mount.c | 56 ++++++---------------------
fs/xfs/xfs_mount.h | 1 -
fs/xfs/xfs_sync.c | 96 +++++++----------------------------------------
fs/xfs/xfs_trans_ail.c | 36 ++++++++++++++----
fs/xfs/xfs_trans_priv.h | 2 +
5 files changed, 56 insertions(+), 135 deletions(-)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 5aa7444..f7f3e97 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -22,6 +22,7 @@
#include "xfs_log.h"
#include "xfs_inum.h"
#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
#include "xfs_sb.h"
#include "xfs_ag.h"
#include "xfs_dir2.h"
@@ -1475,15 +1476,15 @@ xfs_unmountfs(
xfs_log_force(mp, XFS_LOG_SYNC);
/*
- * Do a delwri reclaim pass first so that as many dirty inodes are
- * queued up for IO as possible. Then flush the buffers before making
- * a synchronous path to catch all the remaining inodes are reclaimed.
- * This makes the reclaim process as quick as possible by avoiding
- * synchronous writeout and blocking on inodes already in the delwri
- * state as much as possible.
+ * Flush all pending changes from the AIL.
+ */
+ xfs_ail_push_all_sync(mp->m_ail);
+
+ /*
+ * And reclaim all inodes. At this point there should be no dirty
+ * inode, and none should be pinned or locked, but use synchronous
+ * reclaim just to be sure.
*/
- xfs_reclaim_inodes(mp, 0);
- xfs_flush_buftarg(mp->m_ddev_targp, 1);
xfs_reclaim_inodes(mp, SYNC_WAIT);
xfs_qm_unmount(mp);
@@ -1519,15 +1520,12 @@ xfs_unmountfs(
if (error)
xfs_warn(mp, "Unable to update superblock counters. "
"Freespace may not be correct on next mount.");
- xfs_unmountfs_writesb(mp);
/*
- * Make sure all buffers have been flushed and completed before
- * unmounting the log.
+ * At this point we might have modified the superblock again and thus
+ * added an item to the AIL, thus flush it again.
*/
- error = xfs_flush_buftarg(mp->m_ddev_targp, 1);
- if (error)
- xfs_warn(mp, "%d busy buffers during unmount.", error);
+ xfs_ail_push_all_sync(mp->m_ail);
xfs_wait_buftarg(mp->m_ddev_targp);
xfs_log_unmount_write(mp);
@@ -1588,36 +1586,6 @@ xfs_log_sbcount(xfs_mount_t *mp)
return error;
}
-int
-xfs_unmountfs_writesb(xfs_mount_t *mp)
-{
- xfs_buf_t *sbp;
- int error = 0;
-
- /*
- * skip superblock write if fs is read-only, or
- * if we are doing a forced umount.
- */
- if (!((mp->m_flags & XFS_MOUNT_RDONLY) ||
- XFS_FORCED_SHUTDOWN(mp))) {
-
- sbp = xfs_getsb(mp, 0);
-
- XFS_BUF_UNDONE(sbp);
- XFS_BUF_UNREAD(sbp);
- xfs_buf_delwri_dequeue(sbp);
- XFS_BUF_WRITE(sbp);
- XFS_BUF_UNASYNC(sbp);
- ASSERT(sbp->b_target == mp->m_ddev_targp);
- xfsbdstrat(mp, sbp);
- error = xfs_buf_iowait(sbp);
- if (error)
- xfs_buf_ioerror_alert(sbp, __func__);
- xfs_buf_relse(sbp);
- }
- return error;
-}
-
/*
* xfs_mod_sb() can be used to copy arbitrary changes to the
* in-core superblock into the superblock buffer to be logged.
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 73f6c7a..dbd9d42 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -375,7 +375,6 @@ extern __uint64_t xfs_default_resblks(xfs_mount_t *mp);
extern int xfs_mountfs(xfs_mount_t *mp);
extern void xfs_unmountfs(xfs_mount_t *);
-extern int xfs_unmountfs_writesb(xfs_mount_t *);
extern int xfs_mod_incore_sb(xfs_mount_t *, xfs_sb_field_t, int64_t, int);
extern int xfs_mod_incore_sb_batch(xfs_mount_t *, xfs_mod_sb_t *,
uint, int);
diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
index c07cbb4..7163ca8 100644
--- a/fs/xfs/xfs_sync.c
+++ b/fs/xfs/xfs_sync.c
@@ -241,45 +241,6 @@ xfs_sync_inode_data(
return error;
}
-STATIC int
-xfs_sync_inode_attr(
- struct xfs_inode *ip,
- struct xfs_perag *pag,
- int flags)
-{
- int error = 0;
-
- xfs_ilock(ip, XFS_ILOCK_SHARED);
- if (xfs_inode_clean(ip))
- goto out_unlock;
- if (!xfs_iflock_nowait(ip)) {
- if (!(flags & SYNC_WAIT))
- goto out_unlock;
- xfs_iflock(ip);
- }
-
- if (xfs_inode_clean(ip)) {
- xfs_ifunlock(ip);
- goto out_unlock;
- }
-
- error = xfs_iflush(ip, flags);
-
- /*
- * We don't want to try again on non-blocking flushes that can't run
- * again immediately. If an inode really must be written, then that's
- * what the SYNC_WAIT flag is for.
- */
- if (error == EAGAIN) {
- ASSERT(!(flags & SYNC_WAIT));
- error = 0;
- }
-
- out_unlock:
- xfs_iunlock(ip, XFS_ILOCK_SHARED);
- return error;
-}
-
/*
* Write out pagecache data for the whole filesystem.
*/
@@ -300,19 +261,6 @@ xfs_sync_data(
return 0;
}
-/*
- * Write out inode metadata (attributes) for the whole filesystem.
- */
-STATIC int
-xfs_sync_attr(
- struct xfs_mount *mp,
- int flags)
-{
- ASSERT((flags & ~SYNC_WAIT) == 0);
-
- return xfs_inode_ag_iterator(mp, xfs_sync_inode_attr, flags);
-}
-
STATIC int
xfs_sync_fsdata(
struct xfs_mount *mp)
@@ -350,7 +298,7 @@ xfs_sync_fsdata(
* First stage of freeze - no writers will make progress now we are here,
* so we flush delwri and delalloc buffers here, then wait for all I/O to
* complete. Data is frozen at that point. Metadata is not frozen,
- * transactions can still occur here so don't bother flushing the buftarg
+ * transactions can still occur here so don't bother emptying the AIL
* because it'll just get dirty again.
*/
int
@@ -379,33 +327,6 @@ xfs_quiesce_data(
return error ? error : error2;
}
-STATIC void
-xfs_quiesce_fs(
- struct xfs_mount *mp)
-{
- int count = 0, pincount;
-
- xfs_reclaim_inodes(mp, 0);
- xfs_flush_buftarg(mp->m_ddev_targp, 0);
-
- /*
- * This loop must run at least twice. The first instance of the loop
- * will flush most meta data but that will generate more meta data
- * (typically directory updates). Which then must be flushed and
- * logged before we can write the unmount record. We also so sync
- * reclaim of inodes to catch any that the above delwri flush skipped.
- */
- do {
- xfs_reclaim_inodes(mp, SYNC_WAIT);
- xfs_sync_attr(mp, SYNC_WAIT);
- pincount = xfs_flush_buftarg(mp->m_ddev_targp, 1);
- if (!pincount) {
- delay(50);
- count++;
- }
- } while (count < 2);
-}
-
/*
* Second stage of a quiesce. The data is already synced, now we have to take
* care of the metadata. New transactions are already blocked, so we need to
@@ -421,8 +342,12 @@ xfs_quiesce_attr(
while (atomic_read(&mp->m_active_trans) > 0)
delay(100);
- /* flush inodes and push all remaining buffers out to disk */
- xfs_quiesce_fs(mp);
+ /* reclaim inodes to do any IO before the freeze completes */
+ xfs_reclaim_inodes(mp, 0);
+ xfs_reclaim_inodes(mp, SYNC_WAIT);
+
+ /* flush all pending changes from the AIL */
+ xfs_ail_push_all_sync(mp->m_ail);
/*
* Just warn here till VFS can correctly support
@@ -436,7 +361,12 @@ xfs_quiesce_attr(
xfs_warn(mp, "xfs_attr_quiesce: failed to log sb changes. "
"Frozen image may not be consistent.");
xfs_log_unmount_write(mp);
- xfs_unmountfs_writesb(mp);
+
+ /*
+ * At this point we might have modified the superblock again and thus
+ * added an item to the AIL, thus flush it again.
+ */
+ xfs_ail_push_all_sync(mp->m_ail);
}
static void
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 77acc53..0425ca1 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -383,9 +383,8 @@ xfsaild_push(
spin_lock(&ailp->xa_lock);
}
- target = ailp->xa_target;
lip = xfs_trans_ail_cursor_first(ailp, &cur, ailp->xa_last_pushed_lsn);
- if (!lip || XFS_FORCED_SHUTDOWN(mp)) {
+ if (!lip) {
/*
* AIL is empty or our push has reached the end.
*/
@@ -408,6 +407,7 @@ xfsaild_push(
* lots of contention on the AIL lists.
*/
lsn = lip->li_lsn;
+ target = ailp->xa_target;
while ((XFS_LSN_CMP(lip->li_lsn, target) <= 0)) {
int lock_result;
/*
@@ -466,11 +466,6 @@ xfsaild_push(
}
spin_lock(&ailp->xa_lock);
- /* should we bother continuing? */
- if (XFS_FORCED_SHUTDOWN(mp))
- break;
- ASSERT(mp->m_log);
-
count++;
/*
@@ -611,6 +606,30 @@ xfs_ail_push_all(
}
/*
+ * Push out all items in the AIL immediately and wait until the AIL is empty.
+ */
+void
+xfs_ail_push_all_sync(
+ struct xfs_ail *ailp)
+{
+ struct xfs_log_item *lip;
+ DEFINE_WAIT(wait);
+
+ spin_lock(&ailp->xa_lock);
+ while ((lip = xfs_ail_max(ailp)) != NULL) {
+ prepare_to_wait(&ailp->xa_empty, &wait, TASK_UNINTERRUPTIBLE);
+ ailp->xa_target = lip->li_lsn;
+ wake_up_process(ailp->xa_task);
+ spin_unlock(&ailp->xa_lock);
+ schedule();
+ spin_lock(&ailp->xa_lock);
+ }
+ spin_unlock(&ailp->xa_lock);
+
+ finish_wait(&ailp->xa_empty, &wait);
+}
+
+/*
* xfs_trans_ail_update - bulk AIL insertion operation.
*
* @xfs_trans_ail_update takes an array of log items that all need to be
@@ -737,6 +756,8 @@ xfs_trans_ail_delete_bulk(
if (mlip_changed) {
if (!XFS_FORCED_SHUTDOWN(ailp->xa_mount))
xlog_assign_tail_lsn_locked(ailp->xa_mount);
+ if (list_empty(&ailp->xa_ail))
+ wake_up_all(&ailp->xa_empty);
spin_unlock(&ailp->xa_lock);
xfs_log_space_wake(ailp->xa_mount);
@@ -773,6 +794,7 @@ xfs_trans_ail_init(
INIT_LIST_HEAD(&ailp->xa_ail);
INIT_LIST_HEAD(&ailp->xa_cursors);
spin_lock_init(&ailp->xa_lock);
+ init_waitqueue_head(&ailp->xa_empty);
ailp->xa_task = kthread_run(xfsaild, ailp, "xfsaild/%s",
ailp->xa_mount->m_fsname);
diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h
index 46a1ebd..218304a 100644
--- a/fs/xfs/xfs_trans_priv.h
+++ b/fs/xfs/xfs_trans_priv.h
@@ -71,6 +71,7 @@ struct xfs_ail {
spinlock_t xa_lock;
xfs_lsn_t xa_last_pushed_lsn;
int xa_log_flush;
+ wait_queue_head_t xa_empty;
};
/*
@@ -102,6 +103,7 @@ xfs_trans_ail_delete(
void xfs_ail_push(struct xfs_ail *, xfs_lsn_t);
void xfs_ail_push_all(struct xfs_ail *);
+void xfs_ail_push_all_sync(struct xfs_ail *);
struct xfs_log_item *xfs_ail_min(struct xfs_ail *ailp);
xfs_lsn_t xfs_ail_min_lsn(struct xfs_ail *ailp);
--
1.7.9.5
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-04-23 5:59 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-23 5:58 [PATCH 00/37] xfs: current 3.4 patch queue Dave Chinner
2012-04-23 5:58 ` [PATCH 01/37] xfs: remove log item from AIL in xfs_qm_dqflush after a shutdown Dave Chinner
2012-04-23 5:58 ` [PATCH 02/37] xfs: remove log item from AIL in xfs_iflush " Dave Chinner
2012-04-23 15:39 ` Mark Tinguely
2012-04-23 5:58 ` [PATCH 03/37] xfs: allow assigning the tail lsn with the AIL lock held Dave Chinner
2012-04-23 5:58 ` Dave Chinner [this message]
2012-04-23 15:40 ` [PATCH 04/37] xfs: implement freezing by emptying the AIL Mark Tinguely
2012-04-29 21:43 ` Christoph Hellwig
2012-04-23 5:58 ` [PATCH 05/37] xfs: don't flush inodes from background inode reclaim Dave Chinner
2012-04-23 5:58 ` [PATCH 06/37] xfs: do not write the buffer from xfs_iflush Dave Chinner
2012-04-23 5:58 ` [PATCH 07/37] xfs: do not write the buffer from xfs_qm_dqflush Dave Chinner
2012-04-23 5:58 ` [PATCH 08/37] xfs: do not add buffers to the delwri queue until pushed Dave Chinner
2012-04-23 5:58 ` [PATCH 09/37] xfs: on-stack delayed write buffer lists Dave Chinner
2012-04-25 18:34 ` Mark Tinguely
2012-04-29 21:44 ` Christoph Hellwig
2012-04-23 5:58 ` [PATCH 10/37] xfs: remove some obsolete comments in xfs_trans_ail.c Dave Chinner
2012-04-23 15:41 ` Mark Tinguely
2012-04-23 5:58 ` [PATCH 11/37] xfs: pass shutdown method into xfs_trans_ail_delete_bulk Dave Chinner
2012-04-23 5:58 ` [PATCH 12/37] xfs: Do background CIL flushes via a workqueue Dave Chinner
2012-04-23 7:54 ` [PATCH 12/37 V2] " Dave Chinner
2012-04-29 21:46 ` Christoph Hellwig
2012-04-23 5:58 ` [PATCH 13/37] xfs: page type check in writeback only checks last buffer Dave Chinner
2012-04-23 5:58 ` [PATCH 14/37] xfs: Use preallocation for inodes with extsz hints Dave Chinner
2012-04-29 21:47 ` Christoph Hellwig
2012-04-23 5:58 ` [PATCH 15/37] xfs: fix buffer lookup race on allocation failure Dave Chinner
2012-04-23 5:58 ` [PATCH 16/37] xfs: check for buffer errors before waiting Dave Chinner
2012-04-23 5:58 ` [PATCH 17/37] xfs: fix incorrect b_offset initialisation Dave Chinner
2012-04-23 5:58 ` [PATCH 18/37] xfs: use kmem_zone_zalloc for buffers Dave Chinner
2012-04-23 5:58 ` [PATCH 19/37] xfs: clean up buffer get/read call API Dave Chinner
2012-04-23 5:58 ` [PATCH 20/37] xfs: kill b_file_offset Dave Chinner
2012-04-23 5:58 ` [PATCH 21/37] xfs: use blocks for counting length of buffers Dave Chinner
2012-04-23 5:58 ` [PATCH 22/37] xfs: use blocks for storing the desired IO size Dave Chinner
2012-04-23 5:58 ` [PATCH 23/37] xfs: kill xfs_buf_btoc Dave Chinner
2012-04-23 5:58 ` [PATCH 24/37] xfs: kill XBF_LOCK Dave Chinner
2012-04-23 5:58 ` [PATCH 25/37] xfs: kill xfs_read_buf() Dave Chinner
2012-04-23 5:58 ` [PATCH 26/37] xfs: kill XBF_DONTBLOCK Dave Chinner
2012-04-23 5:58 ` [PATCH 27/37] xfs: use iolock on XFS_IOC_ALLOCSP calls Dave Chinner
2012-04-23 5:58 ` [PATCH 28/37] xfs: move xfsagino_t to xfs_types.h Dave Chinner
2012-04-23 15:43 ` Mark Tinguely
2012-04-24 15:10 ` Mark Tinguely
2012-04-29 21:49 ` Christoph Hellwig
2012-04-30 0:32 ` Dave Chinner
2012-04-23 5:58 ` [PATCH 29/37] xfs: move busy extent handling to it's own file Dave Chinner
2012-04-23 17:57 ` Ben Myers
2012-04-24 0:25 ` [PATCH 29/37 V2] " Dave Chinner
2012-04-24 15:56 ` Mark Tinguely
2012-04-24 18:10 ` Mark Tinguely
2012-04-29 10:39 ` [PATCH 29/37 V3] " Dave Chinner
2012-04-29 21:50 ` Christoph Hellwig
2012-04-30 0:36 ` Dave Chinner
2012-04-30 2:17 ` Dave Chinner
2012-04-23 5:59 ` [PATCH 30/37] xfs: clean up busy extent naming Dave Chinner
2012-04-24 18:11 ` Mark Tinguely
2012-04-29 10:41 ` [PATCH 30/37 V2] " Dave Chinner
2012-04-29 21:50 ` Christoph Hellwig
2012-04-23 5:59 ` [PATCH 31/37] xfs: move xfs_fsb_to_db to xfs_bmap.h Dave Chinner
2012-04-24 19:24 ` Mark Tinguely
2012-04-29 21:53 ` Christoph Hellwig
2012-04-30 2:31 ` Dave Chinner
2012-04-23 5:59 ` [PATCH 32/37] xfs: move xfs_get_extsz_hint() and kill xfs_rw.h Dave Chinner
2012-04-24 19:30 ` Mark Tinguely
2012-04-29 21:53 ` Christoph Hellwig
2012-04-23 5:59 ` [PATCH 33/37] xfs: move xfs_do_force_shutdown() and kill xfs_rw.c Dave Chinner
2012-04-24 19:37 ` Mark Tinguely
2012-04-29 21:54 ` Christoph Hellwig
2012-04-30 2:38 ` Dave Chinner
2012-04-23 5:59 ` [PATCH 34/37] xfs: clean up xfs_bit.h includes Dave Chinner
2012-04-24 19:44 ` Mark Tinguely
2012-04-29 21:55 ` Christoph Hellwig
2012-04-30 2:40 ` Dave Chinner
2012-04-23 5:59 ` [PATCH 35/37] xfs: Properly exclude IO type flags from buffer flags Dave Chinner
2012-04-24 20:02 ` Mark Tinguely
2012-04-29 21:55 ` Christoph Hellwig
2012-04-23 5:59 ` [PATCH 36/37] xfs: flush outstanding buffers on log mount failure Dave Chinner
2012-04-23 15:47 ` Mark Tinguely
2012-04-29 21:55 ` Christoph Hellwig
2012-04-23 5:59 ` [PATCH 37/37] xfs: make XBF_MAPPED the default behaviour Dave Chinner
2012-04-25 18:35 ` Mark Tinguely
2012-04-25 20:09 ` Mark Tinguely
2012-04-25 22:33 ` Dave Chinner
2012-04-29 21:57 ` Christoph Hellwig
2012-04-30 2:45 ` Dave Chinner
2012-04-23 18:01 ` [PATCH 00/37] xfs: current 3.4 patch queue Ben Myers
2012-04-23 23:29 ` Dave Chinner
2012-04-30 14:24 ` Ben Myers
2012-04-28 2:15 ` Ben Myers
2012-04-28 21:28 ` Ben Myers
2012-04-29 0:21 ` Dave Chinner
2012-04-29 0:14 ` Dave Chinner
2012-04-30 14:44 ` Ben Myers
2012-04-30 23:04 ` Dave Chinner
2012-04-30 14:32 ` Assertion failed: RB_EMPTY_NODE(&bp->b_rbnode) Ben Myers
2012-04-30 23:12 ` Dave Chinner
2012-04-30 14:34 ` [PATCH 00/37] xfs: current 3.4 patch queue Ben Myers
2012-04-30 23:20 ` Dave Chinner
2012-04-30 19:25 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1335160747-17254-5-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox