public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: xfs@oss.sgi.com
Cc: Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: [PATCH] xfs: improve sync behaviour in face of aggressive dirtying
Date: Mon, 20 Jun 2011 04:18:02 -0400	[thread overview]
Message-ID: <20110620081802.GA27111@infradead.org> (raw)
In-Reply-To: <20110617131401.GC2141@infradead.org>

As confirmed by Wu moving to a workqueue flush as in the test patch
below brings our performance on par with other filesystems.  But there's
a major and a minor issues with that.

The minor one is that we always flush all work items and not just those
on the filesystem to be flushed.  This might become an issue for lager
systems, or when we apply a similar scheme to fsync, which has the same
underlying issue.

The major one is that flush_workqueue only flushed work items that were
queued before it was called, but we can requeue completions when we fail
to get the ilock in xfs_setfilesize, which can lead to losing i_size
updates when it happens.

I see two ways to fix this:  either we implement our own workqueue
look-alike based on the old workqueue code.  This would allow flushing
queues per-sb or even per-inode, and allow us to special case flushing
requeues as well before returning.

Or we copy the scheme ext4 uses for fsync (it completely fails to flush
the completion queue for plain sync), that is add a list of pending
items to the inode, and a lock to protect it.  I don't like this because
it a) bloats the inode, b) adds a lot of complexity, and c) another lock
to the irq I/O completion.

Any other ideas?


Index: xfs/fs/xfs/linux-2.6/xfs_sync.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_sync.c	2011-06-17 14:16:18.442399481 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_sync.c	2011-06-18 17:55:44.864025123 +0200
@@ -359,14 +359,16 @@ xfs_quiesce_data(
 {
 	int			error, error2 = 0;
 
-	/* push non-blocking */
-	xfs_sync_data(mp, 0);
 	xfs_qm_sync(mp, SYNC_TRYLOCK);
-
-	/* push and block till complete */
-	xfs_sync_data(mp, SYNC_WAIT);
 	xfs_qm_sync(mp, SYNC_WAIT);
 
+	/* flush all pending size updates and unwritten extent conversions */
+	flush_workqueue(xfsconvertd_workqueue);
+	flush_workqueue(xfsdatad_workqueue);
+
+	/* force out the newly dirtied log buffers */
+	xfs_log_force(mp, XFS_LOG_SYNC);
+
 	/* write superblock and hoover up shutdown errors */
 	error = xfs_sync_fsdata(mp);
 
Index: xfs/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_super.c	2011-06-18 17:51:05.660705925 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_super.c	2011-06-18 17:52:50.107367305 +0200
@@ -929,45 +929,12 @@ xfs_fs_write_inode(
 		 * ->sync_fs call do that for thus, which reduces the number
 		 * of synchronous log foces dramatically.
 		 */
-		xfs_ioend_wait(ip);
 		xfs_ilock(ip, XFS_ILOCK_SHARED);
-		if (ip->i_update_core) {
+		if (ip->i_update_core)
 			error = xfs_log_inode(ip);
-			if (error)
-				goto out_unlock;
-		}
-	} else {
-		/*
-		 * We make this non-blocking if the inode is contended, return
-		 * EAGAIN to indicate to the caller that they did not succeed.
-		 * This prevents the flush path from blocking on inodes inside
-		 * another operation right now, they get caught later by
-		 * xfs_sync.
-		 */
-		if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED))
-			goto out;
-
-		if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip))
-			goto out_unlock;
-
-		/*
-		 * Now we have the flush lock and the inode is not pinned, we
-		 * can check if the inode is really clean as we know that
-		 * there are no pending transaction completions, it is not
-		 * waiting on the delayed write queue and there is no IO in
-		 * progress.
-		 */
-		if (xfs_inode_clean(ip)) {
-			xfs_ifunlock(ip);
-			error = 0;
-			goto out_unlock;
-		}
-		error = xfs_iflush(ip, SYNC_TRYLOCK);
+		xfs_iunlock(ip, XFS_ILOCK_SHARED);
 	}
 
- out_unlock:
-	xfs_iunlock(ip, XFS_ILOCK_SHARED);
- out:
 	/*
 	 * if we failed to write out the inode then mark
 	 * it dirty again so we'll try again later.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-06-20  8:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-17 13:14 [PATCH] xfs: improve sync behaviour in face of aggressive dirtying Christoph Hellwig
2011-06-20  8:18 ` Christoph Hellwig [this message]
2011-06-21  0:33   ` Dave Chinner
2011-06-21  7:21     ` Michael Monnerie
2011-06-22  0:19       ` Dave Chinner
2011-06-21  9:29     ` Christoph Hellwig
2011-06-22  1:09       ` Dave Chinner
2011-06-22  6:51         ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110620081802.GA27111@infradead.org \
    --to=hch@infradead.org \
    --cc=fengguang.wu@intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox