public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] XFS update for 3.0-stable
@ 2011-11-19 18:13 Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 1/9] [PATCH 1/9] "xfs: fix error handling for synchronous writes" Christoph Hellwig
                   ` (11 more replies)
  0 siblings, 12 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: xfs

This is the series of XFS fixes from current mainline which is important
enough for 3.0-stable.  All but the first three would also be needed
for 3.1-stable, but given the limited resources I plan to concentrate
on 3.0-stable.  If anyone wants to take care for 3.1 by building a kernel
with the remaining patches and run QA on them you are more welcome to
help out with that tree.

Note that while the description of

        xfs: don't serialise direct IO reads on page cache checks

doesn't mention that is is a regression fix we later noticed that a
large part of the speedups wasn't in fact new, but fixed a performance
regression introduced in Linux 2.6.38 with commit:

	xfs: introduce xfs_rw_lock() helpers for locking the inode

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/9] [PATCH 1/9] "xfs: fix error handling for synchronous writes"
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 2/9] [PATCH 2/9] xfs: fix xfs_mark_inode_dirty during umount Christoph Hellwig
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, Ajeet Yadav, xfs

[-- Attachment #1: 0001-xfs-fix-error-handling-for-synchronous-writes-revisi.patch revisited --]
[-- Type: text/plain, Size: 1296 bytes --]

xfs: fix for hang during synchronous buffer write error

If removed storage while synchronous buffer write underway,
"xfslogd" hangs.

Detailed log http://oss.sgi.com/archives/xfs/2011-07/msg00740.html

Related work bfc60177f8ab509bc225becbb58f7e53a0e33e81
"xfs: fix error handling for synchronous writes"

Given that xfs_bwrite actually does the shutdown already after
waiting for the b_iodone completion and given that we actually
found that calling xfs_force_shutdown from inside
xfs_buf_iodone_callbacks was a major contributor the problem
it better to drop this call.

Signed-off-by: Ajeet Yadav <ajeet.yadav.77@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/xfs_buf_item.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index a7342e8..7888a75 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -1023,7 +1023,6 @@ xfs_buf_iodone_callbacks(
 	XFS_BUF_UNDELAYWRITE(bp);
 
 	trace_xfs_buf_error_relse(bp, _RET_IP_);
-	xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
 
 do_callbacks:
 	xfs_buf_do_callbacks(bp);
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/9] [PATCH 2/9] xfs: fix xfs_mark_inode_dirty during umount
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 1/9] [PATCH 1/9] "xfs: fix error handling for synchronous writes" Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 3/9] [PATCH 3/9] xfs: fix ->write_inode return values Christoph Hellwig
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, xfs

[-- Attachment #1: 0002-xfs-fix-xfs_mark_inode_dirty-during-umount.patch --]
[-- Type: text/plain, Size: 2247 bytes --]

During umount we do not add a dirty inode to the lru and wait for it to
become clean first, but force writeback of data and metadata with
I_WILL_FREE set.  Currently there is no way for XFS to detect that the
inode has been redirtied for metadata operations, as we skip the
mark_inode_dirty call during teardown.  Fix this by setting i_update_core
nanually in that case, so that the inode gets flushed during inode reclaim.

Alternatively we could enable calling mark_inode_dirty for inodes in
I_WILL_FREE state, and let the VFS dirty tracking handle this.  I decided
against this as we will get better I/O patterns from reclaim compared to
the synchronous writeout in write_inode_now, and always marking the inode
dirty in some way from xfs_mark_inode_dirty is a better safetly net in
either case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit da6742a5a4cc844a9982fdd936ddb537c0747856)

Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/linux-2.6/xfs_iops.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index d44d92c..a9b3e1e 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -69,9 +69,8 @@ xfs_synchronize_times(
 }
 
 /*
- * If the linux inode is valid, mark it dirty.
- * Used when committing a dirty inode into a transaction so that
- * the inode will get written back by the linux code
+ * If the linux inode is valid, mark it dirty, else mark the dirty state
+ * in the XFS inode to make sure we pick it up when reclaiming the inode.
  */
 void
 xfs_mark_inode_dirty_sync(
@@ -81,6 +80,10 @@ xfs_mark_inode_dirty_sync(
 
 	if (!(inode->i_state & (I_WILL_FREE|I_FREEING)))
 		mark_inode_dirty_sync(inode);
+	else {
+		barrier();
+		ip->i_update_core = 1;
+	}
 }
 
 void
@@ -91,6 +94,11 @@ xfs_mark_inode_dirty(
 
 	if (!(inode->i_state & (I_WILL_FREE|I_FREEING)))
 		mark_inode_dirty(inode);
+	else {
+		barrier();
+		ip->i_update_core = 1;
+	}
+
 }
 
 /*
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/9] [PATCH 3/9] xfs: fix ->write_inode return values
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 1/9] [PATCH 1/9] "xfs: fix error handling for synchronous writes" Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 2/9] [PATCH 2/9] xfs: fix xfs_mark_inode_dirty during umount Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache Christoph Hellwig
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, xfs

[-- Attachment #1: 0003-xfs-fix-write_inode-return-values.patch --]
[-- Type: text/plain, Size: 2902 bytes --]

Currently we always redirty an inode that was attempted to be written out
synchronously but has been cleaned by an AIL pushed internall, which is
rather bogus.  Fix that by doing the i_update_core check early on and
return 0 for it.  Also include async calls for it, as doing any work for
those is just as pointless.  While we're at it also fix the sign for the
EIO return in case of a filesystem shutdown, and fix the completely
non-sensical locking around xfs_log_inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit 297db93bb74cf687510313eb235a7aec14d67e97)

Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/linux-2.6/xfs_super.c |   34 +++++++++-------------------------
 1 files changed, 9 insertions(+), 25 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 347cae9..28de70b 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -878,33 +878,17 @@ xfs_log_inode(
 	struct xfs_trans	*tp;
 	int			error;
 
-	xfs_iunlock(ip, XFS_ILOCK_SHARED);
 	tp = xfs_trans_alloc(mp, XFS_TRANS_FSYNC_TS);
 	error = xfs_trans_reserve(tp, 0, XFS_FSYNC_TS_LOG_RES(mp), 0, 0, 0);
-
 	if (error) {
 		xfs_trans_cancel(tp, 0);
-		/* we need to return with the lock hold shared */
-		xfs_ilock(ip, XFS_ILOCK_SHARED);
 		return error;
 	}
 
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
-
-	/*
-	 * Note - it's possible that we might have pushed ourselves out of the
-	 * way during trans_reserve which would flush the inode.  But there's
-	 * no guarantee that the inode buffer has actually gone out yet (it's
-	 * delwri).  Plus the buffer could be pinned anyway if it's part of
-	 * an inode in another recent transaction.  So we play it safe and
-	 * fire off the transaction anyway.
-	 */
-	xfs_trans_ijoin(tp, ip);
+	xfs_trans_ijoin_ref(tp, ip, XFS_ILOCK_EXCL);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
-	error = xfs_trans_commit(tp, 0);
-	xfs_ilock_demote(ip, XFS_ILOCK_EXCL);
-
-	return error;
+	return xfs_trans_commit(tp, 0);
 }
 
 STATIC int
@@ -919,7 +903,9 @@ xfs_fs_write_inode(
 	trace_xfs_write_inode(ip);
 
 	if (XFS_FORCED_SHUTDOWN(mp))
-		return XFS_ERROR(EIO);
+		return -XFS_ERROR(EIO);
+	if (!ip->i_update_core)
+		return 0;
 
 	if (wbc->sync_mode == WB_SYNC_ALL) {
 		/*
@@ -930,12 +916,10 @@ xfs_fs_write_inode(
 		 * of synchronous log foces dramatically.
 		 */
 		xfs_ioend_wait(ip);
-		xfs_ilock(ip, XFS_ILOCK_SHARED);
-		if (ip->i_update_core) {
-			error = xfs_log_inode(ip);
-			if (error)
-				goto out_unlock;
-		}
+		error = xfs_log_inode(ip);
+		if (error)
+			goto out;
+		return 0;
 	} else {
 		/*
 		 * We make this non-blocking if the inode is contended, return
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (2 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 3/9] [PATCH 3/9] xfs: fix ->write_inode return values Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-22 21:34   ` Greg KH
  2011-11-19 18:13 ` [PATCH 5/9] [PATCH 5/9] xfs: avoid direct I/O write vs buffered I/O race Christoph Hellwig
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, Dave Chinner, xfs

[-- Attachment #1: 0004-xfs-don-t-serialise-direct-IO-reads-on-page-cache-ch.patch checks --]
[-- Type: text/plain, Size: 2307 bytes --]

There is no need to grab the i_mutex of the IO lock in exclusive
mode if we don't need to invalidate the page cache. Taking these
locks on every direct IO effective serialises them as taking the IO
lock in exclusive mode has to wait for all shared holders to drop
the lock. That only happens when IO is complete, so effective it
prevents dispatch of concurrent direct IO reads to the same inode.

Fix this by taking the IO lock shared to check the page cache state,
and only then drop it and take the IO lock exclusively if there is
work to be done. Hence for the normal direct IO case, no exclusive
locking will occur.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Joern Engel <joern@logfs.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/linux-2.6/xfs_file.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
index 7f782af2..93cc02d 100644
--- a/fs/xfs/linux-2.6/xfs_file.c
+++ b/fs/xfs/linux-2.6/xfs_file.c
@@ -309,7 +309,19 @@ xfs_file_aio_read(
 	if (XFS_FORCED_SHUTDOWN(mp))
 		return -EIO;
 
-	if (unlikely(ioflags & IO_ISDIRECT)) {
+	/*
+	 * Locking is a bit tricky here. If we take an exclusive lock
+	 * for direct IO, we effectively serialise all new concurrent
+	 * read IO to this file and block it behind IO that is currently in
+	 * progress because IO in progress holds the IO lock shared. We only
+	 * need to hold the lock exclusive to blow away the page cache, so
+	 * only take lock exclusively if the page cache needs invalidation.
+	 * This allows the normal direct IO case of no page cache pages to
+	 * proceeed concurrently without serialisation.
+	 */
+	xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
+	if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) {
+		xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED);
 		xfs_rw_ilock(ip, XFS_IOLOCK_EXCL);
 
 		if (inode->i_mapping->nrpages) {
@@ -322,8 +334,7 @@ xfs_file_aio_read(
 			}
 		}
 		xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
-	} else
-		xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
+	}
 
 	trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags);
 
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/9] [PATCH 5/9] xfs: avoid direct I/O write vs buffered I/O race
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (3 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 6/9] [PATCH 6/9] xfs: Return -EIO when xfs_vn_getattr() failed Christoph Hellwig
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, xfs

[-- Attachment #1: 0005-xfs-avoid-direct-I-O-write-vs-buffered-I-O-race.patch --]
[-- Type: text/plain, Size: 2213 bytes --]

Currently a buffered reader or writer can add pages to the pagecache
while we are waiting for the iolock in xfs_file_dio_aio_write.  Prevent
this by re-checking mapping->nrpages after we got the iolock, and if
nessecary upgrade the lock to exclusive mode.  To simplify this a bit
only take the ilock inside of xfs_file_aio_write_checks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/linux-2.6/xfs_file.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
index 93cc02d..b679198 100644
--- a/fs/xfs/linux-2.6/xfs_file.c
+++ b/fs/xfs/linux-2.6/xfs_file.c
@@ -669,6 +669,7 @@ xfs_file_aio_write_checks(
 	xfs_fsize_t		new_size;
 	int			error = 0;
 
+	xfs_rw_ilock(ip, XFS_ILOCK_EXCL);
 	error = generic_write_checks(file, pos, count, S_ISBLK(inode->i_mode));
 	if (error) {
 		xfs_rw_iunlock(ip, XFS_ILOCK_EXCL | *iolock);
@@ -760,14 +761,24 @@ xfs_file_dio_aio_write(
 		*iolock = XFS_IOLOCK_EXCL;
 	else
 		*iolock = XFS_IOLOCK_SHARED;
-	xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock);
+	xfs_rw_ilock(ip, *iolock);
 
 	ret = xfs_file_aio_write_checks(file, &pos, &count, iolock);
 	if (ret)
 		return ret;
 
+	/*
+	 * Recheck if there are cached pages that need invalidate after we got
+	 * the iolock to protect against other threads adding new pages while
+	 * we were waiting for the iolock.
+	 */
+	if (mapping->nrpages && *iolock == XFS_IOLOCK_SHARED) {
+		xfs_rw_iunlock(ip, *iolock);
+		*iolock = XFS_IOLOCK_EXCL;
+		xfs_rw_ilock(ip, *iolock);
+	}
+
 	if (mapping->nrpages) {
-		WARN_ON(*iolock != XFS_IOLOCK_EXCL);
 		ret = -xfs_flushinval_pages(ip, (pos & PAGE_CACHE_MASK), -1,
 							FI_REMAPF_LOCKED);
 		if (ret)
@@ -812,7 +823,7 @@ xfs_file_buffered_aio_write(
 	size_t			count = ocount;
 
 	*iolock = XFS_IOLOCK_EXCL;
-	xfs_rw_ilock(ip, XFS_ILOCK_EXCL | *iolock);
+	xfs_rw_ilock(ip, *iolock);
 
 	ret = xfs_file_aio_write_checks(file, &pos, &count, iolock);
 	if (ret)
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/9] [PATCH 6/9] xfs: Return -EIO when xfs_vn_getattr() failed
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (4 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 5/9] [PATCH 5/9] xfs: avoid direct I/O write vs buffered I/O race Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 7/9] [PATCH 7/9] xfs: fix buffer flushing during unmount Christoph Hellwig
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, Mitsuo Hayasaka, xfs

[-- Attachment #1: 0006-xfs-Return-EIO-when-xfs_vn_getattr-failed.patch --]
[-- Type: text/plain, Size: 1161 bytes --]

An attribute of inode can be fetched via xfs_vn_getattr() in XFS.
Currently it returns EIO, not negative value, when it failed.  As a
result, the system call returns not negative value even though an
error occured. The stat(2), ls and mv commands cannot handle this
error and do not work correctly.

This patch fixes this bug, and returns -EIO, not EIO when an error
is detected in xfs_vn_getattr().

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/linux-2.6/xfs_iops.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index a9b3e1e..f5b697b 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -464,7 +464,7 @@ xfs_vn_getattr(
 	trace_xfs_getattr(ip);
 
 	if (XFS_FORCED_SHUTDOWN(mp))
-		return XFS_ERROR(EIO);
+		return -XFS_ERROR(EIO);
 
 	stat->size = XFS_ISIZE(ip);
 	stat->dev = inode->i_sb->s_dev;
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/9] [PATCH 7/9] xfs: fix buffer flushing during unmount
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (5 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 6/9] [PATCH 6/9] xfs: Return -EIO when xfs_vn_getattr() failed Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 8/9] [PATCH 8/9] xfs: Fix possible memory corruption in xfs_readlink Christoph Hellwig
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, xfs

[-- Attachment #1: 0007-xfs-fix-buffer-flushing-during-unmount.patch --]
[-- Type: text/plain, Size: 3212 bytes --]

The code to flush buffers in the umount code is a bit iffy: we first
flush all delwri buffers out, but then might be able to queue up a
new one when logging the sb counts.  On a normal shutdown that one
would get flushed out when doing the synchronous superblock write in
xfs_unmountfs_writesb, but we skip that one if the filesystem has
been shut down.

Fix this by moving the delwri list flushing until just before unmounting
the log, and while we're at it also remove the superflous delwri list
and buffer lru flusing for the rt and log device that can never have
cached or delwri buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Amit Sahrawat <amit.sahrawat83@gmail.com>
Tested-by: Amit Sahrawat <amit.sahrawat83@gmail.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
---
 fs/xfs/linux-2.6/xfs_buf.h |    1 -
 fs/xfs/xfs_mount.c         |   29 ++++++++++-------------------
 2 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_buf.h b/fs/xfs/linux-2.6/xfs_buf.h
index 50a7d5f..36d6ee4 100644
--- a/fs/xfs/linux-2.6/xfs_buf.h
+++ b/fs/xfs/linux-2.6/xfs_buf.h
@@ -346,7 +346,6 @@ extern struct list_head *xfs_get_buftarg_list(void);
 #define xfs_getsize_buftarg(buftarg)	block_size((buftarg)->bt_bdev)
 #define xfs_readonly_buftarg(buftarg)	bdev_read_only((buftarg)->bt_bdev)
 
-#define xfs_binval(buftarg)		xfs_flush_buftarg(buftarg, 1)
 #define XFS_bflush(buftarg)		xfs_flush_buftarg(buftarg, 1)
 
 #endif	/* __XFS_BUF_H__ */
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index b49b823..9afdd49 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -44,9 +44,6 @@
 #include "xfs_trace.h"
 
 
-STATIC void	xfs_unmountfs_wait(xfs_mount_t *);
-
-
 #ifdef HAVE_PERCPU_SB
 STATIC void	xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t,
 						int);
@@ -1507,11 +1504,6 @@ xfs_unmountfs(
 	 */
 	xfs_log_force(mp, XFS_LOG_SYNC);
 
-	xfs_binval(mp->m_ddev_targp);
-	if (mp->m_rtdev_targp) {
-		xfs_binval(mp->m_rtdev_targp);
-	}
-
 	/*
 	 * Unreserve any blocks we have so that when we unmount we don't account
 	 * the reserved free space as used. This is really only necessary for
@@ -1537,7 +1529,16 @@ xfs_unmountfs(
 		xfs_warn(mp, "Unable to update superblock counters. "
 				"Freespace may not be correct on next mount.");
 	xfs_unmountfs_writesb(mp);
-	xfs_unmountfs_wait(mp); 		/* wait for async bufs */
+
+	/*
+	 * Make sure all buffers have been flushed and completed before
+	 * unmounting the log.
+	 */
+	error = xfs_flush_buftarg(mp->m_ddev_targp, 1);
+	if (error)
+		xfs_warn(mp, "%d busy buffers during unmount.", error);
+	xfs_wait_buftarg(mp->m_ddev_targp);
+
 	xfs_log_unmount_write(mp);
 	xfs_log_unmount(mp);
 	xfs_uuid_unmount(mp);
@@ -1548,16 +1549,6 @@ xfs_unmountfs(
 	xfs_free_perag(mp);
 }
 
-STATIC void
-xfs_unmountfs_wait(xfs_mount_t *mp)
-{
-	if (mp->m_logdev_targp != mp->m_ddev_targp)
-		xfs_wait_buftarg(mp->m_logdev_targp);
-	if (mp->m_rtdev_targp)
-		xfs_wait_buftarg(mp->m_rtdev_targp);
-	xfs_wait_buftarg(mp->m_ddev_targp);
-}
-
 int
 xfs_fs_writable(xfs_mount_t *mp)
 {
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/9] [PATCH 8/9] xfs: Fix possible memory corruption in xfs_readlink
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (6 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 7/9] [PATCH 7/9] xfs: fix buffer flushing during unmount Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:13 ` [PATCH 9/9] [PATCH 9/9] xfs: use doalloc flag in xfs_qm_dqattach_one() Christoph Hellwig
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Alex Elder, Carlos Maiolino, xfs

[-- Attachment #1: 0008-xfs-Fix-possible-memory-corruption-in-xfs_readlink.patch --]
[-- Type: text/plain, Size: 2091 bytes --]

Fixes a possible memory corruption when the link is larger than
MAXPATHLEN and XFS_DEBUG is not enabled. This also remove the
S_ISLNK assert, since the inode mode is checked previously in
xfs_readlink_by_handle() and via VFS.

Updated to address concerns raised by Ben Hutchings about the loose
attention paid to 32- vs 64-bit values, and the lack of handling a
potentially negative pathlen value:
 - Changed type of "pathlen" to be xfs_fsize_t, to match that of
   ip->i_d.di_size
 - Added checking for a negative pathlen to the too-long pathlen
   test, and generalized the message that gets reported in that case
   to reflect the change
As a result, if a negative pathlen were encountered, this function
would return EFSCORRUPTED (and would fail an assertion for a debug
build)--just as would a too-long pathlen.

Signed-off-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_vnodeops.c |   14 ++++++++++----
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 6197207..6cc4d41 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -535,7 +535,7 @@ xfs_readlink(
 	char		*link)
 {
 	xfs_mount_t	*mp = ip->i_mount;
-	int		pathlen;
+	xfs_fsize_t	pathlen;
 	int		error = 0;
 
 	trace_xfs_readlink(ip);
@@ -545,13 +545,19 @@ xfs_readlink(
 
 	xfs_ilock(ip, XFS_ILOCK_SHARED);
 
-	ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFLNK);
-	ASSERT(ip->i_d.di_size <= MAXPATHLEN);
-
 	pathlen = ip->i_d.di_size;
 	if (!pathlen)
 		goto out;
 
+	if (pathlen < 0 || pathlen > MAXPATHLEN) {
+		xfs_alert(mp, "%s: inode (%llu) bad symlink length (%lld)",
+			 __func__, (unsigned long long) ip->i_ino,
+			 (long long) pathlen);
+		ASSERT(0);
+		return XFS_ERROR(EFSCORRUPTED);
+	}
+
+
 	if (ip->i_df.if_flags & XFS_IFINLINE) {
 		memcpy(link, ip->i_df.if_u1.if_data, pathlen);
 		link[pathlen] = '\0';
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] [PATCH 9/9] xfs: use doalloc flag in xfs_qm_dqattach_one()
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (7 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 8/9] [PATCH 8/9] xfs: Fix possible memory corruption in xfs_readlink Christoph Hellwig
@ 2011-11-19 18:13 ` Christoph Hellwig
  2011-11-19 18:59 ` [PATCH 0/9] XFS update for 3.0-stable Greg KH
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2011-11-19 18:13 UTC (permalink / raw)
  To: stable; +Cc: Ben Myers, Alex Elder, Mitsuo Hayasaka, xfs

[-- Attachment #1: 0009-xfs-use-doalloc-flag-in-xfs_qm_dqattach_one.patch --]
[-- Type: text/plain, Size: 1572 bytes --]

The doalloc arg in xfs_qm_dqattach_one() is a flag that indicates
whether a new area to handle quota information will be allocated
if needed. Originally, it was passed to xfs_qm_dqget(), but has
been removed by the following commit (probably by mistake):

	commit 8e9b6e7fa4544ea8a0e030c8987b918509c8ff47
	Author: Christoph Hellwig <hch@lst.de>
	Date:   Sun Feb 8 21:51:42 2009 +0100

	xfs: remove the unused XFS_QMOPT_DQLOCK flag

As the result, xfs_qm_dqget() called from xfs_qm_dqattach_one()
never allocates the new area even if it is needed.

This patch gives the doalloc arg to xfs_qm_dqget() in
xfs_qm_dqattach_one() to fix this problem.

Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
---
 fs/xfs/quota/xfs_qm.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/quota/xfs_qm.c b/fs/xfs/quota/xfs_qm.c
index b94dace..e70c7fc 100644
--- a/fs/xfs/quota/xfs_qm.c
+++ b/fs/xfs/quota/xfs_qm.c
@@ -714,7 +714,8 @@ xfs_qm_dqattach_one(
 	 * disk and we didn't ask it to allocate;
 	 * ESRCH if quotas got turned off suddenly.
 	 */
-	error = xfs_qm_dqget(ip->i_mount, ip, id, type, XFS_QMOPT_DOWARN, &dqp);
+	error = xfs_qm_dqget(ip->i_mount, ip, id, type,
+			     doalloc | XFS_QMOPT_DOWARN, &dqp);
 	if (error)
 		return error;
 
-- 
1.7.7


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/9] XFS update for 3.0-stable
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (8 preceding siblings ...)
  2011-11-19 18:13 ` [PATCH 9/9] [PATCH 9/9] xfs: use doalloc flag in xfs_qm_dqattach_one() Christoph Hellwig
@ 2011-11-19 18:59 ` Greg KH
  2011-11-21  0:46 ` Dave Chinner
  2011-11-21 16:05 ` Ben Myers
  11 siblings, 0 replies; 15+ messages in thread
From: Greg KH @ 2011-11-19 18:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: stable, xfs

On Sat, Nov 19, 2011 at 01:13:36PM -0500, Christoph Hellwig wrote:
> This is the series of XFS fixes from current mainline which is important
> enough for 3.0-stable.  All but the first three would also be needed
> for 3.1-stable, but given the limited resources I plan to concentrate
> on 3.0-stable.  If anyone wants to take care for 3.1 by building a kernel
> with the remaining patches and run QA on them you are more welcome to
> help out with that tree.
> 
> Note that while the description of
> 
>         xfs: don't serialise direct IO reads on page cache checks
> 
> doesn't mention that is is a regression fix we later noticed that a
> large part of the speedups wasn't in fact new, but fixed a performance
> regression introduced in Linux 2.6.38 with commit:
> 
> 	xfs: introduce xfs_rw_lock() helpers for locking the inode

Thanks for the series of patches, I'll queue them up after this latest
3.0 release happens on Monday.

greg k-h

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/9] XFS update for 3.0-stable
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (9 preceding siblings ...)
  2011-11-19 18:59 ` [PATCH 0/9] XFS update for 3.0-stable Greg KH
@ 2011-11-21  0:46 ` Dave Chinner
  2011-11-21 16:05 ` Ben Myers
  11 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2011-11-21  0:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: stable, xfs

On Sat, Nov 19, 2011 at 01:13:36PM -0500, Christoph Hellwig wrote:
> This is the series of XFS fixes from current mainline which is important
> enough for 3.0-stable.  All but the first three would also be needed
> for 3.1-stable, but given the limited resources I plan to concentrate
> on 3.0-stable.  If anyone wants to take care for 3.1 by building a kernel
> with the remaining patches and run QA on them you are more welcome to
> help out with that tree.
> 
> Note that while the description of
> 
>         xfs: don't serialise direct IO reads on page cache checks
> 
> doesn't mention that is is a regression fix we later noticed that a
> large part of the speedups wasn't in fact new, but fixed a performance
> regression introduced in Linux 2.6.38 with commit:
> 
> 	xfs: introduce xfs_rw_lock() helpers for locking the inode

The series of patches for 3.0 looks good. It passes my tests here.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/9] XFS update for 3.0-stable
  2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
                   ` (10 preceding siblings ...)
  2011-11-21  0:46 ` Dave Chinner
@ 2011-11-21 16:05 ` Ben Myers
  11 siblings, 0 replies; 15+ messages in thread
From: Ben Myers @ 2011-11-21 16:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: stable, xfs

Hey Christoph,

On Sat, Nov 19, 2011 at 01:13:36PM -0500, Christoph Hellwig wrote:
> This is the series of XFS fixes from current mainline which is important
> enough for 3.0-stable.  All but the first three would also be needed
> for 3.1-stable, but given the limited resources I plan to concentrate
> on 3.0-stable.  If anyone wants to take care for 3.1 by building a kernel
> with the remaining patches and run QA on them you are more welcome to
> help out with that tree.

I'll work on 3.1...

Thanks,
-Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache
  2011-11-19 18:13 ` [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache Christoph Hellwig
@ 2011-11-22 21:34   ` Greg KH
  2011-11-22 21:35     ` Greg KH
  0 siblings, 1 reply; 15+ messages in thread
From: Greg KH @ 2011-11-22 21:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Alex Elder, Dave Chinner, stable, xfs

On Sat, Nov 19, 2011 at 01:13:40PM -0500, Christoph Hellwig wrote:
> There is no need to grab the i_mutex of the IO lock in exclusive
> mode if we don't need to invalidate the page cache. Taking these
> locks on every direct IO effective serialises them as taking the IO
> lock in exclusive mode has to wait for all shared holders to drop
> the lock. That only happens when IO is complete, so effective it
> prevents dispatch of concurrent direct IO reads to the same inode.
> 
> Fix this by taking the IO lock shared to check the page cache state,
> and only then drop it and take the IO lock exclusively if there is
> work to be done. Hence for the normal direct IO case, no exclusive
> locking will occur.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Tested-by: Joern Engel <joern@logfs.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Alex Elder <aelder@sgi.com>

What is the git commit id that matches this patch in Linus's tree?

thanks,

greg k-h

> ---
>  fs/xfs/linux-2.6/xfs_file.c |   17 ++++++++++++++---
>  1 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
> index 7f782af2..93cc02d 100644
> --- a/fs/xfs/linux-2.6/xfs_file.c
> +++ b/fs/xfs/linux-2.6/xfs_file.c
> @@ -309,7 +309,19 @@ xfs_file_aio_read(
>  	if (XFS_FORCED_SHUTDOWN(mp))
>  		return -EIO;
>  
> -	if (unlikely(ioflags & IO_ISDIRECT)) {
> +	/*
> +	 * Locking is a bit tricky here. If we take an exclusive lock
> +	 * for direct IO, we effectively serialise all new concurrent
> +	 * read IO to this file and block it behind IO that is currently in
> +	 * progress because IO in progress holds the IO lock shared. We only
> +	 * need to hold the lock exclusive to blow away the page cache, so
> +	 * only take lock exclusively if the page cache needs invalidation.
> +	 * This allows the normal direct IO case of no page cache pages to
> +	 * proceeed concurrently without serialisation.
> +	 */
> +	xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
> +	if ((ioflags & IO_ISDIRECT) && inode->i_mapping->nrpages) {
> +		xfs_rw_iunlock(ip, XFS_IOLOCK_SHARED);
>  		xfs_rw_ilock(ip, XFS_IOLOCK_EXCL);
>  
>  		if (inode->i_mapping->nrpages) {
> @@ -322,8 +334,7 @@ xfs_file_aio_read(
>  			}
>  		}
>  		xfs_rw_ilock_demote(ip, XFS_IOLOCK_EXCL);
> -	} else
> -		xfs_rw_ilock(ip, XFS_IOLOCK_SHARED);
> +	}
>  
>  	trace_xfs_file_read(ip, size, iocb->ki_pos, ioflags);
>  
> -- 
> 1.7.7
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache
  2011-11-22 21:34   ` Greg KH
@ 2011-11-22 21:35     ` Greg KH
  0 siblings, 0 replies; 15+ messages in thread
From: Greg KH @ 2011-11-22 21:35 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Alex Elder, Dave Chinner, stable, xfs

On Tue, Nov 22, 2011 at 01:34:25PM -0800, Greg KH wrote:
> On Sat, Nov 19, 2011 at 01:13:40PM -0500, Christoph Hellwig wrote:
> > There is no need to grab the i_mutex of the IO lock in exclusive
> > mode if we don't need to invalidate the page cache. Taking these
> > locks on every direct IO effective serialises them as taking the IO
> > lock in exclusive mode has to wait for all shared holders to drop
> > the lock. That only happens when IO is complete, so effective it
> > prevents dispatch of concurrent direct IO reads to the same inode.
> > 
> > Fix this by taking the IO lock shared to check the page cache state,
> > and only then drop it and take the IO lock exclusively if there is
> > work to be done. Hence for the normal direct IO case, no exclusive
> > locking will occur.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > Tested-by: Joern Engel <joern@logfs.org>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Alex Elder <aelder@sgi.com>
> 
> What is the git commit id that matches this patch in Linus's tree?

Nevermind, I found it, 0c38a2512df272b14ef4238b476a2e4f70da1479, right?

Next time, please include the git commit id of the patch in Linus's tree
so I don't have to dig it out like I did for this series.

thanks,

greg k-h

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-11-22 22:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-19 18:13 [PATCH 0/9] XFS update for 3.0-stable Christoph Hellwig
2011-11-19 18:13 ` [PATCH 1/9] [PATCH 1/9] "xfs: fix error handling for synchronous writes" Christoph Hellwig
2011-11-19 18:13 ` [PATCH 2/9] [PATCH 2/9] xfs: fix xfs_mark_inode_dirty during umount Christoph Hellwig
2011-11-19 18:13 ` [PATCH 3/9] [PATCH 3/9] xfs: fix ->write_inode return values Christoph Hellwig
2011-11-19 18:13 ` [PATCH 4/9] [PATCH 4/9] xfs: dont serialise direct IO reads on page cache Christoph Hellwig
2011-11-22 21:34   ` Greg KH
2011-11-22 21:35     ` Greg KH
2011-11-19 18:13 ` [PATCH 5/9] [PATCH 5/9] xfs: avoid direct I/O write vs buffered I/O race Christoph Hellwig
2011-11-19 18:13 ` [PATCH 6/9] [PATCH 6/9] xfs: Return -EIO when xfs_vn_getattr() failed Christoph Hellwig
2011-11-19 18:13 ` [PATCH 7/9] [PATCH 7/9] xfs: fix buffer flushing during unmount Christoph Hellwig
2011-11-19 18:13 ` [PATCH 8/9] [PATCH 8/9] xfs: Fix possible memory corruption in xfs_readlink Christoph Hellwig
2011-11-19 18:13 ` [PATCH 9/9] [PATCH 9/9] xfs: use doalloc flag in xfs_qm_dqattach_one() Christoph Hellwig
2011-11-19 18:59 ` [PATCH 0/9] XFS update for 3.0-stable Greg KH
2011-11-21  0:46 ` Dave Chinner
2011-11-21 16:05 ` Ben Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox