public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Utako Kusaka <u-kusaka@wm.jp.nec.com>
Cc: Timothy Shimmin <tes@sgi.com>, xfs <xfs@oss.sgi.com>
Subject: [PATCH, RFC] Re: atime not written to disk
Date: Wed, 22 Oct 2008 19:17:10 +1100	[thread overview]
Message-ID: <20081022081710.GL18495@disturbed> (raw)
In-Reply-To: <48FD7B69.3090600@wm.jp.nec.com>

On Tue, Oct 21, 2008 at 03:49:13PM +0900, Utako Kusaka wrote:
> Hi,
>
> Your problem seems the same as mine.
> http://oss.sgi.com/archives/xfs/2007-10/msg00168.html
>
> Utako.
>
> Timothy Shimmin wrote:
>> Hi,
>>
>> Before I investigate further ;-),
>> it appears that in XFS (seen in recent xfs-dev tree and on older issp release
>> on default mkfs/mount options),
>> that the atime is not being written out to disk in xfs,
>> at least, in the simple scenario below.
>>
>> emu:/home/tes # echo bill >/mnt/test/bill
>> emu:/home/tes # ls -l /mnt/test/bill
>> -rw-r--r-- 1 root root 5 2008-10-21 16:03 /mnt/test/bill
>> emu:/home/tes # ls -lu /mnt/test/bill
>> -rw-r--r-- 1 root root 5 2008-10-21 16:03 /mnt/test/bill
>>
>> ... wait a bit to change the atime...
>>
>> emu:/home/tes # cat /mnt/test/bill
>> bill
>> emu:/home/tes # ls -lu /mnt/test/bill
>> -rw-r--r-- 1 root root 5 2008-10-21 16:11 /mnt/test/bill
>>
>> emu:/home/tes # cd /
>> emu:/ # umount /mnt/test
>> emu:/ # mount /mnt/test
>> emu:/mnt/test # ls -lu /mnt/test/bill
>> -rw-r--r-- 1 root root 5 2008-10-21 16:03 /mnt/test/bill

As I mentioned on IRC, Tim, the following patch fixes the above test
case. It will make XFS behave like other filesystems w.r.t. atime,
instead of defaulting to relatime-like behaviour. This will have
performance impact unless ppl now add the relatime mount option.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com


XFS: Implement ->dirty_inode callout

Hook up ->dirty_inode so that when the VFS dirties an inode we
can mark the XFS inode as "dirty with unlogged changes". This
allows events such as touch_atime() to propagate the dirty state
right through to XFS so it gets written back to disk.

Signed-off-by: Dave Chinner <david@fromorbit.com>
---
 fs/xfs/linux-2.6/xfs_aops.c  |    1 -
 fs/xfs/linux-2.6/xfs_iops.c  |   14 +++-----------
 fs/xfs/linux-2.6/xfs_super.c |   23 +++++++++++++++++++++++
 fs/xfs/xfs_inode_item.c      |   14 +++++++++-----
 4 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index 8fbc97d..6f4ebd0 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -189,7 +189,6 @@ xfs_setfilesize(
 
 	if (ip->i_d.di_size < isize) {
 		ip->i_d.di_size = isize;
-		ip->i_update_core = 1;
 		ip->i_update_size = 1;
 		xfs_mark_inode_dirty_sync(ip);
 	}
diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index 37bb101..b7deff9 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -117,19 +117,11 @@ xfs_ichgtime(
 	}
 
 	/*
-	 * We update the i_update_core field _after_ changing
-	 * the timestamps in order to coordinate properly with
-	 * xfs_iflush() so that we don't lose timestamp updates.
-	 * This keeps us from having to hold the inode lock
-	 * while doing this.  We use the SYNCHRONIZE macro to
-	 * ensure that the compiler does not reorder the update
-	 * of i_update_core above the timestamp updates above.
+	 * Update complete - now make sure everyone knows that the inode
+	 * is dirty.
 	 */
-	if (sync_it) {
-		SYNCHRONIZE();
-		ip->i_update_core = 1;
+	if (sync_it)
 		xfs_mark_inode_dirty_sync(ip);
-	}
 }
 
 /*
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 3ae8051..a5570b5 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -928,6 +928,28 @@ xfs_fs_inode_init_once(
 }
 
 /*
+ * Dirty the XFS inode when mark_inode_dirty_sync() is called so that
+ * we catch unlogged VFS level updates to the inode. Care must be taken
+ * here - the transaction code calls mark_inode_dirty_sync() to mark the
+ * VFS inode dirty in a transaction and clears the i_update_core field;
+ * it must clear the field after calling mark_inode_dirty_sync() to
+ * correctly indicate that the dirty state has been propagated into the
+ * inode log item.
+ *
+ * We need the barrier() to maintain correct ordering between unlogged
+ * updates and the transaction commit code that clears the i_update_core
+ * field. This requires all updates to be completed before marking the
+ * inode dirty.
+ */
+STATIC void
+xfs_fs_dirty_inode(
+	struct inode    *inode)
+{
+	barrier();
+	XFS_I(inode)->i_update_core = 1;
+}
+
+/*
  * Attempt to flush the inode, this will actually fail
  * if the inode is pinned, but we dirty the inode again
  * at the point when it is unpinned after a log write,
@@ -1712,6 +1734,7 @@ xfs_fs_get_sb(
 static struct super_operations xfs_super_operations = {
 	.alloc_inode		= xfs_fs_alloc_inode,
 	.destroy_inode		= xfs_fs_destroy_inode,
+	.dirty_inode		= xfs_fs_dirty_inode,
 	.write_inode		= xfs_fs_write_inode,
 	.clear_inode		= xfs_fs_clear_inode,
 	.put_super		= xfs_fs_put_super,
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index aa9bf05..89d480c 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -232,6 +232,15 @@ xfs_inode_item_format(
 	nvecs	     = 1;
 
 	/*
+	 * Make sure the linux inode is dirty. We do this before
+	 * clearing i_update_core as the VFS will call back into
+	 * XFS here and set i_update_core, so we need to dirty the
+	 * inode first so that the ordering of i_update_core and
+	 * unlogged modifications still works as described below.
+	 */
+	xfs_mark_inode_dirty_sync(ip);
+
+	/*
 	 * Clear i_update_core if the timestamps (or any other
 	 * non-transactional modification) need flushing/logging
 	 * and we're about to log them with the rest of the core.
@@ -275,11 +284,6 @@ xfs_inode_item_format(
 	 */
 	xfs_synchronize_atime(ip);
 
-	/*
-	 * make sure the linux inode is dirty
-	 */
-	xfs_mark_inode_dirty_sync(ip);
-
 	vecp->i_addr = (xfs_caddr_t)&ip->i_d;
 	vecp->i_len  = sizeof(xfs_dinode_core_t);
 	XLOG_VEC_SET_TYPE(vecp, XLOG_REG_TYPE_ICORE);

  reply	other threads:[~2008-10-22  8:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-21  6:21 atime not written to disk Timothy Shimmin
2008-10-21  6:49 ` Utako Kusaka
2008-10-22  8:17   ` Dave Chinner [this message]
2008-10-23  2:52     ` [PATCH, RFC] " Niv Sardi
2008-10-23  2:53     ` Niv Sardi
2008-10-23  4:20       ` Dave Chinner
2008-10-24  5:37         ` Niv Sardi
2008-10-24  6:08           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081022081710.GL18495@disturbed \
    --to=david@fromorbit.com \
    --cc=tes@sgi.com \
    --cc=u-kusaka@wm.jp.nec.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox