* [PATCH 1/2] make sure data is on disk before calling ->write_inode
@ 2010-01-11 17:30 Christoph Hellwig
2010-01-12 0:41 ` Dave Chinner
2010-01-14 0:21 ` Andrew Morton
0 siblings, 2 replies; 4+ messages in thread
From: Christoph Hellwig @ 2010-01-11 17:30 UTC (permalink / raw)
To: viro, akpm; +Cc: linux-fsdevel, Trond.Myklebust, dhowells
Similar to the fsync issue fixed a while ago in commit
2daea67e966dc0c42067ebea015ddac6834cef88 we need to write for data to
actually hit the disk before writing out the metadata to guarantee
data integrity for filesystems that modify the inode in the data I/O
completion path. Currently XFS and NFS handle this manually, and AFS
has a write_inode method that does nothing but waiting for data, while
others are possibly missing out on this.
Fortunately this change has a lot less impact than the fsync change
as none of the write_inode methods starts data writeout of any form
by itself.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: linux-2.6/fs/afs/internal.h
===================================================================
--- linux-2.6.orig/fs/afs/internal.h 2010-01-11 15:54:33.776254187 +0100
+++ linux-2.6/fs/afs/internal.h 2010-01-11 15:54:36.556006423 +0100
@@ -733,7 +733,6 @@ extern int afs_write_end(struct file *fi
struct page *page, void *fsdata);
extern int afs_writepage(struct page *, struct writeback_control *);
extern int afs_writepages(struct address_space *, struct writeback_control *);
-extern int afs_write_inode(struct inode *, int);
extern void afs_pages_written_back(struct afs_vnode *, struct afs_call *);
extern ssize_t afs_file_write(struct kiocb *, const struct iovec *,
unsigned long, loff_t);
Index: linux-2.6/fs/afs/super.c
===================================================================
--- linux-2.6.orig/fs/afs/super.c 2010-01-11 15:54:39.129253854 +0100
+++ linux-2.6/fs/afs/super.c 2010-01-11 15:54:42.315256021 +0100
@@ -48,7 +48,6 @@ struct file_system_type afs_fs_type = {
static const struct super_operations afs_super_ops = {
.statfs = afs_statfs,
.alloc_inode = afs_alloc_inode,
- .write_inode = afs_write_inode,
.destroy_inode = afs_destroy_inode,
.clear_inode = afs_clear_inode,
.put_super = afs_put_super,
Index: linux-2.6/fs/afs/write.c
===================================================================
--- linux-2.6.orig/fs/afs/write.c 2010-01-11 15:54:13.757253993 +0100
+++ linux-2.6/fs/afs/write.c 2010-01-11 15:54:30.085006295 +0100
@@ -585,27 +585,6 @@ int afs_writepages(struct address_space
}
/*
- * write an inode back
- */
-int afs_write_inode(struct inode *inode, int sync)
-{
- struct afs_vnode *vnode = AFS_FS_I(inode);
- int ret;
-
- _enter("{%x:%u},", vnode->fid.vid, vnode->fid.vnode);
-
- ret = 0;
- if (sync) {
- ret = filemap_fdatawait(inode->i_mapping);
- if (ret < 0)
- __mark_inode_dirty(inode, I_DIRTY_DATASYNC);
- }
-
- _leave(" = %d", ret);
- return ret;
-}
-
-/*
* completion of write to server
*/
void afs_pages_written_back(struct afs_vnode *vnode, struct afs_call *call)
Index: linux-2.6/fs/fs-writeback.c
===================================================================
--- linux-2.6.orig/fs/fs-writeback.c 2010-01-11 15:53:15.462272627 +0100
+++ linux-2.6/fs/fs-writeback.c 2010-01-11 15:54:09.662006283 +0100
@@ -461,15 +461,20 @@ writeback_single_inode(struct inode *ino
ret = do_writepages(mapping, wbc);
- /* Don't write the inode if only I_DIRTY_PAGES was set */
- if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
- int err = write_inode(inode, wait);
+ /*
+ * Make sure to wait on the data before writing out the metadata.
+ * This is important for filesystems that modify metadata on data
+ * I/O completion.
+ */
+ if (wait) {
+ int err = filemap_fdatawait(mapping);
if (ret == 0)
ret = err;
}
- if (wait) {
- int err = filemap_fdatawait(mapping);
+ /* Don't write the inode if only I_DIRTY_PAGES was set */
+ if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
+ int err = write_inode(inode, wait);
if (ret == 0)
ret = err;
}
Index: linux-2.6/fs/nfs/inode.c
===================================================================
--- linux-2.6.orig/fs/nfs/inode.c 2010-01-11 15:54:45.848003872 +0100
+++ linux-2.6/fs/nfs/inode.c 2010-01-11 15:55:13.095006486 +0100
@@ -101,12 +101,7 @@ int nfs_write_inode(struct inode *inode,
{
int ret;
- if (sync) {
- ret = filemap_fdatawait(inode->i_mapping);
- if (ret == 0)
- ret = nfs_commit_inode(inode, FLUSH_SYNC);
- } else
- ret = nfs_commit_inode(inode, 0);
+ ret = nfs_commit_inode(inode, sync ? FLUSH_SYNC : 0);
if (ret >= 0)
return 0;
__mark_inode_dirty(inode, I_DIRTY_DATASYNC);
Index: linux-2.6/fs/xfs/linux-2.6/xfs_super.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_super.c 2010-01-11 15:55:16.917003903 +0100
+++ linux-2.6/fs/xfs/linux-2.6/xfs_super.c 2010-01-11 15:55:24.764006427 +0100
@@ -1044,12 +1044,6 @@ xfs_fs_write_inode(
if (XFS_FORCED_SHUTDOWN(mp))
return XFS_ERROR(EIO);
- if (sync) {
- error = xfs_wait_on_pages(ip, 0, -1);
- if (error)
- goto out;
- }
-
/*
* Bypass inodes which have already been cleaned by
* the inode flush clustering code inside xfs_iflush
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] make sure data is on disk before calling ->write_inode
2010-01-11 17:30 [PATCH 1/2] make sure data is on disk before calling ->write_inode Christoph Hellwig
@ 2010-01-12 0:41 ` Dave Chinner
2010-01-14 0:21 ` Andrew Morton
1 sibling, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2010-01-12 0:41 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: viro, akpm, linux-fsdevel, Trond.Myklebust, dhowells
On Mon, Jan 11, 2010 at 06:30:47PM +0100, Christoph Hellwig wrote:
> Similar to the fsync issue fixed a while ago in commit
> 2daea67e966dc0c42067ebea015ddac6834cef88 we need to write for data to
> actually hit the disk before writing out the metadata to guarantee
> data integrity for filesystems that modify the inode in the data I/O
> completion path. Currently XFS and NFS handle this manually, and AFS
> has a write_inode method that does nothing but waiting for data, while
> others are possibly missing out on this.
>
> Fortunately this change has a lot less impact than the fsync change
> as none of the write_inode methods starts data writeout of any form
> by itself.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
The generic and XFS bits look OK to me.
Acked-by: Dave Chinner <david@fromrobit.com>
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] make sure data is on disk before calling ->write_inode
2010-01-11 17:30 [PATCH 1/2] make sure data is on disk before calling ->write_inode Christoph Hellwig
2010-01-12 0:41 ` Dave Chinner
@ 2010-01-14 0:21 ` Andrew Morton
2010-01-14 6:21 ` Christoph Hellwig
1 sibling, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2010-01-14 0:21 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: viro, linux-fsdevel, Trond.Myklebust, dhowells
On Mon, 11 Jan 2010 18:30:47 +0100
Christoph Hellwig <hch@lst.de> wrote:
> --- linux-2.6.orig/fs/fs-writeback.c 2010-01-11 15:53:15.462272627 +0100
> +++ linux-2.6/fs/fs-writeback.c 2010-01-11 15:54:09.662006283 +0100
> @@ -461,15 +461,20 @@ writeback_single_inode(struct inode *ino
>
> ret = do_writepages(mapping, wbc);
>
> - /* Don't write the inode if only I_DIRTY_PAGES was set */
> - if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
> - int err = write_inode(inode, wait);
> + /*
> + * Make sure to wait on the data before writing out the metadata.
> + * This is important for filesystems that modify metadata on data
> + * I/O completion.
> + */
> + if (wait) {
> + int err = filemap_fdatawait(mapping);
> if (ret == 0)
> ret = err;
> }
>
> - if (wait) {
> - int err = filemap_fdatawait(mapping);
> + /* Don't write the inode if only I_DIRTY_PAGES was set */
> + if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
> + int err = write_inode(inode, wait);
> if (ret == 0)
> ret = err;
> }
hm, yeah, it's hard to see how this reordering can harm throughput much.
nfs_write_inode() has vanished in linux-enxt so I just dropped all the
nfs parts of these two patches. You might want to check that.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/2] make sure data is on disk before calling ->write_inode
2010-01-14 0:21 ` Andrew Morton
@ 2010-01-14 6:21 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2010-01-14 6:21 UTC (permalink / raw)
To: Andrew Morton
Cc: Christoph Hellwig, viro, linux-fsdevel, Trond.Myklebust, dhowells
On Wed, Jan 13, 2010 at 04:21:54PM -0800, Andrew Morton wrote:
> hm, yeah, it's hard to see how this reordering can harm throughput much.
>
> nfs_write_inode() has vanished in linux-enxt so I just dropped all the
> nfs parts of these two patches. You might want to check that.
That's because linux-next has an ugly workaround for the all calling
conventions. The nfs updates in -next should be dropped / reworked
instead.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-01-14 6:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-11 17:30 [PATCH 1/2] make sure data is on disk before calling ->write_inode Christoph Hellwig
2010-01-12 0:41 ` Dave Chinner
2010-01-14 0:21 ` Andrew Morton
2010-01-14 6:21 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).