All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boaz Harrosh <bharrosh@panasas.com>
To: npiggin@kernel.dk
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 5/7] fs: ext2 inode sync fix
Date: Tue, 30 Nov 2010 13:26:12 +0200	[thread overview]
Message-ID: <4CF4DF54.7020602@panasas.com> (raw)
In-Reply-To: <20101123140707.991528524@kernel.dk>

On 11/23/2010 04:06 PM, npiggin@kernel.dk wrote:
> There is a big fuckup with inode metadata writeback (I suspect in a lot
> of filesystems, but I've only dared to look at a couple as yet).
> 
> ext2 relies on ->write_inode being called from sync_inode_metadata in
> fsync in order to sync the inode. However I_DIRTY_SYNC gets cleared
> after a call to this guy, and it doesn't actually write back and wait
> on the inode block unless it is called for sync. This means that write_inode
> from background writeback can kill the inode dirty bits without the data
> getting to disk. Fsync will subsequently miss it.
> 
> The fix is for ->write_inode to dirty the buffers/cache, and then ->fsync to
> write back the dirty data. In the full filesystem sync case, buffercache
> writeback in the generic code will write back the dirty data. Other
> filesystems could use ->sync_fs for this.
> 
> Signed-off-by: Nick Piggin <npiggin@kernel.dk>
> 
> Index: linux-2.6/fs/ext2/inode.c
> ===================================================================
> --- linux-2.6.orig/fs/ext2/inode.c	2010-11-23 21:56:05.000000000 +1100
> +++ linux-2.6/fs/ext2/inode.c	2010-11-23 22:05:02.000000000 +1100
> @@ -1211,7 +1211,7 @@ static int ext2_setsize(struct inode *in
>  	return 0;
>  }
>  
> -static struct ext2_inode *ext2_get_inode(struct super_block *sb, ino_t ino,
> +struct ext2_inode *ext2_get_inode(struct super_block *sb, ino_t ino,
>  					struct buffer_head **p)
>  {
>  	struct buffer_head * bh;
> @@ -1505,16 +1505,8 @@ static int __ext2_write_inode(struct ino
>  	} else for (n = 0; n < EXT2_N_BLOCKS; n++)
>  		raw_inode->i_block[n] = ei->i_data[n];
>  	mark_buffer_dirty(bh);
> -	if (do_sync) {
> -		sync_dirty_buffer(bh);
> -		if (buffer_req(bh) && !buffer_uptodate(bh)) {
> -			printk ("IO error syncing ext2 inode [%s:%08lx]\n",
> -				sb->s_id, (unsigned long) ino);
> -			err = -EIO;
> -		}
> -	}

Is .fsync the only case when .write_inode is called with do_sync==true ?

> -	ei->i_state &= ~EXT2_STATE_NEW;
>  	brelse (bh);
> +	ei->i_state &= ~EXT2_STATE_NEW;
>  	return err;
>  }
>  
> Index: linux-2.6/fs/ext2/file.c
> ===================================================================
> --- linux-2.6.orig/fs/ext2/file.c	2010-11-23 21:56:05.000000000 +1100
> +++ linux-2.6/fs/ext2/file.c	2010-11-23 22:05:02.000000000 +1100
> @@ -21,6 +21,7 @@
>  #include <linux/time.h>
>  #include <linux/pagemap.h>
>  #include <linux/quotaops.h>
> +#include <linux/buffer_head.h>
>  #include "ext2.h"
>  #include "xattr.h"
>  #include "acl.h"
> @@ -43,16 +44,33 @@ static int ext2_release_file (struct ino
>  int ext2_fsync(struct file *file, int datasync)
>  {
>  	int ret;
> -	struct super_block *sb = file->f_mapping->host->i_sb;
> -	struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
> +	struct inode *inode = file->f_mapping->host;
> +	ino_t ino = inode->i_ino;
> +	struct super_block *sb = inode->i_sb;
> +	struct address_space *sb_mapping = sb->s_bdev->bd_inode->i_mapping;
> +	struct buffer_head *bh;
> +	struct ext2_inode *raw_inode;
>  
>  	ret = generic_file_fsync(file, datasync);
> -	if (ret == -EIO || test_and_clear_bit(AS_EIO, &mapping->flags)) {
> +	if (ret == -EIO || test_and_clear_bit(AS_EIO, &sb_mapping->flags)) {
>  		/* We don't really know where the IO error happened... */
>  		ext2_error(sb, __func__,
>  			   "detected IO error when writing metadata buffers");
> +		return -EIO;
> +	}
> +
> +	raw_inode = ext2_get_inode(sb, ino, &bh);

You use the Read-inode-from-disk-if-not-present function to associate
a bufferhead with the inode, relying on it being in memory for sure.
That's ugly. Maybe a comment?

But do you need it, since above __ext2_write_inode did mark_buffer_dirty(bh);
is it not already included in the buffers flushed to disk?

> +	if (IS_ERR(raw_inode))
> + 		return -EIO;
> +
> +	sync_dirty_buffer(bh);
> +	if (buffer_req(bh) && !buffer_uptodate(bh)) {
> +		printk ("IO error syncing ext2 inode [%s:%08lx]\n",
> +			sb->s_id, (unsigned long) ino);

Should you not use ext2_error(sb, __func__, like above?

Boaz
>  		ret = -EIO;
>  	}
> +	brelse (bh);
> +
>  	return ret;
>  }
>  
> Index: linux-2.6/fs/ext2/ext2.h
> ===================================================================
> --- linux-2.6.orig/fs/ext2/ext2.h	2010-11-23 21:56:05.000000000 +1100
> +++ linux-2.6/fs/ext2/ext2.h	2010-11-23 22:05:02.000000000 +1100
> @@ -124,6 +124,8 @@ extern int ext2_get_block(struct inode *
>  extern int ext2_setattr (struct dentry *, struct iattr *);
>  extern void ext2_set_inode_flags(struct inode *inode);
>  extern void ext2_get_inode_flags(struct ext2_inode_info *);
> +extern struct ext2_inode *ext2_get_inode(struct super_block *sb, ino_t ino,
> +					struct buffer_head **p);
>  extern int ext2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>  		       u64 start, u64 len);
>  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2010-11-30 11:26 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-23 14:06 [patch 0/7] icache dirty / sync fixes npiggin
2010-11-23 14:06 ` [patch 1/7] fs: mark_inode_dirty barrier fix npiggin
2010-11-23 14:06 ` [patch 2/7] fs: simple fsync race fix npiggin
2010-11-29 14:58   ` Christoph Hellwig
2010-11-30  0:05     ` Nick Piggin
2010-11-23 14:06 ` [patch 3/7] fs: introduce inode writeback helpers npiggin
2010-11-29 15:13   ` Christoph Hellwig
2010-11-30  0:22     ` Nick Piggin
2010-11-23 14:06 ` [patch 4/7] fs: preserve inode dirty bits on failed metadata writeback npiggin
2010-11-29 14:59   ` Christoph Hellwig
2010-11-30  0:08     ` Nick Piggin
2010-11-23 14:06 ` [patch 5/7] fs: ext2 inode sync fix npiggin
2010-11-30 11:26   ` Boaz Harrosh [this message]
2010-11-23 14:06 ` [patch 6/7] fs: fsync optimisations npiggin
2010-11-29 15:03   ` Christoph Hellwig
2010-11-30  0:11     ` Nick Piggin
2010-11-23 14:06 ` [patch 7/7] fs: fix or note I_DIRTY handling bugs in filesystems npiggin
2010-11-23 15:04   ` Steven Whitehouse
2010-11-23 22:51   ` Dave Chinner
2010-11-24  0:23     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CF4DF54.7020602@panasas.com \
    --to=bharrosh@panasas.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@kernel.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.