linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] block: Add block_flush_device()
       [not found]                     ` <1234768181.32677.7.camel@sebastian.kern.oss.ntt.co.jp>
@ 2009-02-16  7:25                       ` Fernando Luis Vázquez Cao
  2009-02-16  7:29                       ` [2/3] ext3: call block_flush_device() on fsync Fernando Luis Vázquez Cao
  2009-02-16  7:31                       ` [PATCH 3/3] ext4: " Fernando Luis Vázquez Cao
  2 siblings, 0 replies; 4+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-02-16  7:25 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Jan Kara, Alan Cox, Pavel Machek, kernel list, Jens Axboe,
	sandeen, fernando, rwheeler, linux-fsdevel

This patch adds a helper function that should be used by filesystems that need
to flush the underlying block device on fsync()/fdatasync().

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp linux-2.6.29-rc4-orig/fs/buffer.c linux-2.6.29-rc4/fs/buffer.c
--- linux-2.6.29-rc4-orig/fs/buffer.c	2009-02-16 14:45:11.000000000 +0900
+++ linux-2.6.29-rc4/fs/buffer.c	2009-02-16 14:53:26.000000000 +0900
@@ -165,6 +165,17 @@ void end_buffer_write_sync(struct buffer
 	put_bh(bh);
 }
 
+/* Issue flush of write caches on the block device */
+int block_flush_device(struct super_block *sb)
+{
+	int ret = 0;
+
+	ret = blkdev_issue_flush(sb->s_bdev, NULL);
+
+	return (ret == -EOPNOTSUPP) ? 0 : ret;
+}
+EXPORT_SYMBOL(block_flush_device);
+
 /*
  * Write out and wait upon all the dirty data associated with a block
  * device via its mapping.  Does not take the superblock lock.
diff -urNp linux-2.6.29-rc4-orig/include/linux/buffer_head.h linux-2.6.29-rc4/include/linux/buffer_head.h
--- linux-2.6.29-rc4-orig/include/linux/buffer_head.h	2009-02-16 14:45:12.000000000 +0900
+++ linux-2.6.29-rc4/include/linux/buffer_head.h	2009-02-16 14:48:28.000000000 +0900
@@ -238,6 +238,7 @@ int nobh_write_end(struct file *, struct
 int nobh_truncate_page(struct address_space *, loff_t, get_block_t *);
 int nobh_writepage(struct page *page, get_block_t *get_block,
                         struct writeback_control *wbc);
+int block_flush_device(struct super_block *sb);
 
 void buffer_init(void);
 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [2/3] ext3: call  block_flush_device() on fsync
       [not found]                     ` <1234768181.32677.7.camel@sebastian.kern.oss.ntt.co.jp>
  2009-02-16  7:25                       ` [PATCH 1/3] block: Add block_flush_device() Fernando Luis Vázquez Cao
@ 2009-02-16  7:29                       ` Fernando Luis Vázquez Cao
  2009-02-16  7:31                       ` [PATCH 3/3] ext4: " Fernando Luis Vázquez Cao
  2 siblings, 0 replies; 4+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-02-16  7:29 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Jan Kara, Alan Cox, Pavel Machek, kernel list, Jens Axboe,
	sandeen, fernando, rwheeler, linux-fsdevel

To ensure that bits are truly on-disk after an fsync or fdatasync, we
should force a disk flush explicitly when there is dirty data/metadata
and the journal didn't emit a write barrier (either because metadata is
not being synched or barriers are disabled).

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp linux-2.6.29-rc5-orig/fs/ext3/fsync.c linux-2.6.29-rc5/fs/ext3/fsync.c
--- linux-2.6.29-rc5-orig/fs/ext3/fsync.c	2008-12-25 08:26:37.000000000 +0900
+++ linux-2.6.29-rc5/fs/ext3/fsync.c	2009-02-16 15:56:05.000000000 +0900
@@ -45,6 +45,8 @@
 int ext3_sync_file(struct file * file, struct dentry *dentry, int datasync)
 {
 	struct inode *inode = dentry->d_inode;
+	journal_t *journal = EXT3_SB(inode->i_sb)->s_journal;
+	unsigned long i_state = inode->i_state;
 	int ret = 0;
 
 	J_ASSERT(ext3_journal_current_handle() == NULL);
@@ -69,23 +71,30 @@ int ext3_sync_file(struct file * file, s
 	 */
 	if (ext3_should_journal_data(inode)) {
 		ret = ext3_force_commit(inode->i_sb);
-		goto out;
+		if (!(journal->j_flags & JFS_BARRIER))
+			block_flush_device(inode->i_sb);
+		return ret;
 	}
 
-	if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
-		goto out;
+	if (datasync && !(i_state & I_DIRTY_DATASYNC)) {
+		if (i_state & I_DIRTY_PAGES)
+			block_flush_device(inode->i_sb);
+		return ret;
+	}
 
 	/*
 	 * The VFS has written the file data.  If the inode is unaltered
 	 * then we need not start a commit.
 	 */
-	if (inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
+	if (i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
 		struct writeback_control wbc = {
 			.sync_mode = WB_SYNC_ALL,
 			.nr_to_write = 0, /* sys_fsync did this */
 		};
 		ret = sync_inode(inode, &wbc);
+		if (journal && !(journal->j_flags & JFS_BARRIER))
+			block_flush_device(inode->i_sb);
 	}
-out:
+
 	return ret;
 }



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 3/3] ext4: call  block_flush_device() on fsync
       [not found]                     ` <1234768181.32677.7.camel@sebastian.kern.oss.ntt.co.jp>
  2009-02-16  7:25                       ` [PATCH 1/3] block: Add block_flush_device() Fernando Luis Vázquez Cao
  2009-02-16  7:29                       ` [2/3] ext3: call block_flush_device() on fsync Fernando Luis Vázquez Cao
@ 2009-02-16  7:31                       ` Fernando Luis Vázquez Cao
  2 siblings, 0 replies; 4+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-02-16  7:31 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Jan Kara, Alan Cox, Pavel Machek, kernel list, Jens Axboe,
	sandeen, fernando, rwheeler, linux-fsdevel

To ensure that bits are truly on-disk after an fsync or fdatasync, we
should force a disk flush explicitly when there is dirty data/metadata
and the journal didn't emit a write barrier (either because metadata is
not being synched or barriers are disabled).

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---

diff -urNp linux-2.6.29-rc5-orig/fs/ext4/fsync.c linux-2.6.29-rc5/fs/ext4/fsync.c
--- linux-2.6.29-rc5-orig/fs/ext4/fsync.c	2008-12-25 08:26:37.000000000 +0900
+++ linux-2.6.29-rc5/fs/ext4/fsync.c	2009-02-16 15:52:56.000000000 +0900
@@ -48,6 +48,7 @@ int ext4_sync_file(struct file *file, st
 {
 	struct inode *inode = dentry->d_inode;
 	journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
+	unsigned long i_state = inode->i_state;
 	int ret = 0;
 
 	J_ASSERT(ext4_journal_current_handle() == NULL);
@@ -76,25 +77,30 @@ int ext4_sync_file(struct file *file, st
 	 */
 	if (ext4_should_journal_data(inode)) {
 		ret = ext4_force_commit(inode->i_sb);
-		goto out;
+		if (!(journal->j_flags & JBD2_BARRIER))
+			block_flush_device(inode->i_sb);
+		return ret;
 	}
 
-	if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
-		goto out;
+	if (datasync && !(i_state & I_DIRTY_DATASYNC)) {
+		if (i_state & I_DIRTY_PAGES)
+			block_flush_device(inode->i_sb);
+		return ret;
+	}
 
 	/*
 	 * The VFS has written the file data.  If the inode is unaltered
 	 * then we need not start a commit.
 	 */
-	if (inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
+	if (i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC)) {
 		struct writeback_control wbc = {
 			.sync_mode = WB_SYNC_ALL,
 			.nr_to_write = 0, /* sys_fsync did this */
 		};
 		ret = sync_inode(inode, &wbc);
-		if (journal && (journal->j_flags & JBD2_BARRIER))
-			blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
+		if (journal && !(journal->j_flags & JBD2_BARRIER))
+			block_flush_device(inode->i_sb);
 	}
-out:
+
 	return ret;
 }



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: vfs: Add MS_FLUSHONFSYNC mount flag
       [not found]                           ` <20090215225427.GH10706@mini-me.lan>
@ 2009-02-16  7:47                             ` Fernando Luis Vázquez Cao
  0 siblings, 0 replies; 4+ messages in thread
From: Fernando Luis Vázquez Cao @ 2009-02-16  7:47 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Christoph Hellwig, Jeff Garzik, Eric Sandeen, Jan Kara, Alan Cox,
	Pavel Machek, kernel list, Jens Axboe, fernando, Ric Wheeler,
	linux-fsdevel

On Sun, 2009-02-15 at 17:54 -0500, Theodore Tso wrote:
> On Sun, Feb 15, 2009 at 04:23:26PM +0900, Fernando Luis Vázquez Cao wrote:
> > You mentioned "we should integrate this with the barrier settings". Do
> > you imply we should make it a per-device tunable too? Should we keep the
> > barrier-related mount options some filesystems provide?
> > 
> 
> Making barriers to be a per-device tunable makes sense.  The only
> reason why we kept it as a mount option in ext4 is for benchmarking
> purposes, and in ext3, because the filesystem predated the barrier
> code, and there was a desire to be able to benchmark with and without
> the old behavior --- and because akpm is still worried about the
> performance hit of the barrier code, so he's been resistant about
> change the default for ext3.

Ok, I'll turn both barriers and flushonfsync into a sysfs-exported
per-device knob and see how it turns out.

By the way, should we also add/keep a mount option for "benchmarking
purposes"?. I guess that once we get the per-device tunable we probable
do not need it anymore.

Regards,

Fernando

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-02-16  7:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1232109069.13775.35.camel@sebastian.kern.oss.ntt.co.jp>
     [not found] ` <1232114101.13775.63.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]   ` <20090116163039.GE10617@duck.suse.cz>
     [not found]     ` <1232185639.4831.18.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]       ` <1232186449.4831.29.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]         ` <20090119120349.GA10193@duck.suse.cz>
     [not found]           ` <1233135913.5399.57.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]             ` <20090128095518.GA16554@duck.suse.cz>
     [not found]               ` <1234434811.15270.7.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]                 ` <1234435245.15433.19.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]                   ` <20090215224659.GG10706@mini-me.lan>
     [not found]                     ` <1234768181.32677.7.camel@sebastian.kern.oss.ntt.co.jp>
2009-02-16  7:25                       ` [PATCH 1/3] block: Add block_flush_device() Fernando Luis Vázquez Cao
2009-02-16  7:29                       ` [2/3] ext3: call block_flush_device() on fsync Fernando Luis Vázquez Cao
2009-02-16  7:31                       ` [PATCH 3/3] ext4: " Fernando Luis Vázquez Cao
     [not found]                 ` <1234434970.15433.4.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]                   ` <499458C1.90105@redhat.com>
     [not found]                     ` <49945C90.3010104@garzik.org>
     [not found]                       ` <20090214153626.GA3973@infradead.org>
     [not found]                         ` <1234682606.19783.222.camel@sebastian.kern.oss.ntt.co.jp>
     [not found]                           ` <20090215225427.GH10706@mini-me.lan>
2009-02-16  7:47                             ` vfs: Add MS_FLUSHONFSYNC mount flag Fernando Luis Vázquez Cao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).