From: Mingming Cao <cmm@us.ibm.com>
To: djwong@us.ibm.com
Cc: "Theodore Ts'o" <tytso@mit.edu>,
linux-ext4 <linux-ext4@vger.kernel.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Keith Mannthey <kmannth@us.ibm.com>,
Mingming Cao <mcao@us.ibm.com>
Subject: Re: [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages.
Date: Mon, 03 May 2010 17:57:47 -0700 [thread overview]
Message-ID: <1272934667.2544.3.camel@mingming-laptop> (raw)
In-Reply-To: <20100429235102.GC15607@tux1.beaverton.ibm.com>
On Thu, 2010-04-29 at 16:51 -0700, Darrick J. Wong wrote:
> Hmm. A while ago I was complaining that an evil program that calls fsync() in
> a loop will send a continuous stream of write barriers to the hard disk. Ted
> theorized that it might be possible to set a flag in ext4_writepage and clear
> it in ext4_sync_file; if we happen to enter ext4_sync_file and the flag isn't
> set (meaning that nothing has been dirtied since the last fsync()) then we
> could skip issuing the barrier.
>
> Here's an experimental patch to do something sort of like that. From a quick
> run with blktrace, it seems to skip the redundant barriers and improves the ffsb
> mail server scores. However, I haven't done extensive power failure testing to
> see how much data it can destroy. For that matter I'm not even 100% sure it's
> correct at what it aims to do.
>
> Just throwing this out there, though. Nothing's blown up ... yet. :P
> ---
> Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
> ---
>
> fs/ext4/ext4.h | 2 ++
> fs/ext4/fsync.c | 7 +++++--
> fs/ext4/inode.c | 5 +++++
> 3 files changed, 12 insertions(+), 2 deletions(-)
>
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index bf938cf..3b70195 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -1025,6 +1025,8 @@ struct ext4_sb_info {
>
> /* workqueue for dio unwritten */
> struct workqueue_struct *dio_unwritten_wq;
> +
> + atomic_t unflushed_writes;
> };
>
Just wondering is this per filesystem flag? Thought it is nicer to make
this per -inode flag, when there is no dirty data in fly for this inode
(instead of the whole fs), there is no need to call barrier in
ext4_sync_file().
Mingming
> static inline struct ext4_sb_info *EXT4_SB(struct super_block *sb)
> diff --git a/fs/ext4/fsync.c b/fs/ext4/fsync.c
> index 0d0c323..441f872 100644
> --- a/fs/ext4/fsync.c
> +++ b/fs/ext4/fsync.c
> @@ -52,7 +52,8 @@ int ext4_sync_file(struct file *file, struct dentry *dentry, int datasync)
> {
> struct inode *inode = dentry->d_inode;
> struct ext4_inode_info *ei = EXT4_I(inode);
> - journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
> + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> + journal_t *journal = sbi->s_journal;
> int ret;
> tid_t commit_tid;
...
> @@ -102,7 +103,9 @@ int ext4_sync_file(struct file *file, struct dentry *dentry, int datasync)
> (journal->j_flags & JBD2_BARRIER))
> blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
> jbd2_log_wait_commit(journal, commit_tid);
> - } else if (journal->j_flags & JBD2_BARRIER)
> + } else if (journal->j_flags & JBD2_BARRIER && atomic_read(&sbi->unflushed_writes)) {
> + atomic_set(&sbi->unflushed_writes, 0);
> blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
> + }
> return ret;
> }
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 5381802..e501abd 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2718,6 +2718,7 @@ static int ext4_writepage(struct page *page,
> unsigned int len;
> struct buffer_head *page_bufs = NULL;
> struct inode *inode = page->mapping->host;
> + struct ext4_sb_info *sbi = EXT4_SB(page->mapping->host->i_sb);
>
> trace_ext4_writepage(inode, page);
> size = i_size_read(inode);
> @@ -2726,6 +2727,8 @@ static int ext4_writepage(struct page *page,
> else
> len = PAGE_CACHE_SIZE;
>
> + atomic_set(&sbi->unflushed_writes, 1);
> +
> if (page_has_buffers(page)) {
> page_bufs = page_buffers(page);
> if (walk_page_buffers(NULL, page_bufs, 0, len, NULL,
> @@ -2872,6 +2875,8 @@ static int ext4_da_writepages(struct address_space *mapping,
> if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX)
> range_whole = 1;
>
> + atomic_set(&sbi->unflushed_writes, 1);
> +
> range_cyclic = wbc->range_cyclic;
> if (wbc->range_cyclic) {
> index = mapping->writeback_index;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-05-04 0:57 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-29 23:51 [RFC] ext4: Don't send extra barrier during fsync if there are no dirty pages Darrick J. Wong
2010-05-04 0:57 ` Mingming Cao [this message]
2010-05-04 14:16 ` Ric Wheeler
2010-05-04 15:45 ` Christoph Hellwig
2010-06-30 12:48 ` tytso
2010-06-30 13:21 ` Ric Wheeler
2010-06-30 13:44 ` tytso
2010-06-30 13:54 ` Ric Wheeler
2010-06-30 19:05 ` Andreas Dilger
2010-07-21 17:16 ` Jan Kara
2010-08-03 0:09 ` Darrick J. Wong
2010-08-03 9:01 ` Christoph Hellwig
2010-08-04 18:16 ` Darrick J. Wong
2010-08-03 13:21 ` Jan Kara
2010-08-03 13:24 ` Avi Kivity
2010-08-04 23:32 ` Ted Ts'o
2010-08-05 2:20 ` Avi Kivity
2010-08-05 16:17 ` Ted Ts'o
2010-08-05 19:13 ` Jeff Moyer
2010-08-05 20:39 ` Ted Ts'o
2010-08-05 20:44 ` Jeff Moyer
2010-05-04 19:49 ` Mingming Cao
2010-06-29 20:51 ` [RFC v2] " Darrick J. Wong
2010-08-05 16:40 ` Ted Ts'o
2010-08-05 16:45 ` Ted Ts'o
2010-08-06 7:04 ` Darrick J. Wong
2010-08-06 10:17 ` Ric Wheeler
2010-08-09 19:53 ` [RFC v3] ext4: Combine barrier requests coming from fsync Darrick J. Wong
2010-08-09 21:07 ` Christoph Hellwig
2010-08-16 16:14 ` Darrick J. Wong
2010-08-19 2:07 ` Darrick J. Wong
2010-08-19 8:53 ` Christoph Hellwig
2010-08-19 9:17 ` Tejun Heo
2010-08-19 15:48 ` Tejun Heo
2010-08-09 21:19 ` Andreas Dilger
2010-08-09 23:38 ` Darrick J. Wong
2010-08-19 2:14 ` [RFC v4] ext4: Coordinate fsync requests Darrick J. Wong
2010-08-23 18:31 ` Performance testing of various barrier reduction patches [was: Re: [RFC v4] ext4: Coordinate fsync requests] Darrick J. Wong
2010-09-23 23:25 ` Darrick J. Wong
2010-09-24 6:24 ` Andreas Dilger
2010-09-24 11:44 ` Ric Wheeler
2010-09-27 23:01 ` Darrick J. Wong
2010-10-08 21:26 ` Darrick J. Wong
2010-10-08 21:56 ` Ric Wheeler
2010-10-11 20:20 ` Darrick J. Wong
2010-10-12 14:14 ` Christoph Hellwig
2010-10-15 23:39 ` Darrick J. Wong
2010-10-15 23:40 ` Christoph Hellwig
2010-10-16 0:02 ` Darrick J. Wong
2010-10-11 14:33 ` Ted Ts'o
2010-10-18 22:49 ` Darrick J. Wong
2010-10-19 18:28 ` Christoph Hellwig
2010-08-06 7:13 ` [RFC v2] ext4: Don't send extra barrier during fsync if there are no dirty pages Darrick J. Wong
2010-08-06 18:04 ` Ted Ts'o
2010-08-09 19:36 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1272934667.2544.3.camel@mingming-laptop \
--to=cmm@us.ibm.com \
--cc=djwong@us.ibm.com \
--cc=kmannth@us.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcao@us.ibm.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).