* [PATCH] ext4: Speedup WB_SYNC_ALL pass
@ 2014-03-03 22:53 Jan Kara
2014-03-04 15:54 ` Theodore Ts'o
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2014-03-03 22:53 UTC (permalink / raw)
To: tytso; +Cc: linux-ext4, Jan Kara
When doing filesystem wide sync, there's no need to force transaction
commit (or synchronously write inode buffer) separately for each inode
because ext4_sync_fs() takes care of forcing commit at the end (VFS
takes care of flushing buffer cache, respectively). Most of the time
this slowness doesn't manifest because previous WB_SYNC_NONE writeback
doesn't leave much to write but when there are processes aggressively
creating new files and several filesystems to sync, the sync slowness
can be noticeable. In the following test script sync(1) takes around 6
minutes when there are two ext4 filesystems mounted on a standard SATA
drive. After this patch sync takes a couple of seconds so we have about
two orders of magnitude improvement.
function run_writers
{
for (( i = 0; i < 10; i++ )); do
mkdir $1/dir$i
for (( j = 0; j < 40000; j++ )); do
dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
done &
done
}
for dir in "$@"; do
run_writers $dir
done
sleep 40
time sync
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/ext4/inode.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6e39895a91b8..7850584b0679 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4443,7 +4443,12 @@ int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
return -EIO;
}
- if (wbc->sync_mode != WB_SYNC_ALL)
+ /*
+ * No need to force transaction in WB_SYNC_NONE mode. Also
+ * ext4_sync_fs() will force the commit after everything is
+ * written.
+ */
+ if (wbc->sync_mode != WB_SYNC_ALL || wbc->for_sync)
return 0;
err = ext4_force_commit(inode->i_sb);
@@ -4453,7 +4458,11 @@ int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
err = __ext4_get_inode_loc(inode, &iloc, 0);
if (err)
return err;
- if (wbc->sync_mode == WB_SYNC_ALL)
+ /*
+ * sync(2) will flush the whole buffer cache. No need to do
+ * it here separately for each inode.
+ */
+ if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync)
sync_dirty_buffer(iloc.bh);
if (buffer_req(iloc.bh) && !buffer_uptodate(iloc.bh)) {
EXT4_ERROR_INODE_BLOCK(inode, iloc.bh->b_blocknr,
--
1.8.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] ext4: Speedup WB_SYNC_ALL pass
2014-03-03 22:53 [PATCH] ext4: Speedup WB_SYNC_ALL pass Jan Kara
@ 2014-03-04 15:54 ` Theodore Ts'o
0 siblings, 0 replies; 3+ messages in thread
From: Theodore Ts'o @ 2014-03-04 15:54 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4
On Mon, Mar 03, 2014 at 11:53:28PM +0100, Jan Kara wrote:
> When doing filesystem wide sync, there's no need to force transaction
> commit (or synchronously write inode buffer) separately for each inode
> because ext4_sync_fs() takes care of forcing commit at the end (VFS
> takes care of flushing buffer cache, respectively). Most of the time
> this slowness doesn't manifest because previous WB_SYNC_NONE writeback
> doesn't leave much to write but when there are processes aggressively
> creating new files and several filesystems to sync, the sync slowness
> can be noticeable. In the following test script sync(1) takes around 6
> minutes when there are two ext4 filesystems mounted on a standard SATA
> drive. After this patch sync takes a couple of seconds so we have about
> two orders of magnitude improvement.
>
> function run_writers
> {
> for (( i = 0; i < 10; i++ )); do
> mkdir $1/dir$i
> for (( j = 0; j < 40000; j++ )); do
> dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
> done &
> done
> }
>
> for dir in "$@"; do
> run_writers $dir
> done
>
> sleep 40
> time sync
>
> Signed-off-by: Jan Kara <jack@suse.cz>
Looks good, thanks for the patch!
- Ted
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 0/2 v2] Fix data corruption when blocksize < pagesize for mmapped data
@ 2014-10-10 14:23 Jan Kara
2014-10-10 14:23 ` [PATCH] ext4: Speedup WB_SYNC_ALL pass Jan Kara
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2014-10-10 14:23 UTC (permalink / raw)
To: linux-fsdevel
Cc: linux-ext4, Dave Chinner, xfs, cluster-devel, Steven Whitehouse,
Mark Fasheh, Joel Becker, ocfs2-devel, reiserfs-devel,
Jeff Mahoney, Dave Kleikamp, jfs-discussion, tytso, viro,
Jan Kara
Hello,
this is a second version of the patches to fix data corruption in mmapped
data when blocksize < pagesize as tested by xfstests generic/030 test.
The patchset fixes XFS and ext4. I've checked and btrfs doesn't need fixing
because it doesn't support blocksize < pagesize. If that's ever going
to change btrfs will likely need a similar treatment. ocfs2, ext2, ext3 are
OK since they happily allocate blocks during writeback. For other filesystems
like gfs2, ubifs, nilfs, ceph,... I'm not sure whether they support blocksize <
pagesize at all. Interesting is also NFS which may care but I don't understand
its ->page_mkwrite() handler good enough to judge.
Changes since v1:
- changed helper function name and moved it to mm/truncate.c - I originally
thought we can make the helper function update i_size to simplify the
interface but it's actually impossible due to generic_write_end() lock
ordering constraints.
- used round_up() instead of ALIGN()
- taught truncate_setsize() to use the helper function
Honza
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] ext4: Speedup WB_SYNC_ALL pass
2014-10-10 14:23 [PATCH 0/2 v2] Fix data corruption when blocksize < pagesize for mmapped data Jan Kara
@ 2014-10-10 14:23 ` Jan Kara
0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2014-10-10 14:23 UTC (permalink / raw)
To: linux-fsdevel
Cc: Dave Kleikamp, jfs-discussion, tytso, Jeff Mahoney, Mark Fasheh,
reiserfs-devel, xfs, cluster-devel, Joel Becker, Jan Kara,
linux-ext4, Steven Whitehouse, ocfs2-devel, viro
When doing filesystem wide sync, there's no need to force transaction
commit (or synchronously write inode buffer) separately for each inode
because ext4_sync_fs() takes care of forcing commit at the end (VFS
takes care of flushing buffer cache, respectively). Most of the time
this slowness doesn't manifest because previous WB_SYNC_NONE writeback
doesn't leave much to write but when there are processes aggressively
creating new files and several filesystems to sync, the sync slowness
can be noticeable. In the following test script sync(1) takes around 6
minutes when there are two ext4 filesystems mounted on a standard SATA
drive. After this patch sync takes a couple of seconds so we have about
two orders of magnitude improvement.
function run_writers
{
for (( i = 0; i < 10; i++ )); do
mkdir $1/dir$i
for (( j = 0; j < 40000; j++ )); do
dd if=/dev/zero of=$1/dir$i/$j bs=4k count=4 &>/dev/null
done &
done
}
for dir in "$@"; do
run_writers $dir
done
sleep 40
time sync
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/ext4/inode.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6e39895a91b8..7850584b0679 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4443,7 +4443,12 @@ int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
return -EIO;
}
- if (wbc->sync_mode != WB_SYNC_ALL)
+ /*
+ * No need to force transaction in WB_SYNC_NONE mode. Also
+ * ext4_sync_fs() will force the commit after everything is
+ * written.
+ */
+ if (wbc->sync_mode != WB_SYNC_ALL || wbc->for_sync)
return 0;
err = ext4_force_commit(inode->i_sb);
@@ -4453,7 +4458,11 @@ int ext4_write_inode(struct inode *inode, struct writeback_control *wbc)
err = __ext4_get_inode_loc(inode, &iloc, 0);
if (err)
return err;
- if (wbc->sync_mode == WB_SYNC_ALL)
+ /*
+ * sync(2) will flush the whole buffer cache. No need to do
+ * it here separately for each inode.
+ */
+ if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync)
sync_dirty_buffer(iloc.bh);
if (buffer_req(iloc.bh) && !buffer_uptodate(iloc.bh)) {
EXT4_ERROR_INODE_BLOCK(inode, iloc.bh->b_blocknr,
--
1.8.1.4
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-10-10 14:23 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-03 22:53 [PATCH] ext4: Speedup WB_SYNC_ALL pass Jan Kara
2014-03-04 15:54 ` Theodore Ts'o
-- strict thread matches above, loose matches on Subject: below --
2014-10-10 14:23 [PATCH 0/2 v2] Fix data corruption when blocksize < pagesize for mmapped data Jan Kara
2014-10-10 14:23 ` [PATCH] ext4: Speedup WB_SYNC_ALL pass Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).