From: Jens Axboe <jens.axboe@oracle.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Linux Kernel Developers List <linux-kernel@vger.kernel.org>,
Ext4 Developers List <linux-ext4@vger.kernel.org>
Subject: Re: [GIT PULL] Ext3 latency fixes
Date: Mon, 6 Apr 2009 08:15:59 +0200 [thread overview]
Message-ID: <20090406061558.GL5178@kernel.dk> (raw)
In-Reply-To: <20090404232222.GA7480@mit.edu>
On Sat, Apr 04 2009, Theodore Tso wrote:
> On Sat, Apr 04, 2009 at 08:01:08PM +0200, Jens Axboe wrote:
> >
> > Unless you make all journal writes sync, ext3 fsync will always suck big
> > time. But I get your point.
> >
>
> We have already made all other journal writes sync when triggered by
> fsync() --- except for the commit record which is being written by
> sync_dirty_buffer(). I've verified this via blktrace.
>
> However, you're right. We do have an performance regression using the
> regression test provided in commit 78f707bf. Taking the average of at
> least 5 runs, I found:
>
> stock 2.6.29 1687 ms
> 2.6.29 w/ ext3-latency-fixes 7337 ms
> 2.6.29 w/ full-latency-fixes 13722 ms
>
> "ext3-latency-fixes" are the patches which Linus has already pulled
> into the 2.6.29 tree. "full-latency-fixes" are ext3-latency-fixes
> plus the sync_dirty_buffes() patch.
Ugh yes, not everyone will appreciate the better latency and for them an
8x slowdown is veeery painful.
> Looking at blktrace of stock 2.6.29 running the sqlite performance
> measurement, we see this:
>
> 254,2 1 13 0.005120554 7183 Q W 199720 + 8 [sqlite-fsync-te]
> 254,2 1 15 0.005666573 6947 Q W 10277200 + 8 [kjournald]
> 254,2 1 16 0.005691576 6947 Q W 10277208 + 8 [kjournald]
> 254,2 1 18 0.006145684 6947 Q W 10277216 + 8 [kjournald]
> 254,2 1 21 0.006685348 7183 Q W 199720 + 8 [sqlite-fsync-te]
> 254,2 1 24 0.007162644 6947 Q W 10277224 + 8 [kjournald]
> 254,2 1 25 0.007187857 6947 Q W 10277232 + 8 [kjournald]
> 254,2 0 27 0.007616473 6947 Q W 10277240 + 8 [kjournald]
>
> Looking at a blktrace of 2.6.29 plus the full-latency-fixes, we see this:
>
> 254,2 0 13 0.013744556 7205 Q WS 199208 + 8 [sqlite-fsync-te]
> 254,2 0 16 0.019270748 6965 Q WS 10301896 + 8 [kjournald]
> 254,2 0 17 0.019400024 6965 Q WS 10301904 + 8 [kjournald]
> 254,2 1 23 0.019892824 6965 Q WS 10301912 + 8 [kjournald]
> 254,2 0 20 0.020450995 7205 Q WS 199208 + 8 [sqlite-fsync-te]
> 254,2 1 26 0.025954909 6965 Q WS 10301920 + 8 [kjournald]
> 254,2 1 27 0.026084814 6965 Q WS 10301928 + 8 [kjournald]
> 254,2 0 24 0.026612884 6965 Q WS 10301936 + 8 [kjournald]
>
> Looking at the deltas between the two iterations of the sqlite
> analogue, we see that stock 2.6.29 is 1.56ms, where as with the full
> latency fixes, it's 6.70ms.
It would be interesting with the full set of information, like unplug
and merge. The above definitely shows longer between queue-to-queue, I
wonder what that is due to.
> However, the full latency fixes all the writes are synchronous, so it
> must be the case that the delays are caused by the fact that queue is
> getting implicitly unplugged after the synchronous write, and the
> problem is no longer the mixing of WRITE and WRITE_SYNC requests as
> posted in the commit log for 78f707bf. If we remove the automatic
> unplug for WRITE_SYNC requests, and add an explicit unplug where it is
> needed, that should fix the performance regression for this particular
> sqlite test case. (Which isn't a throughput issue, since the test is
> basically fopen/write/fsync/fclose over and over again, with no
> background load.)
We definitely should add a WRITE_SYNC variant that doesn't unplug, for
the cases where you know you are going to submit more than one IO before
waiting. Whether that should just be WRITE_SYNC or we should add a
WRITE_SYNC_UNPLUG, I don't have a really strong preference either way.
But if the delays are due to premature unplugging, it should be visible
in the trace. Care to send the "full" thing?
> The scenario which Linus and I had been focused on is one where there
> was a heavy background load writing asynchronously. We do want to
> make sure that a series of fsync() calls in a tight loop is also
> reasonable as well, so Jens, I do think you are absolutely right that
> this is something that we need to pay attention to.
Most definitely!
> I had missed the commit 78f707bf when you originally submitted it, so
> I didn't do this test before submitting the patch. And I guess you
> had missed my patch proposal to LKML, and I didn't think to explicitly
> CC you on my patches. Apologies for the communication faults, but
> hopefully we can fix this performance issue for both cases and get
> these problems behind us.
We still have a long cycle ahead of us, so I'm sure we can get it nailed
down before release.
--
Jens Axboe
next prev parent reply other threads:[~2009-04-06 6:16 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-03 7:01 [GIT PULL] Ext3 latency fixes Theodore Ts'o
2009-04-03 7:01 ` [PATCH 1/4] block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks Theodore Ts'o
2009-04-03 7:01 ` [PATCH 2/4] ext3: Use WRITE_SYNC for commits which are caused by fsync() Theodore Ts'o
2009-04-03 7:01 ` [PATCH 3/4] ext3: Add replace-on-truncate hueristics for data=writeback mode Theodore Ts'o
2009-04-03 7:01 ` [PATCH 4/4] ext3: Add replace-on-rename " Theodore Ts'o
2009-04-03 15:01 ` EXT4 in embedded systems Nick Hennenfent (nhennefe)
2009-04-03 16:06 ` Eric Sandeen
2009-04-03 17:15 ` Nick Hennenfent (nhennefe)
2009-04-03 18:24 ` [GIT PULL] Ext3 latency fixes Linus Torvalds
2009-04-03 18:47 ` Jens Axboe
2009-04-03 19:13 ` Theodore Tso
2009-04-03 21:01 ` Chris Mason
2009-04-03 19:02 ` Linus Torvalds
2009-04-03 20:41 ` Linus Torvalds
2009-04-04 13:57 ` Theodore Tso
2009-04-04 15:16 ` Jens Axboe
2009-04-04 15:57 ` Linus Torvalds
2009-04-04 16:06 ` Linus Torvalds
2009-04-04 17:36 ` Jens Axboe
2009-04-04 17:34 ` Jens Axboe
2009-04-04 17:44 ` Linus Torvalds
2009-04-04 18:00 ` Trenton D. Adams
2009-04-04 18:01 ` Jens Axboe
2009-04-04 18:10 ` Linus Torvalds
2009-04-04 23:22 ` Theodore Tso
2009-04-04 23:33 ` Arjan van de Ven
2009-04-05 0:10 ` Theodore Tso
2009-04-05 15:05 ` Arjan van de Ven
2009-04-05 17:01 ` Linus Torvalds
2009-04-05 17:15 ` Mark Lord
2009-04-05 20:57 ` Jeff Garzik
2009-04-05 23:48 ` Arjan van de Ven
2009-04-06 2:32 ` Mark Lord
2009-04-06 5:47 ` Jeff Garzik
2009-04-07 18:18 ` Linus Torvalds
2009-04-07 18:22 ` Linus Torvalds
2009-04-06 8:13 ` Jens Axboe
2009-04-05 18:56 ` Arjan van de Ven
2009-04-05 19:34 ` Linus Torvalds
2009-04-05 20:06 ` Arjan van de Ven
2009-04-06 6:25 ` Jens Axboe
2009-04-06 6:05 ` Theodore Tso
2009-04-06 6:23 ` Jens Axboe
2009-04-06 8:16 ` Jens Axboe
2009-04-06 14:48 ` Linus Torvalds
2009-04-06 15:09 ` Jens Axboe
2009-04-06 6:15 ` Jens Axboe [this message]
2009-04-04 20:18 ` Ingo Molnar
2009-04-06 21:50 ` Lennart Sorensen
2009-04-07 13:31 ` Mark Lord
2009-04-07 14:48 ` Lennart Sorensen
2009-04-07 19:21 ` Mark Lord
2009-04-07 19:57 ` Lennart Sorensen
2009-04-04 20:56 ` Arjan van de Ven
2009-04-06 7:06 ` Jens Axboe
2009-04-07 15:39 ` Indan Zupancic
2009-04-04 19:18 ` Theodore Tso
2009-04-06 8:12 ` Jens Axboe
2009-04-04 22:13 ` Linus Torvalds
2009-04-04 22:19 ` Linus Torvalds
2009-04-05 0:20 ` Theodore Tso
2009-04-03 19:54 ` Theodore Tso
-- strict thread matches above, loose matches on Subject: below --
2009-04-08 23:40 Theodore Ts'o
2009-04-09 15:49 ` Linus Torvalds
2009-04-09 16:23 ` Chris Mason
2009-04-09 17:49 ` Jan Kara
2009-04-09 18:10 ` Chris Mason
2009-04-09 19:04 ` Jan Kara
2009-04-09 17:36 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090406061558.GL5178@kernel.dk \
--to=jens.axboe@oracle.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).