From: Ric Wheeler <ric@emc.com>
To: David Chinner <dgc@sgi.com>
Cc: Josef Bacik <jbacik@redhat.com>, "Theodore Ts'o" <tytso@mit.edu>,
adilger@sun.com, jack@ucw.cz, "Feld, Andy" <Feld_Andy@emc.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: background on the ext3 batching performance issue
Date: Fri, 29 Feb 2008 09:52:56 -0500 [thread overview]
Message-ID: <47C81C48.1030706@emc.com> (raw)
In-Reply-To: <20080228175422.GU155259@sgi.com>
David Chinner wrote:
> On Thu, Feb 28, 2008 at 08:09:57AM -0500, Ric Wheeler wrote:
>> One more thought - what we really want here is to have a sense of the
>> latency of the device. In the S-ATA disk case, this optimization works
>> well for batching since we "spend" an extra 4ms worst case in the chance
>> of combining multiple, slow 18ms operations.
>>
>> With the clariion box we tested, the optimization fails badly since the
>> cost is only 1.3 ms so we optimize by waiting 3-4 times longer than it
>> would take to do the operation immediately.
>>
>> This problem has also seemed to me to be the same problem that IO
>> schedulers do with plugging - we want to dynamically figure out when to
>> plug and unplug here without hard coding in device specific tunings.
>>
>> If we bypass the snippet for multi-threaded writers, we would probably
>> slow down this workload on normal S-ATA/ATA drives (or even higher
>> performance non-RAID disks).
>
> It's the self-tuning aspect of this problem that makes it hard. In
> the case of XFS, the way this tuning is done is that we look at the
> state of the previous log I/O buffer to check if it is still syncing
> to disk. If it is sync to disk, we go to sleep waiting for that log
> buffer I/O to complete. This holds the current buffer open to
> aggregate more transactions before syncing it to disk and hence
> allows parallel fsyncs to be issued in the one log write. The fact
> that it waits for the previous log I/O to complete means it
> self-tunes to the latency of the underlying storage medium.....
>
> Cheers,
>
> Dave.
This sounds like a really clean way to self tune without having any hard coded
assumptions (like the current 1HZ wait)...
ric
next prev parent reply other threads:[~2008-02-29 14:59 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-28 12:09 background on the ext3 batching performance issue Ric Wheeler
2008-02-28 15:05 ` Josef Bacik
2008-02-28 15:41 ` Josef Bacik
2008-02-28 13:03 ` Ric Wheeler
2008-02-28 13:09 ` Ric Wheeler
2008-02-28 16:41 ` Jan Kara
2008-02-28 17:02 ` Chris Mason
2008-02-28 17:13 ` Jan Kara
2008-02-28 17:35 ` Chris Mason
2008-02-28 18:15 ` Jan Kara
2008-02-28 17:54 ` David Chinner
2008-02-28 19:48 ` Ric Wheeler
2008-02-29 14:52 ` Ric Wheeler [this message]
2008-03-05 19:19 ` some hard numbers on ext3 & " Ric Wheeler
2008-03-05 20:20 ` Josef Bacik
2008-03-07 20:08 ` Ric Wheeler
2008-03-07 20:40 ` Josef Bacik
2008-03-07 20:45 ` Ric Wheeler
2008-03-12 18:37 ` Josef Bacik
2008-03-13 11:26 ` Ric Wheeler
2008-03-06 0:28 ` David Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47C81C48.1030706@emc.com \
--to=ric@emc.com \
--cc=Feld_Andy@emc.com \
--cc=adilger@sun.com \
--cc=dgc@sgi.com \
--cc=jack@ucw.cz \
--cc=jbacik@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.