From: Jan Kara <jack@suse.cz>
To: Chris Mason <chris.mason@oracle.com>
Cc: Ric Wheeler <ric@emc.com>, Josef Bacik <jbacik@redhat.com>,
Theodore Ts'o <tytso@mit.edu>,
adilger@sun.com, David Chinner <dgc@sgi.com>,
"Feld, Andy" <Feld_Andy@emc.com>,
linux-fsdevel@vger.kernel.org
Subject: Re: background on the ext3 batching performance issue
Date: Thu, 28 Feb 2008 19:15:00 +0100 [thread overview]
Message-ID: <20080228181500.GA1738@duck.suse.cz> (raw)
In-Reply-To: <200802281235.18226.chris.mason@oracle.com>
On Thu 28-02-08 12:35:17, Chris Mason wrote:
> On Thursday 28 February 2008, Jan Kara wrote:
> > > On Thursday 28 February 2008, Ric Wheeler wrote:
> > >
> > > [ fsync batching can be slow ]
> > >
> > > > One more thought - what we really want here is to have a sense of the
> > > > latency of the device. In the S-ATA disk case, this optimization works
> > > > well for batching since we "spend" an extra 4ms worst case in the
> > > > chance of combining multiple, slow 18ms operations.
> > > >
> > > > With the clariion box we tested, the optimization fails badly since the
> > > > cost is only 1.3 ms so we optimize by waiting 3-4 times longer than it
> > > > would take to do the operation immediately.
> > > >
> > > > This problem has also seemed to me to be the same problem that IO
> > > > schedulers do with plugging - we want to dynamically figure out when to
> > > > plug and unplug here without hard coding in device specific tunings.
> > > >
> > > > If we bypass the snippet for multi-threaded writers, we would probably
> > > > slow down this workload on normal S-ATA/ATA drives (or even higher
> > > > performance non-RAID disks).
> > >
> > > It probably makes sense to keep track of the average number of writers we
> > > are able to gather into a transcation. There are lots of similar
> > > workloads where we have a pool of procs doing fsyncs and the size of the
> > > transaction or the number of times we joined a running transaction will
> > > be fairly constant.
> >
> > I'm probably missing something, but what are you trying to say? Either we
> > wait for writers and the number of writes is higher, or we don't wait and
> > the number of writes in a transaction is lower...
>
> The common workload would be N mail server threads servicing incoming requests
> at a fairly constant rate. Right now we sleep for a bit and wait for the
> number of writers to increase.
>
> My guess is that if we record the average number of times a writer joins an
> existing transaction, or if we record the average size of the transactions,
> we'll end up with a fairly constant number.
>
> So, we can skip the sleep if the transaction has already grown close to that
> number. This would avoid the latencies Ric is seeing.
OK, I see. Interesting idea, but in Ric's case, you'd find out that two
writers always joined the transaction and so you'd always wait for them and
nothing changes, does it?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2008-02-28 18:15 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-28 12:09 background on the ext3 batching performance issue Ric Wheeler
2008-02-28 15:05 ` Josef Bacik
2008-02-28 15:41 ` Josef Bacik
2008-02-28 13:03 ` Ric Wheeler
2008-02-28 13:09 ` Ric Wheeler
2008-02-28 16:41 ` Jan Kara
2008-02-28 17:02 ` Chris Mason
2008-02-28 17:13 ` Jan Kara
2008-02-28 17:35 ` Chris Mason
2008-02-28 18:15 ` Jan Kara [this message]
2008-02-28 17:54 ` David Chinner
2008-02-28 19:48 ` Ric Wheeler
2008-02-29 14:52 ` Ric Wheeler
2008-03-05 19:19 ` some hard numbers on ext3 & " Ric Wheeler
2008-03-05 20:20 ` Josef Bacik
2008-03-07 20:08 ` Ric Wheeler
2008-03-07 20:40 ` Josef Bacik
2008-03-07 20:45 ` Ric Wheeler
2008-03-12 18:37 ` Josef Bacik
2008-03-13 11:26 ` Ric Wheeler
2008-03-06 0:28 ` David Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080228181500.GA1738@duck.suse.cz \
--to=jack@suse.cz \
--cc=Feld_Andy@emc.com \
--cc=adilger@sun.com \
--cc=chris.mason@oracle.com \
--cc=dgc@sgi.com \
--cc=jbacik@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=ric@emc.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).