All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Jan Kara <jack@suse.cz>
Cc: Ric Wheeler <ric@emc.com>, Josef Bacik <jbacik@redhat.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	adilger@sun.com, David Chinner <dgc@sgi.com>,
	"Feld, Andy" <Feld_Andy@emc.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: background on the ext3 batching performance issue
Date: Thu, 28 Feb 2008 12:35:17 -0500	[thread overview]
Message-ID: <200802281235.18226.chris.mason@oracle.com> (raw)
In-Reply-To: <20080228171314.GD25029@atrey.karlin.mff.cuni.cz>

On Thursday 28 February 2008, Jan Kara wrote:
> > On Thursday 28 February 2008, Ric Wheeler wrote:
> >
> > [ fsync batching can be slow ]
> >
> > > One more thought - what we really want here is to have a sense of the
> > > latency of the device. In the S-ATA disk case, this optimization works
> > > well for batching since we "spend" an extra 4ms worst case in the
> > > chance of combining multiple, slow 18ms operations.
> > >
> > > With the clariion box we tested, the optimization fails badly since the
> > > cost is only 1.3 ms so we optimize by waiting 3-4 times longer than it
> > > would take to do the operation immediately.
> > >
> > > This problem has also seemed to me to be the same problem that IO
> > > schedulers do with plugging - we want to dynamically figure out when to
> > > plug and unplug here without hard coding in device specific tunings.
> > >
> > > If we bypass the snippet for multi-threaded writers, we would probably
> > > slow down this workload on normal S-ATA/ATA drives (or even higher
> > > performance non-RAID disks).
> >
> > It probably makes sense to keep track of the average number of writers we
> > are able to gather into a transcation.  There are lots of similar
> > workloads where we have a pool of procs doing fsyncs and the size of the
> > transaction or the number of times we joined a running transaction will
> > be fairly constant.
>
>   I'm probably missing something, but what are you trying to say? Either we
> wait for writers and the number of writes is higher, or we don't wait and
> the number of writes in a transaction is lower...

The common workload would be N mail server threads servicing incoming requests 
at a fairly constant rate.  Right now we sleep for a bit and wait for the 
number of writers to increase.  

My guess is that if we record the average number of times a writer joins an 
existing transaction, or if we record the average size of the transactions, 
we'll end up with a fairly constant number.

So, we can skip the sleep if the transaction has already grown close to that 
number.  This would avoid the latencies Ric is seeing.

-chris

  reply	other threads:[~2008-02-28 17:37 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-28 12:09 background on the ext3 batching performance issue Ric Wheeler
2008-02-28 15:05 ` Josef Bacik
2008-02-28 15:41   ` Josef Bacik
2008-02-28 13:03     ` Ric Wheeler
2008-02-28 13:09     ` Ric Wheeler
2008-02-28 16:41       ` Jan Kara
2008-02-28 17:02       ` Chris Mason
2008-02-28 17:13         ` Jan Kara
2008-02-28 17:35           ` Chris Mason [this message]
2008-02-28 18:15             ` Jan Kara
2008-02-28 17:54       ` David Chinner
2008-02-28 19:48         ` Ric Wheeler
2008-02-29 14:52         ` Ric Wheeler
2008-03-05 19:19         ` some hard numbers on ext3 & " Ric Wheeler
2008-03-05 20:20           ` Josef Bacik
2008-03-07 20:08             ` Ric Wheeler
2008-03-07 20:40               ` Josef Bacik
2008-03-07 20:45                 ` Ric Wheeler
2008-03-12 18:37                   ` Josef Bacik
2008-03-13 11:26                     ` Ric Wheeler
2008-03-06  0:28           ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200802281235.18226.chris.mason@oracle.com \
    --to=chris.mason@oracle.com \
    --cc=Feld_Andy@emc.com \
    --cc=adilger@sun.com \
    --cc=dgc@sgi.com \
    --cc=jack@suse.cz \
    --cc=jbacik@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ric@emc.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.