All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: Josef Bacik <jbacik@redhat.com>, jens.axboe@oracle.com
Cc: linux-ext4@vger.kernel.org
Subject: Re: transaction batching performance & multi-threaded synchronous writers
Date: Mon, 14 Jul 2008 13:26:01 -0400	[thread overview]
Message-ID: <487B8C29.3000908@redhat.com> (raw)
In-Reply-To: <20080714165858.GA10268@unused.rdu.redhat.com>

Josef Bacik wrote:
> On Mon, Jul 14, 2008 at 12:15:23PM -0400, Ric Wheeler wrote:
>   
>> Here is a pointer to the older patch & some results:
>>
>> http://www.spinics.net/lists/linux-fsdevel/msg13121.html
>>
>> I will retry this on some updated kernels, but would not expect to see a 
>> difference since the code has not been changed ;-)
>>
>>     
>
> I've been thinking, the problem with this for slower disks is that with the
> patch I provided we're not really allowing multiple things to be batched, since
> one thread will come up, do the sync and wait for the sync to finish.  In the
> meantime the next thread will come up and do the log_wait_commit() in order to
> let more threads join the transaction, but in the case of fs_mark with only 2
> threads there won't be another one, since the original is waiting for the log to
> commit.  So when the log finishes committing, thread 1 gets woken up to do its
> thing, and thread 2 gets woken up as well, it does its commit and waits for it
> to finish, and thread 2 comes in and gets stuck in log_wait_commit().  So this
> essentially kills the optimization, which is why on faster disks this makes
> everything go better, as the faster disks don't need the original optimization.
>
> So this is what I was thinking about.  Perhaps we track the average time a
> commit takes to occur, and then if the current transaction start time is < than
> the avg commit time we sleep and wait for more things to join the transaction,
> and then we commit.  How does that idea sound?  Thanks,
>
> Josef
>   
I think that this is moving in the right direction. If you think about 
this, we are basically trying to do the same kind of thing that the IO 
scheduler does - anticipate future requests and plug the file system 
level queue for a reasonable bit of time. The problem space is very 
similar - various speed devices and a need to self tune the batching 
dynamically.

It would be great to be able to share the approach (if not the actual 
code) ;-)

ric


  reply	other threads:[~2008-07-14 17:26 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-14 16:15 transaction batching performance & multi-threaded synchronous writers Ric Wheeler
2008-07-14 16:58 ` Josef Bacik
2008-07-14 17:26   ` Ric Wheeler [this message]
2008-07-15  7:58   ` Andreas Dilger
2008-07-15 11:29     ` Ric Wheeler
2008-07-15 12:51     ` Josef Bacik
2008-07-15 14:05       ` Josef Bacik
2008-07-15 14:22       ` Ric Wheeler
2008-07-15 18:39 ` Josef Bacik
2008-07-15 20:10   ` Josef Bacik
2008-07-15 20:43     ` Josef Bacik
2008-07-15 22:33       ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=487B8C29.3000908@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=jbacik@redhat.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.