Re: [patch 2/4] raid1: percpu dispatch for write request if bitmap supported

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Shaohua Li <shli@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org, axboe@kernel.dk, shli@fusionio.com
Subject: Re: [patch 2/4] raid1: percpu dispatch for write request if bitmap supported
Date: Thu, 24 May 2012 13:50:15 +0800	[thread overview]
Message-ID: <20120524055015.GA5579@kernel.org> (raw)
In-Reply-To: <20120524153412.106d5a0c@notabene.brown>

On Thu, May 24, 2012 at 03:34:12PM +1000, NeilBrown wrote:
> On Thu, 24 May 2012 13:17:09 +0800 Shaohua Li <shli@kernel.org> wrote:
> 
> > On Thu, May 24, 2012 at 01:34:02PM +1000, NeilBrown wrote:
> > > On Wed, 23 May 2012 15:26:21 +0800 Shaohua Li <shli@kernel.org> wrote:
> > > 
> > > > In raid1, all write requests are dispatched in raid1d thread. In fast storage,
> > > > the raid1d thread is a bottleneck, because it dispatches request too slow. Also
> > > > raid1d thread migrates freely, which makes request completion cpu not match
> > > > with submission cpu even driver/block layer has such capability. This will
> > > > cause bad cache issue.
> > > > 
> > > > If bitmap support is enabled, write requests can only be dispatched after dirty
> > > > bitmap is flushed out. After bitmap is flushed, how write requests are
> > > > dispatched doesn't impact correctness. A natural idea is to distribute request
> > > > dispatch to several threads. With this patch, requests are added to a percpu
> > > > list first. After bitmap is flushed, then the percpu list requests will
> > > > dispatched in a workqueue. In this way, above bottleneck is removed.
> > > > 
> > > > In a 4k randwrite test with a 2 disks setup, below patch can provide 10% ~ 50%
> > > > performance improvements depending on numa binding.
> > > 
> > > Those numbers are quite impressive so there is certainly room for improvement
> > > here.  I'm not sure that I'm entirely comfortable with the approach though.
> > > 
> > > Passing the request to a per-cpu thread does make sense, but a generic
> > > per-cpu thread feels dangerous as we don't know what else might be queued to
> > > that thread and because of the potential for deadlocks between memory
> > > allocation and generic_make_request (as mentioned in previous email) I find
> > > it hard to convince myself that this approach is entirely safe.
> > 
> > No problem, we can use a separate workqueue.
> 
> A separate set of per-cpu workqueues for each md array doesn't sound like a
> good idea to me - if we can possibly avoid it.

Can be a separate workqueue sharing for all md arrays.
  
> > > I wonder if we might take a very different approach and try to do everything
> > > in the one process.  i.e. don't hand tasks off to other threads at all - at
> > > least in the common case. 
> > > So we could change plugger_unplug (in md.c) so that:
> > > 
> > >  - if current->bio_list is NULL, (meaning all requests have been submitted
> > >    and there is no risk of deadlock) we call bitmap_unplug and then submit
> > >    all the queued writes.
> > >  - if current->bio_list is not NULL, then we just wakeup the md thread to
> > >    do the work.
> > 
> > The current->bio_list check does make sense. I'm going to do the check for the
> > 1 and 3 patches.
> > 
> > But I believe we can't call generic_make_request in unplug. For example,
> > schedule()->unplug->generic_make_request->get_request_wait()->schedule(). At
> 
> My ideas was to only submit the queued writes from a normal call to
> blk_finish_plug.  Calls that come from schedule would use the current
> approach of signalling the md thread to take over.  There must be a way to
> tell the difference - maybe just a new flag to the unplug callback.

Should be the same I think. For exmaple, blk_finish_plug unplug, plug_cb should
generate several bios, but for the first bio, get_request_wait sleeps and flush
plug list again. later bio hasn't a chance to be queued actually till next time
the task runs. Then unplug loses its meaning.

Thanks,
Shaohua

next prev parent reply	other threads:[~2012-05-24  5:50 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-23  7:26 [patch 0/4] MD: improve raid1/10 write performance for fast storage Shaohua Li
2012-05-23  7:26 ` [patch 1/4] raid1: directly dispatch write request if no bitmap Shaohua Li
2012-05-24  2:21   ` NeilBrown
2012-05-24  2:54     ` Shaohua Li
2012-05-24  3:09       ` NeilBrown
2012-05-24  3:31         ` Shaohua Li
2012-05-24  3:51           ` NeilBrown
2012-05-23  7:26 ` [patch 2/4] raid1: percpu dispatch for write request if bitmap supported Shaohua Li
2012-05-24  3:34   ` NeilBrown
2012-05-24  5:17     ` Shaohua Li
2012-05-24  5:34       ` NeilBrown
2012-05-24  5:50         ` Shaohua Li [this message]
2012-05-23  7:26 ` [patch 3/4] raid10: directly dispatch write request if no bitmap Shaohua Li
2012-05-23  7:26 ` [patch 4/4] raid10: percpu dispatch for write request if bitmap supported Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120524055015.GA5579@kernel.org \
    --to=shli@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=shli@fusionio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).