linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org, axboe@kernel.dk, shli@fusionio.com
Subject: Re: [patch 1/4] raid1: directly dispatch write request if no bitmap
Date: Thu, 24 May 2012 10:54:25 +0800	[thread overview]
Message-ID: <20120524025425.GA1190@kernel.org> (raw)
In-Reply-To: <20120524122112.05b4d6e3@notabene.brown>

On Thu, May 24, 2012 at 12:21:12PM +1000, NeilBrown wrote:
> On Wed, 23 May 2012 15:26:20 +0800 Shaohua Li <shli@kernel.org> wrote:
> 
> > In raid1, all write requests are dispatched in raid1d thread. In fast storage,
> > the raid1d thread is a bottleneck, because it dispatches request too slow. Also
> > raid1d thread migrates freely, which makes request completion cpu not match
> > with submission cpu even driver/block layer has such capability. This will
> > cause bad cache issue.
> > 
> > If no bitmap, there is no point to queue bio to a thread and dispatch it in the
> > thread. Directly dispatching bio doesn't impact correctness and removes above
> > bottleneck.
> > 
> > Multiple threads dispatch requests could potentially reduce request merge and
> > increase lock contention. For slow stroage, we just worry about request merge.
> > Caller of .make_request should already have correct block plug set, which will
> > take care of request merge and locking just like accessing raw device, so we
> > don't need worry about this too much.
> > 
> > In a 4k randwrite test with a 2 disks setup, below patch can provide 20% ~ 50%
> > performance improvements depending on numa binding.
> > 
> > Signed-off-by: Shaohua Li <shli@fusionio.com>
> > ---
> >  drivers/md/raid1.c |   11 +++++++----
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > Index: linux/drivers/md/raid1.c
> > ===================================================================
> > --- linux.orig/drivers/md/raid1.c	2012-05-22 13:50:26.989820654 +0800
> > +++ linux/drivers/md/raid1.c	2012-05-22 13:56:46.117054559 +0800
> > @@ -1187,10 +1187,13 @@ read_again:
> >  		mbio->bi_private = r1_bio;
> >  
> >  		atomic_inc(&r1_bio->remaining);
> > -		spin_lock_irqsave(&conf->device_lock, flags);
> > -		bio_list_add(&conf->pending_bio_list, mbio);
> > -		conf->pending_count++;
> > -		spin_unlock_irqrestore(&conf->device_lock, flags);
> > +		if (bitmap) {
> > +			spin_lock_irqsave(&conf->device_lock, flags);
> > +			bio_list_add(&conf->pending_bio_list, mbio);
> > +			conf->pending_count++;
> > +			spin_unlock_irqrestore(&conf->device_lock, flags);
> > +		} else
> > +			generic_make_request(mbio);
> >  	}
> >  	/* Mustn't call r1_bio_write_done before this next test,
> >  	 * as it could result in the bio being freed.
> 
> This looks like it should be 'obviously correct' but unfortunately it isn't.
> 
> If this raid1 is beneath a dm device (e.g. LVM), then generic_make_request
> will queue the request internally and not actually start it.
> In particular this means that the cloned bio has no chance of being released
> before the next clone_bio call which is made on the next time around the loop.
> 
> This can lead to a deadlock as the clone_bio() might be waiting for that
> first cloned bio to be released (if memory is really tight).
> 
> i.e. when allocating multiple bios from a mempool, we need to arrange for
> them to be submitted by a separate thread.

If there isn't block plug, generic_make_request will dispatch the request
directly. If there is, when clone_bio() waits for memory, it will
sleep/schedule, which will trigger block unplug and dispatch the first bio, so
no deadlock to me. Am I misunderstanding you?

Thanks,
Shaohua

  reply	other threads:[~2012-05-24  2:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-23  7:26 [patch 0/4] MD: improve raid1/10 write performance for fast storage Shaohua Li
2012-05-23  7:26 ` [patch 1/4] raid1: directly dispatch write request if no bitmap Shaohua Li
2012-05-24  2:21   ` NeilBrown
2012-05-24  2:54     ` Shaohua Li [this message]
2012-05-24  3:09       ` NeilBrown
2012-05-24  3:31         ` Shaohua Li
2012-05-24  3:51           ` NeilBrown
2012-05-23  7:26 ` [patch 2/4] raid1: percpu dispatch for write request if bitmap supported Shaohua Li
2012-05-24  3:34   ` NeilBrown
2012-05-24  5:17     ` Shaohua Li
2012-05-24  5:34       ` NeilBrown
2012-05-24  5:50         ` Shaohua Li
2012-05-23  7:26 ` [patch 3/4] raid10: directly dispatch write request if no bitmap Shaohua Li
2012-05-23  7:26 ` [patch 4/4] raid10: percpu dispatch for write request if bitmap supported Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120524025425.GA1190@kernel.org \
    --to=shli@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=shli@fusionio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).