All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: NeilBrown <neilb@suse.de>,
	linux-raid@vger.kernel.org, axboe@kernel.dk, shli@fusionio.com
Subject: Re: [patch 8/8] raid5: create multiple threads to handle stripes
Date: Thu, 21 Jun 2012 18:09:55 +0800	[thread overview]
Message-ID: <20120621100955.GA255@kernel.org> (raw)
In-Reply-To: <CAA9_cmc0C3V9SG49UpxW+GAhWo+O8+cnyac6OxROkiTeqcrRcA@mail.gmail.com>

On Tue, Jun 12, 2012 at 09:08:17PM -0700, Dan Williams wrote:
> On Wed, Jun 6, 2012 at 11:45 PM, Shaohua Li <shli@kernel.org> wrote:
> > On Thu, Jun 07, 2012 at 11:39:58AM +1000, NeilBrown wrote:
> >> On Mon, 04 Jun 2012 16:02:00 +0800 Shaohua Li <shli@kernel.org> wrote:
> >>
> >> > Like raid 1/10, raid5 uses one thread to handle stripe. In a fast storage, the
> >> > thread becomes a bottleneck. raid5 can offload calculation like checksum to
> >> > async threads. And if storge is fast, scheduling async work and running async
> >> > work will introduce heavy lock contention of workqueue, which makes such
> >> > optimization useless. And calculation isn't the only bottleneck. For example,
> >> > in my test raid5 thread must handle > 450k requests per second. Just doing
> >> > dispatch and completion will make raid5 thread incapable. The only chance to
> >> > scale is using several threads to handle stripe.
> >> >
> >> > With this patch, user can create several extra threads to handle stripe. How
> >> > many threads are better depending on disk number, so the thread number can be
> >> > changed in userspace. By default, the thread number is 0, which means no extra
> >> > thread.
> >> >
> >> > In a 3-disk raid5 setup, 2 extra threads can provide 130% throughput
> >> > improvement (double stripe_cache_size) and the throughput is pretty close to
> >> > theory value. With >=4 disks, the improvement is even bigger, for example, can
> >> > improve 200% for 4-disk setup, but the throughput is far less than theory
> >> > value, which is caused by several factors like request queue lock contention,
> >> > cache issue, latency introduced by how a stripe is handled in different disks.
> >> > Those factors need further investigations.
> >> >
> >> > Signed-off-by: Shaohua Li <shli@fusionio.com>
> >>
> >> I think it is great that you have got RAID5 to the point where multiple
> >> threads improve performance.
> >> I really don't like the idea of having to configure that number of threads.
> >>
> >> It would be great if it would auto-configure.
> >> Maybe the main thread could fork aux threads when it notices a high load.
> >> e.g. if it has been servicing requests for more than 100ms without a break,
> >> and the number of threads is less than the number of CPUs, then it forks a new
> >> helper and resets the timer.
> >>
> >> If a thread has been idle for more than 30 minutes, it exits.
> >>
> >> Might that be reasonable?
> >
> > Yep, I bet this patch needs more discussion. auto-configure is preferred. Your
> > idea is worthy doing. However, the concern is if doing auto fork/kill thread,
> > user can't do numa binding, which is important for high speed storage. Maybe
> > have a reasonable default thread number, like one thread one disk? Need more
> > investigations, I'm open to any suggestion in this side.
> 
> The last time I looked at this the btrfs thread pool looked like a
> good candidate:
> 
>   http://marc.info/?l=linux-raid&m=126944260704907&w=2
> 
> ...have not looked if Tejun has made this available as a generic workqueue mode.

I tried to create a UNBOUND workqueue and set max active to the cpu number, so
each cpu will handle one work. In the work, the cpu will handle 8 stripes. The
throughput is relative ok, but CPU utilization is very high compared to just
create 3 or 4 threads like the patch does. There is heavy lock contention in
block queue_lock, since every cpu now dispatches request. There are other
issues like cache, raid5 device_lock has more contention too. It appears too
many threads to handle stripe isn't as good as expected.

Thanks,
Shaohua

  reply	other threads:[~2012-06-21 10:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-04  8:01 [patch 0/8] raid5: improve write performance for fast storage Shaohua Li
2012-06-04  8:01 ` [patch 1/8] raid5: add a per-stripe lock Shaohua Li
2012-06-07  0:54   ` NeilBrown
2012-06-07  6:29     ` Shaohua Li
2012-06-07  6:35       ` NeilBrown
2012-06-07  6:52         ` Shaohua Li
2012-06-12 21:02           ` Dan Williams
2012-06-13  4:08             ` Dan Williams
2012-06-13  4:23               ` Shaohua Li
2012-06-12 21:10   ` Dan Williams
2012-06-04  8:01 ` [patch 2/8] raid5: lockless access raid5 overrided bi_phys_segments Shaohua Li
2012-06-07  1:06   ` NeilBrown
2012-06-12 20:41     ` Dan Williams
2012-06-04  8:01 ` [patch 3/8] raid5: remove some device_lock locking places Shaohua Li
2012-06-04  8:01 ` [patch 4/8] raid5: reduce chance release_stripe() taking device_lock Shaohua Li
2012-06-07  0:50   ` NeilBrown
2012-06-04  8:01 ` [patch 5/8] raid5: add batch stripe release Shaohua Li
2012-06-04  8:01 ` [patch 6/8] raid5: make_request use " Shaohua Li
2012-06-07  1:23   ` NeilBrown
2012-06-07  6:33     ` Shaohua Li
2012-06-07  7:33       ` NeilBrown
2012-06-07  7:58         ` Shaohua Li
2012-06-08  6:16           ` Shaohua Li
2012-06-08  6:42             ` NeilBrown
2012-06-04  8:01 ` [patch 7/8] raid5: raid5d handle stripe in batch way Shaohua Li
2012-06-07  1:32   ` NeilBrown
2012-06-07  6:35     ` Shaohua Li
2012-06-07  7:38       ` NeilBrown
2012-06-04  8:02 ` [patch 8/8] raid5: create multiple threads to handle stripes Shaohua Li
2012-06-07  1:39   ` NeilBrown
2012-06-07  6:45     ` Shaohua Li
2012-06-13  4:08       ` Dan Williams
2012-06-21 10:09         ` Shaohua Li [this message]
2012-07-02 20:43           ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120621100955.GA255@kernel.org \
    --to=shli@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dan.j.williams@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=shli@fusionio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.