From: NeilBrown <neilb@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org, axboe@kernel.dk,
dan.j.williams@intel.com, shli@fusionio.com
Subject: Re: [patch 8/8] raid5: create multiple threads to handle stripes
Date: Thu, 7 Jun 2012 11:39:58 +1000 [thread overview]
Message-ID: <20120607113958.04b9d5b8@notabene.brown> (raw)
In-Reply-To: <20120604080354.686050081@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 7335 bytes --]
On Mon, 04 Jun 2012 16:02:00 +0800 Shaohua Li <shli@kernel.org> wrote:
> Like raid 1/10, raid5 uses one thread to handle stripe. In a fast storage, the
> thread becomes a bottleneck. raid5 can offload calculation like checksum to
> async threads. And if storge is fast, scheduling async work and running async
> work will introduce heavy lock contention of workqueue, which makes such
> optimization useless. And calculation isn't the only bottleneck. For example,
> in my test raid5 thread must handle > 450k requests per second. Just doing
> dispatch and completion will make raid5 thread incapable. The only chance to
> scale is using several threads to handle stripe.
>
> With this patch, user can create several extra threads to handle stripe. How
> many threads are better depending on disk number, so the thread number can be
> changed in userspace. By default, the thread number is 0, which means no extra
> thread.
>
> In a 3-disk raid5 setup, 2 extra threads can provide 130% throughput
> improvement (double stripe_cache_size) and the throughput is pretty close to
> theory value. With >=4 disks, the improvement is even bigger, for example, can
> improve 200% for 4-disk setup, but the throughput is far less than theory
> value, which is caused by several factors like request queue lock contention,
> cache issue, latency introduced by how a stripe is handled in different disks.
> Those factors need further investigations.
>
> Signed-off-by: Shaohua Li <shli@fusionio.com>
I think it is great that you have got RAID5 to the point where multiple
threads improve performance.
I really don't like the idea of having to configure that number of threads.
It would be great if it would auto-configure.
Maybe the main thread could fork aux threads when it notices a high load.
e.g. if it has been servicing requests for more than 100ms without a break,
and the number of threads is less than the number of CPUs, then it forks a new
helper and resets the timer.
If a thread has been idle for more than 30 minutes, it exits.
Might that be reasonable?
Thanks,
NeilBrown
> ---
> drivers/md/raid5.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> drivers/md/raid5.h | 2
> 2 files changed, 120 insertions(+)
>
> Index: linux/drivers/md/raid5.c
> ===================================================================
> --- linux.orig/drivers/md/raid5.c 2012-06-01 15:23:56.001992200 +0800
> +++ linux/drivers/md/raid5.c 2012-06-04 11:54:25.775331361 +0800
> @@ -4602,6 +4602,41 @@ static int __get_stripe_batch(struct r5c
> return batch->count;
> }
>
> +static void raid5auxd(struct mddev *mddev)
> +{
> + struct r5conf *conf = mddev->private;
> + struct blk_plug plug;
> + struct stripe_head_batch batch;
> + int handled, i;
> +
> + pr_debug("+++ raid5auxd active\n");
> +
> + blk_start_plug(&plug);
> + handled = 0;
> + spin_lock_irq(&conf->device_lock);
> + while (1) {
> + if (!__get_stripe_batch(conf, &batch))
> + break;
> + spin_unlock_irq(&conf->device_lock);
> +
> + for (i = 0; i < batch.count; i++) {
> + handled++;
> + handle_stripe(batch.stripes[i]);
> + }
> +
> + release_stripe_flush_batch(&batch);
> + cond_resched();
> +
> + spin_lock_irq(&conf->device_lock);
> + }
> + pr_debug("%d stripes handled\n", handled);
> +
> + spin_unlock_irq(&conf->device_lock);
> + blk_finish_plug(&plug);
> +
> + pr_debug("--- raid5auxd inactive\n");
> +}
> +
> /*
> * This is our raid5 kernel thread.
> *
> @@ -4651,6 +4686,8 @@ static void raid5d(struct mddev *mddev)
>
> if (!__get_stripe_batch(conf, &batch))
> break;
> + for (i = 0; i < conf->aux_thread_num; i++)
> + md_wakeup_thread(conf->aux_threads[i]);
> spin_unlock_irq(&conf->device_lock);
>
> for (i = 0; i < batch.count; i++) {
> @@ -4784,10 +4821,86 @@ stripe_cache_active_show(struct mddev *m
> static struct md_sysfs_entry
> raid5_stripecache_active = __ATTR_RO(stripe_cache_active);
>
> +static ssize_t
> +raid5_show_auxthread_number(struct mddev *mddev, char *page)
> +{
> + struct r5conf *conf = mddev->private;
> + if (conf)
> + return sprintf(page, "%d\n", conf->aux_thread_num);
> + else
> + return 0;
> +}
> +
> +static ssize_t
> +raid5_store_auxthread_number(struct mddev *mddev, const char *page, size_t len)
> +{
> + struct r5conf *conf = mddev->private;
> + unsigned long new;
> + int i;
> + struct md_thread **threads;
> +
> + if (len >= PAGE_SIZE)
> + return -EINVAL;
> + if (!conf)
> + return -ENODEV;
> +
> + if (strict_strtoul(page, 10, &new))
> + return -EINVAL;
> +
> + if (new == conf->aux_thread_num)
> + return len;
> +
> + if (new > conf->aux_thread_num) {
> +
> + threads = kmalloc(sizeof(struct md_thread *) * new, GFP_KERNEL);
> + if (!threads)
> + return -EFAULT;
> +
> + i = conf->aux_thread_num;
> + while (i < new) {
> + char name[10];
> +
> + sprintf(name, "aux%d", i);
> + threads[i] = md_register_thread(raid5auxd, mddev, name);
> + if (!threads[i])
> + goto error;
> + i++;
> + }
> + memcpy(threads, conf->aux_threads,
> + sizeof(struct md_thread *) * conf->aux_thread_num);
> + spin_lock_irq(&conf->device_lock);
> + kfree(conf->aux_threads);
> + conf->aux_threads = threads;
> + conf->aux_thread_num = new;
> + spin_unlock_irq(&conf->device_lock);
> + } else {
> + int old = conf->aux_thread_num;
> +
> + spin_lock_irq(&conf->device_lock);
> + conf->aux_thread_num = new;
> + spin_unlock_irq(&conf->device_lock);
> + for (i = new; i < old; i++)
> + md_unregister_thread(&conf->aux_threads[i]);
> + }
> +
> + return len;
> +error:
> + while (--i >= conf->aux_thread_num)
> + md_unregister_thread(&threads[i]);
> + kfree(threads);
> + return -EFAULT;
> +}
> +
> +static struct md_sysfs_entry
> +raid5_auxthread_number = __ATTR(auxthread_number, S_IRUGO|S_IWUSR,
> + raid5_show_auxthread_number,
> + raid5_store_auxthread_number);
> +
> static struct attribute *raid5_attrs[] = {
> &raid5_stripecache_size.attr,
> &raid5_stripecache_active.attr,
> &raid5_preread_bypass_threshold.attr,
> + &raid5_auxthread_number.attr,
> NULL,
> };
> static struct attribute_group raid5_attrs_group = {
> @@ -4835,6 +4948,7 @@ static void raid5_free_percpu(struct r5c
>
> static void free_conf(struct r5conf *conf)
> {
> + kfree(conf->aux_threads);
> shrink_stripes(conf);
> raid5_free_percpu(conf);
> kfree(conf->disks);
> @@ -5391,6 +5505,10 @@ abort:
> static int stop(struct mddev *mddev)
> {
> struct r5conf *conf = mddev->private;
> + int i;
> +
> + for (i = 0; i < conf->aux_thread_num; i++)
> + md_unregister_thread(&conf->aux_threads[i]);
>
> md_unregister_thread(&mddev->thread);
> if (mddev->queue)
> Index: linux/drivers/md/raid5.h
> ===================================================================
> --- linux.orig/drivers/md/raid5.h 2012-06-01 15:23:56.017991998 +0800
> +++ linux/drivers/md/raid5.h 2012-06-01 15:27:12.515521685 +0800
> @@ -463,6 +463,8 @@ struct r5conf {
> * the new thread here until we fully activate the array.
> */
> struct md_thread *thread;
> + int aux_thread_num;
> + struct md_thread **aux_threads;
> };
>
> /*
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-06-07 1:39 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-04 8:01 [patch 0/8] raid5: improve write performance for fast storage Shaohua Li
2012-06-04 8:01 ` [patch 1/8] raid5: add a per-stripe lock Shaohua Li
2012-06-07 0:54 ` NeilBrown
2012-06-07 6:29 ` Shaohua Li
2012-06-07 6:35 ` NeilBrown
2012-06-07 6:52 ` Shaohua Li
2012-06-12 21:02 ` Dan Williams
2012-06-13 4:08 ` Dan Williams
2012-06-13 4:23 ` Shaohua Li
2012-06-12 21:10 ` Dan Williams
2012-06-04 8:01 ` [patch 2/8] raid5: lockless access raid5 overrided bi_phys_segments Shaohua Li
2012-06-07 1:06 ` NeilBrown
2012-06-12 20:41 ` Dan Williams
2012-06-04 8:01 ` [patch 3/8] raid5: remove some device_lock locking places Shaohua Li
2012-06-04 8:01 ` [patch 4/8] raid5: reduce chance release_stripe() taking device_lock Shaohua Li
2012-06-07 0:50 ` NeilBrown
2012-06-04 8:01 ` [patch 5/8] raid5: add batch stripe release Shaohua Li
2012-06-04 8:01 ` [patch 6/8] raid5: make_request use " Shaohua Li
2012-06-07 1:23 ` NeilBrown
2012-06-07 6:33 ` Shaohua Li
2012-06-07 7:33 ` NeilBrown
2012-06-07 7:58 ` Shaohua Li
2012-06-08 6:16 ` Shaohua Li
2012-06-08 6:42 ` NeilBrown
2012-06-04 8:01 ` [patch 7/8] raid5: raid5d handle stripe in batch way Shaohua Li
2012-06-07 1:32 ` NeilBrown
2012-06-07 6:35 ` Shaohua Li
2012-06-07 7:38 ` NeilBrown
2012-06-04 8:02 ` [patch 8/8] raid5: create multiple threads to handle stripes Shaohua Li
2012-06-07 1:39 ` NeilBrown [this message]
2012-06-07 6:45 ` Shaohua Li
2012-06-13 4:08 ` Dan Williams
2012-06-21 10:09 ` Shaohua Li
2012-07-02 20:43 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120607113958.04b9d5b8@notabene.brown \
--to=neilb@suse.de \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@fusionio.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).