From: NeilBrown <neilb@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org, axboe@kernel.dk,
dan.j.williams@intel.com, shli@fusionio.com
Subject: Re: [patch 03/10 v3] raid5: add a per-stripe lock
Date: Mon, 2 Jul 2012 10:50:46 +1000 [thread overview]
Message-ID: <20120702105046.56cd47ec@notabene.brown> (raw)
In-Reply-To: <20120625072613.620625574@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 6119 bytes --]
On Mon, 25 Jun 2012 15:24:50 +0800 Shaohua Li <shli@kernel.org> wrote:
> Add a per-stripe lock to protect stripe specific data, like dev->read,
> written, ... The purpose is to reduce lock contention of conf->device_lock.
>
> Signed-off-by: Shaohua Li <shli@fusionio.com>
I had hoped to avoid having a per-stripe lock again, but it does look like it
is needed.
However I don't like the way you have split up these three patches - it makes
them a little hard to review.
I would like to see one patch which converts the bi_phys_segments access to
be atomic and also removes all the spin_lock calls that were just for
protecting that.
Then another patch which adds the new stripe_lock, clearly documenting
exactly what is protects (not just "like dev->read" but an explicit list)
and also removes any spin_lock of device_lock that is no longer needed.
Then I could see what is being added and what is being removed all in the one
patch and I can be sure that they balance.
Thanks,
NeilBrown
> ---
> drivers/md/raid5.c | 17 +++++++++++++++++
> drivers/md/raid5.h | 1 +
> 2 files changed, 18 insertions(+)
>
> Index: linux/drivers/md/raid5.c
> ===================================================================
> --- linux.orig/drivers/md/raid5.c 2012-06-25 14:36:57.280096788 +0800
> +++ linux/drivers/md/raid5.c 2012-06-25 14:37:13.651888057 +0800
> @@ -751,6 +751,7 @@ static void ops_complete_biofill(void *s
>
> /* clear completed biofills */
> spin_lock_irq(&conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> for (i = sh->disks; i--; ) {
> struct r5dev *dev = &sh->dev[i];
>
> @@ -776,6 +777,7 @@ static void ops_complete_biofill(void *s
> }
> }
> }
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
> clear_bit(STRIPE_BIOFILL_RUN, &sh->state);
>
> @@ -800,8 +802,10 @@ static void ops_run_biofill(struct strip
> if (test_bit(R5_Wantfill, &dev->flags)) {
> struct bio *rbi;
> spin_lock_irq(&conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> dev->read = rbi = dev->toread;
> dev->toread = NULL;
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
> while (rbi && rbi->bi_sector <
> dev->sector + STRIPE_SECTORS) {
> @@ -1139,10 +1143,12 @@ ops_run_biodrain(struct stripe_head *sh,
> struct bio *wbi;
>
> spin_lock_irq(&sh->raid_conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> chosen = dev->towrite;
> dev->towrite = NULL;
> BUG_ON(dev->written);
> wbi = dev->written = chosen;
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&sh->raid_conf->device_lock);
>
> while (wbi && wbi->bi_sector <
> @@ -1448,6 +1454,8 @@ static int grow_one_stripe(struct r5conf
> init_waitqueue_head(&sh->ops.wait_for_ops);
> #endif
>
> + spin_lock_init(&sh->stripe_lock);
> +
> if (grow_buffers(sh)) {
> shrink_buffers(sh);
> kmem_cache_free(conf->slab_cache, sh);
> @@ -2329,6 +2337,7 @@ static int add_stripe_bio(struct stripe_
>
>
> spin_lock_irq(&conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> if (forwrite) {
> bip = &sh->dev[dd_idx].towrite;
> if (*bip == NULL && sh->dev[dd_idx].written == NULL)
> @@ -2362,6 +2371,7 @@ static int add_stripe_bio(struct stripe_
> if (sector >= sh->dev[dd_idx].sector + STRIPE_SECTORS)
> set_bit(R5_OVERWRITE, &sh->dev[dd_idx].flags);
> }
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
>
> pr_debug("added bi b#%llu to stripe s#%llu, disk %d.\n",
> @@ -2378,6 +2388,7 @@ static int add_stripe_bio(struct stripe_
>
> overlap:
> set_bit(R5_Overlap, &sh->dev[dd_idx].flags);
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
> return 0;
> }
> @@ -2429,6 +2440,7 @@ handle_failed_stripe(struct r5conf *conf
> }
> }
> spin_lock_irq(&conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> /* fail all writes first */
> bi = sh->dev[i].towrite;
> sh->dev[i].towrite = NULL;
> @@ -2490,6 +2502,7 @@ handle_failed_stripe(struct r5conf *conf
> bi = nextbi;
> }
> }
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
> if (bitmap_end)
> bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> @@ -2697,6 +2710,7 @@ static void handle_stripe_clean_event(st
> int bitmap_end = 0;
> pr_debug("Return write for disc %d\n", i);
> spin_lock_irq(&conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> wbi = dev->written;
> dev->written = NULL;
> while (wbi && wbi->bi_sector <
> @@ -2711,6 +2725,7 @@ static void handle_stripe_clean_event(st
> }
> if (dev->towrite == NULL)
> bitmap_end = 1;
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
> if (bitmap_end)
> bitmap_endwrite(conf->mddev->bitmap,
> @@ -3170,6 +3185,7 @@ static void analyse_stripe(struct stripe
> /* Now to look around and see what can be done */
> rcu_read_lock();
> spin_lock_irq(&conf->device_lock);
> + spin_lock(&sh->stripe_lock);
> for (i=disks; i--; ) {
> struct md_rdev *rdev;
> sector_t first_bad;
> @@ -3315,6 +3331,7 @@ static void analyse_stripe(struct stripe
> do_recovery = 1;
> }
> }
> + spin_unlock(&sh->stripe_lock);
> spin_unlock_irq(&conf->device_lock);
> if (test_bit(STRIPE_SYNCING, &sh->state)) {
> /* If there is a failed device being replaced,
> Index: linux/drivers/md/raid5.h
> ===================================================================
> --- linux.orig/drivers/md/raid5.h 2012-06-25 14:36:13.940638627 +0800
> +++ linux/drivers/md/raid5.h 2012-06-25 14:37:13.651888057 +0800
> @@ -210,6 +210,7 @@ struct stripe_head {
> int disks; /* disks in stripe */
> enum check_states check_state;
> enum reconstruct_states reconstruct_state;
> + spinlock_t stripe_lock;
> /**
> * struct stripe_operations
> * @target - STRIPE_OP_COMPUTE_BLK target
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-07-02 0:50 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-25 7:24 [patch 00/10 v3] raid5: improve write performance for fast storage Shaohua Li
2012-06-25 7:24 ` [patch 01/10 v3] raid5: use wake_up_all for overlap waking Shaohua Li
2012-06-28 7:26 ` NeilBrown
2012-06-28 8:53 ` Shaohua Li
2012-06-25 7:24 ` [patch 02/10 v3] raid5: delayed stripe fix Shaohua Li
2012-07-02 0:46 ` NeilBrown
2012-07-02 0:49 ` Shaohua Li
2012-07-02 0:55 ` NeilBrown
2012-06-25 7:24 ` [patch 03/10 v3] raid5: add a per-stripe lock Shaohua Li
2012-07-02 0:50 ` NeilBrown [this message]
2012-07-02 3:16 ` Shaohua Li
2012-07-02 7:39 ` NeilBrown
2012-07-03 1:27 ` Shaohua Li
2012-07-03 12:16 ` majianpeng
2012-07-03 23:56 ` NeilBrown
2012-07-04 1:09 ` majianpeng
2012-06-25 7:24 ` [patch 04/10 v3] raid5: lockless access raid5 overrided bi_phys_segments Shaohua Li
2012-06-25 7:24 ` [patch 05/10 v3] raid5: remove some device_lock locking places Shaohua Li
2012-06-25 7:24 ` [patch 06/10 v3] raid5: reduce chance release_stripe() taking device_lock Shaohua Li
2012-07-02 0:57 ` NeilBrown
2012-06-25 7:24 ` [patch 07/10 v3] md: personality can provide unplug private data Shaohua Li
2012-07-02 1:06 ` NeilBrown
2012-06-25 7:24 ` [patch 08/10 v3] raid5: make_request use batch stripe release Shaohua Li
2012-07-02 2:31 ` NeilBrown
2012-07-02 2:59 ` Shaohua Li
2012-07-02 5:07 ` NeilBrown
2012-06-25 7:24 ` [patch 09/10 v3] raid5: raid5d handle stripe in batch way Shaohua Li
2012-07-02 2:32 ` NeilBrown
2012-06-25 7:24 ` [patch 10/10 v3] raid5: create multiple threads to handle stripes Shaohua Li
2012-07-02 2:39 ` NeilBrown
2012-07-02 20:03 ` Dan Williams
2012-07-03 8:04 ` Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120702105046.56cd47ec@notabene.brown \
--to=neilb@suse.de \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@fusionio.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.