From: NeilBrown <neilb@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org, axboe@kernel.dk, dan.j.williams@intel.com
Subject: Re: [patch 3/3]raid5: remove unnecessary bitmap write optimization
Date: Wed, 4 Jul 2012 13:17:13 +1000 [thread overview]
Message-ID: <20120704131713.0feae049@notabene.brown> (raw)
In-Reply-To: <20120703075111.GC23488@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]
On Tue, 3 Jul 2012 15:51:11 +0800 Shaohua Li <shli@kernel.org> wrote:
> Neil pointed out the bitmap write optimization in handle_stripe_clean_event()
> is unnecessary, because the chance one stripe gets written twice in the mean
> time is rare. We can always do a bitmap_startwrite when a write request is
> added to a stripe and bitmap_endwrite after write request is done. Delete the
> optimization. With it, we can delete some cases of stripe_lock.
>
> Signed-off-by: Shaohua Li <shli@fusionio.com>
> ---
> drivers/md/raid5.c | 28 ++++++++--------------------
> 1 file changed, 8 insertions(+), 20 deletions(-)
>
> Index: linux/drivers/md/raid5.c
> ===================================================================
> --- linux.orig/drivers/md/raid5.c 2012-07-03 14:58:51.241382361 +0800
> +++ linux/drivers/md/raid5.c 2012-07-03 15:04:48.568889733 +0800
> @@ -2350,7 +2350,7 @@ static int add_stripe_bio(struct stripe_
> spin_lock_irq(&sh->stripe_lock);
> if (forwrite) {
> bip = &sh->dev[dd_idx].towrite;
> - if (*bip == NULL && sh->dev[dd_idx].written == NULL)
> + if (*bip == NULL)
> firstwrite = 1;
> } else
> bip = &sh->dev[dd_idx].toread;
> @@ -2427,7 +2427,6 @@ handle_failed_stripe(struct r5conf *conf
> int i;
> for (i = disks; i--; ) {
> struct bio *bi;
> - int bitmap_end = 0;
>
> if (test_bit(R5_ReadError, &sh->dev[i].flags)) {
> struct md_rdev *rdev;
> @@ -2451,10 +2450,9 @@ handle_failed_stripe(struct r5conf *conf
> /* fail all writes first */
> bi = sh->dev[i].towrite;
> sh->dev[i].towrite = NULL;
> - if (bi) {
> + if (bi)
> s->to_write--;
> - bitmap_end = 1;
> - }
> + spin_unlock_irq(&sh->stripe_lock);
>
> if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags))
> wake_up(&conf->wait_for_overlap);
> @@ -2473,7 +2471,6 @@ handle_failed_stripe(struct r5conf *conf
> /* and fail all 'written' */
> bi = sh->dev[i].written;
> sh->dev[i].written = NULL;
> - if (bi) bitmap_end = 1;
> while (bi && bi->bi_sector <
> sh->dev[i].sector + STRIPE_SECTORS) {
> struct bio *bi2 = r5_next_bio(bi, sh->dev[i].sector);
> @@ -2509,10 +2506,8 @@ handle_failed_stripe(struct r5conf *conf
> bi = nextbi;
> }
> }
> - spin_unlock_irq(&sh->stripe_lock);
> - if (bitmap_end)
> - bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> - STRIPE_SECTORS, 0, 0);
> + bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> + STRIPE_SECTORS, 0, 0);
> /* If we were in the middle of a write the parity block might
> * still be locked - so just clear all R5_LOCKED flags
> */
Thanks.
However this section - handle_failed_stripe - isn't correct.
bitmap_startwrite and bitmap_endwrite increment and decrement a
counter and so must be balanced.
We are now counting once for each list that is on either ->towrite or
->written. We bitmap_startwrite when we set ->towrite, we then move that to
->written. Then when we remove from ->written we bitmap_endwrite.
In the handle_failed_stripe case we may remove 0, 1, or 2 lists. So we
need to call bitmap_endwrite 0, 1, or 2 times.
So if towrite was not NULL, we want to call bitmap_endwrite
then if written was not NULL we want to call it again.
Also I think I'd prefer it if this patch were before "add a per-stripe lock".
It is best to first get rid of use of device_lock first, then change some of
the remaining ones to stripe_lock. Changing some to stripe_lock, then
discarding them seems messy.
Thanks,
NeilBrown
> @@ -2713,9 +2708,7 @@ static void handle_stripe_clean_event(st
> test_bit(R5_UPTODATE, &dev->flags)) {
> /* We can return any write requests */
> struct bio *wbi, *wbi2;
> - int bitmap_end = 0;
> pr_debug("Return write for disc %d\n", i);
> - spin_lock_irq(&sh->stripe_lock);
> wbi = dev->written;
> dev->written = NULL;
> while (wbi && wbi->bi_sector <
> @@ -2728,15 +2721,10 @@ static void handle_stripe_clean_event(st
> }
> wbi = wbi2;
> }
> - if (dev->towrite == NULL)
> - bitmap_end = 1;
> - spin_unlock_irq(&sh->stripe_lock);
> - if (bitmap_end)
> - bitmap_endwrite(conf->mddev->bitmap,
> - sh->sector,
> - STRIPE_SECTORS,
> + bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> + STRIPE_SECTORS,
> !test_bit(STRIPE_DEGRADED, &sh->state),
> - 0);
> + 0);
> }
> }
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
prev parent reply other threads:[~2012-07-04 3:17 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-03 7:51 [patch 3/3]raid5: remove unnecessary bitmap write optimization Shaohua Li
2012-07-04 3:17 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120704131713.0feae049@notabene.brown \
--to=neilb@suse.de \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.