From: NeilBrown <neilb@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org, axboe@kernel.dk, dan.j.williams@intel.com
Subject: Re: [patch 3/3]raid5: remove unnecessary bitmap write optimization
Date: Wed, 4 Jul 2012 13:17:13 +1000 [thread overview]
Message-ID: <20120704131713.0feae049@notabene.brown> (raw)
In-Reply-To: <20120703075111.GC23488@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]
On Tue, 3 Jul 2012 15:51:11 +0800 Shaohua Li <shli@kernel.org> wrote:
> Neil pointed out the bitmap write optimization in handle_stripe_clean_event()
> is unnecessary, because the chance one stripe gets written twice in the mean
> time is rare. We can always do a bitmap_startwrite when a write request is
> added to a stripe and bitmap_endwrite after write request is done. Delete the
> optimization. With it, we can delete some cases of stripe_lock.
>
> Signed-off-by: Shaohua Li <shli@fusionio.com>
> ---
> drivers/md/raid5.c | 28 ++++++++--------------------
> 1 file changed, 8 insertions(+), 20 deletions(-)
>
> Index: linux/drivers/md/raid5.c
> ===================================================================
> --- linux.orig/drivers/md/raid5.c 2012-07-03 14:58:51.241382361 +0800
> +++ linux/drivers/md/raid5.c 2012-07-03 15:04:48.568889733 +0800
> @@ -2350,7 +2350,7 @@ static int add_stripe_bio(struct stripe_
> spin_lock_irq(&sh->stripe_lock);
> if (forwrite) {
> bip = &sh->dev[dd_idx].towrite;
> - if (*bip == NULL && sh->dev[dd_idx].written == NULL)
> + if (*bip == NULL)
> firstwrite = 1;
> } else
> bip = &sh->dev[dd_idx].toread;
> @@ -2427,7 +2427,6 @@ handle_failed_stripe(struct r5conf *conf
> int i;
> for (i = disks; i--; ) {
> struct bio *bi;
> - int bitmap_end = 0;
>
> if (test_bit(R5_ReadError, &sh->dev[i].flags)) {
> struct md_rdev *rdev;
> @@ -2451,10 +2450,9 @@ handle_failed_stripe(struct r5conf *conf
> /* fail all writes first */
> bi = sh->dev[i].towrite;
> sh->dev[i].towrite = NULL;
> - if (bi) {
> + if (bi)
> s->to_write--;
> - bitmap_end = 1;
> - }
> + spin_unlock_irq(&sh->stripe_lock);
>
> if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags))
> wake_up(&conf->wait_for_overlap);
> @@ -2473,7 +2471,6 @@ handle_failed_stripe(struct r5conf *conf
> /* and fail all 'written' */
> bi = sh->dev[i].written;
> sh->dev[i].written = NULL;
> - if (bi) bitmap_end = 1;
> while (bi && bi->bi_sector <
> sh->dev[i].sector + STRIPE_SECTORS) {
> struct bio *bi2 = r5_next_bio(bi, sh->dev[i].sector);
> @@ -2509,10 +2506,8 @@ handle_failed_stripe(struct r5conf *conf
> bi = nextbi;
> }
> }
> - spin_unlock_irq(&sh->stripe_lock);
> - if (bitmap_end)
> - bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> - STRIPE_SECTORS, 0, 0);
> + bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> + STRIPE_SECTORS, 0, 0);
> /* If we were in the middle of a write the parity block might
> * still be locked - so just clear all R5_LOCKED flags
> */
Thanks.
However this section - handle_failed_stripe - isn't correct.
bitmap_startwrite and bitmap_endwrite increment and decrement a
counter and so must be balanced.
We are now counting once for each list that is on either ->towrite or
->written. We bitmap_startwrite when we set ->towrite, we then move that to
->written. Then when we remove from ->written we bitmap_endwrite.
In the handle_failed_stripe case we may remove 0, 1, or 2 lists. So we
need to call bitmap_endwrite 0, 1, or 2 times.
So if towrite was not NULL, we want to call bitmap_endwrite
then if written was not NULL we want to call it again.
Also I think I'd prefer it if this patch were before "add a per-stripe lock".
It is best to first get rid of use of device_lock first, then change some of
the remaining ones to stripe_lock. Changing some to stripe_lock, then
discarding them seems messy.
Thanks,
NeilBrown
> @@ -2713,9 +2708,7 @@ static void handle_stripe_clean_event(st
> test_bit(R5_UPTODATE, &dev->flags)) {
> /* We can return any write requests */
> struct bio *wbi, *wbi2;
> - int bitmap_end = 0;
> pr_debug("Return write for disc %d\n", i);
> - spin_lock_irq(&sh->stripe_lock);
> wbi = dev->written;
> dev->written = NULL;
> while (wbi && wbi->bi_sector <
> @@ -2728,15 +2721,10 @@ static void handle_stripe_clean_event(st
> }
> wbi = wbi2;
> }
> - if (dev->towrite == NULL)
> - bitmap_end = 1;
> - spin_unlock_irq(&sh->stripe_lock);
> - if (bitmap_end)
> - bitmap_endwrite(conf->mddev->bitmap,
> - sh->sector,
> - STRIPE_SECTORS,
> + bitmap_endwrite(conf->mddev->bitmap, sh->sector,
> + STRIPE_SECTORS,
> !test_bit(STRIPE_DEGRADED, &sh->state),
> - 0);
> + 0);
> }
> }
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
prev parent reply other threads:[~2012-07-04 3:17 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-03 7:51 [patch 3/3]raid5: remove unnecessary bitmap write optimization Shaohua Li
2012-07-04 3:17 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120704131713.0feae049@notabene.brown \
--to=neilb@suse.de \
--cc=axboe@kernel.dk \
--cc=dan.j.williams@intel.com \
--cc=linux-raid@vger.kernel.org \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).