From: NeilBrown <neilb@suse.de>
To: majianpeng <majianpeng@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: About RAID1/10 io barrier between normal io and resync/recovery io.
Date: Thu, 1 Nov 2012 19:24:44 +1100 [thread overview]
Message-ID: <20121101192444.2ac1050a@notabene.brown> (raw)
In-Reply-To: <2012110115580062797811@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5952 bytes --]
On Thu, 1 Nov 2012 15:58:04 +0800 majianpeng <majianpeng@gmail.com> wrote:
> Hi Neil:
> At present, there are barriers in raid1/10.About barrier,you described: "Sometimes we need to suspend IO while we do something else, either some resync/recovery, or reconfigure the array"
> So when do normal request in func make_request, it call wait_barrier.And in sync_request it call raise_barrier.
> Of course,there are some place to call raise_barrier to do other things.
> But there, i only wanted to talk about normal io and resync/recovery.
>
> 1:I think you add io barrier is because raid1/10 didn't stripe like raid456.So it only for the situation which normal io and sync io had same content. Is that right?
Same sector, yes.
> 2: If normal ios outside the resync/recovery window, i think there is no necessary to use io barrier.
correct.
> I wanted to find the reason by reviewing the git log.But at present, git log of kernel is only after kernel 2.6.12.
> So i looked into the code about kernel 2.4.And found something like i thought. There were resync window.
> But why did you remove or modify ?
I didn't. Ingo Molnar did.
http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=0925bad356699684440720f2a908030f98bc3284
It wouldn't have generalised to raid10 very well (because resync and recovery
are handled very differently in raid10).
> 3: I think using resync window to control normal io is perfect but complicated.So i only used one positon which before mddev->curr_resync_completed.
Looks good, though I'll have a proper look next week when I have a bit more
time.
You only need the barrier for write operations, but I think we raise it for
read operations too. We can probably skip that.
Thanks,
NeilBrown
> The code is:
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 8034fbd..c45a769 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -234,6 +234,7 @@ static void call_bio_endio(struct r1bio *r1_bio)
> {
> struct bio *bio = r1_bio->master_bio;
> int done;
> + int skip_barrier = test_bit(R1BIO_SKIP_BARRIER, &r1_bio->state);
> struct r1conf *conf = r1_bio->mddev->private;
>
> if (bio->bi_phys_segments) {
> @@ -253,7 +254,8 @@ static void call_bio_endio(struct r1bio *r1_bio)
> * Wake up any possible resync thread that waits for the device
> * to go idle.
> */
> - allow_barrier(conf);
> + if (!skip_barrier)
> + allow_barrier(conf);
> }
> }
>
> @@ -1007,6 +1009,7 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> int first_clone;
> int sectors_handled;
> int max_sectors;
> + int skip_barrier = 0;
>
> /*
> * Register the new request and wait if the reconstruction
> @@ -1036,7 +1039,13 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> finish_wait(&conf->wait_barrier, &w);
> }
>
> - wait_barrier(conf);
> + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
> + (bio->bi_sector + bio->bi_size/512 <=
> + mddev->curr_resync_completed))
> + skip_barrier = 1;
> +
> + if (!skip_barrier)
> + wait_barrier(conf);
>
> bitmap = mddev->bitmap;
>
> @@ -1053,6 +1062,8 @@ static void make_request(struct mddev *mddev, struct bio * bio)
> r1_bio->mddev = mddev;
> r1_bio->sector = bio->bi_sector;
>
> + if (skip_barrier)
> + set_bit(R1BIO_SKIP_BARRIER, &r1_bio->state);
> /* We might need to issue multiple reads to different
> * devices if there are bad blocks around, so we keep
> * track of the number of reads in bio->bi_phys_segments.
> @@ -1229,10 +1240,15 @@ read_again:
> for (j = 0; j < i; j++)
> if (r1_bio->bios[j])
> rdev_dec_pending(conf->mirrors[j].rdev, mddev);
> - r1_bio->state = 0;
> - allow_barrier(conf);
> + if (!skip_barrier)
> + r1_bio->state = 0;
> + else
> + set_bit(R1BIO_SKIP_BARRIER, &r1_bio->state);
> + if (!skip_barrier)
> + allow_barrier(conf);
> md_wait_for_blocked_rdev(blocked_rdev, mddev);
> - wait_barrier(conf);
> + if (!skip_barrier)
> + wait_barrier(conf);
> goto retry_write;
> }
>
> diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h
> index 0ff3715..fab9fda 100644
> --- a/drivers/md/raid1.h
> +++ b/drivers/md/raid1.h
> @@ -158,6 +158,8 @@ struct r1bio {
> #define R1BIO_MadeGood 7
> #define R1BIO_WriteError 8
>
> +#define R1BIO_SKIP_BARRIER 9
> +
> extern int md_raid1_congested(struct mddev *mddev, int bits);
>
> #endif
>
> At present, the code only for normal io and sync io.But etc stop/remove disk /add disk/fix_read_error also used io barrier.
> I used fio to test.
> The fio script is:
> [global]
> filename=/dev/md0
> direct=1
> bs=512K
>
> [write]
> rw=write
> runtime=600
>
> [read]
> stonewall
> rw=read
> runtime=600
>
> I firstly started resync/recovery for some times in order to normal io position is before curr_resync_completed.
> And i also keep the resync/recovery speed is about 30MB/s throught sys_speed_min.
> The result is:
> w/ above code:
> write recovery/resync 30MB/S; fio 70MB/s
> read recovery/resync 30MB/s; fio 90MB/s
>
> w/o above code:
> write recovery/resync 30MB/s; fio 29MB/s
> read recovery/resync 30MB/s; fio 31MB/s
>
> The result is remarkable.
>
>
> Jianpeng
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
prev parent reply other threads:[~2012-11-01 8:24 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-01 7:58 About RAID1/10 io barrier between normal io and resync/recovery io majianpeng
2012-11-01 8:24 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121101192444.2ac1050a@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=majianpeng@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.