From: majianpeng <majianpeng@gmail.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Re: [PATCH 4/4] raid1: Rewrite the implementation of iobarrier.
Date: Fri, 15 Nov 2013 10:29:56 +0800 [thread overview]
Message-ID: <201311151029513246491@gmail.com> (raw)
In-Reply-To: 20131114174438.7f2db666@notabene.brown
>On Thu, 31 Oct 2013 11:20:55 +0800 majianpeng <majianpeng@gmail.com> wrote:
>
>Sorry for the delayed reply.
>
>
>> >On Tue, 29 Oct 2013 09:30:14 +0800 majianpeng <majianpeng@gmail.com> wrote:
>> >
>> >
>> >Nearly there!! Just a few more details. See below.
>> >
>> >
>> >> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
>> >> ---
>> >> drivers/md/raid1.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++------
>> >> drivers/md/raid1.h | 14 ++++++
>> >> 2 files changed, 124 insertions(+), 12 deletions(-)
>> >>
>> >> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>> >> index b4a6dcd..5b311c0 100644
>> >> --- a/drivers/md/raid1.c
>> >> +++ b/drivers/md/raid1.c
>> >> @@ -66,7 +66,8 @@
>> >> */
>> >> static int max_queued_requests = 1024;
>> >>
>> >> -static void allow_barrier(struct r1conf *conf);
>> >> +static void allow_barrier(struct r1conf *conf, sector_t start_next_window,
>> >> + sector_t bi_sector);
>> >> static void lower_barrier(struct r1conf *conf);
>> >>
>> >> static void * r1bio_pool_alloc(gfp_t gfp_flags, void *data)
>> >> @@ -227,6 +228,8 @@ static void call_bio_endio(struct r1bio *r1_bio)
>> >> struct bio *bio = r1_bio->master_bio;
>> >> int done;
>> >> struct r1conf *conf = r1_bio->mddev->private;
>> >> + sector_t start_next_window = r1_bio->start_next_window;
>> >> + sector_t bi_sector = bio->bi_sector;
>> >
>> >This should be r1_bio->sector, not bio->bi_sector.
>> >They are often the same but if multiple r1_bios are needed for some reason
>> >(e.g. bad blocks) they may not be.
>> >
>> No, allow_barrier() only for bio not for r1bio.It only do when bio->bi_phys_segments == 0.
>
>Yes, I see that now, thanks.
>
>
>> If this value is r1_bio->sector, the value may has different value as you said.
>> So in allow_barrier():
>> if (start_next_window) {
>> if (start_next_window == conf->start_next_window) {
>> if (conf->start_next_window + NEXT_NORMALIO_DISTANCE
>> <= bi_sector)
>> conf->next_window_requests--;
>> else
>> conf->current_window_requests--;
>> } else
>> It will has differnt result.
>>
>> >>
>> >> if (bio->bi_phys_segments) {
>> >> unsigned long flags;
>> >> @@ -234,6 +237,11 @@ static void call_bio_endio(struct r1bio *r1_bio)
>> >> bio->bi_phys_segments--;
>> >> done = (bio->bi_phys_segments == 0);
>> >> spin_unlock_irqrestore(&conf->device_lock, flags);
>> >> + /*
>> >> + * make_request() might be waiting for
>> >> + * bi_phys_segments to decrease
>> >> + */
>> >> + wake_up(&conf->wait_barrier);
>> >> } else
>> >> done = 1;
>> >>
>> >> @@ -245,7 +253,7 @@ static void call_bio_endio(struct r1bio *r1_bio)
>> >> * Wake up any possible resync thread that waits for the device
>> >> * to go idle.
>> >> */
>> >> - allow_barrier(conf);
>> >> + allow_barrier(conf, start_next_window, bi_sector);
>> >> }
>> >> }
>> >>
>> >> @@ -827,10 +835,18 @@ static void raise_barrier(struct r1conf *conf)
>> >> /* block any new IO from starting */
>> >> conf->barrier++;
>> >>
>> >> - /* Now wait for all pending IO to complete */
>> >> + /* For those conditions we must wait:
>> >> + * A:while the array is in frozen state
>> >> + * B:while barrier >= RESYNC_DEPTH, meaning resync reach
>> >> + * the max count which allowed.
>> >> + * C:next_resync + RESYNC_SECTORS > start_next_window, meaning
>> >> + * next resync will reach to window which normal bios are handling.
>> >> + */
>> >> wait_event_lock_irq(conf->wait_barrier,
>> >> !conf->array_frozen &&
>> >> - !conf->nr_pending && conf->barrier < RESYNC_DEPTH,
>> >> + conf->barrier < RESYNC_DEPTH &&
>> >> + (conf->start_next_window >=
>> >> + conf->next_resync + RESYNC_SECTORS),
>> >> conf->resync_lock);
>> >
>> >You've removed the test on conf->nr_pending here, which I think is correct.
>> >It counts 'read' requests as well. Testing start_next_window serves the same
>> >purpose as it is increased whenever current_window_requests reaches zero.
>> >
>> In func wait_barrier():
>> if (need_to_wait_for_sync(conf, bio)) {
>> [snip]
>> } else if (bio_data_dir(bio) == WRITE) {
>> if (conf->next_resync + NEXT_NORMALIO_DISTANCE
>> <= bio->bi_sector) {
>> if (conf->start_next_window == MaxSector)
>> conf->start_next_window =
>> conf->next_resync +
>> NEXT_NORMALIO_DISTANCE;
>>
>> if ((conf->start_next_window + NEXT_NORMALIO_DISTANCE)
>> <= bio->bi_sector)
>> conf->next_window_requests++;
>> else
>> conf->current_window_requests++;
>> }
>> if (bio->bi_sector >= conf->start_next_window)
>> sector = conf->start_next_window;
>> }
>> start_next_window only for bio_data_dir(bio) == WRITE. So resync can't wait read io to complete.
>>
>> >However you having modified as similar test on nr_pending in wait_barrier().
>I meant: haven't modified a similar test ....
>> >That worries me a bit. Should that be changed to a test on start_next_window
>> >to match the above change?
>> For this condition, i only copy it.For this condition, i'm not know clearly.
>> Can you give me more details about this?
>
>
>Before your patch we know if there are pending requests (submitted but not
>completed) which stop use from doing resync if conf->nr_pending > 0.
>After your patch we cannot use that that test as it counts READ requests, and
>requests that are outside the range being resynced.
>
>The test is now a bit more complicated. I think it is:
>
> conf->start_next_window <
> conf->next_resync + RESYNC_SECTORS
>
>
>i.e. while start_next_window is less than next_resync+RESYNC_SECTORS there
>cold be outstanding writes that are blocking resync. As soon as those writes
>complete, start_next_window will increase and the resync can complete.
>So if start_next_window < next_resync+RESYNC_SECTORS and there are pending
>request in current->bio_list, then then there is no point waiting for resync
>(it cannot progress anyway) so we may as well submit another write.
>
>So change:
>
> wait_event_lock_irq(conf->wait_barrier,
> !conf->array_frozen &&
> (!conf->barrier ||
> (conf->nr_pending &&
> current->bio_list &&
> !bio_list_empty(current->bio_list))),
> conf->resync_lock);
>
>in wait barrier to
>
> wait_event_lock_irq(conf->wait_barrier,
> !conf->array_frozen &&
> (!conf->barrier ||
> (conf->start_next_window < conf->next_resync + RESYNC_SECTORS &&
> current->bio_list &&
> !bio_list_empty(current->bio_list))),
> conf->resync_lock);
>
>
Ok, apply this.
>> >
>> >
>> >>
>> >> spin_unlock_irq(&conf->resync_lock);
>> >> @@ -846,10 +862,33 @@ static void lower_barrier(struct r1conf *conf)
>> >> wake_up(&conf->wait_barrier);
>> >> }
>> >>
>> >> -static void wait_barrier(struct r1conf *conf)
>> >> +static bool need_to_wait_for_sync(struct r1conf *conf, struct bio *bio)
>> >> +{
>> >> + bool wait = false;
>> >> +
>> >> + if (conf->array_frozen || !bio)
>> >> + wait = true;
>> >> + else if (conf->barrier && bio_data_dir(bio) == WRITE) {
>> >> + if (conf->next_resync < RESYNC_WINDOW_SECTORS)
>> >> + wait = true;
>> >> + else if ((conf->next_resync - RESYNC_WINDOW_SECTORS
>> >> + >= bio_end_sector(bio)) ||
>> >> + (conf->next_resync + NEXT_NORMALIO_DISTANCE
>> >> + <= bio->bi_sector))
>> >> + wait = false;
>> >> + else
>> >> + wait = true;
>> >> + }
>> >> +
>> >> + return wait;
>> >> +}
>> >> +
>> >> +static sector_t wait_barrier(struct r1conf *conf, struct bio *bio)
>> >> {
>> >> + sector_t sector = 0;
>> >> +
>> >> spin_lock_irq(&conf->resync_lock);
>> >> - if (conf->barrier) {
>> >> + if (need_to_wait_for_sync(conf, bio)) {
>> >> conf->nr_waiting++;
>> >> /* Wait for the barrier to drop.
>> >> * However if there are already pending
>> >> @@ -868,16 +907,57 @@ static void wait_barrier(struct r1conf *conf)
>> >> !bio_list_empty(current->bio_list))),
>> >> conf->resync_lock);
>> >> conf->nr_waiting--;
>> >> + } else if (bio_data_dir(bio) == WRITE) {
>> >> + if (conf->next_resync + NEXT_NORMALIO_DISTANCE
>> >> + <= bio->bi_sector) {
>> >> + if (conf->start_next_window == MaxSector)
>> >> + conf->start_next_window =
>> >> + conf->next_resync +
>> >> + NEXT_NORMALIO_DISTANCE;
>> >> +
>> >> + if ((conf->start_next_window + NEXT_NORMALIO_DISTANCE)
>> >> + <= bio->bi_sector)
>> >> + conf->next_window_requests++;
>> >> + else
>> >> + conf->current_window_requests++;
>> >> + }
>> >> + if (bio->bi_sector >= conf->start_next_window)
>> >> + sector = conf->start_next_window;
>> >
>> >You aren't setting 'sector' if we needed to wait. I don't think that is
>> >correct, is it?
>> >
>> Yes.How about those code:
>> spin_lock_irq(&conf->resync_lock);
>> retry_check:
>> if (need_to_wait_for_sync(conf, bio)) {
>> conf->nr_waiting++;
>> /* Wait for the barrier to drop.
>> * However if there are already pending
>> * requests (preventing the barrier from
>> * rising completely), and the
>> * pre-process bio queue isn't empty,
>> * then don't wait, as we need to empty
>> * that queue to get the nr_pending
>> * count down.
>> */
>> wait_event_lock_irq(conf->wait_barrier,
>> !conf->array_frozen &&
>> (!conf->barrier ||
>> (conf->nr_pending &&
>> current->bio_list &&
>> !bio_list_empty(current->bio_list))),
>> conf->resync_lock);
>> conf->nr_waiting--;
>> goto retry_check;
>> >
>
>It would look a lot better if you made it a while loop:
>
> while (need_to_wait....) {
> nr_waiting++
> wait_event.....
> nr_waiting--
> if (bio_data_dir.....
>
For this, i think it don;t need while() or recheck.
The code should:
if (nedd_to_wait...) {
nr_waiting++
wait_event.....
nr_waiting--
}
if (bio && bio_data_dir(bio) == WRITE) {
do;
}
I think again need_to_wait...(),it must return false. So we don't recheck this.
Or am I missing something?
Thanks!
Jianpeng Ma
next prev parent reply other threads:[~2013-11-15 2:29 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-28 11:40 [PATCH 4/4] raid1: Rewrite the implementation of iobarrier majianpeng
2013-10-24 1:50 ` NeilBrown
2013-10-29 1:30 ` majianpeng
2013-10-31 2:33 ` NeilBrown
2013-10-31 3:20 ` majianpeng
2013-11-14 6:44 ` NeilBrown
2013-11-15 2:29 ` majianpeng [this message]
2013-11-15 3:42 ` NeilBrown
2013-11-15 6:55 ` majianpeng
2013-11-19 4:25 ` NeilBrown
2013-11-19 7:53 ` majianpeng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201311151029513246491@gmail.com \
--to=majianpeng@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).