* [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk.
@ 2013-02-28 7:50 majianpeng
2013-03-04 2:04 ` NeilBrown
0 siblings, 1 reply; 5+ messages in thread
From: majianpeng @ 2013-02-28 7:50 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
Replacement is a fullsync which don't depent on bitmap.So regardless of
the presence and absence of bitmap, it do full resync.
If offset of normal io is larger than offset of resync,it will write
again when resync arrived this offset.
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
---
drivers/md/raid1.c | 4 +++-
drivers/md/raid10.c | 4 +++-
drivers/md/raid5.c | 3 ++-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index d5bddfc..142a5fa 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1173,7 +1173,9 @@ read_again:
set_bit(R1BIO_Degraded, &r1_bio->state);
continue;
}
-
+ if (test_bit(Replacement, &rdev->flags) &&
+ conf->mddev->curr_resync < r1_bio->sector)
+ continue;
atomic_inc(&rdev->nr_pending);
if (test_bit(WriteErrorSeen, &rdev->flags)) {
sector_t first_bad;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 64d4824..bb11cfb 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1337,7 +1337,9 @@ retry_write:
|| test_bit(Unmerged, &rdev->flags)))
rdev = NULL;
if (rrdev && (test_bit(Faulty, &rrdev->flags)
- || test_bit(Unmerged, &rrdev->flags)))
+ || test_bit(Unmerged, &rrdev->flags) ||
+ (test_bit(Replacement, &rrdev->flags) &&
+ conf->mddev->curr_resync < r10_bio->sector)))
rrdev = NULL;
r10_bio->devs[i].bio = NULL;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index bd49623..e0a2a39 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -602,7 +602,8 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
rdev = NULL;
if (rdev)
atomic_inc(&rdev->nr_pending);
- if (rrdev && test_bit(Faulty, &rrdev->flags))
+ if (rrdev && (test_bit(Faulty, &rrdev->flags) ||
+ conf->mddev->curr_resync < sh->sector))
rrdev = NULL;
if (rrdev)
atomic_inc(&rrdev->nr_pending);
--
1.7.9.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk.
2013-02-28 7:50 [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk majianpeng
@ 2013-03-04 2:04 ` NeilBrown
2013-03-04 2:24 ` majianpeng
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2013-03-04 2:04 UTC (permalink / raw)
To: majianpeng; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 2494 bytes --]
On Thu, 28 Feb 2013 15:50:37 +0800 majianpeng <majianpeng@gmail.com> wrote:
> Replacement is a fullsync which don't depent on bitmap.So regardless of
> the presence and absence of bitmap, it do full resync.
> If offset of normal io is larger than offset of resync,it will write
> again when resync arrived this offset.
This might be OK for RAID1 and RAID10 as recover is paused when writes
happen, but that is not the case for RAID5, so it isn't safe to test against
curr_resync - it gets updated a bit too later.
Also you messed up the formatting in raid10.c
I'm not convinced this optimisation is really worth it.
NeilBrown
>
> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> ---
> drivers/md/raid1.c | 4 +++-
> drivers/md/raid10.c | 4 +++-
> drivers/md/raid5.c | 3 ++-
> 3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index d5bddfc..142a5fa 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1173,7 +1173,9 @@ read_again:
> set_bit(R1BIO_Degraded, &r1_bio->state);
> continue;
> }
> -
> + if (test_bit(Replacement, &rdev->flags) &&
> + conf->mddev->curr_resync < r1_bio->sector)
> + continue;
> atomic_inc(&rdev->nr_pending);
> if (test_bit(WriteErrorSeen, &rdev->flags)) {
> sector_t first_bad;
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 64d4824..bb11cfb 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1337,7 +1337,9 @@ retry_write:
> || test_bit(Unmerged, &rdev->flags)))
> rdev = NULL;
> if (rrdev && (test_bit(Faulty, &rrdev->flags)
> - || test_bit(Unmerged, &rrdev->flags)))
> + || test_bit(Unmerged, &rrdev->flags) ||
> + (test_bit(Replacement, &rrdev->flags) &&
> + conf->mddev->curr_resync < r10_bio->sector)))
> rrdev = NULL;
>
> r10_bio->devs[i].bio = NULL;
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index bd49623..e0a2a39 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -602,7 +602,8 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> rdev = NULL;
> if (rdev)
> atomic_inc(&rdev->nr_pending);
> - if (rrdev && test_bit(Faulty, &rrdev->flags))
> + if (rrdev && (test_bit(Faulty, &rrdev->flags) ||
> + conf->mddev->curr_resync < sh->sector))
> rrdev = NULL;
> if (rrdev)
> atomic_inc(&rrdev->nr_pending);
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Re: [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk.
2013-03-04 2:04 ` NeilBrown
@ 2013-03-04 2:24 ` majianpeng
2013-03-04 5:30 ` NeilBrown
0 siblings, 1 reply; 5+ messages in thread
From: majianpeng @ 2013-03-04 2:24 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
>On Thu, 28 Feb 2013 15:50:37 +0800 majianpeng <majianpeng@gmail.com> wrote:
>
>> Replacement is a fullsync which don't depent on bitmap.So regardless of
>> the presence and absence of bitmap, it do full resync.
>> If offset of normal io is larger than offset of resync,it will write
>> again when resync arrived this offset.
>
>This might be OK for RAID1 and RAID10 as recover is paused when writes
>happen, but that is not the case for RAID5, so it isn't safe to test against
>curr_resync - it gets updated a bit too later.
>
->curr_resync + STRIPE_SECTORS is the next stripe which willbe replaced.
How about the ->curr_resync+STRIPE_SECTORS?
>Also you messed up the formatting in raid10.c
>
Can you explain in detail?
>I'm not convinced this optimisation is really worth it.
>
Maybe for HDD disk, it only improve speed by reducing some write operation.
But for ssd disk, it can reduce one write.
Thanks!
Jianpeng Ma
>NeilBrown
>
>
>>
>> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
>> ---
>> drivers/md/raid1.c | 4 +++-
>> drivers/md/raid10.c | 4 +++-
>> drivers/md/raid5.c | 3 ++-
>> 3 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>> index d5bddfc..142a5fa 100644
>> --- a/drivers/md/raid1.c
>> +++ b/drivers/md/raid1.c
>> @@ -1173,7 +1173,9 @@ read_again:
>> set_bit(R1BIO_Degraded, &r1_bio->state);
>> continue;
>> }
>> -
>> + if (test_bit(Replacement, &rdev->flags) &&
>> + conf->mddev->curr_resync < r1_bio->sector)
>> + continue;
>> atomic_inc(&rdev->nr_pending);
>> if (test_bit(WriteErrorSeen, &rdev->flags)) {
>> sector_t first_bad;
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index 64d4824..bb11cfb 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -1337,7 +1337,9 @@ retry_write:
>> || test_bit(Unmerged, &rdev->flags)))
>> rdev = NULL;
>> if (rrdev && (test_bit(Faulty, &rrdev->flags)
>> - || test_bit(Unmerged, &rrdev->flags)))
>> + || test_bit(Unmerged, &rrdev->flags) ||
>> + (test_bit(Replacement, &rrdev->flags) &&
>> + conf->mddev->curr_resync < r10_bio->sector)))
>> rrdev = NULL;
>>
>> r10_bio->devs[i].bio = NULL;
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index bd49623..e0a2a39 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -602,7 +602,8 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>> rdev = NULL;
>> if (rdev)
>> atomic_inc(&rdev->nr_pending);
>> - if (rrdev && test_bit(Faulty, &rrdev->flags))
>> + if (rrdev && (test_bit(Faulty, &rrdev->flags) ||
>> + conf->mddev->curr_resync < sh->sector))
>> rrdev = NULL;
>> if (rrdev)
>> atomic_inc(&rrdev->nr_pending);
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk.
2013-03-04 2:24 ` majianpeng
@ 2013-03-04 5:30 ` NeilBrown
2013-03-05 2:53 ` majianpeng
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2013-03-04 5:30 UTC (permalink / raw)
To: majianpeng; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 4018 bytes --]
On Mon, 4 Mar 2013 10:24:16 +0800 majianpeng <majianpeng@gmail.com> wrote:
> >On Thu, 28 Feb 2013 15:50:37 +0800 majianpeng <majianpeng@gmail.com> wrote:
> >
> >> Replacement is a fullsync which don't depent on bitmap.So regardless of
> >> the presence and absence of bitmap, it do full resync.
> >> If offset of normal io is larger than offset of resync,it will write
> >> again when resync arrived this offset.
> >
> >This might be OK for RAID1 and RAID10 as recover is paused when writes
> >happen, but that is not the case for RAID5, so it isn't safe to test against
> >curr_resync - it gets updated a bit too later.
> >
> ->curr_resync + STRIPE_SECTORS is the next stripe which willbe replaced.
> How about the ->curr_resync+STRIPE_SECTORS?
If you aren't certain, then neither am I.
As resync and normal writes can be intermingled you would need some guarantee
that a write wouldn't be missed, and that almost certainly means a test under
a lock against some value which is updated under a lock.
>
> >Also you messed up the formatting in raid10.c
> >
> Can you explain in detail?
>> if (rrdev && (test_bit(Faulty, &rrdev->flags)
>> - || test_bit(Unmerged, &rrdev->flags)))
>> + || test_bit(Unmerged, &rrdev->flags) ||
>> + (test_bit(Replacement, &rrdev->flags) &&
>> + conf->mddev->curr_resync < r10_bio->sector)))
Text that is inside parentheses (or other brackets) should never be to the
left of the opening parenthesis unless that parenthesis is at the end of a
line.
In the original code, the "|| test_bit(Unmerged....." was to the right of the
'('. In your version it starts to the left, and the lines you added also
start to the left.
> >I'm not convinced this optimisation is really worth it.
> >
> Maybe for HDD disk, it only improve speed by reducing some write operation.
> But for ssd disk, it can reduce one write.
Still doesn't sound convincing.
NeilBrown
>
> Thanks!
> Jianpeng Ma
> >NeilBrown
> >
> >
> >>
> >> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
> >> ---
> >> drivers/md/raid1.c | 4 +++-
> >> drivers/md/raid10.c | 4 +++-
> >> drivers/md/raid5.c | 3 ++-
> >> 3 files changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> >> index d5bddfc..142a5fa 100644
> >> --- a/drivers/md/raid1.c
> >> +++ b/drivers/md/raid1.c
> >> @@ -1173,7 +1173,9 @@ read_again:
> >> set_bit(R1BIO_Degraded, &r1_bio->state);
> >> continue;
> >> }
> >> -
> >> + if (test_bit(Replacement, &rdev->flags) &&
> >> + conf->mddev->curr_resync < r1_bio->sector)
> >> + continue;
> >> atomic_inc(&rdev->nr_pending);
> >> if (test_bit(WriteErrorSeen, &rdev->flags)) {
> >> sector_t first_bad;
> >> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> >> index 64d4824..bb11cfb 100644
> >> --- a/drivers/md/raid10.c
> >> +++ b/drivers/md/raid10.c
> >> @@ -1337,7 +1337,9 @@ retry_write:
> >> || test_bit(Unmerged, &rdev->flags)))
> >> rdev = NULL;
> >> if (rrdev && (test_bit(Faulty, &rrdev->flags)
> >> - || test_bit(Unmerged, &rrdev->flags)))
> >> + || test_bit(Unmerged, &rrdev->flags) ||
> >> + (test_bit(Replacement, &rrdev->flags) &&
> >> + conf->mddev->curr_resync < r10_bio->sector)))
> >> rrdev = NULL;
> >>
> >> r10_bio->devs[i].bio = NULL;
> >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> >> index bd49623..e0a2a39 100644
> >> --- a/drivers/md/raid5.c
> >> +++ b/drivers/md/raid5.c
> >> @@ -602,7 +602,8 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >> rdev = NULL;
> >> if (rdev)
> >> atomic_inc(&rdev->nr_pending);
> >> - if (rrdev && test_bit(Faulty, &rrdev->flags))
> >> + if (rrdev && (test_bit(Faulty, &rrdev->flags) ||
> >> + conf->mddev->curr_resync < sh->sector))
> >> rrdev = NULL;
> >> if (rrdev)
> >> atomic_inc(&rrdev->nr_pending);
> >
> >
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Re: [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk.
2013-03-04 5:30 ` NeilBrown
@ 2013-03-05 2:53 ` majianpeng
0 siblings, 0 replies; 5+ messages in thread
From: majianpeng @ 2013-03-05 2:53 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
>On Mon, 4 Mar 2013 10:24:16 +0800 majianpeng <majianpeng@gmail.com> wrote:
>
>> >On Thu, 28 Feb 2013 15:50:37 +0800 majianpeng <majianpeng@gmail.com> wrote:
>> >
>> >> Replacement is a fullsync which don't depent on bitmap.So regardless of
>> >> the presence and absence of bitmap, it do full resync.
>> >> If offset of normal io is larger than offset of resync,it will write
>> >> again when resync arrived this offset.
>> >
>> >This might be OK for RAID1 and RAID10 as recover is paused when writes
>> >happen, but that is not the case for RAID5, so it isn't safe to test against
>> >curr_resync - it gets updated a bit too later.
>> >
>> ->curr_resync + STRIPE_SECTORS is the next stripe which willbe replaced.
>> How about the ->curr_resync+STRIPE_SECTORS?
>
>If you aren't certain, then neither am I.
>
>As resync and normal writes can be intermingled you would need some guarantee
>that a write wouldn't be missed, and that almost certainly means a test under
>a lock against some value which is updated under a lock.
>
Yes as you said, we must not lost to write.
Because the normal io and replace io don't get the same stripe at the same time.
So there are two cases:
A: if (sh->sector < curr_resync)
For this case,withot doubt it must be write.
B: if (sh->sector >= curr_resync)
For sh->sector == curr_resync, this can't happen. Becase replace must wait this stripe.
For sh->sector > curr_resync, although the replace increate the ->curr_resync.But it can't
got the stripe which sh->sector == curr_resync.
So i think there isn't need some locks to protect.
For the above result, the base is the resync and normal io don't get the same stripe at the same time.
>>
>> >Also you messed up the formatting in raid10.c
>> >
>> Can you explain in detail?
>
>
>>> if (rrdev && (test_bit(Faulty, &rrdev->flags)
>>> - || test_bit(Unmerged, &rrdev->flags)))
>>> + || test_bit(Unmerged, &rrdev->flags) ||
>>> + (test_bit(Replacement, &rrdev->flags) &&
>>> + conf->mddev->curr_resync < r10_bio->sector)))
>
>Text that is inside parentheses (or other brackets) should never be to the
>left of the opening parenthesis unless that parenthesis is at the end of a
>line.
>In the original code, the "|| test_bit(Unmerged....." was to the right of the
>'('. In your version it starts to the left, and the lines you added also
>start to the left.
>
Thanks for the advise.
>> >I'm not convinced this optimisation is really worth it.
>> >
>> Maybe for HDD disk, it only improve speed by reducing some write operation.
>> But for ssd disk, it can reduce one write.
>
>Still doesn't sound convincing.
>
Maybe i do some test.But I still insist that write useless data is useless.
Thanks!
Jianpeng Ma
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-03-05 2:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-28 7:50 [PATCH 3/4] md: Don't do normal-write on unresync area of replacement-disk majianpeng
2013-03-04 2:04 ` NeilBrown
2013-03-04 2:24 ` majianpeng
2013-03-04 5:30 ` NeilBrown
2013-03-05 2:53 ` majianpeng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).