* [PATCH] dm raid1: "mirror" target doesn't use all available legs on multiple failures
@ 2016-10-10 16:48 Heinz Mauelshagen
2016-10-10 20:41 ` Mike Snitzer
0 siblings, 1 reply; 3+ messages in thread
From: Heinz Mauelshagen @ 2016-10-10 16:48 UTC (permalink / raw)
To: dm-devel; +Cc: Heinz Mauelshagen, snitzer
In case legs of a "mirror" target fail, any read will cause a new,
operational default leg to be selected and the read be resubmitted
to it. If that new default leg fails the read too, no other still
accessible legs are used to resubmit the read again thus failing
the io.
Fix by allowing the read to get resubmitted until there's no
operational legs any more.
Resolves: rhbz1383444
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
---
drivers/md/dm-raid1.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index 7a6254d..dd31019 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -1266,16 +1266,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
goto out;
if (unlikely(error)) {
- if (!bio_record->details.bi_bdev) {
- /*
- * There wasn't enough memory to record necessary
- * information for a retry or there was no other
- * mirror in-sync.
- */
- DMERR_LIMIT("Mirror read failed.");
- return -EIO;
- }
-
m = bio_record->m;
DMERR("Mirror read failed from %s. Trying alternative device.",
@@ -1291,7 +1281,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
bd = &bio_record->details;
dm_bio_restore(bd, bio);
- bio_record->details.bi_bdev = NULL;
bio->bi_error = 0;
queue_bio(ms, bio, rw);
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: dm raid1: "mirror" target doesn't use all available legs on multiple failures
2016-10-10 16:48 [PATCH] dm raid1: "mirror" target doesn't use all available legs on multiple failures Heinz Mauelshagen
@ 2016-10-10 20:41 ` Mike Snitzer
2016-10-11 14:56 ` Heinz Mauelshagen
0 siblings, 1 reply; 3+ messages in thread
From: Mike Snitzer @ 2016-10-10 20:41 UTC (permalink / raw)
To: Heinz Mauelshagen; +Cc: dm-devel
On Mon, Oct 10 2016 at 12:48pm -0400,
Heinz Mauelshagen <heinzm@redhat.com> wrote:
> In case legs of a "mirror" target fail, any read will cause a new,
> operational default leg to be selected and the read be resubmitted
> to it. If that new default leg fails the read too, no other still
> accessible legs are used to resubmit the read again thus failing
> the io.
>
> Fix by allowing the read to get resubmitted until there's no
> operational legs any more.
>
> Resolves: rhbz1383444
>
> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Nothing seems to be checking bio_record->details.bi_bdev anymore.
(The one you've removed really seems like a complete hack to begin with)
Shouldn't this patch go further by removing the other 2 places that set
bio_record->details.bi_bdev = NULL; ?
> ---
> drivers/md/dm-raid1.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> index 7a6254d..dd31019 100644
> --- a/drivers/md/dm-raid1.c
> +++ b/drivers/md/dm-raid1.c
> @@ -1266,16 +1266,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
> goto out;
>
> if (unlikely(error)) {
> - if (!bio_record->details.bi_bdev) {
> - /*
> - * There wasn't enough memory to record necessary
> - * information for a retry or there was no other
> - * mirror in-sync.
> - */
> - DMERR_LIMIT("Mirror read failed.");
> - return -EIO;
> - }
> -
> m = bio_record->m;
>
> DMERR("Mirror read failed from %s. Trying alternative device.",
> @@ -1291,7 +1281,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
> bd = &bio_record->details;
>
> dm_bio_restore(bd, bio);
> - bio_record->details.bi_bdev = NULL;
> bio->bi_error = 0;
>
> queue_bio(ms, bio, rw);
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: dm raid1: "mirror" target doesn't use all available legs on multiple failures
2016-10-10 20:41 ` Mike Snitzer
@ 2016-10-11 14:56 ` Heinz Mauelshagen
0 siblings, 0 replies; 3+ messages in thread
From: Heinz Mauelshagen @ 2016-10-11 14:56 UTC (permalink / raw)
To: Mike Snitzer; +Cc: dm-devel
On 10/10/2016 10:41 PM, Mike Snitzer wrote:
> On Mon, Oct 10 2016 at 12:48pm -0400,
> Heinz Mauelshagen <heinzm@redhat.com> wrote:
>
>> In case legs of a "mirror" target fail, any read will cause a new,
>> operational default leg to be selected and the read be resubmitted
>> to it. If that new default leg fails the read too, no other still
>> accessible legs are used to resubmit the read again thus failing
>> the io.
>>
>> Fix by allowing the read to get resubmitted until there's no
>> operational legs any more.
>>
>> Resolves: rhbz1383444
>>
>> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
> Nothing seems to be checking bio_record->details.bi_bdev anymore.
> (The one you've removed really seems like a complete hack to begin with)
The very point of removing "if (!bio_record->details.bi_bdev) { ..." is
to allow for resubmission of read bios to any other still operational leg
after an initial resubmission failed.
do_reads() will conditionally error the bio in case there are none or
no in sync ones available.
The "mirror" processing for read bios is:
1) bio gets remapped to the default mirror in mirror_map() if region is
in sync our queued to the worker for submission
2) mirror_end_io() spots read error and requeues to the worker
3) worker
- submits bio to new, selected default if respective region is in sync
or
- fails bio if it aims at a non synchronized region
4) if not already failed in do_reads(),
mirror_end_io() processes like in step 1 _but_
errors the read because of the "if (!bio_record->details.bi_bdev) {
..."
My patch changes step 4 by removing the conditional to resubmit it
again in case any other operational leg is available. If none's available,
mirror_end_io() errors the read bios or the do_reads error logic applies
in case of a resubmission.
Why do you think this to be a hack?
>
> Shouldn't this patch go further by removing the other 2 places that set
> bio_record->details.bi_bdev = NULL; ?
Yes, that's a good cleanup.
Let's first settle any still open issues like other unspotted side effects
and I'll do a patch v2 with it.
>
>> ---
>> drivers/md/dm-raid1.c | 11 -----------
>> 1 file changed, 11 deletions(-)
>>
>> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
>> index 7a6254d..dd31019 100644
>> --- a/drivers/md/dm-raid1.c
>> +++ b/drivers/md/dm-raid1.c
>> @@ -1266,16 +1266,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
>> goto out;
>>
>> if (unlikely(error)) {
>> - if (!bio_record->details.bi_bdev) {
>> - /*
>> - * There wasn't enough memory to record necessary
>> - * information for a retry or there was no other
>> - * mirror in-sync.
>> - */
>> - DMERR_LIMIT("Mirror read failed.");
>> - return -EIO;
>> - }
>> -
>> m = bio_record->m;
>>
>> DMERR("Mirror read failed from %s. Trying alternative device.",
>> @@ -1291,7 +1281,6 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
>> bd = &bio_record->details;
>>
>> dm_bio_restore(bd, bio);
>> - bio_record->details.bi_bdev = NULL;
>> bio->bi_error = 0;
>>
>> queue_bio(ms, bio, rw);
>> --
>> 2.7.4
>>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-10-11 14:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-10 16:48 [PATCH] dm raid1: "mirror" target doesn't use all available legs on multiple failures Heinz Mauelshagen
2016-10-10 20:41 ` Mike Snitzer
2016-10-11 14:56 ` Heinz Mauelshagen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.