* [PATCH] Don't lose writes if errors are not handled and log fails
@ 2010-01-31 20:34 Mikulas Patocka
2010-02-01 17:26 ` Takahiro Yasui
0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2010-01-31 20:34 UTC (permalink / raw)
To: dm-devel; +Cc: Alasdair G Kergon
Hi
This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197
Please submit this patch before 2.6.33 goes out. It fixes a bug when old
LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to
dm-raid1.
It doesn't need to be backported to RHEL 5.5, because lvm always passes a
flag to handle errors there.
Mikulas
Don't lose writes if errors are not handled and log fails
If the log fails and errors are not handled by dmeventd, the writes
are successfully finished without being actually written to the device.
This code path is taken:
do_writes:
bio_list_merge(&ms->failures, &sync);
do_failures:
if (!get_valid_mirror(ms)) (false)
else if (errors_handled(ms)) (false)
else bio_endio(bio, 0);
The logic in do_failures is based on presuming that the write was already
tried --- if it succeeded at least on one leg and errors are not handled,
it is reported as success.
However, bio can be added to the failures queue without being submitted, in
do_writes.
This patch changes it so that bios are added to the failures list only if
errors are handled --- then, they will be held with hold_bio() called from
do_failures.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
---
drivers/md/dm-raid1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c
===================================================================
--- linux-2.6.33-rc5-devel.orig/drivers/md/dm-raid1.c 2010-01-29 20:29:37.000000000 +0100
+++ linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c 2010-01-30 00:04:18.000000000 +0100
@@ -724,7 +724,7 @@ static void do_writes(struct mirror_set
/*
* Dispatch io.
*/
- if (unlikely(ms->log_failure)) {
+ if (unlikely(ms->log_failure) && errors_handled(ms)) {
spin_lock_irq(&ms->lock);
bio_list_merge(&ms->failures, &sync);
spin_unlock_irq(&ms->lock);
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH] Don't lose writes if errors are not handled and log fails 2010-01-31 20:34 [PATCH] Don't lose writes if errors are not handled and log fails Mikulas Patocka @ 2010-02-01 17:26 ` Takahiro Yasui 2010-02-02 13:00 ` Mikulas Patocka 0 siblings, 1 reply; 6+ messages in thread From: Takahiro Yasui @ 2010-02-01 17:26 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel, Alasdair G Kergon > This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197 > > Please submit this patch before 2.6.33 goes out. It fixes a bug when old > LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to > dm-raid1. > > It doesn't need to be backported to RHEL 5.5, because lvm always passes a > flag to handle errors there. Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1, we don't need to backport it to RHEL 5.5. But if not, we need to. > Don't lose writes if errors are not handled and log fails > > If the log fails and errors are not handled by dmeventd, the writes > are successfully finished without being actually written to the device. > > This code path is taken: > do_writes: > bio_list_merge(&ms->failures, &sync); > do_failures: > if (!get_valid_mirror(ms)) (false) > else if (errors_handled(ms)) (false) > else bio_endio(bio, 0); > > The logic in do_failures is based on presuming that the write was already > tried --- if it succeeded at least on one leg and errors are not handled, > it is reported as success. > > However, bio can be added to the failures queue without being submitted, in > do_writes. > > This patch changes it so that bios are added to the failures list only if > errors are handled --- then, they will be held with hold_bio() called from > do_failures. I agree that bios should be issued by do_write() when ms->log_failures is set, but do we need to add bios to the failures queue? As you mentioned, the failures queue should be used to bios which are already handled. Therefore, I think bios are better to handled directly by hold_bio() instead of adding them to the failures queue as bios for nosync regions are done. - if (unlikely(ms->log_failure)) { - spin_lock_irq(&ms->lock); - bio_list_merge(&ms->failures, &sync); - spin_unlock_irq(&ms->lock); - wakeup_mirrord(ms); - } else - while ((bio = bio_list_pop(&sync))) + while ((bio = bio_list_pop(&sync))) + if (unlikely(ms->log_failure) && errors_handled(ms)) + hold_bio(ms, bio); + else do_write(ms, bio); The policy to treat bios when ms->log_failures == 1 is different, but the above code is based on the following patch. https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html Thanks, Taka ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Don't lose writes if errors are not handled and log fails 2010-02-01 17:26 ` Takahiro Yasui @ 2010-02-02 13:00 ` Mikulas Patocka 2010-02-02 18:17 ` [PATCH] If all legs fail, return an error Mikulas Patocka 2010-02-02 19:46 ` [PATCH] Don't lose writes if errors are not handled and log fails Takahiro Yasui 0 siblings, 2 replies; 6+ messages in thread From: Mikulas Patocka @ 2010-02-02 13:00 UTC (permalink / raw) To: Takahiro Yasui; +Cc: dm-devel, Alasdair G Kergon On Mon, 1 Feb 2010, Takahiro Yasui wrote: > > This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197 > > > > Please submit this patch before 2.6.33 goes out. It fixes a bug when old > > LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to > > dm-raid1. > > > > It doesn't need to be backported to RHEL 5.5, because lvm always passes a > > flag to handle errors there. > > Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1, > we don't need to backport it to RHEL 5.5. But if not, we need to. > > > Don't lose writes if errors are not handled and log fails > > > > If the log fails and errors are not handled by dmeventd, the writes > > are successfully finished without being actually written to the device. > > > > This code path is taken: > > do_writes: > > bio_list_merge(&ms->failures, &sync); > > do_failures: > > if (!get_valid_mirror(ms)) (false) > > else if (errors_handled(ms)) (false) > > else bio_endio(bio, 0); > > > > The logic in do_failures is based on presuming that the write was already > > tried --- if it succeeded at least on one leg and errors are not handled, > > it is reported as success. > > > > However, bio can be added to the failures queue without being submitted, in > > do_writes. > > > > This patch changes it so that bios are added to the failures list only if > > errors are handled --- then, they will be held with hold_bio() called from > > do_failures. > > I agree that bios should be issued by do_write() when ms->log_failures is set, > but do we need to add bios to the failures queue? As you mentioned, the failures > queue should be used to bios which are already handled. Therefore, I think > bios are better to handled directly by hold_bio() instead of adding them to > the failures queue as bios for nosync regions are done. > > - if (unlikely(ms->log_failure)) { > - spin_lock_irq(&ms->lock); > - bio_list_merge(&ms->failures, &sync); > - spin_unlock_irq(&ms->lock); > - wakeup_mirrord(ms); > - } else > - while ((bio = bio_list_pop(&sync))) > + while ((bio = bio_list_pop(&sync))) > + if (unlikely(ms->log_failure) && errors_handled(ms)) > + hold_bio(ms, bio); > + else > do_write(ms, bio); I thought about this too, but I'd decided to put the bios on the failures queue rather than holding them for this reason: if all the legs fail, it is better to terminate the bio with -EIO than to hold it. If all the legs fail, you can't save anything anyway and the less things you are holding, the less possibility for deadlocks exists. So I put the bios to the failure queue and do_failures will terminate them with -EIO if all the legs failred and hold them if we use dmeventd and there is at least one live leg. Mikulas > The policy to treat bios when ms->log_failures == 1 is different, but > the above code is based on the following patch. > > https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html > > Thanks, > Taka > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] If all legs fail, return an error 2010-02-02 13:00 ` Mikulas Patocka @ 2010-02-02 18:17 ` Mikulas Patocka 2010-02-02 19:47 ` Takahiro Yasui 2010-02-02 19:46 ` [PATCH] Don't lose writes if errors are not handled and log fails Takahiro Yasui 1 sibling, 1 reply; 6+ messages in thread From: Mikulas Patocka @ 2010-02-02 18:17 UTC (permalink / raw) To: Alasdair G Kergon; +Cc: dm-devel On Tue, 2 Feb 2010, Mikulas Patocka wrote: > > > On Mon, 1 Feb 2010, Takahiro Yasui wrote: > > > > This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197 > > > > > > Please submit this patch before 2.6.33 goes out. It fixes a bug when old > > > LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to > > > dm-raid1. > > > > > > It doesn't need to be backported to RHEL 5.5, because lvm always passes a > > > flag to handle errors there. > > > > Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1, > > we don't need to backport it to RHEL 5.5. But if not, we need to. > > > > > Don't lose writes if errors are not handled and log fails > > > > > > If the log fails and errors are not handled by dmeventd, the writes > > > are successfully finished without being actually written to the device. > > > > > > This code path is taken: > > > do_writes: > > > bio_list_merge(&ms->failures, &sync); > > > do_failures: > > > if (!get_valid_mirror(ms)) (false) > > > else if (errors_handled(ms)) (false) > > > else bio_endio(bio, 0); > > > > > > The logic in do_failures is based on presuming that the write was already > > > tried --- if it succeeded at least on one leg and errors are not handled, > > > it is reported as success. > > > > > > However, bio can be added to the failures queue without being submitted, in > > > do_writes. > > > > > > This patch changes it so that bios are added to the failures list only if > > > errors are handled --- then, they will be held with hold_bio() called from > > > do_failures. > > > > I agree that bios should be issued by do_write() when ms->log_failures is set, > > but do we need to add bios to the failures queue? As you mentioned, the failures > > queue should be used to bios which are already handled. Therefore, I think > > bios are better to handled directly by hold_bio() instead of adding them to > > the failures queue as bios for nosync regions are done. > > > > - if (unlikely(ms->log_failure)) { > > - spin_lock_irq(&ms->lock); > > - bio_list_merge(&ms->failures, &sync); > > - spin_unlock_irq(&ms->lock); > > - wakeup_mirrord(ms); > > - } else > > - while ((bio = bio_list_pop(&sync))) > > + while ((bio = bio_list_pop(&sync))) > > + if (unlikely(ms->log_failure) && errors_handled(ms)) > > + hold_bio(ms, bio); > > + else > > do_write(ms, bio); > > I thought about this too, but I'd decided to put the bios on the failures > queue rather than holding them for this reason: if all the legs fail, it > is better to terminate the bio with -EIO than to hold it. If all the legs > fail, you can't save anything anyway and the less things you are holding, > the less possibility for deadlocks exists. > > So I put the bios to the failure queue and do_failures will terminate them > with -EIO if all the legs failred and hold them if we use dmeventd and > there is at least one live leg. > > Mikulas BTW. if you want to follow this principle (abort bio if all legs fail), you can also apply this patch, that does it for out-of-sync regions. It is not critical, so it can wait for 2.6.34. Mikulas --- If all legs fail, return an error rather than holding the bio Add the bio to the failures list instead of holding it. It is processed in do_failures, do_failures tests first if all legs failed and returns the bio with -EIO in this case. If there is some leg alive and errors are handled by dmeventd, do_failures calls hold_bio. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> --- drivers/md/dm-raid1.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) Index: linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c =================================================================== --- linux-2.6.33-rc5-devel.orig/drivers/md/dm-raid1.c 2010-02-02 17:23:08.000000000 +0100 +++ linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c 2010-02-02 17:23:40.000000000 +0100 @@ -737,9 +737,12 @@ static void do_writes(struct mirror_set dm_rh_delay(ms->rh, bio); while ((bio = bio_list_pop(&nosync))) { - if (unlikely(ms->leg_failure) && errors_handled(ms)) - hold_bio(ms, bio); - else { + if (unlikely(ms->leg_failure) && errors_handled(ms)) { + spin_lock_irq(&ms->lock); + bio_list_add(&ms->failures, bio); + spin_unlock_irq(&ms->lock); + wakeup_mirrord(ms); + } else { map_bio(get_default_mirror(ms), bio); generic_make_request(bio); } ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] If all legs fail, return an error 2010-02-02 18:17 ` [PATCH] If all legs fail, return an error Mikulas Patocka @ 2010-02-02 19:47 ` Takahiro Yasui 0 siblings, 0 replies; 6+ messages in thread From: Takahiro Yasui @ 2010-02-02 19:47 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel, Alasdair G Kergon On 02/02/10 13:17, Mikulas Patocka wrote: > BTW. if you want to follow this principle (abort bio if all legs fail), > you can also apply this patch, that does it for out-of-sync regions. It is > not critical, so it can wait for 2.6.34. Looks good to me. Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Thanks, Taka ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] Don't lose writes if errors are not handled and log fails 2010-02-02 13:00 ` Mikulas Patocka 2010-02-02 18:17 ` [PATCH] If all legs fail, return an error Mikulas Patocka @ 2010-02-02 19:46 ` Takahiro Yasui 1 sibling, 0 replies; 6+ messages in thread From: Takahiro Yasui @ 2010-02-02 19:46 UTC (permalink / raw) To: Mikulas Patocka; +Cc: dm-devel, Alasdair G Kergon On 02/02/10 08:00, Mikulas Patocka wrote: > On Mon, 1 Feb 2010, Takahiro Yasui wrote: > >>> This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197 >>> >>> Please submit this patch before 2.6.33 goes out. It fixes a bug when old >>> LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to >>> dm-raid1. >>> >>> It doesn't need to be backported to RHEL 5.5, because lvm always passes a >>> flag to handle errors there. >> >> Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1, >> we don't need to backport it to RHEL 5.5. But if not, we need to. >> >>> Don't lose writes if errors are not handled and log fails >>> >>> If the log fails and errors are not handled by dmeventd, the writes >>> are successfully finished without being actually written to the device. >>> >>> This code path is taken: >>> do_writes: >>> bio_list_merge(&ms->failures, &sync); >>> do_failures: >>> if (!get_valid_mirror(ms)) (false) >>> else if (errors_handled(ms)) (false) >>> else bio_endio(bio, 0); >>> >>> The logic in do_failures is based on presuming that the write was already >>> tried --- if it succeeded at least on one leg and errors are not handled, >>> it is reported as success. >>> >>> However, bio can be added to the failures queue without being submitted, in >>> do_writes. >>> >>> This patch changes it so that bios are added to the failures list only if >>> errors are handled --- then, they will be held with hold_bio() called from >>> do_failures. >> >> I agree that bios should be issued by do_write() when ms->log_failures is set, >> but do we need to add bios to the failures queue? As you mentioned, the failures >> queue should be used to bios which are already handled. Therefore, I think >> bios are better to handled directly by hold_bio() instead of adding them to >> the failures queue as bios for nosync regions are done. >> >> - if (unlikely(ms->log_failure)) { >> - spin_lock_irq(&ms->lock); >> - bio_list_merge(&ms->failures, &sync); >> - spin_unlock_irq(&ms->lock); >> - wakeup_mirrord(ms); >> - } else >> - while ((bio = bio_list_pop(&sync))) >> + while ((bio = bio_list_pop(&sync))) >> + if (unlikely(ms->log_failure) && errors_handled(ms)) >> + hold_bio(ms, bio); >> + else >> do_write(ms, bio); > > I thought about this too, but I'd decided to put the bios on the failures > queue rather than holding them for this reason: if all the legs fail, it > is better to terminate the bio with -EIO than to hold it. If all the legs > fail, you can't save anything anyway and the less things you are holding, > the less possibility for deadlocks exists. Thank you for the explanation. With your following patch, we can keep consistency of the policy for sync and nosync region. https://www.redhat.com/archives/dm-devel/2010-February/msg00014.html Reviewed-by: Takahiro Yasui <tyasui@redhat.com> Thanks, Taka > So I put the bios to the failure queue and do_failures will terminate them > with -EIO if all the legs failred and hold them if we use dmeventd and > there is at least one live leg. >> https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html >> >> Thanks, >> Taka >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-02-02 19:47 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-31 20:34 [PATCH] Don't lose writes if errors are not handled and log fails Mikulas Patocka 2010-02-01 17:26 ` Takahiro Yasui 2010-02-02 13:00 ` Mikulas Patocka 2010-02-02 18:17 ` [PATCH] If all legs fail, return an error Mikulas Patocka 2010-02-02 19:47 ` Takahiro Yasui 2010-02-02 19:46 ` [PATCH] Don't lose writes if errors are not handled and log fails Takahiro Yasui
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.