All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Don't lose writes if errors are not handled and log fails
@ 2010-01-31 20:34 Mikulas Patocka
  2010-02-01 17:26 ` Takahiro Yasui
  0 siblings, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2010-01-31 20:34 UTC (permalink / raw)
  To: dm-devel; +Cc: Alasdair G Kergon

Hi

This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197

Please submit this patch before 2.6.33 goes out. It fixes a bug when old 
LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to 
dm-raid1.

It doesn't need to be backported to RHEL 5.5, because lvm always passes a 
flag to handle errors there.

Mikulas

Don't lose writes if errors are not handled and log fails

If the log fails and errors are not handled by dmeventd, the writes
are successfully finished without being actually written to the device.

This code path is taken:
do_writes:
	bio_list_merge(&ms->failures, &sync);
do_failures:
	if (!get_valid_mirror(ms)) (false)
	else if (errors_handled(ms)) (false)
	else bio_endio(bio, 0);

The logic in do_failures is based on presuming that the write was already
tried --- if it succeeded at least on one leg and errors are not handled,
it is reported as success.

However, bio can be added to the failures queue without being submitted, in
do_writes.

This patch changes it so that bios are added to the failures list only if
errors are handled --- then, they will be held with hold_bio() called from
do_failures.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-raid1.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c
===================================================================
--- linux-2.6.33-rc5-devel.orig/drivers/md/dm-raid1.c	2010-01-29 20:29:37.000000000 +0100
+++ linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c	2010-01-30 00:04:18.000000000 +0100
@@ -724,7 +724,7 @@ static void do_writes(struct mirror_set 
 	/*
 	 * Dispatch io.
 	 */
-	if (unlikely(ms->log_failure)) {
+	if (unlikely(ms->log_failure) && errors_handled(ms)) {
 		spin_lock_irq(&ms->lock);
 		bio_list_merge(&ms->failures, &sync);
 		spin_unlock_irq(&ms->lock);

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Don't lose writes if errors are not handled and log fails
  2010-01-31 20:34 [PATCH] Don't lose writes if errors are not handled and log fails Mikulas Patocka
@ 2010-02-01 17:26 ` Takahiro Yasui
  2010-02-02 13:00   ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: Takahiro Yasui @ 2010-02-01 17:26 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel, Alasdair G Kergon

> This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197
> 
> Please submit this patch before 2.6.33 goes out. It fixes a bug when old 
> LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to 
> dm-raid1.
> 
> It doesn't need to be backported to RHEL 5.5, because lvm always passes a 
> flag to handle errors there.

Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1,
we don't need to backport it to RHEL 5.5. But if not, we need to.

> Don't lose writes if errors are not handled and log fails
> 
> If the log fails and errors are not handled by dmeventd, the writes
> are successfully finished without being actually written to the device.
> 
> This code path is taken:
> do_writes:
> 	bio_list_merge(&ms->failures, &sync);
> do_failures:
> 	if (!get_valid_mirror(ms)) (false)
> 	else if (errors_handled(ms)) (false)
> 	else bio_endio(bio, 0);
> 
> The logic in do_failures is based on presuming that the write was already
> tried --- if it succeeded at least on one leg and errors are not handled,
> it is reported as success.
> 
> However, bio can be added to the failures queue without being submitted, in
> do_writes.
> 
> This patch changes it so that bios are added to the failures list only if
> errors are handled --- then, they will be held with hold_bio() called from
> do_failures.

I agree that bios should be issued by do_write() when ms->log_failures is set,
but do we need to add bios to the failures queue? As you mentioned, the failures
queue should be used to bios which are already handled. Therefore, I think
bios are better to handled directly by hold_bio() instead of adding them to
the failures queue as bios for nosync regions are done.

-	if (unlikely(ms->log_failure)) {
-		spin_lock_irq(&ms->lock);
-		bio_list_merge(&ms->failures, &sync);
-		spin_unlock_irq(&ms->lock);
-		wakeup_mirrord(ms);
-	} else
-		while ((bio = bio_list_pop(&sync)))
+	while ((bio = bio_list_pop(&sync)))
+		if (unlikely(ms->log_failure) && errors_handled(ms))
+			hold_bio(ms, bio);
+		else
 			do_write(ms, bio);

The policy to treat bios when ms->log_failures == 1 is different, but
the above code is based on the following patch.

https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html

Thanks,
Taka

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Don't lose writes if errors are not handled and log fails
  2010-02-01 17:26 ` Takahiro Yasui
@ 2010-02-02 13:00   ` Mikulas Patocka
  2010-02-02 18:17     ` [PATCH] If all legs fail, return an error Mikulas Patocka
  2010-02-02 19:46     ` [PATCH] Don't lose writes if errors are not handled and log fails Takahiro Yasui
  0 siblings, 2 replies; 6+ messages in thread
From: Mikulas Patocka @ 2010-02-02 13:00 UTC (permalink / raw)
  To: Takahiro Yasui; +Cc: dm-devel, Alasdair G Kergon



On Mon, 1 Feb 2010, Takahiro Yasui wrote:

> > This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197
> > 
> > Please submit this patch before 2.6.33 goes out. It fixes a bug when old 
> > LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to 
> > dm-raid1.
> > 
> > It doesn't need to be backported to RHEL 5.5, because lvm always passes a 
> > flag to handle errors there.
> 
> Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1,
> we don't need to backport it to RHEL 5.5. But if not, we need to.
> 
> > Don't lose writes if errors are not handled and log fails
> > 
> > If the log fails and errors are not handled by dmeventd, the writes
> > are successfully finished without being actually written to the device.
> > 
> > This code path is taken:
> > do_writes:
> > 	bio_list_merge(&ms->failures, &sync);
> > do_failures:
> > 	if (!get_valid_mirror(ms)) (false)
> > 	else if (errors_handled(ms)) (false)
> > 	else bio_endio(bio, 0);
> > 
> > The logic in do_failures is based on presuming that the write was already
> > tried --- if it succeeded at least on one leg and errors are not handled,
> > it is reported as success.
> > 
> > However, bio can be added to the failures queue without being submitted, in
> > do_writes.
> > 
> > This patch changes it so that bios are added to the failures list only if
> > errors are handled --- then, they will be held with hold_bio() called from
> > do_failures.
> 
> I agree that bios should be issued by do_write() when ms->log_failures is set,
> but do we need to add bios to the failures queue? As you mentioned, the failures
> queue should be used to bios which are already handled. Therefore, I think
> bios are better to handled directly by hold_bio() instead of adding them to
> the failures queue as bios for nosync regions are done.
> 
> -	if (unlikely(ms->log_failure)) {
> -		spin_lock_irq(&ms->lock);
> -		bio_list_merge(&ms->failures, &sync);
> -		spin_unlock_irq(&ms->lock);
> -		wakeup_mirrord(ms);
> -	} else
> -		while ((bio = bio_list_pop(&sync)))
> +	while ((bio = bio_list_pop(&sync)))
> +		if (unlikely(ms->log_failure) && errors_handled(ms))
> +			hold_bio(ms, bio);
> +		else
>  			do_write(ms, bio);

I thought about this too, but I'd decided to put the bios on the failures 
queue rather than holding them for this reason: if all the legs fail, it 
is better to terminate the bio with -EIO than to hold it. If all the legs 
fail, you can't save anything anyway and the less things you are holding, 
the less possibility for deadlocks exists.

So I put the bios to the failure queue and do_failures will terminate them 
with -EIO if all the legs failred and hold them if we use dmeventd and 
there is at least one live leg.

Mikulas

> The policy to treat bios when ms->log_failures == 1 is different, but
> the above code is based on the following patch.
> 
> https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html
> 
> Thanks,
> Taka
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] If all legs fail, return an error
  2010-02-02 13:00   ` Mikulas Patocka
@ 2010-02-02 18:17     ` Mikulas Patocka
  2010-02-02 19:47       ` Takahiro Yasui
  2010-02-02 19:46     ` [PATCH] Don't lose writes if errors are not handled and log fails Takahiro Yasui
  1 sibling, 1 reply; 6+ messages in thread
From: Mikulas Patocka @ 2010-02-02 18:17 UTC (permalink / raw)
  To: Alasdair G Kergon; +Cc: dm-devel



On Tue, 2 Feb 2010, Mikulas Patocka wrote:

> 
> 
> On Mon, 1 Feb 2010, Takahiro Yasui wrote:
> 
> > > This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197
> > > 
> > > Please submit this patch before 2.6.33 goes out. It fixes a bug when old 
> > > LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to 
> > > dm-raid1.
> > > 
> > > It doesn't need to be backported to RHEL 5.5, because lvm always passes a 
> > > flag to handle errors there.
> > 
> > Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1,
> > we don't need to backport it to RHEL 5.5. But if not, we need to.
> > 
> > > Don't lose writes if errors are not handled and log fails
> > > 
> > > If the log fails and errors are not handled by dmeventd, the writes
> > > are successfully finished without being actually written to the device.
> > > 
> > > This code path is taken:
> > > do_writes:
> > > 	bio_list_merge(&ms->failures, &sync);
> > > do_failures:
> > > 	if (!get_valid_mirror(ms)) (false)
> > > 	else if (errors_handled(ms)) (false)
> > > 	else bio_endio(bio, 0);
> > > 
> > > The logic in do_failures is based on presuming that the write was already
> > > tried --- if it succeeded at least on one leg and errors are not handled,
> > > it is reported as success.
> > > 
> > > However, bio can be added to the failures queue without being submitted, in
> > > do_writes.
> > > 
> > > This patch changes it so that bios are added to the failures list only if
> > > errors are handled --- then, they will be held with hold_bio() called from
> > > do_failures.
> > 
> > I agree that bios should be issued by do_write() when ms->log_failures is set,
> > but do we need to add bios to the failures queue? As you mentioned, the failures
> > queue should be used to bios which are already handled. Therefore, I think
> > bios are better to handled directly by hold_bio() instead of adding them to
> > the failures queue as bios for nosync regions are done.
> > 
> > -	if (unlikely(ms->log_failure)) {
> > -		spin_lock_irq(&ms->lock);
> > -		bio_list_merge(&ms->failures, &sync);
> > -		spin_unlock_irq(&ms->lock);
> > -		wakeup_mirrord(ms);
> > -	} else
> > -		while ((bio = bio_list_pop(&sync)))
> > +	while ((bio = bio_list_pop(&sync)))
> > +		if (unlikely(ms->log_failure) && errors_handled(ms))
> > +			hold_bio(ms, bio);
> > +		else
> >  			do_write(ms, bio);
> 
> I thought about this too, but I'd decided to put the bios on the failures 
> queue rather than holding them for this reason: if all the legs fail, it 
> is better to terminate the bio with -EIO than to hold it. If all the legs 
> fail, you can't save anything anyway and the less things you are holding, 
> the less possibility for deadlocks exists.
> 
> So I put the bios to the failure queue and do_failures will terminate them 
> with -EIO if all the legs failred and hold them if we use dmeventd and 
> there is at least one live leg.
> 
> Mikulas

BTW. if you want to follow this principle (abort bio if all legs fail), 
you can also apply this patch, that does it for out-of-sync regions. It is 
not critical, so it can wait for 2.6.34.

Mikulas

---

If all legs fail, return an error rather than holding the bio

Add the bio to the failures list instead of holding it. It is processed
in do_failures, do_failures tests first if all legs failed and returns
the bio with -EIO in this case. If there is some leg alive and errors
are handled by dmeventd, do_failures calls hold_bio.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-raid1.c |    9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Index: linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c
===================================================================
--- linux-2.6.33-rc5-devel.orig/drivers/md/dm-raid1.c	2010-02-02 17:23:08.000000000 +0100
+++ linux-2.6.33-rc5-devel/drivers/md/dm-raid1.c	2010-02-02 17:23:40.000000000 +0100
@@ -737,9 +737,12 @@ static void do_writes(struct mirror_set 
 		dm_rh_delay(ms->rh, bio);
 
 	while ((bio = bio_list_pop(&nosync))) {
-		if (unlikely(ms->leg_failure) && errors_handled(ms))
-			hold_bio(ms, bio);
-		else {
+		if (unlikely(ms->leg_failure) && errors_handled(ms)) {
+			spin_lock_irq(&ms->lock);
+			bio_list_add(&ms->failures, bio);
+			spin_unlock_irq(&ms->lock);
+			wakeup_mirrord(ms);
+		} else {
 			map_bio(get_default_mirror(ms), bio);
 			generic_make_request(bio);
 		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Don't lose writes if errors are not handled and log fails
  2010-02-02 13:00   ` Mikulas Patocka
  2010-02-02 18:17     ` [PATCH] If all legs fail, return an error Mikulas Patocka
@ 2010-02-02 19:46     ` Takahiro Yasui
  1 sibling, 0 replies; 6+ messages in thread
From: Takahiro Yasui @ 2010-02-02 19:46 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel, Alasdair G Kergon

On 02/02/10 08:00, Mikulas Patocka wrote:
> On Mon, 1 Feb 2010, Takahiro Yasui wrote:
> 
>>> This fixes bug https://bugzilla.redhat.com/show_bug.cgi?id=555197
>>>
>>> Please submit this patch before 2.6.33 goes out. It fixes a bug when old 
>>> LVM (<= 2.02.51) is used, that doesn't pass errors_handled flag to 
>>> dm-raid1.
>>>
>>> It doesn't need to be backported to RHEL 5.5, because lvm always passes a 
>>> flag to handle errors there.
>>
>> Don't we need to backport it to RHEL 5.5? If lvm is the only user of dm-raid1,
>> we don't need to backport it to RHEL 5.5. But if not, we need to.
>>
>>> Don't lose writes if errors are not handled and log fails
>>>
>>> If the log fails and errors are not handled by dmeventd, the writes
>>> are successfully finished without being actually written to the device.
>>>
>>> This code path is taken:
>>> do_writes:
>>> 	bio_list_merge(&ms->failures, &sync);
>>> do_failures:
>>> 	if (!get_valid_mirror(ms)) (false)
>>> 	else if (errors_handled(ms)) (false)
>>> 	else bio_endio(bio, 0);
>>>
>>> The logic in do_failures is based on presuming that the write was already
>>> tried --- if it succeeded at least on one leg and errors are not handled,
>>> it is reported as success.
>>>
>>> However, bio can be added to the failures queue without being submitted, in
>>> do_writes.
>>>
>>> This patch changes it so that bios are added to the failures list only if
>>> errors are handled --- then, they will be held with hold_bio() called from
>>> do_failures.
>>
>> I agree that bios should be issued by do_write() when ms->log_failures is set,
>> but do we need to add bios to the failures queue? As you mentioned, the failures
>> queue should be used to bios which are already handled. Therefore, I think
>> bios are better to handled directly by hold_bio() instead of adding them to
>> the failures queue as bios for nosync regions are done.
>>
>> -	if (unlikely(ms->log_failure)) {
>> -		spin_lock_irq(&ms->lock);
>> -		bio_list_merge(&ms->failures, &sync);
>> -		spin_unlock_irq(&ms->lock);
>> -		wakeup_mirrord(ms);
>> -	} else
>> -		while ((bio = bio_list_pop(&sync)))
>> +	while ((bio = bio_list_pop(&sync)))
>> +		if (unlikely(ms->log_failure) && errors_handled(ms))
>> +			hold_bio(ms, bio);
>> +		else
>>  			do_write(ms, bio);
> 
> I thought about this too, but I'd decided to put the bios on the failures 
> queue rather than holding them for this reason: if all the legs fail, it 
> is better to terminate the bio with -EIO than to hold it. If all the legs 
> fail, you can't save anything anyway and the less things you are holding, 
> the less possibility for deadlocks exists.

Thank you for the explanation. With your following patch, we can keep
consistency of the policy for sync and nosync region.

https://www.redhat.com/archives/dm-devel/2010-February/msg00014.html

Reviewed-by: Takahiro Yasui <tyasui@redhat.com>

Thanks,
Taka


> So I put the bios to the failure queue and do_failures will terminate them 
> with -EIO if all the legs failred and hold them if we use dmeventd and 
> there is at least one live leg.
>> https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html
>>
>> Thanks,
>> Taka
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] If all legs fail, return an error
  2010-02-02 18:17     ` [PATCH] If all legs fail, return an error Mikulas Patocka
@ 2010-02-02 19:47       ` Takahiro Yasui
  0 siblings, 0 replies; 6+ messages in thread
From: Takahiro Yasui @ 2010-02-02 19:47 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: dm-devel, Alasdair G Kergon

On 02/02/10 13:17, Mikulas Patocka wrote:
> BTW. if you want to follow this principle (abort bio if all legs fail), 
> you can also apply this patch, that does it for out-of-sync regions. It is 
> not critical, so it can wait for 2.6.34.

Looks good to me.

Reviewed-by: Takahiro Yasui <tyasui@redhat.com>

Thanks,
Taka

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-02-02 19:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-31 20:34 [PATCH] Don't lose writes if errors are not handled and log fails Mikulas Patocka
2010-02-01 17:26 ` Takahiro Yasui
2010-02-02 13:00   ` Mikulas Patocka
2010-02-02 18:17     ` [PATCH] If all legs fail, return an error Mikulas Patocka
2010-02-02 19:47       ` Takahiro Yasui
2010-02-02 19:46     ` [PATCH] Don't lose writes if errors are not handled and log fails Takahiro Yasui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.