Re: [PATCH 7/7] Hold all write bios when errors are handled

All of lore.kernel.org
 help / color / mirror / Atom feed

From: malahal@us.ibm.com
To: dm-devel@redhat.com
Subject: Re: [PATCH 7/7] Hold all write bios when errors are handled
Date: Tue, 24 Nov 2009 11:17:04 -0800	[thread overview]
Message-ID: <20091124191704.GB7971@us.ibm.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0911240638250.27545@hs20-bc2-1.build.redhat.com>

Mikulas Patocka [mpatocka@redhat.com] wrote:
> Yes, writes after the failed request are processed, but it is not a 
> problem --- if the write succeeded on all legs, it is returned as success 
> --- in this case, resychronization can't corrupt written data. If the 
> write succeeded only on some legs, it is held again.
> 
> So in practice, if some leg fails completely, all writes will be held.

I need to look at the code again, but I thought any new writes to a
failed region go to a surviving leg. In that case, we end up returning
I/O's to the application after writing to a single leg.
 
> > Also, we do need to do the above work only if "primary" leg fails. We
> > can continue to work just like the old code if "secondary" legs fail,
> > right? Not sure if this is worth optimizing though, but I would like to
> > see it implemented as it is just a few extra checks. We can have
> > primary_failure field like log_failure field.
 
> I thought about it too, but concluded that we need to hold bios even if 
> the primary leg fails.
> 
> Imagine this scenario:
> * secondary leg fails
> * write fails on the secondaty leg and succeeds on the primary leg 
> and is successfully complete
> * the computer crashes
> * after a reboot, the primary leg is inaccessible and the secondary leg is 
> back online --- now raid1 would be returning stale data.

The software can detect this case. We can fail this completely or use
the data from the secondary that could be "stale" with help from admin. 
Let us call this method 1.

> If we hold the bios if the secondary leg fails (as the patch does), one of 
> these two scenarios happen:
> 
> * secondary leg fails
> * write succeeds on the primary leg and is held
> * the computer crashes
> * after a reboot, the primary leg is inaccessible and the secondary leg is
> back online --- but we haven't completed the write, so the transaction 
> wasn't reported as committed
> 
> or
> 
> * secondary leg fails
> * write succeeds on the primary leg and is held
> * dmeventd removes the secondary leg and the write succeeds
> * the computer crashes
> * after a reboot, the primary leg is inaccessible, the secondary leg was 
> already removed by dmeventd, so the array is considered inaccessible. So 
> it doesn't work but at least it doesn't revert already committed 
> transaction.

How is this latter case (it doesn't need a crash anyway)
different/better from the case where we detect that 'primary' is missing
and ask admin if he wants to use the data on the secondary or not. At
least, the admin has a choice with "method 1" and this doesn't have that
choice.

Thanks, Malahal.

next prev parent reply	other threads:[~2009-11-24 19:17 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-18 12:09 [PATCH 0/7] patches: fix dm-raid1 race, bug 502927 Mikulas Patocka
2009-11-18 12:10 ` [PATCH 1/7] Explicitly initialize bio lists Mikulas Patocka
2009-11-18 12:11   ` [PATCH 2/7] A framework for holding bios until suspend Mikulas Patocka
2009-11-18 12:11     ` [PATCH 3/7] Use the hold framework in do_failures Mikulas Patocka
2009-11-18 12:12       ` [PATCH 4/7] Don't optimize for failure case Mikulas Patocka
2009-11-18 12:13         ` [PATCH 5/7] Move a logic to get a valid mirror leg to a function Mikulas Patocka
2009-11-18 12:18           ` [PATCH 6/7] Move bio completion from dm_rh_mark_nosync to its caller Mikulas Patocka
2009-11-18 12:19             ` [PATCH 7/7] Hold all write bios when errors are handled Mikulas Patocka
2009-11-23  5:58               ` malahal
2009-11-23 17:54                 ` Takahiro Yasui
2009-11-24 11:51                 ` Mikulas Patocka
2009-11-24 19:17                   ` malahal [this message]
2009-11-25 13:19                     ` Mikulas Patocka
2009-11-25 15:43                       ` Takahiro Yasui
2009-11-25 20:44                         ` malahal
2009-11-25 22:50                           ` Takahiro Yasui
2009-11-26 17:56                           ` Mikulas Patocka
2009-11-26 17:54                         ` [PATCH 8/7] Hold all write bios in nosync region Mikulas Patocka
2009-11-25 20:23                       ` [PATCH 7/7] Hold all write bios when errors are handled malahal
2009-11-25 22:47                         ` Takahiro Yasui
2009-11-25 23:20                           ` malahal
2009-11-25 23:50                             ` Takahiro Yasui
2009-11-26  0:30                               ` malahal
2009-11-26 17:58                         ` Mikulas Patocka
2009-11-26 22:22                           ` malahal
2009-11-28 18:02     ` [PATCH 2/7] A framework for holding bios until suspend Takahiro Yasui
2009-11-30  2:55       ` malahal
2009-11-30  9:41       ` Alasdair G Kergon
2009-11-30 16:46 ` [PATCH 0/7] patches: fix dm-raid1 race, bug 502927 Takahiro Yasui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091124191704.GB7971@us.ibm.com \
    --to=malahal@us.ibm.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.