Re: feature re-quest for "re-write"

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: Brad Campbell <lists2009@fnarfbargle.com>
Cc: Mikael Abrahamsson <swmike@swm.pp.se>, linux-raid@vger.kernel.org
Subject: Re: feature re-quest for "re-write"
Date: Tue, 25 Feb 2014 13:10:17 +1100	[thread overview]
Message-ID: <20140225131017.6e71fa5a@notabene.brown> (raw)
In-Reply-To: <530AAD64.4030701@fnarfbargle.com>

[-- Attachment #1: Type: text/plain, Size: 3222 bytes --]

On Mon, 24 Feb 2014 10:24:36 +0800 Brad Campbell <lists2009@fnarfbargle.com>
wrote:

> On 22/02/14 02:09, Mikael Abrahamsson wrote:
> >
> > Hi,
> >
> > we have "check", "repair", "replacement" and other operations on raid
> > volumes.
> >
> > I am not a programmer, but I was wondering how much work it would
> > require to take current code and implement "rewrite", basically
> > re-writing every block in the md raid level. Since "repair" and "check"
> > doesn't seem to properly detect a few errors, wouldn't it make sense to
> > try least existance / easiest implementation route to just re-write all
> > data on the entire array? If reads fail, re-calculate from parity, if
> > reads work, just write again.
> 
> Now, this is after 3 minutes of looking at raid5.c, so if I've missed 
> something obvious please feel free to yell at me. I'm not much of a 
> programmer. Having said that -
> 
> Can someone check my understanding of this bit of code?
> 
> static void handle_parity_checks6(struct r5conf *conf, struct 
> stripe_head *sh,
>                                    struct stripe_head_state *s,
>                                    int disks)
> <....>
> 
>          switch (sh->check_state) {
>          case check_state_idle:
>                  /* start a new check operation if there are < 2 failures */
>                  if (s->failed == s->q_failed) {
>                          /* The only possible failed device holds Q, so it
>                           * makes sense to check P (If anything else 
> were failed,
>                           * we would have used P to recreate it).
>                           */
>                          sh->check_state = check_state_run;
>                  }
>                  if (!s->q_failed && s->failed < 2) {
>                          /* Q is not failed, and we didn't use it to 
> generate
>                           * anything, so it makes sense to check it
>                           */
>                          if (sh->check_state == check_state_run)
>                                  sh->check_state = check_state_run_pq;
>                          else
>                                  sh->check_state = check_state_run_q;
>                  }
> 
>
> So we get passed a stripe. If it's not being checked we :
> 
> - If Q has failed we initiate check_state_run (which checks only P)
> 
> - If we have less than 2 failed drives (lets say we have none), if we 
> are already checking P (check_state_run) we upgrade that to 
> check_state_run_pq (and therefore check both).
> 
> However
> 
> - If we were check_state_idle, beacuse we had 0 failed drives, then we 
> only mark check_state_run_q and therefore skip checking P ??

This code is obviously too subtle.

If 0 drives have failed, then 's->failed' is 0 (it is the count of failed
drives), and  's->q_failed' is also 0 (it is a boolean flag, and q clearly
hasn't failed as nothing has).
So the first 'if' branch will be followed (as "0 == 0") and check_state set to
check_state_run.
Then as q_failed is still 0 and failed < 2, check_state gets set to
check_state_run_pq.

So it does check both p and q.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

next prev parent reply	other threads:[~2014-02-25  2:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-21 18:09 feature re-quest for "re-write" Mikael Abrahamsson
2014-02-24  1:30 ` Brad Campbell
2014-02-24  1:46   ` Eyal Lebedinsky
2014-02-24  2:11     ` Brad Campbell
2014-02-24  3:40       ` Eyal Lebedinsky
2014-02-24 14:14         ` Wilson Jonathan
2014-02-24 20:39           ` Eyal Lebedinsky
2014-02-25  3:16             ` NeilBrown
2014-02-25  5:58               ` Eyal Lebedinsky
2014-02-25  7:05                 ` Stan Hoeppner
2014-02-25  7:45                   ` Eyal Lebedinsky
2014-02-25  7:58               ` Eyal Lebedinsky
2014-02-25  8:35                 ` NeilBrown
2014-02-25 11:08                   ` Eyal Lebedinsky
2014-02-25 11:28                     ` Mikael Abrahamsson
2014-02-25 12:05                       ` Eyal Lebedinsky
2014-02-25 12:17                         ` Mikael Abrahamsson
2014-02-25 12:32                           ` Eyal Lebedinsky
2014-02-24  2:42   ` Mikael Abrahamsson
2014-02-24  2:24 ` Brad Campbell
2014-02-25  2:10   ` NeilBrown [this message]
2014-02-25  2:26     ` Brad Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140225131017.6e71fa5a@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=lists2009@fnarfbargle.com \
    --cc=swmike@swm.pp.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).