Re: [RFC] MD: Allow restarting an interrupted incremental recovery.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: "Andrei E. Warkentin" <andrey.warkentin@gmail.com>
Cc: Andrei Warkentin <andreiw@vmware.com>, linux-raid@vger.kernel.org
Subject: Re: [RFC] MD: Allow restarting an interrupted incremental recovery.
Date: Mon, 17 Oct 2011 14:20:39 +1100	[thread overview]
Message-ID: <20111017142039.5a07b12f@notabene.brown> (raw)
In-Reply-To: <CANz0V+4-qWrc0HfRB5BD8jYFstSmRBr36YwZw5f9t5wwYggYaA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2650 bytes --]

On Thu, 13 Oct 2011 21:18:43 -0400 "Andrei E. Warkentin"
<andrey.warkentin@gmail.com> wrote:

> 2011/10/12 Andrei Warkentin <andreiw@vmware.com>:
> > If an incremental recovery was interrupted, a subsequent
> > re-add will result in a full recovery, even though an
> > incremental should be possible (seen with raid1).
> >
> > Solve this problem by not updating the superblock on the
> > recovering device until array is not degraded any longer.
> >
> > Cc: Neil Brown <neilb@suse.de>
> > Signed-off-by: Andrei Warkentin <andreiw@vmware.com>
> 
> FWIW it appears to me that this idea seems to work well, for the
> following reasons:
> 
> 1) The recovering sb is not touched until the array is not degraded
> (only for incremental sync).
> 2) The events_cleared count isn't updated in the active bitmap sb
> until array is not degraded. This implies that if the incremental was
> interrupted, recovering_sb->events is NOT less than
> active_bitmap->events_cleared).
> 3) The bitmaps (and sb) are updated on all drives at all times as it
> were before.
> 
> How I tested it:
> 1) Create RAID1 array with bitmap.
> 2) Degrade array by removing a drive.
> 3) Write a bunch of data (Gigs...)
> 4) Re-add removed drive - an incremental recovery is started.
> 5) Interrupt the incremental.
> 6) Write some more data.
> 7) MD5sum the data.
> 8) Re-add removed drive - and incremental recovery is restarted (I
> verified it starts at sec 0, just like you mentioned it should be, to
> avoid consistency issues). Verified that, indeed, only changed blocks
> (as noted by write-intent) are synced.
> 10)  Remove other half.
> 11) MD5sum data - hashes match.
> 
> Without this fix, you would of course have to deal with a full resync
> after the interrupted incremental.
> 
> Is there anything you think I'm missing here?
> 
> A

Not much, it looks good, and your testing is of course a good sign.

My only thought is whether we really need the new InIncremental flag.
You set it exactly when saved_raid_disk is set, and clear it exactly when
saved_raid_disk is cleared (set to -1).
So maybe we can just used saved_raid_disk.
If you look at it that way, you might notice that saved_raid_disk is also set
in slot_store, so probably InIncremental should be set there.  So that might
be the one thing you missed.

Could you respin the patch without adding InIncremental, and testing 
  rdev->saved_raid_disk >= 0
instead, check if you agree that should work, and perform a similar test?
(Is that asking too much?).

If you agree that works I would like to go with that version.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

next prev parent reply	other threads:[~2011-10-17  3:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-12 19:41 [RFC] MD: Restart an interrupted incremental recovery Andrei Warkentin
2011-10-12 23:04 ` Andrei E. Warkentin
2011-10-12 23:05   ` [RFC] MD: Allow restarting " Andrei Warkentin
2011-10-14  1:18     ` Andrei E. Warkentin
2011-10-17  3:20       ` NeilBrown [this message]
2011-10-17 17:26         ` Andrei Warkentin
2011-10-17 20:58           ` NeilBrown
2011-10-14  1:07 ` [RFC] MD: Restart " Andrei E. Warkentin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111017142039.5a07b12f@notabene.brown \
    --to=neilb@suse.de \
    --cc=andreiw@vmware.com \
    --cc=andrey.warkentin@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).