Re: [PATCH 03/18] md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Bill Davidsen <davidsen@tmr.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 03/18] md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash
Date: Fri, 13 Feb 2009 11:20:24 -0500	[thread overview]
Message-ID: <49959DC8.1000603@tmr.com> (raw)
In-Reply-To: <20090212031010.23983.74842.stgit@notabene.brown>

NeilBrown wrote:
> Version 1.x metadata has the ability to record the status of a
> partially completed drive recovery.
> However we only update that record on a clean shutdown.
> It would be nice to update it on unclean shutdowns too, particularly
> when using a bitmap that removes much to the 'sync' effort after an
> unclean shutdown.
>
> One complication with checkpointing recovery is that we only know
> where we are up to in terms of IO requests started, not which ones
> have completed.  And we need to know what has completed to record
> how much is recovered.  So occasionally pause the recovery until all
> submitted requests are completed, then update the record of where
> we are up to.
>
> When we have a bitmap, we already do that pause occasionally to keep
> the bitmap up-to-date.  So enhance that code to record the recovery
> offset and schedule a superblock update.
> And when there is no bitmap, just pause 16 times during the resync to
> do a checkpoint.
> '16' is a fairly arbitrary number.  But we don't really have any good
> way to judge how often is acceptable, and it seems like a reasonable
> number for now.
>   

Since the object of this code is to save time on shutdown and restart, 
16 has little relation to time. I would think that having this update on 
a time basis would more reasonably reflect this. I would like to see a 
fairly short time, say ten minutes, since the cost of a save is low, and 
ten minutes seems like a reasonable lower bound on "worth effort to 
save" recovery.

As arrays get larger even a 16th of the recovery time can be a pretty 
long time, particularly if the min recovery speed is set fairly low to 
avoid impact on a production server.

Thought for comment: I already move a lot of overhead to the 2-6am slot 
of low load, would changing the rebuild speeds during prime load be 
desirable? The con is longer degraded operation, the pro is less impact 
on performance.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark

next prev parent reply	other threads:[~2009-02-13 16:20 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-12  3:10 [PATCH 00/18] Assorted md patches headed for 2.6.30 NeilBrown
2009-02-12  3:10 ` [PATCH 01/18] md: never clear bit from the write-intent bitmap when the array is degraded NeilBrown
2009-02-12  3:10 ` [PATCH 04/18] md: be more consistent about setting WriteMostly flag when adding a drive to an array NeilBrown
2009-02-12  3:10 ` [PATCH 05/18] md: Make mddev->size sector-based NeilBrown
2009-02-12  3:10 ` [PATCH 06/18] md: Represent raid device size in sectors NeilBrown
2009-02-12  3:10 ` [PATCH 02/18] md: write bitmap information to devices that are undergoing recovery NeilBrown
2009-02-12  3:10 ` [PATCH 08/18] md/raid5: change raid5_compute_sector and stripe_to_pdidx to take a 'previous' argument NeilBrown
2009-02-12  3:10 ` [PATCH 07/18] md/raid5: simplify interface for init_stripe and get_active_stripe NeilBrown
2009-02-12  3:10 ` [PATCH 03/18] md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash NeilBrown
2009-02-12 17:26   ` John Stoffel
2009-02-13 16:20   ` Bill Davidsen [this message]
2009-02-13 16:34     ` Jon Nelson
2009-02-12  3:10 ` [PATCH 14/18] md: md_unregister_thread should cope with being passed NULL NeilBrown
2009-02-12  3:10 ` [PATCH 09/18] md/raid6: remove expectation that Q device is immediately after P device NeilBrown
2009-02-12 16:56   ` Andre Noll
2009-02-13 22:19     ` Dan Williams
2009-02-16  0:08     ` Neil Brown
2009-02-13 16:37   ` Bill Davidsen
2009-02-16  5:15     ` Neil Brown
2009-02-12  3:10 ` [PATCH 10/18] md/raid5: simplify raid5_compute_sector interface NeilBrown
2009-02-12  3:10 ` [PATCH 15/18] md: hopefully enable suspend/resume of md devices NeilBrown
2009-02-12  3:10 ` [PATCH 18/18] md/raid5: allow layout/chunksize to be changed on an active2-drive raid5 NeilBrown
2009-02-12  3:10 ` [PATCH 16/18] md: add ->takeover method to support changing the personality managing an array NeilBrown
2009-02-12  3:10 ` [PATCH 12/18] md/raid5: finish support for DDF/raid6 NeilBrown
2009-02-12  3:10 ` [PATCH 17/18] md: add ->takeover method for raid5 to be able to take over raid1 NeilBrown
2009-02-12  3:10 ` [PATCH 11/18] md/raid5: Add support for new layouts for raid5 and raid6 NeilBrown
2009-02-12  3:10 ` [PATCH 13/18] md/raid5: refactor raid5 "run" NeilBrown
2009-02-12  8:11 ` [PATCH 00/18] Assorted md patches headed for 2.6.30 Keld Jørn Simonsen
2009-02-12  9:13   ` Steve Fairbairn
2009-02-12  9:46     ` Keld Jørn Simonsen
2009-02-12 10:52       ` NeilBrown
2009-02-12 11:16         ` Keld Jørn Simonsen
2009-02-12 10:53       ` Julian Cowley
2009-02-13 16:54         ` Bill Davidsen
2009-02-16  5:35           ` Neil Brown
2009-02-16 17:31             ` Nagilum
2009-02-12 22:57     ` Dan Williams
2009-02-13 16:56     ` Bill Davidsen
2009-02-12  9:21   ` NeilBrown
2009-02-12  9:53     ` Keld Jørn Simonsen
2009-02-12 10:45       ` NeilBrown
2009-02-12 11:11         ` Keld Jørn Simonsen
2009-02-12 15:28         ` Wil Reichert
2009-02-12 17:44           ` Keld Jørn Simonsen
2009-02-12  9:42 ` Farkas Levente
2009-02-12 10:40   ` NeilBrown
2009-02-12 11:17     ` Farkas Levente
2009-02-13 17:02       ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49959DC8.1000603@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).