Re: [PATCH 03/18] md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bill Davidsen <davidsen@tmr.com>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 03/18] md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash
Date: Fri, 13 Feb 2009 11:20:24 -0500	[thread overview]
Message-ID: <49959DC8.1000603@tmr.com> (raw)
In-Reply-To: <20090212031010.23983.74842.stgit@notabene.brown>

NeilBrown wrote:
> Version 1.x metadata has the ability to record the status of a
> partially completed drive recovery.
> However we only update that record on a clean shutdown.
> It would be nice to update it on unclean shutdowns too, particularly
> when using a bitmap that removes much to the 'sync' effort after an
> unclean shutdown.
>
> One complication with checkpointing recovery is that we only know
> where we are up to in terms of IO requests started, not which ones
> have completed.  And we need to know what has completed to record
> how much is recovered.  So occasionally pause the recovery until all
> submitted requests are completed, then update the record of where
> we are up to.
>
> When we have a bitmap, we already do that pause occasionally to keep
> the bitmap up-to-date.  So enhance that code to record the recovery
> offset and schedule a superblock update.
> And when there is no bitmap, just pause 16 times during the resync to
> do a checkpoint.
> '16' is a fairly arbitrary number.  But we don't really have any good
> way to judge how often is acceptable, and it seems like a reasonable
> number for now.
>   

Since the object of this code is to save time on shutdown and restart, 
16 has little relation to time. I would think that having this update on 
a time basis would more reasonably reflect this. I would like to see a 
fairly short time, say ten minutes, since the cost of a save is low, and 
ten minutes seems like a reasonable lower bound on "worth effort to 
save" recovery.

As arrays get larger even a 16th of the recovery time can be a pretty 
long time, particularly if the min recovery speed is set fairly low to 
avoid impact on a production server.

Thought for comment: I already move a lot of overhead to the 2-6am slot 
of low load, would changing the rebuild speeds during prime load be 
desirable? The con is longer degraded operation, the pro is less impact 
on performance.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark

next prev parent reply	other threads:[~2009-02-13 16:20 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-12  3:10 [PATCH 00/18] Assorted md patches headed for 2.6.30 NeilBrown
2009-02-12  3:10 ` [PATCH 03/18] md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash NeilBrown
2009-02-12 17:26   ` John Stoffel
2009-02-13 16:20   ` Bill Davidsen [this message]
2009-02-13 16:34     ` Jon Nelson
2009-02-12  3:10 ` [PATCH 02/18] md: write bitmap information to devices that are undergoing recovery NeilBrown
2009-02-12  3:10 ` [PATCH 06/18] md: Represent raid device size in sectors NeilBrown
2009-02-12  3:10 ` [PATCH 01/18] md: never clear bit from the write-intent bitmap when the array is degraded NeilBrown
2009-02-12  3:10 ` [PATCH 08/18] md/raid5: change raid5_compute_sector and stripe_to_pdidx to take a 'previous' argument NeilBrown
2009-02-12  3:10 ` [PATCH 05/18] md: Make mddev->size sector-based NeilBrown
2009-02-12  3:10 ` [PATCH 07/18] md/raid5: simplify interface for init_stripe and get_active_stripe NeilBrown
2009-02-12  3:10 ` [PATCH 04/18] md: be more consistent about setting WriteMostly flag when adding a drive to an array NeilBrown
2009-02-12  3:10 ` [PATCH 12/18] md/raid5: finish support for DDF/raid6 NeilBrown
2009-02-12  3:10 ` [PATCH 15/18] md: hopefully enable suspend/resume of md devices NeilBrown
2009-02-12  3:10 ` [PATCH 13/18] md/raid5: refactor raid5 "run" NeilBrown
2009-02-12  3:10 ` [PATCH 18/18] md/raid5: allow layout/chunksize to be changed on an active2-drive raid5 NeilBrown
2009-02-12  3:10 ` [PATCH 17/18] md: add ->takeover method for raid5 to be able to take over raid1 NeilBrown
2009-02-12  3:10 ` [PATCH 16/18] md: add ->takeover method to support changing the personality managing an array NeilBrown
2009-02-12  3:10 ` [PATCH 11/18] md/raid5: Add support for new layouts for raid5 and raid6 NeilBrown
2009-02-12  3:10 ` [PATCH 14/18] md: md_unregister_thread should cope with being passed NULL NeilBrown
2009-02-12  3:10 ` [PATCH 09/18] md/raid6: remove expectation that Q device is immediately after P device NeilBrown
2009-02-12 16:56   ` Andre Noll
2009-02-13 22:19     ` Dan Williams
2009-02-16  0:08     ` Neil Brown
2009-02-13 16:37   ` Bill Davidsen
2009-02-16  5:15     ` Neil Brown
2009-02-12  3:10 ` [PATCH 10/18] md/raid5: simplify raid5_compute_sector interface NeilBrown
2009-02-12  8:11 ` [PATCH 00/18] Assorted md patches headed for 2.6.30 Keld Jørn Simonsen
2009-02-12  9:13   ` Steve Fairbairn
2009-02-12  9:46     ` Keld Jørn Simonsen
2009-02-12 10:52       ` NeilBrown
2009-02-12 11:16         ` Keld Jørn Simonsen
2009-02-12 10:53       ` Julian Cowley
2009-02-13 16:54         ` Bill Davidsen
2009-02-16  5:35           ` Neil Brown
2009-02-16 17:31             ` Nagilum
2009-02-12 22:57     ` Dan Williams
2009-02-13 16:56     ` Bill Davidsen
2009-02-12  9:21   ` NeilBrown
2009-02-12  9:53     ` Keld Jørn Simonsen
2009-02-12 10:45       ` NeilBrown
2009-02-12 11:11         ` Keld Jørn Simonsen
2009-02-12 15:28         ` Wil Reichert
2009-02-12 17:44           ` Keld Jørn Simonsen
2009-02-12  9:42 ` Farkas Levente
2009-02-12 10:40   ` NeilBrown
2009-02-12 11:17     ` Farkas Levente
2009-02-13 17:02       ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49959DC8.1000603@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.