linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Fjellstrom <thomas@fjellstrom.ca>
To: linux-raid@vger.kernel.org
Subject: Re: Thought about delayed sync
Date: Sun, 9 Oct 2011 06:34:57 -0600	[thread overview]
Message-ID: <201110090634.57750.thomas@fjellstrom.ca> (raw)
In-Reply-To: <20111009120407.GB25344@animx.eu.org>

On October 9, 2011, Wakko Warner wrote:
> Thomas Fjellstrom wrote:
> > On October 8, 2011, Wakko Warner wrote:
> > > A few days ago, I thought about creating raid arrays w/o syncing.  I
> > > understand why sync is needed.  Please correct me if I'm wrong in any
> > > of my statements.
> > > 
> > > Currently, if someone uses large disks (1tb or larger), the initial
> > > sync can take a long time and until it has completed, the array isn't
> > > fully protected.  I noted on a raid1 of a pair of 1tb disks took hours
> > > to complete when there was no activity.
> > > 
> > > Here is my thought.  There is already a bitmap to indicate which blocks
> > > are dirty.  Thus by using that, a drop of a disk (accidental or
> > > intentional), a resync only syncs those blocks that the bitmap knows
> > > were dirtied.
> > > 
> > > What if another bitmap could be utilized.  This would be an "in use"
> > > bitmap. The purpose of this could be that there would never be an
> > > initial sync. When data is written to an area that has not been
> > > synced, a sync will happen of that region.  Once the sync is complete,
> > > that region will be marked as synced in the bitmap.  Only the parts
> > > that have been written to will be synced.  The other data is of no
> > > consequence.  As with the current bitmap, this would have to be asked
> > > for.
> > > 
> > > Lets say someone has been using this array for some time and a disk
> > > dropped out and had to be replaced.  Lets also say that the actual
> > > usage was about 25-30% of the array (of course, that would be wasted
> > > space).  With the "in use" bitmap, they would replace the disk and
> > > only the areas that had been written to would be resynced over to the
> > > new disk.  The rest, since it had not been used, would not need to be.
> > > 
> > > A side effect of this would be that a check or a resync could use this
> > > to check the real data (IE on a weekly basis) and take less time.
> > > 
> > > Over all, depending on the usage, this can keep the wear and tear on a
> > > disk down.  I'm speaking of personal experience with my systems.  I
> > > have arrays that are not 100% or even 80% used.  I have some
> > > production servers that have extra space for expansion and not fully
> > > used.
> > > 
> > > I'm sure this would take some time to implement if someone does this. 
> > > As I mentioned at the beginning, this was just a thought, but I think
> > > it could benefit people if it were implemented.
> > > 
> > > I am on the list, but feel free to keep me in the CC.
> > 
> > I think theres at least one, probably fatal problem with that idea. There
> > is currently no reliable way for md to tell which areas are actually in
> > use. That is, once a section is written to the first time, it will stay
> > in use, even if it isn't. "Now what about TRIM?" you ask? Not all file
> > systems support it, and I /think/ (based on a quick search of the list)
> > mdraid doesn't fully support TRIM either. LVM may not either. (a quick
> > search also suggested lvm2 doesn't pass on trim properly/at-all).
> 
> Actually, I was completely aware of this before I wrote my thought to the
> list.  I don't know exactly how it could be told.  I thought about a
> program that could read lvm data and tell MD what blocks are not in use. 
> It could go further and attempt to read the filesystem.  TRIM is a nice
> idea, but as you alread mentioned, not all filesystems support it and not
> all layers support passing it.
> 
> > I've been using the current bitmap support on my raid5 array for some
> > time, and it has made the few resync's that were needed, very fast
> > compared to a full resync. Instead of 15+ hours, they finished in 20
> > minutes or less. I call that a win.
> 
> Try this instead.  Create a raid5 (or 6) on 4 2tb drives.  Add about 100gb
> of data to it and replace one of the disks with a fresh disk.  You'll
> notice you have to resync the entire array.  The current bitmap only tells
> which blocks have changed and a resync of an existing member is quick. 
> But a new member has no known in sync blocks and has to resync the whole
> thing.  I know, I already had this happen to me last month.

Yeah, after reading the link to Neil's blog, it hit me how useful it could be.

> On another note, I used this feature to clean the dust out of my disk array
> in another system.  Fail a drive, read the array to verify which drive I
> physically failed, remove it, clean the dust off, add it back, wait for
> resync to complete and then do another disk.  Resync on that was quick for
> the 750gb member.  Without a bitmap, resync time on that system is 3 hours.

Try it on a 7 1TB drive raid5. fun times. I imagine its much worse with 2, 3 
or 4 TB drives. (though not many people have a bunch of internal 4TB drives I 
imagine).

> Thanks for your input though.


-- 
Thomas Fjellstrom
thomas@fjellstrom.ca

  reply	other threads:[~2011-10-09 12:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-08 18:03 Thought about delayed sync Wakko Warner
2011-10-08 22:00 ` Thomas Fjellstrom
2011-10-09 12:04   ` Wakko Warner
2011-10-09 12:34     ` Thomas Fjellstrom [this message]
2011-10-09 13:44       ` Wakko Warner
2011-10-08 22:36 ` NeilBrown
2011-10-09 11:32   ` Alexander Kühn
2011-10-09 22:12     ` NeilBrown
2011-10-09 11:56   ` Wakko Warner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201110090634.57750.thomas@fjellstrom.ca \
    --to=thomas@fjellstrom.ca \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).