linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: [RFC 1/2]raid1: only write mismatch sectors in sync
Date: Wed, 31 Oct 2012 11:25:33 +0800	[thread overview]
Message-ID: <20121031032533.GA1487@kernel.org> (raw)
In-Reply-To: <20121018133657.1bd012f6@notabene.brown>

On Thu, Oct 18, 2012 at 01:36:57PM +1100, NeilBrown wrote:
> On Thu, 18 Oct 2012 10:01:34 +0800 Shaohua Li <shli@kernel.org> wrote:
> 
> > On Thu, Oct 18, 2012 at 12:29:59PM +1100, NeilBrown wrote:
> > > On Thu, 18 Oct 2012 09:17:35 +0800 Shaohua Li <shli@kernel.org> wrote:
> > >  
> > > > > > Neil,
> > > > > > any further comments on this? This is a usable feature, I hope we can have some
> > > > > > agreements.
> > > > > 
> > > > > You still haven't answered my main question, which possibly means I haven't
> > > > > asked it very clearly.
> > > > > 
> > > > > You are saying that this new behaviour should not be the default and I think
> > > > > I agree.
> > > > > So the question is:  how it is selected?
> > > > > 
> > > > > You cannot expect the user to explicitly enable it any time a resync or
> > > > > recovery starts that should use this new feature.  You must have some
> > > > > automatic, or semi-automatic, way for the feature to be activated, otherwise
> > > > > it will never be used.
> > > > > 
> > > > > I'm not asking "when should the feature be used" - you've answered that
> > > > > question a few time and it really isn't an issue.
> > > > > The question it "What it the exact process by which the feature is turned on
> > > > > for any particular resync or recovery?"
> > > > 
> > > > So you worried about users don't know how to correctly select the feature. An
> > > > experienced user knows this, the usage scenario I mentioned describes how to do
> > > > the decision. For example, a resync after system crash should enable the
> > > > feature. I admit an inexperienced user doesn't know how to select it, but this
> > > > isn't a big problem to me. There are a lot of tunables in the kernel (even MD),
> > > > which can significantly impact kernel behavior. These tunables are just for
> > > > experienced users.
> > > > 
> > > > Thanks,
> > > > Shaohua
> > > 
> > > 
> > > You still aren't answering my question.
> > > 
> > > What exactly, precisely, specifically, will an "experienced user" do?
> > 
> > Set something to a sysfs entry to enable the feature (like my RFC patch does to
> > have a new sysfs entry for the feature), and readd disk. resync then does 'only
> > write mismatch data'. Is this what you asked?

sorry for the delay.
 
> Yes, that is the sort of thing I was asking for.
> When you say "readd disk" I assume you mean to use the --readd option to
> mdadm.
> The only works when there is a bitmap active on the array,  so relatively few
> blocks will be resynced so does it really matter which approach is taken?
> Always copy, or read-and-test?
> 
> Though maybe you really mean to "--add" the device.  In that case it would
> probably make sense to add some other option to mdadm to say "enable
> read-mostly recovery".  I wonder what a good name would be.
> --minimize-writes ??

Yep, it's '--add' case. For the '--readd' with bitmap case, bitmap can already
avoid a lot of write already. The useage case is something like:
one disk is broken; trim whole disk of a new disk; add the new disk
If source disk has a lot of 0 and we only write mismatch data, we can avoid
write a lot.

I believe we need such mechanism for '--create' too, if the first disk has some
data, but the second disk is empty.
 
> You earlier gave a list of scenarios in which you thought this would be
> useful.  It was:
> 
> > > > For 'compare and avoid write if equal' case:
> > > > 1. update SSD firmware. This doesn't change the data, but we need take one disk
> > > > off from the raid one time.
> > > > 2. One disk has errors, but these errors don't ruin most of the data (for
> > > > example, a pcie error)
> > > > 3. driver/os crash.
> > > > In all these cases, two raid disks must be resync, and they have almost identical
> > > > data. write avoidness will be very helpful for these.  
> 
> 
> For case '3', it would be a "resync" rather than a "recovery".  How would you
> expect an "advanced user" to choose read-and-test recovery in that case?
> There is no "readd" command happening.

If there is bitmap, maybe we don't need do read-and-test, so this one isn't
very necessary in current stage. If not, what I suggested is:
1. user suspends resync (write something to a sysfs file)
2. user enables read-and-test (again, write a sysfs file)
3. resume resync

Thanks,
Shaohua

  parent reply	other threads:[~2012-10-31  3:25 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-26  8:01 [RFC 1/2]raid1: only write mismatch sectors in sync Shaohua Li
2012-07-27 16:01 ` Jan Ceuleers
2012-07-30  0:39   ` Shaohua Li
2012-07-30  1:07     ` Roberto Spadim
2012-07-31  5:53 ` NeilBrown
2012-07-31  8:12   ` Shaohua Li
2012-09-11  0:59     ` NeilBrown
2012-09-12  5:29       ` Shaohua Li
2012-09-18  4:57         ` NeilBrown
2012-09-19  5:51           ` Shaohua Li
2012-09-19  7:16             ` NeilBrown
2012-09-20  1:56               ` Shaohua Li
2012-10-17  5:11                 ` Shaohua Li
2012-10-17 22:56                   ` NeilBrown
2012-10-18  1:17                     ` Shaohua Li
2012-10-18  1:29                       ` NeilBrown
2012-10-18  2:01                         ` Shaohua Li
2012-10-18  2:36                           ` NeilBrown
2012-10-21 17:14                             ` Michael Tokarev
2012-10-31  3:25                             ` Shaohua Li [this message]
2012-10-31  5:43                               ` NeilBrown
2012-10-31  6:05                                 ` Shaohua Li
2012-10-18  1:30                       ` kedacomkernel
2012-11-20 17:00                     ` Joseph Glanville

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121031032533.GA1487@kernel.org \
    --to=shli@kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).