linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: [RFC 1/2]raid1: only write mismatch sectors in sync
Date: Wed, 31 Oct 2012 16:43:36 +1100	[thread overview]
Message-ID: <20121031164336.3828a6ca@notabene.brown> (raw)
In-Reply-To: <20121031032533.GA1487@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4929 bytes --]

On Wed, 31 Oct 2012 11:25:33 +0800 Shaohua Li <shli@kernel.org> wrote:

> On Thu, Oct 18, 2012 at 01:36:57PM +1100, NeilBrown wrote:
> > On Thu, 18 Oct 2012 10:01:34 +0800 Shaohua Li <shli@kernel.org> wrote:
> > 
> > > On Thu, Oct 18, 2012 at 12:29:59PM +1100, NeilBrown wrote:
> > > > On Thu, 18 Oct 2012 09:17:35 +0800 Shaohua Li <shli@kernel.org> wrote:
> > > >  
> > > > > > > Neil,
> > > > > > > any further comments on this? This is a usable feature, I hope we can have some
> > > > > > > agreements.
> > > > > > 
> > > > > > You still haven't answered my main question, which possibly means I haven't
> > > > > > asked it very clearly.
> > > > > > 
> > > > > > You are saying that this new behaviour should not be the default and I think
> > > > > > I agree.
> > > > > > So the question is:  how it is selected?
> > > > > > 
> > > > > > You cannot expect the user to explicitly enable it any time a resync or
> > > > > > recovery starts that should use this new feature.  You must have some
> > > > > > automatic, or semi-automatic, way for the feature to be activated, otherwise
> > > > > > it will never be used.
> > > > > > 
> > > > > > I'm not asking "when should the feature be used" - you've answered that
> > > > > > question a few time and it really isn't an issue.
> > > > > > The question it "What it the exact process by which the feature is turned on
> > > > > > for any particular resync or recovery?"
> > > > > 
> > > > > So you worried about users don't know how to correctly select the feature. An
> > > > > experienced user knows this, the usage scenario I mentioned describes how to do
> > > > > the decision. For example, a resync after system crash should enable the
> > > > > feature. I admit an inexperienced user doesn't know how to select it, but this
> > > > > isn't a big problem to me. There are a lot of tunables in the kernel (even MD),
> > > > > which can significantly impact kernel behavior. These tunables are just for
> > > > > experienced users.
> > > > > 
> > > > > Thanks,
> > > > > Shaohua
> > > > 
> > > > 
> > > > You still aren't answering my question.
> > > > 
> > > > What exactly, precisely, specifically, will an "experienced user" do?
> > > 
> > > Set something to a sysfs entry to enable the feature (like my RFC patch does to
> > > have a new sysfs entry for the feature), and readd disk. resync then does 'only
> > > write mismatch data'. Is this what you asked?
> 
> sorry for the delay.
>  
> > Yes, that is the sort of thing I was asking for.
> > When you say "readd disk" I assume you mean to use the --readd option to
> > mdadm.
> > The only works when there is a bitmap active on the array,  so relatively few
> > blocks will be resynced so does it really matter which approach is taken?
> > Always copy, or read-and-test?
> > 
> > Though maybe you really mean to "--add" the device.  In that case it would
> > probably make sense to add some other option to mdadm to say "enable
> > read-mostly recovery".  I wonder what a good name would be.
> > --minimize-writes ??
> 
> Yep, it's '--add' case. For the '--readd' with bitmap case, bitmap can already
> avoid a lot of write already. The useage case is something like:
> one disk is broken; trim whole disk of a new disk; add the new disk
> If source disk has a lot of 0 and we only write mismatch data, we can avoid
> write a lot.
> 
> I believe we need such mechanism for '--create' too, if the first disk has some
> data, but the second disk is empty.
>  
> > You earlier gave a list of scenarios in which you thought this would be
> > useful.  It was:
> > 
> > > > > For 'compare and avoid write if equal' case:
> > > > > 1. update SSD firmware. This doesn't change the data, but we need take one disk
> > > > > off from the raid one time.
> > > > > 2. One disk has errors, but these errors don't ruin most of the data (for
> > > > > example, a pcie error)
> > > > > 3. driver/os crash.
> > > > > In all these cases, two raid disks must be resync, and they have almost identical
> > > > > data. write avoidness will be very helpful for these.  
> > 
> > 
> > For case '3', it would be a "resync" rather than a "recovery".  How would you
> > expect an "advanced user" to choose read-and-test recovery in that case?
> > There is no "readd" command happening.
> 
> If there is bitmap, maybe we don't need do read-and-test, so this one isn't
> very necessary in current stage. If not, what I suggested is:
> 1. user suspends resync (write something to a sysfs file)
> 2. user enables read-and-test (again, write a sysfs file)
> 3. resume resync

So you are happy for the resync to start doing the wrong thing, and expect
the sysadmin to notice, and then take some obscure action to stop it doing
the wrong thing and start it doing the right thing.
Certainly possible, but very error prone I would think.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2012-10-31  5:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-26  8:01 [RFC 1/2]raid1: only write mismatch sectors in sync Shaohua Li
2012-07-27 16:01 ` Jan Ceuleers
2012-07-30  0:39   ` Shaohua Li
2012-07-30  1:07     ` Roberto Spadim
2012-07-31  5:53 ` NeilBrown
2012-07-31  8:12   ` Shaohua Li
2012-09-11  0:59     ` NeilBrown
2012-09-12  5:29       ` Shaohua Li
2012-09-18  4:57         ` NeilBrown
2012-09-19  5:51           ` Shaohua Li
2012-09-19  7:16             ` NeilBrown
2012-09-20  1:56               ` Shaohua Li
2012-10-17  5:11                 ` Shaohua Li
2012-10-17 22:56                   ` NeilBrown
2012-10-18  1:17                     ` Shaohua Li
2012-10-18  1:29                       ` NeilBrown
2012-10-18  2:01                         ` Shaohua Li
2012-10-18  2:36                           ` NeilBrown
2012-10-21 17:14                             ` Michael Tokarev
2012-10-31  3:25                             ` Shaohua Li
2012-10-31  5:43                               ` NeilBrown [this message]
2012-10-31  6:05                                 ` Shaohua Li
2012-10-18  1:30                       ` kedacomkernel
2012-11-20 17:00                     ` Joseph Glanville

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121031164336.3828a6ca@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).