Re: raid1 - mismatches after resuming interrupted recovery

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nate Dailey <nate.dailey@stratus.com>
To: Jes Sorensen <Jes.Sorensen@redhat.com>, NeilBrown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: raid1 - mismatches after resuming interrupted recovery
Date: Fri, 30 Oct 2015 12:57:22 -0400	[thread overview]
Message-ID: <5633A172.3020804@stratus.com> (raw)
In-Reply-To: <wrfj1tccpcuc.fsf@carbonite.lan.trained-monkey.org>

This is the the same as "ignore recovery_offset if bitmap exists", describing 
how I hit the problem (before attempting to put a patch together to fix it).

Nate



On 10/30/2015 11:58 AM, Jes Sorensen wrote:
> Nate Dailey <nate.dailey@stratus.com> writes:
>> I've found that if I interrupt a recovery by removing the target
>> device, do IO before the recovery checkpoint, then re-add the device
>> and let the recovery complete, the mismatch_cnt is non-zero after
>> doing a check.
> Neil,
>
> While I am on the nagging path, here is another one.
>
> Jes
>
>> Here's exactly what I'm doing:
>>
>> - create a 5 GB raid1 with internal bitmap
>>
>> - do a check, verify zero mismatch_cnt
>>
>> - remove one member device
>>
>> - dd 256MB with 2GB seek
>>
>> - lower sync_speed_min/max to 500
>>
>> - re-add removed device
>>
>> - wait 15 sec
>>
>> - remove the same member device again
>>
>> - dd 1MB with 1 GB seek
>>
>> - restore sync_speed_min/max to system defaults
>>
>> - re-add removed device
>>
>> - when recovery competes, do another check
>>
>> At this point the mismatch_cnt is non-zero.
>>
>>
>> I originally hit this on RHEL 7.1, but tested 4.1.1 from kernel.org
>> and it happens there too.
>>
>> I'm out of my league in terms of trying to fix this, but would be
>> happy to test a fix. I wonder if it's really necessary to resume a
>> bitmap recovery from the checkpoint? Wouldn't the bitmap always
>> reflect what needs to be copied?
>>
>> Nate
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2015-10-30 16:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-06 16:19 raid1 - mismatches after resuming interrupted recovery Nate Dailey
2015-10-30 15:58 ` Jes Sorensen
2015-10-30 16:57   ` Nate Dailey [this message]
2015-10-30 18:07     ` Jes Sorensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5633A172.3020804@stratus.com \
    --to=nate.dailey@stratus.com \
    --cc=Jes.Sorensen@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.