Re: RAID scrubbing - Neil Brown

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Justin Maggard <jmaggard10@gmail.com>
Cc: Michael Evans <mjevans1983@gmail.com>, linux-raid@vger.kernel.org
Subject: Re: RAID scrubbing
Date: Thu, 15 Apr 2010 11:22:06 +1000	[thread overview]
Message-ID: <20100415112206.6fcd3d3f@notabene.brown> (raw)
In-Reply-To: <w2h150c16851004141751w6d798c12k548527dab7420db4@mail.gmail.com>

On Wed, 14 Apr 2010 17:51:11 -0700
Justin Maggard <jmaggard10@gmail.com> wrote:

> On Fri, Apr 9, 2010 at 7:01 PM, Michael Evans <mjevans1983@gmail.com> wrote:
> > On Fri, Apr 9, 2010 at 6:46 PM, Justin Maggard <jmaggard10@gmail.com> wrote:
> >> On Fri, Apr 9, 2010 at 6:41 PM, Michael Evans <mjevans1983@gmail.com> wrote:
> >>> On Fri, Apr 9, 2010 at 6:28 PM, Justin Maggard <jmaggard10@gmail.com> wrote:
> >>>> Hi all,
> >>>>
> >>>> I've got a system using two RAID5 arrays that share some physical
> >>>> devices, combined using LVM.  Oddly, when I "echo repair >
> >>>> /sys/block/md0/md/sync_action", once it finishes, it automatically
> >>>> starts a repair on md1 also, even though I haven't requested it.
> >>>> Also, if I try to stop it using "echo idle >
> >>>> /sys/block/md0/md/sync_action", a repair starts on md1 within a few
> >>>> seconds.  If I stop that md1 repair immediately, sometimes it will
> >>>> respawn and start doing the repair again on md1.  What should I be
> >>>> expecting here?  If I start a repair on one array, is it supposed to
> >>>> automatically go through and do it on all arrays sharing that
> >>>> personality?
> >>>>
> >>>> Thanks!
> >>>> -Justin
> >>>>
> >>>
> >>> Is md1 degraded with an active spare?  It might be delaying resync on
> >>> it until the other devices are idle.
> >>
> >> No, both arrays are redundant.  I'm just trying to do scrubbing
> >> (repair) on md0; no resync is going on anywhere.
> >>
> >> -Justin
> >>
> >
> > First: Reply to all.
> >
> > Second, if you insist that things are not as I suspect:
> >
> > cat /proc/mdstat
> >
> > mdadm -Dvvs
> >
> > mdadm -Evvs
> >
> 
> I insist it's something different. :)  Just ran into it again on
> another system.  Here's the requested output:

Thanks.  Very thorough!


> Apr 14 17:32:23 JMAGGARD kernel: md: requested-resync of RAID array md2
> Apr 14 17:32:23 JMAGGARD kernel: md: minimum _guaranteed_  speed: 1000
> KB/sec/disk.
> Apr 14 17:32:23 JMAGGARD kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for requested-resync.
> Apr 14 17:32:23 JMAGGARD kernel: md: using 128k window, over a total
> of 972041296 blocks.
> Apr 14 17:32:51 JMAGGARD kernel: md: md_do_sync() got signal ... exiting
> Apr 14 17:33:35 JMAGGARD kernel: md: requested-resync of RAID array md3

So we see the requested-resync (repair) of md2 started as you requested,
then finished at 17:32:51 when you write 'idle' to 'sync_action'.

Then 44 seconds later a similar repair started on md3.
44 seconds is too long for it to be a direct consequence of the md2 repair
stopping.  Something *must* have written to md3/md/sync_action.   But what?

Maybe you have "mdadm --monitor" running and it notices when repair on one
array finished and has been told to run a script (--program or PROGRAM in
mdadm.conf) which would then start a repair on the next array???

Seems a bit far-fetched, but I'm quite confident that some program must be
writing to md3/md/sync_action while you're not watching.

NeilBrown


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2010-04-15  1:22 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-10  1:28 RAID scrubbing Justin Maggard
2010-04-10  1:41 ` Michael Evans
     [not found]   ` <s2y150c16851004091846t94347cf8u9ffd65133061d16b@mail.gmail.com>
2010-04-10  2:01     ` Michael Evans
2010-04-15  0:51       ` Justin Maggard
2010-04-15  1:22         ` Neil Brown [this message]
2010-04-17  0:03           ` Justin Maggard
2010-04-17  0:19             ` Berkey B Walker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100415112206.6fcd3d3f@notabene.brown \
    --to=neilb@suse.de \
    --cc=jmaggard10@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=mjevans1983@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).