From: Jonathan Tripathy <jonnyt@abpni.co.uk>
To: linux-raid@vger.kernel.org
Subject: Re: Resync Every Sunday
Date: Sun, 01 Jul 2012 13:04:41 +0100 [thread overview]
Message-ID: <4FF03CD9.9040409@abpni.co.uk> (raw)
In-Reply-To: <4FF0328B.5080103@abpni.co.uk>
On 01/07/2012 12:20, Jonathan Tripathy wrote:
> Hi Everyone,
>
> We have a few servers that use md raid with mdadm. Each server has 4
> arrays (md0,md1,md2,md3). md0,1,2 are small and md3 is very large.
> Every Sunday at 4:22am, the servers will start to resync. Here is some
> text from /var/log/messages for one of the servers:
>
> Jul 1 04:22:01 server1 kernel: md: syncing RAID array md0
> Jul 1 04:22:01 server1 kernel: md: minimum _guaranteed_
> reconstruction speed: 1000 KB/sec/disc.
> Jul 1 04:22:01 server1 kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul 1 04:22:01 server1 kernel: md: using 128k window, over a total of
> 104320 blocks.
> Jul 1 04:22:01 server1 kernel: md: delaying resync of md2 until md0
> has finished resync (they share one or more physical units)
> Jul 1 04:22:01 server1 kernel: md: delaying resync of md3 until md0
> has finished resync (they share one or more physical units)
> Jul 1 04:22:05 server1 kernel: md: md0: sync done.
> Jul 1 04:22:05 server1 kernel: md: delaying resync of md3 until md2
> has finished resync (they share one or more physical units)
> Jul 1 04:22:05 server1 kernel: md: delaying resync of md2 until md3
> has finished resync (they share one or more physical units)
> Jul 1 04:22:05 server1 kernel: md: syncing RAID array md3
> Jul 1 04:22:05 server1 kernel: md: minimum _guaranteed_
> reconstruction speed: 1000 KB/sec/disc.
> Jul 1 04:22:05 server1 kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul 1 04:22:05 server1 kernel: md: using 128k window, over a total of
> 1888295936 blocks.
>
> /proc/mdstat shows a progress bar for the array that is currently
> "re-syncing" (in the above case, md3). However, the disks in the
> servers seem fine, and it always seems to happen in the early hours of
> Sunday morning at 4:22am.
>
> The issue gets further complicated as not all arrays are re-synced and
> I can seem to find a pattern as to what's selected. All I know is that
> at 4:22, mdadm will "come alive" and attempt to do re-syncing of some
> (or all) of the arrays. On each of the servers, 3 of the arrays are
> small and one is large; this leads to the phenomenon that when we wake
> up on Sunday morning, a "random" selection of the servers will still
> be syncing (as mdadm has decided to "pick" the large md3 array to
> resync).
>
> Here is output from /var/log/messages on a server that has only
> decided to re-sync 2 small arrays (md0 and md2):
>
> Jul 1 04:22:01 server3 kernel: md: syncing RAID array md0
> Jul 1 04:22:01 server3 kernel: md: minimum _guaranteed_
> reconstruction speed: 1000 KB/sec/disc.
> Jul 1 04:22:01 server3 kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul 1 04:22:01 server3 kernel: md: using 128k window, over a total of
> 104320 blocks.
> Jul 1 04:22:01 server3 kernel: md: delaying resync of md2 until md0
> has finished resync (they share one or more physical units)
> Jul 1 04:22:02 server3 kernel: md: md0: sync done.
> Jul 1 04:22:02 server3 kernel: md: syncing RAID array md2
> Jul 1 04:22:02 server3 kernel: md: minimum _guaranteed_
> reconstruction speed: 1000 KB/sec/disc.
> Jul 1 04:22:02 server3 kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul 1 04:22:02 server3 kernel: md: using 128k window, over a total of
> 1052160 blocks.
> Jul 1 04:22:15 server3 kernel: md: md2: sync done
>
> What's going on? Am I missing something here? Is data on the arrays at
> risk? We're using CentOS 5 with mdadm v2.6.9. Kernel version is
> 2.6.18-274.18.1.el5
>
> Any help is appreciated.
>
>
Upon further reading, I've discovered that these "resyncs" are due to
the cron raid-checks that occur. However, most of my questions still stand:
- Why aren't all arrays checked?
- Why are the checked arrays different each week? (Although md0 and md2
seem to be favorites!)
- Is data at risk during these check times? If not, why does mdstat
report them are "resyncing" and not simply "checking"?
- Is it safe to disable these checks? Would monitoring the SMART status
of the disks serve as a good substitute?
Any help in answering these questions is appreciated
Thanks
next prev parent reply other threads:[~2012-07-01 12:04 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-01 11:20 Resync Every Sunday Jonathan Tripathy
2012-07-01 12:04 ` Jonathan Tripathy [this message]
2012-07-01 12:44 ` Mikael Abrahamsson
2012-07-01 12:53 ` Jonathan Tripathy
2012-07-01 20:41 ` Keith Keller
2012-07-01 20:44 ` Jonathan Tripathy
2012-07-01 21:24 ` Larkin Lowrey
2012-07-01 21:57 ` Jonathan Tripathy
2012-07-01 22:01 ` Jonathan Tripathy
2012-07-02 17:06 ` Larkin Lowrey
2012-07-02 21:30 ` Keith Keller
2012-07-02 22:55 ` Jonathan Tripathy
2012-07-03 3:33 ` Keith Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FF03CD9.9040409@abpni.co.uk \
--to=jonnyt@abpni.co.uk \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.