linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Tripathy <jonnyt@abpni.co.uk>
To: linux-raid@vger.kernel.org
Subject: Re: Resync Every Sunday
Date: Sun, 01 Jul 2012 13:04:41 +0100	[thread overview]
Message-ID: <4FF03CD9.9040409@abpni.co.uk> (raw)
In-Reply-To: <4FF0328B.5080103@abpni.co.uk>


On 01/07/2012 12:20, Jonathan Tripathy wrote:
> Hi Everyone,
>
> We have a few servers that use md raid with mdadm. Each server has 4 
> arrays (md0,md1,md2,md3). md0,1,2 are small and md3 is very large. 
> Every Sunday at 4:22am, the servers will start to resync. Here is some 
> text from /var/log/messages for one of the servers:
>
> Jul  1 04:22:01 server1 kernel: md: syncing RAID array md0
> Jul  1 04:22:01 server1 kernel: md: minimum _guaranteed_ 
> reconstruction speed: 1000 KB/sec/disc.
> Jul  1 04:22:01 server1 kernel: md: using maximum available idle IO 
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul  1 04:22:01 server1 kernel: md: using 128k window, over a total of 
> 104320 blocks.
> Jul  1 04:22:01 server1 kernel: md: delaying resync of md2 until md0 
> has finished resync (they share one or more physical units)
> Jul  1 04:22:01 server1 kernel: md: delaying resync of md3 until md0 
> has finished resync (they share one or more physical units)
> Jul  1 04:22:05 server1 kernel: md: md0: sync done.
> Jul  1 04:22:05 server1 kernel: md: delaying resync of md3 until md2 
> has finished resync (they share one or more physical units)
> Jul  1 04:22:05 server1 kernel: md: delaying resync of md2 until md3 
> has finished resync (they share one or more physical units)
> Jul  1 04:22:05 server1 kernel: md: syncing RAID array md3
> Jul  1 04:22:05 server1 kernel: md: minimum _guaranteed_ 
> reconstruction speed: 1000 KB/sec/disc.
> Jul  1 04:22:05 server1 kernel: md: using maximum available idle IO 
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul  1 04:22:05 server1 kernel: md: using 128k window, over a total of 
> 1888295936 blocks.
>
> /proc/mdstat shows a progress bar for the array that is currently 
> "re-syncing" (in the above case, md3). However, the disks in the 
> servers seem fine, and it always seems to happen in the early hours of 
> Sunday morning at 4:22am.
>
> The issue gets further complicated as not all arrays are re-synced and 
> I can seem to find a pattern as to what's selected. All I know is that 
> at 4:22, mdadm will "come alive" and attempt to do re-syncing of some 
> (or all) of the arrays. On each of the servers, 3 of the arrays are 
> small and one is large; this leads to the phenomenon that when we wake 
> up on Sunday morning, a "random" selection of the servers will still 
> be syncing (as mdadm has decided to "pick" the large md3 array to 
> resync).
>
> Here is output from /var/log/messages on a server that has only 
> decided to re-sync 2 small arrays (md0 and md2):
>
> Jul  1 04:22:01 server3 kernel: md: syncing RAID array md0
> Jul  1 04:22:01 server3 kernel: md: minimum _guaranteed_ 
> reconstruction speed: 1000 KB/sec/disc.
> Jul  1 04:22:01 server3 kernel: md: using maximum available idle IO 
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul  1 04:22:01 server3 kernel: md: using 128k window, over a total of 
> 104320 blocks.
> Jul  1 04:22:01 server3 kernel: md: delaying resync of md2 until md0 
> has finished resync (they share one or more physical units)
> Jul  1 04:22:02 server3 kernel: md: md0: sync done.
> Jul  1 04:22:02 server3 kernel: md: syncing RAID array md2
> Jul  1 04:22:02 server3 kernel: md: minimum _guaranteed_ 
> reconstruction speed: 1000 KB/sec/disc.
> Jul  1 04:22:02 server3 kernel: md: using maximum available idle IO 
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jul  1 04:22:02 server3 kernel: md: using 128k window, over a total of 
> 1052160 blocks.
> Jul  1 04:22:15 server3 kernel: md: md2: sync done
>
> What's going on? Am I missing something here? Is data on the arrays at 
> risk? We're using CentOS 5 with mdadm v2.6.9. Kernel version is 
> 2.6.18-274.18.1.el5
>
> Any help is appreciated.
>
>
Upon further reading, I've discovered that these "resyncs" are due to 
the cron raid-checks that occur. However, most of my questions still stand:

- Why aren't all arrays checked?
- Why are the checked arrays different each week? (Although md0 and md2 
seem to be favorites!)
- Is data at risk during these check times? If not, why does mdstat 
report them are "resyncing" and not simply "checking"?
- Is it safe to disable these checks? Would monitoring the SMART status 
of the disks serve as a good substitute?

Any help in answering these questions is appreciated

Thanks

  reply	other threads:[~2012-07-01 12:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-01 11:20 Resync Every Sunday Jonathan Tripathy
2012-07-01 12:04 ` Jonathan Tripathy [this message]
2012-07-01 12:44   ` Mikael Abrahamsson
2012-07-01 12:53     ` Jonathan Tripathy
2012-07-01 20:41 ` Keith Keller
2012-07-01 20:44   ` Jonathan Tripathy
2012-07-01 21:24     ` Larkin Lowrey
2012-07-01 21:57       ` Jonathan Tripathy
2012-07-01 22:01         ` Jonathan Tripathy
2012-07-02 17:06           ` Larkin Lowrey
2012-07-02 21:30             ` Keith Keller
2012-07-02 22:55               ` Jonathan Tripathy
2012-07-03  3:33                 ` Keith Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FF03CD9.9040409@abpni.co.uk \
    --to=jonnyt@abpni.co.uk \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).