Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Doug Ledford <dledford@redhat.com>
To: max@maxeaves.co.uk
Cc: linux-raid@vger.kernel.org
Subject: Re: Problems with RAID 6 across 15 disks
Date: Thu, 01 Apr 2010 09:49:21 -0400	[thread overview]
Message-ID: <4BB4A461.5030704@redhat.com> (raw)
In-Reply-To: <4BB49E4D.1090809@maxeaves.co.uk>

[-- Attachment #1: Type: text/plain, Size: 2659 bytes --]

On 04/01/2010 09:23 AM, Max Eaves wrote:
> Hi there,
> 
> I hope this gets through....my first posting on this dist.list.
> 
> I am running Centos 5.4 with a 2.6.18-164.15.1.el5 kernel (x86_64)
> kernel using a rather "homebrew" backblaze system
> (http://blog.backblaze.com/) system.
> 
> The mdadm version is: mdadm - v2.6.9 - 10th March 2009
> 
> It uses a number of Silicon Image 3124 (sIL 3124) cards and a number of
> multiplier port cards (sIL3132) to read a large number of disks.
> 
> I have 45 disks arranged into 3 mdadm raid sets of 15 disks.  These 15
> disks are raided using RAID6.
> 
> The problem I have is this:
> 
> At random times, the RAID decides that it needs to resynchronise
> /dev/md10 /dev/md11 and /dev/md12.  There is no error or log event in
> /var/log/messages, but the first thing I notice is that the performance
> of the RAID array drops, and checking out "cat /proc/mdadm" shows all
> three RAID re synchronising themselves.
> 
> ARRAY /dev/md0 level=raid1 num-devices=2
> uuid=7d7b19e6:56cc90cc:3cb166bd:b8086f29 (system boot) (not a problem)
> ARRAY /dev/md1 level=raid1 num-devices=2
> uuid=3782d93d:a491ffd4:f32c1014:94a2b3f7 (system LVM) (not a problem)
> ARRAY /dev/md10 level=raid6 num-devices=15
> uuid=5ca86e2a-3b86-4c0b-9a7a-59143bdcd0f1 (partition 1) (problem)
> ARRAY /dev/md11 level=raid6 num-devices=15
> uuid=61188c90-4825-44c5-8fac-9bc82a5799fe (partition 2) (problem)
> ARRAY /dev/md12 level=raid6 num-devices=15
> uuid=fa939816-1d0f-4eaa-98dd-c131449c3921 (partition 3) (problem)
> 
> These re-synchronisation events take about a week to complete (the RAID
> is 18TB a pop)
> 
> I know that the performance of this system is not great, but I wonder if
> this resynchronisation is occurring because of some I/O time-out.
> 
> Oddly enough, a restart of the server fixes the problem for a couple of
> days, and then problem occurs again (humm - not good).
> 
> I'm happy to post logs etc....just let me know what you need.

Disable /etc/cron.weekly/99-raid-check.  They aren't resyncronizing,
they are actually just checking themselves for consistency, but because
the 2.6.18 kernel didn't have a different word for it in the output of
/proc/mdstat it just looks that way.  I can't remember if the version of
mdadm in centos 5.4 has the /etc/sysconfig/raid-check config file, but
if it does, it's easy to disable the weekly check there.


-- 
Doug Ledford <dledford@redhat.com>
              GPG KeyID: CFBFF194
	      http://people.redhat.com/dledford

Infiniband specific RPMs available at
	      http://people.redhat.com/dledford/Infiniband


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  reply	other threads:[~2010-04-01 13:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-01 13:23 Problems with RAID 6 across 15 disks Max Eaves
2010-04-01 13:49 ` Doug Ledford [this message]
2010-04-01 14:07   ` Max Eaves
2010-04-01 20:43     ` Neil Brown
2010-04-01 22:46       ` Piergiorgio Sartor
2010-04-01 22:58         ` Jools Wills
2010-04-01 23:04           ` Piergiorgio Sartor
2010-04-01 23:46             ` Michael Evans
2010-04-02  1:40             ` Jools Wills
2010-04-02  5:03               ` Neil Brown
2010-04-02  8:22                 ` Piergiorgio Sartor
2010-04-02 10:21                 ` Max Eaves
2010-04-02  5:55       ` responsiveness during raid check (Was: Problems with RAID 6 across 15 disks) Luca Berra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BB4A461.5030704@redhat.com \
    --to=dledford@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=max@maxeaves.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox