From: "Janos Haar" <janos.haar@netcenter.hu>
To: David Lethe <david@santools.com>, cry_regarder@yahoo.com
Cc: linux-raid@vger.kernel.org
Subject: Re: Re: Two Drive Failure on RAID-5
Date: Tue, 20 May 2008 21:40:34 +0200 [thread overview]
Message-ID: <044c01c8bab1$6382e6c0$9300a8c0@dcccs> (raw)
In-Reply-To: A20315AE59B5C34585629E258D76A97C71AC52@34093-C3-EVS3.exchange.rackspace.com
----- Original Message -----
From: "David Lethe" <david@santools.com>
To: "Cry" <cry_regarder@yahoo.com>; <linux-raid@vger.kernel.org>
Sent: Tuesday, May 20, 2008 7:18 PM
Subject: RE: Re: Two Drive Failure on RAID-5
>
>
> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org
> [mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Cry
> Sent: Tuesday, May 20, 2008 10:32 AM
> To: linux-raid@vger.kernel.org
> Subject: Re: Two Drive Failure on RAID-5
>
> David Greaves <david <at> dgreaves.com> writes:
>
>>
>> Yep. Don't panic and don't do anything else yet if you're not
> confident about
>> what you're doing.
>>
>> I'll follow up with more info in a short while.
>>
>> Info you can provide:
>> kernel version
>> mdadm version
>> cat /proc/mdstat
>> mdadm --examine /dev/sd[abcdef]1 (or whatever your array components
> are)
>> relevant smartctl info on the bad drive(s)
>> dmesg info about the drive failures
>>
>> Assuming genuine hardware failure:
>> Do you have any spare drives that you can use to replace the
> components?
>>
>> David
>
> Thanks for the info. I was able to do a --force --assemble on the array
> and I
> copied off my most critical data. At the moment, I don't have enough
> drives to
> take all the data on the array, so I'm going to be at a bit of a
> standstill
> until new hardware arrives.
>
> Since the copy of that data (about 500Gig of about 2TB) went so well, I
> decided
> to try to sync up the spare again and it died at the same point and the
> raid
> system pulled down the array. I'm trying to decide if I should follow
> your
> suggestion in sister post to copy the failed drive onto my spare or if I
> should
> just format the spare and try to recover another 500 gig of data of the
> array.
>
> Is there a mdadm or other command to tell the raid system to stay up in
> the face
> of errors? Can the array be assembled in a way that doesn't change the
> array in
> any way (completely read-only)?
>
> I've got the older failed drive also (about 15 hours older). Can that
> be
> leveraged also?
>
> The server isn't networked right now, but I'll try to get the above
> requested
> logs tonight.
>
> By the way, I'm thinking about buying five of these:
>
> Seagate Barracuda 7200.11 1TB ST31000340AS SATA-II 32MB Cache
>
> and one of these:
>
> Supermicro SUPERMICRO CSE-M35T-1 Hot-Swapable SATA HDD Enclosure
>
> http://www.supermicro.com/products/accessories/mobilerack/CSE-M35T-1.cfm
>
> and building a raid-6 array. I'll convert the surviving drives into a
> backup
> for the primary array. Any feedback on the above? Is there a
> suggestion on an
> inexpensive controller to give more SATA ports that is very software
> raid
> compatible?
>
> Any suggestions for optimal configuration (ext3) and tuning for the new
> array?
> My load consists of serving a photo gallery via apache and gallery2 as
> well as a
> local media (audio/video) server so files sizes tend to be large.
>
> Thanks,
>
> Joel
> ===============
> Joel:
>
> Respectfully .. are you nuts???
>
> Don't buy the 7200.11 disks. You bought a bunch of desktop class
> drives, and they crapped out on you, and you are about to make the same
> mistake again. Get the server class disk that is designed to run 24x7
> duty cycle, which in your case would be the 'cuda ES.2
>
> Sorry about the soapbox, but it never ceases to amaze me how people try
> to save by buying disk drives architected with lowest possible cost in
> mind, and don't investigate the higher-quality disks that are designed
> for extended reliability and data integrity.
>
> David
David and Joel,
Let me remember you to the power supply!
This is really important too!
The 24x7 cycle systems need a good quality PS and cables, connectors for
hdd.
One poor (Y) cable, or connector can make easy 1-2 or more failed drives at
a same time!
The SMART can monitor the actual state, but can not monitor the bad
connection and/or noise on the voltage.
Cheers,
Janos
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-05-20 19:40 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-19 22:49 Two Drive Failure on RAID-5 Cry
2008-05-20 7:37 ` David Greaves
2008-05-20 15:32 ` Cry
2008-05-20 17:18 ` David Lethe
2008-05-20 19:01 ` Cry
2008-05-20 20:09 ` David Lethe
2008-05-20 23:11 ` Keith Roberts
2008-05-20 19:40 ` Janos Haar [this message]
2008-05-20 17:27 ` David Lethe
2008-05-20 19:28 ` Brad Campbell
2008-05-20 9:14 ` David Greaves
2008-05-20 12:17 ` Janos Haar
2008-05-21 14:14 ` Cry
2008-05-21 20:15 ` David Greaves
2008-05-21 20:47 ` Janos Haar
2008-05-21 21:21 ` Cry
2008-05-22 8:38 ` David Greaves
2008-05-31 9:27 ` Cry
2008-05-22 0:05 ` Cry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='044c01c8bab1$6382e6c0$9300a8c0@dcccs' \
--to=janos.haar@netcenter.hu \
--cc=cry_regarder@yahoo.com \
--cc=david@santools.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.