From: Stan Hoeppner <stan@hardwarefreak.com>
To: Barrett Lewis <barrett.lewis.mitsi@gmail.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
Phil Turmel <philip@turmel.org>
Subject: Re: Mdadm server eating drives
Date: Wed, 03 Jul 2013 12:05:56 -0500 [thread overview]
Message-ID: <51D459F4.7050309@hardwarefreak.com> (raw)
In-Reply-To: <CAPSPcXjn4=mxMLtu=88Fm9kFvh6_ujLwrTvQeqQuef3BG_c26Q@mail.gmail.com>
On 7/3/2013 12:26 AM, Barrett Lewis wrote:
...
> This is all about my dedicated server. The external enclosure with
> the 4 drives, 3 of which in a raid0 is just something I used for
> creating an emergency backup, and was plugged directly into the server
> via USB, (has it's own power supply too). The server is using the
> onboard video card on the Asrock z77 extreme 4.
Got it.
...
> The other 2 drives in the picture are the source drives that had the
> original data that the array was initially populated with.
Got it. These questions were simply to get a handle on how much +12V
power you needed before recommending a PSU.
...
> I have been really curious about this "beeping" issue since
> it is so bizarre. Anyway like I said only 2 of those original 6 (they
> were seagate ST2000DM001) remain.
When power supplies go bad you may witness all kinds of weird things.
If the voltage to the speaker drive circuit fluctuates wildly it can
cause leakage on the output drive, which causes the speaker to make
random noises.
> Cheap alternate PSU seemed to work OK so I went to buy a decent
> permanent replacement. I couldn't find either of the two you
> suggested at the store (they were closing and I wanted to get this
> done). So I ended up going with a 750w corsair CX750M. Like magic,
> with a new power supply most of the drives seem to be back working,
> except the first two that failed out yesterday. It seems like maybe
> the event counters (or something) are too far behind to assemble them
> back. That said, md0 mounts fine and fsck returned clean, so that
> deserves some kinda hooray!
The key thing is whether drives keep showing errors in dmesg and
dropping. If not your problem is likely solved. :)
> Here is some data about the two (sdd and sdf) that won't socialize
> with the other disks.
>
> sudo mdadm --assemble --force --verbose /dev/md0 /dev/sd[a-f]
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
> mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 5.
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 2.
> mdadm: added /dev/sdd to /dev/md0 as 1 (possibly out of date)
> mdadm: added /dev/sdf to /dev/md0 as 2 (possibly out of date)
> mdadm: added /dev/sde to /dev/md0 as 3
> mdadm: added /dev/sda to /dev/md0 as 4
> mdadm: added /dev/sdc to /dev/md0 as 5
> mdadm: added /dev/sdb to /dev/md0 as 0
> mdadm: /dev/md0 has been started with 4 drives (out of 6).
>
>
> and from dmesg
> [ 4481.356723] md: bind<sdd>
> [ 4481.356850] md: bind<sdf>
> [ 4481.357007] md: bind<sde>
> [ 4481.357134] md: bind<sda>
> [ 4481.357248] md: bind<sdc>
> [ 4481.357365] md: bind<sdb>
> [ 4481.357395] md: kicking non-fresh sdf from array!
> [ 4481.357400] md: unbind<sdf>
> [ 4481.374480] md: export_rdev(sdf)
> [ 4481.374484] md: kicking non-fresh sdd from array!
> [ 4481.374488] md: unbind<sdd>
> [ 4481.394486] md: export_rdev(sdd)
> [ 4481.396164] md/raid:md0: device sdb operational as raid disk 0
> [ 4481.396168] md/raid:md0: device sdc operational as raid disk 5
> [ 4481.396171] md/raid:md0: device sda operational as raid disk 4
> [ 4481.396173] md/raid:md0: device sde operational as raid disk 3
> [ 4481.396571] md/raid:md0: allocated 6384kB
> [ 4481.396805] md/raid:md0: raid level 6 active with 4 out of 6
> devices, algorithm 2
> [ 4481.396808] RAID conf printout:
> [ 4481.396810] --- level:6 rd:6 wd:4
> [ 4481.396812] disk 0, o:1, dev:sdb
> [ 4481.396814] disk 3, o:1, dev:sde
> [ 4481.396815] disk 4, o:1, dev:sda
> [ 4481.396817] disk 5, o:1, dev:sdc
> [ 4481.396848] md0: detected capacity change from 0 to 8001056407552
> [ 4481.426011] md0: unknown partition table
>
> sudo mdadm -E /dev/sd[a-f] | nopaste
> http://pastie.org/8105693
>
> sudo smartctl -x /dev/sdd | nopaste
> http://pastie.org/8105706
>
> sudo smartctl -x /dev/sdf | nopaste
> http://pastie.org/8105707
>
>
> Are sdd and sdf just too out of sync? Should I zero the superblocks
> and re-add them to the array? Or I could replace them (I have two
> unopened WD reds here, but I'd like to return them if I don't really
> need them right now).
I'm not an expert on recovery when things go this far South. Phil and
others are much more knowledgeable with this so I'll pass the thread
back to them now.
> Thanks for the advice about the PSU, I would have never dreamed it
> would cause behaviour like that.
You're welcome. I've spent a just little time around hardware, as you
might have guessed based on my email address. Started in 1986, so
that's, what, 26 years now? Damn I'm getting old...
--
Stan
next prev parent reply other threads:[~2013-07-03 17:05 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-12 13:47 Mdadm server eating drives Barrett Lewis
2013-06-12 13:57 ` David Brown
2013-06-12 14:44 ` Phil Turmel
2013-06-12 15:41 ` Adam Goryachev
[not found] ` <CAPSPcXihHrAi2TB9Fuxb1qOGMc_WzwGoXAA7nHdwe2knkO0LkQ@mail.gmail.com>
[not found] ` <CAPSPcXib4YZ9Ah-jLvL_kPwpKHLxaGT0rNaDL4XQcFm=RtjcAQ@mail.gmail.com>
2013-06-14 0:19 ` Barrett Lewis
2013-06-14 2:08 ` Phil Turmel
[not found] ` <CAPSPcXgMxOF-C2Szu_nf4ZLDC8p+yJFOtvLPu7xy1DTW9VAHjg@mail.gmail.com>
2013-06-14 21:18 ` Barrett Lewis
2013-06-14 21:20 ` Barrett Lewis
2013-06-14 21:25 ` Phil Turmel
2013-06-14 21:30 ` Phil Turmel
2013-06-17 21:37 ` Barrett Lewis
2013-06-18 4:13 ` Mikael Abrahamsson
2013-06-27 0:23 ` Barrett Lewis
2013-06-27 17:13 ` Nicolas Jungers
2013-07-02 0:17 ` Barrett Lewis
2013-07-02 1:57 ` Stan Hoeppner
2013-07-02 15:48 ` Barrett Lewis
2013-07-02 19:44 ` Stan Hoeppner
2013-07-02 19:54 ` Stan Hoeppner
2013-07-02 20:07 ` Jon Nelson
2013-07-02 20:23 ` Stan Hoeppner
2013-07-02 20:58 ` Barrett Lewis
2013-07-03 1:50 ` Stan Hoeppner
2013-07-03 5:26 ` Barrett Lewis
2013-07-03 14:03 ` Jon Nelson
2013-07-03 14:36 ` Phil Turmel
2013-07-03 17:32 ` Stan Hoeppner
2013-07-03 19:47 ` Barrett Lewis
2013-07-03 20:38 ` Jon Nelson
2013-07-04 2:21 ` Stan Hoeppner
2013-07-03 17:05 ` Stan Hoeppner [this message]
2013-07-02 21:49 ` Phil Turmel
2013-06-14 21:24 ` Phil Turmel
2013-07-29 22:25 ` Roy Sigurd Karlsbakk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51D459F4.7050309@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=barrett.lewis.mitsi@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=philip@turmel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).