From: Andrew Dunn <andrew.g.dunn@gmail.com>
To: landman@scalableinformatics.com
Cc: linux-raid list <linux-raid@vger.kernel.org>
Subject: Re: RAID down, dont know why!
Date: Sun, 08 Nov 2009 09:21:20 -0500 [thread overview]
Message-ID: <4AF6D3E0.40707@gmail.com> (raw)
In-Reply-To: <4AF6D265.7090104@scalableinformatics.com>
storrgie@ALEXANDRIA:~$ sudo mdadm -D /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Fri Nov 6 07:06:34 2009
Raid Level : raid6
Array Size : 6837318656 (6520.58 GiB 7001.41 GB)
Used Dev Size : 976759808 (931.51 GiB 1000.20 GB)
Raid Devices : 9
Total Devices : 9
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Nov 8 09:17:55 2009
State : clean, degraded, recovering
Active Devices : 8
Working Devices : 9
Failed Devices : 0
Spare Devices : 1
Chunk Size : 1024K
Rebuild Status : 0% complete
UUID : 397e0b3f:34cbe4cc:613e2239:070da8c8 (local to host
ALEXANDRIA)
Events : 0.56
Number Major Minor RaidDevice State
0 8 65 0 active sync /dev/sde1
1 8 81 1 active sync /dev/sdf1
2 8 97 2 active sync /dev/sdg1
3 8 113 3 active sync /dev/sdh1
4 8 129 4 active sync /dev/sdi1
5 8 145 5 active sync /dev/sdj1
9 8 161 6 spare rebuilding /dev/sdk1
7 8 177 7 active sync /dev/sdl1
8 8 193 8 active sync /dev/sdm1
Did a:
sudo mdadm --assemble --force /dev/md0 /dev/sd[efghijklm]1
Now its rebuilding? Why did it go down in the first place?
Power and connections are fine and smart reports:
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sde | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdf | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdg | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdh | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdi | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdj | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdk | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdl | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdm | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
Joe Landman wrote:
> Andrew Dunn wrote:
>> storrgie@ALEXANDRIA:~$ lsscsi | grep sd[ijkl]
>> [11:0:0:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdi
>> [11:0:1:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdj
>> [11:0:2:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdk
>> [11:0:3:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdl
>>
>
> Does smartctl report drive failure?
>
> smartctl -a /dev/sdi | grep "SMART overall-health"
> smartctl -a /dev/sdj | grep "SMART overall-health"
> smartctl -a /dev/sdk | grep "SMART overall-health"
> smartctl -a /dev/sdl | grep "SMART overall-health"
>
>>
>> Joe Landman wrote:
>>> Andrew Dunn wrote:
>>>> I just copied 4+ TiB of information to this array, restarted 5 times
>>>> and tried to access it.... What is going on?
>>> It looks like you have 4 failed drives. sdl,sdi,sdj,sdk
>>>
>>> Is it possible you lost power or connectivity to those drives?
>>>
>>> If you have lsscsi installed, what does lsscsi tell you about this?
>>>
>>> lsscsi | grep sd[ijkl]
>>>
>>> Given the proximity of the drives in ordering, I'd suspect a power
>>> loss, or cable seating, or similar to those drives.
>>>
>>> Reseat power/signal cables on the drive bays, and see if this helps.
>>>
>>>
>>> Joe
>>>
>>
>
>
--
Andrew Dunn
http://agdunn.net
next prev parent reply other threads:[~2009-11-08 14:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-08 14:00 RAID down, dont know why! Andrew Dunn
2009-11-08 14:07 ` Joe Landman
2009-11-08 14:08 ` Andrew Dunn
2009-11-08 14:15 ` Joe Landman
2009-11-08 14:21 ` Andrew Dunn [this message]
[not found] ` <4AF82DAC.4020307@harddata.com>
2009-11-09 22:03 ` Andrew Dunn
[not found] ` <4AF82D29.507@harddata.com>
[not found] ` <4AF82DE4.2040805@scalableinformatics.com>
2009-11-09 21:23 ` Andrew Dunn
2009-11-08 14:22 ` Robin Hill
2009-11-08 14:24 ` Andrew Dunn
2009-11-08 15:01 ` Robin Hill
2009-11-08 22:08 ` Ryan Wagoner
2009-11-08 22:15 ` Andrew Dunn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AF6D3E0.40707@gmail.com \
--to=andrew.g.dunn@gmail.com \
--cc=landman@scalableinformatics.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).