From: Andrew Dunn <andrew.g.dunn@gmail.com>
To: landman@scalableinformatics.com
Cc: linux-raid list <linux-raid@vger.kernel.org>
Subject: Re: RAID down, dont know why!
Date: Sun, 08 Nov 2009 09:21:20 -0500 [thread overview]
Message-ID: <4AF6D3E0.40707@gmail.com> (raw)
In-Reply-To: <4AF6D265.7090104@scalableinformatics.com>
storrgie@ALEXANDRIA:~$ sudo mdadm -D /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Fri Nov 6 07:06:34 2009
Raid Level : raid6
Array Size : 6837318656 (6520.58 GiB 7001.41 GB)
Used Dev Size : 976759808 (931.51 GiB 1000.20 GB)
Raid Devices : 9
Total Devices : 9
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Nov 8 09:17:55 2009
State : clean, degraded, recovering
Active Devices : 8
Working Devices : 9
Failed Devices : 0
Spare Devices : 1
Chunk Size : 1024K
Rebuild Status : 0% complete
UUID : 397e0b3f:34cbe4cc:613e2239:070da8c8 (local to host
ALEXANDRIA)
Events : 0.56
Number Major Minor RaidDevice State
0 8 65 0 active sync /dev/sde1
1 8 81 1 active sync /dev/sdf1
2 8 97 2 active sync /dev/sdg1
3 8 113 3 active sync /dev/sdh1
4 8 129 4 active sync /dev/sdi1
5 8 145 5 active sync /dev/sdj1
9 8 161 6 spare rebuilding /dev/sdk1
7 8 177 7 active sync /dev/sdl1
8 8 193 8 active sync /dev/sdm1
Did a:
sudo mdadm --assemble --force /dev/md0 /dev/sd[efghijklm]1
Now its rebuilding? Why did it go down in the first place?
Power and connections are fine and smart reports:
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sde | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdf | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdg | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdh | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdi | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdj | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdk | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdl | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
storrgie@ALEXANDRIA:~$ sudo smartctl -a /dev/sdm | grep "SMART
overall-health"
SMART overall-health self-assessment test result: PASSED
Joe Landman wrote:
> Andrew Dunn wrote:
>> storrgie@ALEXANDRIA:~$ lsscsi | grep sd[ijkl]
>> [11:0:0:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdi
>> [11:0:1:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdj
>> [11:0:2:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdk
>> [11:0:3:0] disk ATA WDC WD1001FALS-0 0K05 /dev/sdl
>>
>
> Does smartctl report drive failure?
>
> smartctl -a /dev/sdi | grep "SMART overall-health"
> smartctl -a /dev/sdj | grep "SMART overall-health"
> smartctl -a /dev/sdk | grep "SMART overall-health"
> smartctl -a /dev/sdl | grep "SMART overall-health"
>
>>
>> Joe Landman wrote:
>>> Andrew Dunn wrote:
>>>> I just copied 4+ TiB of information to this array, restarted 5 times
>>>> and tried to access it.... What is going on?
>>> It looks like you have 4 failed drives. sdl,sdi,sdj,sdk
>>>
>>> Is it possible you lost power or connectivity to those drives?
>>>
>>> If you have lsscsi installed, what does lsscsi tell you about this?
>>>
>>> lsscsi | grep sd[ijkl]
>>>
>>> Given the proximity of the drives in ordering, I'd suspect a power
>>> loss, or cable seating, or similar to those drives.
>>>
>>> Reseat power/signal cables on the drive bays, and see if this helps.
>>>
>>>
>>> Joe
>>>
>>
>
>
--
Andrew Dunn
http://agdunn.net
next prev parent reply other threads:[~2009-11-08 14:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-08 14:00 RAID down, dont know why! Andrew Dunn
2009-11-08 14:07 ` Joe Landman
2009-11-08 14:08 ` Andrew Dunn
2009-11-08 14:15 ` Joe Landman
2009-11-08 14:21 ` Andrew Dunn [this message]
[not found] ` <4AF82DAC.4020307@harddata.com>
2009-11-09 22:03 ` Andrew Dunn
[not found] ` <4AF82D29.507@harddata.com>
[not found] ` <4AF82DE4.2040805@scalableinformatics.com>
2009-11-09 21:23 ` Andrew Dunn
2009-11-08 14:22 ` Robin Hill
2009-11-08 14:24 ` Andrew Dunn
2009-11-08 15:01 ` Robin Hill
2009-11-08 22:08 ` Ryan Wagoner
2009-11-08 22:15 ` Andrew Dunn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AF6D3E0.40707@gmail.com \
--to=andrew.g.dunn@gmail.com \
--cc=landman@scalableinformatics.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.