From: Vincent Schut <schut@sarvision.nl>
To: linux-raid@vger.kernel.org
Subject: Re: RAID 6 Failure follow up
Date: Wed, 11 Nov 2009 13:46:41 +0100 [thread overview]
Message-ID: <4AFAB231.6020306@sarvision.nl> (raw)
In-Reply-To: <4AFAAF42.2000503@gmail.com>
Andrew Dunn wrote:
> Thanks for your help, so far without smartctl installed I have had no
> issues... but it has only been about 12 hours.
I also had no issues when not running smartd/smartctl. It seems the
combination of kernel, backplane SAS driver, and smart which triggers
the trouble...
>
> Could you send me your smatd.conf?
It's pretty much default, there's just one uncommented line in it:
DEVICESCAN -d scsi -a -o on -S on -s (S/../.././02|L/../../6/03) -W
4,45,55 -R 5 -m my@mail.address -M exec
/usr/share/smartmontools/smartd-runner
(the above 3 lines should be all on one line).
I plan to replace the devicescan with explicit /dev/sd.. items, but as
I'm currently regularly adding and removing (usb) drives, I kept the
auto devicescan statement.
The rest means: enable smart on all drives, plan daily short and weekly
long selftests, and warn on temperature too high or temp change of more
than 5 deg., and mail warnings/errors to me.
VS.
>
> Vincent Schut wrote:
>> Andrew Dunn wrote:
>>> I am able to reproduce this smart error now. I have done it twice, so
>>> maybe other things are causing this also.
>>>
>>> When I scanned the devices this morning with smartctl via webmin I lost
>>> 8 of the 9 drives. They are howerver still in my /dev folder.
>>>
>>> Now I sent out my logs from the first failure last night, smartctl was
>>> on the system... I dont know if ubuntu server's default smartd
>>> configuration makes it do periodic scans because I didnt change
>>> anything.
>>>
>>> I would hate to move back to 9.10 and see this problem again.
>>>
>>> Should I just not install smartmontools? This seems like a bad solution
>>> because now I wont be able to check the drives in advance for failures.
>>>
>>> Have you installed LSI's linux drivers? Some people say this solves
>>> their issue.
>>>
>>> From the logs sent out last night do you think it could be something
>>> else?
>>>
>>> Thanks a ton,
>> FWIW, I encountered the same issue, and seem to have found a viable
>> workaround by accessing the SATA disks on that LSI backplane as scsi
>> devices, e.g. by adding '-d scsi' to my smartctl/smartd.conf lines. No
>> more errors in the logs, no more drives being kicked out.
>> Though not as much info is available that way as when using de sata
>> driver ('-d sat', or automatically), like temperature is unavailable,
>> it does allow me to initiate the selftests and get their result, and
>> to monitor generic smart status of the drives. Quite enough for me.
>>
>> YMMV, though.
>>
>> Vincent.
>>> Gabor Gombas wrote:
>>>> On Mon, Nov 09, 2009 at 05:08:23AM -0500, Andrew Dunn wrote:
>>>>
>>>>
>>>>> does it momentarily offline the disks? like they re-appear in /dev
>>>>> within moments? That would be similar behavior to what I am
>>>>> experiencing, the disks drop from the array, but they are in /dev
>>>>> by the
>>>>> time I get a chance to see them.
>>>>>
>>>> No, either the disks need to be physically removed and re-inserted, or
>>>> the machine needs to be rebooted.
>>>>
>>>> Gabor
>>>>
>>>>
>>
>
next prev parent reply other threads:[~2009-11-11 12:46 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-08 14:07 RAID 6 Failure follow up Andrew Dunn
2009-11-08 14:23 ` Roger Heflin
2009-11-08 14:30 ` Andrew Dunn
2009-11-08 18:01 ` Richard Scobie
2009-11-08 18:22 ` Andrew Dunn
2009-11-08 18:34 ` Joe Landman
2009-11-08 22:09 ` Andrew Dunn
2009-11-08 22:59 ` Richard Scobie
2009-11-09 2:45 ` Ryan Wagoner
2009-11-09 2:57 ` Richard Scobie
2009-11-09 8:09 ` Gabor Gombas
2009-11-09 10:08 ` Andrew Dunn
2009-11-09 11:34 ` Gabor Gombas
2009-11-09 22:04 ` Andrew Dunn
2009-11-10 10:55 ` Andrew Dunn
2009-11-10 11:34 ` Vincent Schut
2009-11-11 12:34 ` Andrew Dunn
2009-11-11 12:46 ` Vincent Schut [this message]
2009-11-17 8:40 ` Vincent Schut
2009-11-10 12:45 ` Ryan Wagoner
2009-11-08 14:36 ` Andrew Dunn
2009-11-08 14:56 ` Roger Heflin
2009-11-08 17:08 ` Andrew Dunn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AFAB231.6020306@sarvision.nl \
--to=schut@sarvision.nl \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).