linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Schut <schut@sarvision.nl>
To: linux-raid@vger.kernel.org
Subject: Re: RAID 6 Failure follow up
Date: Wed, 11 Nov 2009 13:46:41 +0100	[thread overview]
Message-ID: <4AFAB231.6020306@sarvision.nl> (raw)
In-Reply-To: <4AFAAF42.2000503@gmail.com>

Andrew Dunn wrote:
> Thanks for your help, so far without smartctl installed I have had no
> issues... but it has only been about 12 hours.
I also had no issues when not running smartd/smartctl. It seems the 
combination of kernel, backplane SAS driver, and smart which triggers 
the trouble...
> 
> Could you send me your smatd.conf?

It's pretty much default, there's just one uncommented line in it:

DEVICESCAN -d scsi -a -o on -S on -s (S/../.././02|L/../../6/03) -W 
4,45,55 -R 5 -m my@mail.address -M exec 
/usr/share/smartmontools/smartd-runner

(the above 3 lines should be all on one line).
I plan to replace the devicescan with explicit /dev/sd.. items, but as 
I'm currently regularly adding and removing (usb) drives, I kept the 
auto devicescan statement.
The rest means: enable smart on all drives, plan daily short and weekly 
long selftests, and warn on temperature too high or temp change of more 
than 5 deg., and mail warnings/errors to me.

VS.
> 
> Vincent Schut wrote:
>> Andrew Dunn wrote:
>>> I am able to reproduce this smart error now. I have done it twice, so
>>> maybe other things are causing this also.
>>>
>>> When I scanned the devices this morning with smartctl via webmin I lost
>>> 8 of the 9 drives. They are howerver still in my /dev folder.
>>>
>>> Now I sent out my logs from the first failure last night, smartctl was
>>> on the system... I dont know if ubuntu server's default smartd
>>> configuration makes it do periodic scans because I didnt change
>>> anything.
>>>
>>> I would hate to move back to 9.10 and see this problem again.
>>>
>>> Should I just not install smartmontools? This seems like a bad solution
>>> because now I wont be able to check the drives in advance for failures.
>>>
>>> Have you installed LSI's linux drivers? Some people say this solves
>>> their issue.
>>>
>>> From the logs sent out last night do you think it could be something
>>> else?
>>>
>>> Thanks a ton,
>> FWIW, I encountered the same issue, and seem to have found a viable
>> workaround by accessing the SATA disks on that LSI backplane as scsi
>> devices, e.g. by adding '-d scsi' to my smartctl/smartd.conf lines. No
>> more errors in the logs, no more drives being kicked out.
>> Though not as much info is available that way as when using de sata
>> driver ('-d sat', or automatically), like temperature is unavailable,
>> it does allow me to initiate the selftests and get their result, and
>> to monitor generic smart status of the drives. Quite enough for me.
>>
>> YMMV, though.
>>
>> Vincent.
>>> Gabor Gombas wrote:
>>>> On Mon, Nov 09, 2009 at 05:08:23AM -0500, Andrew Dunn wrote:
>>>>
>>>>  
>>>>> does it momentarily offline the disks? like they re-appear in /dev
>>>>> within moments? That would be similar behavior to what I am
>>>>> experiencing, the disks drop from the array, but they are in /dev
>>>>> by the
>>>>> time I get a chance to see them.
>>>>>     
>>>> No, either the disks need to be physically removed and re-inserted, or
>>>> the machine needs to be rebooted.
>>>>
>>>> Gabor
>>>>
>>>>   
>>
> 


  reply	other threads:[~2009-11-11 12:46 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-08 14:07 RAID 6 Failure follow up Andrew Dunn
2009-11-08 14:23 ` Roger Heflin
2009-11-08 14:30   ` Andrew Dunn
2009-11-08 18:01     ` Richard Scobie
2009-11-08 18:22       ` Andrew Dunn
2009-11-08 18:34         ` Joe Landman
2009-11-08 22:09       ` Andrew Dunn
2009-11-08 22:59         ` Richard Scobie
2009-11-09  2:45           ` Ryan Wagoner
2009-11-09  2:57             ` Richard Scobie
2009-11-09  8:09             ` Gabor Gombas
2009-11-09 10:08               ` Andrew Dunn
2009-11-09 11:34                 ` Gabor Gombas
2009-11-09 22:04                   ` Andrew Dunn
2009-11-10 10:55                   ` Andrew Dunn
2009-11-10 11:34                     ` Vincent Schut
2009-11-11 12:34                       ` Andrew Dunn
2009-11-11 12:46                         ` Vincent Schut [this message]
2009-11-17  8:40                       ` Vincent Schut
2009-11-10 12:45                     ` Ryan Wagoner
2009-11-08 14:36   ` Andrew Dunn
2009-11-08 14:56     ` Roger Heflin
2009-11-08 17:08       ` Andrew Dunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AFAB231.6020306@sarvision.nl \
    --to=schut@sarvision.nl \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).