From: Nix <nix@esperi.org.uk>
To: Pierre Beck <mail@pierre-beck.de>
Cc: Chris Murphy <lists@colorremedies.com>,
Linux RAID <linux-raid@vger.kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: Areca hardware RAID / first-ever SCSI bus reset: am I about to lose this disk controller?
Date: Tue, 02 Oct 2012 01:10:12 +0100 [thread overview]
Message-ID: <87mx057mzf.fsf@spindle.srvr.nix> (raw)
In-Reply-To: <506A0BFB.60606@pierre-beck.de> (Pierre Beck's message of "Mon, 01 Oct 2012 23:33:15 +0200")
On 1 Oct 2012, Pierre Beck stated:
> On 23.09.2012 17:42, Nix wrote:
>> On 19 Sep 2012, Chris Murphy outgrape:
>>
>>> On Sep 19, 2012, at 12:52 PM, Nix wrote:
>>>
>>>> So I have this x86-64 server running Linux 3.5.1 with a SATA-on-PCIe
>>>> Areca 1210 hardware RAID-5 controller
>>> Did you find this? Same controller family. Weird that this just shows
>>> up now, but perhaps instead of it being "bad hardware" out the gate,
>>> something's happened to it and now it's failing as you suspect.
>> Hm, it's possible I suppose. Just as possible that a disk is dying.
>>
>>
>> It looks to have been a one-off transient -- no recurrence yet, touch
>> wood :)
>>
> Check the SMART values of the disks if possible. Watch for command
> timeouts and the usual bad sector stuff. I've had similar issues with
> Adaptec controllers. Bad disks seem to cause havoc. The outstanding
> operation isn't answered within [SCSI Timeout, default 30,
> /sys/block/sdX/device/timeout] seconds, so Linux performs a loop
> reset, eventually resetting the controller. That means between 60 and
> 120 seconds of zero I/O operation, varying between controllers and
> disk array sizes. It's particularly annoying when in RAID and the disk
> could've simply been kicked within few seconds. Something that needs
> improvement IMHO.
The problem has not recurred in more than three weeks. SMART says no
problems... so I guess the controller dropped off the bus for some
reason. Probably some sort of subtle firmware bug or something. (There
are hints in the driver that such bugs exist, hence the enormous amount
of code the driver devotes to resetting the thing when it goes silent.)
--
NULL && (void)
next prev parent reply other threads:[~2012-10-02 0:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-19 18:52 Areca hardware RAID / first-ever SCSI bus reset: am I about to lose this disk controller? Nix
2012-09-19 20:19 ` Chris Murphy
2012-09-23 15:41 ` Nix
2012-10-01 21:33 ` Pierre Beck
2012-10-01 22:46 ` Chris Murphy
2012-10-01 23:54 ` Pierre Beck
2012-10-02 0:10 ` Nix [this message]
2012-09-19 22:30 ` Stan Hoeppner
2012-09-20 6:51 ` Nix
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mx057mzf.fsf@spindle.srvr.nix \
--to=nix@esperi.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=lists@colorremedies.com \
--cc=mail@pierre-beck.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).