From: Roger Heflin <rogerheflin@gmail.com>
To: Redeeman <redeeman@metanurb.dk>
Cc: Kyle Liddell <kyle@foobox.homelinux.net>, linux-raid@vger.kernel.org
Subject: Re: forcing check of RAID1 arrays causes lockup
Date: Thu, 28 May 2009 18:25:35 -0500 [thread overview]
Message-ID: <4A1F1D6F.2000205@gmail.com> (raw)
In-Reply-To: <1243514523.5740.90.camel@localhost>
Redeeman wrote:
> On Thu, 2009-05-28 at 05:17 -0400, Kyle Liddell wrote:
>> On Tue, May 26, 2009 at 08:33:30PM -0500, Roger Heflin wrote:
>>> Not what you are going to want to hear but badly designed hardware.
>>>
>>> On a machine I had with 4 disks (2 on a build-in via, 2 on other
>>> ports--either a built-in promise, or a sil pci card), when the 2
>>> build-in via sata ports got used heavily at the same times as any
>> ...
>>> It appeared to me as designed the via chipsets (And think your
>>> chipset is pretty close to the one I was using) did not appear to deal
>>> with with high levels of traffic to several devices at once, and would
>>> become unstable.
>>>
>>> Once I figured out the issue, I could duplicate it in under 5 minutes,
>>> and the only working solution was to not use the via ports.
>>>
>>> My mb at the time was a Asus k8v se deluxe with a K8T800 chipset, and
>>> so long as it was not heavily used it was stable, but under heavy use
>>> it was junk.
>> That does sound like my problem, and the hardware is similar. However, I don't think it's the VIA controller that's the problem here: I moved the two drives off the on-board VIA controller and placed them as slaves on the Promise card. I was able to install fedora, which was an improvement, but once installed, I was able to bring the system down again by forcing a check. I've got a spare Promise IDE controller, so I tried swapping it out, with no change.
>>
>> I suppose it's a weird hardware bug, although it still is strange that certain combinations of kernels (which makes a little sense) + distributions (which makes no sense) will work. I just went back to debian on the machine, and it works fine.
>>
>> I'm trying to reproduce the problem on another machine, but I'm not too hopeful.
>
> I have a system with 6 drives in raid5, on such a k8v se deluxe board
> with the via controller, and an additional PCI controller.
>
> I am experiencing some weird problems on this system too, when doing
> lots of read/write it will freeze for up to 10 seconds sometimes, what
> happens is that one of the disks gets the bus soft reset.
>
> Now with the old IDE driver, this would f*** up completely, and the box
> had to be rebooted, however, libata was able to recover it nicely and
> just continue after the freeze.
With the sata via's and sata_via under heavy enough loads it was not
recovering. Under lighter loads it did get some odd errors that were
not fatal.
>
> I was never able to solve the issue, and since it wasnt a major problem
> for the use, i have just ignored it.
On mine until I tried it full rebuild (after a crash) it was not an
issue, with 2 on the via, and 2 either on the promise or a sil pci
card it crashed quickly under a rebuild, the machine had been mostly
stable (a few odd things happened) for a couple of months, after
moving away from the via ports the few odd things quit happening, so I
believe the odd behavior (video capture cards appearing to lose parts
of their video data, video capture cards locking up internally-note 2
completely different pci card designs/chipsets were both doing funny
things).
>
> do you suppose adding another pci card with IDE ports and discontinuing
> use of the via controller entirely would fix it?
On mine moving everything off of the sata_via ports make things slower
but stable (the via pata/sata ports are on pci66x32bit vs. the pci bus
sharing a single pci33x32bit bus), so in the first setup
(2via/2pcibus) the slowest disk's share of bandwidth is in theory
66mb/second, on the second case (4pcibus) each disk's share of
bandwidth is in theory 33mb/second, and I could see a large difference
in the raid5 read/writes in one case vs the other, but the faster case
was unstable so useless.
>
> though, i have actually just replaced this box with a newer, so it wont
> do me much good now, however, the small amount of money a pci ide
> controller costs, would be worth it just to actually find out what was
> wrong.
>
I have since replaced mine (it is no longer a server needing more than
1 drive) so mine is not longer causing issues.
prev parent reply other threads:[~2009-05-28 23:25 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-26 23:45 forcing check of RAID1 arrays causes lockup kyle
2009-05-27 1:33 ` Roger Heflin
2009-05-28 9:17 ` Kyle Liddell
2009-05-28 12:42 ` Redeeman
2009-05-28 23:25 ` Roger Heflin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A1F1D6F.2000205@gmail.com \
--to=rogerheflin@gmail.com \
--cc=kyle@foobox.homelinux.net \
--cc=linux-raid@vger.kernel.org \
--cc=redeeman@metanurb.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).