linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roger Heflin <rogerheflin@gmail.com>
To: Redeeman <redeeman@metanurb.dk>
Cc: Kyle Liddell <kyle@foobox.homelinux.net>, linux-raid@vger.kernel.org
Subject: Re: forcing check of RAID1 arrays causes lockup
Date: Thu, 28 May 2009 18:25:35 -0500	[thread overview]
Message-ID: <4A1F1D6F.2000205@gmail.com> (raw)
In-Reply-To: <1243514523.5740.90.camel@localhost>

Redeeman wrote:
> On Thu, 2009-05-28 at 05:17 -0400, Kyle Liddell wrote:
>> On Tue, May 26, 2009 at 08:33:30PM -0500, Roger Heflin wrote:
>>> Not what you are going to want to hear but badly designed hardware.
>>>
>>> On a machine I had with 4 disks (2 on a build-in via, 2 on other 
>>> ports--either a built-in promise, or a sil pci card), when the 2 
>>> build-in via sata ports got used heavily at the same times as any 
>> ...
>>>    It appeared to me as designed the via chipsets (And think your 
>>> chipset is pretty close to the one I was using) did not appear to deal 
>>> with with high levels of traffic to several devices at once, and would 
>>> become unstable.
>>>
>>> Once I figured out the issue, I could duplicate it in under 5 minutes, 
>>> and the only working solution was to not use the via ports.
>>>
>>> My mb at the time was a Asus k8v se deluxe with a K8T800 chipset, and 
>>> so long as it was not heavily used it was stable, but under heavy use 
>>> it was junk.
>> That does sound like my problem, and the hardware is similar.  However, I don't think it's the VIA controller that's the problem here:  I moved the two drives off the on-board VIA controller and placed them as slaves on the Promise card.  I was able to install fedora, which was an improvement, but once installed, I was able to bring the system down again by forcing a check.  I've got a spare Promise IDE controller, so I tried swapping it out, with no change.
>>
>> I suppose it's a weird hardware bug, although it still is strange that certain combinations of kernels (which makes a little sense) + distributions (which makes no sense) will work.  I just went back to debian on the machine, and it works fine.  
>>
>> I'm trying to reproduce the problem on another machine, but I'm not too hopeful.
> 
> I have a system with 6 drives in raid5, on such a k8v se deluxe board
> with the via controller, and an additional PCI controller.
> 
> I am experiencing some weird problems on this system too, when doing
> lots of read/write it will freeze for up to 10 seconds sometimes, what
> happens is that one of the disks gets the bus soft reset.
> 
> Now with the old IDE driver, this would f*** up completely, and the box
> had to be rebooted, however, libata was able to recover it nicely and
> just continue after the freeze.

With the sata via's and sata_via under heavy enough loads it was not 
recovering.   Under lighter loads it did get some odd errors that were 
not fatal.

> 
> I was never able to solve the issue, and since  it wasnt a major problem
> for the use, i have just ignored it.

On mine until I tried it full rebuild (after a crash) it was not an 
issue, with 2 on the via, and 2 either on the promise or a sil pci 
card it crashed quickly under a rebuild, the machine had been mostly 
stable (a few odd things happened) for a couple of months, after 
moving away from the via ports the few odd things quit happening, so I 
believe the odd behavior (video capture cards appearing to lose parts 
of their video data, video capture cards locking up internally-note 2 
completely different pci card designs/chipsets were both doing funny 
things).

> 
> do you suppose adding another pci card with IDE ports and discontinuing
> use of the via controller entirely would fix it?

On mine moving everything off of the sata_via ports make things slower 
but stable (the via pata/sata ports are on pci66x32bit vs. the pci bus 
sharing a single pci33x32bit bus), so in the first setup 
(2via/2pcibus) the slowest disk's share of bandwidth is in theory 
66mb/second, on the second case (4pcibus) each disk's share of 
bandwidth is in theory 33mb/second, and I could see a large difference 
in the raid5 read/writes in one case vs the other, but the faster case 
was unstable so useless.

> 
> though, i have actually just replaced this box with a newer, so it wont
> do me much good now, however, the small amount of money a pci ide
> controller costs, would be worth it just to actually find out what was
> wrong.
>

I have since replaced mine (it is no longer a server needing more than 
  1 drive) so mine is not longer causing issues.

      reply	other threads:[~2009-05-28 23:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-26 23:45 forcing check of RAID1 arrays causes lockup kyle
2009-05-27  1:33 ` Roger Heflin
2009-05-28  9:17   ` Kyle Liddell
2009-05-28 12:42     ` Redeeman
2009-05-28 23:25       ` Roger Heflin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A1F1D6F.2000205@gmail.com \
    --to=rogerheflin@gmail.com \
    --cc=kyle@foobox.homelinux.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=redeeman@metanurb.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).