From: Bill Davidsen <davidsen@tmr.com>
To: Ralf Herrmann <Ralf.Herrmann@TU-Ilmenau.de>
Cc: Neil Brown <neilb@suse.de>, linux-raid@vger.kernel.org
Subject: Re: PROBLEM: system crash on AMD64 with 2.6.17.11 while accessing 3TB Software-RAID5
Date: Mon, 04 Sep 2006 13:06:34 -0400 [thread overview]
Message-ID: <44FC5D1A.6020809@tmr.com> (raw)
In-Reply-To: <44F6E962.40900@TU-Ilmenau.de>
Ralf Herrmann wrote:
> Dear Mr.Brown,
>
>> Yes...... you are hitting some pretty serious BUGs. And this is in
>> code that is not specific to RAID at all, so if there really were bugs
>> there, we would expect to have seen them well before now.
>
>
> Your are absolutely right, it doesn't seem to be in RAID at all,
> but as of now, it only happened when doing something with /dev/md0.
>
>
>> I really looks to me like a hardware problem. Some how various bits
>> of memory sometimes have bad values and cause a problem.
>>
>> How long did you run memtest? I would suggest running it for at
>> least 24 hours, because my best guess is that it is bad memory, even
>> though your tests so far don't show that.
>
>
> I ran it for about 16h, with all tests enabled, no error occured.
>
> I was always wondering why it worked before the change and
> not now. The only difference were the larger drives. And i've read so
> many reports
> of people running much larger RAID5 partitions than we do,
> so why should it fail in this case?
>
> So my best bet at the moment, would be a hardware problem, too.
> I continued looking at the kernel oops messages and sometimes
> disassembly of the code where it broke gave invalid opcodes.
> This also looks pretty much like a hardware issue.
> But tests of single components did not unvail any error.....
>
> It seems to me, that it only happens, when many system components
> are involved, several HDDs, the whole RAM, the NIC and so on.
> That leads me to another idea i'm currently testing.
>
> It could very well be a bad power supply.
> Maybe this box was running at full load of the power supply before,
> and now with new drives consumes more power the supply can deliver,
> if all system components are used at once.
> I switched to a better power supply, tests are running as i write this.
>
> I'm sorry if i wasted your time, i should have checked this before
> writing to the list. But power supply problems are pretty odd
> and hard to identify. Anyways, i'm not sure if that solves the problem.
>
> Ok, i'll write the results of current tests, when they are finished.
>
> Thanks for your consideration.
It certainly is a legitimate question, and marginal power would have
been at the end of my list as well... However, if all else fails, try
formatting the new drives to use only the size of the old drive capacity
(RAID on small partitions) and see if that works. If so you may have
found some rare size-related bug.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2006-09-04 17:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-28 22:10 PROBLEM: system crash on AMD64 with 2.6.17.11 while accessing 3TB Software-RAID5 Ralf Herrmann
2006-08-31 3:56 ` Neil Brown
2006-08-31 13:51 ` Re[2]: " Ralf Herrmann
2006-09-04 17:06 ` Bill Davidsen [this message]
2006-09-08 2:02 ` Ralf Herrmann
2006-09-14 22:37 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44FC5D1A.6020809@tmr.com \
--to=davidsen@tmr.com \
--cc=Ralf.Herrmann@TU-Ilmenau.de \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).