linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Otto Solares <solca@guug.org>
To: Ryan Anderson <ryan@autoweb.net>
Cc: Andrew Morton <akpm@osdl.org>, linux-scsi@vger.kernel.org
Subject: Re: Fw: [Bugme-new] [Bug 3651] New: dell poweredge 4600 aacraid PERC 3/Di Container goes offline
Date: Tue, 23 Nov 2004 16:58:58 -0600	[thread overview]
Message-ID: <20041123225858.GC1401@guug.org> (raw)
In-Reply-To: <1101246101.26294.76.camel@ryan2.internal.autoweb.net>

On Tue, Nov 23, 2004 at 04:41:41PM -0500, Ryan Anderson wrote:
> On Thu, 2004-10-28 at 00:53 -0700, Andrew Morton wrote:
> > Subject: [Bugme-new] [Bug 3651] New: dell poweredge 4600 aacraid PERC 3/Di Container goes offline
> > 
> > 
> > http://bugme.osdl.org/show_bug.cgi?id=3651
> > 
> >            Summary: dell poweredge 4600 aacraid PERC 3/Di Container goes
> >                     offline
> >     Kernel Version: 2.6.10-rc1, 2.6.9, 2.6.8, 2.6.7, 2.6.6
> >             Status: NEW
> >           Severity: high
> >              Owner: andmike@us.ibm.com
> >          Submitter: oliver.polterauer@ewave.at
> >                 CC: oliver.polterauer@ewave.at
> 
> Is there any update on this problem?
> To reiterate my particular hardware involved that can trigger this
> problem:
> 
> Dell 2650, Dual 2.4Ghz Xeon processors (hyperthreading no, though the
> problem occured in 2.4.20 without hyperthreading disabled via "noht")
> 
> 4 GB of ram
> Only load is PostgreSQL related (i.e, network queries, plus twice daily
> dumps of the database to a NFS store, and a rsync back to the server for
> a second copy)
> 
> Under load, I repeatedly saw containers go offline.
> 
> Dell's recommended hardware diagnostics do not turn up anything (at
> all!)
> 
> The harddrive are Fujitsu drives, so the Seagate Firmware issue should
> not affect them.
> 
> I have since taken this server out of production.  Unfortunately, this
> makes the error much harder to trigger (i.e, I have failed so far to
> trigger it, even with multiple bonnie++ runs)
> 
> Suggestions, diagnostics, etc, would be greatly appreciated.

I used to have this very same problem with exactly the same hardware as you:

2 x 2.4GHz Xeon processor
4GB RAM
PERC 3/Di
4 x Fujitsu MAP3147NC Rev 5608 10K RPMs disks.

I tried all kernels on earth (2.4/2.6) and the controller always dies with
container offline (search this list for the past 15 days and you'll find
my problem).

Currently I'm running 2.6.10-rc1-bk20-adaptec-1.1.5-2370 _WITHOUT_ANY_
problems (ACPI on, HT enabled), my controller:

Red Hat/Adaptec aacraid driver (1.1-5[2370])
ACPI: PCI interrupt 0000:04:08.1[A] -> GSI 30 (level, low) -> IRQ 201
AAC0: kernel 2.8-0[6092] 
AAC0: monitor 2.8-0[6092]
AAC0: bios 2.8-0[6092]
AAC0: serial 3520a1d3
aacraid_setup("")
nondasd=-1 dacmode=-1 commit=-1 coalescethreshold=16 acbsize=-1
scsi0 : percraid
  Vendor: DELL      Model: PERC RAID5        Rev: V1.0
  Type:   Direct-Access                      ANSI SCSI revision: 02
SCSI device sda: 860149632 512-byte hdwr sectors (440397 MB)
SCSI device sda: drive cache: write through
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0

uptime:

 16:48:22 up 13 days, 29 min,  2 users,  load average: 1.15, 1.27, 1.31

I know 13 days is not much for a server but this server dies in the
1-2 day frame time so it is a huge improvement.

You should try that driver, it works for me.

I had to thank Mark Salyzyn from Adaptec for the updated driver, is my
opinion that this "enhanced driver" should make it in Linus' kernel.

-otto

  reply	other threads:[~2004-11-23 22:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-28  7:53 Fw: [Bugme-new] [Bug 3651] New: dell poweredge 4600 aacraid PERC 3/Di Container goes offline Andrew Morton
2004-10-28 18:21 ` Phil Brutsche
2004-11-09 20:22   ` Ryan Anderson
2004-11-09 21:32     ` Otto Solares
2004-11-09 23:49       ` Andrew Kinney
2004-11-10 17:43       ` aacraid, seagate and adaptec issues [Was: Re: Fw: [Bugme-new] [Bug 3651] New: dell poweredge 4600 aacraid PERC 3/Di Container goes offline] Otto Solares
2004-11-10 20:33         ` Phil Brutsche
2004-11-10 21:08           ` Otto Solares
2004-11-23 21:41 ` Fw: [Bugme-new] [Bug 3651] New: dell poweredge 4600 aacraid PERC 3/Di Container goes offline Ryan Anderson
2004-11-23 22:58   ` Otto Solares [this message]
2004-11-24  1:00   ` Andrew Kinney
2004-11-24 18:35     ` Ryan Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041123225858.GC1401@guug.org \
    --to=solca@guug.org \
    --cc=akpm@osdl.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ryan@autoweb.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).