From: "David M. Strang" <dstrang@shellpower.net>
To: linux-kernel@vger.kernel.org
Subject: LSI Logic MegaRAID SATA 150-4 / LSI Logic New Generation RAID Device Drivers (MEGARAID_NEWGEN) problems (megaraid abort: scsi cmd:14600, do now own)
Date: Mon, 18 Feb 2008 21:09:22 -0500 [thread overview]
Message-ID: <47BA3A52.3020106@shellpower.net> (raw)
Greetings -
A couple months back I purchased a LSI Logic MegaRAID ATA 150-4
controller, as well as 3 Seagate 500GB SATA-II hard drives to use in my
system. Previously, I was using a pair of WD4000YR's in software raid,
which seemed to work well. I've just not gotten around to working on
migrating my data to these new drivers + controller, and it's giving me
some issues. As with most, I'm having some severe performance issues,
the performance is simply abysmal. Before getting into the details, here
is a quick overview of my configuration:
System:
Tyan Tiger i7320/R (S5350) System Board
2x Intel Xeon 3.0 GHz
4GB RAM
LSI Logic MegaRAID ATA 150-4 controller - Firmware Revision: 713S
3x Seagate 7200.10 (Perpendicular Recording) ST3500630AS 500GB SATA-II
drives configured as a RAID-1 array with a HotSpare.
Also, connected to the onboard controller is a WD4000YR, where all of my
data currently resides.
I'm running Gentoo Hardended AMD64 MultiLib
(/usr/portage/profiles/hardened/amd64/multilib)
My current kernel revision is 2.6.23-hardened-r7.
Here are some (possibly) relevant snippets from dmesg during startup:
...
megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)
megaraid: 2.20.5.1 (Release Date: Thu Nov 16 15:32:35 EST 2006)
megaraid: probe new device 0x1000:0x1960:0x1000:0x4523: bus 3:slot 3:func 0
ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 24 (level, low) -> IRQ 24
megaraid: fw version:[713S] bios version:[G121]
scsi0 : LSI Logic MegaRAID driver
scsi[0]: scanning scsi channel 0 [Phy 0] for non-raid devices
scsi[0]: scanning scsi channel 1 [virtual] for logical drives
scsi 0:1:0:0: Direct-Access MegaRAID LD 0 RAID1 476G 713S PQ: 0 ANSI: 2
sd 0:1:0:0: [sda] 976762880 512-byte hardware sectors (500103 MB)
sd 0:1:0:0: [sda] Write Protect is off
sd 0:1:0:0: [sda] Mode Sense: 00 00 00 00
sd 0:1:0:0: [sda] Asking for cache data failed
sd 0:1:0:0: [sda] Assuming drive cache: write through
sd 0:1:0:0: [sda] 976762880 512-byte hardware sectors (500103 MB)
sd 0:1:0:0: [sda] Write Protect is off
sd 0:1:0:0: [sda] Mode Sense: 00 00 00 00
sd 0:1:0:0: [sda] Asking for cache data failed
sd 0:1:0:0: [sda] Assuming drive cache: write through
sda: sda1 sda2 sda3 sda4
sd 0:1:0:0: [sda] Attached SCSI disk
ata_piix 0000:00:1f.2: version 2.12
ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi1 : ata_piix
scsi2 : ata_piix
ata1: SATA max UDMA/133 cmd 0x00000000000114a0 ctl 0x000000000001149a
bmdma 0x0000000000011470 irq 18
ata2: SATA max UDMA/133 cmd 0x0000000000011490 ctl 0x0000000000011486
bmdma 0x0000000000011478 irq 18
ata1.00: ATA-7: WDC WD4000YR-01PLB0, 01.06A01, max UDMA/133
ata1.00: 781422768 sectors, multi 16: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
scsi 1:0:0:0: Direct-Access ATA WDC WD4000YR-01P 01.0 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sd 1:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sdb: sdb1 sdb2 sdb3 sdb4
sd 1:0:0:0: [sdb] Attached SCSI disk
...
My controller is configured for Write Back Caching, Adaptive Read Ahead,
and Direct I/O (I've also tried cached I/O but it scared me...)
The first thing I'm noticing is the horrible performance on the raid
disk, compared to the single standalone hard disk. Here is the output
from hdparm -tT on the single disk:
-(root@server)-(~)- # hdparm -tT /dev/sdb1
/dev/sdb1:
Timing cached reads: 1670 MB in 2.00 seconds = 835.00 MB/sec
Timing buffered disk reads: 140 MB in 3.01 seconds = 46.45 MB/sec
And then, the output from the raid-1 array:
-(root@server)-(~)- # hdparm -tT /dev/sda1
/dev/sda1:
Timing cached reads: 1718 MB in 2.00 seconds = 859.65 MB/sec
Timing buffered disk reads: 92 MB in 3.09 seconds = 29.76 MB/sec
I'm not sure what the deal is with the buffered disk reads being so much
WORSE than a single disk. So poor performance is a concern, but what's
more alarming are the messages showing up in DMESG. When I first tried
Cached IO - performance seemed good... except, dmesg was littered with
these errors (?):
megaraid: aborting-14610 cmd=2a <c=1 t=0 l=0>
megaraid abort: scsi cmd:14610, do now own
megaraid: aborting-14612 cmd=2a <c=1 t=0 l=0>
megaraid abort: scsi cmd:14612, do now own
megaraid: aborting-14614 cmd=2a <c=1 t=0 l=0>
megaraid abort: scsi cmd:14614, do now own
...
megaraid: 38 outstanding commands. Max wait 300 sec
megaraid mbox: Wait for 38 commands to complete:300
megaraid mbox: reset sequence completed sucessfully
I'm not certain what these mean... why am I getting aborts?
So, I rebooted the box - and I switched back to direct I/O instead of
cached... and while not as prevelant as before, I still get the above
listed errors as well as these ones:
megaraid abort: 14687:62[255:128], fw owner
megaraid: aborting-14689 cmd=2a <c=1 t=0 l=0>
megaraid abort: 14689:25[255:128], fw owner
megaraid: aborting-14691 cmd=2a <c=1 t=0 l=0>
megaraid abort: 14691:40[255:128], fw owner
megaraid: aborting-14693 cmd=2a <c=1 t=0 l=0>
megaraid abort: 14693:10[255:128], fw owner
megaraid: aborting-14695 cmd=2a <c=1 t=0 l=0>
megaraid abort: 14695:9[255:128], fw owner
I'm also a bit concerned by the dmesg output from the drive
initialization; I have it set for Write Back caching, but this shows up:
sd 0:1:0:0: [sda] Asking for cache data failed
sd 0:1:0:0: [sda] Assuming drive cache: write through
Why?
I would really like to get my data over to my hardware mirror, but
frankly - I'm nervous about this controller's behavior. Other than error
messages in dmesg, and high cpu during file i/o -- it SEEMS ok, but is
it really? I somehow don't think I should be getting these types of
messages. I've searched the list archive, and I see similar messages
follow by failed resets, but my reset sequence always completes
successfully.
Regards,
David M. Strang
next reply other threads:[~2008-02-19 2:44 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-19 2:09 David M. Strang [this message]
2008-02-22 8:13 ` LSI Logic MegaRAID SATA 150-4 / LSI Logic New Generation RAID Device Drivers (MEGARAID_NEWGEN) problems (megaraid abort: scsi cmd:14600, do now own) Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47BA3A52.3020106@shellpower.net \
--to=dstrang@shellpower.net \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.