* mptbase Code={Abort}
@ 2008-04-13 1:36 Sebastian Wasilewski
0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Wasilewski @ 2008-04-13 1:36 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 4142 bytes --]
Hello Everyone,
I have a problem with a new custom build server. The machine has three
LSI SAS1068 PCI-X SAS cards and 10 Seagate ST31000340NS disk connected
to them. On the top of it I have configured a software RAID6 with stripe
size 128kB.
The problem is that during a heavy load (for example RAID rebuilding)
messages following messages are registered by a syslog.
mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort},
SubCode(0x3000)
After each RAID rebuild I have around 100 of them. The RAID itself works
fine.
Does anyone have any ideas what might be wrong, and what does this
messages means? I have updated a card's firmware to the newest one, but
still no changes.
Thanx. Technical details below.
Sebastian
Details of the machine:
System: Fedora 8 x86_64
Kernel: 2.6.21.7-2.fc8xen-local (recompiled with the newest LSI MPT
Fusion driver (4.00.21.00))
HW: 2x Intel Xeon E5320, 16GB RAM
lspci output:
----------------
0:00.0 Host bridge: Intel Corporation 5000V Chipset Memory Controller
Hub (rev b1)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x8
Port 2-3 (rev b1)
00:03.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express x4
Port 3 (rev b1)
00:08.0 System peripheral: Intel Corporation 5000 Series Chipset DMA
Engine (rev b1)
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers
(rev b1)
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers
(rev b1)
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset FSB Registers
(rev b1)
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved
Registers (rev b1)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved
Registers (rev b1)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers
(rev b1)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD Registers
(rev b1)
00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI
Express Root Port 1 (rev 09)
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #1 (rev 09)
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #2 (rev 09)
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #3 (rev 09)
00:1d.3 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
UHCI USB Controller #4 (rev 09)
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset
EHCI USB2 Controller (rev 09)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC
Interface Controller (rev 09)
00:1f.3 SMBus: Intel Corporation 631xESB/632xESB/3100 Chipset SMBus
Controller (rev 09)
01:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express
Upstream Port (rev 01)
01:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to
PCI-X Bridge (rev 01)
02:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express
Downstream Port E1 (rev 01)
02:02.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express
Downstream Port E3 (rev 01)
04:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit
Ethernet Controller (Copper) (rev 01)
04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit
Ethernet Controller (Copper) (rev 01)
05:01.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X
Fusion-MPT SAS (rev 01)
05:02.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X
Fusion-MPT SAS (rev 01)
05:03.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X
Fusion-MPT SAS (rev 01)
07:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet
Controller (rev 06)
07:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet
Controller (rev 06)
08:01.0 PCI bridge: Intel Corporation 21154 PCI-to-PCI Bridge
08:02.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
09:04.0 Ethernet controller: Intel Corporation 82557/8/9 Ethernet Pro
100 (rev 0d)
09:05.0 Ethernet controller: Intel Corporation 82557/8/9 Ethernet Pro
100 (rev 0d)
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 4968 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mptbase Code={Abort}
@ 2008-04-13 19:29 Richard Scobie
2008-04-13 20:24 ` Sebastian Wasilewski
0 siblings, 1 reply; 6+ messages in thread
From: Richard Scobie @ 2008-04-13 19:29 UTC (permalink / raw)
To: linux-scsi
A "me too" here as well.
Fedora 8 x86_64 2.6.24.4-64.fc8, 8GB RAM, LSI SAS1068 B1 with latest
BIOS and FW, attached to 15 x WD7500AYYS SATA via a Vitesse port expander.
While doing initial sync up on an md RAID 5 of the first eight drives:
Apr 13 13:53:16 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:18 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:21 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:21 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:23 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:24 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:34 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:34 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:36 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:38 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:42 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:42 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:44 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Apr 13 13:53:46 flash2 kernel: mptbase: ioc0: LogInfo(0x31120403):
Originator={PL}, Code={Abort}, SubCode(0x0403)
Apr 13 13:53:50 flash2 kernel: mptbase: ioc0: LogInfo(0x31123000):
Originator={PL}, Code={Abort}, SubCode(0x3000)
Will try compiling latest mpt fusion driver today.
Have sent a similar mail, including lsiutil dump offlist to eric at LSI.
Regards,
Richard
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mptbase Code={Abort}
2008-04-13 19:29 mptbase Code={Abort} Richard Scobie
@ 2008-04-13 20:24 ` Sebastian Wasilewski
0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Wasilewski @ 2008-04-13 20:24 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 942 bytes --]
>
> Will try compiling latest mpt fusion driver today.
I have already done it. Nothing has changed.
>
> Have sent a similar mail, including lsiutil dump offlist to eric at LSI.
Could you let me know when he replies, please? I would like know whether
it is something serious or can be ignored.
I have noticed that turning off NCQ significantly reduces amount of the
errors.
I have set for each disk (sda,sdb... ($dev)):
echo 1 > /sys/block/$dev/device/queue_depth
and the same value for "SATA Maximum Queue Depth" using lsiutils.
Better but the error message still did not disappear.
Regards,
Sebastian
--
"Computer Science is no more about computers
than astronomy is about telescopes"
Edsger Wybe Dijkstra
Sebastian Wasilewski
Division of Physical Biochemistry
MRC National Institute for Medical Research
The Ridgeway, London NW7 1AA
+442088162089
swasile@nimr.mrc.ac.uk
sebastian@wasilewski.name
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 4968 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mptbase Code={Abort}
@ 2008-04-13 20:47 Richard Scobie
2008-04-13 20:54 ` Sebastian Wasilewski
0 siblings, 1 reply; 6+ messages in thread
From: Richard Scobie @ 2008-04-13 20:47 UTC (permalink / raw)
To: linux-scsi
> Could you let me know when he replies, please? I would like know
> whether it is something serious or can be ignored.
Certainly. I think this is certainly impacting performance badly,
judging by the LED's on each drive - each time an error occurs, it can
take up to 10 seconds for them toi settle back to a stable pattern of
activity.
> I have noticed that turning off NCQ significantly reduces amount of
> the errors.
I have already tried this and found that the errors were slightly more
frequent, but I had updated the HBA BIOS and firmware before I set the
queues to 1, so I'm not sure which change had an effect.
Regards,
Richard
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mptbase Code={Abort}
2008-04-13 20:47 mptbase Code={Abort} Richard Scobie
@ 2008-04-13 20:54 ` Sebastian Wasilewski
2008-04-20 13:14 ` mptbase Code={Abort} (LSI SAS1068 PCI-X Fusion-MPT) Sebastian Wasilewski
0 siblings, 1 reply; 6+ messages in thread
From: Sebastian Wasilewski @ 2008-04-13 20:54 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 547 bytes --]
Richard Scobie wrote:
>
> I have already tried this and found that the errors were slightly more
> frequent, but I had updated the HBA BIOS and firmware before I set the
> queues to 1, so I'm not sure which change had an effect.
In my case, setting it reduced amount of errors from ~100 to ~30 during
full raid rebuild. But anyway -- it is still 30 to more.
I have made an updated of the BIOS and firmware as a first thing after I
have connected controllers so I cannot compare results with the old BIOS
unfortunately.
Regards,
Sebastian
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 4968 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: mptbase Code={Abort} (LSI SAS1068 PCI-X Fusion-MPT)
2008-04-13 20:54 ` Sebastian Wasilewski
@ 2008-04-20 13:14 ` Sebastian Wasilewski
0 siblings, 0 replies; 6+ messages in thread
From: Sebastian Wasilewski @ 2008-04-20 13:14 UTC (permalink / raw)
To: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 517 bytes --]
Hi All,
The problem still exists. With NCQ turned off as well as with NCQ turned on.
# dmesg |grep 'Code={Abort}, SubCode(0x3000)'
mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort},
SubCode(0x3000)
mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort},
SubCode(0x3000)
mptbase: ioc0: LogInfo(0x31123000): Originator={PL}, Code={Abort},
SubCode(0x3000)
(...)
# dmesg |grep 'Code={Abort}, SubCode(0x3000)' |wc -l
377
Does anyone know, what that messages mean?
Regards,
Sebastian
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/x-pkcs7-signature, Size: 4968 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-04-20 13:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-13 20:47 mptbase Code={Abort} Richard Scobie
2008-04-13 20:54 ` Sebastian Wasilewski
2008-04-20 13:14 ` mptbase Code={Abort} (LSI SAS1068 PCI-X Fusion-MPT) Sebastian Wasilewski
-- strict thread matches above, loose matches on Subject: below --
2008-04-13 19:29 mptbase Code={Abort} Richard Scobie
2008-04-13 20:24 ` Sebastian Wasilewski
2008-04-13 1:36 Sebastian Wasilewski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).