* mptsas problem
@ 2008-04-06 19:56 Wakko Warner
[not found] ` <0631C836DBF79F42B5A60C8C8D4E8229F64FC1@NAMAIL2.ad.lsil.com>
0 siblings, 1 reply; 21+ messages in thread
From: Wakko Warner @ 2008-04-06 19:56 UTC (permalink / raw)
To: linux-scsi
I recall seeing some of this problem on the list a few days ago, but I can't
remember what the subject was and was unable to find the thread.
I'm seeing this:
Apr 5 17:10:07 ani kernel: [519452.504149] mptscsih: ioc0: attempting task abort! (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519452.504157] sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 44 57 eb 99 00 00 28 00
Apr 5 17:10:36 ani kernel: [519452.760082] mptscsih: ioc0: task abort: FAILED (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519452.760092] mptscsih: ioc0: attempting target reset! (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519452.760100] sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 44 57 eb 99 00 00 28 00
Apr 5 17:10:36 ani kernel: [519453.016015] mptscsih: ioc0: target reset: SUCCESS (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519454.164003] mptbase: ioc0: LogInfo(0x31110e00): Originator={PL}, Code={Reset}, SubCode(0x0e00)
Apr 5 17:10:36 ani kernel: [519454.164016] mptscsih: ioc0: attempting bus reset! (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519454.164019] sd 0:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 44 57 eb 99 00 00 28 00
Apr 5 17:10:36 ani kernel: [519454.419666] mptscsih: ioc0: bus reset: SUCCESS (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519464.427314] mptbase: ioc0: LogInfo(0x31110e00): Originator={PL}, Code={Reset}, SubCode(0x0e00)
Apr 5 17:10:36 ani kernel: [519464.427327] mptscsih: ioc0: attempting host reset! (sc=ebd80500)
Apr 5 17:10:36 ani kernel: [519464.427331] mptbase: ioc0: Initiating recovery
Apr 5 17:10:36 ani kernel: [519481.736841] mptscsih: ioc0: host reset: SUCCESS (sc=ebd80500)
Apr 5 17:10:37 ani kernel: [519481.736847] sd 0:0:0:0: Device offlined - not ready after error recovery
Apr 5 17:10:37 ani kernel: [519481.736860] sd 0:0:0:0: [sda] Result: hostbyte=0x00 driverbyte=0x06
Apr 5 17:10:37 ani kernel: [519481.736864] end_request: I/O error, dev sda, sector 1146612633
Apr 5 17:10:37 ani kernel: [519481.736883] sd 0:0:0:0: rejecting I/O to offline device
This is from boot up:
Mar 30 16:53:48 ani kernel: [ 84.772554] mptbase: ioc0: Initiating bringup
Mar 30 16:53:48 ani kernel: [ 85.474583] ioc0: LSISAS1068 B0: Capabilities={Initiator}
Mar 30 16:53:48 ani kernel: [ 98.944823] scsi0 : ioc0: LSISAS1068 B0, FwRev=01060000h, Ports=1, MaxQ=511, IRQ=16
Mar 30 16:53:48 ani kernel: [ 98.957157] scsi 0:0:0:0: Direct-Access ATA ST3750640AS E PQ: 0 ANSI: 5
Mar 30 16:53:48 ani kernel: [ 98.957626] sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB)
Mar 30 16:53:48 ani kernel: [ 98.958701] sd 0:0:0:0: [sda] Write Protect is off
Mar 30 16:53:48 ani kernel: [ 98.958747] sd 0:0:0:0: [sda] Mode Sense: 67 00 00 08
Mar 30 16:53:48 ani kernel: [ 98.960814] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:48 ani kernel: [ 98.961067] sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB)
Mar 30 16:53:48 ani kernel: [ 98.962139] sd 0:0:0:0: [sda] Write Protect is off
Mar 30 16:53:48 ani kernel: [ 98.962184] sd 0:0:0:0: [sda] Mode Sense: 67 00 00 08
Mar 30 16:53:48 ani kernel: [ 98.964235] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:48 ani kernel: [ 98.964293] sda: sda1 sda2
Mar 30 16:53:48 ani kernel: [ 98.984038] sd 0:0:0:0: [sda] Attached SCSI disk
Mar 30 16:53:48 ani kernel: [ 98.988380] scsi 0:0:1:0: Direct-Access ATA ST3750640AS E PQ: 0 ANSI: 5
Mar 30 16:53:48 ani kernel: [ 98.988853] sd 0:0:1:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
Mar 30 16:53:48 ani kernel: [ 98.989922] sd 0:0:1:0: [sdb] Write Protect is off
Mar 30 16:53:48 ani kernel: [ 98.989968] sd 0:0:1:0: [sdb] Mode Sense: 67 00 00 08
Mar 30 16:53:48 ani kernel: [ 98.992016] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:48 ani kernel: [ 98.992257] sd 0:0:1:0: [sdb] 1465149168 512-byte hardware sectors (750156 MB)
Mar 30 16:53:48 ani kernel: [ 98.993326] sd 0:0:1:0: [sdb] Write Protect is off
Mar 30 16:53:48 ani kernel: [ 98.993371] sd 0:0:1:0: [sdb] Mode Sense: 67 00 00 08
Mar 30 16:53:48 ani kernel: [ 98.995419] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:48 ani kernel: [ 98.995476] sdb: sdb1 sdb2
Mar 30 16:53:48 ani kernel: [ 99.019001] sd 0:0:1:0: [sdb] Attached SCSI disk
Mar 30 16:53:48 ani kernel: [ 99.023408] scsi 0:0:2:0: Direct-Access ATA ST3750640AS E PQ: 0 ANSI: 5
Mar 30 16:53:49 ani kernel: [ 99.023877] sd 0:0:2:0: [sdc] 1465149168 512-byte hardware sectors (750156 MB)
Mar 30 16:53:49 ani kernel: [ 99.024955] sd 0:0:2:0: [sdc] Write Protect is off
Mar 30 16:53:49 ani kernel: [ 99.025000] sd 0:0:2:0: [sdc] Mode Sense: 67 00 00 08
Mar 30 16:53:49 ani kernel: [ 99.027040] sd 0:0:2:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:49 ani kernel: [ 99.027283] sd 0:0:2:0: [sdc] 1465149168 512-byte hardware sectors (750156 MB)
Mar 30 16:53:49 ani kernel: [ 99.028352] sd 0:0:2:0: [sdc] Write Protect is off
Mar 30 16:53:49 ani kernel: [ 99.028398] sd 0:0:2:0: [sdc] Mode Sense: 67 00 00 08
Mar 30 16:53:49 ani kernel: [ 99.030434] sd 0:0:2:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:49 ani kernel: [ 99.030491] sdc: sdc1 sdc2
Mar 30 16:53:49 ani kernel: [ 99.056963] sd 0:0:2:0: [sdc] Attached SCSI disk
Mar 30 16:53:49 ani kernel: [ 99.058503] scsi 0:0:3:0: Direct-Access ATA WDC WD800JD-75MS 1E04 PQ: 0 ANSI: 5
Mar 30 16:53:49 ani kernel: [ 99.058990] sd 0:0:3:0: [sdd] 156250000 512-byte hardware sectors (80000 MB)
Mar 30 16:53:49 ani kernel: [ 99.059669] sd 0:0:3:0: [sdd] Write Protect is off
Mar 30 16:53:49 ani kernel: [ 99.059715] sd 0:0:3:0: [sdd] Mode Sense: 67 00 00 08
Mar 30 16:53:49 ani kernel: [ 99.060975] sd 0:0:3:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:49 ani kernel: [ 99.061247] sd 0:0:3:0: [sdd] 156250000 512-byte hardware sectors (80000 MB)
Mar 30 16:53:49 ani kernel: [ 99.061921] sd 0:0:3:0: [sdd] Write Protect is off
Mar 30 16:53:49 ani kernel: [ 99.061966] sd 0:0:3:0: [sdd] Mode Sense: 67 00 00 08
Mar 30 16:53:49 ani kernel: [ 99.063235] sd 0:0:3:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Mar 30 16:53:49 ani kernel: [ 99.063291] sdd: unknown partition table
Mar 30 16:53:49 ani kernel: [ 99.091033] sd 0:0:3:0: [sdd] Attached SCSI disk
My configuration:
LSI SAS PCI-X LSISAS1068 controller
SuperMicro X5DAE motherboard (Intel E7505 chipset)
4GB memory
Pristine Kernel 2.6.24.3
3x Seagate ST3750640AS SATA (Not sure what the firmware is, ident just shows
"E") sda sdb and sdc. These disks are in RAID1 (partition 1) and RAID5
(partition 2)
1x WDC WD800JD-75MS SATA sdd.
I've had sda, sdb, and sdc stop working as show in the above log for sda. I
checked /sys/block/sd[abc]/device/queue_depth. All 4 of my drives were 64.
(NOTE: I have 6 of the same seagate drives on a supermicro X7DA3 with an
adaptec sas onboard. Those show 31 in queue_depth). I have since changed
queue_depth to 1.
The last time the disk dropped out was when I was compressing a large file
to a usb device.
One more thing that I tried (which didn't effect queue_depth). I changed
CONFIG_FUSION_MAX_SGE from the max (128 I think) to the minimum of 16.
Anyone tell me what's going on?
lspci output:
00:00.0 Host bridge: Intel Corporation E7505 Memory Controller Hub (rev 03)
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, fast devsel, latency 0
Memory at dc000000 (32-bit, prefetchable) [size=64M]
Capabilities: <access denied>
00:00.1 Class ff00: Intel Corporation E7505/E7205 Series RAS Controller (rev 03)
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: fast devsel
00:01.0 PCI bridge: Intel Corporation E7505/E7205 PCI-to-AGP Bridge (rev 03) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, fast devsel, latency 96
Memory at e0000000 (32-bit, prefetchable) [size=64M]
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 00003000-00003fff
Memory behind bridge: d8100000-d81fffff
Prefetchable memory behind bridge: e8000000-f7ffffff
Capabilities: <access denied>
00:02.0 PCI bridge: Intel Corporation E7505 Hub Interface B PCI-to-PCI Bridge (rev 03) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, fast devsel, latency 64
Bus: primary=00, secondary=02, subordinate=04, sec-latency=0
I/O behind bridge: 00004000-00005fff
Memory behind bridge: d8200000-d84fffff
Prefetchable memory behind bridge: d4000000-d41fffff
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 02) (prog-if 00 [UHCI])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, medium devsel, latency 0, IRQ 18
I/O ports at 2440 [size=32]
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 02) (prog-if 00 [UHCI])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, medium devsel, latency 0, IRQ 19
I/O ports at 2460 [size=32]
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 02) (prog-if 00 [UHCI])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, medium devsel, latency 0, IRQ 20
I/O ports at 2480 [size=32]
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 02) (prog-if 20 [EHCI])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, medium devsel, latency 0, IRQ 21
Memory at d8000000 (32-bit, non-prefetchable) [size=1K]
Capabilities: <access denied>
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 82) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=05, subordinate=05, sec-latency=64
00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 02)
Flags: bus master, medium devsel, latency 0
00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 02) (prog-if 8a [Master SecP PriP])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, medium devsel, latency 0, IRQ 20
I/O ports at 01f0 [size=8]
I/O ports at 03f4 [size=1]
I/O ports at 0170 [size=8]
I/O ports at 0374 [size=1]
I/O ports at 24a0 [size=16]
Memory at d8000400 (32-bit, non-prefetchable) [size=1K]
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 02)
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: medium devsel, IRQ 22
I/O ports at 1100 [size=32]
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 02)
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, medium devsel, latency 0, IRQ 22
I/O ports at 2000 [size=256]
I/O ports at 2400 [size=64]
Memory at d8000c00 (32-bit, non-prefetchable) [size=512]
Memory at d8000800 (32-bit, non-prefetchable) [size=256]
Capabilities: <access denied>
01:00.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (rev 01) (prog-if 00 [VGA])
Subsystem: Hightech Information System Ltd. Unknown device 320e
Flags: bus master, fast Back2Back, 66MHz, medium devsel, latency 66, IRQ 23
Memory at e8000000 (32-bit, prefetchable) [size=128M]
I/O ports at 3000 [size=256]
Memory at d8100000 (32-bit, non-prefetchable) [size=64K]
[virtual] Expansion ROM at d8120000 [disabled] [size=128K]
Capabilities: <access denied>
01:00.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (Secondary) (rev 01)
Subsystem: Hightech Information System Ltd. Unknown device 320f
Flags: fast Back2Back, 66MHz, medium devsel
Memory at f0000000 (32-bit, prefetchable) [size=128M]
Memory at d8110000 (32-bit, non-prefetchable) [size=64K]
Capabilities: <access denied>
02:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, 66MHz, fast devsel, latency 0
Memory at d8200000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
02:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, fast devsel, latency 64
Bus: primary=02, secondary=03, subordinate=03, sec-latency=64
I/O behind bridge: 00004000-00004fff
Memory behind bridge: d8300000-d83fffff
Capabilities: <access denied>
02:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC])
Subsystem: Super Micro Computer Inc Unknown device 4080
Flags: bus master, 66MHz, fast devsel, latency 0
Memory at d8201000 (32-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
02:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, fast devsel, latency 64
Bus: primary=02, secondary=04, subordinate=04, sec-latency=64
I/O behind bridge: 00005000-00005fff
Memory behind bridge: d8400000-d84fffff
Prefetchable memory behind bridge: 00000000d4000000-00000000d41fffff
Capabilities: <access denied>
03:03.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)
Subsystem: Intel Corporation Unknown device 1011
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 17
Memory at d8300000 (64-bit, non-prefetchable) [size=128K]
I/O ports at 4000 [size=64]
Capabilities: <access denied>
04:02.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
Subsystem: LSI Logic / Symbios Logic Unknown device 3030
Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 16
I/O ports at 5000 [disabled] [size=256]
Memory at d8410000 (64-bit, non-prefetchable) [size=16K]
Memory at d8400000 (64-bit, non-prefetchable) [size=64K]
[virtual] Expansion ROM at d4000000 [disabled] [size=2M]
Capabilities: <access denied>
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
@ 2008-04-06 23:40 Richard Scobie
2008-04-07 1:04 ` Wakko Warner
0 siblings, 1 reply; 21+ messages in thread
From: Richard Scobie @ 2008-04-06 23:40 UTC (permalink / raw)
To: linux-scsi
> I recall seeing some of this problem on the list a few days ago, but
I > can't remember what the subject was and was unable to find the thread.
You are perhaps thinking of this:
http://marc.info/?l=linux-scsi&m=120696978819085&w=2
but it seems to be a different issue.
From the message you posted, it looks as though there may be a problem
with sda.
Does smartctl -a -d ata /dev/sda show any obvious problems?
Regards,
Richard
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-06 23:40 mptsas problem Richard Scobie
@ 2008-04-07 1:04 ` Wakko Warner
2008-04-13 13:00 ` Douglas Gilbert
2008-04-13 14:31 ` James Bottomley
0 siblings, 2 replies; 21+ messages in thread
From: Wakko Warner @ 2008-04-07 1:04 UTC (permalink / raw)
To: Richard Scobie; +Cc: linux-scsi
Richard Scobie wrote:
> > I recall seeing some of this problem on the list a few days ago, but
> I > can't remember what the subject was and was unable to find the thread.
>
> You are perhaps thinking of this:
>
> http://marc.info/?l=linux-scsi&m=120696978819085&w=2
Yes.
> but it seems to be a different issue.
Ok.
> From the message you posted, it looks as though there may be a problem
> with sda.
It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
as stated before, it's 64)
I performed the same copy again with queue_depth=1 after the array rebuilt.
It worked fine then. No errors.
> Does smartctl -a -d ata /dev/sda show any obvious problems?
smartctl doesn't work on sd[a-d] at all:
# smartctl -a -d ata /dev/sdd
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
A mandatory SMART command failed: exiting. To continue, add one or more '-T
permissive' options.
# smartctl -a -d ata -T permissive /dev/sdd
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
=== START OF INFORMATION SECTION ===
Device Model: [No Information Found]
Serial Number: [No Information Found]
Firmware Version: [No Information Found]
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 1
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sun Apr 6 21:02:45 2008 EDT
SMART is only available in ATA Version 3 Revision 3 or greater.
We will try to proceed in spite of this.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
Checking for SMART support by trying SMART ENABLE command.
Error SMART Enable failed: Invalid argument
SMART ENABLE failed - this establishes that this device lacks SMART functionality.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
#
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
[not found] ` <0631C836DBF79F42B5A60C8C8D4E8229F64FC1@NAMAIL2.ad.lsil.com>
@ 2008-04-07 1:07 ` Wakko Warner
2008-04-07 11:36 ` Bernd Schubert
2008-04-07 15:16 ` Moore, Eric
0 siblings, 2 replies; 21+ messages in thread
From: Wakko Warner @ 2008-04-07 1:07 UTC (permalink / raw)
To: Moore, Eric; +Cc: linux-scsi
Moore, Eric wrote:
> The other fellow was having his controller go into fault state, that is not
> the case here. Here commands are not completing back from [0:0:0:0]
> device in the timeout period, typically 30 seconds. Its hard to tell from
Is it possible that with queue_depth > 31 was causing the problem? I read
something about that (32 would cause the device to look like it was hot
unplugged or something. I did not experience this though, or if I did, it
never said the device came back as I had read)
> the information provided what is going on. A sas trace and/or pci trace
> would help in root cause whether the commands got lost out on the device or
> controller firmware.
I don't know how to do a trace of either.
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-07 1:07 ` Wakko Warner
@ 2008-04-07 11:36 ` Bernd Schubert
2008-04-07 15:16 ` Moore, Eric
1 sibling, 0 replies; 21+ messages in thread
From: Bernd Schubert @ 2008-04-07 11:36 UTC (permalink / raw)
To: Wakko Warner; +Cc: Moore, Eric, linux-scsi, Lars Täuber, James Bottomley
Hi all,
On Monday 07 April 2008 03:07:38 Wakko Warner wrote:
> Moore, Eric wrote:
> > The other fellow was having his controller go into fault state, that is
> > not the case here. Here commands are not completing back from [0:0:0:0]
> > device in the timeout period, typically 30 seconds. Its hard to tell
> > from
from my point of view the error handler is by far too quiet by default. Below
are some patches to improve the situation. I tested these patches with
Infortrend scsi-raid systems und LSI scsi HBAs.
With these patches the error handler will now tell why it got activated. We
also had the problem the error handler got activated in an endless loop,
several of these patches are to prevent this.
http://www.pci.uni-heidelberg.de/tc/usr/bernd/downloads/scsi/2.6.22-eh-patches.tar.bz2
>From the series file:
print_eh_activation.patch
scsi_error_limit.patch
scsi_error_state.patch
soft_error_requeue.patch
starget_quiesce_ignore_offlined.patch
scsi_eh_did_no_connect.patch
fusion_tip.patch
All except the fusion_tip.patch are patches for the error handler.
fusion_tip.patch will replace the kernel mpt fusion driver by a more recent
version I got from Eric.
Cheers,
Bernd
--
Bernd Schubert
Q-Leap Networks GmbH
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: mptsas problem
2008-04-07 1:07 ` Wakko Warner
2008-04-07 11:36 ` Bernd Schubert
@ 2008-04-07 15:16 ` Moore, Eric
2008-04-07 16:54 ` Wakko Warner
1 sibling, 1 reply; 21+ messages in thread
From: Moore, Eric @ 2008-04-07 15:16 UTC (permalink / raw)
To: Wakko Warner; +Cc: linux-scsi
On Sunday, April 06, 2008 7:08 PM, Wakko Warner wrote:
> Moore, Eric wrote:
> > The other fellow was having his controller go into fault
> state, that is not
> > the case here. Here commands are not completing back from
> [0:0:0:0]
> > device in the timeout period, typically 30 seconds. Its
> hard to tell from
>
> Is it possible that with queue_depth > 31 was causing the
> problem? I read
> something about that (32 would cause the device to look like
> it was hot
> unplugged or something. I did not experience this though, or
> if I did, it
> never said the device came back as I had read)
>
Can you please download lsiutil from here:
ftp://ftp.lsi.com/HostAdapterDrivers/lsiutil
# ./lsiutil 100 > output
And send me the 'output'. I need to see whether NCQ was enabled, and
what the SATAMaxQDepth was set to.
Eric Moore
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-07 15:16 ` Moore, Eric
@ 2008-04-07 16:54 ` Wakko Warner
0 siblings, 0 replies; 21+ messages in thread
From: Wakko Warner @ 2008-04-07 16:54 UTC (permalink / raw)
To: Moore, Eric; +Cc: linux-scsi
Moore, Eric wrote:
> On Sunday, April 06, 2008 7:08 PM, Wakko Warner wrote:
> > Moore, Eric wrote:
> > > The other fellow was having his controller go into fault
> > state, that is not
> > > the case here. Here commands are not completing back from
> > [0:0:0:0]
> > > device in the timeout period, typically 30 seconds. Its
> > hard to tell from
> >
> > Is it possible that with queue_depth > 31 was causing the
> > problem? I read
> > something about that (32 would cause the device to look like
> > it was hot
> > unplugged or something. I did not experience this though, or
> > if I did, it
> > never said the device came back as I had read)
> >
>
> Can you please download lsiutil from here:
> ftp://ftp.lsi.com/HostAdapterDrivers/lsiutil
ftp.lsi.com doesn't exist.
> # ./lsiutil 100 > output
>
> And send me the 'output'. I need to see whether NCQ was enabled, and
> what the SATAMaxQDepth was set to.
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-07 1:04 ` Wakko Warner
@ 2008-04-13 13:00 ` Douglas Gilbert
2008-04-13 14:39 ` James Bottomley
2008-04-13 14:31 ` James Bottomley
1 sibling, 1 reply; 21+ messages in thread
From: Douglas Gilbert @ 2008-04-13 13:00 UTC (permalink / raw)
To: Wakko Warner; +Cc: Richard Scobie, linux-scsi
Wakko Warner wrote:
> Richard Scobie wrote:
>>> I recall seeing some of this problem on the list a few days ago, but
>> I > can't remember what the subject was and was unable to find the thread.
>>
>> You are perhaps thinking of this:
>>
>> http://marc.info/?l=linux-scsi&m=120696978819085&w=2
>
> Yes.
>
>> but it seems to be a different issue.
>
> Ok.
>
>> From the message you posted, it looks as though there may be a problem
>> with sda.
>
> It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
> as stated before, it's 64)
>
> I performed the same copy again with queue_depth=1 after the array rebuilt.
> It worked fine then. No errors.
>
>> Does smartctl -a -d ata /dev/sda show any obvious problems?
>
> smartctl doesn't work on sd[a-d] at all:
> # smartctl -a -d ata /dev/sdd
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
>
> A mandatory SMART command failed: exiting. To continue, add one or more '-T
> permissive' options.
> # smartctl -a -d ata -T permissive /dev/sdd
As a smartmontools developer I know that '-d ata' is
incorrect in this context. The 'd <command_set> should
either not be given or '-d sat' should be given.
This is difficult to explain and hence put in an easily
understood option. As far as linux is concerned SCSI
commands are being issued to a SCSI device (on a SCSI
transport). But those SCSI commands are mostly instances
of the SCSI ATA PASS-THROUGH command.
It is further complicated by the fact that a properly
implemented SAT layer (and I have never met one)
implements the SCSI commands used for SMART support.
If that was the case 'smartctl -a -d scsi /dev/sdd' would
yield useful output which would be a subset of what 'd sat'
would yield.
Clear as mud?
Doug Gilbert
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
>
> === START OF INFORMATION SECTION ===
> Device Model: [No Information Found]
> Serial Number: [No Information Found]
> Firmware Version: [No Information Found]
> Device is: Not in smartctl database [for details use: -P showall]
> ATA Version is: 1
> ATA Standard is: Exact ATA specification draft version not indicated
> Local Time is: Sun Apr 6 21:02:45 2008 EDT
> SMART is only available in ATA Version 3 Revision 3 or greater.
> We will try to proceed in spite of this.
> SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
> Checking for SMART support by trying SMART ENABLE command.
> Error SMART Enable failed: Invalid argument
> SMART ENABLE failed - this establishes that this device lacks SMART functionality.
> A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
> #
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-07 1:04 ` Wakko Warner
2008-04-13 13:00 ` Douglas Gilbert
@ 2008-04-13 14:31 ` James Bottomley
2008-04-13 16:48 ` Wakko Warner
1 sibling, 1 reply; 21+ messages in thread
From: James Bottomley @ 2008-04-13 14:31 UTC (permalink / raw)
To: Wakko Warner; +Cc: Richard Scobie, linux-scsi
On Sun, 2008-04-06 at 21:04 -0400, Wakko Warner wrote:
> > From the message you posted, it looks as though there may be a problem
> > with sda.
>
> It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
> as stated before, it's 64)
>
> I performed the same copy again with queue_depth=1 after the array rebuilt.
> It worked fine then. No errors.
Actually, I'd say this is a signal for NCQ errors with the drive.
I'm afraid only LSI would be able to say for certain, because the mptsas
implements its NCQ handling in firmware. libata-core doesn't show any
special workarounds for your device (ST3750640AS) but that doesn't mean
there isn't a problem. If it's really an NCQ implementation issue, then
clamping the queue depth to 1 is about the only fix, I'm afraid.
James
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 13:00 ` Douglas Gilbert
@ 2008-04-13 14:39 ` James Bottomley
0 siblings, 0 replies; 21+ messages in thread
From: James Bottomley @ 2008-04-13 14:39 UTC (permalink / raw)
To: dougg; +Cc: Wakko Warner, Richard Scobie, linux-scsi
On Sun, 2008-04-13 at 23:00 +1000, Douglas Gilbert wrote:
> Wakko Warner wrote:
> > Richard Scobie wrote:
> >>> I recall seeing some of this problem on the list a few days ago, but
> >> I > can't remember what the subject was and was unable to find the thread.
> >>
> >> You are perhaps thinking of this:
> >>
> >> http://marc.info/?l=linux-scsi&m=120696978819085&w=2
> >
> > Yes.
> >
> >> but it seems to be a different issue.
> >
> > Ok.
> >
> >> From the message you posted, it looks as though there may be a problem
> >> with sda.
> >
> > It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
> > as stated before, it's 64)
> >
> > I performed the same copy again with queue_depth=1 after the array rebuilt.
> > It worked fine then. No errors.
> >
> >> Does smartctl -a -d ata /dev/sda show any obvious problems?
> >
> > smartctl doesn't work on sd[a-d] at all:
> > # smartctl -a -d ata /dev/sdd
> > smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> > Home page is http://smartmontools.sourceforge.net/
> >
> > Smartctl: Device Read Identity Failed (not an ATA/ATAPI device)
> >
> > A mandatory SMART command failed: exiting. To continue, add one or more '-T
> > permissive' options.
> > # smartctl -a -d ata -T permissive /dev/sdd
>
> As a smartmontools developer I know that '-d ata' is
> incorrect in this context. The 'd <command_set> should
> either not be given or '-d sat' should be given.
>
> This is difficult to explain and hence put in an easily
> understood option. As far as linux is concerned SCSI
> commands are being issued to a SCSI device (on a SCSI
> transport). But those SCSI commands are mostly instances
> of the SCSI ATA PASS-THROUGH command.
This is really what you need to use, although implementation of ATA_12
and ATA_16, which are the prescribed taskfile carriers is also done in
the SAT layer.
> It is further complicated by the fact that a properly
> implemented SAT layer (and I have never met one)
> implements the SCSI commands used for SMART support.
> If that was the case 'smartctl -a -d scsi /dev/sdd' would
> yield useful output which would be a subset of what 'd sat'
> would yield.
The problem with this, of course, is that the SCSI stats are less
comprehensive than the ATA SMART ones, so it's usually better to use the
pass through commands if you know you're dealling with an ATA device.
Actually, I'm afraid it's even more complex than this: libata has a SAT
layer for all SATA devices it attaches to; libsas also uses this, so
you'd think that we only have one SAT layer in Linux. Unfortunately,
mptsas and other "smart" SAS cards actually implement their own SAT
layer in the firmware which bypasses the linux one.
The Linux one definitely doesn't implement the smart piece of the SAT
spec; I'm not at all sure about the various firmware ones.
> Clear as mud?
Not quite ... but hopefully I've stirred it enough ...
James
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 14:31 ` James Bottomley
@ 2008-04-13 16:48 ` Wakko Warner
2008-04-13 16:58 ` James Bottomley
0 siblings, 1 reply; 21+ messages in thread
From: Wakko Warner @ 2008-04-13 16:48 UTC (permalink / raw)
To: James Bottomley; +Cc: Richard Scobie, linux-scsi
James Bottomley wrote:
> On Sun, 2008-04-06 at 21:04 -0400, Wakko Warner wrote:
> > > From the message you posted, it looks as though there may be a problem
> > > with sda.
> >
> > It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
> > as stated before, it's 64)
> >
> > I performed the same copy again with queue_depth=1 after the array rebuilt.
> > It worked fine then. No errors.
>
> Actually, I'd say this is a signal for NCQ errors with the drive.
Unless it's this specific drive firmware, I'd have to disagree. I have 6 of
the exact same drives (can't confirm firmware is the same though) in raid5
on an aic9410 sas controller w/o problems. The queue_depth for those are
31. I considered setting that value to the ones I'm having problems with,
but I really don't want to go through another 4 hour rebuild.
> I'm afraid only LSI would be able to say for certain, because the mptsas
> implements its NCQ handling in firmware. libata-core doesn't show any
> special workarounds for your device (ST3750640AS) but that doesn't mean
> there isn't a problem. If it's really an NCQ implementation issue, then
> clamping the queue depth to 1 is about the only fix, I'm afraid.
If it survives another week, I'd say using depth of 1 worked.
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 16:48 ` Wakko Warner
@ 2008-04-13 16:58 ` James Bottomley
2008-04-13 17:06 ` Wakko Warner
0 siblings, 1 reply; 21+ messages in thread
From: James Bottomley @ 2008-04-13 16:58 UTC (permalink / raw)
To: Wakko Warner; +Cc: Richard Scobie, linux-scsi
On Sun, 2008-04-13 at 12:48 -0400, Wakko Warner wrote:
> James Bottomley wrote:
> > On Sun, 2008-04-06 at 21:04 -0400, Wakko Warner wrote:
> > > > From the message you posted, it looks as though there may be a problem
> > > > with sda.
> > >
> > > It's working fine with /sys/block/sd[abc]/device/queue_depth = 1 (on boot up,
> > > as stated before, it's 64)
> > >
> > > I performed the same copy again with queue_depth=1 after the array rebuilt.
> > > It worked fine then. No errors.
> >
> > Actually, I'd say this is a signal for NCQ errors with the drive.
>
> Unless it's this specific drive firmware, I'd have to disagree. I have 6 of
> the exact same drives (can't confirm firmware is the same though) in raid5
> on an aic9410 sas controller w/o problems. The queue_depth for those are
> 31. I considered setting that value to the ones I'm having problems with,
> but I really don't want to go through another 4 hour rebuild.
Well, yes, different revs of the firmware can behave differently. The
libata-core blacklist includes the firmware version as part of the
pattern matching.
There's an easy way to verify: smartctl -i will print the firmware
version string.
> > I'm afraid only LSI would be able to say for certain, because the mptsas
> > implements its NCQ handling in firmware. libata-core doesn't show any
> > special workarounds for your device (ST3750640AS) but that doesn't mean
> > there isn't a problem. If it's really an NCQ implementation issue, then
> > clamping the queue depth to 1 is about the only fix, I'm afraid.
>
> If it survives another week, I'd say using depth of 1 worked.
James
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 16:58 ` James Bottomley
@ 2008-04-13 17:06 ` Wakko Warner
2008-04-13 17:37 ` James Bottomley
0 siblings, 1 reply; 21+ messages in thread
From: Wakko Warner @ 2008-04-13 17:06 UTC (permalink / raw)
To: James Bottomley; +Cc: Richard Scobie, linux-scsi
James Bottomley wrote:
> On Sun, 2008-04-13 at 12:48 -0400, Wakko Warner wrote:
> > > Actually, I'd say this is a signal for NCQ errors with the drive.
> >
> > Unless it's this specific drive firmware, I'd have to disagree. I have 6 of
> > the exact same drives (can't confirm firmware is the same though) in raid5
> > on an aic9410 sas controller w/o problems. The queue_depth for those are
> > 31. I considered setting that value to the ones I'm having problems with,
> > but I really don't want to go through another 4 hour rebuild.
>
> Well, yes, different revs of the firmware can behave differently. The
> libata-core blacklist includes the firmware version as part of the
> pattern matching.
> There's an easy way to verify: smartctl -i will print the firmware
> version string.
The aic sas one shows 3.AAE
The lsi mptsas can't be queried:
# smartctl -d scsi -i /dev/sdc
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Serial number: 3QD076X8
Device type: disk
Local Time is: Sun Apr 13 13:02:36 2008 EDT
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
#
I tried -d ata, -d sat and not using -d, provided no information. -T
permissive didn't work either.
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 17:06 ` Wakko Warner
@ 2008-04-13 17:37 ` James Bottomley
2008-04-13 20:49 ` Wakko Warner
2008-04-13 22:22 ` Douglas Gilbert
0 siblings, 2 replies; 21+ messages in thread
From: James Bottomley @ 2008-04-13 17:37 UTC (permalink / raw)
To: Wakko Warner; +Cc: Richard Scobie, linux-scsi
On Sun, 2008-04-13 at 13:06 -0400, Wakko Warner wrote:
> > There's an easy way to verify: smartctl -i will print the firmware
> > version string.
>
> The aic sas one shows 3.AAE
> The lsi mptsas can't be queried:
> # smartctl -d scsi -i /dev/sdc
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> Serial number: 3QD076X8
> Device type: disk
> Local Time is: Sun Apr 13 13:02:36 2008 EDT
> Device supports SMART and is Enabled
> Temperature Warning Disabled or Not Supported
> #
>
> I tried -d ata, -d sat and not using -d, provided no information. -T
> permissive didn't work either.
That's a bit unfortunate ... it means the LSI firmware SAT layer doesn't
support ATA_16. you can try ATA_12 just to make sure (little chance it
will work, but just in case):
smartctl -i -d sat,12 /dev/sd<x>
If that doesn't work, I'm afraid you'll need to transfer the drive to a
card that does support the command (like the aic).
However, it doesn't have to be a drive firmware fault, it could be some
type of corner case NCQ failure triggered by the NCQ handler inside the
LSI firmware ... in which case, your only hope for fixing it lies with
LSI.
There is one final thought: the reason the aic94xx has a queue depth of
31 is because that's the maximum NCQ can support (well, it's 32 max, but
we need one command for error handling). So, if the LSI shows a queue
depth of 64 it may be queueing internally as well. You could try
lowering the lsi queue to 31 and seeing if it makes a difference.
James
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 17:37 ` James Bottomley
@ 2008-04-13 20:49 ` Wakko Warner
2008-04-14 18:16 ` Moore, Eric
2008-04-13 22:22 ` Douglas Gilbert
1 sibling, 1 reply; 21+ messages in thread
From: Wakko Warner @ 2008-04-13 20:49 UTC (permalink / raw)
To: James Bottomley; +Cc: Richard Scobie, linux-scsi
James Bottomley wrote:
> On Sun, 2008-04-13 at 13:06 -0400, Wakko Warner wrote:
> > The aic sas one shows 3.AAE
> > The lsi mptsas can't be queried:
> > # smartctl -d scsi -i /dev/sdc
> > smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> > Home page is http://smartmontools.sourceforge.net/
> >
> > Serial number: 3QD076X8
> > Device type: disk
> > Local Time is: Sun Apr 13 13:02:36 2008 EDT
> > Device supports SMART and is Enabled
> > Temperature Warning Disabled or Not Supported
> > #
> >
> > I tried -d ata, -d sat and not using -d, provided no information. -T
> > permissive didn't work either.
>
> That's a bit unfortunate ... it means the LSI firmware SAT layer doesn't
> support ATA_16. you can try ATA_12 just to make sure (little chance it
> will work, but just in case):
>
> smartctl -i -d sat,12 /dev/sd<x>
Same thing.
> If that doesn't work, I'm afraid you'll need to transfer the drive to a
> card that does support the command (like the aic).
The aic is not in the same place as the lsi. I'm not willing to shutdown
this machine to do this right now.
> However, it doesn't have to be a drive firmware fault, it could be some
> type of corner case NCQ failure triggered by the NCQ handler inside the
> LSI firmware ... in which case, your only hope for fixing it lies with
> LSI.
Some how, I don't think it is the drive (but could be).
> There is one final thought: the reason the aic94xx has a queue depth of
> 31 is because that's the maximum NCQ can support (well, it's 32 max, but
> we need one command for error handling). So, if the LSI shows a queue
That's what I've read.
> depth of 64 it may be queueing internally as well. You could try
> lowering the lsi queue to 31 and seeing if it makes a difference.
IIRC, someone said the LSIs does queue in firmware.
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 17:37 ` James Bottomley
2008-04-13 20:49 ` Wakko Warner
@ 2008-04-13 22:22 ` Douglas Gilbert
2008-04-13 23:51 ` James Bottomley
1 sibling, 1 reply; 21+ messages in thread
From: Douglas Gilbert @ 2008-04-13 22:22 UTC (permalink / raw)
To: James Bottomley; +Cc: Wakko Warner, Richard Scobie, linux-scsi
James Bottomley wrote:
> On Sun, 2008-04-13 at 13:06 -0400, Wakko Warner wrote:
>>> There's an easy way to verify: smartctl -i will print the firmware
>>> version string.
>> The aic sas one shows 3.AAE
>> The lsi mptsas can't be queried:
>> # smartctl -d scsi -i /dev/sdc
>> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
>> Home page is http://smartmontools.sourceforge.net/
>>
>> Serial number: 3QD076X8
>> Device type: disk
>> Local Time is: Sun Apr 13 13:02:36 2008 EDT
>> Device supports SMART and is Enabled
>> Temperature Warning Disabled or Not Supported
>> #
>>
>> I tried -d ata, -d sat and not using -d, provided no information. -T
>> permissive didn't work either.
>
> That's a bit unfortunate ... it means the LSI firmware SAT layer doesn't
> support ATA_16. you can try ATA_12 just to make sure (little chance it
> will work, but just in case):
>
> smartctl -i -d sat,12 /dev/sd<x>
>
> If that doesn't work, I'm afraid you'll need to transfer the drive to a
> card that does support the command (like the aic).
>
> However, it doesn't have to be a drive firmware fault, it could be some
> type of corner case NCQ failure triggered by the NCQ handler inside the
> LSI firmware ... in which case, your only hope for fixing it lies with
> LSI.
>
> There is one final thought: the reason the aic94xx has a queue depth of
> 31 is because that's the maximum NCQ can support (well, it's 32 max, but
> we need one command for error handling). So, if the LSI shows a queue
> depth of 64 it may be queueing internally as well. You could try
> lowering the lsi queue to 31 and seeing if it makes a difference.
If memory serves, I needed to upgrade the firmware on my
MPT SAS fusion cards before the SCSI ATA PASS-THROUGH commands
worked (properly). I'm a long way away from my hardware
at the moment so I can't provide any further information
(such as firmware version numbers).
I have just asked Bruce Allen to place a date in the
smartctl copyright line because there are so many
versions of version 5.38 floating around! smartmontools
version 5.38 was recently released but beta (and sometimes
partially broken) versions of 5.38 have been available via
CVS for over 18 months. That said, the 5.38 versions
that mention 2008 (as in "2002-8") are most likely the
final release version.
Doug Gilbert
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-13 22:22 ` Douglas Gilbert
@ 2008-04-13 23:51 ` James Bottomley
0 siblings, 0 replies; 21+ messages in thread
From: James Bottomley @ 2008-04-13 23:51 UTC (permalink / raw)
To: dougg; +Cc: Wakko Warner, Richard Scobie, linux-scsi
On Mon, 2008-04-14 at 08:22 +1000, Douglas Gilbert wrote:
> > There is one final thought: the reason the aic94xx has a queue depth of
> > 31 is because that's the maximum NCQ can support (well, it's 32 max, but
> > we need one command for error handling). So, if the LSI shows a queue
> > depth of 64 it may be queueing internally as well. You could try
> > lowering the lsi queue to 31 and seeing if it makes a difference.
>
> If memory serves, I needed to upgrade the firmware on my
> MPT SAS fusion cards before the SCSI ATA PASS-THROUGH commands
> worked (properly). I'm a long way away from my hardware
> at the moment so I can't provide any further information
> (such as firmware version numbers).
OK, this prompted me to try mine, and it does work with my mptsas. This
is all of my information from the sysfs files in the host directory:
active_mode: Initiator
board_assembly: 03-01088-03B
board_name: SAS3800X
board_tracer: P185695005
can_queue: 127
cmd_per_lun: 7
debug_level: 00000000h
device_delay: 00
host_busy: 0
io_delay: 00
proc_name: mptsas
sg_tablesize: 128
state: running
supported_mode: Initiator
unchecked_isa_dma: 0
unique_id: 2
version_bios: 00.00.00.00
version_fw: 01.24.00.00
version_mpi: 105
version_nvdata_default: 2d02h
version_nvdata_persistent: 2d02h
version_product: LSISAS1068 B0
James
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: mptsas problem
2008-04-13 20:49 ` Wakko Warner
@ 2008-04-14 18:16 ` Moore, Eric
2008-04-14 21:34 ` Wakko Warner
0 siblings, 1 reply; 21+ messages in thread
From: Moore, Eric @ 2008-04-14 18:16 UTC (permalink / raw)
To: Wakko Warner, James Bottomley; +Cc: Richard Scobie, linux-scsi
On Sunday, April 13, 2008 2:49 PM, Wakko Warner wrote:
> > depth of 64 it may be queueing internally as well. You could try
> > lowering the lsi queue to 31 and seeing if it makes a difference.
>
> IIRC, someone said the LSIs does queue in firmware.
>
It was probably me that told you that. Anyways, the lsiutil 100
output you sent me last week indicated youre controller has NCQ
disabled. You need new firmware and controller chip rev in order to
support NCQ. You should contact LSI support to obtain which versions of
firmware and chip rev that is required. Meanwhile your controller will
queuing commands for SATA, only sending one command across the wire at a
time. For SAS, you will not have this issue.
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-14 18:16 ` Moore, Eric
@ 2008-04-14 21:34 ` Wakko Warner
2008-04-15 17:26 ` Moore, Eric
0 siblings, 1 reply; 21+ messages in thread
From: Wakko Warner @ 2008-04-14 21:34 UTC (permalink / raw)
To: Moore, Eric; +Cc: James Bottomley, Richard Scobie, linux-scsi
Moore, Eric wrote:
> On Sunday, April 13, 2008 2:49 PM, Wakko Warner wrote:
> > > depth of 64 it may be queueing internally as well. You could try
> > > lowering the lsi queue to 31 and seeing if it makes a difference.
> >
> > IIRC, someone said the LSIs does queue in firmware.
>
> It was probably me that told you that. Anyways, the lsiutil 100
> output you sent me last week indicated youre controller has NCQ
> disabled. You need new firmware and controller chip rev in order to
> support NCQ. You should contact LSI support to obtain which versions of
> firmware and chip rev that is required. Meanwhile your controller will
> queuing commands for SATA, only sending one command across the wire at a
> time. For SAS, you will not have this issue.
Does "chip rev" mean software/firmware or an actual chip? Is there anyway
the driver can see this and set the queue_depth to 1?
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: mptsas problem
2008-04-14 21:34 ` Wakko Warner
@ 2008-04-15 17:26 ` Moore, Eric
2008-04-15 22:49 ` Wakko Warner
0 siblings, 1 reply; 21+ messages in thread
From: Moore, Eric @ 2008-04-15 17:26 UTC (permalink / raw)
To: Wakko Warner; +Cc: James Bottomley, Richard Scobie, linux-scsi
> Does "chip rev" mean software/firmware or an actual chip? Is
> there anyway
> the driver can see this and set the queue_depth to 1?
>
When I mean chip, I'm refering to hardware. The revision of your
chip(1064/1068/1078) is located at offset 8 in pci config space. This
also known as PCI_CLASS_REVISION. You can get this by using the tool
called lspci.
Eric
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: mptsas problem
2008-04-15 17:26 ` Moore, Eric
@ 2008-04-15 22:49 ` Wakko Warner
0 siblings, 0 replies; 21+ messages in thread
From: Wakko Warner @ 2008-04-15 22:49 UTC (permalink / raw)
To: Moore, Eric; +Cc: James Bottomley, Richard Scobie, linux-scsi
Moore, Eric wrote:
>
> > Does "chip rev" mean software/firmware or an actual chip? Is
> > there anyway
> > the driver can see this and set the queue_depth to 1?
>
> When I mean chip, I'm refering to hardware. The revision of your
> chip(1064/1068/1078) is located at offset 8 in pci config space. This
> also known as PCI_CLASS_REVISION. You can get this by using the tool
> called lspci.
04:02.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X
Fusion-MPT SAS (rev 01)
00: 00 10 54 00 56 01 30 02 01 00 00 01 08 48 00 00
10: 01 50 00 00 04 00 41 d8 00 00 00 00 04 00 40 d8
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 10 30 30
30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 01 40 0a
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 01 98 02 06 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 07 b0 60 10 10 04 43 13
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 05 68 80 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 00 00 00 01 20 00 00 01 30 00 00 00 00 00 00
I'm not that familiar with the pci config space. Are you saying that I need
a new chip for the new firmware? If so, considering everything is soldered
on the board, I'd have a hard time replacing it! =)
--
Lab tests show that use of micro$oft causes cancer in lab animals
Got Gas???
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2008-04-15 22:50 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-06 23:40 mptsas problem Richard Scobie
2008-04-07 1:04 ` Wakko Warner
2008-04-13 13:00 ` Douglas Gilbert
2008-04-13 14:39 ` James Bottomley
2008-04-13 14:31 ` James Bottomley
2008-04-13 16:48 ` Wakko Warner
2008-04-13 16:58 ` James Bottomley
2008-04-13 17:06 ` Wakko Warner
2008-04-13 17:37 ` James Bottomley
2008-04-13 20:49 ` Wakko Warner
2008-04-14 18:16 ` Moore, Eric
2008-04-14 21:34 ` Wakko Warner
2008-04-15 17:26 ` Moore, Eric
2008-04-15 22:49 ` Wakko Warner
2008-04-13 22:22 ` Douglas Gilbert
2008-04-13 23:51 ` James Bottomley
-- strict thread matches above, loose matches on Subject: below --
2008-04-06 19:56 Wakko Warner
[not found] ` <0631C836DBF79F42B5A60C8C8D4E8229F64FC1@NAMAIL2.ad.lsil.com>
2008-04-07 1:07 ` Wakko Warner
2008-04-07 11:36 ` Bernd Schubert
2008-04-07 15:16 ` Moore, Eric
2008-04-07 16:54 ` Wakko Warner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox