From: Tim Small <tim@seoss.co.uk>
To: Gabor Gombas <gombasg@sztaki.hu>
Cc: smartmontools-support@lists.sourceforge.net,
linux-scsi@vger.kernel.org, Linux-PowerEdge@dell.com
Subject: Re: [smartmontools-support] SMART causes disks to go offline on an LSI SAS1068 controller - Dell SAS 5/iR
Date: Tue, 27 Oct 2009 17:30:40 +0000 [thread overview]
Message-ID: <4AE72E40.2000903@seoss.co.uk> (raw)
In-Reply-To: <20090914142939.GE14072@boogie.lpds.sztaki.hu>
Hello,
Just to say that I'm seeing this bug as well, with smartmontools 5.38
and smartctl 5.39 2009-10-10 r2955 on Debian lenny. The machine is a
Dell PowerEdge 860. I'm guessing that this is either a firmware or
driver issue.
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X
Fusion-MPT SAS (rev 01)
Subsystem: Dell SAS 5/iR Adapter RAID Controller
Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 1275
I/O ports at ec00 [disabled] [size=256]
Memory at fe9fc000 (64-bit, non-prefetchable) [size=16K]
Memory at fe9e0000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at fea00000 [disabled] [size=1M]
Capabilities: [50] Power Management version 2
Capabilities: [98] Message Signalled Interrupts: Mask- 64bit+
Queue=0/0 Enable+
Capabilities: [68] PCI-X non-bridge device
Capabilities: [b0] MSI-X: Enable- Mask- TabSize=1
Kernel driver in use: mptsas
Kernel modules: mptsas
# modinfo mptsas
filename:
/lib/modules/2.6.26-2-openvz-amd64/kernel/drivers/message/fusion/mptsas.ko
version: 3.04.06
license: GPL
description: Fusion MPT SAS Host driver
author: LSI Corporation
The errors look like this:
428.524463] mptscsih: ioc0: attempting task abort! (sc=ffff81021b950940)
428.524471] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08
0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00
433.199851] mptbase: ioc0: LogInfo(0x31140000): Originator={PL},
Code={IO Executed}, SubCode(0x0000)
433.199851] mptsas: ioc0: removing sata device, channel 0, id 0, phy 0
433.199851] port-0:0: mptsas: ioc0: delete port (0)
433.199851] sd 0:0:0:0: [sda] Synchronizing SCSI cache
433.348856] mptscsih: ioc0: task abort: SUCCESS (sc=ffff81021b950940)
433.348868] mptscsih: ioc0: attempting task abort! (sc=ffff81021b950440)
433.348873] sd 0:0:0:0: [sda] CDB: Synchronize Cache(10): 35 00 00 00 00
00 00 00 00 00
433.348885] mptscsih: ioc0: task abort: SUCCESS (sc=ffff81021b950440)
433.348893] mptscsih: ioc0: attempting target reset! (sc=ffff81021b950940)
433.348896] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08
0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00
433.605026] mptscsih: ioc0: target reset: SUCCESS (sc=ffff81021b950940)
433.605034] mptscsih: ioc0: attempting bus reset! (sc=ffff81021b950940)
433.605037] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08
0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00
434.157594] mptscsih: ioc0: bus reset: SUCCESS (sc=ffff81021b950940)
444.546154] mptscsih: ioc0: attempting host reset! (sc=ffff81021b950940)
444.546162] mptbase: ioc0: Initiating recovery
461.540429] mptscsih: ioc0: host reset: SUCCESS (sc=ffff81021b950940)
461.540437] sd 0:0:0:0: Device offlined - not ready after error recovery
461.540440] sd 0:0:0:0: Device offlined - not ready after error recovery
461.540475] end_request: I/O error, dev sda, sector 15631039
461.540480] md: super_written gets error=-5, uptodate=0
461.540485] raid1: Disk failure on sda1, disabling device.
and the drives are:
Model Family: Seagate Barracuda ES
Device Model: ST3250620NS
Serial Number: 9QE3L9E0
Firmware Version: 3BKS
and are in JBOD mode (+ sw RAID with md).
lsiutil says:
Current active firmware version is 0.10.51
Firmware image's version is MPTFW-00.10.51.00-IE
LSI Logic
x86 BIOS image's version is MPTBIOS-6.12.05.00 (2007.09.29)
... which is the latest on Dell's download pages for this server.
The kernel is 2.6.26-2-openvz-amd64 from Debian Lenny (same behaviour
with non-openvz kernel). Running smartd makes the drives disappear
after a few hours, but doing this:
while true ; do smartctl -T permissive -d sat -a /dev/sda > /dev/null &&
echo -n . ; done
seems to knock them out in about a minute.
Subjectively, 5.38 seemed to upset the controller a lot quicker than
5.39 r2955 does. For good measure I'm currently stress-testing a PE1950
with a SAS 6/iR (SAS1068E) in the same way (however this is using RAID
setup through the BIOS).
smartctl 5.39-pre needs '-T permissive' on the PE860, but 5.38 doesn't
seem to require it.
It is worth trying a newer mptsas driver?
Regards,
Tim.
--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
next prev parent reply other threads:[~2009-10-27 17:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-14 14:29 SMART causes disks to go offline on an LSI SAS 1068 controller Gabor Gombas
2009-10-27 17:30 ` Tim Small [this message]
2009-10-28 13:18 ` Apparent MPT ata pass-through bug SAS1068 and SAS1068E - WAS SMART causes disks to go offline on an LSI SAS1068 controller - Dell SAS 5/iR Tim Small
2009-10-28 13:28 ` Desai, Kashyap
2009-10-28 16:56 ` Tim Small
2009-10-28 21:10 ` Douglas Gilbert
2009-10-29 8:59 ` [smartmontools-support] " Tim Small
2009-10-29 9:01 ` Tim Small
2009-10-29 9:55 ` [smartmontools-support] " Tim Small
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AE72E40.2000903@seoss.co.uk \
--to=tim@seoss.co.uk \
--cc=Linux-PowerEdge@dell.com \
--cc=gombasg@sztaki.hu \
--cc=linux-scsi@vger.kernel.org \
--cc=smartmontools-support@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.