public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* I/O errors on 6TB device
@ 2006-12-16 12:12 Bernd Schubert
  2006-12-16 15:28 ` James Bottomley
  0 siblings, 1 reply; 4+ messages in thread
From: Bernd Schubert @ 2006-12-16 12:12 UTC (permalink / raw)
  To: linux-scsi

Hi,

recently I asked about problems to expect with >2TB devices and the answer
of Douglas made me hope we won't get any problems this time.
Unfortunately, we get I/O errors on accessing the device.

The unit is called transtec PV610S, which is actually an Infortrend EonStor
A16U-G2421-1 device.

Presently its running in test mode on our failover system with attached to a
MPT scsi controller, after thouroughly tested it is supposed to become our
main storage device on our primary server with an AIC79XX controller.

Without any knowledge about the scsi protocol and its error numbers it looks
like the device is claiming to have more sectors than it actually has,
doesn't it?

[17179724.816000] Fusion MPT base driver 3.03.07
[17179724.816000] Copyright (c) 1999-2005 LSI Logic Corporation
[17179724.832000] Fusion MPT SPI Host driver 3.03.07
[17179724.840000] ACPI: PCI Interrupt 0000:02:0a.0[A] -> GSI 24 (level,
low) -> IRQ 19
[17179724.840000] mptbase: Initiating ioc0 bringup
[17179725.312000] ioc0: 53C1030: Capabilities={Initiator}
[17179725.800000] scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1,
MaxQ=255, IRQ=19
[17179726.064000]   Vendor: Transtec  Model: PV610S16R1B       Rev: 347G
[17179726.064000]   Type:   Direct-Access                      ANSI SCSI
revision: 05
[17179726.084000] sda : very big device. try to use READ CAPACITY(16).
[17179726.084000] SCSI device sda: 12691101696 512-byte hdwr sectors
(6497844 MB)
[17179726.088000] sda: Write Protect is off
[17179726.088000] sda: Mode Sense: 9b 00 00 08
[17179726.088000] SCSI device sda: drive cache: write back
[17179726.100000] sda : very big device. try to use READ CAPACITY(16).
[17179726.100000] SCSI device sda: 12691101696 512-byte hdwr sectors
(6497844 MB)
[17179726.100000] sda: Write Protect is off
[17179726.100000] sda: Mode Sense: 9b 00 00 08
[17179726.100000] SCSI device sda: drive cache: write back
[17179726.100000]  sda:<6>sd 0:0:1:0: SCSI error: return code = 0xb0000
[17179726.132000] end_request: I/O error, dev sda, sector 12691101688
[17179726.132000] Buffer I/O error on device sda, logical block 1586387711
[17179726.136000] sd 0:0:1:0: SCSI error: return code = 0xb0000
[17179726.136000] end_request: I/O error, dev sda, sector 12691101688
[17179726.136000] Buffer I/O error on device sda, logical block 1586387711
[17179726.136000] Alternate GPT is invalid, using primary GPT.
[17179726.136000]
[17179726.136000] sd 0:0:1:0: Attached scsi disk sda
[17179726.160000] sd 0:0:1:0: Attached scsi generic sg0 type 0
[17179729.412000] ACPI: PCI Interrupt 0000:02:0a.1[B] -> GSI 25 (level,
low) -> IRQ 20
[17179729.660000] mptbase: Initiating ioc1 bringup
[17179730.132000] ioc1: 53C1030: Capabilities={Initiator}
[17179730.620000] scsi1 : ioc1: LSI53C1030, FwRev=01030600h, Ports=1,
MaxQ=255, IRQ=20
/10

Now using parted to set a proper partition table, parted will complain about
an I/O error and dmesg shows the messages below.

[17179939.464000] sd 0:0:1:0: SCSI error: return code = 0xb0000
[17179939.464000] end_request: I/O error, dev sda, sector 12691101688
[17180107.112000] sd 0:0:1:0: SCSI error: return code = 0xb0000
[17180107.116000] end_request: I/O error, dev sda, sector 12691101688


Any help is appriciated.

Thanks in advance,
Bernd


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: I/O errors on 6TB device
  2006-12-16 12:12 I/O errors on 6TB device Bernd Schubert
@ 2006-12-16 15:28 ` James Bottomley
  2006-12-16 16:28   ` Bernd Schubert
       [not found]   ` <664A4EBB07F29743873A87CF62C26D702A9967@NAMAIL4.ad.lsil.com>
  0 siblings, 2 replies; 4+ messages in thread
From: James Bottomley @ 2006-12-16 15:28 UTC (permalink / raw)
  To: Bernd Schubert; +Cc: linux-scsi, Moore, Eric

On Sat, 2006-12-16 at 13:12 +0100, Bernd Schubert wrote:
> Hi,
> 
> recently I asked about problems to expect with >2TB devices and the answer
> of Douglas made me hope we won't get any problems this time.
> Unfortunately, we get I/O errors on accessing the device.
> 
> The unit is called transtec PV610S, which is actually an Infortrend EonStor
> A16U-G2421-1 device.
> 
> Presently its running in test mode on our failover system with attached to a
> MPT scsi controller, after thouroughly tested it is supposed to become our
> main storage device on our primary server with an AIC79XX controller.
> 
> Without any knowledge about the scsi protocol and its error numbers it looks
> like the device is claiming to have more sectors than it actually has,
> doesn't it?
> 
> [17179724.816000] Fusion MPT base driver 3.03.07
> [17179724.816000] Copyright (c) 1999-2005 LSI Logic Corporation
> [17179724.832000] Fusion MPT SPI Host driver 3.03.07
> [17179724.840000] ACPI: PCI Interrupt 0000:02:0a.0[A] -> GSI 24 (level,
> low) -> IRQ 19
> [17179724.840000] mptbase: Initiating ioc0 bringup
> [17179725.312000] ioc0: 53C1030: Capabilities={Initiator}
> [17179725.800000] scsi0 : ioc0: LSI53C1030, FwRev=01030600h, Ports=1,
> MaxQ=255, IRQ=19
> [17179726.064000]   Vendor: Transtec  Model: PV610S16R1B       Rev: 347G
> [17179726.064000]   Type:   Direct-Access                      ANSI SCSI
> revision: 05
> [17179726.084000] sda : very big device. try to use READ CAPACITY(16).
> [17179726.084000] SCSI device sda: 12691101696 512-byte hdwr sectors
> (6497844 MB)
> [17179726.088000] sda: Write Protect is off
> [17179726.088000] sda: Mode Sense: 9b 00 00 08
> [17179726.088000] SCSI device sda: drive cache: write back
> [17179726.100000] sda : very big device. try to use READ CAPACITY(16).
> [17179726.100000] SCSI device sda: 12691101696 512-byte hdwr sectors
> (6497844 MB)
> [17179726.100000] sda: Write Protect is off
> [17179726.100000] sda: Mode Sense: 9b 00 00 08
> [17179726.100000] SCSI device sda: drive cache: write back
> [17179726.100000]  sda:<6>sd 0:0:1:0: SCSI error: return code = 0xb0000
> [17179726.132000] end_request: I/O error, dev sda, sector 12691101688

This is definitely a fusion driver error: it's DID_SOFT_ERROR, which
that driver returns for a variety of firmware related conditions or
transfer underruns.

I've cc'd the fusion people to see if they can help you diagnose it
further.

James


> [17179726.132000] Buffer I/O error on device sda, logical block 1586387711
> [17179726.136000] sd 0:0:1:0: SCSI error: return code = 0xb0000
> [17179726.136000] end_request: I/O error, dev sda, sector 12691101688
> [17179726.136000] Buffer I/O error on device sda, logical block 1586387711
> [17179726.136000] Alternate GPT is invalid, using primary GPT.
> [17179726.136000]
> [17179726.136000] sd 0:0:1:0: Attached scsi disk sda
> [17179726.160000] sd 0:0:1:0: Attached scsi generic sg0 type 0
> [17179729.412000] ACPI: PCI Interrupt 0000:02:0a.1[B] -> GSI 25 (level,
> low) -> IRQ 20
> [17179729.660000] mptbase: Initiating ioc1 bringup
> [17179730.132000] ioc1: 53C1030: Capabilities={Initiator}
> [17179730.620000] scsi1 : ioc1: LSI53C1030, FwRev=01030600h, Ports=1,
> MaxQ=255, IRQ=20
> /10
> 
> Now using parted to set a proper partition table, parted will complain about
> an I/O error and dmesg shows the messages below.
> 
> [17179939.464000] sd 0:0:1:0: SCSI error: return code = 0xb0000
> [17179939.464000] end_request: I/O error, dev sda, sector 12691101688
> [17180107.112000] sd 0:0:1:0: SCSI error: return code = 0xb0000
> [17180107.116000] end_request: I/O error, dev sda, sector 12691101688
> 
> 
> Any help is appriciated.
> 
> Thanks in advance,
> Bernd
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: I/O errors on 6TB device
  2006-12-16 15:28 ` James Bottomley
@ 2006-12-16 16:28   ` Bernd Schubert
       [not found]   ` <664A4EBB07F29743873A87CF62C26D702A9967@NAMAIL4.ad.lsil.com>
  1 sibling, 0 replies; 4+ messages in thread
From: Bernd Schubert @ 2006-12-16 16:28 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Moore, Eric

> > [17179726.100000]  sda:<6>sd 0:0:1:0: SCSI error: return code = 0xb0000
> > [17179726.132000] end_request: I/O error, dev sda, sector 12691101688
>
> This is definitely a fusion driver error: it's DID_SOFT_ERROR, which
> that driver returns for a variety of firmware related conditions or
> transfer underruns.
>
> I've cc'd the fusion people to see if they can help you diagnose it
> further.
>
> James

James, many thanks for your help. I will try to do a firmware update on 
Monday, maybe that helps. Of course, we also appreciate any help further help 
from LSI/MPT.

Thanks again,
Bernd

PS: Since the controller hardware might be important now, its an onboard 
controller on a Tyan S2880 mainboard.

0000:02:0a.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X 
Fusion-MPT Dual Ultra320 SCSI (rev 07)
        Subsystem: LSI Logic / Symbios Logic: Unknown device 1000
        Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 19
        I/O ports at a400 [size=256]
        Memory at fc9d0000 (64-bit, non-prefetchable) [size=64K]
        Memory at fc9c0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at fc700000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/0 
Enable-
        Capabilities: [68]
0000:02:0a.1 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X 
Fusion-MPT Dual Ultra320 SCSI (rev 07)
        Subsystem: LSI Logic / Symbios Logic: Unknown device 1000
        Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 20
        I/O ports at a800 [size=256]
        Memory at fc9f0000 (64-bit, non-prefetchable) [size=64K]
        Memory at fc9e0000 (64-bit, non-prefetchable) [size=64K]
        Expansion ROM at fc800000 [disabled] [size=1M]
        Capabilities: [50] Power Management version 2
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/0 
Enable-
        Capabilities: [68]

-- 
Bernd Schubert
PCI / Theoretische Chemie
Universität Heidelberg
INF 229
69120 Heidelberg

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: I/O errors on 6TB device
       [not found]   ` <664A4EBB07F29743873A87CF62C26D702A9967@NAMAIL4.ad.lsil.com>
@ 2006-12-16 23:00     ` Bernd Schubert
  0 siblings, 0 replies; 4+ messages in thread
From: Bernd Schubert @ 2006-12-16 23:00 UTC (permalink / raw)
  To: Moore, Eric; +Cc: James Bottomley, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 2229 bytes --]

On Saturday 16 December 2006 22:03, Moore, Eric wrote:
> On Sat 12/16/2006 8:28 AM,  James Bottomley wrote:
> >> [17179724.816000] Fusion MPT base driver 3.03.07
> >> [17179724.816000] Copyright (c) 1999-2005 LSI Logic Corporation
> >> [17179724.832000] Fusion MPT SPI Host driver 3.03.07
>
> 3.03.07 driver is about a year old.  Which kernel and distro
> are you on?

Eric, thanks for your help. Its 2.6.16.36 on Debian Sarge, using most recent 
kernel version is not always a good idea on server systems - too many 
regressions. 
Attached is dmesg output from 2.6.19.1.

>
> >> [17179726.100000]  sda:<6>sd 0:0:1:0: SCSI error: return code = 0xb0000
> >> [17179726.132000] end_request: I/O error, dev sda, sector 12691101688
> >
> > This is definitely a fusion driver error: it's DID_SOFT_ERROR, which
> > that driver returns for a variety of firmware related conditions or
> > transfer underruns.
>
> Correct, DID_SOFT_ERROR is returned in many cases.
>
> Can you recompile the driver with some debug messages enabled
> in the driver Makefile so I can observe the return codes from firmware.
> You will need to uncomment MPT_DEBUG_REPLY in the Makefile.

Sure, its enabled now.

>
>
> FwRev=01030600h is version 1.03.06, which is quite old.  Try
> obtaining a newer one from the lsi download site.  I believe
> the newest fw internally is 1.03.34.

Until Monday I don't have physical access to the machine, so an update from 
DOS is not possible now. However, I see there is also a linux mptflash, I 
just can't get it compiled.

gcc -g -O -Wall -I..   -c -o mptflash.o mptflash.c
In file included from ../mptbase.h:58,
                 from mptflash.c:13:
../linux_compat.h:9:30: error: scsi/scsi_device.h: No such file or directory
../linux_compat.h:10:28: error: scsi/scsi_cmnd.h: No such file or directory

The kernel_headers in Debian don't have these files and including the header 
files directly from the kernel tree doesn't work. 
Do you have newer sources (the most recent version I could find are from 
mptlinux-3.02.60) or a working binary?


Thanks,
Bernd

-- 
Bernd Schubert
PCI / Theoretische Chemie
Universität Heidelberg
INF 229
69120 Heidelberg


[-- Attachment #2: dmesg.log.gz --]
[-- Type: application/x-gzip, Size: 9438 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-12-16 23:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-16 12:12 I/O errors on 6TB device Bernd Schubert
2006-12-16 15:28 ` James Bottomley
2006-12-16 16:28   ` Bernd Schubert
     [not found]   ` <664A4EBB07F29743873A87CF62C26D702A9967@NAMAIL4.ad.lsil.com>
2006-12-16 23:00     ` Bernd Schubert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox