linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Spradic device disconnections
@ 2006-04-01 14:50 Dan Aloni
  2006-04-02 11:33 ` Jeff Garzik
  0 siblings, 1 reply; 2+ messages in thread
From: Dan Aloni @ 2006-04-01 14:50 UTC (permalink / raw)
  To: Linux Kernel List
  Cc: Mark Lord, IDE/ATA development list, Jens Axboe, Jeff Garzik,
	sander, dror

Hello,

There's a weird behavior we are experiencing with the Marvell
6081 controller, which I though you might have also experienced
or might find interesting.

Basically, the problem involves sporadic device disconnection 
during driver load time and sometimes during continuous use. It
was seen both with sata_mv and the driver provided by Marvell 
(version 3.6.1), so I have a reason to believe it's an hardware 
problem (to which we can find good software-based workarounds, 
hopefully).

The system seen below has 2 Marvell 6081 controllers harboring
a total of 14 hard-drives.

Once in every a few hundreds-or-so insmods of sata_mv the 
controller "misses" one of the drives, as seen below:

Mar 31 10:41:10 14.10.240.6 kernel: ata10: dev 0 ATA-7, max UDMA/133, 976773168 sectors: LBA48  
Mar 31 10:41:10 14.10.240.6 kernel: ata10: dev 0 configured for UDMA/133 
Mar 31 10:41:10 14.10.240.6 kernel: scsi9 : sata_mv 
Mar 31 10:41:11 14.10.240.6 kernel: ata11: dev 0 ATA-7, max UDMA/133, 976773168 sectors: LBA48  
Mar 31 10:41:11 14.10.240.6 kernel: ata11: dev 0 configured for UDMA/133 
Mar 31 10:41:11 14.10.240.6 kernel: scsi10 : sata_mv 
Mar 31 10:41:12 14.10.240.6 kernel: ata12: no device found (phy stat 00000000) 
Mar 31 10:41:12 14.10.240.6 kernel: ata12: no device found (phy stat 00000101) 
Mar 31 10:41:12 14.10.240.6 kernel: ATA: abnormal status 0x7F on port 0xFFFFC200106A811C 
Mar 31 10:41:12 14.10.240.6 kernel: ata12: dev 0 failed to IDENTIFY (I/O error) 
Mar 31 10:41:12 14.10.240.6 kernel: ATA: abnormal status 0x7F on port 0xFFFFC200106A811C 
Mar 31 10:41:12 14.10.240.6 kernel: ata12: dev 0 failed to IDENTIFY (I/O error) 
Mar 31 10:41:12 14.10.240.6 kernel: ATA: abnormal status 0x7F on port 0xFFFFC200106A811C 
Mar 31 10:41:12 14.10.240.6 kernel: ata12: dev 0 failed to IDENTIFY (I/O error) 
Mar 31 10:41:12 14.10.240.6 kernel: scsi11 : sata_mv 
Mar 31 10:41:13 14.10.240.6 kernel: ata13: dev 0 ATA-7, max UDMA/133, 976773168 sectors: LBA48  
Mar 31 10:41:13 14.10.240.6 kernel: ata13: dev 0 configured for UDMA/133 
Mar 31 10:41:13 14.10.240.6 kernel: scsi12 : sata_mv 
Mar 31 10:41:14 14.10.240.6 kernel: ata14: dev 0 ATA-7, max UDMA/133, 976773168 sectors: LBA48  
Mar 31 10:41:14 14.10.240.6 kernel: ata14: dev 0 configured for UDMA/133 
Mar 31 10:41:14 14.10.240.6 kernel: scsi13 : sata_mv 

Rest assured that drive is there, we haven't pulled it out. 

On Linux 2.4.27 with the Marvell driver (3.6.1), using the 
add-single-device command on the (now depricated) /proc/scsi/scsi 
interface, we managed to bring the drive back to life...

The drives are the 500GB from Maxtor. This problem has occured 
with drives from a different manufacturer.

Mar 31 10:41:09 14.10.240.6 kernel:   Vendor: ATA       Model: Maxtor 7H500F0    Rev: HA43 
Mar 31 10:41:09 14.10.240.6 kernel:   Type:   Direct-Access                      ANSI SCSI revision: 05 

-- 
Dan Aloni, Linux specialist
XIV LTD, http://www.xivstorage.com
da-x@monatomic.org, da-x@colinux.org, da-x@gmx.net, dan@xiv.co.il

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Spradic device disconnections
  2006-04-01 14:50 Spradic device disconnections Dan Aloni
@ 2006-04-02 11:33 ` Jeff Garzik
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Garzik @ 2006-04-02 11:33 UTC (permalink / raw)
  To: Dan Aloni
  Cc: Linux Kernel List, Mark Lord, IDE/ATA development list,
	Jens Axboe, sander, dror

Dan Aloni wrote:
> Mar 31 10:41:12 14.10.240.6 kernel: ata12: no device found (phy stat 00000101) 

Well, with 0x101, the hardware is telling us "device presence detected, 
but phy communications not yet established"

So, my first instinct would be to look at __mv_phy_reset() code block 
just above the comment /* work around errata */, and increase the length 
of the timeout from 200ms to 1-5 seconds.

My second instinct would be to increase the number of retries from 5.

	Jeff



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-04-02 11:33 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-01 14:50 Spradic device disconnections Dan Aloni
2006-04-02 11:33 ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).