linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* DMA timeouts on Promise 20267 IDE card
@ 2003-01-09 20:43 James Curbo
  2003-01-09 21:52 ` Ross Biro
  0 siblings, 1 reply; 6+ messages in thread
From: James Curbo @ 2003-01-09 20:43 UTC (permalink / raw)
  To: linux-kernel

[please cc: me as I am not subscribed to lkml]

I've recently started getting errors like this (this example is from
2.4.20-pre3-ac2):

Jan  9 14:20:48 carthage kernel: hda: dma_timer_expiry: dma status ==
0x61
Jan  9 14:20:48 carthage kernel: hdc: dma_timer_expiry: dma status ==
0x21
Jan  9 14:20:58 carthage kernel: hda: timeout waiting for DMA
Jan  9 14:20:58 carthage kernel: PDC202XX: Primary channel reset.
Jan  9 14:20:58 carthage kernel: PDC202XX: Secondary channel reset.
Jan  9 14:20:58 carthage kernel: hda: DMA disabled
Jan  9 14:20:58 carthage kernel: hda: timeout waiting for DMA
Jan  9 14:20:58 carthage kernel: blk: queue c03c2860, I/O limit 4095Mb
(mask 0xffffffff)
Jan  9 14:20:58 carthage kernel: hdc: timeout waiting for DMA
Jan  9 14:20:58 carthage kernel: PDC202XX: Secondary channel reset.
Jan  9 14:20:58 carthage kernel: PDC202XX: Primary channel reset.
Jan  9 14:20:58 carthage kernel: hdc: DMA disabled
Jan  9 14:20:58 carthage kernel: hdc: timeout waiting for DMA
Jan  9 14:20:58 carthage kernel: blk: queue c03c2cac, I/O limit 4095Mb
(mask 0xffffffff)

I have a Promise 20267 PCI IDE controller card on an Epox 8RDA motherboard.
The motherboard is brand new and I never got these kinds of errors with
my previous MSI K7T Turbo board. There are two drives on the card:

hda: WDC WD400BB-00AUA1, ATA DISK drive
hdc: WDC WD400BB-00DEA0, ATA DISK drive

which are both alone on the seperate controllers. I've tried both 2.4
and 2.5 kernels (2.4.20, 2.4.20-ac2, 2.4.20-pre3-ac2, 2.5.[53-55] and
get the same errors.

Does anyone have idea what is causing this? I can offer more information
(.config etc) if necessary.

-- 
James Curbo <hannibal@adtrw.org> <phoenix@sandwich.net>
http://www.adtrw.org/blogs/hannibal/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMA timeouts on Promise 20267 IDE card
  2003-01-09 20:43 DMA timeouts on Promise 20267 IDE card James Curbo
@ 2003-01-09 21:52 ` Ross Biro
  2003-01-10  5:12   ` James Curbo
  0 siblings, 1 reply; 6+ messages in thread
From: Ross Biro @ 2003-01-09 21:52 UTC (permalink / raw)
  To: James Curbo; +Cc: linux-kernel

James Curbo wrote:

>[please cc: me as I am not subscribed to lkml]
>
>I've recently started getting errors like this (this example is from
>2.4.20-pre3-ac2):
>
>Jan  9 14:20:48 carthage kernel: hda: dma_timer_expiry: dma status ==
>0x61
>Jan  9 14:20:48 carthage kernel: hdc: dma_timer_expiry: dma status ==
>0x21
>  
>
I believe the low bit set in the dma_status means that the DMA transfer 
is still in progress.  Since the timer has expired, that means it's been 
in progress for 10 seconds.  Odds are the drive has stopped responding. 
 Since it's a Western Digital drive, it probably needs to be powercycled 
to come back.

I don't think this is a problem with the controller card, but I could be 
wrong.

    Ross




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMA timeouts on Promise 20267 IDE card
  2003-01-09 21:52 ` Ross Biro
@ 2003-01-10  5:12   ` James Curbo
  0 siblings, 0 replies; 6+ messages in thread
From: James Curbo @ 2003-01-10  5:12 UTC (permalink / raw)
  To: Ross Biro; +Cc: linux-kernel

On Jan 09, Ross Biro wrote:

> I believe the low bit set in the dma_status means that the DMA transfer 
> is still in progress.  Since the timer has expired, that means it's been 
> in progress for 10 seconds.  Odds are the drive has stopped responding. 
> Since it's a Western Digital drive, it probably needs to be powercycled 
> to come back.
> 
> I don't think this is a problem with the controller card, but I could be 
> wrong.
> 
>    Ross
> 

Well, I have had the first drive for about a year and a half (I think) 
and the second drive since August. I never had any problems out of them
with the same controller card on my previous motherboard (MSI K7T
Turbo). The problems didn't arise until the other day when I got my new
board.

The errors occur over and over; the drive will come back for a few
seconds and then the error will occur again. I usually reboot at this
point.

-- 
James Curbo <hannibal@adtrw.org> <phoenix@sandwich.net>
http://www.adtrw.org/blogs/hannibal/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMA timeouts on Promise 20267 IDE card
       [not found] <233C89823A37714D95B1A891DE3BCE5202AB1B6D@xch-a.win.zambeel.com>
@ 2003-01-10  5:14 ` James Curbo
  0 siblings, 0 replies; 6+ messages in thread
From: James Curbo @ 2003-01-10  5:14 UTC (permalink / raw)
  To: linux-kernel

On Jan 09, Manish Lachwani wrote:
> Can you also get the SMART data from the drives using smartctl? Also, it
> looks like the errors are happening on both the drives. Which UDMA mode are
> you operating in?
> 
> Thanks
> Manish

UDMA 5 for both of them. Here is the smartctl data:

carthage:/home/james# smartctl -v /dev/hda
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute                    Flag     Value Worst Threshold Raw Value
(  1)Raw Read Error Rate     0x000b   200   199   051       0
(  3)Spin Up Time            0x0007   102   095   021       3733
(  4)Start Stop Count        0x0032   100   100   040       419
(  5)Reallocated Sector Ct   0x0032   160   160   112       160
(  7)Seek Error Rate         0x000b   100   253   051       0
(  9)Power On Hours          0x0032   079   079   000       15858
( 10)Spin Retry Count        0x0013   100   099   051       2
( 11)Calibration Retry Count 0x0013   100   100   051       0
( 12)Power Cycle Count       0x0032   100   100   000       358
(196)Reallocated Event Count 0x0032   126   126   000       74
(197)Current Pending Sector  0x0012   200   200   000       1
(198)Offline Uncorrectable   0x0012   200   200   000       0
(199)UDMA CRC Error Count    0x000a   200   253   000       65884
(200)Unknown Attribute       0x0009   200   199   051       1

carthage:/home/james# smartctl -v /dev/hdc
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute                    Flag     Value Worst Threshold Raw Value
(  1)Raw Read Error Rate     0x000b   200   200   051       0
(  3)Spin Up Time            0x0007   101   093   021       2300
(  4)Start Stop Count        0x0032   100   100   040       96
(  5)Reallocated Sector Ct   0x0033   200   200   140       0
(  7)Seek Error Rate         0x000b   200   200   051       0
(  9)Power On Hours          0x0032   096   096   000       3325
( 10)Spin Retry Count        0x0013   100   253   051       0
( 11)Calibration Retry Count 0x0013   100   253   051       0
( 12)Power Cycle Count       0x0032   100   100   000       93
(196)Reallocated Event Count 0x0032   200   200   000       0
(197)Current Pending Sector  0x0012   200   200   000       0
(198)Offline Uncorrectable   0x0012   200   200   000       0
(199)UDMA CRC Error Count    0x000a   200   253   000       0
(200)Unknown Attribute       0x0009   200   200   051       0



-- 
James Curbo <hannibal@adtrw.org> <phoenix@sandwich.net>
http://www.adtrw.org/blogs/hannibal/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMA timeouts on Promise 20267 IDE card
       [not found] <233C89823A37714D95B1A891DE3BCE5202AB1B7F@xch-a.win.zambeel.com>
@ 2003-01-10  6:47 ` Manish Lachwani
  0 siblings, 0 replies; 6+ messages in thread
From: Manish Lachwani @ 2003-01-10  6:47 UTC (permalink / raw)
  To: linux-kernel


 check the CRC count on the first drive, hda. Its
 65584 !!! Thats huge. This
 CRC values result in UDMA downgrades. Also, check
 the reallocation sector
 count. A high value here means possible timeouts.
 With high reallocation
 sector count, there could be multiple mappings a
 drive would have to look
 into to get to the proper sector. You should change
 the drive hda and also
 the cable. Then try again. 
 
 Thanks
 Manish
 


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMA timeouts on Promise 20267 IDE card
       [not found] <233C89823A37714D95B1A891DE3BCE5202AB1B7A@xch-a.win.zambeel.com>
@ 2003-01-10 10:33 ` James Curbo
  0 siblings, 0 replies; 6+ messages in thread
From: James Curbo @ 2003-01-10 10:33 UTC (permalink / raw)
  To: Manish Lachwani; +Cc: linux-kernel

On Jan 09, Manish Lachwani wrote:

> You should change the drive hda and also the cable. Then try again. 
> 

Oops! One of my IDE cables wasn't seated properly. Thanks for the help!
At least it wasn't a kernel bug :)

-- 
James Curbo <hannibal@adtrw.org> <phoenix@sandwich.net>
http://www.adtrw.org/blogs/hannibal/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-01-10 10:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-09 20:43 DMA timeouts on Promise 20267 IDE card James Curbo
2003-01-09 21:52 ` Ross Biro
2003-01-10  5:12   ` James Curbo
     [not found] <233C89823A37714D95B1A891DE3BCE5202AB1B6D@xch-a.win.zambeel.com>
2003-01-10  5:14 ` James Curbo
     [not found] <233C89823A37714D95B1A891DE3BCE5202AB1B7F@xch-a.win.zambeel.com>
2003-01-10  6:47 ` Manish Lachwani
     [not found] <233C89823A37714D95B1A891DE3BCE5202AB1B7A@xch-a.win.zambeel.com>
2003-01-10 10:33 ` James Curbo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).