ide dma bug?

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* ide dma bug?
@ 2005-09-04 20:18 Trevor Cordes
  2005-09-05 11:39 ` Christoph Burger-Scheidlin
  0 siblings, 1 reply; 3+ messages in thread
From: Trevor Cordes @ 2005-09-04 20:18 UTC (permalink / raw)
  To: linux-ide

I'm getting a couple of times every couple of days:

Sep  1 07:09:53 piles kernel: hdq: dma_intr: bad DMA status (dma_stat=75)
Sep  1 07:09:53 piles kernel: hdq: dma_intr: status=0x50 { DriveReady SeekComplete }
Sep  1 07:09:53 piles kernel: ide: failed opcode was: unknown
Sep  1 07:10:26 piles kernel: hdr: dma_intr: bad DMA status (dma_stat=75)
Sep  1 07:10:26 piles kernel: hdr: dma_intr: status=0x50 { DriveReady SeekComplete }
Sep  1 07:10:26 piles kernel: ide: failed opcode was: unknown
Sep  3 19:41:36 piles kernel: hdt: dma_intr: bad DMA status (dma_stat=75)
Sep  3 19:41:36 piles kernel: hdt: dma_intr: status=0x50 { DriveReady SeekComplete }
Sep  3 19:41:36 piles kernel: ide: failed opcode was: unknown

It's my franken-computer file-server that has 12 IDE PATA drives running a
combo of RAID6 and RAID0 (and a little RAID1).  They're running off of the
onboard Intel 845D IDE, onboard Promise PDC20276 (used only as a simple
IDE controller), 1 Promise Ultra100 TX2 PDC20268, and 2 CMD680-based
cards.  8 drives are 250GB (3 diff brands -- on purpose!), 4 are 160GB. 
The 8 250's are all IDE masters.  The 4 160's share 2 IDE channels
(master+slave).

The 160's are hdq, hdr, hds, and hdt.  Those are the ONLY drives that show
these intermittent dma errors.  Obviously the problem has to do with using
some drives as slaves.  I was reluctant to put in another IDE card (having
so many already scares me), but have one on order right now.  I'm hoping
having them all on master will alleviate this problem.

Note, besides the log messages, I see no other symptoms.  The RAID array 
stays up and things seem to run fine.  However, a friend said to me the 
following and so I write this email:

"Not good... means you've got DMA problems.  Possibly related to 
having too many scatter-gather operations pending simultaneously all 
being triggered off one or two IRQs... ???  I'd post to either ATA or 
Kernel or RAID mailing lists and ask for opinions, since that's a 
potential BIG problem in terms of CPU usage."

"What you're telling me about those disks tells me that there's a slight 
bug in the driver code - basically straggler interrup handler looks for 
DMA results, sees bogosity, checks drive for errors and print message 
about bad-DMA-but-drives-are-fine."

My kernel is 2.6.12-1.1372_FC3

My hdparm's are all identical:
#hdparm /dev/hds
/dev/hds:
 multcount    = 16 (on)
 IO_support   =  1 (32-bit)
 unmaskirq    =  1 (on)
 using_dma    =  1 (on)
 keepsettings =  0 (off)
 readonly     =  0 (off)
 readahead    = 256 (on)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ide dma bug?
  2005-09-04 20:18 ide dma bug? Trevor Cordes
@ 2005-09-05 11:39 ` Christoph Burger-Scheidlin
  2005-09-05 16:00   ` Mark Lord
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Burger-Scheidlin @ 2005-09-05 11:39 UTC (permalink / raw)
  To: linux-ide

On Sunday 04 September 2005 21:18, Trevor Cordes wrote:
> I'm getting a couple of times every couple of days:
>
> Sep  1 07:09:53 piles kernel: hdq: dma_intr: bad DMA status (dma_stat=75)
> Sep  1 07:09:53 piles kernel: hdq: dma_intr: status=0x50 { DriveReady
> SeekComplete } Sep  1 07:09:53 piles kernel: ide: failed opcode was:
> unknown
>
> They're running off of the onboard Intel 845D IDE, onboard Promise PDC20276 
> [...] 2 CMD680-based cards. [...] The 8 250's are all IDE masters.  The 4 
> 160's share 2 IDE channels (master+slave). 

I had a similar problem with a SiI680 (and since lspci states that Silicon 
Image was formerly CMD Tech. I thought it might be related). In my case, I 
got lost interrupts and dma errors when I used both channels of the SiI680 
simultaneously. I serialized both channels (kernel parameters ide2=serialize 
ide3=serialize) and the problems went away (I am not sure though how much of 
a performance impact it has).

If you want more detailed info on my scenario to compare it to yours, please 
check bug 5145: http://bugzilla.kernel.org/show_bug.cgi?id=5145 .

Christoph Burger-Scheidlin

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ide dma bug?
  2005-09-05 11:39 ` Christoph Burger-Scheidlin
@ 2005-09-05 16:00   ` Mark Lord
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Lord @ 2005-09-05 16:00 UTC (permalink / raw)
  To: Christoph Burger-Scheidlin; +Cc: linux-ide

Christoph Burger-Scheidlin wrote:
>
> I had a similar problem with a SiI680 (and since lspci states that Silicon 
> Image was formerly CMD Tech. I thought it might be related). In my case, I 
> got lost interrupts and dma errors when I used both channels of the SiI680 
> simultaneously. I serialized both channels (kernel parameters ide2=serialize 
> ide3=serialize) and the problems went away (I am not sure though how much of 
> a performance impact it has).

Note for others:
Using the ideX=serialize parameter simply causes the IDE driver
to never allow simultaneous activity on the "serialized" channels.

A small performance tradeoff in exchange for reliable operation.
The drives can still seek independently and all that, just without
any overlap in operation.

Cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-09-05 16:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-04 20:18 ide dma bug? Trevor Cordes
2005-09-05 11:39 ` Christoph Burger-Scheidlin
2005-09-05 16:00   ` Mark Lord

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).