linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergei Shtylyov <sshtylyov@ru.mvista.com>
To: linas@austin.ibm.com
Cc: Stuart_Hayes@Dell.com, linux-ide@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [BUG] ide dma_timer_expiry, then hard lockup
Date: Tue, 19 Jun 2007 18:07:07 +0400	[thread overview]
Message-ID: <4677E30B.4020101@ru.mvista.com> (raw)
In-Reply-To: <DFEF91B22ED07447AB6AA4B237F913F9B18A05@ausx3mpc125.aus.amer.dell.com>

Hello.

Stuart_Hayes@Dell.com wrote:
> I think reading the IDE status register clears the interrupt in the IDE
> device, which might be causing the drive to think it's OK to generate
> another interrupt.

    This is not how IDE drives are supposed to act -- they won't proceed any 
further until "interrupt pending" condition is cleared, so these aren't 
supposed to be "stacked". This behavior however is not strictly specified by 
ATA standards IIRC, but I can't readily imagine such situaltion anyway unless 
tagged command queueing  (which is not supported by IDE core) and/or ATAPI 
command overlapping is in action...

>  This could either cause it to get stuck trying to
> service an interrupt that is never getting cleared as you suggested, or
> possibly when the next IRQ comes in the IDE IRQ handler gets stuck
> waiting for a spinlock that the code you're looking at already owns...?

    I could also imagine the HPT366 chip going mad and stalling the reads if 
the taskfile regs forever because of the incomplete DMA or even the drive 
going mad and not replying to I/O cycles with proper -IORDY handshake (i.e. 
holding it low all the time)...

> Perhaps a printk in the IDE IRQ handler would be informative?  It
> wouldn't help you figure out how it got where it is, but it might help
> you figure out why the system is hanging.

> Stuart

> -----Original Message-----
> From: linux-ide-owner@vger.kernel.org
> [mailto:linux-ide-owner@vger.kernel.org] On Behalf Of Linas Vepstas
> Sent: Monday, June 18, 2007 12:57 PM
> To: linux-ide@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [BUG] ide dma_timer_expiry, then hard lockup

> I've got a hard lockup in the ide subsystem, probably due to some irq
> spew or something like that.
> 
> I've just bought a brand new Maxtor 320GB disk driver for the insane
> price of $70 US to replace another failing drive. It works well under
> light load; I was able to copy about 60GB to it. However, under heavy
> load, such as reconstruction of an MD
> RAID-1 array, it'll lock up the kernel.  Which means that my system
> won't boot :-(
> 
> I'm running 2.6.21.1, although the problem seems to occur in 2.6.19 and
> 2.6.18 too; its been there a while; I vageuly remember similar problems
> in 2.6.5 or 2.6.10.
> 
> I get an
> "hdc: dma_timer_expiry: dma status == 0x21" 

    This means "DMA not complete".

> and 10 seconds later,

    The above condition causes another, 10 sec timeout...

> "hdc: DMA Timeout error"

> at which point the system is locked up hard.
> Magic sysreq does not work at all. The hard drive activity light stays
> fully lit.  Inserting printk's into the kernel, I find the hang to be in
> a surprising place: 

> ide_dma_timeout_retry() in ide-io.c 
>   prints the "hdc: DMA Timeout error" then calls
>   HWIF(drive)->ide_dma_end(drive);
>     which returns, and then calls 
>   hwif->INB(IDE_STATUS_REG) which is needed as an argument to
> ide_error()

> But this hangs! -- The INB never returns.
> Now:  hwif->INB = ide_inb; in ide-iops.c

> So putting a printk into ide_inb() shows that
> the printk before the readb() is printed, and the
> printk after the readb is not (!!)

> I find this rather surpriseing, as I can't imagine how the
> readb can fail. My current vague theory is that doing this
> readb makes the hard drive go really nuts, and it probably

    As I said, this is not the only way how it all might have gone nuts... :-)

> ties some interrupt line high, and so the linux kernel 
> gets stuck trying to handle the irq flood. I just don't know
> enough about the i386 architecture, or about interrupts, to 
> prove or disprove this.

> Any suggestions, experiments, experimental patches, data gathering,
> etc. is welcome. The sooner, the better... 

> --linas

MBR, Sergei

  reply	other threads:[~2007-06-19 14:07 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-18 17:57 [BUG] ide dma_timer_expiry, then hard lockup Linas Vepstas
2007-06-18 18:11 ` Stuart_Hayes
2007-06-19 14:07   ` Sergei Shtylyov [this message]
2007-06-19 15:05     ` Linas Vepstas
2007-06-19 16:10       ` Sergei Shtylyov
2007-06-19 16:48         ` Linas Vepstas
2007-06-19 18:43           ` Bartlomiej Zolnierkiewicz
2007-06-19 20:07             ` Sergei Shtylyov
2007-06-20 16:28               ` Linas Vepstas
2007-06-20 17:01                 ` Alan Cox
2007-06-21 17:58                   ` Sergei Shtylyov
2007-06-21 21:41                     ` Alan Cox
2007-06-21 19:47                   ` Linas Vepstas
2007-06-21 22:04                     ` Alan Cox
2007-06-18 20:27 ` Alan Cox
2007-06-18 20:46   ` Linas Vepstas
2007-06-18 21:04     ` Alan Cox
2007-06-18 21:22       ` Linas Vepstas
2007-06-19 14:56         ` bug in libata [was " Linas Vepstas
2007-06-19 14:10       ` Sergei Shtylyov
2007-06-19 14:19         ` Alan Cox
2007-06-19 14:24           ` Sergei Shtylyov
2007-06-19 15:38             ` Mark Lord
2007-06-19 15:51               ` Sergei Shtylyov
2007-06-19 16:17               ` Alan Cox
2007-06-19 16:32                 ` Sergei Shtylyov
2007-06-22 15:39 ` Sergei Shtylyov
2007-06-29 18:52   ` Sergei Shtylyov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4677E30B.4020101@ru.mvista.com \
    --to=sshtylyov@ru.mvista.com \
    --cc=Stuart_Hayes@Dell.com \
    --cc=linas@austin.ibm.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).