All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-ide@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [git patches] libata updates for 2.6.34
Date: Tue, 09 Mar 2010 17:12:02 -0500	[thread overview]
Message-ID: <4B96C7B2.3080008@garzik.org> (raw)
In-Reply-To: <alpine.LFD.2.00.1003091304400.3583@i5.linux-foundation.org>

On 03/09/2010 04:17 PM, Linus Torvalds wrote:
>
> Jeff,
>   this is a new machine, so I don't know when it started, but it was
> running a couple of Fedora 2.6.31/32 kernels for a while with no trouble.
> So I _think_ it's recent.
>
> I'd guess it's due to commit 27943620cb ("libata: implement spurious irq
> handling for SFF and apply it to piix"), in fact.
>
> With current -git I got a 30 second pause, and it was accompanied with
> this kernel log:
>
> 	Mar  9 12:51:05 i5 kernel: [    7.040194] ata4: clearing spurious IRQ
> 	Mar  9 12:51:05 i5 kernel: [   37.978933] ata4: lost interrupt (Status 0x50)
> 	Mar  9 12:51:05 i5 kernel: [   37.978948] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 frozen
> 	Mar  9 12:51:05 i5 kernel: [   37.978951] ata4.01: failed command: READ DMA
> 	Mar  9 12:51:05 i5 kernel: [   37.978954] ata4.01: cmd c8/00:08:ef:44:47/00:00:00:00:00/f0 tag 0 dma 4096 in
> 	Mar  9 12:51:05 i5 kernel: [   37.978955]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> 	Mar  9 12:51:05 i5 kernel: [   37.978957] ata4.01: status: { DRDY }
> 	Mar  9 12:51:05 i5 kernel: [   37.978963] ata4.00: hard resetting link
> 	Mar  9 12:51:05 i5 kernel: [   38.306451] ata4.01: hard resetting link
> 	Mar  9 12:51:05 i5 kernel: [   38.785773] ata4.00: SATA link down (SStatus 0 SControl 300)
> 	Mar  9 12:51:05 i5 kernel: [   38.785787] ata4.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> 	Mar  9 12:51:05 i5 kernel: [   38.809900] ata4.01: configured for UDMA/133
> 	Mar  9 12:51:05 i5 kernel: [   38.809903] ata4.01: device reported invalid CHS sector 0
> 	Mar  9 12:51:05 i5 kernel: [   38.809907] ata4: EH complete

Coincedentally, it looks like someone else just reported the same 
problem, with 2.6.34-rc1.

It definitely sounds like a race.  READ DMA is a DMA command as the name 
implies, so that eliminates the possibility of polling-related paths in 
ata_sff_interrupt (libata-sff.c).

I'll flip some of my machines to the icky slow boring piix mode, rather 
than sexy AHCI mode :) to see if I can reproduce.  I have had a feeling 
that we needed a more sophisticated IRQ handling setup, this may be what 
was needed.  Lost interrupt recovery should occur faster than 30 seconds 
in any case, and should not require a hard reset if the hardware 
functions just fine outside of the lost-interrupt / race that just occurred.

If it helps, this wiki pages explains the error output a bit more: 
http://ata.wiki.kernel.org/index.php/Libata_error_messages

though in this case, it is clearly a timeout, so looking at the input 
and output taskfile register blocks will not be as informative as in 
other error situations.

	Jeff




  reply	other threads:[~2010-03-09 22:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-01 20:23 [git patches] libata updates for 2.6.34 Jeff Garzik
2010-03-05 18:58 ` Bartlomiej Zolnierkiewicz
2010-03-05 18:37   ` Alan Cox
2010-03-05 19:43   ` Jeff Garzik
2010-03-05 20:12     ` Bartlomiej Zolnierkiewicz
2010-03-09 21:17 ` Linus Torvalds
2010-03-09 22:12   ` Jeff Garzik [this message]
2010-03-10  4:26     ` Tejun Heo
2010-03-12  0:16       ` Jeff Garzik
2010-03-15  2:55       ` Jeff Garzik
2010-03-15  7:33         ` Zeno Davatz
2010-03-15 13:06           ` Jeff Garzik
2010-03-15 13:21             ` Zeno Davatz
2010-03-15 13:30               ` Zeno Davatz
2010-03-15 13:32               ` Jeff Garzik
2010-03-15 13:35                 ` Zeno Davatz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B96C7B2.3080008@garzik.org \
    --to=jeff@garzik.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.