From: Jeff Garzik <jeff@garzik.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-ide@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
Tejun Heo <tj@kernel.org>
Subject: Re: [git patches] libata updates for 2.6.34
Date: Tue, 09 Mar 2010 17:12:02 -0500 [thread overview]
Message-ID: <4B96C7B2.3080008@garzik.org> (raw)
In-Reply-To: <alpine.LFD.2.00.1003091304400.3583@i5.linux-foundation.org>
On 03/09/2010 04:17 PM, Linus Torvalds wrote:
>
> Jeff,
> this is a new machine, so I don't know when it started, but it was
> running a couple of Fedora 2.6.31/32 kernels for a while with no trouble.
> So I _think_ it's recent.
>
> I'd guess it's due to commit 27943620cb ("libata: implement spurious irq
> handling for SFF and apply it to piix"), in fact.
>
> With current -git I got a 30 second pause, and it was accompanied with
> this kernel log:
>
> Mar 9 12:51:05 i5 kernel: [ 7.040194] ata4: clearing spurious IRQ
> Mar 9 12:51:05 i5 kernel: [ 37.978933] ata4: lost interrupt (Status 0x50)
> Mar 9 12:51:05 i5 kernel: [ 37.978948] ata4.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 frozen
> Mar 9 12:51:05 i5 kernel: [ 37.978951] ata4.01: failed command: READ DMA
> Mar 9 12:51:05 i5 kernel: [ 37.978954] ata4.01: cmd c8/00:08:ef:44:47/00:00:00:00:00/f0 tag 0 dma 4096 in
> Mar 9 12:51:05 i5 kernel: [ 37.978955] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> Mar 9 12:51:05 i5 kernel: [ 37.978957] ata4.01: status: { DRDY }
> Mar 9 12:51:05 i5 kernel: [ 37.978963] ata4.00: hard resetting link
> Mar 9 12:51:05 i5 kernel: [ 38.306451] ata4.01: hard resetting link
> Mar 9 12:51:05 i5 kernel: [ 38.785773] ata4.00: SATA link down (SStatus 0 SControl 300)
> Mar 9 12:51:05 i5 kernel: [ 38.785787] ata4.01: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> Mar 9 12:51:05 i5 kernel: [ 38.809900] ata4.01: configured for UDMA/133
> Mar 9 12:51:05 i5 kernel: [ 38.809903] ata4.01: device reported invalid CHS sector 0
> Mar 9 12:51:05 i5 kernel: [ 38.809907] ata4: EH complete
Coincedentally, it looks like someone else just reported the same
problem, with 2.6.34-rc1.
It definitely sounds like a race. READ DMA is a DMA command as the name
implies, so that eliminates the possibility of polling-related paths in
ata_sff_interrupt (libata-sff.c).
I'll flip some of my machines to the icky slow boring piix mode, rather
than sexy AHCI mode :) to see if I can reproduce. I have had a feeling
that we needed a more sophisticated IRQ handling setup, this may be what
was needed. Lost interrupt recovery should occur faster than 30 seconds
in any case, and should not require a hard reset if the hardware
functions just fine outside of the lost-interrupt / race that just occurred.
If it helps, this wiki pages explains the error output a bit more:
http://ata.wiki.kernel.org/index.php/Libata_error_messages
though in this case, it is clearly a timeout, so looking at the input
and output taskfile register blocks will not be as informative as in
other error situations.
Jeff
next prev parent reply other threads:[~2010-03-09 22:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-01 20:23 [git patches] libata updates for 2.6.34 Jeff Garzik
2010-03-05 18:58 ` Bartlomiej Zolnierkiewicz
2010-03-05 18:37 ` Alan Cox
2010-03-05 19:43 ` Jeff Garzik
2010-03-05 20:12 ` Bartlomiej Zolnierkiewicz
2010-03-09 21:17 ` Linus Torvalds
2010-03-09 22:12 ` Jeff Garzik [this message]
2010-03-10 4:26 ` Tejun Heo
2010-03-12 0:16 ` Jeff Garzik
2010-03-15 2:55 ` Jeff Garzik
2010-03-15 7:33 ` Zeno Davatz
2010-03-15 13:06 ` Jeff Garzik
2010-03-15 13:21 ` Zeno Davatz
2010-03-15 13:30 ` Zeno Davatz
2010-03-15 13:32 ` Jeff Garzik
2010-03-15 13:35 ` Zeno Davatz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B96C7B2.3080008@garzik.org \
--to=jeff@garzik.org \
--cc=akpm@linux-foundation.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.