From: Rich West <Rich.West@wesmo.com>
To: Jeff Garzik <jeff@garzik.org>
Cc: linux-kernel@vger.kernel.org,
Linux IDE mailing list <linux-ide@vger.kernel.org>
Subject: Re: sata_via
Date: Fri, 04 Apr 2008 19:44:21 -0400 [thread overview]
Message-ID: <47F6BD55.3090708@wesmo.com> (raw)
In-Reply-To: <47F6A5E8.1030602@garzik.org>
Jeff Garzik wrote:
> Rich West wrote:
>> On my mythtv backend system, the recordings volume tends to get pounded
>> rather hard (up to 5 recordings (some HD) at once with multiple frontend
>> systems reading from that same volume). I recently (4 months ago)
>> upgraded the system to a motherboard that happened to have the VIA
>> chipset on it.
>>
>> Since that time, I have had some bizarre problems with that volume.
>> After a seemingly random amount of time, the kernel would report an
>> error with the volume and put it in read-only mode. However, it would
>> not really be in read-only mode, but it would be completely
>> inaccessible. Unmounting the volume would be successful, but
>> re-mounting the volume would fail.
>>
>> I've replaced the drive (with an identical one), tested memory, changed
>> filesystems (it was LVM + ext3, then just ext3) and the problem
>> persists.
>>
>> Running 2.6.24.4-64 (Fedora 8).
>>
>> A larger snippet from the messages log is (dmesg gets cleared after
>> reboot):
>> Apr 3 16:47:27 mythtv1 kernel: ata4.00: exception Emask 0x0 SAct 0x0
>> SErr 0x0 action 0x2 frozen
>> Apr 3 16:47:27 mythtv1 kernel: ata4.00: cmd
>> c8/00:00:77:31:21/00:00:00:00:00/e1 tag 0 dma 131072 in
>> Apr 3 16:47:27 mythtv1 kernel: res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>> Apr 3 16:47:27 mythtv1 kernel: ata4.00: status: { DRDY }
>> Apr 3 16:47:27 mythtv1 kernel: ata4: soft resetting link
>> Apr 3 16:47:57 mythtv1 kernel: ata4.00: qc timeout (cmd 0x27)
>> Apr 3 16:47:57 mythtv1 kernel: ata4.00: failed to read native max
>> address (err_mask=0x4)
>> Apr 3 16:47:57 mythtv1 kernel: ata4.00: HPA support seems broken, will
>> skip HPA handling
>> Apr 3 16:47:57 mythtv1 kernel: ata4.00: revalidation failed (errno=-5)
>> Apr 3 16:47:57 mythtv1 kernel: ata4: failed to recover some devices,
>> retrying in 5 secs
>> Apr 3 16:48:02 mythtv1 kernel: ata4: soft resetting link
>> Apr 3 16:48:02 mythtv1 kernel: ata4.00: configured for UDMA/133
>> Apr 3 16:48:02 mythtv1 kernel: ata4: EH complete
>> Apr 3 16:49:02 mythtv1 kernel: ata4.00: exception Emask 0x0 SAct 0x0
>> SErr 0x0 action 0x2 frozen
>> Apr 3 16:49:02 mythtv1 kernel: ata4.00: cmd
>> c8/00:00:77:31:21/00:00:00:00:00/e1 tag 0 dma 131072 in
>> Apr 3 16:49:02 mythtv1 kernel: res
>> 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
>> Apr 3 16:49:02 mythtv1 kernel: ata4.00: status: { DRDY }
>> Apr 3 16:49:02 mythtv1 kernel: ata4: soft resetting link
>> Apr 3 16:49:03 mythtv1 kernel: ata4.00: configured for UDMA/133
>> Apr 3 16:49:03 mythtv1 kernel: ata4: EH complete
>>
>> It is almost as if I am hitting some bug that is causing the drive to
>> fall off, but I really don't know where else to look or where else to
>> turn...
>>
>> I'm tempted to just go back to using a PATA drive (smaller. :( ) to
>> avoid the problem. I'm just at a loss as to how it can actually be
>> solved.
>
> This timeout/DRDY message has been a common one recently. Some of the
> issues causing this may be resolved in 2.6.25-rc, can you try that?
>
> Also, if you could build and test some older kernels to see when this
> behavior first appeared, that would be quite helpful.
>
> Overall, a timeout _might_ be a problem with libata (the kernel SATA
> drivers), or it _might_ be a problem with your system's interrupt
> delivery (sometimes an ACPI or BIOS problem). Try booting with
> 'noapic' or 'acpi=off'.
>
Thanks for the quick response.
I know this problem was happening with all of the Fedora 7 supplied
kernels (from initial release up until about a week ago) and has
happened with each of the Fedora 8 supplied kernels. I'll try rolling
2.6.25-rc to see if the problem resurfaces. Unfortunately, I don't know
what collision of events causes this problem to erupt, but it usually
happens within 7 days of a reboot (some times within hours of a reboot).
I'll give noapic a try, but (dumb question) what does acpi=off buy?
-Rich
next prev parent reply other threads:[~2008-04-04 23:44 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <47F694C8.8020507@wesmo.com>
2008-04-04 22:04 ` sata_via Jeff Garzik
2008-04-04 23:44 ` Rich West [this message]
2008-04-05 7:15 ` sata_via Wander Winkelhorst
[not found] ` <47F7982E.5080701@wesmo.com>
2008-04-05 21:51 ` sata_via Jeff Garzik
2008-04-13 2:45 ` sata_via Tejun Heo
2008-04-13 4:19 ` sata_via Rich West
2008-04-13 4:36 ` sata_via Rich West
2008-04-14 0:39 ` sata_via Tejun Heo
2008-04-14 8:13 ` sata_via Thomas Renninger
2008-04-14 8:32 ` sata_via Peter Gervai
2008-04-15 2:40 ` sata_via Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47F6BD55.3090708@wesmo.com \
--to=rich.west@wesmo.com \
--cc=jeff@garzik.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).