From: Andrew Morton <akpm@osdl.org>
To: Phillip Susi <psusi@cfl.rr.com>
Cc: linux-kernel@vger.kernel.org, Jeff Garzik <jgarzik@pobox.com>
Subject: Re: VIA SATA Raid needs a long time to recover from suspend
Date: Wed, 16 Nov 2005 17:06:42 -0800 [thread overview]
Message-ID: <20051116170642.313aeada.akpm@osdl.org> (raw)
In-Reply-To: <437AA996.9080505@cfl.rr.com>
Phillip Susi <psusi@cfl.rr.com> wrote:
>
> I have been debugging a power management problem for a few days now, and
> I believe I have finally solved the problem. Because it involved
> patching the kernel, I felt I should share the fix here in hopes that it
> can be improved and/or integrated into future kernels. Right now I am
> running 2.6.14.2 on amd64, compiled myself, with the ubuntu breezy amd64
> distribution.
>
> First I'll state the fix. It involved changing two lines in
> include/linux/libata.h:
>
> static inline u8 ata_busy_wait(struct ata_port *ap, unsigned int bits,
> unsigned int max)
> {
> u8 status;
>
> do {
> udelay(100); <-- changed to 100
> from 10
> status = ata_chk_status(ap);
> max--;
> } while ((status & bits) && (max > 0));
>
> return status;
> }
>
> and:
>
> static inline u8 ata_wait_idle(struct ata_port *ap)
> {
> u8 status = ata_busy_wait(ap, ATA_BUSY | ATA_DRQ,
> 10000); <-- changed to 10,000 from 1,000
>
> if (status & (ATA_BUSY | ATA_DRQ)) {
> unsigned long l = ap->ioaddr.status_addr;
> printk(KERN_WARNING
> "ATA: abnormal status 0x%X on port 0x%lX\n",
> status, l);
> }
>
> return status;
> }
This change will increase the minimum delay in both ata_wait_idle() and
ata_busy_wait() from 10 usec to 100 usec, which is not a good change.
It would be less damaging to increase the delay in ata_wait_idle() from
1000 to 100,000. A one second spin is a bit sad, but the hardware's bust,
so that's the least of the user's worries.
The best fix would be to identify the specific _callers_ of these functions
which need their delay increased. You can do that by adding:
if (max == 0)
dump_stack();
at the end of ata_busy_wait(). The resulting stack dumps will tell you
where the offending callsites are. With luck, it's just one. If we know
which one, that might point us at some chipset-level delay which we're
forgetting to do, or something like that.
> The problem seems to be that my VIA SATA raid controller requires more
> time to recover from being suspended. It looks like the code in
> sata_via.c restores the task file after a resume, then calls
> ata_wait_idle to wait for the busy bit to clear. The problem was that
> this function timed out before the busy bit cleared, resulting in
> messages like this:
>
> ATA: abnormal status 0x80 on port 0xE007
>
> Then if there was an IO request made immediately after resuming, it
> would timeout and fail, because it was issued before the hardware was
> ready. Changing the timeout resolved this. I tried changing both the
> udelay and ata_busy_wait lines to increase the timeout, and it did not
> seem to matter which I changed, as long as the total timeout was
> increased by a factor of 100.
>
> Since increasing the maximum timeout, suspend and hibernate work great
> for me. While experiencing this bug, it may have exposed another bug,
> which I will mention now in passing. As I said before, after a resume,
> if there was an IO request made immediately ( before the busy bit
> finally did clear ) it would timeout and fail. It seemed the kernel
> filled the buffer cache for the requested block with garbage rather than
> retry the read. It seems to me that at some point, the read should have
> been retried.
Or should have returned an IO error.
Yes, this might be a driver bug. Again, if we know precisely which
callsite is experiencing the timeout then we're better situated to fix it.
next prev parent reply other threads:[~2005-11-17 1:07 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-11-16 3:37 VIA SATA Raid needs a long time to recover from suspend Phillip Susi
2005-11-17 1:06 ` Andrew Morton [this message]
2005-11-17 3:55 ` Phillip Susi
2005-11-18 0:08 ` Mark Lord
2005-11-19 23:32 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20051116170642.313aeada.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=jgarzik@pobox.com \
--cc=linux-kernel@vger.kernel.org \
--cc=psusi@cfl.rr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox