From: Sergei Shtylyov <sshtylyov@ru.mvista.com>
To: Linas Vepstas <linas@austin.ibm.com>
Cc: linux-ide@vger.kernel.org
Subject: Re: [RFT] hpt366: reset DMA state machine on timeouts
Date: Fri, 22 Jun 2007 19:32:44 +0400 [thread overview]
Message-ID: <467BEB9C.1070407@ru.mvista.com> (raw)
In-Reply-To: <20070622151359.GD8840@austin.ibm.com>
Hello.
Linas Vepstas wrote:
>>Reset HPT36x's DMA state machine on a DMA timeout the way it's done for HPT370.
>>Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
>>---
>>Linas, here's what I've come up with -- this should apply against 2.6.21.y.
>>Compile-tested only, not for merging.
>> drivers/ide/pci/hpt366.c | 24 +++++++++++++++++++++++-
> This worked great! The patch is good. But it raises another interesting
> issue, one of those akpm ZFS "voilates boundaries" isses.
> However.. When raid goes to reconstruct the partition, I get one
> of the Drive Ready Seek Complete etc. messages. Your handler recovers
I hope you meant those messages were preceeded by DMA timeouts (otherwise
this code wouldn't come into action).
> from it (I put in a printk to verify this).
You mean into my ide_dma_timeout() method?
> And so these printk's
> try to get logged into /var/log/messages ... which trigger more
> errors. At a very high rate ... sometimes hundreds a second, sometimes
> less. The system remains usable, but at one point, it hit 60% cpu usage
> spewing these messages to the screen.
Hm...
> I'd like to see several things.
> 1) This patch should go in. It converts a system that hangs into
> one that doesn't hang.
What's strange is that it never seemed to be necessary before your great
new drive... ;-)
So, providing its data certainly wouldn't hurt -- perhaps we just should
blacklist it instead -- maybe there's a UDMA speed at which this wouldn't
happen, and we could just limit the drive to it.
> 2) There needs to be a way of failing the disk when there's a high
> number of errors. e.g. if there are more than 100 errors per minute
> then the disk needs to be marked "failed" in the raid array.
> Note it should be stopped only if the rate is high: if there is
> only 1 error per minte, this might be very annoying, but acceptable,
> esp. if one is just trying to copy data off the disk.
> I'm not sure what to do if this had been the only disk in the system.
> Maybe if the eror reate exceed 100/minute, then dma is turned off
> permanently?
In fact, it should be turned off after 3 DMA errors (causing PIO retries).
> --linas
MBR, Sergei
next prev parent reply other threads:[~2007-06-22 15:31 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-21 17:54 [RFT] hpt366: reset DMA state machine on timeouts Sergei Shtylyov
2007-06-21 19:31 ` Linas Vepstas
2007-06-22 15:13 ` Linas Vepstas
2007-06-22 15:32 ` Sergei Shtylyov [this message]
2007-06-22 16:36 ` Linas Vepstas
2007-06-23 18:10 ` Sergei Shtylyov
2007-06-25 21:44 ` Linas Vepstas
2007-06-26 13:57 ` Sergei Shtylyov
2007-06-22 15:54 ` Alan Cox
2007-06-22 16:03 ` Linas Vepstas
2007-06-22 16:33 ` Alan Cox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=467BEB9C.1070407@ru.mvista.com \
--to=sshtylyov@ru.mvista.com \
--cc=linas@austin.ibm.com \
--cc=linux-ide@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).