From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
To: Linux Kernel mailing list <linux-kernel@vger.kernel.org>
Subject: [RFC] How drivers notice a HW error?
Date: Thu, 27 Nov 2003 17:28:02 +0900 [thread overview]
Message-ID: <023401c3b4c0$5fb40660$a8647c0a@seto> (raw)
Hi all,
This is a request for comments, especially comments from driver developers.
On some platform, for example IA64, the chipset detects an error caused by
driver's operation such as I/O read, and reports it to kernel. Linux kernel
analyzes the error and decides to kill the driver or reboot at worst.
I want to convey the error information to the offending driver, and want to
enable the driver to recover the failed operation.
So, just a plan, I think about a readb_check function that has checking ability
enable it to return error value if error is occurred on read. Drivers could use
readb_check instead of usual readb, and could diagnosis whether a retry be
required or not, by the return value of readb_check.
To realize this, I consider following two images:
+ readb_check on driver (with Notifier)
[Outline]:
- Hardware error handler (for example in IA64, MCA handler) has a Notifier
as hook point.
- Driver may register a hook function to the Notifier.
- Notifier calls over registered functions when error is occurred.
- Called hook function checks address of error, and if the error seems
to be concerned with the parent driver, ups internal error flag and
stops Notifier by returning OK.
- Hardware error handler regards state of Notifier, and decides the system
to resume or not.
- Restarted driver may refer the error flag after read, and may retry the
read if flag is up.
[Issue]:
- Some interfaces such as register hooks would be required.
- Coding a hook function would be a bother of developers.
+ readb_check on kernel
[Outline]:
- Kernel has readb_check function.
- Drivers may use readb_check instead of usual readb.
- Hardware error handler checks address of error, and if it occurs in
readb_check, changes return value of readb_check and resumes
interrupted context.
- Driver may refer the return value to notice an error in last read
procedure.
[Issue]:
- Overhead would be involved. (Possibly, it could say negligible since
I/O reads are already horribly slow.)
IMO, this is a general-purpose function that should be available on many
platforms. I also hear that Solaris has some similar implementations like this.
If you have any comment about this feature or any idea different from this,
please tell me.
Best regards,
------
H.Seto <seto.hidetoshi@jp.fujitsu.com>
next reply other threads:[~2003-11-27 8:30 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-11-27 8:28 Hidetoshi Seto [this message]
[not found] <WpR1.1LG.3@gated-at.bofh.it>
2003-11-27 11:37 ` [RFC] How drivers notice a HW error? Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='023401c3b4c0$5fb40660$a8647c0a@seto' \
--to=seto.hidetoshi@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.