From: Jeff Garzik <jgarzik@pobox.com>
To: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Linux Kernel list <linux-kernel@vger.kernel.org>,
linux-pci@atrey.karlin.mff.cuni.cz, linux-ia64@vger.kernel.org,
Linus Torvalds <torvalds@osdl.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Linas Vepstas <linas@austin.ibm.com>,
"Luck, Tony" <tony.luck@intel.com>
Subject: Re: [PATCH/RFC] I/O-check interface for driver's error handling
Date: Tue, 01 Mar 2005 11:37:24 -0500 [thread overview]
Message-ID: <42249A44.4020507@pobox.com> (raw)
In-Reply-To: <422428EC.3090905@jp.fujitsu.com>
Hidetoshi Seto wrote:
> Hi, long time no see :-)
>
> Currently, I/O error is not a leading cause of system failure.
> However, since Linux nowadays is making great progress on its
> scalability, and ever larger number of PCI devices are being
> connected to a single high-performance server, the risk of the
> I/O error is increasing day by day.
>
> For example, PCI parity error is one of the most common errors
> in the hardware world. However, the major cause of parity error
> is not hardware's error but software's - low voltage, humidity,
> natural radiation... etc. Even though, some platforms are nervous
> to parity error enough to shutdown the system immediately on such
> error. So if device drivers can retry its transaction once results
> as an error, we can reduce the risk of I/O errors.
>
> So I'd like to suggest new interfaces that enable drivers to
> check - detect error and retry their I/O transaction easily.
I have been thinking about PCI system and parity errors, and how to
handle them. I do not think this is the correct approach.
A simple retry is... too simple. If you are having a massive problem on
your PCI bus, more action should be taken than a retry.
In my opinion each driver needs to be aware of PCI sys/parity errs, and
handle them. For network drivers, this is rather simple -- check the
hardware, then restart the DMA engine. Possibly turning off
TSO/checksum to guarantee that bad packets are not accepted. For SATA
and SCSI drivers, this is more complex, as one must retry a number of
queued disk commands, after resetting the hardware.
A new API handles none of this.
Jeff
next prev parent reply other threads:[~2005-03-01 16:38 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-03-01 8:33 [PATCH/RFC] I/O-check interface for driver's error handling Hidetoshi Seto
2005-03-01 14:42 ` Matthew Wilcox
2005-03-01 19:27 ` Linas Vepstas
2005-03-01 19:37 ` Linus Torvalds
2005-03-02 6:13 ` Hidetoshi Seto
2005-03-02 19:20 ` Linas Vepstas
2005-03-04 2:03 ` Hidetoshi Seto
2005-03-04 16:46 ` Linas Vepstas
2005-03-01 16:37 ` Jeff Garzik [this message]
2005-03-01 16:49 ` Linus Torvalds
2005-03-01 16:59 ` Matthew Wilcox
2005-03-01 17:10 ` Jesse Barnes
2005-03-01 18:33 ` Linas Vepstas
2005-03-01 22:27 ` Benjamin Herrenschmidt
2005-03-02 20:02 ` Linas Vepstas
2005-03-02 22:46 ` Benjamin Herrenschmidt
2005-03-02 23:37 ` Linas Vepstas
2005-03-01 22:23 ` Benjamin Herrenschmidt
2005-03-02 3:13 ` Hidetoshi Seto
2005-03-04 13:54 ` Pavel Machek
2005-03-04 17:50 ` Jesse Barnes
2005-03-04 22:37 ` Benjamin Herrenschmidt
2005-03-04 22:57 ` Pavel Machek
2005-03-04 23:03 ` Benjamin Herrenschmidt
2005-03-04 23:18 ` Pavel Machek
2005-03-04 23:27 ` Benjamin Herrenschmidt
2005-03-02 2:28 ` Hidetoshi Seto
2005-03-02 17:44 ` Linas Vepstas
2005-03-02 18:03 ` linux-os
2005-03-02 22:40 ` Benjamin Herrenschmidt
2005-03-04 2:21 ` Hidetoshi Seto
2005-03-01 22:20 ` Benjamin Herrenschmidt
2005-03-02 18:22 ` Linas Vepstas
2005-03-02 18:41 ` Jesse Barnes
2005-03-02 19:46 ` Linas Vepstas
2005-03-02 22:43 ` Benjamin Herrenschmidt
2005-03-02 22:41 ` Benjamin Herrenschmidt
2005-03-02 23:30 ` Linas Vepstas
2005-03-02 23:40 ` Jesse Barnes
2005-03-01 19:17 ` Linas Vepstas
2005-03-01 22:15 ` Benjamin Herrenschmidt
2005-03-01 17:19 ` Andi Kleen
2005-03-01 18:08 ` Linus Torvalds
2005-03-01 18:45 ` Andi Kleen
2005-03-01 18:59 ` Linas Vepstas
2005-03-01 22:26 ` Benjamin Herrenschmidt
2005-03-01 22:24 ` Benjamin Herrenschmidt
2005-03-04 12:40 ` Hidetoshi Seto
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=42249A44.4020507@pobox.com \
--to=jgarzik@pobox.com \
--cc=benh@kernel.crashing.org \
--cc=linas@austin.ibm.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@atrey.karlin.mff.cuni.cz \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=tony.luck@intel.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox