linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: Linas Vepstas <linas@austin.ibm.com>
Cc: Yanmin Zhang <yanmin.zhang@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Rajesh Shah <rajesh.shah@intel.com>,
	linuxppc-dev@ozlabs.org,
	linux-pci maillist <linux-pci@atrey.karlin.mff.cuni.cz>
Subject: Re: pci error recovery procedure
Date: Thu, 07 Sep 2006 09:56:19 +0800	[thread overview]
Message-ID: <1157594179.20092.451.camel@ymzhang-perf.sh.intel.com> (raw)
In-Reply-To: <20060906200155.GL7139@austin.ibm.com>

On Thu, 2006-09-07 at 04:01, Linas Vepstas wrote:
> On Wed, Sep 06, 2006 at 09:26:56AM +0800, Zhang, Yanmin wrote:
> > > > The
> > > > error_detected of the drivers in the latest kernel who support err handlers
> > > > always returns PCI_ERS_RESULT_NEED_RESET. They are typical examples.
> > > 
> > > Just because the current drivers do it this way does not mean that this is
> > > the best way to do things.
> >
> > If it's not the best way, why did you choose to reset slot for e1000/e100/ipr
> > error handlers? They are typical widely-used devices. To make it easier to
> > add error handlers?
> 
> I did it that way just to get going, get something working. I do not
> have hardware specs for any of these devices, and do not have much of 
> an idea of what they are capable of;
Yes, it's difficult to add fine-grained error handlers for guys who are not
the driver developers.

>  the recovery code I wrote is of
> "brute force, hit it with a hammer"-nature.  Driver writers who 
> know thier hardware well, and are interested in a more refined 
> approach are encouraged to actualy use a more refined approach.
I guess almost no driver developer is happy to spend lots of time to
add refined steps. They would like to focus on normal process (for achievement
feeling? :) ).
In addition, if they use fine-grained steps in error handlers, all these
steps might be rewritten when the device specs is upgraded. Fine-grained steps in
error handlers are more difficut to debug.

It's impossible for you to develop error handlers for all device drivers.

The error handlers look a little like suspend/resume. Of course, it's more
complicated. If we could keep it as simple as suspend/resume, it's more welcomed.

pci error shouldn't happen frequently. And when it happens, I think mostly it's
an endpoint device instead of bridge. When it happens, if we choose always
reset slot, performance could be degraded, but not too much. I just deduce, and 
didn't test it on a machine with hundreds of devices.

  reply	other threads:[~2006-09-07  1:58 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1157008212.20092.36.camel@ymzhang-perf.sh.intel.com>
2006-08-31 17:50 ` pci error recovery procedure Linas Vepstas
2006-09-01  3:33   ` Zhang, Yanmin
2006-09-01 21:25     ` Linas Vepstas
2006-09-04  5:47       ` Zhang, Yanmin
2006-09-04  9:03         ` Benjamin Herrenschmidt
2006-09-05  2:32           ` Zhang, Yanmin
2006-09-05 19:01             ` Linas Vepstas
2006-09-06  1:26               ` Zhang, Yanmin
2006-09-06 20:01                 ` Linas Vepstas
2006-09-07  1:56                   ` Zhang, Yanmin [this message]
2006-09-05 18:50           ` Linas Vepstas
2006-09-05 21:19             ` Benjamin Herrenschmidt
2006-09-06  1:35             ` Zhang, Yanmin
2006-09-05 19:17         ` Linas Vepstas
2006-09-06  2:04           ` Zhang, Yanmin
2006-09-06 20:39             ` Linas Vepstas
2006-09-07  3:18               ` Zhang, Yanmin
2006-09-12 19:38                 ` Linas Vepstas
2006-09-01  3:42   ` Zhang, Yanmin
2006-09-01  9:04     ` Zhang, Yanmin
2006-09-01 21:32       ` Linas Vepstas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1157594179.20092.451.camel@ymzhang-perf.sh.intel.com \
    --to=yanmin_zhang@linux.intel.com \
    --cc=linas@austin.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@atrey.karlin.mff.cuni.cz \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=rajesh.shah@intel.com \
    --cc=yanmin.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).