linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: How to find hotplug slot of PCI dev?
@ 2005-06-29 23:00 Linas Vepstas
  2005-06-29 23:26 ` Greg KH
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Linas Vepstas @ 2005-06-29 23:00 UTC (permalink / raw)
  To: linux-hotplug

On Wed, Jun 29, 2005 at 03:44:38PM -0700, Greg KH was heard to remark:
> On Wed, Jun 29, 2005 at 05:05:08PM -0500, Linas Vepstas wrote:
> > 
> > Hi,
> > 
> > I can't think of any easy way of (generically) finding the pointer to
> > struct hotplug_slot  if I have a pointer to struct pci_dev in hand.
> 
> There is no way, sorry.  Remember, multiple pci_dev can point to a
> single hotplug slot.

Right. I have I pointer to a pci_dev. I want to find the pointer to the 
hotplug slot its in.

> > I am sorely tempted to write a generic patch to
> > drivers/pci/hotplug/pci_hotplug_core.c to add this support
> > (getting it from each of the hotplug arch'es).  Should I?
> > If I submit sucha patch, would it get rejected out of hand 
> > as a bad idea?
> > 
> > I need this for the generic fallback PCI error recovery support;
> 
> Why?

I don't understand the question...

Because after discussions on the LKML, everyone decided that this
code should be written in as generic a way as possible, rather
than arch-dependent, because there will soon be a variety of PCI
chipsets that will be able to detect PCI errors. 

> > I have a pointer to the device; if its in a hotplug slot,
> > I want to toggle power to it.
> 
> No you do not.  That's a userspace policy, not something the kernel
> should do.

Argh.  I thought we've run around the table a few times on this issue
already.  It is impossible for userspace to recover from PCI errors.
The canonical example is the failure of a disk system underneath a block
device. 

All other bus errors are handled in the device drivers; for example,
scsi errors are handled by scsi drivers and/or scsi-generic code.
fiber-channel errors are handled by the fiber channel controllers.
Ethernet hangs are handled by ethernet watchdogs in each ethernet
driver.  I think its unrealistic at this point to try to turn PCI 
error recovery into a userspace policy.


--linas



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id\x16492&op=click
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to find hotplug slot of PCI dev?
  2005-06-29 23:00 How to find hotplug slot of PCI dev? Linas Vepstas
@ 2005-06-29 23:26 ` Greg KH
  2005-06-30  0:14 ` Linas Vepstas
  2005-06-30  0:36 ` Greg KH
  2 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2005-06-29 23:26 UTC (permalink / raw)
  To: linux-hotplug

On Wed, Jun 29, 2005 at 06:00:19PM -0500, Linas Vepstas wrote:
> On Wed, Jun 29, 2005 at 03:44:38PM -0700, Greg KH was heard to remark:
> > On Wed, Jun 29, 2005 at 05:05:08PM -0500, Linas Vepstas wrote:
> > > 
> > > Hi,
> > > 
> > > I can't think of any easy way of (generically) finding the pointer to
> > > struct hotplug_slot  if I have a pointer to struct pci_dev in hand.
> > 
> > There is no way, sorry.  Remember, multiple pci_dev can point to a
> > single hotplug slot.
> 
> Right. I have I pointer to a pci_dev. I want to find the pointer to the 
> hotplug slot its in.

Again, not possible.

> > > I am sorely tempted to write a generic patch to
> > > drivers/pci/hotplug/pci_hotplug_core.c to add this support
> > > (getting it from each of the hotplug arch'es).  Should I?
> > > If I submit sucha patch, would it get rejected out of hand 
> > > as a bad idea?
> > > 
> > > I need this for the generic fallback PCI error recovery support;
> > 
> > Why?
> 
> I don't understand the question...
> 
> Because after discussions on the LKML, everyone decided that this
> code should be written in as generic a way as possible, rather
> than arch-dependent, because there will soon be a variety of PCI
> chipsets that will be able to detect PCI errors. 

True.

> > > I have a pointer to the device; if its in a hotplug slot,
> > > I want to toggle power to it.
> > 
> > No you do not.  That's a userspace policy, not something the kernel
> > should do.
> 
> Argh.  I thought we've run around the table a few times on this issue
> already.  It is impossible for userspace to recover from PCI errors.
> The canonical example is the failure of a disk system underneath a block
> device. 

Yes, but don't go power-cycling the whole pci slot on me.  That's just
insane.

> All other bus errors are handled in the device drivers; for example,
> scsi errors are handled by scsi drivers and/or scsi-generic code.
> fiber-channel errors are handled by the fiber channel controllers.
> Ethernet hangs are handled by ethernet watchdogs in each ethernet
> driver.  I think its unrealistic at this point to try to turn PCI 
> error recovery into a userspace policy.

Ok, but again, realize that multiple pci_dev can point to the same pci
hotplug slot.  Are you going to want to power-cycle all of them (think
multi-port ethernet card, multi-controller usb device, multi-device scsi
card, etc.)  That's just a bad idea.

We did go back and forth a few times about this, yes, and I still think
that you need to notify userspace that something bad is happening, and
let it do the complex stuff if it wants to.  And, if you think you can
do some simple things in your driver, do that.  But please, don't go
power-cycling a pci device, that is just mean, and is probably against
the PCI hotplug spec also.

thanks,

greg k-h


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id\x16492&op=click
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to find hotplug slot of PCI dev?
  2005-06-29 23:00 How to find hotplug slot of PCI dev? Linas Vepstas
  2005-06-29 23:26 ` Greg KH
@ 2005-06-30  0:14 ` Linas Vepstas
  2005-06-30  0:36 ` Greg KH
  2 siblings, 0 replies; 4+ messages in thread
From: Linas Vepstas @ 2005-06-30  0:14 UTC (permalink / raw)
  To: linux-hotplug

On Wed, Jun 29, 2005 at 04:26:33PM -0700, Greg KH was heard to remark:
> On Wed, Jun 29, 2005 at 06:00:19PM -0500, Linas Vepstas wrote:
> > On Wed, Jun 29, 2005 at 03:44:38PM -0700, Greg KH was heard to remark:
> > > On Wed, Jun 29, 2005 at 05:05:08PM -0500, Linas Vepstas wrote:
> > > > 
> > > > Hi,
> > > > 
> > > > I can't think of any easy way of (generically) finding the pointer to
> > > > struct hotplug_slot  if I have a pointer to struct pci_dev in hand.
> > > 
> > > There is no way, sorry.  Remember, multiple pci_dev can point to a
> > > single hotplug slot.
> > 
> > Right. I have I pointer to a pci_dev. I want to find the pointer to the 
> > hotplug slot its in.
> 
> Again, not possible.

? There is an existing call, called "rpaphp_find_hotplug_slot" and it
works fine.  I was toying with the idea of wrapping some generic code
around it, say, for example

struct hotplug_slot *pci_hp_find_slot (struct pci_dev *);

So I'm not sure by what you mean "not possible".  I skimmed the other 
hotplug systems, it seemed quite possible to implement on those as well.

> Yes, but don't go power-cycling the whole pci slot on me.  That's just
> insane.

Why is that insane? It works, and it works well, for those devices that
are able to listen to hotplug events and do something with them.

> > All other bus errors are handled in the device drivers; for example,
> > scsi errors are handled by scsi drivers and/or scsi-generic code.
> > fiber-channel errors are handled by the fiber channel controllers.
> > Ethernet hangs are handled by ethernet watchdogs in each ethernet
> > driver.  I think its unrealistic at this point to try to turn PCI 
> > error recovery into a userspace policy.
> 
> Ok, but again, realize that multiple pci_dev can point to the same pci
> hotplug slot.  Are you going to want to power-cycle all of them (think
> multi-port ethernet card, multi-controller usb device, multi-device scsi
> card, etc.)  That's just a bad idea.

Yes, but a PCI error will take out *all* of the functions plugged into a
slot.  It might even take out multiple slots, if the error occured on a
cable connecting the CPU to the drawer with the pci cards in it.  

As to multi-device scsi cards, scsi device drives already have a reset
sequence that takes takes down the entire scsi bus, and, if nedded, 
reboots the scsi host adapter as well; its been there since kernel-1.0
at least.

As to multi-port ethernet cards, they multi-port burp as well.  If you
have a 4-port Intel e100, it will have four instances of the e100 
driver attached to it. If one of the e100 drivers detects a problem, it
will reset the card, which will affect *all four* ports, and not just
one.

Don't know anything about USB, but I assume that there is such a thing
as a USB controller reset, and that will take out the entire chain
until the reset completes.

> We did go back and forth a few times about this, yes, and I still think
> that you need to notify userspace that something bad is happening, and
> let it do the complex stuff if it wants to.  And, if you think you can
> do some simple things in your driver, do that.  But please, don't go
> power-cycling a pci device, that is just mean, and is probably against
> the PCI hotplug spec also.

Its not all that much meaner than asserting the #RST line.  The point
was that toggling the power was an effective way of dealing with device
for which the device drivers don't support PCI error recovery.  But now 
that I've got 5 drivers that do handle recovery, and some more in the
works, maybe in fact I don't much need hotplug-based recovery any more, 
and I could just let that drop.

--linas


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id\x16492&op=click
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How to find hotplug slot of PCI dev?
  2005-06-29 23:00 How to find hotplug slot of PCI dev? Linas Vepstas
  2005-06-29 23:26 ` Greg KH
  2005-06-30  0:14 ` Linas Vepstas
@ 2005-06-30  0:36 ` Greg KH
  2 siblings, 0 replies; 4+ messages in thread
From: Greg KH @ 2005-06-30  0:36 UTC (permalink / raw)
  To: linux-hotplug

On Wed, Jun 29, 2005 at 07:14:43PM -0500, Linas Vepstas wrote:
> Yes, but a PCI error will take out *all* of the functions plugged into a
> slot.  It might even take out multiple slots, if the error occured on a
> cable connecting the CPU to the drawer with the pci cards in it.  

Then all of those pci_dev will get notified, right?  And if they all
start to try to power cycle the same card, bad things will probably
happen, right?

Again, refer to the PCI Hotplug spec for the fact that I do not think
this is allowable behaviour.

> Don't know anything about USB, but I assume that there is such a thing
> as a USB controller reset, and that will take out the entire chain
> until the reset completes.

There can be multiple USB controllers on a single PCI card, that are
independant of each other (much like network controllers can be.)

And USB 2.0 controllers have a built-in pci bridge with a USB 1.1
controller attached to it.  So it can be quite deep just at the PCI
layer.

> Its not all that much meaner than asserting the #RST line.  The point
> was that toggling the power was an effective way of dealing with device
> for which the device drivers don't support PCI error recovery.  But now 
> that I've got 5 drivers that do handle recovery, and some more in the
> works, maybe in fact I don't much need hotplug-based recovery any more, 
> and I could just let that drop.

Sounds like a wise thing :)

thanks,

greg k-h


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id\x16492&op=click
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-06-30  0:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-29 23:00 How to find hotplug slot of PCI dev? Linas Vepstas
2005-06-29 23:26 ` Greg KH
2005-06-30  0:14 ` Linas Vepstas
2005-06-30  0:36 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).