linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hotplug remove vs. device driver close
@ 2004-06-02 23:14 linas
  2004-06-02 23:28 ` Greg KH
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: linas @ 2004-06-02 23:14 UTC (permalink / raw)
  To: linux-hotplug

Hi,

We are hitting a situation where we are hot-plug removing a pci card before
closing the device driver.  This seems to lead to kernel memory leaks if not
outright crashes. I'm trying to understand what the correct solution to this
is supposed to be.

For example: 'ifup eth0' and 'ifdown eth0' are what usually cause an ethernet
device driver to be opened/closed.  Seprately, we have a userland tool that
can be used to power off the pci slot, and thus perform a hotplug unconfigure 
in the kernel (i.e. calls pci_remove_bus_device()).   Thus, the sysadmin 
currently has the power to hot-remove a device without first closing the 
device driver.  Surely, this is bad. (Right?)  But how is this supposed to
be handled?

Please don't tell me that a good sysadmin should never do that ... in the 
hothouse of the server room, crazy stuff happens and it should not result 
in a server crash so easily ... 

I'm hoping that the answer also isn't that 'the hotplug scripts should 
do that', since hotplug scripts can be buggy, or can crash for many reasons;
such events shouldn't bring down the kernel.

So I conclude two possibilities:

-- All device drivers should watch for hotplug remove, and close themselves
   down in such an event

-- The syscall that allows the pci slot to be powered off should also 
   go through the steps of closing the device driver first. 

Is there another possibility?  What's the right way of handling this?


--linas
 






-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
@ 2004-06-02 23:28 ` Greg KH
  2004-06-03  1:40 ` Anton Blanchard
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2004-06-02 23:28 UTC (permalink / raw)
  To: linux-hotplug

On Wed, Jun 02, 2004 at 06:14:55PM -0500, linas@austin.ibm.com wrote:
> 
> Hi,
> 
> We are hitting a situation where we are hot-plug removing a pci card before
> closing the device driver.  This seems to lead to kernel memory leaks if not
> outright crashes. I'm trying to understand what the correct solution to this
> is supposed to be.

To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"

> For example: 'ifup eth0' and 'ifdown eth0' are what usually cause an ethernet
> device driver to be opened/closed.  Seprately, we have a userland tool that
> can be used to power off the pci slot, and thus perform a hotplug unconfigure
> in the kernel (i.e. calls pci_remove_bus_device()).   Thus, the sysadmin
> currently has the power to hot-remove a device without first closing the
> device driver.  Surely, this is bad. (Right?)  But how is this supposed to
> be handled?

Again, do not do that.

> Please don't tell me that a good sysadmin should never do that ... in the
> hothouse of the server room, crazy stuff happens and it should not result
> in a server crash so easily ...

Tough, do not do that.

That being said, a lot of the PCI drivers can recover from this as they
also work for PCMCIA devices, and they need to be able to handle this.
It is possible, and pretty simple to fix within the driver itself.

> I'm hoping that the answer also isn't that 'the hotplug scripts should
> do that', since hotplug scripts can be buggy, or can crash for many reasons;
> such events shouldn't bring down the kernel.

No, it's not a hotplug script issue.

> So I conclude two possibilities:
> 
> -- All device drivers should watch for hotplug remove, and close themselves
>    down in such an event

No, they should watch for errors when trying to read and write from
their devices and if that happens, handle it properly.  The kernel will
tell them at some time that the device is really gone by calling the
disconnect() callback.

> -- The syscall that allows the pci slot to be powered off should also
>    go through the steps of closing the device driver first.

No.  Read the PCI Hotplug spec.

> Is there another possibility?  What's the right way of handling this?

See above.

What driver is dying for you?  It should be quite easy to fix.

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
  2004-06-02 23:28 ` Greg KH
@ 2004-06-03  1:40 ` Anton Blanchard
  2004-06-03 16:20 ` Greg KH
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Anton Blanchard @ 2004-06-03  1:40 UTC (permalink / raw)
  To: linux-hotplug


Hi,

> > We are hitting a situation where we are hot-plug removing a pci card
> > before closing the device driver.  This seems to lead to kernel
> > memory leaks if not outright crashes. I'm trying to understand what
> > the correct solution to this is supposed to be.
> 
> To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"

How do you currently guarantee this on cardbus?

Anton


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
  2004-06-02 23:28 ` Greg KH
  2004-06-03  1:40 ` Anton Blanchard
@ 2004-06-03 16:20 ` Greg KH
  2004-06-03 18:50 ` linas
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2004-06-03 16:20 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 11:40:04AM +1000, Anton Blanchard wrote:
> 
> Hi,
> 
> > > We are hitting a situation where we are hot-plug removing a pci card
> > > before closing the device driver.  This seems to lead to kernel
> > > memory leaks if not outright crashes. I'm trying to understand what
> > > the correct solution to this is supposed to be.
> > 
> > To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"
> 
> How do you currently guarantee this on cardbus?

We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
quite easily, so it is pretty simple to fix up a PCI driver to also
handle this.

But the main answer is that the PCI Hotplug spec states that the OS does
NOT have to protect for this happening to regular PCI devices.

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (2 preceding siblings ...)
  2004-06-03 16:20 ` Greg KH
@ 2004-06-03 18:50 ` linas
  2004-06-03 19:02 ` Greg KH
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: linas @ 2004-06-03 18:50 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 09:20:20AM -0700, Greg KH wrote:
> On Thu, Jun 03, 2004 at 11:40:04AM +1000, Anton Blanchard wrote:
> > 
> > > > We are hitting a situation where we are hot-plug removing a pci card
> > > > before closing the device driver.  This seems to lead to kernel
> > > > memory leaks if not outright crashes. I'm trying to understand what
> > > > the correct solution to this is supposed to be.
> > > 
> > > To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"
> > 
> > How do you currently guarantee this on cardbus?
> 
> We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
> quite easily, so it is pretty simple to fix up a PCI driver to also
> handle this.
>
> But the main answer is that the PCI Hotplug spec states that the OS does
> NOT have to protect for this happening to regular PCI devices.

So if I understand what you are saying: if the OS crashes because of 
a sysadmin error or a script error during pci hotplug remove, that's 
considered OK?

I understand why the PCI spec would say that: they have no desire 
to over-burden already struggling OS developers: the PCI spec 
committee probably thinks in terms of "provide function not policy".
That's normal and as it should be.

But in the five-9's world of high availability, automatic failover, 
etc. etc. this sure sounds like a great way of putting executives
on a warpath.  I humbly suggest that the Linux kernel policy should 
be that we do better than th PCI spec, and attempt minimize damage
due to operator error.  If not all drivers or tools or subsystems
adhere to this policy, so be it, but robustness should be a goal.


--linas





-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (3 preceding siblings ...)
  2004-06-03 18:50 ` linas
@ 2004-06-03 19:02 ` Greg KH
  2004-06-03 19:23 ` Don Fry
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2004-06-03 19:02 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 01:50:44PM -0500, linas@austin.ibm.com wrote:
> On Thu, Jun 03, 2004 at 09:20:20AM -0700, Greg KH wrote:
> > On Thu, Jun 03, 2004 at 11:40:04AM +1000, Anton Blanchard wrote:
> > > 
> > > > > We are hitting a situation where we are hot-plug removing a pci card
> > > > > before closing the device driver.  This seems to lead to kernel
> > > > > memory leaks if not outright crashes. I'm trying to understand what
> > > > > the correct solution to this is supposed to be.
> > > > 
> > > > To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!"
> > > 
> > > How do you currently guarantee this on cardbus?
> > 
> > We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
> > quite easily, so it is pretty simple to fix up a PCI driver to also
> > handle this.
> >
> > But the main answer is that the PCI Hotplug spec states that the OS does
> > NOT have to protect for this happening to regular PCI devices.
> 
> So if I understand what you are saying: if the OS crashes because of 
> a sysadmin error or a script error during pci hotplug remove, that's 
> considered OK?

As sysadmin I can delete your whole root fs, and reboot the box into
obvilion.  Are you considering changing this ability too?  :)

If you are really worried about this, then look into a different
permisssion model for Linux like SELinux.

Or you can simply fix up your PCI driver to properly handle reading all
FF when the device has been removed.  That seems to be what you need to
do to solve this for your small subset of drivers on your platform,
correct?

> I understand why the PCI spec would say that: they have no desire 
> to over-burden already struggling OS developers: the PCI spec 
> committee probably thinks in terms of "provide function not policy".
> That's normal and as it should be.

That's also what the kernel provides, function not policy.  Put your
policy in userspace and force your admin to use a tool that ensures that
the device has properly shutdown anything that is bound to that device
before it tells the kernel to remove it from the system.

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (4 preceding siblings ...)
  2004-06-03 19:02 ` Greg KH
@ 2004-06-03 19:23 ` Don Fry
  2004-06-03 19:28 ` Greg KH
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Don Fry @ 2004-06-03 19:23 UTC (permalink / raw)
  To: linux-hotplug

> > > We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
> > > quite easily, so it is pretty simple to fix up a PCI driver to also
> > > handle this.
> > >
> > > But the main answer is that the PCI Hotplug spec states that the OS does
> > > NOT have to protect for this happening to regular PCI devices.
> >
> > So if I understand what you are saying: if the OS crashes because of
> > a sysadmin error or a script error during pci hotplug remove, that's
> > considered OK?
> 
> As sysadmin I can delete your whole root fs, and reboot the box into
> obvilion.  Are you considering changing this ability too?  :)
> 
> If you are really worried about this, then look into a different
> permisssion model for Linux like SELinux.
> 
> Or you can simply fix up your PCI driver to properly handle reading all
> FF when the device has been removed.  That seems to be what you need to
> do to solve this for your small subset of drivers on your platform,
> correct?
> 

The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
but that does not include doing a 'close' prior to being removed.  The
driver could keep some state around so that if its remove routine was
called without close first, it would cleanup, but I don't know of any
network driver that does this.

The remove with a close is where the leak/crash might occur.

> > I understand why the PCI spec would say that: they have no desire
> > to over-burden already struggling OS developers: the PCI spec
> > committee probably thinks in terms of "provide function not policy".
> > That's normal and as it should be.
> 
> That's also what the kernel provides, function not policy.  Put your
> policy in userspace and force your admin to use a tool that ensures that
> the device has properly shutdown anything that is bound to that device
> before it tells the kernel to remove it from the system.
> 
> thanks,
> 
> greg k-h
-- 
Don Fry
brazilnut@us.ibm.com


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (5 preceding siblings ...)
  2004-06-03 19:23 ` Don Fry
@ 2004-06-03 19:28 ` Greg KH
  2004-06-03 19:34 ` linas
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2004-06-03 19:28 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 12:23:04PM -0700, Don Fry wrote:
> > > > We make no such guarantee.  As I stated, the Cardbus/PCMCIA handle this
> > > > quite easily, so it is pretty simple to fix up a PCI driver to also
> > > > handle this.
> > > >
> > > > But the main answer is that the PCI Hotplug spec states that the OS does
> > > > NOT have to protect for this happening to regular PCI devices.
> > >
> > > So if I understand what you are saying: if the OS crashes because of
> > > a sysadmin error or a script error during pci hotplug remove, that's
> > > considered OK?
> > 
> > As sysadmin I can delete your whole root fs, and reboot the box into
> > obvilion.  Are you considering changing this ability too?  :)
> > 
> > If you are really worried about this, then look into a different
> > permisssion model for Linux like SELinux.
> > 
> > Or you can simply fix up your PCI driver to properly handle reading all
> > FF when the device has been removed.  That seems to be what you need to
> > do to solve this for your small subset of drivers on your platform,
> > correct?
> > 
> 
> The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
> but that does not include doing a 'close' prior to being removed.  The
> driver could keep some state around so that if its remove routine was
> called without close first, it would cleanup, but I don't know of any
> network driver that does this.
> 
> The remove with a close is where the leak/crash might occur.

That's up to the upper layer, above the network driver to do, right?
It's the same way for all USB and SCSI/block devices.

Remember, the driver isn't unloaded at device removal time, it should
always be bound to memory until the userspace "open" goes away.

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (6 preceding siblings ...)
  2004-06-03 19:28 ` Greg KH
@ 2004-06-03 19:34 ` linas
  2004-06-03 19:39 ` linas
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: linas @ 2004-06-03 19:34 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 12:02:06PM -0700, Greg KH wrote:
> 
> Or you can simply fix up your PCI driver to properly handle reading all
> FF when the device has been removed.  That seems to be what you need to
> do to solve this for your small subset of drivers on your platform,
> correct?

Yes. Well, specifically, the check will be 
"if (0xff && pci_unplugged) then close();"

Due to the nature of the existing PPC64 code, this check will occur 
in the ppc64 inb()/inw() macros, and thus solves the problem for my 
little neck of the woods.   

Is there something in writing somewhere that discusses this as a 
guideline for Linux device driver authors?  Something that I can 
throw at the driver authors and say "here do this, this fixes it 
for all arches"?

--linas


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (7 preceding siblings ...)
  2004-06-03 19:34 ` linas
@ 2004-06-03 19:39 ` linas
  2004-06-03 20:02 ` Don Fry
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: linas @ 2004-06-03 19:39 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 12:23:04PM -0700, Don Fry wrote:
> 
> The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
> but that does not include doing a 'close' prior to being removed.  The
> driver could keep some state around so that if its remove routine was
> called without close first, it would cleanup, but I don't know of any
> network driver that does this.

What I get out of this thread is that pcnet32, and in fact, all drivers, 
should keep sufficient state around so that close() can be called either 
after or before remove().

--linas


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (8 preceding siblings ...)
  2004-06-03 19:39 ` linas
@ 2004-06-03 20:02 ` Don Fry
  2004-06-03 20:39 ` Greg KH
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Don Fry @ 2004-06-03 20:02 UTC (permalink / raw)
  To: linux-hotplug

> > 
> > The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
> > but that does not include doing a 'close' prior to being removed.  The
> > driver could keep some state around so that if its remove routine was
> > called without close first, it would cleanup, but I don't know of any
> > network driver that does this.
> 
> What I get out of this thread is that pcnet32, and in fact, all drivers, 
> should keep sufficient state around so that close() can be called either 
> after or before remove().
> 

Today in 2.6.6 if I try and do a rmmod pcnet32 and something is still using
the device, the rmmod will wait until the device is closed, and then it
goes away.  If the unplug does the same thing, and doesn't complete until
the close occurs, then I would not expect to leak anything, or crash either.

> --linas
> 
> 


-- 
Don Fry
brazilnut@us.ibm.com


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (9 preceding siblings ...)
  2004-06-03 20:02 ` Don Fry
@ 2004-06-03 20:39 ` Greg KH
  2004-06-03 22:25 ` linas
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2004-06-03 20:39 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 01:02:27PM -0700, Don Fry wrote:
> > > 
> > > The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
> > > but that does not include doing a 'close' prior to being removed.  The
> > > driver could keep some state around so that if its remove routine was
> > > called without close first, it would cleanup, but I don't know of any
> > > network driver that does this.
> > 
> > What I get out of this thread is that pcnet32, and in fact, all drivers, 
> > should keep sufficient state around so that close() can be called either 
> > after or before remove().
> > 
> 
> Today in 2.6.6 if I try and do a rmmod pcnet32 and something is still using
> the device, the rmmod will wait until the device is closed, and then it
> goes away.

Yes, that's a "feature" of the network stack :)

> If the unplug does the same thing, and doesn't complete until the
> close occurs, then I would not expect to leak anything, or crash
> either.

The device unplug knows nothing about the upper layers that the device
might have attached to, sorry.

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (10 preceding siblings ...)
  2004-06-03 20:39 ` Greg KH
@ 2004-06-03 22:25 ` linas
  2004-06-04  3:58 ` Benjamin Herrenschmidt
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: linas @ 2004-06-03 22:25 UTC (permalink / raw)
  To: linux-hotplug


Greg,

My apologies, I think I needed to read the source before posting to the 
mailing list.  (yes, my middle name really is 'Luke', no kidding). I might have 
been on a wild goose chase.  See below.

On Thu, Jun 03, 2004 at 01:02:27PM -0700, Don Fry wrote:
> > > 
> > > The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
> > > but that does not include doing a 'close' prior to being removed.  The
> > > driver could keep some state around so that if its remove routine was
> > > called without close first, it would cleanup, but I don't know of any
> > > network driver that does this.
> > 
> > What I get out of this thread is that pcnet32, and in fact, all drivers, 
> > should keep sufficient state around so that close() can be called either 
> > after or before remove().
> 
> Today in 2.6.6 if I try and do a rmmod pcnet32 and something is still using
> the device, the rmmod will wait until the device is closed, and then it
> goes away.  

Where?  pcnet32_cleanup_module() doesn't seem to wait for anything.

However, I'm reading the source, tracing through a power-off, and I don't
see any device driver memory leaks, at least not in the 'typical' pcnet32
driver.  What I do see is that the the device driver close() routine is
called before the device driver memory is released.

Anatomy of a device remove,
---------------------------
starting with an

echo 1 > /sysfs/bus/ .. power file ... 

I trace this through the ppc64 rpaphp hotplug code, and the pcnet32 code.
I did this manually, I hope there aren't any mistakes.

power_write_file () // in /drivers/pci/hotplug/pci_hotplug_core.c
{
  calls
  slot->ops->disable_slot() which is just
  struct hotplug_slot_ops ->disable_slot() which is just
  disable_slot (struct hotplug_slot *) // in /drivers/pci/hotplug/rpaphp_core.c
  {
    which calls
    rpa_php_unconfig_pci_adapter (struct slot *)  // in rpaphp_pci.c
    {
      calls
      pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c
      { 
        calls
        pci_destroy_dev (struct pci_dev *) 
        {
          calls 
          device_unregister (&dev->dev) // in /drivers/base/core.c
          {
            calls
            device_del (struct device *)
            {
              calls 
              bus_remove_device() // in /drivers/base/bus.c
              {
                calls 
                device_release_driver()
                {
                  calls 
                  struct device_driver->remove() which is just
                  pci_device_remove()  // in /drivers/pci/pci_driver.c
                  {
                    calls
                    struct pci_driver->remove() which is just
                    pcnet32_remove_one() // in /drivers/net/pcnet32.c  
                    {
                      calls
                      unregister_netdev() // in /net/core/dev.c
                      {
                        calls 
                        dev_close()  // in /net/core/dev.c
                        { 
                           calls dev->stop();
                           which is just pcnet32_close() // in pcnet32.c
                           {
                             which does what you wanted
                             to stop the device
                           }
                        }
                     }
                   which
                   frees pcnet32 device driver memory
                }
     }}}}}}







-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (11 preceding siblings ...)
  2004-06-03 22:25 ` linas
@ 2004-06-04  3:58 ` Benjamin Herrenschmidt
  2004-06-04 16:24 ` Greg KH
  2004-06-04 17:26 ` linas
  14 siblings, 0 replies; 16+ messages in thread
From: Benjamin Herrenschmidt @ 2004-06-04  3:58 UTC (permalink / raw)
  To: linux-hotplug

On Fri, 2004-06-04 at 05:39, linas@austin.ibm.com wrote:
> On Thu, Jun 03, 2004 at 12:23:04PM -0700, Don Fry wrote:
> >
> > The pcnet32 driver tries to do the 'right thing' when it reads 0xffff,
> > but that does not include doing a 'close' prior to being removed.  The
> > driver could keep some state around so that if its remove routine was
> > called without close first, it would cleanup, but I don't know of any
> > network driver that does this.
> 
> What I get out of this thread is that pcnet32, and in fact, all drivers,
> should keep sufficient state around so that close() can be called either
> after or before remove().

I think the problem is more specific to the netdev interface no ? Isn't
it just that unregister_netdevice fails when it's open ? In which case,
remove should fail ... which may not be what you want, but I don't see
a proper solution unless we fix the network core.

Hrm... looking at the code, unregister_netdevice is supposed to do
a close... Maybe something isn't working properly there...

Ben.




-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (12 preceding siblings ...)
  2004-06-04  3:58 ` Benjamin Herrenschmidt
@ 2004-06-04 16:24 ` Greg KH
  2004-06-04 17:26 ` linas
  14 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2004-06-04 16:24 UTC (permalink / raw)
  To: linux-hotplug

On Thu, Jun 03, 2004 at 02:34:23PM -0500, linas@austin.ibm.com wrote:
> 
> Is there something in writing somewhere that discusses this as a 
> guideline for Linux device driver authors?  Something that I can 
> throw at the driver authors and say "here do this, this fixes it 
> for all arches"?

Point them at this thread :)

Or point them to me.

Or send me the wording you want to see added to Documentation/pci.txt
that would help you out.

thanks,

greg k-h


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hotplug remove vs. device driver close
  2004-06-02 23:14 hotplug remove vs. device driver close linas
                   ` (13 preceding siblings ...)
  2004-06-04 16:24 ` Greg KH
@ 2004-06-04 17:26 ` linas
  14 siblings, 0 replies; 16+ messages in thread
From: linas @ 2004-06-04 17:26 UTC (permalink / raw)
  To: linux-hotplug

On Fri, Jun 04, 2004 at 01:58:23PM +1000, Benjamin Herrenschmidt wrote:
> Hrm... looking at the code, unregister_netdevice is supposed to do
> a close... Maybe something isn't working properly there...

Err, yes, I discovered this yesterday. There's a delay in mail delivery 
somehwere, possibly at my end.   Seems the issues I raised aren't a problem,
its just that some of the device drivers might be buggy, will investigate 
offline.

--linas


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-06-04 17:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-02 23:14 hotplug remove vs. device driver close linas
2004-06-02 23:28 ` Greg KH
2004-06-03  1:40 ` Anton Blanchard
2004-06-03 16:20 ` Greg KH
2004-06-03 18:50 ` linas
2004-06-03 19:02 ` Greg KH
2004-06-03 19:23 ` Don Fry
2004-06-03 19:28 ` Greg KH
2004-06-03 19:34 ` linas
2004-06-03 19:39 ` linas
2004-06-03 20:02 ` Don Fry
2004-06-03 20:39 ` Greg KH
2004-06-03 22:25 ` linas
2004-06-04  3:58 ` Benjamin Herrenschmidt
2004-06-04 16:24 ` Greg KH
2004-06-04 17:26 ` linas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).