From mboxrd@z Thu Jan 1 00:00:00 1970 From: Greg KH Date: Wed, 02 Jun 2004 23:28:51 +0000 Subject: Re: hotplug remove vs. device driver close Message-Id: <20040602232851.GA24169@kroah.com> List-Id: References: <20040602181455.C17544@forte.austin.ibm.com> In-Reply-To: <20040602181455.C17544@forte.austin.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-hotplug@vger.kernel.org On Wed, Jun 02, 2004 at 06:14:55PM -0500, linas@austin.ibm.com wrote: > > Hi, > > We are hitting a situation where we are hot-plug removing a pci card before > closing the device driver. This seems to lead to kernel memory leaks if not > outright crashes. I'm trying to understand what the correct solution to this > is supposed to be. To paraphrase from the PCI Hotplug spec, "DO NOT DO THAT!" > For example: 'ifup eth0' and 'ifdown eth0' are what usually cause an ethernet > device driver to be opened/closed. Seprately, we have a userland tool that > can be used to power off the pci slot, and thus perform a hotplug unconfigure > in the kernel (i.e. calls pci_remove_bus_device()). Thus, the sysadmin > currently has the power to hot-remove a device without first closing the > device driver. Surely, this is bad. (Right?) But how is this supposed to > be handled? Again, do not do that. > Please don't tell me that a good sysadmin should never do that ... in the > hothouse of the server room, crazy stuff happens and it should not result > in a server crash so easily ... Tough, do not do that. That being said, a lot of the PCI drivers can recover from this as they also work for PCMCIA devices, and they need to be able to handle this. It is possible, and pretty simple to fix within the driver itself. > I'm hoping that the answer also isn't that 'the hotplug scripts should > do that', since hotplug scripts can be buggy, or can crash for many reasons; > such events shouldn't bring down the kernel. No, it's not a hotplug script issue. > So I conclude two possibilities: > > -- All device drivers should watch for hotplug remove, and close themselves > down in such an event No, they should watch for errors when trying to read and write from their devices and if that happens, handle it properly. The kernel will tell them at some time that the device is really gone by calling the disconnect() callback. > -- The syscall that allows the pci slot to be powered off should also > go through the steps of closing the device driver first. No. Read the PCI Hotplug spec. > Is there another possibility? What's the right way of handling this? See above. What driver is dying for you? It should be quite easy to fix. thanks, greg k-h ------------------------------------------------------- This SF.Net email is sponsored by the new InstallShield X. >From Windows to Linux, servers to mobile, InstallShield X is the one installation-authoring solution that does it all. Learn more and evaluate today! http://www.installshield.com/Dev2Dev/0504 _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel