From mboxrd@z Thu Jan 1 00:00:00 1970 From: Doug Ledford Subject: Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 Date: Thu, 23 Jan 2003 16:34:23 -0500 Sender: linux-scsi-owner@vger.kernel.org Message-ID: <20030123213423.GA26415@redhat.com> References: <200301232040.41862.oliver@neukum.name> <20030123202835.GA25838@redhat.com> <200301232159.28656.oliver@neukum.name> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <200301232159.28656.oliver@neukum.name> List-Id: linux-scsi@vger.kernel.org To: Oliver Neukum Cc: Luben Tuikov , Alan Stern , David Brownell , Matthew Dharm , Mike Anderson , Greg KH , linux-usb-devel@lists.sourceforge.net, Linux SCSI list On Thu, Jan 23, 2003 at 09:59:28PM +0100, Oliver Neukum wrote: > Hi Doug > > > Actually, I would have both complicated and simple transports call > > scsi_set_device_offline() and for two reasons. 1) you have to provide > > that function for simple drivers so duplicating other detection code in > > the scsi completion handler is a waste. 2) pretty much all transports > > will learn of the device being offline while they are in their interrupt > > handler and should already be holding the lock for the device, which means > > This is not the case for USB and IEEE1394. I am not sure about PCMCIA. > We are in context of a kernel thread while we learn about device removal. No. You might be in a kernel thread context when you decode an interrupt down to determining that a device was removed, but somewhere along the line you took an interrupt that told you the device was removed (or else the command simply timed out and you are in the error handler for the command already). Are you saying that the USB subsystem queues up those interrupt packets and decodes them later (which is fine, I just want to be clear on the point)? > > that calling scsi_set_device_offline() won't race with scsi_request_fn() > > which also needs the device lock (which in reality is the host lock). > > Saving this race is convenient enough IMHO to warrant saying that's the > > way things need to be. > > > > > > scsi_set_device_offline(dev) calls a high-level kernel function to > > > > start higher level things (block queue cut off, etc) which *may* need > > > > to be done. > > > > No, scsi_set_device_offline() schedules the error handler thread for that > > host to be woken up. > > > > > How do you differentiate between real failure and device removal? > > > > We don't, and we shouldn't. Device removal *is* a real failure. > > Well shouldn't a device removal remove the device as a logical > entity and a failure should not? No. That's what the user space hot plug manager is for. If you want this type of behaviour, you take an interrupt to tell you that the device is gone, you mark it gone, the error handler cleans up any outstanding commands, then once the device no longer has any commands outstanding *then* the hot plug manager can successfully umount/unattach/whatever the device and then tell the kernel to actually remove it. Putting this into the scsi stack when it's already in place elsewhere makes no sense to me. > > If the LLDD is the type such that it knows the device is gone (aka, in my > > driver if I get a selection timeout then I know something is fishy and can > > proceed from there, iSCSI may not be so lucky), then it has one of two > > choices. 1) it may flush any commands that it can out of the hardware and > > return them immediately with the same error condition as the one that it > > is already returning. 2) it can sit and wait for the commands to timeout > > one by one if that's what it wants. Since the device has already been > > marked offline by scsi_set_device_offline() and the error handler thread > > is already scheduled to run for the device, 2 is probably the easiest > > thing for the driver to do. The error handler will call the abort/reset > > Again not for USB and IEEE1394. We'd have to wait for the error handler > to finish. Doing it ourselves is easier. OK, are you reading my comments or not? I said "since the error handler thread is already scheduled to run for the device, 2 is probably easiest". In other words, you don't have to wait for anything, it's gonna happen post-haste. So since you should already have proper error handling functions in place (You do have proper error handler functions in place, don't you?), duplicating that code here won't really buy you anything. > > Once all the commands are gone and no more are arriving, then if, and only > > if, someone actually removes the device from the scsi subsystem (maybe > > hotplug manager or something) then you will get the typical > > slave_destroy() call to tell you that it is safe to release all resources > > related to this device. Otherwise, the device will hang around as an > > offline device until someone does echo "scsi-remove-single-device a b c d" > > Eek. That part I must strongly object to. The device is physically gone. > Ever bothering the LLDD with it is very inconvinient. OK, let's look at this realistically. I'm saying you get an interrupt telling you that the device is gone and you tell the scsi core the same thing. Immediately after that the scsi core calls your error handler routines to clean up any pending commands on the device. Once all those pending commands are cleaned up, the hot plug manager is free to remove the device from the system. Once the hot plug manager calls for the free to happen, you get a slave_destroy() call and you free the instances. This all happens in a span of a few milliseconds most likely. Is that really so inconvenient for you? > > > /proc/scsi/scsi to remove it. > > > > Basically, as I see it, we need a new function scsi_set_device_offline() > > that marks the device offline, we need an offline check in > > These functions are needed for a whole bus as well. USB needs it. > > > As far as plugging back in, the answer is simple. Until the old instance > > is dead *and removed* a new one can't be added at the same ID, aka you > > simply ignore the hot plug until the hot remove has completed. > > What do you mean? It is dead because it is removed. How can a device be > anything than dead if it has been unplugged? Please elaborate. I said "old instance", aka the internal data structs (struct scsi_device for that device). A device can be dead but not removed from the scsi subsys if no one has cleaned up after the removal by unmounting any filesystems that were on it and removing the scsi device itself. That would be the job of the hotplug manager. > And who should ignore a hot addition, the LLDD or SCSI core. > If the former, again I must object. The scsi core doesn't allow two devices with the same complete ID set. You would either have to attach the device at a different ID (aka khubd could set the reattached device to a higher SCSI ID or something) or wait for the hot plug manager to complete the old instance of the device's removal before adding the device back in again. -- Doug Ledford 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606