* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 [not found] ` <20030116173539.GA31235@kroah.com> @ 2003-01-16 19:43 ` Matthew Dharm 2003-01-16 19:53 ` Greg KH [not found] ` <20030116195306.GA32697@kroah.com> 0 siblings, 2 replies; 106+ messages in thread From: Matthew Dharm @ 2003-01-16 19:43 UTC (permalink / raw) To: Greg KH; +Cc: linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1367 bytes --] Well, we only create the host when the device is first attached. After that, if it goes away and comes back, we re-connect it to the old SCSI host. But, while the device is gone, you've created an association between a SCSI node that exists and a non-existant USB device. Basically, you've got a pointer that is no longer valid. And when the device is re-attached, there isn't code to re-establish the correct SCSI<->USB association. Something like this would proabably make sense if the hot-unplugging code for SCSI hosts was really stable -- then we could unregister the host when the device went away, and this relation would be disconnected automatically. Matt On Thu, Jan 16, 2003 at 09:35:39AM -0800, Greg KH wrote: > On Thu, Jan 16, 2003 at 09:31:12AM -0800, Matthew Dharm wrote: > > Hrm... doesn't this all fall to pot when the device is unplugged and > > repluged? > > Um, how? This seems to work for me, but I don't have a lot of devices > here. And if there is a problem, you might want to tell the scsi > people, as they are the ones advocating this call be added. -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver You are needink to look more evil. You likink very strong coffee? -- Pitr to Dust Puppy User Friendly, 10/16/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 19:43 ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 Matthew Dharm @ 2003-01-16 19:53 ` Greg KH [not found] ` <20030116195306.GA32697@kroah.com> 1 sibling, 0 replies; 106+ messages in thread From: Greg KH @ 2003-01-16 19:53 UTC (permalink / raw) To: linux-usb-devel, Linux SCSI list On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote: > Well, we only create the host when the device is first attached. After > that, if it goes away and comes back, we re-connect it to the old SCSI > host. Ick, so when the device is gone, where does the SCSI host go? Is it still represented in sysfs and in the SCSI core properly? > But, while the device is gone, you've created an association between a SCSI > node that exists and a non-existant USB device. Basically, you've got a > pointer that is no longer valid. And when the device is re-attached, there > isn't code to re-establish the correct SCSI<->USB association. Why not? I'm guessing that you re-establish this association, right? Then you might have to add this same call to whereever that is done. > Something like this would proabably make sense if the hot-unplugging code > for SCSI hosts was really stable -- then we could unregister the host when > the device went away, and this relation would be disconnected > automatically. Well, push back on the SCSI people to fix this then, hot-unplug should work properly on the SCSI layer too :) thanks, greg k-h ^ permalink raw reply [flat|nested] 106+ messages in thread
[parent not found: <20030116195306.GA32697@kroah.com>]
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 [not found] ` <20030116195306.GA32697@kroah.com> @ 2003-01-16 20:10 ` Linus Torvalds 2003-01-16 20:43 ` greg kh ` (2 more replies) 2003-01-16 20:40 ` David Brownell 1 sibling, 3 replies; 106+ messages in thread From: Linus Torvalds @ 2003-01-16 20:10 UTC (permalink / raw) To: linux-scsi In article <20030116195306.GA32697@kroah.com>, Greg KH <greg@kroah.com> wrote: >On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote: >> Well, we only create the host when the device is first attached. After >> that, if it goes away and comes back, we re-connect it to the old SCSI >> host. > >Ick, so when the device is gone, where does the SCSI host go? Is it >still represented in sysfs and in the SCSI core properly? This is pure and utter USB storage stupidity, and nothing else. When the USB storage device is unplugged, the device should be unregistered. It should be _gone_. It isn't sleeping, it's dead. It's an ex-device. The fact that USB storage still keeps track of devices that do not exist is WRONG. It has resulted in problems in real life multiple times with devices that get re-attached and have a new serial number (quite common as far as I can tell in cheap flash readers), where the _stupid_ rule of trying to keep track of what has been attached results in the device moving from /dev/sda to sdb to sdc as it is unplugged and re-plugged. >> Something like this would proabably make sense if the hot-unplugging code >> for SCSI hosts was really stable -- then we could unregister the host when >> the device went away, and this relation would be disconnected >> automatically. > >Well, push back on the SCSI people to fix this then, hot-unplug should >work properly on the SCSI layer too :) IT IS NOT A SCSI LAYER PROBLEM! It's purely a USB problem. Linus ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 20:10 ` Linus Torvalds @ 2003-01-16 20:43 ` greg kh 2003-01-16 21:41 ` Linus Torvalds 2003-01-16 22:51 ` Matthew Dharm 2 siblings, 0 replies; 106+ messages in thread From: greg kh @ 2003-01-16 20:43 UTC (permalink / raw) To: linux-scsi; +Cc: linux-usb-devel Copied to linux-usb-devel, as they should also see this... On Thu, 16 Jan 2003 12:10:18 -0800, Linus Torvalds wrote: > In article <20030116195306.GA32697@kroah.com>, Greg KH <greg@kroah.com> > wrote: >>On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote: >>> Well, we only create the host when the device is first attached. After >>> that, if it goes away and comes back, we re-connect it to the old SCSI >>> host. >> >>Ick, so when the device is gone, where does the SCSI host go? Is it >>still represented in sysfs and in the SCSI core properly? > > This is pure and utter USB storage stupidity, and nothing else. > > When the USB storage device is unplugged, the device should be > unregistered. It should be _gone_. It isn't sleeping, it's dead. It's > an ex-device. Agreed, I thought most of that logic had been removed from the usb-storage driver as the SCSI layer can now handle removing devices just fine (or so Mike Anderson tells me :) thanks, greg k-h ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 20:10 ` Linus Torvalds 2003-01-16 20:43 ` greg kh @ 2003-01-16 21:41 ` Linus Torvalds 2003-01-16 22:51 ` Matthew Dharm 2 siblings, 0 replies; 106+ messages in thread From: Linus Torvalds @ 2003-01-16 21:41 UTC (permalink / raw) To: linux-scsi In article <b073ja$12i$1@penguin.transmeta.com>, Linus Torvalds <torvalds@transmeta.com> wrote: > >When the USB storage device is unplugged, the device should be >unregistered. It should be _gone_. It isn't sleeping, it's dead. It's >an ex-device. As a follow-up on my rant: if people want reliable static naming over removal/reinsertion of a device, we actually already have exactly that, in user space. Using the hotplug agents. There's a mostly unrelated problem we have from a kernel perspective which is a device that is actually in use when the disconnect happens - say as a part of a sleep sequence (which will cause a forced disconnect/ reconnect event). That's a generic hotplug issue, and still should not be a reason for trying to (on a driver level) keep track of devices that are gone. It should really be up to upper layers to be able to re-associate things properly, since doing it on a driver level simply isn't even possible in the generic case. Linus ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 20:10 ` Linus Torvalds 2003-01-16 20:43 ` greg kh 2003-01-16 21:41 ` Linus Torvalds @ 2003-01-16 22:51 ` Matthew Dharm 2 siblings, 0 replies; 106+ messages in thread From: Matthew Dharm @ 2003-01-16 22:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-scsi [-- Attachment #1: Type: text/plain, Size: 2676 bytes --] On Thu, Jan 16, 2003 at 08:10:18PM +0000, Linus Torvalds wrote: > In article <20030116195306.GA32697@kroah.com>, Greg KH <greg@kroah.com> wrote: > >On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote: > >> Well, we only create the host when the device is first attached. After > >> that, if it goes away and comes back, we re-connect it to the old SCSI > >> host. > > > >Ick, so when the device is gone, where does the SCSI host go? Is it > >still represented in sysfs and in the SCSI core properly? > > This is pure and utter USB storage stupidity, and nothing else. Well, I happen to agree. But it was a necessary evil. Up until recently, an attempt to hot-unplug a SCSI host would result is gross system instability. Also, up until recently, the debounce on USB ports means that a usb-storage device could get disconnected and reconnected when a completely unrelated device was attached to a hub. Newer hubs fixed this. Also, the _vast_ majority of devices keep sane serial numbers. You happen to have one of the broken ones, which tends to skew your perception. But of the two dozen or so storage devices I have, only one exhibits that problem (and only with the old firmware loaded). I'd like to fix this. I've been wanting to fix this for a while. I want SCSI hot-unplug to work well enough to rely on it. I want to be able to take a USB disk, unplug it while writing to it, plug in a new USB disk, and have a guarantee that the SCSI layer won't get confused and try to write the data for the old disk to the new one (think scsi0 goes away, to be replaced by a new scsi0, without properly stopping command initiators). Heck, it was only in early 2.4.x that the SCSI mid-layer fixed an off-by-one error that made it queue one-too-many commands to a HBA. So, here's how it is: We all want it fixed. Linus, if you're willing to deal with a major change for this at this late-date in the 2.5.x lifecycle, then we'll do it. Simple as that. That's pretty much the only reason I haven't started already -- the SCSI add-ons that support the hot-unplug were too new and too late for me to be comfortable making a major change like this. Matt P.S. BTW, this is also the original behavior of usb-storage, all the way back to before I was working on it (and before it was called usb-storage). I've never liked it this way... -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver Hey Chief. We've figured out how to save the technical department. We need to be committed. -- The Techs User Friendly, 1/22/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 [not found] ` <20030116195306.GA32697@kroah.com> 2003-01-16 20:10 ` Linus Torvalds @ 2003-01-16 20:40 ` David Brownell 2003-01-16 20:48 ` Mike Anderson 1 sibling, 1 reply; 106+ messages in thread From: David Brownell @ 2003-01-16 20:40 UTC (permalink / raw) To: Greg KH; +Cc: linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1023 bytes --] Greg KH wrote: > On Thu, Jan 16, 2003 at 11:43:23AM -0800, Matthew Dharm wrote: > >>Well, we only create the host when the device is first attached. After >>that, if it goes away and comes back, we re-connect it to the old SCSI >>host. > > > Ick, so when the device is gone, where does the SCSI host go? Is it > still represented in sysfs and in the SCSI core properly? Just for the record ... I think usb-storage is the only USB driver that tries to keep state about devices across disconnect/reconnect. I agree with Greg that hotplugging (including unplug/replug) should work well with SCSI. But given the problems of determining "identity" of a disk (or volume or whatever) I'm sort of curious what working well should really mean. Have any of the SCSI people been looking much at SCSI hotplug on 2.5? I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it does is make sure the right drivers are loaded, it doesn't have a clue yet about whether/how/where to mount disks or do other stuff. - Dave [-- Attachment #2: scsi.agent --] [-- Type: text/plain, Size: 1022 bytes --] #!/bin/bash # # SCSI hotplug agent for 2.5 kernels # # ACTION=add # DEVPATH=devices/scsi0/0:0:0:0 # cd /etc/hotplug . hotplug.functions case $ACTION in add) # 2.5.50 kernel bug: this happens sometimes if [ ! -d /sys/$DEVPATH ]; then mesg "bogus sysfs DEVPATH=$DEVPATH" exit 1 fi TYPE=$(cat /sys/$DEVPATH/type) case "$TYPE" in # 2.5.51 style attributes; <scsi/scsi.h> TYPE_* constants 0) TYPE=disk ; MODULE=sd_mod ;; # FIXME some tapes use 'osst' not 'st' 1) TYPE=tape ; MODULE=st ;; 2) TYPE=printer ;; 3) TYPE=processor ;; 4) TYPE=worm ; MODULE=sr_mod ;; 5) TYPE=cdrom ; MODULE=sr_mod ;; 6) TYPE=scanner ;; 7) TYPE=mod ; MODULE=sd_mod ;; 8) TYPE=changer ;; 9) TYPE=comm ;; 14) TYPE=enclosure ;; esac if [ "$MODULE" != "" ]; then mesg "$TYPE at $DEVPATH" modprobe $MODULE else mesg "how to add device type=$TYPE at $DEVPATH ??" fi ;; *) debug_mesg SCSI $ACTION event not supported exit 1 ;; esac ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 20:40 ` David Brownell @ 2003-01-16 20:48 ` Mike Anderson 2003-01-16 23:43 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: Mike Anderson @ 2003-01-16 20:48 UTC (permalink / raw) To: David Brownell; +Cc: Greg KH, linux-usb-devel, Linux SCSI list David Brownell [david-b@pacbell.net] wrote: > Have any of the SCSI people been looking much at SCSI hotplug on 2.5? > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it > does is make sure the right drivers are loaded, it doesn't have a > clue yet about whether/how/where to mount disks or do other stuff. SCSI has added newer interfaces for host drivers to use that allow single adds and removes. The api document is located in /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly current. I have been using the scsi_debug driver (which uses the newer interface) to add and remove pseudo adapters, but have not checked the results through user space hotplug. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 20:48 ` Mike Anderson @ 2003-01-16 23:43 ` Oliver Neukum 2003-01-17 8:50 ` Mike Anderson 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-16 23:43 UTC (permalink / raw) To: Mike Anderson, David Brownell; +Cc: Greg KH, linux-usb-devel, Linux SCSI list Am Donnerstag, 16. Januar 2003 21:48 schrieb Mike Anderson: > David Brownell [david-b@pacbell.net] wrote: > > Have any of the SCSI people been looking much at SCSI hotplug on 2.5? > > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it > > does is make sure the right drivers are loaded, it doesn't have a > > clue yet about whether/how/where to mount disks or do other stuff. > > SCSI has added newer interfaces for host drivers to use that allow > single adds and removes. The api document is located in > /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly > current. This is good news in principle. But what use is a function like scsi_remove_host() if it can fail? If a device is gone, it is gone and all the complaining in the world won't alter that. Could you explain how a LLD can ensure that this function will always succeed? Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-16 23:43 ` Oliver Neukum @ 2003-01-17 8:50 ` Mike Anderson 2003-01-17 10:55 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: Mike Anderson @ 2003-01-17 8:50 UTC (permalink / raw) To: Oliver Neukum; +Cc: David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum [oliver@neukum.name] wrote: > Am Donnerstag, 16. Januar 2003 21:48 schrieb Mike Anderson: > > David Brownell [david-b@pacbell.net] wrote: > > > Have any of the SCSI people been looking much at SCSI hotplug on 2.5? > > > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it > > > does is make sure the right drivers are loaded, it doesn't have a > > > clue yet about whether/how/where to mount disks or do other stuff. > > > > SCSI has added newer interfaces for host drivers to use that allow > > single adds and removes. The api document is located in > > /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly > > current. > > This is good news in principle. > But what use is a function like scsi_remove_host() if it can fail? > If a device is gone, it is gone and all the complaining in the world > won't alter that. > Could you explain how a LLD can ensure that this function will always > succeed? The interface was focused on clean removes. There may be layers above the SCSI subsystem using the device that may need user space intervention to release. Currently the request_queue exported to the block layer is tied to the scsi device and cannot dis-associated. (NOTE: in looking at this in 2.5.58 it appears that there maybe some checks missing in scsi_check_device_busy that would make scsi_remove_host succeed more often than it should). If the device disappears the host needs to return all IOs in flight with an error, and we need to ensure new ones do not start ( setting the device offline and maybe host_self_blocked). This is a start on getting the scsi_remove_host to succed. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 8:50 ` Mike Anderson @ 2003-01-17 10:55 ` Oliver Neukum 2003-01-17 15:06 ` Alan Stern 2003-01-17 18:54 ` Matthew Dharm 0 siblings, 2 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-17 10:55 UTC (permalink / raw) To: Mike Anderson; +Cc: David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Am Freitag, 17. Januar 2003 09:50 schrieb Mike Anderson: > Oliver Neukum [oliver@neukum.name] wrote: > > Am Donnerstag, 16. Januar 2003 21:48 schrieb Mike Anderson: > > > David Brownell [david-b@pacbell.net] wrote: > > > > Have any of the SCSI people been looking much at SCSI hotplug on 2.5? > > > > I attach "/etc/hotplug/scsi.agent" from one of my desktops; all it > > > > does is make sure the right drivers are loaded, it doesn't have a > > > > clue yet about whether/how/where to mount disks or do other stuff. > > > > > > SCSI has added newer interfaces for host drivers to use that allow > > > single adds and removes. The api document is located in > > > /Documentation/scsi/scsi_mid_low_api.txt. I believe it is fairly > > > current. > > > > This is good news in principle. > > But what use is a function like scsi_remove_host() if it can fail? > > If a device is gone, it is gone and all the complaining in the world > > won't alter that. > > Could you explain how a LLD can ensure that this function will always > > succeed? > > The interface was focused on clean removes. There may be layers above > the SCSI subsystem using the device that may need user space > intervention to release. Currently the request_queue exported to the > block layer is tied to the scsi device and cannot dis-associated. > > (NOTE: in looking at this in 2.5.58 it appears that there maybe some > checks missing in scsi_check_device_busy that would make > scsi_remove_host succeed more often than it should). That is simply wrong. Reporting somebody having pulled a plug must not fail. What are you supposed to do with an error here? There must be a way for a LLD to report that reliably. If the answer is, take that lock, call that function, error all pending requests, release that lock and call that function, it's OK. But it must work in all cases. > If the device disappears the host needs to return all IOs in flight with > an error, and we need to ensure new ones do not start ( setting the > device offline and maybe host_self_blocked). This is a start on getting > the scsi_remove_host to succed. Could you provide some details? Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 10:55 ` Oliver Neukum @ 2003-01-17 15:06 ` Alan Stern 2003-01-17 18:54 ` Matthew Dharm 1 sibling, 0 replies; 106+ messages in thread From: Alan Stern @ 2003-01-17 15:06 UTC (permalink / raw) To: Oliver Neukum Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list On Fri, 17 Jan 2003, Oliver Neukum wrote: > Am Freitag, 17. Januar 2003 09:50 schrieb Mike Anderson: > > > > If the device disappears the host needs to return all IOs in flight with > > an error, and we need to ensure new ones do not start ( setting the > > device offline and maybe host_self_blocked). This is a start on getting > > the scsi_remove_host to succed. > > Could you provide some details? Usb-storage does most of this already. When the device is unplugged, all current transactions return an error and any new ones return a device-not-ready code. But that doesn't solve the problem of removing the device's representation within SCSI and sysfs. Alan Stern ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 10:55 ` Oliver Neukum 2003-01-17 15:06 ` Alan Stern @ 2003-01-17 18:54 ` Matthew Dharm 2003-01-17 20:25 ` Mike Anderson ` (2 more replies) 1 sibling, 3 replies; 106+ messages in thread From: Matthew Dharm @ 2003-01-17 18:54 UTC (permalink / raw) To: Oliver Neukum Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1766 bytes --] On Fri, Jan 17, 2003 at 11:55:36AM +0100, Oliver Neukum wrote: > That is simply wrong. Reporting somebody having pulled a plug must > not fail. What are you supposed to do with an error here? > > There must be a way for a LLD to report that reliably. > If the answer is, take that lock, call that function, error all pending > requests, release that lock and call that function, it's OK. > > But it must work in all cases. I absolutely agree. The device is gone. I can't do anything about it. If the SCSI layer decides it can't let go, what am I supposed to do about it? In a separate discussion with Mike, he mentioned that you can't scsi_remove_device() unless there are no pending commands. How the hell is an LLD supposed to assure that!?!? The minute I error a command and call scsi_done(), I can get a new one. Unless I lock out requests with scsi_block_requests(), but that comes with major warnings about needing to get unblocked. The way this should work is that the LLD calls scsi_remove_device(), and that cuts off the flow of commands. The LLD can promise to error-out any pending commands in the device command queue. That is, unless scsi_block_requests() and scsi_unblock_requests() are more useful than the documentation suggests... block(), error all commands, unregister()... that would make some sense. We could call scsi_block_request() as soon as we know the unit is gone, and unregister() as soon as the queue is empty. Matt -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver A: The most ironic oxymoron wins ... DP: "Microsoft Works" A: Uh, okay, you win. -- A.J. & Dust Puppy User Friendly, 1/18/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 18:54 ` Matthew Dharm @ 2003-01-17 20:25 ` Mike Anderson 2003-01-17 22:07 ` Oliver Neukum 2003-01-17 20:26 ` [linux-usb-devel] " Oliver Neukum 2003-01-20 17:36 ` Luben Tuikov 2 siblings, 1 reply; 106+ messages in thread From: Mike Anderson @ 2003-01-17 20:25 UTC (permalink / raw) To: Oliver Neukum, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Oliver and Alan I am trying to catch up on this thread so I did not reply directly to your concerns, but I think they are covered below. Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > On Fri, Jan 17, 2003 at 11:55:36AM +0100, Oliver Neukum wrote: > > That is simply wrong. Reporting somebody having pulled a plug must > > not fail. What are you supposed to do with an error here? > > > > There must be a way for a LLD to report that reliably. > > If the answer is, take that lock, call that function, error all pending > > requests, release that lock and call that function, it's OK. > > > > But it must work in all cases. > > I absolutely agree. The device is gone. I can't do anything about it. > If the SCSI layer decides it can't let go, what am I supposed to do about > it? > > In a separate discussion with Mike, he mentioned that you can't > scsi_remove_device() unless there are no pending commands. > > How the hell is an LLD supposed to assure that!?!? > I believe that the scsi_remove_host function the way it is currently is not the correct function. The SCSI needs to separate the device gone from freeing. There maybe some unbounded cleanup as the request_queue that is exported to the block layer is part of the scsi_device which is a child of the virtual usb SCSI adapter. The only way to reduce the unbounded time is possibly we reorganizing some sysfs tree object layouts. > The minute I error a command and call scsi_done(), I can get a new one. > Unless I lock out requests with scsi_block_requests(), but that comes with > major warnings about needing to get unblocked. > Well in the case of the device really being gone does the LLD need to be worried about being unblocked. I get the feeling from this thread that this is probably the wrong interface. > The way this should work is that the LLD calls scsi_remove_device(), and > that cuts off the flow of commands. The LLD can promise to error-out any > pending commands in the device command queue. > > That is, unless scsi_block_requests() and scsi_unblock_requests() are more > useful than the documentation suggests... block(), error all commands, > unregister()... that would make some sense. We could call > scsi_block_request() as soon as we know the unit is gone, and unregister() > as soon as the queue is empty. We should really ensure that we have good separation between stopping device IO, device gone, and release resources. - SCSI seems to have the flags to stop the IO, but instead of scsi_block_requests we may want to export the setting of device online. This can be done from sysfs now, but not from the driver ( the driver does have a handle to the device, but it would be better to have an interface in case we need to do something addition operations). - Possibly add a scsi_remove_device that would always succeed and a version of scsi_remove_host that calls scsi_remove_device for all devices. Though with the recent change to SCSI remove host to allow non sysfs device registration I do not believe we could ensure devices would be cleaned up. - SCSI would need to support ref counting so that resources are not removed to soon. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-17 20:25 ` Mike Anderson @ 2003-01-17 22:07 ` Oliver Neukum 0 siblings, 0 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-17 22:07 UTC (permalink / raw) To: Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list > > The way this should work is that the LLD calls scsi_remove_device(), and > > that cuts off the flow of commands. The LLD can promise to error-out any > > pending commands in the device command queue. > > > > That is, unless scsi_block_requests() and scsi_unblock_requests() are > > more useful than the documentation suggests... block(), error all > > commands, unregister()... that would make some sense. We could call > > scsi_block_request() as soon as we know the unit is gone, and > > unregister() as soon as the queue is empty. > > We should really ensure that we have good separation between stopping > device IO, device gone, and release resources. Very good. > - SCSI seems to have the flags to stop the IO, but instead of > scsi_block_requests we may want to export the setting of > device online. This can be done from sysfs now, but Yes, extremely good, we _need_ this. > not from the driver ( the driver does have a handle to the > device, but it would be better to have an interface in case we > need to do something addition operations). > > - Possibly add a scsi_remove_device that would always succeed and > a version of scsi_remove_host that calls scsi_remove_device for > all devices. Though with the recent change to SCSI remove host to > allow non sysfs device registration I do not believe we could > ensure devices would be cleaned up. Meaning? If memory stays tied up, we have a beautiful DOS attack, if disconnection can be faked by software. Regards Oliver ------------------------------------------------------- This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will allow you to extend the highest allowed 128 bit encryption to all your clients even if they use browsers that are limited to 40 bit encryption. Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 18:54 ` Matthew Dharm 2003-01-17 20:25 ` Mike Anderson @ 2003-01-17 20:26 ` Oliver Neukum 2003-01-17 20:49 ` Mike Anderson 2003-01-20 17:36 ` Luben Tuikov 2 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-17 20:26 UTC (permalink / raw) To: Matthew Dharm Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list > In a separate discussion with Mike, he mentioned that you can't > scsi_remove_device() unless there are no pending commands. > > How the hell is an LLD supposed to assure that!?!? > > The minute I error a command and call scsi_done(), I can get a new one. > Unless I lock out requests with scsi_block_requests(), but that comes with > major warnings about needing to get unblocked. If I understand the scsi code correctly, doing that will result in a memory leak at least. Perhaps exporting a function to declare a host's devices offline might do the trick. But as yet I havn't found out where the scsi layer actually checks that flag. > The way this should work is that the LLD calls scsi_remove_device(), and > that cuts off the flow of commands. The LLD can promise to error-out any > pending commands in the device command queue. > > That is, unless scsi_block_requests() and scsi_unblock_requests() are more > useful than the documentation suggests... block(), error all commands, > unregister()... that would make some sense. We could call > scsi_block_request() as soon as we know the unit is gone, and unregister() > as soon as the queue is empty. Sounds reasonable. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 20:26 ` [linux-usb-devel] " Oliver Neukum @ 2003-01-17 20:49 ` Mike Anderson 0 siblings, 0 replies; 106+ messages in thread From: Mike Anderson @ 2003-01-17 20:49 UTC (permalink / raw) To: Oliver Neukum Cc: Matthew Dharm, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum [oliver@neukum.name] wrote: > > > In a separate discussion with Mike, he mentioned that you can't > > scsi_remove_device() unless there are no pending commands. > > > > How the hell is an LLD supposed to assure that!?!? > > > > The minute I error a command and call scsi_done(), I can get a new one. > > Unless I lock out requests with scsi_block_requests(), but that comes with > > major warnings about needing to get unblocked. > > If I understand the scsi code correctly, doing that will result in a memory > leak at least. Perhaps exporting a function to declare a host's devices > offline might do the trick. But as yet I havn't found out where the scsi layer > actually checks that flag. It is returned by scsi_block_when_processing_errors (upper level drivers opens, ioctl, etc). It is checked in scsi_decide_disposition the scsi_softirq / scsi_done side. It is checked in the command init of the upper level drivers during scsi_prep_fn. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-17 18:54 ` Matthew Dharm 2003-01-17 20:25 ` Mike Anderson 2003-01-17 20:26 ` [linux-usb-devel] " Oliver Neukum @ 2003-01-20 17:36 ` Luben Tuikov 2003-01-20 18:23 ` Oliver Neukum 2003-01-20 20:08 ` David Brownell 2 siblings, 2 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-20 17:36 UTC (permalink / raw) To: Matthew Dharm Cc: Oliver Neukum, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Matthew Dharm wrote: > In a separate discussion with Mike, he mentioned that you can't > scsi_remove_device() unless there are no pending commands. > > How the hell is an LLD supposed to assure that!?!? ABORT TASK/ABORT TASK SET. For a year now I've been trying to get something of the sort scsi_cancel_task/set(). It will send the aforementioned task management functions (TMF) (depending on the abilities of the device) to the device server (LLD). After which the initiator should NOT get a response to any already queued commands in the LLD. LLDD should be smarter if they do their own queuing and snoop this and act accordingly. After sending such a TMF to the LLD, one can clean all ULP queues (scsi, block, etc), knowing that there'd be no response to a command (which is now gone), and then actually remove the device. In my own drivers/mini-scsi-core, I do something like this: 1. mark the device off (stop queuing anything to it, return error or whatever), 2. send the aforementioned TMF, 2a) wait for current transfers to complete 3. cancel ULP queues. Now the device is cleanly off, and one can remove it/restart it/whatever. Note that this method is cleanly reversible (1. turn on, 2. LUN/device RESET (scsi layer), 3. start queuing (block layer)). (Note as well that I make distinction between LLD and LLDD, where the last D stands for ``driver'' in LLDD.) > The way this should work is that the LLD calls scsi_remove_device(), and > that cuts off the flow of commands. The LLD can promise to error-out any > pending commands in the device command queue. I take it you mean that the transport will tell the LLDD that the device is gone and it (LLDD) call the one above, SCSI Core to remove the device. Hmm, more thinking needs to be done here, as shouldn't this be handled by hotplugging? I.e. Targets do not *initiate* events. The transport can notify that the device is gone, but an ULP entity will call scsi_remove_device() not the other way around. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 17:36 ` Luben Tuikov @ 2003-01-20 18:23 ` Oliver Neukum 2003-01-20 18:56 ` Luben Tuikov 2003-01-21 3:31 ` Alan 2003-01-20 20:08 ` David Brownell 1 sibling, 2 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-20 18:23 UTC (permalink / raw) To: Luben Tuikov, Matthew Dharm Cc: Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list > I take it you mean that the transport will tell the LLDD that the device > is gone and it (LLDD) call the one above, SCSI Core to remove the device. > > Hmm, more thinking needs to be done here, as shouldn't this be handled > by hotplugging? I.e. Targets do not *initiate* events. > > The transport can notify that the device is gone, but an ULP entity will > call scsi_remove_device() not the other way around. NO! This is an insanely complicated scheme. We have no notification beforehand. User yanks out cable. That's it. No preperation at all. We as the writers of device drivers need a way to get rid of the device as we are notified of the physical disconnect. It is not our job to maintain devices in an undead state. And a scheme that goes subsystem driver -> hotplugging -> script finding corresponding devices -> script doing proc magic -> scsi layer notifying low level driver is _not_ sensible. It triples the amount of complexity. We need a simple scheme like 1. block further requests 2. kill old requests 3. remove device Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-20 18:23 ` Oliver Neukum @ 2003-01-20 18:56 ` Luben Tuikov 2003-01-20 19:10 ` [linux-usb-devel] " Oliver Neukum 2003-01-20 19:50 ` David Brownell 2003-01-21 3:31 ` Alan 1 sibling, 2 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-20 18:56 UTC (permalink / raw) To: Oliver Neukum Cc: Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: >>I take it you mean that the transport will tell the LLDD that the device >>is gone and it (LLDD) call the one above, SCSI Core to remove the device. >> >>Hmm, more thinking needs to be done here, as shouldn't this be handled >>by hotplugging? I.e. Targets do not *initiate* events. >> >>The transport can notify that the device is gone, but an ULP entity will >>call scsi_remove_device() not the other way around. > > > NO! We're probably talking about two different things. > This is an insanely complicated scheme. > We have no notification beforehand. User yanks out cable. > That's it. No preperation at all. So now you have two possibilities: a) the transport supports this event notification, b) the transport doesn't support this event notification. Let me just elaborate a bit more here: since the transport/LLDD would know that the device has just disappeared (a)), it *will* return error in the due time when someone is trying to use it (and this is the same error as if there had never been such a device). But a removal of a device would probably have to start in top-down approach, to free/release/etc resources/etc, rather than a bottom-up approach (I just cannot see how this would work...) In fact, this is the whole point of hotplugging, as there may be other closely related things which would have to be done. > We as the writers of device drivers need a way to get rid of the device > as we are notified of the physical disconnect. Yes, and as I explained earlier: you *may* get notified by the transport. > It is not our job to maintain devices in an undead state. Yes, Linus has said this here before and it's pointless for you to repeat it here. Furthermore, nothing I've said suggests that. Everyone agrees with this. > And a scheme that goes subsystem driver -> hotplugging -> script finding > corresponding devices -> script doing proc magic -> scsi layer notifying > low level driver is _not_ sensible. It triples the amount of complexity. > > We need a simple scheme like > 1. block further requests > 2. kill old requests > 3. remove device But there's nothing non-trivial about this scheme -- i.e. it's no brainer -- everyone knows that *those* are the minimum set of steps. Q: who initiates step 1? If it's the user, then we're wasting time discussing trivialities here (or maybe we're showing that we're working :-)) ). But if it's not the user, then... In general, I wasn't really discussing hotplugging, I was basically hinting that SCSI Core could use a few more functionalities, and that some things need to go away. -- Luben ------------------------------------------------------- This SF.NET email is sponsored by: FREE SSL Guide from Thawte are you planning your Web Server Security? Click here to get a FREE Thawte SSL guide and find the answers to all your SSL security issues. http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 18:56 ` Luben Tuikov @ 2003-01-20 19:10 ` Oliver Neukum 2003-01-20 19:50 ` David Brownell 1 sibling, 0 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-20 19:10 UTC (permalink / raw) To: Luben Tuikov Cc: Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list > > We as the writers of device drivers need a way to get rid of the device > > as we are notified of the physical disconnect. > > Yes, and as I explained earlier: you *may* get notified by the transport. That exactly is the point. There must be no maybe. Gone is gone. Failure is not an option here. Only LLDD knows reliably whether a device is gone and there can be no second guessing by higher levels. Consequently, the LLDD reports it and the higher layers delete the device as gently as possible, but delete it in any case. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 18:56 ` Luben Tuikov 2003-01-20 19:10 ` [linux-usb-devel] " Oliver Neukum @ 2003-01-20 19:50 ` David Brownell 1 sibling, 0 replies; 106+ messages in thread From: David Brownell @ 2003-01-20 19:50 UTC (permalink / raw) To: Luben Tuikov Cc: Oliver Neukum, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Luben Tuikov wrote: > But a removal of a device would probably have to start in top-down > approach, to free/release/etc resources/etc, rather than a bottom-up > approach (I just cannot see how this would work...) Well, the driver model core covers key parts of that. Or it was at least intended to ... in fact, today I think the bus drivers (like USB and PCI) need to own the top-down logic for cases like resume and hub/bridge/adapter (dis)connect. (And their bottom-up analogues for suspension.) > In general, I wasn't really discussing hotplugging, I was basically > hinting that SCSI Core could use a few more functionalities, and that > some things need to go away. There does seem to be some consensus on that point, which is a good place to start! - Dave ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 18:23 ` Oliver Neukum 2003-01-20 18:56 ` Luben Tuikov @ 2003-01-21 3:31 ` Alan 2003-01-21 7:17 ` Oliver Neukum 2003-01-21 13:30 ` James Bottomley 1 sibling, 2 replies; 106+ messages in thread From: Alan @ 2003-01-21 3:31 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list On Mon, 2003-01-20 at 18:23, Oliver Neukum wrote: > > The transport can notify that the device is gone, but an ULP entity will > > call scsi_remove_device() not the other way around. > > NO! > > This is an insanely complicated scheme. > We have no notification beforehand. User yanks out cable. > That's it. No preperation at all. > > We as the writers of device drivers need a way to get rid of the device > as we are notified of the physical disconnect. It is not our job to maintain > devices in an undead state. If you think about it rationally there isnt a lot that can be done higher up. At the point the hardware vanishes there may be other threads of execution already in your driver, so undead state is a reality you have to live with, at least briefly. Providing you refcount objects and defer freeing of resources its not normally too terrible ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-21 3:31 ` Alan @ 2003-01-21 7:17 ` Oliver Neukum 2003-01-21 11:57 ` [linux-usb-devel] " Douglas Gilbert 2003-01-21 13:30 ` James Bottomley 1 sibling, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-21 7:17 UTC (permalink / raw) To: Alan Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Am Dienstag, 21. Januar 2003 04:31 schrieb Alan: > On Mon, 2003-01-20 at 18:23, Oliver Neukum wrote: > > > The transport can notify that the device is gone, but an ULP entity > > > will call scsi_remove_device() not the other way around. > > > > NO! > > > > This is an insanely complicated scheme. > > We have no notification beforehand. User yanks out cable. > > That's it. No preperation at all. > > > > We as the writers of device drivers need a way to get rid of the device > > as we are notified of the physical disconnect. It is not our job to > > maintain devices in an undead state. > > If you think about it rationally there isnt a lot that can be done higher > up. At the point the hardware vanishes there may be other threads of > execution already in your driver, so undead state is a reality you have to > live with, at least briefly. No problem with that. I have a problem with notifying the SCSI layer and then waiting for an unlimited time until maybe the SCSI layer decides to inform me of a success. You see, disconnection has to work. Having to wait for an unlimited time is a kind of failure. I simply don't trust the SCSI layer. I've had to much trouble with it already. Regards Oliver ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 7:17 ` Oliver Neukum @ 2003-01-21 11:57 ` Douglas Gilbert 2003-01-21 13:48 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: Douglas Gilbert @ 2003-01-21 11:57 UTC (permalink / raw) To: Oliver Neukum Cc: Alan, Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > <snip/> > > I simply don't trust the SCSI layer. I've had to much trouble with it > already. Hopefully we have some better building blocks and clearer code in 2.5 to look at this problem in the SCSI subsystem again. The first thing a LLDD should do when it _knows_ the device is gone is set scsi_device::online=0 ** which should stop all new commands being queued. Now if scsi_device::access_count is zero then we have no problems *** and most of the code we need is in place in latter half of scsi_remove_single_device(). The hard case is when scsi_device::access_count>0 which means open()s or mounts are active on that device. So sd, sr, st, osst and/or sg know about a file descriptor (or the block equivalent) that is associated with that "departed" scsi_device instance. I have code in sg in lk 2.4 to partially handle the case when detach is called on a device for which sg holds an open fd. Sg can handle this because it shadows scsi_device and scsi_cmnd instances. The next time an app tries to access that fd it gets a ENODEV (even sends out a SIGIO/POLL_HUP for advanced apps). I suspect life would not be so simple for sd and sr due to their close binding with the block subsystem. Another approach to this problem is to keep the scsi_device instances for departed devices around until the access_count drops to zero. One silly idea I had was to change the seldom used channel number to 1024 (or the next free number above that) to maintain the uniqueness of the host/channel/id/lun tuple **** and keep the original tuple available for the re-appearance of the departed device. Compounding the hard case is when commands are queued. Can these simply be ENODEV-ed back to the apps that own them? If so, that may help the access_count drop to zero facilitating scsi_device removal. Question: do we need to worry about hot unplugging of hosts? ** Since 'online' already has a usage (i.e. error recovery couldn't resurrect this device) perhaps something stronger like 'departed' is needed as well (or some sysfs mechanism). BTW sg sidesteps 'online' when a file descriptor is opened non-blocking. *** As Oliver has pointed out to me before, there are still opportunities for races when access_count is zero and an open()/mount is about to happen. **** We need a new, enhanced version of the SCSI_IOCTL_GET_IDLUN ioctl which as it stands can only convey 8 bit quantities for host, channel, target and lun; especially if we go to 64 bit luns. ... just some random thoughts ... fire away Doug Gilbert ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 11:57 ` [linux-usb-devel] " Douglas Gilbert @ 2003-01-21 13:48 ` Oliver Neukum 2003-01-21 18:22 ` Luben Tuikov 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-21 13:48 UTC (permalink / raw) To: dougg Cc: Alan, Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Am Dienstag, 21. Januar 2003 12:57 schrieb Douglas Gilbert: > Oliver Neukum wrote: > > <snip/> > > > > I simply don't trust the SCSI layer. I've had to much trouble with it > > already. > > Hopefully we have some better building blocks and > clearer code in 2.5 to look at this problem in the SCSI > subsystem again. I surely hope so. Let's discuss it. > The first thing a LLDD should do when it _knows_ the device > is gone is set scsi_device::online=0 ** which should stop all The SCSI should export functions to do that and to do it to all a host's devices. > new commands being queued. Now if scsi_device::access_count > is zero then we have no problems *** and most of the code > we need is in place in latter half of scsi_remove_single_device(). Yes. And scsi_device_get() should check for a device having been unplugged. > The hard case is when scsi_device::access_count>0 which means > open()s or mounts are active on that device. So sd, sr, st, osst > and/or sg know about a file descriptor (or the block equivalent) What is the block equivalent? > that is associated with that "departed" scsi_device instance. I > have code in sg in lk 2.4 to partially handle the case when > detach is called on a device for which sg holds an open fd. > Sg can handle this because it shadows scsi_device and > scsi_cmnd instances. The next time an app tries to access > that fd it gets a ENODEV (even sends out a SIGIO/POLL_HUP for > advanced apps). I suspect life would not be so simple for sd > and sr due to their close binding with the block subsystem. > Another approach to this problem is to keep the scsi_device > instances for departed devices around until the access_count > drops to zero. One silly idea I had was to change the seldom That looks like a workable approach. > used channel number to 1024 (or the next free number above that) > to maintain the uniqueness of the host/channel/id/lun tuple **** > and keep the original tuple available for the re-appearance > of the departed device. > > Compounding the hard case is when commands are queued. Can > these simply be ENODEV-ed back to the apps that own them? > If so, that may help the access_count drop to zero facilitating > scsi_device removal. The LLDD can certainly return an error for the commands already queued. > Question: do we need to worry about hot unplugging of hosts? Yes, definitely yes. A USB storage device is a virtual host, since scsi ids would otherwise collide. Besides, PCMCIA SCSI host are not exactly brandnew either. > ** Since 'online' already has a usage (i.e. error recovery > couldn't resurrect this device) perhaps something stronger > like 'departed' is needed as well (or some sysfs mechanism). > BTW sg sidesteps 'online' when a file descriptor is opened > non-blocking. Good idea. So a disconnection would look like this: scsi_set_offline_host(...); synchronize_kernel(); error_queied_commands(...); scsi_remove_host(); Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 13:48 ` Oliver Neukum @ 2003-01-21 18:22 ` Luben Tuikov 0 siblings, 0 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-21 18:22 UTC (permalink / raw) To: Oliver Neukum Cc: dougg, Alan, Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > > So a disconnection would look like this: > 1. > scsi_set_offline_host(...); 2. > synchronize_kernel(); 3. > error_queied_commands(...); 4. > scsi_remove_host(); Not quite. You want to do 3 before 2, to get 2 going as soon as possible. Futhermore, 3 is partly in LLDD. 1 would take care of 3 in SCSI Core. See my previous (by date) post. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 3:31 ` Alan 2003-01-21 7:17 ` Oliver Neukum @ 2003-01-21 13:30 ` James Bottomley 1 sibling, 0 replies; 106+ messages in thread From: James Bottomley @ 2003-01-21 13:30 UTC (permalink / raw) To: Alan Cc: Oliver Neukum, Luben Tuikov, Matthew Dharm, Mike Anderson, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list alan@lxorguk.ukuu.org.uk said: > Providing you refcount objects and defer freeing of resources its not > normally too terrible We already have a struct device, which is a ref counted object precisely for this purpose, embedded inside Scsi_Device. One of the issues doing this is fast reattachment: the device goes away then comes back before we've cleared the outstanding command queue. In the latter case, I think we could use some of the work Luben Tuikov has been doing to make the cmnd structures tie more closely to the device: as long as we remove the device from user visibility as soon as it is removed, we can keep the Scsi_Device object around until the ref count falls to zero, and even create a new one while this is going on. Commands attached to the old device still error out. It would still be nice if the trigger for Scsi_device removal came from a USB hotplug event at user level, but I'm not too bothered about that. James ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-20 17:36 ` Luben Tuikov 2003-01-20 18:23 ` Oliver Neukum @ 2003-01-20 20:08 ` David Brownell 2003-01-20 20:48 ` [linux-usb-devel] " Oliver Neukum 2003-01-20 22:16 ` Luben Tuikov 1 sibling, 2 replies; 106+ messages in thread From: David Brownell @ 2003-01-20 20:08 UTC (permalink / raw) To: Luben Tuikov Cc: Matthew Dharm, Oliver Neukum, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Luben Tuikov wrote: >> The way this should work is that the LLD calls scsi_remove_device(), and >> that cuts off the flow of commands. The LLD can promise to error-out any >> pending commands in the device command queue. > > > I take it you mean that the transport will tell the LLDD that the device > is gone and it (LLDD) call the one above, SCSI Core to remove the device. > > Hmm, more thinking needs to be done here, as shouldn't this be handled > by hotplugging? I.e. Targets do not *initiate* events. Not exactly, but the bus driver ("transport"?) certainly does initiate reports like "here's a new device on the bus" or "that device is gone". That's when hotplugging kicks in (both in-kernel and in-userland). And the only way to access a device ("target") on the bus is to give a request to that bus driver. If, when servicing that request, the bus driver notices the device is gone ... that can act a lot like a device initiating a "device gone" event would look. > The transport can notify that the device is gone, but an ULP entity will > call scsi_remove_device() not the other way around. That's how USB works today: khubd shuts things down. Device drivers get disconnect() callbacks, just as when their modules are removed. EXCEPT that "khubd" is part of usbcore (roughly analagous to parts of the scsi mid-layer) ... so the drivers acting as host side proxies for the target hardware ("usb device") are purely reactive. Their only roles in hotplug scenarios are to bind to devices (when a new one appears, using probe callbacks) or unbind from them (when one goes away, using disconnect callbacks). Those disconnect() callbacks have a few key responsibilities, very much including shutting down the entire higher level I/O queue to that device. I think you're saying that SCSI drivers don't have such a responsibility (unlike USB or PCI) ... if so, that would seem to be worth changing. - Dave ------------------------------------------------------- This SF.NET email is sponsored by: FREE SSL Guide from Thawte are you planning your Web Server Security? Click here to get a FREE Thawte SSL guide and find the answers to all your SSL security issues. http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 20:08 ` David Brownell @ 2003-01-20 20:48 ` Oliver Neukum 2003-01-20 21:24 ` David Brownell 2003-01-20 22:16 ` Luben Tuikov 1 sibling, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-20 20:48 UTC (permalink / raw) To: David Brownell, Luben Tuikov Cc: Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Montag, 20. Januar 2003 21:08 schrieb David Brownell: > Luben Tuikov wrote: > >> The way this should work is that the LLD calls scsi_remove_device(), and > >> that cuts off the flow of commands. The LLD can promise to error-out > >> any pending commands in the device command queue. > > > > I take it you mean that the transport will tell the LLDD that the device > > is gone and it (LLDD) call the one above, SCSI Core to remove the device. > > > > Hmm, more thinking needs to be done here, as shouldn't this be handled > > by hotplugging? I.e. Targets do not *initiate* events. > > Not exactly, but the bus driver ("transport"?) certainly does initiate > reports like "here's a new device on the bus" or "that device is gone". > That's when hotplugging kicks in (both in-kernel and in-userland). > > And the only way to access a device ("target") on the bus is to give a > request to that bus driver. If, when servicing that request, the bus > driver notices the device is gone ... that can act a lot like a device > initiating a "device gone" event would look. Correct. As a LLDD is the lowest layer, these are equivalent thing. Only a LLDD can positively detect a device or a bus going away. > > The transport can notify that the device is gone, but an ULP entity will > > call scsi_remove_device() not the other way around. > > That's how USB works today: khubd shuts things down. Device drivers > get disconnect() callbacks, just as when their modules are removed. > > EXCEPT that "khubd" is part of usbcore (roughly analagous to parts > of the scsi mid-layer) ... so the drivers acting as host side proxies > for the target hardware ("usb device") are purely reactive. Their > only roles in hotplug scenarios are to bind to devices (when a new > one appears, using probe callbacks) or unbind from them (when one > goes away, using disconnect callbacks). That model cannot be applied to SCSI as it is much more diverse in the number of bus types it supports. USB can do it, because it knows about hubs. SCSI cannot, as there are no hubs in SCSI. > Those disconnect() callbacks have a few key responsibilities, very > much including shutting down the entire higher level I/O queue to > that device. I think you're saying that SCSI drivers don't have > such a responsibility (unlike USB or PCI) ... if so, that would > seem to be worth changing. If the scsi layer cannot on its own detect that a device or a bus is gone, there'll be no sense in having a callback. It's just a complication. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-20 20:48 ` [linux-usb-devel] " Oliver Neukum @ 2003-01-20 21:24 ` David Brownell 2003-01-20 21:51 ` [linux-usb-devel] " Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: David Brownell @ 2003-01-20 21:24 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > That model cannot be applied to SCSI as it is much more diverse > in the number of bus types it supports. > USB can do it, because it knows about hubs. SCSI cannot, > as there are no hubs in SCSI. Hubs are irrelevant here, the key functionality is noticing hardware addition/disconnect. Parts of it can be done in bus adapter code, parts of it can't. SCSI probes LUNS in much the same way khubd probes hub ports, and as I recall most of that logic isn't any more specific to the adapter than virtual root hub code is for USB. >>Those disconnect() callbacks have a few key responsibilities, very >>much including shutting down the entire higher level I/O queue to >>that device. I think you're saying that SCSI drivers don't have >>such a responsibility (unlike USB or PCI) ... if so, that would >>seem to be worth changing. > > > If the scsi layer cannot on its own detect that a device or a bus is gone, > there'll be no sense in having a callback. It's just a complication. Erm ... which of the three SCSI layers are you talking about? I was talking about the highest level, which is precisely the layer I think has been identified as already needing to know when to shut down the I/O queues (sd_mod and friends). - Dave ------------------------------------------------------- This SF.NET email is sponsored by: FREE SSL Guide from Thawte are you planning your Web Server Security? Click here to get a FREE Thawte SSL guide and find the answers to all your SSL security issues. http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 21:24 ` David Brownell @ 2003-01-20 21:51 ` Oliver Neukum 2003-01-20 22:26 ` David Brownell 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-20 21:51 UTC (permalink / raw) To: David Brownell Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Montag, 20. Januar 2003 22:24 schrieb David Brownell: > Oliver Neukum wrote: > > That model cannot be applied to SCSI as it is much more diverse > > in the number of bus types it supports. > > USB can do it, because it knows about hubs. SCSI cannot, > > as there are no hubs in SCSI. > > Hubs are irrelevant here, the key functionality is noticing > hardware addition/disconnect. Parts of it can be done in bus > adapter code, parts of it can't. SCSI probes LUNS in much > the same way khubd probes hub ports, and as I recall most of > that logic isn't any more specific to the adapter than virtual > root hub code is for USB. I should be more specific. The SCSI is different for several reasons: - we are talking about bus as well as device detection - there's no common way to probe for devices (the probing of LUNs works only on conventional busses) - many SCSI devices are not really SCSI devices. They just use the command set SCSI hotplug detection doesn't work for the same reason that USB can handle only detection of devices by itself. Busses on the other hand are not handled by USB itself. > >>Those disconnect() callbacks have a few key responsibilities, very > >>much including shutting down the entire higher level I/O queue to > >>that device. I think you're saying that SCSI drivers don't have > >>such a responsibility (unlike USB or PCI) ... if so, that would > >>seem to be worth changing. > > > > If the scsi layer cannot on its own detect that a device or a bus is > > gone, there'll be no sense in having a callback. It's just a > > complication. > > Erm ... which of the three SCSI layers are you talking about? I was > talking about the highest level, which is precisely the layer I think > has been identified as already needing to know when to shut down the > I/O queues (sd_mod and friends). In SCSI view device and bus disconnection is recognised by the lowest level. As it knows nothing about the high layers, it notifies the midlayer which in turn notifies the high level drivers. What should a callback do? The low level driver cannot do more than notify. I don't see what the midlayer could do with a callback, but I defer judgement here to the SCSI people, but definitely the LLDD has no use for a callback. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 21:51 ` [linux-usb-devel] " Oliver Neukum @ 2003-01-20 22:26 ` David Brownell 2003-01-20 23:00 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: David Brownell @ 2003-01-20 22:26 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > I should be more specific. Yes... > The SCSI is different for several reasons: > - we are talking about bus as well as device detection > - there's no common way to probe for devices > (the probing of LUNs works only on conventional busses) So the SCSI stack needs to support more than one model for device/bus detection. This can't be news. And some of them have to handle "conventional" busses, like USB and PCI; maybe even handle booting off them... > - many SCSI devices are not really SCSI devices. They just use > the command set > > SCSI hotplug detection doesn't work for the same reason that USB > can handle only detection of devices by itself. Busses on the other > hand are not handled by USB itself. Just how is it that USB doesn't handle USB? "B" == "Bus" ... :) If you mean that HCs hook up to a different bus (often PCI), with its own hotplug support, that doesn't seem so different from SCSI HBAs hooking up to such busses (often PCI) and cascading the same hotplug support... >>Erm ... which of the three SCSI layers are you talking about? I was >>talking about the highest level,.. > > > In SCSI view device and bus disconnection is recognised by the lowest > level. As it knows nothing about the high layers, it notifies the midlayer So you were talking past what I said about notifying that highest level, not disagreeing with it. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 22:26 ` David Brownell @ 2003-01-20 23:00 ` Oliver Neukum 2003-01-21 0:44 ` David Brownell 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-20 23:00 UTC (permalink / raw) To: David Brownell Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > So the SCSI stack needs to support more than one model for > device/bus detection. This can't be news. And some of them > have to handle "conventional" busses, like USB and PCI; maybe > even handle booting off them... Until very recently it was news. And they still haven't fully comprehended the implications. > If you mean that HCs hook up to a different bus (often PCI), with its > own hotplug support, that doesn't seem so different from SCSI HBAs > hooking up to such busses (often PCI) and cascading the same hotplug > support... Right. Only that in SCSI it's that way for devices as well, not just busses. Therefore removal detection and notification is a strict bottom to top process. > >>Erm ... which of the three SCSI layers are you talking about? I was > >>talking about the highest level,.. > > > > In SCSI view device and bus disconnection is recognised by the lowest > > level. As it knows nothing about the high layers, it notifies the > > midlayer > > So you were talking past what I said about notifying that highest level, > not disagreeing with it. I was trying to make the point that callbacks have no place in that process. It must go bottom to top and that's it. And there must be no error conditions on the way. Refusing to take notice of a device removal is just not an option. This is exactly what the current SCSI idea of an API to do bus removal does. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 23:00 ` Oliver Neukum @ 2003-01-21 0:44 ` David Brownell 2003-01-21 0:50 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: David Brownell @ 2003-01-21 0:44 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: >>So you were talking past what I said about notifying that highest level, >>not disagreeing with it. > > > I was trying to make the point that callbacks have no place in that process. If so, you didn't persuade me ... > It must go bottom to top and that's it. ... because those disconnect() callbacks are exactly how USB and PCI deliver that notification to the "top" level, and you've already agreed that SCSI needs to accomodate those models. So clearly they have at least that much of a place. > Refusing to take notice of a device removal is just not an option. > This is exactly what the current SCSI idea of an API to do bus removal does. I perceive violent agreement that a change is needed in that area. But the next step there would seem to be a patch to the SCSI APIs, unless I mis-understood what Matt was saying about the issue he ran into when making usb-storage use the enumeration facilities in the current SCSI mid/low layers. - Dave ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 0:44 ` David Brownell @ 2003-01-21 0:50 ` Oliver Neukum 2003-01-21 18:16 ` Luben Tuikov 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-21 0:50 UTC (permalink / raw) To: David Brownell Cc: Luben Tuikov, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Dienstag, 21. Januar 2003 01:44 schrieb David Brownell: > Oliver Neukum wrote: > >>So you were talking past what I said about notifying that highest level, > >>not disagreeing with it. > > > > I was trying to make the point that callbacks have no place in that > > process. > > If so, you didn't persuade me ... > > > It must go bottom to top and that's it. > > ... because those disconnect() callbacks are exactly how USB and PCI > deliver that notification to the "top" level, and you've already agreed > that SCSI needs to accomodate those models. So clearly they have at least > that much of a place. Disconnect is not really a callback. There's a distinct lack of a back movement here. khubd -> usbcore -> disconnect() in driver -> [layer on top] The proposed API in SCSI looks like: <bus system> -> LLD -> midlayer -> top layer -> midlayer -> LLD with destroy_slave() and that's not OK. > > Refusing to take notice of a device removal is just not an option. > > This is exactly what the current SCSI idea of an API to do bus removal > > does. > > I perceive violent agreement that a change is needed in that area. > > But the next step there would seem to be a patch to the SCSI APIs, > unless I mis-understood what Matt was saying about the issue he ran > into when making usb-storage use the enumeration facilities in the > current SCSI mid/low layers. I have some horrible notions when I see what APIs grace the SCSI layer. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 0:50 ` Oliver Neukum @ 2003-01-21 18:16 ` Luben Tuikov 2003-01-21 19:00 ` Oliver Neukum 2003-01-22 21:30 ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 David Brownell 0 siblings, 2 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-21 18:16 UTC (permalink / raw) To: Oliver Neukum Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > > Disconnect is not really a callback. There's a distinct lack of a back movement here. > khubd -> usbcore -> disconnect() in driver -> [layer on top] > > The proposed API in SCSI looks like: > <bus system> -> LLD -> midlayer -> top layer -> midlayer -> LLD with destroy_slave() > and that's not OK. No, not quite. When the Low Level Device Driver (LLDD), being the transport portal, notices that the device is going away or has gone away from the ``fabric'' (wlg), it will fire a device-gone event with the kernel. *Not* necessarily with SCSI Core, in fact I'd rather it didn't, but with a well defined kernel entry for device-gone events. At the same time the LLDD will start returning TARGET gone, or whatever is appropriate to newly queued commands, and error out all internally queued commands (if it does it's own queuing). (I've seen this work nicely on mount and read/write(2) and fsck.) I.e. the ``synchronization'' has started already by the LLDD erroring out commands, new and queued. All the while the kernel has started higher level cleaning up, decrementing ref counts, etc, stuff which may not be so easy to be cleaned up just by LLDD returning TARGET error. Even though, good design dictates that complete cleaning up should happen just by the LLDD returning TARGET error (e.g. on mount), we *have* to allow for this immediate high level entry point (as I mentioned above) notification, which will be kind of ``meeting place'' for events like this. Depending on what needs to be done at those ``higher'' levels, the event will eventually bubble down to the SCSI Core with something like scsi_remove_device() which will do slave_destroy() in the driver. The point is that at that point in time, it will be *safe* to do scsi_remove_device() as all ULP have alreay been notified, and they've relinquished their use of the LLD (Low Level Device), thus the safety. But there's no such thing as ``waiting around indefinitely'' or ``blocking wait'' as you've suggested in some of your emails. Even if this UL entry point doesn't do anything, ref counts should go to zero, after all users error out on this device, at which point the user can remove the device from *the system* by hand/old method through proc or whatever finalizes for 2.6. Those are more or less my thoughts on the subject. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-21 18:16 ` Luben Tuikov @ 2003-01-21 19:00 ` Oliver Neukum 2003-01-21 20:02 ` [linux-usb-devel] " Luben Tuikov 2003-01-22 21:30 ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 David Brownell 1 sibling, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-21 19:00 UTC (permalink / raw) To: Luben Tuikov Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Dienstag, 21. Januar 2003 19:16 schrieb Luben Tuikov: > Oliver Neukum wrote: > > Disconnect is not really a callback. There's a distinct lack of a back > > movement here. khubd -> usbcore -> disconnect() in driver -> [layer on > > top] > > > > The proposed API in SCSI looks like: > > <bus system> -> LLD -> midlayer -> top layer -> midlayer -> LLD with > > destroy_slave() and that's not OK. > > No, not quite. > > When the Low Level Device Driver (LLDD), being the transport portal, > notices that the device is going away or has gone away from the > ``fabric'' (wlg), it will fire a device-gone event with the kernel. > *Not* necessarily with SCSI Core, in fact I'd rather it didn't, > but with a well defined kernel entry for device-gone events. Well, we are in feature freeze. I see no alternative but to notify the mid layer. Who else but the mid layer knows what a physical device is logically associated with? > At the same time the LLDD will start returning TARGET gone, or > whatever is appropriate to newly queued commands, and error out > all internally queued commands (if it does it's own queuing). > (I've seen this work nicely on mount and read/write(2) and fsck.) Right. > I.e. the ``synchronization'' has started already by the LLDD erroring > out commands, new and queued. > > All the while the kernel has started higher level cleaning up, > decrementing ref counts, etc, stuff which may not be so easy to be > cleaned up just by LLDD returning TARGET error. Even though, You cannot really make anything depend on errors returned, because there simply may not be any commands queued. You can make it a requirement for an LLDD to return all commands in flight with an error, but you can do little with these errors. Basically you have to treat them like uncorrectable errors, except maybe for the error code returned to user space. But the processing of the disconnect itself should be triggered by the LLDD's notification, because it's the only indication of an unplug event you are sure to get. > good design dictates that complete cleaning up should happen just > by the LLDD returning TARGET error (e.g. on mount), we *have* to allow > for this immediate high level entry point (as I mentioned above) > notification, which will be kind of ``meeting place'' for events like this. That I don't understand. It would seem to me to be cleanest to have just one path to process a disconnect event. > Depending on what needs to be done at those ``higher'' levels, the > event will eventually bubble down to the SCSI Core with something like > scsi_remove_device() which will do slave_destroy() in the driver. > > The point is that at that point in time, it will be *safe* to do > scsi_remove_device() as all ULP have alreay been notified, and they've > relinquished their use of the LLD (Low Level Device), thus the safety. But there can be no users of the LLDD at this point. There can of course be references to devices and hosts, but not really uses. After we have done a notification of the event the first things to do are to make further opening of the device fail and make sure no more commands are sent to the device. Likewise all queued commands have returned with an error. So at this point it's impossible to use an unplugged device. > But there's no such thing as ``waiting around indefinitely'' or > ``blocking wait'' as you've suggested in some of your emails. > > Even if this UL entry point doesn't do anything, ref counts should > go to zero, after all users error out on this device, at which point > the user can remove the device from *the system* by hand/old method > through proc or whatever finalizes for 2.6. You cannot be sure that reference counts will go to zero ever. You can be sure that they won't increase as you can fail any operation that would cause them to increase, but you cannot force userland to close its fds. And waiting for somebody to remove a device is wrong. It's gone physically. There's no choice but to remove it. The refcounts can tell you when to free data structures associated with devices, but what else do you want them to do? Regards Oliver ------------------------------------------------------- This SF.net email is sponsored by: Scholarships for Techies! Can't afford IT training? All 2003 ictp students receive scholarships. Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more. www.ictp.com/training/sourceforge.asp _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 19:00 ` Oliver Neukum @ 2003-01-21 20:02 ` Luben Tuikov 2003-01-21 21:02 ` Alan Stern 0 siblings, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-21 20:02 UTC (permalink / raw) To: Oliver Neukum Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > Am Dienstag, 21. Januar 2003 19:16 schrieb Luben Tuikov: > >> >>When the Low Level Device Driver (LLDD), being the transport portal, >>notices that the device is going away or has gone away from the >>``fabric'' (wlg), it will fire a device-gone event with the kernel. >>*Not* necessarily with SCSI Core, in fact I'd rather it didn't, >>but with a well defined kernel entry for device-gone events. > > > Well, we are in feature freeze. I see no alternative but to notify > the mid layer. Who else but the mid layer knows what a physical device > is logically associated with? Yes, we're in feature freeze. I realize this and the fact that this may be 2.7 work, but it's nevetheless worth to brainstorm the issue. I think one needs to notify at a higher level -- (some) decision making may/will be made there. SCSI Core will be notified eventually, or maybe right away. For all we know, the policy of removing a device could be to just go into SCSI Core with the removal -- but the point is that you need to notify at a higher level. In due time, SCSI Core has no problem with a device disappearing. As I mentioned already, the event will ``bubble down'' to SCSI Core, at some point or immediately. > >>At the same time the LLDD will start returning TARGET gone, or >>whatever is appropriate to newly queued commands, and error out >>all internally queued commands (if it does it's own queuing). >>(I've seen this work nicely on mount and read/write(2) and fsck.) > > > Right. I've been saying (repeating) this for my last 3-4 emails. Glad to hear we've come to some kind of agreement. :-) >>I.e. the ``synchronization'' has started already by the LLDD erroring >>out commands, new and queued. >> >>All the while the kernel has started higher level cleaning up, >>decrementing ref counts, etc, stuff which may not be so easy to be >>cleaned up just by LLDD returning TARGET error. Even though, > > > You cannot really make anything depend on errors returned, because > there simply may not be any commands queued. You can make it a Exactly. The more reason to have a notification even at a higher level, because *if* you had users and whatnot using the device then you'd want to let them/it know. You need a higher level hook. I can see a ton of uses for such a higher level hook. > requirement for an LLDD to return all commands in flight with an error, > but you can do little with these errors. Basically you have to treat them As I've said, I've seen this method work nicely with mount and fsck -- they time out almost right away, with different errors of course, but LLDD returns TARGET error all the while. So, either way (users or none), a higher level hook would seem like a more general approach. > like uncorrectable errors, except maybe for the error code returned to > user space. But the processing of the disconnect itself should be triggered > by the LLDD's notification, because it's the only indication of an unplug > event you are sure to get. I think this is the first thing I mentioned yesterday when I wrote ``transport initiated event''. >>good design dictates that complete cleaning up should happen just >>by the LLDD returning TARGET error (e.g. on mount), we *have* to allow >>for this immediate high level entry point (as I mentioned above) >>notification, which will be kind of ``meeting place'' for events like this. > > > That I don't understand. It would seem to me to be cleanest to have just > one path to process a disconnect event. I also think that there should be one path: LLDD starts returning TARGET error and all the while cleaning up has started from the top. >>Depending on what needs to be done at those ``higher'' levels, the >>event will eventually bubble down to the SCSI Core with something like >>scsi_remove_device() which will do slave_destroy() in the driver. >> >>The point is that at that point in time, it will be *safe* to do >>scsi_remove_device() as all ULP have alreay been notified, and they've >>relinquished their use of the LLD (Low Level Device), thus the safety. > > > But there can be no users of the LLDD at this point. There can of > course be references to devices and hosts, but not really uses. The more reason for a higher level hook -- you see, it generalizes the cases of users and no users using the device -- you have it covered both ways. See my comments above. > After we have done a notification of the event the first things to do > are to make further opening of the device fail and make sure no more > commands are sent to the device. Likewise all queued commands have > returned with an error. So at this point it's impossible to use an unplugged > device. So here I take it you agree with me. >>But there's no such thing as ``waiting around indefinitely'' or >>``blocking wait'' as you've suggested in some of your emails. >> >>Even if this UL entry point doesn't do anything, ref counts should >>go to zero, after all users error out on this device, at which point >>the user can remove the device from *the system* by hand/old method >>through proc or whatever finalizes for 2.6. > > > You cannot be sure that reference counts will go to zero ever. > You can be sure that they won't increase as you can fail any operation that > would cause them to increase, but you cannot force userland to close its fds. > And waiting for somebody to remove a device is wrong. It's gone physically. > There's no choice but to remove it. The refcounts can tell you when to free > data structures associated with devices, but what else do you want them to do? A agree with all this. What I was saying is the flexibility of the policy. Yes, it is correct that we cannot force userland to close its fd's. Just as you cannot force a parent process to collect child exit status :-) . (Idea!) I'm glad to see we're coming to an agreement. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 20:02 ` [linux-usb-devel] " Luben Tuikov @ 2003-01-21 21:02 ` Alan Stern 2003-01-22 21:50 ` Luben Tuikov 0 siblings, 1 reply; 106+ messages in thread From: Alan Stern @ 2003-01-21 21:02 UTC (permalink / raw) To: Luben Tuikov Cc: Oliver Neukum, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Here's another question to add to your discussion. When a device is unplugged, the system's representation of that device can't be removed immediately; there may be open fd's, mounts, pointers, and so on. Until the time comes when all these handles are released, all interaction with the device has to fail, one way or another. Whose responsibility is it to fail these interactions? For something simple, like a USB serial port, it might turn out that the low-level device driver gets all these requests and then fails them. That means the driver has to keep track of the fact that the device is no longer connected until some reference count goes to 0. For SCSI and emulated SCSI devices, it might be the one of the SCSI layers that keeps track of the fact that the device has disconnected. Or it might be somewhere else in the kernel. It would nice to have some sort of coherent plan for how to handle this. In fact, it ought to be part of the device-driver model that underlies sysfs. But so far as I am aware, there is currently nothing in the sysfs documentation to address the problem. Alan Stern ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 21:02 ` Alan Stern @ 2003-01-22 21:50 ` Luben Tuikov 2003-01-22 22:46 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-22 21:50 UTC (permalink / raw) To: Alan Stern Cc: Oliver Neukum, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Alan Stern wrote: > Here's another question to add to your discussion. > > When a device is unplugged, the system's representation of that device > can't be removed immediately; there may be open fd's, mounts, pointers, > and so on. Until the time comes when all these handles are released, all > interaction with the device has to fail, one way or another. > > Whose responsibility is it to fail these interactions? The transport. When a device is plugged to the SAN/fabric (wlg) it may not be the case that all initiators will know about it. For this reason the transport itself, not SCSI Core, not LLDD*, will decide if the CDB is deliverable. * An LLDD may keep a table of seen devices (depending on the transport it provides a portal to), and thus the decision may be made there, but this doesn't have to be the case. (Just think of SANs/IP SANs.) > For something simple, like a USB serial port, it might turn out that the > low-level device driver gets all these requests and then fails them. That > means the driver has to keep track of the fact that the device is no > longer connected until some reference count goes to 0. A LLDD doesn't have to keep reference counts. In the simple case you mention above, it will check that the device is no longer reachable and will return TARGET error, which will bubble up the layers, or the Execute Command remote procedure will end with Service Delivery Failure as the Service Response -- exactly the same effect as far as SCSI Core is concerned. The Service Response is Service Delivery Failure, in which case the Status byte is undefined. I've been wanting to include a Service Response into scsi_cmnd and rename ``result'' into ``status'' to be closer to the SCSI Architecure, for some time now, but we'll see when this will happen. Newer drivers will make use of Service Response code, and be able to address only by (target, lun) rather than (bus, target, lun), and target may not be an int anymore. But this is 2.7 stuff, or maybe a separately distributed SCSI Core and LLDDs subsystem... > For SCSI and emulated SCSI devices, it might be the one of the SCSI layers > that keeps track of the fact that the device has disconnected. Or it > might be somewhere else in the kernel. Right. For this reason I'm thinking that a higher level hook on device disconnect (if reported by the transport) might be needed. This doesn't mean that it has to do higher-level things, :-) , it might just call SCSI Core, but as long as it goes through a higher layer. > It would nice to have some sort of coherent plan for how to handle this. > In fact, it ought to be part of the device-driver model that underlies > sysfs. But so far as I am aware, there is currently nothing in the sysfs > documentation to address the problem. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-22 21:50 ` Luben Tuikov @ 2003-01-22 22:46 ` Oliver Neukum 2003-01-23 17:46 ` Luben Tuikov 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-22 22:46 UTC (permalink / raw) To: Luben Tuikov, Alan Stern Cc: David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > > Whose responsibility is it to fail these interactions? > > The transport. > > When a device is plugged to the SAN/fabric (wlg) it may not be the case > that all initiators will know about it. For this reason the transport > itself, not SCSI Core, not LLDD*, will decide if the CDB is deliverable. > > * An LLDD may keep a table of seen devices (depending on the transport > it provides a portal to), and thus the decision may be made there, but > this doesn't have to be the case. > > (Just think of SANs/IP SANs.) Not all the world is a SAN. USB has no possibility to even try an interaction after the device is gone. We have to handle this flexibly. In fact, if a device can vanish without a LLDD knowing about it, this is purely a problem of the SCSI layer. > > For something simple, like a USB serial port, it might turn out that the > > low-level device driver gets all these requests and then fails them. > > That means the driver has to keep track of the fact that the device is no > > longer connected until some reference count goes to 0. > > A LLDD doesn't have to keep reference counts. In the simple case > you mention above, it will check that the device is no longer reachable > and will return TARGET error, which will bubble up the layers, or the That is something that is impossible to some LLDDs. We have to keep a record about devices and busses we can reach and can delete these records only after we positively know that no more commands will come down to the LLDD. The alternative would be to check a table of available devices for every command. That means that we have to have a way to ensure that no more commands will reach the LLDD which can be triggered without any commands to be executed at all. This functionality has to come from the scsi mid layer. > Newer drivers will make use of Service Response code, and be able to > address only by (target, lun) rather than (bus, target, lun), and > target may not be an int anymore. But this is 2.7 stuff, or maybe > a separately distributed SCSI Core and LLDDs subsystem... Yes, but we need a solution for 2.6. And it has to be reasonably simple. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-22 22:46 ` Oliver Neukum @ 2003-01-23 17:46 ` Luben Tuikov 2003-01-23 18:19 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-23 17:46 UTC (permalink / raw) To: Oliver Neukum Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > > Not all the world is a SAN. USB has no possibility to even try an interaction > after the device is gone. We have to handle this flexibly. Thus the example in the original post. I.e. for simple transports whose portals get notified when a device is plugged off (USB), the LLDD can notify SCSI Core, by setting a state variable in scsi_device. In which case SCSI Core can answer with the proper TARGET error code. (This was outlined before, scsi_command->online:1 ...) > In fact, if a device > can vanish without a LLDD knowing about it, this is purely a problem of the > SCSI layer. No, of course not. (Think of IP.) When a device vanishes and LLDD doesn't know about it (more complicated transports), the CDB will return with the proper Service Response, since the transport(s) won't be able to deliver it. This will bubble up through SCSI Core and the error returned will have to be the same as that of the simpler transports, as outlined above. >>>For something simple, like a USB serial port, it might turn out that the >>>low-level device driver gets all these requests and then fails them. >>>That means the driver has to keep track of the fact that the device is no >>>longer connected until some reference count goes to 0. >> >>A LLDD doesn't have to keep reference counts. In the simple case >>you mention above, it will check that the device is no longer reachable >>and will return TARGET error, which will bubble up the layers, or the > > > That is something that is impossible to some LLDDs. We have to keep > a record about devices and busses we can reach and can delete these > records only after we positively know that no more commands will come > down to the LLDD. But USB does keep such a record, doesn't it? *Even if it doesn't*, as outlined above, it can set a state variable in scsi_device and SCSI Core can take over for error return values. > The alternative would be to check a table of available devices for every > command. A command is destined to a device, at SCSI Core queuing logic a check can be made... No need to go through tables of devices. > That means that we have to have a way to ensure that no more commands > will reach the LLDD which can be triggered without any commands to be > executed at all. This functionality has to come from the scsi mid layer. For simple transports yes; for more complicated ones, the CDB will not be able to be delivered, and will return with error. >>Newer drivers will make use of Service Response code, and be able to >>address only by (target, lun) rather than (bus, target, lun), and >>target may not be an int anymore. But this is 2.7 stuff, or maybe >>a separately distributed SCSI Core and LLDDs subsystem... > > > Yes, but we need a solution for 2.6. > And it has to be reasonably simple. I think we have enough ideas to implement a reasonable one. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 17:46 ` Luben Tuikov @ 2003-01-23 18:19 ` Oliver Neukum 2003-01-23 19:07 ` Luben Tuikov 2003-01-23 20:41 ` A different look at block device hotswap in the Linux kernel Steven Dake 0 siblings, 2 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-23 18:19 UTC (permalink / raw) To: Luben Tuikov Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Donnerstag, 23. Januar 2003 18:46 schrieb Luben Tuikov: > Oliver Neukum wrote: > > Not all the world is a SAN. USB has no possibility to even try an > > interaction after the device is gone. We have to handle this flexibly. > > Thus the example in the original post. I.e. for simple transports whose > portals get notified when a device is plugged off (USB), the LLDD > can notify SCSI Core, by setting a state variable in scsi_device. > In which case SCSI Core can answer with the proper TARGET error code. > (This was outlined before, scsi_command->online:1 ...) Very well, so you agree that the SCSI layer should export to the LLDD a function to set devices offline? > > In fact, if a device > > can vanish without a LLDD knowing about it, this is purely a problem of > > the SCSI layer. > > No, of course not. (Think of IP.) When a device vanishes and LLDD doesn't > know about it (more complicated transports), the CDB will return with > the proper Service Response, since the transport(s) won't be able to > deliver it. This will bubble up through SCSI Core and the error returned > will have to be the same as that of the simpler transports, as outlined > above. Yes, sorry. To be precise, this means that the LLDD has to do nothing special, as it has to implement checking for a failing command anyway. But it's not entirely the same. If a command cannot be delivered it may or may not be appropriate to start error recovery. After the LLDD has told the SCSI layer that it has noticed a device going away, there must be no error recovery. > > That means that we have to have a way to ensure that no more commands > > will reach the LLDD which can be triggered without any commands to be > > executed at all. This functionality has to come from the scsi mid layer. > > For simple transports yes; for more complicated ones, the CDB will > not be able to be delivered, and will return with error. Good. So the first thing a LLDD has to do after it has learned about a device being removed is to have the device block. 1. set device offline But commands may still be in flight.IMHO it is not right to assume that all commands now in flight to a device have failed, as some may have completed successfully in time, or failed for other reasons than unplugging. So it should be the LLDD's responsibility to finish the outstanding commands. Furthermore, there's a window for commands already having passed the check for offline but not yet being noticed by the LLDD. The simplest solution is to use a waiting primitive from RCU. So we are at: 1. set device offline 2. synchronize the kernel 3. finish all pending commands So far with me? The LLDD could now forget about the device and be done with it. However there's a problem left. The device may come back. What happens if a device with the same ID is reconnected? Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 18:19 ` Oliver Neukum @ 2003-01-23 19:07 ` Luben Tuikov 2003-01-23 19:40 ` Oliver Neukum 2003-01-23 20:41 ` A different look at block device hotswap in the Linux kernel Steven Dake 1 sibling, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-23 19:07 UTC (permalink / raw) To: Oliver Neukum Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver Neukum wrote: > > Very well, so you agree that the SCSI layer should export to the LLDD > a function to set devices offline? I've never really disagreed -- simpler transports will make use of such a function. The important point to note is that the error return value for simpler and more complicated transports has to be the same (i.e. ones which know about the device disconnect and others which send out the CDB and which will return with error). I forgot to mention this with my previous email: think of a LLDD more as part of the transport than of SCSI Core. > Yes, sorry. To be precise, this means that the LLDD has to do nothing > special, as it has to implement checking for a failing command anyway. > But it's not entirely the same. If a command cannot be delivered it may or may > not be appropriate to start error recovery. After the LLDD has told > the SCSI layer that it has noticed a device going away, there must be no > error recovery. Error recovery should not be that complicated for device being disconnected, just error out all commands new and old -- as I've said so many times. The command structs will return back to LLDD and all will be good. (Simple transports.) More complicated transports will just return Service Delivery Failure. (See below.) > Good. > So the first thing a LLDD has to do after it has learned about a device > being removed is to have the device block. ``block'' (verb) is such a strong word. * Simple transports: call scsi_set_device_offline(dev) or something like this. * More complicated transports: SCSI Core sees Service Response of Service Delivery Failure and it itself calls scsi_set_device_offline(dev). scsi_set_device_offline(dev) calls a high-level kernel function to start higher level things (block queue cut off, etc) which *may* need to be done. The control path will eventually bubble down to SCSI Core which will error out already queued commands (unless they've returned already with the appropriate error code), remove the device, etc, etc, etc. > 1. set device offline > But commands may still be in flight.IMHO it is not right to assume that > all commands now in flight to a device have failed, as some may have > completed successfully in time, or failed for other reasons than unplugging. They will just return with ok status and after a certain point in time, all others will return with the appropriate error -- in which case see above. > So it should be the LLDD's responsibility to finish the outstanding commands. LLDD cannot really ``finish'' outstanding commands, it's just a transport portal. > Furthermore, there's a window for commands already having passed the check > for offline but not yet being noticed by the LLDD. They will return with an appropriate error. > The simplest solution is to > use a waiting primitive from RCU. So we are at: > > 1. set device offline > 2. synchronize the kernel > 3. finish all pending commands I told you before: 3 starts *before* 2 and 3 is *part* of 2. Furthermore, after 1 has happened in time, all pending commands will error out (wrt a time line). 2 is what I call ``higher-level hook'', but it's not really ``synchronization''. Synchronization will take delta-time, it will not happen instantaneously. > So far with me? > The LLDD could now forget about the device and be done with it. Some LLDD will not have the concept of device -- they'll just set up the remote procedure call Execute Command and initiate it, given a target and LUN. Who knows what happens after that? I mean the command may go through several transports..., the LUN may get translated a few times, etc. We have to keep this in mind. And some other transports will know about devices. I.e. you have to allow for the possibility of a command being sent to a non-existent device through LLDD, in which case the LLDD/transport will have to error it out. > However there's a problem left. The device may come back. > What happens if a device with the same ID is reconnected? -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 19:07 ` Luben Tuikov @ 2003-01-23 19:40 ` Oliver Neukum 2003-01-23 20:28 ` Doug Ledford 0 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-23 19:40 UTC (permalink / raw) To: Luben Tuikov Cc: Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Donnerstag, 23. Januar 2003 20:07 schrieb Luben Tuikov: > Oliver Neukum wrote: > > Very well, so you agree that the SCSI layer should export to the LLDD > > a function to set devices offline? > > I've never really disagreed -- simpler transports will make use > of such a function. The important point to note is that the error Good. > return value for simpler and more complicated transports has to > be the same (i.e. ones which know about the device disconnect and > others which send out the CDB and which will return with error). Why? It throws away information needlessly. If the LLDD knows that the reason is unplugging why not report it? A LLDD that doesn't know about devices going away on the other hand can just report an error. Can the higher layers simply assume that the device was unplugged? IMHO they can't and should at least try to recover from the error. > I forgot to mention this with my previous email: think of a LLDD > more as part of the transport than of SCSI Core. Hard to do. The scsi mid layer does timing out and error handling. There's a relatively tight connection. > > So the first thing a LLDD has to do after it has learned about a device > > being removed is to have the device block. > > ``block'' (verb) is such a strong word. What do you prefer ? ;-) I'll certainly use another word if you like me to do so. > * Simple transports: call scsi_set_device_offline(dev) or something like > this. > > * More complicated transports: SCSI Core sees Service Response of Service > Delivery Failure and it itself calls scsi_set_device_offline(dev). > > scsi_set_device_offline(dev) calls a high-level kernel function to start > higher level things (block queue cut off, etc) which *may* need to be done. How do you differentiate between real failure and device removal? > > So it should be the LLDD's responsibility to finish the outstanding > > commands. > > LLDD cannot really ``finish'' outstanding commands, it's just a transport > portal. Well, report back the results, if you prefer, thus returning ownership to higher layers. > > Furthermore, there's a window for commands already having passed the > > check for offline but not yet being noticed by the LLDD. > > They will return with an appropriate error. Not quite so simple. Some LLDDs need to know at some point that no more commands will arrive for sure and none are still in flight. > > The simplest solution is to > > use a waiting primitive from RCU. So we are at: > > > > 1. set device offline > > 2. synchronize the kernel > > 3. finish all pending commands > > I told you before: 3 starts *before* 2 and 3 is *part* of 2. > Furthermore, after 1 has happened in time, all pending commands > will error out (wrt a time line). That's not enough. The LLDD has to know that they've errored in order to free associated data structures. The simplest way to do so is returning them to higher layers. > 2 is what I call ``higher-level hook'', but it's not really > ``synchronization''. Synchronization will take delta-time, it > will not happen instantaneously. Please explain. > I.e. you have to allow for the possibility of a command > being sent to a non-existent device through LLDD, in which > case the LLDD/transport will have to error it out. Non existant is OK. For a device already flagged offline is not OK. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 19:40 ` Oliver Neukum @ 2003-01-23 20:28 ` Doug Ledford 2003-01-23 20:59 ` Oliver Neukum ` (2 more replies) 0 siblings, 3 replies; 106+ messages in thread From: Doug Ledford @ 2003-01-23 20:28 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Thu, Jan 23, 2003 at 08:40:41PM +0100, Oliver Neukum wrote: > Am Donnerstag, 23. Januar 2003 20:07 schrieb Luben Tuikov: > > > return value for simpler and more complicated transports has to > > be the same (i.e. ones which know about the device disconnect and > > others which send out the CDB and which will return with error). > > Why? It throws away information needlessly. If the LLDD knows > that the reason is unplugging why not report it? What does it matter? If you know that the device was unplugged, are you going to then wait for it to be plugged back in and if it's plugged back in (and confirmed to be the same device via serial number or some such) pick back up where you left off like nothing happened? That would be the only reason to care about whether it was unplugged or died. And if you did that you would probably violate the rule of least suprise with people unplugging their hard disks in a fit when they realize they did rm -fr on the drive and then when they plug it back in to check out how much damage they did it picks back up where it left off! > A LLDD that doesn't > know about devices going away on the other hand can just report > an error. For SPI that would be DID_TIMEOUT, aka the device wasn't there. For iSCSI it would be timeout as well. Pretty much anything is simply going to be either timeout or if you know it's gone you could return some other error. > Can the higher layers simply assume that the device was unplugged? > IMHO they can't and should at least try to recover from the error. Correct. And this isn't a problem. > > I forgot to mention this with my previous email: think of a LLDD > > more as part of the transport than of SCSI Core. > > Hard to do. The scsi mid layer does timing out and error handling. > There's a relatively tight connection. > > > > So the first thing a LLDD has to do after it has learned about a device > > > being removed is to have the device block. > > > > ``block'' (verb) is such a strong word. > > What do you prefer ? ;-) I'll certainly use another word if you like me to > do so. > > > * Simple transports: call scsi_set_device_offline(dev) or something like > > this. > > > > * More complicated transports: SCSI Core sees Service Response of Service > > Delivery Failure and it itself calls scsi_set_device_offline(dev). Actually, I would have both complicated and simple transports call scsi_set_device_offline() and for two reasons. 1) you have to provide that function for simple drivers so duplicating other detection code in the scsi completion handler is a waste. 2) pretty much all transports will learn of the device being offline while they are in their interrupt handler and should already be holding the lock for the device, which means that calling scsi_set_device_offline() won't race with scsi_request_fn() which also needs the device lock (which in reality is the host lock). Saving this race is convenient enough IMHO to warrant saying that's the way things need to be. > > scsi_set_device_offline(dev) calls a high-level kernel function to start > > higher level things (block queue cut off, etc) which *may* need to be done. No, scsi_set_device_offline() schedules the error handler thread for that host to be woken up. > How do you differentiate between real failure and device removal? We don't, and we shouldn't. Device removal *is* a real failure. > > > So it should be the LLDD's responsibility to finish the outstanding > > > commands. > > > > LLDD cannot really ``finish'' outstanding commands, it's just a transport > > portal. > > Well, report back the results, if you prefer, thus returning ownership to > higher layers. If the LLDD is the type such that it knows the device is gone (aka, in my driver if I get a selection timeout then I know something is fishy and can proceed from there, iSCSI may not be so lucky), then it has one of two choices. 1) it may flush any commands that it can out of the hardware and return them immediately with the same error condition as the one that it is already returning. 2) it can sit and wait for the commands to timeout one by one if that's what it wants. Since the device has already been marked offline by scsi_set_device_offline() and the error handler thread is already scheduled to run for the device, 2 is probably the easiest thing for the driver to do. The error handler will call the abort/reset routine for each command still outstanding and the LLDD can just clean up one at a time and return them as it would under any other error condition. > > > Furthermore, there's a window for commands already having passed the > > > check for offline but not yet being noticed by the LLDD. No, not if you handle things in the interrupt handler and if your interrupt handler holds the host lock like it's suppossed to. If you want to go without using these locking methods in your lldd then you are free to do so, but that means *you* need to handle this situation in your driver, the mid layer shouldn't be trying to solve this problem for lldd that want to be lock free. > > They will return with an appropriate error. > > Not quite so simple. Some LLDDs need to know at some point that > no more commands will arrive for sure and none are still in flight. Follow the simple rule above and when you call scsi_set_device_offline() then you will *never* get called in your queuecommand() for that device again. That is separate from all commands being cleaned up. That will happen after the error handler thread has run and cleaned your driver out. Once all the commands are gone and no more are arriving, then if, and only if, someone actually removes the device from the scsi subsystem (maybe hotplug manager or something) then you will get the typical slave_destroy() call to tell you that it is safe to release all resources related to this device. Otherwise, the device will hang around as an offline device until someone does echo "scsi-remove-single-device a b c d" > /proc/scsi/scsi to remove it. Basically, as I see it, we need a new function scsi_set_device_offline() that marks the device offline, we need an offline check in scsi_request_fn(), and we need scsi_set_device_offline() to schedule the error handler thread for wakeup (and it should flag the device that needs recovered so that the error handler thread knows what to do), then the error handler thread routine needs modified to understand what to do with a device that's been offlined with commands outstanding, and once all the commands are returned it should signal the higher layer (block or whatever) that the device is offlined. Sounds like about an afternoons worth of work to me and should solve the issues you are bringing up. As far as plugging back in, the answer is simple. Until the old instance is dead *and removed* a new one can't be added at the same ID, aka you simply ignore the hot plug until the hot remove has completed. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 20:28 ` Doug Ledford @ 2003-01-23 20:59 ` Oliver Neukum 2003-01-23 21:34 ` Doug Ledford 2003-01-24 0:15 ` Patrick Mansfield 2003-01-24 8:33 ` David Brownell 2 siblings, 1 reply; 106+ messages in thread From: Oliver Neukum @ 2003-01-23 20:59 UTC (permalink / raw) To: Doug Ledford Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Hi Doug > Actually, I would have both complicated and simple transports call > scsi_set_device_offline() and for two reasons. 1) you have to provide > that function for simple drivers so duplicating other detection code in > the scsi completion handler is a waste. 2) pretty much all transports > will learn of the device being offline while they are in their interrupt > handler and should already be holding the lock for the device, which means This is not the case for USB and IEEE1394. I am not sure about PCMCIA. We are in context of a kernel thread while we learn about device removal. > that calling scsi_set_device_offline() won't race with scsi_request_fn() > which also needs the device lock (which in reality is the host lock). > Saving this race is convenient enough IMHO to warrant saying that's the > way things need to be. > > > > scsi_set_device_offline(dev) calls a high-level kernel function to > > > start higher level things (block queue cut off, etc) which *may* need > > > to be done. > > No, scsi_set_device_offline() schedules the error handler thread for that > host to be woken up. > > > How do you differentiate between real failure and device removal? > > We don't, and we shouldn't. Device removal *is* a real failure. Well shouldn't a device removal remove the device as a logical entity and a failure should not? > If the LLDD is the type such that it knows the device is gone (aka, in my > driver if I get a selection timeout then I know something is fishy and can > proceed from there, iSCSI may not be so lucky), then it has one of two > choices. 1) it may flush any commands that it can out of the hardware and > return them immediately with the same error condition as the one that it > is already returning. 2) it can sit and wait for the commands to timeout > one by one if that's what it wants. Since the device has already been > marked offline by scsi_set_device_offline() and the error handler thread > is already scheduled to run for the device, 2 is probably the easiest > thing for the driver to do. The error handler will call the abort/reset Again not for USB and IEEE1394. We'd have to wait for the error handler to finish. Doing it ourselves is easier. > Once all the commands are gone and no more are arriving, then if, and only > if, someone actually removes the device from the scsi subsystem (maybe > hotplug manager or something) then you will get the typical > slave_destroy() call to tell you that it is safe to release all resources > related to this device. Otherwise, the device will hang around as an > offline device until someone does echo "scsi-remove-single-device a b c d" Eek. That part I must strongly object to. The device is physically gone. Ever bothering the LLDD with it is very inconvinient. > > /proc/scsi/scsi to remove it. > > Basically, as I see it, we need a new function scsi_set_device_offline() > that marks the device offline, we need an offline check in These functions are needed for a whole bus as well. USB needs it. > As far as plugging back in, the answer is simple. Until the old instance > is dead *and removed* a new one can't be added at the same ID, aka you > simply ignore the hot plug until the hot remove has completed. What do you mean? It is dead because it is removed. How can a device be anything than dead if it has been unplugged? Please elaborate. And who should ignore a hot addition, the LLDD or SCSI core. If the former, again I must object. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 20:59 ` Oliver Neukum @ 2003-01-23 21:34 ` Doug Ledford 2003-01-23 22:39 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: Doug Ledford @ 2003-01-23 21:34 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Thu, Jan 23, 2003 at 09:59:28PM +0100, Oliver Neukum wrote: > Hi Doug > > > Actually, I would have both complicated and simple transports call > > scsi_set_device_offline() and for two reasons. 1) you have to provide > > that function for simple drivers so duplicating other detection code in > > the scsi completion handler is a waste. 2) pretty much all transports > > will learn of the device being offline while they are in their interrupt > > handler and should already be holding the lock for the device, which means > > This is not the case for USB and IEEE1394. I am not sure about PCMCIA. > We are in context of a kernel thread while we learn about device removal. No. You might be in a kernel thread context when you decode an interrupt down to determining that a device was removed, but somewhere along the line you took an interrupt that told you the device was removed (or else the command simply timed out and you are in the error handler for the command already). Are you saying that the USB subsystem queues up those interrupt packets and decodes them later (which is fine, I just want to be clear on the point)? > > that calling scsi_set_device_offline() won't race with scsi_request_fn() > > which also needs the device lock (which in reality is the host lock). > > Saving this race is convenient enough IMHO to warrant saying that's the > > way things need to be. > > > > > > scsi_set_device_offline(dev) calls a high-level kernel function to > > > > start higher level things (block queue cut off, etc) which *may* need > > > > to be done. > > > > No, scsi_set_device_offline() schedules the error handler thread for that > > host to be woken up. > > > > > How do you differentiate between real failure and device removal? > > > > We don't, and we shouldn't. Device removal *is* a real failure. > > Well shouldn't a device removal remove the device as a logical > entity and a failure should not? No. That's what the user space hot plug manager is for. If you want this type of behaviour, you take an interrupt to tell you that the device is gone, you mark it gone, the error handler cleans up any outstanding commands, then once the device no longer has any commands outstanding *then* the hot plug manager can successfully umount/unattach/whatever the device and then tell the kernel to actually remove it. Putting this into the scsi stack when it's already in place elsewhere makes no sense to me. > > If the LLDD is the type such that it knows the device is gone (aka, in my > > driver if I get a selection timeout then I know something is fishy and can > > proceed from there, iSCSI may not be so lucky), then it has one of two > > choices. 1) it may flush any commands that it can out of the hardware and > > return them immediately with the same error condition as the one that it > > is already returning. 2) it can sit and wait for the commands to timeout > > one by one if that's what it wants. Since the device has already been > > marked offline by scsi_set_device_offline() and the error handler thread > > is already scheduled to run for the device, 2 is probably the easiest > > thing for the driver to do. The error handler will call the abort/reset > > Again not for USB and IEEE1394. We'd have to wait for the error handler > to finish. Doing it ourselves is easier. OK, are you reading my comments or not? I said "since the error handler thread is already scheduled to run for the device, 2 is probably easiest". In other words, you don't have to wait for anything, it's gonna happen post-haste. So since you should already have proper error handling functions in place (You do have proper error handler functions in place, don't you?), duplicating that code here won't really buy you anything. > > Once all the commands are gone and no more are arriving, then if, and only > > if, someone actually removes the device from the scsi subsystem (maybe > > hotplug manager or something) then you will get the typical > > slave_destroy() call to tell you that it is safe to release all resources > > related to this device. Otherwise, the device will hang around as an > > offline device until someone does echo "scsi-remove-single-device a b c d" > > Eek. That part I must strongly object to. The device is physically gone. > Ever bothering the LLDD with it is very inconvinient. OK, let's look at this realistically. I'm saying you get an interrupt telling you that the device is gone and you tell the scsi core the same thing. Immediately after that the scsi core calls your error handler routines to clean up any pending commands on the device. Once all those pending commands are cleaned up, the hot plug manager is free to remove the device from the system. Once the hot plug manager calls for the free to happen, you get a slave_destroy() call and you free the instances. This all happens in a span of a few milliseconds most likely. Is that really so inconvenient for you? > > > /proc/scsi/scsi to remove it. > > > > Basically, as I see it, we need a new function scsi_set_device_offline() > > that marks the device offline, we need an offline check in > > These functions are needed for a whole bus as well. USB needs it. > > > As far as plugging back in, the answer is simple. Until the old instance > > is dead *and removed* a new one can't be added at the same ID, aka you > > simply ignore the hot plug until the hot remove has completed. > > What do you mean? It is dead because it is removed. How can a device be > anything than dead if it has been unplugged? Please elaborate. I said "old instance", aka the internal data structs (struct scsi_device for that device). A device can be dead but not removed from the scsi subsys if no one has cleaned up after the removal by unmounting any filesystems that were on it and removing the scsi device itself. That would be the job of the hotplug manager. > And who should ignore a hot addition, the LLDD or SCSI core. > If the former, again I must object. The scsi core doesn't allow two devices with the same complete ID set. You would either have to attach the device at a different ID (aka khubd could set the reattached device to a higher SCSI ID or something) or wait for the hot plug manager to complete the old instance of the device's removal before adding the device back in again. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 21:34 ` Doug Ledford @ 2003-01-23 22:39 ` Oliver Neukum 2003-01-23 23:23 ` Doug Ledford 2003-01-23 23:25 ` Matthew Dharm 0 siblings, 2 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-23 22:39 UTC (permalink / raw) To: Doug Ledford Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > No. You might be in a kernel thread context when you decode an interrupt > down to determining that a device was removed, but somewhere along the > line you took an interrupt that told you the device was removed (or else > the command simply timed out and you are in the error handler for the > command already). Are you saying that the USB subsystem queues up those > interrupt packets and decodes them later (which is fine, I just want to be > clear on the point)? Yes, an interrupt occurs, it's queued, a kernel thread woken up, it decodes the interrupts and notifies the device driver. > No. That's what the user space hot plug manager is for. If you want this > type of behaviour, you take an interrupt to tell you that the device is > gone, you mark it gone, the error handler cleans up any outstanding > commands, then once the device no longer has any commands outstanding > *then* the hot plug manager can successfully umount/unattach/whatever the > device and then tell the kernel to actually remove it. Putting this into > the scsi stack when it's already in place elsewhere makes no sense to me. Well, it's a SCSI matter. > > > If the LLDD is the type such that it knows the device is gone (aka, in > > > my driver if I get a selection timeout then I know something is fishy > > > and can proceed from there, iSCSI may not be so lucky), then it has one > > > of two choices. 1) it may flush any commands that it can out of the > > > hardware and return them immediately with the same error condition as > > > the one that it is already returning. 2) it can sit and wait for the > > > commands to timeout one by one if that's what it wants. Since the > > > device has already been marked offline by scsi_set_device_offline() and > > > the error handler thread is already scheduled to run for the device, 2 > > > is probably the easiest thing for the driver to do. The error handler > > > will call the abort/reset > > > > Again not for USB and IEEE1394. We'd have to wait for the error handler > > to finish. Doing it ourselves is easier. > > OK, are you reading my comments or not? I said "since the error handler Oh, I do. Some of them just seem impractical from a USB point of view. > thread is already scheduled to run for the device, 2 is probably easiest". > In other words, you don't have to wait for anything, it's gonna happen > post-haste. So since you should already have proper error handling > functions in place (You do have proper error handler functions in place, > don't you?), duplicating that code here won't really buy you anything. I have memory to free. I can do that only after the last command is gone. I'd have to yield in a loop and count commands. All quite messy. > OK, let's look at this realistically. I'm saying you get an interrupt > telling you that the device is gone and you tell the scsi core the same > thing. Immediately after that the scsi core calls your error handler > routines to clean up any pending commands on the device. Once all those > pending commands are cleaned up, the hot plug manager is free to remove > the device from the system. Once the hot plug manager calls for the free > to happen, you get a slave_destroy() call and you free the instances. > This all happens in a span of a few milliseconds most likely. Is that > really so inconvenient for you? Yes. There is no such thing as "most likely". I have to code for the worst case. So "most likely" means "maybe never". Either do or don't. We cannot have a device removal fail for any reason. It drives up complexity by an order of magnitude. Besides having a callback going from kernel code through user space back to kernel code is incredibly ugly. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 22:39 ` Oliver Neukum @ 2003-01-23 23:23 ` Doug Ledford 2003-01-23 23:25 ` Matthew Dharm 1 sibling, 0 replies; 106+ messages in thread From: Doug Ledford @ 2003-01-23 23:23 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Thu, Jan 23, 2003 at 11:39:40PM +0100, Oliver Neukum wrote: > > > No. You might be in a kernel thread context when you decode an interrupt > > down to determining that a device was removed, but somewhere along the > > line you took an interrupt that told you the device was removed (or else > > the command simply timed out and you are in the error handler for the > > command already). Are you saying that the USB subsystem queues up those > > interrupt packets and decodes them later (which is fine, I just want to be > > clear on the point)? > > Yes, an interrupt occurs, it's queued, a kernel thread woken up, it decodes > the interrupts and notifies the device driver. Fine. In that code, when it detects a device being removed, it would do this: spin_lock_irqsave(&device->host->host_lock); scsi_set_device_offline(device); spin_unlock_irqrestore(&device->host->host_lock); and it will, from that point on, never get another command for that device. Now, what would you normally do after that? > > No. That's what the user space hot plug manager is for. If you want this > > type of behaviour, you take an interrupt to tell you that the device is > > gone, you mark it gone, the error handler cleans up any outstanding > > commands, then once the device no longer has any commands outstanding > > *then* the hot plug manager can successfully umount/unattach/whatever the > > device and then tell the kernel to actually remove it. Putting this into > > the scsi stack when it's already in place elsewhere makes no sense to me. > > Well, it's a SCSI matter. Unmounting a mounted filesystem is not a scsi matter. It's not even clear that it's what we want in all cases. As I mentioned in a separate private email, I would find it cool as hell if I could mount my USB2.0 mp3 player that looks like a regular hard disk and have it configured as a permanent mount that simply deferred I/O errors indefinitely. Then I would be free to write new files to the filesystem and if the device was plugged in they would get sent immediately and if it wasn't then the writes would just be buffered until the next time the hard disk was plugged in. Then, when it did get a hotplug event, if there are any buffered up events in the device request queue we could just kick the request queue and everything would get written out. That would just be cool as hell, but not very feasible if I follow your plan. That's why I think the user space hot plug manager should decide what to do in these cases. > Oh, I do. Some of them just seem impractical from a USB point of view. > > > thread is already scheduled to run for the device, 2 is probably easiest". > > In other words, you don't have to wait for anything, it's gonna happen > > post-haste. So since you should already have proper error handling > > functions in place (You do have proper error handler functions in place, > > don't you?), duplicating that code here won't really buy you anything. > > I have memory to free. I can do that only after the last command is gone. > I'd have to yield in a loop and count commands. All quite messy. No. You would implement a slave_destroy() entry point in your driver and *forget* about counting commands or yielding in a loop. It *simplifies* low level drivers when they use the facilities provided. The infrastructure is all there, but if you want to reimplement it yourself in your driver, go right ahead. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 22:39 ` Oliver Neukum 2003-01-23 23:23 ` Doug Ledford @ 2003-01-23 23:25 ` Matthew Dharm 2003-01-24 15:34 ` Alan Stern ` (2 more replies) 1 sibling, 3 replies; 106+ messages in thread From: Matthew Dharm @ 2003-01-23 23:25 UTC (permalink / raw) To: Oliver Neukum Cc: Doug Ledford, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 3909 bytes --] Well, I've been watching this go on for days. I hate to weigh in now, but I think someone needs to understand what the guy writing the code that is facing this problem really wants. First, let me say that a USB storage device shows up as a HBA. That's because some devices are actually USB/SCSI bridges. But, since I work at the 'emulated host' level, that's where I'm focused. What I want: I want to be able to free resources associated with a device within a finite (and well bounded) amount of time after I am notified that the device is gone. I want to be able to inform the SCSI mid-layer, which will then inform higher layers, that the device is gone. This is so that all may deal with this however they want. I really don't care who does what, as long as we don't crash. This implies that all block-type drivers will need to become hotplug aware, or the SCSI mid-layer will have to fake command failures. I want to be able to do as little command-trickery as possible. If I have to do it, then that means the next hotplug-capable LLDD must do it also. Duplication of code is bad -- it should all be handled in the mid-layer. As yet, the interface I have to the SCSI mid-layer fails on all three points here. And now, some of my opinions on how this should all work: It would be nice if the user informed us about removing the device before they did it. But we shouldn't crash if they don't. I don't want to be hanging around after a device is gone, spinning my wheels because some other part of the kernel can't handle the fact that the device is gone. My driver is a passthru between the a SCSI emualted host and a physical USB device -- if my device is gone, I want to be out of there. (Oddly enough, I'm starting to think there may be a DoS attack here if you force the LLDD to stay -- after all, it consumes memory....) Remember, the physical plug doesn't ask me if it's okay, and I don't get to ask the SCSI mid-layer if it's okay. Yes, starting with the user clicking to tell us would be nice, but I don't get to see that. All I get to see is an indication that the plug is pulled. I don't really give a rat's a** about 'how SCSI works' or how it's specified or CAM models or any of that. I try to live in the real world as much as possible. In that world, I'm not asking to remove an HBA -- I'm telling you it's been removed. I can't call it back. I can't even fake a command (other than perhaps INQUIRY) in any meaningful way. THERE IS NOTHING I CAN DO BUT KEEP INSISTING THAT THE DEVICE IS GONE! It would be nice if the user could inform various parts of the kernel that this device was going away, and then all sorts of cleanup could happen. But I really don't care -- all I'm trying to do is exit without a resource leak under all circumstances. It would be nice if the SCSI mid-layer kept track of what commands were in what stages in who's queues. After all, if I hot-unplug a PCI SCSI controller, the controller really isn't going to be able to complete those commands for us -- we have to assume that commands queued by a LLDD are really just being sent to the hardware for queuing. If that wasn't the case, then having LLDD queue capability doesn't make sense. Now, here's the kicker -- this is what I think Linus wants: Linus said to me, with a degree of annoyance, that he doesn't want usb-storage to keep any associations of departed devices with SCSI emulated hosts. That means that I need to be able to add and remove hosts at the will of the end-user. In the end, what drives the entire process is what Linus' hand does when it's placed on his USB flashcard reader. Matt -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver It was a new hope. -- Dust Puppy User Friendly, 12/25/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 23:25 ` Matthew Dharm @ 2003-01-24 15:34 ` Alan Stern 2003-01-24 16:06 ` Oliver Neukum ` (2 more replies) 2003-01-24 19:10 ` Luben Tuikov 2003-01-24 21:48 ` Doug Ledford 2 siblings, 3 replies; 106+ messages in thread From: Alan Stern @ 2003-01-24 15:34 UTC (permalink / raw) To: Matthew Dharm Cc: Oliver Neukum, Doug Ledford, Luben Tuikov, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Thu, 23 Jan 2003, Matthew Dharm wrote: > What I want: > > I want to be able to free resources associated with a device within a > finite (and well bounded) amount of time after I am notified that the > device is gone. > > I want to be able to inform the SCSI mid-layer, which will then inform > higher layers, that the device is gone. This is so that all may deal with > this however they want. I really don't care who does what, as long as we > don't crash. This implies that all block-type drivers will need to become > hotplug aware, or the SCSI mid-layer will have to fake command failures. > > I want to be able to do as little command-trickery as possible. If I have > to do it, then that means the next hotplug-capable LLDD must do it also. > Duplication of code is bad -- it should all be handled in the mid-layer. Matt's current proposed patch has the USB LLDD calling scsi_unregister_host() (because the device is respresented as an emulated host adapter) when it learns that the device is gone. Provided that routine doesn't block for very long, provided it handles all the details of hotplug notifications, and provided it guarantees that after it returns there will be no more calls to the emulated adapter, I don't see any problem. The LLDD can go ahead and remove all records of the former device, secure in the knowledge that all pointers to data structures and entry points have been erased. Isn't this exactly what everyone has been asking for and debating about? Alan Stern ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-24 15:34 ` Alan Stern @ 2003-01-24 16:06 ` Oliver Neukum 2003-01-24 17:58 ` [linux-usb-devel] " Doug Ledford 2003-01-24 19:00 ` Luben Tuikov 2 siblings, 0 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-24 16:06 UTC (permalink / raw) To: Alan Stern, Matthew Dharm Cc: Doug Ledford, Luben Tuikov, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > Matt's current proposed patch has the USB LLDD calling > scsi_unregister_host() (because the device is respresented as an emulated > host adapter) when it learns that the device is gone. Provided that > routine doesn't block for very long, provided it handles all the details > of hotplug notifications, and provided it guarantees that after it returns > there will be no more calls to the emulated adapter, I don't see any > problem. The LLDD can go ahead and remove all records of the former > device, secure in the knowledge that all pointers to data structures and > entry points have been erased. > > Isn't this exactly what everyone has been asking for and debating about? If it only met the requirements, we'd be happy. But it doesn't. By a large margin. Regards Oliver ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 15:34 ` Alan Stern 2003-01-24 16:06 ` Oliver Neukum @ 2003-01-24 17:58 ` Doug Ledford 2003-01-24 19:00 ` Luben Tuikov 2 siblings, 0 replies; 106+ messages in thread From: Doug Ledford @ 2003-01-24 17:58 UTC (permalink / raw) To: Alan Stern Cc: Matthew Dharm, Oliver Neukum, Luben Tuikov, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Fri, Jan 24, 2003 at 10:34:11AM -0500, Alan Stern wrote: > On Thu, 23 Jan 2003, Matthew Dharm wrote: > > > What I want: > > > > I want to be able to free resources associated with a device within a > > finite (and well bounded) amount of time after I am notified that the > > device is gone. > > > > I want to be able to inform the SCSI mid-layer, which will then inform > > higher layers, that the device is gone. This is so that all may deal with > > this however they want. I really don't care who does what, as long as we > > don't crash. This implies that all block-type drivers will need to become > > hotplug aware, or the SCSI mid-layer will have to fake command failures. > > > > I want to be able to do as little command-trickery as possible. If I have > > to do it, then that means the next hotplug-capable LLDD must do it also. > > Duplication of code is bad -- it should all be handled in the mid-layer. > > Matt's current proposed patch has the USB LLDD calling > scsi_unregister_host() (because the device is respresented as an emulated > host adapter) when it learns that the device is gone. Provided that > routine doesn't block for very long, provided it handles all the details > of hotplug notifications, and provided it guarantees that after it returns > there will be no more calls to the emulated adapter, There is no such guarantee of this. Last I checked, scsi_unregister_host() can fail if devices are busy. In fact, I would say that's exactly what happened on my system when I unplugged a USB2.0 CD-ROM that was in use. The USB stack called this code, it failed due to busy, USB stack didn't care and blew state away, I knew have a permanent CD-ROM device that I can no longer clean up and I had to reboot my machine. > I don't see any > problem. The LLDD can go ahead and remove all records of the former > device, secure in the knowledge that all pointers to data structures and > entry points have been erased. > > Isn't this exactly what everyone has been asking for and debating about? > > Alan Stern > -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 15:34 ` Alan Stern 2003-01-24 16:06 ` Oliver Neukum 2003-01-24 17:58 ` [linux-usb-devel] " Doug Ledford @ 2003-01-24 19:00 ` Luben Tuikov 2003-01-24 22:23 ` Oliver.Neukum 2 siblings, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-24 19:00 UTC (permalink / raw) To: Alan Stern Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Alan Stern wrote: > > Matt's current proposed patch has the USB LLDD calling > scsi_unregister_host() (because the device is respresented as an emulated A LLDD should and must *not* call scsi_unregister_host(). This brakes all hierarchy. Let's not make a distinction between a USB LLDD and any LLDD wrt hotplugging, else we'll have a big mess and plenty of code duplication. When a device gets unplugged and the LLDD notices it, it will set the device to off in its own tables, call scsi_set_device_offline(dev) and from that point on *if* any commands for that device step in through queuecommand() method, they will return with the appropriate error. scsi_set_device_offline(dev) will do whatever it has to do, which IMHO is to, as Doug suggested, start error recovery (i.e. less code, less code duplication in LLDD), and call an upper level hook. * * Though I'm not quite certain where that error recovery should be started... Maybe that upper level hook after doing whatever it has to do, it will start the error recovery, or maybe it doesn't matter for now. But, an upper level hook call is, IMHO, mandatory for many reasons. Now, here we have the alternatives: scsi_set_device_offline() calls slave_destroy(), or slave_destory() is called later on when upper level structs has been cleaned. In fact, it wouldn't matter, if SCSI Core takes over error return, via post scsi_set_device_offline(dev) call. > host adapter) when it learns that the device is gone. Provided that > routine doesn't block for very long, provided it handles all the details > of hotplug notifications, and provided it guarantees that after it returns > there will be no more calls to the emulated adapter, I don't see any > problem. The LLDD can go ahead and remove all records of the former > device, secure in the knowledge that all pointers to data structures and > entry points have been erased. A LLDD shouldn't be so concerned with ``blocking'', unless *it* sets timers on calls to an upper level like SCSI Core functions. A LLDD has quite a clean and clear function it's supposed to do. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 19:00 ` Luben Tuikov @ 2003-01-24 22:23 ` Oliver.Neukum 0 siblings, 0 replies; 106+ messages in thread From: Oliver.Neukum @ 2003-01-24 22:23 UTC (permalink / raw) To: Luben Tuikov Cc: Alan Stern, Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Fri, 24 Jan 2003, Luben Tuikov wrote: > Alan Stern wrote: > > > > Matt's current proposed patch has the USB LLDD calling > > scsi_unregister_host() (because the device is respresented as an emulated > > A LLDD should and must *not* call scsi_unregister_host(). This brakes > all hierarchy. What then is supposed to happen when you remove a PCMCIA host adapter or a SCSI to USB bridge or ... ? There must be a way to report that a host was unplugged. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-23 23:25 ` Matthew Dharm 2003-01-24 15:34 ` Alan Stern @ 2003-01-24 19:10 ` Luben Tuikov 2003-01-24 19:56 ` [linux-usb-devel] " Alan Stern 2003-01-24 21:48 ` Doug Ledford 2 siblings, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-24 19:10 UTC (permalink / raw) To: Matthew Dharm Cc: Oliver Neukum, Doug Ledford, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Matthew Dharm wrote: > > It would be nice if the SCSI mid-layer kept track of what commands were in > what stages in who's queues. My mini-scsi-core does exactly that. Moving commands between queues is atomic. The whole thing is completely reentrant and multithreaded capable, etc, etc. It has a simple interface of send_command() and cancel_command(); doesn't have device discovery though. I tried to sell those features/ideas to SCSI Core, but we're not there yet. Oh, I forgot to mention in my previous by date letter: The entity which calls scsi_register_host() should by good design call scsi_unregister_host(). It may be the case that when there are no more devices associated with a particular host, then SCSI Core can call scsi_unregister_host(), reminicent of SCSI Core early initialization. -- Luben ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 19:10 ` Luben Tuikov @ 2003-01-24 19:56 ` Alan Stern 2003-01-24 20:11 ` Luben Tuikov 2003-01-24 21:09 ` Luben Tuikov 0 siblings, 2 replies; 106+ messages in thread From: Alan Stern @ 2003-01-24 19:56 UTC (permalink / raw) To: Luben Tuikov Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Are you aware that you are contradicting yourself? On Fri, 24 Jan 2003, Luben Tuikov wrote: > A LLDD should and must *not* call scsi_unregister_host(). This brakes > all hierarchy. On Fri, 24 Jan 2003, Luben Tuikov wrote: > Oh, I forgot to mention in my previous by date letter: > The entity which calls scsi_register_host() should by good design call > scsi_unregister_host(). In usb-storage, it _is_ the LLDD that calls scsi_register_host(). Alan Stern ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 19:56 ` [linux-usb-devel] " Alan Stern @ 2003-01-24 20:11 ` Luben Tuikov 2003-01-24 21:09 ` Luben Tuikov 1 sibling, 0 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-24 20:11 UTC (permalink / raw) To: Alan Stern Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Alan Stern wrote: > Are you aware that you are contradicting yourself? Oops, yes, sorry, I was thinking about something else, .... doh! Correction noted. I've been known to do things like this -- think about something else and write total crap :-) . -- Luben P.S. Plus, it's Friday and after a beer at lunch... this is what we get :-) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 19:56 ` [linux-usb-devel] " Alan Stern 2003-01-24 20:11 ` Luben Tuikov @ 2003-01-24 21:09 ` Luben Tuikov 2003-01-24 21:55 ` Alan Stern 1 sibling, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-24 21:09 UTC (permalink / raw) To: Alan Stern Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Alan Stern wrote: > Are you aware that you are contradicting yourself? > > On Fri, 24 Jan 2003, Luben Tuikov wrote: > > >>A LLDD should and must *not* call scsi_unregister_host(). This brakes >>all hierarchy. > What I probably meant is the detect()/release() pair; release() itself normally calls scsi_unregister(host); the point is that it got nudged from ``above'', i.e. SCSI Core. How can a LLDD be certain that it can safely call scsi_unregister_host() whenever it wishes? As Doug pointed out this leads to problems. Furhtermore, are we talking about scsi_unregister_host() or scsi_unregister(host)? The former does drivers and the latter does hosts. This would mean that my original statement was nevertheless correct, how can a LLDD decide to unload itself safely? -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 21:09 ` Luben Tuikov @ 2003-01-24 21:55 ` Alan Stern 2003-01-24 22:03 ` Luben Tuikov 2003-01-24 23:21 ` Mike Anderson 0 siblings, 2 replies; 106+ messages in thread From: Alan Stern @ 2003-01-24 21:55 UTC (permalink / raw) To: Luben Tuikov Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Fri, 24 Jan 2003, Luben Tuikov wrote: > >>A LLDD should and must *not* call scsi_unregister_host(). This brakes > >>all hierarchy. > > > > What I probably meant is the detect()/release() pair; release() itself > normally calls scsi_unregister(host); the point is that it got nudged > from ``above'', i.e. SCSI Core. > > How can a LLDD be certain that it can safely call scsi_unregister_host() > whenever it wishes? As Doug pointed out this leads to problems. Apparently it can't. I don't mean to say that this was the right thing to do; I just meant that this is what Matt's currently-proposed patch does. Personally, I'm not very familiar with the details of the SCSI subsystem, and I don't know what preconditions are required for calling the various API's. > Furhtermore, are we talking about scsi_unregister_host() or > scsi_unregister(host)? The former does drivers and the latter > does hosts. This would mean that my original statement was > nevertheless correct, how can a LLDD decide to unload itself safely? I did indeed type it wrong. The code first calls scsi_remove_host(host) and then it calls scsi_unregister(host). Alan Stern ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 21:55 ` Alan Stern @ 2003-01-24 22:03 ` Luben Tuikov 2003-01-24 23:21 ` Mike Anderson 1 sibling, 0 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-24 22:03 UTC (permalink / raw) To: Alan Stern Cc: Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Alan Stern wrote: > > Apparently it can't. I don't mean to say that this was the right thing to > do; I just meant that this is what Matt's currently-proposed patch does. > Personally, I'm not very familiar with the details of the SCSI subsystem, > and I don't know what preconditions are required for calling the various > API's. Ok, no problem. Doug just explained it very good in his most recent email. Take heed in his words. Everything he said is quite doable. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 21:55 ` Alan Stern 2003-01-24 22:03 ` Luben Tuikov @ 2003-01-24 23:21 ` Mike Anderson 1 sibling, 0 replies; 106+ messages in thread From: Mike Anderson @ 2003-01-24 23:21 UTC (permalink / raw) To: Alan Stern Cc: Luben Tuikov, Matthew Dharm, Oliver Neukum, Doug Ledford, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Alan Stern [stern@rowland.harvard.edu] wrote: > On Fri, 24 Jan 2003, Luben Tuikov wrote: > > > >>A LLDD should and must *not* call scsi_unregister_host(). This brakes > > >>all hierarchy. > > > > > > > What I probably meant is the detect()/release() pair; release() itself > > normally calls scsi_unregister(host); the point is that it got nudged > > from ``above'', i.e. SCSI Core. > > > > How can a LLDD be certain that it can safely call scsi_unregister_host() > > whenever it wishes? As Doug pointed out this leads to problems. > > Apparently it can't. I don't mean to say that this was the right thing to > do; I just meant that this is what Matt's currently-proposed patch does. > Personally, I'm not very familiar with the details of the SCSI subsystem, > and I don't know what preconditions are required for calling the various > API's. > > > Furhtermore, are we talking about scsi_unregister_host() or > > scsi_unregister(host)? The former does drivers and the latter > > does hosts. This would mean that my original statement was > > nevertheless correct, how can a LLDD decide to unload itself safely? > > I did indeed type it wrong. The code first calls scsi_remove_host(host) > and then it calls scsi_unregister(host). > I probably should have looked at Matt's patch closer, sorry. If a LLDD is going to be using scsi_add_host and scsi_remove_host the driver should not use scsi_register_host / scsi_unregister_host. If the driver is updated to the sysfs driver model then: 1.) The drivers probe routine should call into the scsi mid with. scsi_register(...); scsi_add_host(...); 2.) The drivers remove routine should call into the scsi mid with. scsi_remove_host(...); scsi_unregister(...); (scsi_remove_host is part of this current discussion). The event that calls probe / remove is the device_register / device_unregister of the the adapter device. The LLDD's device_initcall / module_exit routines will call driver_register and driver_unregister to cause device to driver binding. Which will cause probe / remove to be called. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 23:25 ` Matthew Dharm 2003-01-24 15:34 ` Alan Stern 2003-01-24 19:10 ` Luben Tuikov @ 2003-01-24 21:48 ` Doug Ledford 2003-01-24 22:59 ` Mike Anderson 2003-01-24 23:25 ` Matthew Dharm 2 siblings, 2 replies; 106+ messages in thread From: Doug Ledford @ 2003-01-24 21:48 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Thu, Jan 23, 2003 at 03:25:54PM -0800, Matthew Dharm wrote: > Well, I've been watching this go on for days. I hate to weigh in now, but > I think someone needs to understand what the guy writing the code that is > facing this problem really wants. > > First, let me say that a USB storage device shows up as a HBA. That's > because some devices are actually USB/SCSI bridges. But, since I work at > the 'emulated host' level, that's where I'm focused. > > What I want: > > I want to be able to free resources associated with a device within a > finite (and well bounded) amount of time after I am notified that the > device is gone. This is doable. Partly. You need to do it within the framework of what the scsi subsys needs, but it can be done. Basically, the scsi core has an object allocation oriented paridigm where the lldd is expected to act as a driver that handles creation, use, and destruction of objects at the mid layer's direction. Changing that paridigm would be very difficult. However, we don't need the usb subsys to maintain full state on the device once it has gone away, just enough state to be able to talk to the mid layer and tell it where the outstanding commands are. Once the mid layer is finally done with it, the mid layer will tell your subsys to do a final free of the device by calling your release routine (for hosts) or your slave_destroy routine (for devices). So, what you should be doing in order to be both a nice scsi host that plays well with the generic mechanism we have in place is when you get this removal event, you should be free'ing all the state you needed about the usb bus and such and taking this usb device off line or whatever you do. Then let the scsi mid layer clean up at it's leisure. You don't need to worry about it because the only thing you will have left is to wait for the scsi subsys to call you when it's time to delete things. You don't even have to keep device references around because we pass those in to your deletion routines anyway. > I want to be able to inform the SCSI mid-layer, which will then inform > higher layers, that the device is gone. scsi_set_device_offline() as we've been discussing. > This is so that all may deal with > this however they want. I really don't care who does what, as long as we > don't crash. This implies that all block-type drivers will need to become > hotplug aware, or the SCSI mid-layer will have to fake command failures. As I stated in my proposed design email, this will already be taken care of. As long as you call scsi_set_device_offline() while holding the host lock for this host, then there will be no races between this call and the queuecommand() call which should be *all* that you care about. You will no longer get commands for this device. Now let me inform you of what the scsi subsys cares about. You need to return *all* outstanding commands that this device has in your lldd before you go off and play "my device is gone so I'm free'ing everything". That hasn't been the case from what I've seen, and that's what the scsi subsys needs. If you don't return absolutely *all* outstanding commands before you go around free'ing stuff, then what happens is the scsi subsys needs those commands back in order to free up the scsi structs and it calls into the usb stack to get a reset or abort on the command and you guys have already free'd up your stuff and you don't claim to know what we are talking about. At that point, we have a permanently stuck device. This is the part where I said yesterday that you could handle this at the time you set the device offline by adding the code to return all the commands or you could just let your already present error handler code get called by the error handler thread and return the outstanding commands that way. But, one way or the other, it has to be done. > I want to be able to do as little command-trickery as possible. If I have > to do it, then that means the next hotplug-capable LLDD must do it also. > Duplication of code is bad -- it should all be handled in the mid-layer. No one has to do anything of the sort if you follow what I wrote. Once the device is marked offline, any further commands already present in the request queue but not yet sent to the device plus any commands that come in to the request queue will all be sent back as I/O errors immediately before ever coming to you. > As yet, the interface I have to the SCSI mid-layer fails on all three > points here. SCSI <-> USB interaction is wrong right now. We know that. > And now, some of my opinions on how this should all work: > > It would be nice if the user informed us about removing the device before > they did it. But we shouldn't crash if they don't. Correct. > I don't want to be hanging around after a device is gone, spinning my > wheels because some other part of the kernel can't handle the fact that the > device is gone. My driver is a passthru between the a SCSI emualted host > and a physical USB device -- if my device is gone, I want to be out of > there. (Oddly enough, I'm starting to think there may be a DoS attack here > if you force the LLDD to stay -- after all, it consumes memory....) The device will go away. But, because we clean up multiple things on a host, like an error handler thread that needs to be woke up and we have to wait for it to run and acknowledge the death, instantaneous removal of all the structs is simply unrealistic. Unplugging your internal structs from the actual bus and then letting the scsi subsys clean them up at it's leisure is possible though, and that's what I'm asking for. > Remember, the physical plug doesn't ask me if it's okay, and I don't get to > ask the SCSI mid-layer if it's okay. Yes, starting with the user clicking > to tell us would be nice, but I don't get to see that. All I get to see is > an indication that the plug is pulled. I don't disagree. But because a plug is pulled has nothing to do with objects that might hold a reference to your object. Just because you *want* it that way doesn't mean it's realistic to actually try and make it that way. We can delete an object only after the last reference is released. > I don't really give a rat's a** about 'how SCSI works' or how it's > specified or CAM models or any of that. I try to live in the real world as > much as possible. In that world, I'm not asking to remove an HBA -- I'm > telling you it's been removed. I can't call it back. I can't even fake a > command (other than perhaps INQUIRY) in any meaningful way. THERE IS > NOTHING I CAN DO BUT KEEP INSISTING THAT THE DEVICE IS GONE! I wouldn't suggest otherwise. I'm saying that in the real world of software design, it doesn't matter if your device is gone or not, if there are still references outstanding to the object then a free of that object is a software bug. Come on now, quit being a prick about all this because you know I'm right on that point. I'm not asking you to leave shit around forever, I'm asking you guys to work within the generic framework of object allocation and freeing that already exists in the scsi subsys to get things done. The real problem is that you guys want this clean up process to be syncronous and I'm telling you that it's ever so slightly async because we have a couple schedules that have to take place and possibly a request queue to clean out. No one seems willing to accept that a slightly async process is going to be OK. Whatever. > It would be nice if the user could inform various parts of the kernel that > this device was going away, and then all sorts of cleanup could happen. > But I really don't care -- all I'm trying to do is exit without a resource > leak under all circumstances. And so you shall if you follow the guidelines I set forth. > It would be nice if the SCSI mid-layer kept track of what commands were in > what stages in who's queues. After all, if I hot-unplug a PCI SCSI > controller, the controller really isn't going to be able to complete those > commands for us -- we have to assume that commands queued by a LLDD are > really just being sent to the hardware for queuing. If that wasn't the > case, then having LLDD queue capability doesn't make sense. See, now you're just being a prick again. You know the reason that the scsi mid layer *makes* the lldd tell us it is done with the commands is because if we go around saying "oh, this command was outstanding, let's free it" without getting clearance from the lldd then the lldd might still yet have state associated with the command. We can't *ASSUME* jack crap in the mid layer. You have to *TELL* us that you are done with a command. It is the only way to avoid races, resource leaks, and all sorts of other crap. It amazes me that you think you should be excused from the job of cleaning up your queues after something happens like that. Maybe the USB code doesn't allocate any internal command structs to go along with each scsi command, but pretty much every real scsi driver does and it's imperative that the driver go through all those commands and clean up after an event such as a removal, so what's the big deal about calling scsi_done() to *tell* us that you've cleaned up? > Now, here's the kicker -- this is what I think Linus wants: > > Linus said to me, with a degree of annoyance, that he doesn't want > usb-storage to keep any associations of departed devices with SCSI emulated > hosts. That means that I need to be able to add and remove hosts at the > will of the end-user. In the end, what drives the entire process is what > Linus' hand does when it's placed on his USB flashcard reader. Well, the few milliseconds it might take to properly clean this stuff out is well within the specifications of how fast a human can go around plugging and unplugging a flashcard reader, so I'm not concerned that my proposal doesn't meet these requirements. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Re: [PATCH] USB changes for 2.5.58 2003-01-24 21:48 ` Doug Ledford @ 2003-01-24 22:59 ` Mike Anderson 2003-01-24 23:17 ` [linux-usb-devel] " Doug Ledford 2003-01-25 0:24 ` Luben Tuikov 2003-01-24 23:25 ` Matthew Dharm 1 sibling, 2 replies; 106+ messages in thread From: Mike Anderson @ 2003-01-24 22:59 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Doug, I started writing the interface you put forth in your email. I am currently debugging it in UML so I can generate the error conditions in a control manner. I still have some stuff to look at in the error handler with it running in this mode as it previously expected no one else to be possibly doing operations on the host. This could be the case if other LLDD's use this interface and have another device that happens to timeout an IO post a device being set offline. A clarification question below. Doug Ledford [dledford@redhat.com] wrote: > So, what you should be doing in > order to be both a nice scsi host that plays well with the generic > mechanism we have in place is when you get this removal event, you should > be free'ing all the state you needed about the usb bus and such and taking > this usb device off line or whatever you do. Then let the scsi mid layer > clean up at it's leisure. You don't need to worry about it because the > only thing you will have left is to wait for the scsi subsys to call you > when it's time to delete things. You don't even have to keep device > references around because we pass those in to your deletion routines > anyway. > > > I want to be able to inform the SCSI mid-layer, which will then inform > > higher layers, that the device is gone. > > scsi_set_device_offline() as we've been discussing. > I assumed that the hotplug event would only come from this function if no commands where outstanding. If there where commands outstanding the event would not be generated until the error handler gained ownership of all the commands. -andmike -- Michael Anderson andmike@us.ibm.com ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 22:59 ` Mike Anderson @ 2003-01-24 23:17 ` Doug Ledford 2003-01-25 0:24 ` Luben Tuikov 1 sibling, 0 replies; 106+ messages in thread From: Doug Ledford @ 2003-01-24 23:17 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list On Fri, Jan 24, 2003 at 02:59:10PM -0800, Mike Anderson wrote: > Doug, > I started writing the interface you put forth in your email. I > am currently debugging it in UML so I can generate the error > conditions in a control manner. Very cool! Thanks Mike. > I still have some stuff to look > at in the error handler with it running in this mode as it > previously expected no one else to be possibly doing operations > on the host. This could be the case if other LLDD's use this > interface and have another device that happens to timeout an IO > post a device being set offline. Unless someone changed things behind my back, we still have on eh thread per host don't we? As such, since each device is it's own host in USB (and I think in ieee1394 as well), this shouldn't be an issue... > A clarification question below. > > > Doug Ledford [dledford@redhat.com] wrote: > > > So, what you should be doing in > > order to be both a nice scsi host that plays well with the generic > > mechanism we have in place is when you get this removal event, you should > > be free'ing all the state you needed about the usb bus and such and taking > > this usb device off line or whatever you do. Then let the scsi mid layer > > clean up at it's leisure. You don't need to worry about it because the > > only thing you will have left is to wait for the scsi subsys to call you > > when it's time to delete things. You don't even have to keep device > > references around because we pass those in to your deletion routines > > anyway. > > > > > I want to be able to inform the SCSI mid-layer, which will then inform > > > higher layers, that the device is gone. > > > > scsi_set_device_offline() as we've been discussing. > > > > I assumed that the hotplug event would only come from this function if > no commands where outstanding. If there where commands outstanding the > event would not be generated until the error handler gained ownership of > all the commands. Yes. But more than that, we really want to also make sure that the current request queue for the device is empty of all commands before sending the hot plug event. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 22:59 ` Mike Anderson 2003-01-24 23:17 ` [linux-usb-devel] " Doug Ledford @ 2003-01-25 0:24 ` Luben Tuikov 2003-01-25 1:35 ` Mike Anderson 1 sibling, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-25 0:24 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Mike Anderson wrote: > Doug, > I started writing the interface you put forth in your email. Do you mind clarifying? Either it was a private email, or one posted here, in which case there was an interpretation. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-25 0:24 ` Luben Tuikov @ 2003-01-25 1:35 ` Mike Anderson 0 siblings, 0 replies; 106+ messages in thread From: Mike Anderson @ 2003-01-25 1:35 UTC (permalink / raw) To: Luben Tuikov Cc: Oliver Neukum, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Luben Tuikov [luben@splentec.com] wrote: > Mike Anderson wrote: > >Doug, > > I started writing the interface you put forth in your email. > > Do you mind clarifying? Either it was a private email, or > one posted here, in which case there was an interpretation. It was posted here at the bottom of this email http://marc.theaimsgroup.com/?l=linux-scsi&m=104335366403485&w=2 It is a starting point. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 21:48 ` Doug Ledford 2003-01-24 22:59 ` Mike Anderson @ 2003-01-24 23:25 ` Matthew Dharm 2003-01-25 0:05 ` Doug Ledford 2003-01-25 1:24 ` Luben Tuikov 1 sibling, 2 replies; 106+ messages in thread From: Matthew Dharm @ 2003-01-24 23:25 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 12830 bytes --] So, if I read this correctly, you're saying that the correct sequence is: (1) get disconnect notification from USB (2) Call scsi_set_device_offline() (must hold host lock for this) (3) call scsi_done() for all command in queue (max: 1) (4) Call scsi_remove_host(), which should now work because no commands are outstanding (5) Call scsi_unregister() And we're done, all structures can be freed. And, as I understand it, the following is true: (a) once (2) is done, no more commands will be queued (b) once (3) is done, (4) is guaranteed to work (c) there is nothing the user can do to make this sequence take a long time Tho, this does leave me with a couple of questions: (i) Doesn't scsi_set_device_offline() work on devices, not hosts? How do I map from my host to my device list? (ii) Do I need to call scsi_set_device_offline() for each device? I presume 'yes'. (iii) What should I shove into the status field of the scsi command before I scsi_done() it? Oh, and as for my being a 'prick'.... my big problem is that the documented interface is synchronous. Async is fine with me, but up until this e-mail, all I've seen is people arguing over what the sequence is, and theoretical issues of what users should and should not do. And I also think that a large number of hotplugable hosts are going to replicate a whole bunch of code to do (2)+(3)+(4) in one, synchronous burst. If someone will step forward with a 'yes' or 'no' on this sequence, then I'll get it done. If the answer is 'no', then what did I miss? Matt On Fri, Jan 24, 2003 at 04:48:31PM -0500, Doug Ledford wrote: > On Thu, Jan 23, 2003 at 03:25:54PM -0800, Matthew Dharm wrote: > > Well, I've been watching this go on for days. I hate to weigh in now, but > > I think someone needs to understand what the guy writing the code that is > > facing this problem really wants. > > > > First, let me say that a USB storage device shows up as a HBA. That's > > because some devices are actually USB/SCSI bridges. But, since I work at > > the 'emulated host' level, that's where I'm focused. > > > > What I want: > > > > I want to be able to free resources associated with a device within a > > finite (and well bounded) amount of time after I am notified that the > > device is gone. > > This is doable. Partly. You need to do it within the framework of what > the scsi subsys needs, but it can be done. Basically, the scsi core has > an object allocation oriented paridigm where the lldd is expected to act > as a driver that handles creation, use, and destruction of objects at the > mid layer's direction. Changing that paridigm would be very difficult. > However, we don't need the usb subsys to maintain full state on the device > once it has gone away, just enough state to be able to talk to the mid > layer and tell it where the outstanding commands are. Once the mid layer > is finally done with it, the mid layer will tell your subsys to do a final > free of the device by calling your release routine (for hosts) or your > slave_destroy routine (for devices). So, what you should be doing in > order to be both a nice scsi host that plays well with the generic > mechanism we have in place is when you get this removal event, you should > be free'ing all the state you needed about the usb bus and such and taking > this usb device off line or whatever you do. Then let the scsi mid layer > clean up at it's leisure. You don't need to worry about it because the > only thing you will have left is to wait for the scsi subsys to call you > when it's time to delete things. You don't even have to keep device > references around because we pass those in to your deletion routines > anyway. > > > I want to be able to inform the SCSI mid-layer, which will then inform > > higher layers, that the device is gone. > > scsi_set_device_offline() as we've been discussing. > > > This is so that all may deal with > > this however they want. I really don't care who does what, as long as we > > don't crash. This implies that all block-type drivers will need to become > > hotplug aware, or the SCSI mid-layer will have to fake command failures. > > As I stated in my proposed design email, this will already be taken care > of. As long as you call scsi_set_device_offline() while holding the host > lock for this host, then there will be no races between this call and the > queuecommand() call which should be *all* that you care about. You will > no longer get commands for this device. Now let me inform you of what the > scsi subsys cares about. You need to return *all* outstanding commands > that this device has in your lldd before you go off and play "my device is > gone so I'm free'ing everything". That hasn't been the case from what > I've seen, and that's what the scsi subsys needs. If you don't return > absolutely *all* outstanding commands before you go around free'ing stuff, > then what happens is the scsi subsys needs those commands back in order to > free up the scsi structs and it calls into the usb stack to get a reset or > abort on the command and you guys have already free'd up your stuff and > you don't claim to know what we are talking about. At that point, we have > a permanently stuck device. This is the part where I said yesterday that > you could handle this at the time you set the device offline by adding the > code to return all the commands or you could just let your already present > error handler code get called by the error handler thread and return the > outstanding commands that way. But, one way or the other, it has to be > done. > > > I want to be able to do as little command-trickery as possible. If I have > > to do it, then that means the next hotplug-capable LLDD must do it also. > > Duplication of code is bad -- it should all be handled in the mid-layer. > > No one has to do anything of the sort if you follow what I wrote. Once > the device is marked offline, any further commands already present in the > request queue but not yet sent to the device plus any commands that come > in to the request queue will all be sent back as I/O errors immediately > before ever coming to you. > > > As yet, the interface I have to the SCSI mid-layer fails on all three > > points here. > > SCSI <-> USB interaction is wrong right now. We know that. > > > And now, some of my opinions on how this should all work: > > > > It would be nice if the user informed us about removing the device before > > they did it. But we shouldn't crash if they don't. > > Correct. > > > I don't want to be hanging around after a device is gone, spinning my > > wheels because some other part of the kernel can't handle the fact that the > > device is gone. My driver is a passthru between the a SCSI emualted host > > and a physical USB device -- if my device is gone, I want to be out of > > there. (Oddly enough, I'm starting to think there may be a DoS attack here > > if you force the LLDD to stay -- after all, it consumes memory....) > > The device will go away. But, because we clean up multiple things on a > host, like an error handler thread that needs to be woke up and we have to > wait for it to run and acknowledge the death, instantaneous removal of all > the structs is simply unrealistic. Unplugging your internal structs from > the actual bus and then letting the scsi subsys clean them up at it's > leisure is possible though, and that's what I'm asking for. > > > Remember, the physical plug doesn't ask me if it's okay, and I don't get to > > ask the SCSI mid-layer if it's okay. Yes, starting with the user clicking > > to tell us would be nice, but I don't get to see that. All I get to see is > > an indication that the plug is pulled. > > I don't disagree. But because a plug is pulled has nothing to do with > objects that might hold a reference to your object. Just because you > *want* it that way doesn't mean it's realistic to actually try and make it > that way. We can delete an object only after the last reference is > released. > > > I don't really give a rat's a** about 'how SCSI works' or how it's > > specified or CAM models or any of that. I try to live in the real world as > > much as possible. In that world, I'm not asking to remove an HBA -- I'm > > telling you it's been removed. I can't call it back. I can't even fake a > > command (other than perhaps INQUIRY) in any meaningful way. THERE IS > > NOTHING I CAN DO BUT KEEP INSISTING THAT THE DEVICE IS GONE! > > I wouldn't suggest otherwise. I'm saying that in the real world of > software design, it doesn't matter if your device is gone or not, if there > are still references outstanding to the object then a free of that object > is a software bug. Come on now, quit being a prick about all this because > you know I'm right on that point. I'm not asking you to leave shit around > forever, I'm asking you guys to work within the generic framework of > object allocation and freeing that already exists in the scsi subsys to > get things done. The real problem is that you guys want this clean up > process to be syncronous and I'm telling you that it's ever so slightly > async because we have a couple schedules that have to take place and > possibly a request queue to clean out. No one seems willing to accept > that a slightly async process is going to be OK. Whatever. > > > It would be nice if the user could inform various parts of the kernel that > > this device was going away, and then all sorts of cleanup could happen. > > But I really don't care -- all I'm trying to do is exit without a resource > > leak under all circumstances. > > And so you shall if you follow the guidelines I set forth. > > > It would be nice if the SCSI mid-layer kept track of what commands were in > > what stages in who's queues. After all, if I hot-unplug a PCI SCSI > > controller, the controller really isn't going to be able to complete those > > commands for us -- we have to assume that commands queued by a LLDD are > > really just being sent to the hardware for queuing. If that wasn't the > > case, then having LLDD queue capability doesn't make sense. > > See, now you're just being a prick again. You know the reason that the > scsi mid layer *makes* the lldd tell us it is done with the commands is > because if we go around saying "oh, this command was outstanding, let's > free it" without getting clearance from the lldd then the lldd might still > yet have state associated with the command. We can't *ASSUME* jack crap > in the mid layer. You have to *TELL* us that you are done with a command. > It is the only way to avoid races, resource leaks, and all sorts of other > crap. It amazes me that you think you should be excused from the job of > cleaning up your queues after something happens like that. Maybe the USB > code doesn't allocate any internal command structs to go along with each > scsi command, but pretty much every real scsi driver does and it's > imperative that the driver go through all those commands and clean up > after an event such as a removal, so what's the big deal about calling > scsi_done() to *tell* us that you've cleaned up? > > > Now, here's the kicker -- this is what I think Linus wants: > > > > Linus said to me, with a degree of annoyance, that he doesn't want > > usb-storage to keep any associations of departed devices with SCSI emulated > > hosts. That means that I need to be able to add and remove hosts at the > > will of the end-user. In the end, what drives the entire process is what > > Linus' hand does when it's placed on his USB flashcard reader. > > Well, the few milliseconds it might take to properly clean this stuff out > is well within the specifications of how fast a human can go around > plugging and unplugging a flashcard reader, so I'm not concerned that my > proposal doesn't meet these requirements. > > > > -- > Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 > Red Hat, Inc. > 1801 Varsity Dr. > Raleigh, NC 27606 > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver Sir, for the hundreth time, we do NOT carry 600-round boxes of belt-fed suction darts! -- Salesperson to Greg User Friendly, 12/30/1997 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 23:25 ` Matthew Dharm @ 2003-01-25 0:05 ` Doug Ledford 2003-01-25 0:45 ` Matthew Dharm 2003-02-02 3:49 ` Matthew Dharm 2003-01-25 1:24 ` Luben Tuikov 1 sibling, 2 replies; 106+ messages in thread From: Doug Ledford @ 2003-01-25 0:05 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Fri, Jan 24, 2003 at 03:25:40PM -0800, Matthew Dharm wrote: > So, if I read this correctly, you're saying that the correct sequence is: > > (1) get disconnect notification from USB > (2) Call scsi_set_device_offline() (must hold host lock for this) Yes. > (3) call scsi_done() for all command in queue (max: 1) Hmmm...only 1? USB limit or driver limit? > (4) Call scsi_remove_host(), which should now work because no commands are > outstanding We may need to add code to scsi_remove_host() to allow it to clean out the request queue of the device when the device is offline and this call is made. Just because we returned the 1 command you had outstanding doesn't mean that there weren't more in the request queue (especially true of hard disks like my mp3 player). However, once the device is offline, cleaning out the queue is a known non-blocking operation, it just takes non-0 time as well. Once the queue is cleaned out, we need to shut it down so that no more commands can come in to the block level. > (5) Call scsi_unregister() > > And we're done, all structures can be freed. And, as I understand it, the > following is true: > > (a) once (2) is done, no more commands will be queued To your driver, yes. If Mike makes it clean out and disable the request queue at the same time, then we could answer this question as yes at the request queue level as well. > (b) once (3) is done, (4) is guaranteed to work No! Remember, command completion is delayed! We have a tasklet that processes your now complete command, and with that processing comes marking the device unbusy, which is also required for 4 to work. That's why I was suggesting waking up the error handler thread and letting it finish this process off. The error handler thread has the luxury of being able to wait for the command completion to happen, and in my opinion it's a slightly better place to do the work of cleaning out the request queue. > (c) there is nothing the user can do to make this sequence take a long time True. We need time to do things in our very slightly async way, but the user isn't able to keep us from completing. > Tho, this does leave me with a couple of questions: > > (i) Doesn't scsi_set_device_offline() work on devices, not hosts? How do I > map from my host to my device list? Well, in hosts.c::scsi_remove_host() we do it thusly: list_for_each_entry(sdev, &shost->my_devices, siblings) if (scsi_check_device_busy(sdev)) return 1; > (ii) Do I need to call scsi_set_device_offline() for each device? I > presume 'yes'. Yes. As people pointed out to me the reason a USB device is done as a host is because it very well may *be* a host with several devices behind it, so it must handle the multiple device scenario correctly and set all devices offline and clean up after all of them that might be behind this bridge. > (iii) What should I shove into the status field of the scsi command before > I scsi_done() it? Well, to force an error I always put DID_ERROR into the driver byte of the result dword, aka: cmd->result = DID_ERROR << 16; > Oh, and as for my being a 'prick'.... my big problem is that the documented > interface is synchronous. Async is fine with me, but up until this e-mail, > all I've seen is people arguing over what the sequence is, and theoretical > issues of what users should and should not do. And I also think that a > large number of hotplugable hosts are going to replicate a whole bunch of > code to do (2)+(3)+(4) in one, synchronous burst. Which would be wrong BTW. If you can support multiple devices behind a bridge then you can't put (2)+(3)+(4) together in one burst. That's why they aren't that way now. As to the sync vs. async, the scsi mid layer quit being fully sync during the 2.4 timeframe. When the old error handling code was dropped from 2.5+, all sync completion code was also dropped. > If someone will step forward with a 'yes' or 'no' on this sequence, then > I'll get it done. If the answer is 'no', then what did I miss? Just the tasklet completion issue. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-25 0:05 ` Doug Ledford @ 2003-01-25 0:45 ` Matthew Dharm 2003-01-25 1:07 ` Doug Ledford 2003-02-02 3:49 ` Matthew Dharm 1 sibling, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-01-25 0:45 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 3227 bytes --] Ah... the sweet feeling of progress. On Fri, Jan 24, 2003 at 07:05:29PM -0500, Doug Ledford wrote: > On Fri, Jan 24, 2003 at 03:25:40PM -0800, Matthew Dharm wrote: > > So, if I read this correctly, you're saying that the correct sequence is: > > > > (1) get disconnect notification from USB > > (2) Call scsi_set_device_offline() (must hold host lock for this) > > (3) call scsi_done() for all command in queue (max: 1) > > Hmmm...only 1? USB limit or driver limit? Driver limit. I added support for queueing, but the queue is fixed at size 1. It's an improvement for the future. > > (4) Call scsi_remove_host(), which should now work because no commands are > > outstanding > > > (5) Call scsi_unregister() > > > > And we're done, all structures can be freed. And, as I understand it, the > > following is true: > > > > (b) once (3) is done, (4) is guaranteed to work > > No! Remember, command completion is delayed! We have a tasklet that > processes your now complete command, and with that processing comes > marking the device unbusy, which is also required for 4 to work. That's > why I was suggesting waking up the error handler thread and letting it > finish this process off. The error handler thread has the luxury of being > able to wait for the command completion to happen, and in my opinion it's > a slightly better place to do the work of cleaning out the request queue. Okay... so what do I do if it fails? Sleep for a while and try again later? Wait on a flag somewhere? > > Tho, this does leave me with a couple of questions: > > > > (i) Doesn't scsi_set_device_offline() work on devices, not hosts? How do I > > map from my host to my device list? > > Well, in hosts.c::scsi_remove_host() we do it thusly: > > list_for_each_entry(sdev, &shost->my_devices, siblings) > if (scsi_check_device_busy(sdev)) > return 1; Right, perfect example. > > (iii) What should I shove into the status field of the scsi command before > > I scsi_done() it? > > Well, to force an error I always put DID_ERROR into the driver byte of > the result dword, aka: > > cmd->result = DID_ERROR << 16; Sounds reasonable. > > Async is fine with me, but up until this e-mail, > > all I've seen is people arguing over what the sequence is, and theoretical > > issues of what users should and should not do. And I also think that a > > large number of hotplugable hosts are going to replicate a whole bunch of > > code to do (2)+(3)+(4) in one, synchronous burst. > > Which would be wrong BTW. If you can support multiple devices behind a > bridge then you can't put (2)+(3)+(4) together in one burst. That's why > they aren't that way now. Hrm... I can see your point if we're talking about hotplugging an individual device, but I don't see how (2)+(3)+(4) isn't what we want for hotplugging an entire host. Matt -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver You are needink to look more evil. You likink very strong coffee? -- Pitr to Dust Puppy User Friendly, 10/16/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-25 0:45 ` Matthew Dharm @ 2003-01-25 1:07 ` Doug Ledford 2003-02-02 18:13 ` Matthew Dharm 0 siblings, 1 reply; 106+ messages in thread From: Doug Ledford @ 2003-01-25 1:07 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Fri, Jan 24, 2003 at 04:45:53PM -0800, Matthew Dharm wrote: > Ah... the sweet feeling of progress. Indeed ;-) > Driver limit. I added support for queueing, but the queue is fixed at size > 1. It's an improvement for the future. OK. Just curious. > > No! Remember, command completion is delayed! We have a tasklet that > > processes your now complete command, and with that processing comes > > marking the device unbusy, which is also required for 4 to work. That's > > why I was suggesting waking up the error handler thread and letting it > > finish this process off. The error handler thread has the luxury of being > > able to wait for the command completion to happen, and in my opinion it's > > a slightly better place to do the work of cleaning out the request queue. > > Okay... so what do I do if it fails? Sleep for a while and try again > later? Wait on a flag somewhere? Well, the better option is what I think we are working on. Instead of trying to remove the host completely, just unhook it from the USB stuff, then call the set_scsi_device_offline(), then send back any outstanding commands via scsi_done(), then possibly call the scsi_schedule_host_removal() if Mike adds that function. Then return. Don't to anything else. Take all the remaining code you would normally run at this point and put it into a function in your source called usb_release() and in your Scsi_Host_Template struct that you pass to the scsi layer, init the release pointer with the address of your usb_release() routine. That way, the scsi layer can do what it's best at, taking care of the clean up details we've been talking about, and when it's all done, it will call your usb_release() routine with a single argument of the host struct you are wanting released. At that point, you can do all the freeing you would have done in that khubd loop (at least I think that's the context you are doing the freeing from now) and know for a fact that not only are you freeing everything, but so is the scsi mid layer. I think this will solve all the issues you've had, because this *won't* leak, it won't block your other actions, and it lets the scsi subsystem clean up properly. > > > Async is fine with me, but up until this e-mail, > > > all I've seen is people arguing over what the sequence is, and theoretical > > > issues of what users should and should not do. And I also think that a > > > large number of hotplugable hosts are going to replicate a whole bunch of > > > code to do (2)+(3)+(4) in one, synchronous burst. > > > > Which would be wrong BTW. If you can support multiple devices behind a > > bridge then you can't put (2)+(3)+(4) together in one burst. That's why > > they aren't that way now. > > Hrm... I can see your point if we're talking about hotplugging an > individual device, but I don't see how (2)+(3)+(4) isn't what we want for > hotplugging an entire host. The numbers are gone from the email now so it's hard to reference, but I think I was commenting on the fact that if you have a true host device, then you might well be doing (2)+(3)*(number of devices behind bridge)+(4) or something like that. -- Doug Ledford <dledford@redhat.com> 919-754-3700 x44233 Red Hat, Inc. 1801 Varsity Dr. Raleigh, NC 27606 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-25 1:07 ` Doug Ledford @ 2003-02-02 18:13 ` Matthew Dharm 2003-02-02 20:06 ` Matthew Dharm 0 siblings, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-02-02 18:13 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 2129 bytes --] So, was any of this ever implemented? As far as I can tell, the required changes were: (o) addition of scsi_schedule_host_removal() (possibly optional) (o) implementation of scsi_set_device_offline() (possibly optional) (o) change the behavior of the 'hotplug initialization model' to call my release function Matt On Fri, Jan 24, 2003 at 08:07:29PM -0500, Doug Ledford wrote: > On Fri, Jan 24, 2003 at 04:45:53PM -0800, Matthew Dharm wrote: > > Okay... so what do I do if it fails? Sleep for a while and try again > > later? Wait on a flag somewhere? > > Well, the better option is what I think we are working on. Instead of > trying to remove the host completely, just unhook it from the USB stuff, > then call the set_scsi_device_offline(), then send back any outstanding > commands via scsi_done(), then possibly call the > scsi_schedule_host_removal() if Mike adds that function. Then return. > Don't to anything else. Take all the remaining code you would normally > run at this point and put it into a function in your source called > usb_release() and in your Scsi_Host_Template struct that you pass to the > scsi layer, init the release pointer with the address of your > usb_release() routine. That way, the scsi layer can do what it's best at, > taking care of the clean up details we've been talking about, and when > it's all done, it will call your usb_release() routine with a single > argument of the host struct you are wanting released. At that point, you > can do all the freeing you would have done in that khubd loop (at least I > think that's the context you are doing the freeing from now) and know for > a fact that not only are you freeing everything, but so is the scsi mid > layer. I think this will solve all the issues you've had, because this > *won't* leak, it won't block your other actions, and it lets the scsi > subsystem clean up properly. -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver We can customize our colonels. -- Tux User Friendly, 12/1/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-02 18:13 ` Matthew Dharm @ 2003-02-02 20:06 ` Matthew Dharm 2003-02-03 17:17 ` Mike Anderson 0 siblings, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-02-02 20:06 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 2713 bytes --] Willem Riede <wrlk@riede.org> suggested to me that I simply set sdev->online = 0 for scsi_set_device_offline() Any reason that isn't good enough? On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote: > So, was any of this ever implemented? As far as I can tell, the required > changes were: > > (o) addition of scsi_schedule_host_removal() (possibly optional) > (o) implementation of scsi_set_device_offline() (possibly optional) > (o) change the behavior of the 'hotplug initialization model' to call my > release function > > Matt > > On Fri, Jan 24, 2003 at 08:07:29PM -0500, Doug Ledford wrote: > > On Fri, Jan 24, 2003 at 04:45:53PM -0800, Matthew Dharm wrote: > > > Okay... so what do I do if it fails? Sleep for a while and try again > > > later? Wait on a flag somewhere? > > > > Well, the better option is what I think we are working on. Instead of > > trying to remove the host completely, just unhook it from the USB stuff, > > then call the set_scsi_device_offline(), then send back any outstanding > > commands via scsi_done(), then possibly call the > > scsi_schedule_host_removal() if Mike adds that function. Then return. > > Don't to anything else. Take all the remaining code you would normally > > run at this point and put it into a function in your source called > > usb_release() and in your Scsi_Host_Template struct that you pass to the > > scsi layer, init the release pointer with the address of your > > usb_release() routine. That way, the scsi layer can do what it's best at, > > taking care of the clean up details we've been talking about, and when > > it's all done, it will call your usb_release() routine with a single > > argument of the host struct you are wanting released. At that point, you > > can do all the freeing you would have done in that khubd loop (at least I > > think that's the context you are doing the freeing from now) and know for > > a fact that not only are you freeing everything, but so is the scsi mid > > layer. I think this will solve all the issues you've had, because this > > *won't* leak, it won't block your other actions, and it lets the scsi > > subsystem clean up properly. > > -- > Matthew Dharm Home: mdharm-usb@one-eyed-alien.net > Maintainer, Linux USB Mass Storage Driver > > We can customize our colonels. > -- Tux > User Friendly, 12/1/1998 -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver Sir, for the hundreth time, we do NOT carry 600-round boxes of belt-fed suction darts! -- Salesperson to Greg User Friendly, 12/30/1997 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-02 20:06 ` Matthew Dharm @ 2003-02-03 17:17 ` Mike Anderson 2003-02-16 21:18 ` Matthew Dharm 0 siblings, 1 reply; 106+ messages in thread From: Mike Anderson @ 2003-02-03 17:17 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Sorry Matthew I got side tracked on some issues for the last week. The scsi_set_device_offline(); function has not been added to any of James linux-scsi trees. You could add a ifndef in your code until we get the interface in the tree. Would scsi_set_device_offline() do more than sdev->online = 0? It depends on the state of the device at the calling of the function. I currently have scsi_set_device_offline() trying to do the following: 1.) set device offline and mark host in_recovery. 2.) mark all outstanding commands to be canceled and wake up error handler. 3.) flush the request queue. 4.) Once device is really offline send hotplug event. (2) needs some changes in the error handler which are needed in other cases. This the large part of the change, but it is not directly part of your request. (3) need a cleaner way to flush request specials off the queue . It would also be nice if there was a method to stop the incoming side of the request queue. (4) need export of do_hotplug interface or a method to generate a call to it for an offline event. I am working on these changes pretty much in order. Doug suggested a scsi_schedule_host_removal(), but I thought we could just change scsi_remove_host() to handle this task unless there is a side effect that all callers would not want???. -andmike Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > Willem Riede <wrlk@riede.org> suggested to me that I simply set > sdev->online = 0 for scsi_set_device_offline() > > Any reason that isn't good enough? > > On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote: > > So, was any of this ever implemented? As far as I can tell, the required > > changes were: > > > > (o) addition of scsi_schedule_host_removal() (possibly optional) > > (o) implementation of scsi_set_device_offline() (possibly optional) > > (o) change the behavior of the 'hotplug initialization model' to call my > > release function > > > > Matt > > > then call the set_scsi_device_offline(), then send back any outstanding > > > commands via scsi_done(), then possibly call the > > > scsi_schedule_host_removal() if Mike adds that function. Then return. > > > Don't to anything else. Take all the remaining code you would normally > > > run at this point and put it into a function in your source called > > > usb_release() and in your Scsi_Host_Template struct that you pass to the > > > scsi layer, init the release pointer with the address of your > > > usb_release() routine. That way, the scsi layer can do what it's best at, > > > taking care of the clean up details we've been talking about, and when > > > it's all done, it will call your usb_release() routine with a single > > > argument of the host struct you are wanting released. At that point, you > > > can do all the freeing you would have done in that khubd loop (at least I > > > think that's the context you are doing the freeing from now) and know for > > > a fact that not only are you freeing everything, but so is the scsi mid > > > layer. I think this will solve all the issues you've had, because this > > > *won't* leak, it won't block your other actions, and it lets the scsi > > > subsystem clean up properly. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-03 17:17 ` Mike Anderson @ 2003-02-16 21:18 ` Matthew Dharm 2003-02-17 19:37 ` Mike Anderson 0 siblings, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-02-16 21:18 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 4225 bytes --] Any updates on this? I saw some patches, but they don't seem to be in my tree (the usb tree, which is synced from Linus' tree). People are starting to reports OOPSes to me because of this being missing.... Matt On Mon, Feb 03, 2003 at 09:17:26AM -0800, Mike Anderson wrote: > Sorry Matthew I got side tracked on some issues for the last week. The > scsi_set_device_offline(); function has not been added to any of James > linux-scsi trees. You could add a ifndef in your code until we get the > interface in the tree. > > Would scsi_set_device_offline() do more than sdev->online = 0? It > depends on the state of the device at the calling of the function. > > I currently have scsi_set_device_offline() trying to do the following: > 1.) set device offline and mark host in_recovery. > 2.) mark all outstanding commands to be canceled and wake up > error handler. > 3.) flush the request queue. > 4.) Once device is really offline send hotplug event. > > (2) needs some changes in the error handler which are needed in other > cases. This the large part of the change, but it is not directly part of > your request. > > (3) need a cleaner way to flush request specials off the queue . It > would also be nice if there was a method to stop the incoming side of the > request queue. > > (4) need export of do_hotplug interface or a method to generate a call > to it for an offline event. > > I am working on these changes pretty much in order. > > Doug suggested a scsi_schedule_host_removal(), but I thought we could > just change scsi_remove_host() to handle this task unless there is a > side effect that all callers would not want???. > > -andmike > > Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > > Willem Riede <wrlk@riede.org> suggested to me that I simply set > > sdev->online = 0 for scsi_set_device_offline() > > > > Any reason that isn't good enough? > > > > On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote: > > > So, was any of this ever implemented? As far as I can tell, the required > > > changes were: > > > > > > (o) addition of scsi_schedule_host_removal() (possibly optional) > > > (o) implementation of scsi_set_device_offline() (possibly optional) > > > (o) change the behavior of the 'hotplug initialization model' to call my > > > release function > > > > > > Matt > > > > then call the set_scsi_device_offline(), then send back any outstanding > > > > commands via scsi_done(), then possibly call the > > > > scsi_schedule_host_removal() if Mike adds that function. Then return. > > > > Don't to anything else. Take all the remaining code you would normally > > > > run at this point and put it into a function in your source called > > > > usb_release() and in your Scsi_Host_Template struct that you pass to the > > > > scsi layer, init the release pointer with the address of your > > > > usb_release() routine. That way, the scsi layer can do what it's best at, > > > > taking care of the clean up details we've been talking about, and when > > > > it's all done, it will call your usb_release() routine with a single > > > > argument of the host struct you are wanting released. At that point, you > > > > can do all the freeing you would have done in that khubd loop (at least I > > > > think that's the context you are doing the freeing from now) and know for > > > > a fact that not only are you freeing everything, but so is the scsi mid > > > > layer. I think this will solve all the issues you've had, because this > > > > *won't* leak, it won't block your other actions, and it lets the scsi > > > > subsystem clean up properly. > > > -andmike > -- > Michael Anderson > andmike@us.ibm.com > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver My mother not mind to die for stoppink Windows NT! She is rememberink Stalin! -- Pitr User Friendly, 9/6/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-16 21:18 ` Matthew Dharm @ 2003-02-17 19:37 ` Mike Anderson 2003-02-17 19:51 ` Patrick Mansfield 2003-02-23 7:48 ` Matthew Dharm 0 siblings, 2 replies; 106+ messages in thread From: Mike Anderson @ 2003-02-17 19:37 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > Any updates on this? I saw some patches, but they don't seem to be in my > tree (the usb tree, which is synced from Linus' tree). > > People are starting to reports OOPSes to me because of this being > missing.... > > Matt > The scsi_set_device_offline interface is part of the last patch (scsi error) I sent to linux-scsi. I updated my patch post some comments from the list, but I am working on issue with the patch before I resend. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-17 19:37 ` Mike Anderson @ 2003-02-17 19:51 ` Patrick Mansfield 2003-02-23 7:48 ` Matthew Dharm 1 sibling, 0 replies; 106+ messages in thread From: Patrick Mansfield @ 2003-02-17 19:51 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list On Mon, Feb 17, 2003 at 11:37:37AM -0800, Mike Anderson wrote: > > The scsi_set_device_offline interface is part of the last patch (scsi > error) I sent to linux-scsi. I updated my patch post some comments from > the list, but I am working on issue with the patch before I resend. > > -andmike > -- > Michael Anderson > andmike@us.ibm.com One point with the interface - can we have a bit higher level interface based on what has happened to the adapter, so that if scsi wants to behave differently in the future the interface need not change? That is, have an exported scsi_host_removed(struct scsi_host *shost) versus a scsi_set_device_offline(struct scsi_device *sdev)? scsi_host_removed can offline all sdev's etc. for now, and if we ever want to change data structure layouts or the behaviour in the future (i.e. allow surprise removal/replacement of the same storage without forcing removal of a scsi_device) the interface need not change. -- Patrick Mansfield ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-17 19:37 ` Mike Anderson 2003-02-17 19:51 ` Patrick Mansfield @ 2003-02-23 7:48 ` Matthew Dharm 2003-02-26 23:37 ` Mike Anderson 1 sibling, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-02-23 7:48 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1774 bytes --] Okay, I see Linus has now accepted this into his tree. It should propagate to the USB development trees soon. One question: What else is needed? We set the device offline, error/complete all pending commands, and the need to (somehow) make certain we're in a good state for calling scsi_remove_host(). How do we make that final guarantee? There was talk that scsi_set_device_offline() would take care of that for us by waking up the error handler... there seems to be code to do that.... There was talk of using the release() function from the SCSI template to actually release resources.... So, what's the plan? Matt On Mon, Feb 17, 2003 at 11:37:37AM -0800, Mike Anderson wrote: > Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > > Any updates on this? I saw some patches, but they don't seem to be in my > > tree (the usb tree, which is synced from Linus' tree). > > > > People are starting to reports OOPSes to me because of this being > > missing.... > > > > Matt > > > > The scsi_set_device_offline interface is part of the last patch (scsi > error) I sent to linux-scsi. I updated my patch post some comments from > the list, but I am working on issue with the patch before I resend. > > -andmike > -- > Michael Anderson > andmike@us.ibm.com > > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver But where are the THEMES?! How do you expect me to use an OS without themes?! -- Stef User Friendly, 10/9/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-23 7:48 ` Matthew Dharm @ 2003-02-26 23:37 ` Mike Anderson 2003-02-27 1:10 ` Matthew Dharm 0 siblings, 1 reply; 106+ messages in thread From: Mike Anderson @ 2003-02-26 23:37 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Matthew, Sorry for the delay in replying (non coding activities are consuming to many hours). Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > Okay, I see Linus has now accepted this into his tree. It should propagate > to the USB development trees soon. > > One question: What else is needed? We set the device offline, > error/complete all pending commands, and the need to (somehow) make certain > we're in a good state for calling scsi_remove_host(). How do we make that > final guarantee? > > There was talk that scsi_set_device_offline() would take care of that for > us by waking up the error handler... there seems to be code to do that.... > Yes, The scsi_set_device_offline will wake up the error handler to abort outstanding commands. > There was talk of using the release() function from the SCSI template to > actually release resources.... > > So, what's the plan? There still are a few things on the to do list, but should not effect the LLDD interface (at least this is the goal). - scsi_request_fn needs a fix for device offline that will handle all request types. - scsi_remove_host needs to call template release at the correct time (ref counting ??). - need fix for offline hotplug event. - Should do_hotplug be exported or should device states be added / fixed ?? Cleanups - Change scsi_remove_host for int to void function. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-26 23:37 ` Mike Anderson @ 2003-02-27 1:10 ` Matthew Dharm 2003-02-27 6:37 ` Mike Anderson 0 siblings, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-02-27 1:10 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1015 bytes --] On Wed, Feb 26, 2003 at 03:37:02PM -0800, Mike Anderson wrote: > There still are a few things on the to do list, but should not effect the > LLDD interface (at least this is the goal). > - scsi_request_fn needs a fix for device offline that will > handle all request types. > - scsi_remove_host needs to call template release at the correct > time (ref counting ??). > - need fix for offline hotplug event. > - Should do_hotplug be exported or should device states > be added / fixed ?? Right... but I removed the release() function because that was marked (in the documentation) as only for the old-style drivers. So I'll need to re-introduce it -- but it looks like all it has to do is free some memory. Does that sound about right? Matt -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver Da. Am thinkink of carbonated borscht for lonk nights of coding. -- Pitr User Friendly, 7/24/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-27 1:10 ` Matthew Dharm @ 2003-02-27 6:37 ` Mike Anderson 2003-02-27 19:32 ` Matthew Dharm 0 siblings, 1 reply; 106+ messages in thread From: Mike Anderson @ 2003-02-27 6:37 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > Right... but I removed the release() function because that was marked (in > the documentation) as only for the old-style drivers. So I'll need to > re-introduce it -- but it looks like all it has to do is free some memory. > Does that sound about right? Yes it was previously removed, but IIRC this is the direction discussed on this thread. For a idle device the release functionality could be done in the context of the scsi_remove_host call, but for a busy device we need to have this call to clean up later. -andmike -- Michael Anderson andmike@us.ibm.com ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-27 6:37 ` Mike Anderson @ 2003-02-27 19:32 ` Matthew Dharm 2003-03-01 1:41 ` Matthew Dharm 0 siblings, 1 reply; 106+ messages in thread From: Matthew Dharm @ 2003-02-27 19:32 UTC (permalink / raw) To: Mike Anderson Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1636 bytes --] This was discussed, but I didn't recall a firm decision. I'll keep my eyes open for the patch that uses release(). Matt On Wed, Feb 26, 2003 at 10:37:37PM -0800, Mike Anderson wrote: > Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > > Right... but I removed the release() function because that was marked (in > > the documentation) as only for the old-style drivers. So I'll need to > > re-introduce it -- but it looks like all it has to do is free some memory. > > Does that sound about right? > > Yes it was previously removed, but IIRC this is the direction > discussed on this thread. For a idle device the release functionality could > be done in the context of the scsi_remove_host call, but for a busy > device we need to have this call to clean up later. > > -andmike > -- > Michael Anderson > andmike@us.ibm.com > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Scholarships for Techies! > Can't afford IT training? All 2003 ictp students receive scholarships. > Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more. > www.ictp.com/training/sourceforge.asp > _______________________________________________ > linux-usb-devel@lists.sourceforge.net > To unsubscribe, use the last form field at: > https://lists.sourceforge.net/lists/listinfo/linux-usb-devel -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver C: They kicked your ass, didn't they? S: They were cheating! -- The Chief and Stef User Friendly, 11/19/1997 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-02-27 19:32 ` Matthew Dharm @ 2003-03-01 1:41 ` Matthew Dharm 0 siblings, 0 replies; 106+ messages in thread From: Matthew Dharm @ 2003-03-01 1:41 UTC (permalink / raw) To: Mike Anderson, Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 2282 bytes --] After conversing with Mike some more, there is a deadlock problem. I need to hold the host_lock to walk the device list, but I can't hold the host_lock when I call scsi_set_device_offline(). Mike's looking into this. Until he finds a good answer, no patch for usb-storage. Matt On Thu, Feb 27, 2003 at 11:32:39AM -0800, Matthew Dharm wrote: > This was discussed, but I didn't recall a firm decision. > > I'll keep my eyes open for the patch that uses release(). > > Matt > > On Wed, Feb 26, 2003 at 10:37:37PM -0800, Mike Anderson wrote: > > Matthew Dharm [mdharm-scsi@one-eyed-alien.net] wrote: > > > Right... but I removed the release() function because that was marked (in > > > the documentation) as only for the old-style drivers. So I'll need to > > > re-introduce it -- but it looks like all it has to do is free some memory. > > > Does that sound about right? > > > > Yes it was previously removed, but IIRC this is the direction > > discussed on this thread. For a idle device the release functionality could > > be done in the context of the scsi_remove_host call, but for a busy > > device we need to have this call to clean up later. > > > > -andmike > > -- > > Michael Anderson > > andmike@us.ibm.com > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: Scholarships for Techies! > > Can't afford IT training? All 2003 ictp students receive scholarships. > > Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more. > > www.ictp.com/training/sourceforge.asp > > _______________________________________________ > > linux-usb-devel@lists.sourceforge.net > > To unsubscribe, use the last form field at: > > https://lists.sourceforge.net/lists/listinfo/linux-usb-devel > > -- > Matthew Dharm Home: mdharm-usb@one-eyed-alien.net > Maintainer, Linux USB Mass Storage Driver > > C: They kicked your ass, didn't they? > S: They were cheating! > -- The Chief and Stef > User Friendly, 11/19/1997 -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver It was a new hope. -- Dust Puppy User Friendly, 12/25/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-25 0:05 ` Doug Ledford 2003-01-25 0:45 ` Matthew Dharm @ 2003-02-02 3:49 ` Matthew Dharm 1 sibling, 0 replies; 106+ messages in thread From: Matthew Dharm @ 2003-02-02 3:49 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1310 bytes --] So, I was trying to implement this bit of logic in usb-storage, when I discovered that scsi_set_device_offline() doesn't exist. Okay, what did I miss? I'm sure it's obvious, but for the life of me I don't see it. Matt On Fri, Jan 24, 2003 at 07:05:29PM -0500, Doug Ledford wrote: > > (i) Doesn't scsi_set_device_offline() work on devices, not hosts? How do I > > map from my host to my device list? > > Well, in hosts.c::scsi_remove_host() we do it thusly: > > list_for_each_entry(sdev, &shost->my_devices, siblings) > if (scsi_check_device_busy(sdev)) > return 1; > > > (ii) Do I need to call scsi_set_device_offline() for each device? I > > presume 'yes'. > > Yes. As people pointed out to me the reason a USB device is done as a > host is because it very well may *be* a host with several devices behind > it, so it must handle the multiple device scenario correctly and set all > devices offline and clean up after all of them that might be behind this > bridge. -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver I'm seen in many forms. Now open your mouth. It's caffeine time. -- Cola Man to Greg User Friendly, 10/28/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-24 23:25 ` Matthew Dharm 2003-01-25 0:05 ` Doug Ledford @ 2003-01-25 1:24 ` Luben Tuikov 1 sibling, 0 replies; 106+ messages in thread From: Luben Tuikov @ 2003-01-25 1:24 UTC (permalink / raw) To: Matthew Dharm Cc: Oliver Neukum, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Matthew Dharm wrote: > So, if I read this correctly, you're saying that the correct sequence is: > > (1) get disconnect notification from USB > (2) Call scsi_set_device_offline() (must hold host lock for this) > (3) call scsi_done() for all command in queue (max: 1) Right. LLDD does (2) and (3), though, *and let's decide on this scsi ppl* I'd rather (3) be initiated by SCSI Core, e.g. in recovery code. > (4) Call scsi_remove_host(), which should now work because no commands are > outstanding > (5) Call scsi_unregister() This is tricky. I'd rather SCSI Core do those things. I.e. when there are no devices left, then SCSI Core could probably initiate the removal of the host, just as it does for devices with slave_alloc(). Please ppl, read Doug's original email *carefullly*, especially those who are implementing this now (Mike?). (You may decide to go over your queue and error them out, but the error recovery thread should call your eh_abort() method -- this is a more consistent way of doing this and you'd have *less* code duplication in LLDD.) The whole point is that when shared data is involved, i.e. hosts, commands, devices, LLDD can tell SCSI Core what it (LLDD) wants to be done eventually, (by, say telling scsi core about the unplug event by calling scsi_set_device_offline()), and then when SCSI Core decides that it is safe to do so, it will *remove* the device(s) and/(or) host(s), the former by calling slave_destroy(), and the latter by the release() method. Please note, that a LLDD cannot ``run'' SCSI Core, and this is what you've been inclined to do in all your emails. For this I'm inclined to include in SHT/host, a host_volatile:1 flag to mean whether the host is to be removed when there's no devices; 1 to mean that it is to be removed when # dev = 0, 0 to mean that the host stays. Some hosts may decide to stay even if there's no devices attached, e.g. FC/SAN hosts. (since a device may come up any moment now, or that the host has been authenticated and connected to a target whos luns got pulled out at this moment in time -- and there's no point in removing the host and then initiating a whole new session/authenication/etc.) USB Storage hosts will have .host_volatile = 1. > And we're done, all structures can be freed. SCSI Core will tell you when you're done, when your release() method is called -- when SCSI Core decides to do it. > And, as I understand it, the > following is true: > > (a) once (2) is done, no more commands will be queued Ok, SCSI Core can take care of this. > (b) once (3) is done, (4) is guaranteed to work You shouldn't care of this -- your driver methods will get called. You shouldn't know about how the layer above you implements it -- this is the whole point of separating SCSI Core and LLDD as different subsystems. > (c) there is nothing the user can do to make this sequence take a long time Right. > Tho, this does leave me with a couple of questions: > > (i) Doesn't scsi_set_device_offline() work on devices, not hosts? How do I > map from my host to my device list? Doug answered this -- but it's even easier when you have one device per host. > (ii) Do I need to call scsi_set_device_offline() for each device? I > presume 'yes'. Mostly it would depend on USB. Is it a normal host with several real devices and just one of them is going away? Or is it a bridge itself which is going away and you must force unplug all device? I can imagine this either way, though my knowledge of USB is limited. :-) SCSI ppl: the transport may not be USB so let's not generalize if new code is going into SCSI Core. > (iii) What should I shove into the status field of the scsi command before > I scsi_done() it? DID_ERROR or DID_BAD_TARGET, since the device is gone. > Oh, and as for my being a 'prick'.... my big problem is that the documented > interface is synchronous. Async is fine with me, but up until this e-mail, > all I've seen is people arguing over what the sequence is, and theoretical > issues of what users should and should not do. Most of the things said were of interest and concern to SCSI ppl too. > And I also think that a > large number of hotplugable hosts are going to replicate a whole bunch of > code to do (2)+(3)+(4) in one, synchronous burst. No, not quite -- those will be called by SCSI Core, depending on the host. Things may work differently for a host connected to a SAN, don't you think? For this reason we cannot lump up (2), (3) and (4). They'll be separated and SCSI Core will drive things up. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 20:28 ` Doug Ledford 2003-01-23 20:59 ` Oliver Neukum @ 2003-01-24 0:15 ` Patrick Mansfield 2003-01-24 8:33 ` David Brownell 2 siblings, 0 replies; 106+ messages in thread From: Patrick Mansfield @ 2003-01-24 0:15 UTC (permalink / raw) To: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list On Thu, Jan 23, 2003 at 03:28:36PM -0500, Doug Ledford wrote: > On Thu, Jan 23, 2003 at 08:40:41PM +0100, Oliver Neukum wrote: > > No, scsi_set_device_offline() schedules the error handler thread for that > host to be woken up. > Doug - Why would the error handler need to run? If the LLDD fails all outstanding command with an appropriate error (like DID_NO_CONNECT), the failure is passed to the upper levels. It seems that we could just set the device offline, make sure we do not send commands for an offline device, and let the adapter fail all outstanding commands. -- Patrick Mansfield ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-23 20:28 ` Doug Ledford 2003-01-23 20:59 ` Oliver Neukum 2003-01-24 0:15 ` Patrick Mansfield @ 2003-01-24 8:33 ` David Brownell 2 siblings, 0 replies; 106+ messages in thread From: David Brownell @ 2003-01-24 8:33 UTC (permalink / raw) To: Doug Ledford Cc: Oliver Neukum, Luben Tuikov, Alan Stern, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Doug Ledford wrote: > Once all the commands are gone and no more are arriving, then if, and only > if, someone actually removes the device from the scsi subsystem (maybe > hotplug manager or something) then you will get the typical > slave_destroy() call to tell you that it is safe to release all resources > related to this device. Otherwise, the device will hang around as an > offline device until someone does Does that mean there's no LLD state for that HBA (and/or devices connected to it), so that some _other_ kind of state is representing these zombie devices? Seems like it must. USB physical device model state must go away when the device does, rather promptly ... the disconnect() is invoked in a thread, so it can block for a very short while. Blocking more than a few milliseconds there is extremely antisocial though, so that can't be the device state that will "hang around" until some user mode agent does your suggested removal action. I get the impression these zombie devices are largely what Linus has recently asked to be removed from usb-storage. Which has, so far, been quite keen to re-animate them ... maybe a useful incremental improvement would be just to give up the re-animation part, in such a way that the scsi a/b/c/d identifiers eventually get recycled. > echo "scsi-remove-single-device a b c d" >/proc/scsi/scsi > to remove it. That ought to be trivial for some hotplug agent to do ... but there's the issue of where "a b c d" come from. Last I looked, there was no obvious way to associate such data with the SCSI hotplug events; the sysfs state wasn't very helpful. - Dave ^ permalink raw reply [flat|nested] 106+ messages in thread
* A different look at block device hotswap in the Linux kernel 2003-01-23 18:19 ` Oliver Neukum 2003-01-23 19:07 ` Luben Tuikov @ 2003-01-23 20:41 ` Steven Dake 2003-01-23 21:07 ` Matthew Jacob 2003-01-24 0:07 ` Oliver Neukum 1 sibling, 2 replies; 106+ messages in thread From: Steven Dake @ 2003-01-23 20:41 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oliver and others, In regards to hotswap, any real operating system should be _told_ that a block device is going to be removed from the top. There are several reasons. 1) File mounts should be removed from the filesystem layer 2) files accessing block devices directly should be terminated 3) raid members using that block device should be hot removed 4) I'm sure you can think of others :) The key is that the removal request should come from the top, not the bottom. If someone is stupid enough to surprise remove a device (ie: unplug their USB SCSI device while the device is in use by the OS), they get what they deserve (I/O errors, dirty OS data, queued up requests which never shut down). If they tell the OS that the device is going to be removed, so it may flush the device and shut down I/O to the device, the request should be granted on all accounts (expected removal). The device driver should not be responsible for managing hotswap in any regard. Its only purpose should be to tell the block device removal layer that a surprise extraction was initiated such that the block device removal code can ask the mid layer drivers to shut down error correction routines to the device and dump its pending I/O queue and clean up after the device. The main advantage of this technique is simplicity (the LLDD's don't have to have repetative logic for each device driver), genericity (the block device removal code can be maintained in one place and be guaranteed to ensure the OS is in a stable state after a device is removed either surprise or expected and finally it solves the in-flight I/O problem by stopping new I/O to the device, shutting down I/O to the device, flushing the pending I/O queues, and killing all references in the OS of the device. If you think about what your suggesting, your suggesting that the LLDD tells the scsi layer that the device is gone, that then times out errors and leaves the filesystem and sys_open/close file tables, and RAID layers in a state of disarray. We don't want the LLDD knowing about the RAID system and whether it should tell the RAID layer to hot remove, do we? I've developed code to do exactly what I have described here (surprise and expected extractions genericized into one file with one simple call from userland and a method for lower layers to indicate a surprise extraction if they have detected one. I'll post as soon as I have time to make a patch against 2.5 . Thanks -steve Oliver Neukum wrote: >Am Donnerstag, 23. Januar 2003 18:46 schrieb Luben Tuikov: > > >>Oliver Neukum wrote: >> >> >>>Not all the world is a SAN. USB has no possibility to even try an >>>interaction after the device is gone. We have to handle this flexibly. >>> >>> >>Thus the example in the original post. I.e. for simple transports whose >>portals get notified when a device is plugged off (USB), the LLDD >>can notify SCSI Core, by setting a state variable in scsi_device. >>In which case SCSI Core can answer with the proper TARGET error code. >>(This was outlined before, scsi_command->online:1 ...) >> >> > >Very well, so you agree that the SCSI layer should export to the LLDD >a function to set devices offline? > > > >>>In fact, if a device >>>can vanish without a LLDD knowing about it, this is purely a problem of >>>the SCSI layer. >>> >>> >>No, of course not. (Think of IP.) When a device vanishes and LLDD doesn't >>know about it (more complicated transports), the CDB will return with >>the proper Service Response, since the transport(s) won't be able to >>deliver it. This will bubble up through SCSI Core and the error returned >>will have to be the same as that of the simpler transports, as outlined >>above. >> >> > >Yes, sorry. To be precise, this means that the LLDD has to do nothing >special, as it has to implement checking for a failing command anyway. >But it's not entirely the same. If a command cannot be delivered it may or may >not be appropriate to start error recovery. After the LLDD has told >the SCSI layer that it has noticed a device going away, there must be no >error recovery. > > > >>>That means that we have to have a way to ensure that no more commands >>>will reach the LLDD which can be triggered without any commands to be >>>executed at all. This functionality has to come from the scsi mid layer. >>> >>> >>For simple transports yes; for more complicated ones, the CDB will >>not be able to be delivered, and will return with error. >> >> > >Good. >So the first thing a LLDD has to do after it has learned about a device >being removed is to have the device block. >1. set device offline >But commands may still be in flight.IMHO it is not right to assume that >all commands now in flight to a device have failed, as some may have >completed successfully in time, or failed for other reasons than unplugging. >So it should be the LLDD's responsibility to finish the outstanding commands. >Furthermore, there's a window for commands already having passed the check >for offline but not yet being noticed by the LLDD. The simplest solution is to >use a waiting primitive from RCU. So we are at: > >1. set device offline >2. synchronize the kernel >3. finish all pending commands > >So far with me? >The LLDD could now forget about the device and be done with it. >However there's a problem left. The device may come back. >What happens if a device with the same ID is reconnected? > > Regards > Oliver > >- >To unsubscribe from this list: send the line "unsubscribe linux-scsi" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-23 20:41 ` A different look at block device hotswap in the Linux kernel Steven Dake @ 2003-01-23 21:07 ` Matthew Jacob 2003-01-23 21:06 ` Steven Dake 2003-01-24 0:07 ` Oliver Neukum 1 sibling, 1 reply; 106+ messages in thread From: Matthew Jacob @ 2003-01-23 21:07 UTC (permalink / raw) To: Steven Dake Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > > The key is that the removal request should come from the top, not the > bottom. If someone is stupid enough to surprise remove a device (ie: > unplug their USB SCSI device while the device is in use by the OS), they > get what they deserve (I/O errors, dirty OS data, queued up requests > which never shut down). If they tell the OS that the device is going to > be removed, so it may flush the device and shut down I/O to the device, > the request should be granted on all accounts (expected removal). > Hmm? Windows and OS/X cope with this just fine. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-23 21:07 ` Matthew Jacob @ 2003-01-23 21:06 ` Steven Dake 2003-01-23 21:16 ` Matthew Jacob 0 siblings, 1 reply; 106+ messages in thread From: Steven Dake @ 2003-01-23 21:06 UTC (permalink / raw) To: mjacob Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list I cant speak about OS/X, but I have crashed windows several times (BSOD) while hot removing a USB SCSI CDROM. As you will notice, when you run windows and attach a device, there is a program that is started that allows you to notify the os of the removal so that it may properly remove the device from the OS instead of it being yanked. Thanks -steve Matthew Jacob wrote: >>The key is that the removal request should come from the top, not the >>bottom. If someone is stupid enough to surprise remove a device (ie: >>unplug their USB SCSI device while the device is in use by the OS), they >>get what they deserve (I/O errors, dirty OS data, queued up requests >>which never shut down). If they tell the OS that the device is going to >>be removed, so it may flush the device and shut down I/O to the device, >>the request should be granted on all accounts (expected removal). >> >> >> > >Hmm? Windows and OS/X cope with this just fine. > > > > > > ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-23 21:06 ` Steven Dake @ 2003-01-23 21:16 ` Matthew Jacob 0 siblings, 0 replies; 106+ messages in thread From: Matthew Jacob @ 2003-01-23 21:16 UTC (permalink / raw) To: Steven Dake Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Oh, well. I've pulled my camera in the middle of reads and just got the usual whininess. I think I was reacting to your "get what they deserve" comment. The end goal of USB should probably *be* an alert that said "oh, dear, that wasn't helpful- please put that memory stick back so I can finish writing it". The message "die, heathen dog luser!" is not exactly the right idea. In the matrix of outcomes of pulling a disk (or a fibre channel cable) in the middle of I/O, there are many entries that are not recoverable, many entries are hard to recover from, and many that are easy. This should be irrelevant to the basic policy decision as to how you want your system to be used- do you want it to require intervention so that it is "safe" to change h/w? do you want I/O to autorestart after (temporary) h/w topology changes? Have these questions been answered or can they be answered via policies? On Thu, 23 Jan 2003, Steven Dake wrote: > I cant speak about OS/X, but I have crashed windows several times (BSOD) > while hot removing a USB SCSI CDROM. As you will notice, when you run > windows and attach a device, there is a program that is started that > allows you to notify the os of the removal so that it may properly > remove the device from the OS instead of it being yanked. > > Thanks > -steve > > Matthew Jacob wrote: > > >>The key is that the removal request should come from the top, not the > >>bottom. If someone is stupid enough to surprise remove a device (ie: > >>unplug their USB SCSI device while the device is in use by the OS), they > >>get what they deserve (I/O errors, dirty OS data, queued up requests > >>which never shut down). If they tell the OS that the device is going to > >>be removed, so it may flush the device and shut down I/O to the device, > >>the request should be granted on all accounts (expected removal). > >> > >> > >> > > > >Hmm? Windows and OS/X cope with this just fine. > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-23 20:41 ` A different look at block device hotswap in the Linux kernel Steven Dake 2003-01-23 21:07 ` Matthew Jacob @ 2003-01-24 0:07 ` Oliver Neukum 2003-01-24 0:21 ` Matthew Jacob 2003-01-24 0:54 ` Steven Dake 1 sibling, 2 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-24 0:07 UTC (permalink / raw) To: Steven Dake Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Am Donnerstag, 23. Januar 2003 21:41 schrieb Steven Dake: > Oliver and others, > > In regards to hotswap, any real operating system should be _told_ that a > block device is going to be removed from the top. There are several > reasons. Users don't do what they should. It is as simple as that. The hotplugging busses are supposed to handle that. > 1) File mounts should be removed from the filesystem layer > 2) files accessing block devices directly should be terminated > 3) raid members using that block device should be hot removed > 4) I'm sure you can think of others :) > > The key is that the removal request should come from the top, not the > bottom. If someone is stupid enough to surprise remove a device (ie: No! You have to be able to handle a sudden failure. If you don't do this you are already buggy. Hardware doesn't send advance notification before failing. Data loss will occur. It's unavoidable. Anything else must not happen. And a failure of hardware can only be recognised at the layer closest to the hardware in the generic case. > The device driver should not be responsible for managing hotswap in any > regard. Its only purpose should be to tell the block device removal Yes. > layer that a surprise extraction was initiated such that the block > device removal code can ask the mid layer drivers to shut down error > correction routines to the device and dump its pending I/O queue and > clean up after the device. The main advantage of this technique is Yes. But not ask. Demand. There's no asking here. Do or die. > If you think about what your suggesting, your suggesting that the LLDD > tells the scsi layer that the device is gone, that then times out errors > and leaves the filesystem and sys_open/close file tables, and RAID > layers in a state of disarray. We don't want the LLDD knowing about the > RAID system and whether it should tell the RAID layer to hot remove, do we? I want: LLDD to SCSI: device is gone SCSI to LLDD: Ok. I'll handle from here on. LLDD: OK. I am gone. And won't have any contact until the next device is plugged in. The process can be somewhat more complicated, under some conditions: - it never fails - it is done within a finite, bounded, reasonable time Regards Oliver ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-24 0:07 ` Oliver Neukum @ 2003-01-24 0:21 ` Matthew Jacob 2003-01-24 7:53 ` David Brownell 2003-01-24 0:54 ` Steven Dake 1 sibling, 1 reply; 106+ messages in thread From: Matthew Jacob @ 2003-01-24 0:21 UTC (permalink / raw) To: Oliver Neukum Cc: Steven Dake, Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > I want: > LLDD to SCSI: device is gone > SCSI to LLDD: Ok. I'll handle from here on. > LLDD: OK. I am gone. And won't have any contact until the next device is > plugged in. > > The process can be somewhat more complicated, under some conditions: > - it never fails > - it is done within a finite, bounded, reasonable time Could this time limit be fixed (or parameterized) known to all LLDDs? This would allow one to try and avoid flooding SCSI with detach/reattach events for the 'same' device. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-24 0:21 ` Matthew Jacob @ 2003-01-24 7:53 ` David Brownell 2003-01-24 15:26 ` Matthew Jacob 0 siblings, 1 reply; 106+ messages in thread From: David Brownell @ 2003-01-24 7:53 UTC (permalink / raw) To: mjacob Cc: Oliver Neukum, Steven Dake, Luben Tuikov, Alan Stern, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Matthew Jacob wrote: >>I want: >>LLDD to SCSI: device is gone >>SCSI to LLDD: Ok. I'll handle from here on. >>LLDD: OK. I am gone. And won't have any contact until the next device is >>plugged in. >> >>... > > > Could this time limit be fixed (or parameterized) known to all LLDDs? > This would allow one to try and avoid flooding SCSI with detach/reattach > events for the 'same' device. And what exactly is the "same" device? And who's keeping history about devices that have previously been attached? And, says the guy who's full of questions, didn't Linus want to get rid of such history? - Dave ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-24 7:53 ` David Brownell @ 2003-01-24 15:26 ` Matthew Jacob 0 siblings, 0 replies; 106+ messages in thread From: Matthew Jacob @ 2003-01-24 15:26 UTC (permalink / raw) To: David Brownell Cc: Oliver Neukum, Steven Dake, Luben Tuikov, Alan Stern, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > >>... > > > > > > Could this time limit be fixed (or parameterized) known to all LLDDs? > > This would allow one to try and avoid flooding SCSI with detach/reattach > > events for the 'same' device. > > And what exactly is the "same" device? And who's keeping history > about devices that have previously been attached? And, says the guy > who's full of questions, didn't Linus want to get rid of such history? Hrmm. That's a damned good point. I was going to say things like "the FC HBA driver knows that device XYX left the fabric and now has returned", but if XYZ left the fabric, why am I keeping track of it still? Once gone, it's gone. I had convinced myself that if an FC device (re)appears, it's not up to the HBA to say it's the same (the content may have been changed even if the container tag is the same). Hrm. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: A different look at block device hotswap in the Linux kernel 2003-01-24 0:07 ` Oliver Neukum 2003-01-24 0:21 ` Matthew Jacob @ 2003-01-24 0:54 ` Steven Dake 2003-01-24 2:35 ` [linux-usb-devel] " Matthew Dharm 1 sibling, 1 reply; 106+ messages in thread From: Steven Dake @ 2003-01-24 0:54 UTC (permalink / raw) To: Oliver Neukum Cc: Luben Tuikov, Alan Stern, David Brownell, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list >I want: >LLDD to SCSI: device is gone >SCSI to LLDD: Ok. I'll handle from here on. >LLDD: OK. I am gone. And won't have any contact until the next device is >plugged in. > > The downside of this approach is that the LLDD must now be able to detect insertions and removals when it may not be able to do so. If it is able to do so, then fine, it can tell upper layers about it, but the actual control of removal of a device should occur higher up to fix several problems with the approach of having the LLDD manage the hotswap state of the device. >The process can be somewhat more complicated, under some conditions: >- it never fails >- it is done within a finite, bounded, reasonable time > > Regards > Oliver > > > > > ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: A different look at block device hotswap in the Linux kernel 2003-01-24 0:54 ` Steven Dake @ 2003-01-24 2:35 ` Matthew Dharm 0 siblings, 0 replies; 106+ messages in thread From: Matthew Dharm @ 2003-01-24 2:35 UTC (permalink / raw) To: Steven Dake Cc: Oliver Neukum, Luben Tuikov, Alan Stern, David Brownell, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list [-- Attachment #1: Type: text/plain, Size: 1622 bytes --] On Thu, Jan 23, 2003 at 05:54:57PM -0700, Steven Dake wrote: > >I want: > >LLDD to SCSI: device is gone > >SCSI to LLDD: Ok. I'll handle from here on. > >LLDD: OK. I am gone. And won't have any contact until the next device is > >plugged in. > > > > > The downside of this approach is that the LLDD must now be able to > detect insertions and removals when it may not be able to do so. If it > is able to do so, then fine, it can tell upper layers about it, but the > actual control of removal of a device should occur higher up to fix > several problems with the approach of having the LLDD manage the hotswap > state of the device. Huh? Aren't we talking about a hotplug scenario? How can you talk about the 'LLDD must now be able to detect... when it may not be able to do so.'? Oh... I see. We keep talking about devices. I'm trying to hotswap an entire host, which is mapped to a single USB device. But the theory is the same, really. In the end, you can only hotswap something that is hotswapable. That means that the driver has to support the hotswap system, whatever it is. If you can't support hotswap detection, then this entire scenario is reduced to 'what happens if I blow a FET on my HD', because it's the exact same thing. Recovering from fatal error is a separate discussion. Matt -- Matthew Dharm Home: mdharm-usb@one-eyed-alien.net Maintainer, Linux USB Mass Storage Driver A: The most ironic oxymoron wins ... DP: "Microsoft Works" A: Uh, okay, you win. -- A.J. & Dust Puppy User Friendly, 1/18/1998 [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-21 18:16 ` Luben Tuikov 2003-01-21 19:00 ` Oliver Neukum @ 2003-01-22 21:30 ` David Brownell 1 sibling, 0 replies; 106+ messages in thread From: David Brownell @ 2003-01-22 21:30 UTC (permalink / raw) To: Luben Tuikov Cc: Oliver Neukum, Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Luben Tuikov wrote: > > When the Low Level Device Driver (LLDD), being the transport portal, > notices that the device is going away or has gone away from the > ``fabric'' (wlg), it will fire a device-gone event with the kernel. > *Not* necessarily with SCSI Core, in fact I'd rather it didn't, > but with a well defined kernel entry for device-gone events. > > At the same time the LLDD will start returning TARGET gone, or > whatever is appropriate to newly queued commands, and error out > all internally queued commands (if it does it's own queuing). > (I've seen this work nicely on mount and read/write(2) and fsck.) > > I.e. the ``synchronization'' has started already by the LLDD erroring > out commands, new and queued. This model I like, though FWIW the USB code has pretty messy handling of the (a) cancel queued requests and (b) reject new requests parts of that model. All the updates to handle that would be transparent to device drivers (other than HCDs); we've discussed it a bit. > All the while the kernel has started higher level cleaning up, > decrementing ref counts, etc, ... Invoking some hotplug event that might unmount filesystems, so _all_ relevant kernel state can be cleaned up ... > But there's no such thing as ``waiting around indefinitely'' or > ``blocking wait'' as you've suggested in some of your emails. Right. Though I can wish the driver model core actually used its "enum device_state" and had an instance variable of that type in "struct device". That'd help the bus level driver (for SCSI, an LLDD or LLD; for USB, an HCD) with (a) and (b). The "no waiting indefinitely" is in part that because knowing both (a) and (b) happen means you know the device quiesces. - Dave ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 20:08 ` David Brownell 2003-01-20 20:48 ` [linux-usb-devel] " Oliver Neukum @ 2003-01-20 22:16 ` Luben Tuikov 2003-01-20 22:51 ` David Brownell 1 sibling, 1 reply; 106+ messages in thread From: Luben Tuikov @ 2003-01-20 22:16 UTC (permalink / raw) To: David Brownell Cc: Matthew Dharm, Oliver Neukum, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list David Brownell wrote: > Luben Tuikov wrote: > >>> The way this should work is that the LLD calls scsi_remove_device(), and >>> that cuts off the flow of commands. The LLD can promise to error-out >>> any >>> pending commands in the device command queue. >> >> >> >> I take it you mean that the transport will tell the LLDD that the device >> is gone and it (LLDD) call the one above, SCSI Core to remove the device. >> >> Hmm, more thinking needs to be done here, as shouldn't this be handled >> by hotplugging? I.e. Targets do not *initiate* events. > > > Not exactly, but the bus driver ("transport"?) certainly does initiate > reports like "here's a new device on the bus" or "that device is gone". > That's when hotplugging kicks in (both in-kernel and in-userland). > > And the only way to access a device ("target") on the bus is to give a > request to that bus driver. If, when servicing that request, the bus > driver notices the device is gone ... that can act a lot like a device > initiating a "device gone" event would look. David, when I said ``... the transport will tell the LLDD that the device ...'' this is *exactly* what I meant. You're just repeating it here in a more broken-down way. By transport I mean USB, FC, SPI, etc; LLDD is the transport portal and the initiator (aka the initiator port). This terminology is not really that new, but still not that old, and described in SAM-3. >> The transport can notify that the device is gone, but an ULP entity will >> call scsi_remove_device() not the other way around. > > > That's how USB works today: khubd shuts things down. Device drivers > get disconnect() callbacks, just as when their modules are removed. Pardon me, I'm not very familiar with the USB subsystem, but this only makes sense -- why would anyone do it any other way... :-) > EXCEPT that "khubd" is part of usbcore (roughly analagous to parts > of the scsi mid-layer) ... so the drivers acting as host side proxies > for the target hardware ("usb device") are purely reactive. Their > only roles in hotplug scenarios are to bind to devices (when a new > one appears, using probe callbacks) or unbind from them (when one > goes away, using disconnect callbacks). Very nice. > Those disconnect() callbacks have a few key responsibilities, very > much including shutting down the entire higher level I/O queue to > that device. I think you're saying that SCSI drivers don't have > such a responsibility (unlike USB or PCI) ... if so, that would > seem to be worth changing. We just cannot let a transport event just wipe out a device, without consulting hotplugging first -- think security. SCSI drivers' (LLDD) responsibility is changing. This is inevitable, due to the reorganization of SAM-3 and SPC-3. There's no more such a thing as a ``bus'' in SCSI, e.g. ``Bus'' *may* be a concept of the transport, and then again it may not. General: -------- SCSI was never designed to support Target initiated events. SAM-3 has no provision for it, except passively when the next command status is returned (e.g. UA). For this reason, device removal is *transport* related event -- it has *nothing* to do with the SCSI target/target device, except that it's gone :-) . Being pedantic, this would be a /transport initiated event/ . When this event takes place, the LLDD will notice it, and let the kernel know about it, via a callback, all the while the LLDD will return TARGET error (since it's gone), until is has been told slave_destroy(), after which it should never be queried of it, and if it is it should return the same error. That is, when a transport event takes place, the LLDD doesn't have to ``run to'' SCSI Core right away. Just let the kernel know about this event, and start returning errors, on newly queued commands. The kernel will decide what to do about this device going away, i.e. hotplugging, sysop notification, etc. I guess we're crossing in this discussion in such a way, just because of USB and SCSI crossing here. But if we think that USB is the transport and that it could also be FC, SPI, SSA, iSCSI, then a general framework of the workings is inevitable. I.e. when talking about LLDDs we'd concentrate less on ``Target'' and more on ``transport''. -- Luben ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 22:16 ` Luben Tuikov @ 2003-01-20 22:51 ` David Brownell 2003-01-20 23:27 ` Oliver Neukum 0 siblings, 1 reply; 106+ messages in thread From: David Brownell @ 2003-01-20 22:51 UTC (permalink / raw) To: Luben Tuikov Cc: Matthew Dharm, Oliver Neukum, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list Luben Tuikov wrote: > David, when I said ``... the transport will tell the LLDD that the > device ...'' this is *exactly* what I meant. You're just repeating > it here in a more broken-down way. OK > By transport I mean USB, FC, SPI, etc; LLDD is the transport portal > and the initiator (aka the initiator port). This terminology is not > really that new, but still not that old, and described in SAM-3. I was hoping for something described in the 2.5.58 kernel docs, which only talks about LLD (Documentation/scsi) except in one case (looked like a typo) ... I remember SAM-3 as a kind of missile! > We just cannot let a transport event just wipe out a device, > without consulting hotplugging first -- think security. Certainly "device gone" would be an auditable event, but this is primarily an integrity issue: don't free objects until other components have stopped using them. If any components attach security policies to that "gone" state transition, that'd be atypical but purely their own business. (Like a transport erasing session master keys ... most transports wouldn't have them, and would likely erase them as soon as the device is known to be gone, no hotplug involved.) > That is, when a transport event takes place, the LLDD doesn't > have to ``run to'' SCSI Core right away. Just let the kernel > know about this event, and start returning errors, on newly > queued commands. > > The kernel will decide what to do about this device going away, > i.e. hotplugging, sysop notification, etc. Sounds right. Except that it'd normally be the SCSI core that we "let" know about the event. (Not always, I can imagine that some transports might be able to kick in recovery procedures and find some other path for accessing the device. But in such cases, SCSI might never see the device as "gone" ... ) - Dave ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 2003-01-20 22:51 ` David Brownell @ 2003-01-20 23:27 ` Oliver Neukum 0 siblings, 0 replies; 106+ messages in thread From: Oliver Neukum @ 2003-01-20 23:27 UTC (permalink / raw) To: David Brownell, Luben Tuikov Cc: Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel, Linux SCSI list > > By transport I mean USB, FC, SPI, etc; LLDD is the transport portal > > and the initiator (aka the initiator port). This terminology is not > > really that new, but still not that old, and described in SAM-3. > > I was hoping for something described in the 2.5.58 kernel docs, > which only talks about LLD (Documentation/scsi) except in one case > (looked like a typo) ... I remember SAM-3 as a kind of missile! Possibly the view people at IBM have of SCSI is more comprehensive than ordinarily used. We are probably better of talking about low level drivers (LLDs). Whether these drive devices, busses or transport mechanisms is not really relevant here. > > We just cannot let a transport event just wipe out a device, > > without consulting hotplugging first -- think security. > > Certainly "device gone" would be an auditable event, but this is > primarily an integrity issue: don't free objects until other > components have stopped using them. Right. There's nothing wrong with a LLD having to wait for a _limited_ time by making a blocking call to the midlayer. But that call must finish within a limited,reasonable time and it must succeed. The complexity for handling hotunplugging belongs squarely in a centralised place, but not every LLD. It is true that currently hotplugging is a generic safety problem (unless you use devfs). The problem is reuse of device nodes leading to a race with the hotplug skripts. Old permissions may for a time be applied to a new device. But that just means that the hotplugging user space notification model is incomplete. Now the simple fix with a callback into the LLD which would allow simply waiting for the unplugging skript to run is true madness. You cannot stall a hotpluggable subsystem waiting on a skript, neither can you do sane error handling. Problem should be fixed where they arise, eg. "lock" a new hotplugged device. > If any components attach security policies to that "gone" state > transition, that'd be atypical but purely their own business. > (Like a transport erasing session master keys ... most transports > wouldn't have them, and would likely erase them as soon as the > device is known to be gone, no hotplug involved.) Right. > > That is, when a transport event takes place, the LLDD doesn't > > have to ``run to'' SCSI Core right away. Just let the kernel > > know about this event, and start returning errors, on newly > > queued commands. > > > > The kernel will decide what to do about this device going away, > > i.e. hotplugging, sysop notification, etc. > > Sounds right. Except that it'd normally be the SCSI core that > we "let" know about the event. (Not always, I can imagine that > some transports might be able to kick in recovery procedures > and find some other path for accessing the device. But in > such cases, SCSI might never see the device as "gone" ... ) I must disagree. There's no decision involved here. You handle tasks having the device open, clean things up, free up the resources and fire off a user space notification. Decisions get made in user space where policy can be reasonably implemented. Regards Oliver ^ permalink raw reply [flat|nested] 106+ messages in thread
* RE: A different look at block device hotswap in the Linux kernel
@ 2003-01-24 16:36 Cress, Andrew R
2003-01-24 18:01 ` Bryan Henderson
0 siblings, 1 reply; 106+ messages in thread
From: Cress, Andrew R @ 2003-01-24 16:36 UTC (permalink / raw)
To: 'mjacob@feral.com', David Brownell
Cc: Oliver Neukum, Steven Dake, Luben Tuikov, Alan Stern,
Matthew Dharm, Mike Anderson, Greg KH, linux-usb-devel,
Linux SCSI list
My $.02:
The comparing of a saved device list snapshot with the current device should
be the responsibility of a user-space daemon, provided that the kernel
exposes enough information to uniquely identify the devices (like serial
numbers, or some other UID if no ser num exists).
The kernel would assume that the device is new (not the same) unless told so
by a daemon that is watching.
Andy
-----Original Message-----
From: Matthew Jacob [mailto:mjacob@feral.com]
Sent: Friday, January 24, 2003 10:26 AM
To: David Brownell
Cc: Oliver Neukum; Steven Dake; Luben Tuikov; Alan Stern; Matthew Dharm;
Mike Anderson; Greg KH; linux-usb-devel@lists.sourceforge.net; Linux SCSI
list
Subject: Re: A different look at block device hotswap in the Linux kernel
> >>...
> >
> >
> > Could this time limit be fixed (or parameterized) known to all LLDDs?
> > This would allow one to try and avoid flooding SCSI with detach/reattach
> > events for the 'same' device.
>
> And what exactly is the "same" device? And who's keeping history
> about devices that have previously been attached? And, says the guy
> who's full of questions, didn't Linus want to get rid of such history?
Hrmm. That's a damned good point. I was going to say things like "the
FC HBA driver knows that device XYX left the fabric and now has
returned", but if XYZ left the fabric, why am I keeping track of it
still? Once gone, it's gone. I had convinced myself that if an FC device
(re)appears, it's not up to the HBA to say it's the same (the content
may have been changed even if the container tag is the same).
Hrm.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 106+ messages in thread* RE: A different look at block device hotswap in the Linux kernel 2003-01-24 16:36 A different look at block device hotswap in the Linux kernel Cress, Andrew R @ 2003-01-24 18:01 ` Bryan Henderson 2003-01-24 18:09 ` Matthew Jacob 0 siblings, 1 reply; 106+ messages in thread From: Bryan Henderson @ 2003-01-24 18:01 UTC (permalink / raw) To: Cress, Andrew R Cc: andmike, David Brownell, Greg KH, Linux SCSI list, linux-usb-devel, Luben Tuikov, Matthew Dharm, 'mjacob@feral.com', Oliver Neukum, Steven Dake, Alan Stern >The comparing of a saved device list snapshot with the current device should >be the responsibility of >From a usability standpoint, I don't think any such comparing should be done by anyone. When I unplug a device and then plug it in again, I want a total reset. I'm willing to take my lumps if I unplug something that isn't in a state to be safely unplugged. It's like when I pull the power plug because my system is totally hosed and I want to start over. I know I can cause damage by doing that, but I would be upset if the new system booted back to the broken state it was in when I unplugged it. ^ permalink raw reply [flat|nested] 106+ messages in thread
* RE: A different look at block device hotswap in the Linux kernel 2003-01-24 18:01 ` Bryan Henderson @ 2003-01-24 18:09 ` Matthew Jacob 0 siblings, 0 replies; 106+ messages in thread From: Matthew Jacob @ 2003-01-24 18:09 UTC (permalink / raw) To: Bryan Henderson Cc: Cress, Andrew R, andmike, David Brownell, Greg KH, Linux SCSI list, linux-usb-devel, Luben Tuikov, Matthew Dharm, Oliver Neukum, Steven Dake, Alan Stern > > It's like when I pull the power plug because my system is totally hosed and > I want to start over. I know I can cause damage by doing that, but I would > be upset if the new system booted back to the broken state it was in when I > unplugged it. I had this conversation with doug offlist- this is a policy choice. You may want your device to reattach as totally new. You may, on the other hand, want your device to resume where you left off. I can see valid reasons for wanting either behaviour (but it can't/shouldn't be deduced by the OS). ^ permalink raw reply [flat|nested] 106+ messages in thread
end of thread, other threads:[~2003-03-01 1:41 UTC | newest]
Thread overview: 106+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <10426732153816@kroah.com>
[not found] ` <10426732212871@kroah.com>
[not found] ` <20030116093112.B29001@one-eyed-alien.net>
[not found] ` <20030116173539.GA31235@kroah.com>
2003-01-16 19:43 ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 Matthew Dharm
2003-01-16 19:53 ` Greg KH
[not found] ` <20030116195306.GA32697@kroah.com>
2003-01-16 20:10 ` Linus Torvalds
2003-01-16 20:43 ` greg kh
2003-01-16 21:41 ` Linus Torvalds
2003-01-16 22:51 ` Matthew Dharm
2003-01-16 20:40 ` David Brownell
2003-01-16 20:48 ` Mike Anderson
2003-01-16 23:43 ` Oliver Neukum
2003-01-17 8:50 ` Mike Anderson
2003-01-17 10:55 ` Oliver Neukum
2003-01-17 15:06 ` Alan Stern
2003-01-17 18:54 ` Matthew Dharm
2003-01-17 20:25 ` Mike Anderson
2003-01-17 22:07 ` Oliver Neukum
2003-01-17 20:26 ` [linux-usb-devel] " Oliver Neukum
2003-01-17 20:49 ` Mike Anderson
2003-01-20 17:36 ` Luben Tuikov
2003-01-20 18:23 ` Oliver Neukum
2003-01-20 18:56 ` Luben Tuikov
2003-01-20 19:10 ` [linux-usb-devel] " Oliver Neukum
2003-01-20 19:50 ` David Brownell
2003-01-21 3:31 ` Alan
2003-01-21 7:17 ` Oliver Neukum
2003-01-21 11:57 ` [linux-usb-devel] " Douglas Gilbert
2003-01-21 13:48 ` Oliver Neukum
2003-01-21 18:22 ` Luben Tuikov
2003-01-21 13:30 ` James Bottomley
2003-01-20 20:08 ` David Brownell
2003-01-20 20:48 ` [linux-usb-devel] " Oliver Neukum
2003-01-20 21:24 ` David Brownell
2003-01-20 21:51 ` [linux-usb-devel] " Oliver Neukum
2003-01-20 22:26 ` David Brownell
2003-01-20 23:00 ` Oliver Neukum
2003-01-21 0:44 ` David Brownell
2003-01-21 0:50 ` Oliver Neukum
2003-01-21 18:16 ` Luben Tuikov
2003-01-21 19:00 ` Oliver Neukum
2003-01-21 20:02 ` [linux-usb-devel] " Luben Tuikov
2003-01-21 21:02 ` Alan Stern
2003-01-22 21:50 ` Luben Tuikov
2003-01-22 22:46 ` Oliver Neukum
2003-01-23 17:46 ` Luben Tuikov
2003-01-23 18:19 ` Oliver Neukum
2003-01-23 19:07 ` Luben Tuikov
2003-01-23 19:40 ` Oliver Neukum
2003-01-23 20:28 ` Doug Ledford
2003-01-23 20:59 ` Oliver Neukum
2003-01-23 21:34 ` Doug Ledford
2003-01-23 22:39 ` Oliver Neukum
2003-01-23 23:23 ` Doug Ledford
2003-01-23 23:25 ` Matthew Dharm
2003-01-24 15:34 ` Alan Stern
2003-01-24 16:06 ` Oliver Neukum
2003-01-24 17:58 ` [linux-usb-devel] " Doug Ledford
2003-01-24 19:00 ` Luben Tuikov
2003-01-24 22:23 ` Oliver.Neukum
2003-01-24 19:10 ` Luben Tuikov
2003-01-24 19:56 ` [linux-usb-devel] " Alan Stern
2003-01-24 20:11 ` Luben Tuikov
2003-01-24 21:09 ` Luben Tuikov
2003-01-24 21:55 ` Alan Stern
2003-01-24 22:03 ` Luben Tuikov
2003-01-24 23:21 ` Mike Anderson
2003-01-24 21:48 ` Doug Ledford
2003-01-24 22:59 ` Mike Anderson
2003-01-24 23:17 ` [linux-usb-devel] " Doug Ledford
2003-01-25 0:24 ` Luben Tuikov
2003-01-25 1:35 ` Mike Anderson
2003-01-24 23:25 ` Matthew Dharm
2003-01-25 0:05 ` Doug Ledford
2003-01-25 0:45 ` Matthew Dharm
2003-01-25 1:07 ` Doug Ledford
2003-02-02 18:13 ` Matthew Dharm
2003-02-02 20:06 ` Matthew Dharm
2003-02-03 17:17 ` Mike Anderson
2003-02-16 21:18 ` Matthew Dharm
2003-02-17 19:37 ` Mike Anderson
2003-02-17 19:51 ` Patrick Mansfield
2003-02-23 7:48 ` Matthew Dharm
2003-02-26 23:37 ` Mike Anderson
2003-02-27 1:10 ` Matthew Dharm
2003-02-27 6:37 ` Mike Anderson
2003-02-27 19:32 ` Matthew Dharm
2003-03-01 1:41 ` Matthew Dharm
2003-02-02 3:49 ` Matthew Dharm
2003-01-25 1:24 ` Luben Tuikov
2003-01-24 0:15 ` Patrick Mansfield
2003-01-24 8:33 ` David Brownell
2003-01-23 20:41 ` A different look at block device hotswap in the Linux kernel Steven Dake
2003-01-23 21:07 ` Matthew Jacob
2003-01-23 21:06 ` Steven Dake
2003-01-23 21:16 ` Matthew Jacob
2003-01-24 0:07 ` Oliver Neukum
2003-01-24 0:21 ` Matthew Jacob
2003-01-24 7:53 ` David Brownell
2003-01-24 15:26 ` Matthew Jacob
2003-01-24 0:54 ` Steven Dake
2003-01-24 2:35 ` [linux-usb-devel] " Matthew Dharm
2003-01-22 21:30 ` [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58 David Brownell
2003-01-20 22:16 ` Luben Tuikov
2003-01-20 22:51 ` David Brownell
2003-01-20 23:27 ` Oliver Neukum
2003-01-24 16:36 A different look at block device hotswap in the Linux kernel Cress, Andrew R
2003-01-24 18:01 ` Bryan Henderson
2003-01-24 18:09 ` Matthew Jacob
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox