* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap [not found] <200210242002.g9OK27W03864@localhost.localdomain> @ 2002-10-24 20:45 ` Steven Dake 2002-10-24 21:05 ` Randy.Dunlap 2002-10-24 23:42 ` James Bottomley 0 siblings, 2 replies; 13+ messages in thread From: Steven Dake @ 2002-10-24 20:45 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi, linux-kernel James Some responses below: James Bottomley wrote: >sdake@mvista.com said: > > >>I plan to produce a now patch that dumps the filesystem interface and >>replaces it with driverfs files in /sys/bus/scsi. These things take >>time, but I hope to be finished by October 25th. >> >> > >OK, that's good, thanks. > > > >>The current remove interface is unmaintained, doesn't contain locking, >> and requires laborious string processing resulting in slow results. >> >> > >It is maintained (well, I was planning on looking after it). The locking can >be added (the 1st part of your patch). It does two in kernel strncmps. >That's not really slow by most definitions. > > The locking most definately needs to be added to the kernel. I'm surprised the original patch didn't contain any locking, but then again, my first patch didn't either :) > > >>Further there is no usage information (which means the usage must >>come by looking at drivers/scsi/scsi.c which is beyond most typical >>users). >> >> > >I don't really think it's the job of the kernel to conatin usage information. >That's the job of the user level documentation. > > I've gotten mixed feedback on this. I'll add you to the list that doesn't like this. perhaps it should be removed (even though it takes up minimal memory). > > >>Imagine scanning each disk in driverfs looking at its WWN attribute >>(if it has one) until a match is found. Assume there are 16 FC >>devices. That is several hundred syscalls just to complete one >>hotswap operation. >> >> > >Why is speed so important? > > Telecoms and Datacoms have told me in numerous conversations that a hotswap operation should occur in 20msec. I've arbitrarily set 10msec as my target to ensure that I meet the worse-case bus-is-loaded responses during scans, etc. I can't mention the names of the telecoms, but several with 10000+ employees have mentioned it. > > >>This requires the adaptor to maintain a mapping of WWNs to SCSI IDs, >>however, this is already required by most FibreChannel firmware I've >>seen (and hence is available in the driver database already). >> >> > >There will be a point where for a large number of drivers, a linear scan even >in the kernel will be slower than a good DB lookup in userspace. > > This may be true, but most systems will only have at most 4-5 devices. Theres only so much room on PCI for FC devices :) > > >>Hotplugs on FibreChannel don't trigger "events". What they can do is >>LIP (loop initialization procedure) if the device has been configured >>in it's SCSI code pages to do such a thing. Since this is device >>specific I'd hate to rely on it for hotswap. >> >> > >They don't now, but they should. The LIP protocol makes the FC driver aware >of the gain or loss of devices. This should be communicated to the mid-layer >and then trigger a hotplug event. Someone needs to write this, I was just >wondering if you might. > > I like the idea and it was something I was considering for early next year. Its driver dependent and until a FC driver is in the kernel, theres not much point yet :) Keep in mind also that a LIP is not always generated on an insertion and isn't generated on a removal at all. This makes insertion easy but removal still requires user intervention. In Advanced TCA (what spawned this work) a button is pressed to indicate hotswap removal which makes for easy detection of hotswap events. This is why there are kernel interfaces for removal and insertion (so a kernel driver can be written to detect the button press and remove the devices from the os data structures and then light a blue led indicating safe for removal). > > >>I think this would be too slow. 10 msec for my entire hotswap is >>available. If you calculate 2msec for the actual hotswap disk >>operation, that leaves 8 msec for the rest of the mess. Scanning >>through tables or scanning tens or hundreds of files through hundreds >>of syscalls may betoo slow. >> >> > >Where does the 10ms figure come from? > > See above Thanks James for reading the code and giving comments! >James > > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 20:45 ` [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap Steven Dake @ 2002-10-24 21:05 ` Randy.Dunlap 2002-10-24 21:48 ` Steven Dake 2002-10-24 23:00 ` Scott Murray 2002-10-24 23:42 ` James Bottomley 1 sibling, 2 replies; 13+ messages in thread From: Randy.Dunlap @ 2002-10-24 21:05 UTC (permalink / raw) To: Steven Dake; +Cc: James Bottomley, linux-scsi, linux-kernel On Thu, 24 Oct 2002, Steven Dake wrote: | James | Some responses below: | | James Bottomley wrote: | | >sdake@mvista.com said: | >I don't really think it's the job of the kernel to conatin usage information. | >That's the job of the user level documentation. | > | > | I've gotten mixed feedback on this. I'll add you to the list that | doesn't like this. add me to that list also. | perhaps it should be removed (even though it takes up minimal memory). yes, i agree. | >>Imagine scanning each disk in driverfs looking at its WWN attribute | >>(if it has one) until a match is found. Assume there are 16 FC | >>devices. That is several hundred syscalls just to complete one | >>hotswap operation. | >> | >> | > | >Why is speed so important? | > | > | Telecoms and Datacoms have told me in numerous conversations that a hotswap | operation should occur in 20msec. I've arbitrarily set 10msec as my | target to | ensure that I meet the worse-case bus-is-loaded responses during scans, etc. | | I can't mention the names of the telecoms, but several with 10000+ employees | have mentioned it. | >>I think this would be too slow. 10 msec for my entire hotswap is | >>available. If you calculate 2msec for the actual hotswap disk | >>operation, that leaves 8 msec for the rest of the mess. Scanning | >>through tables or scanning tens or hundreds of files through hundreds | >>of syscalls may betoo slow. | >> | > | >Where does the 10ms figure come from? | > | See above I've already ask Steve about this and received his answers. Can't say that I agree with them though, so I asked someone from a Telecom Equipment Mfr. about this. He said that it's just for equipment testing, where technicians verify that hotswap works, and they are impatient to wait, so they practice surprise removal instead of coordinated removal. He doesn't think that's how it's actually done out in the field, just in test labs. Preface question: does cPCI support surprise removal (in the PICMG specs, not in some implementation)? I know that PCI hotplug doesn't support surprise removal, only "coordinated" removal. So the question that has to be answered IMO is: do we want to support surprise removal for something like manufacturing test, which doesn't abide by the coordinated removal protocol? or: Do we have to support surprise removal, only because it can't be prevented? I expect that this is the case, but I still don't see or understand the 20 ms time requirement. -- ~Randy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 21:05 ` Randy.Dunlap @ 2002-10-24 21:48 ` Steven Dake 2002-10-24 23:00 ` Scott Murray 1 sibling, 0 replies; 13+ messages in thread From: Steven Dake @ 2002-10-24 21:48 UTC (permalink / raw) To: Randy.Dunlap; +Cc: James Bottomley, linux-scsi, linux-kernel Randy.Dunlap wrote: >On Thu, 24 Oct 2002, Steven Dake wrote: > >| James >| Some responses below: >| >| James Bottomley wrote: >| >| >sdake@mvista.com said: > >| >I don't really think it's the job of the kernel to conatin usage information. >| >That's the job of the user level documentation. >| > >| > >| I've gotten mixed feedback on this. I'll add you to the list that >| doesn't like this. > >add me to that list also. > >| perhaps it should be removed (even though it takes up minimal memory). >yes, i agree. > >| >>Imagine scanning each disk in driverfs looking at its WWN attribute >| >>(if it has one) until a match is found. Assume there are 16 FC >| >>devices. That is several hundred syscalls just to complete one >| >>hotswap operation. >| >> >| >> >| > >| >Why is speed so important? >| > >| > >| Telecoms and Datacoms have told me in numerous conversations that a hotswap >| operation should occur in 20msec. I've arbitrarily set 10msec as my >| target to >| ensure that I meet the worse-case bus-is-loaded responses during scans, etc. >| >| I can't mention the names of the telecoms, but several with 10000+ employees >| have mentioned it. > >| >>I think this would be too slow. 10 msec for my entire hotswap is >| >>available. If you calculate 2msec for the actual hotswap disk >| >>operation, that leaves 8 msec for the rest of the mess. Scanning >| >>through tables or scanning tens or hundreds of files through hundreds >| >>of syscalls may betoo slow. >| >> >| > >| >Where does the 10ms figure come from? >| > >| See above > >I've already ask Steve about this and received his answers. >Can't say that I agree with them though, so I asked someone from >a Telecom Equipment Mfr. about this. He said that it's just for >equipment testing, where technicians verify that hotswap works, >and they are impatient to wait, so they practice surprise removal >instead of coordinated removal. He doesn't think that's how it's >actually done out in the field, just in test labs. > >Preface question: does cPCI support surprise removal (in the >PICMG specs, not in some implementation)? I know that PCI hotplug >doesn't support surprise removal, only "coordinated" removal. > > PICMG 2.12 doesn't support surprise removal (the hardware does, the software doesn't). The latch must first be popped, then the user must wait for the blue led. If the blue led isn't lit, the operating system isn't ready for the board to be removed. This said, operators are paid 10 bucks an hour to replace boards and you know how that goes. :) For Compact PCI, the surprise removal rate is about 100 msec. This is as fast as the user can rip the board out of a chassis, meaning if you can light the blue led in less then 100 msec it doesn't matter if the extraction is a surprise or not. >So the question that has to be answered IMO is: do we want to >support surprise removal for something like manufacturing test, >which doesn't abide by the coordinated removal protocol? > >or: Do we have to support surprise removal, only because it can't >be prevented? I expect that this is the case, but I still don't >see or understand the 20 ms time requirement. > > For Advanced TCA, there isn't a "latch" required to unpop before removing the board. For Compact PCI, the latch must be popped, allowing a signal to be sent to the board. For ATCA, a button is pressed (which is a major complaint of Advanced TCA, boards can be removed without any signaling to the OS that the board is being removed). I'm not sure what the PICMG3 foks are going to do about that problem. I'm assuming they are going to rework the enumeration of the hotswap event to be driven by extracting the board instead of by a button. In this case, extremely fast hotswap times are required, because the board can be removed very fast in Advanced TCA (vs the latched method of Compact PCI). Perhaps this is where the timing constraints originate. > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 21:05 ` Randy.Dunlap 2002-10-24 21:48 ` Steven Dake @ 2002-10-24 23:00 ` Scott Murray 2002-10-24 23:22 ` Greg KH 1 sibling, 1 reply; 13+ messages in thread From: Scott Murray @ 2002-10-24 23:00 UTC (permalink / raw) To: Randy.Dunlap; +Cc: Steven Dake, James Bottomley, linux-scsi, linux-kernel On Thu, 24 Oct 2002, Randy.Dunlap wrote: > Preface question: does cPCI support surprise removal (in the > PICMG specs, not in some implementation)? I know that PCI hotplug > doesn't support surprise removal, only "coordinated" removal. No, according to PICMG 2.1 R2.0, suprise removal is "non-compliant". > So the question that has to be answered IMO is: do we want to > support surprise removal for something like manufacturing test, > which doesn't abide by the coordinated removal protocol? > > or: Do we have to support surprise removal, only because it can't > be prevented? I expect that this is the case, but I still don't > see or understand the 20 ms time requirement. I've not implemented it yet, but I'm pretty sure I can detect surprise extractions in my cPCI driver. The only thing holding me back at the moment is that there's no clear way to report this status change via pcihpfs without doing something a bit funky like reporting "-1" in the "adapter" node. Scott -- Scott Murray SOMA Networks, Inc. Toronto, Ontario e-mail: scottm@somanetworks.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:00 ` Scott Murray @ 2002-10-24 23:22 ` Greg KH 2002-10-24 23:48 ` Steven Dake 2002-10-25 0:18 ` Scott Murray 0 siblings, 2 replies; 13+ messages in thread From: Greg KH @ 2002-10-24 23:22 UTC (permalink / raw) To: Scott Murray Cc: Randy.Dunlap, Steven Dake, James Bottomley, linux-scsi, linux-kernel On Thu, Oct 24, 2002 at 07:00:23PM -0400, Scott Murray wrote: > > I've not implemented it yet, but I'm pretty sure I can detect surprise > extractions in my cPCI driver. The only thing holding me back at the > moment is that there's no clear way to report this status change via > pcihpfs without doing something a bit funky like reporting "-1" in the > "adapter" node. Why would you need to report anything other than if the card is present or not? What would a "supprise" removal cause you to do differently? Hm, well I guess we should be extra careful in trying to shut down any driver bound to that card... thanks, greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:22 ` Greg KH @ 2002-10-24 23:48 ` Steven Dake 2002-10-25 0:20 ` Jeff Garzik 2002-10-25 10:04 ` Alan Cox 2002-10-25 0:18 ` Scott Murray 1 sibling, 2 replies; 13+ messages in thread From: Steven Dake @ 2002-10-24 23:48 UTC (permalink / raw) To: Greg KH Cc: Scott Murray, Randy.Dunlap, James Bottomley, linux-scsi, linux-kernel Montavista has discussed at length Compact PCI hotswap using surprise removal events. The key feature of any hotswap operation that happens in a surprise fashion is that the device driver might want a hint that the hardware is no longer present so it can immediatly dump its buffers/io maps/etc and totally stop accessing the device. An expected removal, on the other hand, would give the device driver time to flush its buffers (for example a scsi driver could dump its outstanding queued scsi messages). Once the driver is done accessing the device, the blue led on the CompactPCI board can be lit and it can be removed. This is the main difference. Since the driver model of Linux doesn't support a surprise extract method call for drivers, I don't think its been implemented here. Further the drivers must be modified to actually use the hint instead of doing its normal shutdown operation. Surprise extraction is not a simple problem especially to ensure the device drivers exit cleanly without dumping more data on the PCI bus to a PCI device that may not exist. Thanks! -steve Greg KH wrote: >On Thu, Oct 24, 2002 at 07:00:23PM -0400, Scott Murray wrote: > > >>I've not implemented it yet, but I'm pretty sure I can detect surprise >>extractions in my cPCI driver. The only thing holding me back at the >>moment is that there's no clear way to report this status change via >>pcihpfs without doing something a bit funky like reporting "-1" in the >>"adapter" node. >> >> > >Why would you need to report anything other than if the card is present >or not? What would a "supprise" removal cause you to do differently? >Hm, well I guess we should be extra careful in trying to shut down any >driver bound to that card... > >thanks, > >greg k-h > > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:48 ` Steven Dake @ 2002-10-25 0:20 ` Jeff Garzik 2002-10-25 10:04 ` Alan Cox 1 sibling, 0 replies; 13+ messages in thread From: Jeff Garzik @ 2002-10-25 0:20 UTC (permalink / raw) To: Steven Dake Cc: Greg KH, Scott Murray, Randy.Dunlap, James Bottomley, linux-scsi, linux-kernel Steven Dake wrote: > Montavista has discussed at length Compact PCI hotswap using surprise > removal events. > > The key feature of any hotswap operation that happens in a surprise > fashion is that > the device driver might want a hint that the hardware is no longer > present so it can > immediatly dump its buffers/io maps/etc and totally stop accessing the > device. An > expected removal, on the other hand, would give the device driver time > to flush its > buffers (for example a scsi driver could dump its outstanding queued > scsi messages). > Once the driver is done accessing the device, the blue led on the > CompactPCI board > can be lit and it can be removed. > > This is the main difference. Since the driver model of Linux doesn't > support a surprise > extract method call for drivers, I don't think its been implemented > here. Further the > drivers must be modified to actually use the hint instead of doing its > normal shutdown > operation. Wrong. The _only_ supported method so far has been surprise removal. For years now. This happens every day in the land of CardBus, which was the first "PCI" hotplug implementation in the Linux kernel. PCI HotPlug introduces a new, non-surprise removal. Thus, the current model should be assumed to be surprise removal, and you need an additional notification from the system if a "nice" removal is about to occur. > Surprise extraction is not a simple problem especially to ensure the > device drivers exit > cleanly without dumping more data on the PCI bus to a PCI device that > may not > exist. PCI is electrically safe. Reads to non-existent areas return 0xffffffff, etc. Take a look at net drivers some day, we have been handling this for years. Surprise removal is actually easier from many perspectives -- you don't have to worry about quiescing the hardware, you simply have to error out all I/Os, and clean up the kernel structures that are left behind (host info, device info, etc.). The non-surprise removal is more annoying, in that you could potentially have an indefinite wait (and must actively avoid such a situation) while shutting down the hardware, completing I/Os, etc. Jeff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:48 ` Steven Dake 2002-10-25 0:20 ` Jeff Garzik @ 2002-10-25 10:04 ` Alan Cox 1 sibling, 0 replies; 13+ messages in thread From: Alan Cox @ 2002-10-25 10:04 UTC (permalink / raw) To: Steven Dake Cc: Greg KH, Scott Murray, Randy.Dunlap, James Bottomley, linux-scsi, Linux Kernel Mailing List On Fri, 2002-10-25 at 00:48, Steven Dake wrote: > Surprise extraction is not a simple problem especially to ensure the > device drivers exit > cleanly without dumping more data on the PCI bus to a PCI device that > may not > exist. Thats primarily about resource handling orders. Making sure we don't release the claimed pci resources in the driver until the driver itself is sure it has shut up. I'm doing suprise removal ok with the thinkpad 600 (in 2.4 with some limits due to the 2.4 pci layer). Network stuff seems to be fine. The block layer has major issues. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:22 ` Greg KH 2002-10-24 23:48 ` Steven Dake @ 2002-10-25 0:18 ` Scott Murray 1 sibling, 0 replies; 13+ messages in thread From: Scott Murray @ 2002-10-25 0:18 UTC (permalink / raw) To: Greg KH Cc: Randy.Dunlap, Steven Dake, James Bottomley, linux-scsi, linux-kernel On Thu, 24 Oct 2002, Greg KH wrote: > On Thu, Oct 24, 2002 at 07:00:23PM -0400, Scott Murray wrote: > > > > I've not implemented it yet, but I'm pretty sure I can detect surprise > > extractions in my cPCI driver. The only thing holding me back at the > > moment is that there's no clear way to report this status change via > > pcihpfs without doing something a bit funky like reporting "-1" in the > > "adapter" node. > > Why would you need to report anything other than if the card is present > or not? What would a "supprise" removal cause you to do differently? Thinking about it a bit more, my idea to use -1 is indeed unnecessary, since userspace code can check if the adapter node changes to 0 before it itself writes a 0 to it. If multiple users/software play around with the nodes at the same time, they'll get what's coming to them... > Hm, well I guess we should be extra careful in trying to shut down any > driver bound to that card... Yeah, as Steven mentions in his reply, Linux drivers don't handle this well at the moment. Scott -- Scott Murray SOMA Networks, Inc. Toronto, Ontario e-mail: scottm@somanetworks.com ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 20:45 ` [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap Steven Dake 2002-10-24 21:05 ` Randy.Dunlap @ 2002-10-24 23:42 ` James Bottomley 2002-10-24 23:52 ` Jeff Garzik 1 sibling, 1 reply; 13+ messages in thread From: James Bottomley @ 2002-10-24 23:42 UTC (permalink / raw) To: Steven Dake; +Cc: linux-scsi, linux-kernel > The locking most definately needs to be added to the kernel. I'm > surprised the original patch didn't contain any locking, but then > again, my first patch didn't either:) When I first read the SCSI code many years ago, I found surprise wasn't adequate and I was forced to resort to astonishment. It's being cleaned up slowly. Originally hot removal/insertion was the exception, so nobody tripped over the locking issue. Now it's fast becoming the rule. > This may be true, but most systems will only have at most 4-5 devices. > > Theres only so much room on PCI for FC devices :) I have to think about other SCSI systems as well. Some IBM beasties have > 256 PCI slots. Infiniband is threatening direct bus-fibre attachment. > In Advanced TCA (what spawned this work) a button is pressed to > indicate hotswap removal which makes for easy detection of hotswap > events. This is why there are kernel interfaces for removal and > insertion (so a kernel driver can be written to detect the button > press and remove the devices from the os data structures and then > light a blue led indicating safe for removal). OK, I understand what's going on now. It's no different from those hotplug PCI busses where you press the button and a second or so later the LED goes out and you can remove the card. 10ms sounds rather a short maximum time for a technician to wait for a light to go out....I suppose Telco technicians are rather impatient. I really think you need to lengthen this interval. The kernel is moving towards this type of hotplug infrastructure which you can easily leverage (or even help build), but it's definitely going to be mainly in user space. James ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:42 ` James Bottomley @ 2002-10-24 23:52 ` Jeff Garzik 2002-10-27 15:08 ` Rob Landley 0 siblings, 1 reply; 13+ messages in thread From: Jeff Garzik @ 2002-10-24 23:52 UTC (permalink / raw) To: James Bottomley; +Cc: Steven Dake, linux-scsi, linux-kernel James Bottomley wrote: >>n Advanced TCA (what spawned this work) a button is pressed to >>indicate hotswap removal which makes for easy detection of hotswap >>events. This is why there are kernel interfaces for removal and >>insertion (so a kernel driver can be written to detect the button >>press and remove the devices from the os data structures and then >>light a blue led indicating safe for removal). >> >> > >OK, I understand what's going on now. It's no different from those hotplug >PCI busses where you press the button and a second or so later the LED goes >out and you can remove the card. 10ms sounds rather a short maximum time for >a technician to wait for a light to go out....I suppose Telco technicians are >rather impatient. > >I really think you need to lengthen this interval. The kernel is moving >towards this type of hotplug infrastructure which you can easily leverage (or >even help build), but it's definitely going to be mainly in user space. > > Caveat coder -- you also have to handle the case where the device is already gone, by the time you are notified of the hot-unplug event. Some ejections are less friendly than others... though from a SCSI standpoint, hopefully that case is easier -- error out all I/Os in flight, and unregister the host and device structures associated with the recently-removed host. The devil, of course, is in the details ;-) Jeff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-24 23:52 ` Jeff Garzik @ 2002-10-27 15:08 ` Rob Landley 2002-10-27 20:25 ` Randy.Dunlap 0 siblings, 1 reply; 13+ messages in thread From: Rob Landley @ 2002-10-27 15:08 UTC (permalink / raw) To: Jeff Garzik, James Bottomley; +Cc: Steven Dake, linux-scsi, linux-kernel On Thursday 24 October 2002 18:52, Jeff Garzik wrote: > James Bottomley wrote: > >>n Advanced TCA (what spawned this work) a button is pressed to > >>indicate hotswap removal which makes for easy detection of hotswap > >>events. This is why there are kernel interfaces for removal and > >>insertion (so a kernel driver can be written to detect the button > >>press and remove the devices from the os data structures and then > >>light a blue led indicating safe for removal). > > > >OK, I understand what's going on now. It's no different from those > > hotplug PCI busses where you press the button and a second or so later > > the LED goes out and you can remove the card. 10ms sounds rather a short > > maximum time for a technician to wait for a light to go out....I suppose > > Telco technicians are rather impatient. > > > >I really think you need to lengthen this interval. The kernel is moving > >towards this type of hotplug infrastructure which you can easily leverage > > (or even help build), but it's definitely going to be mainly in user > > space. > > Caveat coder -- you also have to handle the case where the device is > already gone, by the time you are notified of the hot-unplug event. > Some ejections are less friendly than others... though from a SCSI > standpoint, hopefully that case is easier -- error out all I/Os in > flight, and unregister the host and device structures associated with > the recently-removed host. The devil, of course, is in the details ;-) Hmmm... Not being familiar with the SCSI layer but sticking my nose in anyway on general block device/mount point hotplug issues: How hard would it be to write a simple debugging function to lobotomize a block device? (So that all further I/O to that sucker immediately returns an error.) Not just simulating an a hot extraction (or catastrophic failure) of a block device, but also something you could use to see how gracefully filesystems react. The reason I ask is there was a discussion a while back about the new lazy unmount (umount -l /blah/foo) not always being quite enough, and that sometimes what what you want is basically "umount -9 /blah/foo" (ala kill -9). Close all files, reparent all process home directories and chroot mount points to a dummy inode, flush all I/O, drive a stake through the superblock's heart, and scatter the ashes at sea. Somebody posted a patch to actually do this. (Against 2.4, i think.) I could probably dig it up if you were curious. Let's see... http://marc.theaimsgroup.com/?l=linux-kernel&m=103443466225915&q=raw The eject command should certainly have an "umount with shotgun" option, so zombie processes can't pin your CD in the drive. (Your average end-user is NOT going to be able to grovel through /proc to figure out which processes have an open filehandle or home directory under the cdrom mount point so it can kill them and get the disk out. They're going to power cycle the machine and eject it while the bios is in charge. I've done this myself a couple of times when I'm in a hurry.) Anyway, if the block device under the filesystem honestly does go away for hotplug eject reasons, the obvious thing to do is umount -9 the sucker immediately so userspace can collapse gracefully (or even conceivably recover). The main difference here is that the flushing would all error out and get discarded, and this wouldn't always get reported to the user, but thanks to write cacheing that's the case anyway. (Use some variant of O_DIRECT or fsync if you care.) The errors userspace does see switch from "all my I/O failed with a media error" to "all my filehandles closed out from under me" (and the directory I'm in has been deleted), but that's still relatively logical behavior. Does this sound like it's off in left field? > Jeff Rob -- http://penguicon.sf.net - Terry Pratchett, Eric Raymond, Pete Abrams, Illiad, CmdrTaco, liquid nitrogen ice cream, and caffienated jello. Well why not? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap 2002-10-27 15:08 ` Rob Landley @ 2002-10-27 20:25 ` Randy.Dunlap 0 siblings, 0 replies; 13+ messages in thread From: Randy.Dunlap @ 2002-10-27 20:25 UTC (permalink / raw) To: Rob Landley Cc: Jeff Garzik, James Bottomley, Steven Dake, linux-scsi, linux-kernel On Sun, 27 Oct 2002, Rob Landley wrote: (maybe wrap lines around column 70 ? :) ... Stephen Tweedie did something like this already (for 2.4.19-pre10), called "testdrive". It uses loopback over a block device. He says that it will need modifications to use bio in 2.5. See here: http://marc.theaimsgroup.com/?l=linux-kernel&m=102457399020069&w=2 -- ~Randy | Hmmm... Not being familiar with the SCSI layer but sticking my nose in anyway | on general block device/mount point hotplug issues: | | How hard would it be to write a simple debugging function to lobotomize a | block device? (So that all further I/O to that sucker immediately returns an | error.) Not just simulating an a hot extraction (or catastrophic failure) of | a block device, but also something you could use to see how gracefully | filesystems react. | | The reason I ask is there was a discussion a while back about the new lazy | unmount (umount -l /blah/foo) not always being quite enough, and that | sometimes what what you want is basically "umount -9 /blah/foo" (ala kill | -9). Close all files, reparent all process home directories and chroot mount | points to a dummy inode, flush all I/O, drive a stake through the | superblock's heart, and scatter the ashes at sea. Somebody posted a patch to | actually do this. (Against 2.4, i think.) I could probably dig it up if you | were curious. Let's see... | | http://marc.theaimsgroup.com/?l=linux-kernel&m=103443466225915&q=raw | | The eject command should certainly have an "umount with shotgun" option, so | zombie processes can't pin your CD in the drive. (Your average end-user is | NOT going to be able to grovel through /proc to figure out which processes | have an open filehandle or home directory under the cdrom mount point so it | can kill them and get the disk out. They're going to power cycle the machine | and eject it while the bios is in charge. I've done this myself a couple of | times when I'm in a hurry.) | | Anyway, if the block device under the filesystem honestly does go away for | hotplug eject reasons, the obvious thing to do is umount -9 the sucker | immediately so userspace can collapse gracefully (or even conceivably | recover). The main difference here is that the flushing would all error out | and get discarded, and this wouldn't always get reported to the user, but | thanks to write cacheing that's the case anyway. (Use some variant of | O_DIRECT or fsync if you care.) The errors userspace does see switch from | "all my I/O failed with a media error" to "all my filehandles closed out from | under me" (and the directory I'm in has been deleted), but that's still | relatively logical behavior. | | Does this sound like it's off in left field? | | Rob ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2002-10-27 20:22 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200210242002.g9OK27W03864@localhost.localdomain>
2002-10-24 20:45 ` [PATCH] [RFC] Advanced TCA SCSI Disk Hotswap Steven Dake
2002-10-24 21:05 ` Randy.Dunlap
2002-10-24 21:48 ` Steven Dake
2002-10-24 23:00 ` Scott Murray
2002-10-24 23:22 ` Greg KH
2002-10-24 23:48 ` Steven Dake
2002-10-25 0:20 ` Jeff Garzik
2002-10-25 10:04 ` Alan Cox
2002-10-25 0:18 ` Scott Murray
2002-10-24 23:42 ` James Bottomley
2002-10-24 23:52 ` Jeff Garzik
2002-10-27 15:08 ` Rob Landley
2002-10-27 20:25 ` Randy.Dunlap
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).