* Hotplug events during sleep transition @ 2005-12-22 14:28 Alan Stern 2005-12-22 14:34 ` Pavel Machek 0 siblings, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-22 14:28 UTC (permalink / raw) To: Pavel Machek, Greg KH; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 609 bytes --] Pavel & Greg: I recently worked on a patch to unbind USB drivers that don't have suspend/resume methods, and then rebind them when the system comes back up. There was an unexpected effect: In some cases the unbind or rebind operations caused hotplug events, which started up new user processes at a time when everything was supposed to be frozen. Obviously this is not a good thing. Is there some other way these sorts of events can be handled? For instance, can the hotplug system be smart enough to delay creating the new processes? Or can they be created already in a frozen state? Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Hotplug events during sleep transition 2005-12-22 14:28 Hotplug events during sleep transition Alan Stern @ 2005-12-22 14:34 ` Pavel Machek 2005-12-22 18:20 ` Dmitry Torokhov 0 siblings, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-22 14:34 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 794 bytes --] Hi! > I recently worked on a patch to unbind USB drivers that don't have > suspend/resume methods, and then rebind them when the system comes back > up. There was an unexpected effect: In some cases the unbind or rebind > operations caused hotplug events, which started up new user processes at a > time when everything was supposed to be frozen. Obviously this is not a > good thing. > > Is there some other way these sorts of events can be handled? For > instance, can the hotplug system be smart enough to delay creating the new > processes? Or can they be created already in a frozen state? There were some problems with this already... but I do not recall what the solution was. lkml thread, IIRC. Were it PCMCIA cards or something like that? Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-22 14:34 ` Pavel Machek @ 2005-12-22 18:20 ` Dmitry Torokhov 2005-12-22 20:52 ` Patrick Mochel 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-22 18:20 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list On 12/22/05, Pavel Machek <pavel@suse.cz> wrote: > Hi! > > > I recently worked on a patch to unbind USB drivers that don't have > > suspend/resume methods, and then rebind them when the system comes back > > up. There was an unexpected effect: In some cases the unbind or rebind > > operations caused hotplug events, which started up new user processes at a > > time when everything was supposed to be frozen. Obviously this is not a > > good thing. > > > > Is there some other way these sorts of events can be handled? For > > instance, can the hotplug system be smart enough to delay creating the new > > processes? Or can they be created already in a frozen state? > > There were some problems with this already... but I do not recall what > the solution was. lkml thread, IIRC. Were it PCMCIA cards or something > like that? It was serio layer. I just ensured that all hotplug events are generated from serio thread thus making sure that we won't see them during suspend. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-22 18:20 ` Dmitry Torokhov @ 2005-12-22 20:52 ` Patrick Mochel 2005-12-22 20:56 ` Randy.Dunlap 2005-12-22 22:13 ` Alan Stern 0 siblings, 2 replies; 69+ messages in thread From: Patrick Mochel @ 2005-12-22 20:52 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 2360 bytes --] On Thu, 22 Dec 2005, Dmitry Torokhov wrote: > On 12/22/05, Pavel Machek <pavel@suse.cz> wrote: > > Hi! > > > > > I recently worked on a patch to unbind USB drivers that don't have > > > suspend/resume methods, and then rebind them when the system comes back > > > up. There was an unexpected effect: In some cases the unbind or rebind > > > operations caused hotplug events, which started up new user processes at a > > > time when everything was supposed to be frozen. Obviously this is not a > > > good thing. > > > > > > Is there some other way these sorts of events can be handled? For > > > instance, can the hotplug system be smart enough to delay creating the new > > > processes? Or can they be created already in a frozen state? > > > > There were some problems with this already... but I do not recall what > > the solution was. lkml thread, IIRC. Were it PCMCIA cards or something > > like that? > > It was serio layer. I just ensured that all hotplug events are > generated from serio thread thus making sure that we won't see them > during suspend. PCMCIA recently had a similar issue. I can't find an archive for the linux-pcmcia (@lists.infradead.org), but the thread was titled: "uevent / call_usermoderrelated breakage on suspend to disk [Was: Re: PCMCIA-related breakage on suspend to disk]" Though, the pcmcia case was a bit different - the system would hang because the artificial insertion events on resume. My recommendation for the pcmcia case was that a subsystem shouldn't attempt to discover new devices during the device resume sequence. It should first deal with devices that are there (and bound to a driver). Then, it should try to discard devices that still have objects, and finally discover new devices that have been inserted. The final two stages should, IMHO, happen after all other devices have been resumed and userspace is running once again. Implementing this will probably require a per-subsystem thread, leveraging existing code (subsystem threads) if possible. I feel the same about binding/unbinding devices from drivers. In itself, it sounds like a hack, but I realize that USB devices are treated specially during suspend/resume. For the sake of getting it working, I would recommend also waiting until everything else is finished, then process those devices to bind them to their drivers. Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-22 20:52 ` Patrick Mochel @ 2005-12-22 20:56 ` Randy.Dunlap 2005-12-22 22:13 ` Alan Stern 1 sibling, 0 replies; 69+ messages in thread From: Randy.Dunlap @ 2005-12-22 20:56 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 371 bytes --] On Thu, 22 Dec 2005, Patrick Mochel wrote: > PCMCIA recently had a similar issue. I can't find an archive for the > linux-pcmcia (@lists.infradead.org), but the thread was titled: > "uevent / call_usermoderrelated breakage on suspend to disk [Was: Re: > PCMCIA-related breakage on suspend to disk]" FYI: http://lists.infradead.org/pipermail/linux-pcmcia/ -- ~Randy [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-22 20:52 ` Patrick Mochel 2005-12-22 20:56 ` Randy.Dunlap @ 2005-12-22 22:13 ` Alan Stern 2005-12-23 3:49 ` Patrick Mochel 1 sibling, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-22 22:13 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1955 bytes --] On Thu, 22 Dec 2005, Patrick Mochel wrote: > My recommendation for the pcmcia case was that a subsystem shouldn't > attempt to discover new devices during the device resume sequence. It > should first deal with devices that are there (and bound to a driver). > Then, it should try to discard devices that still have objects, and > finally discover new devices that have been inserted. > > The final two stages should, IMHO, happen after all other devices have > been resumed and userspace is running once again. Implementing this will > probably require a per-subsystem thread, leveraging existing code > (subsystem threads) if possible. > > I feel the same about binding/unbinding devices from drivers. In itself, > it sounds like a hack, but I realize that USB devices are treated > specially during suspend/resume. For the sake of getting it working, I > would recommend also waiting until everything else is finished, then > process those devices to bind them to their drivers. While feasible for the resume end, that approach won't work for the suspend side. The driver can't be unbound until its suspend call arrives (because before that the core doesn't know anything is happening), and that's after userspace is already frozen. Conversely, the driver has to be unbound before the suspend call can return. I noticed this problem when testing a version of usb-storage that had the suspend/resume support commented out. Unbinding it caused the child SCSI host to be unregistered, which generated a hotplug event. Likewise, rebinding it registered the SCSI host. Now obviously usb-storage isn't going to cause this problem in the real world, but other drivers might. I'm not so sure this approach really qualifies as a "hack". If a driver doesn't have explicit support for suspend/resume, the best way to quiesce it is to unbind it. The only alternative is to fail all of its I/O requests, which seems much less reliable. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-22 22:13 ` Alan Stern @ 2005-12-23 3:49 ` Patrick Mochel 2005-12-23 3:52 ` Dmitry Torokhov 0 siblings, 1 reply; 69+ messages in thread From: Patrick Mochel @ 2005-12-23 3:49 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 569 bytes --] On Thu, 22 Dec 2005, Alan Stern wrote: > I'm not so sure this approach really qualifies as a "hack". If a driver > doesn't have explicit support for suspend/resume, the best way to quiesce > it is to unbind it. The only alternative is to fail all of its I/O > requests, which seems much less reliable. Hmm, the best way to queisce it seems to be to fix the driver. Sure, that doesn't solve your problems now, but that is the ideal solution, no? Until then, band-aids like unbinding the device from the driver is only a remedy to a symptom; not a cure.. Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 3:49 ` Patrick Mochel @ 2005-12-23 3:52 ` Dmitry Torokhov 2005-12-23 15:20 ` Alan Stern 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-23 3:52 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list On Thursday 22 December 2005 22:49, Patrick Mochel wrote: > > On Thu, 22 Dec 2005, Alan Stern wrote: > > > I'm not so sure this approach really qualifies as a "hack". If a driver > > doesn't have explicit support for suspend/resume, the best way to quiesce > > it is to unbind it. The only alternative is to fail all of its I/O > > requests, which seems much less reliable. > > Hmm, the best way to queisce it seems to be to fix the driver. Sure, that > doesn't solve your problems now, but that is the ideal solution, no? Until > then, band-aids like unbinding the device from the driver is only a remedy > to a symptom; not a cure.. > Quite often unbinding is easiest and also correct way. I would love to be able to just unbind serio port/input device and have it recreated later. Unfortunately X/GPM do not [yet?] support hotplugging of devices so kernel has to compensate. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 3:52 ` Dmitry Torokhov @ 2005-12-23 15:20 ` Alan Stern 2005-12-23 15:35 ` Patrick Mochel 0 siblings, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-23 15:20 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1270 bytes --] On Thu, 22 Dec 2005, Dmitry Torokhov wrote: > On Thursday 22 December 2005 22:49, Patrick Mochel wrote: > > > > On Thu, 22 Dec 2005, Alan Stern wrote: > > > > > I'm not so sure this approach really qualifies as a "hack". If a driver > > > doesn't have explicit support for suspend/resume, the best way to quiesce > > > it is to unbind it. The only alternative is to fail all of its I/O > > > requests, which seems much less reliable. > > > > Hmm, the best way to queisce it seems to be to fix the driver. Sure, that > > doesn't solve your problems now, but that is the ideal solution, no? Until > > then, band-aids like unbinding the device from the driver is only a remedy > > to a symptom; not a cure.. > > > > Quite often unbinding is easiest and also correct way. I would love to be > able to just unbind serio port/input device and have it recreated later. > Unfortunately X/GPM do not [yet?] support hotplugging of devices so kernel > has to compensate. I agree with Dmitry. Many drivers don't need to do anything special for suspend, or only need to cancel an outstanding input request. Rather than go through every single driver and add a minimal (and possibly erroneous) suspend routine, it's much easier just to unbind these drivers. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 15:20 ` Alan Stern @ 2005-12-23 15:35 ` Patrick Mochel 2005-12-23 16:52 ` Alan Stern 2005-12-23 18:32 ` Dmitry Torokhov 0 siblings, 2 replies; 69+ messages in thread From: Patrick Mochel @ 2005-12-23 15:35 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 3305 bytes --] On Fri, 23 Dec 2005, Alan Stern wrote: > On Thu, 22 Dec 2005, Dmitry Torokhov wrote: > > > On Thursday 22 December 2005 22:49, Patrick Mochel wrote: > > > > > > On Thu, 22 Dec 2005, Alan Stern wrote: > > > > > > > I'm not so sure this approach really qualifies as a "hack". If a driver > > > > doesn't have explicit support for suspend/resume, the best way to quiesce > > > > it is to unbind it. The only alternative is to fail all of its I/O > > > > requests, which seems much less reliable. > > > > > > Hmm, the best way to queisce it seems to be to fix the driver. Sure, that > > > doesn't solve your problems now, but that is the ideal solution, no? Until > > > then, band-aids like unbinding the device from the driver is only a remedy > > > to a symptom; not a cure.. > > > > > > > Quite often unbinding is easiest and also correct way. I would love to be > > able to just unbind serio port/input device and have it recreated later. > > Unfortunately X/GPM do not [yet?] support hotplugging of devices so kernel > > has to compensate. > > I agree with Dmitry. Many drivers don't need to do anything special for > suspend, or only need to cancel an outstanding input request. Rather than > go through every single driver and add a minimal (and possibly erroneous) > suspend routine, it's much easier just to unbind these drivers. So why not fix the subsystems to do the equivalent of an unbinding, from a tear-down and reinitialization standpoint? That way, you don't have to force the core to contort itself simply for the fact that you want to push in a thumbtack with a sledgehammer? Think about what is happening. It's been discovered that shutting down the device and reinitializing the device performs all the correct actions to get the device up and running again after a suspend/resume cycle. All you have to do for each suspend routine is mimic that effect. It arguably doesn't require any serious knowledge about the device - it only needs a copy of the ->probe() and ->remove() routines (or the functional equivalent for those devices), without the allocation and freeing of data structures. In fact, this sounds an awful lot like the ->start() / ->stop() methods that have been proposed over the years. I would say that it's time to give that concept some serious thought and start implementing them in drivers. That way, for devices that don't actually do hardware PM, we can still get them down and back up again, without having to worry about those scary ->suspend()/->resume() methods. :-) I don't find justification to your argument in avoiding to fix drivers. We don't change the core when it means the impact would be to change every driver. And, we shouldn't change the core to avoid fixing a subset of drivers. The number of drivers that it affects should not be 100% of them. And, as I mentioned before, it should be +/- simple, and without any of the (unknown) side-effects of binding/unbinding. You said 'possibly erroneous', but I fail to see how this would be any more dangerous than unbinding/binding. If these drivers are maintained, then the maintainers should have no problem doing what's mentioned above. In fact, some janitors could probably do it. If the drivers are unmaintained, then perhaps they don't belong in the kernel anyway? Thanks, Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 15:35 ` Patrick Mochel @ 2005-12-23 16:52 ` Alan Stern 2005-12-23 17:20 ` Pavel Machek ` (2 more replies) 2005-12-23 18:32 ` Dmitry Torokhov 1 sibling, 3 replies; 69+ messages in thread From: Alan Stern @ 2005-12-23 16:52 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 4409 bytes --] On Fri, 23 Dec 2005, Patrick Mochel wrote: > > I agree with Dmitry. Many drivers don't need to do anything special for > > suspend, or only need to cancel an outstanding input request. Rather than > > go through every single driver and add a minimal (and possibly erroneous) > > suspend routine, it's much easier just to unbind these drivers. > > So why not fix the subsystems to do the equivalent of an unbinding, from > a tear-down and reinitialization standpoint? That way, you don't have to > force the core to contort itself simply for the fact that you want to push > in a thumbtack with a sledgehammer? That sounds very dangerous. You would end up with a situation where the driver thinks it has been unbound but the core thinks it is still bound. Not to mention that some drivers, as part of their unbind routines, unbind themselves from other interfaces -- these would be true, actual unbindings. > Think about what is happening. It's been discovered that shutting down the > device and reinitializing the device performs all the correct actions to > get the device up and running again after a suspend/resume cycle. ? I don't follow you at all. What does shutting down the device have to do with what we're talking about? I'm trying to unbind a _driver_, not do anything to the _device_. As far as I know, the device doesn't need any sort of special treatment to work okay with suspend/resume. > All you > have to do for each suspend routine is mimic that effect. It arguably > doesn't require any serious knowledge about the device - it only needs a > copy of the ->probe() and ->remove() routines (or the functional > equivalent for those devices), without the allocation and freeing of data > structures. Why go to all the trouble of copying probe and remove when we can call the actual existing routines? > In fact, this sounds an awful lot like the ->start() / ->stop() methods > that have been proposed over the years. I would say that it's time to give > that concept some serious thought and start implementing them in drivers. > That way, for devices that don't actually do hardware PM, we can still get > them down and back up again, without having to worry about those scary > ->suspend()/->resume() methods. :-) Start and stop are no more or less scary than suspend and resume. _Any_ routine that hasn't been implemented and subjected to testing is going to be less reliable than one that has been around and in use for a long time (such as probe and resume). Furthermore, I don't see any advantage to changing both the core (to add start and stop) as well as changing all the drivers, as opposed to simply changing all the drivers (to add suspend and resume). > I don't find justification to your argument in avoiding to fix drivers. We > don't change the core when it means the impact would be to change every > driver. And, we shouldn't change the core to avoid fixing a subset of > drivers. The number of drivers that it affects should not be 100% of them. The issue of registering and unregistering devices during a sleep transition has come up before. It's not going to go away. We should find a good way to deal with it. Changing the core hotplug support so that new unfrozen processes don't get started up while everything else is frozen seems to me like a fairly small, cheap way of getting things to work properly. Furthermore, Greg KH has stated that many (or even most) USB drivers shouldn't need to have special support for suspend/resume added. I'm not sure whether he intended that these drivers should be unbound during a suspend, but that's the easiest way to handle them. > And, as I mentioned before, it should be +/- simple, and without any of > the (unknown) side-effects of binding/unbinding. You said 'possibly > erroneous', but I fail to see how this would be any more dangerous than > unbinding/binding. Binding and unbinding have been part of every USB driver from the day they were first written. They have gotten _lots_ of testing and are known to work well. The same cannot be said for new suspend/resume routines. > If these drivers are maintained, then the maintainers should have no > problem doing what's mentioned above. In fact, some janitors could > probably do it. If the drivers are unmaintained, then perhaps they don't > belong in the kernel anyway? I'll leave that for Greg to answer. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 16:52 ` Alan Stern @ 2005-12-23 17:20 ` Pavel Machek 2005-12-23 21:22 ` Alan Stern 2005-12-23 19:40 ` Greg KH 2005-12-24 0:23 ` Patrick Mochel 2 siblings, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-23 17:20 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 674 bytes --] Hi! > > All you > > have to do for each suspend routine is mimic that effect. It arguably > > doesn't require any serious knowledge about the device - it only needs a > > copy of the ->probe() and ->remove() routines (or the functional > > equivalent for those devices), without the allocation and freeing of data > > structures. > > Why go to all the trouble of copying probe and remove when we can call the > actual existing routines? Well, if you can find some elegant solution in the core, I think thats the best way. You could set system_state to "suspending" or something like that, and just if() out notifications in that case. Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 17:20 ` Pavel Machek @ 2005-12-23 21:22 ` Alan Stern 2005-12-23 21:28 ` Greg KH 2005-12-23 21:28 ` Pavel Machek 0 siblings, 2 replies; 69+ messages in thread From: Alan Stern @ 2005-12-23 21:22 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 571 bytes --] On Fri, 23 Dec 2005, Pavel Machek wrote: > Well, if you can find some elegant solution in the core, I think thats > the best way. > > You could set system_state to "suspending" or something like that, and > just if() out notifications in that case. How about simply adding a call to try_to_freeze() somewhere inside kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much what I want, in theory. The hotplug processes would get frozen before /sbin/hotplug is exec'ed. I haven't actually tried it yet, so maybe there's a hidden gotcha. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 21:22 ` Alan Stern @ 2005-12-23 21:28 ` Greg KH 2005-12-23 22:09 ` Nigel Cunningham 2005-12-24 15:11 ` Alan Stern 2005-12-23 21:28 ` Pavel Machek 1 sibling, 2 replies; 69+ messages in thread From: Greg KH @ 2005-12-23 21:28 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list, Pavel Machek [-- Attachment #1: Type: text/plain, Size: 751 bytes --] On Fri, Dec 23, 2005 at 04:22:44PM -0500, Alan Stern wrote: > On Fri, 23 Dec 2005, Pavel Machek wrote: > > > Well, if you can find some elegant solution in the core, I think thats > > the best way. > > > > You could set system_state to "suspending" or something like that, and > > just if() out notifications in that case. > > How about simply adding a call to try_to_freeze() somewhere inside > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > what I want, in theory. The hotplug processes would get frozen before > /sbin/hotplug is exec'ed. On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. We use netlink to send the data out, so this might not even be a problem anymore... thanks, greg k-h [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 21:28 ` Greg KH @ 2005-12-23 22:09 ` Nigel Cunningham 2005-12-23 22:31 ` Pavel Machek 2005-12-24 15:11 ` Alan Stern 1 sibling, 1 reply; 69+ messages in thread From: Nigel Cunningham @ 2005-12-23 22:09 UTC (permalink / raw) To: Greg KH; +Cc: Linux-pm mailing list, Pavel Machek [-- Attachment #1: Type: text/plain, Size: 1034 bytes --] Hi. On Sat, 2005-12-24 at 07:28, Greg KH wrote: > On Fri, Dec 23, 2005 at 04:22:44PM -0500, Alan Stern wrote: > > On Fri, 23 Dec 2005, Pavel Machek wrote: > > > > > Well, if you can find some elegant solution in the core, I think thats > > > the best way. > > > > > > You could set system_state to "suspending" or something like that, and > > > just if() out notifications in that case. > > > > How about simply adding a call to try_to_freeze() somewhere inside > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > what I want, in theory. The hotplug processes would get frozen before > > /sbin/hotplug is exec'ed. > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > We use netlink to send the data out, so this might not even be a problem > anymore... At resume time, prior to copying back the original kernel data, events can and do occur. FWIW, I've been using "if(freezer_is_on()) return 0" for a while in call_usermodehelper_keys, to good effect. Regards, Nigel [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 22:09 ` Nigel Cunningham @ 2005-12-23 22:31 ` Pavel Machek 2005-12-23 22:40 ` Nigel Cunningham 0 siblings, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-23 22:31 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 955 bytes --] Hi! > > > > Well, if you can find some elegant solution in the core, I think thats > > > > the best way. > > > > > > > > You could set system_state to "suspending" or something like that, and > > > > just if() out notifications in that case. > > > > > > How about simply adding a call to try_to_freeze() somewhere inside > > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > > what I want, in theory. The hotplug processes would get frozen before > > > /sbin/hotplug is exec'ed. > > > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > > We use netlink to send the data out, so this might not even be a problem > > anymore... > > At resume time, prior to copying back the original kernel data, events > can and do occur. > > FWIW, I've been using "if(freezer_is_on()) return 0" for a while in > call_usermodehelper_keys, to good effect. Can we have a patch? Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 22:31 ` Pavel Machek @ 2005-12-23 22:40 ` Nigel Cunningham 2005-12-23 22:48 ` Pavel Machek 0 siblings, 1 reply; 69+ messages in thread From: Nigel Cunningham @ 2005-12-23 22:40 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 1241 bytes --] Hi. On Sat, 2005-12-24 at 08:31, Pavel Machek wrote: > Hi! > > > > > > Well, if you can find some elegant solution in the core, I think thats > > > > > the best way. > > > > > > > > > > You could set system_state to "suspending" or something like that, and > > > > > just if() out notifications in that case. > > > > > > > > How about simply adding a call to try_to_freeze() somewhere inside > > > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > > > what I want, in theory. The hotplug processes would get frozen before > > > > /sbin/hotplug is exec'ed. > > > > > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > > > We use netlink to send the data out, so this might not even be a problem > > > anymore... > > > > At resume time, prior to copying back the original kernel data, events > > can and do occur. > > > > FWIW, I've been using "if(freezer_is_on()) return 0" for a while in > > call_usermodehelper_keys, to good effect. > > Can we have a patch? Well the version I'm using depends on my changes to the freezer, so I can't give you one straight off, and I'm not likely to get around to it today, so I was hoping someone else would run with the concept :) Nigel [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 22:40 ` Nigel Cunningham @ 2005-12-23 22:48 ` Pavel Machek 2005-12-23 22:57 ` Nigel Cunningham 2005-12-24 0:29 ` Nigel Cunningham 0 siblings, 2 replies; 69+ messages in thread From: Pavel Machek @ 2005-12-23 22:48 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 1520 bytes --] On So 24-12-05 08:40:31, Nigel Cunningham wrote: > On Sat, 2005-12-24 at 08:31, Pavel Machek wrote: > > > > > > Well, if you can find some elegant solution in the core, I think thats > > > > > > the best way. > > > > > > > > > > > > You could set system_state to "suspending" or something like that, and > > > > > > just if() out notifications in that case. > > > > > > > > > > How about simply adding a call to try_to_freeze() somewhere inside > > > > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > > > > what I want, in theory. The hotplug processes would get frozen before > > > > > /sbin/hotplug is exec'ed. > > > > > > > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > > > > We use netlink to send the data out, so this might not even be a problem > > > > anymore... > > > > > > At resume time, prior to copying back the original kernel data, events > > > can and do occur. > > > > > > FWIW, I've been using "if(freezer_is_on()) return 0" for a while in > > > call_usermodehelper_keys, to good effect. > > > > Can we have a patch? > > Well the version I'm using depends on my changes to the freezer, so I > can't give you one straight off, and I'm not likely to get around to it > today, so I was hoping someone else would run with the concept :) Ahha, ok. Forget it, then. Or just send a patch without the refrigerator changes, so Alan knows where to hack. It does not have to work in order to be useful :-). Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 22:48 ` Pavel Machek @ 2005-12-23 22:57 ` Nigel Cunningham 2005-12-24 0:29 ` Nigel Cunningham 1 sibling, 0 replies; 69+ messages in thread From: Nigel Cunningham @ 2005-12-23 22:57 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 2291 bytes --] Hi. On Sat, 2005-12-24 at 08:48, Pavel Machek wrote: > On So 24-12-05 08:40:31, Nigel Cunningham wrote: > > On Sat, 2005-12-24 at 08:31, Pavel Machek wrote: > > > > > > > > Well, if you can find some elegant solution in the core, I think thats > > > > > > > the best way. > > > > > > > > > > > > > > You could set system_state to "suspending" or something like that, and > > > > > > > just if() out notifications in that case. > > > > > > > > > > > > How about simply adding a call to try_to_freeze() somewhere inside > > > > > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > > > > > what I want, in theory. The hotplug processes would get frozen before > > > > > > /sbin/hotplug is exec'ed. > > > > > > > > > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > > > > > We use netlink to send the data out, so this might not even be a problem > > > > > anymore... > > > > > > > > At resume time, prior to copying back the original kernel data, events > > > > can and do occur. > > > > > > > > FWIW, I've been using "if(freezer_is_on()) return 0" for a while in > > > > call_usermodehelper_keys, to good effect. > > > > > > Can we have a patch? > > > > Well the version I'm using depends on my changes to the freezer, so I > > can't give you one straight off, and I'm not likely to get around to it > > today, so I was hoping someone else would run with the concept :) > > Ahha, ok. Forget it, then. Or just send a patch without the > refrigerator changes, so Alan knows where to hack. It does not have to > work in order to be useful :-). > Pavel Fair enough. Here's what I'm using, FWIW. By the way, hope you all have a great Christmas and festive season in general! Regards, Nigel diff -ruN linux-2.6.15-rc6/kernel/kmod.c build-2.6.15-rc6/kernel/kmod.c --- linux-2.6.15-rc6/kernel/kmod.c 2005-12-20 19:46:36.000000000 +1000 +++ build-2.6.15-rc6/kernel/kmod.c 2005-12-21 10:18:39.000000000 +1000 @@ -36,6 +36,7 @@ #include <linux/mount.h> #include <linux/kernel.h> #include <linux/init.h> +#include <linux/freezer.h> #include <asm/uaccess.h> extern int max_threads; @@ -249,6 +250,9 @@ if (!khelper_wq) return -EBUSY; + if (freezer_is_on()) + return 0; + if (path[0] == '\0') return 0; [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 22:48 ` Pavel Machek 2005-12-23 22:57 ` Nigel Cunningham @ 2005-12-24 0:29 ` Nigel Cunningham 1 sibling, 0 replies; 69+ messages in thread From: Nigel Cunningham @ 2005-12-24 0:29 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 20705 bytes --] Hi. On Sat, 2005-12-24 at 08:48, Pavel Machek wrote: > Ahha, ok. Forget it, then. Or just send a patch without the > refrigerator changes, so Alan knows where to hack. It does not have to > work in order to be useful :-). Maybe it would be helpful if I sent more than that first reply. I'm not suggesting for a moment that you'll want all of this, but it might be helpful anyway. I particularly find taking the freezer definitions out of sched.h helpful - there are too many files dependant on sched.h already, and little changes to freezing should mean recompiling half the kernel. By the way, yes - there are little bits here and there that I can remove. I'm still removing the vestiges of earlier methods of doing things. Regards, Nigel. diff -ruN linux-2.6.15-rc6/include/linux/freezer.h build-2.6.15-rc6/include/linux/freezer.h --- linux-2.6.15-rc6/include/linux/freezer.h 1970-01-01 10:00:00.000000000 +1000 +++ build-2.6.15-rc6/include/linux/freezer.h 2005-12-21 10:18:39.000000000 +1000 @@ -0,0 +1,29 @@ +/* Freezer declarations */ + +#define FREEZER_ON 0 +#define ABORT_FREEZING 1 +#define LRU_FREEZE 2 + +#define FREEZER_KERNEL_THREADS 0 +#define FREEZER_ALL_THREADS 1 + +#ifdef CONFIG_PM +extern unsigned long freezer_state; + +#define test_freezer_state(bit) test_bit(bit, &freezer_state) +#define set_freezer_state(bit) set_bit(bit, &freezer_state) +#define clear_freezer_state(bit) clear_bit(bit, &freezer_state) + +#define freezer_is_on() (test_freezer_state(FREEZER_ON)) + +extern void do_freeze_process(struct notifier_block *nl); + +#else + +#define test_freezer_state(bit) (0) +#define set_freezer_state(bit) do { } while(0) +#define clear_freezer_state(bit) do { } while(0) + +#define freezer_is_on() (0) + +#endif diff -ruN linux-2.6.15-rc6/include/linux/sched.h build-2.6.15-rc6/include/linux/sched.h --- linux-2.6.15-rc6/include/linux/sched.h 2005-12-20 19:46:36.000000000 +1000 +++ build-2.6.15-rc6/include/linux/sched.h 2005-12-21 10:18:39.000000000 +1000 @@ -34,6 +34,7 @@ #include <linux/percpu.h> #include <linux/topology.h> #include <linux/seccomp.h> +#include <linux/notifier.h> #include <linux/auxvec.h> /* For AT_VECTOR_SIZE */ @@ -807,7 +808,10 @@ int (*notifier)(void *priv); void *notifier_data; sigset_t *notifier_mask; - + + /* todo list to be executed in the context of this thread */ + struct notifier_block *todo; + void *security; struct audit_context *audit_context; seccomp_t seccomp; @@ -898,7 +902,6 @@ #define PF_MEMALLOC 0x00000800 /* Allocating memory */ #define PF_FLUSHER 0x00001000 /* responsible for disk writeback */ #define PF_USED_MATH 0x00002000 /* if unset the fpu must be initialized before use */ -#define PF_FREEZE 0x00004000 /* this task is being frozen for suspend now */ #define PF_NOFREEZE 0x00008000 /* this thread should not be frozen */ #define PF_FROZEN 0x00010000 /* frozen for system suspend */ #define PF_FSTRANS 0x00020000 /* inside a filesystem transaction */ @@ -1385,79 +1388,37 @@ #endif -#ifdef CONFIG_PM /* - * Check if a process has been frozen + * Check if there is a todo list request */ -static inline int frozen(struct task_struct *p) +static inline int todo_list_active(void) { - return p->flags & PF_FROZEN; + return current->todo != NULL; } -/* - * Check if there is a request to freeze a process - */ -static inline int freezing(struct task_struct *p) +static inline void run_todo_list(void) { - return p->flags & PF_FREEZE; + notifier_call_chain(¤t->todo, 0, current); } -/* - * Request that a process be frozen - * FIXME: SMP problem. We may not modify other process' flags! - */ -static inline void freeze(struct task_struct *p) +static inline int try_todo_list(void) { - p->flags |= PF_FREEZE; -} - -/* - * Wake up a frozen process - */ -static inline int thaw_process(struct task_struct *p) -{ - if (frozen(p)) { - p->flags &= ~PF_FROZEN; - wake_up_process(p); - return 1; - } - return 0; -} - -/* - * freezing is complete, mark process as frozen - */ -static inline void frozen_process(struct task_struct *p) -{ - p->flags = (p->flags & ~PF_FREEZE) | PF_FROZEN; -} - -extern void refrigerator(void); -extern int freeze_processes(void); -extern void thaw_processes(void); - -static inline int try_to_freeze(void) -{ - if (freezing(current)) { - refrigerator(); + if (todo_list_active()) { + run_todo_list(); return 1; } else return 0; } -#else -static inline int frozen(struct task_struct *p) { return 0; } -static inline int freezing(struct task_struct *p) { return 0; } -static inline void freeze(struct task_struct *p) { BUG(); } -static inline int thaw_process(struct task_struct *p) { return 1; } -static inline void frozen_process(struct task_struct *p) { BUG(); } - -static inline void refrigerator(void) {} -static inline int freeze_processes(void) { BUG(); return 0; } -static inline void thaw_processes(void) {} -static inline int try_to_freeze(void) { return 0; } +/* + * Compatibility definitions to use the suspend checkpoints for the task todo + * list. These may be removed once all uses of try_to_free, refrigerator and + * freezing have been removed. + */ +#define try_to_freeze try_todo_list +#define refrigerator run_todo_list +#define freezing(p) todo_list_active() -#endif /* CONFIG_PM */ #endif /* __KERNEL__ */ #endif diff -ruN linux-2.6.15-rc6/include/linux/suspend.h build-2.6.15-rc6/include/linux/suspend.h --- linux-2.6.15-rc6/include/linux/suspend.h 2005-12-20 19:46:36.000000000 +1000 +++ build-2.6.15-rc6/include/linux/suspend.h 2005-12-21 10:18:41.000000000 +1000 @@ -9,6 +9,7 @@ #include <linux/config.h> #include <linux/init.h> #include <linux/pm.h> +#include <linux/suspend2.h> /* page backup entry */ typedef struct pbe { @@ -50,14 +51,20 @@ extern int pm_prepare_console(void); extern void pm_restore_console(void); +extern int freeze_processes(void); +extern void thaw_processes(int which_threads); #else static inline int software_suspend(void) { printk("Warning: fake suspend called\n"); return -EPERM; } +static inline int freeze_processes(void) { return 0; } +static inline void thaw_processes(int which_threads) { } #endif +extern char resume2_file[256]; + #ifdef CONFIG_SUSPEND_SMP extern void disable_nonboot_cpus(void); extern void enable_nonboot_cpus(void); diff -ruN linux-2.6.15-rc6/kernel/power/process.c build-2.6.15-rc6/kernel/power/process.c --- linux-2.6.15-rc6/kernel/power/process.c 2005-12-20 19:46:36.000000000 +1000 +++ build-2.6.15-rc6/kernel/power/process.c 2005-12-21 10:18:39.000000000 +1000 @@ -1,134 +1,431 @@ /* - * drivers/power/process.c - Functions for starting/stopping processes on - * suspend transitions. + * kernel/power/process.c * - * Originally from swsusp. + * Copyright (C) 1998-2001 Gabor Kuti <seasons@fornax.hu> + * Copyright (C) 1998,2001,2002 Pavel Machek <pavel@suse.cz> + * Copyright (C) 2002-2003 Florent Chabaud <fchabaud@free.fr> + * Copyright (C) 2002-2004 Nigel Cunningham <ncunningham@linuxmail.org> + * + * This file is released under the GPLv2. + * + * Freeze_and_free contains the routines software suspend uses to freeze other + * processes during the suspend cycle and to (if necessary) free up memory in + * accordance with limitations on the image size. + * + * Ideally, the image saved to disk would be an atomic copy of the entire + * contents of all RAM and related hardware state. One of the first + * prerequisites for getting our approximation of this is stopping the activity + * of other processes. We can't stop all other processes, however, since some + * are needed in doing the I/O to save the image. Freeze_and_free.c contains + * the routines that control suspension and resuming of these processes. + * + * Under high I/O load, we need to be careful about the order in which we + * freeze processes. If we freeze processes in the wrong order, we could + * deadlock others. The freeze_order array this specifies the order in which + * critical processes are frozen. All others are suspended after these have + * entered the refrigerator. + * + * Another complicating factor is that freeing memory requires the processes + * to not be frozen, but at the end of freeing memory, they need to be frozen + * so that we can be sure we actually have eaten enough memory. This is why + * freezing and freeing are in the one file. The freezer is not called from + * the main logic, but indirectly, via the code for eating memory. The eat + * memory logic is iterative, first freezing processes and checking the stats, + * then (if necessary) unfreezing them and eating more memory until it looks + * like the criteria are met (at which point processes are frozen & stats + * checked again). */ - -#undef DEBUG - -#include <linux/smp_lock.h> -#include <linux/interrupt.h> #include <linux/suspend.h> +#include <linux/freezer.h> #include <linux/module.h> +#include <linux/mount.h> +#include <linux/namespace.h> +#include <linux/buffer_head.h> +#include <asm/tlbflush.h> + +unsigned long freezer_state = 0; + +#ifdef CONFIG_PM_DEBUG +#define freezer_message(msg, a...) do { printk(msg, ##a); } while(0) +#else +#define freezer_message(msg, a...) do { } while(0) +#endif + +/* Timeouts when freezing */ +#define FREEZER_TOTAL_TIMEOUT (5 * HZ) +#define FREEZER_CHECK_TIMEOUT (HZ / 10) + +DECLARE_COMPLETION(kernelspace_thaw); +DECLARE_COMPLETION(userspace_thaw); +static atomic_t nr_userspace_frozen; +static atomic_t nr_kernelspace_frozen; + +struct frozen_fs +{ + struct list_head fsb_list; + struct super_block *sb; +}; + +LIST_HEAD(frozen_fs_list); + +void freezer_make_fses_rw(void) +{ + struct frozen_fs *fs, *next_fs; + + list_for_each_entry_safe(fs, next_fs, &frozen_fs_list, fsb_list) { + thaw_bdev(fs->sb->s_bdev, fs->sb); + + list_del(&fs->fsb_list); + kfree(fs); + } +} /* - * Timeout for stopping processes + * Done after userspace is frozen, so there should be no danger of + * fses being unmounted while we're in here. */ -#define TIMEOUT (6 * HZ) +int freezer_make_fses_ro(void) +{ + struct frozen_fs *fs; + struct super_block *sb; + + /* Generate the list */ + list_for_each_entry(sb, &super_blocks, s_list) { + if (!sb->s_root || !sb->s_bdev || + (sb->s_frozen == SB_FREEZE_TRANS) || + (sb->s_flags & MS_RDONLY)) + continue; + fs = kmalloc(sizeof(struct frozen_fs), GFP_ATOMIC); + fs->sb = sb; + list_add_tail(&fs->fsb_list, &frozen_fs_list); + }; + + /* Do the freezing in reverse order so filesystems dependant + * upon others are frozen in the right order. (Eg loopback + * on ext3). */ + list_for_each_entry_reverse(fs, &frozen_fs_list, fsb_list) + freeze_bdev(fs->sb->s_bdev); -static inline int freezeable(struct task_struct * p) + return 0; +} + +/* + * freezeable + * + * Description: Determine whether a process should be frozen yet. + * Parameters: struct task_struct * The process to consider. + * int Boolean - 0 = userspace else all. + * Returns: int 0 if don't freeze yet, otherwise do. + */ +static inline int freezeable(struct task_struct * p, int all_freezable) { if ((p == current) || + (p->flags & PF_FROZEN) || (p->flags & PF_NOFREEZE) || (p->exit_state == EXIT_ZOMBIE) || (p->exit_state == EXIT_DEAD) || (p->state == TASK_STOPPED) || - (p->state == TASK_TRACED)) + (p->state == TASK_TRACED) || + (!p->mm && !all_freezable)) return 0; return 1; } -/* Refrigerator is place where frozen processes are stored :-). */ -void refrigerator(void) +static void __freeze_process(struct completion *completion_handler, + atomic_t *nr_frozen) { - /* Hmm, should we be allowed to suspend when there are realtime - processes around? */ long save; - save = current->state; - pr_debug("%s entered refrigerator\n", current->comm); - printk("="); - frozen_process(current); - spin_lock_irq(¤t->sighand->siglock); - recalc_sigpending(); /* We sent fake signal, clean it up */ - spin_unlock_irq(¤t->sighand->siglock); - - while (frozen(current)) { - current->state = TASK_UNINTERRUPTIBLE; - schedule(); - } - pr_debug("%s left refrigerator\n", current->comm); + freezer_message("%s (%d) frozen.\n", + current->comm, current->pid); + save = current->state; + + atomic_inc(nr_frozen); + wait_for_completion(completion_handler); + atomic_dec(nr_frozen); + current->state = save; + freezer_message("%s (%d) leaving freezer.\n", + current->comm, current->pid); } -/* 0 = success, else # of processes that we failed to stop */ -int freeze_processes(void) +/* + * Invoked by the task todo list notifier when the task to be + * frozen is running. + */ +static int freeze_process(struct notifier_block *nl, unsigned long x, void *v) { - int todo; - unsigned long start_time; - struct task_struct *g, *p; unsigned long flags; - printk( "Stopping tasks: " ); - start_time = jiffies; - do { - todo = 0; - read_lock(&tasklist_lock); - do_each_thread(g, p) { - if (!freezeable(p)) - continue; - if (frozen(p)) - continue; + might_sleep(); - freeze(p); - spin_lock_irqsave(&p->sighand->siglock, flags); - signal_wake_up(p, 0); - spin_unlock_irqrestore(&p->sighand->siglock, flags); - todo++; - } while_each_thread(g, p); - read_unlock(&tasklist_lock); - yield(); /* Yield is okay here */ - if (todo && time_after(jiffies, start_time + TIMEOUT)) { - printk( "\n" ); - printk(KERN_ERR " stopping tasks failed (%d tasks remaining)\n", todo ); - break; - } - } while(todo); - - /* This does not unfreeze processes that are already frozen - * (we have slightly ugly calling convention in that respect, - * and caller must call thaw_processes() if something fails), - * but it cleans up leftover PF_FREEZE requests. - */ - if (todo) { - read_lock(&tasklist_lock); - do_each_thread(g, p) - if (freezing(p)) { - pr_debug(" clean up: %s\n", p->comm); - p->flags &= ~PF_FREEZE; - spin_lock_irqsave(&p->sighand->siglock, flags); - recalc_sigpending_tsk(p); - spin_unlock_irqrestore(&p->sighand->siglock, flags); - } - while_each_thread(g, p); - read_unlock(&tasklist_lock); - return todo; + /* Locking to handle race against waking the process in + * freeze threads. */ + spin_lock_irqsave(¤t->sighand->siglock, flags); + current->flags |= PF_FROZEN; + + if (nl) + notifier_chain_unregister(¤t->todo, nl); + + recalc_sigpending(); + spin_unlock_irqrestore(¤t->sighand->siglock, flags); + + if (nl) + kfree(nl); + + if (test_freezer_state(FREEZER_ON)) { + if (current->mm) + __freeze_process(&userspace_thaw, &nr_userspace_frozen); + else + __freeze_process(&kernelspace_thaw, + &nr_kernelspace_frozen); } - printk( "|\n" ); - BUG_ON(in_atomic()); + spin_lock_irqsave(¤t->sighand->siglock, flags); + recalc_sigpending(); + spin_unlock_irqrestore(¤t->sighand->siglock, flags); + + current->flags &= ~PF_FROZEN; + return 0; } -void thaw_processes(void) +void thaw_processes(int do_all_threads) { + if (do_all_threads) { + clear_freezer_state(FREEZER_ON); + clear_freezer_state(ABORT_FREEZING); + } + + complete_all(&kernelspace_thaw); + while (atomic_read(&nr_kernelspace_frozen) > 0) + yield(); + + init_completion(&kernelspace_thaw); + freezer_make_fses_rw(); + + if (do_all_threads) { + complete_all(&userspace_thaw); + while (atomic_read(&nr_userspace_frozen) > 0) + yield(); + init_completion(&userspace_thaw); + } +} + +/* + * num_freezeable + * + * Description: Determine how many processes of our type are still to be + * frozen. As a side effect, update the progress bar too. + * Parameters: int Which type we are trying to freeze. + * int Whether we are displaying our progress. + */ +static int num_freezeable(int do_all_threads) { + struct task_struct *g, *p; + int todo_this_type = 0; - printk( "Restarting tasks..." ); read_lock(&tasklist_lock); do_each_thread(g, p) { - if (!freezeable(p)) + if (freezeable(p, do_all_threads)) + todo_this_type++; + } while_each_thread(g, p); + read_unlock(&tasklist_lock); + + return todo_this_type; +} + +/* + * num_uninterruptible + * + * Description: Determine how many processes of our type are in state + * task uninterruptible. + * Parameters: int Which type we are trying to freeze. + */ +static int num_uninterruptible(int do_all_threads) { + + struct task_struct *g, *p; + int count = 0; + + read_lock(&tasklist_lock); + do_each_thread(g, p) { + if (freezeable(p, do_all_threads) && + p->state == TASK_UNINTERRUPTIBLE) + count++; + } while_each_thread(g, p); + read_unlock(&tasklist_lock); + + return count; +} + +/* + * Tell threads of the type to enter the freezer. + */ +static void signal_threads(int do_all_threads) +{ + struct task_struct *g, *p; + struct notifier_block *n; + + read_lock(&tasklist_lock); + do_each_thread(g, p) { + if (!freezeable(p, do_all_threads)) continue; - if (!thaw_process(p)) - printk(KERN_INFO " Strange, %s not stopped\n", p->comm ); + + n = kmalloc(sizeof(struct notifier_block), + GFP_ATOMIC); + + if (n) { + n->notifier_call = freeze_process; + n->priority = 0; + notifier_chain_register(&p->todo, n); + } } while_each_thread(g, p); + read_unlock(&tasklist_lock); +} + +/* + * Prod processes that haven't entered the refrigerator yet. + */ +static void prod_processes(int do_all_threads) +{ + struct task_struct *g, *p; + unsigned long flags; + + read_lock(&tasklist_lock); + do_each_thread(g, p) { + if (!freezeable(p, do_all_threads)) + continue; + + spin_lock_irqsave(&p->sighand->siglock, flags); + if (!(p->flags & PF_FROZEN)) { + recalc_sigpending(); + signal_wake_up(p, 0); + } + spin_unlock_irqrestore(&p->sighand->siglock, flags); + } while_each_thread(g, p); + read_unlock(&tasklist_lock); +} + +/* + * Freezer failure. + * + * Check whether we failed to freeze all the processes that + * should be frozen. If we find a task that failed to freeze, + * we give useful information on what failed and how. + */ +static int freezer_failure(int do_all_threads) +{ + int result = 0; + struct task_struct *g, *p; + + read_lock(&tasklist_lock); + do_each_thread(g, p) { + if (!freezeable(p, do_all_threads) || + p->state == TASK_UNINTERRUPTIBLE) + continue; + if (!result) { + printk(KERN_ERR "Stopping tasks failed.\n"); + printk(KERN_ERR "Tasks that refused to be " + "refrigerated and haven't since exited:\n"); + set_freezer_state(ABORT_FREEZING); + result = 1; + } + + if ((freezing(p))) { + printk(" - %s (#%d) signalled but " + "didn't enter refrigerator.\n", + p->comm, p->pid); + } else + printk(" - %s (#%d) signalled " + "and todo list empty.\n", + p->comm, p->pid); + } while_each_thread(g, p); read_unlock(&tasklist_lock); - schedule(); - printk( " done\n" ); + + return result; +} + +/* + * freeze_threads + * + * Freeze a set of threads having particular attributes. + * + * Types: + * 2: User threads. + * 3: Kernel threads. + */ +static int freeze_threads(int do_all_threads) +{ + int result = 0, still_to_do; + unsigned long start_time = jiffies; + + if (do_all_threads) + freezer_make_fses_ro(); + + signal_threads(do_all_threads); + + /* Watch them do it, wake them if they ignore us. */ + do { + prod_processes(do_all_threads); + + set_task_state(current, TASK_INTERRUPTIBLE); + schedule_timeout(FREEZER_CHECK_TIMEOUT); + + still_to_do = num_freezeable(do_all_threads) - + num_uninterruptible(do_all_threads); + + } while(still_to_do && (!test_freezer_state(ABORT_FREEZING)) && + !time_after(jiffies, start_time + FREEZER_TOTAL_TIMEOUT)); + + /* + * Did we time out? See if we failed to freeze processes as well. + * + */ + if ((time_after(jiffies, start_time + FREEZER_TOTAL_TIMEOUT)) + && (still_to_do)) + result = freezer_failure(do_all_threads); + + BUG_ON(in_atomic()); + + return 0; +} + +/* + * freeze_processes - Freeze processes prior to saving an image of memory. + * + * Return value: 0 = success, 1 = faulure. + */ +int freeze_processes(void) +{ + enum system_states old_state = system_state; + int result = 0; + + if (!test_freezer_state(FREEZER_ON)) { + /* + * No race. While !FREEZER_ON, processes + * won't enter __freeze_process + */ + init_completion(&userspace_thaw); + init_completion(&kernelspace_thaw); + set_freezer_state(FREEZER_ON); + } + + /* Now freeze processes that were syncing and are still running */ + if (freeze_threads(0) || (test_freezer_state(ABORT_FREEZING))) { + result = 1; + goto out; + } + + /* Freeze kernel threads */ + if (freeze_threads(1) || (test_freezer_state(ABORT_FREEZING))) + result = 1; + +out: + system_state = old_state; + return result; } -EXPORT_SYMBOL(refrigerator); +EXPORT_SYMBOL(freezer_state); [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 21:28 ` Greg KH 2005-12-23 22:09 ` Nigel Cunningham @ 2005-12-24 15:11 ` Alan Stern 2005-12-24 15:28 ` Pavel Machek 2005-12-24 17:02 ` Greg KH 1 sibling, 2 replies; 69+ messages in thread From: Alan Stern @ 2005-12-24 15:11 UTC (permalink / raw) To: Greg KH; +Cc: Nigel Cunningham, Linux-pm mailing list, Pavel Machek [-- Attachment #1: Type: TEXT/PLAIN, Size: 1403 bytes --] On Fri, 23 Dec 2005, Greg KH wrote: > On Fri, Dec 23, 2005 at 04:22:44PM -0500, Alan Stern wrote: > > On Fri, 23 Dec 2005, Pavel Machek wrote: > > > > > Well, if you can find some elegant solution in the core, I think thats > > > the best way. > > > > > > You could set system_state to "suspending" or something like that, and > > > just if() out notifications in that case. > > > > How about simply adding a call to try_to_freeze() somewhere inside > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > what I want, in theory. The hotplug processes would get frozen before > > /sbin/hotplug is exec'ed. > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > We use netlink to send the data out, so this might not even be a problem > anymore... Indeed it might not. On my older FC3 system I end up with unwanted processes, but under FC4 that doesn't happen. (Note however that even on the FC4 system, /sbin/hotplug is not empty and /proc/sys/kernel/hotplug does point to /sbin/hotplug.) The ideal situation would be to have PF_FREEZE automatically set for every new process created while userspace was frozen. Kernel threads could remove the flag and set PF_NOFREEZE if they wanted, while user processes would automatically get frozen before returning to userspace. This approach would eliminate all the loopholes at once. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-24 15:11 ` Alan Stern @ 2005-12-24 15:28 ` Pavel Machek 2005-12-24 17:02 ` Greg KH 1 sibling, 0 replies; 69+ messages in thread From: Pavel Machek @ 2005-12-24 15:28 UTC (permalink / raw) To: Alan Stern; +Cc: Nigel Cunningham, Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 2031 bytes --] On So 24-12-05 10:11:13, Alan Stern wrote: > On Fri, 23 Dec 2005, Greg KH wrote: > > > On Fri, Dec 23, 2005 at 04:22:44PM -0500, Alan Stern wrote: > > > On Fri, 23 Dec 2005, Pavel Machek wrote: > > > > > > > Well, if you can find some elegant solution in the core, I think thats > > > > the best way. > > > > > > > > You could set system_state to "suspending" or something like that, and > > > > just if() out notifications in that case. > > > > > > How about simply adding a call to try_to_freeze() somewhere inside > > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > > what I want, in theory. The hotplug processes would get frozen before > > > /sbin/hotplug is exec'ed. > > > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > > We use netlink to send the data out, so this might not even be a problem > > anymore... > > Indeed it might not. On my older FC3 system I end up with unwanted > processes, but under FC4 that doesn't happen. (Note however that even on > the FC4 system, /sbin/hotplug is not empty and /proc/sys/kernel/hotplug > does point to /sbin/hotplug.) > > The ideal situation would be to have PF_FREEZE automatically set for every > new process created while userspace was frozen. Kernel threads could > remove the flag and set PF_NOFREEZE if they wanted, while user processes > would automatically get frozen before returning to userspace. This > approach would eliminate all the loopholes at once. It would also prevent some clever stuff with u-swsusp. We'll want to keep u-swsusp program controlling suspend. I don't see how it needs to exec() today, but eventually it may want to run some properly-audited helpers... ...okay, better not, because that would mean memory allocation while suspended => bad. Anyway, doing check in exec() or something like that would slow down hot path. Just take fix userland exec's during suspend one-by-one. We are doing something wrong if we attempt that, anyway. Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-24 15:11 ` Alan Stern 2005-12-24 15:28 ` Pavel Machek @ 2005-12-24 17:02 ` Greg KH 1 sibling, 0 replies; 69+ messages in thread From: Greg KH @ 2005-12-24 17:02 UTC (permalink / raw) To: Alan Stern; +Cc: Nigel Cunningham, Linux-pm mailing list, Pavel Machek [-- Attachment #1: Type: text/plain, Size: 1290 bytes --] On Sat, Dec 24, 2005 at 10:11:13AM -0500, Alan Stern wrote: > On Fri, 23 Dec 2005, Greg KH wrote: > > > On Fri, Dec 23, 2005 at 04:22:44PM -0500, Alan Stern wrote: > > > On Fri, 23 Dec 2005, Pavel Machek wrote: > > > > > > > Well, if you can find some elegant solution in the core, I think thats > > > > the best way. > > > > > > > > You could set system_state to "suspending" or something like that, and > > > > just if() out notifications in that case. > > > > > > How about simply adding a call to try_to_freeze() somewhere inside > > > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > > > what I want, in theory. The hotplug processes would get frozen before > > > /sbin/hotplug is exec'ed. > > > > On modern distros, /sbin/hotplug is set to NULL, so this isn't an issue. > > We use netlink to send the data out, so this might not even be a problem > > anymore... > > Indeed it might not. On my older FC3 system I end up with unwanted > processes, but under FC4 that doesn't happen. (Note however that even on > the FC4 system, /sbin/hotplug is not empty and /proc/sys/kernel/hotplug > does point to /sbin/hotplug.) I said "modern" :) Rawhide should have this set to NULL, as well as SuSE, Gentoo unstable and Debian unstable. thanks, greg k-h [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 21:22 ` Alan Stern 2005-12-23 21:28 ` Greg KH @ 2005-12-23 21:28 ` Pavel Machek 1 sibling, 0 replies; 69+ messages in thread From: Pavel Machek @ 2005-12-23 21:28 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list On Pá 23-12-05 16:22:44, Alan Stern wrote: > On Fri, 23 Dec 2005, Pavel Machek wrote: > > > Well, if you can find some elegant solution in the core, I think thats > > the best way. > > > > You could set system_state to "suspending" or something like that, and > > just if() out notifications in that case. > > How about simply adding a call to try_to_freeze() somewhere inside > kernel/kmod.c:____call_usermodehelper()? That ought to do pretty much > what I want, in theory. The hotplug processes would get frozen before > /sbin/hotplug is exec'ed. > > I haven't actually tried it yet, so maybe there's a hidden gotcha. That would mean quite a lot of "extra" events delivered to userspace, no? ... no, actually not, that should be okay. "resume during suspend" events will be discarded, because they'll not make it into on-disk image, so userspace will get single unplug/replug events after resume. That's okay. OTOH I do not think try_to_freeze() is going to make it. Process freezing is one-time action, so if you start new thread, try_to_freeze() will not attempt to freeze it. Pavel -- Thanks, Sharp! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 16:52 ` Alan Stern 2005-12-23 17:20 ` Pavel Machek @ 2005-12-23 19:40 ` Greg KH 2005-12-24 0:23 ` Patrick Mochel 2 siblings, 0 replies; 69+ messages in thread From: Greg KH @ 2005-12-23 19:40 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 1414 bytes --] On Fri, Dec 23, 2005 at 11:52:03AM -0500, Alan Stern wrote: > > I don't find justification to your argument in avoiding to fix drivers. We > > don't change the core when it means the impact would be to change every > > driver. And, we shouldn't change the core to avoid fixing a subset of > > drivers. The number of drivers that it affects should not be 100% of them. > > The issue of registering and unregistering devices during a sleep > transition has come up before. It's not going to go away. We should > find a good way to deal with it. Changing the core hotplug support so > that new unfrozen processes don't get started up while everything else is > frozen seems to me like a fairly small, cheap way of getting things to > work properly. > > Furthermore, Greg KH has stated that many (or even most) USB drivers > shouldn't need to have special support for suspend/resume added. I'm not > sure whether he intended that these drivers should be unbound during a > suspend, but that's the easiest way to handle them. No, my main point was that we can not abort the syspend sequence just because a USB driver does not have a suspend/resume callback. That is what was happening in the tree at the time, and a lot of different users complained about it. And yes, I don't mind if we just unbind those drivers from devices at suspend, as for USB that's probably the easiest way. thanks, greg k-h [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 16:52 ` Alan Stern 2005-12-23 17:20 ` Pavel Machek 2005-12-23 19:40 ` Greg KH @ 2005-12-24 0:23 ` Patrick Mochel 2005-12-25 2:56 ` Alan Stern 2 siblings, 1 reply; 69+ messages in thread From: Patrick Mochel @ 2005-12-24 0:23 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 8347 bytes --] On Fri, 23 Dec 2005, Alan Stern wrote: > On Fri, 23 Dec 2005, Patrick Mochel wrote: > > > > I agree with Dmitry. Many drivers don't need to do anything special for > > > suspend, or only need to cancel an outstanding input request. Rather than > > > go through every single driver and add a minimal (and possibly erroneous) > > > suspend routine, it's much easier just to unbind these drivers. > > > > So why not fix the subsystems to do the equivalent of an unbinding, from > > a tear-down and reinitialization standpoint? That way, you don't have to > > force the core to contort itself simply for the fact that you want to push > > in a thumbtack with a sledgehammer? > > That sounds very dangerous. You would end up with a situation where the > driver thinks it has been unbound but the core thinks it is still bound. > Not to mention that some drivers, as part of their unbind routines, unbind > themselves from other interfaces -- these would be true, actual > unbindings. What? No, you are trying to make it sound way more complicated than it needs to be. I didn't say anything about unbinding - I was only talking about tearing down the device (aka 'shutting down the device' aka 'quiescing the device' aka 'whatever the ->remove() routine does to the device'). If there was a way to do that, then you wouldn't need to unbind/rebind the driver to the device. See next paragraph. > > Think about what is happening. It's been discovered that shutting down the > > device and reinitializing the device performs all the correct actions to > > get the device up and running again after a suspend/resume cycle. > > ? I don't follow you at all. What does shutting down the device have to > do with what we're talking about? I'm trying to unbind a _driver_, not do > anything to the _device_. As far as I know, the device doesn't need any > sort of special treatment to work okay with suspend/resume. Then what is the point of unbinding in the first place. If the device doesn't need any special treatment, then all you're doing is tinkering with data structures, which should be completely unnecessary (unless for some reason the data structures are left in some incomplete state, which is cause for greater alarm). The reason why it makes sense to unbind/rebind drives is because the device accesses on ->remove() and ->probe() do enough to put the device to bed and bring back up again. (Note that's the context I was speaking with for the last couple of mesages.) Any other reason to unbind/rebind seems very fishy. > > All you have to do for each suspend routine is mimic that effect. It > > arguably doesn't require any serious knowledge about the device - it > > only needs a copy of the ->probe() and ->remove() routines (or the > > functional equivalent for those devices), without the allocation and > > freeing of data structures. > > Why go to all the trouble of copying probe and remove when we can call the > actual existing routines? Because all you want to do is stop and start the device. Unbinding and rebinding do a whole lot more. A device does an entire round trip through the driver core, which does a lot of crap. There are several memory allocations, perhaps dozens of sysfs files created, and a handful of locks taken adn lists traversed. No, it's not on the fast path, but it's just simply inefficient. Not to mention that even if you do get the userspace notification correct, you're unnecessarily creating work to do for devices that have not really been removed and re-added. > > In fact, this sounds an awful lot like the ->start() / ->stop() methods > > that have been proposed over the years. I would say that it's time to give > > that concept some serious thought and start implementing them in drivers. > > That way, for devices that don't actually do hardware PM, we can still get > > them down and back up again, without having to worry about those scary > > ->suspend()/->resume() methods. :-) > > Start and stop are no more or less scary than suspend and resume. It was a joke, based on the fact that you're treating the PM calls as some hideous monster. C'mon, it can't be that hard to implement something that does the exact same device acceses as happen in ->probe() and ->remove(). > _Any_ routine that hasn't been implemented and subjected to testing is > going to be less reliable than one that has been around and in use for a > long time (such as probe and resume). Furthermore, I don't see any > advantage to changing both the core (to add start and stop) as well as > changing all the drivers, as opposed to simply changing all the drivers > (to add suspend and resume). For the record, the changes to the core are trivial. And, I wasn't advocating that we change the core and every driver to add ->start() and ->stop() in lieu of simply adding ->suspend() and ->resume(). I think that they should all be added in due time. > > I don't find justification to your argument in avoiding to fix drivers. We > > don't change the core when it means the impact would be to change every > > driver. And, we shouldn't change the core to avoid fixing a subset of > > drivers. The number of drivers that it affects should not be 100% of them. > > The issue of registering and unregistering devices during a sleep > transition has come up before. It's not going to go away. We should > find a good way to deal with it. Like fix the drivers so that they don't have to be unregistered and re-registered? That seems like a proper long term solution. > Changing the core hotplug support so that new unfrozen processes don't > get started up while everything else is frozen seems to me like a fairly > small, cheap way of getting things to work properly. Sure, it should be easy. Stop the hotplug thread; or just kill it. Regardless, it doesn't change the fact that the drivers are still not working properly. They exhibit working behavior, but they are strictly dependent on the behavior of an independent subsystem. So now we have to make changes to the hotpug code simply to accomodate some other half-implemented code. That's a dangerous situtation, because it could prevent us from changing the hotplug code (or any other subsystem that must be changed to accomodate incomplete PM implementations) in the future. Sure, it's a small change now, and there are "lots of drivers" that would have to be fixed otherwise. But, what happens in 5 years when we want to do something totally different, and people are still ignoring the call to implement suspend/resume support (that is now 10 years old). Breaking all the drivers then will be a lot more painful than buckling down and fixing them now. > Furthermore, Greg KH has stated that many (or even most) USB drivers > shouldn't need to have special support for suspend/resume added. I'm not > sure whether he intended that these drivers should be unbound during a > suspend, but that's the easiest way to handle them. Fine, see below. > > And, as I mentioned before, it should be +/- simple, and without any of > > the (unknown) side-effects of binding/unbinding. You said 'possibly > > erroneous', but I fail to see how this would be any more dangerous than > > unbinding/binding. > > Binding and unbinding have been part of every USB driver from the day they > were first written. They have gotten _lots_ of testing and are known to > work well. The same cannot be said for new suspend/resume routines. And, this is going to help that disparity how? People have been avoiding adding PM support for five years now. By a stroke of luck, USB didn't need it, and things just worked. So, what now? Do we tiptoe around that in the rest of the driver/PM/hotplug core, or do not worry about the shortcomings of some drivers, accept the fact there is going to be some work to do, and do things right? I say the latter. And, I say that there should be no hotplug events during suspend/resume. If some drivers require that they be unbound before suspend, then I have a very simple solution - do it in userspace. It's damn easy to unload the driver module, then to reload it again. Push it on to the distros. Write some scripts. Create a black/white/blue list of drivers that must be unloaded. Problem solved. Not only that, but it also prevents us from getting hotplug events during the suspend transition. Thanks, Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-24 0:23 ` Patrick Mochel @ 2005-12-25 2:56 ` Alan Stern 2005-12-25 8:53 ` Pavel Machek 2005-12-26 21:20 ` Patrick Mochel 0 siblings, 2 replies; 69+ messages in thread From: Alan Stern @ 2005-12-25 2:56 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 7730 bytes --] On Fri, 23 Dec 2005, Patrick Mochel wrote: > > > So why not fix the subsystems to do the equivalent of an unbinding, from > > > a tear-down and reinitialization standpoint? That way, you don't have to > > > force the core to contort itself simply for the fact that you want to push > > > in a thumbtack with a sledgehammer? > > > > That sounds very dangerous. You would end up with a situation where the > > driver thinks it has been unbound but the core thinks it is still bound. > > Not to mention that some drivers, as part of their unbind routines, unbind > > themselves from other interfaces -- these would be true, actual > > unbindings. > > What? No, you are trying to make it sound way more complicated than it > needs to be. I didn't say anything about unbinding - I can't resist pointing out that you _did_ say, just above: "So why not fix the subsystems to do the equivalent of an unbinding...". > I was only talking > about tearing down the device (aka 'shutting down the device' aka > 'quiescing the device' aka 'whatever the ->remove() routine does to the > device'). > > If there was a way to do that, then you wouldn't need to unbind/rebind the > driver to the device. See next paragraph. > Then what is the point of unbinding in the first place. If the device > doesn't need any special treatment, then all you're doing is tinkering > with data structures, which should be completely unnecessary (unless for > some reason the data structures are left in some incomplete state, which > is cause for greater alarm). > > The reason why it makes sense to unbind/rebind drives is because the > device accesses on ->remove() and ->probe() do enough to put the device to > bed and bring back up again. (Note that's the context I was speaking with > for the last couple of mesages.) Any other reason to unbind/rebind seems > very fishy. The way you wrote this indicates you don't understand the problem we're trying to solve, so I'd better explain more fully. We're not worried about quiescing the device, putting it to bed, or whatever. That will happen just fine, all by itself. The problem is that the _driver_ needs to be quiesced. It shouldn't try to do any I/O during a system sleep. Partly because any attempted I/O will fail once the USB device has been suspended, partly because during a short time interval (between suspending the USB interface and suspending the USB device) the I/O will succeed but still shouldn't occur, and partly because the attempt will leave the driver in a not-well-defined error state. Now perhaps things aren't quite so strict. With many drivers it might be okay to let them deal with I/O failures. It's not elegant, though. > Because all you want to do is stop and start the device. Stop and start the _driver_, not the device. But maybe you regard those two phrases as meaning the same thing. > Unbinding and > rebinding do a whole lot more. A device does an entire round trip through > the driver core, which does a lot of crap. There are several memory > allocations, perhaps dozens of sysfs files created, and a handful of locks > taken adn lists traversed. No, you're thinking of unregistering and re-registering a device or driver. Unbinding and rebinding is much simpler, although still somewhat complex. > No, it's not on the fast path, but it's just simply inefficient. Not to > mention that even if you do get the userspace notification correct, you're > unnecessarily creating work to do for devices that have not really been > removed and re-added. Again, this concerns drivers and not devices. But yes, it is extra work. > > The issue of registering and unregistering devices during a sleep > > transition has come up before. It's not going to go away. We should > > find a good way to deal with it. > > Like fix the drivers so that they don't have to be unregistered and > re-registered? That seems like a proper long term solution. Unbound and rebound, not unregistered and re-registered. Yes, changing all the drivers would mean we could avoid unbinding and rebinding them. It's not at all clear that this would an optimal solution, however, in terms of either code size or maintainability. > Sure, it should be easy. Stop the hotplug thread; or just kill it. > > Regardless, it doesn't change the fact that the drivers are still not > working properly. They exhibit working behavior, but they are strictly > dependent on the behavior of an independent subsystem. So now we have to > make changes to the hotpug code simply to accomodate some other > half-implemented code. > > That's a dangerous situtation, because it could prevent us from changing > the hotplug code (or any other subsystem that must be changed to > accomodate incomplete PM implementations) in the future. > > Sure, it's a small change now, and there are "lots of drivers" that would > have to be fixed otherwise. But, what happens in 5 years when we want to > do something totally different, and people are still ignoring the call to > implement suspend/resume support (that is now 10 years old). Breaking all > the drivers then will be a lot more painful than buckling down and fixing > them now. Well, the hotplug thread thing is now a separate issue. And maybe one we don't need to worry about... > > Furthermore, Greg KH has stated that many (or even most) USB drivers > > shouldn't need to have special support for suspend/resume added. I'm not > > sure whether he intended that these drivers should be unbound during a > > suspend, but that's the easiest way to handle them. > > Fine, see below. > > Binding and unbinding have been part of every USB driver from the day they > > were first written. They have gotten _lots_ of testing and are known to > > work well. The same cannot be said for new suspend/resume routines. > > And, this is going to help that disparity how? > > People have been avoiding adding PM support for five years now. By a > stroke of luck, USB didn't need it, and things just worked. So, what now? > Do we tiptoe around that in the rest of the driver/PM/hotplug core, or do > not worry about the shortcomings of some drivers, accept the fact there is > going to be some work to do, and do things right? > > I say the latter. And, I say that there should be no hotplug events during > suspend/resume. > > If some drivers require that they be unbound before suspend, then I have a > very simple solution - do it in userspace. It's damn easy to unload the > driver module, then to reload it again. Push it on to the distros. Write > some scripts. Create a black/white/blue list of drivers that must be > unloaded. Problem solved. Not only that, but it also prevents us from > getting hotplug events during the suspend transition. At this point I'm starting to feel like the proverbial "man-in-the-middle". Recently I submitted a patch adding suspend/resume support to the usblp driver. Greg objected to it because it added explicit checks for whether the driver was suspended before starting up I/O operations. He said that such checks did not belong in USB drivers, they belonged in the USB core. (You can read his original comments in http://marc.theaimsgroup.com/?l=linux-usb-devel&m=113418897301873&w=2 .) So now what should we do? Require userspace to rmmod usblp before suspending? Add suspend and resume methods to usblp, but make it so they only cancel outstanding I/O, while relying on usbcore to fail any new I/O requests? Neither of those feels good to me. The only other options I can think of are: Unbind usblp in lieu of suspending it. Make usblp check whether it is suspended before submitting any I/O requests. These have been ruled out by you and Greg. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 2:56 ` Alan Stern @ 2005-12-25 8:53 ` Pavel Machek 2005-12-25 16:43 ` Alan Stern 2005-12-26 21:20 ` Patrick Mochel 1 sibling, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-25 8:53 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 1454 bytes --] Hi! > At this point I'm starting to feel like the proverbial > "man-in-the-middle". Recently I submitted a patch adding Rock and a hard place? > suspend/resume > support to the usblp driver. Greg objected to it because it added > explicit checks for whether the driver was suspended before starting up > I/O operations. He said that such checks did not belong in USB drivers, > they belonged in the USB core. (You can read his original comments in > > http://marc.theaimsgroup.com/?l=linux-usb-devel&m=113418897301873&w=2 > > .) So now what should we do? > > Require userspace to rmmod usblp before suspending? No. > Add suspend and resume methods to usblp, but make it so they > only cancel outstanding I/O, while relying on usbcore to fail > any new I/O requests? It is not *that* bad, actually. In system suspend/resume cases, no new I/O requests can happen, because userspace is frozen. Because of runtime suspend, you should handle I/O errors properly, but you should handle I/O errors properly, anyway, so... looks like a solution to me. > Neither of those feels good to me. The only other options I can think of > are: > > Unbind usblp in lieu of suspending it. If this can be done in reasonable ammount of not-too-ugly code, why not? I think that even Patrick can be convinced by nice patch. > Make usblp check whether it is suspended before submitting any > I/O requests. Ugly. Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 8:53 ` Pavel Machek @ 2005-12-25 16:43 ` Alan Stern 2005-12-25 16:52 ` Dmitry Torokhov 2005-12-25 19:52 ` Pavel Machek 0 siblings, 2 replies; 69+ messages in thread From: Alan Stern @ 2005-12-25 16:43 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 4356 bytes --] On Sun, 25 Dec 2005, Pavel Machek wrote: > Hi! > > > At this point I'm starting to feel like the proverbial > > "man-in-the-middle". Recently I submitted a patch adding > > Rock and a hard place? I guess so. The encryption-attacker simile isn't appropriate... > > suspend/resume > > support to the usblp driver. Greg objected to it because it added > > explicit checks for whether the driver was suspended before starting up > > I/O operations. He said that such checks did not belong in USB drivers, > > they belonged in the USB core. (You can read his original comments in > > > > http://marc.theaimsgroup.com/?l=linux-usb-devel&m=113418897301873&w=2 > > > > .) So now what should we do? > > > > Require userspace to rmmod usblp before suspending? > > No. > > > Add suspend and resume methods to usblp, but make it so they > > only cancel outstanding I/O, while relying on usbcore to fail > > any new I/O requests? > > It is not *that* bad, actually. In system suspend/resume cases, no new > I/O requests can happen, because userspace is frozen. Because of > runtime suspend, you should handle I/O errors properly, but you should > handle I/O errors properly, anyway, so... looks like a solution to me. You're right, it's not really all that bad. Note however that in the PPC implementation, Ben H. does not freeze userspace before suspend to RAM. Making the necessary changes to usbcore won't be very difficult. (With the exception of endpoint 0 -- I think we can afford to ignore that issue.) > > Neither of those feels good to me. The only other options I can think of > > are: > > > > Unbind usblp in lieu of suspending it. > > If this can be done in reasonable ammount of not-too-ugly code, why > not? I think that even Patrick can be convinced by nice patch. This isn't very hard either. A sample patch is given below. I haven't tested it with vanilla 2.6.15-rc6, but a similar patch works okay on Greg's development tree. Apart from the FIXMEs, it's not bad at all. > > Make usblp check whether it is suspended before submitting any > > I/O requests. > > Ugly. Alan Stern Index: linux-2.6.15-rc6/drivers/usb/core/usb.c =================================================================== --- linux-2.6.15-rc6.orig/drivers/usb/core/usb.c +++ linux-2.6.15-rc6/drivers/usb/core/usb.c @@ -1431,8 +1431,17 @@ static int usb_generic_suspend(struct de else mark_quiesced(intf); } else { - // FIXME else if there's no suspend method, disconnect... - dev_warn(dev, "no %s?\n", "suspend"); + dev_warn(dev, "no suspend for %s? unbinding...\n", + driver->name); + up(&dev->sem); + + /* + * FIXME: You're not supposed to do this without holding + * the USB device lock. But we can't just grab the lock + * because our caller might already be holding it. + */ + usb_driver_release_interface(driver, intf); + down(&dev->sem); status = 0; } return status; @@ -1459,9 +1468,22 @@ static int usb_generic_resume(struct dev return usb_resume_device (to_usb_device(dev)); } - if ((dev->driver == NULL) || - (dev->driver_data == &usb_generic_driver_data)) + /* try to bind interfaces that have no driver */ + if (dev->driver == NULL) { + up(&dev->sem); + + /* + * FIXME: You're not supposed to do this without holding + * the USB device lock. But we can't just grab the lock + * because our caller might already be holding it. + */ + device_attach(dev); + down(&dev->sem); return 0; + } + + if (dev->driver_data == &usb_generic_driver_data) + return 0; /* Shouldn't happen */ intf = to_usb_interface(dev); driver = to_usb_driver(dev->driver); @@ -1481,7 +1503,7 @@ static int usb_generic_resume(struct dev mark_quiesced(intf); } } else - dev_warn(dev, "no %s?\n", "resume"); + dev_warn(dev, "no resume for %s?\n", driver->name); return 0; } Index: linux-2.6.15-rc6/drivers/usb/core/hub.c =================================================================== --- linux-2.6.15-rc6.orig/drivers/usb/core/hub.c +++ linux-2.6.15-rc6/drivers/usb/core/hub.c @@ -1838,12 +1838,6 @@ int usb_resume_device(struct usb_device usb_unlock_device(udev); - /* rebind drivers that had no suspend() */ - if (status == 0) { - usb_lock_all_devices(); - bus_rescan_devices(&usb_bus_type); - usb_unlock_all_devices(); - } return status; } [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 16:43 ` Alan Stern @ 2005-12-25 16:52 ` Dmitry Torokhov 2005-12-25 19:54 ` Pavel Machek 2005-12-25 19:52 ` Pavel Machek 1 sibling, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-25 16:52 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list On Sunday 25 December 2005 11:43, Alan Stern wrote: > > It is not *that* bad, actually. In system suspend/resume cases, no new > > I/O requests can happen, because userspace is frozen. Because of > > runtime suspend, you should handle I/O errors properly, but you should > > handle I/O errors properly, anyway, so... looks like a solution to me. > > You're right, it's not really all that bad. Note however that in the PPC > implementation, Ben H. does not freeze userspace before suspend to RAM. > Do we freeze processes before STR on i386? -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 16:52 ` Dmitry Torokhov @ 2005-12-25 19:54 ` Pavel Machek 0 siblings, 0 replies; 69+ messages in thread From: Pavel Machek @ 2005-12-25 19:54 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list On Ne 25-12-05 11:52:21, Dmitry Torokhov wrote: > On Sunday 25 December 2005 11:43, Alan Stern wrote: > > > It is not *that* bad, actually. In system suspend/resume cases, no new > > > I/O requests can happen, because userspace is frozen. Because of > > > runtime suspend, you should handle I/O errors properly, but you should > > > handle I/O errors properly, anyway, so... looks like a solution to me. > > > > You're right, it's not really all that bad. Note however that in the PPC > > implementation, Ben H. does not freeze userspace before suspend to RAM. > > > > Do we freeze processes before STR on i386? Yes. Pavel -- Thanks, Sharp! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 16:43 ` Alan Stern 2005-12-25 16:52 ` Dmitry Torokhov @ 2005-12-25 19:52 ` Pavel Machek 2005-12-26 3:53 ` Alan Stern 1 sibling, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-25 19:52 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 3726 bytes --] Hi! > > > Add suspend and resume methods to usblp, but make it so they > > > only cancel outstanding I/O, while relying on usbcore to fail > > > any new I/O requests? > > > > It is not *that* bad, actually. In system suspend/resume cases, no new > > I/O requests can happen, because userspace is frozen. Because of > > runtime suspend, you should handle I/O errors properly, but you should > > handle I/O errors properly, anyway, so... looks like a solution to me. > > You're right, it's not really all that bad. Note however that in the PPC > implementation, Ben H. does not freeze userspace before suspend to RAM. Perhaps he'll change his mind ;-))))). Yep, for ppc and runtime power management, this is not really nice. But if someone forces our device into off state, perhaps returning errors to userspace is acceptable solution? > > > Unbind usblp in lieu of suspending it. > > > > If this can be done in reasonable ammount of not-too-ugly code, why > > not? I think that even Patrick can be convinced by nice patch. > > This isn't very hard either. A sample patch is given below. I haven't > tested it with vanilla 2.6.15-rc6, but a similar patch works okay on > Greg's development tree. Apart from the FIXMEs, it's not bad at > all. Well, the FIXMEs look quite nasty to my untrained eye... Pavel > Index: linux-2.6.15-rc6/drivers/usb/core/usb.c > =================================================================== > --- linux-2.6.15-rc6.orig/drivers/usb/core/usb.c > +++ linux-2.6.15-rc6/drivers/usb/core/usb.c > @@ -1431,8 +1431,17 @@ static int usb_generic_suspend(struct de > else > mark_quiesced(intf); > } else { > - // FIXME else if there's no suspend method, disconnect... > - dev_warn(dev, "no %s?\n", "suspend"); > + dev_warn(dev, "no suspend for %s? unbinding...\n", > + driver->name); > + up(&dev->sem); > + > + /* > + * FIXME: You're not supposed to do this without holding > + * the USB device lock. But we can't just grab the lock > + * because our caller might already be holding it. > + */ > + usb_driver_release_interface(driver, intf); > + down(&dev->sem); > status = 0; > } > return status; > @@ -1459,9 +1468,22 @@ static int usb_generic_resume(struct dev > return usb_resume_device (to_usb_device(dev)); > } > > - if ((dev->driver == NULL) || > - (dev->driver_data == &usb_generic_driver_data)) > + /* try to bind interfaces that have no driver */ > + if (dev->driver == NULL) { > + up(&dev->sem); > + > + /* > + * FIXME: You're not supposed to do this without holding > + * the USB device lock. But we can't just grab the lock > + * because our caller might already be holding it. > + */ > + device_attach(dev); > + down(&dev->sem); > return 0; > + } > + > + if (dev->driver_data == &usb_generic_driver_data) > + return 0; /* Shouldn't happen */ > > intf = to_usb_interface(dev); > driver = to_usb_driver(dev->driver); > @@ -1481,7 +1503,7 @@ static int usb_generic_resume(struct dev > mark_quiesced(intf); > } > } else > - dev_warn(dev, "no %s?\n", "resume"); > + dev_warn(dev, "no resume for %s?\n", driver->name); > return 0; > } > > Index: linux-2.6.15-rc6/drivers/usb/core/hub.c > =================================================================== > --- linux-2.6.15-rc6.orig/drivers/usb/core/hub.c > +++ linux-2.6.15-rc6/drivers/usb/core/hub.c > @@ -1838,12 +1838,6 @@ int usb_resume_device(struct usb_device > > usb_unlock_device(udev); > > - /* rebind drivers that had no suspend() */ > - if (status == 0) { > - usb_lock_all_devices(); > - bus_rescan_devices(&usb_bus_type); > - usb_unlock_all_devices(); > - } > return status; > } > > -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 19:52 ` Pavel Machek @ 2005-12-26 3:53 ` Alan Stern 0 siblings, 0 replies; 69+ messages in thread From: Alan Stern @ 2005-12-26 3:53 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1653 bytes --] On Sun, 25 Dec 2005, Pavel Machek wrote: > > > It is not *that* bad, actually. In system suspend/resume cases, no new > > > I/O requests can happen, because userspace is frozen. Because of > > > runtime suspend, you should handle I/O errors properly, but you should > > > handle I/O errors properly, anyway, so... looks like a solution to me. > > > > You're right, it's not really all that bad. Note however that in the PPC > > implementation, Ben H. does not freeze userspace before suspend to RAM. > > Perhaps he'll change his mind ;-))))). > > Yep, for ppc and runtime power management, this is not really > nice. But if someone forces our device into off state, perhaps > returning errors to userspace is acceptable solution? It should be. Provided the driver doesn't get too confused. With usblp that's not a problem. > > > > Unbind usblp in lieu of suspending it. > > > > > > If this can be done in reasonable ammount of not-too-ugly code, why > > > not? I think that even Patrick can be convinced by nice patch. > > > > This isn't very hard either. A sample patch is given below. I haven't > > tested it with vanilla 2.6.15-rc6, but a similar patch works okay on > > Greg's development tree. Apart from the FIXMEs, it's not bad at > > all. > > Well, the FIXMEs look quite nasty to my untrained eye... They're not as bad as they look. The requirement about holding the extra semaphore matters only for the hub driver; no other USB drivers care about it AFAIK. And of course the hub driver _does_ have suspend/resume support, so the new code wouldn't affect it. However, I will go ahead as you and Greg suggest. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-25 2:56 ` Alan Stern 2005-12-25 8:53 ` Pavel Machek @ 2005-12-26 21:20 ` Patrick Mochel 2005-12-26 22:50 ` Dmitry Torokhov 2005-12-26 23:25 ` Alan Stern 1 sibling, 2 replies; 69+ messages in thread From: Patrick Mochel @ 2005-12-26 21:20 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 5381 bytes --] On Sat, 24 Dec 2005, Alan Stern wrote: > On Fri, 23 Dec 2005, Patrick Mochel wrote: > > > > > So why not fix the subsystems to do the equivalent of an unbinding, from > > > > a tear-down and reinitialization standpoint? That way, you don't have to > > > > force the core to contort itself simply for the fact that you want to push > > > > in a thumbtack with a sledgehammer? > > > > > > That sounds very dangerous. You would end up with a situation where the > > > driver thinks it has been unbound but the core thinks it is still bound. > > > Not to mention that some drivers, as part of their unbind routines, unbind > > > themselves from other interfaces -- these would be true, actual > > > unbindings. > > > > What? No, you are trying to make it sound way more complicated than it > > needs to be. I didn't say anything about unbinding - > > I can't resist pointing out that you _did_ say, just above: "So why not > fix the subsystems to do the equivalent of an unbinding...". Fair enough. :) > The way you wrote this indicates you don't understand the problem we're > trying to solve, so I'd better explain more fully. > > We're not worried about quiescing the device, putting it to bed, or > whatever. That will happen just fine, all by itself. > > The problem is that the _driver_ needs to be quiesced. It shouldn't try > to do any I/O during a system sleep. Partly because any attempted I/O > will fail once the USB device has been suspended, partly because during a > short time interval (between suspending the USB interface and suspending > the USB device) the I/O will succeed but still shouldn't occur, and partly > because the attempt will leave the driver in a not-well-defined error > state. > > Now perhaps things aren't quite so strict. With many drivers it might be > okay to let them deal with I/O failures. It's not elegant, though. > > > Because all you want to do is stop and start the device. > > Stop and start the _driver_, not the device. But maybe you regard those > two phrases as meaning the same thing. Basically. I understand your points, and I think that we were talking about slightly different points. Nevertheless, think about both of the things that we are talking about - quiescing the driver and quiescing the device. Both of those things happen by unbinding the driver - the driver stops the I/O requests and the device is put into a non-talkative state. However, there are other things that happen as a side effect, which I think can (and eventually will) produce negative results. I think that we can achieve the things that we talk about in a simpler manner, which will also happen to be more elegant and more efficient. Instead of unbinding, what if we simply issued a request to the device to halt I/O transactions and put the device in a quiescent state? We could call this method ->stop() and put it in struct device_driver. Since it is in the core driver structure, it would be filled in and intercepted by the different bus driver cores. The USB core would receive the call, prevent any future I/O transactions, then call down to the (optionally present) driver-specific ->stop(). [ Of course, it looks good on paper, and you may know some of the nasty details of USB that would prevent something like this. ] Then again, you could initially implement this for USB only, then move it into the core once we all decide that it makes sense to propogate outwards. > > Unbinding and > > rebinding do a whole lot more. A device does an entire round trip through > > the driver core, which does a lot of crap. There are several memory > > allocations, perhaps dozens of sysfs files created, and a handful of locks > > taken adn lists traversed. > > No, you're thinking of unregistering and re-registering a device or > driver. Unbinding and rebinding is much simpler, although still somewhat > complex. Ah, point taken; I was thinking that all USB devices were still disconnected during a suspend transition.. > At this point I'm starting to feel like the proverbial > "man-in-the-middle". Recently I submitted a patch adding suspend/resume > support to the usblp driver. Greg objected to it because it added > explicit checks for whether the driver was suspended before starting up > I/O operations. He said that such checks did not belong in USB drivers, > they belonged in the USB core. (You can read his original comments in > > http://marc.theaimsgroup.com/?l=linux-usb-devel&m=113418897301873&w=2 > > .) So now what should we do? > > Require userspace to rmmod usblp before suspending? > > Add suspend and resume methods to usblp, but make it so they > only cancel outstanding I/O, while relying on usbcore to fail > any new I/O requests? The latter seems to make the most sense - it is partitioning the problem between things which are specific to the device (canceling pending I/O, assuming that it requires issuing commands to the devices) and will be generic to all like devices (preventing future I/O, assuming that's a software flag that is set/checked). Because the USB core intercepts and forwards the ->suspend()/->resume() calls (and possibly ->start()/->stop()), it has the ability to do any pre- and post- work that is generic to all or most devices, which is good because it prevents unnecessary duplication of work in the various drivers. Does that help at all? Thanks, Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-26 21:20 ` Patrick Mochel @ 2005-12-26 22:50 ` Dmitry Torokhov 2005-12-26 23:01 ` Alan Stern 2005-12-27 21:09 ` Patrick Mochel 2005-12-26 23:25 ` Alan Stern 1 sibling, 2 replies; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-26 22:50 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list On Monday 26 December 2005 16:20, Patrick Mochel wrote: > > > Unbinding and > > > rebinding do a whole lot more. A device does an entire round trip through > > > the driver core, which does a lot of crap. There are several memory > > > allocations, perhaps dozens of sysfs files created, and a handful of locks > > > taken adn lists traversed. > > > > No, you're thinking of unregistering and re-registering a device or > > driver. Unbinding and rebinding is much simpler, although still somewhat > > complex. > > Ah, point taken; I was thinking that all USB devices were still > disconnected during a suspend transition.. > Device removal could happen at any point, even during suspend transition. The kernel should be able to handle this scenario therefore implementation that assumes that device tree is frozen in flawed. As far as I understand the only thing that does not work at the moment is invoking hotplug handler. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-26 22:50 ` Dmitry Torokhov @ 2005-12-26 23:01 ` Alan Stern 2005-12-27 21:45 ` Dmitry Torokhov 2005-12-27 21:09 ` Patrick Mochel 1 sibling, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-26 23:01 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1148 bytes --] On Mon, 26 Dec 2005, Dmitry Torokhov wrote: > Device removal could happen at any point, even during suspend transition. > The kernel should be able to handle this scenario therefore implementation > that assumes that device tree is frozen in flawed. As far as I understand > the only thing that does not work at the moment is invoking hotplug handler. The situation isn't all that bad. The device tree can be changed only by calling device_add or device_del (or the corresponding register/unregister routines). That can only be done in process context; hence if processes are frozen it can't happen. Or rather, it can happen only if a suspend() or resume() method tries to do it. If those methods don't try to handle device removal or addition -- which in most cases they shouldn't -- then there's nothing to worry about. It's only oddball situations (like calling the remove method when the driver doesn't define a suspend method) that cause problems. And even then, with most drivers you don't get a hotplug event when the driver is unbound. It happens only with things like usb-storage, that define child devices. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-26 23:01 ` Alan Stern @ 2005-12-27 21:45 ` Dmitry Torokhov 2005-12-27 22:08 ` Pavel Machek 2005-12-29 2:07 ` Alan Stern 0 siblings, 2 replies; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-27 21:45 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list On 12/26/05, Alan Stern <stern@rowland.harvard.edu> wrote: > On Mon, 26 Dec 2005, Dmitry Torokhov wrote: > > > Device removal could happen at any point, even during suspend transition. > > The kernel should be able to handle this scenario therefore implementation > > that assumes that device tree is frozen in flawed. As far as I understand > > the only thing that does not work at the moment is invoking hotplug handler. > > The situation isn't all that bad. > > The device tree can be changed only by calling device_add or device_del > (or the corresponding register/unregister routines). That can only be > done in process context; hence if processes are frozen it can't happen. > Or rather, it can happen only if a suspend() or resume() method tries to > do it. > > If those methods don't try to handle device removal or addition -- which > in most cases they shouldn't -- then there's nothing to worry about. It's > only oddball situations (like calling the remove method when the driver > doesn't define a suspend method) that cause problems. > Hmm, I am confused. What should ->resume() method do when it fails to resume a device? I would say kill it, espacially if it is bus-generic method. How else would we clean up after removed devices? And here we have a problem because swsuspend likes to resume all devices before writing the image and drivers don't rellay know if they are woken up for real or just to complete suspend transition. > And even then, with most drivers you don't get a hotplug event when the > driver is unbound. It happens only with things like usb-storage, that > define child devices. > No, I'd say most drivers generate hotplug events because most of them create/destroy class devices nowadays. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 21:45 ` Dmitry Torokhov @ 2005-12-27 22:08 ` Pavel Machek 2005-12-27 22:21 ` Dmitry Torokhov 2005-12-29 2:07 ` Alan Stern 1 sibling, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-27 22:08 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list On Út 27-12-05 16:45:17, Dmitry Torokhov wrote: > On 12/26/05, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Mon, 26 Dec 2005, Dmitry Torokhov wrote: > > > > > Device removal could happen at any point, even during suspend transition. > > > The kernel should be able to handle this scenario therefore implementation > > > that assumes that device tree is frozen in flawed. As far as I understand > > > the only thing that does not work at the moment is invoking hotplug handler. > > > > The situation isn't all that bad. > > > > The device tree can be changed only by calling device_add or device_del > > (or the corresponding register/unregister routines). That can only be > > done in process context; hence if processes are frozen it can't happen. > > Or rather, it can happen only if a suspend() or resume() method tries to > > do it. > > > > If those methods don't try to handle device removal or addition -- which > > in most cases they shouldn't -- then there's nothing to worry about. It's > > only oddball situations (like calling the remove method when the driver > > doesn't define a suspend method) that cause problems. > > > > Hmm, I am confused. What should ->resume() method do when it fails to > resume a device? I would say kill it, espacially if it is > bus-generic That should basically never happen. You can kill the device, or maybe even panic() whole machine. It just should not happen. Pavel -- Thanks, Sharp! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 22:08 ` Pavel Machek @ 2005-12-27 22:21 ` Dmitry Torokhov 2005-12-27 22:31 ` Pavel Machek 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-27 22:21 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list On 12/27/05, Pavel Machek <pavel@ucw.cz> wrote: > On Út 27-12-05 16:45:17, Dmitry Torokhov wrote: > > > > Hmm, I am confused. What should ->resume() method do when it fails to > > resume a device? I would say kill it, espacially if it is > > bus-generic > > That should basically never happen. You can kill the device, or maybe > even panic() whole machine. It just should not happen. It can happen, it did happen and it will continue to happen. You hit your suspend button and stand up and accidentially kick your USB hub or your PS/2 mouse gets loose and kernel fails to resume it. It is no big deal, we just have to be prepared to handle it. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 22:21 ` Dmitry Torokhov @ 2005-12-27 22:31 ` Pavel Machek 2005-12-27 22:39 ` Dmitry Torokhov 0 siblings, 1 reply; 69+ messages in thread From: Pavel Machek @ 2005-12-27 22:31 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list On Út 27-12-05 17:21:28, Dmitry Torokhov wrote: > On 12/27/05, Pavel Machek <pavel@ucw.cz> wrote: > > On Út 27-12-05 16:45:17, Dmitry Torokhov wrote: > > > > > > Hmm, I am confused. What should ->resume() method do when it fails to > > > resume a device? I would say kill it, espacially if it is > > > bus-generic > > > > That should basically never happen. You can kill the device, or maybe > > even panic() whole machine. It just should not happen. > > It can happen, it did happen and it will continue to happen. You hit > your suspend button and stand up and accidentially kick your USB hub > or your PS/2 mouse gets loose and kernel fails to resume it. It is no > big deal, we just have to be prepared to handle it. Ok, in case of hotpluggable devices, you just handle it in a similar way to surprise removal. Pavel -- Thanks, Sharp! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 22:31 ` Pavel Machek @ 2005-12-27 22:39 ` Dmitry Torokhov 2005-12-27 22:44 ` Pavel Machek 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-27 22:39 UTC (permalink / raw) To: Pavel Machek; +Cc: Linux-pm mailing list On 12/27/05, Pavel Machek <pavel@ucw.cz> wrote: > On Út 27-12-05 17:21:28, Dmitry Torokhov wrote: > > On 12/27/05, Pavel Machek <pavel@ucw.cz> wrote: > > > On Út 27-12-05 16:45:17, Dmitry Torokhov wrote: > > > > > > > > Hmm, I am confused. What should ->resume() method do when it fails to > > > > resume a device? I would say kill it, espacially if it is > > > > bus-generic > > > > > > That should basically never happen. You can kill the device, or maybe > > > even panic() whole machine. It just should not happen. > > > > It can happen, it did happen and it will continue to happen. You hit > > your suspend button and stand up and accidentially kick your USB hub > > or your PS/2 mouse gets loose and kernel fails to resume it. It is no > > big deal, we just have to be prepared to handle it. > > Ok, in case of hotpluggable devices, you just handle it in a similar > way to surprise removal. Yes, and we go back to beginning - currently the driver does not know whether it is allowed to handle surprise removal or not because it does not know if it is resumed for real or just to be able to write suspend image. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 22:39 ` Dmitry Torokhov @ 2005-12-27 22:44 ` Pavel Machek 0 siblings, 0 replies; 69+ messages in thread From: Pavel Machek @ 2005-12-27 22:44 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list On Út 27-12-05 17:39:00, Dmitry Torokhov wrote: > On 12/27/05, Pavel Machek <pavel@ucw.cz> wrote: > > On Út 27-12-05 17:21:28, Dmitry Torokhov wrote: > > > On 12/27/05, Pavel Machek <pavel@ucw.cz> wrote: > > > > On Út 27-12-05 16:45:17, Dmitry Torokhov wrote: > > > > > > > > > > Hmm, I am confused. What should ->resume() method do when it fails to > > > > > resume a device? I would say kill it, espacially if it is > > > > > bus-generic > > > > > > > > That should basically never happen. You can kill the device, or maybe > > > > even panic() whole machine. It just should not happen. > > > > > > It can happen, it did happen and it will continue to happen. You hit > > > your suspend button and stand up and accidentially kick your USB hub > > > or your PS/2 mouse gets loose and kernel fails to resume it. It is no > > > big deal, we just have to be prepared to handle it. > > > > Ok, in case of hotpluggable devices, you just handle it in a similar > > way to surprise removal. > > Yes, and we go back to beginning - currently the driver does not know > whether it is allowed to handle surprise removal or not because it > does not know if it is resumed for real or just to be able to write > suspend image. Would a flag passed down to the driver, telling it if it is "for real" or only as a part of suspend help? Pavel -- Thanks, Sharp! ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 21:45 ` Dmitry Torokhov 2005-12-27 22:08 ` Pavel Machek @ 2005-12-29 2:07 ` Alan Stern 1 sibling, 0 replies; 69+ messages in thread From: Alan Stern @ 2005-12-29 2:07 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1027 bytes --] On Tue, 27 Dec 2005, Dmitry Torokhov wrote: > Hmm, I am confused. What should ->resume() method do when it fails to > resume a device? I would say kill it, espacially if it is bus-generic > method. How else would we clean up after removed devices? And here we > have a problem because swsuspend likes to resume all devices before > writing the image and drivers don't rellay know if they are woken up > for real or just to complete suspend transition. If resume fails to resume a device, it should return a negative error code. It should not try to unregister the device; that action belongs to a different pathway. Cleaning up after removed devices should be handled the same way as we already do it -- that is, totally separate from suspend/resume. As for swsuspend resuming devices in order to write the image... I don't know how it currently handles a resume failure. Offhand it's not clear what could be done about a failure anyway. Probably resume failures at this point should simply be ignored. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-26 22:50 ` Dmitry Torokhov 2005-12-26 23:01 ` Alan Stern @ 2005-12-27 21:09 ` Patrick Mochel 2005-12-27 21:51 ` Dmitry Torokhov 2005-12-29 2:02 ` Alan Stern 1 sibling, 2 replies; 69+ messages in thread From: Patrick Mochel @ 2005-12-27 21:09 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list On Mon, 26 Dec 2005, Dmitry Torokhov wrote: > On Monday 26 December 2005 16:20, Patrick Mochel wrote: > > > > Unbinding and > > > > rebinding do a whole lot more. A device does an entire round trip through > > > > the driver core, which does a lot of crap. There are several memory > > > > allocations, perhaps dozens of sysfs files created, and a handful of locks > > > > taken adn lists traversed. > > > > > > No, you're thinking of unregistering and re-registering a device or > > > driver. Unbinding and rebinding is much simpler, although still somewhat > > > complex. > > > > Ah, point taken; I was thinking that all USB devices were still > > disconnected during a suspend transition.. > > > > Device removal could happen at any point, even during suspend transition. > The kernel should be able to handle this scenario therefore implementation > that assumes that device tree is frozen in flawed. As far as I understand > the only thing that does not work at the moment is invoking hotplug handler. I don't think that the tree should be assumed to be frozen. I just think that we should try to avoid doing a full add or remove of a device (or really, any object) while in a suspend or resume transition. During normal system operation, how does a USB device get removed? Does it happen via the USB hub thread? Forgive me for asking a question that's probably been asked countless times, but is this thread running during a suspend transition? Would it be possible to simply mark the device as 'removed' and ignore it until we resumed, and then clean it up (hotplug events and everything)? Thanks, Patrick ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 21:09 ` Patrick Mochel @ 2005-12-27 21:51 ` Dmitry Torokhov 2005-12-28 4:36 ` Patrick Mochel 2005-12-29 2:10 ` Alan Stern 2005-12-29 2:02 ` Alan Stern 1 sibling, 2 replies; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-27 21:51 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list On 12/27/05, Patrick Mochel <mochel@digitalimplant.org> wrote: > > On Mon, 26 Dec 2005, Dmitry Torokhov wrote: > > > On Monday 26 December 2005 16:20, Patrick Mochel wrote: > > > > > Unbinding and > > > > > rebinding do a whole lot more. A device does an entire round trip through > > > > > the driver core, which does a lot of crap. There are several memory > > > > > allocations, perhaps dozens of sysfs files created, and a handful of locks > > > > > taken adn lists traversed. > > > > > > > > No, you're thinking of unregistering and re-registering a device or > > > > driver. Unbinding and rebinding is much simpler, although still somewhat > > > > complex. > > > > > > Ah, point taken; I was thinking that all USB devices were still > > > disconnected during a suspend transition.. > > > > > > > Device removal could happen at any point, even during suspend transition. > > The kernel should be able to handle this scenario therefore implementation > > that assumes that device tree is frozen in flawed. As far as I understand > > the only thing that does not work at the moment is invoking hotplug handler. > > I don't think that the tree should be assumed to be frozen. I just think > that we should try to avoid doing a full add or remove of a device (or > really, any object) while in a suspend or resume transition. > > During normal system operation, how does a USB device get removed? Does it > happen via the USB hub thread? Forgive me for asking a question that's > probably been asked countless times, but is this thread running during a > suspend transition? > That's Greg/Alan question, I am ignorant here. > Would it be possible to simply mark the device as 'removed' and ignore it > until we resumed, and then clean it up (hotplug events and everything)? > Unfortunately swsusp resumes devices in the middle of suspend process with everything frozen and drivers don't know if they may clean up or have to postpone doing so. To do it uniformly you'd need to introduce threads and offload cleanups. I doubt it is good idea to require each subsystem to define cleanup thread or [ab]used keventd? -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 21:51 ` Dmitry Torokhov @ 2005-12-28 4:36 ` Patrick Mochel 2005-12-28 6:11 ` Dmitry Torokhov ` (2 more replies) 2005-12-29 2:10 ` Alan Stern 1 sibling, 3 replies; 69+ messages in thread From: Patrick Mochel @ 2005-12-28 4:36 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 2244 bytes --] On Tue, 27 Dec 2005, Dmitry Torokhov wrote: > On 12/27/05, Patrick Mochel <mochel@digitalimplant.org> wrote: > > Would it be possible to simply mark the device as 'removed' and ignore it > > until we resumed, and then clean it up (hotplug events and everything)? > > > > Unfortunately swsusp resumes devices in the middle of suspend process > with everything frozen and drivers don't know if they may clean up or > have to postpone doing so. To do it uniformly you'd need to introduce > threads and offload cleanups. I doubt it is good idea to require each > subsystem to define cleanup thread or [ab]used keventd? Blech, I forgot about the caveman-like way of making sure the swap device is resumed. :) Nevertheless, it shouldn't matter, and you bring up a point that exists, and is actually more prevalent - the case of trying to resume a device that doesn't exist any more. Last week, I suggested that all devices that are thought to be present should be resumed. Those that have gone away should be removed once the system is resumed, then new devices should be discovered. In this, I am implying that devices that are thought to be there, but really aren't (have been removed, unplugged, or undocked) should be marked as such (somehow) and taken care of later. When swsusp resumes every device before writing the image, it should ignore devices that aren't present anymore. They can still exist in software (in the various object hiearchies), but they should just be skipped. Ditto for when we 'suspend' them again, and on the real resume. When the system is back up and userspace is started again, something should walk the device tree and reap the devices that are no longer present. The core could do this, so long as the flag was in struct device, though I'm not sure that it would know about the hooks to trigger a removal on the device's bus (or would it need to?). Then, we can free all the memory, do all the unregistering, and generate all the hotplug events. Of course, at that point, we'll need to discover any new devices, which is going to be bus-speciifc. But, with a few extra bits of infrastructure (per-bus objects, and a ->scan() method in struct bus_type), this will relatively simple.. Thoughts? Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-28 4:36 ` Patrick Mochel @ 2005-12-28 6:11 ` Dmitry Torokhov 2005-12-28 10:59 ` Rafael J. Wysocki 2005-12-29 2:13 ` Alan Stern 2 siblings, 0 replies; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-28 6:11 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list On Tuesday 27 December 2005 23:36, Patrick Mochel wrote: > > On Tue, 27 Dec 2005, Dmitry Torokhov wrote: > > > On 12/27/05, Patrick Mochel <mochel@digitalimplant.org> wrote: > > > > Would it be possible to simply mark the device as 'removed' and ignore it > > > until we resumed, and then clean it up (hotplug events and everything)? > > > > > > > Unfortunately swsusp resumes devices in the middle of suspend process > > with everything frozen and drivers don't know if they may clean up or > > have to postpone doing so. To do it uniformly you'd need to introduce > > threads and offload cleanups. I doubt it is good idea to require each > > subsystem to define cleanup thread or [ab]used keventd? > > Blech, I forgot about the caveman-like way of making sure the swap device > is resumed. :) > > Nevertheless, it shouldn't matter, and you bring up a point that exists, > and is actually more prevalent - the case of trying to resume a device > that doesn't exist any more. > > Last week, I suggested that all devices that are thought to be present > should be resumed. Those that have gone away should be removed once the > system is resumed, then new devices should be discovered. In this, I am > implying that devices that are thought to be there, but really aren't > (have been removed, unplugged, or undocked) should be marked as such > (somehow) and taken care of later. > > When swsusp resumes every device before writing the image, it should > ignore devices that aren't present anymore. They can still exist in > software (in the various object hiearchies), but they should just be > skipped. Ditto for when we 'suspend' them again, and on the real resume. > > When the system is back up and userspace is started again, something > should walk the device tree and reap the devices that are no longer > present. The core could do this, so long as the flag was in struct device, > though I'm not sure that it would know about the hooks to trigger a > removal on the device's bus (or would it need to?). Then, we can free all > the memory, do all the unregistering, and generate all the hotplug events. > > Of course, at that point, we'll need to discover any new devices, which is > going to be bus-speciifc. But, with a few extra bits of infrastructure > (per-bus objects, and a ->scan() method in struct bus_type), this will > relatively simple.. > > Thoughts? > I suppose... New devices will indeed have to be bus-specific but reaping function I think can be same for everyone - just call device_unregister() for every device that is marked as 'dead'. Unregister usually ignores all errors anyway. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-28 4:36 ` Patrick Mochel 2005-12-28 6:11 ` Dmitry Torokhov @ 2005-12-28 10:59 ` Rafael J. Wysocki 2005-12-29 2:18 ` Alan Stern 2005-12-29 2:13 ` Alan Stern 2 siblings, 1 reply; 69+ messages in thread From: Rafael J. Wysocki @ 2005-12-28 10:59 UTC (permalink / raw) To: linux-pm [-- Attachment #1: Type: text/plain, Size: 3090 bytes --] Hi, On Wednesday, 28 December 2005 05:36, Patrick Mochel wrote: > > On Tue, 27 Dec 2005, Dmitry Torokhov wrote: > > > On 12/27/05, Patrick Mochel <mochel@digitalimplant.org> wrote: > > > > Would it be possible to simply mark the device as 'removed' and ignore it > > > until we resumed, and then clean it up (hotplug events and everything)? > > > > > > > Unfortunately swsusp resumes devices in the middle of suspend process > > with everything frozen and drivers don't know if they may clean up or > > have to postpone doing so. To do it uniformly you'd need to introduce > > threads and offload cleanups. I doubt it is good idea to require each > > subsystem to define cleanup thread or [ab]used keventd? > > Blech, I forgot about the caveman-like way of making sure the swap device > is resumed. :) > > Nevertheless, it shouldn't matter, and you bring up a point that exists, > and is actually more prevalent - the case of trying to resume a device > that doesn't exist any more. > > Last week, I suggested that all devices that are thought to be present > should be resumed. Those that have gone away should be removed once the > system is resumed, then new devices should be discovered. In this, I am > implying that devices that are thought to be there, but really aren't > (have been removed, unplugged, or undocked) should be marked as such > (somehow) and taken care of later. > > When swsusp resumes every device before writing the image, it should > ignore devices that aren't present anymore. They can still exist in > software (in the various object hiearchies), but they should just be > skipped. Ditto for when we 'suspend' them again, and on the real resume. > > When the system is back up and userspace is started again, something > should walk the device tree and reap the devices that are no longer > present. The core could do this, so long as the flag was in struct device, > though I'm not sure that it would know about the hooks to trigger a > removal on the device's bus (or would it need to?). Then, we can free all > the memory, do all the unregistering, and generate all the hotplug events. > > Of course, at that point, we'll need to discover any new devices, which is > going to be bus-speciifc. But, with a few extra bits of infrastructure > (per-bus objects, and a ->scan() method in struct bus_type), this will > relatively simple.. > > Thoughts? I think for a (suspended) device that can be removed, unplugged, undocked, etc. (call it "removable") the most natural place in which we can detect that the device is no longer accessible is the device driver's .resume() routine, at least as far as swsusp is concerned. IMO .resume() should check if the device is still there and if not, it should put the device on a list of "no-longer-present devices" and just return. After the .resume() routines of all devices have completed, the list can be processed by something (a kernel thread?) that will do the changes on whatever level is necessary. Greetings, Rafael -- Beer is proof that God loves us and wants us to be happy - Benjamin Franklin [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-28 10:59 ` Rafael J. Wysocki @ 2005-12-29 2:18 ` Alan Stern 2005-12-29 2:55 ` Nigel Cunningham 2006-01-04 23:31 ` Pavel Machek 0 siblings, 2 replies; 69+ messages in thread From: Alan Stern @ 2005-12-29 2:18 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-pm [-- Attachment #1: Type: TEXT/PLAIN, Size: 1357 bytes --] On Wed, 28 Dec 2005, Rafael J. Wysocki wrote: > I think for a (suspended) device that can be removed, unplugged, undocked, > etc. (call it "removable") the most natural place in which we can detect > that the device is no longer accessible is the device driver's .resume() > routine, at least as far as swsusp is concerned. No. The most natural place in which we can detect that a device is no longer accessible is the place where we already do these detections. Not in the resume routine. > IMO .resume() should check if the device is still there and if not, it should > put the device on a list of "no-longer-present devices" and just return. > After the .resume() routines of all devices have completed, the list can > be processed by something (a kernel thread?) that will do the changes > on whatever level is necessary. Often the device-specific resume routine doesn't have the information needed for checking whether the device is still there. Such checks are done by generic bus-oriented routines. Yes, the bus's resume handler can do such a check. Why should it bother, when other parts of the bus code will detect the device removal in the normal course of events? Or alternatively, why shouldn't the bus's resume handler simply invoke those other parts of the bus code when it determines that a device has been removed? Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 2:18 ` Alan Stern @ 2005-12-29 2:55 ` Nigel Cunningham 2005-12-29 11:29 ` Rafael J. Wysocki 2005-12-30 17:46 ` Alan Stern 2006-01-04 23:31 ` Pavel Machek 1 sibling, 2 replies; 69+ messages in thread From: Nigel Cunningham @ 2005-12-29 2:55 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 1636 bytes --] Hi Alan. On Thu, 2005-12-29 at 12:18, Alan Stern wrote: > On Wed, 28 Dec 2005, Rafael J. Wysocki wrote: > > > I think for a (suspended) device that can be removed, unplugged, undocked, > > etc. (call it "removable") the most natural place in which we can detect > > that the device is no longer accessible is the device driver's .resume() > > routine, at least as far as swsusp is concerned. > > No. The most natural place in which we can detect that a device is no > longer accessible is the place where we already do these detections. Not > in the resume routine. > > > IMO .resume() should check if the device is still there and if not, it should > > put the device on a list of "no-longer-present devices" and just return. > > After the .resume() routines of all devices have completed, the list can > > be processed by something (a kernel thread?) that will do the changes > > on whatever level is necessary. > > Often the device-specific resume routine doesn't have the information > needed for checking whether the device is still there. Such checks are > done by generic bus-oriented routines. > > Yes, the bus's resume handler can do such a check. Why should it bother, > when other parts of the bus code will detect the device removal in the > normal course of events? Or alternatively, why shouldn't the bus's resume > handler simply invoke those other parts of the bus code when it determines > that a device has been removed? I guess because the resume routine is almost certainly the first piece of code that will try to access the hardware (and possibly choke and die it it's not there). Regards, Nigel [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 2:55 ` Nigel Cunningham @ 2005-12-29 11:29 ` Rafael J. Wysocki 2005-12-30 17:46 ` Alan Stern 1 sibling, 0 replies; 69+ messages in thread From: Rafael J. Wysocki @ 2005-12-29 11:29 UTC (permalink / raw) To: ncunningham; +Cc: Linux-pm mailing list [-- Attachment #1: Type: text/plain, Size: 1784 bytes --] Hi, On Thursday, 29 December 2005 03:55, Nigel Cunningham wrote: > On Thu, 2005-12-29 at 12:18, Alan Stern wrote: > > On Wed, 28 Dec 2005, Rafael J. Wysocki wrote: > > > > > I think for a (suspended) device that can be removed, unplugged, undocked, > > > etc. (call it "removable") the most natural place in which we can detect > > > that the device is no longer accessible is the device driver's .resume() > > > routine, at least as far as swsusp is concerned. > > > > No. The most natural place in which we can detect that a device is no > > longer accessible is the place where we already do these detections. Not > > in the resume routine. > > > > > IMO .resume() should check if the device is still there and if not, it should > > > put the device on a list of "no-longer-present devices" and just return. > > > After the .resume() routines of all devices have completed, the list can > > > be processed by something (a kernel thread?) that will do the changes > > > on whatever level is necessary. > > > > Often the device-specific resume routine doesn't have the information > > needed for checking whether the device is still there. Such checks are > > done by generic bus-oriented routines. > > > > Yes, the bus's resume handler can do such a check. Why should it bother, > > when other parts of the bus code will detect the device removal in the > > normal course of events? Or alternatively, why shouldn't the bus's resume > > handler simply invoke those other parts of the bus code when it determines > > that a device has been removed? > > I guess because the resume routine is almost certainly the first piece > of code that will try to access the hardware (and possibly choke and die > it it's not there). This is what I had in mind. Greetings, Rafael [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 2:55 ` Nigel Cunningham 2005-12-29 11:29 ` Rafael J. Wysocki @ 2005-12-30 17:46 ` Alan Stern 1 sibling, 0 replies; 69+ messages in thread From: Alan Stern @ 2005-12-30 17:46 UTC (permalink / raw) To: Nigel Cunningham; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1256 bytes --] On Thu, 29 Dec 2005, Nigel Cunningham wrote: > > Often the device-specific resume routine doesn't have the information > > needed for checking whether the device is still there. Such checks are > > done by generic bus-oriented routines. > > > > Yes, the bus's resume handler can do such a check. Why should it bother, > > when other parts of the bus code will detect the device removal in the > > normal course of events? Or alternatively, why shouldn't the bus's resume > > handler simply invoke those other parts of the bus code when it determines > > that a device has been removed? > > I guess because the resume routine is almost certainly the first piece > of code that will try to access the hardware (and possibly choke and die > it it's not there). Obviously we want the resume routine to be robust enough not to die if the device was removed while the system was suspended. :-) As for the rest, however, it simply is not resume()'s job to check whether the device is still there. That job belongs to another piece of code. Consider this: What would happen if the system _hadn't_ been suspended when the device was removed? Why shouldn't the same thing happen if the system _is_ suspended when the device is removed? Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 2:18 ` Alan Stern 2005-12-29 2:55 ` Nigel Cunningham @ 2006-01-04 23:31 ` Pavel Machek 2006-01-05 16:07 ` Alan Stern 1 sibling, 1 reply; 69+ messages in thread From: Pavel Machek @ 2006-01-04 23:31 UTC (permalink / raw) To: Alan Stern; +Cc: linux-pm [-- Attachment #1: Type: text/plain, Size: 699 bytes --] On St 28-12-05 21:18:38, Alan Stern wrote: > On Wed, 28 Dec 2005, Rafael J. Wysocki wrote: > > > I think for a (suspended) device that can be removed, unplugged, undocked, > > etc. (call it "removable") the most natural place in which we can detect > > that the device is no longer accessible is the device driver's .resume() > > routine, at least as far as swsusp is concerned. > > No. The most natural place in which we can detect that a device is no > longer accessible is the place where we already do these detections. Not > in the resume routine. ...for reasonable buses, like usb. For something like ps/2, .resume is the place to check, I'm afraid. Pavel -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2006-01-04 23:31 ` Pavel Machek @ 2006-01-05 16:07 ` Alan Stern 2006-01-05 19:05 ` Pavel Machek 0 siblings, 1 reply; 69+ messages in thread From: Alan Stern @ 2006-01-05 16:07 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-pm [-- Attachment #1: Type: TEXT/PLAIN, Size: 1218 bytes --] On Thu, 5 Jan 2006, Pavel Machek wrote: > On St 28-12-05 21:18:38, Alan Stern wrote: > > On Wed, 28 Dec 2005, Rafael J. Wysocki wrote: > > > > > I think for a (suspended) device that can be removed, unplugged, undocked, > > > etc. (call it "removable") the most natural place in which we can detect > > > that the device is no longer accessible is the device driver's .resume() > > > routine, at least as far as swsusp is concerned. > > > > No. The most natural place in which we can detect that a device is no > > longer accessible is the place where we already do these detections. Not > > in the resume routine. > > ...for reasonable buses, like usb. For something like ps/2, .resume is > the place to check, I'm afraid. It's _a_ place to check. On the other hand, since there's no checking anywhere else, why does there need to be a check in .resume? If people insist on detecting device removal during resume, then the driver and PM cores will have to be able to cope with it. Dumping everything into keventd's lap is one approach. Or we could define a list of devices awaiting unregistration; the list could be scanned every time a system resume or runtime resume procedure completes. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2006-01-05 16:07 ` Alan Stern @ 2006-01-05 19:05 ` Pavel Machek 0 siblings, 0 replies; 69+ messages in thread From: Pavel Machek @ 2006-01-05 19:05 UTC (permalink / raw) To: Alan Stern; +Cc: linux-pm [-- Attachment #1: Type: text/plain, Size: 686 bytes --] On Thu 05-01-06 11:07:52, Alan Stern wrote: > On Thu, 5 Jan 2006, Pavel Machek wrote: > > On St 28-12-05 21:18:38, Alan Stern wrote: > > > On Wed, 28 Dec 2005, Rafael J. Wysocki wrote: > > > > ...for reasonable buses, like usb. For something like ps/2, .resume is > > the place to check, I'm afraid. > > It's _a_ place to check. On the other hand, since there's no checking > anywhere else, why does there need to be a check in .resume? You are not normally sending commands to ps/2 mouse, except when initializing it. And you have to re-initialize it during .resume. You'll notice that commands time out... ...well, you *could* just ignore the timeouts :-). -- Thanks, Sharp! [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-28 4:36 ` Patrick Mochel 2005-12-28 6:11 ` Dmitry Torokhov 2005-12-28 10:59 ` Rafael J. Wysocki @ 2005-12-29 2:13 ` Alan Stern 2 siblings, 0 replies; 69+ messages in thread From: Alan Stern @ 2005-12-29 2:13 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1023 bytes --] On Tue, 27 Dec 2005, Patrick Mochel wrote: > When the system is back up and userspace is started again, something > should walk the device tree and reap the devices that are no longer > present. The core could do this, so long as the flag was in struct device, > though I'm not sure that it would know about the hooks to trigger a > removal on the device's bus (or would it need to?). Then, we can free all > the memory, do all the unregistering, and generate all the hotplug events. Why do we need to do this? Why can't we rely on devices being unregistered by the same mechanism that would operate in the absence of any suspend/resume? > Of course, at that point, we'll need to discover any new devices, which is > going to be bus-speciifc. But, with a few extra bits of infrastructure > (per-bus objects, and a ->scan() method in struct bus_type), this will > relatively simple.. Again, why do we need anything special? What's wrong with the existing method for discovering new hotplugged devices? Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 21:51 ` Dmitry Torokhov 2005-12-28 4:36 ` Patrick Mochel @ 2005-12-29 2:10 ` Alan Stern 2005-12-29 5:20 ` Dmitry Torokhov 1 sibling, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-29 2:10 UTC (permalink / raw) To: dtor_core; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 952 bytes --] On Tue, 27 Dec 2005, Dmitry Torokhov wrote: > > Would it be possible to simply mark the device as 'removed' and ignore it > > until we resumed, and then clean it up (hotplug events and everything)? > > > > Unfortunately swsusp resumes devices in the middle of suspend process > with everything frozen and drivers don't know if they may clean up or > have to postpone doing so. Drivers should strive to avoid cleaning up after removed devices during their resume methods in any case. Such cleanups shouldn't happen until userspace has been unfrozen. > To do it uniformly you'd need to introduce > threads and offload cleanups. I doubt it is good idea to require each > subsystem to define cleanup thread or [ab]used keventd? Why do you need a new cleanup thread? What's wrong with the existing strategy for handling removed devices -- i.e., the procedure that would have been followed if there was no suspend/resume transition? Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 2:10 ` Alan Stern @ 2005-12-29 5:20 ` Dmitry Torokhov 2005-12-30 18:26 ` Alan Stern 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-29 5:20 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list On Wednesday 28 December 2005 21:10, Alan Stern wrote: > On Tue, 27 Dec 2005, Dmitry Torokhov wrote: > > > > Would it be possible to simply mark the device as 'removed' and ignore it > > > until we resumed, and then clean it up (hotplug events and everything)? > > > > > > > Unfortunately swsusp resumes devices in the middle of suspend process > > with everything frozen and drivers don't know if they may clean up or > > have to postpone doing so. > > Drivers should strive to avoid cleaning up after removed devices during > their resume methods in any case. Such cleanups shouldn't happen until > userspace has been unfrozen. > > > To do it uniformly you'd need to introduce > > threads and offload cleanups. I doubt it is good idea to require each > > subsystem to define cleanup thread or [ab]used keventd? > > Why do you need a new cleanup thread? What's wrong with the existing > strategy for handling removed devices -- i.e., the procedure that would > have been followed if there was no suspend/resume transition? > Sometimes there is no such procedure. For example if you yank PS/2 mouse there is no way for the system to detect it's missing. Only when you try to resume and mouse does not respond you realize it's gone. I bet there are other devices like that. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 5:20 ` Dmitry Torokhov @ 2005-12-30 18:26 ` Alan Stern 2005-12-30 19:18 ` Dmitry Torokhov 0 siblings, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-30 18:26 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 949 bytes --] On Thu, 29 Dec 2005, Dmitry Torokhov wrote: > > Why do you need a new cleanup thread? What's wrong with the existing > > strategy for handling removed devices -- i.e., the procedure that would > > have been followed if there was no suspend/resume transition? > > > > Sometimes there is no such procedure. For example if you yank PS/2 mouse > there is no way for the system to detect it's missing. Only when you try > to resume and mouse does not respond you realize it's gone. I bet there > are other devices like that. There may indeed be other devices like that. If you can't handle device removal while the system is running normally, why should you be able to handle device removal while the system is suspended? Conversely, even if the resume routine does detect somehow that the device was removed, there's nothing it can do with that information if you don't have a mechanism in place for handling device removals. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-30 18:26 ` Alan Stern @ 2005-12-30 19:18 ` Dmitry Torokhov 2005-12-30 19:26 ` Alan Stern 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-30 19:18 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list On Friday 30 December 2005 13:26, Alan Stern wrote: > On Thu, 29 Dec 2005, Dmitry Torokhov wrote: > > > > Why do you need a new cleanup thread? What's wrong with the existing > > > strategy for handling removed devices -- i.e., the procedure that would > > > have been followed if there was no suspend/resume transition? > > > > > > > Sometimes there is no such procedure. For example if you yank PS/2 mouse > > there is no way for the system to detect it's missing. Only when you try > > to resume and mouse does not respond you realize it's gone. I bet there > > are other devices like that. > > There may indeed be other devices like that. If you can't handle device > removal while the system is running normally, why should you be able to > handle device removal while the system is suspended? > Because I can? > Conversely, even if the resume routine does detect somehow that the device > was removed, there's nothing it can do with that information if you don't > have a mechanism in place for handling device removals. > The subsystem can handle device removal, its just hardware does not give any notification when device is removed so subsystem has to rely on some external event (such as resume process) to query the device and check if it is still present. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-30 19:18 ` Dmitry Torokhov @ 2005-12-30 19:26 ` Alan Stern 2005-12-30 19:42 ` Dmitry Torokhov 0 siblings, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-30 19:26 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list On Fri, 30 Dec 2005, Dmitry Torokhov wrote: > > There may indeed be other devices like that. If you can't handle device > > removal while the system is running normally, why should you be able to > > handle device removal while the system is suspended? > > > > Because I can? > > > Conversely, even if the resume routine does detect somehow that the device > > was removed, there's nothing it can do with that information if you don't > > have a mechanism in place for handling device removals. > > > > The subsystem can handle device removal, its just hardware does not > give any notification when device is removed so subsystem has to rely > on some external event (such as resume process) to query the device > and check if it is still present. In that case, I would say that it is appropriate to use keventd or some other worker thread to handle device removal detected during resume. Since these threads will be frozen along with everything else, the unregister calls wouldn't be made until the system was once again awake. If you were really serious about it, you could even set up a work queue to query each input device once per minute (say), to see whether it had gotten removed. But that's probably overkill. Alan Stern ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-30 19:26 ` Alan Stern @ 2005-12-30 19:42 ` Dmitry Torokhov 2005-12-30 20:13 ` Alan Stern 0 siblings, 1 reply; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-30 19:42 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list On Friday 30 December 2005 14:26, Alan Stern wrote: > On Fri, 30 Dec 2005, Dmitry Torokhov wrote: > > > > There may indeed be other devices like that. If you can't handle device > > > removal while the system is running normally, why should you be able to > > > handle device removal while the system is suspended? > > > > > > > Because I can? > > > > > Conversely, even if the resume routine does detect somehow that the device > > > was removed, there's nothing it can do with that information if you don't > > > have a mechanism in place for handling device removals. > > > > > > > The subsystem can handle device removal, its just hardware does not > > give any notification when device is removed so subsystem has to rely > > on some external event (such as resume process) to query the device > > and check if it is still present. > > In that case, I would say that it is appropriate to use keventd or some > other worker thread to handle device removal detected during resume. > Since these threads will be frozen along with everything else, the > unregister calls wouldn't be made until the system was once again awake. > Input (well, actually serio) devices are already handled via kseriod thread, but that's because it was already there. I was saying that driver core should not require a reaper thread if there isn't one yet. > If you were really serious about it, you could even set up a work queue > to query each input device once per minute (say), to see whether it had > gotten removed. But that's probably overkill. Not only this is overkill but there are many devices with different ways to check for presence and quite often queies may disturb the device (i.e. right when you decide to tap on your touchpad my thread will decide that it is time for querying and will disable it, query and re-enable and that tap will be lost). -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-30 19:42 ` Dmitry Torokhov @ 2005-12-30 20:13 ` Alan Stern 0 siblings, 0 replies; 69+ messages in thread From: Alan Stern @ 2005-12-30 20:13 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: Linux-pm mailing list On Fri, 30 Dec 2005, Dmitry Torokhov wrote: > > In that case, I would say that it is appropriate to use keventd or some > > other worker thread to handle device removal detected during resume. > > Since these threads will be frozen along with everything else, the > > unregister calls wouldn't be made until the system was once again awake. > > > > Input (well, actually serio) devices are already handled via kseriod thread, > but that's because it was already there. I was saying that driver core should > not require a reaper thread if there isn't one yet. The driver core doesn't make this requirement. If you don't want to reap those devices during resume then you don't need the thread. And after all, if you don't reap the devices at other times then why bother reaping them during resume? Alan Stern ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-27 21:09 ` Patrick Mochel 2005-12-27 21:51 ` Dmitry Torokhov @ 2005-12-29 2:02 ` Alan Stern 2005-12-30 18:34 ` Patrick Mochel 1 sibling, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-29 2:02 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1686 bytes --] On Tue, 27 Dec 2005, Patrick Mochel wrote: > I don't think that the tree should be assumed to be frozen. I just think > that we should try to avoid doing a full add or remove of a device (or > really, any object) while in a suspend or resume transition. > > During normal system operation, how does a USB device get removed? Does it > happen via the USB hub thread? Forgive me for asking a question that's > probably been asked countless times, but is this thread running during a > suspend transition? USB device creation and removal is handled by the khubd thread. That thread gets frozen along with everything else during suspends. > Would it be possible to simply mark the device as 'removed' and ignore it > until we resumed, and then clean it up (hotplug events and everything)? Internally that's what the USB core does do. When we know that a device has gone away during a sleep transition (generally because the USB host controller has lost power while it was suspended), the device structure is marked with USB_STATE_NOTATTACHED. All future I/O operations to it will fail, future resume requests will return 0 without actually doing anything, and future suspend requests will fail. When the khubd thread is eventually unfrozen, it goes through the device unregistration procedure. Note that USB device drivers do not receive any special notification when a device goes to NOTATTACHED. Of course they can find out for themselves by checking the state field in the device structure, but I'm not aware of any drivers actually doing so. If asked to perform I/O (to write out a memory image, for example), they would simply generate a bunch of I/O errors. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-29 2:02 ` Alan Stern @ 2005-12-30 18:34 ` Patrick Mochel 2005-12-30 19:21 ` Alan Stern 0 siblings, 1 reply; 69+ messages in thread From: Patrick Mochel @ 2005-12-30 18:34 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 2072 bytes --] On Wed, 28 Dec 2005, Alan Stern wrote: > On Tue, 27 Dec 2005, Patrick Mochel wrote: > > > I don't think that the tree should be assumed to be frozen. I just think > > that we should try to avoid doing a full add or remove of a device (or > > really, any object) while in a suspend or resume transition. > > > > During normal system operation, how does a USB device get removed? Does it > > happen via the USB hub thread? Forgive me for asking a question that's > > probably been asked countless times, but is this thread running during a > > suspend transition? > > USB device creation and removal is handled by the khubd thread. That > thread gets frozen along with everything else during suspends. > > > Would it be possible to simply mark the device as 'removed' and ignore it > > until we resumed, and then clean it up (hotplug events and everything)? > > Internally that's what the USB core does do. When we know that a device > has gone away during a sleep transition (generally because the USB host > controller has lost power while it was suspended), the device structure is > marked with USB_STATE_NOTATTACHED. All future I/O operations to it will > fail, future resume requests will return 0 without actually doing > anything, and future suspend requests will fail. When the khubd thread is > eventually unfrozen, it goes through the device unregistration procedure. Ah, perfect. So, it actually scans the USB device lists on resume? I thought it relied on a notification via an interrupt? > Note that USB device drivers do not receive any special notification when > a device goes to NOTATTACHED. Of course they can find out for themselves > by checking the state field in the device structure, but I'm not aware of > any drivers actually doing so. If asked to perform I/O (to write out a > memory image, for example), they would simply generate a bunch of I/O > errors. That makes sense. We just have to make sure that the driver actually does proper error handling on suspend (and actually fails if the device doesn't respond). Thanks, Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-30 18:34 ` Patrick Mochel @ 2005-12-30 19:21 ` Alan Stern 2006-01-02 23:49 ` Patrick Mochel 0 siblings, 1 reply; 69+ messages in thread From: Alan Stern @ 2005-12-30 19:21 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1652 bytes --] On Fri, 30 Dec 2005, Patrick Mochel wrote: > > > Would it be possible to simply mark the device as 'removed' and ignore it > > > until we resumed, and then clean it up (hotplug events and everything)? > > > > Internally that's what the USB core does do. When we know that a device > > has gone away during a sleep transition (generally because the USB host > > controller has lost power while it was suspended), the device structure is > > marked with USB_STATE_NOTATTACHED. All future I/O operations to it will > > fail, future resume requests will return 0 without actually doing > > anything, and future suspend requests will fail. When the khubd thread is > > eventually unfrozen, it goes through the device unregistration procedure. > > Ah, perfect. So, it actually scans the USB device lists on resume? I > thought it relied on a notification via an interrupt? Khubd does not exactly scan device lists on resume. What happens is that we set a bit in the device's parent hub's private data structure, indicating that there was a connect-change on the device's port, and we put the data structure on a list of hubs needing attention. Khubd continually scans this hub-event list whenever it isn't asleep, so it sees that the device has been disconnected as soon as it gets unfrozen. In normal operation khubd operates not so much by interrupts as by polling hubs. (The polling is set up once and then handled automatically by the USB host controller hardware.) However for managing root hubs we do use either timer-driven polling or interrupt-driven notifications -- some host controller drivers use one, some use the other. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-30 19:21 ` Alan Stern @ 2006-01-02 23:49 ` Patrick Mochel 0 siblings, 0 replies; 69+ messages in thread From: Patrick Mochel @ 2006-01-02 23:49 UTC (permalink / raw) To: Alan Stern; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 1066 bytes --] On Fri, 30 Dec 2005, Alan Stern wrote: > Khubd does not exactly scan device lists on resume. What happens is that > we set a bit in the device's parent hub's private data structure, > indicating that there was a connect-change on the device's port, and we > put the data structure on a list of hubs needing attention. Khubd > continually scans this hub-event list whenever it isn't asleep, so it sees > that the device has been disconnected as soon as it gets unfrozen. Getting back to the original topic about unbinding and rebinding... IMO, it would be best to not unbind and rebind devices, but rather do the steps necessary to quiesce the driver and/or the device. But, in the event that that can't happen (or in the interim), could a similar mechanism be used for unbind events that are used for removals - i.e. setting a bit in the device's (or related) data structure indicating that the driver has been unbound -- and that work needs to be done -- and then having khubd process it (and generate the appropriate events) on resume? Thanks, Patrick [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-26 21:20 ` Patrick Mochel 2005-12-26 22:50 ` Dmitry Torokhov @ 2005-12-26 23:25 ` Alan Stern 1 sibling, 0 replies; 69+ messages in thread From: Alan Stern @ 2005-12-26 23:25 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list [-- Attachment #1: Type: TEXT/PLAIN, Size: 4009 bytes --] On Mon, 26 Dec 2005, Patrick Mochel wrote: > Nevertheless, think about both of the things that we are talking about - > quiescing the driver and quiescing the device. Both of those things happen > by unbinding the driver - the driver stops the I/O requests and the device > is put into a non-talkative state. However, there are other things that > happen as a side effect, which I think can (and eventually will) produce > negative results. > > I think that we can achieve the things that we talk about in a simpler > manner, which will also happen to be more elegant and more efficient. > Instead of unbinding, what if we simply issued a request to the device to > halt I/O transactions and put the device in a quiescent state? > > We could call this method ->stop() and put it in struct device_driver. > Since it is in the core driver structure, it would be filled in and > intercepted by the different bus driver cores. The USB core would receive > the call, prevent any future I/O transactions, then call down to the > (optionally present) driver-specific ->stop(). > > [ Of course, it looks good on paper, and you may know some of the nasty > details of USB that would prevent something like this. ] In fact I do, although it's not nasty. With USB interfaces (USB devices too, but we're not concerned with them at the moment) _all_ I/O is initiated by the host. There's no need to send a special message to make the interface stop doing I/O; the driver merely needs to stop making transfers. So for USB anyway, your proposed stop() method is redundant. It would do nothing more or less than the suspend() method already does. (I have ignored matters of long-running device activities that get started by an I/O request and then keep on going, like playing an audio CD. For both stop() and suspend() we would want the driver to cancel all such ongoing activities.) There is one question here I'm not clear about. You suggested that the USB core would prevent I/O transactions and then call the driver's method. That raises the possibility that some transactions could fail for no good reason (as far as the driver can tell). Alternatively, the core could call the method first. That leaves the possibility of accepting some I/O that should have been refused. It's a race, and I don't know which way would be best. > Then again, you could initially implement this for USB only, then move it > into the core once we all decide that it makes sense to propogate > outwards. I'm not convinced that start() and stop() methods are generally applicable. Many drivers do their work only when they have something to do. There's no reason ever to stop() a printer driver, for instance -- you just don't send data to that printer. It's only devices like network interfaces (which go ahead doing work all on their own as opposed to when somebody asks for it) that can utilize these methods. Maybe there are good reasons for implementing them with arbitrary devices, but I haven't heard any. > > .) So now what should we do? > > > > Require userspace to rmmod usblp before suspending? > > > > Add suspend and resume methods to usblp, but make it so they > > only cancel outstanding I/O, while relying on usbcore to fail > > any new I/O requests? > > The latter seems to make the most sense - it is partitioning the problem > between things which are specific to the device (canceling pending I/O, > assuming that it requires issuing commands to the devices) and will be > generic to all like devices (preventing future I/O, assuming that's a > software flag that is set/checked). > > Because the USB core intercepts and forwards the ->suspend()/->resume() > calls (and possibly ->start()/->stop()), it has the ability to do any pre- > and post- work that is generic to all or most devices, which is good > because it prevents unnecessary duplication of work in the various > drivers. > > Does that help at all? Yes, okay, I'll work on doing things that way. Alan Stern [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: Re: Hotplug events during sleep transition 2005-12-23 15:35 ` Patrick Mochel 2005-12-23 16:52 ` Alan Stern @ 2005-12-23 18:32 ` Dmitry Torokhov 1 sibling, 0 replies; 69+ messages in thread From: Dmitry Torokhov @ 2005-12-23 18:32 UTC (permalink / raw) To: Patrick Mochel; +Cc: Linux-pm mailing list On Friday 23 December 2005 10:35, Patrick Mochel wrote: > > > Quite often unbinding is easiest and also correct way. I would love to be > > > able to just unbind serio port/input device and have it recreated later. > > > Unfortunately X/GPM do not [yet?] support hotplugging of devices so kernel > > > has to compensate. > > > > I agree with Dmitry. Many drivers don't need to do anything special for > > suspend, or only need to cancel an outstanding input request. Rather than > > go through every single driver and add a minimal (and possibly erroneous) > > suspend routine, it's much easier just to unbind these drivers. > > So why not fix the subsystems to do the equivalent of an unbinding, from > a tear-down and reinitialization standpoint? That way, you don't have to > force the core to contort itself simply for the fact that you want to push > in a thumbtack with a sledgehammer? > > Think about what is happening. It's been discovered that shutting down the > device and reinitializing the device performs all the correct actions to > get the device up and running again after a suspend/resume cycle. All you > have to do for each suspend routine is mimic that effect. It arguably > doesn't require any serious knowledge about the device - it only needs a > copy of the ->probe() and ->remove() routines (or the functional > equivalent for those devices), without the allocation and freeing of data > structures. That's all good except that it overly complicates things. Look for example at psmouse driver - it can't just reinitialize the hardware, it also has to make sure that device connected to the port is the same device there was before suspending. And if it is not the same we still have to go though deardown and re-creating input device to make sure that it reports correct capabilities to userspace. Again, I would _love_ just to unbind the driver and have it bind again after resume. "Full" resume is only required when you have to maintain full state of the device in question (like a drive may have outstanding requests to complete). With some devices we can safely drop all outstanding requests. -- Dmitry ^ permalink raw reply [flat|nested] 69+ messages in thread
end of thread, other threads:[~2006-01-05 19:05 UTC | newest] Thread overview: 69+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-12-22 14:28 Hotplug events during sleep transition Alan Stern 2005-12-22 14:34 ` Pavel Machek 2005-12-22 18:20 ` Dmitry Torokhov 2005-12-22 20:52 ` Patrick Mochel 2005-12-22 20:56 ` Randy.Dunlap 2005-12-22 22:13 ` Alan Stern 2005-12-23 3:49 ` Patrick Mochel 2005-12-23 3:52 ` Dmitry Torokhov 2005-12-23 15:20 ` Alan Stern 2005-12-23 15:35 ` Patrick Mochel 2005-12-23 16:52 ` Alan Stern 2005-12-23 17:20 ` Pavel Machek 2005-12-23 21:22 ` Alan Stern 2005-12-23 21:28 ` Greg KH 2005-12-23 22:09 ` Nigel Cunningham 2005-12-23 22:31 ` Pavel Machek 2005-12-23 22:40 ` Nigel Cunningham 2005-12-23 22:48 ` Pavel Machek 2005-12-23 22:57 ` Nigel Cunningham 2005-12-24 0:29 ` Nigel Cunningham 2005-12-24 15:11 ` Alan Stern 2005-12-24 15:28 ` Pavel Machek 2005-12-24 17:02 ` Greg KH 2005-12-23 21:28 ` Pavel Machek 2005-12-23 19:40 ` Greg KH 2005-12-24 0:23 ` Patrick Mochel 2005-12-25 2:56 ` Alan Stern 2005-12-25 8:53 ` Pavel Machek 2005-12-25 16:43 ` Alan Stern 2005-12-25 16:52 ` Dmitry Torokhov 2005-12-25 19:54 ` Pavel Machek 2005-12-25 19:52 ` Pavel Machek 2005-12-26 3:53 ` Alan Stern 2005-12-26 21:20 ` Patrick Mochel 2005-12-26 22:50 ` Dmitry Torokhov 2005-12-26 23:01 ` Alan Stern 2005-12-27 21:45 ` Dmitry Torokhov 2005-12-27 22:08 ` Pavel Machek 2005-12-27 22:21 ` Dmitry Torokhov 2005-12-27 22:31 ` Pavel Machek 2005-12-27 22:39 ` Dmitry Torokhov 2005-12-27 22:44 ` Pavel Machek 2005-12-29 2:07 ` Alan Stern 2005-12-27 21:09 ` Patrick Mochel 2005-12-27 21:51 ` Dmitry Torokhov 2005-12-28 4:36 ` Patrick Mochel 2005-12-28 6:11 ` Dmitry Torokhov 2005-12-28 10:59 ` Rafael J. Wysocki 2005-12-29 2:18 ` Alan Stern 2005-12-29 2:55 ` Nigel Cunningham 2005-12-29 11:29 ` Rafael J. Wysocki 2005-12-30 17:46 ` Alan Stern 2006-01-04 23:31 ` Pavel Machek 2006-01-05 16:07 ` Alan Stern 2006-01-05 19:05 ` Pavel Machek 2005-12-29 2:13 ` Alan Stern 2005-12-29 2:10 ` Alan Stern 2005-12-29 5:20 ` Dmitry Torokhov 2005-12-30 18:26 ` Alan Stern 2005-12-30 19:18 ` Dmitry Torokhov 2005-12-30 19:26 ` Alan Stern 2005-12-30 19:42 ` Dmitry Torokhov 2005-12-30 20:13 ` Alan Stern 2005-12-29 2:02 ` Alan Stern 2005-12-30 18:34 ` Patrick Mochel 2005-12-30 19:21 ` Alan Stern 2006-01-02 23:49 ` Patrick Mochel 2005-12-26 23:25 ` Alan Stern 2005-12-23 18:32 ` Dmitry Torokhov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox