* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912111938310.32493-100000@netrider.rowland.org> @ 2009-12-12 17:35 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-12 17:35 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Saturday 12 December 2009, Alan Stern wrote: > On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > > Below is a patch I've just tested, but there's a lockdep problem in it I don't > > know how to solve. Namely, lockdep is apparently unhappy with us not releasing > > the lock taken in device_suspend() and it complains we take it twice in a row > > (which we do, but for another device). I need to use down_read_non_owner() > > to make it shut up and then I also need to use up_read_non_owner() in > > __device_suspend(), although there's the comment in include/linux/rwsem.h > > saying exatly this about that: > > > > /* > > * Take/release a lock when not the owner will release it. > > * > > * [ This API should be avoided as much as possible - the > > * proper abstraction for this case is completions. ] > > */ > > > > (I'd like to know your opinion about that). Yet, that's not all, because next > > it complains during resume that __device_resume() releases a lock it didn't > > acquire, which it clearly does, but that is intentional. Unfortunately, > > there's no up_write_non_owner() ... > > Hah! I knew it! > > How come lockdep didn't complain earlier? What's different about this > patch? Only the nesting annotations? Why should adding annotations > make lockdep less happy? I'm not sure. Perhaps I made a mistake during the previous tests. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912201434340.27137-100000@netrider.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912201434340.27137-100000@netrider.rowland.org> @ 2009-12-20 19:51 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-20 19:51 UTC (permalink / raw) To: Alan Stern Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sunday 20 December 2009, Alan Stern wrote: > On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > BTW, what's the right place to call device_enable_async_suspend() for USB > > devices? > > For USB devices, it's in drivers/usb/core/hub.c:usb_new_device() > anywhere before the call to usb_device_add(). > > For USB interfaces, it's in > drivers/usb/core/message.c:usb_set_configuration() before the call to > device_add(). > > For USB endpoints, it's in > drivers/usb/core/endpoint.c:usb_create_ep_devs() before the call to > device_register(). Thanks! > However you won't need to do it for interfaces and endpoints if you > automatically treat as async any device without suspend/resume > callbacks. I don't do that right now and I need these settings just for testing at the moment. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912201910.26895.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <200912201910.26895.rjw@sisk.pl> @ 2009-12-20 19:38 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-20 19:38 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > BTW, what's the right place to call device_enable_async_suspend() for USB > devices? For USB devices, it's in drivers/usb/core/hub.c:usb_new_device() anywhere before the call to usb_device_add(). For USB interfaces, it's in drivers/usb/core/message.c:usb_set_configuration() before the call to device_add(). For USB endpoints, it's in drivers/usb/core/endpoint.c:usb_create_ep_devs() before the call to device_register(). However you won't need to do it for interfaces and endpoints if you automatically treat as async any device without suspend/resume callbacks. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912201210300.24162-100000@netrider.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912201210300.24162-100000@netrider.rowland.org> @ 2009-12-20 18:10 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-20 18:10 UTC (permalink / raw) To: Alan Stern Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sunday 20 December 2009, Alan Stern wrote: > On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > > It's too early to come to this sort of conclusion (i.e., that suspend > > > and resume react very differently to an asynchronous approach). Unless > > > you have some definite _reason_ for thinking that resume will benefit > > > more than suspend, you shouldn't try to generalize so much from tests > > > on only two systems. > > > > In fact I have one reason. Namely, the things that drivers do on suspend and > > resume are evidently quite different and on these two systems I was able to > > test they apparently took different amounts of time to complete. > > > > The very fact that on both systems resume is substantially longer than suspend, > > even if all devices are suspended and resumed synchronously, is quite > > interesting. > > Yes, it is. But it doesn't mean that suspend won't benefit from > asynchronicity; it just means that the benefits might not be as large > as they are for resume. Agreed, although that rises the question whether they are sufficiently significant. I guess time will tell. With the i8042 done asynchronously they are IMO. BTW, what's the right place to call device_enable_async_suspend() for USB devices? Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912201352.07689.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <200912201352.07689.rjw@sisk.pl> @ 2009-12-20 17:12 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-20 17:12 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > It's too early to come to this sort of conclusion (i.e., that suspend > > and resume react very differently to an asynchronous approach). Unless > > you have some definite _reason_ for thinking that resume will benefit > > more than suspend, you shouldn't try to generalize so much from tests > > on only two systems. > > In fact I have one reason. Namely, the things that drivers do on suspend and > resume are evidently quite different and on these two systems I was able to > test they apparently took different amounts of time to complete. > > The very fact that on both systems resume is substantially longer than suspend, > even if all devices are suspended and resumed synchronously, is quite > interesting. Yes, it is. But it doesn't mean that suspend won't benefit from asynchronicity; it just means that the benefits might not be as large as they are for resume. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912192232360.6618-100000@netrider.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912192232360.6618-100000@netrider.rowland.org> @ 2009-12-20 12:55 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-20 12:55 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Sunday 20 December 2009, Alan Stern wrote: > On Sat, 19 Dec 2009, Rafael J. Wysocki wrote: > > > On Friday 18 December 2009, Alan Stern wrote: > > > On Fri, 18 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > I didn't manage to do that, but I was able to mark sd and i8042 as async and > > > > see the impact of this. > > > > > > Apparently this didn't do what you wanted. In the nx6325 > > > sd+i8042+async+extra log, the 0:0:0:0 device (which is a SCSI disk) was > > To be precise, the device is an ATA or SATA disk but it is managed by > the sd driver. > > > > suspended by the main thread instead of an async thread. > > > > Hm, that's odd, because there's a noticeable time difference between the > > two cases in which the sd is sync and async. I'll look into it further. > > I don't know what the whole story is, but the PID number tells the > tale. > > > > There's an important point I neglected to mention before. Your logs > > > don't show anything for devices with no suspend callbacks at all. > > > Nevertheless, these devices sit on the device list and prevent other > > > devices from suspending or resuming as soon as they could. > > > > Unless they are async, that is. > > Yes. It would be simpler to make them async. But first we ought to > know what they are. Can you add an extra line to the log for such > devices? Sure, I'll do that. > What I'm afraid of is that there might be a "normal" device with a > "normal" ancestor but with "abnormal" devices in between (where > "normal" means there is a suspend or resume routine and "abnormal" > means all the method pointers are NULL). I know that this happens when > there's a USB mass-storage device, for example. If we complete the > intermediate devices immediately, then there won't be anything to > prevent the ancestor from suspending before the device or the device > from resuming before the ancestor. I'm afraid of that too. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912192253200.6618-100000@netrider.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912192253200.6618-100000@netrider.rowland.org> @ 2009-12-20 12:52 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-20 12:52 UTC (permalink / raw) To: Alan Stern Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sunday 20 December 2009, Alan Stern wrote: > On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > So, seriously, do you think it makes sense to do asynchronous suspend at all? > > I'm asking, because we're likely to get into troubles like this during suspend > > for other kinds of devices too and without resolving them we won't get any > > significant speedup from asynchronous suspend. > > > > That said, to me it's definitely worth doing asynchronous resume with the > > "start asynch threads upfront" modification, as the results of the tests show > > that quite clearly. I hope you agree. > > It's too early to come to this sort of conclusion (i.e., that suspend > and resume react very differently to an asynchronous approach). Unless > you have some definite _reason_ for thinking that resume will benefit > more than suspend, you shouldn't try to generalize so much from tests > on only two systems. In fact I have one reason. Namely, the things that drivers do on suspend and resume are evidently quite different and on these two systems I was able to test they apparently took different amounts of time to complete. The very fact that on both systems resume is substantially longer than suspend, even if all devices are suspended and resumed synchronously, is quite interesting. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912192241.03991.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <200912192241.03991.rjw@sisk.pl> @ 2009-12-20 3:48 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-20 3:48 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Sat, 19 Dec 2009, Rafael J. Wysocki wrote: > On Friday 18 December 2009, Alan Stern wrote: > > On Fri, 18 Dec 2009, Rafael J. Wysocki wrote: > > > > > I didn't manage to do that, but I was able to mark sd and i8042 as async and > > > see the impact of this. > > > > Apparently this didn't do what you wanted. In the nx6325 > > sd+i8042+async+extra log, the 0:0:0:0 device (which is a SCSI disk) was To be precise, the device is an ATA or SATA disk but it is managed by the sd driver. > > suspended by the main thread instead of an async thread. > > Hm, that's odd, because there's a noticeable time difference between the > two cases in which the sd is sync and async. I'll look into it further. I don't know what the whole story is, but the PID number tells the tale. > > There's an important point I neglected to mention before. Your logs > > don't show anything for devices with no suspend callbacks at all. > > Nevertheless, these devices sit on the device list and prevent other > > devices from suspending or resuming as soon as they could. > > Unless they are async, that is. Yes. It would be simpler to make them async. But first we ought to know what they are. Can you add an extra line to the log for such devices? What I'm afraid of is that there might be a "normal" device with a "normal" ancestor but with "abnormal" devices in between (where "normal" means there is a suspend or resume routine and "abnormal" means all the method pointers are NULL). I know that this happens when there's a USB mass-storage device, for example. If we complete the intermediate devices immediately, then there won't be anything to prevent the ancestor from suspending before the device or the device from resuming before the ancestor. Forcing the "abnormal" devices to be async, even if they aren't marked that way, would avoid these problems. > > For example, the fingerprint sensor (3-1) took the most time to resume. > > But other devices were delayed until after it finished because it had > > children with no callbacks, and they delayed the devices following > > them in the list. > > > > What would happen if you completed these devices immediately, as part > > of the first pass? > > OK. How do the PM core is supposed to check if a device has null suspend > and resume? Check all of the function pointers in the first pass? All the relevant pointers (including the legacy pointers). That is, you check only the suspend pointers during the first suspend pass, and likewise for resume. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912181205290.2987-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912181205290.2987-100000@iolanthe.rowland.org> @ 2009-12-19 21:41 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 21:41 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Friday 18 December 2009, Alan Stern wrote: > On Fri, 18 Dec 2009, Rafael J. Wysocki wrote: > > > I didn't manage to do that, but I was able to mark sd and i8042 as async and > > see the impact of this. > > Apparently this didn't do what you wanted. In the nx6325 > sd+i8042+async+extra log, the 0:0:0:0 device (which is a SCSI disk) was > suspended by the main thread instead of an async thread. Hm, that's odd, because there's a noticeable time difference between the two cases in which the sd is sync and async. I'll look into it further. > There's an important point I neglected to mention before. Your logs > don't show anything for devices with no suspend callbacks at all. > Nevertheless, these devices sit on the device list and prevent other > devices from suspending or resuming as soon as they could. Unless they are async, that is. > For example, the fingerprint sensor (3-1) took the most time to resume. > But other devices were delayed until after it finished because it had > children with no callbacks, and they delayed the devices following > them in the list. > > What would happen if you completed these devices immediately, as part > of the first pass? OK. How do the PM core is supposed to check if a device has null suspend and resume? Check all of the function pointers in the first pass? Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912171444040.2645-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912171444040.2645-100000@iolanthe.rowland.org> @ 2009-12-17 20:36 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-17 20:36 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thursday 17 December 2009, Alan Stern wrote: > On Thu, 17 Dec 2009, Rafael J. Wysocki wrote: > > > That actually is correct. On the nx6325 suspend is totally dominated by disk > > spindown, almost everything else is negligible compared to it (well, except for > > the audio), so we can't go down below 1 s during suspend on this box. > > > > On the Wind, disk spindown time is comparable with serio suspend time, > > so at least in principle we should be able to get .5 s suspend on this box - > > if the disk spindown in async. > > > > In turn, the resume on the Wind is dominated by disk spinup, so we can't > > go below 1.5 s on this box during resume (notice that the "async+extra" > > approach brings us close to this limit, although we could save .5 s more in > > principle by making more devices async). > > > > Resume on the nx6325 is a different story, though, as it is dominated by USB > > and PCI devices, so marking those as async would probably bring us close to > > the limit. > > The implications seem pretty clear. If the following sorts of devices > were async: > > USB (devices and interfaces), PCI, serio, SCSI (hosts, targets, > devices) Plus ACPI battery. > then we would reap close to the maximum benefit -- providing: > > async threads are started in a first pass without waiting > for synchronous devices, and Agreed. > It's not clear that making all these types of devices async will really > work, but it's worth testing. I'm working on it. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912161753540.2643-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912161753540.2643-100000@iolanthe.rowland.org> @ 2009-12-16 23:18 ` Rafael J. Wysocki [not found] ` <200912170018.05175.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-16 23:18 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thursday 17 December 2009, Alan Stern wrote: > On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > I've just put the first set of data, for the HP nx6325 at: > > http://www.sisk.pl/kernel/data/nx6325/ > > > > The *-dmesg.log files contain full dmesg outputs starting from a cold boot and > > including one suspend-resume cycle in each case, with debug_initcall enabled. > > > > The *-suspend.log files are excerpts from the *-dmesg.log files containing > > the suspend messages only, and analogously for *-resume.log. > > I've just started looking at the sync-suspend.log file. What are all > the '+' characters and " @ 3368" strings after the device names? I think the + is necessary for the Arjan's graph-generating script and the @ number is the value of current (ie. the PID of the calling task). > You didn't print out the parent name for each device, so the tree > structure has been lost. That's because the original Arjan's patch doesn't do that, I'm adding it right now. > Why do those "sd 0:0:0:0 [sda]" messages appear in between two > callbacks? The cache-synchronization and the spin-down commands are > not executed asynchronously. Because the data are incomplete. :-( I've just realized that the Arjan's patch only covers bus types and classes that have been converted to dev_pm_ops already, so I'm extending it to the "legacy" ones at the moment. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912170018.05175.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912170018.05175.rjw@sisk.pl> @ 2009-12-17 1:30 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-17 1:30 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thursday 17 December 2009, Rafael J. Wysocki wrote: > On Thursday 17 December 2009, Alan Stern wrote: > > On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > > > I've just put the first set of data, for the HP nx6325 at: > > > http://www.sisk.pl/kernel/data/nx6325/ > > > > > > The *-dmesg.log files contain full dmesg outputs starting from a cold boot and > > > including one suspend-resume cycle in each case, with debug_initcall enabled. > > > > > > The *-suspend.log files are excerpts from the *-dmesg.log files containing > > > the suspend messages only, and analogously for *-resume.log. > > > > I've just started looking at the sync-suspend.log file. What are all > > the '+' characters and " @ 3368" strings after the device names? > > I think the + is necessary for the Arjan's graph-generating script and the > @ number is the value of current (ie. the PID of the calling task). > > > You didn't print out the parent name for each device, so the tree > > structure has been lost. > > That's because the original Arjan's patch doesn't do that, I'm adding it > right now. > > > Why do those "sd 0:0:0:0 [sda]" messages appear in between two > > callbacks? The cache-synchronization and the spin-down commands are > > not executed asynchronously. > > Because the data are incomplete. :-( > > I've just realized that the Arjan's patch only covers bus types and classes > that have been converted to dev_pm_ops already, so I'm extending it to the > "legacy" ones at the moment. New data files have been uploaded to: http://www.sisk.pl/kernel/data/nx6325/ http://www.sisk.pl/kernel/data/wind/ Please let me know if you need more information. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912161018100.2909-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912161018100.2909-100000@iolanthe.rowland.org> @ 2009-12-16 19:26 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-16 19:26 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Wednesday 16 December 2009, Alan Stern wrote: > On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > I measured the total time of suspending and resuming devices as shown by the > > code added by this patch: > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > > different and the HP was running 64-bit kernel and user space). > > > I carried out 5 consecutive suspend-resume cycles (started from under X) on > > each box in each case, and the raw data are here (all times in milliseconds): > > http://www.sisk.pl/kernel/data/async-suspend.pdf > > I'd like to see much more detailed data. For each device, let's get > the device name, the parent's name, and the start time, end time, and > duration for suspend or resume. The start time should be measured when > you have finished waiting for the children. The end time should be > measured just before the complete_all(). I'm going to use the Arjan's patch + script to chart the suspend/resume times for individual devices. I can send you the raw data, though. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912151337350.14385@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <alpine.LFD.2.00.0912151337350.14385@localhost.localdomain> @ 2009-12-15 22:27 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-15 22:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Linus Torvalds wrote: > On Tue, 15 Dec 2009, Alan Stern wrote: > > > > Okay. This obviously implies that if/when cardbus bridges are > > converted to async suspend/resume, the driver should make sure that the > > lower-numbered devices wait for their sibling higher-numbered devices > > to suspend (and vice versa for resume). Awkward though it may be. > > Yes. However, this is an excellent case where the whole "the device layer > does things asynchronously" is really rather awkward. > > For cardbus, the nicest model really would be for the _driver_ to decide > to do some things asynchronously, after having done some other things > synchronously (to make sure of ordering). Have you considered the possibility of augmenting the design to allow this? Perhaps reserve a particular return code from the suspend routine to mean that asynchronous operations are still underway, so the PM core shouldn't automatically do the complete_all(). > So I suspect that we _can_ just do cardbus bridges asynchronously too, but > it really needs some care. I suspect to a first approximation we would > want to do the easy cases first, and ignore cardbus as being "known to > possibly have issues". Certainly. Start with the easy things and leave harder devices like cardbus bridges for later. > > > Subtle? Hell yes. > > > > I don't disagree. However the subtlety lies mainly in the matter of > > non-obvious dependencies. > > Yes. But we don't necessarily even _know_ those dependencies. Yep. Both non-obvious and non-known. > The Cardbus ones I know about, but really only because I wrote much of > that code initially when converting cardbus to look like the PCI bridge it > largely is. But how many other cases like that do we have that we have > perhaps never even hit, because we've never done anything out of order. > > > The ACPI relations are definitely something to worry about. It would > > be a good idea, at an early stage, to add those dependencies > > explicitly. I don't know enough about them to say more; perhaps Rafael > > does. > > Quite frankly, I would really not want to do ACPI first at all. Dear me, no! I wasn't saying ACPI should be made async; I was saying that ACPI "shadow" devices should be made to wait for their async PCI counterparts. > > Indeed. Perhaps you were too hasty in suggesting that PCI bridges > > should be async. > > Oh, yes. I would suggest that first we do _nothing_ async except for > within just a single USB tree, and perhaps some individual drivers like > the PS/2 keyboard controller (and do even that perhaps only for the PC > version, which we know is on the southbridge and not anywhere else). > > If that ends up meaning that we block due to PCI bridges, so be it. I > really would prefer baby steps over anything more complete. Agreed. I'm not in any hurry. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912152226.22578.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <200912152226.22578.rjw@sisk.pl> @ 2009-12-15 22:01 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-15 22:01 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > Ideally we would figure out how to do the slow devices in parallel > > without interference from fast devices having unknown dependencies. > > Unfortunately this may not be possible. > > I really expect to see those "unknown dependencies" in the _noirq > suspend/resume phases and above. [The very fact they exist is worrisome, > because that's why we don't know why things work on one system and don't > work on another, although they appear to be very similar.] This is a good reason for keeping the _noirq phases synchronous. AFAIK they don't take long enough to be worth converting, so there's no loss. > > The real issue is "blockage": synchronous devices preventing > > possible concurrency among async devices. That's what you thought > > making PCI bridges async would help. > > > > In general, blockage arises in suspend when you have an async child > > with a synchronous parent. The parent has to wait for the child, which > > might take a long time, thereby delaying other unrelated devices. > > Exactly, but the Linus' point seems to be that's going to be rare and we > should be able to special case all of the interesting cases. Maybe that's true. Without seeing some examples of actual dpm_list contents, we can't tell. Can you post the interesting parts of the lists from some of your test machines? Maybe with a USB device or two plugged in? (The device names together with the names of their parents should be enough.) > > (This explains why you wanted to make PCI bridges async -- they are the > > parents of USB controllers.) For resume it's the opposite: an async > > parent with synchronous children. > > Is that really going to happen in practice? I mean, what would be the point? I don't know. It's all speculation until we see some actual lists. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912151444010.2643-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912151444010.2643-100000@iolanthe.rowland.org> @ 2009-12-15 21:26 ` Rafael J. Wysocki 2009-12-15 21:54 ` Linus Torvalds 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-15 21:26 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Tuesday 15 December 2009, Alan Stern wrote: > On Tue, 15 Dec 2009, Linus Torvalds wrote: > > > It's a very subtle theory, and it's not necessarily always 100% true. For > > example, a cardbus bridge is strictly speaking very much a PCI bridge, but > > for cardbus bridges we _do_ have a suspend/resume function. > > > > And perhaps worse than that, cardbus bridges are one of the canonical > > examples where two different PCI devices actually share registers. It's > > quite common that some of the control registers are shared across the two > > subfunctions of a two-slot cardbus controller (and we generally don't even > > have full docs for them!) > > Okay. This obviously implies that if/when cardbus bridges are > converted to async suspend/resume, the driver should make sure that the > lower-numbered devices wait for their sibling higher-numbered devices > to suspend (and vice versa for resume). Awkward though it may be. > > > > The same goes for devices that don't have suspend or resume methods. > > > > Yes and no. > > > > Again, the "async_suspend" flag is done at the generic device layer, but > > 99% of all suspend/resume methods are _not_ done at that level: they are > > bus-specific functions, where the bus has a generic suspend-resume > > function that it exposes to the generic device layer, and that knows about > > the bus-specific rules. > > > > So if you are a PCI device (to take just that example - but it's true of > > just about all other buses too), and you don't have any suspend or resume > > methods, it's actually impossible to see that fact from the generic device > > layer. > > Sure. That's why the async_suspend flag is set at the bus/driver > level. > > > And even when you know it's PCI, our rules are actually not simple at all. > > Our rules for PCI devices (and this strictly speaking is true for bridges > > too) are rather complex: > > > > - do we have _any_ legacy PM support (ie the "direct" driver > > suspend/resume functions in the driver ops, rather than having a > > "struct dev_pm_ops" pointer)? If so, call "->suspend()" > > > > - If not - do we have that "dev_pm_ops" thing? If so, call it. > > > > - If not - just disable the device entirely _UNLESS_ you're a PCI bridge. > > > > Notice? The way things are set up, if you have no suspend routine, you'll > > not get suspended, but you will get disabled. > > > > So it's _not_ actually safe to asynchronously suspend a PCI device if that > > device has no driver or no suspend routines - because even in the absense > > of a driver and suspend routines, we'll still least disable it. And if > > there is some subtle dependency on that device that isn't obvious (say, it > > might be used indirectly for some ACPI thing), then that async suspend is > > the wrong thing to do. > > > > Subtle? Hell yes. > > I don't disagree. However the subtlety lies mainly in the matter of > non-obvious dependencies. (The other stuff is all known to the PCI > core.) AFAICS there's otherwise little difference between an async > routine that does nothing and one that disables the device -- both > operations are very fast. > > The ACPI relations are definitely something to worry about. It would > be a good idea, at an early stage, to add those dependencies > explicitly. I don't know enough about them to say more; perhaps Rafael > does. It boils down to the fact that for each PCI device known to the ACPI BIOS there is a "shadow" ACPI device that generally has its own suspend/resume callbacks and these "shadow" devices are members of the ACPI subtree of the device tree (ie. they have parents and so on). Now, when I worked on the first version of async suspend/resume, I noticed that if those "shadow" ACPI devices did not wait for their PCI counterparts to suspend, things broke badly. The reason probably wasn't related to what they did in their suspend/resume callbacks, because they are usually empty, but it was rather related to the dependencies between devices in the ACPI subtree (so, generally speaking, it seems the entire ACPI subtree of the device tree should be suspended after the entire PCI subtree). That obviously requires more investigation, though. > As for other non-obvious dependencies... Who knows? Probably the only > way to find them is by experimentation. My guess is that they will > turn out to be connected mostly with "high-level" devices: system > devices, things on the motherboard -- generally speaking, stuff close > to the CPU. Relatively few will be associated with devices below the > level of a PCI device or equivalent. > > Ideally we would figure out how to do the slow devices in parallel > without interference from fast devices having unknown dependencies. > Unfortunately this may not be possible. I really expect to see those "unknown dependencies" in the _noirq suspend/resume phases and above. [The very fact they exist is worrisome, because that's why we don't know why things work on one system and don't work on another, although they appear to be very similar.] > > So the whole thing about "we can do PCI bridges asynchronously because > > they are obviously no-op" is kind of true - except for the "obviously" > > part. It's not obvious at all. It's rather subtle. > > > > As an example of this kind of subtlety - iirc PCIE bridges used to have > > suspend and resume bugs when we initially switched over to the "new world" > > suspend/resume exactly because they actually did things at "suspend" time > > (rather than suspend_late), and that broke devices behind them (this was > > not related to async, of course, but the point is that even when you look > > like a PCI bridge, you might be doing odd things). Well, those "pcieport devices" still are the children of PCIe ports, although physically they just correspond to different sets of registers within the ports' config spaces (_that_ is overdesigned IMnsHO) and they are "suspended" during the regular suspend of their PCIe port "parents". > > So just saying "let's do it asynchronously" is _not_ always guaranteed to > > be the right thing at all. It's _probably_ safe for at least regular PCI > > bridges. Cardbus bridges? Probably not, but since most modern laptop have > > just a single slot - and people who have multiple slots seldom use them > > all - most people will probably never see the problems that it _could_ > > introduce. > > > > And PCIE bridges? Should be safe these days, but it wasn't quite as > > obvious, because a PCIE bridge actually has a driver unlike a regular > > plain PCI-PCI bridge. > > > > Subtle, subtle. > > Indeed. Perhaps you were too hasty in suggesting that PCI bridges > should be async. > > It would help a lot to see some device lists for typical machines. (If > there are such things.) Otherwise we are just blowing gas. > > > > There remains a separate question: Should async devices also be forced > > > to wait for their children? I don't see why not. For PCI bridges it > > > won't make any significant difference. As long as the async code > > > doesn't have to do anything, who cares when it runs? > > > > That's why I just set the "async_resume = 1" thing. > > > > But there might actually be reasons why we care. Like the fact that we > > actually throttle the amount of parallel work we do in async_schedule(). > > So doing even a "no-op" asynchronously isn't actually a no-op: while it is > > pending (and those things can be pending for a long time, since they have > > to wait for those slow devices underneath them), it can cause _other_ > > async work - that isn't necessarily a no-op at all - to be then done > > synchronously. > > > > Now, admittedly our async throttling limits are high enough that the above > > kind of detail will probably never ever realy matter (default 256 worker > > threads etc). But it's an example of how practice is different from theory > > - in _theory_ it doesn't make any difference if you wait for something > > asynchronously, but in practice it could make a difference under some > > circumstances. > > We certainly shouldn't be worried about side effects of async > throttling as this stage. KISS works both ways: Don't overdesign, and > don't worry about things that might crop up when you expand the design. > > We have strayed off the point of your original objection: not providing > a way for devices to skip waiting for their children. This really is a > separate issue from deciding whether or not to go async. For example, > your proposed patch makes PCI bridges async but doesn't allow them to > avoid waiting for children. IMO that's a good thing. > > The real issue is "blockage": synchronous devices preventing > possible concurrency among async devices. That's what you thought > making PCI bridges async would help. > > In general, blockage arises in suspend when you have an async child > with a synchronous parent. The parent has to wait for the child, which > might take a long time, thereby delaying other unrelated devices. Exactly, but the Linus' point seems to be that's going to be rare and we should be able to special case all of the interesting cases. > (This explains why you wanted to make PCI bridges async -- they are the > parents of USB controllers.) For resume it's the opposite: an async > parent with synchronous children. Is that really going to happen in practice? I mean, what would be the point? > Thus, while making PCI bridges async might make suspend faster, it probably > won't help much with resume speed. You'd have to make the children of USB > devices (SCSI hosts, TTYs, and so on) async. Depending on the order of > device registration, of course. > > Apart from all this, there's a glaring hole in the discussion so far. > You and Arjan may not have noticed it, but those of us still using > rotating media have to put up with disk resume times that are a factor > of 100 (!) larger than USB resume times. That's where the greatest > gains are to be found. I guess so. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912151444010.2643-100000@iolanthe.rowland.org> 2009-12-15 21:26 ` Rafael J. Wysocki @ 2009-12-15 21:54 ` Linus Torvalds 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 21:54 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Alan Stern wrote: > > Okay. This obviously implies that if/when cardbus bridges are > converted to async suspend/resume, the driver should make sure that the > lower-numbered devices wait for their sibling higher-numbered devices > to suspend (and vice versa for resume). Awkward though it may be. Yes. However, this is an excellent case where the whole "the device layer does things asynchronously" is really rather awkward. For cardbus, the nicest model really would be for the _driver_ to decide to do some things asynchronously, after having done some other things synchronously (to make sure of ordering). That said, I think we are ok for at least Yenta resume, because the really ordering-critical stuff we tend to do at "resume_early", which wouldn't be asynchronous anyway. But for an idea of what I'm talking about, look at the o2micro stuff in drivers/pcmcia/o2micro.h, and notice how it does certain things only for the "PCI_FUNC(..devfn) == 0" case. So I suspect that we _can_ just do cardbus bridges asynchronously too, but it really needs some care. I suspect to a first approximation we would want to do the easy cases first, and ignore cardbus as being "known to possibly have issues". > > Subtle? Hell yes. > > I don't disagree. However the subtlety lies mainly in the matter of > non-obvious dependencies. Yes. But we don't necessarily even _know_ those dependencies. The Cardbus ones I know about, but really only because I wrote much of that code initially when converting cardbus to look like the PCI bridge it largely is. But how many other cases like that do we have that we have perhaps never even hit, because we've never done anything out of order. > The ACPI relations are definitely something to worry about. It would > be a good idea, at an early stage, to add those dependencies > explicitly. I don't know enough about them to say more; perhaps Rafael > does. Quite frankly, I would really not want to do ACPI first at all. We already handle batteries specially, but any random system device? Don't touch it, is my suggestion. There is just too many ways it can fail. Don't tell me that things "should work" - we know for a fact that BIOS tables almost always have every single bug they could possibly have). > > And PCIE bridges? Should be safe these days, but it wasn't quite as > > obvious, because a PCIE bridge actually has a driver unlike a regular > > plain PCI-PCI bridge. > > > > Subtle, subtle. > > Indeed. Perhaps you were too hasty in suggesting that PCI bridges > should be async. Oh, yes. I would suggest that first we do _nothing_ async except for within just a single USB tree, and perhaps some individual drivers like the PS/2 keyboard controller (and do even that perhaps only for the PC version, which we know is on the southbridge and not anywhere else). If that ends up meaning that we block due to PCI bridges, so be it. I really would prefer baby steps over anything more complete. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912151047410.3566-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912151047410.3566-100000@iolanthe.rowland.org> @ 2009-12-15 16:28 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912150803250.14385@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 16:28 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Alan Stern wrote: > > It doesn't feel like an ugly hack to me. It seems like exactly the > Right Thing To Do: Make as many devices as possible use async > suspend/resume. The reason it's a ugly hack is that it's actually not a simple decision to make. The devil is in the details: > The only reason we don't make every device async is because we don't > know whether it's safe. In the case of PCI bridges we _do_ know -- > because they don't have any work to do outside of > late_suspend/early_resume -- and so they _should_ be async. That's the theory, yes. And it was worth the comment to spell out that theory. But.. It's a very subtle theory, and it's not necessarily always 100% true. For example, a cardbus bridge is strictly speaking very much a PCI bridge, but for cardbus bridges we _do_ have a suspend/resume function. And perhaps worse than that, cardbus bridges are one of the canonical examples where two different PCI devices actually share registers. It's quite common that some of the control registers are shared across the two subfunctions of a two-slot cardbus controller (and we generally don't even have full docs for them!) > The same goes for devices that don't have suspend or resume methods. Yes and no. Again, the "async_suspend" flag is done at the generic device layer, but 99% of all suspend/resume methods are _not_ done at that level: they are bus-specific functions, where the bus has a generic suspend-resume function that it exposes to the generic device layer, and that knows about the bus-specific rules. So if you are a PCI device (to take just that example - but it's true of just about all other buses too), and you don't have any suspend or resume methods, it's actually impossible to see that fact from the generic device layer. And even when you know it's PCI, our rules are actually not simple at all. Our rules for PCI devices (and this strictly speaking is true for bridges too) are rather complex: - do we have _any_ legacy PM support (ie the "direct" driver suspend/resume functions in the driver ops, rather than having a "struct dev_pm_ops" pointer)? If so, call "->suspend()" - If not - do we have that "dev_pm_ops" thing? If so, call it. - If not - just disable the device entirely _UNLESS_ you're a PCI bridge. Notice? The way things are set up, if you have no suspend routine, you'll not get suspended, but you will get disabled. So it's _not_ actually safe to asynchronously suspend a PCI device if that device has no driver or no suspend routines - because even in the absense of a driver and suspend routines, we'll still least disable it. And if there is some subtle dependency on that device that isn't obvious (say, it might be used indirectly for some ACPI thing), then that async suspend is the wrong thing to do. Subtle? Hell yes. So the whole thing about "we can do PCI bridges asynchronously because they are obviously no-op" is kind of true - except for the "obviously" part. It's not obvious at all. It's rather subtle. As an example of this kind of subtlety - iirc PCIE bridges used to have suspend and resume bugs when we initially switched over to the "new world" suspend/resume exactly because they actually did things at "suspend" time (rather than suspend_late), and that broke devices behind them (this was not related to async, of course, but the point is that even when you look like a PCI bridge, you might be doing odd things). So just saying "let's do it asynchronously" is _not_ always guaranteed to be the right thing at all. It's _probably_ safe for at least regular PCI bridges. Cardbus bridges? Probably not, but since most modern laptop have just a single slot - and people who have multiple slots seldom use them all - most people will probably never see the problems that it _could_ introduce. And PCIE bridges? Should be safe these days, but it wasn't quite as obvious, because a PCIE bridge actually has a driver unlike a regular plain PCI-PCI bridge. Subtle, subtle. > There remains a separate question: Should async devices also be forced > to wait for their children? I don't see why not. For PCI bridges it > won't make any significant difference. As long as the async code > doesn't have to do anything, who cares when it runs? That's why I just set the "async_resume = 1" thing. But there might actually be reasons why we care. Like the fact that we actually throttle the amount of parallel work we do in async_schedule(). So doing even a "no-op" asynchronously isn't actually a no-op: while it is pending (and those things can be pending for a long time, since they have to wait for those slow devices underneath them), it can cause _other_ async work - that isn't necessarily a no-op at all - to be then done synchronously. Now, admittedly our async throttling limits are high enough that the above kind of detail will probably never ever realy matter (default 256 worker threads etc). But it's an example of how practice is different from theory - in _theory_ it doesn't make any difference if you wait for something asynchronously, but in practice it could make a difference under some circumstances. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912150803250.14385@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912150803250.14385@localhost.localdomain> @ 2009-12-15 18:57 ` Linus Torvalds 2009-12-15 20:26 ` Alan Stern 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 18:57 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Linus Torvalds wrote: > > And even when you know it's PCI, our rules are actually not simple at all. > Our rules for PCI devices (and this strictly speaking is true for bridges > too) are rather complex: > > - do we have _any_ legacy PM support (ie the "direct" driver > suspend/resume functions in the driver ops, rather than having a > "struct dev_pm_ops" pointer)? If so, call "->suspend()" > > - If not - do we have that "dev_pm_ops" thing? If so, call it. > > - If not - just disable the device entirely _UNLESS_ you're a PCI bridge. > > Notice? The way things are set up, if you have no suspend routine, you'll > not get suspended, but you will get disabled. Side note - what I think might be a clean solution for PCI at least is to do something like the following: - move that "disable the device entirely" thing to suspend_late, rather than the earlier suspend phase. Now PCI devices without drivers or PM will not be touched at all in the first suspend phase. - initialize all PCI devices to have 'async_suspend = 1' on discovery - whenever we bind a driver to the PCI device, we'd then look at whether that driver implements suspend/resume callbacks (legacy or new), and clear the async_suspend bit if so. That way we'd have the same old synchronous behavior for all PCI suspend and resume events (unless the driver itself then sets the async_suspend bit at device init time, which it could do, of course), while still always doing async "no-op" events. That would avoid the ugly one-liner that just "knows" that PCI bridges are special and don't do anything at suspend time (even though they aren't really - a PCI bridge _could_ have a driver associated with it that does something that might not be happy being asynchronous). Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912150803250.14385@localhost.localdomain> 2009-12-15 18:57 ` Linus Torvalds @ 2009-12-15 20:26 ` Alan Stern 1 sibling, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-15 20:26 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Linus Torvalds wrote: > It's a very subtle theory, and it's not necessarily always 100% true. For > example, a cardbus bridge is strictly speaking very much a PCI bridge, but > for cardbus bridges we _do_ have a suspend/resume function. > > And perhaps worse than that, cardbus bridges are one of the canonical > examples where two different PCI devices actually share registers. It's > quite common that some of the control registers are shared across the two > subfunctions of a two-slot cardbus controller (and we generally don't even > have full docs for them!) Okay. This obviously implies that if/when cardbus bridges are converted to async suspend/resume, the driver should make sure that the lower-numbered devices wait for their sibling higher-numbered devices to suspend (and vice versa for resume). Awkward though it may be. > > The same goes for devices that don't have suspend or resume methods. > > Yes and no. > > Again, the "async_suspend" flag is done at the generic device layer, but > 99% of all suspend/resume methods are _not_ done at that level: they are > bus-specific functions, where the bus has a generic suspend-resume > function that it exposes to the generic device layer, and that knows about > the bus-specific rules. > > So if you are a PCI device (to take just that example - but it's true of > just about all other buses too), and you don't have any suspend or resume > methods, it's actually impossible to see that fact from the generic device > layer. Sure. That's why the async_suspend flag is set at the bus/driver level. > And even when you know it's PCI, our rules are actually not simple at all. > Our rules for PCI devices (and this strictly speaking is true for bridges > too) are rather complex: > > - do we have _any_ legacy PM support (ie the "direct" driver > suspend/resume functions in the driver ops, rather than having a > "struct dev_pm_ops" pointer)? If so, call "->suspend()" > > - If not - do we have that "dev_pm_ops" thing? If so, call it. > > - If not - just disable the device entirely _UNLESS_ you're a PCI bridge. > > Notice? The way things are set up, if you have no suspend routine, you'll > not get suspended, but you will get disabled. > > So it's _not_ actually safe to asynchronously suspend a PCI device if that > device has no driver or no suspend routines - because even in the absense > of a driver and suspend routines, we'll still least disable it. And if > there is some subtle dependency on that device that isn't obvious (say, it > might be used indirectly for some ACPI thing), then that async suspend is > the wrong thing to do. > > Subtle? Hell yes. I don't disagree. However the subtlety lies mainly in the matter of non-obvious dependencies. (The other stuff is all known to the PCI core.) AFAICS there's otherwise little difference between an async routine that does nothing and one that disables the device -- both operations are very fast. The ACPI relations are definitely something to worry about. It would be a good idea, at an early stage, to add those dependencies explicitly. I don't know enough about them to say more; perhaps Rafael does. As for other non-obvious dependencies... Who knows? Probably the only way to find them is by experimentation. My guess is that they will turn out to be connected mostly with "high-level" devices: system devices, things on the motherboard -- generally speaking, stuff close to the CPU. Relatively few will be associated with devices below the level of a PCI device or equivalent. Ideally we would figure out how to do the slow devices in parallel without interference from fast devices having unknown dependencies. Unfortunately this may not be possible. > So the whole thing about "we can do PCI bridges asynchronously because > they are obviously no-op" is kind of true - except for the "obviously" > part. It's not obvious at all. It's rather subtle. > > As an example of this kind of subtlety - iirc PCIE bridges used to have > suspend and resume bugs when we initially switched over to the "new world" > suspend/resume exactly because they actually did things at "suspend" time > (rather than suspend_late), and that broke devices behind them (this was > not related to async, of course, but the point is that even when you look > like a PCI bridge, you might be doing odd things). > > So just saying "let's do it asynchronously" is _not_ always guaranteed to > be the right thing at all. It's _probably_ safe for at least regular PCI > bridges. Cardbus bridges? Probably not, but since most modern laptop have > just a single slot - and people who have multiple slots seldom use them > all - most people will probably never see the problems that it _could_ > introduce. > > And PCIE bridges? Should be safe these days, but it wasn't quite as > obvious, because a PCIE bridge actually has a driver unlike a regular > plain PCI-PCI bridge. > > Subtle, subtle. Indeed. Perhaps you were too hasty in suggesting that PCI bridges should be async. It would help a lot to see some device lists for typical machines. (If there are such things.) Otherwise we are just blowing gas. > > There remains a separate question: Should async devices also be forced > > to wait for their children? I don't see why not. For PCI bridges it > > won't make any significant difference. As long as the async code > > doesn't have to do anything, who cares when it runs? > > That's why I just set the "async_resume = 1" thing. > > But there might actually be reasons why we care. Like the fact that we > actually throttle the amount of parallel work we do in async_schedule(). > So doing even a "no-op" asynchronously isn't actually a no-op: while it is > pending (and those things can be pending for a long time, since they have > to wait for those slow devices underneath them), it can cause _other_ > async work - that isn't necessarily a no-op at all - to be then done > synchronously. > > Now, admittedly our async throttling limits are high enough that the above > kind of detail will probably never ever realy matter (default 256 worker > threads etc). But it's an example of how practice is different from theory > - in _theory_ it doesn't make any difference if you wait for something > asynchronously, but in practice it could make a difference under some > circumstances. We certainly shouldn't be worried about side effects of async throttling as this stage. KISS works both ways: Don't overdesign, and don't worry about things that might crop up when you expand the design. We have strayed off the point of your original objection: not providing a way for devices to skip waiting for their children. This really is a separate issue from deciding whether or not to go async. For example, your proposed patch makes PCI bridges async but doesn't allow them to avoid waiting for children. IMO that's a good thing. The real issue is "blockage": synchronous devices preventing possible concurrency among async devices. That's what you thought making PCI bridges async would help. In general, blockage arises in suspend when you have an async child with a synchronous parent. The parent has to wait for the child, which might take a long time, thereby delaying other unrelated devices. (This explains why you wanted to make PCI bridges async -- they are the parents of USB controllers.) For resume it's the opposite: an async parent with synchronous children. Thus, while making PCI bridges async might make suspend faster, it probably won't help much with resume speed. You'd have to make the children of USB devices (SCSI hosts, TTYs, and so on) async. Depending on the order of device registration, of course. Apart from all this, there's a glaring hole in the discussion so far. You and Arjan may not have noticed it, but those of us still using rotating media have to put up with disk resume times that are a factor of 100 (!) larger than USB resume times. That's where the greatest gains are to be found. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912131221210.1111-100000@netrider.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912131221210.1111-100000@netrider.rowland.org> @ 2009-12-13 19:02 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-13 19:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, pm list, LKML On Sun, 13 Dec 2009, Alan Stern wrote: > > Namely that there's no apparent sane way to say "don't wait for children". > > > > PCI bridges that don't suspend at all - or any other device that only > > suspends in the 'suspend_late()' thing, for that matter - don't have any > > reason what-so-ever to wait for children, since they aren't actually > > suspending in the first place. But you make them wait regardless, which > > then serializes things unnecessarily (for example, two unrelated USB > > controllers). > In short, allowing devices to suspend before their children would be > dangerous and probably would not save a significant amount of time. There's more to be said. Even without this "don't wait for children" thing, there can be bad interactions causing unnecessary delays. For example, suppose A (async) is the parent of B (sync), B comes before C (sync) in dpm_list, and C is the parent of D (async). Even if A & B are unrelated to C & D, they will be forced to wait for them. It doesn't matter that A and D are unrelated and so could suspend concurrently. In essence, every synchonrous device is treated as though it depends on all the synchronous devices preceding it in dpm_list. That's a lot of unnecessary constraints. At the moment we have no choice, because we have to assume that some of those constraints actually are necessary -- and we don't know which ones. It's an inescapable fact: If there are unnecessary ordering constraints then you generally can't be 100% efficient in carrying out parallel operations. Compared with all these extra "synchronous" constraints, the relatively small number of "don't need to wait for children" constraints is harmless. I bet that if we got rid of all unnecessary constraints except for making parents always wait for their children, we'd attain more than 95% of the ideal speedup. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912112317.31668.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <200912112317.31668.rjw@sisk.pl> @ 2009-12-12 0:38 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-12 0:38 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Fri, 11 Dec 2009, Rafael J. Wysocki wrote: > > > .. and I've told you several times that we should simply not do such > > > devices asynchronously. At least not unless there is some _overriding_ > > > reason to. And so far, nobody has suggested anything even remotely > > > likely for that. > > > > Agreed. The fact that async non-tree suspend constraints are difficult > > with rwsems isn't a drawback if nobody needs to use them. > > Well, see my reply to Linus. The only thing that bothers me is that if we use > rwsems, there's no way to handle that even if it turns out that someone > needs them after all. This is now a totally moot point, but I want to make it anyway just to show how perverse life can be. It turns out that by combining some of the worst parts of the rwsem approach and the completion approach, it _is_ possible to have async non-tree suspend constraints with rwsems. The key is to imitate the way the completions work. The resume algorithm doesn't change, but the suspend algorithm does. Currently, when suspending a device you first read-lock the parent (to prevent it from suspending too soon), then you asynchronously write-lock the device and suspend it, and finally read-unlock the parent. Instead, you could first write-lock the device (to prevent the parent and any other dependents from suspending too soon), then asynchronously read-lock each of the children and anything else the device needs to wait for, then suspend the device, and finally write-unlock it. This really is analogous to completions: down_write() is like init_completion(), up_write() is like complete_all(), and down_read()+up_read() is like wait_for_completion(). I got the idea from Linus's comment that completions really are nothing but locks initialized in the "locked" state. Of course, you would have to iterate over all the children and deal with lockdep complaints. So this obviously is not to be considered as a serious proposal. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912102155390.12136-100000@netrider.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912102155390.12136-100000@netrider.rowland.org> @ 2009-12-11 22:17 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-11 22:17 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Friday 11 December 2009, Alan Stern wrote: > Up front: This is my personal view of the matter. Which probably isn't > of much interest to anybody, so I won't bother to defend these views or > comment any further on them. The decision about what version to use is > up to the two of you. The fact is, either implementation would get the > job done. > > On Thu, 10 Dec 2009, Linus Torvalds wrote: > > > Completions really are "locks that were initialized to locked". That is, > > in fact, how completions came to be: we literally used to use semaphores > > for them, and the reason for completions is literally the magic lifetime > > rules they have. > > > > So when you do > > > > INIT_COMPLETION(dev->power.completion); > > > > that really is historically, logically, and conceptually exactly the same > > thing as initializing a lock to the locked state. We literally used to do > > it with the equivalent of > > > > init_MUTEX_LOCKED() > > > > way back when (well, except we didn't have mutexes back then, we had only > > counting semaphores) and instead of "complete()", we had "up()" on the > > semaphore to complete it. > > You think of it that way because you have been closely involved in the > development of the various kinds of locks. Speaking as an outsider who > has relatively little interest in the internal details, completions > appear simpler than rwsems. Mostly because they have a smaller API: > complete() (or complete_all()) and wait_for_completion() as opposed to > down_read(), up_read(), down_write(), and up_write(). Agreed. > > > Besides, suppose a device driver wants some off-tree constraints to be > > > satisfied. > > > > .. and I've told you several times that we should simply not do such > > devices asynchronously. At least not unless there is some _overriding_ > > reason to. And so far, nobody has suggested anything even remotely > > likely for that. > > Agreed. The fact that async non-tree suspend constraints are difficult > with rwsems isn't a drawback if nobody needs to use them. Well, see my reply to Linus. The only thing that bothers me is that if we use rwsems, there's no way to handle that even if it turns out that someone needs them after all. > > > Well, why actually do we need to preserve the state of the data structure from > > > one cycle to another? There's no need whatsoever. > > > > My point is, with locks, none of that is necessary. Because they > > automatically do the right thing. > > > > By picking the right concept, you don't have any of those "oh, we need to > > re-initialize things" issues. They just work. > > That's true, but it's not entirely clear. There are subtle questions > about what happens if you stop in the middle or a device gets > unregistered or registered in the middle. They require careful thought > in both approaches. > > Having to reinitialize a completion each time doesn't bother me. It's > merely an indication that each suspend & resume is independent of all > the others. YES! > > > I still don't think there are many places where locks are used in a way you're > > > suggesting. I would even say it's quite unusual to use locks this way. > > > > See above. It's what completions _are_. > > This is almost a philosophical issue. If each A_i must wait for some > B_j's, is the onus on each A_i to test the B_j's it's interested in? > Or is the onus on each B_j to tell the A_i's waiting for it that they > may proceed? As Humpty-Dumpty said, "The question is which is to be > master -- that's all". Agreed. > > > Well, I guess your point is that the implementation of completions is much > > > more complicated that we really need, but I'm not sure if that really hurts. > > > > No. The implementation of completions is actually pretty simple, exactly > > because they have that spinlock that is required to protect them. > > > > That wasn't the point. The point was that locks are actually the "normal" > > thing to use. > > > > You are arguing as if completions are somehow the simpler model. That's > > simply not true. Completions are just a _special_case_of_locking_. > > Doesn't that make them simpler by definition? Special cases always > have less to worry about than the general case. Heh, good point. > > So why not just use regular locks instead, when it's actually the natural > > way to do it, and results in simpler code? > > Simpler but also more subtle, IMO. If you didn't already know how the > algorithm worked, figuring it out from the code would be harder with > rwsems than with completions. Indeed. > Partly because of the way readers and > writers exchange roles in suspend vs. resume, and partly because > sometimes devices lock themselves and sometimes they lock other > devices. With completions each device has its own, and each device > waits for other devices' completions -- easier to keep track of > mentally. Agreed again. > (I still think this whole readers vs. writers thing is a red herring. > The essential property is that there are two opposing classes of lock > holders. The fact that multiple writers can't hold the lock at the > same time whereas multiple readers can is of no importance; the > algorithm would work just as well if multiple writers _could_ hold the > lock simultaneously.) > > Balancing the additional conceptual complexity of the rwsem approach is > the conceptual simplicity afforded by not needing to check all the > children. To me this makes it pretty much a toss-up. Yup. Thanks! Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912101321020.2680-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912101321020.2680-100000@iolanthe.rowland.org> @ 2009-12-10 23:51 ` Linus Torvalds 0 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-10 23:51 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, LKML, pm list On Thu, 10 Dec 2009, Alan Stern wrote: > > You probably didn't look closely at the original code in dpm_suspend() > and dpm_resume(). It's very awkward; each device is removed from > dpm_list, operated on, and then added on to a new local list. At the > end the new list is spliced back into dpm_list. > > This approach is better because it doesn't involve changing any list > pointers while the sleep transition is in progress. At any rate, I > don't recommend doing it in the same patch as the async stuff; it > should be done separately. Either before or after -- the two are > independent. I do agree with the "independent" part. But I don't agree about the awkwardness per se. Sure, it moves things back and forth and has private lists, but that's actually a fairly standard thing to do in those kinds of situations where you're taking something off a list, operating on it, and may need to put it back on the same list eventually. The VM layer does similar things. So that's why I think your version was actually odder - the existing list manipulation isn't all that odd. It has that strange "did we get removed while we dropped the lock and tried to suspend the device" thing, of course, but that's not entirely unheard of either. Could it be done more cleanly? I think so, but I agree with you that it's likely a separate issue. I _suspect_, for example, that we could just do something like, the appended to avoid _some_ of the subtlety. IOW, just move the device to the local list early - and if it gets removed while being suspended, it will automatically get removed from the local list (the remover doesn't care _what_ list it is on whe it does a 'list_del(power.entr)'). UNTESTED PATCH! This may be total crap, of course. But it _looks_ like an "ObviousCleanup(tm)" - famous last words. Linus --- drivers/base/power/main.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index 8aa2443..f2bb493 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -687,6 +687,7 @@ static int dpm_suspend(pm_message_t state) struct device *dev = to_device(dpm_list.prev); get_device(dev); + list_move(&dev->power.entry, &list); mutex_unlock(&dpm_list_mtx); error = device_suspend(dev, state); @@ -698,8 +699,6 @@ static int dpm_suspend(pm_message_t state) break; } dev->power.status = DPM_OFF; - if (!list_empty(&dev->power.entry)) - list_move(&dev->power.entry, &list); put_device(dev); } list_splice(&list, dpm_list.prev); ^ permalink raw reply related [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912101653120.2680-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912101653120.2680-100000@iolanthe.rowland.org> @ 2009-12-10 23:45 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-10 23:45 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thursday 10 December 2009, Alan Stern wrote: > On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > > > You should see how badly lockdep complains about the rwsems. If it > > > really doesn't like them then using completions makes sense. > > > > It does complain about them, but when the nested _down operations are marked > > as nested, it stops complaining (that's in the version where there's no async > > in the _noirq phases). > > Did you set the async_suspend flag for any devices during the test? Yes. All ACPI, all PCI, all serio, as usual. ;-) > And did you run more than one suspend/resume cycle? Sure. Actually, I test it in the /sys/power/pm_test = core mode, but that shouldn't really matter. > > +extern int __dpm_wait(struct device *dev, void *ign); > > + > > +static inline void dpm_wait(struct device *dev) > > +{ > > + __dpm_wait(dev, NULL); > > +} > > Sorry, I intended to mention this before but forgot. This design is > inelegant. You shouldn't have inlines calling functions with extra > unused arguments; they just waste code space. Make dpm_wait() be a > real routine and add a shim to the device_for_each_child() loop. I thought about that myself, done now. > > @@ -366,7 +388,7 @@ void dpm_resume_noirq(pm_message_t state > > > > mutex_lock(&dpm_list_mtx); > > transition_started = false; > > - list_for_each_entry(dev, &dpm_list, power.entry) > > + list_for_each_entry(dev, &dpm_list, power.entry) { > > if (dev->power.status > DPM_OFF) { > > int error; > > > > @@ -375,23 +397,27 @@ void dpm_resume_noirq(pm_message_t state > > if (error) > > pm_dev_err(dev, state, " early", error); > > } > > + /* Needed by the subsequent dpm_resume(). */ > > + INIT_COMPLETION(dev->power.completion); > > You're still doing it. Don't initialize the completions in a totally > different phase! Initialize them directly before they are used. > Namely, at the start of device_resume() and device_suspend(). The idea was to initialize them all at the same time, before entering the phase in which they were used, but I came to the conclusion that this was not necessary, because the dpm_list ordering was such that the devices to be waited for would always have their completions reinitialized before starting __device_suspend() or __device_resume() for the waiting ones. > One more thing. A logical time to check for errors is just after > waiting for the children in __device_suspend(), instead of beforehand > in async_suspend(). After all, if an error occurs then it's likely to > happen while we are waiting. Good idea, done. Updated patch is appended. Rafael --- drivers/base/power/main.c | 106 ++++++++++++++++++++++++++++++++++++++++--- include/linux/device.h | 6 ++ include/linux/pm.h | 7 ++ include/linux/resume-trace.h | 7 ++ 4 files changed, 121 insertions(+), 5 deletions(-) Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -26,6 +26,7 @@ #include <linux/spinlock.h> #include <linux/wait.h> #include <linux/timer.h> +#include <linux/completion.h> /* * Callbacks for platform drivers to implement. @@ -412,9 +413,11 @@ struct dev_pm_info { pm_message_t power_state; unsigned int can_wakeup:1; unsigned int should_wakeup:1; + unsigned async_suspend:1; enum dpm_state status; /* Owned by the PM core */ #ifdef CONFIG_PM_SLEEP struct list_head entry; + struct completion completion; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; @@ -508,6 +511,8 @@ extern void __suspend_report_result(cons __suspend_report_result(__func__, fn, ret); \ } while (0) +extern void dpm_wait(struct device *dev); + #else /* !CONFIG_PM_SLEEP */ #define device_pm_lock() do {} while (0) @@ -520,6 +525,8 @@ static inline int dpm_suspend_start(pm_m #define suspend_report_result(fn, ret) do {} while (0) +static inline void dpm_wait(struct device *dev) {} + #endif /* !CONFIG_PM_SLEEP */ /* How to reorder dpm_list after device_move() */ Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/resume-trace.h> #include <linux/rwsem.h> #include <linux/interrupt.h> +#include <linux/async.h> #include "../base.h" #include "power.h" @@ -42,6 +43,7 @@ LIST_HEAD(dpm_list); static DEFINE_MUTEX(dpm_list_mtx); +static pm_message_t pm_transition; /* * Set once the preparation of devices for a PM transition has started, reset @@ -56,6 +58,7 @@ static bool transition_started; void device_pm_init(struct device *dev) { dev->power.status = DPM_ON; + init_completion(&dev->power.completion); pm_runtime_init(dev); } @@ -111,6 +114,7 @@ void device_pm_remove(struct device *dev pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); + complete_all(&dev->power.completion); mutex_lock(&dpm_list_mtx); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); @@ -162,6 +166,28 @@ void device_pm_move_last(struct device * } /** + * dpm_wait - Wait for a PM operation to complete. + * @dev: Device to wait for. + */ +void dpm_wait(struct device *dev) +{ + if (dev) + wait_for_completion(&dev->power.completion); +} +EXPORT_SYMBOL_GPL(dpm_wait); + +static int dpm_wait_fn(struct device *dev, void *ignore) +{ + dpm_wait(dev); + return 0; +} + +static void dpm_wait_for_children(struct device *dev) +{ + device_for_each_child(dev, NULL, dpm_wait_fn); +} + +/** * pm_op - Execute the PM operation appropriate for given PM event. * @dev: Device to handle. * @ops: PM operations to choose from. @@ -381,17 +407,18 @@ void dpm_resume_noirq(pm_message_t state EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** - * device_resume - Execute "resume" callbacks for given device. + * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_resume(struct device *dev, pm_message_t state) +static int __device_resume(struct device *dev, pm_message_t state) { int error = 0; TRACE_DEVICE(dev); TRACE_RESUME(0); + dpm_wait(dev->parent); down(&dev->sem); if (dev->bus) { @@ -426,11 +453,34 @@ static int device_resume(struct device * } End: up(&dev->sem); + complete_all(&dev->power.completion); TRACE_RESUME(error); return error; } +static void async_resume(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_resume(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + +static int device_resume(struct device *dev) +{ + if (dev->power.async_suspend && !pm_trace_is_enabled()) { + get_device(dev); + async_schedule(async_resume, dev); + return 0; + } + + return __device_resume(dev, pm_transition); +} + /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. @@ -444,6 +494,7 @@ static void dpm_resume(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -451,10 +502,11 @@ static void dpm_resume(pm_message_t stat if (dev->power.status >= DPM_OFF) { int error; + INIT_COMPLETION(dev->power.completion); dev->power.status = DPM_RESUMING; mutex_unlock(&dpm_list_mtx); - error = device_resume(dev, state); + error = device_resume(dev); mutex_lock(&dpm_list_mtx); if (error) @@ -469,6 +521,7 @@ static void dpm_resume(pm_message_t stat } list_splice(&list, &dpm_list); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); } /** @@ -623,17 +676,23 @@ int dpm_suspend_noirq(pm_message_t state } EXPORT_SYMBOL_GPL(dpm_suspend_noirq); +static int async_error; + /** * device_suspend - Execute "suspend" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_suspend(struct device *dev, pm_message_t state) +static int __device_suspend(struct device *dev, pm_message_t state) { int error = 0; + dpm_wait_for_children(dev); down(&dev->sem); + if (async_error) + goto End; + if (dev->class) { if (dev->class->pm) { pm_dev_dbg(dev, state, "class "); @@ -666,12 +725,42 @@ static int device_suspend(struct device suspend_report_result(dev->bus->suspend, error); } } + + if (!error) + dev->power.status = DPM_OFF; + End: up(&dev->sem); + complete_all(&dev->power.completion); return error; } +static void async_suspend(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend(dev, pm_transition); + if (error) { + pm_dev_err(dev, pm_transition, " async", error); + async_error = error; + } + + put_device(dev); +} + +static int device_suspend(struct device *dev, pm_message_t state) +{ + if (dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend, dev); + return 0; + } + + return __device_suspend(dev, pm_transition); +} + /** * dpm_suspend - Execute "suspend" callbacks for all non-sysdev devices. * @state: PM transition of the system being carried out. @@ -683,10 +772,12 @@ static int dpm_suspend(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.prev); get_device(dev); + INIT_COMPLETION(dev->power.completion); mutex_unlock(&dpm_list_mtx); error = device_suspend(dev, state); @@ -697,13 +788,17 @@ static int dpm_suspend(pm_message_t stat put_device(dev); break; } - dev->power.status = DPM_OFF; if (!list_empty(&dev->power.entry)) list_move(&dev->power.entry, &list); put_device(dev); + if (async_error) + break; } list_splice(&list, dpm_list.prev); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); + if (!error) + error = async_error; return error; } @@ -762,6 +857,7 @@ static int dpm_prepare(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); transition_started = true; + async_error = 0; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); Index: linux-2.6/include/linux/resume-trace.h =================================================================== --- linux-2.6.orig/include/linux/resume-trace.h +++ linux-2.6/include/linux/resume-trace.h @@ -6,6 +6,11 @@ extern int pm_trace_enabled; +static inline int pm_trace_is_enabled(void) +{ + return pm_trace_enabled; +} + struct device; extern void set_trace_device(struct device *); extern void generate_resume_trace(const void *tracedata, unsigned int user); @@ -17,6 +22,8 @@ extern void generate_resume_trace(const #else +static inline int pm_trace_is_enabled(void) { return 0; } + #define TRACE_DEVICE(dev) do { } while (0) #define TRACE_RESUME(dev) do { } while (0) Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h +++ linux-2.6/include/linux/device.h @@ -472,6 +472,12 @@ static inline int device_is_registered(s return dev->kobj.state_in_sysfs; } +static inline void device_enable_async_suspend(struct device *dev, bool enable) +{ + if (dev->power.status == DPM_ON) + dev->power.async_suspend = enable; +} + void driver_init(void); /* ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912102214.40310.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <200912102214.40310.rjw@sisk.pl> @ 2009-12-10 22:17 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-10 22:17 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > You should see how badly lockdep complains about the rwsems. If it > > really doesn't like them then using completions makes sense. > > It does complain about them, but when the nested _down operations are marked > as nested, it stops complaining (that's in the version where there's no async > in the _noirq phases). Did you set the async_suspend flag for any devices during the test? And did you run more than one suspend/resume cycle? > +extern int __dpm_wait(struct device *dev, void *ign); > + > +static inline void dpm_wait(struct device *dev) > +{ > + __dpm_wait(dev, NULL); > +} Sorry, I intended to mention this before but forgot. This design is inelegant. You shouldn't have inlines calling functions with extra unused arguments; they just waste code space. Make dpm_wait() be a real routine and add a shim to the device_for_each_child() loop. > @@ -366,7 +388,7 @@ void dpm_resume_noirq(pm_message_t state > > mutex_lock(&dpm_list_mtx); > transition_started = false; > - list_for_each_entry(dev, &dpm_list, power.entry) > + list_for_each_entry(dev, &dpm_list, power.entry) { > if (dev->power.status > DPM_OFF) { > int error; > > @@ -375,23 +397,27 @@ void dpm_resume_noirq(pm_message_t state > if (error) > pm_dev_err(dev, state, " early", error); > } > + /* Needed by the subsequent dpm_resume(). */ > + INIT_COMPLETION(dev->power.completion); You're still doing it. Don't initialize the completions in a totally different phase! Initialize them directly before they are used. Namely, at the start of device_resume() and device_suspend(). One more thing. A logical time to check for errors is just after waiting for the children in __device_suspend(), instead of beforehand in async_suspend(). After all, if an error occurs then it's likely to happen while we are waiting. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912101010090.2825-100000@iolanthe.rowland.org>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912101010090.2825-100000@iolanthe.rowland.org> @ 2009-12-10 15:45 ` Linus Torvalds 2009-12-10 21:14 ` Rafael J. Wysocki 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-10 15:45 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, LKML, pm list On Thu, 10 Dec 2009, Alan Stern wrote: > > In device_pm_remove(): > > mutex_lock(&dpm_list_mtx); > if (dev == dpm_next) > dpm_next = to_device(dpm_iterate_forward ? > dev->power.entry.next : dev->power.entry.prev); > list_del_init(&dev->power.entry); > mutex_unlock(&dpm_list_mtx); I'm really not seeing the point - it's much better to hardcode the ordering in the place you use it (where it is static and the compiler can generate bette code) than to do some dynamic choice that depends on some fake flag - especially a global one. Also, quite frankly, error handling needs to be separated out of the whole async patch, and needs to be thought about a lot more. And I would seriously argue that if you have any async suspends, then those async suspends are _not_ allowed to fail. At least not initially Having async failures and trying to fix them up is just a disaster. Which ones actually failed, and which ones were aborted before they even really got to their suspend routines? Which ones do you try to resume? IOW, it needs way more thought than what has clearly happened so far. And once more, I will refuse to merge anything that is complicated for no actual reason (where reason is "real life, and tested to make a big difference", not some hand-waving) Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912101010090.2825-100000@iolanthe.rowland.org> 2009-12-10 15:45 ` Linus Torvalds @ 2009-12-10 21:14 ` Rafael J. Wysocki 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-10 21:14 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thursday 10 December 2009, Alan Stern wrote: > On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > > > How about CONFIG_PROVE_LOCKING? If lockdep really does start > > > complaining then switching to completions would be a simple way to > > > appease it. > > > > Ah, that one is not set. I guess I'll try it later, although I've already > > decided to use completions anyway. > > You should see how badly lockdep complains about the rwsems. If it > really doesn't like them then using completions makes sense. It does complain about them, but when the nested _down operations are marked as nested, it stops complaining (that's in the version where there's no async in the _noirq phases). > > Index: linux-2.6/drivers/base/power/main.c > > =================================================================== > > --- linux-2.6.orig/drivers/base/power/main.c > > +++ linux-2.6/drivers/base/power/main.c > > @@ -56,6 +58,7 @@ static bool transition_started; > > void device_pm_init(struct device *dev) > > { > > dev->power.status = DPM_ON; > > + init_completion(&dev->power.completion); > > pm_runtime_init(dev); > > } > > You need a matching complete_all() in device_pm_remove(), in case > someone else is waiting for the device when it gets unregistered. Right, added. > > +/** > > + * dpm_synchronize - Wait for PM callbacks of all devices to complete. > > + */ > > +static void dpm_synchronize(void) > > +{ > > + struct device *dev; > > + > > + async_synchronize_full(); > > + > > + mutex_lock(&dpm_list_mtx); > > + list_for_each_entry(dev, &dpm_list, power.entry) > > + INIT_COMPLETION(dev->power.completion); > > + mutex_unlock(&dpm_list_mtx); > > +} > > I agree with Linus, initializing the completions here is weird. You > should initialize them just before using them. I removed that completely and now the INIT_COMPLETION() is always done in the preceding phase. > > @@ -683,6 +786,7 @@ static int dpm_suspend(pm_message_t stat > > > > INIT_LIST_HEAD(&list); > > mutex_lock(&dpm_list_mtx); > > + pm_transition = state; > > while (!list_empty(&dpm_list)) { > > struct device *dev = to_device(dpm_list.prev); > > > > @@ -697,13 +801,18 @@ static int dpm_suspend(pm_message_t stat > > put_device(dev); > > break; > > } > > - dev->power.status = DPM_OFF; > > if (!list_empty(&dev->power.entry)) > > list_move(&dev->power.entry, &list); > > put_device(dev); > > + error = atomic_read(&async_error); > > + if (error) > > + break; > > } > > list_splice(&list, dpm_list.prev); > > Here's something you might want to do in a later patch. These awkward > list-pointer manipulations can be simplified as follows: Well, I'm not sure if that's more straightforward. Anyway, as you said, that's something for a different patch. :-) Below is an updated version of the $subject one. I don't use the atomic_t for async_error any more and (apart from this fixed issue) I don't see any problems in the suspend error path now. Rafael --- drivers/base/power/main.c | 113 ++++++++++++++++++++++++++++++++++++++++--- include/linux/device.h | 6 ++ include/linux/pm.h | 12 ++++ include/linux/resume-trace.h | 7 ++ 4 files changed, 131 insertions(+), 7 deletions(-) Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -26,6 +26,7 @@ #include <linux/spinlock.h> #include <linux/wait.h> #include <linux/timer.h> +#include <linux/completion.h> /* * Callbacks for platform drivers to implement. @@ -412,9 +413,11 @@ struct dev_pm_info { pm_message_t power_state; unsigned int can_wakeup:1; unsigned int should_wakeup:1; + unsigned async_suspend:1; enum dpm_state status; /* Owned by the PM core */ #ifdef CONFIG_PM_SLEEP struct list_head entry; + struct completion completion; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; @@ -508,6 +511,13 @@ extern void __suspend_report_result(cons __suspend_report_result(__func__, fn, ret); \ } while (0) +extern int __dpm_wait(struct device *dev, void *ign); + +static inline void dpm_wait(struct device *dev) +{ + __dpm_wait(dev, NULL); +} + #else /* !CONFIG_PM_SLEEP */ #define device_pm_lock() do {} while (0) @@ -520,6 +530,8 @@ static inline int dpm_suspend_start(pm_m #define suspend_report_result(fn, ret) do {} while (0) +static inline void dpm_wait(struct device *dev) {} + #endif /* !CONFIG_PM_SLEEP */ /* How to reorder dpm_list after device_move() */ Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/resume-trace.h> #include <linux/rwsem.h> #include <linux/interrupt.h> +#include <linux/async.h> #include "../base.h" #include "power.h" @@ -42,6 +43,7 @@ LIST_HEAD(dpm_list); static DEFINE_MUTEX(dpm_list_mtx); +static pm_message_t pm_transition; /* * Set once the preparation of devices for a PM transition has started, reset @@ -56,6 +58,7 @@ static bool transition_started; void device_pm_init(struct device *dev) { dev->power.status = DPM_ON; + init_completion(&dev->power.completion); pm_runtime_init(dev); } @@ -111,6 +114,7 @@ void device_pm_remove(struct device *dev pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); + complete_all(&dev->power.completion); mutex_lock(&dpm_list_mtx); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); @@ -162,6 +166,24 @@ void device_pm_move_last(struct device * } /** + * __dpm_wait - Wait for a PM operation to complete. + * @dev: Device to wait for. + * @ign: This value is not used by the function. + */ +int __dpm_wait(struct device *dev, void *ign) +{ + if (dev) + wait_for_completion(&dev->power.completion); + return 0; +} +EXPORT_SYMBOL_GPL(__dpm_wait); + +static void dpm_wait_for_children(struct device *dev) +{ + device_for_each_child(dev, NULL, __dpm_wait); +} + +/** * pm_op - Execute the PM operation appropriate for given PM event. * @dev: Device to handle. * @ops: PM operations to choose from. @@ -366,7 +388,7 @@ void dpm_resume_noirq(pm_message_t state mutex_lock(&dpm_list_mtx); transition_started = false; - list_for_each_entry(dev, &dpm_list, power.entry) + list_for_each_entry(dev, &dpm_list, power.entry) { if (dev->power.status > DPM_OFF) { int error; @@ -375,23 +397,27 @@ void dpm_resume_noirq(pm_message_t state if (error) pm_dev_err(dev, state, " early", error); } + /* Needed by the subsequent dpm_resume(). */ + INIT_COMPLETION(dev->power.completion); + } mutex_unlock(&dpm_list_mtx); resume_device_irqs(); } EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** - * device_resume - Execute "resume" callbacks for given device. + * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_resume(struct device *dev, pm_message_t state) +static int __device_resume(struct device *dev, pm_message_t state) { int error = 0; TRACE_DEVICE(dev); TRACE_RESUME(0); + dpm_wait(dev->parent); down(&dev->sem); if (dev->bus) { @@ -426,11 +452,34 @@ static int device_resume(struct device * } End: up(&dev->sem); + complete_all(&dev->power.completion); TRACE_RESUME(error); return error; } +static void async_resume(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_resume(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + +static int device_resume(struct device *dev) +{ + if (dev->power.async_suspend && !pm_trace_is_enabled()) { + get_device(dev); + async_schedule(async_resume, dev); + return 0; + } + + return __device_resume(dev, pm_transition); +} + /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. @@ -444,6 +493,7 @@ static void dpm_resume(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -454,7 +504,7 @@ static void dpm_resume(pm_message_t stat dev->power.status = DPM_RESUMING; mutex_unlock(&dpm_list_mtx); - error = device_resume(dev, state); + error = device_resume(dev); mutex_lock(&dpm_list_mtx); if (error) @@ -469,6 +519,7 @@ static void dpm_resume(pm_message_t stat } list_splice(&list, &dpm_list); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); } /** @@ -623,15 +674,18 @@ int dpm_suspend_noirq(pm_message_t state } EXPORT_SYMBOL_GPL(dpm_suspend_noirq); +static int async_error; + /** * device_suspend - Execute "suspend" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_suspend(struct device *dev, pm_message_t state) +static int __device_suspend(struct device *dev, pm_message_t state) { int error = 0; + dpm_wait_for_children(dev); down(&dev->sem); if (dev->class) { @@ -666,12 +720,48 @@ static int device_suspend(struct device suspend_report_result(dev->bus->suspend, error); } } + + if (!error) + dev->power.status = DPM_OFF; + End: up(&dev->sem); + complete_all(&dev->power.completion); return error; } +static void async_suspend(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + if (async_error) { + complete_all(&dev->power.completion); + goto End; + } + + error = __device_suspend(dev, pm_transition); + if (error) { + pm_dev_err(dev, pm_transition, " async", error); + async_error = error; + } + + End: + put_device(dev); +} + +static int device_suspend(struct device *dev, pm_message_t state) +{ + if (dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend, dev); + return 0; + } + + return __device_suspend(dev, pm_transition); +} + /** * dpm_suspend - Execute "suspend" callbacks for all non-sysdev devices. * @state: PM transition of the system being carried out. @@ -683,6 +773,7 @@ static int dpm_suspend(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.prev); @@ -697,13 +788,17 @@ static int dpm_suspend(pm_message_t stat put_device(dev); break; } - dev->power.status = DPM_OFF; if (!list_empty(&dev->power.entry)) list_move(&dev->power.entry, &list); put_device(dev); + if (async_error) + break; } list_splice(&list, dpm_list.prev); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); + if (!error) + error = async_error; return error; } @@ -762,6 +857,7 @@ static int dpm_prepare(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); transition_started = true; + async_error = 0; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -793,8 +889,11 @@ static int dpm_prepare(pm_message_t stat break; } dev->power.status = DPM_SUSPENDING; - if (!list_empty(&dev->power.entry)) + if (!list_empty(&dev->power.entry)) { list_move_tail(&dev->power.entry, &list); + /* Needed by the subsequent dpm_suspend(). */ + INIT_COMPLETION(dev->power.completion); + } put_device(dev); } list_splice(&list, &dpm_list); Index: linux-2.6/include/linux/resume-trace.h =================================================================== --- linux-2.6.orig/include/linux/resume-trace.h +++ linux-2.6/include/linux/resume-trace.h @@ -6,6 +6,11 @@ extern int pm_trace_enabled; +static inline int pm_trace_is_enabled(void) +{ + return pm_trace_enabled; +} + struct device; extern void set_trace_device(struct device *); extern void generate_resume_trace(const void *tracedata, unsigned int user); @@ -17,6 +22,8 @@ extern void generate_resume_trace(const #else +static inline int pm_trace_is_enabled(void) { return 0; } + #define TRACE_DEVICE(dev) do { } while (0) #define TRACE_RESUME(dev) do { } while (0) Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h +++ linux-2.6/include/linux/device.h @@ -472,6 +472,12 @@ static inline int device_is_registered(s return dev->kobj.state_in_sysfs; } +static inline void device_enable_async_suspend(struct device *dev, bool enable) +{ + if (dev->power.status == DPM_ON) + dev->power.async_suspend = enable; +} + void driver_init(void); /* ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912100739260.3560@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <alpine.LFD.2.00.0912100739260.3560@localhost.localdomain> @ 2009-12-10 18:37 ` Alan Stern 0 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-10 18:37 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Thu, 10 Dec 2009, Linus Torvalds wrote: > > > On Thu, 10 Dec 2009, Alan Stern wrote: > > > > In device_pm_remove(): > > > > mutex_lock(&dpm_list_mtx); > > if (dev == dpm_next) > > dpm_next = to_device(dpm_iterate_forward ? > > dev->power.entry.next : dev->power.entry.prev); > > list_del_init(&dev->power.entry); > > mutex_unlock(&dpm_list_mtx); > > I'm really not seeing the point - it's much better to hardcode the > ordering in the place you use it (where it is static and the compiler can > generate bette code) than to do some dynamic choice that depends on some > fake flag - especially a global one. You probably didn't look closely at the original code in dpm_suspend() and dpm_resume(). It's very awkward; each device is removed from dpm_list, operated on, and then added on to a new local list. At the end the new list is spliced back into dpm_list. This approach is better because it doesn't involve changing any list pointers while the sleep transition is in progress. At any rate, I don't recommend doing it in the same patch as the async stuff; it should be done separately. Either before or after -- the two are independent. > Also, quite frankly, error handling needs to be separated out of the whole > async patch, and needs to be thought about a lot more. And I would > seriously argue that if you have any async suspends, then those async > suspends are _not_ allowed to fail. At least not initially > > Having async failures and trying to fix them up is just a disaster. Which > ones actually failed, and which ones were aborted before they even really > got to their suspend routines? Which ones do you try to resume? We record the status of each device; dev->power.status stores different values depending on whether the device suspend succeeded or failed. The value will be correct and up-to-date after async_synchronize_full() returns. The value is used in dpm_resume() to decide which devices need their resume methods called. I don't see any problems there. > IOW, it needs way more thought than what has clearly happened so far. And > once more, I will refuse to merge anything that is complicated for no > actual reason (where reason is "real life, and tested to make a big > difference", not some hand-waving) I don't think the error handling requires more than minimal changes. The whole atomic_t thing was overkill. It probably stemmed from a discussion some time back with Pavel Machek about concurrent writes to a single variable. I claimed that concurrent writes to a properly aligned pointer, int, or long would never create a "mash-up"; that is, readers would see either the original value or one of the new values but never some weird combination of bits. Alan Cox pointed out that while this was technically correct, there's nothing to prevent the compiler from translating a = b + c; into something like: load b, R1 store R1, a load c, R1 add R1, a in which case readers might see the intermediate value. (Okay, the compiler would have to be pretty stupid to do this with such a simple expression, but it could happen with more complicated expressions.) Pavel favored always using atomic types when there could be concurrent writes, and apparently Rafael was following his advice. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0912091729530.2672-100000@iolanthe.rowland.org>]
* Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] <Pine.LNX.4.44L0.0912091729530.2672-100000@iolanthe.rowland.org> @ 2009-12-09 23:18 ` Rafael J. Wysocki [not found] ` <200912100018.19723.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-09 23:18 UTC (permalink / raw) To: Alan Stern; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Wednesday 09 December 2009, Alan Stern wrote: > On Wed, 9 Dec 2009, Rafael J. Wysocki wrote: > > > On Wednesday 09 December 2009, Alan Stern wrote: > > > On Tue, 8 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > For completness, below is the full async suspend/resume patch with rwlocks, > > > > that has been (very slightly) tested and doesn't seem to break things. > > > > > > > > [Note to Alan: lockdep doesn't seem to complain about the not annotated nested > > > > locks.] > > > > > > I can't imagine why not. And wouldn't lockdep get confused by the fact > > > that in the async case, the rwsems are released by a different process > > > from the one that acquired them? > > > > /me looks at the .config > > > > I have CONFIG_LOCKDEP_SUPPORT set, is there anything else I need to set > > in .config? > > How about CONFIG_PROVE_LOCKING? If lockdep really does start > complaining then switching to completions would be a simple way to > appease it. Ah, that one is not set. I guess I'll try it later, although I've already decided to use completions anyway. ... > > > How about exporting a wait_for_device_to_resume() routine? Drivers > > > could call it for non-tree resume constraints: > > > > > > void wait_for_device_to_resume(struct device *other) > > > { > > > down_read(&other->power.rwsem); > > > up_read(&other->power.rwsem); > > > } > > > > > > Unfortunately there is no equivalent for non-tree suspend constraints. > > > > If we use completions, it will be possible to just export something like > > > > dpm_wait(dev) > > { > > if (dev) > > wait_for_completion(dev->power.completion); > > } > > > > I think. It appears that will also work for suspend, unless I'm missing > > something. > > It will. Completions it is, then. Additionally, I've removed the async support from the _noirq parts and moved the setting of power.status on suspend to __device_suspend(). The result is appended. Rafael --- drivers/base/power/main.c | 124 ++++++++++++++++++++++++++++++++++++++++--- include/linux/device.h | 6 ++ include/linux/pm.h | 12 ++++ include/linux/resume-trace.h | 7 ++ 4 files changed, 143 insertions(+), 6 deletions(-) Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -26,6 +26,7 @@ #include <linux/spinlock.h> #include <linux/wait.h> #include <linux/timer.h> +#include <linux/completion.h> /* * Callbacks for platform drivers to implement. @@ -412,9 +413,11 @@ struct dev_pm_info { pm_message_t power_state; unsigned int can_wakeup:1; unsigned int should_wakeup:1; + unsigned async_suspend:1; enum dpm_state status; /* Owned by the PM core */ #ifdef CONFIG_PM_SLEEP struct list_head entry; + struct completion completion; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; @@ -508,6 +511,13 @@ extern void __suspend_report_result(cons __suspend_report_result(__func__, fn, ret); \ } while (0) +extern int __dpm_wait(struct device *dev, void *ign); + +static inline void dpm_wait(struct device *dev) +{ + __dpm_wait(dev, NULL); +} + #else /* !CONFIG_PM_SLEEP */ #define device_pm_lock() do {} while (0) @@ -520,6 +530,8 @@ static inline int dpm_suspend_start(pm_m #define suspend_report_result(fn, ret) do {} while (0) +static inline void dpm_wait(struct device *dev) {} + #endif /* !CONFIG_PM_SLEEP */ /* How to reorder dpm_list after device_move() */ Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/resume-trace.h> #include <linux/rwsem.h> #include <linux/interrupt.h> +#include <linux/async.h> #include "../base.h" #include "power.h" @@ -42,6 +43,7 @@ LIST_HEAD(dpm_list); static DEFINE_MUTEX(dpm_list_mtx); +static pm_message_t pm_transition; /* * Set once the preparation of devices for a PM transition has started, reset @@ -56,6 +58,7 @@ static bool transition_started; void device_pm_init(struct device *dev) { dev->power.status = DPM_ON; + init_completion(&dev->power.completion); pm_runtime_init(dev); } @@ -162,6 +165,39 @@ void device_pm_move_last(struct device * } /** + * __dpm_wait - Wait for a PM operation to complete. + * @dev: Device to wait for. + * @ign: This value is not used by the function. + */ +int __dpm_wait(struct device *dev, void *ign) +{ + if (dev) + wait_for_completion(&dev->power.completion); + return 0; +} +EXPORT_SYMBOL_GPL(__dpm_wait); + +static void dpm_wait_for_children(struct device *dev) +{ + device_for_each_child(dev, NULL, __dpm_wait); +} + +/** + * dpm_synchronize - Wait for PM callbacks of all devices to complete. + */ +static void dpm_synchronize(void) +{ + struct device *dev; + + async_synchronize_full(); + + mutex_lock(&dpm_list_mtx); + list_for_each_entry(dev, &dpm_list, power.entry) + INIT_COMPLETION(dev->power.completion); + mutex_unlock(&dpm_list_mtx); +} + +/** * pm_op - Execute the PM operation appropriate for given PM event. * @dev: Device to handle. * @ops: PM operations to choose from. @@ -381,17 +417,18 @@ void dpm_resume_noirq(pm_message_t state EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** - * device_resume - Execute "resume" callbacks for given device. + * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_resume(struct device *dev, pm_message_t state) +static int __device_resume(struct device *dev, pm_message_t state) { int error = 0; TRACE_DEVICE(dev); TRACE_RESUME(0); + dpm_wait(dev->parent); down(&dev->sem); if (dev->bus) { @@ -426,11 +463,34 @@ static int device_resume(struct device * } End: up(&dev->sem); + complete_all(&dev->power.completion); TRACE_RESUME(error); return error; } +static void async_resume(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_resume(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + +static int device_resume(struct device *dev) +{ + if (dev->power.async_suspend && !pm_trace_is_enabled()) { + get_device(dev); + async_schedule(async_resume, dev); + return 0; + } + + return __device_resume(dev, pm_transition); +} + /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. @@ -444,6 +504,7 @@ static void dpm_resume(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -454,7 +515,7 @@ static void dpm_resume(pm_message_t stat dev->power.status = DPM_RESUMING; mutex_unlock(&dpm_list_mtx); - error = device_resume(dev, state); + error = device_resume(dev); mutex_lock(&dpm_list_mtx); if (error) @@ -469,6 +530,7 @@ static void dpm_resume(pm_message_t stat } list_splice(&list, &dpm_list); mutex_unlock(&dpm_list_mtx); + dpm_synchronize(); } /** @@ -533,6 +595,8 @@ static void dpm_complete(pm_message_t st mutex_unlock(&dpm_list_mtx); } +static atomic_t async_error; + /** * dpm_resume_end - Execute "resume" callbacks and complete system transition. * @state: PM transition of the system being carried out. @@ -628,10 +692,11 @@ EXPORT_SYMBOL_GPL(dpm_suspend_noirq); * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_suspend(struct device *dev, pm_message_t state) +static int __device_suspend(struct device *dev, pm_message_t state) { int error = 0; + dpm_wait_for_children(dev); down(&dev->sem); if (dev->class) { @@ -666,12 +731,50 @@ static int device_suspend(struct device suspend_report_result(dev->bus->suspend, error); } } + + if (!error) + dev->power.status = DPM_OFF; + End: up(&dev->sem); + complete_all(&dev->power.completion); return error; } +static void async_suspend(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error = atomic_read(&async_error); + + if (error) { + complete_all(&dev->power.completion); + goto End; + } + + error = __device_suspend(dev, pm_transition); + if (error) { + pm_dev_err(dev, pm_transition, " async", error); + atomic_set(&async_error, error); + } + + End: + put_device(dev); +} + +static int device_suspend(struct device *dev, pm_message_t state) +{ + int error; + + if (dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend, dev); + return 0; + } + + return __device_suspend(dev, pm_transition); +} + /** * dpm_suspend - Execute "suspend" callbacks for all non-sysdev devices. * @state: PM transition of the system being carried out. @@ -683,6 +786,7 @@ static int dpm_suspend(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.prev); @@ -697,13 +801,18 @@ static int dpm_suspend(pm_message_t stat put_device(dev); break; } - dev->power.status = DPM_OFF; if (!list_empty(&dev->power.entry)) list_move(&dev->power.entry, &list); put_device(dev); + error = atomic_read(&async_error); + if (error) + break; } list_splice(&list, dpm_list.prev); mutex_unlock(&dpm_list_mtx); + dpm_synchronize(); + if (!error) + error = atomic_read(&async_error); return error; } @@ -762,6 +871,7 @@ static int dpm_prepare(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); transition_started = true; + atomic_set(&async_error, 0); while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -793,8 +903,10 @@ static int dpm_prepare(pm_message_t stat break; } dev->power.status = DPM_SUSPENDING; - if (!list_empty(&dev->power.entry)) + if (!list_empty(&dev->power.entry)) { list_move_tail(&dev->power.entry, &list); + INIT_COMPLETION(dev->power.completion); + } put_device(dev); } list_splice(&list, &dpm_list); Index: linux-2.6/include/linux/resume-trace.h =================================================================== --- linux-2.6.orig/include/linux/resume-trace.h +++ linux-2.6/include/linux/resume-trace.h @@ -6,6 +6,11 @@ extern int pm_trace_enabled; +static inline int pm_trace_is_enabled(void) +{ + return pm_trace_enabled; +} + struct device; extern void set_trace_device(struct device *); extern void generate_resume_trace(const void *tracedata, unsigned int user); @@ -17,6 +22,8 @@ extern void generate_resume_trace(const #else +static inline int pm_trace_is_enabled(void) { return 0; } + #define TRACE_DEVICE(dev) do { } while (0) #define TRACE_RESUME(dev) do { } while (0) Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h +++ linux-2.6/include/linux/device.h @@ -472,6 +472,12 @@ static inline int device_is_registered(s return dev->kobj.state_in_sysfs; } +static inline void device_enable_async_suspend(struct device *dev, bool enable) +{ + if (dev->power.status == DPM_ON) + dev->power.async_suspend = enable; +} + void driver_init(void); /* ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912100018.19723.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912100018.19723.rjw@sisk.pl> @ 2009-12-10 2:51 ` Linus Torvalds 2009-12-10 15:31 ` Alan Stern [not found] ` <alpine.LFD.2.00.0912091835280.3560@localhost.localdomain> 2 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-10 2:51 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > Completions it is, then. What was so hard with the "Try the simple one first" to understand? You had a simpler working patch, why are you making this more complex one without ever having had any problems with the simpler one? Btw, your 'atomic_set()' with errors is pure voodoo programming. That's not how atomics work. They do SMP-atomic addition etc, the 'atomic_set()' and 'atomic_read()' things are not in any way more atomic than any other access. They are meant for racy reads (atomic_read()) and for initializations (atomic_set()), and the way you use them that 'atomic' part is entirely pointless, because it really isn't anything different from an 'int', except that it may be very very expensive on some architectures due to hashed spinlocks etc. So stop this overdesign thing. Start simple. If you _ever_ see real problems, that's when you add stuff. As it is, any time you add complexity, you just add bugs. > +/** > + * dpm_synchronize - Wait for PM callbacks of all devices to complete. > + */ > +static void dpm_synchronize(void) > +{ > + struct device *dev; > + > + async_synchronize_full(); > + > + mutex_lock(&dpm_list_mtx); > + list_for_each_entry(dev, &dpm_list, power.entry) > + INIT_COMPLETION(dev->power.completion); > + mutex_unlock(&dpm_list_mtx); > +} And this, for example, is pretty disgusting. Not only is that INIT_COMPLETION purely brought on by the whole problem with completions (they are fundamentally one-shot, but you want to use them over and over so you need to re-initialize them: a nice lock wouldn't have that problem to begin with), but the comment isn't even accurate. Sure, it waits for any async jobs, but that's the _least_ of what the function actually does, so the comment is actively misleading, isn't it? Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912100018.19723.rjw@sisk.pl> 2009-12-10 2:51 ` Linus Torvalds @ 2009-12-10 15:31 ` Alan Stern [not found] ` <alpine.LFD.2.00.0912091835280.3560@localhost.localdomain> 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-10 15:31 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > How about CONFIG_PROVE_LOCKING? If lockdep really does start > > complaining then switching to completions would be a simple way to > > appease it. > > Ah, that one is not set. I guess I'll try it later, although I've already > decided to use completions anyway. You should see how badly lockdep complains about the rwsems. If it really doesn't like them then using completions makes sense. > Index: linux-2.6/drivers/base/power/main.c > =================================================================== > --- linux-2.6.orig/drivers/base/power/main.c > +++ linux-2.6/drivers/base/power/main.c > @@ -56,6 +58,7 @@ static bool transition_started; > void device_pm_init(struct device *dev) > { > dev->power.status = DPM_ON; > + init_completion(&dev->power.completion); > pm_runtime_init(dev); > } You need a matching complete_all() in device_pm_remove(), in case someone else is waiting for the device when it gets unregistered. > +/** > + * dpm_synchronize - Wait for PM callbacks of all devices to complete. > + */ > +static void dpm_synchronize(void) > +{ > + struct device *dev; > + > + async_synchronize_full(); > + > + mutex_lock(&dpm_list_mtx); > + list_for_each_entry(dev, &dpm_list, power.entry) > + INIT_COMPLETION(dev->power.completion); > + mutex_unlock(&dpm_list_mtx); > +} I agree with Linus, initializing the completions here is weird. You should initialize them just before using them. > @@ -683,6 +786,7 @@ static int dpm_suspend(pm_message_t stat > > INIT_LIST_HEAD(&list); > mutex_lock(&dpm_list_mtx); > + pm_transition = state; > while (!list_empty(&dpm_list)) { > struct device *dev = to_device(dpm_list.prev); > > @@ -697,13 +801,18 @@ static int dpm_suspend(pm_message_t stat > put_device(dev); > break; > } > - dev->power.status = DPM_OFF; > if (!list_empty(&dev->power.entry)) > list_move(&dev->power.entry, &list); > put_device(dev); > + error = atomic_read(&async_error); > + if (error) > + break; > } > list_splice(&list, dpm_list.prev); Here's something you might want to do in a later patch. These awkward list-pointer manipulations can be simplified as follows: static bool dpm_iterate_forward; static struct device *dpm_next; In device_pm_remove(): mutex_lock(&dpm_list_mtx); if (dev == dpm_next) dpm_next = to_device(dpm_iterate_forward ? dev->power.entry.next : dev->power.entry.prev); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); In dpm_resume(): dpm_iterate_forward = true; list_for_each_entry_safe(dev, dpm_next, dpm_list, power.entry) { ... In dpm_suspend(): dpm_iterate_forward = false; list_for_each_entry_safe_reverse(dev, dpm_next, dpm_list, power.entry) { ... Whether this really is better is a matter of opinion; I like it. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912091835280.3560@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912091835280.3560@localhost.localdomain> @ 2009-12-10 19:40 ` Rafael J. Wysocki [not found] ` <200912102040.11063.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-10 19:40 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Thursday 10 December 2009, Linus Torvalds wrote: > > On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > > > Completions it is, then. > > What was so hard with the "Try the simple one first" to understand? You > had a simpler working patch, why are you making this more complex one > without ever having had any problems with the simpler one? OK, why don't you just say you won't merge anything that doesn't use rwsems (although you said before that completions would be fine with you)? That would make things clear, but also it would mean we gave up handling the off-tree dependencies in general. > Btw, your 'atomic_set()' with errors is pure voodoo programming. That's > not how atomics work. They do SMP-atomic addition etc, the 'atomic_set()' > and 'atomic_read()' things are not in any way more atomic than any other > access. > > They are meant for racy reads (atomic_read()) and for initializations > (atomic_set()), and the way you use them that 'atomic' part is entirely > pointless, because it really isn't anything different from an 'int', > except that it may be very very expensive on some architectures due to > hashed spinlocks etc. > > So stop this overdesign thing. Start simple. If you _ever_ see real > problems, that's when you add stuff. As it is, any time you add > complexity, you just add bugs. OK, so that need not be atomic. > > +/** > > + * dpm_synchronize - Wait for PM callbacks of all devices to complete. > > + */ > > +static void dpm_synchronize(void) > > +{ > > + struct device *dev; > > + > > + async_synchronize_full(); > > + > > + mutex_lock(&dpm_list_mtx); > > + list_for_each_entry(dev, &dpm_list, power.entry) > > + INIT_COMPLETION(dev->power.completion); > > + mutex_unlock(&dpm_list_mtx); > > +} > > And this, for example, is pretty disgusting. Not only is that > INIT_COMPLETION purely brought on by the whole problem with completions > (they are fundamentally one-shot, but you want to use them over and over Actually, twice. However, since I don't want to do any async handling in the _noirq phases any more, I can get rid of this whole function. Thanks for pointing that out to me. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912102040.11063.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912102040.11063.rjw@sisk.pl> @ 2009-12-10 23:30 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912101507550.3560@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-10 23:30 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: > > OK, why don't you just say you won't merge anything that doesn't use rwsems I did! Here's a quote (and it's pretty much the whole email, so it's not like it was hidden): - alpine.LFD.2.00.0912081309370.3560@localhost.localdomain: "Let me put this simply: I've told you guys how to do it simply, with _zero_ crap. No "iterating over children". No games. No data structures. No new infrastructure. Just a single new rwlock per device, and _trivial_ code. So here's the challenge: try it my simple way first. I've quoted the code about five million times already. If you _actually_ see some problems, explain them. Don't make up stupid "iterate over each child" things. Don't claim totally made-up "leads to difficulties". Don't make it any more complicated than it needs to be. Keep it simple. And once you have tried that simple approach, and you really can show why it doesn't work, THEN you can try something else. But before you try the simple approach and explain why it wouldn't work, I simply will not pull anything more complex. Understood and agreed?" And then later about completions: - alpine.LFD.2.00.0912081416470.3560@localhost.localdomain: "So I think completions should work, if done right. That whole "make the parent wait for all the children to complete" is fine in that sense. And I'll happily take such an approach if my rwlock thing doesn't work." IOW, I'll happily take the completions version, but dammit, I refuse to take it when there is a simpler approach that does NOT need to iterate, and does NOT need to re-initialize the data structures each round etc. That's what I've been arguing against the whole time. It started as arguing against complex and unnecessary infrastructure, and trying to show that it _can_ be done so much simpler using existing basic locking. And I get annoyed when you guys continually seem to want to make it more complex than it needs to be. > > And this, for example, is pretty disgusting. Not only is that > > INIT_COMPLETION purely brought on by the whole problem with completions > > (they are fundamentally one-shot, but you want to use them over and over > > Actually, twice. However, since I don't want to do any async handling in the > _noirq phases any more, I can get rid of this whole function. Thanks for > pointing that out to me. Well, my point was that you'll need to do that INIT_COMPLETION(dev->power.completion); thing each suspend and each resume. Exactly because completions are designed to be "onw-way" things, so you end up having to reset them each cycle (you just reset them even _more_ than you needed). Again, my point was that using locks is actually a very _natural_ thing to do. I really don't understand what problems you and Alan have with just using locks - we have way more locks in the kernel than we have completions, so they are the "default" thing to do, and they really are very natural to use. [ Ok, so admittedly the actual use of 'struct rw_semaphore' is pretty unusual, but my point is that people are used to locking semantics in general, more so than the semantics of completions ] Completions were literally designed to be used for one-off things - one of the most common uses is that the 'struct completion' is on the _stack_. It doesn't get much more one-off than that - and the completions are really very explicitly designed so that you can do a 'complete()' on something that will literally disappear from under you as you do it (because the struct completion might be on the stack of the thing that is waiting for it, and gets de-allocated when the waiter goes ahead). That is why 'wait_for_completion()' always has to take the spinlock, for example - there is no fastpath for completion, because the races for the waiter releasing things too early are too nasty. So completions are actually very subtle things - and you don't need any of that subtlety. I realize that from a user perspective, completions look very simple, but in many ways they actually have subtler semantics than a regular lock has. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912101507550.3560@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912101507550.3560@localhost.localdomain> @ 2009-12-11 1:02 ` Rafael J. Wysocki [not found] ` <200912110202.28536.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-11 1:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Friday 11 December 2009, Linus Torvalds wrote: > > On Thu, 10 Dec 2009, Rafael J. Wysocki wrote: ... > > IOW, I'll happily take the completions version, but dammit, I refuse to > take it when there is a simpler approach that does NOT need to iterate, > and does NOT need to re-initialize the data structures each round etc. I don't think it really is that simple. For example, the fact that the outer lock has to be taken by one thread and released by another is not exactly straightforward. [One might ask what's the critical section in this case.] Besides, suppose a device driver wants some off-tree constraints to be satisfied. What's the driver writer supposed to do? He only can lock the other device, but that will cause lockdep to complain, because this lock is going to be nested. Moreover, it's already too late, because his async thread has started and there's no guarantee that the other device hasn't acquired its rwsem yet. With completions, the driver doesn't have to take any action to prevent another one from suspending too early. Instead, the other one has to wait for its suspend to complete, and for me personally this is a much more natural thing to do. IOW, if I were a driver writed, I'd probably prefer to wait on a completion than to use a lock in a tricky manner. > That's what I've been arguing against the whole time. It started as > arguing against complex and unnecessary infrastructure, and trying to show > that it _can_ be done so much simpler using existing basic locking. > > And I get annoyed when you guys continually seem to want to make it more > complex than it needs to be. > > > > And this, for example, is pretty disgusting. Not only is that > > > INIT_COMPLETION purely brought on by the whole problem with completions > > > (they are fundamentally one-shot, but you want to use them over and over > > > > Actually, twice. However, since I don't want to do any async handling in the > > _noirq phases any more, I can get rid of this whole function. Thanks for > > pointing that out to me. > > Well, my point was that you'll need to do that > > INIT_COMPLETION(dev->power.completion); > > thing each suspend and each resume. Exactly because completions are > designed to be "onw-way" things, so you end up having to reset them each > cycle (you just reset them even _more_ than you needed). Well, why actually do we need to preserve the state of the data structure from one cycle to another? There's no need whatsoever. > Again, my point was that using locks is actually a very _natural_ thing to > do. I really don't understand what problems you and Alan have with just > using locks - we have way more locks in the kernel than we have > completions, so they are the "default" thing to do, and they really are > very natural to use. > > [ Ok, so admittedly the actual use of 'struct rw_semaphore' is pretty > unusual, but my point is that people are used to locking semantics in > general, more so than the semantics of completions ] I still don't think there are many places where locks are used in a way you're suggesting. I would even say it's quite unusual to use locks this way. > Completions were literally designed to be used for one-off things - one of > the most common uses is that the 'struct completion' is on the _stack_. It > doesn't get much more one-off than that - and the completions are really > very explicitly designed so that you can do a 'complete()' on something > that will literally disappear from under you as you do it (because the > struct completion might be on the stack of the thing that is waiting for > it, and gets de-allocated when the waiter goes ahead). We could literally throw away a completion after all of the potentially waiting threads have finished their operations and then allocate it back again when necessary. We only need the synchronization in this particular phase of suspend or resume and it doesn't need to extend to the other phases or other cycles, because all of the concurrent threads we need to synchronize will only live during this one particular phase of suspend or resume. They will all exit when it's finished anyway. > That is why 'wait_for_completion()' always has to take the spinlock, for > example - there is no fastpath for completion, because the races for the > waiter releasing things too early are too nasty. > > So completions are actually very subtle things - and you don't need any of > that subtlety. I realize that from a user perspective, completions look > very simple, but in many ways they actually have subtler semantics than a > regular lock has. Well, I guess your point is that the implementation of completions is much more complicated that we really need, but I'm not sure if that really hurts. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912110202.28536.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912110202.28536.rjw@sisk.pl> @ 2009-12-11 1:25 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912101713440.3560@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-11 1:25 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Fri, 11 Dec 2009, Rafael J. Wysocki wrote: > > I don't think it really is that simple. For example, the fact that the outer > lock has to be taken by one thread and released by another is not exactly > straightforward. [One might ask what's the critical section in this case.] Why is that any different from initializing the completion in one thread, and completing it in another? It's exactly equivalent. Completions really are "locks that were initialized to locked". That is, in fact, how completions came to be: we literally used to use semaphores for them, and the reason for completions is literally the magic lifetime rules they have. So when you do INIT_COMPLETION(dev->power.completion); that really is historically, logically, and conceptually exactly the same thing as initializing a lock to the locked state. We literally used to do it with the equivalent of init_MUTEX_LOCKED() way back when (well, except we didn't have mutexes back then, we had only counting semaphores) and instead of "complete()", we had "up()" on the semaphore to complete it. > Besides, suppose a device driver wants some off-tree constraints to be > satisfied. .. and I've told you several times that we should simply not do such devices asynchronously. At least not unless there is some _overriding_ reason to. And so far, nobody has suggested anything even remotely likely for that. Again - KISS: Keep It Simple, Stupid! Don't try to make up problems. The _only_ subsystem we know wants this is USB, and we know USB is purely a tree. > > INIT_COMPLETION(dev->power.completion); > > > > thing each suspend and each resume. Exactly because completions are > > designed to be "onw-way" things, so you end up having to reset them each > > cycle (you just reset them even _more_ than you needed). > > Well, why actually do we need to preserve the state of the data structure from > one cycle to another? There's no need whatsoever. My point is, with locks, none of that is necessary. Because they automatically do the right thing. By picking the right concept, you don't have any of those "oh, we need to re-initialize things" issues. They just work. > I still don't think there are many places where locks are used in a way you're > suggesting. I would even say it's quite unusual to use locks this way. See above. It's what completions _are_. > Well, I guess your point is that the implementation of completions is much > more complicated that we really need, but I'm not sure if that really hurts. No. The implementation of completions is actually pretty simple, exactly because they have that spinlock that is required to protect them. That wasn't the point. The point was that locks are actually the "normal" thing to use. You are arguing as if completions are somehow the simpler model. That's simply not true. Completions are just a _special_case_of_locking_. So why not just use regular locks instead, when it's actually the natural way to do it, and results in simpler code? Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912101713440.3560@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912101713440.3560@localhost.localdomain> @ 2009-12-11 3:42 ` Alan Stern 2009-12-11 22:11 ` Rafael J. Wysocki [not found] ` <200912112311.08548.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-11 3:42 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list Up front: This is my personal view of the matter. Which probably isn't of much interest to anybody, so I won't bother to defend these views or comment any further on them. The decision about what version to use is up to the two of you. The fact is, either implementation would get the job done. On Thu, 10 Dec 2009, Linus Torvalds wrote: > Completions really are "locks that were initialized to locked". That is, > in fact, how completions came to be: we literally used to use semaphores > for them, and the reason for completions is literally the magic lifetime > rules they have. > > So when you do > > INIT_COMPLETION(dev->power.completion); > > that really is historically, logically, and conceptually exactly the same > thing as initializing a lock to the locked state. We literally used to do > it with the equivalent of > > init_MUTEX_LOCKED() > > way back when (well, except we didn't have mutexes back then, we had only > counting semaphores) and instead of "complete()", we had "up()" on the > semaphore to complete it. You think of it that way because you have been closely involved in the development of the various kinds of locks. Speaking as an outsider who has relatively little interest in the internal details, completions appear simpler than rwsems. Mostly because they have a smaller API: complete() (or complete_all()) and wait_for_completion() as opposed to down_read(), up_read(), down_write(), and up_write(). > > Besides, suppose a device driver wants some off-tree constraints to be > > satisfied. > > .. and I've told you several times that we should simply not do such > devices asynchronously. At least not unless there is some _overriding_ > reason to. And so far, nobody has suggested anything even remotely > likely for that. Agreed. The fact that async non-tree suspend constraints are difficult with rwsems isn't a drawback if nobody needs to use them. > > Well, why actually do we need to preserve the state of the data structure from > > one cycle to another? There's no need whatsoever. > > My point is, with locks, none of that is necessary. Because they > automatically do the right thing. > > By picking the right concept, you don't have any of those "oh, we need to > re-initialize things" issues. They just work. That's true, but it's not entirely clear. There are subtle questions about what happens if you stop in the middle or a device gets unregistered or registered in the middle. They require careful thought in both approaches. Having to reinitialize a completion each time doesn't bother me. It's merely an indication that each suspend & resume is independent of all the others. > > I still don't think there are many places where locks are used in a way you're > > suggesting. I would even say it's quite unusual to use locks this way. > > See above. It's what completions _are_. This is almost a philosophical issue. If each A_i must wait for some B_j's, is the onus on each A_i to test the B_j's it's interested in? Or is the onus on each B_j to tell the A_i's waiting for it that they may proceed? As Humpty-Dumpty said, "The question is which is to be master -- that's all". > > Well, I guess your point is that the implementation of completions is much > > more complicated that we really need, but I'm not sure if that really hurts. > > No. The implementation of completions is actually pretty simple, exactly > because they have that spinlock that is required to protect them. > > That wasn't the point. The point was that locks are actually the "normal" > thing to use. > > You are arguing as if completions are somehow the simpler model. That's > simply not true. Completions are just a _special_case_of_locking_. Doesn't that make them simpler by definition? Special cases always have less to worry about than the general case. > So why not just use regular locks instead, when it's actually the natural > way to do it, and results in simpler code? Simpler but also more subtle, IMO. If you didn't already know how the algorithm worked, figuring it out from the code would be harder with rwsems than with completions. Partly because of the way readers and writers exchange roles in suspend vs. resume, and partly because sometimes devices lock themselves and sometimes they lock other devices. With completions each device has its own, and each device waits for other devices' completions -- easier to keep track of mentally. (I still think this whole readers vs. writers thing is a red herring. The essential property is that there are two opposing classes of lock holders. The fact that multiple writers can't hold the lock at the same time whereas multiple readers can is of no importance; the algorithm would work just as well if multiple writers _could_ hold the lock simultaneously.) Balancing the additional conceptual complexity of the rwsem approach is the conceptual simplicity afforded by not needing to check all the children. To me this makes it pretty much a toss-up. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912101713440.3560@localhost.localdomain> 2009-12-11 3:42 ` Alan Stern @ 2009-12-11 22:11 ` Rafael J. Wysocki [not found] ` <200912112311.08548.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-11 22:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Friday 11 December 2009, Linus Torvalds wrote: > > On Fri, 11 Dec 2009, Rafael J. Wysocki wrote: > > > > I don't think it really is that simple. For example, the fact that the outer > > lock has to be taken by one thread and released by another is not exactly > > straightforward. [One might ask what's the critical section in this case.] > > Why is that any different from initializing the completion in one thread, > and completing it in another? > > It's exactly equivalent. > > Completions really are "locks that were initialized to locked". That is, > in fact, how completions came to be: we literally used to use semaphores > for them, and the reason for completions is literally the magic lifetime > rules they have. I don't know how they emerged historically and that's why I look a them in a different way than you do, probably. But fine, say we use the approach based on rwsems and consider suspend and the inner lock. We acquire it using down_write(), because we want to wait for multiple other dirvers. Now, in fact we could do literally down_write(dev->power.rwsem); up_write(dev->power.rwsem); because the lock doesn't really protect anything from anyone. What it does is to prevent _us_ from doing something too early. To me, personally, it's not a usual use of locks. Moreover, if you think completions should be treated like locks, the up_write() above plays the role of the INIT_COMPLETION() in my last patch (or vice versa), so we reinitialize the data structure to the previous state in this case too, only earlier (and we could do that later just as well). The only real drawback of using completions I can see is that we have to iterate over the children during suspend, but if async suspend is going to save us any time at all, we can easily afford it (resume with completions is actually simpler than with rwsems, because we only have to wait for one device each time). > > Besides, suppose a device driver wants some off-tree constraints to be > > satisfied. > > .. and I've told you several times that we should simply not do such > devices asynchronously. At least not unless there is some _overriding_ > reason to. And so far, nobody has suggested anything even remotely > likely for that. > > Again - KISS: Keep It Simple, Stupid! > > Don't try to make up problems. The _only_ subsystem we know wants this is > USB, and we know USB is purely a tree. Not really. I've already said it once, but let me repeat. Some device objects have those ACPI "shadow" device objects that represent the ACPI view of given "physical" device and have their own suspend and resume routines. It turns out that these ACPI "shadow" devices have to be suspended after their "physical" counterparts and resumed before them, or else things beak really badly. I don't know the reason for that, I only verified it experimentally (I also don't like that design, but I didn't invent it and I have to live with it at least for now). So if we don't enforce these constraints doing async suspend and resume, we won't be able to handle _any_ devices with those ACPI "shadow" things asynchronously. Ever. [That includes the majority PCI devices, at least the "planar" ones (which is unfortunate, but that's how it goes).] If we had a clean way of representing off-tree constraints during asynchronous suspend and resume, we'd be able to handle this issue at the bus type level. And even if we don't anticipate it right now, I think the iteration over children during suspend is a fair price for a clean interface that bus types or drivers can use in future. YMMV. > > Well, I guess your point is that the implementation of completions is much > > more complicated that we really need, but I'm not sure if that really hurts. > > No. The implementation of completions is actually pretty simple, exactly > because they have that spinlock that is required to protect them. > > That wasn't the point. The point was that locks are actually the "normal" > thing to use. > > You are arguing as if completions are somehow the simpler model. That's because I think so. > That's simply not true. Completions are just a _special_case_of_locking_. Which doesn't necessarily prevent them from being conceptually simpler that the locking scheme based on rwsems. > So why not just use regular locks instead, when it's actually the natural > way to do it, and results in simpler code? Well, to me, it's way not natural and, quite frankly, in my not so humble opinion, it's a matter of personal preference. But, since your personal preference is what matters in this case, I'm not going to argue any more, because that just plain doesn't make sense. So, if you're not fine with the last patch I sent (http://patchwork.kernel.org/patch/66375/), I'll send one using rwsems instead of completions just to make _you_ happy, not because I think that's what we should do objectively. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912112311.08548.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912112311.08548.rjw@sisk.pl> @ 2009-12-11 22:31 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912111415160.3922@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-11 22:31 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Fri, 11 Dec 2009, Rafael J. Wysocki wrote: > > But fine, say we use the approach based on rwsems and consider suspend and > the inner lock. We acquire it using down_write(), because we want to wait for > multiple other dirvers. Now, in fact we could do literally > > down_write(dev->power.rwsem); > up_write(dev->power.rwsem); > > because the lock doesn't really protect anything from anyone. What it does is > to prevent _us_ from doing something too early. To me, personally, it's not a > usual use of locks. I agree that it's fairly unusual, but on the other hand, it's unusual only because you contrieved it to be. If you instead do down_write(dev->power.rwsem); .. do the actual suspend .. up_write(dev->power.rwsem); it doesn't look odd any more, does it? And while you don't _need_ to hold the power lock over the suspend call, it actually does make sense, and gives you some nicer guarantees. For an example of the kinds of guarantees it would give you - I think that you might actually be able to do a partial suspend and then a resume without any other locks, and you'd know that just the per-device locking would already guarantee that no device is ever tried to resume before it has finished its asynchronous suspend. Think about it. In the completion model, the "async_synchronize_full()" will synchronize all async work, and as a result you think that you don't need that level of robustness from the locking itself. But think about it this way: if you could abort a failed suspend, and start resuming devices immediately, without doing that "async_synchronize_full()" in between - simply because you know that the node locking itself will just "do the right thing". To me, that's a sign of a _good_ design. Using a rwsem is simply just more robust and natural for the problem in question. Exactly because it's a real lock. > > Don't try to make up problems. The _only_ subsystem we know wants this is > > USB, and we know USB is purely a tree. > > Not really. > > I've already said it once, but let me repeat. Some device objects have those > ACPI "shadow" device objects that represent the ACPI view of given "physical" > device and have their own suspend and resume routines. It turns out that > these ACPI "shadow" devices have to be suspended after their "physical" > counterparts and resumed before them, or else things beak really badly. > I don't know the reason for that, I only verified it experimentally (I also > don't like that design, but I didn't invent it and I have to live with it at > least for now). So if we don't enforce these constraints doing async > suspend and resume, we won't be able to handle _any_ devices with those > ACPI "shadow" things asynchronously. Ever. [That includes the majority > PCI devices, at least the "planar" ones (which is unfortunate, but that's how > it goes).] So? First off, you're wrong. It's not "ever". I'm happy to add complexity later, I just don't want to start out with a complex model. Adding complexity too early "just because we migth need it" is the wrong thing to do. Secondly, I repeat: we don't want to do those PCI devices asynchronously anyway. You're again digging yourself deeper by just continually bringing up this total non-issue. I realize you did it for testing, but I'm serious when I say that we should limit these things as much as possible, rather than see it as an opportunity to do crazy things. Solve the problem at hand _first_. Solve it as simply as you can. And hope that you never ever will need anything more complex. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912111415160.3922@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912111415160.3922@localhost.localdomain> @ 2009-12-11 23:48 ` Rafael J. Wysocki [not found] ` <200912120048.46180.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-11 23:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Friday 11 December 2009, Linus Torvalds wrote: > > On Fri, 11 Dec 2009, Rafael J. Wysocki wrote: > > > > But fine, say we use the approach based on rwsems and consider suspend and > > the inner lock. We acquire it using down_write(), because we want to wait for > > multiple other dirvers. Now, in fact we could do literally > > > > down_write(dev->power.rwsem); > > up_write(dev->power.rwsem); > > > > because the lock doesn't really protect anything from anyone. What it does is > > to prevent _us_ from doing something too early. To me, personally, it's not a > > usual use of locks. > > I agree that it's fairly unusual, but on the other hand, it's unusual only > because you contrieved it to be. Whatever. The very fact that you can freely move the up_write() (as long as it's after the down_write()) is fairly unusual. > But think about it this way: if you could abort a failed suspend, and > start resuming devices immediately, without doing that > "async_synchronize_full()" in between - simply because you know that the > node locking itself will just "do the right thing". I'd rather not. :-) > To me, that's a sign of a _good_ design. Using a rwsem is simply just more > robust and natural for the problem in question. Exactly because it's a > real lock. ... > Solve the problem at hand _first_. Solve it as simply as you can. And hope > that you never ever will need anything more complex. Below is a patch I've just tested, but there's a lockdep problem in it I don't know how to solve. Namely, lockdep is apparently unhappy with us not releasing the lock taken in device_suspend() and it complains we take it twice in a row (which we do, but for another device). I need to use down_read_non_owner() to make it shut up and then I also need to use up_read_non_owner() in __device_suspend(), although there's the comment in include/linux/rwsem.h saying exatly this about that: /* * Take/release a lock when not the owner will release it. * * [ This API should be avoided as much as possible - the * proper abstraction for this case is completions. ] */ (I'd like to know your opinion about that). Yet, that's not all, because next it complains during resume that __device_resume() releases a lock it didn't acquire, which it clearly does, but that is intentional. Unfortunately, there's no up_write_non_owner() ... So, what am I supposed to do about that? Rafael --- drivers/base/power/main.c | 107 +++++++++++++++++++++++++++++++++++++++---- include/linux/device.h | 6 ++ include/linux/pm.h | 3 + include/linux/resume-trace.h | 7 ++ 4 files changed, 114 insertions(+), 9 deletions(-) Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -26,6 +26,7 @@ #include <linux/spinlock.h> #include <linux/wait.h> #include <linux/timer.h> +#include <linux/rwsem.h> /* * Callbacks for platform drivers to implement. @@ -412,9 +413,11 @@ struct dev_pm_info { pm_message_t power_state; unsigned int can_wakeup:1; unsigned int should_wakeup:1; + unsigned async_suspend:1; enum dpm_state status; /* Owned by the PM core */ #ifdef CONFIG_PM_SLEEP struct list_head entry; + struct rw_semaphore rwsem; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/resume-trace.h> #include <linux/rwsem.h> #include <linux/interrupt.h> +#include <linux/async.h> #include "../base.h" #include "power.h" @@ -42,6 +43,7 @@ LIST_HEAD(dpm_list); static DEFINE_MUTEX(dpm_list_mtx); +static pm_message_t pm_transition; /* * Set once the preparation of devices for a PM transition has started, reset @@ -56,6 +58,7 @@ static bool transition_started; void device_pm_init(struct device *dev) { dev->power.status = DPM_ON; + init_rwsem(&dev->power.rwsem); pm_runtime_init(dev); } @@ -381,17 +384,22 @@ void dpm_resume_noirq(pm_message_t state EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** - * device_resume - Execute "resume" callbacks for given device. + * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_resume(struct device *dev, pm_message_t state) +static int __device_resume(struct device *dev, pm_message_t state) { + struct device *parent = dev->parent; int error = 0; TRACE_DEVICE(dev); TRACE_RESUME(0); + /* Wait for the parent's resume to complete, if necessary. */ + if (parent) + down_read_nested(&parent->power.rwsem, SINGLE_DEPTH_NESTING); + down(&dev->sem); if (dev->bus) { @@ -426,11 +434,41 @@ static int device_resume(struct device * } End: up(&dev->sem); + if (parent) + up_read(&parent->power.rwsem); + + /* Allow the children to resume now. */ + up_write(&dev->power.rwsem); TRACE_RESUME(error); return error; } +static void async_resume(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_resume(dev, pm_transition); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + +static int device_resume(struct device *dev) +{ + /* Prevent the children from resuming before us. */ + down_write(&dev->power.rwsem); + + if (dev->power.async_suspend && !pm_trace_is_enabled()) { + get_device(dev); + async_schedule(async_resume, dev); + return 0; + } + + return __device_resume(dev, pm_transition); +} + /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. @@ -444,6 +482,7 @@ static void dpm_resume(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -454,7 +493,7 @@ static void dpm_resume(pm_message_t stat dev->power.status = DPM_RESUMING; mutex_unlock(&dpm_list_mtx); - error = device_resume(dev, state); + error = device_resume(dev); mutex_lock(&dpm_list_mtx); if (error) @@ -469,6 +508,7 @@ static void dpm_resume(pm_message_t stat } list_splice(&list, &dpm_list); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); } /** @@ -584,13 +624,11 @@ static int device_suspend_noirq(struct d { int error = 0; - if (!dev->bus) - return 0; - - if (dev->bus->pm) { + if (dev->bus && dev->bus->pm) { pm_dev_dbg(dev, state, "LATE "); error = pm_noirq_op(dev, dev->bus->pm, state); } + return error; } @@ -623,17 +661,24 @@ int dpm_suspend_noirq(pm_message_t state } EXPORT_SYMBOL_GPL(dpm_suspend_noirq); +static int async_error; + /** * device_suspend - Execute "suspend" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. */ -static int device_suspend(struct device *dev, pm_message_t state) +static int __device_suspend(struct device *dev, pm_message_t state) { int error = 0; + /* Wait for the suspends of the children to complete, if necessary. */ + down_write_nested(&dev->power.rwsem, SINGLE_DEPTH_NESTING); down(&dev->sem); + if (async_error) + goto End; + if (dev->class) { if (dev->class->pm) { pm_dev_dbg(dev, state, "class "); @@ -666,12 +711,50 @@ static int device_suspend(struct device suspend_report_result(dev->bus->suspend, error); } } + + if (!error) + dev->power.status = DPM_OFF; + End: up(&dev->sem); + up_write(&dev->power.rwsem); + + /* Allow the parent to suspend now. */ + if (dev->parent) + up_read_non_owner(&dev->parent->power.rwsem); return error; } +static void async_suspend(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend(dev, pm_transition); + if (error) { + pm_dev_err(dev, pm_transition, " async", error); + async_error = error; + } + + put_device(dev); +} + +static int device_suspend(struct device *dev, pm_message_t state) +{ + /* Prevent the parent from suspending before us. */ + if (dev->parent) + down_read_non_owner(&dev->parent->power.rwsem); + + if (dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend, dev); + return 0; + } + + return __device_suspend(dev, pm_transition); +} + /** * dpm_suspend - Execute "suspend" callbacks for all non-sysdev devices. * @state: PM transition of the system being carried out. @@ -683,6 +766,7 @@ static int dpm_suspend(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.prev); @@ -697,13 +781,17 @@ static int dpm_suspend(pm_message_t stat put_device(dev); break; } - dev->power.status = DPM_OFF; if (!list_empty(&dev->power.entry)) list_move(&dev->power.entry, &list); put_device(dev); + if (async_error) + break; } list_splice(&list, dpm_list.prev); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); + if (!error) + error = async_error; return error; } @@ -762,6 +850,7 @@ static int dpm_prepare(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); transition_started = true; + async_error = 0; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); Index: linux-2.6/include/linux/resume-trace.h =================================================================== --- linux-2.6.orig/include/linux/resume-trace.h +++ linux-2.6/include/linux/resume-trace.h @@ -6,6 +6,11 @@ extern int pm_trace_enabled; +static inline int pm_trace_is_enabled(void) +{ + return pm_trace_enabled; +} + struct device; extern void set_trace_device(struct device *); extern void generate_resume_trace(const void *tracedata, unsigned int user); @@ -17,6 +22,8 @@ extern void generate_resume_trace(const #else +static inline int pm_trace_is_enabled(void) { return 0; } + #define TRACE_DEVICE(dev) do { } while (0) #define TRACE_RESUME(dev) do { } while (0) Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h +++ linux-2.6/include/linux/device.h @@ -472,6 +472,12 @@ static inline int device_is_registered(s return dev->kobj.state_in_sysfs; } +static inline void device_enable_async_suspend(struct device *dev, bool enable) +{ + if (dev->power.status == DPM_ON) + dev->power.async_suspend = enable; +} + void driver_init(void); /* ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912120048.46180.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912120048.46180.rjw@sisk.pl> @ 2009-12-11 23:53 ` Linus Torvalds 2009-12-12 0:43 ` Alan Stern [not found] ` <alpine.LFD.2.00.0912111552330.3526@localhost.localdomain> 2 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-11 23:53 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > Below is a patch I've just tested, but there's a lockdep problem in it I don't > know how to solve. Namely, lockdep is apparently unhappy with us not releasing > the lock taken in device_suspend() and it complains we take it twice in a row > (which we do, but for another device). I need to use down_read_non_owner() > to make it shut up and then I also need to use up_read_non_owner() in > __device_suspend(), Ok, that I admit is actually a problem. Ok, ok, I'll accept that completion() version, even though I think it's inferior. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912120048.46180.rjw@sisk.pl> 2009-12-11 23:53 ` Linus Torvalds @ 2009-12-12 0:43 ` Alan Stern [not found] ` <alpine.LFD.2.00.0912111552330.3526@localhost.localdomain> 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-12 0:43 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > Below is a patch I've just tested, but there's a lockdep problem in it I don't > know how to solve. Namely, lockdep is apparently unhappy with us not releasing > the lock taken in device_suspend() and it complains we take it twice in a row > (which we do, but for another device). I need to use down_read_non_owner() > to make it shut up and then I also need to use up_read_non_owner() in > __device_suspend(), although there's the comment in include/linux/rwsem.h > saying exatly this about that: > > /* > * Take/release a lock when not the owner will release it. > * > * [ This API should be avoided as much as possible - the > * proper abstraction for this case is completions. ] > */ > > (I'd like to know your opinion about that). Yet, that's not all, because next > it complains during resume that __device_resume() releases a lock it didn't > acquire, which it clearly does, but that is intentional. Unfortunately, > there's no up_write_non_owner() ... Hah! I knew it! How come lockdep didn't complain earlier? What's different about this patch? Only the nesting annotations? Why should adding annotations make lockdep less happy? Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912111552330.3526@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912111552330.3526@localhost.localdomain> @ 2009-12-12 17:48 ` Rafael J. Wysocki 2009-12-12 18:54 ` Linus Torvalds 0 siblings, 1 reply; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-12 17:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Saturday 12 December 2009, Linus Torvalds wrote: > > On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > > > Below is a patch I've just tested, but there's a lockdep problem in it I don't > > know how to solve. Namely, lockdep is apparently unhappy with us not releasing > > the lock taken in device_suspend() and it complains we take it twice in a row > > (which we do, but for another device). I need to use down_read_non_owner() > > to make it shut up and then I also need to use up_read_non_owner() in > > __device_suspend(), > > Ok, that I admit is actually a problem. > > Ok, ok, I'll accept that completion() version, even though I think it's > inferior. Great! :-) I slightly changed it in the meantime to avoid calling wait_for_completion() when both the parent and the child are "synchronous", which prevents the code from choking on some situations when the ordering of dpm_list is wrong (this happens as a result of bugs, but not necessarily fatal, for example if one of the drivers' suspend and resume callbacks are NULL and the bus type doesn't access the hardware directly, so we shouldn't make things worse than they already are IMO). I'd like to put it into my tree in this form, if you don't mind. [Note for Alan: dpm_wait() is not exported for now, we'll export it when there are any users.] Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Subject: PM: Asynchronous suspend and resume of devices Theoretically, the total time of system sleep transitions (suspend to RAM, hibernation) can be reduced by running suspend and resume callbacks of device drivers in parallel with each other. However, there are dependencies between devices such that we're not allowed to suspend the parent of a device before suspending the device itself. Analogously, we're not allowed to resume a device before resuming its parent. Thus, to make it possible to execute device drivers' suspend and resume callbacks in parallel with each other, introduce (at the PM core level) a synchronization mechanism preventing the dependencies between devices from being violated. First, device drivers that want their suspend and resume callbacks to be run asynchronously need to set the power.async_suspend flags of their devices using device_enable_async_suspend(). Second, for each device with the power.async_suspend flag set the PM core will start async threads to execute its suspend and resume callbacks. The async threads started for different devices are synchronized with each other and with the main suspend (or resume) thread with the help of completions, in the following way: (1) There is a completion, power.completion, for each device object. (2) Each device's completion is reset before starting the async suspend (or resume) thread for the device or, in the case of devices whose power.async_suspend flags are not set, before executing the device's suspend and resume callbacks. (3) During suspend, right before running the bus type, device type and device class suspend callbacks for the device, the PM core waits for the completions of all the device's children to be completed. (4) During resume, right before running the bus type, device type and device class resume callbacks for the device, the PM core waits for the completion of the device's parent to be completed. (5) The PM core completes power.completion for each device right after the bus type, device type and device class suspend (or resume) callbacks executed for the device have returned. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- drivers/base/power/main.c | 115 ++++++++++++++++++++++++++++++++++++++++--- include/linux/device.h | 6 ++ include/linux/pm.h | 3 + include/linux/resume-trace.h | 7 ++ 4 files changed, 125 insertions(+), 6 deletions(-) Index: linux-2.6/include/linux/pm.h =================================================================== --- linux-2.6.orig/include/linux/pm.h +++ linux-2.6/include/linux/pm.h @@ -26,6 +26,7 @@ #include <linux/spinlock.h> #include <linux/wait.h> #include <linux/timer.h> +#include <linux/completion.h> /* * Callbacks for platform drivers to implement. @@ -412,9 +413,11 @@ struct dev_pm_info { pm_message_t power_state; unsigned int can_wakeup:1; unsigned int should_wakeup:1; + unsigned async_suspend:1; enum dpm_state status; /* Owned by the PM core */ #ifdef CONFIG_PM_SLEEP struct list_head entry; + struct completion completion; #endif #ifdef CONFIG_PM_RUNTIME struct timer_list suspend_timer; Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -25,6 +25,7 @@ #include <linux/resume-trace.h> #include <linux/rwsem.h> #include <linux/interrupt.h> +#include <linux/async.h> #include "../base.h" #include "power.h" @@ -42,6 +43,7 @@ LIST_HEAD(dpm_list); static DEFINE_MUTEX(dpm_list_mtx); +static pm_message_t pm_transition; /* * Set once the preparation of devices for a PM transition has started, reset @@ -56,6 +58,7 @@ static bool transition_started; void device_pm_init(struct device *dev) { dev->power.status = DPM_ON; + init_completion(&dev->power.completion); pm_runtime_init(dev); } @@ -111,6 +114,7 @@ void device_pm_remove(struct device *dev pr_debug("PM: Removing info for %s:%s\n", dev->bus ? dev->bus->name : "No Bus", kobject_name(&dev->kobj)); + complete_all(&dev->power.completion); mutex_lock(&dpm_list_mtx); list_del_init(&dev->power.entry); mutex_unlock(&dpm_list_mtx); @@ -162,6 +166,31 @@ void device_pm_move_last(struct device * } /** + * dpm_wait - Wait for a PM operation to complete. + * @dev: Device to wait for. + * @async: If unset, wait only if the device's power.async_suspend flag is set. + */ +static void dpm_wait(struct device *dev, bool async) +{ + if (!dev) + return; + + if (async || dev->power.async_suspend) + wait_for_completion(&dev->power.completion); +} + +static int dpm_wait_fn(struct device *dev, void *async_ptr) +{ + dpm_wait(dev, *((bool *)async_ptr)); + return 0; +} + +static void dpm_wait_for_children(struct device *dev, bool async) +{ + device_for_each_child(dev, &async, dpm_wait_fn); +} + +/** * pm_op - Execute the PM operation appropriate for given PM event. * @dev: Device to handle. * @ops: PM operations to choose from. @@ -381,17 +410,19 @@ void dpm_resume_noirq(pm_message_t state EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** - * device_resume - Execute "resume" callbacks for given device. + * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. + * @async: If true, the device is being resumed asynchronously. */ -static int device_resume(struct device *dev, pm_message_t state) +static int __device_resume(struct device *dev, pm_message_t state, bool async) { int error = 0; TRACE_DEVICE(dev); TRACE_RESUME(0); + dpm_wait(dev->parent, async); down(&dev->sem); if (dev->bus) { @@ -426,11 +457,36 @@ static int device_resume(struct device * } End: up(&dev->sem); + complete_all(&dev->power.completion); TRACE_RESUME(error); return error; } +static void async_resume(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_resume(dev, pm_transition, true); + if (error) + pm_dev_err(dev, pm_transition, " async", error); + put_device(dev); +} + +static int device_resume(struct device *dev) +{ + INIT_COMPLETION(dev->power.completion); + + if (dev->power.async_suspend && !pm_trace_is_enabled()) { + get_device(dev); + async_schedule(async_resume, dev); + return 0; + } + + return __device_resume(dev, pm_transition, false); +} + /** * dpm_resume - Execute "resume" callbacks for non-sysdev devices. * @state: PM transition of the system being carried out. @@ -444,6 +500,7 @@ static void dpm_resume(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.next); @@ -454,7 +511,7 @@ static void dpm_resume(pm_message_t stat dev->power.status = DPM_RESUMING; mutex_unlock(&dpm_list_mtx); - error = device_resume(dev, state); + error = device_resume(dev); mutex_lock(&dpm_list_mtx); if (error) @@ -469,6 +526,7 @@ static void dpm_resume(pm_message_t stat } list_splice(&list, &dpm_list); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); } /** @@ -623,17 +681,24 @@ int dpm_suspend_noirq(pm_message_t state } EXPORT_SYMBOL_GPL(dpm_suspend_noirq); +static int async_error; + /** * device_suspend - Execute "suspend" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. + * @async: If true, the device is being suspended asynchronously. */ -static int device_suspend(struct device *dev, pm_message_t state) +static int __device_suspend(struct device *dev, pm_message_t state, bool async) { int error = 0; + dpm_wait_for_children(dev, async); down(&dev->sem); + if (async_error) + goto End; + if (dev->class) { if (dev->class->pm) { pm_dev_dbg(dev, state, "class "); @@ -666,12 +731,44 @@ static int device_suspend(struct device suspend_report_result(dev->bus->suspend, error); } } + + if (!error) + dev->power.status = DPM_OFF; + End: up(&dev->sem); + complete_all(&dev->power.completion); return error; } +static void async_suspend(void *data, async_cookie_t cookie) +{ + struct device *dev = (struct device *)data; + int error; + + error = __device_suspend(dev, pm_transition, true); + if (error) { + pm_dev_err(dev, pm_transition, " async", error); + async_error = error; + } + + put_device(dev); +} + +static int device_suspend(struct device *dev) +{ + INIT_COMPLETION(dev->power.completion); + + if (dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend, dev); + return 0; + } + + return __device_suspend(dev, pm_transition, false); +} + /** * dpm_suspend - Execute "suspend" callbacks for all non-sysdev devices. * @state: PM transition of the system being carried out. @@ -683,13 +780,15 @@ static int dpm_suspend(pm_message_t stat INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); + pm_transition = state; + async_error = 0; while (!list_empty(&dpm_list)) { struct device *dev = to_device(dpm_list.prev); get_device(dev); mutex_unlock(&dpm_list_mtx); - error = device_suspend(dev, state); + error = device_suspend(dev); mutex_lock(&dpm_list_mtx); if (error) { @@ -697,13 +796,17 @@ static int dpm_suspend(pm_message_t stat put_device(dev); break; } - dev->power.status = DPM_OFF; if (!list_empty(&dev->power.entry)) list_move(&dev->power.entry, &list); put_device(dev); + if (async_error) + break; } list_splice(&list, dpm_list.prev); mutex_unlock(&dpm_list_mtx); + async_synchronize_full(); + if (!error) + error = async_error; return error; } Index: linux-2.6/include/linux/resume-trace.h =================================================================== --- linux-2.6.orig/include/linux/resume-trace.h +++ linux-2.6/include/linux/resume-trace.h @@ -6,6 +6,11 @@ extern int pm_trace_enabled; +static inline int pm_trace_is_enabled(void) +{ + return pm_trace_enabled; +} + struct device; extern void set_trace_device(struct device *); extern void generate_resume_trace(const void *tracedata, unsigned int user); @@ -17,6 +22,8 @@ extern void generate_resume_trace(const #else +static inline int pm_trace_is_enabled(void) { return 0; } + #define TRACE_DEVICE(dev) do { } while (0) #define TRACE_RESUME(dev) do { } while (0) Index: linux-2.6/include/linux/device.h =================================================================== --- linux-2.6.orig/include/linux/device.h +++ linux-2.6/include/linux/device.h @@ -472,6 +472,12 @@ static inline int device_is_registered(s return dev->kobj.state_in_sysfs; } +static inline void device_enable_async_suspend(struct device *dev, bool enable) +{ + if (dev->power.status == DPM_ON) + dev->power.async_suspend = enable; +} + void driver_init(void); /* ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-12 17:48 ` Rafael J. Wysocki @ 2009-12-12 18:54 ` Linus Torvalds 2009-12-12 22:34 ` Rafael J. Wysocki ` (2 more replies) 0 siblings, 3 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-12 18:54 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > I'd like to put it into my tree in this form, if you don't mind. This version still has a major problem, which is not related to completions vs rwsems, but simply to the fact that you wanted to do this at the generic device layer level rather than do it at the actual low-level suspend/resume level. Namely that there's no apparent sane way to say "don't wait for children". PCI bridges that don't suspend at all - or any other device that only suspends in the 'suspend_late()' thing, for that matter - don't have any reason what-so-ever to wait for children, since they aren't actually suspending in the first place. But you make them wait regardless, which then serializes things unnecessarily (for example, two unrelated USB controllers). And no, making _everything_ be async is _not_ the answer. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-12 18:54 ` Linus Torvalds @ 2009-12-12 22:34 ` Rafael J. Wysocki 2009-12-12 22:40 ` Rafael J. Wysocki ` (2 more replies) 2009-12-13 13:08 ` Rafael J. Wysocki 2009-12-13 17:30 ` Alan Stern 2 siblings, 3 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-12 22:34 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Saturday 12 December 2009, Linus Torvalds wrote: > > On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > > > I'd like to put it into my tree in this form, if you don't mind. > > This version still has a major problem, which is not related to > completions vs rwsems, but simply to the fact that you wanted to do this > at the generic device layer level rather than do it at the actual > low-level suspend/resume level. > > Namely that there's no apparent sane way to say "don't wait for children". > > PCI bridges that don't suspend at all - or any other device that only > suspends in the 'suspend_late()' thing, for that matter - don't have any > reason what-so-ever to wait for children, since they aren't actually > suspending in the first place. But you make them wait regardless, which > then serializes things unnecessarily (for example, two unrelated USB > controllers). This is a problem that needs to be solved. One solution that we have discussed on linux-pm is to start a bunch of async threads searching for async devices that can be suspended and suspending them (assuming suspend is considered) out of order with respect to dpm_list. For example, leaf async devices can always be suspended at the same time regardless of their positions in dpm_list. This way we could get almost the entire gain resulting from suspending or resuming devices in parallel without bothering drivers with the problem of dependencies that need to be honoured. That's something we can add on top of this patch, though, not to complicate things from the start and it surely requires more discussion. > And no, making _everything_ be async is _not_ the answer. I'm not sure what you mean, really. Speaking of PCI bridges, even though they don't "suspend" in the sense of being put into low power states or something, we still need to save their registers on suspend and restore them on resume, and that restore has to be done before we start to access devices below the bridge. There are devices with totally null suspend and resume routines that even the bus type doesn't really handle, but those can be marked as "async" from the start and they won't really get in the way any more (this creates another issue to solve, namely that we shouldn't really start a new async thread for each of them; we have considered that too). Even if we move that all to drivers, the constraints won't go away and someone will have to take care of them. Now, since _we_ have problems with reaching an agreement about how to do it, the driver writers will be even less likely to figure that out. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-12 22:34 ` Rafael J. Wysocki @ 2009-12-12 22:40 ` Rafael J. Wysocki 2009-12-14 18:21 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912141015240.26135@localhost.localdomain> 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-12 22:40 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Saturday 12 December 2009, Rafael J. Wysocki wrote: > On Saturday 12 December 2009, Linus Torvalds wrote: > > > > On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > > ... > > > And no, making _everything_ be async is _not_ the answer. > > I'm not sure what you mean, really. > > Speaking of PCI bridges, even though they don't "suspend" in the sense of > being put into low power states or something, we still need to save their > registers on suspend and restore them on resume, and that restore has to > be done before we start to access devices below the bridge. Of course we restore them at the early stage now so the above remark does't apply to the patch in question, sorry. But the one below does. > Even if we move that all to drivers, the constraints won't go away and someone > will have to take care of them. Now, since _we_ have problems with reaching > an agreement about how to do it, the driver writers will be even less likely to > figure that out. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-12 22:34 ` Rafael J. Wysocki 2009-12-12 22:40 ` Rafael J. Wysocki @ 2009-12-14 18:21 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912141015240.26135@localhost.localdomain> 2 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-14 18:21 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > One solution that we have discussed on linux-pm is to start a bunch of async > threads searching for async devices that can be suspended and suspending > them (assuming suspend is considered) out of order with respect to dpm_list. Ok, guys, stop the crazy. That's another of those "ok, that's just ttoally stupid and clearly too complex" ideas that I would never pull. I should seriously suggest that people just stop discussing architectural details on the pm list if they all end up being this level of crazy. The sane thing to do is to just totally ignore the async layer on PCI bridges and other things that only have a late-suspend/early-resume thing. No need for the above kind of obviously idiotic crap. However, my point was really that we wouldn't even have _needed_ that kind of special case if we had just decided to let the subsystems do it. But whatever. At worst, the PCI layer can even just mark such devices with just late/early suspend/resume as being asynchronous, even though that ends up resulting in some totally pointless async work that doesn't do anything. But please guys - reign in the crazy ideas on the pm list. It's not like our suspend/resume has gotten so stable as to be boring, and we want it to become unreliable again. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912141015240.26135@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912141015240.26135@localhost.localdomain> @ 2009-12-14 22:11 ` Rafael J. Wysocki [not found] ` <200912142311.31658.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-14 22:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Monday 14 December 2009, Linus Torvalds wrote: > > On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > > > One solution that we have discussed on linux-pm is to start a bunch of async > > threads searching for async devices that can be suspended and suspending > > them (assuming suspend is considered) out of order with respect to dpm_list. > > Ok, guys, stop the crazy. > > That's another of those "ok, that's just ttoally stupid and clearly too > complex" ideas that I would never pull. > > I should seriously suggest that people just stop discussing architectural > details on the pm list if they all end up being this level of crazy. > > The sane thing to do is to just totally ignore the async layer on PCI > bridges and other things that only have a late-suspend/early-resume thing. > No need for the above kind of obviously idiotic crap. > > However, my point was really that we wouldn't even have _needed_ that kind > of special case if we had just decided to let the subsystems do it. But > whatever. At worst, the PCI layer can even just mark such devices with > just late/early suspend/resume as being asynchronous, even though that > ends up resulting in some totally pointless async work that doesn't do > anything. > > But please guys - reign in the crazy ideas on the pm list. It's not like > our suspend/resume has gotten so stable as to be boring, and we want it to > become unreliable again. Indeed. OK, what about a two-pass approach in which the first pass only inits the completions and starts async threads for leaf "async" devices? I think leaf devices are most likely to take much time to suspend, so this will give us a chance to save quite some time. A more aggressive version of this might start the async threads for all async devices in the first pass and then only handle the sychronous ones in the second pass - as long as there are only a few async devices that should be quite efficient. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912142311.31658.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912142311.31658.rjw@sisk.pl> @ 2009-12-14 22:41 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912141416040.26135@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-14 22:41 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Mon, 14 Dec 2009, Rafael J. Wysocki wrote: > > OK, what about a two-pass approach in which the first pass only inits the > completions and starts async threads for leaf "async" devices? I think leaf > devices are most likely to take much time to suspend, so this will give us > a chance to save quite some time. Why? Really. Again, stop making it harder than it needs to be. Why do you make up these crazy schemes that are way more complex than they need to be? Here's an untested one-liner that has a 10-line comment. I agree it is ugly, but it is ugly exactly because the generic device layer _forces_ us to wait for children even when we don't want to. With this, that unnecessary wait is now done asynchronously. I'd rather do it some other way - perhaps having an explicit flag that says "don't wait for children because I'm not going to suspend myself until 'suspend_late' _anyway_". But at least this is _simple_. Linus --- drivers/pci/probe.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 98ffb2d..4e0ad7b 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -437,6 +437,17 @@ static struct pci_bus *pci_alloc_child_bus(struct pci_bus *parent, } bridge->subordinate = child; + /* + * We don't really suspend PCI buses asyncronously. + * + * However, since we don't actually suspend them at all until + * the late phase, we might as well lie to the device layer + * and it to do our no-op not-suspend asynchronously, so that + * we end up not synchronizing with any of our child devices + * that might want to be asynchronous. + */ + bridge->dev.power.async_suspend = 1; + return child; } ^ permalink raw reply related [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912141416040.26135@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912141416040.26135@localhost.localdomain> @ 2009-12-14 22:43 ` Linus Torvalds 2009-12-14 23:18 ` Rafael J. Wysocki [not found] ` <200912150018.11837.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-14 22:43 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Mon, 14 Dec 2009, Linus Torvalds wrote: > > Here's an untested one-liner that has a 10-line comment. Btw, when I say "untested", in this case I mean that it isn't even compile-tested. I haven't merged your other patches yet, so in my tree that 'async_suspend' flag doesn't even exist, and the patch I sent out definitely doesn't compile. But it _might_ compile (and perhaps even work) in your tree. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912141416040.26135@localhost.localdomain> 2009-12-14 22:43 ` Linus Torvalds @ 2009-12-14 23:18 ` Rafael J. Wysocki [not found] ` <200912150018.11837.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-14 23:18 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Monday 14 December 2009, Linus Torvalds wrote: > > On Mon, 14 Dec 2009, Rafael J. Wysocki wrote: > > > > OK, what about a two-pass approach in which the first pass only inits the > > completions and starts async threads for leaf "async" devices? I think leaf > > devices are most likely to take much time to suspend, so this will give us > > a chance to save quite some time. > > Why? > > Really. Because the PCI bridges are not the only case where it matters (I'd say they are really a corner case). Basically, any two async devices separeted by a series of sync ones are likely not to be suspended (or resumed) in parallel with each other, because the parent is usually next to its children in dpm_list. So, if the first device suspends, its "synchronous" parent waits for it and the suspend of the second async device won't be started until the first one's suspend has returned. And it doesn't matter at what level we do the async thing, because dpm_list is there anyway. As Alan said, the real problem is that we generally can't change the ordering of dpm_list arbitrarily, because we don't know what's going to happen as a result. The async_suspend flag tells us, basically, what devices can be safely moved to different positions in dpm_list without breaking things, as long as they are not moved behind their parents or in front of their children. Starting the async suspends upfront would effectively work in the same way as moving those devices to the beginning of dpm_list without breaking the parent-child chains, which in turn is likely to allow us to save some extra time. That's not only about the PCI bridges, it's more general. As far as your one-liner is concerned, I'm going to test it, because I think we could use it anyway. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912150018.11837.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912150018.11837.rjw@sisk.pl> @ 2009-12-15 0:10 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912141609020.14385@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 0:10 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > Because the PCI bridges are not the only case where it matters (I'd say they > are really a corner case). Basically, any two async devices separeted by a > series of sync ones are likely not to be suspended (or resumed) in parallel > with each other, because the parent is usually next to its children in dpm_list. Give a real example that matters. Really. How hard can it be to understand: KISS. Keep It Simple, Stupid. I get really tired of this whole stupid async discussion, because you're overdesigning it. To a first approximation, THE ONLY THING THAT MATTERS IS USB. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912141609020.14385@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912141609020.14385@localhost.localdomain> @ 2009-12-15 0:11 ` Linus Torvalds 2009-12-15 11:03 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 0:11 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Mon, 14 Dec 2009, Linus Torvalds wrote: > > I get really tired of this whole stupid async discussion, because you're > overdesigning it. Btw, this is important. I'm not going to pull even your _current_ async stuff if you can't show that you fundamentally UNDERSTAND this fact. Stop making up idiotic complex interfaces. Look at my one-liner patch, and realize that it gets you 99% there - the 99% that matters. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912141609020.14385@localhost.localdomain> 2009-12-15 0:11 ` Linus Torvalds @ 2009-12-15 11:03 ` Rafael J. Wysocki [not found] ` <alpine.LFD.2.00.0912141610460.14385@localhost.localdomain> [not found] ` <200912151203.22916.rjw@sisk.pl> 3 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-15 11:03 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Tuesday 15 December 2009, Linus Torvalds wrote: > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > Because the PCI bridges are not the only case where it matters (I'd say they > > are really a corner case). Basically, any two async devices separeted by a > > series of sync ones are likely not to be suspended (or resumed) in parallel > > with each other, because the parent is usually next to its children in dpm_list. > > Give a real example that matters. I'll try. Let -> denote child-parent relationships and assume dpm_list looks like this: ..., A->B->C, D, E->F->G, ... where A, B, E, F are all async and C, D, G are sync (E, F, G may be USB and A, B, C may be serio input devices and D is a device that just happens to be in dpm_list between them). Say A and C take the majority of the total suspend time and assume we traverse the dpm_list from left to right. Now, during suspend, C waits for B that waits for A and G waits for F that waits for E. Moreover, since C is sync, the PM core won't start the suspend of D until the suspend of C has returned. In turn, since D is sync, the suspend of E won't be started until the suspend of D has returned. So in this situation the gain from the async suspends of A, B, E, F is zero. However, it won't be zero if we start the async suspends of A, B, E, F upfront. I'm not sure if this is sufficiently "real life" for you, but this is how dpm_list looks on one of my test boxes, more or less. > Really. > > How hard can it be to understand: KISS. Keep It Simple, Stupid. > > I get really tired of this whole stupid async discussion, because you're > overdesigning it. > > To a first approximation, THE ONLY THING THAT MATTERS IS USB. If this applies to _resume_ only, then I agree, but the Arjan's data clearly show that serio devices take much more time to suspend than USB. But if we only talk about resume, the PCI bridges don't really matter, because they are resumed before all devices that depend on them, so they don't really need to wait for anyone anyway. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912141610460.14385@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912141610460.14385@localhost.localdomain> @ 2009-12-15 11:14 ` Rafael J. Wysocki [not found] ` <200912151214.10980.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-15 11:14 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Tuesday 15 December 2009, Linus Torvalds wrote: > > On Mon, 14 Dec 2009, Linus Torvalds wrote: > > > > I get really tired of this whole stupid async discussion, because you're > > overdesigning it. > > Btw, this is important. I'm not going to pull even your _current_ async > stuff if you can't show that you fundamentally UNDERSTAND this fact. What fact? The only thing that matters is USB? For resume, it is. For suspend, it clearly isn't. > Stop making up idiotic complex interfaces. Look at my one-liner patch, and > realize that it gets you 99% there - the 99% that matters. I said I was going to use it, but I don't think that's going to be sufficient. [BTW, I'm not sure what you want to achieve by insulting me. Either you may want to scare me, but I'm not scared, or you may want to try to make me so disgusted that I'll just give up and back off, but this is not going to happen either.] Insults aside, I'm going to make some measurements to see how much time we can save. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912151214.10980.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912151214.10980.rjw@sisk.pl> @ 2009-12-15 15:31 ` Linus Torvalds 0 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 15:31 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > What fact? The only thing that matters is USB? For resume, it is. For > suspend, it clearly isn't. For suspend, the only other case we've seen has been the keyboard and mouse controller, which has exactly the same "we can special case it with a single 'let's do _this_ device asynchronously'". Again, it may not be pretty, but it sure is simple. Much simpler than talking about some generic infrastructure changes and about doing "let's do leaves of the tree separately" schemes. And that's why I'm _soo_ unhappy with you, and am insulting you. Because you keep on making the same mistake over and over - overdesigning. Overdesigning is a SIN. It's the archetypal example of what I call "bad taste". I get really upset when a subsystem maintainer starts overdesigning things. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912151203.22916.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912151203.22916.rjw@sisk.pl> @ 2009-12-15 15:26 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912150722310.14385@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-15 15:26 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > Give a real example that matters. > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > like this: No. I mean something real - something like - if you run on a non-PC with two USB buses behind non-PCI controllers. - device xyz. > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > show that serio devices take much more time to suspend than USB. I mean in general - something where you actually have hard data that some device really needs anythign more than my one-liner, and really _needs_ some complex infrastructure. Not "let's imagine a case like xyz". > But if we only talk about resume, the PCI bridges don't really matter, > because they are resumed before all devices that depend on them, so they don't > really need to wait for anyone anyway. But that's my _point_. That's the whole point of the one-liner patch. Read the comment above that one-liner. My whole point was that by doing the whole "wait for children" in generic code, you also made devices - such as PCI bridges - have to wait for children, even though they don't need to, and don't want to. So I suggested an admittedly ugly hack to take care of it - rather than some complex infrastructure. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912150722310.14385@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912150722310.14385@localhost.localdomain> @ 2009-12-15 15:55 ` Alan Stern 2009-12-16 2:11 ` Rafael J. Wysocki [not found] ` <200912160311.05915.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-15 15:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Tue, 15 Dec 2009, Linus Torvalds wrote: > My whole point was that by doing the whole "wait for children" in generic > code, you also made devices - such as PCI bridges - have to wait for > children, even though they don't need to, and don't want to. > > So I suggested an admittedly ugly hack to take care of it - rather than > some complex infrastructure. It doesn't feel like an ugly hack to me. It seems like exactly the Right Thing To Do: Make as many devices as possible use async suspend/resume. The only reason we don't make every device async is because we don't know whether it's safe. In the case of PCI bridges we _do_ know -- because they don't have any work to do outside of late_suspend/early_resume -- and so they _should_ be async. The same goes for devices that don't have suspend or resume methods. There remains a separate question: Should async devices also be forced to wait for their children? I don't see why not. For PCI bridges it won't make any significant difference. As long as the async code doesn't have to do anything, who cares when it runs? Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912150722310.14385@localhost.localdomain> 2009-12-15 15:55 ` Alan Stern @ 2009-12-16 2:11 ` Rafael J. Wysocki [not found] ` <200912160311.05915.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-16 2:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Tuesday 15 December 2009, Linus Torvalds wrote: > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > > > Give a real example that matters. > > > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > > like this: > > No. > > I mean something real - something like > > - if you run on a non-PC with two USB buses behind non-PCI controllers. > > - device xyz. > > > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > > show that serio devices take much more time to suspend than USB. > > I mean in general - something where you actually have hard data that some > device really needs anythign more than my one-liner, and really _needs_ > some complex infrastructure. > > Not "let's imagine a case like xyz". As I said I would, I made some measurements. I measured the total time of suspending and resuming devices as shown by the code added by this patch: http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite different and the HP was running 64-bit kernel and user space). I took four cases into consideration: (1) synchronous suspend and resume (/sys/power/pm_async = 0) (2) asynchronous suspend and resume as introduced by the async branch at: http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async (3) asynchronous suspend and resume like in (2), but with your one-liner setting the power.async_suspend flag for PCI bridges on top (4) asynchronous suspend and resume like in (2), but with an extra patch that is appended on top For those tests I set power.async_suspend for all USB devices, all serio input devices, the ACPI battery and the USB PCI controllers (to see the impact of the one-liner, if any). I carried out 5 consecutive suspend-resume cycles (started from under X) on each box in each case, and the raw data are here (all times in milliseconds): http://www.sisk.pl/kernel/data/async-suspend.pdf The summarized data are below (the "big" numbers are averages and the +/- numbers are standard deviations, all in milliseconds): HP nx6325 MSI Wind U100 sync suspend 1482 (+/- 40) 1180 (+/- 24) sync resume 2955 (+/- 2) 3597 (+/- 25) async suspend 1553 (+/- 49) 1177 (+/- 32) async resume 2692 (+/- 326) 3556 (+/- 33) async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) async+extra suspend 1496 (+/- 37) 1217 (+/- 38) async+extra resume 1859 (+/- 114) 1923 (+/- 35) So, in my opinion, with the above set of "async" devices, it doesn't make sense to do async suspend at all, because the sync suspend is actually the fastest on both machines. However, it surely is worth doing async _resume_ with the extra patch appended below, because that allows us to save 1 second or more on both machines with respect to the sync case. The other variants of async resume also produce some time savings, but (on the nx6325) at the expense of huge fluctuations from one cycle to another (so they can actually be slower than the sync resume). Only the async resume with the extra patch is consistently better than the sync one. The impact of the one-liner is either negligible or slightly negative. Now, what does the extra patch do? Exactly the thing I was talking about, it starts all async suspends and resumes upfront. So, it looks like we both were wrong. I was wrong, because I thought the extra patch would help suspend, but not resume, while in fact it appears to help resume big time. You were wrong, because you thought that the one-liner would have positive impact, while in fact it doesn't. Concluding, at this point I'd opt for implementing asynchronous resume alone, _without_ asynchronous suspend, which is more complicated and doesn't really give us any time savings. At the same time, I'd implement the asynchronous resume in such a way that all of the async resume threads would be started before the synchronous suspend thread, because that would give us the best results. Rafael --- drivers/base/power/main.c | 48 +++++++++++++++++++++++++++++----------------- 1 file changed, 31 insertions(+), 17 deletions(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -523,14 +523,9 @@ static void async_resume(void *data, asy static int device_resume(struct device *dev) { - INIT_COMPLETION(dev->power.completion); - - if (pm_async_enabled && dev->power.async_suspend - && !pm_trace_is_enabled()) { - get_device(dev); - async_schedule(async_resume, dev); + if (dev->power.async_suspend && pm_async_enabled + && !pm_trace_is_enabled()) return 0; - } return __device_resume(dev, pm_transition, false); } @@ -545,14 +540,28 @@ static int device_resume(struct device * static void dpm_resume(pm_message_t state) { struct list_head list; + struct device *dev; ktime_t starttime = ktime_get(); INIT_LIST_HEAD(&list); mutex_lock(&dpm_list_mtx); pm_transition = state; - while (!list_empty(&dpm_list)) { - struct device *dev = to_device(dpm_list.next); + list_for_each_entry(dev, &dpm_list, power.entry) { + if (dev->power.status < DPM_OFF) + continue; + + INIT_COMPLETION(dev->power.completion); + + if (dev->power.async_suspend && pm_async_enabled + && !pm_trace_is_enabled()) { + get_device(dev); + async_schedule(async_resume, dev); + } + } + + while (!list_empty(&dpm_list)) { + dev = to_device(dpm_list.next); get_device(dev); if (dev->power.status >= DPM_OFF) { int error; @@ -809,13 +818,8 @@ static void async_suspend(void *data, as static int device_suspend(struct device *dev) { - INIT_COMPLETION(dev->power.completion); - - if (pm_async_enabled && dev->power.async_suspend) { - get_device(dev); - async_schedule(async_suspend, dev); + if (pm_async_enabled && dev->power.async_suspend) return 0; - } return __device_suspend(dev, pm_transition, false); } @@ -827,6 +831,7 @@ static int device_suspend(struct device static int dpm_suspend(pm_message_t state) { struct list_head list; + struct device *dev; ktime_t starttime = ktime_get(); int error = 0; @@ -834,9 +839,18 @@ static int dpm_suspend(pm_message_t stat mutex_lock(&dpm_list_mtx); pm_transition = state; async_error = 0; - while (!list_empty(&dpm_list)) { - struct device *dev = to_device(dpm_list.prev); + list_for_each_entry_reverse(dev, &dpm_list, power.entry) { + INIT_COMPLETION(dev->power.completion); + + if (pm_async_enabled && dev->power.async_suspend) { + get_device(dev); + async_schedule(async_suspend, dev); + } + } + + while (!list_empty(&dpm_list)) { + dev = to_device(dpm_list.prev); get_device(dev); mutex_unlock(&dpm_list_mtx); ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912160311.05915.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912160311.05915.rjw@sisk.pl> @ 2009-12-16 6:40 ` Dmitry Torokhov 2009-12-16 15:22 ` Alan Stern ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-16 6:40 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: > On Tuesday 15 December 2009, Linus Torvalds wrote: > > > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > Give a real example that matters. > > > > > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > > > like this: > > > > No. > > > > I mean something real - something like > > > > - if you run on a non-PC with two USB buses behind non-PCI controllers. > > > > - device xyz. > > > > > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > > > show that serio devices take much more time to suspend than USB. > > > > I mean in general - something where you actually have hard data that some > > device really needs anythign more than my one-liner, and really _needs_ > > some complex infrastructure. > > > > Not "let's imagine a case like xyz". > > As I said I would, I made some measurements. > > I measured the total time of suspending and resuming devices as shown by the > code added by this patch: > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > different and the HP was running 64-bit kernel and user space). > > I took four cases into consideration: > (1) synchronous suspend and resume (/sys/power/pm_async = 0) > (2) asynchronous suspend and resume as introduced by the async branch at: > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > (3) asynchronous suspend and resume like in (2), but with your one-liner setting > the power.async_suspend flag for PCI bridges on top > (4) asynchronous suspend and resume like in (2), but with an extra patch that > is appended on top > > For those tests I set power.async_suspend for all USB devices, all serio input > devices, the ACPI battery and the USB PCI controllers (to see the impact of the > one-liner, if any). > > I carried out 5 consecutive suspend-resume cycles (started from under X) on > each box in each case, and the raw data are here (all times in milliseconds): > http://www.sisk.pl/kernel/data/async-suspend.pdf > > The summarized data are below (the "big" numbers are averages and the +/- > numbers are standard deviations, all in milliseconds): > > HP nx6325 MSI Wind U100 > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > sync resume 2955 (+/- 2) 3597 (+/- 25) > > async suspend 1553 (+/- 49) 1177 (+/- 32) > async resume 2692 (+/- 326) 3556 (+/- 33) > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > So, in my opinion, with the above set of "async" devices, it doesn't > make sense to do async suspend at all, because the sync suspend is actually > the fastest on both machines. I think the async suspend is not asynchronous enough then - what kind of time do you get if you simply comment out call to psmouse_reset() in drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for testing purposes only, I don't think we want to do that by default.) -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912160311.05915.rjw@sisk.pl> 2009-12-16 6:40 ` Dmitry Torokhov @ 2009-12-16 15:22 ` Alan Stern 2009-12-16 15:47 ` Linus Torvalds [not found] ` <20091216064025.GB2699@core.coreip.homeip.net> 3 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-16 15:22 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > I measured the total time of suspending and resuming devices as shown by the > code added by this patch: > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > different and the HP was running 64-bit kernel and user space). > I carried out 5 consecutive suspend-resume cycles (started from under X) on > each box in each case, and the raw data are here (all times in milliseconds): > http://www.sisk.pl/kernel/data/async-suspend.pdf I'd like to see much more detailed data. For each device, let's get the device name, the parent's name, and the start time, end time, and duration for suspend or resume. The start time should be measured when you have finished waiting for the children. The end time should be measured just before the complete_all(). Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912160311.05915.rjw@sisk.pl> 2009-12-16 6:40 ` Dmitry Torokhov 2009-12-16 15:22 ` Alan Stern @ 2009-12-16 15:47 ` Linus Torvalds 2009-12-16 19:27 ` Rafael J. Wysocki [not found] ` <200912162027.16574.rjw@sisk.pl> [not found] ` <20091216064025.GB2699@core.coreip.homeip.net> 3 siblings, 2 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-16 15:47 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > The summarized data are below (the "big" numbers are averages and the +/- > numbers are standard deviations, all in milliseconds): > > HP nx6325 MSI Wind U100 > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > sync resume 2955 (+/- 2) 3597 (+/- 25) > > async suspend 1553 (+/- 49) 1177 (+/- 32) > async resume 2692 (+/- 326) 3556 (+/- 33) > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > So, in my opinion, with the above set of "async" devices, it doesn't > make sense to do async suspend at all, because the sync suspend is actually > the fastest on both machines. Hmm. I certainly agree - your numbers do not seem to support any async at all. However, I do note that for the "extra patch" makes a big difference at resume time. That implies that the resume serializes on some slow device that wasn't marked async - and starting the async ones early avoids that. But without the per-device timings, it's hard to even guess what device that was. But even that doesn't really help the suspend cases, only resume. Do you have any sample timing output with devices listed? Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-16 15:47 ` Linus Torvalds @ 2009-12-16 19:27 ` Rafael J. Wysocki [not found] ` <200912162027.16574.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-16 19:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Wednesday 16 December 2009, Linus Torvalds wrote: > > On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > > The summarized data are below (the "big" numbers are averages and the +/- > > numbers are standard deviations, all in milliseconds): > > > > HP nx6325 MSI Wind U100 > > > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > > sync resume 2955 (+/- 2) 3597 (+/- 25) > > > > async suspend 1553 (+/- 49) 1177 (+/- 32) > > async resume 2692 (+/- 326) 3556 (+/- 33) > > > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > > > So, in my opinion, with the above set of "async" devices, it doesn't > > make sense to do async suspend at all, because the sync suspend is actually > > the fastest on both machines. > > Hmm. I certainly agree - your numbers do not seem to support any async at > all. > > However, I do note that for the "extra patch" makes a big difference at > resume time. That implies that the resume serializes on some slow device > that wasn't marked async - and starting the async ones early avoids that. > > But without the per-device timings, it's hard to even guess what device > that was. > > But even that doesn't really help the suspend cases, only resume. > > Do you have any sample timing output with devices listed? I'm going to generate one shortly. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912162027.16574.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912162027.16574.rjw@sisk.pl> @ 2009-12-16 20:59 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912161255080.3556@localhost.localdomain> 1 sibling, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-16 20:59 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > > Do you have any sample timing output with devices listed? > > I'm going to generate one shortly. >From my bootup timings, I have this memory of SATA link bringup being noticeable. I wonder if that is the case on resume too... Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912161255080.3556@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912161255080.3556@localhost.localdomain> @ 2009-12-16 21:57 ` Rafael J. Wysocki [not found] ` <200912162257.00771.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-16 21:57 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Wednesday 16 December 2009, Linus Torvalds wrote: > > On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > > > > Do you have any sample timing output with devices listed? > > > > I'm going to generate one shortly. I've just put the first set of data, for the HP nx6325 at: http://www.sisk.pl/kernel/data/nx6325/ The *-dmesg.log files contain full dmesg outputs starting from a cold boot and including one suspend-resume cycle in each case, with debug_initcall enabled. The *-suspend.log files are excerpts from the *-dmesg.log files containing the suspend messages only, and analogously for *-resume.log. The *-times.txt files contain suspend/resume time for every device sorted in the decreasing order. > From my bootup timings, I have this memory of SATA link bringup being > noticeable. I wonder if that is the case on resume too... There's no SATA in the nx6325, only IDE, so we'd need to wait for the Wind data (in the works). The slowest suspending device in the nx6325 is the audio chip (surprise, surprise), it takes ~220 ms alone. Then - serio, but since i8042 was not async, the async suspend of serio didn't really help (another ~140 ms). Then network, FireWire, MMC, USB, SD host (~15 ms each). [I think we can help suspend a bit by making i8042 async, although I'm not sure that's going to be safe.] The slowest resuming are USB (by far) and then CardBus, audio, USB controllers, FireWire, network and IDE (but that only takes about 7 ms). But the main problem with async resume is that the USB devices are at the beginning of dpm_list, so the resume of them is not even started until _all_ of the slow devices behind them are woken up. That's why the extra patch helps so much IMO. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912162257.00771.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912162257.00771.rjw@sisk.pl> @ 2009-12-16 22:11 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912161410120.3556@localhost.localdomain> ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-16 22:11 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, LKML, pm list Btw, what are the timings if you just force everything async? I think that worked on yur laptops, no? It would be interestign to know - if only to see what the asymptotic upper bound is for all of this is.. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912161410120.3556@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912161410120.3556@localhost.localdomain> @ 2009-12-16 22:33 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-16 22:33 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Wednesday 16 December 2009, Linus Torvalds wrote: > > Btw, what are the timings if you just force everything async? I think that > worked on yur laptops, no? No, it didn't. I could make all PCI async, provided that the ACPI subtree was resumed before any PCI devices. [Theoretically I can make that happen by moving ACPI resume to the _noirq phase (just for testing of course). So I can try to make PCI async in addition to serio and USB, plus i8042 perhaps, which should be sfficient for the nx6325 I think.] Making all async always hanged the boxes on resume. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912162257.00771.rjw@sisk.pl> 2009-12-16 22:11 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912161410120.3556@localhost.localdomain> @ 2009-12-16 23:04 ` Alan Stern 2009-12-17 1:49 ` Rafael J. Wysocki 3 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-16 23:04 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > I've just put the first set of data, for the HP nx6325 at: > http://www.sisk.pl/kernel/data/nx6325/ > > The *-dmesg.log files contain full dmesg outputs starting from a cold boot and > including one suspend-resume cycle in each case, with debug_initcall enabled. > > The *-suspend.log files are excerpts from the *-dmesg.log files containing > the suspend messages only, and analogously for *-resume.log. I've just started looking at the sync-suspend.log file. What are all the '+' characters and " @ 3368" strings after the device names? You didn't print out the parent name for each device, so the tree structure has been lost. Why do those "sd 0:0:0:0 [sda]" messages appear in between two callbacks? The cache-synchronization and the spin-down commands are not executed asynchronously. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912162257.00771.rjw@sisk.pl> ` (2 preceding siblings ...) 2009-12-16 23:04 ` Alan Stern @ 2009-12-17 1:49 ` Rafael J. Wysocki 2009-12-17 20:06 ` Alan Stern ` (2 more replies) 3 siblings, 3 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-17 1:49 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Wednesday 16 December 2009, Rafael J. Wysocki wrote: > On Wednesday 16 December 2009, Linus Torvalds wrote: > > > > On Wed, 16 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > Do you have any sample timing output with devices listed? > > > > > > I'm going to generate one shortly. > > I've just put the first set of data, for the HP nx6325 at: > http://www.sisk.pl/kernel/data/nx6325/ As I said in a message to Alan, the data were incomplete, because the original Arjan's patch only covers bus types and device classes converted to dev_pm_ops, which I only noticed earlier today. So I added the appended patch on top of the async tree and I applied a one-liner adding the name of the parent to each device line during (regular) suspend and resume. The new data sets are at: http://www.sisk.pl/kernel/data/nx6325/ http://www.sisk.pl/kernel/data/wind/ and the format is the same as described below. > The *-dmesg.log files contain full dmesg outputs starting from a cold boot and > including one suspend-resume cycle in each case, with debug_initcall enabled. > > The *-suspend.log files are excerpts from the *-dmesg.log files containing > the suspend messages only, and analogously for *-resume.log. > > The *-times.txt files contain suspend/resume time for every device sorted > in the decreasing order. > > > From my bootup timings, I have this memory of SATA link bringup being > > noticeable. I wonder if that is the case on resume too... That actually is correct. On the nx6325 suspend is totally dominated by disk spindown, almost everything else is negligible compared to it (well, except for the audio), so we can't go down below 1 s during suspend on this box. On the Wind, disk spindown time is comparable with serio suspend time, so at least in principle we should be able to get .5 s suspend on this box - if the disk spindown in async. In turn, the resume on the Wind is dominated by disk spinup, so we can't go below 1.5 s on this box during resume (notice that the "async+extra" approach brings us close to this limit, although we could save .5 s more in principle by making more devices async). Resume on the nx6325 is a different story, though, as it is dominated by USB and PCI devices, so marking those as async would probably bring us close to the limit. [Surprisingly enough to me some ACPI devices appear to take quite noticeable amounts of time to resume on both boxes.] Tomorrow I'll try to mark as many devices as reasonably possible as async and see how the total suspend-resume times change. Rafael --- drivers/base/power/main.c | 97 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 77 insertions(+), 20 deletions(-) Index: linux-2.6/drivers/base/power/main.c =================================================================== --- linux-2.6.orig/drivers/base/power/main.c +++ linux-2.6/drivers/base/power/main.c @@ -165,6 +165,32 @@ void device_pm_move_last(struct device * list_move_tail(&dev->power.entry, &dpm_list); } +static ktime_t initcall_debug_start(struct device *dev) +{ + ktime_t calltime = ktime_set(0, 0); + + if (initcall_debug) { + pr_info("calling %s_i+ @ %i\n", + dev_name(dev), task_pid_nr(current)); + calltime = ktime_get(); + } + + return calltime; +} + +static void initcall_debug_report(struct device *dev, ktime_t calltime, + int error) +{ + ktime_t delta, rettime; + + if (initcall_debug) { + rettime = ktime_get(); + delta = ktime_sub(rettime, calltime); + pr_info("call %s+ returned %d after %Ld usecs\n", dev_name(dev), + error, (unsigned long long)ktime_to_ns(delta) >> 10); + } +} + /** * dpm_wait - Wait for a PM operation to complete. * @dev: Device to wait for. @@ -201,13 +227,9 @@ static int pm_op(struct device *dev, pm_message_t state) { int error = 0; - ktime_t calltime, delta, rettime; + ktime_t calltime; - if (initcall_debug) { - pr_info("calling %s+ @ %i\n", - dev_name(dev), task_pid_nr(current)); - calltime = ktime_get(); - } + calltime = initcall_debug_start(dev); switch (state.event) { #ifdef CONFIG_SUSPEND @@ -256,12 +278,7 @@ static int pm_op(struct device *dev, error = -EINVAL; } - if (initcall_debug) { - rettime = ktime_get(); - delta = ktime_sub(rettime, calltime); - pr_info("call %s+ returned %d after %Ld usecs\n", dev_name(dev), - error, (unsigned long long)ktime_to_ns(delta) >> 10); - } + initcall_debug_report(dev, calltime, error); return error; } @@ -338,8 +355,9 @@ static int pm_noirq_op(struct device *de if (initcall_debug) { rettime = ktime_get(); delta = ktime_sub(rettime, calltime); - printk("initcall %s_i+ returned %d after %Ld usecs\n", dev_name(dev), - error, (unsigned long long)ktime_to_ns(delta) >> 10); + printk("initcall %s_i+ returned %d after %Ld usecs\n", + dev_name(dev), error, + (unsigned long long)ktime_to_ns(delta) >> 10); } return error; @@ -456,6 +474,26 @@ void dpm_resume_noirq(pm_message_t state EXPORT_SYMBOL_GPL(dpm_resume_noirq); /** + * legacy_resume - Execute a legacy (bus or class) resume callback for device. + * dev: Device to resume. + * cb: Resume callback to execute. + */ +static int legacy_resume(struct device *dev, int (*cb)(struct device *dev)) +{ + int error; + ktime_t calltime; + + calltime = initcall_debug_start(dev); + + error = cb(dev); + suspend_report_result(cb, error); + + initcall_debug_report(dev, calltime, error); + + return error; +} + +/** * __device_resume - Execute "resume" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. @@ -477,7 +515,7 @@ static int __device_resume(struct device error = pm_op(dev, dev->bus->pm, state); } else if (dev->bus->resume) { pm_dev_dbg(dev, state, "legacy "); - error = dev->bus->resume(dev); + error = legacy_resume(dev, dev->bus->resume); } if (error) goto End; @@ -498,7 +536,7 @@ static int __device_resume(struct device error = pm_op(dev, dev->class->pm, state); } else if (dev->class->resume) { pm_dev_dbg(dev, state, "legacy class "); - error = dev->class->resume(dev); + error = legacy_resume(dev, dev->class->resume); } } End: @@ -734,6 +772,27 @@ EXPORT_SYMBOL_GPL(dpm_suspend_noirq); static int async_error; /** + * legacy_suspend - Execute a legacy (bus or class) suspend callback for device. + * dev: Device to suspend. + * cb: Suspend callback to execute. + */ +static int legacy_suspend(struct device *dev, pm_message_t state, + int (*cb)(struct device *dev, pm_message_t state)) +{ + int error; + ktime_t calltime; + + calltime = initcall_debug_start(dev); + + error = cb(dev, state); + suspend_report_result(cb, error); + + initcall_debug_report(dev, calltime, error); + + return error; +} + +/** * device_suspend - Execute "suspend" callbacks for given device. * @dev: Device to handle. * @state: PM transition of the system being carried out. @@ -755,8 +814,7 @@ static int __device_suspend(struct devic error = pm_op(dev, dev->class->pm, state); } else if (dev->class->suspend) { pm_dev_dbg(dev, state, "legacy class "); - error = dev->class->suspend(dev, state); - suspend_report_result(dev->class->suspend, error); + error = legacy_suspend(dev, state, dev->class->suspend); } if (error) goto End; @@ -777,8 +835,7 @@ static int __device_suspend(struct devic error = pm_op(dev, dev->bus->pm, state); } else if (dev->bus->suspend) { pm_dev_dbg(dev, state, "legacy "); - error = dev->bus->suspend(dev, state); - suspend_report_result(dev->bus->suspend, error); + error = legacy_suspend(dev, state, dev->bus->suspend); } } ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-17 1:49 ` Rafael J. Wysocki @ 2009-12-17 20:06 ` Alan Stern 2009-12-18 1:51 ` Rafael J. Wysocki [not found] ` <200912180251.22655.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-17 20:06 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Thu, 17 Dec 2009, Rafael J. Wysocki wrote: > That actually is correct. On the nx6325 suspend is totally dominated by disk > spindown, almost everything else is negligible compared to it (well, except for > the audio), so we can't go down below 1 s during suspend on this box. > > On the Wind, disk spindown time is comparable with serio suspend time, > so at least in principle we should be able to get .5 s suspend on this box - > if the disk spindown in async. > > In turn, the resume on the Wind is dominated by disk spinup, so we can't > go below 1.5 s on this box during resume (notice that the "async+extra" > approach brings us close to this limit, although we could save .5 s more in > principle by making more devices async). > > Resume on the nx6325 is a different story, though, as it is dominated by USB > and PCI devices, so marking those as async would probably bring us close to > the limit. The implications seem pretty clear. If the following sorts of devices were async: USB (devices and interfaces), PCI, serio, SCSI (hosts, targets, devices) then we would reap close to the maximum benefit -- providing: async threads are started in a first pass without waiting for synchronous devices, and It's not clear that making all these types of devices async will really work, but it's worth testing. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-17 1:49 ` Rafael J. Wysocki 2009-12-17 20:06 ` Alan Stern @ 2009-12-18 1:51 ` Rafael J. Wysocki [not found] ` <200912180251.22655.rjw@sisk.pl> 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-18 1:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Thursday 17 December 2009, Rafael J. Wysocki wrote: ... > Tomorrow I'll try to mark as many devices as reasonably possible as async > and see how the total suspend-resume times change. I didn't manage to do that, but I was able to mark sd and i8042 as async and see the impact of this. The raw data are in the usual place: http://www.sisk.pl/kernel/data/async-suspend-resume.pdf and the individual device timings and logs are in: http://www.sisk.pl/kernel/data/nx6325/ http://www.sisk.pl/kernel/data/wind/ This is the summary (previous results are inculded for easier reference): HP nx6325 MSI Wind U100 sync suspend 1482 (+/- 40) 1180 (+/- 24) sync resume 2955 (+/- 2) 3597 (+/- 25) async suspend 1553 (+/- 49) 1177 (+/- 32) async resume 2692 (+/- 326) 3556 (+/- 33) async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) async+extra suspend 1496 (+/- 37) 1217 (+/- 38) async+extra resume 1859 (+/- 114) 1923 (+/- 35) with "async" i8042 and sd: async suspend 1319 (+/- 51) 1045 (+/- 41) async resume 2929 (+/- 3) 3546 (+/- 27) async+extra suspend 1327 (+/- 36) (didn't work) async+extra resume 1742 (+/- 164) 1896 (+/- 28) (the summary is also available at: http://www.sisk.pl/kernel/data/results.txt). So, it actually makes the case for async suspend! Although it's not very strong, with these two additional devices marked as "async" we get noticeable suspend time improvement. Still, the "extra" patch doesn't help on suspend at all and on the Wind the suspend part of it didn't even work (I'm yet to figure out which of the two devices crashed the suspend). Nevertheless the resume part of the "extra" patch worked in both cases and worked better than without the two additional "async" devices. To me, this means that the suspend part of the "extra" patch is not really useful. However, the resume part of it is _very_ useful, so I'd like to add that part only to the async patchset. The explanation why it helps so much is also straightforward to me. Namely, if slow async devices are last to resume, then without the "extra" patch they need to wait for all of the preceding sync devices and the speedup from executing their resume routines asynchronously is very limited. Now, with the "extra" patch their resume routines start as soon as their parents complete resuming and that may be early enough for the speedup to be significant. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912180251.22655.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912180251.22655.rjw@sisk.pl> @ 2009-12-18 17:26 ` Alan Stern 2009-12-18 23:42 ` Rafael J. Wysocki 1 sibling, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-18 17:26 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: ACPI Devel Maling List, Linus Torvalds, LKML, pm list On Fri, 18 Dec 2009, Rafael J. Wysocki wrote: > I didn't manage to do that, but I was able to mark sd and i8042 as async and > see the impact of this. Apparently this didn't do what you wanted. In the nx6325 sd+i8042+async+extra log, the 0:0:0:0 device (which is a SCSI disk) was suspended by the main thread instead of an async thread. There's an important point I neglected to mention before. Your logs don't show anything for devices with no suspend callbacks at all. Nevertheless, these devices sit on the device list and prevent other devices from suspending or resuming as soon as they could. For example, the fingerprint sensor (3-1) took the most time to resume. But other devices were delayed until after it finished because it had children with no callbacks, and they delayed the devices following them in the list. What would happen if you completed these devices immediately, as part of the first pass? Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912180251.22655.rjw@sisk.pl> 2009-12-18 17:26 ` Alan Stern @ 2009-12-18 23:42 ` Rafael J. Wysocki 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-18 23:42 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Friday 18 December 2009, Rafael J. Wysocki wrote: > On Thursday 17 December 2009, Rafael J. Wysocki wrote: > ... > > Tomorrow I'll try to mark as many devices as reasonably possible as async > > and see how the total suspend-resume times change. > > I didn't manage to do that, but I was able to mark sd and i8042 as async and > see the impact of this. > > The raw data are in the usual place: > > http://www.sisk.pl/kernel/data/async-suspend-resume.pdf > > and the individual device timings and logs are in: > > http://www.sisk.pl/kernel/data/nx6325/ > http://www.sisk.pl/kernel/data/wind/ > > This is the summary (previous results are inculded for easier reference): > > HP nx6325 MSI Wind U100 > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > sync resume 2955 (+/- 2) 3597 (+/- 25) > > async suspend 1553 (+/- 49) 1177 (+/- 32) > async resume 2692 (+/- 326) 3556 (+/- 33) > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > with "async" i8042 and sd: > > async suspend 1319 (+/- 51) 1045 (+/- 41) > async resume 2929 (+/- 3) 3546 (+/- 27) > > async+extra suspend 1327 (+/- 36) (didn't work) > async+extra resume 1742 (+/- 164) 1896 (+/- 28) > > (the summary is also available at: http://www.sisk.pl/kernel/data/results.txt). > > So, it actually makes the case for async suspend! Although it's not very > strong, with these two additional devices marked as "async" we get noticeable > suspend time improvement. > > Still, the "extra" patch doesn't help on suspend at all and on the Wind the > suspend part of it didn't even work (I'm yet to figure out which of the two > devices crashed the suspend). Small update. I've just verified that sd was the failing device, although I'm not sure about the reason. Apart from this, I ran some tests on the Wind with i8042 marked as "async" and sd marked as "sync". In that case all of the tests succeeded and I got the following numbers: suspend (i8042 async, full extra patch applied): 1070 (+/- 40) resume (i8042 async, full extra patch applied): 1915,84 (+/- 27) suspend (i8042 async, resume part of extra patch applied): 1050 (+/- 34) First, It looks like the suspend speedup was related to marking i8042 as "async". Since the serio devices, which are the i8042's children, were also "async" (just like in all of the tests before), this means that the speedup resulted from removing a suspend stall caused by a sync parent of async children (i8042 and serio, respectively, in this case). However, the suspend part of the extra patch doesn't help really. In fact it even makes things worse. So, I still think the resume part of the extra patch is definitely useful, but the suspend part of it is not. IOW, it's worth running async resumes upfront, but it's not worth running async suspends upfront. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <20091216064025.GB2699@core.coreip.homeip.net>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <20091216064025.GB2699@core.coreip.homeip.net> @ 2009-12-18 22:43 ` Rafael J. Wysocki 2009-12-19 19:59 ` Dmitry Torokhov [not found] ` <20091219195935.GB4073@core.coreip.homeip.net> 0 siblings, 2 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-18 22:43 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Wednesday 16 December 2009, Dmitry Torokhov wrote: > On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: > > On Tuesday 15 December 2009, Linus Torvalds wrote: > > > > > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > > > Give a real example that matters. > > > > > > > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > > > > like this: > > > > > > No. > > > > > > I mean something real - something like > > > > > > - if you run on a non-PC with two USB buses behind non-PCI controllers. > > > > > > - device xyz. > > > > > > > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > > > > show that serio devices take much more time to suspend than USB. > > > > > > I mean in general - something where you actually have hard data that some > > > device really needs anythign more than my one-liner, and really _needs_ > > > some complex infrastructure. > > > > > > Not "let's imagine a case like xyz". > > > > As I said I would, I made some measurements. > > > > I measured the total time of suspending and resuming devices as shown by the > > code added by this patch: > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > > different and the HP was running 64-bit kernel and user space). > > > > I took four cases into consideration: > > (1) synchronous suspend and resume (/sys/power/pm_async = 0) > > (2) asynchronous suspend and resume as introduced by the async branch at: > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > > (3) asynchronous suspend and resume like in (2), but with your one-liner setting > > the power.async_suspend flag for PCI bridges on top > > (4) asynchronous suspend and resume like in (2), but with an extra patch that > > is appended on top > > > > For those tests I set power.async_suspend for all USB devices, all serio input > > devices, the ACPI battery and the USB PCI controllers (to see the impact of the > > one-liner, if any). > > > > I carried out 5 consecutive suspend-resume cycles (started from under X) on > > each box in each case, and the raw data are here (all times in milliseconds): > > http://www.sisk.pl/kernel/data/async-suspend.pdf > > > > The summarized data are below (the "big" numbers are averages and the +/- > > numbers are standard deviations, all in milliseconds): > > > > HP nx6325 MSI Wind U100 > > > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > > sync resume 2955 (+/- 2) 3597 (+/- 25) > > > > async suspend 1553 (+/- 49) 1177 (+/- 32) > > async resume 2692 (+/- 326) 3556 (+/- 33) > > > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > > > So, in my opinion, with the above set of "async" devices, it doesn't > > make sense to do async suspend at all, because the sync suspend is actually > > the fastest on both machines. > > I think the async suspend is not asynchronous enough then - what kind of > time do you get if you simply comment out call to psmouse_reset() in > drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for testing > purposes only, I don't think we want to do that by default.) The problem apparently is that the i8042 suspend/resume is synchronous. Do you think it's safe to mark it as asynchronous? Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-18 22:43 ` Rafael J. Wysocki @ 2009-12-19 19:59 ` Dmitry Torokhov [not found] ` <20091219195935.GB4073@core.coreip.homeip.net> 1 sibling, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-19 19:59 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: > On Wednesday 16 December 2009, Dmitry Torokhov wrote: > > On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: > > > On Tuesday 15 December 2009, Linus Torvalds wrote: > > > > > > > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > > > > > Give a real example that matters. > > > > > > > > > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > > > > > like this: > > > > > > > > No. > > > > > > > > I mean something real - something like > > > > > > > > - if you run on a non-PC with two USB buses behind non-PCI controllers. > > > > > > > > - device xyz. > > > > > > > > > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > > > > > show that serio devices take much more time to suspend than USB. > > > > > > > > I mean in general - something where you actually have hard data that some > > > > device really needs anythign more than my one-liner, and really _needs_ > > > > some complex infrastructure. > > > > > > > > Not "let's imagine a case like xyz". > > > > > > As I said I would, I made some measurements. > > > > > > I measured the total time of suspending and resuming devices as shown by the > > > code added by this patch: > > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > > > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > > > different and the HP was running 64-bit kernel and user space). > > > > > > I took four cases into consideration: > > > (1) synchronous suspend and resume (/sys/power/pm_async = 0) > > > (2) asynchronous suspend and resume as introduced by the async branch at: > > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > > > (3) asynchronous suspend and resume like in (2), but with your one-liner setting > > > the power.async_suspend flag for PCI bridges on top > > > (4) asynchronous suspend and resume like in (2), but with an extra patch that > > > is appended on top > > > > > > For those tests I set power.async_suspend for all USB devices, all serio input > > > devices, the ACPI battery and the USB PCI controllers (to see the impact of the > > > one-liner, if any). > > > > > > I carried out 5 consecutive suspend-resume cycles (started from under X) on > > > each box in each case, and the raw data are here (all times in milliseconds): > > > http://www.sisk.pl/kernel/data/async-suspend.pdf > > > > > > The summarized data are below (the "big" numbers are averages and the +/- > > > numbers are standard deviations, all in milliseconds): > > > > > > HP nx6325 MSI Wind U100 > > > > > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > > > sync resume 2955 (+/- 2) 3597 (+/- 25) > > > > > > async suspend 1553 (+/- 49) 1177 (+/- 32) > > > async resume 2692 (+/- 326) 3556 (+/- 33) > > > > > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > > > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > > > > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > > > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > > > > > So, in my opinion, with the above set of "async" devices, it doesn't > > > make sense to do async suspend at all, because the sync suspend is actually > > > the fastest on both machines. > > > > I think the async suspend is not asynchronous enough then - what kind of > > time do you get if you simply comment out call to psmouse_reset() in > > drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for testing > > purposes only, I don't think we want to do that by default.) > > The problem apparently is that the i8042 suspend/resume is synchronous. > > Do you think it's safe to mark it as asynchronous? > Umm.. there lie dragons. There is an implicit relationship between i8042 and PNP/ACPI devices representing keyboard and mouse ports, and I am not sure how happy i8042 (and most importantly the BIOS) will be if they get shut down before i8042. Also there is EC which is in theory independent but in practice not so much. -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <20091219195935.GB4073@core.coreip.homeip.net>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <20091219195935.GB4073@core.coreip.homeip.net> @ 2009-12-19 21:33 ` Rafael J. Wysocki [not found] ` <200912192233.44575.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 21:33 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Saturday 19 December 2009, Dmitry Torokhov wrote: > On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: > > On Wednesday 16 December 2009, Dmitry Torokhov wrote: > > > On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: > > > > On Tuesday 15 December 2009, Linus Torvalds wrote: > > > > > > > > > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > > > > > > > Give a real example that matters. > > > > > > > > > > > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > > > > > > like this: > > > > > > > > > > No. > > > > > > > > > > I mean something real - something like > > > > > > > > > > - if you run on a non-PC with two USB buses behind non-PCI controllers. > > > > > > > > > > - device xyz. > > > > > > > > > > > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > > > > > > show that serio devices take much more time to suspend than USB. > > > > > > > > > > I mean in general - something where you actually have hard data that some > > > > > device really needs anythign more than my one-liner, and really _needs_ > > > > > some complex infrastructure. > > > > > > > > > > Not "let's imagine a case like xyz". > > > > > > > > As I said I would, I made some measurements. > > > > > > > > I measured the total time of suspending and resuming devices as shown by the > > > > code added by this patch: > > > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > > > > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > > > > different and the HP was running 64-bit kernel and user space). > > > > > > > > I took four cases into consideration: > > > > (1) synchronous suspend and resume (/sys/power/pm_async = 0) > > > > (2) asynchronous suspend and resume as introduced by the async branch at: > > > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > > > > (3) asynchronous suspend and resume like in (2), but with your one-liner setting > > > > the power.async_suspend flag for PCI bridges on top > > > > (4) asynchronous suspend and resume like in (2), but with an extra patch that > > > > is appended on top > > > > > > > > For those tests I set power.async_suspend for all USB devices, all serio input > > > > devices, the ACPI battery and the USB PCI controllers (to see the impact of the > > > > one-liner, if any). > > > > > > > > I carried out 5 consecutive suspend-resume cycles (started from under X) on > > > > each box in each case, and the raw data are here (all times in milliseconds): > > > > http://www.sisk.pl/kernel/data/async-suspend.pdf > > > > > > > > The summarized data are below (the "big" numbers are averages and the +/- > > > > numbers are standard deviations, all in milliseconds): > > > > > > > > HP nx6325 MSI Wind U100 > > > > > > > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > > > > sync resume 2955 (+/- 2) 3597 (+/- 25) > > > > > > > > async suspend 1553 (+/- 49) 1177 (+/- 32) > > > > async resume 2692 (+/- 326) 3556 (+/- 33) > > > > > > > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > > > > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > > > > > > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > > > > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > > > > > > > So, in my opinion, with the above set of "async" devices, it doesn't > > > > make sense to do async suspend at all, because the sync suspend is actually > > > > the fastest on both machines. > > > > > > I think the async suspend is not asynchronous enough then - what kind of > > > time do you get if you simply comment out call to psmouse_reset() in > > > drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for testing > > > purposes only, I don't think we want to do that by default.) > > > > The problem apparently is that the i8042 suspend/resume is synchronous. > > > > Do you think it's safe to mark it as asynchronous? > > > > Umm.. there lie dragons. There is an implicit relationship between i8042 > and PNP/ACPI devices representing keyboard and mouse ports, and I am not > sure how happy i8042 (and most importantly the BIOS) will be if they get > shut down before i8042. Also there is EC which is in theory independent > but in practice not so much. I see. Is this possible to identify ACPI devices that should wait for the i8042 suspend and that should be waited for by it on resume? Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912192233.44575.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912192233.44575.rjw@sisk.pl> @ 2009-12-19 22:29 ` Rafael J. Wysocki [not found] ` <200912192329.03251.rjw@sisk.pl> ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 22:29 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Saturday 19 December 2009, Rafael J. Wysocki wrote: > On Saturday 19 December 2009, Dmitry Torokhov wrote: > > On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: > > > On Wednesday 16 December 2009, Dmitry Torokhov wrote: > > > > On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: > > > > > On Tuesday 15 December 2009, Linus Torvalds wrote: > > > > > > > > > > > > On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > > > > > > > > > > > > > > > > Give a real example that matters. > > > > > > > > > > > > > > I'll try. Let -> denote child-parent relationships and assume dpm_list looks > > > > > > > like this: > > > > > > > > > > > > No. > > > > > > > > > > > > I mean something real - something like > > > > > > > > > > > > - if you run on a non-PC with two USB buses behind non-PCI controllers. > > > > > > > > > > > > - device xyz. > > > > > > > > > > > > > If this applies to _resume_ only, then I agree, but the Arjan's data clearly > > > > > > > show that serio devices take much more time to suspend than USB. > > > > > > > > > > > > I mean in general - something where you actually have hard data that some > > > > > > device really needs anythign more than my one-liner, and really _needs_ > > > > > > some complex infrastructure. > > > > > > > > > > > > Not "let's imagine a case like xyz". > > > > > > > > > > As I said I would, I made some measurements. > > > > > > > > > > I measured the total time of suspending and resuming devices as shown by the > > > > > code added by this patch: > > > > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > > > > > on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they are quite > > > > > different and the HP was running 64-bit kernel and user space). > > > > > > > > > > I took four cases into consideration: > > > > > (1) synchronous suspend and resume (/sys/power/pm_async = 0) > > > > > (2) asynchronous suspend and resume as introduced by the async branch at: > > > > > http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > > > > > (3) asynchronous suspend and resume like in (2), but with your one-liner setting > > > > > the power.async_suspend flag for PCI bridges on top > > > > > (4) asynchronous suspend and resume like in (2), but with an extra patch that > > > > > is appended on top > > > > > > > > > > For those tests I set power.async_suspend for all USB devices, all serio input > > > > > devices, the ACPI battery and the USB PCI controllers (to see the impact of the > > > > > one-liner, if any). > > > > > > > > > > I carried out 5 consecutive suspend-resume cycles (started from under X) on > > > > > each box in each case, and the raw data are here (all times in milliseconds): > > > > > http://www.sisk.pl/kernel/data/async-suspend.pdf > > > > > > > > > > The summarized data are below (the "big" numbers are averages and the +/- > > > > > numbers are standard deviations, all in milliseconds): > > > > > > > > > > HP nx6325 MSI Wind U100 > > > > > > > > > > sync suspend 1482 (+/- 40) 1180 (+/- 24) > > > > > sync resume 2955 (+/- 2) 3597 (+/- 25) > > > > > > > > > > async suspend 1553 (+/- 49) 1177 (+/- 32) > > > > > async resume 2692 (+/- 326) 3556 (+/- 33) > > > > > > > > > > async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > > > > > async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > > > > > > > > > > async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > > > > > async+extra resume 1859 (+/- 114) 1923 (+/- 35) > > > > > > > > > > So, in my opinion, with the above set of "async" devices, it doesn't > > > > > make sense to do async suspend at all, because the sync suspend is actually > > > > > the fastest on both machines. > > > > > > > > I think the async suspend is not asynchronous enough then - what kind of > > > > time do you get if you simply comment out call to psmouse_reset() in > > > > drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for testing > > > > purposes only, I don't think we want to do that by default.) > > > > > > The problem apparently is that the i8042 suspend/resume is synchronous. > > > > > > Do you think it's safe to mark it as asynchronous? > > > > > > > Umm.. there lie dragons. There is an implicit relationship between i8042 > > and PNP/ACPI devices representing keyboard and mouse ports, and I am not > > sure how happy i8042 (and most importantly the BIOS) will be if they get > > shut down before i8042. Also there is EC which is in theory independent > > but in practice not so much. > > I see. > > Is this possible to identify ACPI devices that should wait for the i8042 > suspend and that should be waited for by it on resume? Wait, if you look at the logs at http://www.sisk.pl/kernel/data/nx6325/ http://www.sisk.pl/kernel/data/wind/ you'll see that the i8042 suspend is called before any ACPI devices are suspended anyway. In fact, it is suspended right after its serio children which is very early in the suspend sequence. So, it seems, if there were any problems with i8042 vs ACPI, we'd experience them anyway. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912192329.03251.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912192329.03251.rjw@sisk.pl> @ 2009-12-19 22:43 ` Dmitry Torokhov 0 siblings, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-19 22:43 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Dec 19, 2009, at 2:29 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > On Saturday 19 December 2009, Rafael J. Wysocki wrote: >> On Saturday 19 December 2009, Dmitry Torokhov wrote: >>> On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: >>>> On Wednesday 16 December 2009, Dmitry Torokhov wrote: >>>>> On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: >>>>>> On Tuesday 15 December 2009, Linus Torvalds wrote: >>>>>>> >>>>>>> On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: >>>>>>>>> >>>>>>>>> Give a real example that matters. >>>>>>>> >>>>>>>> I'll try. Let -> denote child-parent relationships and >>>>>>>> assume dpm_list looks >>>>>>>> like this: >>>>>>> >>>>>>> No. >>>>>>> >>>>>>> I mean something real - something like >>>>>>> >>>>>>> - if you run on a non-PC with two USB buses behind non-PCI >>>>>>> controllers. >>>>>>> >>>>>>> - device xyz. >>>>>>> >>>>>>>> If this applies to _resume_ only, then I agree, but the >>>>>>>> Arjan's data clearly >>>>>>>> show that serio devices take much more time to suspend than >>>>>>>> USB. >>>>>>> >>>>>>> I mean in general - something where you actually have hard >>>>>>> data that some >>>>>>> device really needs anythign more than my one-liner, and >>>>>>> really _needs_ >>>>>>> some complex infrastructure. >>>>>>> >>>>>>> Not "let's imagine a case like xyz". >>>>>> >>>>>> As I said I would, I made some measurements. >>>>>> >>>>>> I measured the total time of suspending and resuming devices as >>>>>> shown by the >>>>>> code added by this patch: >>>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 >>>>>> on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they >>>>>> are quite >>>>>> different and the HP was running 64-bit kernel and user space). >>>>>> >>>>>> I took four cases into consideration: >>>>>> (1) synchronous suspend and resume (/sys/power/pm_async = 0) >>>>>> (2) asynchronous suspend and resume as introduced by the async >>>>>> branch at: >>>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async >>>>>> (3) asynchronous suspend and resume like in (2), but with your >>>>>> one-liner setting >>>>>> the power.async_suspend flag for PCI bridges on top >>>>>> (4) asynchronous suspend and resume like in (2), but with an >>>>>> extra patch that >>>>>> is appended on top >>>>>> >>>>>> For those tests I set power.async_suspend for all USB devices, >>>>>> all serio input >>>>>> devices, the ACPI battery and the USB PCI controllers (to see >>>>>> the impact of the >>>>>> one-liner, if any). >>>>>> >>>>>> I carried out 5 consecutive suspend-resume cycles (started from >>>>>> under X) on >>>>>> each box in each case, and the raw data are here (all times in >>>>>> milliseconds): >>>>>> http://www.sisk.pl/kernel/data/async-suspend.pdf >>>>>> >>>>>> The summarized data are below (the "big" numbers are averages >>>>>> and the +/- >>>>>> numbers are standard deviations, all in milliseconds): >>>>>> >>>>>> HP nx6325 MSI Wind U100 >>>>>> >>>>>> sync suspend 1482 (+/- 40) 1180 (+/- 24) >>>>>> sync resume 2955 (+/- 2) 3597 (+/- 25) >>>>>> >>>>>> async suspend 1553 (+/- 49) 1177 (+/- 32) >>>>>> async resume 2692 (+/- 326) 3556 (+/- 33) >>>>>> >>>>>> async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) >>>>>> async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) >>>>>> >>>>>> async+extra suspend 1496 (+/- 37) 1217 (+/- 38) >>>>>> async+extra resume 1859 (+/- 114) 1923 (+/- 35) >>>>>> >>>>>> So, in my opinion, with the above set of "async" devices, it >>>>>> doesn't >>>>>> make sense to do async suspend at all, because the sync suspend >>>>>> is actually >>>>>> the fastest on both machines. >>>>> >>>>> I think the async suspend is not asynchronous enough then - what >>>>> kind of >>>>> time do you get if you simply comment out call to psmouse_reset >>>>> () in >>>>> drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for >>>>> testing >>>>> purposes only, I don't think we want to do that by default.) >>>> >>>> The problem apparently is that the i8042 suspend/resume is >>>> synchronous. >>>> >>>> Do you think it's safe to mark it as asynchronous? >>>> >>> >>> Umm.. there lie dragons. There is an implicit relationship between >>> i8042 >>> and PNP/ACPI devices representing keyboard and mouse ports, and I >>> am not >>> sure how happy i8042 (and most importantly the BIOS) will be if >>> they get >>> shut down before i8042. Also there is EC which is in theory >>> independent >>> but in practice not so much. >> >> I see. >> >> Is this possible to identify ACPI devices that should wait for the >> i8042 >> suspend and that should be waited for by it on resume? > > Wait, if you look at the logs at > > http://www.sisk.pl/kernel/data/nx6325/ > http://www.sisk.pl/kernel/data/wind/ > > you'll see that the i8042 suspend is called before any ACPI devices > are > suspended anyway. In fact, it is suspended right after its serio > children > which is very early in the suspend sequence. Right, and we do want to "suspend" i8042 (well, reset to the initial state we found it at bootup) before touching ACPI. If i8042 is async, given the fact that psmouse reset takes a long time, it is possible that we start suspending PNP before we are done with i8042. -- > Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912192233.44575.rjw@sisk.pl> 2009-12-19 22:29 ` Rafael J. Wysocki [not found] ` <200912192329.03251.rjw@sisk.pl> @ 2009-12-19 22:47 ` Dmitry Torokhov [not found] ` <A37A0A6F-3662-40C9-BE1F-B9F6A38CD80B@gmail.com> 3 siblings, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-19 22:47 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Dec 19, 2009, at 1:33 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > On Saturday 19 December 2009, Dmitry Torokhov wrote: >> On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: >>> On Wednesday 16 December 2009, Dmitry Torokhov wrote: >>>> On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: >>>>> On Tuesday 15 December 2009, Linus Torvalds wrote: >>>>>> >>>>>> On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: >>>>>>>> >>>>>>>> Give a real example that matters. >>>>>>> >>>>>>> I'll try. Let -> denote child-parent relationships and assume >>>>>>> dpm_list looks >>>>>>> like this: >>>>>> >>>>>> No. >>>>>> >>>>>> I mean something real - something like >>>>>> >>>>>> - if you run on a non-PC with two USB buses behind non-PCI >>>>>> controllers. >>>>>> >>>>>> - device xyz. >>>>>> >>>>>>> If this applies to _resume_ only, then I agree, but the >>>>>>> Arjan's data clearly >>>>>>> show that serio devices take much more time to suspend than USB. >>>>>> >>>>>> I mean in general - something where you actually have hard data >>>>>> that some >>>>>> device really needs anythign more than my one-liner, and really >>>>>> _needs_ >>>>>> some complex infrastructure. >>>>>> >>>>>> Not "let's imagine a case like xyz". >>>>> >>>>> As I said I would, I made some measurements. >>>>> >>>>> I measured the total time of suspending and resuming devices as >>>>> shown by the >>>>> code added by this patch: >>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 >>>>> on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they >>>>> are quite >>>>> different and the HP was running 64-bit kernel and user space). >>>>> >>>>> I took four cases into consideration: >>>>> (1) synchronous suspend and resume (/sys/power/pm_async = 0) >>>>> (2) asynchronous suspend and resume as introduced by the async >>>>> branch at: >>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async >>>>> (3) asynchronous suspend and resume like in (2), but with your >>>>> one-liner setting >>>>> the power.async_suspend flag for PCI bridges on top >>>>> (4) asynchronous suspend and resume like in (2), but with an >>>>> extra patch that >>>>> is appended on top >>>>> >>>>> For those tests I set power.async_suspend for all USB devices, >>>>> all serio input >>>>> devices, the ACPI battery and the USB PCI controllers (to see >>>>> the impact of the >>>>> one-liner, if any). >>>>> >>>>> I carried out 5 consecutive suspend-resume cycles (started from >>>>> under X) on >>>>> each box in each case, and the raw data are here (all times in >>>>> milliseconds): >>>>> http://www.sisk.pl/kernel/data/async-suspend.pdf >>>>> >>>>> The summarized data are below (the "big" numbers are averages >>>>> and the +/- >>>>> numbers are standard deviations, all in milliseconds): >>>>> >>>>> HP nx6325 MSI Wind U100 >>>>> >>>>> sync suspend 1482 (+/- 40) 1180 (+/- 24) >>>>> sync resume 2955 (+/- 2) 3597 (+/- 25) >>>>> >>>>> async suspend 1553 (+/- 49) 1177 (+/- 32) >>>>> async resume 2692 (+/- 326) 3556 (+/- 33) >>>>> >>>>> async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) >>>>> async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) >>>>> >>>>> async+extra suspend 1496 (+/- 37) 1217 (+/- 38) >>>>> async+extra resume 1859 (+/- 114) 1923 (+/- 35) >>>>> >>>>> So, in my opinion, with the above set of "async" devices, it >>>>> doesn't >>>>> make sense to do async suspend at all, because the sync suspend >>>>> is actually >>>>> the fastest on both machines. >>>> >>>> I think the async suspend is not asynchronous enough then - what >>>> kind of >>>> time do you get if you simply comment out call to psmouse_reset() >>>> in >>>> drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for >>>> testing >>>> purposes only, I don't think we want to do that by default.) >>> >>> The problem apparently is that the i8042 suspend/resume is >>> synchronous. >>> >>> Do you think it's safe to mark it as asynchronous? >>> >> >> Umm.. there lie dragons. There is an implicit relationship between >> i8042 >> and PNP/ACPI devices representing keyboard and mouse ports, and I >> am not >> sure how happy i8042 (and most importantly the BIOS) will be if >> they get >> shut down before i8042. Also there is EC which is in theory >> independent >> but in practice not so much. > > I see. > > Is this possible to identify ACPI devices that should wait for the > i8042 > suspend and that should be waited for by it on resume? We could try to add some dependencies while discovering PNP to get KBC addresses in i8042 but we need tomake sure we do it even in presence of i8042.nopnp. -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <A37A0A6F-3662-40C9-BE1F-B9F6A38CD80B@gmail.com>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <A37A0A6F-3662-40C9-BE1F-B9F6A38CD80B@gmail.com> @ 2009-12-19 23:10 ` Rafael J. Wysocki [not found] ` <200912200010.19899.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 23:10 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Saturday 19 December 2009, Dmitry Torokhov wrote: > On Dec 19, 2009, at 1:33 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > On Saturday 19 December 2009, Dmitry Torokhov wrote: > >> On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: > >>> On Wednesday 16 December 2009, Dmitry Torokhov wrote: > >>>> On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki wrote: > >>>>> On Tuesday 15 December 2009, Linus Torvalds wrote: > >>>>>> > >>>>>> On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > >>>>>>>> > >>>>>>>> Give a real example that matters. > >>>>>>> > >>>>>>> I'll try. Let -> denote child-parent relationships and assume > >>>>>>> dpm_list looks > >>>>>>> like this: > >>>>>> > >>>>>> No. > >>>>>> > >>>>>> I mean something real - something like > >>>>>> > >>>>>> - if you run on a non-PC with two USB buses behind non-PCI > >>>>>> controllers. > >>>>>> > >>>>>> - device xyz. > >>>>>> > >>>>>>> If this applies to _resume_ only, then I agree, but the > >>>>>>> Arjan's data clearly > >>>>>>> show that serio devices take much more time to suspend than USB. > >>>>>> > >>>>>> I mean in general - something where you actually have hard data > >>>>>> that some > >>>>>> device really needs anythign more than my one-liner, and really > >>>>>> _needs_ > >>>>>> some complex infrastructure. > >>>>>> > >>>>>> Not "let's imagine a case like xyz". > >>>>> > >>>>> As I said I would, I made some measurements. > >>>>> > >>>>> I measured the total time of suspending and resuming devices as > >>>>> shown by the > >>>>> code added by this patch: > >>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > >>>>> on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they > >>>>> are quite > >>>>> different and the HP was running 64-bit kernel and user space). > >>>>> > >>>>> I took four cases into consideration: > >>>>> (1) synchronous suspend and resume (/sys/power/pm_async = 0) > >>>>> (2) asynchronous suspend and resume as introduced by the async > >>>>> branch at: > >>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > >>>>> (3) asynchronous suspend and resume like in (2), but with your > >>>>> one-liner setting > >>>>> the power.async_suspend flag for PCI bridges on top > >>>>> (4) asynchronous suspend and resume like in (2), but with an > >>>>> extra patch that > >>>>> is appended on top > >>>>> > >>>>> For those tests I set power.async_suspend for all USB devices, > >>>>> all serio input > >>>>> devices, the ACPI battery and the USB PCI controllers (to see > >>>>> the impact of the > >>>>> one-liner, if any). > >>>>> > >>>>> I carried out 5 consecutive suspend-resume cycles (started from > >>>>> under X) on > >>>>> each box in each case, and the raw data are here (all times in > >>>>> milliseconds): > >>>>> http://www.sisk.pl/kernel/data/async-suspend.pdf > >>>>> > >>>>> The summarized data are below (the "big" numbers are averages > >>>>> and the +/- > >>>>> numbers are standard deviations, all in milliseconds): > >>>>> > >>>>> HP nx6325 MSI Wind U100 > >>>>> > >>>>> sync suspend 1482 (+/- 40) 1180 (+/- 24) > >>>>> sync resume 2955 (+/- 2) 3597 (+/- 25) > >>>>> > >>>>> async suspend 1553 (+/- 49) 1177 (+/- 32) > >>>>> async resume 2692 (+/- 326) 3556 (+/- 33) > >>>>> > >>>>> async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > >>>>> async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > >>>>> > >>>>> async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > >>>>> async+extra resume 1859 (+/- 114) 1923 (+/- 35) > >>>>> > >>>>> So, in my opinion, with the above set of "async" devices, it > >>>>> doesn't > >>>>> make sense to do async suspend at all, because the sync suspend > >>>>> is actually > >>>>> the fastest on both machines. > >>>> > >>>> I think the async suspend is not asynchronous enough then - what > >>>> kind of > >>>> time do you get if you simply comment out call to psmouse_reset() > >>>> in > >>>> drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for > >>>> testing > >>>> purposes only, I don't think we want to do that by default.) > >>> > >>> The problem apparently is that the i8042 suspend/resume is > >>> synchronous. > >>> > >>> Do you think it's safe to mark it as asynchronous? > >>> > >> > >> Umm.. there lie dragons. There is an implicit relationship between > >> i8042 > >> and PNP/ACPI devices representing keyboard and mouse ports, and I > >> am not > >> sure how happy i8042 (and most importantly the BIOS) will be if > >> they get > >> shut down before i8042. Also there is EC which is in theory > >> independent > >> but in practice not so much. > > > > I see. > > > > Is this possible to identify ACPI devices that should wait for the > > i8042 > > suspend and that should be waited for by it on resume? > > We could try to add some dependencies while discovering PNP to get KBC > addresses in i8042 but we need tomake sure we do it even in presence > of i8042.nopnp. Well, I guess this is the example of the off-tree dependencies that actually matter Linus wanted. :-) I guess there are quite a few devices that can depend on the i8042 in principle, is this correct? Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912200010.19899.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912200010.19899.rjw@sisk.pl> @ 2009-12-19 23:22 ` Dmitry Torokhov 2009-12-19 23:23 ` Linus Torvalds ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-19 23:22 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Dec 19, 2009, at 3:10 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > On Saturday 19 December 2009, Dmitry Torokhov wrote: >> On Dec 19, 2009, at 1:33 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: >> >>> On Saturday 19 December 2009, Dmitry Torokhov wrote: >>>> On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: >>>>> On Wednesday 16 December 2009, Dmitry Torokhov wrote: >>>>>> On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki >>>>>> wrote: >>>>>>> On Tuesday 15 December 2009, Linus Torvalds wrote: >>>>>>>> >>>>>>>> On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: >>>>>>>>>> >>>>>>>>>> Give a real example that matters. >>>>>>>>> >>>>>>>>> I'll try. Let -> denote child-parent relationships and assume >>>>>>>>> dpm_list looks >>>>>>>>> like this: >>>>>>>> >>>>>>>> No. >>>>>>>> >>>>>>>> I mean something real - something like >>>>>>>> >>>>>>>> - if you run on a non-PC with two USB buses behind non-PCI >>>>>>>> controllers. >>>>>>>> >>>>>>>> - device xyz. >>>>>>>> >>>>>>>>> If this applies to _resume_ only, then I agree, but the >>>>>>>>> Arjan's data clearly >>>>>>>>> show that serio devices take much more time to suspend than >>>>>>>>> USB. >>>>>>>> >>>>>>>> I mean in general - something where you actually have hard data >>>>>>>> that some >>>>>>>> device really needs anythign more than my one-liner, and really >>>>>>>> _needs_ >>>>>>>> some complex infrastructure. >>>>>>>> >>>>>>>> Not "let's imagine a case like xyz". >>>>>>> >>>>>>> As I said I would, I made some measurements. >>>>>>> >>>>>>> I measured the total time of suspending and resuming devices as >>>>>>> shown by the >>>>>>> code added by this patch: >>>>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 >>>>>>> on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they >>>>>>> are quite >>>>>>> different and the HP was running 64-bit kernel and user space). >>>>>>> >>>>>>> I took four cases into consideration: >>>>>>> (1) synchronous suspend and resume (/sys/power/pm_async = 0) >>>>>>> (2) asynchronous suspend and resume as introduced by the async >>>>>>> branch at: >>>>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async >>>>>>> (3) asynchronous suspend and resume like in (2), but with your >>>>>>> one-liner setting >>>>>>> the power.async_suspend flag for PCI bridges on top >>>>>>> (4) asynchronous suspend and resume like in (2), but with an >>>>>>> extra patch that >>>>>>> is appended on top >>>>>>> >>>>>>> For those tests I set power.async_suspend for all USB devices, >>>>>>> all serio input >>>>>>> devices, the ACPI battery and the USB PCI controllers (to see >>>>>>> the impact of the >>>>>>> one-liner, if any). >>>>>>> >>>>>>> I carried out 5 consecutive suspend-resume cycles (started from >>>>>>> under X) on >>>>>>> each box in each case, and the raw data are here (all times in >>>>>>> milliseconds): >>>>>>> http://www.sisk.pl/kernel/data/async-suspend.pdf >>>>>>> >>>>>>> The summarized data are below (the "big" numbers are averages >>>>>>> and the +/- >>>>>>> numbers are standard deviations, all in milliseconds): >>>>>>> >>>>>>> HP nx6325 MSI Wind U100 >>>>>>> >>>>>>> sync suspend 1482 (+/- 40) 1180 (+/- 24) >>>>>>> sync resume 2955 (+/- 2) 3597 (+/- 25) >>>>>>> >>>>>>> async suspend 1553 (+/- 49) 1177 (+/- 32) >>>>>>> async resume 2692 (+/- 326) 3556 (+/- 33) >>>>>>> >>>>>>> async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) >>>>>>> async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) >>>>>>> >>>>>>> async+extra suspend 1496 (+/- 37) 1217 (+/- 38) >>>>>>> async+extra resume 1859 (+/- 114) 1923 (+/- 35) >>>>>>> >>>>>>> So, in my opinion, with the above set of "async" devices, it >>>>>>> doesn't >>>>>>> make sense to do async suspend at all, because the sync suspend >>>>>>> is actually >>>>>>> the fastest on both machines. >>>>>> >>>>>> I think the async suspend is not asynchronous enough then - what >>>>>> kind of >>>>>> time do you get if you simply comment out call to psmouse_reset() >>>>>> in >>>>>> drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for >>>>>> testing >>>>>> purposes only, I don't think we want to do that by default.) >>>>> >>>>> The problem apparently is that the i8042 suspend/resume is >>>>> synchronous. >>>>> >>>>> Do you think it's safe to mark it as asynchronous? >>>>> >>>> >>>> Umm.. there lie dragons. There is an implicit relationship between >>>> i8042 >>>> and PNP/ACPI devices representing keyboard and mouse ports, and I >>>> am not >>>> sure how happy i8042 (and most importantly the BIOS) will be if >>>> they get >>>> shut down before i8042. Also there is EC which is in theory >>>> independent >>>> but in practice not so much. >>> >>> I see. >>> >>> Is this possible to identify ACPI devices that should wait for the >>> i8042 >>> suspend and that should be waited for by it on resume? >> >> We could try to add some dependencies while discovering PNP to get >> KBC >> addresses in i8042 but we need tomake sure we do it even in presence >> of i8042.nopnp. > > Well, I guess this is the example of the off-tree dependencies that > actually > matter Linus wanted. :-) > > I guess there are quite a few devices that can depend on the i8042 in > principle, is this correct? The devices that depend on i8042 are serio ports that are it's children. I8042 itself may have indirect dependency on a couple of PNP devices. > I hope this answers your question... -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912200010.19899.rjw@sisk.pl> 2009-12-19 23:22 ` Dmitry Torokhov @ 2009-12-19 23:23 ` Linus Torvalds [not found] ` <43A402BB-6AB3-4127-A441-D53EDE09F22E@gmail.com> [not found] ` <alpine.LFD.2.00.0912191521180.3712@localhost.localdomain> 3 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-19 23:23 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > Well, I guess this is the example of the off-tree dependencies that actually > matter Linus wanted. :-) It's also the kind of dependency where I say "if we get into these kinds of messes, then the whole async crap isn't worth it". Really. Having to try to match things up with ACPI and PnP is a nightmare. Especially since I doubt Windows does anything like this, which means that there's no reason for BIOS vendors to do the tables so that we'd even know. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <43A402BB-6AB3-4127-A441-D53EDE09F22E@gmail.com>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <43A402BB-6AB3-4127-A441-D53EDE09F22E@gmail.com> @ 2009-12-19 23:33 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 23:33 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sunday 20 December 2009, Dmitry Torokhov wrote: > On Dec 19, 2009, at 3:10 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > On Saturday 19 December 2009, Dmitry Torokhov wrote: > >> On Dec 19, 2009, at 1:33 PM, "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > >> > >>> On Saturday 19 December 2009, Dmitry Torokhov wrote: > >>>> On Fri, Dec 18, 2009 at 11:43:29PM +0100, Rafael J. Wysocki wrote: > >>>>> On Wednesday 16 December 2009, Dmitry Torokhov wrote: > >>>>>> On Wed, Dec 16, 2009 at 03:11:05AM +0100, Rafael J. Wysocki > >>>>>> wrote: > >>>>>>> On Tuesday 15 December 2009, Linus Torvalds wrote: > >>>>>>>> > >>>>>>>> On Tue, 15 Dec 2009, Rafael J. Wysocki wrote: > >>>>>>>>>> > >>>>>>>>>> Give a real example that matters. > >>>>>>>>> > >>>>>>>>> I'll try. Let -> denote child-parent relationships and assume > >>>>>>>>> dpm_list looks > >>>>>>>>> like this: > >>>>>>>> > >>>>>>>> No. > >>>>>>>> > >>>>>>>> I mean something real - something like > >>>>>>>> > >>>>>>>> - if you run on a non-PC with two USB buses behind non-PCI > >>>>>>>> controllers. > >>>>>>>> > >>>>>>>> - device xyz. > >>>>>>>> > >>>>>>>>> If this applies to _resume_ only, then I agree, but the > >>>>>>>>> Arjan's data clearly > >>>>>>>>> show that serio devices take much more time to suspend than > >>>>>>>>> USB. > >>>>>>>> > >>>>>>>> I mean in general - something where you actually have hard data > >>>>>>>> that some > >>>>>>>> device really needs anythign more than my one-liner, and really > >>>>>>>> _needs_ > >>>>>>>> some complex infrastructure. > >>>>>>>> > >>>>>>>> Not "let's imagine a case like xyz". > >>>>>>> > >>>>>>> As I said I would, I made some measurements. > >>>>>>> > >>>>>>> I measured the total time of suspending and resuming devices as > >>>>>>> shown by the > >>>>>>> code added by this patch: > >>>>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=commitdiff_plain;h=c1b8fc0a8bff7707c10f31f3d26bfa88e18ccd94;hp=087dbf5f079f1b55cbd3964c9ce71268473d5b67 > >>>>>>> on two boxes, HP nx6325 and MSI Wind U100 (hardware-wise they > >>>>>>> are quite > >>>>>>> different and the HP was running 64-bit kernel and user space). > >>>>>>> > >>>>>>> I took four cases into consideration: > >>>>>>> (1) synchronous suspend and resume (/sys/power/pm_async = 0) > >>>>>>> (2) asynchronous suspend and resume as introduced by the async > >>>>>>> branch at: > >>>>>>> http://git.kernel.org/?p=linux/kernel/git/rafael/suspend-2.6.git;a=shortlog;h=refs/heads/async > >>>>>>> (3) asynchronous suspend and resume like in (2), but with your > >>>>>>> one-liner setting > >>>>>>> the power.async_suspend flag for PCI bridges on top > >>>>>>> (4) asynchronous suspend and resume like in (2), but with an > >>>>>>> extra patch that > >>>>>>> is appended on top > >>>>>>> > >>>>>>> For those tests I set power.async_suspend for all USB devices, > >>>>>>> all serio input > >>>>>>> devices, the ACPI battery and the USB PCI controllers (to see > >>>>>>> the impact of the > >>>>>>> one-liner, if any). > >>>>>>> > >>>>>>> I carried out 5 consecutive suspend-resume cycles (started from > >>>>>>> under X) on > >>>>>>> each box in each case, and the raw data are here (all times in > >>>>>>> milliseconds): > >>>>>>> http://www.sisk.pl/kernel/data/async-suspend.pdf > >>>>>>> > >>>>>>> The summarized data are below (the "big" numbers are averages > >>>>>>> and the +/- > >>>>>>> numbers are standard deviations, all in milliseconds): > >>>>>>> > >>>>>>> HP nx6325 MSI Wind U100 > >>>>>>> > >>>>>>> sync suspend 1482 (+/- 40) 1180 (+/- 24) > >>>>>>> sync resume 2955 (+/- 2) 3597 (+/- 25) > >>>>>>> > >>>>>>> async suspend 1553 (+/- 49) 1177 (+/- 32) > >>>>>>> async resume 2692 (+/- 326) 3556 (+/- 33) > >>>>>>> > >>>>>>> async+one-liner suspend 1600 (+/- 39) 1212 (+/- 41) > >>>>>>> async+one-liner resume 2692 (+/- 324) 3579 (+/- 24) > >>>>>>> > >>>>>>> async+extra suspend 1496 (+/- 37) 1217 (+/- 38) > >>>>>>> async+extra resume 1859 (+/- 114) 1923 (+/- 35) > >>>>>>> > >>>>>>> So, in my opinion, with the above set of "async" devices, it > >>>>>>> doesn't > >>>>>>> make sense to do async suspend at all, because the sync suspend > >>>>>>> is actually > >>>>>>> the fastest on both machines. > >>>>>> > >>>>>> I think the async suspend is not asynchronous enough then - what > >>>>>> kind of > >>>>>> time do you get if you simply comment out call to psmouse_reset() > >>>>>> in > >>>>>> drivers/input/mouse/psmouse-base.c:psmouse_cleanup()? (Just for > >>>>>> testing > >>>>>> purposes only, I don't think we want to do that by default.) > >>>>> > >>>>> The problem apparently is that the i8042 suspend/resume is > >>>>> synchronous. > >>>>> > >>>>> Do you think it's safe to mark it as asynchronous? > >>>>> > >>>> > >>>> Umm.. there lie dragons. There is an implicit relationship between > >>>> i8042 > >>>> and PNP/ACPI devices representing keyboard and mouse ports, and I > >>>> am not > >>>> sure how happy i8042 (and most importantly the BIOS) will be if > >>>> they get > >>>> shut down before i8042. Also there is EC which is in theory > >>>> independent > >>>> but in practice not so much. > >>> > >>> I see. > >>> > >>> Is this possible to identify ACPI devices that should wait for the > >>> i8042 > >>> suspend and that should be waited for by it on resume? > >> > >> We could try to add some dependencies while discovering PNP to get > >> KBC > >> addresses in i8042 but we need tomake sure we do it even in presence > >> of i8042.nopnp. > > > > Well, I guess this is the example of the off-tree dependencies that > > actually > > matter Linus wanted. :-) > > > > I guess there are quite a few devices that can depend on the i8042 in > > principle, is this correct? > > The devices that depend on i8042 are serio ports that are it's > children. That I already knew. :-) > I8042 itself may have indirect dependency on a couple of PNP devices. I was really asking about these. > I hope this answers your question... Yes, thanks. ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912191521180.3712@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912191521180.3712@localhost.localdomain> @ 2009-12-19 23:40 ` Rafael J. Wysocki [not found] ` <200912200040.18944.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 23:40 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sunday 20 December 2009, Linus Torvalds wrote: > > On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > > Well, I guess this is the example of the off-tree dependencies that actually > > matter Linus wanted. :-) > > It's also the kind of dependency where I say "if we get into these kinds > of messes, then the whole async crap isn't worth it". > > Really. Having to try to match things up with ACPI and PnP is a nightmare. > Especially since I doubt Windows does anything like this, which means that > there's no reason for BIOS vendors to do the tables so that we'd even > know. OK, so this means we can just forget about suspending/resuming i8042 asynchronously, which is a pity, because that gave us some real suspend speedup on my test systems. Well, whatever. So, seriously, do you think it makes sense to do asynchronous suspend at all? I'm asking, because we're likely to get into troubles like this during suspend for other kinds of devices too and without resolving them we won't get any significant speedup from asynchronous suspend. That said, to me it's definitely worth doing asynchronous resume with the "start asynch threads upfront" modification, as the results of the tests show that quite clearly. I hope you agree. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912200040.18944.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912200040.18944.rjw@sisk.pl> @ 2009-12-19 23:46 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912191542570.3712@localhost.localdomain> 2009-12-20 3:59 ` Alan Stern 2 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-19 23:46 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > OK, so this means we can just forget about suspending/resuming i8042 > asynchronously, which is a pity, because that gave us some real suspend > speedup on my test systems. No. What it means is that you shouldn't try to come up with these idiotic scenarios just trying to make trouble for yourself, and using it as an excuse for crap. I suggest you try to treat the i8042 controller async, and see if it is problematic. If it isn't, don't do that then. But we actually have no real reason to believe that it would be problematic, at least on a PC where the actual logic is on the SB (presumably behind the LPC controller). Why would it be? The fact that PnP and ACPI enumerates those devices has exactly _what_ to do with anything? Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912191542570.3712@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912191542570.3712@localhost.localdomain> @ 2009-12-19 23:47 ` Linus Torvalds 2009-12-19 23:53 ` Rafael J. Wysocki ` (2 subsequent siblings) 3 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-19 23:47 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sat, 19 Dec 2009, Linus Torvalds wrote: > > I suggest you try to treat the i8042 controller async, and see if it is > problematic. If it isn't, don't do that then. I obviously meant: "If it _is_ problematic, don't do that then". "Is", not "isn't". Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912191542570.3712@localhost.localdomain> 2009-12-19 23:47 ` Linus Torvalds @ 2009-12-19 23:53 ` Rafael J. Wysocki [not found] ` <alpine.LFD.2.00.0912191546250.3712@localhost.localdomain> [not found] ` <200912200053.45988.rjw@sisk.pl> 3 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 23:53 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sunday 20 December 2009, Linus Torvalds wrote: > > On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > > OK, so this means we can just forget about suspending/resuming i8042 > > asynchronously, which is a pity, because that gave us some real suspend > > speedup on my test systems. > > No. What it means is that you shouldn't try to come up with these idiotic > scenarios just trying to make trouble for yourself, I haven't. I've just asked Dmitry for his opinion and got it. The fact that you don't like it doesn't mean it's actually "idiotic". > and using it as an excuse for crap. I'm not sure what you mean exactly, but whatever. > I suggest you try to treat the i8042 controller async, and see if it is > problematic. I already have and I don't see problems with it, but quite obviously I can't test all possible configurations out there. > If it isn't, don't do that then. But we actually have no real > reason to believe that it would be problematic, at least on a PC where the > actual logic is on the SB (presumably behind the LPC controller). > > Why would it be? The embedded controller may depend on it. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912191546250.3712@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912191546250.3712@localhost.localdomain> @ 2009-12-19 23:54 ` Rafael J. Wysocki 0 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-19 23:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sunday 20 December 2009, Linus Torvalds wrote: > > On Sat, 19 Dec 2009, Linus Torvalds wrote: > > > > I suggest you try to treat the i8042 controller async, and see if it is > > problematic. If it isn't, don't do that then. > > I obviously meant: "If it _is_ problematic, don't do that then". "Is", not > "isn't". Sure, I understood that was a typo. :-) Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912200053.45988.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912200053.45988.rjw@sisk.pl> @ 2009-12-20 0:09 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912191557320.3712@localhost.localdomain> 2009-12-20 2:45 ` Async suspend-resume patch w/ completions (was: Re: Async suspend-resume " Dmitry Torokhov 2 siblings, 0 replies; 98+ messages in thread From: Linus Torvalds @ 2009-12-20 0:09 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > > Why would it be? > > The embedded controller may depend on it. Again, I say "why?" Anything can be true. That doesn't _make_ everything true. There's no real reason why PnP/ACPI suspend/resume should really care. We can try it. Not for 2.6.33, but by the 34 merge window maybe we'll have a patch-series that is ready to be tested, and that aggressively tries to do the devices that matter asynchronously. So instead of you trying to make up some idiotic cross-device worries, just see if those worries have any actual background in reality. So far I haven't actually heard anything but "in theory, anything is possible", which is such a truism that it's not even worth voicing. That said, I still get the feeling that we'd be even better off simply trying to avoid the whole keyboard reset entirely. Apparently we do it for a few HP laptops. It's entirely possible that we'd be better off simply not _doing_ the slow thing in the first place. For example, we may be _much_ better off doing that whole keyboard reset at resume time than at suspend time. That's what we do when we probe things on initialization - and the resume-time keyboard code is actually already asynchronous, it does that atkbd_reconnect asynchronously by queuing it as an event. So again, all these problems may not at all be fundamnetal problems: the keyboard driver does certain things, but there is no guarantee that it _needs_ to do those things. Turning the driver async may be totally the wrong thing to do, when we could potentially fix latency problems at the driver level instead. Linus ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <alpine.LFD.2.00.0912191557320.3712@localhost.localdomain>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912191557320.3712@localhost.localdomain> @ 2009-12-20 0:35 ` Rafael J. Wysocki 2009-12-20 2:41 ` Dmitry Torokhov [not found] ` <20091220024142.GC4073@core.coreip.homeip.net> 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-20 0:35 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, pm list On Sunday 20 December 2009, Linus Torvalds wrote: > > On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > > > > > > Why would it be? > > > > The embedded controller may depend on it. > > Again, I say "why?" > > Anything can be true. That doesn't _make_ everything true. There's no real > reason why PnP/ACPI suspend/resume should really care. > > We can try it. Not for 2.6.33, but by the 34 merge window maybe we'll have > a patch-series that is ready to be tested, and that aggressively tries to > do the devices that matter asynchronously. Yes, I'd like to have such a patch series for 2.6.34. So far I've been able to confirm that doing serio+i8042, USB and ACPI battery asynchronously may give us significant time savings, especially during resume. > So instead of you trying to make up some idiotic cross-device worries, > just see if those worries have any actual background in reality. So far I > haven't actually heard anything but "in theory, anything is possible", > which is such a truism that it's not even worth voicing. > > That said, I still get the feeling that we'd be even better off simply > trying to avoid the whole keyboard reset entirely. Apparently we do it for > a few HP laptops. It's entirely possible that we'd be better off simply > not _doing_ the slow thing in the first place. That very well may be the case, but I'm not the right person to confirm or deny that. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <alpine.LFD.2.00.0912191557320.3712@localhost.localdomain> 2009-12-20 0:35 ` Rafael J. Wysocki @ 2009-12-20 2:41 ` Dmitry Torokhov [not found] ` <20091220024142.GC4073@core.coreip.homeip.net> 2 siblings, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-20 2:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: LKML, ACPI Devel Maling List, Vojtech Pavlik, pm list On Sat, Dec 19, 2009 at 04:09:07PM -0800, Linus Torvalds wrote: > > That said, I still get the feeling that we'd be even better off simply > trying to avoid the whole keyboard reset entirely. Apparently we do it for > a few HP laptops. I was mistaken, HP laptops do not like mouse disabled when suspending, not sure about the rest of the state. > It's entirely possible that we'd be better off simply > not _doing_ the slow thing in the first place. > The reset appeared first in 2.5.42. I expect that some BIOSes get very confused when tehy find mouse speaking something that they do not unserstand (i.e. synaptics, ALPS or anything else that is not bare PS/2 or intellimouse), but maybe Vojtech remembers better? > For example, we may be _much_ better off doing that whole keyboard reset > at resume time than at suspend time. We do the reset for the different reasons - at resume we want the device in known state to ensure that it properly responds to the probes we send to it. At suspend we trying to reset things into original state so that the firmware will not be confused. If we want to try to live without reset we could to PSMOUSE_CMD_RESET_DIS instead of PSMOUSE_CMD_RESET_BAT which is much heavier. We should probably not wait for .34 then because the bulk of testing will happen only when .33 is close to be released because that's when most of regular users will start using the new code and try to suspend and resume. Rafael, how long does suspend take if you change call to psmouse_reset() in psmouse_cleanup() to ps2_command(&psmouse->ps2dev, NULL, PSMOUSE_CMD_RESET_DIS)? And do the same for atkbd... BTW, making just serio asynchronous while keeping i8042 synchronous makes no sense because I serialize access to i8042 - the thing does not survive simultaneous [command] access to both keyboard and mouse... -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <20091220024142.GC4073@core.coreip.homeip.net>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <20091220024142.GC4073@core.coreip.homeip.net> @ 2009-12-20 19:25 ` Rafael J. Wysocki [not found] ` <200912202025.25618.rjw@sisk.pl> 1 sibling, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-20 19:25 UTC (permalink / raw) To: linux-pm Cc: ACPI Devel Maling List, Dmitry Torokhov, Linus Torvalds, LKML, Vojtech Pavlik On Sunday 20 December 2009, Dmitry Torokhov wrote: > On Sat, Dec 19, 2009 at 04:09:07PM -0800, Linus Torvalds wrote: > > > > That said, I still get the feeling that we'd be even better off simply > > trying to avoid the whole keyboard reset entirely. Apparently we do it for > > a few HP laptops. > > I was mistaken, HP laptops do not like mouse disabled when suspending, > not sure about the rest of the state. > > > It's entirely possible that we'd be better off simply > > not _doing_ the slow thing in the first place. > > > > The reset appeared first in 2.5.42. I expect that some BIOSes get very > confused when tehy find mouse speaking something that they do not > unserstand (i.e. synaptics, ALPS or anything else that is not bare PS/2 > or intellimouse), but maybe Vojtech remembers better? > > > For example, we may be _much_ better off doing that whole keyboard reset > > at resume time than at suspend time. > > We do the reset for the different reasons - at resume we want the device > in known state to ensure that it properly responds to the probes we > send to it. At suspend we trying to reset things into original state so > that the firmware will not be confused. > > If we want to try to live without reset we could to PSMOUSE_CMD_RESET_DIS > instead of PSMOUSE_CMD_RESET_BAT which is much heavier. We should > probably not wait for .34 then because the bulk of testing will happen > only when .33 is close to be released because that's when most of > regular users will start using the new code and try to suspend and > resume. > > Rafael, how long does suspend take if you change call to psmouse_reset() > in psmouse_cleanup() to ps2_command(&psmouse->ps2dev, NULL, PSMOUSE_CMD_RESET_DIS)? > And do the same for atkbd... On the nx6325 that appears to reduce the suspend time as much so the effect of async is not visible any more. On the Wind it decreases the total suspend time almost by half! Please push this patch to Linus. :-) > BTW, making just serio asynchronous while keeping i8042 synchronous > makes no sense because I serialize access to i8042 - the thing does not > survive simultaneous [command] access to both keyboard and mouse... OK Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <200912202025.25618.rjw@sisk.pl>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async?suspend-resume patch w/ rwsems) [not found] ` <200912202025.25618.rjw@sisk.pl> @ 2009-12-21 7:39 ` Dmitry Torokhov [not found] ` <20091221073915.GC3234@core.coreip.homeip.net> 1 sibling, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-21 7:39 UTC (permalink / raw) To: Rafael J. Wysocki Cc: ACPI Devel Maling List, linux-pm, Vojtech Pavlik, Linus Torvalds, LKML On Sun, Dec 20, 2009 at 08:25:25PM +0100, Rafael J. Wysocki wrote: > On Sunday 20 December 2009, Dmitry Torokhov wrote: > > On Sat, Dec 19, 2009 at 04:09:07PM -0800, Linus Torvalds wrote: > > > > > > That said, I still get the feeling that we'd be even better off simply > > > trying to avoid the whole keyboard reset entirely. Apparently we do it for > > > a few HP laptops. > > > > I was mistaken, HP laptops do not like mouse disabled when suspending, > > not sure about the rest of the state. > > > > > It's entirely possible that we'd be better off simply > > > not _doing_ the slow thing in the first place. > > > > > > > The reset appeared first in 2.5.42. I expect that some BIOSes get very > > confused when tehy find mouse speaking something that they do not > > unserstand (i.e. synaptics, ALPS or anything else that is not bare PS/2 > > or intellimouse), but maybe Vojtech remembers better? > > > > > For example, we may be _much_ better off doing that whole keyboard reset > > > at resume time than at suspend time. > > > > We do the reset for the different reasons - at resume we want the device > > in known state to ensure that it properly responds to the probes we > > send to it. At suspend we trying to reset things into original state so > > that the firmware will not be confused. > > > > If we want to try to live without reset we could to PSMOUSE_CMD_RESET_DIS > > instead of PSMOUSE_CMD_RESET_BAT which is much heavier. We should > > probably not wait for .34 then because the bulk of testing will happen > > only when .33 is close to be released because that's when most of > > regular users will start using the new code and try to suspend and > > resume. > > > > Rafael, how long does suspend take if you change call to psmouse_reset() > > in psmouse_cleanup() to ps2_command(&psmouse->ps2dev, NULL, PSMOUSE_CMD_RESET_DIS)? > > And do the same for atkbd... > > On the nx6325 that appears to reduce the suspend time as much so the effect > of async is not visible any more. On the Wind it decreases the total suspend > time almost by half! > > Please push this patch to Linus. :-) > Let's see if I manage to solicit some testers first. FWIW it seems to be working on my boxes. But if this works then I am not sure we even want to bother with async suspend of i8042 and serios. And serio already does resume asynchronously through kseriod. -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
[parent not found: <20091221073915.GC3234@core.coreip.homeip.net>]
* Re: Async suspend-resume patch w/ completions (was: Re: Async?suspend-resume patch w/ rwsems) [not found] ` <20091221073915.GC3234@core.coreip.homeip.net> @ 2009-12-21 11:20 ` Vojtech Pavlik 0 siblings, 0 replies; 98+ messages in thread From: Vojtech Pavlik @ 2009-12-21 11:20 UTC (permalink / raw) To: Dmitry Torokhov; +Cc: ACPI Devel Maling List, linux-pm, Linus Torvalds, LKML On Sun, Dec 20, 2009 at 11:39:15PM -0800, Dmitry Torokhov wrote: > > On the nx6325 that appears to reduce the suspend time as much so the effect > > of async is not visible any more. On the Wind it decreases the total suspend > > time almost by half! > > > > Please push this patch to Linus. :-) > > > > Let's see if I manage to solicit some testers first. FWIW it seems to be > working on my boxes. > > But if this works then I am not sure we even want to bother with async > suspend of i8042 and serios. And serio already does resume > asynchronously through kseriod. I'm kind of wondering where this will break, but I don't remember why the RESET_BAT was put in exactly - the point of making sure the BIOS doesn't get confused by the advanced modes is correct, and is required at least when a keyboard is set to "Set 3", but RESET_BAT is a too heavy hammer anyway - we could just make sure to switch the kbd/mouse to 'default' modes instead of doing a full reset. -- Vojtech Pavlik Director SuSE Labs ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912200053.45988.rjw@sisk.pl> 2009-12-20 0:09 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912191557320.3712@localhost.localdomain> @ 2009-12-20 2:45 ` Dmitry Torokhov 2 siblings, 0 replies; 98+ messages in thread From: Dmitry Torokhov @ 2009-12-20 2:45 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sun, Dec 20, 2009 at 12:53:45AM +0100, Rafael J. Wysocki wrote: > On Sunday 20 December 2009, Linus Torvalds wrote: > > > > If it isn't, don't do that then. But we actually have no real > > reason to believe that it would be problematic, at least on a PC where the > > actual logic is on the SB (presumably behind the LPC controller). > > > > Why would it be? > > The embedded controller may depend on it. > No, not really depend but rather wierd things may happen if you accessing both. Witness regressions where touching embedded controller makes us lose data from touchpad, I think you are CCed on that bug. -- Dmitry ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) [not found] ` <200912200040.18944.rjw@sisk.pl> 2009-12-19 23:46 ` Linus Torvalds [not found] ` <alpine.LFD.2.00.0912191542570.3712@localhost.localdomain> @ 2009-12-20 3:59 ` Alan Stern 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-20 3:59 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Dmitry Torokhov, LKML, ACPI Devel Maling List, Linus Torvalds, pm list On Sun, 20 Dec 2009, Rafael J. Wysocki wrote: > So, seriously, do you think it makes sense to do asynchronous suspend at all? > I'm asking, because we're likely to get into troubles like this during suspend > for other kinds of devices too and without resolving them we won't get any > significant speedup from asynchronous suspend. > > That said, to me it's definitely worth doing asynchronous resume with the > "start asynch threads upfront" modification, as the results of the tests show > that quite clearly. I hope you agree. It's too early to come to this sort of conclusion (i.e., that suspend and resume react very differently to an asynchronous approach). Unless you have some definite _reason_ for thinking that resume will benefit more than suspend, you shouldn't try to generalize so much from tests on only two systems. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-12 18:54 ` Linus Torvalds 2009-12-12 22:34 ` Rafael J. Wysocki @ 2009-12-13 13:08 ` Rafael J. Wysocki 2009-12-13 17:30 ` Alan Stern 2 siblings, 0 replies; 98+ messages in thread From: Rafael J. Wysocki @ 2009-12-13 13:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Saturday 12 December 2009, Linus Torvalds wrote: > > On Sat, 12 Dec 2009, Rafael J. Wysocki wrote: > > > > I'd like to put it into my tree in this form, if you don't mind. > > This version still has a major problem, which is not related to > completions vs rwsems, but simply to the fact that you wanted to do this > at the generic device layer level rather than do it at the actual > low-level suspend/resume level. > > Namely that there's no apparent sane way to say "don't wait for children". There is, if the partent would really do something that could disturb the children. This isn't always the case, but at least in a few important cases it is (think of a USB controller and USB devices behind it, for example). I thought we had this discussion already, but perhaps that was with someone else and in a slightly different context. The main reasons why I think it's useful to do this at the generic device layer level are that, if we do it this way: a. Drivers that don't want to be "asynchronous" don't need to care in any case. b. Drivers whose suspend and resume routines are guaranteed not to disturb anyone else can mark their devices as "async" and be done with it, no other modification of the code is needed (drivers that do nothing in their suspend and resume routines also fall into this category). Now, if it's done at the low-level suspend/resume level, a. will not be true any more in general. Say device A has parent B and the driver of A wants to suspend asynchrnously. It needs to split its suspend into synchronous and asynchronous part and at one point start an async thread to run the latter. Now assume B has a real reason not to suspend before the suspens of A has finished. Then, the driver of B has to be modified so that it waits for the A's async suspend to complete (some sort of synchronization between the two has to be added). So, even if B is "synchronous", its driver has to be modified to handle the asynchronous suspend of A. Similarly, b. will no longer be true if it's done at the low-level suspend/resume level, because now every driver that wants to be "asynchronous" will need to take care of running an async thread etc. Moreover, it will need to make sure that the device parent's driver doesn't need to be modified, because the parent's suspend may do something that will disturb the child's asynchronous suspend. Furthermore, if the parent's driver doesn't need to be modified, it will need to consider the parent of the parent, because that one may potentially disturb the asynchronous suspend of its grand child and so on up to a device without a parent. That already is a pain to a driver writer, but the problem you're saying would be solved by doing this at the low-level suspend/resume level is still there in general! Namely, go back do the example with devices A and B and say B _really_ has to wait for A's suspend to complete. Then, since B is after A in dpm_list, the PM core will not start the suspend of any device after B until the suspend of B returns. Now, if the suspend of B waits for the suspend of A, then the PM core will effectively wait for the suspend of A to complete before suspending any other devices. Worse yet, if that happens, we can't do anything about it at the low-level suspend/resume level, althouth at the PM core level we can. Rafael ^ permalink raw reply [flat|nested] 98+ messages in thread
* Re: Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) 2009-12-12 18:54 ` Linus Torvalds 2009-12-12 22:34 ` Rafael J. Wysocki 2009-12-13 13:08 ` Rafael J. Wysocki @ 2009-12-13 17:30 ` Alan Stern 2 siblings, 0 replies; 98+ messages in thread From: Alan Stern @ 2009-12-13 17:30 UTC (permalink / raw) To: Linus Torvalds; +Cc: ACPI Devel Maling List, LKML, pm list On Sat, 12 Dec 2009, Linus Torvalds wrote: > This version still has a major problem, which is not related to > completions vs rwsems, but simply to the fact that you wanted to do this > at the generic device layer level rather than do it at the actual > low-level suspend/resume level. > > Namely that there's no apparent sane way to say "don't wait for children". > > PCI bridges that don't suspend at all - or any other device that only > suspends in the 'suspend_late()' thing, for that matter - don't have any > reason what-so-ever to wait for children, since they aren't actually > suspending in the first place. But you make them wait regardless, which > then serializes things unnecessarily (for example, two unrelated USB > controllers). In reality this should never be a problem. Consider that ultimately we want to achieve the following two goals: Implement a two-pass algorithm, so that synchronous devices can't cause spurious dependencies between two async devices. (This will fix the issue of an intermediate PCI bridge serializing two unrelated USB controllers.) Convert all lengthy suspend/resume operations to async. Obviously we don't want to do this all at once. But until the goals are achieved, there's no point worrying about devices being forced to wait for their children or parents. And after the goals are achieved, it won't matter. Why not? Consider the devices which would be delayed. If they use synchronous suspend/resume then they won't take much time, so delaying them won't matter. Indeed, based on Arjan's preliminary measurements it's fair to say that the total time taken by all the synchronous suspends/resumes put together should be negligible. Even if all of them were somehow delayed until all the async activities were complete, nobody would notice or care. (And conversely, if all the async activities could somehow be forced to wait until all the synchronous suspends/resumes were done, nobody would notice or care.) Okay, so consider a case where A comes before B in dpm_list and B is the parent of C. Suppose B doesn't need to wait for C to suspend, but we force it to wait anyhow. If A or C is synchronous then we're okay, by the considerations above. Suppose A is async. Then it wouldn't be delayed unless it was one of B's ancestors, so suppose it is. Now we are potentially delaying A more than necessary. Or are we? Even though B might not need to wait for C to suspend, there's an excellent chance that A _does_ need to wait for C. If we allow B to suspend before C then there would be nothing to prevent A from suspending too quickly. A's driver would need to wait explicitly for C -- which is unreasonable since C isn't one of A's children. (Rafael made a similar point.) In short, allowing devices to suspend before their children would be dangerous and probably would not save a significant amount of time. Alan Stern ^ permalink raw reply [flat|nested] 98+ messages in thread
end of thread, other threads:[~2009-12-21 11:20 UTC | newest]
Thread overview: 98+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.44L0.0912111938310.32493-100000@netrider.rowland.org>
2009-12-12 17:35 ` Async suspend-resume patch w/ completions (was: Re: Async suspend-resume patch w/ rwsems) Rafael J. Wysocki
[not found] <Pine.LNX.4.44L0.0912201434340.27137-100000@netrider.rowland.org>
2009-12-20 19:51 ` Rafael J. Wysocki
[not found] <200912201910.26895.rjw@sisk.pl>
2009-12-20 19:38 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912201210300.24162-100000@netrider.rowland.org>
2009-12-20 18:10 ` Rafael J. Wysocki
[not found] <200912201352.07689.rjw@sisk.pl>
2009-12-20 17:12 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912192232360.6618-100000@netrider.rowland.org>
2009-12-20 12:55 ` Rafael J. Wysocki
[not found] <Pine.LNX.4.44L0.0912192253200.6618-100000@netrider.rowland.org>
2009-12-20 12:52 ` Rafael J. Wysocki
[not found] <200912192241.03991.rjw@sisk.pl>
2009-12-20 3:48 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912181205290.2987-100000@iolanthe.rowland.org>
2009-12-19 21:41 ` Rafael J. Wysocki
[not found] <Pine.LNX.4.44L0.0912171444040.2645-100000@iolanthe.rowland.org>
2009-12-17 20:36 ` Rafael J. Wysocki
[not found] <Pine.LNX.4.44L0.0912161753540.2643-100000@iolanthe.rowland.org>
2009-12-16 23:18 ` Rafael J. Wysocki
[not found] ` <200912170018.05175.rjw@sisk.pl>
2009-12-17 1:30 ` Rafael J. Wysocki
[not found] <Pine.LNX.4.44L0.0912161018100.2909-100000@iolanthe.rowland.org>
2009-12-16 19:26 ` Rafael J. Wysocki
[not found] <alpine.LFD.2.00.0912151337350.14385@localhost.localdomain>
2009-12-15 22:27 ` Alan Stern
[not found] <200912152226.22578.rjw@sisk.pl>
2009-12-15 22:01 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912151444010.2643-100000@iolanthe.rowland.org>
2009-12-15 21:26 ` Rafael J. Wysocki
2009-12-15 21:54 ` Linus Torvalds
[not found] <Pine.LNX.4.44L0.0912151047410.3566-100000@iolanthe.rowland.org>
2009-12-15 16:28 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912150803250.14385@localhost.localdomain>
2009-12-15 18:57 ` Linus Torvalds
2009-12-15 20:26 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912131221210.1111-100000@netrider.rowland.org>
2009-12-13 19:02 ` Alan Stern
[not found] <200912112317.31668.rjw@sisk.pl>
2009-12-12 0:38 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912102155390.12136-100000@netrider.rowland.org>
2009-12-11 22:17 ` Rafael J. Wysocki
[not found] <Pine.LNX.4.44L0.0912101321020.2680-100000@iolanthe.rowland.org>
2009-12-10 23:51 ` Linus Torvalds
[not found] <Pine.LNX.4.44L0.0912101653120.2680-100000@iolanthe.rowland.org>
2009-12-10 23:45 ` Rafael J. Wysocki
[not found] <200912102214.40310.rjw@sisk.pl>
2009-12-10 22:17 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912101010090.2825-100000@iolanthe.rowland.org>
2009-12-10 15:45 ` Linus Torvalds
2009-12-10 21:14 ` Rafael J. Wysocki
[not found] <alpine.LFD.2.00.0912100739260.3560@localhost.localdomain>
2009-12-10 18:37 ` Alan Stern
[not found] <Pine.LNX.4.44L0.0912091729530.2672-100000@iolanthe.rowland.org>
2009-12-09 23:18 ` Rafael J. Wysocki
[not found] ` <200912100018.19723.rjw@sisk.pl>
2009-12-10 2:51 ` Linus Torvalds
2009-12-10 15:31 ` Alan Stern
[not found] ` <alpine.LFD.2.00.0912091835280.3560@localhost.localdomain>
2009-12-10 19:40 ` Rafael J. Wysocki
[not found] ` <200912102040.11063.rjw@sisk.pl>
2009-12-10 23:30 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912101507550.3560@localhost.localdomain>
2009-12-11 1:02 ` Rafael J. Wysocki
[not found] ` <200912110202.28536.rjw@sisk.pl>
2009-12-11 1:25 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912101713440.3560@localhost.localdomain>
2009-12-11 3:42 ` Alan Stern
2009-12-11 22:11 ` Rafael J. Wysocki
[not found] ` <200912112311.08548.rjw@sisk.pl>
2009-12-11 22:31 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912111415160.3922@localhost.localdomain>
2009-12-11 23:48 ` Rafael J. Wysocki
[not found] ` <200912120048.46180.rjw@sisk.pl>
2009-12-11 23:53 ` Linus Torvalds
2009-12-12 0:43 ` Alan Stern
[not found] ` <alpine.LFD.2.00.0912111552330.3526@localhost.localdomain>
2009-12-12 17:48 ` Rafael J. Wysocki
2009-12-12 18:54 ` Linus Torvalds
2009-12-12 22:34 ` Rafael J. Wysocki
2009-12-12 22:40 ` Rafael J. Wysocki
2009-12-14 18:21 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912141015240.26135@localhost.localdomain>
2009-12-14 22:11 ` Rafael J. Wysocki
[not found] ` <200912142311.31658.rjw@sisk.pl>
2009-12-14 22:41 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912141416040.26135@localhost.localdomain>
2009-12-14 22:43 ` Linus Torvalds
2009-12-14 23:18 ` Rafael J. Wysocki
[not found] ` <200912150018.11837.rjw@sisk.pl>
2009-12-15 0:10 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912141609020.14385@localhost.localdomain>
2009-12-15 0:11 ` Linus Torvalds
2009-12-15 11:03 ` Rafael J. Wysocki
[not found] ` <alpine.LFD.2.00.0912141610460.14385@localhost.localdomain>
2009-12-15 11:14 ` Rafael J. Wysocki
[not found] ` <200912151214.10980.rjw@sisk.pl>
2009-12-15 15:31 ` Linus Torvalds
[not found] ` <200912151203.22916.rjw@sisk.pl>
2009-12-15 15:26 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912150722310.14385@localhost.localdomain>
2009-12-15 15:55 ` Alan Stern
2009-12-16 2:11 ` Rafael J. Wysocki
[not found] ` <200912160311.05915.rjw@sisk.pl>
2009-12-16 6:40 ` Dmitry Torokhov
2009-12-16 15:22 ` Alan Stern
2009-12-16 15:47 ` Linus Torvalds
2009-12-16 19:27 ` Rafael J. Wysocki
[not found] ` <200912162027.16574.rjw@sisk.pl>
2009-12-16 20:59 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912161255080.3556@localhost.localdomain>
2009-12-16 21:57 ` Rafael J. Wysocki
[not found] ` <200912162257.00771.rjw@sisk.pl>
2009-12-16 22:11 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912161410120.3556@localhost.localdomain>
2009-12-16 22:33 ` Rafael J. Wysocki
2009-12-16 23:04 ` Alan Stern
2009-12-17 1:49 ` Rafael J. Wysocki
2009-12-17 20:06 ` Alan Stern
2009-12-18 1:51 ` Rafael J. Wysocki
[not found] ` <200912180251.22655.rjw@sisk.pl>
2009-12-18 17:26 ` Alan Stern
2009-12-18 23:42 ` Rafael J. Wysocki
[not found] ` <20091216064025.GB2699@core.coreip.homeip.net>
2009-12-18 22:43 ` Rafael J. Wysocki
2009-12-19 19:59 ` Dmitry Torokhov
[not found] ` <20091219195935.GB4073@core.coreip.homeip.net>
2009-12-19 21:33 ` Rafael J. Wysocki
[not found] ` <200912192233.44575.rjw@sisk.pl>
2009-12-19 22:29 ` Rafael J. Wysocki
[not found] ` <200912192329.03251.rjw@sisk.pl>
2009-12-19 22:43 ` Dmitry Torokhov
2009-12-19 22:47 ` Dmitry Torokhov
[not found] ` <A37A0A6F-3662-40C9-BE1F-B9F6A38CD80B@gmail.com>
2009-12-19 23:10 ` Rafael J. Wysocki
[not found] ` <200912200010.19899.rjw@sisk.pl>
2009-12-19 23:22 ` Dmitry Torokhov
2009-12-19 23:23 ` Linus Torvalds
[not found] ` <43A402BB-6AB3-4127-A441-D53EDE09F22E@gmail.com>
2009-12-19 23:33 ` Rafael J. Wysocki
[not found] ` <alpine.LFD.2.00.0912191521180.3712@localhost.localdomain>
2009-12-19 23:40 ` Rafael J. Wysocki
[not found] ` <200912200040.18944.rjw@sisk.pl>
2009-12-19 23:46 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912191542570.3712@localhost.localdomain>
2009-12-19 23:47 ` Linus Torvalds
2009-12-19 23:53 ` Rafael J. Wysocki
[not found] ` <alpine.LFD.2.00.0912191546250.3712@localhost.localdomain>
2009-12-19 23:54 ` Rafael J. Wysocki
[not found] ` <200912200053.45988.rjw@sisk.pl>
2009-12-20 0:09 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0912191557320.3712@localhost.localdomain>
2009-12-20 0:35 ` Rafael J. Wysocki
2009-12-20 2:41 ` Dmitry Torokhov
[not found] ` <20091220024142.GC4073@core.coreip.homeip.net>
2009-12-20 19:25 ` Rafael J. Wysocki
[not found] ` <200912202025.25618.rjw@sisk.pl>
2009-12-21 7:39 ` Async suspend-resume patch w/ completions (was: Re: Async?suspend-resume " Dmitry Torokhov
[not found] ` <20091221073915.GC3234@core.coreip.homeip.net>
2009-12-21 11:20 ` Vojtech Pavlik
2009-12-20 2:45 ` Async suspend-resume patch w/ completions (was: Re: Async suspend-resume " Dmitry Torokhov
2009-12-20 3:59 ` Alan Stern
2009-12-13 13:08 ` Rafael J. Wysocki
2009-12-13 17:30 ` Alan Stern
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox