Re: A desktop environment[1] kernel wishlist

Linux Power Management development
 help / color / mirror / Atom feed

* Re: A desktop environment[1] kernel wishlist
       [not found]   ` <CAJFHJrp8R+Na73W2Jx6BJ+JnHTGZfd61d+EracgehS+ZQoZOcg@mail.gmail.com>
@ 2015-05-04 22:12     ` Rafael J. Wysocki
  2015-05-04 23:30       ` Chirantan Ekbote
       [not found]     ` <CAAObsKDYV=Hyz0XcPkyD6=hu4WKk7PHGdSMOCZxbh_vMKHS32Q@mail.gmail.com>
  1 sibling, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-04 22:12 UTC (permalink / raw)
  To: Chirantan Ekbote, John Stultz
  Cc: Olof Johansson, Bastien Nocera, Linux Kernel Mailing List, snanda,
	Tomeu Vizoso, Linux PM list

On Thursday, April 30, 2015 11:54:51 AM Chirantan Ekbote wrote:
> On Thu, Apr 30, 2015 at 10:23 AM, Olof Johansson <olof@lixom.net> wrote:
> > Hi,
> >
> > On Thu, Apr 30, 2015 at 10:10 AM, John Stultz <john.stultz@linaro.org> wrote:
> >> On Thu, Apr 30, 2015 at 9:25 AM, Bastien Nocera <hadess@hadess.net> wrote:
> >>> On Tue, 2014-10-21 at 10:04 -0700, John Stultz wrote:

Thanks for CCin me, John!

Let's also CC linux-pm as the people on that list may be generally interested
in this thread.

> >>>> On Tue, Oct 21, 2014 at 1:49 AM, Bastien Nocera <hadess@hadess.net>
> >>>> wrote:
> >>>> > Hey,
> >>>> >
> >>>> > GNOME has had discussions with kernel developers in the past, and,
> >>>> > fortunately, in some cases we were able to make headway.
> >>>> >
> >>>> > There are however a number of items that we still don't have
> >>>> > solutions
> >>>> > for, items that kernel developers might not realise we'd like to
> >>>> > rely
> >>>> > on, or don't know that we'd make use of if merged.
> >>>> >
> >>>> > I've posted this list at:
> >>>> > https://wiki.gnome.org/BastienNocera/KernelWishlist
> >>>> >
> >>>> > Let me know on-list or off-list if you have any comments about
> >>>> > those, so
> >>>> > I can update the list.
> >>>>
> >>>> As for: 'Export of "wake reason" when the system wakes up (rtc alarm,
> >>>> lid open, etc.) and wakealarm (/sys/class/rtc/foo/wakealarm)
> >>>> documentation'
> >>>>
> >>>> Can you expand more on the rational for the need here? Is this for UI
> >>>> for power debugging, or something else?
> >>>
> >>> This is pretty much what I had in mind:
> >>> https://www.chromium.org/chromium-os/chromiumos-design-docs/lucid-sleep
> >>>
> >>> I guess I didn't make myself understood.
> >>
> >> My, admittedly quick skim, of that design document seems to suggest
> >> that lucid sleep would be a new kernel state. That would keep the
> >> kernel in charge of determining the state transitions (ie:
> >> SUSPEND-(alarm)->LUCID-(wakelock
> >> release)->SUSPEND-(alarm)->LUCID-(power-button)->AWAKE). Then it seems
> >> userspace would be able to query the current state. This avoids some
> >> of the races I was concerned with trying to detect which irq woke us
> >> from suspend from userspace.
> >>
> 
> Tomeu has been working on making things so that we don't need a new
> kernel state.

Which is good, because adding a new kernel state like that to the mainline is
out of the question as far as I'm concerned.

>  Adding him on cc so he can correct me if I say
> something wrong.  The current idea is to have userspace runtime
> suspend any unneeded devices before starting a suspend.  This way the
> kernel will leave them alone at resume.  This behavior already exists
> in the mainline kernel.  Getting the wakeup reason can be accomplished
> by having the kernel emit a uevent with the wakeup reason.  This is
> the only change that would be necessary.

Well, that needs to be thought through carefully in my view, or it will
always be racy.

You cannot really only rely on wakeup events that have already happened,
because something requiring you to bring up the full UI may happen at
any time.  In particular, it may happen when you're about to suspend again.

For this reason, it looks like you need something along the lines of
the wakeup_count interface, but acting on subsets of devices.

It actually shouldn't be too difficult to split the existing wakeup
counter into a number of subcounters each tracking a subset of wakeup
sources and one of them might be used as the "full UI wakeup" condition
trigger in principle.

> >> That said, the Power Manager section in that document sounds a little
> >> racy as it seems to rely on asking userspace if suspend is ok, rather
> >> then using userspace wakelocks, so I'm not sure how well baked this
> >> doc is.

You cannot be "a little" racy.  Either you are racy, or you aren't.  If you
are, it's only a matter of time until someone hits the race.  How often
that will happen depends on how hard the race is to trigger and how many
users of the feature there are.  Given enough users, quite a number of them
may be unhappy.

> I'm not sure I understand what you are saying here.  If you're saying
> that the kernel is asking userspace if suspend is ok, then I can say
> that that's definitely not the case.  Currently from the kernel's
> perspective a lucid sleep resume isn't really different from a regular
> resume.  We have a hack in each driver that we care about that
> basically boils down to an if statement that skips re-initialization
> if we are entering lucid sleep.  If userspace tries to access that
> device in lucid sleep, it just gets an error.  This has actually
> caused us some headache (see the GPU process section of the doc),
> which is why we'd like to switch to using the runtime suspend approach
> I mentioned above.

That's a good plan, because that's the only way you can satisfy all of the
dependencies that may be involved.

> If instead you're saying that the power manager needs to ask the rest
> of userspace whether suspend is ok, you can think of the current
> mechanism as a sort of timed wake lock.  Daemons that care about lucid
> sleep register with the power manager when they start up.  The power
> manager then waits for these daemons to report readiness while in
> lucid sleep before starting another suspend.  So each daemon
> effectively acquires a wake lock when the system enters lucid sleep
> and releases the wake lock when it reports readiness to the power
> manager or the timeout occurs.

I think what John meant was exactly what I said above: You need a race
free mechanism to verify whether or not it is OK to suspend again (ie.
whether or not there are any unhandled events that would have woken you up
had they happened while suspended) when you're about to.  You *also* need
to be able to determine (in a race free way) whether or not any of them
require you to bring up the UI.

It looks like your use case is actually more complex than the Android's one. :-)


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
       [not found]     ` <CAAObsKDYV=Hyz0XcPkyD6=hu4WKk7PHGdSMOCZxbh_vMKHS32Q@mail.gmail.com>
@ 2015-05-04 22:19       ` Rafael J. Wysocki
  2015-05-05  6:05         ` Tomeu Vizoso
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-04 22:19 UTC (permalink / raw)
  To: Tomeu Vizoso
  Cc: Chirantan Ekbote, Olof Johansson, John Stultz, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Linux PM list

On Friday, May 01, 2015 11:02:19 AM Tomeu Vizoso wrote:
> On 30 April 2015 at 20:54, Chirantan Ekbote <chirantan@chromium.org> wrote:
> > On Thu, Apr 30, 2015 at 10:23 AM, Olof Johansson <olof@lixom.net> wrote:
> >> Hi,
> >>
> >> On Thu, Apr 30, 2015 at 10:10 AM, John Stultz <john.stultz@linaro.org> wrote:
> >>> On Thu, Apr 30, 2015 at 9:25 AM, Bastien Nocera <hadess@hadess.net> wrote:
> >>>> On Tue, 2014-10-21 at 10:04 -0700, John Stultz wrote:
> >>>>> On Tue, Oct 21, 2014 at 1:49 AM, Bastien Nocera <hadess@hadess.net>
> >>>>> wrote:
> >>>>> > Hey,
> >>>>> >
> >>>>> > GNOME has had discussions with kernel developers in the past, and,
> >>>>> > fortunately, in some cases we were able to make headway.
> >>>>> >
> >>>>> > There are however a number of items that we still don't have
> >>>>> > solutions
> >>>>> > for, items that kernel developers might not realise we'd like to
> >>>>> > rely
> >>>>> > on, or don't know that we'd make use of if merged.
> >>>>> >
> >>>>> > I've posted this list at:
> >>>>> > https://wiki.gnome.org/BastienNocera/KernelWishlist
> >>>>> >
> >>>>> > Let me know on-list or off-list if you have any comments about
> >>>>> > those, so
> >>>>> > I can update the list.
> >>>>>
> >>>>> As for: 'Export of "wake reason" when the system wakes up (rtc alarm,
> >>>>> lid open, etc.) and wakealarm (/sys/class/rtc/foo/wakealarm)
> >>>>> documentation'
> >>>>>
> >>>>> Can you expand more on the rational for the need here? Is this for UI
> >>>>> for power debugging, or something else?
> >>>>
> >>>> This is pretty much what I had in mind:
> >>>> https://www.chromium.org/chromium-os/chromiumos-design-docs/lucid-sleep
> >>>>
> >>>> I guess I didn't make myself understood.
> >>>
> >>> My, admittedly quick skim, of that design document seems to suggest
> >>> that lucid sleep would be a new kernel state. That would keep the
> >>> kernel in charge of determining the state transitions (ie:
> >>> SUSPEND-(alarm)->LUCID-(wakelock
> >>> release)->SUSPEND-(alarm)->LUCID-(power-button)->AWAKE). Then it seems
> >>> userspace would be able to query the current state. This avoids some
> >>> of the races I was concerned with trying to detect which irq woke us
> >>> from suspend from userspace.
> >>>
> >
> > Tomeu has been working on making things so that we don't need a new
> > kernel state.  Adding him on cc so he can correct me if I say
> > something wrong.  The current idea is to have userspace runtime
> > suspend any unneeded devices before starting a suspend.  This way the
> > kernel will leave them alone at resume.  This behavior already exists
> > in the mainline kernel.
> 
> This is right, I have one series on flight about removing any
> non-runtime device resumes from my test platform (a nyan-big) and
> another about entering suspend-to-idle with ticks frozen (also on
> Tegra124/nyan-big).
> 
> > Getting the wakeup reason can be accomplished
> > by having the kernel emit a uevent with the wakeup reason.  This is
> > the only change that would be necessary.
> 
> My current opinion is that for ChromeOS there's no need for the kernel
> to communicate a wakeup reason to userspace. From what I know it
> should be enough for powerd/upower/whatever to constantly monitor all
> relevant input devices and that should tell if the wakeup was caused
> by user activity or not.

Dream on.

What about input events that leave no trace in the input buffers?

What about events occuring between the time you've decided to suspend and
you actually suspended?

> Once userspace is thawed, the power management daemon would read any
> pending events from the input devices and would decide whether to stay
> on dark resume or whether to unfreeze renderer daemons and power up
> the display, unpause any audio streams, etc.

And what about if it decideds to suspend again and a wakeup event
requiring you to bring up the UI occurs exactly at that time?

You really have the same race as Android does here, but in addition to that
you want to deterimne how far to go with the resume on the basis of what
wakeup sources are involved.

I'm leaving the below for the benefit of linux-pm readers.

> For the GNOME folks, this is how it's done in ChromeOS:
> 
> https://chromium.googlesource.com/chromiumos/platform2/+/master/power_manager/powerd/system/input_watcher.cc
> 
> Of course, this depends on events not getting lost. In the ChromeOS
> case, the firmware is under the control of the OS developers, so any
> bugs can be fixed.
> 
> For GNOME and other desktop environments who aim to run on systems
> whose firmware cannot be fixed, I think it could make sense for the
> kernel to synthesize such events if it happens to have enough
> information to do so.
> 
> Besides this issue, I think that what "only" remains to be done in the
> kernel is to speed up resumes so no hacks are needed downstream. The
> infrastructure for this already exists in the form of
> power.direct_complete and suspend-to-idle and the work that remains is
> mostly platform-specific.
> 
> This is to say that GNOME could start implementing lucid sleep right
> now, though the user experience might not be ideal yet because resumes
> might take longer than desired. But it might not matter to GNOME as
> much as it does to ChromeOS because they aren't in control of the hw
> anyway?
> 
> Regards,
> 
> Tomeu
> 
> >>> That said, the Power Manager section in that document sounds a little
> >>> racy as it seems to rely on asking userspace if suspend is ok, rather
> >>> then using userspace wakelocks, so I'm not sure how well baked this
> >>> doc is.
> >>>
> >
> > I'm not sure I understand what you are saying here.  If you're saying
> > that the kernel is asking userspace if suspend is ok, then I can say
> > that that's definitely not the case.  Currently from the kernel's
> > perspective a lucid sleep resume isn't really different from a regular
> > resume.  We have a hack in each driver that we care about that
> > basically boils down to an if statement that skips re-initialization
> > if we are entering lucid sleep.  If userspace tries to access that
> > device in lucid sleep, it just gets an error.  This has actually
> > caused us some headache (see the GPU process section of the doc),
> > which is why we'd like to switch to using the runtime suspend approach
> > I mentioned above.
> >
> > If instead you're saying that the power manager needs to ask the rest
> > of userspace whether suspend is ok, you can think of the current
> > mechanism as a sort of timed wake lock.  Daemons that care about lucid
> > sleep register with the power manager when they start up.  The power
> > manager then waits for these daemons to report readiness while in
> > lucid sleep before starting another suspend.  So each daemon
> > effectively acquires a wake lock when the system enters lucid sleep
> > and releases the wake lock when it reports readiness to the power
> > manager or the timeout occurs.
> >
> >>> Olof: Can you comment on who's working on that design doc?   Also the
> >>> discussion around using freezing cgroups separately to distinguish
> >>> between lucid and awake is interesting, but I wonder if we need to
> >>> make wakeup_sources/wakelocks cgroup aware, or has that already been
> >>> done?
> >>
> >
> > Currently cgroup process management is handled by the Chrome browser
> > since the only processes we freeze are Chrome renderers.  A renderer
> > is basically a process that hosts the content for a single tab,
> > extension, plugin, etc.  Freezing occurs when the system is about to
> > enter suspend from the fully awake state and thawing occurs during the
> > reverse transition.  We've made no changes to the cgroup code and have
> > been using it as is.
> >
> >>
> >> Sameer and Chirantan have both been deeply involved in that project,
> >> adding them on cc here.
> >>
> >>
> >> -Olof
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-04 22:12     ` A desktop environment[1] kernel wishlist Rafael J. Wysocki
@ 2015-05-04 23:30       ` Chirantan Ekbote
  2015-05-05 10:46         ` Bastien Nocera
                           ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Chirantan Ekbote @ 2015-05-04 23:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Mon, May 4, 2015 at 3:12 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, April 30, 2015 11:54:51 AM Chirantan Ekbote wrote:
>> On Thu, Apr 30, 2015 at 10:23 AM, Olof Johansson <olof@lixom.net> wrote:
>> > Hi,
>> >
>> > On Thu, Apr 30, 2015 at 10:10 AM, John Stultz <john.stultz@linaro.org> wrote:
>> >> On Thu, Apr 30, 2015 at 9:25 AM, Bastien Nocera <hadess@hadess.net> wrote:
>> >>> On Tue, 2014-10-21 at 10:04 -0700, John Stultz wrote:
>
> Thanks for CCin me, John!
>
> Let's also CC linux-pm as the people on that list may be generally interested
> in this thread.
>
>> >>>> On Tue, Oct 21, 2014 at 1:49 AM, Bastien Nocera <hadess@hadess.net>
>> >>>> wrote:
>> >>>> > Hey,
>> >>>> >
>> >>>> > GNOME has had discussions with kernel developers in the past, and,
>> >>>> > fortunately, in some cases we were able to make headway.
>> >>>> >
>> >>>> > There are however a number of items that we still don't have
>> >>>> > solutions
>> >>>> > for, items that kernel developers might not realise we'd like to
>> >>>> > rely
>> >>>> > on, or don't know that we'd make use of if merged.
>> >>>> >
>> >>>> > I've posted this list at:
>> >>>> > https://wiki.gnome.org/BastienNocera/KernelWishlist
>> >>>> >
>> >>>> > Let me know on-list or off-list if you have any comments about
>> >>>> > those, so
>> >>>> > I can update the list.
>> >>>>
>> >>>> As for: 'Export of "wake reason" when the system wakes up (rtc alarm,
>> >>>> lid open, etc.) and wakealarm (/sys/class/rtc/foo/wakealarm)
>> >>>> documentation'
>> >>>>
>> >>>> Can you expand more on the rational for the need here? Is this for UI
>> >>>> for power debugging, or something else?
>> >>>
>> >>> This is pretty much what I had in mind:
>> >>> https://www.chromium.org/chromium-os/chromiumos-design-docs/lucid-sleep
>> >>>
>> >>> I guess I didn't make myself understood.
>> >>
>> >> My, admittedly quick skim, of that design document seems to suggest
>> >> that lucid sleep would be a new kernel state. That would keep the
>> >> kernel in charge of determining the state transitions (ie:
>> >> SUSPEND-(alarm)->LUCID-(wakelock
>> >> release)->SUSPEND-(alarm)->LUCID-(power-button)->AWAKE). Then it seems
>> >> userspace would be able to query the current state. This avoids some
>> >> of the races I was concerned with trying to detect which irq woke us
>> >> from suspend from userspace.
>> >>
>>
>> Tomeu has been working on making things so that we don't need a new
>> kernel state.
>
> Which is good, because adding a new kernel state like that to the mainline is
> out of the question as far as I'm concerned.
>
>>  Adding him on cc so he can correct me if I say
>> something wrong.  The current idea is to have userspace runtime
>> suspend any unneeded devices before starting a suspend.  This way the
>> kernel will leave them alone at resume.  This behavior already exists
>> in the mainline kernel.  Getting the wakeup reason can be accomplished
>> by having the kernel emit a uevent with the wakeup reason.  This is
>> the only change that would be necessary.
>
> Well, that needs to be thought through carefully in my view, or it will
> always be racy.
>
> You cannot really only rely on wakeup events that have already happened,
> because something requiring you to bring up the full UI may happen at
> any time.  In particular, it may happen when you're about to suspend again.
>
> For this reason, it looks like you need something along the lines of
> the wakeup_count interface, but acting on subsets of devices.
>
> It actually shouldn't be too difficult to split the existing wakeup
> counter into a number of subcounters each tracking a subset of wakeup
> sources and one of them might be used as the "full UI wakeup" condition
> trigger in principle.
>

In the interest of brevity, I didn't go into the design of suspend /
resume in userspace in my last email but it seems like there's no way
around it.

Ignoring lucid sleep for a moment, here is how a regular suspend works
in the power manager.  The power manager sees a suspend request either
because the user has been idle for X amount of time (usually 15
minutes) or the user explicitly requested it by closing the lid.  The
power manager reads the value of /sys/power/wakeup_count and then
announces an imminent suspend to the rest of the system via DBus.
Interested applications (like the network manager and Chrome) will
have registered with the power manager to be informed about this when
they started up.  For example, this is when Chrome would freeze its
renderer processes.  The power manager will now wait for them to
report back their readiness to suspend.  Once all applications have
reported ready (or the maximum timeout occurs), the power manager
performs some final preparations (like setting the resume brightness
for the display).  The last thing the power manager does, right before
writing "mem" to /sys/power/state, is write the wakeup_count that it
read earlier to /sys/power/wakeup_count.  If the write fails, the
power manager considers the suspend attempt failed, reads the new
wakeup_count, and starts a timer (usually 10 seconds) to retry the
suspend.  The same thing happens if the write to /sys/power/state
fails.

It may be the case that the event that incremented the count happened
because a user was trying to cancel the suspend.  The user could have
pressed some keys on the keyboard, touched the trackpad, opened the
lid, pressed the power button, etc, etc.  For the keyboard and
trackpad these events make their way up to Chrome, which sends a user
activity message to the power manager.  This is a message that Chrome
sends to the power manager even during regular activity, up to five
times a second, to prevent the idle timeout from occuring.  When the
power manager sees this user activity message while waiting to retry
the suspend, it cancels the retry and sends out a DBus signal
informing the system that the suspend is now complete.  For events
like opening the lid or pressing the power button, the power manager
monitors those events in /dev/input and simulates a user activity
message from chrome if any of them fire.

Ok so now I can talk about how lucid sleep fits into all of this.  The
power manager only considers a suspend attempt successful if the write
to /sys/power/state was successful.  If the suspend was successful,
the power manager then reads another sysfs file
(/sys/power/wakeup_type in our kernel) to determine if the wakeup
event was user-triggered.  If the event was user-triggered it sends
out a DBus signal announcing the end of the suspend, Chrome thaws its
renderer processes, the full UI comes back up, and the user can start
working.  If the event was _not_ user-triggred (if it was the RTC or
NIC), the power manager sends out a different DBus signal announcing
that the system is in lucid sleep and will re-suspend soon.  It will
then wait for all registered applications to report readiness to
suspend or for the max timeout to expire.

If it so happens that a user interacts with the system while it is in
this state, the power manager detects it via the user activity message
that Chrome sends.  This could be real (keyboard, trackpad) or
simulated (lid, power button).  Either way, the power manager responds
the same way: it announces the end of the suspend, Chrome thaws its
renderers, the full UI comes back up, and the user starts working.  If
there is no user activity and all applications report ready, the power
manager gets ready to suspend the system again.  Since the NIC is now
a wakeup source, the power manager doesn't read the wakeup_count until
after all applications have reported ready because accessing the
network could increment the wakeup_count and cause false positives.

If either the write to /sys/power/wakeup_count or /sys/power/state
fails from lucid sleep, the power manager re-announces that the system
is in lucid sleep and will re-suspend soon.  It's actually a little
smart about this: it will only re-announce lucid sleep if there was a
wakeup_count mismatch or if the write to /sys/power/state returned
EBUSY.  Other failures only trigger a simple retry and no DBus signal.
We do this because these wakeup events may legitimately trigger lucid
sleep.  For example, more packets may arrive from the network or the
RTC may go off and applications don't perform work until they hear
from the power manager that the system is in lucid sleep.  At this
point the power manager is back to waiting for applications to report
ready (or for the retry timer to fire).  This process may repeat
multiple times if we keep getting wakeup events right as the system
tries to re-suspend.

So that was a little long-winded but hopefully I've addressed all your
concerns about potential race conditions in this code.  I simplified a
few bits because would just complicate the discussion but for the most
part this is how the feature works now.  Having the kernel emit a
uevent with the wakeup event type would take the place of the power
manager reading from /sys/power/wakeup_type in this system but
wouldn't really affect anything else.

>> >> That said, the Power Manager section in that document sounds a little
>> >> racy as it seems to rely on asking userspace if suspend is ok, rather
>> >> then using userspace wakelocks, so I'm not sure how well baked this
>> >> doc is.
>
> You cannot be "a little" racy.  Either you are racy, or you aren't.  If you
> are, it's only a matter of time until someone hits the race.  How often
> that will happen depends on how hard the race is to trigger and how many
> users of the feature there are.  Given enough users, quite a number of them
> may be unhappy.
>
>> I'm not sure I understand what you are saying here.  If you're saying
>> that the kernel is asking userspace if suspend is ok, then I can say
>> that that's definitely not the case.  Currently from the kernel's
>> perspective a lucid sleep resume isn't really different from a regular
>> resume.  We have a hack in each driver that we care about that
>> basically boils down to an if statement that skips re-initialization
>> if we are entering lucid sleep.  If userspace tries to access that
>> device in lucid sleep, it just gets an error.  This has actually
>> caused us some headache (see the GPU process section of the doc),
>> which is why we'd like to switch to using the runtime suspend approach
>> I mentioned above.
>
> That's a good plan, because that's the only way you can satisfy all of the
> dependencies that may be involved.
>
>> If instead you're saying that the power manager needs to ask the rest
>> of userspace whether suspend is ok, you can think of the current
>> mechanism as a sort of timed wake lock.  Daemons that care about lucid
>> sleep register with the power manager when they start up.  The power
>> manager then waits for these daemons to report readiness while in
>> lucid sleep before starting another suspend.  So each daemon
>> effectively acquires a wake lock when the system enters lucid sleep
>> and releases the wake lock when it reports readiness to the power
>> manager or the timeout occurs.
>
> I think what John meant was exactly what I said above: You need a race
> free mechanism to verify whether or not it is OK to suspend again (ie.
> whether or not there are any unhandled events that would have woken you up
> had they happened while suspended) when you're about to.  You *also* need
> to be able to determine (in a race free way) whether or not any of them
> require you to bring up the UI.
>
> It looks like your use case is actually more complex than the Android's one. :-)
>
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-04 22:19       ` Rafael J. Wysocki
@ 2015-05-05  6:05         ` Tomeu Vizoso
  2015-05-05 12:31           ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Tomeu Vizoso @ 2015-05-05  6:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Chirantan Ekbote, Olof Johansson, John Stultz, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Linux PM list

On 5 May 2015 at 00:19, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Friday, May 01, 2015 11:02:19 AM Tomeu Vizoso wrote:
>> On 30 April 2015 at 20:54, Chirantan Ekbote <chirantan@chromium.org> wrote:
>> > On Thu, Apr 30, 2015 at 10:23 AM, Olof Johansson <olof@lixom.net> wrote:
>> >> Hi,
>> >>
>> >> On Thu, Apr 30, 2015 at 10:10 AM, John Stultz <john.stultz@linaro.org> wrote:
>> >>> On Thu, Apr 30, 2015 at 9:25 AM, Bastien Nocera <hadess@hadess.net> wrote:
>> >>>> On Tue, 2014-10-21 at 10:04 -0700, John Stultz wrote:
>> >>>>> On Tue, Oct 21, 2014 at 1:49 AM, Bastien Nocera <hadess@hadess.net>
>> >>>>> wrote:
>> >>>>> > Hey,
>> >>>>> >
>> >>>>> > GNOME has had discussions with kernel developers in the past, and,
>> >>>>> > fortunately, in some cases we were able to make headway.
>> >>>>> >
>> >>>>> > There are however a number of items that we still don't have
>> >>>>> > solutions
>> >>>>> > for, items that kernel developers might not realise we'd like to
>> >>>>> > rely
>> >>>>> > on, or don't know that we'd make use of if merged.
>> >>>>> >
>> >>>>> > I've posted this list at:
>> >>>>> > https://wiki.gnome.org/BastienNocera/KernelWishlist
>> >>>>> >
>> >>>>> > Let me know on-list or off-list if you have any comments about
>> >>>>> > those, so
>> >>>>> > I can update the list.
>> >>>>>
>> >>>>> As for: 'Export of "wake reason" when the system wakes up (rtc alarm,
>> >>>>> lid open, etc.) and wakealarm (/sys/class/rtc/foo/wakealarm)
>> >>>>> documentation'
>> >>>>>
>> >>>>> Can you expand more on the rational for the need here? Is this for UI
>> >>>>> for power debugging, or something else?
>> >>>>
>> >>>> This is pretty much what I had in mind:
>> >>>> https://www.chromium.org/chromium-os/chromiumos-design-docs/lucid-sleep
>> >>>>
>> >>>> I guess I didn't make myself understood.
>> >>>
>> >>> My, admittedly quick skim, of that design document seems to suggest
>> >>> that lucid sleep would be a new kernel state. That would keep the
>> >>> kernel in charge of determining the state transitions (ie:
>> >>> SUSPEND-(alarm)->LUCID-(wakelock
>> >>> release)->SUSPEND-(alarm)->LUCID-(power-button)->AWAKE). Then it seems
>> >>> userspace would be able to query the current state. This avoids some
>> >>> of the races I was concerned with trying to detect which irq woke us
>> >>> from suspend from userspace.
>> >>>
>> >
>> > Tomeu has been working on making things so that we don't need a new
>> > kernel state.  Adding him on cc so he can correct me if I say
>> > something wrong.  The current idea is to have userspace runtime
>> > suspend any unneeded devices before starting a suspend.  This way the
>> > kernel will leave them alone at resume.  This behavior already exists
>> > in the mainline kernel.
>>
>> This is right, I have one series on flight about removing any
>> non-runtime device resumes from my test platform (a nyan-big) and
>> another about entering suspend-to-idle with ticks frozen (also on
>> Tegra124/nyan-big).
>>
>> > Getting the wakeup reason can be accomplished
>> > by having the kernel emit a uevent with the wakeup reason.  This is
>> > the only change that would be necessary.
>>
>> My current opinion is that for ChromeOS there's no need for the kernel
>> to communicate a wakeup reason to userspace. From what I know it
>> should be enough for powerd/upower/whatever to constantly monitor all
>> relevant input devices and that should tell if the wakeup was caused
>> by user activity or not.
>
> Dream on.
>
> What about input events that leave no trace in the input buffers?

You mean that it can happen that, barring bugs on hw or fw, the kernel
knows what device "caused" the wakeup but doesn't have enough
information to report the full event (key pressed, touchpad motion,
etc) to userspace?

> What about events occuring between the time you've decided to suspend and
> you actually suspended?

What about them? How would wakeup_type help us there?

Regards,

Tomeu

>> Once userspace is thawed, the power management daemon would read any
>> pending events from the input devices and would decide whether to stay
>> on dark resume or whether to unfreeze renderer daemons and power up
>> the display, unpause any audio streams, etc.
>
> And what about if it decideds to suspend again and a wakeup event
> requiring you to bring up the UI occurs exactly at that time?
>
> You really have the same race as Android does here, but in addition to that
> you want to deterimne how far to go with the resume on the basis of what
> wakeup sources are involved.
>
> I'm leaving the below for the benefit of linux-pm readers.
>
>> For the GNOME folks, this is how it's done in ChromeOS:
>>
>> https://chromium.googlesource.com/chromiumos/platform2/+/master/power_manager/powerd/system/input_watcher.cc
>>
>> Of course, this depends on events not getting lost. In the ChromeOS
>> case, the firmware is under the control of the OS developers, so any
>> bugs can be fixed.
>>
>> For GNOME and other desktop environments who aim to run on systems
>> whose firmware cannot be fixed, I think it could make sense for the
>> kernel to synthesize such events if it happens to have enough
>> information to do so.
>>
>> Besides this issue, I think that what "only" remains to be done in the
>> kernel is to speed up resumes so no hacks are needed downstream. The
>> infrastructure for this already exists in the form of
>> power.direct_complete and suspend-to-idle and the work that remains is
>> mostly platform-specific.
>>
>> This is to say that GNOME could start implementing lucid sleep right
>> now, though the user experience might not be ideal yet because resumes
>> might take longer than desired. But it might not matter to GNOME as
>> much as it does to ChromeOS because they aren't in control of the hw
>> anyway?
>>
>> Regards,
>>
>> Tomeu
>>
>> >>> That said, the Power Manager section in that document sounds a little
>> >>> racy as it seems to rely on asking userspace if suspend is ok, rather
>> >>> then using userspace wakelocks, so I'm not sure how well baked this
>> >>> doc is.
>> >>>
>> >
>> > I'm not sure I understand what you are saying here.  If you're saying
>> > that the kernel is asking userspace if suspend is ok, then I can say
>> > that that's definitely not the case.  Currently from the kernel's
>> > perspective a lucid sleep resume isn't really different from a regular
>> > resume.  We have a hack in each driver that we care about that
>> > basically boils down to an if statement that skips re-initialization
>> > if we are entering lucid sleep.  If userspace tries to access that
>> > device in lucid sleep, it just gets an error.  This has actually
>> > caused us some headache (see the GPU process section of the doc),
>> > which is why we'd like to switch to using the runtime suspend approach
>> > I mentioned above.
>> >
>> > If instead you're saying that the power manager needs to ask the rest
>> > of userspace whether suspend is ok, you can think of the current
>> > mechanism as a sort of timed wake lock.  Daemons that care about lucid
>> > sleep register with the power manager when they start up.  The power
>> > manager then waits for these daemons to report readiness while in
>> > lucid sleep before starting another suspend.  So each daemon
>> > effectively acquires a wake lock when the system enters lucid sleep
>> > and releases the wake lock when it reports readiness to the power
>> > manager or the timeout occurs.
>> >
>> >>> Olof: Can you comment on who's working on that design doc?   Also the
>> >>> discussion around using freezing cgroups separately to distinguish
>> >>> between lucid and awake is interesting, but I wonder if we need to
>> >>> make wakeup_sources/wakelocks cgroup aware, or has that already been
>> >>> done?
>> >>
>> >
>> > Currently cgroup process management is handled by the Chrome browser
>> > since the only processes we freeze are Chrome renderers.  A renderer
>> > is basically a process that hosts the content for a single tab,
>> > extension, plugin, etc.  Freezing occurs when the system is about to
>> > enter suspend from the fully awake state and thawing occurs during the
>> > reverse transition.  We've made no changes to the cgroup code and have
>> > been using it as is.
>> >
>> >>
>> >> Sameer and Chirantan have both been deeply involved in that project,
>> >> adding them on cc here.
>> >>
>> >>
>> >> -Olof
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > Please read the FAQ at  http://www.tux.org/lkml/
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-04 23:30       ` Chirantan Ekbote
@ 2015-05-05 10:46         ` Bastien Nocera
  2015-05-05 19:22           ` Chirantan Ekbote
  2015-05-05 14:39         ` Alan Stern
  2015-05-05 23:47         ` Rafael J. Wysocki
  2 siblings, 1 reply; 25+ messages in thread
From: Bastien Nocera @ 2015-05-05 10:46 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Mon, 2015-05-04 at 16:30 -0700, Chirantan Ekbote wrote:
> 
<snip>
> In the interest of brevity, I didn't go into the design of suspend /
> resume in userspace in my last email but it seems like there's no way
> around it.
> 
> Ignoring lucid sleep for a moment, here is how a regular suspend 
> works
> in the power manager.  The power manager sees a suspend request 
> either
> because the user has been idle for X amount of time (usually 15
> minutes) or the user explicitly requested it by closing the lid.  The
> power manager reads the value of /sys/power/wakeup_count and then
> announces an imminent suspend to the rest of the system via DBus.
> Interested applications (like the network manager and Chrome) will
> have registered with the power manager to be informed about this when
> they started up.  For example, this is when Chrome would freeze its
> renderer processes.  The power manager will now wait for them to
> report back their readiness to suspend.  Once all applications have
> reported ready (or the maximum timeout occurs), the power manager
> performs some final preparations (like setting the resume brightness
> for the display).

logind in systemd also does this, using block or delay inhibitors over
D-Bus:
http://www.freedesktop.org/wiki/Software/systemd/inhibit/

>   The last thing the power manager does, right before
> writing "mem" to /sys/power/state, is write the wakeup_count that it
> read earlier to /sys/power/wakeup_count.  If the write fails, the
> power manager considers the suspend attempt failed, reads the new
> wakeup_count, and starts a timer (usually 10 seconds) to retry the
> suspend.  The same thing happens if the write to /sys/power/state
> fails.

Is this something that logind should do as well?

> It may be the case that the event that incremented the count happened
> because a user was trying to cancel the suspend.  The user could have
> pressed some keys on the keyboard, touched the trackpad, opened the
> lid, pressed the power button, etc, etc.  For the keyboard and
> trackpad these events make their way up to Chrome, which sends a user
> activity message to the power manager.  This is a message that Chrome
> sends to the power manager even during regular activity, up to five
> times a second, to prevent the idle timeout from occuring.  When the
> power manager sees this user activity message while waiting to retry
> the suspend, it cancels the retry and sends out a DBus signal
> informing the system that the suspend is now complete.  For events
> like opening the lid or pressing the power button, the power manager
> monitors those events in /dev/input and simulates a user activity
> message from chrome if any of them fire.
> 
> 
> Ok so now I can talk about how lucid sleep fits into all of this. 
>  The
> power manager only considers a suspend attempt successful if the 
> write
> to /sys/power/state was successful.  If the suspend was successful,
> the power manager then reads another sysfs file
> (/sys/power/wakeup_type in our kernel) to determine if the wakeup
> event was user-triggered.

This is where your lucid sleep implementation matches up with what I
had in mind for that wishlist.

>   If the event was user-triggered it sends
> out a DBus signal announcing the end of the suspend, Chrome thaws its
> renderer processes, the full UI comes back up, and the user can start
> working.  If the event was _not_ user-triggred (if it was the RTC or
> NIC), the power manager sends out a different DBus signal announcing
> that the system is in lucid sleep and will re-suspend soon.  It will
> then wait for all registered applications to report readiness to
> suspend or for the max timeout to expire.
> 
> If it so happens that a user interacts with the system while it is in
> this state, the power manager detects it via the user activity 
> message
> that Chrome sends.  This could be real (keyboard, trackpad) or
> simulated (lid, power button).  Either way, the power manager 
> responds
> the same way: it announces the end of the suspend, Chrome thaws its
> renderers, the full UI comes back up, and the user starts working. 
>  If
> there is no user activity and all applications report ready, the 
> power
> manager gets ready to suspend the system again.  Since the NIC is now
> a wakeup source, the power manager doesn't read the wakeup_count 
> until
> after all applications have reported ready because accessing the
> network could increment the wakeup_count and cause false positives.
> 
> If either the write to /sys/power/wakeup_count or /sys/power/state
> fails from lucid sleep, the power manager re-announces that the 
> system
> is in lucid sleep and will re-suspend soon.  It's actually a little
> smart about this: it will only re-announce lucid sleep if there was a
> wakeup_count mismatch or if the write to /sys/power/state returned
> EBUSY.  Other failures only trigger a simple retry and no DBus 
> signal.
> We do this because these wakeup events may legitimately trigger lucid
> sleep.  For example, more packets may arrive from the network or the
> RTC may go off and applications don't perform work until they hear
> from the power manager that the system is in lucid sleep.  At this
> point the power manager is back to waiting for applications to report
> ready (or for the retry timer to fire).  This process may repeat
> multiple times if we keep getting wakeup events right as the system
> tries to re-suspend.
> 
> 
> So that was a little long-winded but hopefully I've addressed all 
> your
> concerns about potential race conditions in this code.  I simplified 
> a
> few bits because would just complicate the discussion but for the 
> most
> part this is how the feature works now.  Having the kernel emit a
> uevent with the wakeup event type would take the place of the power
> manager reading from /sys/power/wakeup_type in this system but
> wouldn't really affect anything else.

Could you or Tomeu explain the driver changes that are necessary? Do
only drivers that we want to be usable as wakeup sources need to be
changed? Do you have examples of such a patch?

Cheers


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05  6:05         ` Tomeu Vizoso
@ 2015-05-05 12:31           ` Rafael J. Wysocki
  2015-05-07 16:54             ` One Thousand Gnomes
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-05 12:31 UTC (permalink / raw)
  To: Tomeu Vizoso
  Cc: Chirantan Ekbote, Olof Johansson, John Stultz, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Linux PM list

On Tuesday, May 05, 2015 08:05:32 AM Tomeu Vizoso wrote:
> On 5 May 2015 at 00:19, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Friday, May 01, 2015 11:02:19 AM Tomeu Vizoso wrote:
> >> On 30 April 2015 at 20:54, Chirantan Ekbote <chirantan@chromium.org> wrote:
> >> > On Thu, Apr 30, 2015 at 10:23 AM, Olof Johansson <olof@lixom.net> wrote:
> >> >> Hi,
> >> >>
> >> >> On Thu, Apr 30, 2015 at 10:10 AM, John Stultz <john.stultz@linaro.org> wrote:
> >> >>> On Thu, Apr 30, 2015 at 9:25 AM, Bastien Nocera <hadess@hadess.net> wrote:
> >> >>>> On Tue, 2014-10-21 at 10:04 -0700, John Stultz wrote:
> >> >>>>> On Tue, Oct 21, 2014 at 1:49 AM, Bastien Nocera <hadess@hadess.net>
> >> >>>>> wrote:
> >> >>>>> > Hey,
> >> >>>>> >
> >> >>>>> > GNOME has had discussions with kernel developers in the past, and,
> >> >>>>> > fortunately, in some cases we were able to make headway.
> >> >>>>> >
> >> >>>>> > There are however a number of items that we still don't have
> >> >>>>> > solutions
> >> >>>>> > for, items that kernel developers might not realise we'd like to
> >> >>>>> > rely
> >> >>>>> > on, or don't know that we'd make use of if merged.
> >> >>>>> >
> >> >>>>> > I've posted this list at:
> >> >>>>> > https://wiki.gnome.org/BastienNocera/KernelWishlist
> >> >>>>> >
> >> >>>>> > Let me know on-list or off-list if you have any comments about
> >> >>>>> > those, so
> >> >>>>> > I can update the list.
> >> >>>>>
> >> >>>>> As for: 'Export of "wake reason" when the system wakes up (rtc alarm,
> >> >>>>> lid open, etc.) and wakealarm (/sys/class/rtc/foo/wakealarm)
> >> >>>>> documentation'
> >> >>>>>
> >> >>>>> Can you expand more on the rational for the need here? Is this for UI
> >> >>>>> for power debugging, or something else?
> >> >>>>
> >> >>>> This is pretty much what I had in mind:
> >> >>>> https://www.chromium.org/chromium-os/chromiumos-design-docs/lucid-sleep
> >> >>>>
> >> >>>> I guess I didn't make myself understood.
> >> >>>
> >> >>> My, admittedly quick skim, of that design document seems to suggest
> >> >>> that lucid sleep would be a new kernel state. That would keep the
> >> >>> kernel in charge of determining the state transitions (ie:
> >> >>> SUSPEND-(alarm)->LUCID-(wakelock
> >> >>> release)->SUSPEND-(alarm)->LUCID-(power-button)->AWAKE). Then it seems
> >> >>> userspace would be able to query the current state. This avoids some
> >> >>> of the races I was concerned with trying to detect which irq woke us
> >> >>> from suspend from userspace.
> >> >>>
> >> >
> >> > Tomeu has been working on making things so that we don't need a new
> >> > kernel state.  Adding him on cc so he can correct me if I say
> >> > something wrong.  The current idea is to have userspace runtime
> >> > suspend any unneeded devices before starting a suspend.  This way the
> >> > kernel will leave them alone at resume.  This behavior already exists
> >> > in the mainline kernel.
> >>
> >> This is right, I have one series on flight about removing any
> >> non-runtime device resumes from my test platform (a nyan-big) and
> >> another about entering suspend-to-idle with ticks frozen (also on
> >> Tegra124/nyan-big).
> >>
> >> > Getting the wakeup reason can be accomplished
> >> > by having the kernel emit a uevent with the wakeup reason.  This is
> >> > the only change that would be necessary.
> >>
> >> My current opinion is that for ChromeOS there's no need for the kernel
> >> to communicate a wakeup reason to userspace. From what I know it
> >> should be enough for powerd/upower/whatever to constantly monitor all
> >> relevant input devices and that should tell if the wakeup was caused
> >> by user activity or not.
> >
> > Dream on.
> >
> > What about input events that leave no trace in the input buffers?
> 
> You mean that it can happen that, barring bugs on hw or fw, the kernel
> knows what device "caused" the wakeup but doesn't have enough
> information to report the full event (key pressed, touchpad motion,
> etc) to userspace?

Yes.

For example, when you wake up from S3 on ACPI-based systems, the best you
can get is what devices have generated the wakeup events, but there's
no input available from that (like you won't know which key has been
pressed).  You may not get that even.  You may only know what GPEs have
caused the wakeup to happen and they may be shared.

For PCI wakeup, the wakeup event may be out of band.  You need to walk
the hierarchy and check the PME status bits to identify the wakeup device
and then you need to be careful enough not to reset it while putting into
D0 for the input data associated with the event to be available.  I'm not
sure how many device/driver combinations this actually works for.

For USB wakeup, you get the wakeup event from the controller which may be
a PCI device.  Getting to the USB device itself from there requires some
work and even then the device may not "remember" what exactly happened.

Further, if you wake up via the PC keyboard from suspend-to-idle, the
wakeup key code is not available, the only thing you know is that the
interrupts has occured (that may be changed, but it's how the current
code works).

I guess I could continue with that.

> > What about events occuring between the time you've decided to suspend and
> > you actually suspended?
> 
> What about them? How would wakeup_type help us there?

Well, what do you do to ensure that you don't miss them?  That's what wakeup_count
can be used for (if that's what you mean).

Thanks,
Rafael


> >> Once userspace is thawed, the power management daemon would read any
> >> pending events from the input devices and would decide whether to stay
> >> on dark resume or whether to unfreeze renderer daemons and power up
> >> the display, unpause any audio streams, etc.
> >
> > And what about if it decideds to suspend again and a wakeup event
> > requiring you to bring up the UI occurs exactly at that time?
> >
> > You really have the same race as Android does here, but in addition to that
> > you want to deterimne how far to go with the resume on the basis of what
> > wakeup sources are involved.
> >
> > I'm leaving the below for the benefit of linux-pm readers.
> >
> >> For the GNOME folks, this is how it's done in ChromeOS:
> >>
> >> https://chromium.googlesource.com/chromiumos/platform2/+/master/power_manager/powerd/system/input_watcher.cc
> >>
> >> Of course, this depends on events not getting lost. In the ChromeOS
> >> case, the firmware is under the control of the OS developers, so any
> >> bugs can be fixed.
> >>
> >> For GNOME and other desktop environments who aim to run on systems
> >> whose firmware cannot be fixed, I think it could make sense for the
> >> kernel to synthesize such events if it happens to have enough
> >> information to do so.
> >>
> >> Besides this issue, I think that what "only" remains to be done in the
> >> kernel is to speed up resumes so no hacks are needed downstream. The
> >> infrastructure for this already exists in the form of
> >> power.direct_complete and suspend-to-idle and the work that remains is
> >> mostly platform-specific.
> >>
> >> This is to say that GNOME could start implementing lucid sleep right
> >> now, though the user experience might not be ideal yet because resumes
> >> might take longer than desired. But it might not matter to GNOME as
> >> much as it does to ChromeOS because they aren't in control of the hw
> >> anyway?
> >>
> >> Regards,
> >>
> >> Tomeu
> >>
> >> >>> That said, the Power Manager section in that document sounds a little
> >> >>> racy as it seems to rely on asking userspace if suspend is ok, rather
> >> >>> then using userspace wakelocks, so I'm not sure how well baked this
> >> >>> doc is.
> >> >>>
> >> >
> >> > I'm not sure I understand what you are saying here.  If you're saying
> >> > that the kernel is asking userspace if suspend is ok, then I can say
> >> > that that's definitely not the case.  Currently from the kernel's
> >> > perspective a lucid sleep resume isn't really different from a regular
> >> > resume.  We have a hack in each driver that we care about that
> >> > basically boils down to an if statement that skips re-initialization
> >> > if we are entering lucid sleep.  If userspace tries to access that
> >> > device in lucid sleep, it just gets an error.  This has actually
> >> > caused us some headache (see the GPU process section of the doc),
> >> > which is why we'd like to switch to using the runtime suspend approach
> >> > I mentioned above.
> >> >
> >> > If instead you're saying that the power manager needs to ask the rest
> >> > of userspace whether suspend is ok, you can think of the current
> >> > mechanism as a sort of timed wake lock.  Daemons that care about lucid
> >> > sleep register with the power manager when they start up.  The power
> >> > manager then waits for these daemons to report readiness while in
> >> > lucid sleep before starting another suspend.  So each daemon
> >> > effectively acquires a wake lock when the system enters lucid sleep
> >> > and releases the wake lock when it reports readiness to the power
> >> > manager or the timeout occurs.
> >> >
> >> >>> Olof: Can you comment on who's working on that design doc?   Also the
> >> >>> discussion around using freezing cgroups separately to distinguish
> >> >>> between lucid and awake is interesting, but I wonder if we need to
> >> >>> make wakeup_sources/wakelocks cgroup aware, or has that already been
> >> >>> done?
> >> >>
> >> >
> >> > Currently cgroup process management is handled by the Chrome browser
> >> > since the only processes we freeze are Chrome renderers.  A renderer
> >> > is basically a process that hosts the content for a single tab,
> >> > extension, plugin, etc.  Freezing occurs when the system is about to
> >> > enter suspend from the fully awake state and thawing occurs during the
> >> > reverse transition.  We've made no changes to the cgroup code and have
> >> > been using it as is.
> >> >
> >> >>
> >> >> Sameer and Chirantan have both been deeply involved in that project,
> >> >> adding them on cc here.
> >> >>
> >> >>
> >> >> -Olof
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> > the body of a message to majordomo@vger.kernel.org
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> > Please read the FAQ at  http://www.tux.org/lkml/
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at  http://www.tux.org/lkml/
> >
> > --
> > I speak only for myself.
> > Rafael J. Wysocki, Intel Open Source Technology Center.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-04 23:30       ` Chirantan Ekbote
  2015-05-05 10:46         ` Bastien Nocera
@ 2015-05-05 14:39         ` Alan Stern
  2015-05-05 17:58           ` Chirantan Ekbote
  2015-05-05 23:47         ` Rafael J. Wysocki
  2 siblings, 1 reply; 25+ messages in thread
From: Alan Stern @ 2015-05-05 14:39 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Mon, 4 May 2015, Chirantan Ekbote wrote:

> Ok so now I can talk about how lucid sleep fits into all of this.  The
> power manager only considers a suspend attempt successful if the write
> to /sys/power/state was successful.  If the suspend was successful,
> the power manager then reads another sysfs file
> (/sys/power/wakeup_type in our kernel) to determine if the wakeup
> event was user-triggered.  If the event was user-triggered it sends
> out a DBus signal announcing the end of the suspend, Chrome thaws its
> renderer processes, the full UI comes back up, and the user can start
> working.  If the event was _not_ user-triggred (if it was the RTC or
> NIC), the power manager sends out a different DBus signal announcing
> that the system is in lucid sleep and will re-suspend soon.  It will
> then wait for all registered applications to report readiness to
> suspend or for the max timeout to expire.
> 
> If it so happens that a user interacts with the system while it is in
> this state, the power manager detects it via the user activity message
> that Chrome sends.  This could be real (keyboard, trackpad) or
> simulated (lid, power button).  Either way, the power manager responds
> the same way: it announces the end of the suspend, Chrome thaws its
> renderers, the full UI comes back up, and the user starts working.  If
> there is no user activity and all applications report ready, the power
> manager gets ready to suspend the system again.  Since the NIC is now
> a wakeup source, the power manager doesn't read the wakeup_count until
> after all applications have reported ready because accessing the
> network could increment the wakeup_count and cause false positives.

This gives only an implicit description of the difference between
"lucid sleep" and normal resume.  I gather that "lucid sleep" means you
avoid reactivating some parts of the hardware (in particular the parts
used by the renderers), and perhaps also you shorten the inactivity
delay before the next suspend.

Is that right?  The main point of "lucid sleep" is to allow the system 
to do a partial resume, in which some of the most power-hungry parts 
remain suspended?

If that's true then some of the recent work on "direct-complete" may be 
sufficient to do what you want.  See commit f71495f3f0c5 (PM / sleep: 
Update device PM documentation to cover direct_complete) and the 
preceding commits.  The idea is that under some circumstances, devices 
that were in runtime-suspend before a system sleep can remain in 
runtime-suspend when the system wakes up.

If you can force the rendering hardware to go into runtime-suspend when
starting a system suspend, it can remain powered down when the system
wakes up.  Then if it isn't needed for anything (because you suspend
the system again before doing any UI activity), it can remain powered
down the entire time.  No need for a separate "lucid sleep" state.

Alan Stern

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 14:39         ` Alan Stern
@ 2015-05-05 17:58           ` Chirantan Ekbote
  2015-05-05 19:35             ` Alan Stern
  0 siblings, 1 reply; 25+ messages in thread
From: Chirantan Ekbote @ 2015-05-05 17:58 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tue, May 5, 2015 at 7:39 AM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Mon, 4 May 2015, Chirantan Ekbote wrote:
>
>> Ok so now I can talk about how lucid sleep fits into all of this.  The
>> power manager only considers a suspend attempt successful if the write
>> to /sys/power/state was successful.  If the suspend was successful,
>> the power manager then reads another sysfs file
>> (/sys/power/wakeup_type in our kernel) to determine if the wakeup
>> event was user-triggered.  If the event was user-triggered it sends
>> out a DBus signal announcing the end of the suspend, Chrome thaws its
>> renderer processes, the full UI comes back up, and the user can start
>> working.  If the event was _not_ user-triggred (if it was the RTC or
>> NIC), the power manager sends out a different DBus signal announcing
>> that the system is in lucid sleep and will re-suspend soon.  It will
>> then wait for all registered applications to report readiness to
>> suspend or for the max timeout to expire.
>>
>> If it so happens that a user interacts with the system while it is in
>> this state, the power manager detects it via the user activity message
>> that Chrome sends.  This could be real (keyboard, trackpad) or
>> simulated (lid, power button).  Either way, the power manager responds
>> the same way: it announces the end of the suspend, Chrome thaws its
>> renderers, the full UI comes back up, and the user starts working.  If
>> there is no user activity and all applications report ready, the power
>> manager gets ready to suspend the system again.  Since the NIC is now
>> a wakeup source, the power manager doesn't read the wakeup_count until
>> after all applications have reported ready because accessing the
>> network could increment the wakeup_count and cause false positives.
>
> This gives only an implicit description of the difference between
> "lucid sleep" and normal resume.  I gather that "lucid sleep" means you
> avoid reactivating some parts of the hardware (in particular the parts
> used by the renderers), and perhaps also you shorten the inactivity
> delay before the next suspend.
>
> Is that right?  The main point of "lucid sleep" is to allow the system
> to do a partial resume, in which some of the most power-hungry parts
> remain suspended?
>

Yes, that is correct.

> If that's true then some of the recent work on "direct-complete" may be
> sufficient to do what you want.  See commit f71495f3f0c5 (PM / sleep:
> Update device PM documentation to cover direct_complete) and the
> preceding commits.  The idea is that under some circumstances, devices
> that were in runtime-suspend before a system sleep can remain in
> runtime-suspend when the system wakes up.
>
> If you can force the rendering hardware to go into runtime-suspend when
> starting a system suspend, it can remain powered down when the system
> wakes up.  Then if it isn't needed for anything (because you suspend
> the system again before doing any UI activity), it can remain powered
> down the entire time.  No need for a separate "lucid sleep" state.
>

This is our plan for the next version (see my email earlier in this
thread).  Keeping a hybrid power state with hacks in the drivers isn't
really maintainable, scalable, or upstream-able and has caused us some
headaches already.  Unfortunately we were working with the 3.14 kernel
at the time, which didn't have the framework necessary to do anything
else.  The new version of lucid sleep will have the power manager
runtime suspend power-hungry devices before a suspend so that they
remain powered off at resume time.  The power manager can then decide
to resume those devices based on whether the wakeup event was
user-triggered.

Being able to determine the wakeup type from userspace is the only
functionality we need from the kernel that doesn't already exist in
mainline.

> Alan Stern
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 10:46         ` Bastien Nocera
@ 2015-05-05 19:22           ` Chirantan Ekbote
  2015-05-06 12:41             ` Bastien Nocera
  0 siblings, 1 reply; 25+ messages in thread
From: Chirantan Ekbote @ 2015-05-05 19:22 UTC (permalink / raw)
  To: Bastien Nocera
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tue, May 5, 2015 at 3:46 AM, Bastien Nocera <hadess@hadess.net> wrote:
> On Mon, 2015-05-04 at 16:30 -0700, Chirantan Ekbote wrote:
>>
> <snip>
>> In the interest of brevity, I didn't go into the design of suspend /
>> resume in userspace in my last email but it seems like there's no way
>> around it.
>>
>> Ignoring lucid sleep for a moment, here is how a regular suspend
>> works
>> in the power manager.  The power manager sees a suspend request
>> either
>> because the user has been idle for X amount of time (usually 15
>> minutes) or the user explicitly requested it by closing the lid.  The
>> power manager reads the value of /sys/power/wakeup_count and then
>> announces an imminent suspend to the rest of the system via DBus.
>> Interested applications (like the network manager and Chrome) will
>> have registered with the power manager to be informed about this when
>> they started up.  For example, this is when Chrome would freeze its
>> renderer processes.  The power manager will now wait for them to
>> report back their readiness to suspend.  Once all applications have
>> reported ready (or the maximum timeout occurs), the power manager
>> performs some final preparations (like setting the resume brightness
>> for the display).
>
> logind in systemd also does this, using block or delay inhibitors over
> D-Bus:
> http://www.freedesktop.org/wiki/Software/systemd/inhibit/
>
>>   The last thing the power manager does, right before
>> writing "mem" to /sys/power/state, is write the wakeup_count that it
>> read earlier to /sys/power/wakeup_count.  If the write fails, the
>> power manager considers the suspend attempt failed, reads the new
>> wakeup_count, and starts a timer (usually 10 seconds) to retry the
>> suspend.  The same thing happens if the write to /sys/power/state
>> fails.
>
> Is this something that logind should do as well?
>

We do it to avoid a race condition where a wakeup event occurs after
userspace has started the suspend process but before anything writes
"mem" to /sys/power/state.  I'm guessing that this is something logind
should be doing as well since the chances of missing a wakeup event
increase the longer any given delay inhibitor takes to delay a
suspend.

>> It may be the case that the event that incremented the count happened
>> because a user was trying to cancel the suspend.  The user could have
>> pressed some keys on the keyboard, touched the trackpad, opened the
>> lid, pressed the power button, etc, etc.  For the keyboard and
>> trackpad these events make their way up to Chrome, which sends a user
>> activity message to the power manager.  This is a message that Chrome
>> sends to the power manager even during regular activity, up to five
>> times a second, to prevent the idle timeout from occuring.  When the
>> power manager sees this user activity message while waiting to retry
>> the suspend, it cancels the retry and sends out a DBus signal
>> informing the system that the suspend is now complete.  For events
>> like opening the lid or pressing the power button, the power manager
>> monitors those events in /dev/input and simulates a user activity
>> message from chrome if any of them fire.
>>
>> <snip>
>>
>> So that was a little long-winded but hopefully I've addressed all
>> your
>> concerns about potential race conditions in this code.  I simplified
>> a
>> few bits because would just complicate the discussion but for the
>> most
>> part this is how the feature works now.  Having the kernel emit a
>> uevent with the wakeup event type would take the place of the power
>> manager reading from /sys/power/wakeup_type in this system but
>> wouldn't really affect anything else.
>
> Could you or Tomeu explain the driver changes that are necessary? Do
> only drivers that we want to be usable as wakeup sources need to be
> changed? Do you have examples of such a patch?
>

You will need driver support for anything that needs to trigger a
wakeup event.  The RTC already supports this but for wifi we needed
both driver and firmware support from Intel.  Assuming you choose to
go down the "runtime-suspend devices to keep them suspended on resume"
path, then you would also need runtime suspend support in all the
drivers that you wanted to leave off.  In our case, we only do this
for the display and the USB ports because we don't want the screen to
flash or the lights on any USB-connected device to start blinking.

Here are some of the patches for wifi that we needed to get this working:

6abb9cb cfg80211: make WoWLAN configuration available to drivers
cd8f7cb cfg80211/mac80211: support reporting wakeup reason
a92eecb cfg80211: fix WoWLAN wakeup tracing
8cd4d45 cfg80211: add wowlan net-detect support


> Cheers
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 17:58           ` Chirantan Ekbote
@ 2015-05-05 19:35             ` Alan Stern
  2015-05-05 20:58               ` Chirantan Ekbote
  0 siblings, 1 reply; 25+ messages in thread
From: Alan Stern @ 2015-05-05 19:35 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tue, 5 May 2015, Chirantan Ekbote wrote:

> This is our plan for the next version (see my email earlier in this
> thread).  Keeping a hybrid power state with hacks in the drivers isn't
> really maintainable, scalable, or upstream-able and has caused us some
> headaches already.  Unfortunately we were working with the 3.14 kernel
> at the time, which didn't have the framework necessary to do anything
> else.  The new version of lucid sleep will have the power manager
> runtime suspend power-hungry devices before a suspend so that they
> remain powered off at resume time.  The power manager can then decide
> to resume those devices based on whether the wakeup event was
> user-triggered.
> 
> Being able to determine the wakeup type from userspace is the only
> functionality we need from the kernel that doesn't already exist in
> mainline.

Maybe you can simplify the problem.  You don't really need to know the 
wakeup type, or to determine which device was responsible for the 
wakeup.  All you really need to know is whether the wakeup was 
user-triggered.  That may be much easier to discover.

Alan Stern


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 19:35             ` Alan Stern
@ 2015-05-05 20:58               ` Chirantan Ekbote
  2015-05-05 23:56                 ` Rafael J. Wysocki
  2015-05-07 17:03                 ` One Thousand Gnomes
  0 siblings, 2 replies; 25+ messages in thread
From: Chirantan Ekbote @ 2015-05-05 20:58 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tue, May 5, 2015 at 12:35 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> On Tue, 5 May 2015, Chirantan Ekbote wrote:
>
>> This is our plan for the next version (see my email earlier in this
>> thread).  Keeping a hybrid power state with hacks in the drivers isn't
>> really maintainable, scalable, or upstream-able and has caused us some
>> headaches already.  Unfortunately we were working with the 3.14 kernel
>> at the time, which didn't have the framework necessary to do anything
>> else.  The new version of lucid sleep will have the power manager
>> runtime suspend power-hungry devices before a suspend so that they
>> remain powered off at resume time.  The power manager can then decide
>> to resume those devices based on whether the wakeup event was
>> user-triggered.
>>
>> Being able to determine the wakeup type from userspace is the only
>> functionality we need from the kernel that doesn't already exist in
>> mainline.
>
> Maybe you can simplify the problem.  You don't really need to know the
> wakeup type, or to determine which device was responsible for the
> wakeup.  All you really need to know is whether the wakeup was
> user-triggered.  That may be much easier to discover.
>

You are, of course, correct.  Ultimately the only requirement we have
is that there exists a way for userspace to determine if the system
woke up because of a user-triggered event.  The actual mechanism by
which this determination is made isn't something I feel strongly
about.  The reason I had been focusing on exposing the actual wakeup
event to userspace is because classifying wakeup events as
user-triggered or not feels to me like a policy decision that should
be left to userspace.  If the kernel maintainers are ok with doing
this work in the kernel instead and only exposing a binary yes/no bit
to userspace for user-triggered wakeups, that's perfectly fine because
it still meets our requirements.

> Alan Stern
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 23:56                 ` Rafael J. Wysocki
@ 2015-05-05 23:38                   ` David Lang
  2015-05-05 23:51                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: David Lang @ 2015-05-05 23:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Chirantan Ekbote, Alan Stern, John Stultz, Olof Johansson,
	Bastien Nocera, Linux Kernel Mailing List, snanda, Tomeu Vizoso,
	Linux PM list

On Wed, 6 May 2015, Rafael J. Wysocki wrote:

>> You are, of course, correct.  Ultimately the only requirement we have
>> is that there exists a way for userspace to determine if the system
>> woke up because of a user-triggered event.  The actual mechanism by
>> which this determination is made isn't something I feel strongly
>> about.  The reason I had been focusing on exposing the actual wakeup
>> event to userspace is because classifying wakeup events as
>> user-triggered or not feels to me like a policy decision that should
>> be left to userspace.  If the kernel maintainers are ok with doing
>> this work in the kernel instead and only exposing a binary yes/no bit
>> to userspace for user-triggered wakeups, that's perfectly fine because
>> it still meets our requirements.
>
> Well, please see the message I've just sent.
>
> All wakeup devices have a wakeup source object associated with them.  In
> principle, we can expose a "priority" attribute from that for user space to
> set as it wants to.  There may be two values of it, like "normal" and "high"
> for example.
>
> Then, what only remains is to introduce separate wakeup counts for the "high"
> priority and "normal" priority wakeup sources and teach the power manager to
> use them.
>
> That leaves no policy in the kernel, but it actually has a chance to work.

how about instead of setting two states and defining that one must be a subset 
of the other you instead have the existing feed of events and then allow 
software that cares to define additional feeds that take the current feed and 
filter it. We allow bpf filters in the kernel, so use those to filter what 
events the additional feed is going to receive.

remember that the interesting numbers in CS are 0, 1, and many, not 2 :-)

don't limit things to two feeds with one always being a subset of the other, 
create a mechanism to allow an arbitrary number of feeds that can be filtered in 
different ways

David Lang

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-04 23:30       ` Chirantan Ekbote
  2015-05-05 10:46         ` Bastien Nocera
  2015-05-05 14:39         ` Alan Stern
@ 2015-05-05 23:47         ` Rafael J. Wysocki
  2015-05-06 17:40           ` Chirantan Ekbote
  2015-05-11 22:12           ` Pavel Machek
  2 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-05 23:47 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Monday, May 04, 2015 04:30:00 PM Chirantan Ekbote wrote:
> On Mon, May 4, 2015 at 3:12 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Thursday, April 30, 2015 11:54:51 AM Chirantan Ekbote wrote:
> >> On Thu, Apr 30, 2015 at 10:23 AM, Olof Johansson <olof@lixom.net> wrote:
> >> > Hi,
> >> >
> >> > On Thu, Apr 30, 2015 at 10:10 AM, John Stultz <john.stultz@linaro.org> wrote:
> >> >> On Thu, Apr 30, 2015 at 9:25 AM, Bastien Nocera <hadess@hadess.net> wrote:
> >> >>> On Tue, 2014-10-21 at 10:04 -0700, John Stultz wrote:
> >
> > Thanks for CCin me, John!
> >
> > Let's also CC linux-pm as the people on that list may be generally interested
> > in this thread.
> >
> >> >>>> On Tue, Oct 21, 2014 at 1:49 AM, Bastien Nocera <hadess@hadess.net>
> >> >>>> wrote:
> >> >>>> > Hey,
> >> >>>> >
> >> >>>> > GNOME has had discussions with kernel developers in the past, and,
> >> >>>> > fortunately, in some cases we were able to make headway.
> >> >>>> >
> >> >>>> > There are however a number of items that we still don't have
> >> >>>> > solutions
> >> >>>> > for, items that kernel developers might not realise we'd like to
> >> >>>> > rely
> >> >>>> > on, or don't know that we'd make use of if merged.
> >> >>>> >
> >> >>>> > I've posted this list at:
> >> >>>> > https://wiki.gnome.org/BastienNocera/KernelWishlist
> >> >>>> >
> >> >>>> > Let me know on-list or off-list if you have any comments about
> >> >>>> > those, so
> >> >>>> > I can update the list.
> >> >>>>
> >> >>>> As for: 'Export of "wake reason" when the system wakes up (rtc alarm,
> >> >>>> lid open, etc.) and wakealarm (/sys/class/rtc/foo/wakealarm)
> >> >>>> documentation'
> >> >>>>
> >> >>>> Can you expand more on the rational for the need here? Is this for UI
> >> >>>> for power debugging, or something else?
> >> >>>
> >> >>> This is pretty much what I had in mind:
> >> >>> https://www.chromium.org/chromium-os/chromiumos-design-docs/lucid-sleep
> >> >>>
> >> >>> I guess I didn't make myself understood.
> >> >>
> >> >> My, admittedly quick skim, of that design document seems to suggest
> >> >> that lucid sleep would be a new kernel state. That would keep the
> >> >> kernel in charge of determining the state transitions (ie:
> >> >> SUSPEND-(alarm)->LUCID-(wakelock
> >> >> release)->SUSPEND-(alarm)->LUCID-(power-button)->AWAKE). Then it seems
> >> >> userspace would be able to query the current state. This avoids some
> >> >> of the races I was concerned with trying to detect which irq woke us
> >> >> from suspend from userspace.
> >> >>
> >>
> >> Tomeu has been working on making things so that we don't need a new
> >> kernel state.
> >
> > Which is good, because adding a new kernel state like that to the mainline is
> > out of the question as far as I'm concerned.
> >
> >>  Adding him on cc so he can correct me if I say
> >> something wrong.  The current idea is to have userspace runtime
> >> suspend any unneeded devices before starting a suspend.  This way the
> >> kernel will leave them alone at resume.  This behavior already exists
> >> in the mainline kernel.  Getting the wakeup reason can be accomplished
> >> by having the kernel emit a uevent with the wakeup reason.  This is
> >> the only change that would be necessary.
> >
> > Well, that needs to be thought through carefully in my view, or it will
> > always be racy.
> >
> > You cannot really only rely on wakeup events that have already happened,
> > because something requiring you to bring up the full UI may happen at
> > any time.  In particular, it may happen when you're about to suspend again.
> >
> > For this reason, it looks like you need something along the lines of
> > the wakeup_count interface, but acting on subsets of devices.
> >
> > It actually shouldn't be too difficult to split the existing wakeup
> > counter into a number of subcounters each tracking a subset of wakeup
> > sources and one of them might be used as the "full UI wakeup" condition
> > trigger in principle.
> >
> 
> In the interest of brevity, I didn't go into the design of suspend /
> resume in userspace in my last email but it seems like there's no way
> around it.

Well, thanks for the effort. :-)

> Ignoring lucid sleep for a moment, here is how a regular suspend works
> in the power manager.  The power manager sees a suspend request either
> because the user has been idle for X amount of time (usually 15
> minutes) or the user explicitly requested it by closing the lid.  The
> power manager reads the value of /sys/power/wakeup_count and then
> announces an imminent suspend to the rest of the system via DBus.
> Interested applications (like the network manager and Chrome) will
> have registered with the power manager to be informed about this when
> they started up.  For example, this is when Chrome would freeze its
> renderer processes.  The power manager will now wait for them to
> report back their readiness to suspend.  Once all applications have
> reported ready (or the maximum timeout occurs), the power manager
> performs some final preparations (like setting the resume brightness
> for the display).  The last thing the power manager does, right before
> writing "mem" to /sys/power/state, is write the wakeup_count that it
> read earlier to /sys/power/wakeup_count.  If the write fails, the
> power manager considers the suspend attempt failed, reads the new
> wakeup_count, and starts a timer (usually 10 seconds) to retry the
> suspend.  The same thing happens if the write to /sys/power/state
> fails.

That sounds reasonable to me.

> It may be the case that the event that incremented the count happened
> because a user was trying to cancel the suspend.  The user could have
> pressed some keys on the keyboard, touched the trackpad, opened the
> lid, pressed the power button, etc, etc.  For the keyboard and
> trackpad these events make their way up to Chrome, which sends a user
> activity message to the power manager.  This is a message that Chrome
> sends to the power manager even during regular activity, up to five
> times a second, to prevent the idle timeout from occuring.  When the
> power manager sees this user activity message while waiting to retry
> the suspend, it cancels the retry and sends out a DBus signal
> informing the system that the suspend is now complete.  For events
> like opening the lid or pressing the power button, the power manager
> monitors those events in /dev/input and simulates a user activity
> message from chrome if any of them fire.

So far, so good.

> Ok so now I can talk about how lucid sleep fits into all of this.  The
> power manager only considers a suspend attempt successful if the write
> to /sys/power/state was successful.  If the suspend was successful,
> the power manager then reads another sysfs file
> (/sys/power/wakeup_type in our kernel) to determine if the wakeup
> event was user-triggered.

Well, that's where things are likely to get ugly depending on how the
/sys/power/wakeup_type attribute works (more below).

> If the event was user-triggered it sends
> out a DBus signal announcing the end of the suspend, Chrome thaws its
> renderer processes, the full UI comes back up, and the user can start
> working.  If the event was _not_ user-triggred (if it was the RTC or
> NIC), the power manager sends out a different DBus signal announcing
> that the system is in lucid sleep and will re-suspend soon.  It will
> then wait for all registered applications to report readiness to
> suspend or for the max timeout to expire.

First let me say that the "user-triggered" vs "non-user-triggered" distinction
seems somewhat artificial to me.  All boils down to having a special class
of wakeup events that are supposed to make the power manager behave differently
after resuming.  Whether or not they are actually triggered by the user
doesn't really matter technically.

> If it so happens that a user interacts with the system while it is in
> this state, the power manager detects it via the user activity message
> that Chrome sends.  This could be real (keyboard, trackpad) or
> simulated (lid, power button).  Either way, the power manager responds
> the same way: it announces the end of the suspend, Chrome thaws its
> renderers, the full UI comes back up, and the user starts working.  If
> there is no user activity and all applications report ready, the power
> manager gets ready to suspend the system again.  Since the NIC is now
> a wakeup source, the power manager doesn't read the wakeup_count until
> after all applications have reported ready because accessing the
> network could increment the wakeup_count and cause false positives.
> 
> If either the write to /sys/power/wakeup_count or /sys/power/state
> fails from lucid sleep, the power manager re-announces that the system
> is in lucid sleep and will re-suspend soon.  It's actually a little
> smart about this: it will only re-announce lucid sleep if there was a
> wakeup_count mismatch or if the write to /sys/power/state returned
> EBUSY.  Other failures only trigger a simple retry and no DBus signal.
> We do this because these wakeup events may legitimately trigger lucid
> sleep.  For example, more packets may arrive from the network or the
> RTC may go off and applications don't perform work until they hear
> from the power manager that the system is in lucid sleep.  At this
> point the power manager is back to waiting for applications to report
> ready (or for the retry timer to fire).  This process may repeat
> multiple times if we keep getting wakeup events right as the system
> tries to re-suspend.
> 
> 
> So that was a little long-winded but hopefully I've addressed all your
> concerns about potential race conditions in this code.  I simplified a
> few bits because would just complicate the discussion but for the most
> part this is how the feature works now.  Having the kernel emit a
> uevent with the wakeup event type would take the place of the power
> manager reading from /sys/power/wakeup_type in this system but
> wouldn't really affect anything else.

Which loops back to my previous remark: Things may get ugly if /sys/power/wakeup_type
doesn't do the right thing (the uevent mechanics you'd like to replace it with
will really need to do the same, so I'm not quite sure it's worth the effort).

Namely, it really has to cover all events that might have woken you up and
happened before stuff has started to be added to the input buffers that Chrome
cares about.  It is difficult to identify the exact point where that takes place
in the resume sequence, but it should be somewhere in dpm_resume_end().  Why so?
Because it really doesn't matter why exactly the system is waking up.  What
matters is whether or not an event that you should react to by bringing up the
UI happens *at* *any* *time* between (and including) the actual wakeup and the
point when you can rely on the input buffers to contain any useful information
consumable by Chrome.

This pretty much means that /sys/power/wakeup_type needs to behave almost like
/sys/power/wakeup_count, but is limited to a subset of wakeup sources.  That's
why I was talking about splitting the wakeup count.

So instead of adding an entirely new mechanics for that, why don't you add
something like "priority" or "weight" to struct wakeup_source and assign
higher values of that to the wakeup sources associated with the events
you want to bring up the UI after resume?  And make those "higher-priority"
wakeup sources use a separate wakeup counter, so you can easily verify if
any of them has triggered by reading that or making it trigger a uevent if
you want to?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 23:38                   ` David Lang
@ 2015-05-05 23:51                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-05 23:51 UTC (permalink / raw)
  To: David Lang
  Cc: Rafael J. Wysocki, Chirantan Ekbote, Alan Stern, John Stultz,
	Olof Johansson, Bastien Nocera, Linux Kernel Mailing List, snanda,
	Tomeu Vizoso, Linux PM list

On Wed, May 6, 2015 at 1:38 AM, David Lang <david@lang.hm> wrote:
> On Wed, 6 May 2015, Rafael J. Wysocki wrote:
>
>>> You are, of course, correct.  Ultimately the only requirement we have
>>> is that there exists a way for userspace to determine if the system
>>> woke up because of a user-triggered event.  The actual mechanism by
>>> which this determination is made isn't something I feel strongly
>>> about.  The reason I had been focusing on exposing the actual wakeup
>>> event to userspace is because classifying wakeup events as
>>> user-triggered or not feels to me like a policy decision that should
>>> be left to userspace.  If the kernel maintainers are ok with doing
>>> this work in the kernel instead and only exposing a binary yes/no bit
>>> to userspace for user-triggered wakeups, that's perfectly fine because
>>> it still meets our requirements.
>>
>>
>> Well, please see the message I've just sent.
>>
>> All wakeup devices have a wakeup source object associated with them.  In
>> principle, we can expose a "priority" attribute from that for user space
>> to
>> set as it wants to.  There may be two values of it, like "normal" and
>> "high"
>> for example.
>>
>> Then, what only remains is to introduce separate wakeup counts for the
>> "high"
>> priority and "normal" priority wakeup sources and teach the power manager
>> to
>> use them.
>>
>> That leaves no policy in the kernel, but it actually has a chance to work.
>
>
> how about instead of setting two states and defining that one must be a
> subset of the other you instead have the existing feed of events and then
> allow software that cares to define additional feeds that take the current
> feed and filter it. We allow bpf filters in the kernel, so use those to
> filter what events the additional feed is going to receive.
>
> remember that the interesting numbers in CS are 0, 1, and many, not 2 :-)
>
> don't limit things to two feeds with one always being a subset of the other,
> create a mechanism to allow an arbitrary number of feeds that can be
> filtered in different ways

In my example above "high" is not a subset of "normal".  They are
separate sets.  Each of them is a subset of the set of all wakeup
sources, but that's obvious.

And yes, you can create more of them, but then you'll need an
interface to manipulate them which will probably be overkill for the
use case at hand.

Do you envision a use case where more than two sets would be necessary?

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 20:58               ` Chirantan Ekbote
@ 2015-05-05 23:56                 ` Rafael J. Wysocki
  2015-05-05 23:38                   ` David Lang
  2015-05-07 17:03                 ` One Thousand Gnomes
  1 sibling, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-05 23:56 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: Alan Stern, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tuesday, May 05, 2015 01:58:12 PM Chirantan Ekbote wrote:
> On Tue, May 5, 2015 at 12:35 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> > On Tue, 5 May 2015, Chirantan Ekbote wrote:
> >
> >> This is our plan for the next version (see my email earlier in this
> >> thread).  Keeping a hybrid power state with hacks in the drivers isn't
> >> really maintainable, scalable, or upstream-able and has caused us some
> >> headaches already.  Unfortunately we were working with the 3.14 kernel
> >> at the time, which didn't have the framework necessary to do anything
> >> else.  The new version of lucid sleep will have the power manager
> >> runtime suspend power-hungry devices before a suspend so that they
> >> remain powered off at resume time.  The power manager can then decide
> >> to resume those devices based on whether the wakeup event was
> >> user-triggered.
> >>
> >> Being able to determine the wakeup type from userspace is the only
> >> functionality we need from the kernel that doesn't already exist in
> >> mainline.
> >
> > Maybe you can simplify the problem.  You don't really need to know the
> > wakeup type, or to determine which device was responsible for the
> > wakeup.  All you really need to know is whether the wakeup was
> > user-triggered.  That may be much easier to discover.
> >
> 
> You are, of course, correct.  Ultimately the only requirement we have
> is that there exists a way for userspace to determine if the system
> woke up because of a user-triggered event.  The actual mechanism by
> which this determination is made isn't something I feel strongly
> about.  The reason I had been focusing on exposing the actual wakeup
> event to userspace is because classifying wakeup events as
> user-triggered or not feels to me like a policy decision that should
> be left to userspace.  If the kernel maintainers are ok with doing
> this work in the kernel instead and only exposing a binary yes/no bit
> to userspace for user-triggered wakeups, that's perfectly fine because
> it still meets our requirements.

Well, please see the message I've just sent.

All wakeup devices have a wakeup source object associated with them.  In
principle, we can expose a "priority" attribute from that for user space to
set as it wants to.  There may be two values of it, like "normal" and "high"
for example.

Then, what only remains is to introduce separate wakeup counts for the "high"
priority and "normal" priority wakeup sources and teach the power manager to
use them.

That leaves no policy in the kernel, but it actually has a chance to work.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 19:22           ` Chirantan Ekbote
@ 2015-05-06 12:41             ` Bastien Nocera
  0 siblings, 0 replies; 25+ messages in thread
From: Bastien Nocera @ 2015-05-06 12:41 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: Rafael J. Wysocki, John Stultz, Olof Johansson,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tue, 2015-05-05 at 12:22 -0700, Chirantan Ekbote wrote:
> On Tue, May 5, 2015 at 3:46 AM, Bastien Nocera <hadess@hadess.net> 
> wrote:
> > 
> > >   The last thing the power manager does, right before
> > > writing "mem" to /sys/power/state, is write the wakeup_count 
> > > that it
> > > read earlier to /sys/power/wakeup_count.  If the write fails, the
> > > power manager considers the suspend attempt failed, reads the new
> > > wakeup_count, and starts a timer (usually 10 seconds) to retry 
> > > the
> > > suspend.  The same thing happens if the write to /sys/power/state
> > > fails.
> > 
> > Is this something that logind should do as well?
> > 
> 
> We do it to avoid a race condition where a wakeup event occurs after
> userspace has started the suspend process but before anything writes
> "mem" to /sys/power/state.  I'm guessing that this is something 
> logind
> should be doing as well since the chances of missing a wakeup event
> increase the longer any given delay inhibitor takes to delay a
> suspend.

File https://bugzilla.freedesktop.org/show_bug.cgi?id=90339

Cheers


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 23:47         ` Rafael J. Wysocki
@ 2015-05-06 17:40           ` Chirantan Ekbote
  2015-05-07 23:19             ` Rafael J. Wysocki
  2015-05-11 22:12           ` Pavel Machek
  1 sibling, 1 reply; 25+ messages in thread
From: Chirantan Ekbote @ 2015-05-06 17:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tue, May 5, 2015 at 4:47 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, May 04, 2015 04:30:00 PM Chirantan Ekbote wrote:
>
> <snip>
>
>> Ok so now I can talk about how lucid sleep fits into all of this.  The
>> power manager only considers a suspend attempt successful if the write
>> to /sys/power/state was successful.  If the suspend was successful,
>> the power manager then reads another sysfs file
>> (/sys/power/wakeup_type in our kernel) to determine if the wakeup
>> event was user-triggered.
>
> Well, that's where things are likely to get ugly depending on how the
> /sys/power/wakeup_type attribute works (more below).
>
>> If the event was user-triggered it sends
>> out a DBus signal announcing the end of the suspend, Chrome thaws its
>> renderer processes, the full UI comes back up, and the user can start
>> working.  If the event was _not_ user-triggred (if it was the RTC or
>> NIC), the power manager sends out a different DBus signal announcing
>> that the system is in lucid sleep and will re-suspend soon.  It will
>> then wait for all registered applications to report readiness to
>> suspend or for the max timeout to expire.
>
> First let me say that the "user-triggered" vs "non-user-triggered" distinction
> seems somewhat artificial to me.  All boils down to having a special class
> of wakeup events that are supposed to make the power manager behave differently
> after resuming.  Whether or not they are actually triggered by the user
> doesn't really matter technically.
>

This is true.  It's just easier to talk about them as user-triggered
or not.  For reference, [1] and [2] are the patches that implement
wakeup_type in our kernel.

>> If it so happens that a user interacts with the system while it is in
>> this state, the power manager detects it via the user activity message
>> that Chrome sends.  This could be real (keyboard, trackpad) or
>> simulated (lid, power button).  Either way, the power manager responds
>> the same way: it announces the end of the suspend, Chrome thaws its
>> renderers, the full UI comes back up, and the user starts working.  If
>> there is no user activity and all applications report ready, the power
>> manager gets ready to suspend the system again.  Since the NIC is now
>> a wakeup source, the power manager doesn't read the wakeup_count until
>> after all applications have reported ready because accessing the
>> network could increment the wakeup_count and cause false positives.
>>
>> If either the write to /sys/power/wakeup_count or /sys/power/state
>> fails from lucid sleep, the power manager re-announces that the system
>> is in lucid sleep and will re-suspend soon.  It's actually a little
>> smart about this: it will only re-announce lucid sleep if there was a
>> wakeup_count mismatch or if the write to /sys/power/state returned
>> EBUSY.  Other failures only trigger a simple retry and no DBus signal.
>> We do this because these wakeup events may legitimately trigger lucid
>> sleep.  For example, more packets may arrive from the network or the
>> RTC may go off and applications don't perform work until they hear
>> from the power manager that the system is in lucid sleep.  At this
>> point the power manager is back to waiting for applications to report
>> ready (or for the retry timer to fire).  This process may repeat
>> multiple times if we keep getting wakeup events right as the system
>> tries to re-suspend.
>>
>>
>> So that was a little long-winded but hopefully I've addressed all your
>> concerns about potential race conditions in this code.  I simplified a
>> few bits because would just complicate the discussion but for the most
>> part this is how the feature works now.  Having the kernel emit a
>> uevent with the wakeup event type would take the place of the power
>> manager reading from /sys/power/wakeup_type in this system but
>> wouldn't really affect anything else.
>
> Which loops back to my previous remark: Things may get ugly if /sys/power/wakeup_type
> doesn't do the right thing (the uevent mechanics you'd like to replace it with
> will really need to do the same, so I'm not quite sure it's worth the effort).
>
> Namely, it really has to cover all events that might have woken you up and
> happened before stuff has started to be added to the input buffers that Chrome
> cares about.  It is difficult to identify the exact point where that takes place
> in the resume sequence, but it should be somewhere in dpm_resume_end().  Why so?
> Because it really doesn't matter why exactly the system is waking up.  What
> matters is whether or not an event that you should react to by bringing up the
> UI happens *at* *any* *time* between (and including) the actual wakeup and the
> point when you can rely on the input buffers to contain any useful information
> consumable by Chrome.
>

So this is something that we don't catch right now.  Our
implementation queries the firmware via an ACPI call to get the wakeup
source and I was assuming that any events after that *would*
eventually make their way up to Chrome or userspace.  But based on
what you are saying, it seems like we do drop any events that occur
between the wakeup and the time when the input buffers contain useful
information.

I'm guessing that this window is pretty small though and when we do
end up missing an event, we can consider ourselves lucky because the
typical user reaction when their computer doesn't wake up is to start
pressing random keys on the keyboard and we would definitely catch
those.

> This pretty much means that /sys/power/wakeup_type needs to behave almost like
> /sys/power/wakeup_count, but is limited to a subset of wakeup sources.  That's
> why I was talking about splitting the wakeup count.
>
> So instead of adding an entirely new mechanics for that, why don't you add
> something like "priority" or "weight" to struct wakeup_source and assign
> higher values of that to the wakeup sources associated with the events
> you want to bring up the UI after resume?  And make those "higher-priority"
> wakeup sources use a separate wakeup counter, so you can easily verify if
> any of them has triggered by reading that or making it trigger a uevent if
> you want to?
>
>

Sounds good to me.



[1] https://chromium-review.googlesource.com/#/c/226932
[2] https://chromium-review.googlesource.com/#/c/226933

> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 12:31           ` Rafael J. Wysocki
@ 2015-05-07 16:54             ` One Thousand Gnomes
  2015-05-07 21:03               ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: One Thousand Gnomes @ 2015-05-07 16:54 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tomeu Vizoso, Chirantan Ekbote, Olof Johansson, John Stultz,
	Bastien Nocera, Linux Kernel Mailing List, snanda, Linux PM list

On Tue, 05 May 2015 14:31:26 +0200

> For example, when you wake up from S3 on ACPI-based systems, the best you
> can get is what devices have generated the wakeup events, but there's
> no input available from that (like you won't know which key has been
> pressed).  You may not get that even.  You may only know what GPEs have
> caused the wakeup to happen and they may be shared.
> 
> For PCI wakeup, the wakeup event may be out of band.  You need to walk
> the hierarchy and check the PME status bits to identify the wakeup device
> and then you need to be careful enough not to reset it while putting into
> D0 for the input data associated with the event to be available.  I'm not
> sure how many device/driver combinations this actually works for.
> 
> For USB wakeup, you get the wakeup event from the controller which may be
> a PCI device.  Getting to the USB device itself from there requires some
> work and even then the device may not "remember" what exactly happened.
> 
> Further, if you wake up via the PC keyboard from suspend-to-idle, the
> wakeup key code is not available, the only thing you know is that the
> interrupts has occured (that may be changed, but it's how the current
> code works).

It's probably got to change, otherwise once machines get able to sleep
between keypresses it's going to suck every time you pause and think for
a minute then begin typing. Remember display being off for suspend is
purely a limitation of most current display panels.


Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 20:58               ` Chirantan Ekbote
  2015-05-05 23:56                 ` Rafael J. Wysocki
@ 2015-05-07 17:03                 ` One Thousand Gnomes
  2015-05-07 18:21                   ` Chirantan Ekbote
  1 sibling, 1 reply; 25+ messages in thread
From: One Thousand Gnomes @ 2015-05-07 17:03 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: Alan Stern, Rafael J. Wysocki, John Stultz, Olof Johansson,
	Bastien Nocera, Linux Kernel Mailing List, snanda, Tomeu Vizoso,
	Linux PM list

> You are, of course, correct.  Ultimately the only requirement we have
> is that there exists a way for userspace to determine if the system
> woke up because of a user-triggered event.  The actual mechanism by

No. That is irrelevant. You need a way to ascertain if a user triggered
event has occurred since you suspended.

The two are not the same thing.

If your box wakes up due to something like a wireless card deciding it
needs to poke the base station and the user hits a key a microsecond
after wakeup then you want the display on.

The question is never "did the user wake the machine" the question is "did
the user do something that takes me out of 'lucid sleep/snooze/whatever'
since I suspended". Every user event could equally occur a microsecond
after a wakeup from a non user source, so every time you must ask the
"since suspend" question.

In fact if you had some kind of hypoethetical event counter incremented
by the device on it causing a wakeup event *or* an event while active
(and no way to tell them apat) that would provide a correct race free
interface to figure out if the display ought to be on

It doesn't solve the powering off as a key is hit race but that's a
different beast.

Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-07 17:03                 ` One Thousand Gnomes
@ 2015-05-07 18:21                   ` Chirantan Ekbote
  0 siblings, 0 replies; 25+ messages in thread
From: Chirantan Ekbote @ 2015-05-07 18:21 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: Alan Stern, Rafael J. Wysocki, John Stultz, Olof Johansson,
	Bastien Nocera, Linux Kernel Mailing List, snanda, Tomeu Vizoso,
	Linux PM list

On Thu, May 7, 2015 at 10:03 AM, One Thousand Gnomes
<gnomes@lxorguk.ukuu.org.uk> wrote:
>> You are, of course, correct.  Ultimately the only requirement we have
>> is that there exists a way for userspace to determine if the system
>> woke up because of a user-triggered event.  The actual mechanism by
>
> No. That is irrelevant. You need a way to ascertain if a user triggered
> event has occurred since you suspended.
>
> The two are not the same thing.
>
> If your box wakes up due to something like a wireless card deciding it
> needs to poke the base station and the user hits a key a microsecond
> after wakeup then you want the display on.
>
> The question is never "did the user wake the machine" the question is "did
> the user do something that takes me out of 'lucid sleep/snooze/whatever'
> since I suspended". Every user event could equally occur a microsecond
> after a wakeup from a non user source, so every time you must ask the
> "since suspend" question.
>

Yes, this is what Rafael said earlier and if you had read my reply,
you would have seen that I have already admitted that this is a
situation that our current implementation does not handle properly.
However, this hasn't been a very big issue for us in practice because
1) the window during which an event could get dropped like this is
presumably very small and 2) the standard user behavior when their
computer doesn't wake up is to start pressing random keys, which we do
end up catching.  No, this is not a great user experience and we want
to fix it for the next version.

> In fact if you had some kind of hypoethetical event counter incremented
> by the device on it causing a wakeup event *or* an event while active
> (and no way to tell them apat) that would provide a correct race free
> interface to figure out if the display ought to be on
>

Again, this is basically what Rafael suggested earlier and I've
already said that it sounds like a perfectly reasonable solution to
me.

> It doesn't solve the powering off as a key is hit race but that's a
> different beast.
>
> Alan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-07 16:54             ` One Thousand Gnomes
@ 2015-05-07 21:03               ` Rafael J. Wysocki
  2015-05-08  7:09                 ` Tomeu Vizoso
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-07 21:03 UTC (permalink / raw)
  To: One Thousand Gnomes
  Cc: Tomeu Vizoso, Chirantan Ekbote, Olof Johansson, John Stultz,
	Bastien Nocera, Linux Kernel Mailing List, snanda, Linux PM list

On Thursday, May 07, 2015 05:54:56 PM One Thousand Gnomes wrote:
> On Tue, 05 May 2015 14:31:26 +0200
> 
> > For example, when you wake up from S3 on ACPI-based systems, the best you
> > can get is what devices have generated the wakeup events, but there's
> > no input available from that (like you won't know which key has been
> > pressed).  You may not get that even.  You may only know what GPEs have
> > caused the wakeup to happen and they may be shared.
> > 
> > For PCI wakeup, the wakeup event may be out of band.  You need to walk
> > the hierarchy and check the PME status bits to identify the wakeup device
> > and then you need to be careful enough not to reset it while putting into
> > D0 for the input data associated with the event to be available.  I'm not
> > sure how many device/driver combinations this actually works for.
> > 
> > For USB wakeup, you get the wakeup event from the controller which may be
> > a PCI device.  Getting to the USB device itself from there requires some
> > work and even then the device may not "remember" what exactly happened.
> > 
> > Further, if you wake up via the PC keyboard from suspend-to-idle, the
> > wakeup key code is not available, the only thing you know is that the
> > interrupts has occured (that may be changed, but it's how the current
> > code works).
> 
> It's probably got to change, otherwise once machines get able to sleep
> between keypresses it's going to suck every time you pause and think for
> a minute then begin typing. Remember display being off for suspend is
> purely a limitation of most current display panels.

Right.

It is just one example, though.

Take a PCI device in D3hot for another one.  It may not even have a buffer
to store input data while in that state.  The only thing it may be able to
do is to signal a PME from it.

Rafael


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-06 17:40           ` Chirantan Ekbote
@ 2015-05-07 23:19             ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-07 23:19 UTC (permalink / raw)
  To: Chirantan Ekbote
  Cc: John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Wednesday, May 06, 2015 10:40:53 AM Chirantan Ekbote wrote:
> On Tue, May 5, 2015 at 4:47 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Monday, May 04, 2015 04:30:00 PM Chirantan Ekbote wrote:
> >
> > <snip>
> >
> >> Ok so now I can talk about how lucid sleep fits into all of this.  The
> >> power manager only considers a suspend attempt successful if the write
> >> to /sys/power/state was successful.  If the suspend was successful,
> >> the power manager then reads another sysfs file
> >> (/sys/power/wakeup_type in our kernel) to determine if the wakeup
> >> event was user-triggered.
> >
> > Well, that's where things are likely to get ugly depending on how the
> > /sys/power/wakeup_type attribute works (more below).
> >
> >> If the event was user-triggered it sends
> >> out a DBus signal announcing the end of the suspend, Chrome thaws its
> >> renderer processes, the full UI comes back up, and the user can start
> >> working.  If the event was _not_ user-triggred (if it was the RTC or
> >> NIC), the power manager sends out a different DBus signal announcing
> >> that the system is in lucid sleep and will re-suspend soon.  It will
> >> then wait for all registered applications to report readiness to
> >> suspend or for the max timeout to expire.
> >
> > First let me say that the "user-triggered" vs "non-user-triggered" distinction
> > seems somewhat artificial to me.  All boils down to having a special class
> > of wakeup events that are supposed to make the power manager behave differently
> > after resuming.  Whether or not they are actually triggered by the user
> > doesn't really matter technically.
> >
> 
> This is true.  It's just easier to talk about them as user-triggered
> or not.  For reference, [1] and [2] are the patches that implement
> wakeup_type in our kernel.
> 
> >> If it so happens that a user interacts with the system while it is in
> >> this state, the power manager detects it via the user activity message
> >> that Chrome sends.  This could be real (keyboard, trackpad) or
> >> simulated (lid, power button).  Either way, the power manager responds
> >> the same way: it announces the end of the suspend, Chrome thaws its
> >> renderers, the full UI comes back up, and the user starts working.  If
> >> there is no user activity and all applications report ready, the power
> >> manager gets ready to suspend the system again.  Since the NIC is now
> >> a wakeup source, the power manager doesn't read the wakeup_count until
> >> after all applications have reported ready because accessing the
> >> network could increment the wakeup_count and cause false positives.
> >>
> >> If either the write to /sys/power/wakeup_count or /sys/power/state
> >> fails from lucid sleep, the power manager re-announces that the system
> >> is in lucid sleep and will re-suspend soon.  It's actually a little
> >> smart about this: it will only re-announce lucid sleep if there was a
> >> wakeup_count mismatch or if the write to /sys/power/state returned
> >> EBUSY.  Other failures only trigger a simple retry and no DBus signal.
> >> We do this because these wakeup events may legitimately trigger lucid
> >> sleep.  For example, more packets may arrive from the network or the
> >> RTC may go off and applications don't perform work until they hear
> >> from the power manager that the system is in lucid sleep.  At this
> >> point the power manager is back to waiting for applications to report
> >> ready (or for the retry timer to fire).  This process may repeat
> >> multiple times if we keep getting wakeup events right as the system
> >> tries to re-suspend.
> >>
> >>
> >> So that was a little long-winded but hopefully I've addressed all your
> >> concerns about potential race conditions in this code.  I simplified a
> >> few bits because would just complicate the discussion but for the most
> >> part this is how the feature works now.  Having the kernel emit a
> >> uevent with the wakeup event type would take the place of the power
> >> manager reading from /sys/power/wakeup_type in this system but
> >> wouldn't really affect anything else.
> >
> > Which loops back to my previous remark: Things may get ugly if /sys/power/wakeup_type
> > doesn't do the right thing (the uevent mechanics you'd like to replace it with
> > will really need to do the same, so I'm not quite sure it's worth the effort).
> >
> > Namely, it really has to cover all events that might have woken you up and
> > happened before stuff has started to be added to the input buffers that Chrome
> > cares about.  It is difficult to identify the exact point where that takes place
> > in the resume sequence, but it should be somewhere in dpm_resume_end().  Why so?
> > Because it really doesn't matter why exactly the system is waking up.  What
> > matters is whether or not an event that you should react to by bringing up the
> > UI happens *at* *any* *time* between (and including) the actual wakeup and the
> > point when you can rely on the input buffers to contain any useful information
> > consumable by Chrome.
> >
> 
> So this is something that we don't catch right now.  Our
> implementation queries the firmware via an ACPI call to get the wakeup
> source and I was assuming that any events after that *would*
> eventually make their way up to Chrome or userspace.  But based on
> what you are saying, it seems like we do drop any events that occur
> between the wakeup and the time when the input buffers contain useful
> information.
> 
> I'm guessing that this window is pretty small though and when we do
> end up missing an event, we can consider ourselves lucky because the
> typical user reaction when their computer doesn't wake up is to start
> pressing random keys on the keyboard and we would definitely catch
> those.

That is, provided that the "user" actually is a human at the keyboard which
need not be the case in the general situation.

> > This pretty much means that /sys/power/wakeup_type needs to behave almost like
> > /sys/power/wakeup_count, but is limited to a subset of wakeup sources.  That's
> > why I was talking about splitting the wakeup count.
> >
> > So instead of adding an entirely new mechanics for that, why don't you add
> > something like "priority" or "weight" to struct wakeup_source and assign
> > higher values of that to the wakeup sources associated with the events
> > you want to bring up the UI after resume?  And make those "higher-priority"
> > wakeup sources use a separate wakeup counter, so you can easily verify if
> > any of them has triggered by reading that or making it trigger a uevent if
> > you want to?
> >
> >
> 
> Sounds good to me.

OK

So I can prototype high-level support for that in the wakeup sources framework.

In addition to that we would need to ensure that the wakeup sources would be
activated in response to the wakeup interrupts handled by the IRQ core between
suspendind and resuming device IRQs.  That is slightly less straightforward
than one might think, but should be doable.

Would you be interested in that?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-07 21:03               ` Rafael J. Wysocki
@ 2015-05-08  7:09                 ` Tomeu Vizoso
  0 siblings, 0 replies; 25+ messages in thread
From: Tomeu Vizoso @ 2015-05-08  7:09 UTC (permalink / raw)
  To: Rafael J. Wysocki, One Thousand Gnomes
  Cc: Chirantan Ekbote, Olof Johansson, John Stultz, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Linux PM list

On 05/07/2015 11:03 PM, Rafael J. Wysocki wrote:
> On Thursday, May 07, 2015 05:54:56 PM One Thousand Gnomes wrote:
>> On Tue, 05 May 2015 14:31:26 +0200
>>
>>> For example, when you wake up from S3 on ACPI-based systems, the best you
>>> can get is what devices have generated the wakeup events, but there's
>>> no input available from that (like you won't know which key has been
>>> pressed).  You may not get that even.  You may only know what GPEs have
>>> caused the wakeup to happen and they may be shared.
>>>
>>> For PCI wakeup, the wakeup event may be out of band.  You need to walk
>>> the hierarchy and check the PME status bits to identify the wakeup device
>>> and then you need to be careful enough not to reset it while putting into
>>> D0 for the input data associated with the event to be available.  I'm not
>>> sure how many device/driver combinations this actually works for.
>>>
>>> For USB wakeup, you get the wakeup event from the controller which may be
>>> a PCI device.  Getting to the USB device itself from there requires some
>>> work and even then the device may not "remember" what exactly happened.
>>>
>>> Further, if you wake up via the PC keyboard from suspend-to-idle, the
>>> wakeup key code is not available, the only thing you know is that the
>>> interrupts has occured (that may be changed, but it's how the current
>>> code works).
>>
>> It's probably got to change, otherwise once machines get able to sleep
>> between keypresses it's going to suck every time you pause and think for
>> a minute then begin typing. Remember display being off for suspend is
>> purely a limitation of most current display panels.
> 
> Right.
> 
> It is just one example, though.
> 
> Take a PCI device in D3hot for another one.  It may not even have a buffer
> to store input data while in that state.  The only thing it may be able to
> do is to signal a PME from it.

Yeah, I tried to make clear that I don't think that this is generally
achievable. But in the ChromeOS hardware that I have here, the input
event is there for userspace to read when it wakes up.

But if there's traction for adding upstream a more generic mechanism
that works in a broader range of machines, I'm all for it.

Regards,

Tomeu


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-05 23:47         ` Rafael J. Wysocki
  2015-05-06 17:40           ` Chirantan Ekbote
@ 2015-05-11 22:12           ` Pavel Machek
  2015-05-12  0:45             ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2015-05-11 22:12 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Chirantan Ekbote, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

Hi!

> > If the event was user-triggered it sends
> > out a DBus signal announcing the end of the suspend, Chrome thaws its
> > renderer processes, the full UI comes back up, and the user can start
> > working.  If the event was _not_ user-triggred (if it was the RTC or
> > NIC), the power manager sends out a different DBus signal announcing
> > that the system is in lucid sleep and will re-suspend soon.  It will
> > then wait for all registered applications to report readiness to
> > suspend or for the max timeout to expire.
> 
> First let me say that the "user-triggered" vs "non-user-triggered" distinction
> seems somewhat artificial to me.  All boils down to having a special class
> of wakeup events that are supposed to make the power manager behave differently
> after resuming.  Whether or not they are actually triggered by the user
> doesn't really matter technically.
...
> > So that was a little long-winded but hopefully I've addressed all your
> > concerns about potential race conditions in this code.  I simplified a
> > few bits because would just complicate the discussion but for the most
> > part this is how the feature works now.  Having the kernel emit a
> > uevent with the wakeup event type would take the place of the power
> > manager reading from /sys/power/wakeup_type in this system but
> > wouldn't really affect anything else.
> 
> Which loops back to my previous remark: Things may get ugly if /sys/power/wakeup_type
> doesn't do the right thing (the uevent mechanics you'd like to replace it with
> will really need to do the same, so I'm not quite sure it's worth the effort).
> 
> Namely, it really has to cover all events that might have woken you up and
> happened before stuff has started to be added to the input buffers that Chrome
> cares about.  It is difficult to identify the exact point where that takes place
> in the resume sequence, but it should be somewhere in dpm_resume_end().  Why so?
> Because it really doesn't matter why exactly the system is waking up.  What
> matters is whether or not an event that you should react to by bringing up the
> UI happens *at* *any* *time* between (and including) the actual wakeup and the
> point when you can rely on the input buffers to contain any useful information
> consumable by Chrome.
> 
> This pretty much means that /sys/power/wakeup_type needs to behave almost like
> /sys/power/wakeup_count, but is limited to a subset of wakeup sources.  That's
> why I was talking about splitting the wakeup count.
> 
> So instead of adding an entirely new mechanics for that, why don't you add
> something like "priority" or "weight" to struct wakeup_source and assign
> higher values of that to the wakeup sources associated with the events
> you want to bring up the UI after resume?  And make those "higher-priority"
> wakeup sources use a separate wakeup counter, so you can easily verify if
> any of them has triggered by reading that or making it trigger a uevent if
> you want to?

Does it do all we want? What if one device wants to generate both
"normal" and "higher-priority" wakeup events? (*)

Should not we have normal interface for keyboard (and similar devices)
where we could ask "did something interesting happen while we were
sleeping"? Actually.. maybe the device can queue the events
that happened during sleep, and deliver them after wakeup? If user
pressed key during sleep, you should have key event waiting on
/dev/input/event3...

									Pavel
(*) Ethernet card might be an example. If machine received wake-on-lan
packet, it will want to wake up with screen on. If machine received
normal packet, it might want to process the packet and get back to sleep.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: A desktop environment[1] kernel wishlist
  2015-05-11 22:12           ` Pavel Machek
@ 2015-05-12  0:45             ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2015-05-12  0:45 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Chirantan Ekbote, John Stultz, Olof Johansson, Bastien Nocera,
	Linux Kernel Mailing List, snanda, Tomeu Vizoso, Linux PM list

On Tuesday, May 12, 2015 12:12:30 AM Pavel Machek wrote:
> Hi!
> 
> > > If the event was user-triggered it sends
> > > out a DBus signal announcing the end of the suspend, Chrome thaws its
> > > renderer processes, the full UI comes back up, and the user can start
> > > working.  If the event was _not_ user-triggred (if it was the RTC or
> > > NIC), the power manager sends out a different DBus signal announcing
> > > that the system is in lucid sleep and will re-suspend soon.  It will
> > > then wait for all registered applications to report readiness to
> > > suspend or for the max timeout to expire.
> > 
> > First let me say that the "user-triggered" vs "non-user-triggered" distinction
> > seems somewhat artificial to me.  All boils down to having a special class
> > of wakeup events that are supposed to make the power manager behave differently
> > after resuming.  Whether or not they are actually triggered by the user
> > doesn't really matter technically.
> ...
> > > So that was a little long-winded but hopefully I've addressed all your
> > > concerns about potential race conditions in this code.  I simplified a
> > > few bits because would just complicate the discussion but for the most
> > > part this is how the feature works now.  Having the kernel emit a
> > > uevent with the wakeup event type would take the place of the power
> > > manager reading from /sys/power/wakeup_type in this system but
> > > wouldn't really affect anything else.
> > 
> > Which loops back to my previous remark: Things may get ugly if /sys/power/wakeup_type
> > doesn't do the right thing (the uevent mechanics you'd like to replace it with
> > will really need to do the same, so I'm not quite sure it's worth the effort).
> > 
> > Namely, it really has to cover all events that might have woken you up and
> > happened before stuff has started to be added to the input buffers that Chrome
> > cares about.  It is difficult to identify the exact point where that takes place
> > in the resume sequence, but it should be somewhere in dpm_resume_end().  Why so?
> > Because it really doesn't matter why exactly the system is waking up.  What
> > matters is whether or not an event that you should react to by bringing up the
> > UI happens *at* *any* *time* between (and including) the actual wakeup and the
> > point when you can rely on the input buffers to contain any useful information
> > consumable by Chrome.
> > 
> > This pretty much means that /sys/power/wakeup_type needs to behave almost like
> > /sys/power/wakeup_count, but is limited to a subset of wakeup sources.  That's
> > why I was talking about splitting the wakeup count.
> > 
> > So instead of adding an entirely new mechanics for that, why don't you add
> > something like "priority" or "weight" to struct wakeup_source and assign
> > higher values of that to the wakeup sources associated with the events
> > you want to bring up the UI after resume?  And make those "higher-priority"
> > wakeup sources use a separate wakeup counter, so you can easily verify if
> > any of them has triggered by reading that or making it trigger a uevent if
> > you want to?
> 
> Does it do all we want?

I believe so.

> What if one device wants to generate both "normal" and "higher-priority"
> wakeup events? (*)
>
> Should not we have normal interface for keyboard (and similar devices)
> where we could ask "did something interesting happen while we were
> sleeping"? Actually.. maybe the device can queue the events
> that happened during sleep, and deliver them after wakeup? If user
> pressed key during sleep, you should have key event waiting on
> /dev/input/event3...

If it can queue up all of them, it can be "normal" priority just fine
and user space can read all the queue and decide what to do then.

The "high-priority" idea is for devices that can't do that at least at one
point during the suspend-resume cycle.  In those cases we can't simply go
back and check what the event was, so we need to rely on the device's
"importance" or "class" (with respect to wakeup).


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-05-12  0:20 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1413881397.30379.7.camel@hadess.net>
     [not found] ` <CAOesGMg6UBuF=OJ-JdAUx-sD5MnLL8+Ag=xeJFmr-Dxezti2YA@mail.gmail.com>
     [not found]   ` <CAJFHJrp8R+Na73W2Jx6BJ+JnHTGZfd61d+EracgehS+ZQoZOcg@mail.gmail.com>
2015-05-04 22:12     ` A desktop environment[1] kernel wishlist Rafael J. Wysocki
2015-05-04 23:30       ` Chirantan Ekbote
2015-05-05 10:46         ` Bastien Nocera
2015-05-05 19:22           ` Chirantan Ekbote
2015-05-06 12:41             ` Bastien Nocera
2015-05-05 14:39         ` Alan Stern
2015-05-05 17:58           ` Chirantan Ekbote
2015-05-05 19:35             ` Alan Stern
2015-05-05 20:58               ` Chirantan Ekbote
2015-05-05 23:56                 ` Rafael J. Wysocki
2015-05-05 23:38                   ` David Lang
2015-05-05 23:51                     ` Rafael J. Wysocki
2015-05-07 17:03                 ` One Thousand Gnomes
2015-05-07 18:21                   ` Chirantan Ekbote
2015-05-05 23:47         ` Rafael J. Wysocki
2015-05-06 17:40           ` Chirantan Ekbote
2015-05-07 23:19             ` Rafael J. Wysocki
2015-05-11 22:12           ` Pavel Machek
2015-05-12  0:45             ` Rafael J. Wysocki
     [not found]     ` <CAAObsKDYV=Hyz0XcPkyD6=hu4WKk7PHGdSMOCZxbh_vMKHS32Q@mail.gmail.com>
2015-05-04 22:19       ` Rafael J. Wysocki
2015-05-05  6:05         ` Tomeu Vizoso
2015-05-05 12:31           ` Rafael J. Wysocki
2015-05-07 16:54             ` One Thousand Gnomes
2015-05-07 21:03               ` Rafael J. Wysocki
2015-05-08  7:09                 ` Tomeu Vizoso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox