From: Grygorii Strashko <grygorii.strashko@ti.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
Alan Stern <stern@rowland.harvard.edu>
Cc: Thierry Reding <thierry.reding@gmail.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Tomeu Vizoso <tomeu.vizoso@collabora.com>,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] driver core: Ensure proper suspend/resume ordering
Date: Wed, 16 Sep 2015 16:39:59 +0300 [thread overview]
Message-ID: <55F9712F.1030305@ti.com> (raw)
In-Reply-To: <5347923.qphJvWEeRm@vostro.rjw.lan>
Hi Thierry, Alan, Rafael,
On 09/12/2015 01:28 AM, Rafael J. Wysocki wrote:
> On Friday, September 11, 2015 03:01:14 PM Alan Stern wrote:
>> On Fri, 11 Sep 2015, Thierry Reding wrote:
>>
>>> On Fri, Sep 11, 2015 at 12:08:02AM +0200, Rafael J. Wysocki wrote:
>>>> On Thursday, September 10, 2015 12:19:03 PM Thierry Reding wrote:
>>>>> From: Thierry Reding <treding@nvidia.com>
>>>>>
>>>>> Deferred probe can lead to strange situations where a device that is a
>>>>> dependency for others will be moved to the end of the dpm_list. At the
>>>>> same time the dependers may not be moved because at the time they will
>>>>> be probed the dependee may already have been successfully reprobed and
>>>>> they will not have to defer the probe themselves.
>>>>
>>>> So there's a bug in the implementation of deferred probing IMO.
>>>
>>> Well, yeah. The root problem here is that we don't have dependency
>>> information and deferred probing is supposed to fix that. It does so
>>> fairly well, but it breaks in this particular case.
Actually this is the old problem and deferred probing just makes it more
reproducible.
Lets say, we have device P-producer and device C-consumer.
- devices are created/registered in the following order (It doesn't matter
how they are created - dt, platform,..)
-- device C created
-- device P created
- devices probing order (legacy probing order depends on Makefiles and
initcalls - no deferred probing)
-- device P probed
-- device C probed
- suspend order
-- device P suspended
-- device C suspended (ops, device C may still need device P)
To W/A such issues:
- driver's suspend code can be shifted between suspend layers (from
.suspend() to .suspend_noirq(), for example)
- device's registration order can be changed manually in DT or platform code
- can play with initcalls
May be it was enough before, but now it's not :(, simply because, now
there are much more generic/common frameworks in Kernel:
gpio, regulators, dma ++ clk, syscon, phy, extcon, component, display framework..
As result, It introduces more devices and dependencies between devices and,
therefore, suspend ordering issue can be hit more often.
>>>
>>>>> One example where this happens is the Jetson TK1 board (Tegra124). The
>>>>> gpio-keys driver exposes the power key of the board as an input device
>>>>> that can also be used as a wakeup source. Commit 17cdddf0fb68 ("ARM:
>>>>> tegra: Add gpio-ranges property") results in the gpio-tegra driver
>>>>> deferring probe because one of its dependencies, the pinctrl-tegra
>>>>> driver, has not successfully completed probing. Currently the deferred
>>>>> probe code will move the corresponding gpio-tegra device to the end of
>>>>> the dpm_list, but by the time the gpio-keys device, depending on the
>>>>> gpio-tegra device, is probed, gpio-tegra has already been reprobed, so
>>>>> the gpio-keys device is not moved to the end of dpm_list itself. As a
>>>>> result, the suspend ordering becomes pinctrl-tegra -> gpio-keys ->
>>>>> gpio-tegra. That's problematic because the gpio-keys driver requests
>>>>> the power key to be a wakeup source. However, the programming of the
>>>>> wakeup interrupt registers happens in the gpio-tegra driver's suspend
>>>>> callback, which is now called before that of the gpio-keys driver. The
>>>>> result is that the wrong values are programmed and leaves the system
>>>>> unable to be resumed using the power key.
>>>>>
>>>>> To fix this situation, always move devices to the end of the dpm_list
>>>>> before probing them. Technically this should only be done for devices
>>>>> that have been successfully probed, but that won't work for recursive
>>>>> probing of devices (think an I2C master that instantiates children in
>>>>> its ->probe()). Effectively the dpm_list will end up ordered the same
>>>>> way that devices were probed, hence taking care of dependencies.
>>
>> I'm not worried about the overhead involved in moving a device to the
>> end of the list every time it is probed. That ought to be relatively
>> small.
>>
>> There are a few things to watch out for. Since the dpm_list gets
>> modified during system sleep transitions, we would have to make sure
>> that nothing gets probed during those times. In principle, that's what
>> the "prepare" stage is meant for, but there's still a race. As long as
>> no other kernel thread (such as the deferred probing mechanism) tries
>> to probe a device once everything has been frozen, we should be okay.
>> But if not, there will be trouble -- after the ->prepare callback runs,
>> the device is no longer on the dpm_list and so we don't want this patch
>> to put it back on that list.
>
I think, It should prohibited to probe devices during suspend/hibernation.
And solution introduced in this patch might help to fix it -
in general, we could do :
- add sync point on suspend enter: wait_for_device_probe() and
- prohibit probing: move all devices which will request probing into
deferred_probe list
- one suspend exit: allow probing and do driver_deferred_probe_trigger
>
>> There's also an issue about other types of dependencies. For instance,
>> it's conceivable that device B might be discovered and depend on device
>> A, even before A has been bound to a driver. (B might be discovered by
>> A's subsystem rather than A's driver.) In that case, moving A to the
>> end of the list would cause B to come before A even though B depends on
>> A. Of course, deferred probing already has this problem.
>
> That may actually happen for PCIe ports and devices below them AFAICS.
>
> Devices below PCIe ports are discovered by the PCI subsystem and the PCIe
> ports need not be probed before those devices are probed.
I'd like to mention here that this patch will work only
if dmp_list will be filled according device creation order ("parent<-child" dependencies)
*AND* according device's probing order ("supplier<-consumer").
So, if there is the case when Parent device can be probed AFTER its children
- it will not work, because "parent<-child" dependencies will not be tracked
any more :( Sry, I could not even imagine that such crazy case exist :'(
Are there any other subsystems with the same behavior like PCI?
If not - probably, it could be fixed in PCI subsystem using device_pm_move_after() or
device_move() in PCIe ports probe.
if yes - ... maybe we can scan/re-check and reorder dpm_list on suspend enter and
restore ("parent<-child" dependencies).
>
>> An easy way to check for this sort of thing would be to verify that a
>> device about to be probed doesn't have any children. This wouldn't
>> catch all the potential dependencies, but it would be a reasonable
>> start.
>
> That would address the PCIe ports issue.
>
that might work also.
Truth is that smth. need to be done 100%. Personally, I was hit by this issue also,
and it cost me 3 hours of debugging and I came up with the same patch as
Bill Huang, then spent some time trying to understand what is wrong with PCI
- finally, I've just changed the order of my devices in DT :)
Also, I think, it will be good to have this patch in -next to collect more feedbacks.
--
regards,
-grygorii
WARNING: multiple messages have this Message-ID (diff)
From: Grygorii Strashko <grygorii.strashko@ti.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
Alan Stern <stern@rowland.harvard.edu>
Cc: Thierry Reding <thierry.reding@gmail.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Tomeu Vizoso <tomeu.vizoso@collabora.com>,
<linux-pm@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] driver core: Ensure proper suspend/resume ordering
Date: Wed, 16 Sep 2015 16:39:59 +0300 [thread overview]
Message-ID: <55F9712F.1030305@ti.com> (raw)
In-Reply-To: <5347923.qphJvWEeRm@vostro.rjw.lan>
Hi Thierry, Alan, Rafael,
On 09/12/2015 01:28 AM, Rafael J. Wysocki wrote:
> On Friday, September 11, 2015 03:01:14 PM Alan Stern wrote:
>> On Fri, 11 Sep 2015, Thierry Reding wrote:
>>
>>> On Fri, Sep 11, 2015 at 12:08:02AM +0200, Rafael J. Wysocki wrote:
>>>> On Thursday, September 10, 2015 12:19:03 PM Thierry Reding wrote:
>>>>> From: Thierry Reding <treding@nvidia.com>
>>>>>
>>>>> Deferred probe can lead to strange situations where a device that is a
>>>>> dependency for others will be moved to the end of the dpm_list. At the
>>>>> same time the dependers may not be moved because at the time they will
>>>>> be probed the dependee may already have been successfully reprobed and
>>>>> they will not have to defer the probe themselves.
>>>>
>>>> So there's a bug in the implementation of deferred probing IMO.
>>>
>>> Well, yeah. The root problem here is that we don't have dependency
>>> information and deferred probing is supposed to fix that. It does so
>>> fairly well, but it breaks in this particular case.
Actually this is the old problem and deferred probing just makes it more
reproducible.
Lets say, we have device P-producer and device C-consumer.
- devices are created/registered in the following order (It doesn't matter
how they are created - dt, platform,..)
-- device C created
-- device P created
- devices probing order (legacy probing order depends on Makefiles and
initcalls - no deferred probing)
-- device P probed
-- device C probed
- suspend order
-- device P suspended
-- device C suspended (ops, device C may still need device P)
To W/A such issues:
- driver's suspend code can be shifted between suspend layers (from
.suspend() to .suspend_noirq(), for example)
- device's registration order can be changed manually in DT or platform code
- can play with initcalls
May be it was enough before, but now it's not :(, simply because, now
there are much more generic/common frameworks in Kernel:
gpio, regulators, dma ++ clk, syscon, phy, extcon, component, display framework..
As result, It introduces more devices and dependencies between devices and,
therefore, suspend ordering issue can be hit more often.
>>>
>>>>> One example where this happens is the Jetson TK1 board (Tegra124). The
>>>>> gpio-keys driver exposes the power key of the board as an input device
>>>>> that can also be used as a wakeup source. Commit 17cdddf0fb68 ("ARM:
>>>>> tegra: Add gpio-ranges property") results in the gpio-tegra driver
>>>>> deferring probe because one of its dependencies, the pinctrl-tegra
>>>>> driver, has not successfully completed probing. Currently the deferred
>>>>> probe code will move the corresponding gpio-tegra device to the end of
>>>>> the dpm_list, but by the time the gpio-keys device, depending on the
>>>>> gpio-tegra device, is probed, gpio-tegra has already been reprobed, so
>>>>> the gpio-keys device is not moved to the end of dpm_list itself. As a
>>>>> result, the suspend ordering becomes pinctrl-tegra -> gpio-keys ->
>>>>> gpio-tegra. That's problematic because the gpio-keys driver requests
>>>>> the power key to be a wakeup source. However, the programming of the
>>>>> wakeup interrupt registers happens in the gpio-tegra driver's suspend
>>>>> callback, which is now called before that of the gpio-keys driver. The
>>>>> result is that the wrong values are programmed and leaves the system
>>>>> unable to be resumed using the power key.
>>>>>
>>>>> To fix this situation, always move devices to the end of the dpm_list
>>>>> before probing them. Technically this should only be done for devices
>>>>> that have been successfully probed, but that won't work for recursive
>>>>> probing of devices (think an I2C master that instantiates children in
>>>>> its ->probe()). Effectively the dpm_list will end up ordered the same
>>>>> way that devices were probed, hence taking care of dependencies.
>>
>> I'm not worried about the overhead involved in moving a device to the
>> end of the list every time it is probed. That ought to be relatively
>> small.
>>
>> There are a few things to watch out for. Since the dpm_list gets
>> modified during system sleep transitions, we would have to make sure
>> that nothing gets probed during those times. In principle, that's what
>> the "prepare" stage is meant for, but there's still a race. As long as
>> no other kernel thread (such as the deferred probing mechanism) tries
>> to probe a device once everything has been frozen, we should be okay.
>> But if not, there will be trouble -- after the ->prepare callback runs,
>> the device is no longer on the dpm_list and so we don't want this patch
>> to put it back on that list.
>
I think, It should prohibited to probe devices during suspend/hibernation.
And solution introduced in this patch might help to fix it -
in general, we could do :
- add sync point on suspend enter: wait_for_device_probe() and
- prohibit probing: move all devices which will request probing into
deferred_probe list
- one suspend exit: allow probing and do driver_deferred_probe_trigger
>
>> There's also an issue about other types of dependencies. For instance,
>> it's conceivable that device B might be discovered and depend on device
>> A, even before A has been bound to a driver. (B might be discovered by
>> A's subsystem rather than A's driver.) In that case, moving A to the
>> end of the list would cause B to come before A even though B depends on
>> A. Of course, deferred probing already has this problem.
>
> That may actually happen for PCIe ports and devices below them AFAICS.
>
> Devices below PCIe ports are discovered by the PCI subsystem and the PCIe
> ports need not be probed before those devices are probed.
I'd like to mention here that this patch will work only
if dmp_list will be filled according device creation order ("parent<-child" dependencies)
*AND* according device's probing order ("supplier<-consumer").
So, if there is the case when Parent device can be probed AFTER its children
- it will not work, because "parent<-child" dependencies will not be tracked
any more :( Sry, I could not even imagine that such crazy case exist :'(
Are there any other subsystems with the same behavior like PCI?
If not - probably, it could be fixed in PCI subsystem using device_pm_move_after() or
device_move() in PCIe ports probe.
if yes - ... maybe we can scan/re-check and reorder dpm_list on suspend enter and
restore ("parent<-child" dependencies).
>
>> An easy way to check for this sort of thing would be to verify that a
>> device about to be probed doesn't have any children. This wouldn't
>> catch all the potential dependencies, but it would be a reasonable
>> start.
>
> That would address the PCIe ports issue.
>
that might work also.
Truth is that smth. need to be done 100%. Personally, I was hit by this issue also,
and it cost me 3 hours of debugging and I came up with the same patch as
Bill Huang, then spent some time trying to understand what is wrong with PCI
- finally, I've just changed the order of my devices in DT :)
Also, I think, it will be good to have this patch in -next to collect more feedbacks.
--
regards,
-grygorii
next prev parent reply other threads:[~2015-09-16 13:40 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-10 10:19 [PATCH] driver core: Ensure proper suspend/resume ordering Thierry Reding
2015-09-10 10:58 ` Grygorii Strashko
2015-09-10 10:58 ` Grygorii Strashko
2015-09-10 22:08 ` Rafael J. Wysocki
2015-09-11 12:03 ` Thierry Reding
2015-09-11 19:01 ` Alan Stern
2015-09-11 19:01 ` Alan Stern
2015-09-11 22:28 ` Rafael J. Wysocki
2015-09-12 17:40 ` Alan Stern
2015-09-12 17:40 ` Alan Stern
2015-09-15 0:40 ` Rafael J. Wysocki
2015-09-15 14:23 ` Alan Stern
2015-09-16 1:04 ` Rafael J. Wysocki
2015-09-16 13:39 ` Grygorii Strashko [this message]
2015-09-16 13:39 ` Grygorii Strashko
2015-09-16 19:27 ` Alan Stern
2015-09-16 19:27 ` Alan Stern
2015-09-17 0:07 ` Rafael J. Wysocki
2015-09-17 15:48 ` Grygorii Strashko
2015-09-17 15:48 ` Grygorii Strashko
2015-09-17 23:59 ` Rafael J. Wysocki
2015-09-18 0:03 ` Rafael J. Wysocki
2015-10-01 18:14 ` Grygorii Strashko
2015-09-15 16:10 ` Thierry Reding
2015-09-15 19:18 ` Alan Stern
2015-09-15 19:18 ` Alan Stern
2015-09-16 1:28 ` Rafael J. Wysocki
2015-09-16 13:38 ` Grygorii Strashko
2015-09-16 13:38 ` Grygorii Strashko
2015-09-16 16:58 ` Alan Stern
2015-09-16 16:58 ` Alan Stern
2015-09-16 17:08 ` Alan Stern
2015-09-16 17:08 ` Alan Stern
2015-09-16 13:16 ` Thierry Reding
2015-09-16 19:22 ` Alan Stern
2015-09-16 19:22 ` Alan Stern
2015-09-17 0:27 ` Rafael J. Wysocki
2015-09-17 2:04 ` Rafael J. Wysocki
2015-09-17 17:02 ` Alan Stern
2015-09-17 18:43 ` Rafael J. Wysocki
2015-09-17 21:06 ` Alan Stern
2015-09-18 0:13 ` Rafael J. Wysocki
2015-09-18 15:55 ` Thierry Reding
2015-09-18 23:07 ` Rafael J. Wysocki
2015-09-21 8:51 ` Thierry Reding
2015-09-21 14:34 ` Alan Stern
2015-09-22 0:26 ` Rafael J. Wysocki
2015-09-22 1:22 ` Alan Stern
2015-09-22 0:39 ` Rafael J. Wysocki
2015-09-17 5:40 ` Tomeu Vizoso
2015-09-17 18:15 ` Rafael J. Wysocki
2015-09-17 19:21 ` Alan Stern
2015-09-11 22:38 ` Rafael J. Wysocki
2015-09-15 15:53 ` Thierry Reding
2015-09-16 1:18 ` Rafael J. Wysocki
2015-09-16 13:38 ` Grygorii Strashko
2015-09-16 13:38 ` Grygorii Strashko
2015-09-17 0:55 ` Rafael J. Wysocki
2015-09-11 14:05 ` Alan Stern
2015-09-11 14:05 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55F9712F.1030305@ti.com \
--to=grygorii.strashko@ti.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=rjw@rjwysocki.net \
--cc=stern@rowland.harvard.edu \
--cc=thierry.reding@gmail.com \
--cc=tomeu.vizoso@collabora.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.