* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
[not found] ` <CAPwe5RMpdG1ziRAwDhqkxuzHX0x=SdFQRFUbPCVuir1OgE90YQ@mail.gmail.com>
@ 2025-08-27 14:52 ` Alan Stern
2025-08-27 18:56 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2025-08-27 14:52 UTC (permalink / raw)
To: ryan zhou, Rafael J. Wysocki
Cc: Roy Luo, Thinh.Nguyen, gregkh, linux-usb, linux-kernel, linux-pm
Ryan:
You should present your questions to the maintainer of the kernel's
Power Management subsystem, Rafael Wysocki (added to the To: list for
this email).
Alan Stern
On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> Hi Roy,
> Thank you for reviewing my patch.
> >
> > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> >
> No, in the following case, the parent device will not be reviewed
> before resuming the child device.
> Taking the 'imx8mp-dwc3' driver as an example.
> Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> suspend state firstly, followed by
> the parent device imx8mp-dwc3 enters runtime suspend
> flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> Step2.system deep trigger:consistent with the runtime suspend flow,
> child enters pm suspend and followed
> by parent
> flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> task terminated the system suspend process
> . The system will resume from the checkpoint, and resume devices in
> the suspended state in the reverse
> of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> did not execute the suspend process.
>
> >
> >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> >
> Following the above analysis, dwc3_resume calls
> pm_runtime_set_active(dev), it checks the
> parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
>
> >
> >What is the glue driver that's being used here? Knowing what's being done in the glue driver pm callbacks
> >would help in understanding the issue.
> >
> Refer to the driver 'dwc3-imx8mp.c' please, maybe you could help me
> find a better solution.
>
>
> Thanks,
> ryan
>
> Roy Luo <royluo@google.com> 于2025年8月27日周三 02:38写道:
> >
> > On Tue, Aug 26, 2025 at 8:12 AM Ryan Zhou <ryanzhou54@gmail.com> wrote:
> > >
> > > Issue description:
> > > The parent device dwc3_glue has runtime PM enabled and is in the
> > > runtime suspended state. The system enters the deep sleep process
> > > but is interrupted by another task. When resuming dwc3,
> > > console outputs the log 'runtime PM trying to activate child device
> > > xxx.dwc3 but parent is not active'.
> > >
> >
> > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is
> > not active' happen in the first place?
> > What is the glue driver that's being used here? Knowing what's being done in
> > the glue driver pm callbacks would help in understanding the issue.
> >
> > Regards,
> > Roy
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-27 14:52 ` [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active Alan Stern
@ 2025-08-27 18:56 ` Rafael J. Wysocki
2025-08-29 0:43 ` Thinh Nguyen
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-08-27 18:56 UTC (permalink / raw)
To: Alan Stern
Cc: ryan zhou, Rafael J. Wysocki, Roy Luo, Thinh.Nguyen, gregkh,
linux-usb, linux-kernel, linux-pm
On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> Ryan:
>
> You should present your questions to the maintainer of the kernel's
> Power Management subsystem, Rafael Wysocki (added to the To: list for
> this email).
Thanks Alan!
> On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > Hi Roy,
> > Thank you for reviewing my patch.
> > >
> > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > >
> > No, in the following case, the parent device will not be reviewed
> > before resuming the child device.
> > Taking the 'imx8mp-dwc3' driver as an example.
> > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > suspend state firstly, followed by
> > the parent device imx8mp-dwc3 enters runtime suspend
> > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > Step2.system deep trigger:consistent with the runtime suspend flow,
> > child enters pm suspend and followed
> > by parent
> > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > task terminated the system suspend process
> > . The system will resume from the checkpoint, and resume devices in
> > the suspended state in the reverse
> > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > did not execute the suspend process.
> >
> > >
> > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > >
> > Following the above analysis, dwc3_resume calls
I assume that dwc3_pm_resume() is meant here.
> > pm_runtime_set_active(dev), it checks the
> > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
And it does so because enabling runtime PM for the child with
runtime_status == RPM_ACTIVE does not make sense when the parent has
runtime PM enabled and its status is not RPM_ACTIVE.
It looks like the runtime PM status of the parent is not as expected,
but quite frankly I don't quite follow the logic in dwc3_pm_resume().
Why does it disable runtime PM just for the duration of
dwc3_resume_common()? If runtime PM is functional before the
pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
be resumed by calling pm_runtime_resume() on it without disabling
runtime PM. In turn, if runtime PM is not functional at that point,
it should not be enabled.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-27 18:56 ` Rafael J. Wysocki
@ 2025-08-29 0:43 ` Thinh Nguyen
2025-08-29 1:25 ` Alan Stern
0 siblings, 1 reply; 20+ messages in thread
From: Thinh Nguyen @ 2025-08-29 0:43 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Alan Stern, ryan zhou, Roy Luo, Thinh Nguyen,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Wed, Aug 27, 2025, Rafael J. Wysocki wrote:
> On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > Ryan:
> >
> > You should present your questions to the maintainer of the kernel's
> > Power Management subsystem, Rafael Wysocki (added to the To: list for
> > this email).
>
> Thanks Alan!
>
>
> > On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > > Hi Roy,
> > > Thank you for reviewing my patch.
> > > >
> > > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > > >
> > > No, in the following case, the parent device will not be reviewed
> > > before resuming the child device.
> > > Taking the 'imx8mp-dwc3' driver as an example.
> > > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > > suspend state firstly, followed by
> > > the parent device imx8mp-dwc3 enters runtime suspend
> > > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > > Step2.system deep trigger:consistent with the runtime suspend flow,
> > > child enters pm suspend and followed
> > > by parent
> > > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > > task terminated the system suspend process
> > > . The system will resume from the checkpoint, and resume devices in
> > > the suspended state in the reverse
> > > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > > did not execute the suspend process.
> > >
> > > >
> > > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > > >
> > > Following the above analysis, dwc3_resume calls
>
> I assume that dwc3_pm_resume() is meant here.
>
> > > pm_runtime_set_active(dev), it checks the
> > > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
>
> And it does so because enabling runtime PM for the child with
> runtime_status == RPM_ACTIVE does not make sense when the parent has
> runtime PM enabled and its status is not RPM_ACTIVE.
>
> It looks like the runtime PM status of the parent is not as expected,
So is the scenario Ryan brought up unexpected? What are we missing here
and where should the fix be in?
> but quite frankly I don't quite follow the logic in dwc3_pm_resume().
>
> Why does it disable runtime PM just for the duration of
> dwc3_resume_common()? If runtime PM is functional before the
> pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
> be resumed by calling pm_runtime_resume() on it without disabling
> runtime PM. In turn, if runtime PM is not functional at that point,
> it should not be enabled.
Base on git-blame, I hope this will answer your question:
68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
For device mode, if PM runtime autosuspend feature enabled, the
runtime power status of dwc3 may be suspended when run dwc3_resume(),
and dwc3 gadget would not be configured in dwc3_gadget_run_stop().
It would cause gadget connected failed if USB cable has been plugged
before PM resume. So move forward pm_runtime_set_active() to fix it.
In certain platforms, they probably need the phy to be active to perform
dwc3_resume_common().
BR,
Thinh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 0:43 ` Thinh Nguyen
@ 2025-08-29 1:25 ` Alan Stern
2025-08-29 19:07 ` Thinh Nguyen
2025-08-29 19:23 ` Rafael J. Wysocki
0 siblings, 2 replies; 20+ messages in thread
From: Alan Stern @ 2025-08-29 1:25 UTC (permalink / raw)
To: Thinh Nguyen
Cc: Rafael J. Wysocki, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Fri, Aug 29, 2025 at 12:43:17AM +0000, Thinh Nguyen wrote:
> On Wed, Aug 27, 2025, Rafael J. Wysocki wrote:
> > On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> > >
> > > Ryan:
> > >
> > > You should present your questions to the maintainer of the kernel's
> > > Power Management subsystem, Rafael Wysocki (added to the To: list for
> > > this email).
> >
> > Thanks Alan!
> >
> >
> > > On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > > > Hi Roy,
> > > > Thank you for reviewing my patch.
> > > > >
> > > > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > > > >
> > > > No, in the following case, the parent device will not be reviewed
> > > > before resuming the child device.
> > > > Taking the 'imx8mp-dwc3' driver as an example.
> > > > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > > > suspend state firstly, followed by
> > > > the parent device imx8mp-dwc3 enters runtime suspend
> > > > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > > > Step2.system deep trigger:consistent with the runtime suspend flow,
> > > > child enters pm suspend and followed
> > > > by parent
> > > > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > > > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > > > task terminated the system suspend process
> > > > . The system will resume from the checkpoint, and resume devices in
> > > > the suspended state in the reverse
> > > > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > > > did not execute the suspend process.
> > > >
> > > > >
> > > > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > > > >
> > > > Following the above analysis, dwc3_resume calls
> >
> > I assume that dwc3_pm_resume() is meant here.
> >
> > > > pm_runtime_set_active(dev), it checks the
> > > > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
> >
> > And it does so because enabling runtime PM for the child with
> > runtime_status == RPM_ACTIVE does not make sense when the parent has
> > runtime PM enabled and its status is not RPM_ACTIVE.
> >
> > It looks like the runtime PM status of the parent is not as expected,
>
> So is the scenario Ryan brought up unexpected? What are we missing here
> and where should the fix be in?
>
> > but quite frankly I don't quite follow the logic in dwc3_pm_resume().
> >
> > Why does it disable runtime PM just for the duration of
> > dwc3_resume_common()? If runtime PM is functional before the
> > pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
> > be resumed by calling pm_runtime_resume() on it without disabling
> > runtime PM. In turn, if runtime PM is not functional at that point,
> > it should not be enabled.
>
> Base on git-blame, I hope this will answer your question:
>
> 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
>
> For device mode, if PM runtime autosuspend feature enabled, the
> runtime power status of dwc3 may be suspended when run dwc3_resume(),
> and dwc3 gadget would not be configured in dwc3_gadget_run_stop().
> It would cause gadget connected failed if USB cable has been plugged
> before PM resume. So move forward pm_runtime_set_active() to fix it.
>
>
> In certain platforms, they probably need the phy to be active to perform
> dwc3_resume_common().
It sounds like the real question is how we should deal with an
interrupted system suspend. Suppose parent device A and child device B
are both in runtime suspend when a system sleep transition begins. The
PM core invokes the ->suspend callback of B (and let's say the callback
doesn't need to do anything because B is already suspended with the
appropriate wakeup setting).
But then before the PM core invokes the ->suspend callback of A, the
system sleep transition is cancelled. So the PM core goes through the
device tree from parents to children, invoking the ->resume callback for
all the devices whose ->suspend callback was called earlier. Thus, A's
->resume is skipped because A's ->suspend wasn't called, but B's
->resume callback _is_ invoked. This callback fails, because it can't
resume B while A is still in runtime suspend.
The same problem arises if A isn't a parent of B but there is a PM
dependency from B to A.
It's been so long since I worked on the system suspend code that I don't
remember how we decided to handle this scenario.
Alan Stern
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 1:25 ` Alan Stern
@ 2025-08-29 19:07 ` Thinh Nguyen
2025-08-29 19:26 ` Rafael J. Wysocki
2025-08-29 19:23 ` Rafael J. Wysocki
1 sibling, 1 reply; 20+ messages in thread
From: Thinh Nguyen @ 2025-08-29 19:07 UTC (permalink / raw)
To: Alan Stern, Rafael J. Wysocki
Cc: Thinh Nguyen, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Thu, Aug 28, 2025, Alan Stern wrote:
> On Fri, Aug 29, 2025 at 12:43:17AM +0000, Thinh Nguyen wrote:
> > On Wed, Aug 27, 2025, Rafael J. Wysocki wrote:
> > > On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> > > >
> > > > Ryan:
> > > >
> > > > You should present your questions to the maintainer of the kernel's
> > > > Power Management subsystem, Rafael Wysocki (added to the To: list for
> > > > this email).
> > >
> > > Thanks Alan!
> > >
> > >
> > > > On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > > > > Hi Roy,
> > > > > Thank you for reviewing my patch.
> > > > > >
> > > > > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > > > > >
> > > > > No, in the following case, the parent device will not be reviewed
> > > > > before resuming the child device.
> > > > > Taking the 'imx8mp-dwc3' driver as an example.
> > > > > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > > > > suspend state firstly, followed by
> > > > > the parent device imx8mp-dwc3 enters runtime suspend
> > > > > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > > > > Step2.system deep trigger:consistent with the runtime suspend flow,
> > > > > child enters pm suspend and followed
> > > > > by parent
> > > > > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > > > > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > > > > task terminated the system suspend process
> > > > > . The system will resume from the checkpoint, and resume devices in
> > > > > the suspended state in the reverse
> > > > > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > > > > did not execute the suspend process.
> > > > >
> > > > > >
> > > > > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > > > > >
> > > > > Following the above analysis, dwc3_resume calls
> > >
> > > I assume that dwc3_pm_resume() is meant here.
> > >
> > > > > pm_runtime_set_active(dev), it checks the
> > > > > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
> > >
> > > And it does so because enabling runtime PM for the child with
> > > runtime_status == RPM_ACTIVE does not make sense when the parent has
> > > runtime PM enabled and its status is not RPM_ACTIVE.
> > >
> > > It looks like the runtime PM status of the parent is not as expected,
> >
> > So is the scenario Ryan brought up unexpected? What are we missing here
> > and where should the fix be in?
> >
> > > but quite frankly I don't quite follow the logic in dwc3_pm_resume().
> > >
> > > Why does it disable runtime PM just for the duration of
> > > dwc3_resume_common()? If runtime PM is functional before the
> > > pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
> > > be resumed by calling pm_runtime_resume() on it without disabling
> > > runtime PM. In turn, if runtime PM is not functional at that point,
> > > it should not be enabled.
> >
> > Base on git-blame, I hope this will answer your question:
> >
> > 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
> >
> > For device mode, if PM runtime autosuspend feature enabled, the
> > runtime power status of dwc3 may be suspended when run dwc3_resume(),
> > and dwc3 gadget would not be configured in dwc3_gadget_run_stop().
> > It would cause gadget connected failed if USB cable has been plugged
> > before PM resume. So move forward pm_runtime_set_active() to fix it.
> >
> >
> > In certain platforms, they probably need the phy to be active to perform
> > dwc3_resume_common().
>
> It sounds like the real question is how we should deal with an
> interrupted system suspend. Suppose parent device A and child device B
> are both in runtime suspend when a system sleep transition begins. The
> PM core invokes the ->suspend callback of B (and let's say the callback
> doesn't need to do anything because B is already suspended with the
> appropriate wakeup setting).
>
> But then before the PM core invokes the ->suspend callback of A, the
> system sleep transition is cancelled. So the PM core goes through the
> device tree from parents to children, invoking the ->resume callback for
> all the devices whose ->suspend callback was called earlier. Thus, A's
> ->resume is skipped because A's ->suspend wasn't called, but B's
> ->resume callback _is_ invoked. This callback fails, because it can't
> resume B while A is still in runtime suspend.
>
> The same problem arises if A isn't a parent of B but there is a PM
> dependency from B to A.
>
> It's been so long since I worked on the system suspend code that I don't
> remember how we decided to handle this scenario.
>
Alan, Rafael,
What are your thoughts on how we should handle this. Should the fix be
at the PM core? Sounds like the PM core needs to check more than whether
the ->suspend callback was called earlier to determine whether to skip
->resume.
Thanks,
Thinh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 1:25 ` Alan Stern
2025-08-29 19:07 ` Thinh Nguyen
@ 2025-08-29 19:23 ` Rafael J. Wysocki
2025-08-29 19:58 ` Alan Stern
1 sibling, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-08-29 19:23 UTC (permalink / raw)
To: Alan Stern
Cc: Thinh Nguyen, Rafael J. Wysocki, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Fri, Aug 29, 2025 at 3:25 AM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Fri, Aug 29, 2025 at 12:43:17AM +0000, Thinh Nguyen wrote:
> > On Wed, Aug 27, 2025, Rafael J. Wysocki wrote:
> > > On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> > > >
> > > > Ryan:
> > > >
> > > > You should present your questions to the maintainer of the kernel's
> > > > Power Management subsystem, Rafael Wysocki (added to the To: list for
> > > > this email).
> > >
> > > Thanks Alan!
> > >
> > >
> > > > On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > > > > Hi Roy,
> > > > > Thank you for reviewing my patch.
> > > > > >
> > > > > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > > > > >
> > > > > No, in the following case, the parent device will not be reviewed
> > > > > before resuming the child device.
> > > > > Taking the 'imx8mp-dwc3' driver as an example.
> > > > > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > > > > suspend state firstly, followed by
> > > > > the parent device imx8mp-dwc3 enters runtime suspend
> > > > > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > > > > Step2.system deep trigger:consistent with the runtime suspend flow,
> > > > > child enters pm suspend and followed
> > > > > by parent
> > > > > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > > > > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > > > > task terminated the system suspend process
> > > > > . The system will resume from the checkpoint, and resume devices in
> > > > > the suspended state in the reverse
> > > > > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > > > > did not execute the suspend process.
> > > > >
> > > > > >
> > > > > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > > > > >
> > > > > Following the above analysis, dwc3_resume calls
> > >
> > > I assume that dwc3_pm_resume() is meant here.
> > >
> > > > > pm_runtime_set_active(dev), it checks the
> > > > > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
> > >
> > > And it does so because enabling runtime PM for the child with
> > > runtime_status == RPM_ACTIVE does not make sense when the parent has
> > > runtime PM enabled and its status is not RPM_ACTIVE.
> > >
> > > It looks like the runtime PM status of the parent is not as expected,
> >
> > So is the scenario Ryan brought up unexpected? What are we missing here
> > and where should the fix be in?
> >
> > > but quite frankly I don't quite follow the logic in dwc3_pm_resume().
> > >
> > > Why does it disable runtime PM just for the duration of
> > > dwc3_resume_common()? If runtime PM is functional before the
> > > pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
> > > be resumed by calling pm_runtime_resume() on it without disabling
> > > runtime PM. In turn, if runtime PM is not functional at that point,
> > > it should not be enabled.
> >
> > Base on git-blame, I hope this will answer your question:
> >
> > 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
> >
> > For device mode, if PM runtime autosuspend feature enabled, the
> > runtime power status of dwc3 may be suspended when run dwc3_resume(),
> > and dwc3 gadget would not be configured in dwc3_gadget_run_stop().
> > It would cause gadget connected failed if USB cable has been plugged
> > before PM resume. So move forward pm_runtime_set_active() to fix it.
> >
> >
> > In certain platforms, they probably need the phy to be active to perform
> > dwc3_resume_common().
>
> It sounds like the real question is how we should deal with an
> interrupted system suspend. Suppose parent device A and child device B
> are both in runtime suspend when a system sleep transition begins. The
> PM core invokes the ->suspend callback of B (and let's say the callback
> doesn't need to do anything because B is already suspended with the
> appropriate wakeup setting).
>
> But then before the PM core invokes the ->suspend callback of A, the
> system sleep transition is cancelled. So the PM core goes through the
> device tree from parents to children, invoking the ->resume callback for
> all the devices whose ->suspend callback was called earlier. Thus, A's
> ->resume is skipped because A's ->suspend wasn't called, but B's
> ->resume callback _is_ invoked. This callback fails, because it can't
> resume B while A is still in runtime suspend.
>
> The same problem arises if A isn't a parent of B but there is a PM
> dependency from B to A.
>
> It's been so long since I worked on the system suspend code that I don't
> remember how we decided to handle this scenario.
We actually have not made any specific decision in that respect. That
is, in the error path, the core will invoke the resume callbacks for
devices whose suspend callbacks were invoked and it won't do anything
beyond that because it has too little information on what would need
to be done.
Arguably, though, the failure case described above is not different
from regular resume during which the driver of A decides to retain the
device in runtime suspend.
I'm not sure if the core can do anything about it.
But at the time when the B's resume callback is invoked, runtime PM is
enabled for A, so the driver of B may as well use runtime_resume() to
resume the device if it wants to do so. It may also decide to do
nothing like in the suspend callback.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 19:07 ` Thinh Nguyen
@ 2025-08-29 19:26 ` Rafael J. Wysocki
2025-08-29 20:13 ` Thinh Nguyen
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-08-29 19:26 UTC (permalink / raw)
To: Thinh Nguyen
Cc: Alan Stern, Rafael J. Wysocki, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Fri, Aug 29, 2025 at 9:07 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
>
> On Thu, Aug 28, 2025, Alan Stern wrote:
> > On Fri, Aug 29, 2025 at 12:43:17AM +0000, Thinh Nguyen wrote:
> > > On Wed, Aug 27, 2025, Rafael J. Wysocki wrote:
> > > > On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> > > > >
> > > > > Ryan:
> > > > >
> > > > > You should present your questions to the maintainer of the kernel's
> > > > > Power Management subsystem, Rafael Wysocki (added to the To: list for
> > > > > this email).
> > > >
> > > > Thanks Alan!
> > > >
> > > >
> > > > > On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > > > > > Hi Roy,
> > > > > > Thank you for reviewing my patch.
> > > > > > >
> > > > > > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > > > > > >
> > > > > > No, in the following case, the parent device will not be reviewed
> > > > > > before resuming the child device.
> > > > > > Taking the 'imx8mp-dwc3' driver as an example.
> > > > > > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > > > > > suspend state firstly, followed by
> > > > > > the parent device imx8mp-dwc3 enters runtime suspend
> > > > > > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > > > > > Step2.system deep trigger:consistent with the runtime suspend flow,
> > > > > > child enters pm suspend and followed
> > > > > > by parent
> > > > > > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > > > > > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > > > > > task terminated the system suspend process
> > > > > > . The system will resume from the checkpoint, and resume devices in
> > > > > > the suspended state in the reverse
> > > > > > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > > > > > did not execute the suspend process.
> > > > > >
> > > > > > >
> > > > > > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > > > > > >
> > > > > > Following the above analysis, dwc3_resume calls
> > > >
> > > > I assume that dwc3_pm_resume() is meant here.
> > > >
> > > > > > pm_runtime_set_active(dev), it checks the
> > > > > > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
> > > >
> > > > And it does so because enabling runtime PM for the child with
> > > > runtime_status == RPM_ACTIVE does not make sense when the parent has
> > > > runtime PM enabled and its status is not RPM_ACTIVE.
> > > >
> > > > It looks like the runtime PM status of the parent is not as expected,
> > >
> > > So is the scenario Ryan brought up unexpected? What are we missing here
> > > and where should the fix be in?
> > >
> > > > but quite frankly I don't quite follow the logic in dwc3_pm_resume().
> > > >
> > > > Why does it disable runtime PM just for the duration of
> > > > dwc3_resume_common()? If runtime PM is functional before the
> > > > pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
> > > > be resumed by calling pm_runtime_resume() on it without disabling
> > > > runtime PM. In turn, if runtime PM is not functional at that point,
> > > > it should not be enabled.
> > >
> > > Base on git-blame, I hope this will answer your question:
> > >
> > > 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
> > >
> > > For device mode, if PM runtime autosuspend feature enabled, the
> > > runtime power status of dwc3 may be suspended when run dwc3_resume(),
> > > and dwc3 gadget would not be configured in dwc3_gadget_run_stop().
> > > It would cause gadget connected failed if USB cable has been plugged
> > > before PM resume. So move forward pm_runtime_set_active() to fix it.
> > >
> > >
> > > In certain platforms, they probably need the phy to be active to perform
> > > dwc3_resume_common().
> >
> > It sounds like the real question is how we should deal with an
> > interrupted system suspend. Suppose parent device A and child device B
> > are both in runtime suspend when a system sleep transition begins. The
> > PM core invokes the ->suspend callback of B (and let's say the callback
> > doesn't need to do anything because B is already suspended with the
> > appropriate wakeup setting).
> >
> > But then before the PM core invokes the ->suspend callback of A, the
> > system sleep transition is cancelled. So the PM core goes through the
> > device tree from parents to children, invoking the ->resume callback for
> > all the devices whose ->suspend callback was called earlier. Thus, A's
> > ->resume is skipped because A's ->suspend wasn't called, but B's
> > ->resume callback _is_ invoked. This callback fails, because it can't
> > resume B while A is still in runtime suspend.
> >
> > The same problem arises if A isn't a parent of B but there is a PM
> > dependency from B to A.
> >
> > It's been so long since I worked on the system suspend code that I don't
> > remember how we decided to handle this scenario.
> >
>
> Alan, Rafael,
>
> What are your thoughts on how we should handle this.
I'm not really sure what you mean by "this": the scenario described by
Alan or something else?
I was pulled into the thread in the middle of it and I don't know the
full context.
> Should the fix be at the PM core? Sounds like the PM core needs to check
> more than whether the ->suspend callback was called earlier to determine
> whether to skip ->resume.
But the core doesn't know what happened in the ->suspend callback in
the first place, so how can it know what's the right thing to do?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 19:23 ` Rafael J. Wysocki
@ 2025-08-29 19:58 ` Alan Stern
2025-09-01 19:41 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2025-08-29 19:58 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Thinh Nguyen, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Fri, Aug 29, 2025 at 09:23:12PM +0200, Rafael J. Wysocki wrote:
> On Fri, Aug 29, 2025 at 3:25 AM Alan Stern <stern@rowland.harvard.edu> wrote:
> > It sounds like the real question is how we should deal with an
> > interrupted system suspend. Suppose parent device A and child device B
> > are both in runtime suspend when a system sleep transition begins. The
> > PM core invokes the ->suspend callback of B (and let's say the callback
> > doesn't need to do anything because B is already suspended with the
> > appropriate wakeup setting).
> >
> > But then before the PM core invokes the ->suspend callback of A, the
> > system sleep transition is cancelled. So the PM core goes through the
> > device tree from parents to children, invoking the ->resume callback for
> > all the devices whose ->suspend callback was called earlier. Thus, A's
> > ->resume is skipped because A's ->suspend wasn't called, but B's
> > ->resume callback _is_ invoked. This callback fails, because it can't
> > resume B while A is still in runtime suspend.
> >
> > The same problem arises if A isn't a parent of B but there is a PM
> > dependency from B to A.
> >
> > It's been so long since I worked on the system suspend code that I don't
> > remember how we decided to handle this scenario.
>
> We actually have not made any specific decision in that respect. That
> is, in the error path, the core will invoke the resume callbacks for
> devices whose suspend callbacks were invoked and it won't do anything
> beyond that because it has too little information on what would need
> to be done.
>
> Arguably, though, the failure case described above is not different
> from regular resume during which the driver of A decides to retain the
> device in runtime suspend.
>
> I'm not sure if the core can do anything about it.
>
> But at the time when the B's resume callback is invoked, runtime PM is
> enabled for A, so the driver of B may as well use runtime_resume() to
> resume the device if it wants to do so. It may also decide to do
> nothing like in the suspend callback.
Good point. Since both devices were in runtime suspend before the sleep
transition started, there's no reason they can't remain in runtime
suspend after the sleep transition is cancelled.
On the other hand, it seems clear that this scenario doesn't get very
much testing. I'm pretty sure the USB subsystem in general is
vulnerable to this problem; it doesn't consider suspended devices to be
in different states according to the reason for the suspend. That is, a
USB device suspended for runtime PM is in the same state as a device
suspended for system PM (aside from minor details like wakeup settings).
Consequently the ->resume and ->runtime_resume callbacks do essentially
the same thing, both assuming the parent device is not suspended. As we
have discussed, this assumption isn't always correct.
I'm open to suggestions for how to handle this. Should we keep track of
whether a device was in runtime suspend when a system suspend happens,
so that the ->resume callback can avoid doing anything? Will that work
if the device was the source of a wakeup request?
Alan Stern
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 19:26 ` Rafael J. Wysocki
@ 2025-08-29 20:13 ` Thinh Nguyen
2025-08-30 0:46 ` Alan Stern
0 siblings, 1 reply; 20+ messages in thread
From: Thinh Nguyen @ 2025-08-29 20:13 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Thinh Nguyen, Alan Stern, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Fri, Aug 29, 2025, Rafael J. Wysocki wrote:
> On Fri, Aug 29, 2025 at 9:07 PM Thinh Nguyen <Thinh.Nguyen@synopsys.com> wrote:
> >
> > On Thu, Aug 28, 2025, Alan Stern wrote:
> > > On Fri, Aug 29, 2025 at 12:43:17AM +0000, Thinh Nguyen wrote:
> > > > On Wed, Aug 27, 2025, Rafael J. Wysocki wrote:
> > > > > On Wed, Aug 27, 2025 at 4:52 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> > > > > >
> > > > > > Ryan:
> > > > > >
> > > > > > You should present your questions to the maintainer of the kernel's
> > > > > > Power Management subsystem, Rafael Wysocki (added to the To: list for
> > > > > > this email).
> > > > >
> > > > > Thanks Alan!
> > > > >
> > > > >
> > > > > > On Wed, Aug 27, 2025 at 10:09:10PM +0800, ryan zhou wrote:
> > > > > > > Hi Roy,
> > > > > > > Thank you for reviewing my patch.
> > > > > > > >
> > > > > > > > Wouldn't the parent glue dev already resume before resuming the child dwc3?
> > > > > > > >
> > > > > > > No, in the following case, the parent device will not be reviewed
> > > > > > > before resuming the child device.
> > > > > > > Taking the 'imx8mp-dwc3' driver as an example.
> > > > > > > Step 1.usb disconnect trigger: the child device dwc3 enter runtime
> > > > > > > suspend state firstly, followed by
> > > > > > > the parent device imx8mp-dwc3 enters runtime suspend
> > > > > > > flow:dwc3_runtime_suspend->dwc3_imx8mp_runtime_suspend
> > > > > > > Step2.system deep trigger:consistent with the runtime suspend flow,
> > > > > > > child enters pm suspend and followed
> > > > > > > by parent
> > > > > > > flow: dwc3_pm_suspend->dwc3_imx8mp_pm_suspend
> > > > > > > Step3: After dwc3_pm_suspend, and before dwc3_imx8mp_pm_suspend, a
> > > > > > > task terminated the system suspend process
> > > > > > > . The system will resume from the checkpoint, and resume devices in
> > > > > > > the suspended state in the reverse
> > > > > > > of pm suspend, but excluding the parent device imx8mp-dwc3 since it
> > > > > > > did not execute the suspend process.
> > > > > > >
> > > > > > > >
> > > > > > > >Why would 'runtime PM trying to activate child device xxx.dwc3 but parent is not active' happen in the first place?
> > > > > > > >
> > > > > > > Following the above analysis, dwc3_resume calls
> > > > >
> > > > > I assume that dwc3_pm_resume() is meant here.
> > > > >
> > > > > > > pm_runtime_set_active(dev), it checks the
> > > > > > > parent.power->runtime_status is not RPM_ACTIVE and outputs the error log.
> > > > >
> > > > > And it does so because enabling runtime PM for the child with
> > > > > runtime_status == RPM_ACTIVE does not make sense when the parent has
> > > > > runtime PM enabled and its status is not RPM_ACTIVE.
> > > > >
> > > > > It looks like the runtime PM status of the parent is not as expected,
> > > >
> > > > So is the scenario Ryan brought up unexpected? What are we missing here
> > > > and where should the fix be in?
> > > >
> > > > > but quite frankly I don't quite follow the logic in dwc3_pm_resume().
> > > > >
> > > > > Why does it disable runtime PM just for the duration of
> > > > > dwc3_resume_common()? If runtime PM is functional before the
> > > > > pm_runtime_disable() call in dwc3_pm_resume(), the device may as well
> > > > > be resumed by calling pm_runtime_resume() on it without disabling
> > > > > runtime PM. In turn, if runtime PM is not functional at that point,
> > > > > it should not be enabled.
> > > >
> > > > Base on git-blame, I hope this will answer your question:
> > > >
> > > > 68c26fe58182 ("usb: dwc3: set pm runtime active before resume common")
> > > >
> > > > For device mode, if PM runtime autosuspend feature enabled, the
> > > > runtime power status of dwc3 may be suspended when run dwc3_resume(),
> > > > and dwc3 gadget would not be configured in dwc3_gadget_run_stop().
> > > > It would cause gadget connected failed if USB cable has been plugged
> > > > before PM resume. So move forward pm_runtime_set_active() to fix it.
> > > >
> > > >
> > > > In certain platforms, they probably need the phy to be active to perform
> > > > dwc3_resume_common().
> > >
> > > It sounds like the real question is how we should deal with an
> > > interrupted system suspend. Suppose parent device A and child device B
> > > are both in runtime suspend when a system sleep transition begins. The
> > > PM core invokes the ->suspend callback of B (and let's say the callback
> > > doesn't need to do anything because B is already suspended with the
> > > appropriate wakeup setting).
> > >
> > > But then before the PM core invokes the ->suspend callback of A, the
> > > system sleep transition is cancelled. So the PM core goes through the
> > > device tree from parents to children, invoking the ->resume callback for
> > > all the devices whose ->suspend callback was called earlier. Thus, A's
> > > ->resume is skipped because A's ->suspend wasn't called, but B's
> > > ->resume callback _is_ invoked. This callback fails, because it can't
> > > resume B while A is still in runtime suspend.
> > >
> > > The same problem arises if A isn't a parent of B but there is a PM
> > > dependency from B to A.
> > >
> > > It's been so long since I worked on the system suspend code that I don't
> > > remember how we decided to handle this scenario.
> > >
> >
> > Alan, Rafael,
> >
> > What are your thoughts on how we should handle this.
>
> I'm not really sure what you mean by "this": the scenario described by
> Alan or something else?
>
> I was pulled into the thread in the middle of it and I don't know the
> full context.
Thanks for looking into this. The issue Ryan reported is what Alan
described: A system suspend was interrupted for device A and B that are
already in runtime suspend. The roll back transition skipped A ->resume.
B's ->resume was called and received an error message because A was
still in runtime suspend.
For dwc3 however, it resumes B anyway regardless of the error. Because
of the dependency of A, it runs into problems later. This can be fixed
in dwc3 to check for the error, but..
>
> > Should the fix be at the PM core? Sounds like the PM core needs to check
> > more than whether the ->suspend callback was called earlier to determine
> > whether to skip ->resume.
>
> But the core doesn't know what happened in the ->suspend callback in
> the first place, so how can it know what's the right thing to do?
..shouldn't the PM core know that A was runtime suspended to not skip
->resume? (sorry I'm not an expert in the PM core, genuine question
here).
Thanks,
Thinh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 20:13 ` Thinh Nguyen
@ 2025-08-30 0:46 ` Alan Stern
2025-08-30 1:14 ` Thinh Nguyen
0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2025-08-30 0:46 UTC (permalink / raw)
To: Thinh Nguyen
Cc: Rafael J. Wysocki, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Fri, Aug 29, 2025 at 08:13:07PM +0000, Thinh Nguyen wrote:
> ..shouldn't the PM core know that A was runtime suspended to not skip
> ->resume? (sorry I'm not an expert in the PM core, genuine question
> here).
This doesn't answer your question directly, but I would like to add some
background.
There are subsystems/drivers that do want to resume their devices during
system resume, even if the devices were in runtime suspend originally.
At a minimum, the PM core doesn't want to take this choice away from
them.
In fact, the USB subsystem was designed to run that way back when
support for runtime PM was first added, and it hasn't been changed since
-- although maybe it should be. There are explicit mechanisms for
telling the PM core that a device should be skipped during system
resume; we could use them.
Regardless, I don't recall any discussions of the particular situation
in this thread ever taking place.
Alan Stern
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-30 0:46 ` Alan Stern
@ 2025-08-30 1:14 ` Thinh Nguyen
0 siblings, 0 replies; 20+ messages in thread
From: Thinh Nguyen @ 2025-08-30 1:14 UTC (permalink / raw)
To: Alan Stern
Cc: Thinh Nguyen, Rafael J. Wysocki, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Fri, Aug 29, 2025, Alan Stern wrote:
> On Fri, Aug 29, 2025 at 08:13:07PM +0000, Thinh Nguyen wrote:
> > ..shouldn't the PM core know that A was runtime suspended to not skip
> > ->resume? (sorry I'm not an expert in the PM core, genuine question
> > here).
>
> This doesn't answer your question directly, but I would like to add some
> background.
>
> There are subsystems/drivers that do want to resume their devices during
> system resume, even if the devices were in runtime suspend originally.
> At a minimum, the PM core doesn't want to take this choice away from
> them.
>
> In fact, the USB subsystem was designed to run that way back when
> support for runtime PM was first added, and it hasn't been changed since
> -- although maybe it should be. There are explicit mechanisms for
> telling the PM core that a device should be skipped during system
> resume; we could use them.
>
> Regardless, I don't recall any discussions of the particular situation
> in this thread ever taking place.
>
Thank you for the background Alan.
I'm glad that we're having this discussion now. I'll continue to monitor
this thread. Do let me know how we should handle this case.
Thanks,
Thinh
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-08-29 19:58 ` Alan Stern
@ 2025-09-01 19:41 ` Rafael J. Wysocki
2025-09-01 20:40 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-09-01 19:41 UTC (permalink / raw)
To: Alan Stern
Cc: Rafael J. Wysocki, Thinh Nguyen, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Fri, Aug 29, 2025 at 9:58 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Fri, Aug 29, 2025 at 09:23:12PM +0200, Rafael J. Wysocki wrote:
> > On Fri, Aug 29, 2025 at 3:25 AM Alan Stern <stern@rowland.harvard.edu> wrote:
> > > It sounds like the real question is how we should deal with an
> > > interrupted system suspend. Suppose parent device A and child device B
> > > are both in runtime suspend when a system sleep transition begins. The
> > > PM core invokes the ->suspend callback of B (and let's say the callback
> > > doesn't need to do anything because B is already suspended with the
> > > appropriate wakeup setting).
> > >
> > > But then before the PM core invokes the ->suspend callback of A, the
> > > system sleep transition is cancelled. So the PM core goes through the
> > > device tree from parents to children, invoking the ->resume callback for
> > > all the devices whose ->suspend callback was called earlier. Thus, A's
> > > ->resume is skipped because A's ->suspend wasn't called, but B's
> > > ->resume callback _is_ invoked. This callback fails, because it can't
> > > resume B while A is still in runtime suspend.
> > >
> > > The same problem arises if A isn't a parent of B but there is a PM
> > > dependency from B to A.
> > >
> > > It's been so long since I worked on the system suspend code that I don't
> > > remember how we decided to handle this scenario.
> >
> > We actually have not made any specific decision in that respect. That
> > is, in the error path, the core will invoke the resume callbacks for
> > devices whose suspend callbacks were invoked and it won't do anything
> > beyond that because it has too little information on what would need
> > to be done.
> >
> > Arguably, though, the failure case described above is not different
> > from regular resume during which the driver of A decides to retain the
> > device in runtime suspend.
> >
> > I'm not sure if the core can do anything about it.
> >
> > But at the time when the B's resume callback is invoked, runtime PM is
> > enabled for A, so the driver of B may as well use runtime_resume() to
> > resume the device if it wants to do so. It may also decide to do
> > nothing like in the suspend callback.
>
> Good point. Since both devices were in runtime suspend before the sleep
> transition started, there's no reason they can't remain in runtime
> suspend after the sleep transition is cancelled.
>
> On the other hand, it seems clear that this scenario doesn't get very
> much testing.
No, it doesn't in general AFAICS.
> I'm pretty sure the USB subsystem in general is
> vulnerable to this problem; it doesn't consider suspended devices to be
> in different states according to the reason for the suspend. That is, a
> USB device suspended for runtime PM is in the same state as a device
> suspended for system PM (aside from minor details like wakeup settings).
> Consequently the ->resume and ->runtime_resume callbacks do essentially
> the same thing, both assuming the parent device is not suspended. As we
> have discussed, this assumption isn't always correct.
>
> I'm open to suggestions for how to handle this. Should we keep track of
> whether a device was in runtime suspend when a system suspend happens,
> so that the ->resume callback can avoid doing anything? Will that work
> if the device was the source of a wakeup request?
Generally speaking, for proper integration of system suspend with
runtime suspend at all levels, it is necessary to track whether or not
the given device has been suspended prior to system suspend.
In fact, there are even ways to opt-in for assistance from the PM core
and bus types in that respect to some extent.
In the particular case at hand though, the PM core is not involved in
making the decision whether or not to leave the devices in runtime
suspend during system suspend and it all depends on the drivers of A
and B.
Note here that the problematic situation occurs when the suspend of B
has run, but the suspend of A has not run yet and the transition is
aborted between them, so the driver of A cannot do much to help. The
driver of B has a couple of options though.
First off, it might decide to runtime-resume the device in its system
suspend callback (as long as we are talking about the "suspend" phase
and not any later phases of system suspend) before suspending it again
which will also cause A to runtime-resume and aborting system suspend
would not be problematic any more. So that's one of the options, but
it is kind of wasteful and time-consuming.
Another option, which I mentioned before, might be to call
runtime_resume() from the system resume callback of B (again, as long
as we are talking about the "resume" phase, not any of the earlier
phases of system resume). This assumes that runtime PM is enabled at
this point for both A and B and so it should work properly.
Now, if the driver of B needs to do something special to the device in
its system suspend callback, it may want (and likely should) disable
runtime PM prior to this and in that case it will have to check what
the runtime PM status of the device is and adjust its actions
accordingly. That really depends on what those actions are etc, so
I'd rather not talk about it without a specific example.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-01 19:41 ` Rafael J. Wysocki
@ 2025-09-01 20:40 ` Rafael J. Wysocki
2025-09-02 2:41 ` Alan Stern
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-09-01 20:40 UTC (permalink / raw)
To: Alan Stern
Cc: Thinh Nguyen, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Mon, Sep 1, 2025 at 9:41 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Aug 29, 2025 at 9:58 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > On Fri, Aug 29, 2025 at 09:23:12PM +0200, Rafael J. Wysocki wrote:
> > > On Fri, Aug 29, 2025 at 3:25 AM Alan Stern <stern@rowland.harvard.edu> wrote:
> > > > It sounds like the real question is how we should deal with an
> > > > interrupted system suspend. Suppose parent device A and child device B
> > > > are both in runtime suspend when a system sleep transition begins. The
> > > > PM core invokes the ->suspend callback of B (and let's say the callback
> > > > doesn't need to do anything because B is already suspended with the
> > > > appropriate wakeup setting).
> > > >
> > > > But then before the PM core invokes the ->suspend callback of A, the
> > > > system sleep transition is cancelled. So the PM core goes through the
> > > > device tree from parents to children, invoking the ->resume callback for
> > > > all the devices whose ->suspend callback was called earlier. Thus, A's
> > > > ->resume is skipped because A's ->suspend wasn't called, but B's
> > > > ->resume callback _is_ invoked. This callback fails, because it can't
> > > > resume B while A is still in runtime suspend.
> > > >
> > > > The same problem arises if A isn't a parent of B but there is a PM
> > > > dependency from B to A.
> > > >
> > > > It's been so long since I worked on the system suspend code that I don't
> > > > remember how we decided to handle this scenario.
> > >
> > > We actually have not made any specific decision in that respect. That
> > > is, in the error path, the core will invoke the resume callbacks for
> > > devices whose suspend callbacks were invoked and it won't do anything
> > > beyond that because it has too little information on what would need
> > > to be done.
> > >
> > > Arguably, though, the failure case described above is not different
> > > from regular resume during which the driver of A decides to retain the
> > > device in runtime suspend.
> > >
> > > I'm not sure if the core can do anything about it.
> > >
> > > But at the time when the B's resume callback is invoked, runtime PM is
> > > enabled for A, so the driver of B may as well use runtime_resume() to
> > > resume the device if it wants to do so. It may also decide to do
> > > nothing like in the suspend callback.
> >
> > Good point. Since both devices were in runtime suspend before the sleep
> > transition started, there's no reason they can't remain in runtime
> > suspend after the sleep transition is cancelled.
> >
> > On the other hand, it seems clear that this scenario doesn't get very
> > much testing.
>
> No, it doesn't in general AFAICS.
>
> > I'm pretty sure the USB subsystem in general is
> > vulnerable to this problem; it doesn't consider suspended devices to be
> > in different states according to the reason for the suspend. That is, a
> > USB device suspended for runtime PM is in the same state as a device
> > suspended for system PM (aside from minor details like wakeup settings).
> > Consequently the ->resume and ->runtime_resume callbacks do essentially
> > the same thing, both assuming the parent device is not suspended. As we
> > have discussed, this assumption isn't always correct.
> >
> > I'm open to suggestions for how to handle this. Should we keep track of
> > whether a device was in runtime suspend when a system suspend happens,
> > so that the ->resume callback can avoid doing anything? Will that work
> > if the device was the source of a wakeup request?
>
> Generally speaking, for proper integration of system suspend with
> runtime suspend at all levels, it is necessary to track whether or not
> the given device has been suspended prior to system suspend.
>
> In fact, there are even ways to opt-in for assistance from the PM core
> and bus types in that respect to some extent.
>
> In the particular case at hand though, the PM core is not involved in
> making the decision whether or not to leave the devices in runtime
> suspend during system suspend and it all depends on the drivers of A
> and B.
>
> Note here that the problematic situation occurs when the suspend of B
> has run, but the suspend of A has not run yet and the transition is
> aborted between them, so the driver of A cannot do much to help. The
> driver of B has a couple of options though.
>
> First off, it might decide to runtime-resume the device in its system
> suspend callback (as long as we are talking about the "suspend" phase
> and not any later phases of system suspend) before suspending it again
> which will also cause A to runtime-resume and aborting system suspend
> would not be problematic any more. So that's one of the options, but
> it is kind of wasteful and time-consuming.
>
> Another option, which I mentioned before, might be to call
> runtime_resume() from the system resume callback of B (again, as long
> as we are talking about the "resume" phase, not any of the earlier
> phases of system resume). This assumes that runtime PM is enabled at
> this point for both A and B and so it should work properly.
>
> Now, if the driver of B needs to do something special to the device in
> its system suspend callback, it may want (and likely should) disable
> runtime PM prior to this and in that case it will have to check what
> the runtime PM status of the device is and adjust its actions
> accordingly. That really depends on what those actions are etc, so
> I'd rather not talk about it without a specific example.
Of course, the driver of B may also choose to leave the device in
runtime suspend in its system resume callback. This requires checking
the runtime PM status of the device upfront, but the driver needs to
do that anyway in order to leave the device in runtime suspend during
system suspend, so it can record the fact that the device has been
left in runtime suspend. That record can be used later during system
resume.
The kind of tricky aspect of this is when the device triggers a system
wakeup by generating a wakeup signal. In that case, it is probably
better to resume it during system resume, but the driver should know
that it is the case (it has access to the device's registers after
all). It may, for example, use runtime_resume() for resuming the
device (and its parent etc) then.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-01 20:40 ` Rafael J. Wysocki
@ 2025-09-02 2:41 ` Alan Stern
2025-09-03 11:51 ` ryan zhou
2025-09-03 19:30 ` Rafael J. Wysocki
0 siblings, 2 replies; 20+ messages in thread
From: Alan Stern @ 2025-09-02 2:41 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Thinh Nguyen, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Mon, Sep 01, 2025 at 10:40:25PM +0200, Rafael J. Wysocki wrote:
> Of course, the driver of B may also choose to leave the device in
> runtime suspend in its system resume callback. This requires checking
> the runtime PM status of the device upfront, but the driver needs to
> do that anyway in order to leave the device in runtime suspend during
> system suspend, so it can record the fact that the device has been
> left in runtime suspend. That record can be used later during system
> resume.
As a general rule, I think this is by default the best approach. That
is, since B was in runtime suspend before the system sleep, you should
just keep it in runtime suspend after the system sleep unless you have
some good reason not to. In other words, strive to leave the entire
system in the same state that it started in, as near as possible.
One good reason not to would obviously be if B is the source of a wakeup
signal...
> The kind of tricky aspect of this is when the device triggers a system
> wakeup by generating a wakeup signal. In that case, it is probably
> better to resume it during system resume, but the driver should know
> that it is the case (it has access to the device's registers after
> all).
Not necessarily. Suppose that C is a child of B, and C is the wakeup
source. B's driver might decide to keep B in runtime suspend
since B wasn't the wakeup source -- but then C's driver would have to
make the same decision and would not have access to C's registers.
> It may, for example, use runtime_resume() for resuming the
> device (and its parent etc) then.
Consider this as a possible heuristic for B's ->resume callback, in the
case where B was in runtime suspend throughout the system sleep:
If B's parent A is not in runtime suspend, test whether B
has a wakeup signal pending. If it does, do a runtime
resume. If not (or if A is in runtime suspend), leave B
in runtime suspend.
At first glance I think that will work fairly well. Even if B is kept
in runtime suspend when it shouldn't be, the normal runtime wakeup
signalling mechanism should kick in without too much of a delay.
The big problem is that this issue applies to all subsystems and
devices. It would be better if we had a uniform solution that could be
implemented in the PM core, not in every single subsystem or device
driver.
Here's another possibility, one that can be implemented in the PM core
during the ->resume, ->resume_early, or ->resume_noirq phase of system
wakeup:
If A and B are both in runtime suspend, do not invoke B's
->resume callback. (Or maybe don't invoke it if A's ->resume
callback wasn't invoked.)
There probably are some detailed reasons why that won't always work, but
maybe you figure out something like it that will be okay.
Alan Stern
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-02 2:41 ` Alan Stern
@ 2025-09-03 11:51 ` ryan zhou
2025-09-03 19:30 ` Rafael J. Wysocki
1 sibling, 0 replies; 20+ messages in thread
From: ryan zhou @ 2025-09-03 11:51 UTC (permalink / raw)
To: Alan Stern
Cc: Rafael J. Wysocki, Thinh Nguyen, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Alan Stern <stern@rowland.harvard.edu> 于2025年9月2日周二 10:41写道:
>
> On Mon, Sep 01, 2025 at 10:40:25PM +0200, Rafael J. Wysocki wrote:
> > Of course, the driver of B may also choose to leave the device in
> > runtime suspend in its system resume callback. This requires checking
> > the runtime PM status of the device upfront, but the driver needs to
> > do that anyway in order to leave the device in runtime suspend during
> > system suspend, so it can record the fact that the device has been
> > left in runtime suspend. That record can be used later during system
> > resume.
>
> As a general rule, I think this is by default the best approach. That
> is, since B was in runtime suspend before the system sleep, you should
> just keep it in runtime suspend after the system sleep unless you have
> some good reason not to. In other words, strive to leave the entire
> system in the same state that it started in, as near as possible.
>
Alan, I fully concur with your perspective. Specifically, I maintain that the
device's runtime status should remain consistent before and after
system deep sleep.
To keep parent runtime-active during child wake, use device_link_add
create a link between them. Then dwc3_resume's pm_runtime_set_active
forces parent wake-up first.
However, for the dwc3 driver, both Rafael 's two solutions and
My aforementioned solution introduces new issues. When USB performs
deep sleep wake-up, the USB PHYS fails to initialize properly because
deep sleep wake-up executes runtime resume first, leaving the PHYS
initialized with per-sleep configurations. This ignores Type-C interface
requirements to reconfigure PHYS based on plug orientation.
> One good reason not to would obviously be if B is the source of a wakeup
> signal...
>
> > The kind of tricky aspect of this is when the device triggers a system
> > wakeup by generating a wakeup signal. In that case, it is probably
> > better to resume it during system resume, but the driver should know
> > that it is the case (it has access to the device's registers after
> > all).
>
> Not necessarily. Suppose that C is a child of B, and C is the wakeup
> source. B's driver might decide to keep B in runtime suspend
> since B wasn't the wakeup source -- but then C's driver would have to
> make the same decision and would not have access to C's registers.
>
> > It may, for example, use runtime_resume() for resuming the
> > device (and its parent etc) then.
>
> Consider this as a possible heuristic for B's ->resume callback, in the
> case where B was in runtime suspend throughout the system sleep:
>
> If B's parent A is not in runtime suspend, test whether B
> has a wakeup signal pending. If it does, do a runtime
> resume. If not (or if A is in runtime suspend), leave B
> in runtime suspend.
>
Following your suggestion, I conducted verification. If the child device
is in a runtime suspended state, the suspend and resume callback should
not be invoked. The proposed solution is as follows:
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 20945cad29a1..642bf4b5d3c4 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -2687,6 +2687,9 @@ int dwc3_pm_resume(struct dwc3 *dwc)
struct device *dev = dwc->dev;
int ret = 0;
+ if (pm_runtime_suspended(dev))
+ return ret;
+
pinctrl_pm_select_default_state(dev);
pm_runtime_disable(dev);
> At first glance I think that will work fairly well. Even if B is kept
> in runtime suspend when it shouldn't be, the normal runtime wakeup
> signalling mechanism should kick in without too much of a delay.
>
> The big problem is that this issue applies to all subsystems and
> devices. It would be better if we had a uniform solution that could be
> implemented in the PM core, not in every single subsystem or device
> driver.
>
> Here's another possibility, one that can be implemented in the PM core
> during the ->resume, ->resume_early, or ->resume_noirq phase of system
> wakeup:
>
> If A and B are both in runtime suspend, do not invoke B's
> ->resume callback. (Or maybe don't invoke it if A's ->resume
> callback wasn't invoked.)
>
It is preferable for the PM core to maintain the runtime status
between parent and child devices,
where feasible. Could you advise on the most effective approach to
settle this issue?
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-02 2:41 ` Alan Stern
2025-09-03 11:51 ` ryan zhou
@ 2025-09-03 19:30 ` Rafael J. Wysocki
2025-09-03 21:54 ` Alan Stern
1 sibling, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-09-03 19:30 UTC (permalink / raw)
To: Alan Stern
Cc: Rafael J. Wysocki, Thinh Nguyen, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Tue, Sep 2, 2025 at 4:41 AM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Mon, Sep 01, 2025 at 10:40:25PM +0200, Rafael J. Wysocki wrote:
> > Of course, the driver of B may also choose to leave the device in
> > runtime suspend in its system resume callback. This requires checking
> > the runtime PM status of the device upfront, but the driver needs to
> > do that anyway in order to leave the device in runtime suspend during
> > system suspend, so it can record the fact that the device has been
> > left in runtime suspend. That record can be used later during system
> > resume.
>
> As a general rule, I think this is by default the best approach. That
> is, since B was in runtime suspend before the system sleep, you should
> just keep it in runtime suspend after the system sleep unless you have
> some good reason not to. In other words, strive to leave the entire
> system in the same state that it started in, as near as possible.
That's reasonable IMV.
> One good reason not to would obviously be if B is the source of a wakeup
> signal...
>
> > The kind of tricky aspect of this is when the device triggers a system
> > wakeup by generating a wakeup signal. In that case, it is probably
> > better to resume it during system resume, but the driver should know
> > that it is the case (it has access to the device's registers after
> > all).
>
> Not necessarily. Suppose that C is a child of B, and C is the wakeup
> source. B's driver might decide to keep B in runtime suspend
> since B wasn't the wakeup source -- but then C's driver would have to
> make the same decision and would not have access to C's registers.
>
> > It may, for example, use runtime_resume() for resuming the
> > device (and its parent etc) then.
>
> Consider this as a possible heuristic for B's ->resume callback, in the
> case where B was in runtime suspend throughout the system sleep:
>
> If B's parent A is not in runtime suspend, test whether B
> has a wakeup signal pending. If it does, do a runtime
> resume. If not (or if A is in runtime suspend), leave B
> in runtime suspend.
>
> At first glance I think that will work fairly well. Even if B is kept
> in runtime suspend when it shouldn't be, the normal runtime wakeup
> signalling mechanism should kick in without too much of a delay.
This is not just about the parent, but also about suppliers and things
get fairly complicated at that point, so I would just rely on the
observation that runtime wakeup should trigger shortly.
> The big problem is that this issue applies to all subsystems and
> devices. It would be better if we had a uniform solution that could be
> implemented in the PM core, not in every single subsystem or device
> driver.
>
> Here's another possibility, one that can be implemented in the PM core
> during the ->resume, ->resume_early, or ->resume_noirq phase of system
> wakeup:
>
> If A and B are both in runtime suspend, do not invoke B's
> ->resume callback. (Or maybe don't invoke it if A's ->resume
> callback wasn't invoked.)
>
> There probably are some detailed reasons why that won't always work, but
> maybe you figure out something like it that will be okay.
I personally think that it would be reasonable to simply preserve
device states in error paths unless they have been changed already
before the error (or suspend abort due to a wakeup signal).
By this rule, B would be left in runtime suspend if it were still in
runtime suspend when the error (or suspend abort in general) occurred
and then it doesn't matter what happens to A.
The PM core can do something like that for the drivers opting in for
runtime PM integration assistance, so to speak. That is, drivers that
point their ->suspend() and ->resume() callbacks to
pm_runtime_force_suspend() and pm_runtime_force_resume(),
respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time
which is now feasible). Otherwise, it is hard to say what the
expectations of the driver are and some code between the driver and
the PM core may be involved (say, the PCI bus type).
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-03 19:30 ` Rafael J. Wysocki
@ 2025-09-03 21:54 ` Alan Stern
2025-09-04 14:08 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2025-09-03 21:54 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Thinh Nguyen, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Wed, Sep 03, 2025 at 09:30:47PM +0200, Rafael J. Wysocki wrote:
> I personally think that it would be reasonable to simply preserve
> device states in error paths unless they have been changed already
> before the error (or suspend abort due to a wakeup signal).
The problem is complicated by the interaction between runtime-PM states
and system-sleep states. In the case, we've been considering, B changes
from runtime-suspended to runtime-suspended + system-suspended.
Therefore the error path is allowed to modify B's state.
> By this rule, B would be left in runtime suspend if it were still in
> runtime suspend when the error (or suspend abort in general) occurred
> and then it doesn't matter what happens to A.
More fully, B would be changed from runtime-suspended + system-suspended
back to simply runtime-suspended. Unfortunately, none of the PM
callbacks in the kernel are defined to make this change -- at least, not
without some cooperation from the driver.
> The PM core can do something like that for the drivers opting in for
> runtime PM integration assistance, so to speak. That is, drivers that
> point their ->suspend() and ->resume() callbacks to
> pm_runtime_force_suspend() and pm_runtime_force_resume(),
> respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time
> which is now feasible). Otherwise, it is hard to say what the
> expectations of the driver are and some code between the driver and
> the PM core may be involved (say, the PCI bus type).
Setting DPM_FLAG_SMART_SUSPEND really does sound like the best answer.
But there still should be some way the PM core can make resumes easier
for drivers that don't set the flag. Something like: If the device is
in runtime suspend with SMART_SUSPEND clear, perform a runtime resume on
the device's parent (and anything else the device depends on) before
invoking ->resume.
Alan Stern
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-03 21:54 ` Alan Stern
@ 2025-09-04 14:08 ` Rafael J. Wysocki
2025-09-04 14:19 ` Alan Stern
0 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-09-04 14:08 UTC (permalink / raw)
To: Alan Stern
Cc: Rafael J. Wysocki, Thinh Nguyen, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Wed, Sep 3, 2025 at 11:54 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Wed, Sep 03, 2025 at 09:30:47PM +0200, Rafael J. Wysocki wrote:
> > I personally think that it would be reasonable to simply preserve
> > device states in error paths unless they have been changed already
> > before the error (or suspend abort due to a wakeup signal).
>
> The problem is complicated by the interaction between runtime-PM states
> and system-sleep states. In the case, we've been considering, B changes
> from runtime-suspended to runtime-suspended + system-suspended.
> Therefore the error path is allowed to modify B's state.
Yes, it is, but retaining the B's state in an error path is also fine
so long as no changes have been made to it so far.
If B was runtime-suspended to start with and none of the suspend
callbacks invoked for it so far has done anything to it, then it is de
facto still runtime-suspended and its state need not be changed in an
error path.
> > By this rule, B would be left in runtime suspend if it were still in
> > runtime suspend when the error (or suspend abort in general) occurred
> > and then it doesn't matter what happens to A.
>
> More fully, B would be changed from runtime-suspended + system-suspended
> back to simply runtime-suspended. Unfortunately, none of the PM
> callbacks in the kernel are defined to make this change -- at least, not
> without some cooperation from the driver.
Except when runtime-suspended + system-suspended is effectively the
same as runtime-suspended.
Say this is not the case and say that the device is runtime-suspended
to start with. Then the "suspend" callback has two choices: either
(1) it can runtime-resume the device before doing anything to it,
which will also cause the device's parent and suppliers to
runtime-resume, or (2) it can update the device's state without
resuming it.
If it chooses (1), then "resume" is straightforward. If it chooses
(2), "resume" may just reverse the changes made by "suspend" and
declare that the device is runtime-suspended. And if it really really
wants to resume the device then, why not call runtime_resume() on it?
> > The PM core can do something like that for the drivers opting in for
> > runtime PM integration assistance, so to speak. That is, drivers that
> > point their ->suspend() and ->resume() callbacks to
> > pm_runtime_force_suspend() and pm_runtime_force_resume(),
> > respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time
> > which is now feasible). Otherwise, it is hard to say what the
> > expectations of the driver are and some code between the driver and
> > the PM core may be involved (say, the PCI bus type).
>
> Setting DPM_FLAG_SMART_SUSPEND really does sound like the best answer.
>
> But there still should be some way the PM core can make resumes easier
> for drivers that don't set the flag. Something like: If the device is
> in runtime suspend with SMART_SUSPEND clear, perform a runtime resume on
> the device's parent (and anything else the device depends on) before
> invoking ->resume.
Say that ->resume() does nothing to the device (because it is
runtime-suspended and there's no need to resume it). Why would the
core resume the parent etc then?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-04 14:08 ` Rafael J. Wysocki
@ 2025-09-04 14:19 ` Alan Stern
2025-09-04 14:33 ` Rafael J. Wysocki
0 siblings, 1 reply; 20+ messages in thread
From: Alan Stern @ 2025-09-04 14:19 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Thinh Nguyen, ryan zhou, Roy Luo, gregkh@linuxfoundation.org,
linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pm@vger.kernel.org
On Thu, Sep 04, 2025 at 04:08:47PM +0200, Rafael J. Wysocki wrote:
> On Wed, Sep 3, 2025 at 11:54 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> >
> > On Wed, Sep 03, 2025 at 09:30:47PM +0200, Rafael J. Wysocki wrote:
> > > I personally think that it would be reasonable to simply preserve
> > > device states in error paths unless they have been changed already
> > > before the error (or suspend abort due to a wakeup signal).
> >
> > The problem is complicated by the interaction between runtime-PM states
> > and system-sleep states. In the case, we've been considering, B changes
> > from runtime-suspended to runtime-suspended + system-suspended.
> > Therefore the error path is allowed to modify B's state.
>
> Yes, it is, but retaining the B's state in an error path is also fine
> so long as no changes have been made to it so far.
>
> If B was runtime-suspended to start with and none of the suspend
> callbacks invoked for it so far has done anything to it, then it is de
> facto still runtime-suspended and its state need not be changed in an
> error path.
>
> > > By this rule, B would be left in runtime suspend if it were still in
> > > runtime suspend when the error (or suspend abort in general) occurred
> > > and then it doesn't matter what happens to A.
> >
> > More fully, B would be changed from runtime-suspended + system-suspended
> > back to simply runtime-suspended. Unfortunately, none of the PM
> > callbacks in the kernel are defined to make this change -- at least, not
> > without some cooperation from the driver.
>
> Except when runtime-suspended + system-suspended is effectively the
> same as runtime-suspended.
>
> Say this is not the case and say that the device is runtime-suspended
> to start with. Then the "suspend" callback has two choices: either
> (1) it can runtime-resume the device before doing anything to it,
> which will also cause the device's parent and suppliers to
> runtime-resume, or (2) it can update the device's state without
> resuming it.
>
> If it chooses (1), then "resume" is straightforward. If it chooses
> (2), "resume" may just reverse the changes made by "suspend" and
> declare that the device is runtime-suspended. And if it really really
> wants to resume the device then, why not call runtime_resume() on it?
That's what I meant by needing "cooperation from the driver". The
driver's ->resume callback needs to do this check to see which pathway
to follow: (1) or (2).
I bet that right now almost none of the drivers in the kernel do
anything like that. I know that the USB drivers always follow (1)
during ->resume, even if they followed (2) during suspend. What do the
PCI drivers do?
> > > The PM core can do something like that for the drivers opting in for
> > > runtime PM integration assistance, so to speak. That is, drivers that
> > > point their ->suspend() and ->resume() callbacks to
> > > pm_runtime_force_suspend() and pm_runtime_force_resume(),
> > > respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time
> > > which is now feasible). Otherwise, it is hard to say what the
> > > expectations of the driver are and some code between the driver and
> > > the PM core may be involved (say, the PCI bus type).
> >
> > Setting DPM_FLAG_SMART_SUSPEND really does sound like the best answer.
> >
> > But there still should be some way the PM core can make resumes easier
> > for drivers that don't set the flag. Something like: If the device is
> > in runtime suspend with SMART_SUSPEND clear, perform a runtime resume on
> > the device's parent (and anything else the device depends on) before
> > invoking ->resume.
>
> Say that ->resume() does nothing to the device (because it is
> runtime-suspended and there's no need to resume it). Why would the
> core resume the parent etc then?
You're right. I'm just trying to figure out a way to fix this problem
in general without the need for updating every driver in the kernel.
Maybe that's not possible. :-(
Alan Stern
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active
2025-09-04 14:19 ` Alan Stern
@ 2025-09-04 14:33 ` Rafael J. Wysocki
0 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2025-09-04 14:33 UTC (permalink / raw)
To: Alan Stern
Cc: Rafael J. Wysocki, Thinh Nguyen, ryan zhou, Roy Luo,
gregkh@linuxfoundation.org, linux-usb@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
On Thu, Sep 4, 2025 at 4:19 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>
> On Thu, Sep 04, 2025 at 04:08:47PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Sep 3, 2025 at 11:54 PM Alan Stern <stern@rowland.harvard.edu> wrote:
> > >
> > > On Wed, Sep 03, 2025 at 09:30:47PM +0200, Rafael J. Wysocki wrote:
> > > > I personally think that it would be reasonable to simply preserve
> > > > device states in error paths unless they have been changed already
> > > > before the error (or suspend abort due to a wakeup signal).
> > >
> > > The problem is complicated by the interaction between runtime-PM states
> > > and system-sleep states. In the case, we've been considering, B changes
> > > from runtime-suspended to runtime-suspended + system-suspended.
> > > Therefore the error path is allowed to modify B's state.
> >
> > Yes, it is, but retaining the B's state in an error path is also fine
> > so long as no changes have been made to it so far.
> >
> > If B was runtime-suspended to start with and none of the suspend
> > callbacks invoked for it so far has done anything to it, then it is de
> > facto still runtime-suspended and its state need not be changed in an
> > error path.
> >
> > > > By this rule, B would be left in runtime suspend if it were still in
> > > > runtime suspend when the error (or suspend abort in general) occurred
> > > > and then it doesn't matter what happens to A.
> > >
> > > More fully, B would be changed from runtime-suspended + system-suspended
> > > back to simply runtime-suspended. Unfortunately, none of the PM
> > > callbacks in the kernel are defined to make this change -- at least, not
> > > without some cooperation from the driver.
> >
> > Except when runtime-suspended + system-suspended is effectively the
> > same as runtime-suspended.
> >
> > Say this is not the case and say that the device is runtime-suspended
> > to start with. Then the "suspend" callback has two choices: either
> > (1) it can runtime-resume the device before doing anything to it,
> > which will also cause the device's parent and suppliers to
> > runtime-resume, or (2) it can update the device's state without
> > resuming it.
> >
> > If it chooses (1), then "resume" is straightforward. If it chooses
> > (2), "resume" may just reverse the changes made by "suspend" and
> > declare that the device is runtime-suspended. And if it really really
> > wants to resume the device then, why not call runtime_resume() on it?
>
> That's what I meant by needing "cooperation from the driver". The
> driver's ->resume callback needs to do this check to see which pathway
> to follow: (1) or (2).
>
> I bet that right now almost none of the drivers in the kernel do
> anything like that. I know that the USB drivers always follow (1)
> during ->resume, even if they followed (2) during suspend. What do the
> PCI drivers do?
If they don't set DPM_FLAG_SMART_SUSPEND, the PCI bus type suspend
callback will runtime-resume their devices.
Calling runtime_resume() in a suspend callback (for the "suspend"
phase) is a popular pattern because it was recommended once upon a
time.
> > > > The PM core can do something like that for the drivers opting in for
> > > > runtime PM integration assistance, so to speak. That is, drivers that
> > > > point their ->suspend() and ->resume() callbacks to
> > > > pm_runtime_force_suspend() and pm_runtime_force_resume(),
> > > > respectively, or set DPM_FLAG_SMART_SUSPEND (or both at the same time
> > > > which is now feasible). Otherwise, it is hard to say what the
> > > > expectations of the driver are and some code between the driver and
> > > > the PM core may be involved (say, the PCI bus type).
> > >
> > > Setting DPM_FLAG_SMART_SUSPEND really does sound like the best answer.
> > >
> > > But there still should be some way the PM core can make resumes easier
> > > for drivers that don't set the flag. Something like: If the device is
> > > in runtime suspend with SMART_SUSPEND clear, perform a runtime resume on
> > > the device's parent (and anything else the device depends on) before
> > > invoking ->resume.
> >
> > Say that ->resume() does nothing to the device (because it is
> > runtime-suspended and there's no need to resume it). Why would the
> > core resume the parent etc then?
>
> You're right. I'm just trying to figure out a way to fix this problem
> in general without the need for updating every driver in the kernel.
> Maybe that's not possible. :-(
Fortunately, in many cases runtime-suspended + system-suspended ==
runtime-suspended.
Many drivers use pm_runtime_force_suspend/resume() as their
suspend/resume callbacks.
There are also drivers without runtime PM support.
It is not all lost I think.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-09-04 14:33 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250826150826.11096-1-ryanzhou54@gmail.com>
[not found] ` <CA+zupgwnbt=5Oh28Chco=YNt9WwKzi2J+0hQ04nqyZG_7WUAYg@mail.gmail.com>
[not found] ` <CAPwe5RMpdG1ziRAwDhqkxuzHX0x=SdFQRFUbPCVuir1OgE90YQ@mail.gmail.com>
2025-08-27 14:52 ` [PATCH] drvier: usb: dwc3: Fix runtime PM trying to activate child device xxx.dwc3 but parent is not active Alan Stern
2025-08-27 18:56 ` Rafael J. Wysocki
2025-08-29 0:43 ` Thinh Nguyen
2025-08-29 1:25 ` Alan Stern
2025-08-29 19:07 ` Thinh Nguyen
2025-08-29 19:26 ` Rafael J. Wysocki
2025-08-29 20:13 ` Thinh Nguyen
2025-08-30 0:46 ` Alan Stern
2025-08-30 1:14 ` Thinh Nguyen
2025-08-29 19:23 ` Rafael J. Wysocki
2025-08-29 19:58 ` Alan Stern
2025-09-01 19:41 ` Rafael J. Wysocki
2025-09-01 20:40 ` Rafael J. Wysocki
2025-09-02 2:41 ` Alan Stern
2025-09-03 11:51 ` ryan zhou
2025-09-03 19:30 ` Rafael J. Wysocki
2025-09-03 21:54 ` Alan Stern
2025-09-04 14:08 ` Rafael J. Wysocki
2025-09-04 14:19 ` Alan Stern
2025-09-04 14:33 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).