public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
To: wuliangfeng <william.wu@rock-chips.com>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"frank.wang@rock-chips.com" <frank.wang@rock-chips.com>,
	"jianwei.zheng@rock-chips.com" <jianwei.zheng@rock-chips.com>,
	"yangbin@rock-chips.com" <yangbin@rock-chips.com>
Subject: Re: [PATCH v2] usb: dwc3: core: Avoid resume dwc3 if already suspended in pm resume
Date: Mon, 2 Oct 2023 18:18:50 +0000	[thread overview]
Message-ID: <20231002181852.cfucitjfacf3khnx@synopsys.com> (raw)
In-Reply-To: <471bd277-3047-4157-a27d-f2fd203fb9b8@rock-chips.com>

On Fri, Sep 15, 2023, wuliangfeng wrote:
> Hi Thinh,
> 
> On 2023/9/12 8:08, Thinh Nguyen wrote:
> > Hi,
> > 
> > On Mon, Sep 11, 2023, William Wu wrote:
> > > If we enable PM runtime auto suspend for dwc3 on rockchip
> > > platforms (e.g. RK3562), it allows the dwc3 controller to
> > > enter runtime suspend if usb cable detached and power off
> > > the power domain of the controller. When system resume, if
> > > the dwc3 already in runtime suspended, it Shouldn't access
> > > the dwc3 registers in dwc3_resume() because its power domain
> > > maybe power off.
> > > 
> > > Test on RK3562 tablet, this patch can help to avoid kernel
> > > panic when accessing the dwc3 registers in dwc3_resume() if
> > > the dwc3 is in runtime suspended and it's power domain is
> > > power off.
> > The controller should be woken up before this step. Can you provide more
> > detail on what led to this?
> 
> Yes, the power domain of the usb controller will be enabled by the framework
> of  the pm generic domain before dwc3 resume if the system enter suspend and
> exit suspend normally. However, in my test case,if the system fail to enter
> suspend because of some devices's problem, and then goto recovery process,
> the power domain of the usb controller will not be enable before dwc3
> resume.

Ok.

> 
> > e.g. some questions:
> > Who handles the waking up of the controller? Is it the phy driver? Is
> > the phy driver not detecting a resume? Or did the resume fail? Does this
> > occur consistently?
> > 
> > Thanks,
> > Thinh
> 
> This issue occurs occasionally on RK3562 EVB with Type-C USB, and enable
> autosuspend for dwc3 controller.
> 
> Here is the test steps:
> 
> 1. Power on the RK3562 EVB and the Type-C USB interface is in unconnected
> state.
> 
> 2. Makesure the dwc3 controller enter runtime suspend, and its power domain
> is disabled.
> 
> 3. Do system suspend/resume stress test.
> 
> 4. The issue occurs occasionally  with the following log:
> 
> [  251.681091][ T4331] PM: suspend entry (deep)
> [  251.778975][ T4331] Filesystems sync: 0.097 seconds
> [  251.779025][ T4331] Freezing user space processes ... (elapsed 0.005
> seconds) done.
> [  251.784819][ T4331] OOM killer disabled.
> [  251.784851][ T4331] Freezing remaining freezable tasks ... (elapsed 0.004
> seconds) done.
> [  251.792719][  T503] [SKWIFI DBG] skw_suspend: WoW: enabled, skw flags:
> 0x302
> [  251.803701][ T4331] PM: dpm_run_callback():
> platform_pm_suspend.cfi_jt+0x0/0x8 returns -16
> [  251.803779][   T75] PM: PM: Pending Wakeup Sources: alarmtimer.0.auto
> [  251.803789][ T4331] PM: Device alarmtimer.0.a
> [  251.803928][ T4331] PM: Some devices failed to suspend, or early wake
> event detected
> [  251.804141][   T75] [SKWIFI DBG] skw_resume: skw flags: 0x300
> [  251.804715][    C2] SError Interrupt on CPU2, code 0xbf000000 -- SError
> [  251.804725][    C2] CPU: 2 PID: 4331 Comm: binder:251_4 Tainted: G       
> WC  E 5.10.157-android13-4-00006-g73f337804fbc-ab9881769 #1
> [  251.804732][    C2] Hardware name: Rockchip RK3562 RK817 TABLET LP4 Board
> (DT)
> [  251.804738][    C2] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
> [  251.804743][    C2] pc : el1_abort+0x40/0x68
> [  251.804748][    C2] lr : el1_abort+0x28/0x68
> 
> ......
> 
> [  251.804965][    C2] Kernel panic - not syncing: Asynchronous SError
> Interrupt
> [  251.804974][    C2] CPU: 2 PID: 4331 Comm: binder:251_4 Tainted: G       
> WC  E 5.10.157-android13-4-00006-g73f337804fbc-ab9881769 #1
> [  251.804980][    C2] Hardware name: Rockchip RK3562 RK817 TABLET LP4 Board
> (DT)
> [  251.804984][    C2] Call trace:
> [  251.804990][    C2]  dump_backtrace.cfi_jt+0x0/0x8
> [  251.804995][    C2]  dump_stack_lvl+0xc0/0x13c
> [  251.805000][    C2]  panic+0x174/0x468
> [  251.805006][    C2]  arm64_serror_panic+0x1b0/0x200
> [  251.805010][    C2]  do_serror+0x184/0x1e4
> [  251.805016][    C2]  el1_error+0x94/0x118
> [  251.805020][    C2]  el1_abort+0x40/0x68
> [  251.805026][    C2]  el1_sync_handler+0x58/0x88
> [  251.805031][    C2]  el1_sync+0x8c/0x140
> [  251.805035][    C2]  dwc3_readl+0x30/0x1a0
> [  251.805040][    C2]  dwc3_phy_setup+0x38/0x510
> [  251.805045][    C2]  dwc3_core_init+0x68/0xcd4
> [  251.805051][    C2]  dwc3_core_init_for_resume+0x10c/0x25c
> [  251.805056][    C2]  dwc3_resume_common+0x44/0x3d0
> [  251.805061][    C2]  dwc3_resume+0x5c/0xb8
> [  251.805067][    C2]  dpm_run_callback+0x70/0x488
> [  251.805071][    C2]  device_resume+0x250/0x2f8
> [  251.805077][    C2]  dpm_resume+0x258/0x9dc
> [  251.805082][    C2]  suspend_devices_and_enter+0x850/0xcac
> 
> In this case, during suspend process, because the device alarmtimer failed
> to suspend, it break the system suspend in the funciton
> suspend_devices_and_enter(), and goto platform_recover() directly without
> enable the power domain of the controller, then trigger the Kernel panic in
> dwc3_resume().
> 

Thanks for the details.

> 
> For a comparison, in the normal case, if the system enter suspend normally,
> and after the system wakeup, the power domain of the controller will be
> enable by the framework of  the pm generic domain before dwc3 resume.
> 
> The function call stack like this:
> 
> suspend_devices_and_enter -->
> 
>     suspend_enter -->
> 
>          dpm_resume_noirq --> dpm_noirq_resume_devices -->
> device_resume_noirq --> genpd_resume_noirq --> rockchip_pd_power (enable
> the power domain of the controller)
> 
>     dpm_resume_end -->
> 
>          dpm_resume --> device_resume --> dpm_run_callback --> dwc3_resume
> (access the controller safely)
> 
>          dpm_complete --> genpd_complete --> genpd_queue_power_off_work
> 
> suspend_finish --> suspend_thaw_processes --> genpd_power_off_work_fn -->
> (diable the power domain of the controller to maintain the original runtime 
> suspend state)
> 

At what step do we restore the power domain when this happen? Looks like
there's a missing step in the suspend failure recovery to recover the
power domain. What we're doing here seems more like a workaround to
that, which unfortunately makes the code logic looks unclear IMO.

Can this be fixed in the lower layer?

Thanks,
Thinh

  reply	other threads:[~2023-10-02 18:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-11  3:31 [PATCH v2] usb: dwc3: core: Avoid resume dwc3 if already suspended in pm resume William Wu
2023-09-12  0:08 ` Thinh Nguyen
2023-09-15 15:53   ` wuliangfeng
2023-10-02 18:18     ` Thinh Nguyen [this message]
2023-10-10  7:28       ` wuliangfeng
2023-10-12 17:38         ` Thinh Nguyen
2023-09-15 20:17 ` Wesley Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231002181852.cfucitjfacf3khnx@synopsys.com \
    --to=thinh.nguyen@synopsys.com \
    --cc=frank.wang@rock-chips.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jianwei.zheng@rock-chips.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=william.wu@rock-chips.com \
    --cc=yangbin@rock-chips.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox