Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Johan Hovold <johan@kernel.org>
To: manivannan.sadhasivam@linaro.org
Cc: mhi@lists.linux.dev, Loic Poulain <loic.poulain@linaro.org>,
	linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org
Subject: Re: [PATCH 2/2] bus: mhi: host: pci_generic: Recover the device synchronously from mhi_pci_runtime_resume()
Date: Wed, 22 Jan 2025 18:25:51 +0100	[thread overview]
Message-ID: <Z5EqH95TWIGJhPG9@hovoldconsulting.com> (raw)
In-Reply-To: <Z5ENq9EMPlNvxNOF@hovoldconsulting.com>

On Wed, Jan 22, 2025 at 04:24:27PM +0100, Johan Hovold wrote:
> On Wed, Jan 08, 2025 at 07:09:28PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > 
> > Currently, in mhi_pci_runtime_resume(), if the resume fails, recovery_work
> > is started asynchronously and success is returned. But this doesn't align
> > with what PM core expects as documented in
> > Documentation/power/runtime_pm.rst:

> > Cc: stable@vger.kernel.org # 5.13
> > Reported-by: Johan Hovold <johan@kernel.org>
> > Closes: https://lore.kernel.org/mhi/Z2PbEPYpqFfrLSJi@hovoldconsulting.com
> > Fixes: d3800c1dce24 ("bus: mhi: pci_generic: Add support for runtime PM")
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> 
> Reasoning above makes sense, and I do indeed see resume taking five
> seconds longer with this patch as Loic suggested it would.

I forgot to mention the following warnings that now show up when system
resume succeeds. Recovery was run also before this patch but the "parent
mhi0 should not be sleeping" warnings are new:

[   68.753288] qcom_mhi_qrtr mhi0_IPCR: failed to prepare for autoqueue transfer -22
[   68.761109] qcom_mhi_qrtr mhi0_IPCR: PM: dpm_run_callback(): qcom_mhi_qrtr_pm_resume_early [qrtr_mhi] returns -22
[   68.771804] qcom_mhi_qrtr mhi0_IPCR: PM: failed to resume early: error -22
[   68.795053] mhi-pci-generic 0005:01:00.0: mhi_pci_resume
[   68.800709] mhi-pci-generic 0005:01:00.0: mhi_pci_runtime_resume
[   68.800794] mhi mhi0: Resuming from non M3 state (RESET)
[   68.800804] mhi-pci-generic 0005:01:00.0: failed to resume device: -22
[   68.819517] mhi-pci-generic 0005:01:00.0: device recovery started
[   68.819532] mhi-pci-generic 0005:01:00.0: __mhi_power_down
[   68.819543] mhi-pci-generic 0005:01:00.0: __mhi_power_down - pm mutex taken
[   68.819554] mhi-pci-generic 0005:01:00.0: __mhi_power_down - pm lock taken
[   68.820060] wwan wwan0: port wwan0qcdm0 disconnected
[   68.824839] nvme nvme0: 12/0/0 default/read/poll queues
[   68.857908] wwan wwan0: port wwan0mbim0 disconnected
[   68.864012] wwan wwan0: port wwan0qmi0 disconnected
[   68.943307] mhi-pci-generic 0005:01:00.0: __mhi_power_down - returns
[   68.956253] mhi mhi0: Requested to power ON
[   68.960753] mhi mhi0: Power on setup success
[   68.965262] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - wait event timeout_ms = 8000
[   73.183086] mhi mhi0: Wait for device to enter SBL or Mission mode
[   73.653462] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - wait event returns, ret = 0
[   73.653752] mhi mhi0_DIAG: PM: parent mhi0 should not be sleeping
[   73.661955] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - returns
[   73.668461] mhi mhi0_MBIM: PM: parent mhi0 should not be sleeping
[   73.674950] mhi-pci-generic 0005:01:00.0: Recovery completed
[   73.681428] mhi mhi0_QMI: PM: parent mhi0 should not be sleeping
[   74.315919] OOM killer enabled.
[   74.316475] wwan wwan0: port wwan0qcdm0 attached
[   74.319206] Restarting tasks ...
[   74.322825] done.
[   74.322870] random: crng reseeded on system resumption
[   74.325956] wwan wwan0: port wwan0mbim0 attached
[   74.334467] wwan wwan0: port wwan0qmi0 attached

> Unfortunately, something else is broken as the recovery code now
> deadlocks again when the modem fails to resume (with both patches
> applied):
> 
> [  729.833701] PM: suspend entry (deep)
> [  729.841377] Filesystems sync: 0.000 seconds
> [  729.867672] Freezing user space processes
> [  729.869494] Freezing user space processes completed (elapsed 0.001 seconds)
> [  729.869499] OOM killer disabled.
> [  729.869501] Freezing remaining freezable tasks
> [  729.870882] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [  730.184254] mhi-pci-generic 0005:01:00.0: mhi_pci_runtime_resume
> [  730.190643] mhi mhi0: Resuming from non M3 state (SYS ERROR)
> [  730.196587] mhi-pci-generic 0005:01:00.0: failed to resume device: -22
> [  730.203412] mhi-pci-generic 0005:01:00.0: device recovery started
> 
> I've reproduced this three times in three different paths (runtime
> resume before suspend; runtime resume during suspend; and during system
> resume).
> 
> I didn't try to figure what causes the deadlock this time (and lockdep
> does not trigger), but you should be able to reproduce this by
> instrumenting a resume failure.

Johan

      reply	other threads:[~2025-01-22 17:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-08 13:39 [PATCH 0/2] bus: mhi: host: pci_generic: Couple of recovery fixes Manivannan Sadhasivam via B4 Relay
2025-01-08 13:39 ` [PATCH 1/2] bus: mhi: host: pci_generic: Use pci_try_reset_function() to avoid deadlock Manivannan Sadhasivam via B4 Relay
2025-01-08 14:46   ` Loic Poulain
2025-01-22 15:11   ` Johan Hovold
2025-02-19 13:13     ` Manivannan Sadhasivam
2025-02-19 13:52       ` Johan Hovold
2025-02-19 14:14         ` Manivannan Sadhasivam
2025-01-08 13:39 ` [PATCH 2/2] bus: mhi: host: pci_generic: Recover the device synchronously from mhi_pci_runtime_resume() Manivannan Sadhasivam via B4 Relay
2025-01-08 15:19   ` Loic Poulain
2025-01-08 16:02     ` Manivannan Sadhasivam
2025-01-09 20:50       ` Loic Poulain
2025-01-12  4:23         ` Manivannan Sadhasivam
2025-01-22 15:24   ` Johan Hovold
2025-01-22 17:25     ` Johan Hovold [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5EqH95TWIGJhPG9@hovoldconsulting.com \
    --to=johan@kernel.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=loic.poulain@linaro.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=mhi@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox