From: Johan Hovold <johan@kernel.org>
To: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: mhi@lists.linux.dev, linux-arm-msm@vger.kernel.org,
linux-kernel@vger.kernel.org,
Loic Poulain <loic.poulain@linaro.org>
Subject: Re: mhi resume failure on reboot with 6.13-rc2
Date: Wed, 18 Dec 2024 09:40:45 +0100 [thread overview]
Message-ID: <Z2KKjWY2mPen6GPL@hovoldconsulting.com> (raw)
In-Reply-To: <20241216141303.2zr5klbgua55agkx@thinkpad>
On Mon, Dec 16, 2024 at 07:43:03PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Dec 16, 2024 at 02:20:09PM +0100, Johan Hovold wrote:
> > On Mon, Dec 16, 2024 at 01:10:21PM +0530, Manivannan Sadhasivam wrote:
> > > On Wed, Dec 11, 2024 at 04:03:59PM +0100, Johan Hovold wrote:
> > I just hit the issue again and can confirm that it does block
> > reboot/shutdown forever (I've been waiting for 20 minutes now).
>
> Ah, that's bad.
>
> > Judging from a quick look at the code, "Wait for device to enter SBL or
> > Mission mode" is printed by mhi_fw_load_handler(), which in turn is only
> > called from the mhi_pm_st_worker() state machine.
> >
> > I can't seem to find anything that makes sure that the next state is
> > ever reached, so regardless of the cause of the modem fw crash
>
> This code will make sure:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/bus/mhi/host/pm.c?h=v6.13-rc1#n1264
>
> But then it doesn't print the error and returns -ETIMEDOUT to the caller after
> powering down MHI. The caller (mhi_pci_recovery_work), in the case of failure,
> unprepares MHI and starts function level recovery.
>
> > (if
> > that's what it is) the hung reboot appears to be a bug in mhi.
I've tracked down the hang to a deadlock on the parent device lock.
Driver core takes the parent device lock before calling shutdown(), and
then mhi_pci_shutdown() waits indefinitely for the recovery thread to
finish.
But the mhi recovery thread ends up trying to take the same parent
device lock in pci_reset_function() when recovery fails:
[ 339.351915] shutdown[1]: Rebooting.
[ 339.724498] arm-smmu 3da0000.iommu: disabling translation
[ 339.760134] mhi mhi0: Resuming from non M3 state (SYS ERROR)
[ 339.766211] mhi-pci-generic 0005:01:00.0: failed to resume device: -22
[ 339.773158] mhi-pci-generic 0005:01:00.0: device recovery started
The recovery thread is running before shutdown() is called.
[ 339.779638] mhi-pci-generic 0005:01:00.0: __mhi_power_down
[ 339.779650] mhi-pci-generic 0005:01:00.0: mhi_pci_shutdown
[ 339.785422] wwan wwan0: port wwan0qcdm0 disconnected
[ 339.791001] mhi-pci-generic 0005:01:00.0: mhi_pci_remove
[ 339.791006] mhi-pci-generic 0005:01:00.0: mhi_pci_remove - cancel work sync
shutdown() waits for the recovery thread to finish
[ 339.825892] wwan wwan0: port wwan0mbim0 disconnected
[ 339.831320] wwan wwan0: port wwan0qmi0 disconnected
[ 339.904249] mhi-pci-generic 0005:01:00.0: __mhi_power_down - returns
[ 340.025390] mhi mhi0: Requested to power ON
[ 340.233771] mhi mhi0: Power on setup success
[ 340.233954] mhi mhi0: Wait for device to enter SBL or Mission mode
[ 340.238272] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - wait event timeout_ms = 8000
[ 348.400082] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - wait event returns, ret = -110
The recovery thread fails to power up the device.
[ 348.419967] mhi-pci-generic 0005:01:00.0: __mhi_power_down
[ 348.472665] mhi-pci-generic 0005:01:00.0: __mhi_power_down - returns
[ 348.725069] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - returns
[ 348.742644] mhi-pci-generic 0005:01:00.0: mhi_pci_recovery_work - mhi unprepare after power down
[ 348.762737] mhi-pci-generic 0005:01:00.0: mhi_pci_recovery_work - pci reset
[ 348.780904] mhi-pci-generic 0005:01:00.0: pci_reset_function
And tries to reset the device, which triggers the deadlock when
trying to take the already held parent (bridge) device lock.
Johan
next prev parent reply other threads:[~2024-12-18 8:40 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 14:17 mhi resume failure on reboot with 6.13-rc2 Johan Hovold
2024-12-11 14:53 ` Manivannan Sadhasivam
2024-12-11 15:03 ` Johan Hovold
2024-12-16 7:40 ` Manivannan Sadhasivam
2024-12-16 7:43 ` Manivannan Sadhasivam
2024-12-16 13:20 ` Johan Hovold
2024-12-16 14:13 ` Manivannan Sadhasivam
2024-12-16 16:25 ` Loic Poulain
2024-12-17 9:57 ` Johan Hovold
2024-12-18 8:48 ` Johan Hovold
2024-12-18 8:40 ` Johan Hovold [this message]
2024-12-18 11:38 ` Manivannan Sadhasivam
2024-12-18 12:02 ` Johan Hovold
2024-12-18 12:30 ` Manivannan Sadhasivam
2024-12-18 13:55 ` Johan Hovold
2024-12-18 14:09 ` Manivannan Sadhasivam
2024-12-18 14:26 ` Johan Hovold
2024-12-18 18:35 ` Manivannan Sadhasivam
2024-12-19 8:36 ` Johan Hovold
2025-01-08 12:49 ` Manivannan Sadhasivam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2KKjWY2mPen6GPL@hovoldconsulting.com \
--to=johan@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=loic.poulain@linaro.org \
--cc=manivannan.sadhasivam@linaro.org \
--cc=mhi@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox