From: Johan Hovold <johan@kernel.org>
To: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: mhi@lists.linux.dev, linux-arm-msm@vger.kernel.org,
linux-kernel@vger.kernel.org,
Loic Poulain <loic.poulain@linaro.org>
Subject: Re: mhi resume failure on reboot with 6.13-rc2
Date: Wed, 18 Dec 2024 09:40:45 +0100 [thread overview]
Message-ID: <Z2KKjWY2mPen6GPL@hovoldconsulting.com> (raw)
In-Reply-To: <20241216141303.2zr5klbgua55agkx@thinkpad>
On Mon, Dec 16, 2024 at 07:43:03PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Dec 16, 2024 at 02:20:09PM +0100, Johan Hovold wrote:
> > On Mon, Dec 16, 2024 at 01:10:21PM +0530, Manivannan Sadhasivam wrote:
> > > On Wed, Dec 11, 2024 at 04:03:59PM +0100, Johan Hovold wrote:
> > I just hit the issue again and can confirm that it does block
> > reboot/shutdown forever (I've been waiting for 20 minutes now).
>
> Ah, that's bad.
>
> > Judging from a quick look at the code, "Wait for device to enter SBL or
> > Mission mode" is printed by mhi_fw_load_handler(), which in turn is only
> > called from the mhi_pm_st_worker() state machine.
> >
> > I can't seem to find anything that makes sure that the next state is
> > ever reached, so regardless of the cause of the modem fw crash
>
> This code will make sure:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/bus/mhi/host/pm.c?h=v6.13-rc1#n1264
>
> But then it doesn't print the error and returns -ETIMEDOUT to the caller after
> powering down MHI. The caller (mhi_pci_recovery_work), in the case of failure,
> unprepares MHI and starts function level recovery.
>
> > (if
> > that's what it is) the hung reboot appears to be a bug in mhi.
I've tracked down the hang to a deadlock on the parent device lock.
Driver core takes the parent device lock before calling shutdown(), and
then mhi_pci_shutdown() waits indefinitely for the recovery thread to
finish.
But the mhi recovery thread ends up trying to take the same parent
device lock in pci_reset_function() when recovery fails:
[ 339.351915] shutdown[1]: Rebooting.
[ 339.724498] arm-smmu 3da0000.iommu: disabling translation
[ 339.760134] mhi mhi0: Resuming from non M3 state (SYS ERROR)
[ 339.766211] mhi-pci-generic 0005:01:00.0: failed to resume device: -22
[ 339.773158] mhi-pci-generic 0005:01:00.0: device recovery started
The recovery thread is running before shutdown() is called.
[ 339.779638] mhi-pci-generic 0005:01:00.0: __mhi_power_down
[ 339.779650] mhi-pci-generic 0005:01:00.0: mhi_pci_shutdown
[ 339.785422] wwan wwan0: port wwan0qcdm0 disconnected
[ 339.791001] mhi-pci-generic 0005:01:00.0: mhi_pci_remove
[ 339.791006] mhi-pci-generic 0005:01:00.0: mhi_pci_remove - cancel work sync
shutdown() waits for the recovery thread to finish
[ 339.825892] wwan wwan0: port wwan0mbim0 disconnected
[ 339.831320] wwan wwan0: port wwan0qmi0 disconnected
[ 339.904249] mhi-pci-generic 0005:01:00.0: __mhi_power_down - returns
[ 340.025390] mhi mhi0: Requested to power ON
[ 340.233771] mhi mhi0: Power on setup success
[ 340.233954] mhi mhi0: Wait for device to enter SBL or Mission mode
[ 340.238272] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - wait event timeout_ms = 8000
[ 348.400082] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - wait event returns, ret = -110
The recovery thread fails to power up the device.
[ 348.419967] mhi-pci-generic 0005:01:00.0: __mhi_power_down
[ 348.472665] mhi-pci-generic 0005:01:00.0: __mhi_power_down - returns
[ 348.725069] mhi-pci-generic 0005:01:00.0: mhi_sync_power_up - returns
[ 348.742644] mhi-pci-generic 0005:01:00.0: mhi_pci_recovery_work - mhi unprepare after power down
[ 348.762737] mhi-pci-generic 0005:01:00.0: mhi_pci_recovery_work - pci reset
[ 348.780904] mhi-pci-generic 0005:01:00.0: pci_reset_function
And tries to reset the device, which triggers the deadlock when
trying to take the already held parent (bridge) device lock.
Johan
next prev parent reply other threads:[~2024-12-18 8:40 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 14:17 mhi resume failure on reboot with 6.13-rc2 Johan Hovold
2024-12-11 14:53 ` Manivannan Sadhasivam
2024-12-11 15:03 ` Johan Hovold
2024-12-16 7:40 ` Manivannan Sadhasivam
2024-12-16 7:43 ` Manivannan Sadhasivam
2024-12-16 13:20 ` Johan Hovold
2024-12-16 14:13 ` Manivannan Sadhasivam
2024-12-16 16:25 ` Loic Poulain
2024-12-17 9:57 ` Johan Hovold
2024-12-18 8:48 ` Johan Hovold
2024-12-18 8:40 ` Johan Hovold [this message]
2024-12-18 11:38 ` Manivannan Sadhasivam
2024-12-18 12:02 ` Johan Hovold
2024-12-18 12:30 ` Manivannan Sadhasivam
2024-12-18 13:55 ` Johan Hovold
2024-12-18 14:09 ` Manivannan Sadhasivam
2024-12-18 14:26 ` Johan Hovold
2024-12-18 18:35 ` Manivannan Sadhasivam
2024-12-19 8:36 ` Johan Hovold
2025-01-08 12:49 ` Manivannan Sadhasivam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z2KKjWY2mPen6GPL@hovoldconsulting.com \
--to=johan@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=loic.poulain@linaro.org \
--cc=manivannan.sadhasivam@linaro.org \
--cc=mhi@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.