From: Manivannan Sadhasivam <mani@kernel.org>
To: Baochen Qiang <quic_bqiang@quicinc.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>,
Kalle Valo <kvalo@kernel.org>,
mhi@lists.linux.dev, ath11k@lists.infradead.org,
linux-wireless@vger.kernel.org, quic_cang@quicinc.com,
quic_qianyu@quicinc.com
Subject: Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()
Date: Tue, 23 Jan 2024 21:06:58 +0530 [thread overview]
Message-ID: <20240123153658.GF19029@thinkpad> (raw)
In-Reply-To: <1d9b8bc6-b1ef-4568-a265-b4e69bf90aa9@quicinc.com>
On Tue, Jan 23, 2024 at 09:44:11AM +0800, Baochen Qiang wrote:
>
>
> On 1/22/2024 9:09 PM, Manivannan Sadhasivam wrote:
> > On Mon, Jan 22, 2024 at 04:09:53PM +0800, Baochen Qiang wrote:
> > >
> > >
> > > On 1/22/2024 2:24 PM, Manivannan Sadhasivam wrote:
> > > > On Thu, Jan 04, 2024 at 11:39:12AM +0530, Manivannan Sadhasivam wrote:
> > > >
> > > > + Can, Qiang
> > > >
> > > > [...]
> > > >
> > > > > > > To me it all sounds like the probe deferral is not handled properly in mac80211
> > > > > > > stack. As you mentioned in the commit message that the dpm_prepare() blocks
> > > > > > > probing of devices. It gets unblocked and trigerred in dpm_complete():
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131
> > > > > > >
> > > > > > > So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
> > > > > > > it is definitely an issue that needs to be fixed properly.
> > > > > > To clarify, ath11k CAN probe the devices at dpm_complete() stage. The
> > > > > > problem is kernel does not wait for all probes to finish, and in that way we
> > > > > > will face the issue that user space applications are likely to fail because
> > > > > > they get thawed BEFORE WLAN is ready.
> > > > > >
> > > > >
> > > > > Hmm. Please give me some time to reproduce this issue locally. I will get back
> > > > > to this thread with my analysis.
> > > > >
> > > >
> > > > We reproduced the issue with the help of PCIe team (thanks Can). What we found
> > > > out was, during the resume from hibernation the faliure happens in
> > > > ath11k_core_resume(). Precisely here:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/tree/drivers/net/wireless/ath/ath11k/core.c?h=ath11k-hibernation-support#n850
> > > >
> > > > This code waits for the QMI messages to arrive and eventually timesout. But the
> > > > impression I got from the start was that the mhi_power_up() always fails during
> > > > resume. In our investigation, we confirmed that the failure is not happening at
> > > > the MHI level.No, mhi_power_up() never fails as it only downloads PBL,
> > > > SBL and waits
> > > for mission mode, no MHI device created hence not affected by the deferred
> > > probe. However in addition to PBL/SBL, ath11k also needs to download m3.bin,
> > > borad.bin and regdb.bin. Those files are part of WLAN firmware and are
> > > downloaded via QMI messages. After mhi_power_up() succeeds
> > > ath11k_core_resume() waits for QMI downloading those files. As you know QMI
> > > relies on MHI channels, these channels are managed by qcom_mhi_qrtr_driver.
> > > Since device probing is deferred, qcom_mhi_qrtr_driver has no chance to run
> > > at this stage. As a result ath11k_core_resume() times out.
> > >
> >
> > Thanks for the info, this clarifies the issue in detail.
> >
> > > >
> > > > I'm not pointing fingers here, but trying to understand why can't you fix
> > > > ath11k_core_resume() to not timeout? IMO this timeout should be handled as a
> > > > deferral case.
> > > Let's see what happens if we do it in a deferral way:
> > > 1. In ath11k_core_resume() we returns success directly without waiting for
> > > QMI downloading other firmware files.
> > > 2. Kernel unblocks device probe and schedules a work item to trigger all
> > > deferred probing. As a result MHI devices are probed by qcom_mhi_qrtr_driver
> > > and finally QMI is online.
> > > 3. kernel continues to resume and wake up userspace applications.
> > > 4. ath11k gets the message, either by kernel PM notification or something
> > > else, that QMI is ready and then downloads other firmware files.
> > >
> > > What happens if userspace applications or network stack immediately initiate
> > > some WLAN request after resume back? Can ath11k handle such request? The
> > > answer is, most likely, no. Because there is no guarantee that QMI finishes
> > > downloading before those request.
> > >
> >
> > What will happen to userspace if ath11k returns an error like -EBUSY or
> > something? Will the netdev completely go away?
> It depends, and varies from application to application, we can't make the
> assumption.
>
> Besides, it doesn't make sense to return -EBUSY or something like that, if
> ath11k returns success during resume. A WLAN driver is supposed to finish
> everything, at least get back to the state before suspend, in the resume
> callback. If it couldn't, report the error.
>
Ok. So I am getting the feeling that we need to talk to the PM people to get a
proper solution. Clearly fixing the MHI code is not the right thing to do. We
might need a separate callback that gets registered by the drivers like ath11k
to wait for the dependency drivers to get probed.
Can you initiate such a discussion? You can write to linux-pm@vger.kernel.org,
"Rafael J. Wysocki" <rafael@kernel.org> and Pavel Machek <pavel@ucw.cz>.
- Mani
--
மணிவண்ணன் சதாசிவம்
next prev parent reply other threads:[~2024-01-23 15:37 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-27 16:20 [PATCH RFC v2 0/8] wifi: ath11k: hibernation support Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-27 16:20 ` [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy() Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-30 5:42 ` Manivannan Sadhasivam
2023-12-01 1:08 ` Baochen Qiang
2023-12-05 12:29 ` Kalle Valo
2023-12-18 16:19 ` Jeff Johnson
2023-12-20 16:32 ` Manivannan Sadhasivam
2023-12-20 16:51 ` Manivannan Sadhasivam
2023-12-21 11:05 ` Baochen Qiang
2024-01-04 6:09 ` Manivannan Sadhasivam
2024-01-22 6:24 ` Manivannan Sadhasivam
2024-01-22 8:09 ` Baochen Qiang
2024-01-22 13:09 ` Manivannan Sadhasivam
2024-01-23 1:44 ` Baochen Qiang
2024-01-23 15:36 ` Manivannan Sadhasivam [this message]
2024-01-23 16:53 ` Jeff Johnson
2024-01-30 18:04 ` Manivannan Sadhasivam
2024-01-31 10:51 ` Baochen Qiang
2023-11-27 16:20 ` [PATCH RFC v2 2/8] bus: mhi: host: add new interfaces to handle MHI channels directly Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2024-01-30 18:19 ` Manivannan Sadhasivam
2024-01-31 7:39 ` Baochen Qiang
2024-02-01 10:00 ` Manivannan Sadhasivam
2024-02-02 6:42 ` Baochen Qiang
2024-02-02 7:10 ` Manivannan Sadhasivam
2024-02-02 10:49 ` Baochen Qiang
2024-02-02 12:16 ` Manivannan Sadhasivam
2023-11-27 16:20 ` [PATCH RFC v2 3/8] wifi: ath11k: handle irq enable/disable in several code path Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-27 16:20 ` [PATCH RFC v2 4/8] wifi: ath11k: remove MHI LOOPBACK channels Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-28 1:13 ` Baochen Qiang
2023-11-28 1:13 ` Baochen Qiang
2023-11-27 16:20 ` [PATCH RFC v2 5/8] wifi: ath11k: do not dump SRNG statistics during resume Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-27 16:20 ` [PATCH RFC v2 6/8] wifi: ath11k: fix warning on DMA ring capabilities event Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-27 16:20 ` [PATCH RFC v2 7/8] wifi: ath11k: thermal: don't try to register multiple times Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-27 16:20 ` [PATCH RFC v2 8/8] wifi: ath11k: support hibernation Kalle Valo
2023-11-27 16:20 ` Kalle Valo
2023-11-27 18:49 ` [PATCH RFC v2 0/8] wifi: ath11k: hibernation support Jeff Johnson
2023-11-27 18:49 ` Jeff Johnson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240123153658.GF19029@thinkpad \
--to=mani@kernel.org \
--cc=ath11k@lists.infradead.org \
--cc=kvalo@kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mhi@lists.linux.dev \
--cc=quic_bqiang@quicinc.com \
--cc=quic_cang@quicinc.com \
--cc=quic_qianyu@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.