Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
From: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
To: Muhammad Usama Anjum <usama.anjum@collabora.com>,
	Krishna Chaitanya Chundru <quic_krichai@quicinc.com>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	Johannes Berg <johannes@sipsolutions.net>,
	Jeff Johnson <jjohnson@kernel.org>,
	Jeffrey Hugo <quic_jhugo@quicinc.com>,
	Yan Zhen <yanzhen@vivo.com>,
	Youssef Samir <quic_yabdulra@quicinc.com>,
	Qiang Yu <quic_qianyu@quicinc.com>, Alex Elder <elder@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Kunwu Chan <chentao@kylinos.cn>
Cc: kernel@collabora.com, mhi@lists.linux.dev,
	linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-wireless@vger.kernel.org, ath11k@lists.infradead.org
Subject: Re: [PATCH v2] bus: mhi: host: don't free bhie tables during suspend/hibernation
Date: Tue, 22 Apr 2025 08:22:47 -0600	[thread overview]
Message-ID: <c1fdbd16-4197-4a2e-a33d-6b29cc285f0a@oss.qualcomm.com> (raw)
In-Reply-To: <1bf328cd-d301-4d1f-a8f5-7020d9e25ea5@collabora.com>

On 4/22/2025 1:23 AM, Muhammad Usama Anjum wrote:
> On 4/18/25 7:08 PM, Jeff Hugo wrote:
>> On 4/18/2025 2:10 AM, Muhammad Usama Anjum wrote:
>>> On 4/14/25 7:14 PM, Jeff Hugo wrote:
>>>> On 4/14/2025 1:32 AM, Muhammad Usama Anjum wrote:
>>>>> On 4/12/25 6:22 AM, Krishna Chaitanya Chundru wrote:
>>>>>>
>>>>>> On 4/12/2025 12:02 AM, Muhammad Usama Anjum wrote:
>>>>>>> On 4/11/25 1:39 PM, Krishna Chaitanya Chundru wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 4/11/2025 12:32 PM, Muhammad Usama Anjum wrote:
>>>>>>>>> On 4/11/25 8:37 AM, Krishna Chaitanya Chundru wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 4/10/2025 8:26 PM, Muhammad Usama Anjum wrote:
>>>>>>>>>>> Fix dma_direct_alloc() failure at resume time during bhie_table
>>>>>>>>>>> allocation. There is a crash report where at resume time, the
>>>>>>>>>>> memory
>>>>>>>>>>> from the dma doesn't get allocated and MHI fails to re-
>>>>>>>>>>> initialize.
>>>>>>>>>>> There may be fragmentation of some kind which fails the
>>>>>>>>>>> allocation
>>>>>>>>>>> call.
>>>>>>>>>>>
>>>>>>>>>>> To fix it, don't free the memory at power down during suspend /
>>>>>>>>>>> hibernation. Instead, use the same allocated memory again after
>>>>>>>>>>> every
>>>>>>>>>>> resume / hibernation. This patch has been tested with resume and
>>>>>>>>>>> hibernation both.
>>>>>>>>>>>
>>>>>>>>>>> The rddm is of constant size for a given hardware. While the
>>>>>>>>>>> fbc_image
>>>>>>>>>>> size depends on the firmware. If the firmware changes, we'll
>>>>>>>>>>> free and
>>>>>>>>>> If firmware image will change between suspend and resume ?
>>>>>>>>> Yes, correct.
>>>>>>>>>
>>>>>>>> why the firmware image size will change between suspend & resume?
>>>>>>>> who will update the firmware image after bootup?
>>>>>>>> It is not expected behaviour.
>>>>>>> I was trying to research if the firmware can change or not. I've not
>>>>>>> found any documentation on it.
>>>>>>>
>>>>>>> If the firmare is updated in filesystem before suspend/hibernate,
>>>>>>> would
>>>>>>> the new firwmare be loaded the next time kernel resumes as the older
>>>>>>> firmware is no where to be found?
>>>>>>>
>>>>>>> What do you think about this?
>>>>>>>
>>>>>> I don't think firmware can be updated before suspend/hibernate. I
>>>>>> don't
>>>>>> see any reason why it can be updated. If you think it can be updated
>>>>>> please quote relevant doc.
>>>>> I've not found any documentation on it. Let's wait for others to review
>>>>> and it it cannot be updated, I'll remove this part.
>>>>>
>>>>
>>>> Wouldn't this be trivial to test?  Boot the device, go modify the
>>>> firmware on the filesystem, then go through a suspend cycle.
>>> I just tested this. I've used an old firmware from last year vs the
>>> latest one.
>>>
>>> Firmware A: old firmware size: 5349376
>>> Firmware B: new firmware size: 5165056
>>>
>>> A here has bigger size.
>>>
>>> 1. I loaded A at boot and then replaced the firmwares in filesystem with
>>> B before syspend. At resume time, B was loaded fine by freeing the
>>> bigger memory area and allocating the smaller one.
>>>
>>> 2. I loaded B and then replaced A in its place before suspend. At resume
>>> time, memory was freed and larger memory was allocated. But driver
>>> wasn't able to initialize correctly:
>>>
>>> [  184.051902] ath11k_pci 0000:03:00.0: timeout while waiting for
>>> restart complete
>>> [  184.051916] ath11k_pci 0000:03:00.0: failed to resume core: -110
>>> [  184.051923] ath11k_pci 0000:03:00.0: PM: dpm_run_callback():
>>> pci_pm_resume returns -110
>>> [  184.051945] ath11k_pci 0000:03:00.0: PM: failed to resume async:
>>> error -110
>>> [  187.251911] ath11k_pci 0000:03:00.0: wmi command 16387 timeout
>>> [  187.251924] ath11k_pci 0000:03:00.0: failed to send
>>> WMI_PDEV_SET_PARAM cmd
>>> [  187.251933] ath11k_pci 0000:03:00.0: failed to enable dynamic bw: -11
>>>
>>> So should we generalize above that changing firmware at
>>> suspend/hibernation time isn't supported. If firmware package is
>>> updated, does user restarts every time?
>>
>> You may want to review how other devices handle this.  I can think of
>> these threads as potential reference
>>
>> https://lore.kernel.org/all/
>> CAPM=9twyvq3EWkwUeoTdMMj76u_sRPmUDHWrzbzEZFQ8eL++BQ@mail.gmail.com/
>> https://lore.kernel.org/all/20250207012531.621369-1-airlied@gmail.com/
> They are talking about firmware cache which is not being used in the
> wireless drivers. In my kernel config, firwmare cache is enabeld. But
> everytime kernel needs to read the firwamre, it reads from the filesystem.
> 
> What can be the way forward for this patch? Assuming my previous
> experiment with changed firmwares across suspend/resume failed, I should
> remove reuse logic and send again?

Perhaps you need to refactor the wireless drivers?

I'm not convinced your patch is valid.  If FW needs to be reloaded due 
to suspend/resume, it seems like the proper thing is to load the same FW 
that was loaded at device boot.  Per your testing, loading changed FW 
can cause a failure.  Even if it doesn't fail, will the changed firmware 
cause a "breakage" from the user perspective by modifying the device 
behavior?

This does not seem to be a problem that is relevant to all MHI devices, 
so whatever the end solution ends up being, I think that it should not 
be blanket applied to all of MHI.

-Jeff

  reply	other threads:[~2025-04-22 14:22 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10 14:56 [PATCH v2] bus: mhi: host: don't free bhie tables during suspend/hibernation Muhammad Usama Anjum
2025-04-10 17:00 ` Greg Kroah-Hartman
2025-04-11 18:47   ` Muhammad Usama Anjum
2025-04-11  3:37 ` Krishna Chaitanya Chundru
2025-04-11  7:02   ` Muhammad Usama Anjum
2025-04-11  8:39     ` Krishna Chaitanya Chundru
2025-04-11 18:32       ` Muhammad Usama Anjum
2025-04-12  1:22         ` Krishna Chaitanya Chundru
2025-04-14  7:32           ` Muhammad Usama Anjum
2025-04-14 14:14             ` Jeff Hugo
2025-04-18  8:10               ` Muhammad Usama Anjum
2025-04-18 14:08                 ` Jeff Hugo
2025-04-22  7:23                   ` Muhammad Usama Anjum
2025-04-22 14:22                     ` Jeff Hugo [this message]
2025-04-23  6:41                       ` Muhammad Usama Anjum
2025-04-11 16:10 ` Jeff Hugo
2025-04-11 19:10   ` Muhammad Usama Anjum
2025-04-25  7:04 ` Manivannan Sadhasivam
2025-04-25  7:14   ` Muhammad Usama Anjum
2025-04-25  7:32     ` Manivannan Sadhasivam
2025-04-25  7:42       ` Muhammad Usama Anjum
2025-04-25  8:59         ` Manivannan Sadhasivam
2025-04-25 11:41           ` Muhammad Usama Anjum
2025-04-25 14:47             ` Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1fdbd16-4197-4a2e-a33d-6b29cc285f0a@oss.qualcomm.com \
    --to=jeff.hugo@oss.qualcomm.com \
    --cc=ath11k@lists.infradead.org \
    --cc=chentao@kylinos.cn \
    --cc=elder@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jjohnson@kernel.org \
    --cc=johannes@sipsolutions.net \
    --cc=kernel@collabora.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=mhi@lists.linux.dev \
    --cc=quic_jhugo@quicinc.com \
    --cc=quic_krichai@quicinc.com \
    --cc=quic_qianyu@quicinc.com \
    --cc=quic_yabdulra@quicinc.com \
    --cc=usama.anjum@collabora.com \
    --cc=yanzhen@vivo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox