* [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
@ 2025-04-29 12:20 Muhammad Usama Anjum
2025-05-01 16:00 ` Greg Kroah-Hartman
0 siblings, 1 reply; 8+ messages in thread
From: Muhammad Usama Anjum @ 2025-04-29 12:20 UTC (permalink / raw)
To: Manivannan Sadhasivam, Jeff Johnson, Jeff Hugo, Youssef Samir,
Matthew Leung, Muhammad Usama Anjum, Yan Zhen, Alex Elder,
Jacek Lawrynowicz, Kunwu Chan, Greg Kroah-Hartman, Troy Hanson,
Dr. David Alan Gilbert
Cc: kernel, mhi, linux-arm-msm, linux-kernel, linux-wireless, ath11k,
ath12k
Fix dma_direct_alloc() failure at resume time during bhie_table
allocation. There is a crash report where at resume time, the memory
from the dma doesn't get allocated and MHI fails to re-initialize.
There is fragmentation/memory pressure.
To fix it, don't free the memory at power down during suspend /
hibernation. Instead, use the same allocated memory again after every
resume / hibernation. This patch has been tested with resume and
hibernation both.
The rddm is of constant size for a given hardware. While the fbc_image
size depends on the firmware. If the firmware changes, we'll free and
allocate new memory for it.
Here are the crash logs:
[ 3029.338587] mhi mhi0: Requested to power ON
[ 3029.338621] mhi mhi0: Power on setup success
[ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
[ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
[ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
[ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
[ 3029.668717] Call Trace:
[ 3029.668722] <TASK>
[ 3029.668728] dump_stack_lvl+0x4e/0x70
[ 3029.668738] warn_alloc+0x164/0x190
[ 3029.668747] ? srso_return_thunk+0x5/0x5f
[ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
[ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
[ 3029.668774] __alloc_pages_noprof+0x321/0x350
[ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
[ 3029.668790] dma_direct_alloc+0x70/0x270
[ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
[ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
[ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
[ 3029.668844] ? srso_return_thunk+0x5/0x5f
[ 3029.668853] process_one_work+0x17e/0x330
[ 3029.668861] worker_thread+0x2ce/0x3f0
[ 3029.668868] ? __pfx_worker_thread+0x10/0x10
[ 3029.668873] kthread+0xd2/0x100
[ 3029.668879] ? __pfx_kthread+0x10/0x10
[ 3029.668885] ret_from_fork+0x34/0x50
[ 3029.668892] ? __pfx_kthread+0x10/0x10
[ 3029.668898] ret_from_fork_asm+0x1a/0x30
[ 3029.668910] </TASK>
Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Don't free bhie tables during suspend/hibernation only
- Handle fbc_image changed size correctly
- Remove fbc_image getting set to NULL in *free_bhie_table()
Changes since v2:
- Remove the new mhi_partial_unprepare_after_power_down() and instead
update mhi_power_down_keep_dev() to use
mhi_power_down_unprepare_keep_dev() as suggested by Mani
- Update all users of this API such as ath12k (previously only ath11k
was updated)
- Define prev_fw_sz in docs
- Do better alignment of comments
Tested on ath11k.
---
drivers/bus/mhi/host/boot.c | 15 +++++++++++----
drivers/bus/mhi/host/init.c | 5 +++--
drivers/bus/mhi/host/pm.c | 9 +++++++++
drivers/net/wireless/ath/ath11k/mhi.c | 8 ++++----
drivers/net/wireless/ath/ath12k/mhi.c | 8 ++++----
include/linux/mhi.h | 2 ++
6 files changed, 33 insertions(+), 14 deletions(-)
diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
index efa3b6dddf4d2..bc8459798bbee 100644
--- a/drivers/bus/mhi/host/boot.c
+++ b/drivers/bus/mhi/host/boot.c
@@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl)
* device transitioning into MHI READY state
*/
if (fw_load_type == MHI_FW_LOAD_FBC) {
- ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz);
- if (ret) {
- release_firmware(firmware);
- goto error_fw_load;
+ if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) {
+ mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image);
+ mhi_cntrl->fbc_image = NULL;
+ }
+ if (!mhi_cntrl->fbc_image) {
+ ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz);
+ if (ret) {
+ release_firmware(firmware);
+ goto error_fw_load;
+ }
+ mhi_cntrl->prev_fw_sz = fw_sz;
}
/* Load the firmware into BHIE vec table */
diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 13e7a55f54ff4..a7663ad16bfc6 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -1173,8 +1173,9 @@ int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl)
/*
* Allocate RDDM table for debugging purpose if specified
*/
- mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image,
- mhi_cntrl->rddm_size);
+ if (!mhi_cntrl->rddm_image)
+ mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image,
+ mhi_cntrl->rddm_size);
if (mhi_cntrl->rddm_image) {
ret = mhi_rddm_prepare(mhi_cntrl,
mhi_cntrl->rddm_image);
diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
index e6c3ff62bab1d..b726b000d8a5d 100644
--- a/drivers/bus/mhi/host/pm.c
+++ b/drivers/bus/mhi/host/pm.c
@@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
}
EXPORT_SYMBOL_GPL(mhi_power_down);
+void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
+{
+ mhi_cntrl->bhi = NULL;
+ mhi_cntrl->bhie = NULL;
+
+ mhi_deinit_dev_ctxt(mhi_cntrl);
+}
+
void mhi_power_down_keep_dev(struct mhi_controller *mhi_cntrl,
bool graceful)
{
__mhi_power_down(mhi_cntrl, graceful, false);
+ mhi_power_down_unprepare_keep_dev(mhi_cntrl);
}
EXPORT_SYMBOL_GPL(mhi_power_down_keep_dev);
diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c
index acd76e9392d31..c5dc776b23643 100644
--- a/drivers/net/wireless/ath/ath11k/mhi.c
+++ b/drivers/net/wireless/ath/ath11k/mhi.c
@@ -460,12 +460,12 @@ void ath11k_mhi_stop(struct ath11k_pci *ab_pci, bool is_suspend)
* workaround, otherwise ath11k_core_resume() will timeout
* during resume.
*/
- if (is_suspend)
+ if (is_suspend) {
mhi_power_down_keep_dev(ab_pci->mhi_ctrl, true);
- else
+ } else {
mhi_power_down(ab_pci->mhi_ctrl, true);
-
- mhi_unprepare_after_power_down(ab_pci->mhi_ctrl);
+ mhi_unprepare_after_power_down(ab_pci->mhi_ctrl);
+ }
}
int ath11k_mhi_suspend(struct ath11k_pci *ab_pci)
diff --git a/drivers/net/wireless/ath/ath12k/mhi.c b/drivers/net/wireless/ath/ath12k/mhi.c
index 08f44baf182a5..cb7f789d873f2 100644
--- a/drivers/net/wireless/ath/ath12k/mhi.c
+++ b/drivers/net/wireless/ath/ath12k/mhi.c
@@ -635,12 +635,12 @@ void ath12k_mhi_stop(struct ath12k_pci *ab_pci, bool is_suspend)
* workaround, otherwise ath12k_core_resume() will timeout
* during resume.
*/
- if (is_suspend)
+ if (is_suspend) {
ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF_KEEP_DEV);
- else
+ } else {
ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF);
-
- ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT);
+ ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT);
+ }
}
void ath12k_mhi_suspend(struct ath12k_pci *ab_pci)
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index dd372b0123a6d..6fd218a877855 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -306,6 +306,7 @@ struct mhi_controller_config {
* if fw_image is NULL and fbc_download is true (optional)
* @fw_sz: Firmware image data size for normal booting, used only if fw_image
* is NULL and fbc_download is true (optional)
+ * @prev_fw_sz: Previous firmware image data size, when fbc_download is true
* @edl_image: Firmware image name for emergency download mode (optional)
* @rddm_size: RAM dump size that host should allocate for debugging purpose
* @sbl_size: SBL image size downloaded through BHIe (optional)
@@ -382,6 +383,7 @@ struct mhi_controller {
const char *fw_image;
const u8 *fw_data;
size_t fw_sz;
+ size_t prev_fw_sz;
const char *edl_image;
size_t rddm_size;
size_t sbl_size;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
@ 2025-04-29 12:23 Muhammad Usama Anjum
2025-04-29 20:55 ` Sebastian Reichel
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Muhammad Usama Anjum @ 2025-04-29 12:23 UTC (permalink / raw)
To: Manivannan Sadhasivam, Jeff Johnson, Jeff Hugo, Youssef Samir,
Matthew Leung, Muhammad Usama Anjum, Yan Zhen, Kunwu Chan,
Greg Kroah-Hartman, Dr. David Alan Gilbert, Troy Hanson
Cc: kernel, Carl Vanderlip, Sumit Garg, mhi, linux-arm-msm,
linux-kernel, linux-wireless, ath11k, ath12k
Fix dma_direct_alloc() failure at resume time during bhie_table
allocation. There is a crash report where at resume time, the memory
from the dma doesn't get allocated and MHI fails to re-initialize.
There is fragmentation/memory pressure.
To fix it, don't free the memory at power down during suspend /
hibernation. Instead, use the same allocated memory again after every
resume / hibernation. This patch has been tested with resume and
hibernation both.
The rddm is of constant size for a given hardware. While the fbc_image
size depends on the firmware. If the firmware changes, we'll free and
allocate new memory for it.
Here are the crash logs:
[ 3029.338587] mhi mhi0: Requested to power ON
[ 3029.338621] mhi mhi0: Power on setup success
[ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
[ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
[ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
[ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
[ 3029.668717] Call Trace:
[ 3029.668722] <TASK>
[ 3029.668728] dump_stack_lvl+0x4e/0x70
[ 3029.668738] warn_alloc+0x164/0x190
[ 3029.668747] ? srso_return_thunk+0x5/0x5f
[ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
[ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
[ 3029.668774] __alloc_pages_noprof+0x321/0x350
[ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
[ 3029.668790] dma_direct_alloc+0x70/0x270
[ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
[ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
[ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
[ 3029.668844] ? srso_return_thunk+0x5/0x5f
[ 3029.668853] process_one_work+0x17e/0x330
[ 3029.668861] worker_thread+0x2ce/0x3f0
[ 3029.668868] ? __pfx_worker_thread+0x10/0x10
[ 3029.668873] kthread+0xd2/0x100
[ 3029.668879] ? __pfx_kthread+0x10/0x10
[ 3029.668885] ret_from_fork+0x34/0x50
[ 3029.668892] ? __pfx_kthread+0x10/0x10
[ 3029.668898] ret_from_fork_asm+0x1a/0x30
[ 3029.668910] </TASK>
Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Don't free bhie tables during suspend/hibernation only
- Handle fbc_image changed size correctly
- Remove fbc_image getting set to NULL in *free_bhie_table()
Changes since v2:
- Remove the new mhi_partial_unprepare_after_power_down() and instead
update mhi_power_down_keep_dev() to use
mhi_power_down_unprepare_keep_dev() as suggested by Mani
- Update all users of this API such as ath12k (previously only ath11k
was updated)
- Define prev_fw_sz in docs
- Do better alignment of comments
Tested on ath11k.
---
drivers/bus/mhi/host/boot.c | 15 +++++++++++----
drivers/bus/mhi/host/init.c | 5 +++--
drivers/bus/mhi/host/pm.c | 9 +++++++++
drivers/net/wireless/ath/ath11k/mhi.c | 8 ++++----
drivers/net/wireless/ath/ath12k/mhi.c | 8 ++++----
include/linux/mhi.h | 2 ++
6 files changed, 33 insertions(+), 14 deletions(-)
diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
index efa3b6dddf4d2..bc8459798bbee 100644
--- a/drivers/bus/mhi/host/boot.c
+++ b/drivers/bus/mhi/host/boot.c
@@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl)
* device transitioning into MHI READY state
*/
if (fw_load_type == MHI_FW_LOAD_FBC) {
- ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz);
- if (ret) {
- release_firmware(firmware);
- goto error_fw_load;
+ if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) {
+ mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image);
+ mhi_cntrl->fbc_image = NULL;
+ }
+ if (!mhi_cntrl->fbc_image) {
+ ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz);
+ if (ret) {
+ release_firmware(firmware);
+ goto error_fw_load;
+ }
+ mhi_cntrl->prev_fw_sz = fw_sz;
}
/* Load the firmware into BHIE vec table */
diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 13e7a55f54ff4..a7663ad16bfc6 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -1173,8 +1173,9 @@ int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl)
/*
* Allocate RDDM table for debugging purpose if specified
*/
- mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image,
- mhi_cntrl->rddm_size);
+ if (!mhi_cntrl->rddm_image)
+ mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image,
+ mhi_cntrl->rddm_size);
if (mhi_cntrl->rddm_image) {
ret = mhi_rddm_prepare(mhi_cntrl,
mhi_cntrl->rddm_image);
diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
index e6c3ff62bab1d..b726b000d8a5d 100644
--- a/drivers/bus/mhi/host/pm.c
+++ b/drivers/bus/mhi/host/pm.c
@@ -1259,10 +1259,19 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
}
EXPORT_SYMBOL_GPL(mhi_power_down);
+void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
+{
+ mhi_cntrl->bhi = NULL;
+ mhi_cntrl->bhie = NULL;
+
+ mhi_deinit_dev_ctxt(mhi_cntrl);
+}
+
void mhi_power_down_keep_dev(struct mhi_controller *mhi_cntrl,
bool graceful)
{
__mhi_power_down(mhi_cntrl, graceful, false);
+ mhi_power_down_unprepare_keep_dev(mhi_cntrl);
}
EXPORT_SYMBOL_GPL(mhi_power_down_keep_dev);
diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c
index acd76e9392d31..c5dc776b23643 100644
--- a/drivers/net/wireless/ath/ath11k/mhi.c
+++ b/drivers/net/wireless/ath/ath11k/mhi.c
@@ -460,12 +460,12 @@ void ath11k_mhi_stop(struct ath11k_pci *ab_pci, bool is_suspend)
* workaround, otherwise ath11k_core_resume() will timeout
* during resume.
*/
- if (is_suspend)
+ if (is_suspend) {
mhi_power_down_keep_dev(ab_pci->mhi_ctrl, true);
- else
+ } else {
mhi_power_down(ab_pci->mhi_ctrl, true);
-
- mhi_unprepare_after_power_down(ab_pci->mhi_ctrl);
+ mhi_unprepare_after_power_down(ab_pci->mhi_ctrl);
+ }
}
int ath11k_mhi_suspend(struct ath11k_pci *ab_pci)
diff --git a/drivers/net/wireless/ath/ath12k/mhi.c b/drivers/net/wireless/ath/ath12k/mhi.c
index 08f44baf182a5..cb7f789d873f2 100644
--- a/drivers/net/wireless/ath/ath12k/mhi.c
+++ b/drivers/net/wireless/ath/ath12k/mhi.c
@@ -635,12 +635,12 @@ void ath12k_mhi_stop(struct ath12k_pci *ab_pci, bool is_suspend)
* workaround, otherwise ath12k_core_resume() will timeout
* during resume.
*/
- if (is_suspend)
+ if (is_suspend) {
ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF_KEEP_DEV);
- else
+ } else {
ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF);
-
- ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT);
+ ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT);
+ }
}
void ath12k_mhi_suspend(struct ath12k_pci *ab_pci)
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index dd372b0123a6d..6fd218a877855 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -306,6 +306,7 @@ struct mhi_controller_config {
* if fw_image is NULL and fbc_download is true (optional)
* @fw_sz: Firmware image data size for normal booting, used only if fw_image
* is NULL and fbc_download is true (optional)
+ * @prev_fw_sz: Previous firmware image data size, when fbc_download is true
* @edl_image: Firmware image name for emergency download mode (optional)
* @rddm_size: RAM dump size that host should allocate for debugging purpose
* @sbl_size: SBL image size downloaded through BHIe (optional)
@@ -382,6 +383,7 @@ struct mhi_controller {
const char *fw_image;
const u8 *fw_data;
size_t fw_sz;
+ size_t prev_fw_sz;
const char *edl_image;
size_t rddm_size;
size_t sbl_size;
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
2025-04-29 12:23 [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation Muhammad Usama Anjum
@ 2025-04-29 20:55 ` Sebastian Reichel
2025-04-30 14:29 ` kernel test robot
2025-04-30 16:45 ` kernel test robot
2 siblings, 0 replies; 8+ messages in thread
From: Sebastian Reichel @ 2025-04-29 20:55 UTC (permalink / raw)
To: Muhammad Usama Anjum
Cc: Manivannan Sadhasivam, Jeff Johnson, Jeff Hugo, Youssef Samir,
Matthew Leung, Yan Zhen, Kunwu Chan, Greg Kroah-Hartman,
Dr. David Alan Gilbert, Troy Hanson, kernel, Carl Vanderlip,
Sumit Garg, mhi, linux-arm-msm, linux-kernel, linux-wireless,
ath11k, ath12k
[-- Attachment #1: Type: text/plain, Size: 13820 bytes --]
Hi,
On Tue, Apr 29, 2025 at 05:23:35PM +0500, Muhammad Usama Anjum wrote:
> Fix dma_direct_alloc() failure at resume time during bhie_table
> allocation. There is a crash report where at resume time, the memory
> from the dma doesn't get allocated and MHI fails to re-initialize.
> There is fragmentation/memory pressure.
>
> To fix it, don't free the memory at power down during suspend /
> hibernation. Instead, use the same allocated memory again after every
> resume / hibernation. This patch has been tested with resume and
> hibernation both.
>
> The rddm is of constant size for a given hardware. While the fbc_image
> size depends on the firmware. If the firmware changes, we'll free and
> allocate new memory for it.
>
> Here are the crash logs:
>
> [ 3029.338587] mhi mhi0: Requested to power ON
> [ 3029.338621] mhi mhi0: Power on setup success
> [ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
> [ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> [ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> [ 3029.668717] Call Trace:
> [ 3029.668722] <TASK>
> [ 3029.668728] dump_stack_lvl+0x4e/0x70
> [ 3029.668738] warn_alloc+0x164/0x190
> [ 3029.668747] ? srso_return_thunk+0x5/0x5f
> [ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
> [ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> [ 3029.668774] __alloc_pages_noprof+0x321/0x350
> [ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
> [ 3029.668790] dma_direct_alloc+0x70/0x270
> [ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668844] ? srso_return_thunk+0x5/0x5f
> [ 3029.668853] process_one_work+0x17e/0x330
> [ 3029.668861] worker_thread+0x2ce/0x3f0
> [ 3029.668868] ? __pfx_worker_thread+0x10/0x10
> [ 3029.668873] kthread+0xd2/0x100
> [ 3029.668879] ? __pfx_kthread+0x10/0x10
> [ 3029.668885] ret_from_fork+0x34/0x50
> [ 3029.668892] ? __pfx_kthread+0x10/0x10
> [ 3029.668898] ret_from_fork_asm+0x1a/0x30
> [ 3029.668910] </TASK>
>
> Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
> ---
This breaks ath12k on my T14s Snapdragon with WCN785x. After a
suspend/resume cycle the following is in my logs (and the resume
is super slow). Additionally at shutdown ath12k crashes with a
NULL pointer dereference in mhi_deinit_dev_ctxt, which got called
by mhi_unprepare_after_power_down, which got called by ath12k_mhi_stop.
This happens after filesystem umount and I don't have anything
configured right now to get logs from that point, so it is not
included in the log from the suspend/resume cycle down below:
...
[ 28.385370] ath12k_pci 0004:01:00.0: failed to set mhi state INIT(0) in current mhi state (0x1)
[ 28.385379] ath12k_pci 0004:01:00.0: failed to set mhi state: INIT(0)
[ 28.385383] ath12k_pci 0004:01:00.0: failed to start mhi: -22
[ 28.385387] ath12k_pci 0004:01:00.0: failed to power up hif during resume: -22
[ 28.385391] ath12k_pci 0004:01:00.0: failed to early resume core: -22
[ 28.385393] ath12k_pci 0004:01:00.0: PM: dpm_run_callback(): pci_pm_resume_early returns -22
[ 28.385413] ath12k_pci 0004:01:00.0: PM: failed to resume async early: error -22
[ 28.385513] qcom_mhi_qrtr mhi0_IPCR: Current EE: DISABLE Required EE Mask: 0x4
[ 28.385521] qcom_mhi_qrtr mhi0_IPCR: failed to prepare for autoqueue transfer -107
[ 28.385526] qcom_mhi_qrtr mhi0_IPCR: PM: dpm_run_callback(): qcom_mhi_qrtr_pm_resume_early [qrtr_mhi] returns -107
[ 28.385541] qcom_mhi_qrtr mhi0_IPCR: PM: failed to resume early: error -107
[ 50.146823] ath12k_pci 0004:01:00.0: timeout while waiting for restart complete
[ 50.146830] ath12k_pci 0004:01:00.0: failed to resume core: -110
[ 50.146834] ath12k_pci 0004:01:00.0: PM: dpm_run_callback(): pci_pm_resume returns -110
[ 50.146849] ath12k_pci 0004:01:00.0: PM: failed to resume async: error -110
[ 53.218794] ath12k_pci 0004:01:00.0: wmi command 16387 timeout
[ 53.218801] ath12k_pci 0004:01:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[ 53.218808] ath12k_pci 0004:01:00.0: failed to set ac override for ARP: -11
[ 53.218813] ath12k_pci 0004:01:00.0: fail to start mac operations in pdev idx 0 ret -11
[ 53.218817] ------------[ cut here ]------------
[ 53.218820] Hardware became unavailable upon resume. This could be a software issue prior to suspend or a hardware issue.
[ 53.218855] WARNING: CPU: 2 PID: 1958 at net/mac80211/util.c:1829 ieee80211_reconfig+0x37c/0x1718 [mac80211]
[ 53.218936] Modules linked in: reset_gpio snd_soc_wsa884x q6prm_clocks q6apm_dai q6apm_lpass_dais snd_q6dsp_common q6prm michael_mic rfcomm wireguard libchacha20poly1305 chacha_neon libchacha poly1305_neon ip6_udp_tunnel udp_tunnel libcurve25519_generic binfmt_misc qrtr_mhi ath12k mac80211 libarc4 cfg80211 mhi hci_uart btqca btbcm snd_soc_x1e80100 snd_soc_qcom_sdw snd_soc_qcom_common bluetooth ecdh_generic ecc qcom_spmi_temp_alarm rfkill snd_q6apm snd_soc_hdmi_codec fastrpc snd_soc_lpass_va_macro snd_soc_lpass_tx_macro snd_soc_lpass_rx_macro snd_soc_lpass_wsa_macro soundwire_qcom snd_soc_wcd938x slimbus snd_soc_lpass_macro_common snd_soc_wcd938x_sdw pci_pwrctrl_pwrseq regmap_sdw snd_soc_wcd_mbhc coresight_stm coresight_funnel coresight_tmc snd_soc_wcd_classh coresight_cti stm_core coresight_replicator soundwire_bus coresight mux_gpio fuse nfnetlink ip_tables x_tables ipv6 gpio_sbu_mux panel_edp msm hid_multitouch drm_exec ocmem gpu_sched drm_dp_aux_bus rpmsg_ctrl apr rpmsg_char qrtr_smd i2c_hid_of qcom_pd_mapper
[ 53.219100] ps883x phy_nxp_ptn3222 i2c_hid drm_display_helper nvme phy_qcom_qmp_combo leds_qcom_lpg ucsi_glink pmic_glink_altmode nvme_core aux_hpd_bridge typec_ucsi qcom_battmgr sm3_ce sm3 led_class_multicolor qcom_q6v5_pas sha3_ce rtc_pm8xxx phy_qcom_eusb2_repeater qcom_pbs drm_client_lib aux_bridge sha512_ce qcom_pil_info drm_kms_helper qcom_common qcom_pon sha512_arm64 qcom_glink_smem typec qcom_q6v5 nvmem_qcom_spmi_sdam dispcc_x1e80100 drm pwrseq_qcom_wcn pinctrl_sm8550_lpass_lpi pwrseq_core i2c_qcom_geni qcom_stats pinctrl_lpass_lpi phy_qcom_edp phy_qcom_qmp_usb qcom_sysmon tcsrcc_x1e80100 llcc_qcom gpucc_x1e80100 phy_qcom_snps_eusb2 mdt_loader lpasscc_sc8280xp pcie_qcom qcom_cpucp_mbox icc_bwmon phy_qcom_qmp_pcie qrtr pmic_glink pdr_interface qcom_pdr_msg pwm_bl socinfo backlight qmi_helpers
[ 53.219234] CPU: 2 UID: 0 PID: 1958 Comm: kworker/u49:49 Not tainted 6.15.0-rc4+ #95 PREEMPT
[ 53.219241] Hardware name: LENOVO 21N1CTO1WW/21N1CTO1WW, BIOS N42ET85W (2.15 ) 11/22/2024
[ 53.219245] Workqueue: async async_run_entry_fn
[ 53.219258] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 53.219265] pc : ieee80211_reconfig+0x37c/0x1718 [mac80211]
[ 53.219315] lr : ieee80211_reconfig+0x37c/0x1718 [mac80211]
[ 53.219362] sp : ffff8000853ebb30
[ 53.219364] x29: ffff8000853ebbf0 x28: 0000000000000000 x27: 0000000000000000
[ 53.219373] x26: ffff1ce140047428 x25: 0000000000000000 x24: ffff1ce1408f7c05
[ 53.219380] x23: ffff1ce14aaa05b8 x22: 0000000000000010 x21: 00000000fffffff5
[ 53.219387] x20: 0000000000000000 x19: ffff1ce14aaa0900 x18: 00000000fffffffe
[ 53.219394] x17: 72617774666f7320 x16: 6120656220646c75 x15: 6f63207369685420
[ 53.219401] x14: 2e656d7573657220 x13: 0a2e657573736920 x12: 6572617764726168
[ 53.219408] x11: 0000000000000058 x10: 0000000000000018 x9 : ffffdacf6aa7749c
[ 53.219415] x8 : 0000000000000507 x7 : ffffdacf6d031138 x6 : ffffdacf6d031138
[ 53.219422] x5 : ffff1ce8bbe76508 x4 : 0000000000000000 x3 : ffff42194efd1000
[ 53.219429] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff1ce149caa300
[ 53.219437] Call trace:
[ 53.219440] ieee80211_reconfig+0x37c/0x1718 [mac80211] (P)
[ 53.219490] ieee80211_resume+0x54/0x78 [mac80211]
[ 53.219541] wiphy_resume+0x8c/0x200 [cfg80211]
[ 53.219603] dpm_run_callback+0x50/0x188
[ 53.219614] device_resume+0xc4/0x1f8
[ 53.219621] async_resume+0x2c/0x50
[ 53.219628] async_run_entry_fn+0x3c/0x160
[ 53.219634] process_one_work+0x158/0x3c8
[ 53.219643] worker_thread+0x2e0/0x418
[ 53.219650] kthread+0x14c/0x230
[ 53.219657] ret_from_fork+0x10/0x20
[ 53.219666] ---[ end trace 0000000000000000 ]---
[ 53.220154] ------------[ cut here ]------------
[ 53.220158] WARNING: CPU: 2 PID: 1958 at net/mac80211/driver-ops.c:41 drv_stop+0x1cc/0x1e8 [mac80211]
[ 53.220235] Modules linked in: reset_gpio snd_soc_wsa884x q6prm_clocks q6apm_dai q6apm_lpass_dais snd_q6dsp_common q6prm michael_mic rfcomm wireguard libchacha20poly1305 chacha_neon libchacha poly1305_neon ip6_udp_tunnel udp_tunnel libcurve25519_generic binfmt_misc qrtr_mhi ath12k mac80211 libarc4 cfg80211 mhi hci_uart btqca btbcm snd_soc_x1e80100 snd_soc_qcom_sdw snd_soc_qcom_common bluetooth ecdh_generic ecc qcom_spmi_temp_alarm rfkill snd_q6apm snd_soc_hdmi_codec fastrpc snd_soc_lpass_va_macro snd_soc_lpass_tx_macro snd_soc_lpass_rx_macro snd_soc_lpass_wsa_macro soundwire_qcom snd_soc_wcd938x slimbus snd_soc_lpass_macro_common snd_soc_wcd938x_sdw pci_pwrctrl_pwrseq regmap_sdw snd_soc_wcd_mbhc coresight_stm coresight_funnel coresight_tmc snd_soc_wcd_classh coresight_cti stm_core coresight_replicator soundwire_bus coresight mux_gpio fuse nfnetlink ip_tables x_tables ipv6 gpio_sbu_mux panel_edp msm hid_multitouch drm_exec ocmem gpu_sched drm_dp_aux_bus rpmsg_ctrl apr rpmsg_char qrtr_smd i2c_hid_of qcom_pd_mapper
[ 53.220351] ps883x phy_nxp_ptn3222 i2c_hid drm_display_helper nvme phy_qcom_qmp_combo leds_qcom_lpg ucsi_glink pmic_glink_altmode nvme_core aux_hpd_bridge typec_ucsi qcom_battmgr sm3_ce sm3 led_class_multicolor qcom_q6v5_pas sha3_ce rtc_pm8xxx phy_qcom_eusb2_repeater qcom_pbs drm_client_lib aux_bridge sha512_ce qcom_pil_info drm_kms_helper qcom_common qcom_pon sha512_arm64 qcom_glink_smem typec qcom_q6v5 nvmem_qcom_spmi_sdam dispcc_x1e80100 drm pwrseq_qcom_wcn pinctrl_sm8550_lpass_lpi pwrseq_core i2c_qcom_geni qcom_stats pinctrl_lpass_lpi phy_qcom_edp phy_qcom_qmp_usb qcom_sysmon tcsrcc_x1e80100 llcc_qcom gpucc_x1e80100 phy_qcom_snps_eusb2 mdt_loader lpasscc_sc8280xp pcie_qcom qcom_cpucp_mbox icc_bwmon phy_qcom_qmp_pcie qrtr pmic_glink pdr_interface qcom_pdr_msg pwm_bl socinfo backlight qmi_helpers
[ 53.220444] CPU: 2 UID: 0 PID: 1958 Comm: kworker/u49:49 Tainted: G W 6.15.0-rc4+ #95 PREEMPT
[ 53.220452] Tainted: [W]=WARN
[ 53.220455] Hardware name: LENOVO 21N1CTO1WW/21N1CTO1WW, BIOS N42ET85W (2.15 ) 11/22/2024
[ 53.220458] Workqueue: async async_run_entry_fn
[ 53.220467] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 53.220472] pc : drv_stop+0x1cc/0x1e8 [mac80211]
[ 53.220521] lr : ieee80211_stop_device+0x8c/0xa8 [mac80211]
[ 53.220580] sp : ffff8000853eb9f0
[ 53.220582] x29: ffff8000853eb9f0 x28: 0000000000000000 x27: 0000000000000000
[ 53.220591] x26: ffff1ce140047428 x25: ffff8000853eba50 x24: ffff8000853eba50
[ 53.220598] x23: 0000000000000000 x22: 0000000000000001 x21: 0000000000000000
[ 53.220604] x20: 0000000000000000 x19: ffff1ce14aaa0900 x18: 00000000fffffffe
[ 53.220611] x17: ffff42194efd1000 x16: ffff800080010000 x15: 6f63207369685420
[ 53.220618] x14: 000000000000037f x13: 000000000000037f x12: 071c71c71c71c71c
[ 53.220625] x11: ffff1ce8bbe88b8c x10: 1f0348adc6bb8584 x9 : ffffdacf67622b3c
[ 53.220633] x8 : ffff1ce149e1e550 x7 : 0000000000000000 x6 : 000000000000003f
[ 53.220640] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000000003
[ 53.220646] x2 : 0000000000000001 x1 : 0000000000000000 x0 : 0000000000000000
[ 53.220652] Call trace:
[ 53.220654] drv_stop+0x1cc/0x1e8 [mac80211] (P)
[ 53.220702] ieee80211_stop_device+0x8c/0xa8 [mac80211]
[ 53.220751] ieee80211_do_stop+0x644/0x830 [mac80211]
[ 53.220798] ieee80211_stop+0x60/0x1b0 [mac80211]
[ 53.220845] __dev_close_many+0xbc/0x1f0
[ 53.220857] dev_close_many+0x94/0x160
[ 53.220863] netif_close+0x78/0xa0
[ 53.220868] dev_close+0x3c/0x70
[ 53.220876] cfg80211_shutdown_all_interfaces+0x4c/0x118 [cfg80211]
[ 53.220935] wiphy_resume+0xc0/0x200 [cfg80211]
[ 53.220985] dpm_run_callback+0x50/0x188
[ 53.220992] device_resume+0xc4/0x1f8
[ 53.220999] async_resume+0x2c/0x50
[ 53.221006] async_run_entry_fn+0x3c/0x160
[ 53.221012] process_one_work+0x158/0x3c8
[ 53.221020] worker_thread+0x2e0/0x418
[ 53.221027] kthread+0x14c/0x230
[ 53.221033] ret_from_fork+0x10/0x20
[ 53.221039] ---[ end trace 0000000000000000 ]---
[ 53.221223] ieee80211 phy0: PM: dpm_run_callback(): wiphy_resume [cfg80211] returns -11
[ 53.221277] ieee80211 phy0: PM: failed to resume async: error -11
[ 53.667179] OOM killer enabled.
[ 53.667182] Restarting tasks ... done.
[ 53.668270] random: crng reseeded on system resumption
[ 53.668317] PM: suspend exit
[ 56.804822] ath12k_pci 0004:01:00.0: wmi command 16387 timeout
[ 56.804845] ath12k_pci 0004:01:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[ 56.804859] ath12k_pci 0004:01:00.0: failed to enable PMF QOS: (-11
[ 56.804872] ath12k_pci 0004:01:00.0: fail to start mac operations in pdev idx 0 ret -11
...
-- Sebastian
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
2025-04-29 12:23 [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation Muhammad Usama Anjum
2025-04-29 20:55 ` Sebastian Reichel
@ 2025-04-30 14:29 ` kernel test robot
2025-04-30 16:45 ` kernel test robot
2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2025-04-30 14:29 UTC (permalink / raw)
To: Muhammad Usama Anjum, Manivannan Sadhasivam, Jeff Johnson,
Jeff Hugo, Youssef Samir, Matthew Leung, Yan Zhen, Kunwu Chan,
Greg Kroah-Hartman, Dr. David Alan Gilbert, Troy Hanson
Cc: oe-kbuild-all, kernel, Carl Vanderlip, Sumit Garg, mhi,
linux-arm-msm, linux-kernel, linux-wireless, ath11k, ath12k
Hi Muhammad,
kernel test robot noticed the following build warnings:
[auto build test WARNING on ath/ath-next]
[also build test WARNING on next-20250430]
[cannot apply to mani-mhi/mhi-next char-misc/char-misc-testing char-misc/char-misc-next char-misc/char-misc-linus staging/staging-testing staging/staging-next staging/staging-linus usb/usb-testing usb/usb-next usb/usb-linus linus/master v6.15-rc4]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Muhammad-Usama-Anjum/bus-mhi-host-don-t-free-bhie-tables-during-suspend-hibernation/20250429-202649
base: https://git.kernel.org/pub/scm/linux/kernel/git/ath/ath.git ath-next
patch link: https://lore.kernel.org/r/20250429122351.108684-1-usama.anjum%40collabora.com
patch subject: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
config: arm-randconfig-001-20250430 (https://download.01.org/0day-ci/archive/20250430/202504302208.7JSH4wb6-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 10.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250430/202504302208.7JSH4wb6-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202504302208.7JSH4wb6-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/bus/mhi/host/pm.c:1246:6: warning: no previous prototype for 'mhi_power_down_unprepare_keep_dev' [-Wmissing-prototypes]
1246 | void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vim +/mhi_power_down_unprepare_keep_dev +1246 drivers/bus/mhi/host/pm.c
1245
> 1246 void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
1247 {
1248 mhi_cntrl->bhi = NULL;
1249 mhi_cntrl->bhie = NULL;
1250
1251 mhi_deinit_dev_ctxt(mhi_cntrl);
1252 }
1253
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
2025-04-29 12:23 [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation Muhammad Usama Anjum
2025-04-29 20:55 ` Sebastian Reichel
2025-04-30 14:29 ` kernel test robot
@ 2025-04-30 16:45 ` kernel test robot
2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2025-04-30 16:45 UTC (permalink / raw)
To: Muhammad Usama Anjum, Manivannan Sadhasivam, Jeff Johnson,
Jeff Hugo, Youssef Samir, Matthew Leung, Yan Zhen, Kunwu Chan,
Greg Kroah-Hartman, Dr. David Alan Gilbert, Troy Hanson
Cc: llvm, oe-kbuild-all, kernel, Carl Vanderlip, Sumit Garg, mhi,
linux-arm-msm, linux-kernel, linux-wireless, ath11k, ath12k
Hi Muhammad,
kernel test robot noticed the following build warnings:
[auto build test WARNING on ath/ath-next]
[also build test WARNING on next-20250430]
[cannot apply to mani-mhi/mhi-next char-misc/char-misc-testing char-misc/char-misc-next char-misc/char-misc-linus staging/staging-testing staging/staging-next staging/staging-linus usb/usb-testing usb/usb-next usb/usb-linus linus/master v6.15-rc4]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Muhammad-Usama-Anjum/bus-mhi-host-don-t-free-bhie-tables-during-suspend-hibernation/20250429-202649
base: https://git.kernel.org/pub/scm/linux/kernel/git/ath/ath.git ath-next
patch link: https://lore.kernel.org/r/20250429122351.108684-1-usama.anjum%40collabora.com
patch subject: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
config: arm64-randconfig-002-20250430 (https://download.01.org/0day-ci/archive/20250501/202505010037.1PMLamw8-lkp@intel.com/config)
compiler: clang version 21.0.0git (https://github.com/llvm/llvm-project f819f46284f2a79790038e1f6649172789734ae8)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250501/202505010037.1PMLamw8-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202505010037.1PMLamw8-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/bus/mhi/host/pm.c:1246:6: warning: no previous prototype for function 'mhi_power_down_unprepare_keep_dev' [-Wmissing-prototypes]
1246 | void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
| ^
drivers/bus/mhi/host/pm.c:1246:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
1246 | void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
| ^
| static
1 warning generated.
vim +/mhi_power_down_unprepare_keep_dev +1246 drivers/bus/mhi/host/pm.c
1245
> 1246 void mhi_power_down_unprepare_keep_dev(struct mhi_controller *mhi_cntrl)
1247 {
1248 mhi_cntrl->bhi = NULL;
1249 mhi_cntrl->bhie = NULL;
1250
1251 mhi_deinit_dev_ctxt(mhi_cntrl);
1252 }
1253
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
2025-04-29 12:20 Muhammad Usama Anjum
@ 2025-05-01 16:00 ` Greg Kroah-Hartman
2025-05-02 4:15 ` Muhammad Usama Anjum
0 siblings, 1 reply; 8+ messages in thread
From: Greg Kroah-Hartman @ 2025-05-01 16:00 UTC (permalink / raw)
To: Muhammad Usama Anjum
Cc: Manivannan Sadhasivam, Jeff Johnson, Jeff Hugo, Youssef Samir,
Matthew Leung, Yan Zhen, Alex Elder, Jacek Lawrynowicz,
Kunwu Chan, Troy Hanson, Dr. David Alan Gilbert, kernel, mhi,
linux-arm-msm, linux-kernel, linux-wireless, ath11k, ath12k
On Tue, Apr 29, 2025 at 05:20:56PM +0500, Muhammad Usama Anjum wrote:
> Fix dma_direct_alloc() failure at resume time during bhie_table
> allocation. There is a crash report where at resume time, the memory
> from the dma doesn't get allocated and MHI fails to re-initialize.
> There is fragmentation/memory pressure.
>
> To fix it, don't free the memory at power down during suspend /
> hibernation. Instead, use the same allocated memory again after every
> resume / hibernation. This patch has been tested with resume and
> hibernation both.
>
> The rddm is of constant size for a given hardware. While the fbc_image
> size depends on the firmware. If the firmware changes, we'll free and
> allocate new memory for it.
>
> Here are the crash logs:
>
> [ 3029.338587] mhi mhi0: Requested to power ON
> [ 3029.338621] mhi mhi0: Power on setup success
> [ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
> [ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> [ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> [ 3029.668717] Call Trace:
> [ 3029.668722] <TASK>
> [ 3029.668728] dump_stack_lvl+0x4e/0x70
> [ 3029.668738] warn_alloc+0x164/0x190
> [ 3029.668747] ? srso_return_thunk+0x5/0x5f
> [ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
> [ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> [ 3029.668774] __alloc_pages_noprof+0x321/0x350
> [ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
> [ 3029.668790] dma_direct_alloc+0x70/0x270
> [ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> [ 3029.668844] ? srso_return_thunk+0x5/0x5f
> [ 3029.668853] process_one_work+0x17e/0x330
> [ 3029.668861] worker_thread+0x2ce/0x3f0
> [ 3029.668868] ? __pfx_worker_thread+0x10/0x10
> [ 3029.668873] kthread+0xd2/0x100
> [ 3029.668879] ? __pfx_kthread+0x10/0x10
> [ 3029.668885] ret_from_fork+0x34/0x50
> [ 3029.668892] ? __pfx_kthread+0x10/0x10
> [ 3029.668898] ret_from_fork_asm+0x1a/0x30
> [ 3029.668910] </TASK>
>
> Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
>
> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
What commit id does this fix? Should it go to stable kernel(s)? If so,
how far back?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
2025-05-01 16:00 ` Greg Kroah-Hartman
@ 2025-05-02 4:15 ` Muhammad Usama Anjum
2025-05-02 5:06 ` Greg Kroah-Hartman
0 siblings, 1 reply; 8+ messages in thread
From: Muhammad Usama Anjum @ 2025-05-02 4:15 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: usama.anjum, Manivannan Sadhasivam, Jeff Johnson, Jeff Hugo,
Youssef Samir, Matthew Leung, Yan Zhen, Alex Elder,
Jacek Lawrynowicz, Kunwu Chan, Troy Hanson,
Dr. David Alan Gilbert, kernel, mhi, linux-arm-msm, linux-kernel,
linux-wireless, ath11k, ath12k
Hi Greg,
On 5/1/25 9:00 PM, Greg Kroah-Hartman wrote:
> On Tue, Apr 29, 2025 at 05:20:56PM +0500, Muhammad Usama Anjum wrote:
>> Fix dma_direct_alloc() failure at resume time during bhie_table
>> allocation. There is a crash report where at resume time, the memory
>> from the dma doesn't get allocated and MHI fails to re-initialize.
>> There is fragmentation/memory pressure.
>>
>> To fix it, don't free the memory at power down during suspend /
>> hibernation. Instead, use the same allocated memory again after every
>> resume / hibernation. This patch has been tested with resume and
>> hibernation both.
>>
>> The rddm is of constant size for a given hardware. While the fbc_image
>> size depends on the firmware. If the firmware changes, we'll free and
>> allocate new memory for it.
>>
>> Here are the crash logs:
>>
>> [ 3029.338587] mhi mhi0: Requested to power ON
>> [ 3029.338621] mhi mhi0: Power on setup success
>> [ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
>> [ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
>> [ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> [ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
>> [ 3029.668717] Call Trace:
>> [ 3029.668722] <TASK>
>> [ 3029.668728] dump_stack_lvl+0x4e/0x70
>> [ 3029.668738] warn_alloc+0x164/0x190
>> [ 3029.668747] ? srso_return_thunk+0x5/0x5f
>> [ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
>> [ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
>> [ 3029.668774] __alloc_pages_noprof+0x321/0x350
>> [ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
>> [ 3029.668790] dma_direct_alloc+0x70/0x270
>> [ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
>> [ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
>> [ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
>> [ 3029.668844] ? srso_return_thunk+0x5/0x5f
>> [ 3029.668853] process_one_work+0x17e/0x330
>> [ 3029.668861] worker_thread+0x2ce/0x3f0
>> [ 3029.668868] ? __pfx_worker_thread+0x10/0x10
>> [ 3029.668873] kthread+0xd2/0x100
>> [ 3029.668879] ? __pfx_kthread+0x10/0x10
>> [ 3029.668885] ret_from_fork+0x34/0x50
>> [ 3029.668892] ? __pfx_kthread+0x10/0x10
>> [ 3029.668898] ret_from_fork_asm+0x1a/0x30
>> [ 3029.668910] </TASK>
>>
>> Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
>>
>> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
>
> What commit id does this fix? Should it go to stable kernel(s)? If so,
> how far back?
This patch is fixing the dma_coherent_alloc() failure when there is
memory pressure and its unable to allocate memory. Its not a bug in
allocation API or the driver. I think it should be considered an
improvement instead of the fix. Please correct me if I'm wrong.
--
Regards,
Usama
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation
2025-05-02 4:15 ` Muhammad Usama Anjum
@ 2025-05-02 5:06 ` Greg Kroah-Hartman
0 siblings, 0 replies; 8+ messages in thread
From: Greg Kroah-Hartman @ 2025-05-02 5:06 UTC (permalink / raw)
To: Muhammad Usama Anjum
Cc: Manivannan Sadhasivam, Jeff Johnson, Jeff Hugo, Youssef Samir,
Matthew Leung, Yan Zhen, Alex Elder, Jacek Lawrynowicz,
Kunwu Chan, Troy Hanson, Dr. David Alan Gilbert, kernel, mhi,
linux-arm-msm, linux-kernel, linux-wireless, ath11k, ath12k
On Fri, May 02, 2025 at 09:15:10AM +0500, Muhammad Usama Anjum wrote:
> Hi Greg,
>
> On 5/1/25 9:00 PM, Greg Kroah-Hartman wrote:
> > On Tue, Apr 29, 2025 at 05:20:56PM +0500, Muhammad Usama Anjum wrote:
> >> Fix dma_direct_alloc() failure at resume time during bhie_table
> >> allocation. There is a crash report where at resume time, the memory
> >> from the dma doesn't get allocated and MHI fails to re-initialize.
> >> There is fragmentation/memory pressure.
> >>
> >> To fix it, don't free the memory at power down during suspend /
> >> hibernation. Instead, use the same allocated memory again after every
> >> resume / hibernation. This patch has been tested with resume and
> >> hibernation both.
> >>
> >> The rddm is of constant size for a given hardware. While the fbc_image
> >> size depends on the firmware. If the firmware changes, we'll free and
> >> allocate new memory for it.
> >>
> >> Here are the crash logs:
> >>
> >> [ 3029.338587] mhi mhi0: Requested to power ON
> >> [ 3029.338621] mhi mhi0: Power on setup success
> >> [ 3029.668654] kworker/u33:8: page allocation failure: order:7, mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> >> [ 3029.668682] CPU: 4 UID: 0 PID: 2744 Comm: kworker/u33:8 Not tainted 6.11.11-valve10-1-neptune-611-gb69e902b4338 #1ed779c892334112fb968aaa3facf9686b5ff0bd7
> >> [ 3029.668690] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> >> [ 3029.668694] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> >> [ 3029.668717] Call Trace:
> >> [ 3029.668722] <TASK>
> >> [ 3029.668728] dump_stack_lvl+0x4e/0x70
> >> [ 3029.668738] warn_alloc+0x164/0x190
> >> [ 3029.668747] ? srso_return_thunk+0x5/0x5f
> >> [ 3029.668754] ? __alloc_pages_direct_compact+0xaf/0x360
> >> [ 3029.668761] __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> >> [ 3029.668774] __alloc_pages_noprof+0x321/0x350
> >> [ 3029.668782] __dma_direct_alloc_pages.isra.0+0x14a/0x290
> >> [ 3029.668790] dma_direct_alloc+0x70/0x270
> >> [ 3029.668796] mhi_alloc_bhie_table+0xe8/0x190 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> >> [ 3029.668814] mhi_fw_load_handler+0x1bc/0x310 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> >> [ 3029.668830] mhi_pm_st_worker+0x5c8/0xaa0 [mhi faa917c5aa23a5f5b12d6a2c597067e16d2fedc0]
> >> [ 3029.668844] ? srso_return_thunk+0x5/0x5f
> >> [ 3029.668853] process_one_work+0x17e/0x330
> >> [ 3029.668861] worker_thread+0x2ce/0x3f0
> >> [ 3029.668868] ? __pfx_worker_thread+0x10/0x10
> >> [ 3029.668873] kthread+0xd2/0x100
> >> [ 3029.668879] ? __pfx_kthread+0x10/0x10
> >> [ 3029.668885] ret_from_fork+0x34/0x50
> >> [ 3029.668892] ? __pfx_kthread+0x10/0x10
> >> [ 3029.668898] ret_from_fork_asm+0x1a/0x30
> >> [ 3029.668910] </TASK>
> >>
> >> Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
> >>
> >> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
> >
> > What commit id does this fix? Should it go to stable kernel(s)? If so,
> > how far back?
> This patch is fixing the dma_coherent_alloc() failure when there is
> memory pressure and its unable to allocate memory. Its not a bug in
> allocation API or the driver. I think it should be considered an
> improvement instead of the fix. Please correct me if I'm wrong.
You show a kernel crash in the changelog, that's a major issue (i.e.
will get assigned a CVE), so you need to show what commit id it fixes
for people to know how far back to take the fix to.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-02 5:06 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-29 12:23 [PATCH v3] bus: mhi: host: don't free bhie tables during suspend/hibernation Muhammad Usama Anjum
2025-04-29 20:55 ` Sebastian Reichel
2025-04-30 14:29 ` kernel test robot
2025-04-30 16:45 ` kernel test robot
-- strict thread matches above, loose matches on Subject: below --
2025-04-29 12:20 Muhammad Usama Anjum
2025-05-01 16:00 ` Greg Kroah-Hartman
2025-05-02 4:15 ` Muhammad Usama Anjum
2025-05-02 5:06 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).