linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles
@ 2025-07-15 13:25 Muhammad Usama Anjum
  2025-07-15 13:25 ` [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle Muhammad Usama Anjum
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Muhammad Usama Anjum @ 2025-07-15 13:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Jeff Hugo, Muhammad Usama Anjum,
	Youssef Samir, Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel

When there is memory pressure during resume and no DMA memory is
available, the ath11k driver fails to resume. The driver currently
frees its DMA memory during suspend or hibernate, and attempts to
re-allocate it during resume. However, if the DMA memory has been
consumed by other software in the meantime, these allocations can
fail, leading to critical failures in the WiFi driver. It has been
reported [1].

Although I have recently fixed several instances [2] [3] to ensure
DMA memory is not freed once allocated, we continue to receive
reports of new failures.

In this series, 3 more such cases are being fixed. There are still
some cases which I'm trying to fix. They can be discussed separately.

[1] https://lore.kernel.org/all/ead32f5b-730a-4b81-b38f-93d822f990c6@collabora.com
[2] https://lore.kernel.org/all/20250428080242.466901-1-usama.anjum@collabora.com
[3] https://lore.kernel.org/all/20250516184952.878726-1-usama.anjum@collabora.com

Muhammad Usama Anjum (3):
  bus: mhi: host: keep bhi buffer through suspend cycle
  bus: mhi: host: keep bhie buffer through suspend cycle
  bus: mhi: keep device context through suspend cycles

 drivers/bus/mhi/host/boot.c     | 44 ++++++++++++++++++++-------------
 drivers/bus/mhi/host/init.c     | 41 ++++++++++++++++++++++++++----
 drivers/bus/mhi/host/internal.h |  2 ++
 include/linux/mhi.h             |  2 ++
 4 files changed, 67 insertions(+), 22 deletions(-)

-- 
2.39.5


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle
  2025-07-15 13:25 [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Muhammad Usama Anjum
@ 2025-07-15 13:25 ` Muhammad Usama Anjum
  2025-07-16  3:41   ` Baochen Qiang
  2025-07-16  9:34   ` Greg Kroah-Hartman
  2025-07-15 13:25 ` [PATCH v2 2/3] bus: mhi: host: keep bhie " Muhammad Usama Anjum
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Muhammad Usama Anjum @ 2025-07-15 13:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Jeff Hugo, Muhammad Usama Anjum,
	Youssef Samir, Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel, stable

When there is memory pressure, at resume time dma_alloc_coherent()
returns error which in turn fails the loading of firmware and hence
the driver crashes:

kernel: kworker/u33:5: page allocation failure: order:7,
mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
kernel: CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
kernel: Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
kernel: Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
kernel: Call Trace:
kernel:  <TASK>
kernel:  dump_stack_lvl+0x4e/0x70
kernel:  warn_alloc+0x164/0x190
kernel:  ? srso_return_thunk+0x5/0x5f
kernel:  ? __alloc_pages_direct_compact+0xaf/0x360
kernel:  __alloc_pages_slowpath.constprop.0+0xc75/0xd70
kernel:  __alloc_pages_noprof+0x321/0x350
kernel:  __dma_direct_alloc_pages.isra.0+0x14a/0x290
kernel:  dma_direct_alloc+0x70/0x270
kernel:  mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
kernel:  mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
kernel:  ? srso_return_thunk+0x5/0x5f
kernel:  process_one_work+0x17e/0x330
kernel:  worker_thread+0x2ce/0x3f0
kernel:  ? __pfx_worker_thread+0x10/0x10
kernel:  kthread+0xd2/0x100
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork+0x34/0x50
kernel:  ? __pfx_kthread+0x10/0x10
kernel:  ret_from_fork_asm+0x1a/0x30
kernel:  </TASK>
kernel: Mem-Info:
kernel: active_anon:513809 inactive_anon:152 isolated_anon:0
    active_file:359315 inactive_file:2487001 isolated_file:0
    unevictable:637 dirty:19 writeback:0
    slab_reclaimable:160391 slab_unreclaimable:39729
    mapped:175836 shmem:51039 pagetables:4415
    sec_pagetables:0 bounce:0
    kernel_misc_reclaimable:0
    free:125666 free_pcp:0 free_cma:0

In above example, if we sum all the consumed memory, it comes out
to be 15.5GB and free memory is ~ 500MB from a total of 16GB RAM.
Even though memory is present. But all of the dma memory has been
exhausted or fragmented.

Fix it by allocating it only once and then reuse the same allocated
memory. As we'll allocate this memory only once, this memory will stay
allocated.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6

Fixes: cd457afb1667 ("bus: mhi: core: Add support for downloading firmware over BHIe")
Cc: stable@vger.kernel.org
Cc: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Reported here:
https://lore.kernel.org/all/ead32f5b-730a-4b81-b38f-93d822f990c6@collabora.com

Still a lot of more fixes are required. Hence, I'm not adding closes tag.
Changes since v1:
- fix mhi_load_image_bhi()
- Cc stable and fix tested on tag
---
 drivers/bus/mhi/host/boot.c     | 25 +++++++++++++++----------
 drivers/bus/mhi/host/init.c     |  5 +++++
 drivers/bus/mhi/host/internal.h |  2 ++
 include/linux/mhi.h             |  1 +
 4 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
index b3a85aa3c4768..9fc983bc12d49 100644
--- a/drivers/bus/mhi/host/boot.c
+++ b/drivers/bus/mhi/host/boot.c
@@ -302,8 +302,8 @@ static int mhi_fw_load_bhi(struct mhi_controller *mhi_cntrl,
 	return -EIO;
 }
 
-static void mhi_free_bhi_buffer(struct mhi_controller *mhi_cntrl,
-				struct image_info *image_info)
+void mhi_free_bhi_buffer(struct mhi_controller *mhi_cntrl,
+			 struct image_info *image_info)
 {
 	struct mhi_buf *mhi_buf = image_info->mhi_buf;
 
@@ -455,18 +455,23 @@ static enum mhi_fw_load_type mhi_fw_load_type_get(const struct mhi_controller *m
 
 static int mhi_load_image_bhi(struct mhi_controller *mhi_cntrl, const u8 *fw_data, size_t size)
 {
-	struct image_info *image;
+	struct image_info **image = &mhi_cntrl->bhi_image;
 	int ret;
 
-	ret = mhi_alloc_bhi_buffer(mhi_cntrl, &image, size);
-	if (ret)
-		return ret;
+	if (!(*image)) {
+		ret = mhi_alloc_bhi_buffer(mhi_cntrl, image, size);
+		if (ret)
+			return ret;
 
-	/* Load the firmware into BHI vec table */
-	memcpy(image->mhi_buf->buf, fw_data, size);
+		/* Load the firmware into BHI vec table */
+		memcpy((*image)->mhi_buf->buf, fw_data, size);
+	}
 
-	ret = mhi_fw_load_bhi(mhi_cntrl, &image->mhi_buf[image->entries - 1]);
-	mhi_free_bhi_buffer(mhi_cntrl, image);
+	ret = mhi_fw_load_bhi(mhi_cntrl, &(*image)->mhi_buf[(*image)->entries - 1]);
+	if (ret) {
+		mhi_free_bhi_buffer(mhi_cntrl, *image);
+		*image = NULL;
+	}
 
 	return ret;
 }
diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 6e06e4efec765..2e0f18c939e68 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -1228,6 +1228,11 @@ void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl)
 		mhi_cntrl->rddm_image = NULL;
 	}
 
+	if (mhi_cntrl->bhi_image) {
+		mhi_free_bhi_buffer(mhi_cntrl, mhi_cntrl->bhi_image);
+		mhi_cntrl->bhi_image = NULL;
+	}
+
 	mhi_deinit_dev_ctxt(mhi_cntrl);
 }
 EXPORT_SYMBOL_GPL(mhi_unprepare_after_power_down);
diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
index 1054e67bb450d..60b0699323375 100644
--- a/drivers/bus/mhi/host/internal.h
+++ b/drivers/bus/mhi/host/internal.h
@@ -324,6 +324,8 @@ int mhi_alloc_bhie_table(struct mhi_controller *mhi_cntrl,
 			 struct image_info **image_info, size_t alloc_size);
 void mhi_free_bhie_table(struct mhi_controller *mhi_cntrl,
 			 struct image_info *image_info);
+void mhi_free_bhi_buffer(struct mhi_controller *mhi_cntrl,
+			 struct image_info *image_info);
 
 /* Power management APIs */
 enum mhi_pm_state __must_check mhi_tryset_pm_state(
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index 4c567907933a5..593012f779d97 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -391,6 +391,7 @@ struct mhi_controller {
 	size_t reg_len;
 	struct image_info *fbc_image;
 	struct image_info *rddm_image;
+	struct image_info *bhi_image;
 	struct mhi_chan *mhi_chan;
 	struct list_head lpm_chans;
 	int *irq;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] bus: mhi: host: keep bhie buffer through suspend cycle
  2025-07-15 13:25 [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Muhammad Usama Anjum
  2025-07-15 13:25 ` [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle Muhammad Usama Anjum
@ 2025-07-15 13:25 ` Muhammad Usama Anjum
  2025-07-16  9:35   ` Greg Kroah-Hartman
  2025-07-15 13:25 ` [PATCH v2 3/3] bus: mhi: keep device context through suspend cycles Muhammad Usama Anjum
  2025-07-16  3:36 ` [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Baochen Qiang
  3 siblings, 1 reply; 12+ messages in thread
From: Muhammad Usama Anjum @ 2025-07-15 13:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Jeff Hugo, Muhammad Usama Anjum,
	Youssef Samir, Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel, stable

When there is memory pressure, at resume time dma_alloc_coherent()
returns error which in turn fails the loading of the firmware and hence
the driver crashes.

Fix it by allocating only once and then reuse the same allocated memory.
As we'll allocate this memory only once, this memory will stays
allocated.

Fixes: f88f1d0998ea ("bus: mhi: host: Add a policy to enable image transfer via BHIe in PBL")
Cc: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Added in v2
---
 drivers/bus/mhi/host/boot.c | 19 ++++++++++++-------
 drivers/bus/mhi/host/init.c |  5 +++++
 include/linux/mhi.h         |  1 +
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c
index 9fc983bc12d49..9f35ce9d670e7 100644
--- a/drivers/bus/mhi/host/boot.c
+++ b/drivers/bus/mhi/host/boot.c
@@ -478,17 +478,22 @@ static int mhi_load_image_bhi(struct mhi_controller *mhi_cntrl, const u8 *fw_dat
 
 static int mhi_load_image_bhie(struct mhi_controller *mhi_cntrl, const u8 *fw_data, size_t size)
 {
-	struct image_info *image;
+	struct image_info **image = &mhi_cntrl->bhie_image;
 	int ret;
 
-	ret = mhi_alloc_bhie_table(mhi_cntrl, &image, size);
-	if (ret)
-		return ret;
+	if (!(*image)) {
+		ret = mhi_alloc_bhie_table(mhi_cntrl, image, size);
+		if (ret)
+			return ret;
 
-	mhi_firmware_copy_bhie(mhi_cntrl, fw_data, size, image);
+		mhi_firmware_copy_bhie(mhi_cntrl, fw_data, size, *image);
+	}
 
-	ret = mhi_fw_load_bhie(mhi_cntrl, &image->mhi_buf[image->entries - 1]);
-	mhi_free_bhie_table(mhi_cntrl, image);
+	ret = mhi_fw_load_bhie(mhi_cntrl, &(*image)->mhi_buf[(*image)->entries - 1]);
+	if (ret) {
+		mhi_free_bhie_table(mhi_cntrl, *image);
+		*image = NULL;
+	}
 
 	return ret;
 }
diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 2e0f18c939e68..46ed0ae2ac285 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -1228,6 +1228,11 @@ void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl)
 		mhi_cntrl->rddm_image = NULL;
 	}
 
+	if (mhi_cntrl->bhie_image) {
+		mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->bhie_image);
+		mhi_cntrl->bhie_image = NULL;
+	}
+
 	if (mhi_cntrl->bhi_image) {
 		mhi_free_bhi_buffer(mhi_cntrl, mhi_cntrl->bhi_image);
 		mhi_cntrl->bhi_image = NULL;
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index 593012f779d97..77986cd66fda3 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -391,6 +391,7 @@ struct mhi_controller {
 	size_t reg_len;
 	struct image_info *fbc_image;
 	struct image_info *rddm_image;
+	struct image_info *bhie_image;
 	struct image_info *bhi_image;
 	struct mhi_chan *mhi_chan;
 	struct list_head lpm_chans;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] bus: mhi: keep device context through suspend cycles
  2025-07-15 13:25 [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Muhammad Usama Anjum
  2025-07-15 13:25 ` [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle Muhammad Usama Anjum
  2025-07-15 13:25 ` [PATCH v2 2/3] bus: mhi: host: keep bhie " Muhammad Usama Anjum
@ 2025-07-15 13:25 ` Muhammad Usama Anjum
  2025-07-16  9:35   ` Greg Kroah-Hartman
  2025-07-16  3:36 ` [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Baochen Qiang
  3 siblings, 1 reply; 12+ messages in thread
From: Muhammad Usama Anjum @ 2025-07-15 13:25 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Jeff Hugo, Muhammad Usama Anjum,
	Youssef Samir, Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel, stable

Don't deinitialize the device context while going into suspend or
hibernation cycles. Otherwise the resume may fail if at resume time, the
memory pressure is high and no dma memory is available. At resume, only
reset the read/write ring pointers as rings may have stale data.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6

Fixes: 3000f85b8f47 ("bus: mhi: core: Add support for basic PM operations")
Cc: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
---
Changes since v1:
- Cc stable and fix tested on tag
- Add logic to reset rings at resume
---
 drivers/bus/mhi/host/init.c | 31 ++++++++++++++++++++++++++-----
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 46ed0ae2ac285..6513012311ce3 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -474,6 +474,24 @@ static int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 	return ret;
 }
 
+static void mhi_reset_dev_rings(struct mhi_controller *mhi_cntrl)
+{
+	struct mhi_event *mhi_event = mhi_cntrl->mhi_event;
+	struct mhi_cmd *mhi_cmd = mhi_cntrl->mhi_cmd;
+	struct mhi_ring *ring;
+	int i;
+
+	for (i = 0; i < mhi_cntrl->total_ev_rings; i++, mhi_event++) {
+		ring = &mhi_event->ring;
+		ring->rp = ring->wp = ring->base;
+	}
+
+	for (i = 0; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
+		ring = &mhi_cmd->ring;
+		ring->rp = ring->wp = ring->base;
+	}
+}
+
 int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
 {
 	u32 val;
@@ -1133,9 +1151,13 @@ int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl)
 
 	mutex_lock(&mhi_cntrl->pm_mutex);
 
-	ret = mhi_init_dev_ctxt(mhi_cntrl);
-	if (ret)
-		goto error_dev_ctxt;
+	if (!mhi_cntrl->mhi_ctxt) {
+		ret = mhi_init_dev_ctxt(mhi_cntrl);
+		if (ret)
+			goto error_dev_ctxt;
+	} else {
+		mhi_reset_dev_rings(mhi_cntrl);
+	}
 
 	ret = mhi_read_reg(mhi_cntrl, mhi_cntrl->regs, BHIOFF, &bhi_off);
 	if (ret) {
@@ -1212,8 +1234,6 @@ void mhi_deinit_dev_ctxt(struct mhi_controller *mhi_cntrl)
 {
 	mhi_cntrl->bhi = NULL;
 	mhi_cntrl->bhie = NULL;
-
-	__mhi_deinit_dev_ctxt(mhi_cntrl);
 }
 
 void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl)
@@ -1239,6 +1259,7 @@ void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl)
 	}
 
 	mhi_deinit_dev_ctxt(mhi_cntrl);
+	__mhi_deinit_dev_ctxt(mhi_cntrl);
 }
 EXPORT_SYMBOL_GPL(mhi_unprepare_after_power_down);
 
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles
  2025-07-15 13:25 [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Muhammad Usama Anjum
                   ` (2 preceding siblings ...)
  2025-07-15 13:25 ` [PATCH v2 3/3] bus: mhi: keep device context through suspend cycles Muhammad Usama Anjum
@ 2025-07-16  3:36 ` Baochen Qiang
  2025-07-17 10:01   ` Muhammad Usama Anjum
  3 siblings, 1 reply; 12+ messages in thread
From: Baochen Qiang @ 2025-07-16  3:36 UTC (permalink / raw)
  To: Muhammad Usama Anjum, Manivannan Sadhasivam, Jeff Hugo,
	Youssef Samir, Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel



On 7/15/2025 9:25 PM, Muhammad Usama Anjum wrote:
> When there is memory pressure during resume and no DMA memory is
> available, the ath11k driver fails to resume. The driver currently
> frees its DMA memory during suspend or hibernate, and attempts to
> re-allocate it during resume. However, if the DMA memory has been
> consumed by other software in the meantime, these allocations can
> fail, leading to critical failures in the WiFi driver. It has been
> reported [1].
> 
> Although I have recently fixed several instances [2] [3] to ensure
> DMA memory is not freed once allocated, we continue to receive
> reports of new failures.
> 
> In this series, 3 more such cases are being fixed. There are still
> some cases which I'm trying to fix. They can be discussed separately.
> 
> [1] https://lore.kernel.org/all/ead32f5b-730a-4b81-b38f-93d822f990c6@collabora.com
> [2] https://lore.kernel.org/all/20250428080242.466901-1-usama.anjum@collabora.com
> [3] https://lore.kernel.org/all/20250516184952.878726-1-usama.anjum@collabora.com
> 
> Muhammad Usama Anjum (3):
>   bus: mhi: host: keep bhi buffer through suspend cycle
>   bus: mhi: host: keep bhie buffer through suspend cycle
>   bus: mhi: keep device context through suspend cycles
> 
>  drivers/bus/mhi/host/boot.c     | 44 ++++++++++++++++++++-------------
>  drivers/bus/mhi/host/init.c     | 41 ++++++++++++++++++++++++++----
>  drivers/bus/mhi/host/internal.h |  2 ++
>  include/linux/mhi.h             |  2 ++
>  4 files changed, 67 insertions(+), 22 deletions(-)
> 

changelog missing


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle
  2025-07-15 13:25 ` [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle Muhammad Usama Anjum
@ 2025-07-16  3:41   ` Baochen Qiang
  2025-07-16  9:34   ` Greg Kroah-Hartman
  1 sibling, 0 replies; 12+ messages in thread
From: Baochen Qiang @ 2025-07-16  3:41 UTC (permalink / raw)
  To: Muhammad Usama Anjum, Manivannan Sadhasivam, Jeff Hugo,
	Youssef Samir, Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel, stable



On 7/15/2025 9:25 PM, Muhammad Usama Anjum wrote:
> When there is memory pressure, at resume time dma_alloc_coherent()
> returns error which in turn fails the loading of firmware and hence
> the driver crashes:
> 
> kernel: kworker/u33:5: page allocation failure: order:7,
> mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> kernel: CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
> kernel: Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> kernel: Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> kernel: Call Trace:
> kernel:  <TASK>
> kernel:  dump_stack_lvl+0x4e/0x70
> kernel:  warn_alloc+0x164/0x190
> kernel:  ? srso_return_thunk+0x5/0x5f
> kernel:  ? __alloc_pages_direct_compact+0xaf/0x360
> kernel:  __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> kernel:  __alloc_pages_noprof+0x321/0x350
> kernel:  __dma_direct_alloc_pages.isra.0+0x14a/0x290
> kernel:  dma_direct_alloc+0x70/0x270
> kernel:  mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
> kernel:  mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
> kernel:  ? srso_return_thunk+0x5/0x5f
> kernel:  process_one_work+0x17e/0x330
> kernel:  worker_thread+0x2ce/0x3f0
> kernel:  ? __pfx_worker_thread+0x10/0x10
> kernel:  kthread+0xd2/0x100
> kernel:  ? __pfx_kthread+0x10/0x10
> kernel:  ret_from_fork+0x34/0x50
> kernel:  ? __pfx_kthread+0x10/0x10
> kernel:  ret_from_fork_asm+0x1a/0x30
> kernel:  </TASK>
> kernel: Mem-Info:
> kernel: active_anon:513809 inactive_anon:152 isolated_anon:0
>     active_file:359315 inactive_file:2487001 isolated_file:0
>     unevictable:637 dirty:19 writeback:0
>     slab_reclaimable:160391 slab_unreclaimable:39729
>     mapped:175836 shmem:51039 pagetables:4415
>     sec_pagetables:0 bounce:0
>     kernel_misc_reclaimable:0
>     free:125666 free_pcp:0 free_cma:0
> 
> In above example, if we sum all the consumed memory, it comes out
> to be 15.5GB and free memory is ~ 500MB from a total of 16GB RAM.
> Even though memory is present. But all of the dma memory has been
> exhausted or fragmented.
> 
> Fix it by allocating it only once and then reuse the same allocated
> memory. As we'll allocate this memory only once, this memory will stay
> allocated.

BHI buffer is not needed anymore after initial firmware loaded. So IMO we can not keep it
just for the purpose of avoiding OOM in the future.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle
  2025-07-15 13:25 ` [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle Muhammad Usama Anjum
  2025-07-16  3:41   ` Baochen Qiang
@ 2025-07-16  9:34   ` Greg Kroah-Hartman
  2025-07-17 10:00     ` Muhammad Usama Anjum
  1 sibling, 1 reply; 12+ messages in thread
From: Greg Kroah-Hartman @ 2025-07-16  9:34 UTC (permalink / raw)
  To: Muhammad Usama Anjum
  Cc: Manivannan Sadhasivam, Jeff Hugo, Youssef Samir, Matthew Leung,
	Alexander Wilhelm, Kunwu Chan, Krishna Chaitanya Chundru,
	Jacek Lawrynowicz, Yan Zhen, Sujeev Dias, Siddartha Mohanadoss,
	mhi, linux-arm-msm, linux-kernel, kernel, stable

On Tue, Jul 15, 2025 at 06:25:07PM +0500, Muhammad Usama Anjum wrote:
> When there is memory pressure, at resume time dma_alloc_coherent()
> returns error which in turn fails the loading of firmware and hence
> the driver crashes:
> 
> kernel: kworker/u33:5: page allocation failure: order:7,
> mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> kernel: CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
> kernel: Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> kernel: Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> kernel: Call Trace:
> kernel:  <TASK>
> kernel:  dump_stack_lvl+0x4e/0x70
> kernel:  warn_alloc+0x164/0x190
> kernel:  ? srso_return_thunk+0x5/0x5f
> kernel:  ? __alloc_pages_direct_compact+0xaf/0x360
> kernel:  __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> kernel:  __alloc_pages_noprof+0x321/0x350
> kernel:  __dma_direct_alloc_pages.isra.0+0x14a/0x290
> kernel:  dma_direct_alloc+0x70/0x270
> kernel:  mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
> kernel:  mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
> kernel:  ? srso_return_thunk+0x5/0x5f
> kernel:  process_one_work+0x17e/0x330
> kernel:  worker_thread+0x2ce/0x3f0
> kernel:  ? __pfx_worker_thread+0x10/0x10
> kernel:  kthread+0xd2/0x100
> kernel:  ? __pfx_kthread+0x10/0x10
> kernel:  ret_from_fork+0x34/0x50
> kernel:  ? __pfx_kthread+0x10/0x10
> kernel:  ret_from_fork_asm+0x1a/0x30
> kernel:  </TASK>
> kernel: Mem-Info:
> kernel: active_anon:513809 inactive_anon:152 isolated_anon:0
>     active_file:359315 inactive_file:2487001 isolated_file:0
>     unevictable:637 dirty:19 writeback:0
>     slab_reclaimable:160391 slab_unreclaimable:39729
>     mapped:175836 shmem:51039 pagetables:4415
>     sec_pagetables:0 bounce:0
>     kernel_misc_reclaimable:0
>     free:125666 free_pcp:0 free_cma:0

This is not a "crash", it is a warning that your huge memory allocation
did not succeed.  Properly handle this issue (and if you know it's going
to happen, turn the warning off in your allocation), and you should be
fine.

> In above example, if we sum all the consumed memory, it comes out
> to be 15.5GB and free memory is ~ 500MB from a total of 16GB RAM.
> Even though memory is present. But all of the dma memory has been
> exhausted or fragmented.

What caused that to happen?

> Fix it by allocating it only once and then reuse the same allocated
> memory. As we'll allocate this memory only once, this memory will stay
> allocated.

As others said, no, don't consume memory for no good reason, that just
means that other devices will fail more frequently.  If all
devices/drivers did this, you wouldn't have memory to work either.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] bus: mhi: host: keep bhie buffer through suspend cycle
  2025-07-15 13:25 ` [PATCH v2 2/3] bus: mhi: host: keep bhie " Muhammad Usama Anjum
@ 2025-07-16  9:35   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2025-07-16  9:35 UTC (permalink / raw)
  To: Muhammad Usama Anjum
  Cc: Manivannan Sadhasivam, Jeff Hugo, Youssef Samir, Matthew Leung,
	Alexander Wilhelm, Kunwu Chan, Krishna Chaitanya Chundru,
	Jacek Lawrynowicz, Yan Zhen, Sujeev Dias, Siddartha Mohanadoss,
	mhi, linux-arm-msm, linux-kernel, kernel, stable

On Tue, Jul 15, 2025 at 06:25:08PM +0500, Muhammad Usama Anjum wrote:
> When there is memory pressure, at resume time dma_alloc_coherent()
> returns error which in turn fails the loading of the firmware and hence
> the driver crashes.
> 
> Fix it by allocating only once and then reuse the same allocated memory.
> As we'll allocate this memory only once, this memory will stays
> allocated.

Again, no, do not waste memory like this.  If all drivers/devices did
this that would be horrible.

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] bus: mhi: keep device context through suspend cycles
  2025-07-15 13:25 ` [PATCH v2 3/3] bus: mhi: keep device context through suspend cycles Muhammad Usama Anjum
@ 2025-07-16  9:35   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2025-07-16  9:35 UTC (permalink / raw)
  To: Muhammad Usama Anjum
  Cc: Manivannan Sadhasivam, Jeff Hugo, Youssef Samir, Matthew Leung,
	Alexander Wilhelm, Kunwu Chan, Krishna Chaitanya Chundru,
	Jacek Lawrynowicz, Yan Zhen, Sujeev Dias, Siddartha Mohanadoss,
	mhi, linux-arm-msm, linux-kernel, kernel, stable

On Tue, Jul 15, 2025 at 06:25:09PM +0500, Muhammad Usama Anjum wrote:
> Don't deinitialize the device context while going into suspend or
> hibernation cycles. Otherwise the resume may fail if at resume time, the
> memory pressure is high and no dma memory is available. At resume, only
> reset the read/write ring pointers as rings may have stale data.


Same here, don't do this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle
  2025-07-16  9:34   ` Greg Kroah-Hartman
@ 2025-07-17 10:00     ` Muhammad Usama Anjum
  2025-07-17 11:50       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 12+ messages in thread
From: Muhammad Usama Anjum @ 2025-07-17 10:00 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Manivannan Sadhasivam, Jeff Hugo, Youssef Samir, Matthew Leung,
	Alexander Wilhelm, Kunwu Chan, Krishna Chaitanya Chundru,
	Jacek Lawrynowicz, Yan Zhen, Sujeev Dias, Siddartha Mohanadoss,
	mhi, linux-arm-msm, linux-kernel, kernel, stable

Hi Greg,

On 7/16/25 2:34 PM, Greg Kroah-Hartman wrote:
> On Tue, Jul 15, 2025 at 06:25:07PM +0500, Muhammad Usama Anjum wrote:
>> When there is memory pressure, at resume time dma_alloc_coherent()
>> returns error which in turn fails the loading of firmware and hence
>> the driver crashes:
>>
>> kernel: kworker/u33:5: page allocation failure: order:7,
>> mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
>> kernel: CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
>> kernel: Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> kernel: Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
>> kernel: Call Trace:
>> kernel:  <TASK>
>> kernel:  dump_stack_lvl+0x4e/0x70
>> kernel:  warn_alloc+0x164/0x190
>> kernel:  ? srso_return_thunk+0x5/0x5f
>> kernel:  ? __alloc_pages_direct_compact+0xaf/0x360
>> kernel:  __alloc_pages_slowpath.constprop.0+0xc75/0xd70
>> kernel:  __alloc_pages_noprof+0x321/0x350
>> kernel:  __dma_direct_alloc_pages.isra.0+0x14a/0x290
>> kernel:  dma_direct_alloc+0x70/0x270
>> kernel:  mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
>> kernel:  mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
>> kernel:  ? srso_return_thunk+0x5/0x5f
>> kernel:  process_one_work+0x17e/0x330
>> kernel:  worker_thread+0x2ce/0x3f0
>> kernel:  ? __pfx_worker_thread+0x10/0x10
>> kernel:  kthread+0xd2/0x100
>> kernel:  ? __pfx_kthread+0x10/0x10
>> kernel:  ret_from_fork+0x34/0x50
>> kernel:  ? __pfx_kthread+0x10/0x10
>> kernel:  ret_from_fork_asm+0x1a/0x30
>> kernel:  </TASK>
>> kernel: Mem-Info:
>> kernel: active_anon:513809 inactive_anon:152 isolated_anon:0
>>     active_file:359315 inactive_file:2487001 isolated_file:0
>>     unevictable:637 dirty:19 writeback:0
>>     slab_reclaimable:160391 slab_unreclaimable:39729
>>     mapped:175836 shmem:51039 pagetables:4415
>>     sec_pagetables:0 bounce:0
>>     kernel_misc_reclaimable:0
>>     free:125666 free_pcp:0 free_cma:0
> 
> This is not a "crash", it is a warning that your huge memory allocation
> did not succeed.  Properly handle this issue (and if you know it's going
> to happen, turn the warning off in your allocation), and you should be
> fine.
Yes, the system is fine. But wifi/sound drivers fail to reinitialize.

> 
>> In above example, if we sum all the consumed memory, it comes out
>> to be 15.5GB and free memory is ~ 500MB from a total of 16GB RAM.
>> Even though memory is present. But all of the dma memory has been
>> exhausted or fragmented.
> 
> What caused that to happen?
Excessive use of the page cache occurs when user-space applications open
and consume large amounts of file system memory, even if those files are
no longer being actively read. I haven't found any documentation on limiting
the size of the page cache or preventing it from occupying DMA-capable
memory—perhaps the MM developers can provide more insight.

I can reproduce this issue by running stress tests that create and
sequentially read files. On a system with 16GB of RAM, the page cache can
easily grow to 10–12GB. Since the kernel manages the page cache, it's unclear
why it doesn't reclaim inactive cache more aggressively.

> 
>> Fix it by allocating it only once and then reuse the same allocated
>> memory. As we'll allocate this memory only once, this memory will stay
>> allocated.
> 
> As others said, no, don't consume memory for no good reason, that just
> means that other devices will fail more frequently.  If all
> devices/drivers did this, you wouldn't have memory to work either.
Makes sense.

> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles
  2025-07-16  3:36 ` [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Baochen Qiang
@ 2025-07-17 10:01   ` Muhammad Usama Anjum
  0 siblings, 0 replies; 12+ messages in thread
From: Muhammad Usama Anjum @ 2025-07-17 10:01 UTC (permalink / raw)
  To: Baochen Qiang, Manivannan Sadhasivam, Jeff Hugo, Youssef Samir,
	Matthew Leung, Alexander Wilhelm, Kunwu Chan,
	Krishna Chaitanya Chundru, Jacek Lawrynowicz, Yan Zhen,
	Sujeev Dias, Greg Kroah-Hartman, Siddartha Mohanadoss, mhi,
	linux-arm-msm, linux-kernel
  Cc: kernel

On 7/16/25 8:36 AM, Baochen Qiang wrote:
> 
> 
> On 7/15/2025 9:25 PM, Muhammad Usama Anjum wrote:
>> When there is memory pressure during resume and no DMA memory is
>> available, the ath11k driver fails to resume. The driver currently
>> frees its DMA memory during suspend or hibernate, and attempts to
>> re-allocate it during resume. However, if the DMA memory has been
>> consumed by other software in the meantime, these allocations can
>> fail, leading to critical failures in the WiFi driver. It has been
>> reported [1].
>>
>> Although I have recently fixed several instances [2] [3] to ensure
>> DMA memory is not freed once allocated, we continue to receive
>> reports of new failures.
>>
>> In this series, 3 more such cases are being fixed. There are still
>> some cases which I'm trying to fix. They can be discussed separately.
>>
>> [1] https://lore.kernel.org/all/ead32f5b-730a-4b81-b38f-93d822f990c6@collabora.com
>> [2] https://lore.kernel.org/all/20250428080242.466901-1-usama.anjum@collabora.com
>> [3] https://lore.kernel.org/all/20250516184952.878726-1-usama.anjum@collabora.com
>>
>> Muhammad Usama Anjum (3):
>>   bus: mhi: host: keep bhi buffer through suspend cycle
>>   bus: mhi: host: keep bhie buffer through suspend cycle
>>   bus: mhi: keep device context through suspend cycles
>>
>>  drivers/bus/mhi/host/boot.c     | 44 ++++++++++++++++++++-------------
>>  drivers/bus/mhi/host/init.c     | 41 ++++++++++++++++++++++++++----
>>  drivers/bus/mhi/host/internal.h |  2 ++
>>  include/linux/mhi.h             |  2 ++
>>  4 files changed, 67 insertions(+), 22 deletions(-)
>>
> 
> changelog missing
Sorry, missed the changelog in the cover letter. For now, please find
changelog in the individual patches. I'll add changelog if there would be
v3.


> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle
  2025-07-17 10:00     ` Muhammad Usama Anjum
@ 2025-07-17 11:50       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2025-07-17 11:50 UTC (permalink / raw)
  To: Muhammad Usama Anjum
  Cc: Manivannan Sadhasivam, Jeff Hugo, Youssef Samir, Matthew Leung,
	Alexander Wilhelm, Kunwu Chan, Krishna Chaitanya Chundru,
	Jacek Lawrynowicz, Yan Zhen, Sujeev Dias, Siddartha Mohanadoss,
	mhi, linux-arm-msm, linux-kernel, kernel, stable

On Thu, Jul 17, 2025 at 03:00:14PM +0500, Muhammad Usama Anjum wrote:
> Hi Greg,
> 
> On 7/16/25 2:34 PM, Greg Kroah-Hartman wrote:
> > On Tue, Jul 15, 2025 at 06:25:07PM +0500, Muhammad Usama Anjum wrote:
> >> When there is memory pressure, at resume time dma_alloc_coherent()
> >> returns error which in turn fails the loading of firmware and hence
> >> the driver crashes:
> >>
> >> kernel: kworker/u33:5: page allocation failure: order:7,
> >> mode:0xc04(GFP_NOIO|GFP_DMA32), nodemask=(null),cpuset=/,mems_allowed=0
> >> kernel: CPU: 1 UID: 0 PID: 7693 Comm: kworker/u33:5 Not tainted 6.11.11-valve17-1-neptune-611-g027868a0ac03 #1 3843143b92e9da0fa2d3d5f21f51beaed15c7d59
> >> kernel: Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> >> kernel: Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
> >> kernel: Call Trace:
> >> kernel:  <TASK>
> >> kernel:  dump_stack_lvl+0x4e/0x70
> >> kernel:  warn_alloc+0x164/0x190
> >> kernel:  ? srso_return_thunk+0x5/0x5f
> >> kernel:  ? __alloc_pages_direct_compact+0xaf/0x360
> >> kernel:  __alloc_pages_slowpath.constprop.0+0xc75/0xd70
> >> kernel:  __alloc_pages_noprof+0x321/0x350
> >> kernel:  __dma_direct_alloc_pages.isra.0+0x14a/0x290
> >> kernel:  dma_direct_alloc+0x70/0x270
> >> kernel:  mhi_fw_load_handler+0x126/0x340 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
> >> kernel:  mhi_pm_st_worker+0x5e8/0xac0 [mhi a96cb91daba500cc77f86bad60c1f332dc3babdf]
> >> kernel:  ? srso_return_thunk+0x5/0x5f
> >> kernel:  process_one_work+0x17e/0x330
> >> kernel:  worker_thread+0x2ce/0x3f0
> >> kernel:  ? __pfx_worker_thread+0x10/0x10
> >> kernel:  kthread+0xd2/0x100
> >> kernel:  ? __pfx_kthread+0x10/0x10
> >> kernel:  ret_from_fork+0x34/0x50
> >> kernel:  ? __pfx_kthread+0x10/0x10
> >> kernel:  ret_from_fork_asm+0x1a/0x30
> >> kernel:  </TASK>
> >> kernel: Mem-Info:
> >> kernel: active_anon:513809 inactive_anon:152 isolated_anon:0
> >>     active_file:359315 inactive_file:2487001 isolated_file:0
> >>     unevictable:637 dirty:19 writeback:0
> >>     slab_reclaimable:160391 slab_unreclaimable:39729
> >>     mapped:175836 shmem:51039 pagetables:4415
> >>     sec_pagetables:0 bounce:0
> >>     kernel_misc_reclaimable:0
> >>     free:125666 free_pcp:0 free_cma:0
> > 
> > This is not a "crash", it is a warning that your huge memory allocation
> > did not succeed.  Properly handle this issue (and if you know it's going
> > to happen, turn the warning off in your allocation), and you should be
> > fine.
> Yes, the system is fine. But wifi/sound drivers fail to reinitialize.
> 
> > 
> >> In above example, if we sum all the consumed memory, it comes out
> >> to be 15.5GB and free memory is ~ 500MB from a total of 16GB RAM.
> >> Even though memory is present. But all of the dma memory has been
> >> exhausted or fragmented.
> > 
> > What caused that to happen?
> Excessive use of the page cache occurs when user-space applications open
> and consume large amounts of file system memory, even if those files are
> no longer being actively read. I haven't found any documentation on limiting
> the size of the page cache or preventing it from occupying DMA-capable
> memory—perhaps the MM developers can provide more insight.
> 
> I can reproduce this issue by running stress tests that create and
> sequentially read files. On a system with 16GB of RAM, the page cache can
> easily grow to 10–12GB. Since the kernel manages the page cache, it's unclear
> why it doesn't reclaim inactive cache more aggressively.

It should be reclaiming this, as it's just cache, not really used
memory.  I think something isn't tuned properly for your system, OR your
drivers are asking for way too much memory.  Either way, the correct
solution is NOT to have the drivers consume even more memory, that just
makes the overall system less useful.

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-07-17 11:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-15 13:25 [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Muhammad Usama Anjum
2025-07-15 13:25 ` [PATCH v2 1/3] bus: mhi: host: keep bhi buffer through suspend cycle Muhammad Usama Anjum
2025-07-16  3:41   ` Baochen Qiang
2025-07-16  9:34   ` Greg Kroah-Hartman
2025-07-17 10:00     ` Muhammad Usama Anjum
2025-07-17 11:50       ` Greg Kroah-Hartman
2025-07-15 13:25 ` [PATCH v2 2/3] bus: mhi: host: keep bhie " Muhammad Usama Anjum
2025-07-16  9:35   ` Greg Kroah-Hartman
2025-07-15 13:25 ` [PATCH v2 3/3] bus: mhi: keep device context through suspend cycles Muhammad Usama Anjum
2025-07-16  9:35   ` Greg Kroah-Hartman
2025-07-16  3:36 ` [PATCH v2 0/3] bus: mhi: keep dma buffers through suspend/hibernation cycles Baochen Qiang
2025-07-17 10:01   ` Muhammad Usama Anjum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).